Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Massive Negative SEO Attack - What to do?

         

westcoast

7:32 pm on Apr 7, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



We are a very old website, over 20 years old. Decently large. Over the past year our traffic has dropped sharply (-60%) and unexpectedly, and I had always felt something was off.

We have fought off negative-SEO attacks in the past, but this new one is just massive in scope.

It appears 10,000+ hacked domains are being used to generate spam links to our site. The pages use lifted text mushed up from a bunch of sites like ours, and then a link at the bottom pointing to our site. When I look at Google, I see that there are 160,000 such links to our site currently INDEXED by Google from these domains.

It is completely out of control, and there are far, FAR too many to even work out what all these domains are, let alone disavow them all.

Are there any suggestions on what can be done? It is a clear case where we are just being steamrolled by this negative SEO, our rankings are sharply down, and there is no obvious way for us to escape.

Please help?

NickMNS

7:22 pm on May 24, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your only solution then is to block the scraping. If this is recurring on an ongoing basis it should be doable to find the offending ip/ua that are doing the scraping and blocking them.

To find the offender you will need to look through your logs to find IP's that appear frequently, that have visited many pages including those that you know have links pointing to them. This will require some work and may entail a game of whack-a-mole, but I assume that with sufficient effort that spammers will move on to site that is not protecting it's content instead of wasting resource trying to out smart you.

WebOpz

7:27 pm on May 25, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



@westcoast I feel your pain and have experienced much the same. I don't think people who have blamed you or your SEM efforts are of particularly helpful. You are the victim here, we have all been in your shoes and are here to help. Instead of looking at the problem singularly as a SEM have you considered its potential security roots as well?

If you have a DevOps/SecOps team have you reviewed the situation with them?
Do you have a WAF (Web Application Firewall) that might aid you with features to better filter out this garbage traffic?
Have you considered cloud SaaS CDN/WAF's like (Cloudfront, Cloudflare) to help block these malicious bots and incoming links for the time being?
Is there a common hosting (or hosting) providers amongst all the traffic? Have you tried to contact the abuse@isp to notify them they are hosting malicious bots? I know this can be very difficult as there are many 'providers' who are actual cybercriminals. You could at least strike at the biggest visible parts of the roots of this network.

These things should be done in addition to any of your SEM efforts/investments and I"m sure you are already doing them but I figured I'd bring them to the fore just the same. Once you have further locked down your site you can then focus on your content again. It is a whack a mole game, but that is why I suggest going with a SaaS company that does most of it (or has most of the needed features.)

Most search engines (Google included) are so bad at detecting these botnets and malicious actors. No doubt, they need to get better. They should have some method of support for paying customers for this.

westcoast

8:15 pm on May 25, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



WebOpz: Thanks for yout thoughts. There's not much that can be done here, since the scraping activity isn't really intense at all and is pretty much undetectable. All they need to do is spoof a Chrome user agent, and then scrape from a few pages every hour. It's so, so easy to blend in with normal traffic to scrape content that there's really no way to stop it.

On a positive note: over the past 24 hours it appears that Google has taken aim at the specific hacked-domain issue that led to this thread. I have watched the number of indexed hacked pages pointing to a number of domains including ours decrease by the hour, and hopefully this is the beginning for the end for this specific issue. I am curious to see what happens to my traffic as this junk is yanked out and our site's external-link profile starts normalizing. Thank you Google for finally noticing this issue.

WebOpz

8:25 pm on May 25, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



@westcoast Great news, I'm glad to hear that it is currently subsiding.

JorgeV

9:58 pm on May 25, 2021 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,

Blocking requests from IP ranges which belong to Data Center, and you'll get ride of most* of scrappers.

* some might use domestic connections, or hacked computers, but this will account for very low number.

martinibuster

1:43 am on May 26, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>>>I don't think people who have blamed you or your SEM efforts

Nobody is "blaming" the OP. You are mistaken.

I am offering useful advice to help the OP focus on what will help rather than waste time on activities that won't make a difference.

If you're using WordPress then certainly a plugin like Wordfence will be useful for making it easy to see the user agents, IP addresses and hosts where those scraper attacks are coming from.

You might be surprised how many rogue hackers and scrapers attack from the IPs that trace back to Bluehost, AWS, and OVH. Bots on Bluehost were hitting me thousands of times per day on one of my sites. Bots originating from OVH servers were the major attackers on another server.

Heavy scraping can lead to 500 errors and crawl anomalies when Google tries to crawl a site that's under a massive attack, even on a dedicated server.

Cloudflare is not something I've used but many people are happy with for combating bot attacks.

If you're on a dedicated server there are also server level solutions for blocking bots that can be subscribed to as well.

My main concern about bots is for stopping the scraping that could slow down the user experience for site visitors and keep out legitimate bots from Google and Bing.

archiweb

7:59 am on May 26, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Like already suggested – block these #*$!s via WAF or your server configs.
Put a rate limits cap of 30 request per second per IP – globally or per fingerprint – go lower if necessary.

Even blindfolded, I would start by blocking Amazon AWS, OVH, Huawei clouds, Wowrack, Semrush, Kyivstar, Hetzner, DigitalOcean, and even GoogleUserContent because Google themselves host most of these #*$! on their Google Cloud.

Good luck!

westcoast

3:59 pm on Jun 6, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



The massive Google cloaked-hacked-site-spam campaign that this thread has been tracking continues, and has increased significantly. We're now going on 3 months.

Our site: Had 15,000 spam backlinks a month ago
Today... 24,000 spam backlinks

Other sites we have been tracking:
First line is the # of spam pages indexed in Google a month ago. Second line is what there is in there today.

Searching for "See full list on <sitename>" is the best way to identify the spam.

Sooo.....

"See full list on blog" - 548,000 indexed doorway pages
Today... 599,000 indexed cloaked doorway pages

"See full list on msn" - 159,000 indexed doorway pages
Today... 210,000

"See full list on bbc" - 32,000 indexed doorway pages
Today.... 242,000

"See full list on adobe" - 15,000 indexed doorway pages
Today... 189,000

"See full list on pcworld" -27,600 indexed doorway pages
Todaty... 38,000

"See full list on buzzfeed" - 73,000 indexed doorway pages
Today... 96,000

"See full list on cnn" - 17,000 indexed doorway pages
Today... 138,000

Our "top 1000 domains" external backlink profile on Google Search Console remains completely filled with hacked domains.

Our traffic and rankings continue to drop. Daily. Weekly. Monthly.

2z7B

2:58 pm on Jun 9, 2021 (gmt 0)



Hi,

Thank you all for taking the time to write this analysis.
The same problem is afflicting one of my sites.

That is not even the first time. It has happened before - last year - and the site recovered. It even doubled in traffic.

But now again. What is going on?
Since April the number of indexed spam in the search console links report has been growing weekly. Now reaching new peaks.

What is the effect?
Some sort of penalty triggers as soon as your site passes a certain amount of spam "signals". Each site has its own threshold.

Then the site gets throttled: less bot crawls and reduced traffic.

You can see it from the ratio between impressions and clicks in SC, slowly changing week by week. Impressions up, clicks slowly bleeding.

Now with the core update the site lost Discover visibility, almost overnight.

Not enough trust signals. Can’t blame them, when over 80% of my link profile is spam.

I submit a weekly disavow - 8k domains so far - to try to mitigate the problem. Have been for years.
Right now the links are impossible to remove by yourself.

Reporting the domains or URLs with their spam form is useless.

Last year forcing bots to recrawl the spam was kinda helping. Now the success rate makes the process useless.
Spam piles up in your link profile and there is nothing you can do.

I mean, there is something you could do, buy some links to balance the spam effect. To me it feels like paying for ransomware.

Considering the economic health of the company, can’t G hire 50 people to manually check every site with more than x,xxx indexed pages to see if it’s spam or not?

Or even better, couldn’t the disavow JUST WORK?


What to hope for? I try to stay positive, but it gets harder with time.

It would be nice to have some kind of exposure for this thing.
Unfortunately we will only get canned responses and links to webmaster or core updates guidelines.


Best of luck to everyone going through this.

westcoast

5:51 pm on Jun 9, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Someone had written a post in this thread that had information on how to report this massive Google spam issue to McAffee (I think it was?), with the thought that perhaps reporting it to them will help get it removed from Google.

They seem to have deleted their post. Does anyone have any experience in reporting hacked-domain spam campaigns to McAffee, and if so how one goes about it?

Clearly Google's not doing anything about the spam in their own index.

Thanks

westcoast

2:02 am on Jun 10, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



"Since April the number of indexed spam in the search console links report has been growing weekly. Now reaching new peaks."

I've been watching the spam attack closely... a couple of weeks ago Google started removing pages from the index, almost as if they had detected the issue and were finally doing something about it.

At that point, the website starting "possessedcr..." (no domains are allowed to be posted here, so thats just the first bit of the long name) was doing all the forwarding and cloaking. Now the spammers have inserted a second intermediary 301 in there, executed by domain (inspiration...). It then forwards to "possessedcr...", which then forwards to the cloaked ads.

Some of those cloaked domains have over a million pages. I would guess this operation alone is probably well over a billion cloaked pages and counting.

Like you, the # of hacked site links into our site is at an all time high, and climbing with each passing day.

westcoast

4:02 am on Jun 12, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Another huge round of additional hacked domains appeared today from this spam-attack.

Come on, Google. This is getting stupid.

This is the code that these tens of thousands (HUNDREDS of thousands?) of hacked domain sites are using. I created a script in 3 minutes to scan all sites in my "external links" GSC list to detect these sites. Google, are you telling me your team of programmers can't work out how to detect this spam?

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Loading...</title> <script>window.location.href = "/click.php";</script> </head> <body> <h1>Loading...</h1> </body></html>

FranticFish

7:51 am on Jun 12, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If Google would just decide to IGNORE links they don't like, false positives wouldn't be an issue.

I'm yet to read a satisfactory explanation as to why punishment is deemed necessary within an algorithm designed to reward, especially given that it has been many years since Google were made aware of the real world consequences of that mindset. The only notable thing they did initially was to change a wording in their support section from something like 'There is nothing a competitor can do to damage your Google rankings' to 'There is almost nothing a competitor can do to damage your Google rankings'.

nomis5

4:11 pm on Jun 12, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The only notable thing they did initially was to change a wording in their support section from something like 'There is nothing a competitor can do to damage your Google rankings' to 'There is almost nothing a competitor can do to damage your Google rankings'.


A bit like the "rules" in George Orwell's book Animal Farm!

In the end the changes became so numerous that pigs running the farm did away with the rules altogether. Maybe the same will happen to Google's support section in the end.

westcoast

4:03 pm on Jun 20, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Another week, another fail for Google as the massive, obvious cloaked-hacked-domain-spam campaign I have been following continues to accelerate in size and speed.

I added another 1000 hacked domains to our disavow list this week alone.

Queries that illustrate the issue. First line is what it was a few weeks ago, second line as of this morning:

"See full list on blog" - 548,000 indexed doorway pages
Today.. 599,000 indexed cloaked doorway pages.

"See full list on msn" - 159,000 indexed doorway pages
Today... 249,000

"See full list on bbc" - 32,000 indexed doorway pages
Today.... 242,000

"See full list on adobe" - 15,000 indexed doorway pages
Today... 134,000

"See full list on pcworld" -27,600 indexed doorway pages
Todaty... 38,400

"See full list on buzzfeed" - 73,000 indexed doorway pages
Today... 114,000

"See full list on cnn" - 17,000 indexed doorway pages
Today... 145,000

Our site: spam backlinks from this cloaked spam rose from 23,000 indexed backlinks last week to now 30,000 indexed backlinks today.

DAMAGE TO US: We have plummeted in SERPs since these spammy backlinks started appearing a few months ago. We have gone from #2 in our primary keyword that we held for 20 years down to #30 as of this morning. We have also plummeted across the board in most other queries. A lot of this damage has occurred BETWEEN core updates. Could it be a coincidence? Possible. But the timing of the appearance of this flood of spam backlinks and the feeling that we are in some sort of penalty box is suspicious...

Our Google Search Console lists hacked domains as 700 of the 1000 top backlinking domains to our site. In Google's eyes, our entire 20-year link profile has been trashed and filled with total spam.

Google has ignored all of the submissions we have made to their spam reporting. Their algorithms are ignoring the most basic of cloaking strategies. They are indexing massive -- MASSIVE -- numbers of spam pages from hacked domains every hour.

ESTIMATE OF SPAM SIZE CURRENTLY: At least 10,000 domains, I'd say probably closer to 20,000 given the speed at which I'm discovering new hacked spamming domains. Each domain has, on average, somewhere around 50,000 pages indexed in google, although I have found a few with well over a million.

That gives an estimate of the spam size of currently at least 1 BILLION indexed pages.

Most of them continue to use the exact same code. /click.php sends cloaked stuff to GoogleBot and ads / scams / malware to users.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Loading...</title> <script>window.location.href = "/click.php";</script> </head> <body> <h1>Loading...</h1> </body></html>

delorean

7:53 am on Jun 29, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



@westcoast
Hi sir! What is your technique in collecting the spam links?

OldFaces

6:09 am on Sep 26, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hey WestCoast what ever happened to you with this? We are having the exact same issue as you and it began in July. Only difference is that these are about 30 new links per day according to one backlink tool.
This 47 message thread spans 2 pages: 47