Forum Moderators: martinibuster

Message Too Old, No Replies

Scrapers - Where are the beefs?

Gone but not forgotten, or forgotten but not gone?

         

europeforvisitors

6:50 pm on Aug 29, 2006 (gmt 0)



It wasn't so long ago that angry threads about scrapers were as common as angry threads about MFAs are now.

What happened? Are scrapers doing less well than they did six months or a year ago? Or are forum members just tired of talking about them?

Brett_Tabke

7:15 pm on Aug 29, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> Or are forum members just tired of talking about them?

Google changed their spider algo and the dupe content detection system started working mucho better. Dupe/ripped content is not as big an issue as it was.

europeforvisitors

8:36 pm on Aug 29, 2006 (gmt 0)



In other words, positive changes come to those who are patient. :-)

bumpski

8:47 pm on Aug 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm fortunate that with one search per site I can find virtually all sites that have scraped Titles from all pages of each of my sites. I can say that 98% or more of these scraped (mostly MFA) sites are indicated as supplemental. Over many hundreds of results reviewed.

There are definitely still some succeeding at this game. When the Adsense MFA game started many of these sites ranked higher in the SERPs than the originator of the content. Seeing this keyword rich nonsense content at the top of the SERPs did disturb many content originators.

Google has finally figured out who the original owner of the content is!

Now if they could just keep all the legitimate content pages indexed, a goal MSN and Yahoo seem to easily achieve.

EFV you must be bored, you usually seem to dislike the type of posts you're now soliciting!

Remember the counter posts; Stop whining about other webmaster's sites and pay attention to your own. Has the word "whining" been banned? I haven't seen it in a post in a while. Or maybe site owners with enormous supplemental content have migrated away from WW (too busy trying to trick Google). (Note: I'm sure there are a few legitimate sites that are in the supplemental quagmire, sorry!)

So where's the beef?

There does seem to be yet another wave of successful MFA scraper sites.

Now if only the Adsense earnings per page, and per click, migrated back to the pre-scraping mania days (2003). Google, in trying to monopolize the market, has cut prices too far, facilitating scraping again. Now they're sort of trying to fix it with the Oct. 1 changes, and others. Let's see, Quality versus instantaneous Income, Hmmm, tough decision. (I'm an Adwords advertiser.)

So was it tenderloin or hamburger?

incrediBILL

9:00 pm on Aug 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The scrapers are still scraping, but it's much harder to locate your scraped content these days unless you use MSN or Yahoo ;)

Dupe/ripped content is not as big an issue as it was.

That would explain the emergence of all those sites like I've been showing where the content was "blenderized" mixing partial sentences from multiple pages into one big page, thus making the content "unique" once more.

I hate to disagree with Brett [oh who am I kidding] but his own articles [google.com] are scraped all over the place verbatim, maybe with or without permission, most at least give attribution, some don't even seem to have duplicate content penalties but clicking "the omitted results" really shows the depth of the duplicate content being filtered.

However, scroll to the very bottom of the Google page for a true scraping in the Hotel ads.

Maybe it's the newer scraper sites being scrutinized harder as some of the old die-hard sites continue to slip thru the cracks.

[edited by: incrediBILL at 9:06 pm (utc) on Aug. 29, 2006]

Lorel

12:20 am on Sep 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




The scrapers are still scraping, but it's much harder to locate your scraped content these days unless you use MSN or Yahoo ;)

I found an easy way to catch them. Just search for inurl:domain (without the .com on the end). Often they put the title description keywords on the page but also in their page title and sometimes in comment or alt tags-- or else they just copy the whole page.

I just found one and reported it to G.AdSense. They had 1 adsense ad showing and 2 adsense Ads that were hidden from view. I'm not sure how they would benefit from that but reported them anyway.

Jalinder

6:08 am on Sep 1, 2006 (gmt 0)

10+ Year Member



Scrapers are still very much active. One such scraper took most of our time in first 3 weeks of August. The site had copied all our articles. It was just a 3 month old site but had PR of 5.

We gathered proofs, contacted search engines and filed DMCA.

Some of our proofs were:
1. After noticing the scraper was copying our articles, we introduced HTML comments within article content. The HTML comments had crawling data within it. The scraper continued copying our new articles, but now we had proof ... just view source on the scraper site and see info within HTML comments.
2. Many of the articles mention our site in the content.
3. Lot of articles are about events that took place between 2000 to 2005, when this scraper site did not even exist.
4. Authors could be contacted to verify who paid them.
5. etc. etc.

MSN was quickest in banning the scraper from their search index.

Google took time but were very supportive. 3 days back Google removed all of scraper's pages from their index that were copied from our site. But the scraper site's homepage is still in Google index and has PR 5. Even scraper's category and other pages are still in Google index.

Best part is that, a Google Search for the scraper (site:) shows this message:
"In response to a complaint we received under the US Digital Millennium Copyright Act, we have removed x result(s) from this page. If you wish, you may read the DMCA complaint that caused the removal(s) at ChillingEffects.org."

Yahoo and Ask have still not responded to our request (sent on 10th August by fedex as well as email). The scraper continues to get traffic from Yahoo.

Our team appreciates MSN and Google for following good methods when handling complaints against scrapers.