Forum Moderators: open
I understand that duplicate content can mean when you use "black hat" techniques to stuff a site with huge amounts of pages without making sure that each page has unique content on it. So if you printed the same article on dozens of your pages, Google would penalise you for having duplicate content on your site.
But I'm getting confused about duplicate content in the sense of article directories or having your articles appear on multiple sites (if scraper sites have harvested your article or parts of it and put it on their sites.)
Let's say I have Article A on my website. I also submit this article to articles directories. After it is submitted to directories it is likely to be harvested and used in whole or in snippets on scraper sites. So now Article A is on my website, it appears in the articles directory site, and the article or bits of it is now starting to appear on scraper sites.
Can my main site be punished for duplicate content because of this?
If I only submit the article to directories, and I don't post it on my website, and the article is then picked up and harvested by scraper sites, and if I include a URL in the article pointing back to my website, will my main site (the one referenced in the URL) be punished for duplicate content?
And what if I write Article A and find that the content would be suitable for several of my own websites - let's say they are websites covering different industries and keywords, but this article is still something I want to publish on several of my sites. I do this, posting it on several of my OWN sites. Do any of my sites get punished for duplicate content?
And lastly, if I write several blogs on related subjects and for the most part post entirely new, original content on each blog, and each blog has its own RSS feed and is spidered pretty regularly, and if I choose to post Article A in its entirety on several of my own blogs, does the duplicate content issue arise? Who gets penalized? One or all of my blogs?
I guess I'm just confused about what duplicate content really is and the repercussions or fallout that can happen once you have an essay, blog entry, or article and it starts appearing in more than one place.
Thanks in advance for your help and guidance!
Cosmokid
If the same content is on 1,000 different web sites, why should Google (or any search engine) index each copy of it? It's not efficient and it doesn't help the user. (If the user is looking for that, then he only needs 1 link to it. If he isn't looking for it, then why make him wade through 1,000 listings to somethign he doesn't want?) The assumption is that the first time the content is found is the original and all others are copies.
Similarly, let's say you can get to the exact same content on your site by going to:
mysite.com/article1
mysite.com/articles/1
mysite.com/articles.php?aid=1
Why should Google index all 3 pages? One is the "original" and all the others are copies.
If you have too much duplicate content, it appears too much like you're trying to game the system. Then you may be penalizaed.
And I agree that duplicate content is going to happen.
Again, I'm just asking from those more experienced folks here what actions they would avoid when developing a website to minimize the possibilities of their site ever being penalized for duplicate content.
Would you submit an article to a directory while also posting it on your own site? Probably not, right? Because Google will try to determine which version is the "original" by looking at the age of the sites in question - and if your site is newer than the directory site, your screwed. Right? Am I understanding this correctly?
By the same token, I guess it's not a good idea to post an ongoing column that you write on multiple sites that you own, even though it expands your reach as a writer, because your sites might get penalized for hosting duplicate content? Do you agree?
I'm a syndicated columnist and extremely prolific writer with multiple blogs and sites and I'm just trying to put together an efficient strategy for maximizing the use of my own, original content without incurring duplicate content problems. I have only recently realized that one of my columns which appears on dozens of websites (as well as newspapers) is probably really goofing me up. So I'm trying to wrap my brains around the duplice content issue ASAP.
Cosmokid
Because Google will try to determine which version is the "original" by looking at the age of the sites in question