We use Joomla for our CMS, it has a module that allows the creation of PDF's on most pages. Google has started indexing the PDF version of each page and they are now showing up as supplemental pages in their index.
Should I be concerned about duplicate content?
jonrichd
12:14 pm on Dec 12, 2006 (gmt 0)
It sounds to me like Google is doing pretty much what you would want it to do: keeping the html versions in the index, and the PDFs as supplemental. In this case, I don't think that the supplemental classification is bad, since the pages are actually different URLS.
I also don't think that this is going to cause any sort of penalty WRT duplicate content. However, if you are really concerned, you could block indexing of the PDFs with robots.txt.
travelin cat
5:17 pm on Dec 12, 2006 (gmt 0)
Thanks for the reply. That was my feeling as well.