Forum Moderators: Robert Charlton & goodroi
Today it was annouced that the 3 big search engines have come up with a new tag to help with canoncial issues.
Announcements:
[googlewebmastercentral.blogspot.com ]
[ysearchblog.com ]
[blogs.msdn.com ]
Using the new canonical tagSpecify the canonical version using a tag in the head section of the page as follows:
<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish"/>
That’s it!You can only use the tag on pages within a single site (subdomains and subfolders are fine).
You can use relative or absolute links, but the search engines recommend absolute links.
This tag will operate in a similar way to a 301 redirect for all URLs that display the page with this tag.Links to all URLs will be consolidated to the one specified as canonical.
Search engines will consider this URL a “strong hint” as to the one to crawl and index.
< See also Canonical Tag Results: Share the stories - Positive / Negative / No Impact [webmasterworld.com] >
[edited by: tedster at 6:08 pm (utc) on April 2, 2009]
A guide to fixing duplicate content & URL issues on Apache
How to canonicalize all of your URLs with a single redirect
[webmasterworld.com...]heavy meal for a saturday; i'll dig in on tuesday ;)
I refuse to do anything specifically for the SE's benefit.
How is it to the search engines' benefit? Do the search engines care if John Doe's site is screwed in the rankings because of a canonical issue?
It seems to me that the beneficiaries of the optional "canonical tag" are people like John Doe. For the search engines, implementing that option for John and his peers is just another chore and expense.
How is it to the search engines' benefit? Do the search engines care if John Doe's site is screwed in the rankings because of a canonical issue?
The nofollow tag also started out being promoted as being for 'our own good'. A few years later, it's used very much, perhaps primarily, for the SE's benefit, to the detriment of end users.
In fact based on nothing other than the SE's treatment of nofollow, webmasters and SEO' would be well advised to be very suspicious of another big wooden horse parked at the front door.
[edited by: tedster at 7:06 pm (utc) on Feb. 15, 2009]
[edit reason] fix quote box [/edit]
In other words, the sentence says "Search engines will consider this URL a “strong hint” as to the one to crawl and index." and I'm wondering how strong that means. This might be helpful to understand for some very large sites where policing all the inline anchors would be unweildy.
...very suspicious of another big wooden horse parked at the front door
Just like nofollow, this is another ill-thought-out patch where once again webmasters have the responsibility for the search engines' own algo failings foisted upon them.
Just like nofollow, this is another ill-thought-out patch where once again webmasters have the responsibility for the search engines' own algo failings foisted upon them.
Nope, it's a rescue tool to help site owners clean up the messes they've made by having multiple URLs for each page.
What's more, nobody is required to use it.
I don't need it myself, but I can see why some people will be grateful for it, and I don't see why they should be deprived of it because a few other people have nightmares about Trojan horses.
Nope, it's a rescue tool to help site owners clean up the messes they've made by having multiple URLs for each page.
Rather like being handed a hatchet so you can repair the hole in the bottom of your boat...
In my opinion, the example of nofollow speaks for itself. Originally offered as a tool to help cut back on blog spam, it is now a blunt force instrument used for everything from "PR" flow to those dastardly paid links.
I'm all for industry standards. SE's do not (yet) own the net.
Let's take the third example, items.php?sortby=name&page=2&query=fooThat's a unique representation of the data. Never mind that the items in the list are repeated in other views with different sorting and filtering - it's still unique and in my books, it's canonical.
I agree with your statement, but question the wisdom of presenting the search engine with a potentially huge number of views on the same data set.
I would be tempted to misuse? the tag to reduce the number of combinations offered for indexing in the hope of all items being visible in search results at least once.
I'm all for industry standards.
I'd say this tag is needed mostly because some server software and some hosting environments have been ignoring standards for a long time. I just did a quick audit for a new client who is on shared Windows hosting. Unless they move to a new host, this tag is going to be their only real hope to control a combination of ten different canonical issues.
Yes this tag will also benefit the search engines. For one, they'll have a better shot at ranking some websites whose content deserves it but has been crippled through url problems.
I've often observed that websites+search engines creates a competitive/cooperative environment. If you only see one half of that, and I'm talking about either half, then you're missing the real picture and can make some poor decisions. This environment is a common kind of "game" in game theory or ecology. The warfare model misses the boat, and so does the sweetness-and-light model.
----
Anyone with technical questions may find some help from Matt Cutts' new blog post [mattcutts.com]. He also links to slides, an instructional video, and some new plug-ins for WordPress, Drupal, and Magento that Joost de Valk created for the canonical tag.
[edited by: tedster at 8:48 am (utc) on Feb. 16, 2009]
Or how about sites that allow some query strings but have trouble scripting a rule that wipes out the crazy variations that sometimes appear. If this new canoncial tag gets used as advertised, then many webmasters will have an easier time of it. And "as advertised" includes combining link juice. That's a promise that remains to be seen in practice.
If this new canoncial tag gets used as advertised, then many webmasters will have an easier time of it.
Most webmasters don't even know what canonical is or even how to spell it ;).
This is going to go right over the head of 99% of webmasters. Of the remaining 1%, 3/4's of them are going to screw up the implementation.
Many site owners didn't write the code of their web platform and are not aware they have multiple URLs.
A long-time competitor that does their website in-house finally implemented database/asp on their product line. I know through spying on their forum posts that they hired the work out-of-country, that it's a mess, and they have no idea where to begin. But they know enough to properly utilize this tool.
Perhaps that's one of the reasons for some of the (mostly old-school it seems) backlash: it empowers the semi-clueless competitor instantly, with what took some of us many hours ($$) to properly implement.
The one thing I can think of that no one else here has mentioned at all -- this might slow down some scrapers, who scrap your content verbatim. This tag being so new, the scraper will not know to remove it. Thus the search engines will not give them the credit for the page, when they get crawled.
Or, just to play it safe, should it be used on every page, for the same reason?
..........................
IF the only 'clean-up' of potential duplicate content is to only have www url...
That would be a rare bird indeed. In a recent thread, we identified around 30 canonical problems [webmasterworld.com] that can occur in combination. Just because you don't see them in your WMT account or site: operator results doesn't mean thay aren't affecting you. Google has been working to combine the various forms of urls for a while, and the effects can be seen in site: results especially. But it is a huge challenge, given the variety that is represented across the entire web.
I'm not about to drop the no-www with-www redirects, and yes, if that truly is the only problem then this tag will not offer your site anything much.
'Copy this link into the <head> section of all non-canonical versions of the page, such as http://www.example.com/product.php?item=swedish-fish&sort=price.'
If you rely on dynaimically drawing URLs from a DB, that can be extracting on a range of URLs creating DUP content issues this is a nice solution without the tech fix.
However as these pages all use a single header; will it affect the canonical page having this in the header too? Will this created a spidering loop that will have negative implications?
1. If a website has Google Analytics and Webmaster Tools and their is NO Duplicate Content showing in the Content Analysis does that mean that Google doesn't view that there are duplicate content problems? If duplicate content problems do arise is it fair to say that they would firstly appear in the the Webmaster Tools? Or is that too simplistic? If they aren't appearing there as per Tedster's post.. then how can we find out If they are affecting a site?
2. Following on from the post above by jonny0000, Can the Canonical tag be added to the head section of a Dynamic Web Template used in Microsoft Expression? Would that work? (although all pages are .html and therefore I cannot see the value of the tag apart from making sure the spiders include or not the www - all pages are viewed fine with their full extension and can only be viewed that way.)
[edited by: Gemini23 at 11:59 am (utc) on Feb. 17, 2009]
I have set up a basic test which I hope will confirm what other factors google aludes to (if any) influence the implementation of this type of redirect. Foremost in my mind are:
Does the refered page get passed value if it is not linked to on the site at all but the referer is?
Does the refered page get passed value if both pages are linked to on the site?
If any savvy expert plz help me
Let say my site urls are like this
www.example.com/category/post.php?post_id=1&cat=1
www.example.com/category/post.php?post_id=2&cat=1
www.example.com/category/post.php?post_id=1&cat=2
www.example.com/category/post.php?post_id=2&cat=2
www.example.com/category/post.php?post_id=1&cat=100
and so on
www.example.com/category/post.php?category.php?cat_id=1
www.example.com/category/post.php?category.php?cat_id=1
and so on
well my site generate dynamic links like for post it generates links like this
www.example.com/category/post.php?post_id=2&cat=1&mostviews
www.example.com/category/post.php?post_id=2&cat=1&random
www.example.com/category/post.php?post_id=2&cat=1&mostemailed
and same for categories
so will u plz help me adding this code in header
And 2nd question is that i have blocked these dynamic urls with extra strings using robots file few days back and i hope it will remove duplicate urls from google soon.
So should i have remove the robot rule and add this new tag
Or
Leave robots rule as it is and go for new tag Or what>>?
as my site has lost ranking from few months and im in doubt it can be reason so how can i go for it.
Thanks for ur time
That the introduction of this Canonical Tag has created more confusion than it was meant to solve. I could go back through this topic and probably come up with a good 50 questions so far. That would lead me to believe that there are going to be some challenges in the implementation of this tag for many. If you are not sure, leave it alone. The worst thing you can do at this point is add insult to injury.