Forum Moderators: open

Message Too Old, No Replies

Javascript to hide web analytics link tags from spiders?

         

monkeylytics

5:41 am on Jan 15, 2008 (gmt 0)

10+ Year Member



Does anybody here hide their web analytics link tags from search engine bots? We're currently using a web analytics package that tracks links on a page with a unique variable. Eg, http://www.example.com/page?analyticstag=foo

We'd rather not have these links indexed by a search engine as the tags sometimes change (for instance A/B testing), and we don't want to introduce all of these unique URLs for a particular page of content.

We were thinking about just creating a Javascript function that would allow us to do something like

<a href="http://www.widgets.com/page" onClick="addTag(this, 'foo');">

with the idea that the search engines would process the link properly for crawling and indexing purposes but wouldn't execute the Javascript. And then we could have clean URLs for the search engines without ever worrying about analytics-laden URLs competing with the canonical ones for a page. And then human visitors cause the link tags to be added.

This seems harmless enough, but is it considered kosher to search engines or does it look cloaky to have all of these links with JS additions to them?

Thanks,

Steve

[edited by: tedster at 7:38 am (utc) on Jan. 15, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]

lammert

9:17 pm on Jan 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you don't want the analytics links to show up in the search engines, you might consider to tag them with rel="nofollow". That tag is recognized by the major search engines.

I use it as a standard procedure on my affiliate links etc to prevent them showing up in the SERPs.

caveman

1:10 am on Jan 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could also run the links through a robots.txt protected directory.

monkeylytics

5:57 am on Jan 16, 2008 (gmt 0)

10+ Year Member




If the link is to the widgets page, we want the search engines to follow the link and index the content of the widgets page and do so regularly. What we don't want are 5 different URL variations to the same widgets page just because we changed the web analytics tag on the link for whatever tracking reasons.

page=widgets
page=widgets&tag=1
page=widgets&tag=2
etc

We just search engines to have one URL, page = widget, associated with widgets content and not think there are 3+ different URLs pointing to the same content.

Putting a nofollow on the link with the analytics tag means the link won't be followed or pass rank mojo which is a bad thing. That link might be part of our navigation for example. Same with the robots.txt. We do want the link crawled and the destination indexed. We just don't want URL permutations that all point to the same content.

If we could do it through a Javascript mechanism described in the first post, the search engine would follow the link but not execute the JS (and hence not tag the link with the analytics tag whereas users would) But this is an option only if it doesn't raise any red flags with search engines.

I suppose we could test it out on some unimportant links and see what the effects are. But if others have been down this road before...

Thanks,

Steve

caveman

8:16 pm on Jan 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah, got it.

Personally I would not use the JS solution for a few reasons, but if you want to go that way, I'd first test it to ensure I understood how all of the major search engines treated your specific implememtation. Even then, use caution.

The reasons I would not go the JS route are:

1) I believe that there is some chance that using JS on all of your internal links would send a red flag to one or more of the engines, or that it might even trip filters or create issues with the engines algo's. This is because JS has been used in sneaky ways over the years ... ways the SE's don't like ... and I would not be surprised if the engines had their radar up on extensive use of it on internal links with respect to algos and rankings.

I also know that MC said at one point: "If you make lots of pages, don’t put JavaScript redirects on all of them." I know you're not exactly talking about JS redirects here, but again, even if you test your solution and it works, I'd worry about extensive use of JS in association with links. One can run afoul of algos despite the best of intentions, by finding new ways to do things that look like problems to automated systems.

2) The engines' ability to follow JS has changed over the years. I have no idea exactly how each of the three big SE's are handling and treating JS as we speak. What I can say is that Google does a pretty good job of following JS these days and I know MC acknowledged that recently.

How about IP delivery? Search engines don't like "cloaking" but it's my impression that they've come around to defining cloaking as serving different content to users than to bots. Using IP delivery to dish up clean URL's to the bots and tracking URL's to users seems a very, very legitimate way to operate. Many very large, very well known sites do exactly that. The engines don't want to kill sites; they just want to prevent sneaky and deceptive practices.

Another option is to capture the info, and use 301 redirects to send users and bots on to the canonical pages. But u might end up with a hecka lot of 301's that way.

monkeylytics

5:12 am on Jan 17, 2008 (gmt 0)

10+ Year Member



Thanks, caveman. We have similar concerns. If I were a search engine I would be curious as to why so many links have an onClick payload if it were used on everything.

We actually have been doing an IP delivery mechanism for the pages that are primarily driven by application code to avoid getting lost in URL variation loops for the same content

However, for other content, we use a content management system. Putting that same application code in the content violates that framework enough that we started looking for a nicer compromise. A JS onClick mechanism is a bit less offensive for an occasional link within content that we would like "clean" and still trackable.

And then there are links that fall in between these two areas where we will need to make a call as to what mechanism to use.

Steve