Forum Moderators: phranque

Message Too Old, No Replies

Will spiders follow & index encoded URLs?

         

spyder_tek

2:33 am on Jul 28, 2006 (gmt 0)

10+ Year Member



Hi,

Will search engine spiders follow and index an encoded URL? Such as:

<a href="http://%77%77%77%2e%79%61%68%6f%6f%2e%63%6f%6d">Link</a>

In other words, do spiders understand encoded URLs?

The above is just a simple link to Yahoo.

Thanks!

lammert

10:39 am on Jul 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have some sites linking to me with errornous links, for example with a space behind the domain name: www.example.com%20

I have seen in my log file that bots, especially Slurp follow these links and try to load www.example.com%20. It seems that the %xx characters are not translated, but passed literaly to the crawler. I have had some other examples where %xx characters weren't translated to the equivalent ASCII form, even when it were normal alphanumerical characters. Therefore as far as my experience goes, the %-version of the URL will be spidered, but the spider doesn't make the connection between the percentage, and normal version of the URL. This is from observation in my log files only, I didn't test it exhaustively.

Google after the BigDaddy update may be different, because one of the enhancements with BigDaddy was a better recognition of duplicate contents and more URL pointing to the same page.

rj87uk

12:35 pm on Jul 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ive been having some problems with something along the idea of this, no idea how to fix it or snything but it looks like SEs are indexing my site with a %20www. but when clicking on that link in the SERPs its a 404 error. Weird thing is in the SERPs it has all the page details like the title and descriptions etc etc! Weird?

Jim Catanich

2:36 pm on Aug 1, 2006 (gmt 0)

10+ Year Member



The %20 is a "space" in the url. Problems like this come from different Character Sets as they are applied to links.

Jim Catanich