Forum Moderators: Robert Charlton & goodroi
Googlebot ignoring robots.txt and nofollow
I have also nofollowed them all, but I have noticed that Googlebot is now ignoring these directions
I have not bothered to no-follow affiliate links and have no problems. It does not look like they have anything against affiliate links per se - just when there is no content or thin content to support those links. (Or maybe more links than the content supports?)
help me understand...
your affiliate links go through an external script and you are using robots.txt to exclude googlebot or all bots from crawling that external script?
[edited by: phranque at 11:06 am (utc) on Apr 27, 2017]
Googlebot has been crawling & indexing redirect links that have always been blocked in the robots.txt
A description for this result is not available because of this site's robots.txt
robots.txt doesn't block bots from accessing & following links... it asks that the disallowed files not be indexed.
The "Disallow: /" tells the robot that it should not visit any pages on the site.
Disallow
The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved. For example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html.
Now, wait. This message means that the search engine (G is not the only one) has indexed but not crawled the page.
You may choose to take the opposite approach: permit crawling but deny indexing by putting a robots meta on each page. (Some search engines also recognize the X-Robots header, which can be applied globally. I don't remember if Google does.
To my knowledge Google does need to crawl in order to index, you disagree
I see hits on in within the blog
I can't believe people here with 10+ year experience says, robots.txt does not block from crawlingBelieve it. Robots.txt doesn't *block* anything. Hundreds of bots ignore it altogether. Robots.txt only works on those bots that support it.