Forum Moderators: mack
GET /index.php?xml_sitemap=params=pt-post-2014-10 HTTP/1.1
GET /index.php?xml_sitemap=params=pt-post-2013-11 HTTP/1.1
GET /?p=35 HTTP/1.1
GET /?m=201312 HTTP/1.1
GET /?page_id=2 HTTP/1.1
GET /?paged=5&cat=1 HTTP/1.1
User-agent: *
sitemap: http://example.com/Sitemap.xml
User-agent: MauiBot
Disallow: /
Here is an example:Has your site entirely stopped using parameters, or have only selected ones been dropped? Even if bingbot is the most visible annoyance, there are other approaches, such as redirecting any with-query request to the queryless version of the same URL, or returning a 410 for URLs that are genuinely gone and have no current equivalent. What happens to humans who have an outdated URL bookmarked?
I don't believe any power on earth will make bing forget an URL. Pages removed up to three years ago? Still requested by bingbot. Pages redirected up to seven years ago? Still requested by bingbot. {snip}
Has your site entirely stopped using parameters, or have only selected ones been dropped?
Even if bingbot is the most visible annoyance, there are other approaches, such as redirecting any with-query request to the queryless version of the same URL, or returning a 410 for URLs that are genuinely gone and have no current equivalent.
What happens to humans who have an outdated URL bookmarked?
Some crawlers will recognize robots.txt directives involving parameters; some won’t.
It's generally more efficient to return responses such as 404 straight from the server, rather than let it get all the way to /index.php and making it do the work. It also makes it easier to see what's going on if you ever have occasion to study your server access logs, since a 404 or 410 will then be logged as such, instead of as 200.