Forum Moderators: open
173.252.87.5 - - [24/Oct/2018:00:35:27 -0700] "GET /robots.txt HTTP/1.1" 206 1029 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
2a03:2880:11ff:6::face:b00c - - [13/Oct/2018:12:41:25 -0700] "GET /robots.txt HTTP/1.1" 206 1020 "-" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
I've been seeing this sporadically since mid-October on all sites; interestingly the earliest was my test site. (In theory, a human could blunder by and then post a link. I don't think it has ever happened, though I have seen the twitterbot a time or two.) Comprehensive search of raw logs confirms that I have never before seen FB asking for robots.txt. On one site they enthusiastically asked for it in batches of five. What they're planning to do with it remains unclear. 69.171.251.13 [12/Oct/2018:16:50:18 GET /robots.txt HTTP/1.1 206 1551 - facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
69.171.251.13 [15/Oct/2018:19:11:46 GET /example/wp-content/uploads/2013/07/example-vending-machine.jpg HTTP/1.1 200 34892 - facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
2018-10-12:20:57:11
URL: /example/
IP: 69.171.251.17
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en-US
Authorization:
Connection: keep-alive
Host: example.com
Upgrade-Insecure-Requests: 1
User-Agent: facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
That IP was used only twiceHuh, interesting. I know that FB has used a lot of different IPs--I don't generally keep track of which ones are currently active--but the two I listed were the only ones I'd ever seen asking for robots.txt. It's not unheard-of for an operator to use one IP for robots.txt requests and a different one for pages. But apparently that isn't the case here.