Forum Moderators: open
An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.For “need”, read “want”.
Edit: But what does this have to do with Google SEO?I think lucy24's question was to tell me to wake up and move the thread to where it belongs. That has been done now. ;)
User-agent: Scrapy
Allow: /
using a Scrapy-based tool in the past and had to poke a hole in an existing bot-excluding directiveFurther poring over archived logs--now that we're in SSID--suggests that users can readily modify the script to make Scrapy disregard robots.txt. It comes from a random seleciton of IPs, presumably the IP of whoever is running the script. (I didn't check whether they are server ranges, human ranges or both.) Sometimes it’s compliant but more often not.