When updating a site how long can you safely use robots.txt?
JAB Creations
8:24 am on Nov 8, 2010 (gmt 0)
I'm going to be updating my site in the semi-near future and was wondering how long I can safely use robots.txt to prevent good spiders from indexing my site until I've finished updating everything?
- John
piatkow
11:11 am on Nov 8, 2010 (gmt 0)
No idea but if I couldn't upload the revamped site in one go I would build the update in phases that could stand indexing in their own right.
That is classic static html site build, edited and tested off line. Is there anything about the software that you are using that affects the approach?
JAB Creations
8:30 pm on Nov 9, 2010 (gmt 0)
Actually I generally only take a few hours to update my site however I have considered doing some extra testing with the new version I've been working on just to make sure when it goes live it'll work smoothly for visitors. It's database driven and I avoid static like shark filled pools with overhanging giant hair dryers.
- John
phranque
10:27 am on Nov 13, 2010 (gmt 0)
safe from what? indexing the content? robots.txt will successfully exclude content from being crawled by well-behaved bots indefinitely. in some cases you may be better off putting the content on a subdomain or in a subdirectory and hiding it behind basic authentication, so crawler requests get a 401 Unauthorized response.
experienced
12:59 pm on Nov 13, 2010 (gmt 0)
a good idea can be doing those changes in a separate folder and when it is done, take all the files on root. you dont even have to tell anyone not to crawl since you will not be having any linked reference to that folder, or for your safety, you cna put that folder in robot file and when the changes are done, you cna take the all updated file out of it.
JAB Creations
9:45 am on Nov 27, 2010 (gmt 0)
I kept spiders out of my site for six days without major issues apparently. :)