Forum Moderators: Robert Charlton & goodroi
I still don't know if I can simply add X-Robots-Tag stright into my sitemap.xml file, or I have to do some .htaccess exercise.
If the head of my sitemap.xml looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
Can I now put "X-Robots-Tag: noindex" somewhere or I have to serve it in some other way?
If I can, how would the new head look like, please?
I'm just confused with the syntax, as everybody refers to this specific tag like X-Robots-Tag: noindex while I see that the tags from XML file always have values under quotes.
Is all about quotes?
Thanks
[edited by: Robert_Charlton at 6:38 pm (utc) on Sep. 23, 2009]
[edit reason] fixed typo & delinked sample links [/edit]
I have read in non-authoritative blogs that an x-robots directive can be added to the .htaccess file on an Apache server, but I cannot confirm that information. You might get more specific server help than I can offer by asking in our Apache Forum [webmasterworld.com].
It is just that people like myself lack a basic knowledge (like about headers and how stuff really works), while we run sites and play with Apache or other servers.
That's where the question like this one come from and wait for moderator to answer it as most of other participants go "Ha?!".
# Set HTTP X-Robots-Tag response header to "noindex" for sitemap.xml requests
<FilesMatch "^sitemap\.xml$">
Header set X-Robots-Tag: "noindex"
</FilesMatch>
Jim
Unless for some reason people search for XML files in SERPs to add to their Reader, without visiting the site!
Previous thread started by me: [webmasterworld.com ]
Is there a solid value in hiding the file?
We do have certain pages on websites that we don't want in the files, but that is a separate issue and easy enough to leave them out.
She never gets above 60 mark in SERP for those pages on long tail.
Scrapers Paradise! Her Content is all over the web, and it gets spidered elsewhere before GBot hits to her site.
Not destructive, but undesirable. Why would I want someone to open a .gz or .xml file to visit my content, I already have my pages listed which present the information the way I want it to appear.If a bot can get access to this information, any other computer can as well. !
This raises a good question about the priority/ranks in sitemaps, time to save my sitemaps (but I cannot or I have to bann same BOT ranges)