Forum Moderators: phranque
a way to ensure that the text is not indexed with the actual page content
Put the pop-over file in a roboted-out directory.
To ensure it doesn't get indexed, I not only disallow the directory in robots.txt but also use an htaccess in that directory with:Header set X-Robots-Tag "noindex, noarchive"
the X-Robots-Tag will only prevent indexing of the urls being requested from this directory (the popover text) rather than affecting the url of the actual page contentYes
How does the header work if the contents of the directory exist only as included material within pages that live elsewhere?It's a pop-over. The file is not writing to the page. The content of the pop-over file is native only to the directory/file.
Every time I post about additional insurance to robots.txt, everyone jumps in saying the bot will never see it if it's blocked in robots.txt.
That's the point... It's insurance
Just a fail safe in case the file is requested *prior* to the SE bot requesting robots.txt. It happens.
It's a pop-over. The file is not writing to the page. The content of the pop-over file is native only to the directory/file.
Put the pop-over file in a roboted-out directory.
My implementation is to use an include to inject the jQuery code into the pages server side
the server side includes adds the jQuery to the page. But the actual pop-over code, the html, is only requested by the ajax call which in turn is triggered by a scroll event. So, based on the fetch and render it appears to successfully prevent Googlebot from seeing the content. The problem is that the ajax call includes a url that Googlebot follows and then indexes.
the page that was indexed only consists of a few lines of html, no head, no body. How am I supposed to add a meta "noindex" tag?