Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Best Practice - hundreds of thousands 404 pages every few weeks

         

MonkFish

1:59 pm on Dec 23, 2011 (gmt 0)

10+ Year Member



Hi everyone,

Long time reader, first time poster.

I have a question regarding having a vast amount of 404 pages weekly. I have just joined this company and a process they put in place was to have the option of users removing their profiles from google index (only implemented in the past couple of months).

Now the way they went about doing this is; if the user said yes then the page is returned as a 404 and only registered users are able to see the profile. Now as we have over 100 million profiles with hundreds of thousands being created almost weekly it is producing 100k plus 404s nearly every week.

I was considering reversing this and just having a noindex, follow meta tag on the profiles if they opt out of the search results. So my question is this:

1. can having this many 404s every other week harm the website (ranks/trust etc)
2. would noindex,follow or even nofollow produce the same results? I have seen instances of a url appearing in the rankings even when robots out because it had links pointing to it. We can not have this happen.
3. is there any other way to go about doing this?

Thanks

tedster

7:22 pm on Dec 23, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to the forums, MonkFish.

If the profile page is not supposed to be indexed from the moment it is created, then I'd say any link that points to that profile page could use a rel="nofollow" attribute. Otherwise googlebot will continue to use those newly appearing links for attempted URL discovery.

Since a 404 status is being returned, googlebot will not even see any meta tag on the page that logged in users would be served.

However, I'd actually expect to see a 403 Forbidden status for any request coming from a not-logged-in "user". It's a much more accurate status, since the profile does "exist." That alone would save a lot of potential trouble.

MonkFish

10:37 am on Jan 3, 2012 (gmt 0)

10+ Year Member



Thank you for the reply tedster, I understand the 403 idea but my concern is that it could be millions of pages a month that will be returning a 403 status. Would Google be ok with forbidden status even though previously they were able to crawl that url and other pages on the website are linking to it?

g1smd

11:01 am on Jan 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's quite alright to have private areas of a website that searchengines have no business accessing. Returning 403 would be correct for those.

MonkFish

11:47 am on Jan 3, 2012 (gmt 0)

10+ Year Member



OK cool, Thanks tester/g1smd, i will use the 403 option with a customized version for users (you must be logged in/search etc) in case we are getting any visits through it from social etc.

g1smd

12:03 pm on Jan 3, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Make sure the error message includes links to other parts of the site so that visitors can continue on your site rather than simply hitting the browser back button.