Forum Moderators: buckworks

Message Too Old, No Replies

osCommerce, SE friendly URLs, and 404 headers

         

Nutter

12:48 am on Apr 11, 2007 (gmt 0)

10+ Year Member



This could have gone in here, the Apache board, or the PHP board. But I figure I'm most likely to run across someone who is using osCommerce and has had similar problems here...

Some time in the past couple of weeks all of my product pages have started returning 404 headers instead of 200, which of course is a bad thing. The page comes up and looks perfect to the visitor, so I didn't notice until awstats started showing hundreds of 404 errors each day.

If I turn off the search engine friendly URLs the pages return 200 like they're supposed to. But I prefer the SE URLs because they look cleaner and it was easier to write a robots.txt to block duplicate content than it would have been to add meta robots blocks to any page with a cPath variable. More importantly, the site is several months old and most of the pages are indexed in the big 3 SEs so I don't want to change every URL.

My guess is that it's an Apache issue since I can't find anything in the osCommerce code that would send a 404 header. It's only on pages like /product_info.php/products_id/116 . To confuse me though, I have a Wordpress blog embedded using a blog.php file and /blog.php/post-plug/ works correctly and sends a 200 header.

Any suggestions where to look for a fix? I've posted on the osc boards as well. And I've found a page that mentions this type of issue but only in relation to Apache 2 and my server is running version 1.3.

ss2sanjay

6:03 pm on Apr 23, 2007 (gmt 0)

10+ Year Member



if a product is removed from site .it will still keep on showing on search engine cache till site crawled and cache is updated. and if a user clicks from a cached page on that you can get 404 error.

Nutter

6:19 pm on Apr 23, 2007 (gmt 0)

10+ Year Member



They're not products that I've removed. They're all active products. The closest thing to a cause I could find was that Apache was incorrectly sending the 404 and not OSC. But I only could find information on fixing it using Apache 2 and I'm using 1.3.

Either way, I wound up dropping the "SE Friendly URLs" and going back to query strings and did a 301 redirect from the friendly urls that have been indexed. I also added a line in the product_info file so that if there is a cPath set it adds a robots meta tag to tell it not to index.

Heck, I just checked and G has more pages indexed than before I changed this back so I guess there's something to be said for the friendliness of the osc friendly urls.