Forum Moderators: phranque

Message Too Old, No Replies

stop indexing https

         

jake66

3:02 pm on Aug 9, 2006 (gmt 0)

10+ Year Member



i've tried relentlessly in the past 2 weeks to figure out how to stop bots from crawling https pages.

- https is in the TLD of the website, i do not have control over this and i cannot change it to [secure.mysite.com...]

- i have tried to use the handlers to switch based on ports, but my host does not allow this (why? beyond me)

- i cannot insert a robots.txt to disallow, because again my https runs through the entire site

anyone have any suggestions?

FalseDawn

4:20 pm on Aug 9, 2006 (gmt 0)

10+ Year Member



If your entire site is https:// and if you expect to have any pages indexed at all, pages that do not need to be secure should be changed to http:// ASAP.

Why?
1) In general, search engines do not like to add https:// pages to their index.
2) It makes your site much slower and adds load to the server.

When you say https:// is in the TLD of the website, this is not correct - the https:// just indicates a secure connection. What happens if you try plain [?...]

[edited by: FalseDawn at 4:23 pm (utc) on Aug. 9, 2006]

jake66

4:32 pm on Aug 9, 2006 (gmt 0)

10+ Year Member



they all are already http.

but this does not stop somebody linking to an https:// page and search bots finding it. once that happens, that particular page is indexed as http and https

it would take a lot of work for someone to do this to my site because even if someone links to an https, the links within that https page are not https... but i would like to fix the problem before it happens.

tedster

5:03 pm on Aug 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you made sure that any links from your secure pages are full, absolute links that use "http"?

In my experience, it is links on a secure page that cause this problem most all of the time. Once in a while there's an accidental inbound link from someone (often grabbed from the address bar after following a relative link on a secure page) and even more rarely a real attempt to sabotage a site by a competitor.

Given the limitations of your hosting set-up, the only other approach I can see is moving to a new host who will oblige your need.

Bewenched

9:58 pm on Aug 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Same thing happened to us. Honestly I think that a competitor linked to us this way to cause us harm, but of course there is no way to prove it.

We had to write a script that would throw a 301 Permanently Moved to get them out of their system. They were being seen as duplicate content.

jake66

4:49 am on Aug 10, 2006 (gmt 0)

10+ Year Member



tedster, everything linked within the site is 100% http. but again, this won't stop unsavory individuals from posting the links all over the web to sabatage one's ranking.

i know i can do the meta trick, but my pages are cached. is anyone familiar with osc? if so, how can you disable the cache add-on for https pages?
this alone, would solve my problem.