Forum Moderators: phranque

Message Too Old, No Replies

Tip for Validating Links

         

jomaxx

10:58 pm on Sep 10, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm sure everyone knows it's a good idea to use a link validator periodically if they have a significant number of links, but here's a little trick that helped me find a lot of broken links that were invisible to my link checker:

Deliberately create a file of broken links on the domains you want to check (for me this is a fairly simple task using TextPad), and validate that file.

Any broken links that do not get flagged point to a site that is not reporting errors in a standard way, a site that should be verified manually. Even after doing a full link validation last month, I just tried this and discovered a further few percent of problem links that for all I know could have been broken for a very long time.

The most egregious example is people scooping up and parking expired domains -- I assume that they deliberately throw up a status 200 so that old broken links NEVER get fixed. But you also end up with a lot of badly configured domains that serve up a valid web page containing some kind of error/not found message.

Hope that's useful for some people. Any other suggestions for non-obvious link verification techniques, or for especially useful software packages or web services, would fit nicely here.

Status_203

8:11 am on Sep 11, 2009 (gmt 0)

10+ Year Member



Good tip. Thanks.

I suppose any changes anywhere up the chain might indicate a review is required:

* > x% change in html or content. (May or may not want to take more interest in any addition or change in links on the page)
* Change in certain response headers.
* change of ip.
* change of nameserver.
* change in whois (might want to pick out particular parts, given that some anonymity services regenerate the email address every y days).

All of which of course require caching what they used to be (and keeping a history might be useful as well).

piatkow

8:22 am on Sep 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A lot depends on what niche you are in. For many years I ran a site for a community arts organisation. We had a big directory of local groups, performers, suppliers etc. Most of these were run by volunteers or amature/semi-pro musicians who had a lovely habit of leaving sites in place when they moved on. As these often took advantage of free hosting from ISPs they usually lasted as long as the owner kept a connection through that supplier.

After adding parked domains redirected to MFA or "buy this domain" pages I found that automated checks only caught a minority of problem pages. The overhead of maintaining a manual review cycle was too much and on-top of automated checks I just settled for manually checking two other links every time I made a change.

D_Blackwell

5:00 pm on Sep 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A lot depends on what niche you are in......I just settled for manually checking two other links every time I made a change.

I don't know how one confirms external links without a manual review; no control or monitoring of content. The live page/site could become anything down the road.

I manage a website for a craft guild in fiber arts. Almost all of their links (90%+) point to external links - websites that they do not control; member sites, artist sites, supply sources, informational sources.....

I've always looked at this situation as a question of commitment (investment of time and/or money) to the quality of the website. Truth be told, most sites don't see much change after a very short period of time from launch. People lose interest, move on, consider the job 'done'.....

Those sites that see the most change are the most likely to be 'perfect' or the most likely to get flagged in a scan. These sites are easily caught and fixed or dropped.

It's those pesky sites that haven't been updated once in two years that are a PITA. They may still be as useful as the day first linked. Their time may have passed. Somebody has to physically look and make a decision.

Because so many sites so often see so little change, a schedule or policy that works through the external links over time is usually going to be adequate. Really not worried about a stale link here and there. An organization with some money can afford the manual check. It doesn't have to be me. I can give them a URL list in an Excel file or some such, and they can have someone (volunteer or paid) check and review X links a day. 5,000 external links = 100 checks per week or 20 per day. A large site can afford it - or has other issues. A smaller site should have an easy time keeping up if they want to.

If I supply a master list, even the lowliest staffer or volunteer with a set of guidelines should be able to fly through this. Send back the file as a monthly report with notes of links checked and drops/edits to be made. I'm in, done, out, and updated on the master list in a few minutes. Problems are almost always tied directly to commitment; or wanting more than they are willing to support. (My! What big eyes you have!)

If commitment to maintaining the website is an issue, then they have other (and more serious) problems than the links. They should reevaluate what the site should be, or what the site should offer.

Same issues as websites that put up date sensitive content (calendar, events....) and then don't stay on top of updates. No faster way for a site to look like garbage than post date sensitive information and then not stay on top of things.

The guild site that I manage does not pay to be kept tip top, and 'website priority' fluctuates. In their region they have THE 'go to' site and are proud of this - but do a pitiful awful job of taking advantage of this and really looking good. Major commitment issues. They want the benefit, but don't want to make much effort or investment. It is what it is. I am not paid to keep up the quality of the links - they don't do it - my warnings and concerns are well documented - it's not my problem. (I've stopped doing 'extras' for them because it was a lopsided relationship. All take - no give.)