If User-Agents had PGP Public Key Serial Numbers

Forum Moderators: phranque

Message Too Old, No Replies

If User-Agents had PGP Public Key Serial Numbers

Would it be a useful thing?

trillianjedi

12:29 pm on Dec 7, 2006 (gmt 0)

Consider having something like this in a referrer field:-

"BrowserPublicKey={12345436-45353-48eb-4524-5645745529BE}"

Would that be the ultimate in user identification? Any requested response from the server would be encoded, and only the owner of the private key (the browser) would be able to understand it. You'd have an encrypted authorising "handshake" at the start of any session.

Large crawler-operating organisations like Google, Yahoo! and MSN could have just one key for all their spiders.

Thoughts?

celgins

3:44 pm on Dec 7, 2006 (gmt 0)

Interesting. But from a programming perspective, how is this different from the way current session variables are handled?

trillianjedi

4:02 pm on Dec 7, 2006 (gmt 0)

The session variables cannot ascertain identity.

If I get a file request from someone calling themselves "googlebot", if I have Googles public key, I can handshake with that client using the key and know for certain whether or not they are who they say they are.

And no-one, within the realms of the security of PGP in this example, could spoof that.

jtara

7:20 pm on Dec 7, 2006 (gmt 0)

The useful idea here is having crawler's identities authenticated, giving you the option to only permit those whose identity can be verified.

However, this probably isn't the best way to do it.

First, let's clarify for those not familiar with public-key encryption how this works. The crawler would present a public key when crawling. The response would be encrypted. Only the holder of the private key can decrpty the response. If the crawler was a fake, pretending to be, say Googlebot, when it is in fact not, then they would get meaningless garbage.

But there's really no need to encrypt the response, as this is a case where only authentication is needed, not encryption of the traffic. Of course, there is no challenge-response protocol defined for dealing with public-key identication presented in the user-agent header.

No need, though, because there already is a means to do this - client-side certificates. Client-side certificates can do the same thing.

Of course, if widely-adopted, this would just drive fake/rogue crawlers to identify themselves as normal browsers even moreso than they do now. So, I'm not sure this solves a problem.

What it would be is improve accuracy of logfiles. Not sure that would be of great benefits. You would know that visits shown in your log to be from Googlebot really ARE from Googlebot. Not sure that's a benefit worth going to the trouble for.