Google: EU Wants Answers on Google Data Storage

Forum Moderators: goodroi

Message Too Old, No Replies

Google: EU Wants Answers on Google Data Storage

engine

4:23 pm on May 28, 2007 (gmt 0)

A European Union committee is questioning how Google Inc. manages and stores the personal data it collects from consumers who use the company's popular search engine.
The inquiry, confirmed in a letter sent to Google, concerns the massive amount of information the powerful Mountain View business gathers about its users, and what could be done with it.

Google: EU Wants Answers on Google Data Storage [sfgate.com]

lammert

4:34 pm on May 28, 2007 (gmt 0)

Google handled nearly 3.8 million queries in April in the United States, or about 55 percent of all search traffic, according to Nielsen/NetRatings.

That is 1.5 query per second. You don't need a dozen datacenters for that :) Journalist should first investigate and analyze, and only after that write an article.

Bondings

5:05 pm on May 28, 2007 (gmt 0)

@lammert, the number of queries of the report in question is a mistake. It's not 3.8 million, but 3.8 billion. They forgot the "Searches (000)".

[nielsen-netratings.com...] (the report)

zeus

8:40 pm on May 28, 2007 (gmt 0)

I like that the EU is finaly doing something about all this colleting stuff, I dont care if they only use it for ads

Trax

9:29 pm on May 28, 2007 (gmt 0)

I like the question they are asking... I wouldn't mind knowing about that, too.

oneguy

9:55 pm on May 28, 2007 (gmt 0)

Google uses the records to help protect users from fraud and abuse, according to the article.

How do they do that?

IanKelley

10:05 pm on May 28, 2007 (gmt 0)

EU's concern stems from Google's decision in March to keep that data, but to "anonymize" it -- for instance, deleting eight digits from the computer's network address -- after 18 to 24 months.

The obsessive privacy concerns regarding Google are, IMO, ignorant and potentially damaging to everyone that operates a website.

There are a lot of very good reasons to store usage data. Some sites could not operate without that data (in unadulterated form). No site could operate as effectively without it.

Google is an easy target because they're big, but the truth is that they do more to protect privacy than most, if not all, other major sites.

I don't know of any other site that has a policy of anonymizing user data after X amount of time.

Everyone keeps web logs, or visitor stats, or usage data, whatever you want to call it. Some sites do more with it, some sites delete it more frequently than others, but the data is the same.

Should they not be allowed to do that?

To me that just doesn't make sense... And lets be clear, if a major website gets legislated or sued into changing the way they store data then that effects everyone because we're all doing exactly the same thing. Even those of us who are unaware of it.

zeus

10:22 pm on May 28, 2007 (gmt 0)

IanKelley - I totaly do NOT agree with you i all terms, with all the tools, gmail and search they have HUGE lack in there privacy policy for the user.

Also I dont need any info from my users, I eep stats, but no IP

ronin

12:22 am on May 29, 2007 (gmt 0)

the truth is that they do more to protect privacy than most, if not all, other major sites.

If that's true then great. If Google is regulated, the regulatory regime will ensure that it continues to do so. Any problems with industry regulation?

walkman

4:28 am on May 29, 2007 (gmt 0)

I'll be damned...I never thought that EU would do this, but, kudos to them. The difference is that G has that info categorized and can map your entire life. I, or you can't as we only see a small part..

IanKelley

6:38 am on May 29, 2007 (gmt 0)

The difference is that G has that info categorized and can map your entire life

That's definitely true. But it's also a separate issue.

The issue here, unless I misread the article, is that Google has chosen to store usage data and then anonymize it in the future.

Again, all websites store usage data. Aside from possibly one that has specifically told their web server and firewall not to, a mistake as the server in question would then have little defense against DOS and brute force attacks.

Beyond that any website that needs to fight fraud or abuse absolutely has to store usage data (including IP addresses) they really have no choice.

I'm not saying that there isn't a real possibility that Google will at some point become irresponsible about the way they use data. At that point governments would need to step in, but the answer should not have anything to do with what they are allowed to store. There is already little enough data available to websites about their visitors.

msaqibansari

10:42 am on May 29, 2007 (gmt 0)

I think European Union committee doing right to asking how Google manages the personal data. At least someone has power to ask something to GOOOOOOOOOOOOGLE.

zeus

1:28 pm on May 29, 2007 (gmt 0)

well the EU dont pass anything that easy, so I thought it would come at some time, be cause google have so much power, the EU also had protest that the thumb is scaned when traveling to the US and they keep the data.

jecasc

2:39 pm on May 29, 2007 (gmt 0)

If you put all Googles services together Google pretty much knows everything about most peoples online behaviour. The average webmaster only knows what someone did on his website and where he came from.

Google knows all the websites you reached through Google Search. It knows what you did on the website if the websites uses Googles Analitycs. Google knows who you are if you use one of it's services like Google Checkout. And Google even has access to your emails if you are a Gmail user.

And they want to store all that information for an undefined period of time.

In my opinion this is simply too much information for one company to handle without any supervision.

And when I think of it now that is a lot more information than I want any company to have about me.

IanKelley

8:01 pm on May 29, 2007 (gmt 0)

So while the EU's issue seems to revolve around data storage, it seems that most posters in this thread are concerned about the breadth of information that Google potentially has access to about a given user.

The thing is, aside from search, all of this data is collected by services where users sign up (or download an application) and give Google the data of their own free will. Most of the services could not function without the data collection, and storage thereof.

No one is in any way required to use these services.

The same thing applies to other major web service providers such as Yahoo and MSN which both have a similar array of services/web properties. Both also have more traffic than Google, at least according to all the metrics I've seen.

Heck, Microsoft makes Google look like Uncle Joe's Sausages on a Stick.com by comparison. They've got the Windows automated updates service running on PCs beyond count sending them IP addresses AND a unique ID associated with the OS installation every single day. It would be a simple thing for them to cross reference this with data collected from their various web services. Talk about privacy issues.

Any large Internet company has enough data to make people uncomfortable.

The point, as I see it, is that they need this data to function and it's perfectly reasonable for them to collect and keep it. Users are always free to opt out.

I agree that limits as to what they can ultimately do with this data should be considered. It does not make sense, however, to talk about limiting their ability to collect and store the data.

SlyOldDog

9:08 pm on May 29, 2007 (gmt 0)

Ian, I think it does matter. The amount of data you can store about a user should be inversely proportional to your ability to collect data on that user. This way privacy can be protected.

You cannot have a catch all rule separating data storage rules from the power and reach of the data collection process.

The question is...how can you quanitfy the ability of each organization to collect data? Tough one. This may be a whole new science one day.

[edited by: SlyOldDog at 9:10 pm (utc) on May 29, 2007]

IanKelley

9:38 pm on May 29, 2007 (gmt 0)

That's an interesting thought. When you talk about limiting data collection, though, how would that work?

Basically we're talking about three kinds of data here:

1) User details from a signup or application form. So then would a large web property be compelled to use signup forms that ask for less data?

2) Access logs. Basically IP Address, Date/Time, User Agent and Page Visited. That's not much data, which parts of it would a company with a large reach be disallowed from collecting/storing? Keeping in mind that the most personal of these (IP address) is required for fraud and abuse control.

3) Historical usage data. This is probably the one that I can see it making the most sense to limit. The problem though is that a lot of businesses rely on that kind of data. Amazon.com, for instance, owes a lot of it's success to innovative use of historical shopping data.

Search engines don't really need historical data to be associated with individual users in order to operate effectively (at least not long term) so it's fair to ask that they don't store the data in that way.

But then isn't anonymizing 100% of all data after X months reasonable?

jecasc

6:11 pm on May 30, 2007 (gmt 0)

No one is in any way required to use these services.

This is an argument that does not count. Just because you offer a service does not mean you can do what you want and tell people just to beat it if they do not like it.

Imagine phone companies would start to tape and analyze your phone calls (Bad enough that governments are doing this) to analyze your behaviour and send your data to telemarketers. Hey if you do not like it, don't use your phone anymore.

But that is not an option. Especially when services have become indispensable (like search or email has) - and companies have reached a certain market share (like google has) it is a normal thing to regulate what they may do and what not.

Besides many countries already have rules about what user data you may collect and how long you may store user data. For example in Germany you are not even allowed to offer a newsletter service on your webpage and make information like name or address a requirement for subscription. You have to offer the possibility to subscribe anonymously.

IanKelley

8:06 pm on May 30, 2007 (gmt 0)

Just because you offer a service does not mean you can do what you want and tell people just to beat it if they do not like it

I may have missed something but I'm pretty sure that none of the major web service providers are ignoring widespread user concerns and telling people to "beat it" :-)

In fact ironically the EU's issues are in response to Google taking a step to protect privacy completely on their own initiative in response to concerns.

Imagine phone companies...

I understand what you're getting at here but it's not reasonable to compare web based email and search to a public utility.

...a certain market share (like google has)...

Google's large market share is pretty much restricted to search. Other companies dwarf them in other areas (with the possible exception of YouTube).

So going back to public utilities... Search is not a communication channel, there are no conversations being recorded. The data being collected and stored is part of the necessary operation of the service. Phone companies log every call you make.

For example in Germany...

Wow, I hadn't heard that. Entertaining, thanks.

europeforvisitors

2:45 am on May 31, 2007 (gmt 0)

I'd be more concerned with the EU's data snooping than with Google's data storage. (The same goes for the NSA's data snooping in the United States.)

Look up "EU Data Retention Directive" and "NSA data snooping" if you aren't familiar with the issues.