Jun 12, 2007 at 4:10am ET by Danny Sullivan
In response to an EU letter over data retention, Google has announced that it will now anonymize server log data after 18 months, rather than the previous maximum time it had announced of 24 months. It is also reconsidering how long its cookies last. It’s nice to see Google make such a fast, responsive move, though it is reacting to something that felt more like a political show rather than a real effort to improve privacy protection. The EU privacy group that sent the letter can feel it got a result. Google can look like it perhaps protected privacy more, but some important core issues remain unresolved — though Google shines hope on the idea of a digital dashboard or console to put users directly in control of all their private data with the company. And will a new privacy summit being called for happen?
Google Announced Log Anonymizing Program
To understand today’s announcement, let’s go back in time. Google Anonymizing Search Records To Protect Privacy from me covers how back in March, Google announced that it would make server log records anonymous after a period of 18 to 24 months. That article goes into detail about what server log data does — and does not — record that could be personally identifiable. Some key issues to understand from this original announcement:
EU Asks Questions On Data Retention
In April, news came out (see EU Group May Serve Google With Letter Over Data Retention Policies) that the European Union’s Article 29 Data Protection Working Party was going to send Google a letter over its data retention policies. At the end of May, the letter was finally sent (see European Union Questions Google’s Data Retention Policy).
From the letter (PDF format), the letter outlines the working group’s concern over log data retention:
It is of the opinion that the new storage period of 18 to 24 months on the basis indicated by Google thus far, does not seem to meet the requirements of the European legal data protection framework.
The Article 29 Working Party is concerned that Google has so far not sufficiently specified the purposes for which server logs need to be kept, as required by Article 6 (1) (e) of Data Protection Directive 95/46/EC. Taking account of Google’s market position and ever-growing importance, the Article 29 Working Party would like further clarification as to why this long storage period was chosen. The Working Party would also be keen to hear Google’s legal justification for the storage of server logs in general….
Concerning the “google cookie”, the lifetime of this cookie, which has a validity of approximately 30 years, is disproportionate with respect to the purpose of the data processing which is performed and goes beyond what seems to be “strictly necessary” for the provision of the service.
Things The EU Should Have Done, Should Have Known
I find the letter fairly amazing from a group that supposedly is concerned about privacy, in how it fails to ask any substantial questions and suggests, frankly, technical ignorance. It simply feels motivated out of political posturing. Let me count the reasons, some of which I’ll revisit in more depth as part of Google’s response to the EU letter further below.
Yet, by sending this letter to Google only — rather than sending a slightly different letter to all the major search engines that would have addressed the same issues across the board — the group rewards Google with headlines about how it is effectively being knuckle-rapped over privacy.
Cookie, Smookie!
I need to step out of the bullet points to best dive into my disgust over the cookie issue coming up in this letter.
Daniel Brandt of Google Watch deserves credit as being the main person to scream out against Google’s 30 year cookie (which would be longer, but I believe it was originally set to the maximum safe time according to the Year 2038 bug). But when I interviewed him back in 2003 about this, even he wasn’t that worried about the time period, ultimately. From what I wrote back then:
In conclusion, don’t be worried that Google’s cookie won’t expire for 35 years. Even Brandt agrees that’s not the issue. He just doesn’t like the unique ID portion of the cookie.
“Getting rid of the unique ID is the most important thing. The expiration date is a second indicator of how sensitive they are to privacy issues, even without the unique ID. But the expiration date issue is close to trivial once the unique ID is gone,” Brandt said.
Here we are four years later, and the Working Group is acting like it simply reacts to cookie scaremongering rather than understands that cookie length means little when no one has a computer that lasts 30 years. Sure, it would be nice to see the cookie expiration shortened. But it means little in terms of real data protection.
Comparing Cookies
Still, let’s try something. I took Internet Explorer 7, cleared everything out of it and then changed my settings to get prompted for each cookie requested. Then I did a tour of search engines.
First Yahoo. Visiting the home page gave me get four different cookies, with these expiration dates:
Then I did a search, which caused me to get two more cookies:
Next, Microsoft’s Live.com search engine. It gave me double the number that Yahoo did, to reach the home page, eight cookies in all. Expiration dates:
After entering a search request, I got six more:
Next, Google. I received only two cookies:
That’s it. No more when I did a search.
Yes, Google has the longest lasting cookie, but barely. Look at the longest period of time for each service, which I’ve rounded roughly:
Basically, both Google and Yahoo have 30 year cookies. So where’s the letter for Yahoo from the Working Group? And isn’t 14 years from Microsoft excessive? As I said, I think focusing on the time period of a cookie is just scaremongering and not diving into more substantial and important issues. But if Google’s going to get called out, why aren’t the others?
Google’s Response To The EU
But enough of my reaction to the Working Group’s letter. Today, Google’s given its own response in an open letter (PDF), linked from a blog post called, How long should Google remember searches?
Why keep server data? Google responds:
As part of the general blog post, it also succinctly lists:
Why Keep It 24 Months?
Moving along, Google provides a variety of reasons for why the 24 month maximum period was initially selected:
We need to have a sufficient amount of historical log server data. In fact, all search engine companies need sufficient data to evaluate and improve their services based on the needs of users, as online services evolve very rapidly. In addition, there is tremendous growth in fraud on the Internet, posing serious challenges for service providers to keep their services secure. In determining a retention period, we closely examined the evolution of search engine services, and the needs of our engineers to ensure the security of Google services. The period chosen, 18 to 24 months, represents a period lengthy enough to achieve these purposes without being excessive. We therefore believe that this is a proportionate period for the retention of log server data.
In addition to proportionality, data retention policies must also respect the principle of legality set forth in Article 6(1)(a) of the General Data Protection Directive. The Data Retention Directive requires all EU Member States to pass data retention laws by 2009 with retention for periods between 6 and 24 months. Google is therefore potentially subject (both inside and outside the EU) to legal requirements to retain data for a certain period. Since not many Member States have implemented the Directive thus far, it is too early to know the final retention time periods, the jurisdictional impact, and the scope of applicability. Because Google may be subject to the requirements of the Directive in some Member States, under the principle of legality, we have no choice but to be prepared to retain log server data for up to 24 months.
Problems With The EU Data Retention Law
You can see the issue of the EU data retention law comes up. Google then goes into some interesting depth of the problems of trying to figure out how exactly this is applied, and to whom:
There are many unanswered questions regarding the EU Data Retention Directive. The Working Party has criticized its lack of clarity in many respects, particularly with regard to divergent implementations in each Member State. We would welcome a definitive debate across Europe to answer such basic questions as:
1) What is an “electronic communication service provider” subject to data retention obligations, and would it include Google services, such as Gmail, Google Talk, or Google Search, in light of different definitions in each Member State?
2) What is the binding retention period for a global Internet company doing business in each Member State, when retention periods range from 6 to 24 months?
3) Do data retention requirements apply to the storage of personal data outside the EU by service providers established in the EU? 4) Will EU Member States go beyond the Directive and implement more stringent retention requirements?
For example, the German Ministry of Justice has proposed that webmail providers should be required to verify the identity of their account holders. Would the German authorities attempt to apply that requirement to Google? Could we challenge its legality in court, either as an unconstitutional infringement of privacy, or as an example of jurisdictional over-reach?
In short, there is tremendous confusion in legal circles across Europe on these issues, and both individuals and companies would benefit from greater clarity from authorities responsible for the Data Retention Directive to answer these very fundamental questions. A public discussion is needed between officials working in data protection and law enforcement to resolve these issues.
Complying With Other Laws
Google also notes that it is subject to laws outside the EU and interestingly works in an argument that it might want to retain data to help law enforcement:
It is also important to remember that in the U.S., the Department of Justice and others have similarly called for a 24-month data retention period. Thus, there seems to be an emerging international consensus on 24 months as the outer limit for data retention. This period makes sense for a global company like Google that must comply with the laws of all countries where it does business. Regardless of data retention requirements, logs are an important tool for law enforcement to investigate and prosecute many serious crimes, such as child exploitation. While we have resisted excessive requests from governments in the past, we believe that it is our responsibility to respect law enforcement requests for logs information when law enforcement follows valid legal process. Once again, a reasonable balance needs to be struck between the goals of privacy and the legitimate goals of law enforcement.
In addition, data protection laws, such as Article 17 of the General Directive and Article 4 of the E-Privacy Directive, require companies to ensure that adequate security measures are taken to protect user data. As explained above, our systems engineers require a sufficient historical sample of log server data in order to analyze security threats. A period of 18 to 24 months provides our engineers with sufficient data to analyze these threats without being excessive.
Of course, other laws also impose obligations on companies to retain information. In the U.S., for example, the Sarbanes-Oxley law requires us to retain business records sufficient to establish adequate financial and other controls. The same is true of tax and accounting requirements, especially for paid services, such as clicks on sponsored links, where we have a contractual and accounting obligation to retain data, at a minimum until invoices are paid and the period for legal disputes has expired. These legal obligations must also be considered in connection with our server log retention policies.
Shortening To 18 Months & Reconsidering Cookie Expirations
As for cookies, Google writes:
We believe that cookies data management in a user’s browser is fundamentally a browser/client issue, not a service/server issue. Therefore, the lifetime of a cookie does not indicate or imply any enforcement of data retention. We also believe that cookie lifetimes should not be so short as to expire and force users to re-enter basic preferences (such as language preference). Nonetheless, we acknowledge that cookie lifetimes should be “proportionate” to the data processing being performed.
The real kicker, of course, is Google concluding that it will shorten the retention period and reconsider how long cookies last:
After considering the Working Party’s concerns, we are announcing a new policy: to anonymize our search server logs after 18 months, rather than the previously-established period of 18 to 24 months. We believe that we can still address our legitimate interests in security, innovation and anti-fraud efforts with this shorter period. However, we must point out that future data retention laws may obligate U.S. to raise the retention period to 24 months. We also firmly reject any suggestions that we could meet our legitimate interests in security, innovation and anti-fraud efforts with any retention period shorter than 18 months. We are considering the Working Party’s concerns regarding cookie expiration periods, and we are exploring ways to redesign cookies and to reduce their expiration without artificially forcing users to re-enter basic preferences such as language preference. We plan to make an announcement about privacy improvements for our cookies in the coming months.
Lest the EU or other privacy groups try to jump in and claim credit if the cookie gets reduced, a reminder again that the attention on cookie length was sparked all those years ago by Daniel Brandt, who coincidentally and before this announcement from Google remarked in a privacy discussion on Googler Matt Cutts’ blog:
How about drastically trimming back on that cookie that expires in 2038? That would impress me as a symbolic gesture of good will. It was that cookie that first alerted me to the fact, way back in year 2000, that Google was going to be a problem when it came to privacy. I was right.
Beyond The Obvious
Yesterday in my Google Bad On Privacy? Maybe It’s Privacy International’s Report That Sucks article, I spent a considerable amount of time being upset with Privacy International for doing what I thought was a slipshod report on privacy. Today, I’m similarly critical about the EU move. It’s not — as I said yesterday — that I’m a Google fanboy that thinks it can do no wrong. In fact, it’s the opposite — I think Google as well as all the major search engines (and big companies for that matter) to have outside privacy groups and governmental bodies keeping them honest. My upset is that both Privacy International and the EU have seemed more concerned with style than substance.
As I pointed out in my article yesterday, Google has a variety of privacy policies that cover a range of services that it offers. These services can have data well beyond what’s in server logs, and it’s difficult for me — someone who regularly writes about Google — to know what happens with my data. Consider the accounts I have:
That’s just some of my accounts. If I delete my web history, I know that data is destroyed, though what’s kept on offline archives currently is not destroyed, from what I was last told. If I go to the Gmail privacy FAQ (far more useful than the Gmail privacy policy, which fails to link directly to the FAQ), I’m told deleting my mail really deletes it, even off backups, though that might take time. But then again, are these deleted from online backups only? What about offline?
The Privacy Control Panel
Figuring out where all my data resides and how to kill it is a pain — at Google or Microsoft or Yahoo, for that matter. John Battelle had a good suggestion back in early 2006 for a sort of private data control panel that could show you exactly what was stored where and put the user in control:
I bet 95% of the public will never edit, or even view the data more than once. But the sense that the control panel is there, just in case, will be invaluable to establishing trust.
We could use that more than ever. Google especially could use that, if it wants to stop the privacy attacks or at least stem them. How about it? I asked Google’s global privacy counsel Peter Fleischer about this yesterday, when talking to him about the Privacy International survey.
“We’re thinking hard internally along the digital dashboard-type of approach. Is there a way to give users a dashboard and visibility to all these elements and give them control,” he said. “It would be hugely complicated to build, but in terms of that vision, I completely share it, and we’re having deep discussions about it.”
As for Privacy International, it has now come out with a call for a privacy summit to be held on July 23 in San Francisco:
Following the recent publication of its consultative privacy rankings, PI has called on the major Internet companies to meet with the organization in July in San Francisco. The meeting has been called to clarify a number of data handling practices and is seen by PI as the first step to achieving an accord that will provide customers with consistent and strengthened privacy protections, and to give companies a greater understanding of the key challenges.
The meeting has been called for the week of 23rd July in San Francisco. Privacy International will reach out to all major Internet organizations with invitations to the event. These will be sent by Tuesday 12th July, 12.00 EST. We will then publish the full list of invited organizations together with a status report on their responses to the invitation.
A wide-ranging summit is a good idea, but after the publicity stunt Privacy International effectively pulled yesterday, it seems wrong they are the ones to set the date of when, where and hold out a status report of invitations accepted as a name-and-shame attempt. If they had been serious about this, they would have never published that inept report in the first place, causing them to lose credibility. Instead, they would have called for exactly this type of summit, perhaps with other organizations, and not polarized the situation even more.
Yes — let’s see that privacy summit happen — and soon. Yes — involve the group. But no, they aren’t the right group to be leading it now, nor setting the terms.
Postscript: I went back to Fleischer and asked if Google would take part in what PI suggests. He said:
Google is always open to a thoughtful dialogue with people who care about privacy. We have not received an invitation to this event yet, even though it has been reported publicly. So, at this stage, we cannot evaluate whether this would be a forum for a thoughtful exchange of views, or a publicity stunt.
Share, Bookmark & Discuss This Article
More:
Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter
See more stories like this in the Members Library! Check out the Google: Critics, Legal: Privacy sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!
TOP STORIES
SEARCH NEWS BRIEFS
FEATURES & ANALYSIS
RECENT COMMENTS
Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›
Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.
SMX Web Site » | SMX Difference » | SMX News »
Join us at an upcoming SMX event:
Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:
Featured sites from our Blogroll
Become a premium member today and receive:
Regarding..
“Google Keeps Data Up To 24 Months Because The EU Tells It To: Some EU members may require companies to retain data up to two years…All this seems pretty easily knowable by the Working Group, and asking about it feels like a bit of written theater.”
Dude, if you want to understand the EU Data Retention saga, read this paper
http://www.law.ed.ac.uk/ahrc/script-ed/vol3-4/rauhofer.asp
The point is that EU Data Retention applies to ISPs and telcos, and there’s some argument about webmail. No way does it apply to search engines, and nobody ever suggested it did until Google started blowing this smoke. Contrary to the Google Art.29 reply, there is no ambiguity about whether a search engine is an “electronic communication service provider” – it ain’t, and that’s settled terminology across three other Directives.
The Directive is here
http://europa.eu.int/eur-lex/lex/LexUriServ/site/en/oj/2006/l_105/l_10520060413en00540063.pdf
…and the retention obligations on ISPs and telcos are in Article 5 – it’s obvious they don’t apply to search engines (only references to “Internet access, Internet e-mail and Internet telephony”)
Moreover
“Recital 13: This Directive relates only to data generated or processed as a consequence of a communication or a communication service and does not relate to data that are the content of the information communicated…”
“Recital 23: Given that the obligations on providers of electronic communications services should be proportionate, this Directive requires that they retain only such data as are generated or processed in the process of supplying their communications services. To the extent that such data are not generated or processed by those providers, there is no obligation to retain them…”
The EU cannot prohibit individual EU countries imposing further (”national security”) requirements which *could* include search engine logs (or my pet tortoise’s iris scan), but no country has done so, and if they did they would have to explain to the European Court of Human Rights (nothing to do with EU BTW) why that was “necessary” in a democratic society.
Thanks for the links. I gather Google will disagree on the blowing smoke part, but perhaps you’re right.
Let’s assume so. Issue still doesn’t go away. Why restrict Google from keeping server logs that are relatively incomplete about particular people (they only see a limited amount of what you do) when ISPs are being told to keep more complete data longer. Making Google (or any site) destroy the data faster than ISPs doesn’t necessarily protect privacy, when that data is still accessible (and leakable) by ISPs.
Moreover, it still doesn’t let the Working Group off the hook, sorry. If they know this isn’t applicable to search engines, then again, why not say that right up front in the letter (IE: Please don’t argue that point). And again, why not again go after other major search engines at the same time. It still smack of political theater.
“Why restrict Google from keeping server logs that are relatively incomplete about particular people (they only see a limited amount of what you do) when ISPs are being told to keep more complete data longer”
- The Data Retention Directive means ISPs will have to retain logs of the changing dynamic IP addresses assigned to customers, BUT ISPs don’t have to retain web traffic passing through. In fact it would be illegal for them to do so under Data *Protection* Directive, and this is reinforced and made explicit by Recitals 12 and 23 above.
“Making Google (or any site) destroy the data faster than ISPs doesn’t necessarily protect privacy, when that data is still accessible (and leakable) by ISPs.”
- now you see from previous why that’s not true? The Retention Directive is a privacy monstrosity, but not anything like as monstrous as search retention
“..doesn’t let the Working Group off the hook, sorry. If they know this isn’t applicable to search engines, then again, why not say that right up front in the letter”
- the Art.29 letter to Google doesn’t mention the *Retention* Directive at all! Why should they tell Google pre-emptively not to cite some manifestly inapplicable law?
Thanks again for the points. If the data retention law really doesn’t require ISP to maintain records of web traffic going through them, yep, less of an issue. It’s odd that press accounts report the opposite — and even odder, then, why the law was passed at all. ISPs would have relatively little info of use.
Why mention the data retention act at all? Because the Working Group knows it would come up.
That’s the main point I have in all this. I’m not — NOT — saying that Google has no privacy issues. But I am saying that this Working Group letter was stage theater. Everything they asked, they could get answers to directly from statements Google has already published, first-hand statements. So why ask what’s been answered? Answer, to look like you are doing something. And they did — they got Google to cut back to 18 months from 24.
Big whoop. Microsoft and Yahoo still keep far longer than that, but the Working Group doesn’t seem to care, because neither was in the news recently. That’s how I interpret their letter. And that’s disappointing to me, because if there were real concerns they had about Google, they were applicable to others and similar letters should have gone out. That’s not an excuse for Google to say, “We’re just going what others do.” It’s a criticism for this group for not stepping up and doing what I presume its job actually is.
“It’s odd that press accounts report the opposite …”
Danny, how often do *US* press account of a *European* law get it right? I’ve got an email tax for you.
Google plays geeks like a fiddle. They’ve practically fabricated that EU retention directive reason, and no amount of experts pointing out that it is a fabricated reason makes a significant dent in the packaged-for-public-fear story.
Seth nailed it
Google can’t say there weren’t warned – they were at the conference where this was decided in Nov 2006
http://ec.europa.eu/justice_home/fsj/privacy/news/docs/pr_google_annex_16_05_07_en.pdf
“Many IP-logs, especially when combined with respective data stored with access providers, allow for the identification of users…
[search engines] …specifically, they shall not record any information about the search that ***can*** be linked to users or about the search engine users themselves. After the end of a search session, ***no*** data that can be linked to an individual user should be kept stored unless the user has given his
explicit, informed consent to have data necessary to provide a service stored (e.g. for use in future searches)”
There’s a pretty big gap between search queries with identifiable IP-addresses for 18 months, rather than NONE AT ALL (identifiable) without express consent. This has all been a big 6-month snow job by Google to make believe 18 months compulsory retention is other than insane. And PI whacked ‘em off course in mid-flack. Good on PI.
Reality check. They didn’t do this because of PI. They did this because of the EU. If anything, PI is simply riding in on the EU’s coattails.
Then it’s a snowjob by Google? Hey they have flacks, they do PR, sure. But then again, they are the only major search engine to first come out with a 2 year limit, which is now 18 month. Can’t say Microsoft and Yahoo weren’t warned, either.
What you can say is that when Google actually did something to reduce data retention, it got a big PR brick in the eye from the EU.
Microsoft and Yahoo got to keep doing whatever they want without any attention or public pressure, despite them being on par as privacy threats. Heck, both of them have user data going back years longer than Google, since they are older companies.
I can’t keep repeating over and over that I agree Google has privacy issues. I’m not excusing them of that. But why cut slack on the EU privacy group that attacks the only company that actually did something, for reasons that frankly smack of publicity generating than actually trying to protect users? And how long does the EU keep server data again? Because when I looked, that wasn’t listed on their site.
http://europa.eu/geninfo/legal_notices_en.htm#personaldata
That’s an excellent point – they don’t. Maybe it would be fun for your next blog on this stuff to complain to :
http://www.edps.europa.eu/EDPSWEB/Jahia/lang/en/pid/32
(it’s their job to regulate)
Would make an interesting story for someone from US to argue a beef with European privacy
http://www.out-law.com/page-8147
Data retention laws do not cover Google searches, says Europe”
Google is not bound by the Data Retention Directive when it comes to search engine logs, Europe’s data protection committee has said. Google has used the Directive to justify keeping data, but OUT-LAW has learned that the law does not apply.
I’m based out of the UK, so I might not be the best for that test case :)
That’s a great article. I cut all but the opening paragraph since we can’t reprint without permission, but yep, that seems to settle Google’s questions about whether that particular law applies. But see — someone from the Data Retention group was also on the Data Protection group. So when sending that letter to Google, why not say something from the start like:
“We don’t know why you are storing this data so long. We’ve seen you justify this in part by suggesting EU law might require it, but that’s not the case according to blah blah.”
Anyway, apparently the Working Group is happy. From the AP:
The EU justice and home affairs commissioner welcomed a letter sent by Google officials to an independent EU data protection panel earlier this week in which the company said it would raise its data privacy standards for all users.
“It is indeed a good step, I have appreciated the commitment of Google not only to meet our expectations in terms of protection of privacy or better on cutting the time and reducing the time of retention of personal data,” Frattini said.