Google wrote on its public policy blog last night that it will "anonymize" its search logs after nine months instead of 18.
But, while that news mitigates the potential privacy threat, it doesn't remove it. For one thing, Google isn't currently stating exactly how it will anonymize the logs. "We haven't sorted out all of the implementation details, and we may not be able to use precisely the same methods for anonymizing as we do after 18 months, but we are committed to making it work," the company wrote on its blog.
But methods of anonymization can vary widely. When AOL released search query logs online, the company first replaced the actual IP addresses with other numerals. Still, that wasn't enough to protect users' privacy. That's because, with enough search queries, some users can be identified simply based on the content of those queries -- especially if they've conducted vanity searches for their own names, or addresses, etc.
Secondly, nine months is still a long time. In Europe, where there's a broad privacy law, some regulators have said companies should purge users' personal data as soon as possible, and within six months at the longest.
Google last night filed a 20-page paper with the EU stating that it uses IP logs to fight Web spam and click fraud. But even that paper doesn't explain exactly why Google needs IP addresses for nine months to do so.
In its blog post, the company presents itself as something of a martyr to privacy officials. "While we're glad that this will bring some additional improvement in privacy, we're also concerned about the potential loss of security, quality, and innovation that may result from having less data," the company writes.
At the same time, Google argues in conclusory terms that cutting the time it stores data to even less than nine months would result in diminishing privacy returns. "As the period prior to anonymization gets shorter, the added privacy benefits are less significant and the utility lost from the data grows," the company argues.
But the company doesn't explain why this should be so. In fact, despite Google's assertion on this point, it seems obvious that the less data Google retains about individual users, the less likely it is that people's privacy will be compromised.