Commentary

Microsoft To EU: We'll Anonymize Search Logs If Everybody Else Follows

Microsoft executives today told European privacy officials that the company is willing to anonymize its search logs after six months if its competitors do likewise.

EU officials have been pressing search companies for years to more quickly anonymize IP logs, but with mixed success. Recently, Google said it would remove the last octet of IP addresses after nine months, while Yahoo said it would do so after three months. Microsoft currently retains IP logs for 18 months, but says it immediately strips out names and addresses from search queries. Microsoft also says that after 18 months, it completely removes IP addresses from its records, as opposed to just stripping out the final digits.

Unlike the U.S., Europe has a sweeping privacy law that regulates companies' collection and use of personal data. Some EU officials have gone on record as saying that IP addresses are personal data, even though IP addresses can change over time. Last April, the Article 29 Working Party (the EU's privacy officials), said search engines should purge IP addresses as soon as possible, with an outer limit of six months. After that, companies should delete the logs or anonymize them using an irreversible process.

In the U.S., some industry watchers contend that IP addresses aren't "personally identifiable information" because they don't in themselves reveal users' names or addresses. But even though there's no publicly available reverse directory for IP addresses, it's possible to identify users simply by examining their search queries -- as Thelma Arnold, AOL User 4417749, learned after AOL released search data for 650,000 "anonymous" members.

Today wasn't the first time Microsoft offered to anonymize search records after six months if rivals also did so. The company made similar statements late last year.

But, frankly, six months still sounds like a long time. Ask Jeeves offers to delete logs after three days, while European search engine Ixquick doesn't retain the data at all. The major U.S. search engines typically say they retain the logs to fight click fraud and improve search results. But they have yet to adequately explain why either goal requires holding on to data for months on end.

1 comment about "Microsoft To EU: We'll Anonymize Search Logs If Everybody Else Follows".
Check to receive email when comments are posted.
  1. Privacy Dude from Self, February 11, 2009 at 6:24 a.m.

    I am always intrigued that Thelma Arnold is always referenced in these "are IP addresses PII" conversations. Notably the AOL data leak (not breach - it was posted not stolen) did not include IP address. It did include a field containing a unique identifier called anon_id. Anon_id became the common linkage point to see that the same person (who ever they were) made 224 distinct searches including terms like "60 single men" and "jarrett arnold".

    What this points to is that any identifier may or may not become PII based on the quantity and specificity of data linked to the identifier and the scope of those linkages (what constitutes PII to party A may not to party B - not all controllers share common linkages). The larger point is connect less (or more aggregate) data and share it less if you want to keep data anonymous or more correctly pseudonymous. As Ms. Arnold demonstrates, this unfortunately may be a real business challenge for search.

Next story loading loading..