Now, however, privacy advocates are fuming at the news that the company has changed its mind and will start retaining IP logs for 18 months.
"It's a sign of a race to the bottom," says Center for Democracy & Technology policy analyst Erica Newland. With Google retaining data for at least nine months, "it stands to reason that Yahoo feels like they're at a competitive disadvantage," she says. And, indeed, Yahoo said in its blog post announcing the change that one reason for the move is that "the Internet has changed, our business has changed, and the competitive landscape has changed."
Yet it remains unclear exactly how or why the log files will help Yahoo. The company said only that retaining the data will allow it to give consumers "a more robust individualized experience." But it's not immediately obvious how the data will result in greater personalization. (Google, for its part, says that it uses IP logs in order to detect click fraud and to improve the relevance of its search results.)
Regardless, this much is known: logs tying search results to IP addresses can in and of themselves identify users. That point was brought home in the Data Valdez debacle, in which AOL released "anonymized" search queries for 650,000 users. Within days of the data's public release, The New York Times identified and profiled one AOL user based solely on her search queries. While the Federal Trade Commission's recent do-not-track proposal has drawn much attention, it wouldn't affect Yahoo's decision to retain search queries. That's because the FTC's proposal would give users an easy way to opt out of being tracked as they navigate from site to site, but wouldn't affect how search companies or other publishers can use the data they collect from their own sites.
But other proposals out there could make it harder for Yahoo to hold onto this type of information. For instance, the Commerce Department recommended that the government enact baseline privacy legislation that would follow Fair Information Practices principles. Among those principles is one that requires companies to minimize the length of time they retain data.
Should that principle be enacted, Yahoo might have to come up with a more precise reason for retaining query logs than the company has offered to date.