But in the last three years, it's become apparent that the difference between personally identifiable and non-personally identifiable can be illusory. Thanks to AOL, we know definitively that computer users can be identified simply by examining their search queries. In 2006, AOL released search logs showing queries made by more than 650,000 members. While the company changed people's IP addresses, the queries themselves were sufficiently detailed that The New York Times was able to find and profile one "anonymized" user, Thelma Arnold, within days.
And thanks to Netflix, we also know that movie reviewers who post critiques of obscure films can be identified, even when they write pseudonymously.
Given the fact that people can piece together Web users' identities without directly collecting names or addresses, regulators are backing away from the idea that collecting personal data always requires more safeguards than collecting supposedly anonymous data.
But the court system has been slow to acknowledge just how quickly users can be identified based on anonymous information. In June, a federal judge in Seattle ruled that IP addresses aren't personal information. There, the court ruled that Microsoft didn't violate its user agreement by collecting IP addresses of users, even though the agreement said the company would only gather data that doesn't personally identify users.
Additionally, last summer a federal judge in New York ordered YouTube to provide Viacom with the IP addresses of users, as part of Viacom's copyright infringement lawsuit against the video-sharing site. The judge wrote at the time that IP addresses alone can't identify users. (The companies later agreed that Google would replace the actual IP address with a substitute.)
Now, a federal judge in Kentucky has ruled that a nursing student at the University of Louisville didn't reveal personally identifiable information when she posted information on MySpace about a patient who had just given birth to a baby girl.
"The blog post does not disclose the birth mother's name, address, social security number, or the like. It does not disclose her age, race, or ethnicity. The blog post does not contain 'financial' or 'employment related information' about the birth mother. It does not disclose where she was in labor," the court wrote.
Santa Clara University law professor Eric Goldman thinks the judge made the wrong call on that narrow point. "I'm confident that any savvy investigator could combine the blog post with other data sources and quickly identify the mom with a high degree of certainty," he writes.
On the other hand, the Citizen Media Law Project points out that medical personnel would never be able to discuss their professional experiences if that kind of post was held to violate confidentiality.
Still, it doesn't appear as if the judge in this case seriously balanced the risk of de-anonymization against the student's free speech right to blog about her work. Technology often moves faster than the legal system. But when judges are called on to determine matters involving online privacy, one would think they would at least keep up with what's been happening in the last few years, as opposed to relying on outdated definitions of personal information.