Many of the most popular online publishers are leaking readers' names, addresses, home phone numbers, email addresses and other personal data to outside parties, according to a new study.
"The problem of privacy has worsened significantly in spite of the various proposals and reports by researchers, government agencies, and privacy advocates," researchers from AT&T and Worcester Polytechnic Institute state in their most recent report about online privacy. "The ability of advertisers and third-party aggregators to collect a vast amount of increasingly personal information about users who visit various Web sites has been steadily growing."
For the paper, "Privacy leakage vs. Protection measures: the growing disconnect," the researchers examined 120 heavily trafficked sites, including health, travel, jobs and news sites. All of the sites examined -- except for those in health category -- required users to register, often by providing a name, email address, or other identifiable information.
The majority of the sites (56%) were found to leak user data that is both sensitive and personally identifiable.
The report suggests that online publishers should take a more active role in preserving users' privacy. "Whether it's intentional or whether it's inadvertent, these first-party sites are not doing a good job of making sure that information does not get passed through," says Worcester Polytechnic's Craig Wills, one of the report authors.
Health and travel sites were especially prone to leakage, with 9 of 10 sites examined in each of those categories passing along potentially sensitive information about users. The health sites transmitted information about what search terms people use within the sites, while the travel sites passed along the users' possible itineraries. "We suspect that this would come as new and unwelcome news to most users," the report states.
Leakage of names or search terms often occurred via referrer headers.
The report also explored methods that data aggregators can use to connect small amounts of information from a variety of sites in order to arrive at detailed profiles.
In some cases, leaked information is attached to cookies that have been placed on users' computers by outside companies. For instance, the report says, a health site might leak the fact that a user is searching for information about pancreatic cancer, while another site might leak that same user's email address. If a third party stores both pieces of information on the same cookie in that user's computer, that company then knows the email address of a user interested in learning more about pancreatic cancer.
Users can delete those cookies, or prevent them from being set, but even without cookies, third parties can still compile information about users. For instance, the report says that aggregators also can tie data about one user from different publishers to an IP address. While some IP addresses are shared between family members or work colleagues, others correspond to specific individuals.
The same team to release study previously reported on the leakage of users' names by social networks and mobile social networks. Those reports resulted in potential class-action lawsuits against Facebook and MySpace. A federal judge dismissed the lawsuit against Facebook two weeks ago, but said the users could attempt to refile the matter.