Ad tech companies frequently argue that online tracking doesn't pose a threat to privacy because it's a anonymous -- meaning that ad networks and other tech companies don't know the names of the people being tracked.
But a new study concludes that ad networks theoretically can figure out many users' identities by examining publicly available data from social media services.
"Browsing histories can be linked to social media profiles such as Twitter, Facebook, or Reddit accounts," states the report, "De-anonymizing Web Browsing Data with Social Networks," authored by researchers at Princeton and Stanford.
"Users may assume they are anonymous when they are browsing a news or a health website, but our work adds to the list of ways in which tracking companies may be able to learn their identities," Princeton faculty member Arvind Narayanan said Thursday in a statement.
Prior reports have found that Web publishers and social networks can leak people's personal information by including user names or other data in "referrer headers" -- the information that is automatically transmitted to ad networks and other third parties.
The new report rests on a few premises, including that social media users follow a distinct set of other people, and that users are particularly likely to click on the links that appear in their feeds.
Those assumptions won't be true in all cases. But when they are, companies can de-anonymize some users by comparing the links in their publicly available social media feeds -- like their Twitter feeds -- with the sites they have visited, according to the report.
"Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one’s feed is unique," the authors write. "Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity."
The authors tested their theory by recruiting 400 people who allowed their Web browsing histories to be tracked, and then comparing the sites they visited to sites mentioned in Twitter accounts they followed. The researchers say they were able to use that method to identify more than 70% of the volunteers.
"Any social media site can be used for such an attack, provided that a list of each user’s subscriptions can be inferred, the content is public, and the user visits sufficiently many links from the site," the report states. "For example, on Facebook subscriptions can be inferred based on 'likes,' and on Reddit based on comments, albeit incompletely and with some error."