This week, reports from two different sources appeared detailing how information that users put online on social networking sites is being mined. The Electronic Frontier Foundation
releasedrecords
showing that the government gathers intelligence by scouring social networking sites, and
The Wall Street Journalpublished a report examining how marketing companies, data brokers and others are using automated scrapers to
collect data about Web users.
Both reports left some observers unsettled, though whether the news should have outraged anyone is debatable.
On one hand, when people post information
about themselves to message boards, and use their real names, that data is inherently public. Maybe in the past, when Facebook's system was more closed, people could have reasonably expected that at
least information on Facebook wouldn't go beyond their friends. But even on Facebook these days, much of what people write is available throughout the Web.
On the other hand, however, many
users clearly don't expect that their posts will be investigated by outsiders -- whether the government, marketing companies or human resources directors -- for insights into their personalities or
potential buying behavior.
The legal issues of scraping data are extremely murky -- though not necessarily for privacy reasons. When companies allege that scraping is unlawful, they tend to
argue that scraping infringes sites' copyrights, or constitutes a trespass, or violates terms of service clauses that forbid accessing the sites through automated means.
Those allegations
are central to a pending lawsuit about scraping by Facebook against Power.com. The latter company aggregates information from a variety of social networking sites, enabling users with accounts at
services like Orkut, MySpace, LinkedIn and Twitter to access their information from one portal. To do so, Power asks users to provide log-in information for their social networking sites and then
imports their information.
Facebook objects to the practice, arguing that Power is violating a federal computer fraud law by scraping. A judge recently dismissed some of Facebook's claims, but ruled that Power could be liable if it circumvented technical barriers on
Facebook. Facebook has publicly said that Power's technology could threaten members' privacy because Power enables users to easily transfer photos or messages marked "private" to other social
networking services. Power counters that users have the ability to do this manually anyway and that Facebook's objection to the practice is driven by the desire to keep control over the data.
(Facebook recently released a tool allowing users to download some -- but, critically, not all -- of
the information associated with their profiles with a single click.)
As that lawsuit continues through the courts, questions continue to swirl over the legality of accessing data on social
networking sites. Even without using automatic scrapers, companies can potentially violate a site's terms of service by collecting data about members. Consider, Facebook's statement of rights and responsibilities includes this directive: "If you collect information from users, you will: obtain their consent, make it clear
you (and not Facebook) are the one collecting their information, and post a privacy policy explaining what information you collect and how you will use it."
It's not clear whether courts
would enforce this provision against, say, a company that's monitoring the site in order to gather intelligence about job applicants. As a practical matter, however, it's probably impossible to
prevent people from manually collecting information that users have themselves made available in a public forum.