Judge Sides Against Meta In Battle Over Scraping Public Data

A federal judge has largely sided against Meta Platforms in its battle with the Israeli analytics company Bright Data over data posted by Facebook and Instagram users and designated by them as public.

In a ruling issued Tuesday, U.S. District Court Judge Edward Chen in the Northern District of California said Bright Data didn't violate Meta's terms of service by scraping publicly available data from Instagram and Facebook because Bright Data wasn't logged in when it gathered the data.

“Bright Data only engaged in logged-out scraping of public data ... and thus did not breach the terms either while it had accounts with Facebook and Instagram or after terminating its accounts,” Chen wrote in a 37-page order awarding Bright Data summary judgment on the claim that it broke its contract with Meta.

The ruling allows Meta to amend its allegations against Bright Data, and also to continue to pursue a claim that Bright Data interfered in Meta's contracts with other companies. That claim centers on allegations that Bright Data sold user data to companies that themselves have Facebook and Instagram accounts that prohibit scraping.

advertisement

advertisement

A Meta spokesperson said the company is “evaluating next steps in the ongoing litigation."

The decision comes in a legal battle dating to January 2023, when Meta alleged in a federal complaint that Bright Data wrongfully used automated tools to gather data from Facebook and Instagram -- including “users’ profile information, followers, and posts that users have shared with others” -- and then selling that data.

Meta contended in court papers that Bright Data offered to sell an Instagram data set containing 615 million records for $860,000. That data set “includes at least 34 fields from Instagram users’ profiles, including full name, ID, country code, region, post count, biography, business category, hashtags, followers, following, posts, profile image, highlights, verification status, business email, and business addresses,” according to Meta.

The complaint included a claim that Bright Data -- which itself had accounts with Meta from April 2021 until December 2022 -- violated the platform's terms of service. 

For its part, Bright Data argued that it only gathered information that was publicly available on the web -- meaning posts Instagram and Facebook users made available to all web users, regardless of whether they had Meta accounts.

“Bright Data searches information that does not require any Facebook or Instagram account,” the company wrote in a motion seeking summary judgment.

“Meta says it can contractually block any public web search it does not like simply by posting terms and conditions at the bottom of its web page. It cannot,” Bright Data added.

Chen agreed, writing that even though Meta's terms of service prohibit scraping, they only apply to users who are logged into their social media accounts.

Meta's terms of service “do not apply to and prohibit Bright Data's scraping of publicly available data while logged off," Chen wrote.

Colorado attorney Kieran McCarthy, who has represented companies in battles over web scraping, calls the ruling the “most important legal opinion related to web scraping to date.”

He says most judges that have considered breach of contract claims similar to the one raised by Meta have ruled against companies that scrape data, when they know scraping violates the terms of service.

“This opinion is definitely at odds with the bulk of historical case law,” he tells MediaPost.

One factor that appears to have carried weight with Chen was that Meta revised its terms of service in 2009 by removing language providing that anyone who visited Facebook, whether a user or not, was bound by its terms of service.

“The now-obsolete clause demonstrates that Meta was fully aware of how to write a clear provision that applied to both logged-in and logged-off users and made a conscious decision not to include the distinction in the most recent iteration of the terms for Facebook and Instagram,” Chen wrote. “Therefore, it is reasonable to infer that the current terms contemplate a 'user' as an account holder,” he added.

Given that wrinkle, it's not yet clear whether Chen's ruling is “just a one-off opinion,” McCarthy says.

“The meaning of this case will be litigated ad infinitum,” he adds.

Bright Data is facing a separate lawsuit by X Corp. (formerly Twitter), which also centers on allegations that Bright Data used automated tools to collect data from the platform.

Next story loading loading..