Commentary

Google Services Become Big Data Leak Through Third Parties

by Laurie Sullivan , Staff Writer, July 6, 2018

Google makes it possible for third-party developers to integrate with its services like Chrome and Gmail, but The Wall Street Journal set off a firestorm after reporting that Google has been allowing third-party companies to scrape keywords and data in emails from Gmail.

In an effort to make the industry more comfortable with its doings, Google published a blog earlier this week explaining how it puts developers through a “multi-step review process that includes automated and manual review of the developer, assessment of the app’s privacy policy and homepage to ensure it is a legitimate app, and in-app testing to ensure the app works as it says it does.”

Apparently Gmail is one in a list of Google apps that share data with third-party developers. Stripe software engineer Robert Heaton recently found that Google Chrome and Mozilla Firefox internet browsers share online browsing history with third-party developers.

Heaton points to Stylish, an internet browser plug-in, that he found has recorded every website that its 2 million users visit, and has done so since January 2017. Firefox began doing the same in March 2018.

The Stylish app sends the user’s complete browsing history back to its servers, together with a unique identifier, where its parent company SimilarWeb, a service to measure website traffic and related mobile app data, can tie in other pieces of data.

“This allows its new owner, SimilarWeb, to connect all of an individual’s actions into a single profile,” Heaton wrote in a blog post. “And for users like me who have created a Stylish account on userstyles.org, this unique identifier can easily be linked to a login cookie.”

Heaton suggests that SimilarWeb, through Stylish, now owns a copy of each user’s complete browsing history and enough other data to tie search and browser histories to email addresses and real-world identities.

Search Insider reached out to SimilarWeb, but did not receive a response.

Heaton writes that he has also seen sensitive URLs serve up. His online medical provider shows him his medical documents using secret, 1000-character-long URLs generated by Amazon S3 that expire within a minute or so.

These pages do not require login authentication beyond simply knowing the URL, and anyone who guessed the authentication token in the URL before it expired would be able to view and download his medical documents.

"In my opinion this is not best practice on the part of my online medical provider’s engineering team," he wrote. "But the real world is full of things that are not best practice, and no conventional attacker is actually going to be able to guess a 1000-character long URL within a minute."

apps, data, data management, digital, email, google, keywords, privacy, search

Next story loading

About the Author

Laurie Sullivan is a writer and editor for MediaPost. You can reach Laurie at lauriesullivan@gmail.com.

More from Performance Marketing Insider

SPONSOR CONTENT

How Backbone Media Built a Scalable TV Practice and the 7-Step Framework Any Agency Can Use