'WSJ' Charts Media's Data Leakage -- But Is Anyone Using The PII?
This is the chart many people in the data and privacy worlds will be discussing over the next few days, so you may as well take a look this morning. WSJ.com just published results of its own survey of how the most trafficked online sites are sharing personal data of their registered, logged-in users among tech and ad partners. The interesting thing about the survey is that it gets beyond the cookie fetish into the issue of how major media are deliberately or not passing on PII (personally identifiable information) like email, full names, account usernames, and then less personal demo info like age and zip.
The survey finds that Ask.com, CNN, Pinterest, and WSJ itself routinely were seen passing along the email address of its users to tech and ad partners like (not in all cases) Google Analytics, Facebook, Audience Science and others. In the case of CNN, for instance, email is being sent to Adobe, Audience Science, Diqus, Facebook, Nielsen and comScore’s Scorecard Research. In response to WSJ’s finding, CNN said that it was not knowingly providing email addresses to any partners and “is investigating the findings and, if appropriate, will remedy the problem.” Likewise, WSJ itself, which was detected passing along email to AudienceScience, Opt Intelligence, Peer39 and ScorecardResearch, said this was unintentional. The company claimed the data was passed along “in error” and that it is “working to close the leaks.”
Several other sites like Ask.com and WhitePages argued that they no longer pass along emails. Some of the data recipients consulted (especially Google and Facebook) argued they neither ask for or use the PII sent them by sites.
To its credit, WSJ did a pretty good job of reporting all sides of this issue, and seemed to employ a fair methodology in analyzing traffic. It registered for the sites and browsed pages, and inspected the data being transmitted. And it sought comments from both the publishers and the data providers.
The first point that jump outs from the survey is that publishers still are not fully aware of what is going on at their sites. In fact, it's unclear how many of the publishers who claimed they no longer pass the data WSJ detected were actually first alerted to the leakage by the WSJ research. Probably most important is how the sites are handling their own registered users. These are, after all, the most valuable users for the content provider, both in terms of loyalty and depth of profile. This is where the trust bar needs to be set highest.
And in an interesting twist on conventional wisdom, it is the third-party data-mongers like Google and Facebook who are arguing they don’t want the PII anyway. This probably bears a bit more scrutiny. If email addresses are being passed to the two biggest aggregators of data on the Internet, then I think we need to know more and get greater proof than a passing denial that they don’t keep or want it.
But there is an ironic point buried in this. Perhaps these third parties are right: PII, like email, is not as valuable to the ad ecosystem as the hashed behavioral data these sites are piling up attached to your cookie. So what you have done is more relevant to their purposes than knowing your inbox?
If data falls in the woods and there is no one there to capture it, did it make a sound?
Still, even if actual recipients of the data argue that they're using or storing it, that is less important overall than the fact that the data is being circulated. In an age of data trading and sharing, users will trust the brand with whom they feel the closest bond. And despite their massive presence in our everyday lives, neither Google nor Facebook seem to be brands with which we feel any personal relationship. In that sense neither is a media brand in the traditional sense of having a voice to which the user can attach some identity and thus emotional link.
That can work to Google and Facebook's advantage. They are the Sheldon Coopers of the new media world. Our expectations of them as corporate actors are lower than for NYTImes.com or WSJ. Sometime the tech company narrative comes in handy. We persistently correct their missteps, but in the end chalk it up to corporate Asperger’s Syndrome. The media that provide rich and manmade content are still the ones users expect to protect their identity.