Google, Facebook and Amazon are not surprisingly at the top, not only because they are among the largest in the Fortune 500, but because their whole business is built on data. The largest telco, CPG, retailer and financial service brands follow as large data producers, given how many people and how often they touch their lives.
Yet there are still large differences among the Fortune 500 in the amount of data they have.
An easy way to think about the distribution of data wealth is to consider Zipf’s Law, which states that, given a large sample, the frequency of any one thing is inversely proportional to its rank in the frequency. So the #2 ranked thing is ½ the size of #1 and the #3 thing is the size of #1.
This empirical law has been demonstrated in many classes of data relating to people and human activity, such as city sizes, corporation sizes, language and more.
Applying Zipf’s Law, the largest owner of advertising data would have twice as much data as the next one. This means the top 50 would have more data than the next 450 of the Fortune 500 companies ranked by data collected.
In truth, many brand marketers that say they do a lot of data-driven marketing actually don’t generate or access much data when you consider where they might be even in this group of 500, with a data base already far larger than the 6 million-plus digital advertisers worldwide. Instead, many marketers have to rely on borrowing from the one or two leading publishers that own most user data. There are thus data owners and data renters.
The largest data owners have built sophisticated data practices. They historically hired data scientists, built data lakes and deployed data analysts to maximize their investment. They value the data because they know what it can do for their strategy, but they can also be most vulnerable to swings in fortune by having so much to lose.
This division is about to get worse with tightening restriction to data. First, GDPR in Europe and California’s new law restricts data collection by companies and requires greater consumer permission to use their data.
These regulations, as a result, make it easier for the largest platforms that touch the most people, especially if it’s a free service, to collect the most data.
Also, Google has decided to shut off all sharing of DoubleClick IDs in log-level data of ads run through its advertising platform by the end of the year. This decision means advertisers who relied on generating most of their data through advertising and reaching more consumers than they currently have a direct relationship with will lose that vital source to learn and improve their advertising strategy.
Perhaps one outcome of these factors is that, among advertisers, everyone will be more equal, but poorer. Redistribution of wealth may narrow the gap between the top and bottom some, but can lead to an even larger gap from the very top and the rest.
It will take strong action and commitment to owning the data to change fortunes. As Marcus Startzel, SVP of AppNexus, says, "If you are a brand, you must demand."