When More Data Is Too Much Data

Venue: A top-level ad industry fireside chat

Moderator: “Can there ever be too much data for a brand?”

Industry exec: “Interesting - I have never been asked that question before.

As we are praising the opportunities to create more insights from Big Data and to make better decisions, nobody seems to question what the right amount of data is. Just because by definition you need more data to get to Big Data, does not mean it is a good idea to keep collecting.

There are a couple of more or less obvious consequences from gathering ever more data: it’s costly, it slows down decisions, and it can be outright misleading.

Data collection is expensive

Together with the CFO of a major media company, we recently looked at just a couple of subscriptions to typical industry data providers like Nielsen, comScore, etc. In a heartbeat, he was able to cut nearly $40,000 a year by questioning what was really being used (let alone analyzing what is instrumental to make a decision).

Other costs for data are storage, bandwidth, and processing, which can quickly run into thousands of dollars. In one example, keeping a record of all paid and earned Facebook data that a mid-size social media provider manages will increase just the storage bill by $80,000 in one year.

Keeping data becomes a liability, too. Certain manipulation of data sets, like aggregation, might not be allowed per the original license agreements, but who is tracking that, and how? And even discarding data requires effort, as somebody needs to decide what can be removed and then actually has to get around to actually purging it. The data assessment project at a major agency holding company took four months and countless meetings, emails, conference calls and spreadsheets before somebody knew how big the problem was.

Data can be overwhelming

Somehow the term information overload isn't trending anymore. Compared to 2004, just 20% of people search for answers to information overload these days (according to Google Trends). However, it is obvious that few people really know what relevance that newfound "treasure chest" of data actually has, and thus are wasting their time with useless data. That leads to further delays in making decisions, which is a behavior, spreading like a disease, once thought to be contained to large corporations but now becoming pervasive even in nimble start-ups.

Additional data can be misleading

Data scientists are constantly warning about the danger of uncontrolled data visualization, where pretty correlations are mistaken for causations and lead to wrong conclusions. That spike in sales shortly after a great increase in Facebook "likes" certainly means we should do another Facebook campaign, shouldn’t we?

While all of the points above are rather straightforward, less obvious is a more fundamental psychological reaction to more data: making wrong decisions due to the active pursuit of additional data, even if it was deemed non-instrumental beforehand (‘non-instrumental’ = should not influence decision). In a more than 25-year-old study, researchers found that people felt compelled to take data into account as soon as it was offered. And when they do, they disregard their original criteria to make a decision and overweight the additional information.

Imagine your data scientists created a well-balanced media-mix-model, and just before committing budgets, somebody suggested waiting for the latest Nielsen-Twitter data to tell you about engagement from unique authors with the shows in your media plan. According to the psychological study mentioned above, you will most probably wait for the outcome, and you will be influenced in your go-ahead decision based on the relative weighting of the tweet-engagements -- even it has no relevance to your model whatsoever. "Dancing with the Stars" had just 26,000 unique authors, vs. the 34,000 of "The Voice," both at 66,000 tweets, but you don’t have "The Voice" in your media plan? Does it matter? Should it matter?

Ray of light

Kimberly-Clark CMO Clive Sirkin was recently quoted saying, "We can't see where we are supposed to go, so we naturally go where data is.”
As a company moves from "small data" to Big Data, it makes sense get your initial batch of data sources connected to the business intelligence infrastructure and start populating first models. However, once the confidence in those models is high and the expected data is flowing in, it’s time to avoid the pitfalls of more data, and go where you have planned to go. A good data management strategy will precisely document these choices.

It’s just like the old story about the lost keys in a dark alley. At the beginning, it makes sense to search where the light (data) is. But once you are able to form a solid model retracing your steps from the bar to the car and get your own flashlight, you need to follow that search plan relentlessly. Just because somebody turned on a big flood lamp on the other side of street, doesn’t mean you should go and search there.

So always ask yourself if you really need more data. Because if you don't know what to do with that data, don't even bother with it.

Tags: big data, metrics
Recommend (2)
2 comments about "When More Data Is Too Much Data".
  1. Nathan Easom from WAYN (Where Are You Now?) , June 4, 2014 at 6:22 p.m.
    Really good article... made some similar points at a presentation this week at the Eye for Travel conference in Miami.
  2. Ed Papazian from Media Dynamics Inc , June 5, 2014 at 10:38 a.m.
    Very good article. There are some people who really believe that the typical media/marketing executive of tomorrow is going to spend most of his/her business time plugged into one or more "big data" sources constantly weighing and sifting the findings and reacting with instantly placed media buys and cancellations. Sorry, but that's not the way most humans behave----or want to behave. Unless we are talking about a new kind of media/marketing executive---- most likely, a robot. Nobody is against "big data", providing the data is really valid----not just available in huge doses broken down into minute detail. But advocates of big data as the ultimate solution will have to recognize that you can't overload people with data and expect them not to tune out. Instead, you must come up with ways to interpret the information and validate whatever assumptions you have made to help your users to cut through the flood of statistics you are providing. Also, the media---TV, for example--- must be clued in so rapid fire revisions to a media buy can be made. If such rapid reactions are not feasible---as seems likely--- then what exactly do big data users do with the insights they glean?And how timely is their response going to be?