Commentary

How Accurate Are Those Ratings? It's Easy to Find Out

by Steve Sternberg , Featured Columnist, November 30, 2016

I was one of the founding members of The Council for Research Excellence (CRE), a group of 40 top industry researchers from major Nielsen clients. As part of its Media Consumption and Engagement Committee, I helped spearhead the landmark “Video Consumer Mapping Study,” which still stands as the best original research into consumer media habits that I’ve ever seen.

The key players involved in the study were the CRE, the Ball State University Center for Media Design, and Sequent Partners. The study was conducted in 2008, and its findings released to the industry in early 2009.

While the CRE was designed as an independently operated group (which was basically set up so Nielsen could avoid government intervention in how it measures media audiences), Nielsen funded the research, at a cost of about $3.5 million. Such a major expenditure for one research study is simply out of the ballpark for any agency or network to conduct on its own — one of the key benefits of the CRE.

One of the lesser-known findings of the study was that while Nielsen’s broad television usage data for households and demos such as Adults 18-49 were remarkably accurate, as the audience segments got narrower, the gap between reported Nielsen data and observed behavior got wider.

At the time, I suggested an analysis that I thought could provide definitive insights into the accuracy of Nielsen’s reported ratings. Nielsen, of course, did not want to have anything to do with this, and we moved on to other things (as usually happens when Nielsen wants to move on to other things).

I bring this up because at the time, it seemed as though the CRE was the one industry entity capable of doing real, worthwhile independent research, designed solely to advance how audiences are measured, without any sales-related agenda or bias. In retrospect, perhaps that was a naïve notion (even though we original CRE members had pledged to do just that).

In the past, Nielsen has made some attempts to “validate” its currency audience data through telephone coincidentals and the like. My proposal was a little different. I wanted Nielsen to meter the homes of research executives from the top 20 or 30 media agencies, all the broadcast networks, and major cable networks (anywhere from 50-100 people).

For one day, these execs would write down their media behavior in minute detail: what they were watching, platforms they were using, whether they were time-shifting, when they’re fast-forwarding through commercials, when they switch channels, which commercials they're seeing, etc. Different research executives who participate can set up their own scenarios they think might present measuring difficulties, then simply compare their actual viewing to what Nielsen reports.

Unlike regular viewers, senior researchers are used to doing such detailed work, and should have no problem accurately recording their activity. This can be done not just for Nielsen, but comScore -- and any other company that claims it can measure video audiences.

Not only will this provide, for the first time, a look at how accurate reported ratings are, but it will also tell us exactly where improvements to audience measurement need to be made (which I believe was the original purpose of the CRE).

This analysis should be overseen by an un-biased third party with no stake in the results — perhaps a small group of former industry researchers who are no longer working for a buyer, seller, programmer, measurement service, or producer of video content.

As the industry seems to be hurdling (stumbling?) toward “total” audience measurement and TV everywhere, we should pause for just a moment to see whether we are measuring TV anywhere correctly.

metrics, ratings, tv, tv everywhere

11 comments about "How Accurate Are Those Ratings? It's Easy to Find Out ".

Check to receive email when comments are posted.

Ed Papazian from Media Dynamics Inc, November 30, 2016 at 1 p.m.
An interesting idea, Steve. In my own way, I did something similar regarding magazine total audience measurements some time earlier. In this case, we had Simmons send his top interviewers to each of my senior media execs---planning and research---and conduct a standard "through-the-book" interview as was done in his syndicated studies. My purpose was not to validate but to show my people----the end users of the resulting data----exactly what was involved. It was a most informative experience and cast some doubts on the total audience concept in several cases.

I am somewhat skeptical about your idea of getting media researchers to do what Nielsen asks of its rating panel members for a single day, then comparing the results to what the meters say. A typical Nielsen panelist is probably in the panel for two to three years and, no matter how carefully the instructions are explained at the outset---even if the panelist has the best of intentions---there is likely to be quite a lot of slippage as the days and weeks pass on and on and on---with almost 2000 hours of TV viewed per year per panelist. In other words, while one day is a start, it may wind up demonstrating that the ratings are very precise, while the same exercise, if extended to replicate the actual long term involvement of a panelist, might show quite a different result.
Reply

Steve Sternberg from The Sternberg Report, November 30, 2016 at 1:31 p.m.

This is true Ed, but it will show which aspects of audience measurement need the most improvement. And if there are wide gaps in what researchers record as their activity and what Nieslen reports for a single day, we know there is a probem. After the single day analysis, it can then be decided if it is worth doing for a week or so.

M Cohen from marshall cohen associates, November 30, 2016 at 5:44 p.m.

This is a good idea to check and audit the mechanics. However, I have found (perhaps due to the services that I have worked on -- e.g. kids at Nick, teens at MTV, young people at VH1 and more recently, Spanish speaking and "over the air" Univision viewers) that the "correct" composition of the sample (after recruiting, installing, keeping homes in-tab, etc.) and various field force issues are critical. There are so many places where this can go wrong making the sample not representative, (Spanish speaking installers, various materials and leave behinds, various dialects, neighborhood issues, and I could go on and on). And, of course, as all of our statistics professors always taught us, "when the sample isn't right, you really have no data."

Steve Sternberg from The Sternberg Report replied, November 30, 2016 at 5:50 p.m.

You are, of course, correct, but for this analysis, sample composition is not relevant. It is just to discover how accurate the current measurement systems are. Anything beyond that can be done in future analyses.

Ed Papazian from Media Dynamics Inc, November 30, 2016 at 6:10 p.m.

You are both right and I accept Steve's point about distinguishing between the actual measurement process and other issues---like panel composition, how the data is tabulated and weighted, etc. I should add one more item to the list, namely the distinct possibility that heavy TV viewers are over represented in such a panel while light viewers are probably under represented. This tends to happen whenever it is clear to potential panel members---or respondents in a one time only study----that recording TV set usage/viewing is the object of the
panel. Same thing applies to other media. If people know you are doing a radio study, you may get more heavy listeners than you wish in your sample; an obvious magazine study will tend to gain cooperation from heavy readers to a greater degree than light readers, etc. The implications, of course, are obvious.

Christopher OHearn from 3M3A, December 1, 2016 at 2:51 a.m.

It's a good idea but I think I'd be most interested in the second point you make, about where improvements need to be made.

Not so much a case of "Is what is being measured accurate?"... more a question of "Are we actually measuring it?".

Richard Zackon from Council for Research Excellence replied, December 1, 2016 at 9:53 a.m.

As the Facilitator of the CRE who was present at its founding, I owe it to the readers of MediaPost to offer a response to this piece.

Regarding the findings from the VCM study, trained statisticians on the CRE fully expected the gap between results from the Nielsen meter sample and the observational results to grow as audience segments narrowed. It is an a priori principle of statistics, not a noteworthy empirical finding, that sampling error increases as sample size decreases.

Regarding the proposal made in the piece, Steve implies his idea was not pursued because Nielsen was opposed to it. In fact, CRE members choose research projects independently of Nielsen. It was the case that most of Steve’s CRE colleagues did not vote to support the proposal. While the methodology set forth in the column may yield some anecdotal insights, it is fraught with issues of reliability and validity and is therefore unlikely to yield definitive results. Having 50-100 TV executives write down their media behavior for one day or one week hardly qualifies as a definitive ground truth to assess ratings accuracy. If anyone believes that written self-reports by small samples warrants our confidence, they should encourage Nielsen to maintain its soon to be retired local market diary methodology.

Finally, contrary to the implication here, the CRE, even after Steve’s departure, has remained an entity capable of doing real, worthwhile independent research designed solely to advance how audiences are measured, without any sales-related agenda or bias. Steve characterizes that spirit well and it has proven not to be a naïve notion. See for example, our latest work product: http://www.researchexcellence.com/files/pdf/201611/id402_guide_for_validating_audience_data_2016_11_11.pdf

Innovative thinking is to be encouraged and the VCM study is a testament to that. To improve audience measurement, however, methodological rigor must be applied and easy shortcuts are usually best avoided.

Tony Jarvis from Olympic Media Consultancy, December 1, 2016 at 1:55 p.m.

Reading between the lines, Mr. Zakon's (compensated by Nielsen!) response sadly reflects the naïveté of many of the members of the Nielsen, so called "independent", CRE including our esteemed Steve Sternberg - sorry. Richard's "holier than thou" comments regarding the array of potentially valuable research executed notably with very limited access to non-Nielsen clients (you also have to be a Nielsen client to sit on the CRE!) merely underlines the original Nielsen objective correctly identified by Steve, "basically set up so Nielsen could avoid government intervention in how it measures media audiences". That being said the millions that have been invested by Nielsen are unquestionably commendable but we should not ignore their primary self-serving motivation. That he would draw a parallel to Steve's recommendations for further study with local market diary measurement is of course laughable. If audience measurement in the US was funded and managed by Joint Industry Committee's, JIC's, we would not need a CRE managed and funded by an unregulated monopoly.

Steve Sternberg from The Sternberg Report, December 1, 2016 at 3:04 p.m.

Hi Richard. I did not mean to imply that the CRE is no longer capable of doing great independent research in the mold of the VCM study. I've been a vocal advocate of the CRE since its inception. My point was that irt might be best for a 3rd party to conduct and oversee this type of analysis simply because Nielsen funding a study into how accurate its ratings or ratings of its competitors are will lead to too many perception issues that would affect the "integrity" of the findings in the minds of many people who have not had direct experience with the CRE.

And just to correct the record slightly, it is true that the CRE, not Nielsen, selects the projects to pursue. But when I originally made the proposal, one of the Nieslen reps at the meeting said they would not provide the equipment for such an analysis, so it was actually never voted on.

That said, I do think the CRE continues to do vital work, and I will continue to be an ardent supporter of your efforts. And, of course, I dsagree with your assessment that this type of study would not have tremendous value.

John Grono from GAP Research, December 1, 2016 at 4:24 p.m.

Here in Australia we have an independent auditor of the system and panel. This includes the auditor going to panellist homes and observing respondent behaviour and compliance, as well as the auditor also having 'test meters' available (e.g. the auditor can create a viewing session which they record or annotate, which is then compared to the 'captured' tuning/viewing behaviour).

The auditor also checks and reports to the Technical Committe (which I sat representing the media gancies on for 15 years) the health of the panel - installations, in-tab rates, composition etc.

And just a comment on Non-English Speaking Backgound (NESB) participants, yes that is an issue. For ethnic focussed channels and content, there is little doubt that their numbers are under-cooked. The issue is not so much with "the system" as across every panel and sample I have ever worked on, they are under-represented. Every type of recruitment and incentivisation scheme has only helped to improve the performance but has not totally solved the problem. NESB homes with children tend to co-operate more as the children learn English at school and proxy to the adult (assuming your research codes of conduct allow contact with children). The issue also not only revolves around the type of content they consume but the volume of content and the method - in essence NESB is not a significant predictor of how much TV (or any medium) they consume, so it tends to NOT be a factor in the weighting regime. Unfortunately we do not have the luxury of massive sample sizes to include language as a weighting rim alongside all the other weighting rims, such as number of TVs, cable, household size, presence of children, PVR, lifestage/household composition etc.

Richard Zackon from Council for Research Excellence, December 1, 2016 at 6:59 p.m.

Steve,

First, I apologize for failing in my reply to tip my hat to you for your leadership on the VCM project. It clearly was a legacy study in which you and your colleagues can take pride.

I do not recall Nielsen declining to make equipment available to CRE for the proposed study. I’m not refuting, I’m just saying I don’t recall. Informed observers know that CRE members do not follow Nielsen’s marching orders, but they certainly respect Nielsen’s intellectual property rights. Perhaps seeing this exchange CRE may want to revisit the idea.

Finally, regarding the potential value of the research proposal, my assessment is that it would not qualify as a definitive ground truth. One of the key findings of VCM was that serious caution is called for in interpreting self-report data for media use.