Commentary

Cross-Platform Measurement And The Golden Spike

by Josh Chasin , Op-Ed Contributor, March 12, 2015

After viewability and fraud, what’s the biggest issue in digital metrics today? Now, I know what you’re thinking: “There are other issues in digital metrics?” Humor me for a few hundred words.

From my perspective, given that we’re taking a column-long reprieve from viewability and fraud, the biggest measurement issue currently confronting buyers, sellers and researchers in the digital space is cross-platform or multiscreen measurement. In short: the convergence of TV and digital measurement, encompassing computers, smartphones, tablets, so-called “traditional” TV, and over-the-top (OTT).

Consider:

The phenomenon of “cord-cutting” is real. I know most estimates peg this at between 4%-8%, which may not seem like much now. But in 2004, telephonic cord cutting (aka, cell-only households) was at about 4%. Now it’s at about 44%. TV cord-cutting may not get to 44% in 10 years, but it’s not going to abate any time soon.

The digital space now has its own version of TV’s upfront: the Newsfronts, where both traditional and digital-only video providers convene to present short- and long-form program opportunities to advertisers. These programs are consumed across a diverse multiplicity of screen types.

To an entire generation, “TV” has taken on a new meaning. Recently I asked my 10-year-old daughter what she was doing. “I’m watching TV,” she called from her room. When I looked in on her, she was watching Netflix on her iPad.

As measurement providers strive to build cross-platform solutions for this new multiscreen world, the question before us is this: What do we really want from cross-platform measurement? Here are some suggestions:

Combined, unduplicated reach. To you media mathematicians out there, this probably seems self-evident. Unduplicated reach is the primary goal of all cross-media measurement system: whether via single-source collection, fusion, or some other form of integration across data sets. Unduplicated reach is the core building block that makes other measurements possible (without it you haven’t got reach and frequency). Since individual media generally have their own measurement solutions, it is the technique for handling cross-media duplication that tends to drive the efficacy of cross-media solutions.

Reporting granularity. When we talk about granularity, we mean, how small a media vehicle audience can the system report on with sufficient robustness? And hand-in-hand with how small, how many entities are reported? (You digital folks will recognize “small” and “many” as the long tail.) Once you bring TV, mobile, and computer measurement together, you need to report on TV networks and programs across traditional TV distribution channels, OTT, the Web, the mobile Web, and on specific apps. Traditional audience measurement constructs tend to fall apart when confronted with the requirement for this level of granularity in measurement and reporting.

Media allocation, optimization and scenario planning: The classic question in cross-platform planning is, “What happens if I shift some spending from here to there?” A successful cross-platform solution must allow users to test alternative media and vehicle allocations. In many respects, this ability to optimize is the great promise of unified cross-platform measurement, because it represents a previously untapped opportunity to increase ad buy effectiveness.

So how do we get there? Cross-platform measurement should not be TV measurement with some digital tossed in; neither should it be digital measurement with a little TV on the side. Ideally, It should be the holistic joining of the two, without the baggage of either: either via brand-new measurements, or by the intelligent integration of existing systems. Of course, that’s easier said than done, because one man’s baggage is another man’s industry infrastructure.

If you look at the activity internationally -- specifically, the TV audience measurement (TAM) and Internet audience measurement (IAM) JICs -- in both cases, the movement is toward expansion of legacy currency systems to incorporate mobile measurement. This makes mobile the core focal point of multiscreen measurement.

I expect that we’re going to see the TAM and IAM providers around the world begin to approach each other about building mobile measurement assets in partnership, in order to bring their two systems together into a single holistic, cross-platform measurement system. In this scenario, mobile measurement is the “Golden Spike” -- the final stake in the ground sealing the connection between two tracks running inexorably toward each other.

Let’s loop back to my daughter, using Netflix and her iPad to “watch TV.” I think we all recognize that this is where media consumers are at these days. They move between screens and program sources, following the content. In doing so, they make our jobs more complicated than ever before. I believe that the future of cross-platform measurement will require multiple currency data sets and machine-level census data sets, and I suspect much of the action will rotate around the axis of mobile measurement, where TV and digital come together.

metrics, tv, video

13 comments about "Cross-Platform Measurement And The Golden Spike".

Check to receive email when comments are posted.

Tony Jarvis from Olympic Media Consultancy, March 12, 2015 at 10:35 a.m.
While it is implied, sort of, I respectfully suggest that an additional fundamental measurement principle needs to be fully embraced in the industry's move towards a "Golden Spike" that would serve all media players of any channel digital or otherwise: Measurement at the ad and at the program exposure or "contact" level, i.e. well beyond OTS to "Eyes-On" or ears-on. Without a meaningful comparable "viewable" currency across all media measurement your three principles while unquestionably critical will remain problematic.
Reply

Ed Papazian from Media Dynamics Inc, March 12, 2015 at 10:51 a.m.

I have to agree with Tony on this one. It's all very well to deal with opportunity to see ( OTS ) but this is not a level playing field when you are comparing various TV/video platforms. All are suspect, to some degree, regarding whether the "audience" is actually there and whether it is really engaged by the ad but some platforms are much more suspect than others in this regard. How does one solve this problem? That's a difficult question. Trying to rely only on electronic means may not be enough. A combination of electronic indicators plus human response ----- surveys about ad recall/motivation, content attentiveness, liking, etc.---- may have to be melded together in some sensible manner.Unfortunately attempting this will take time and, I suspect, that reliance will continue to be placed on electronics, because no one will be willing to wait for better indicators, or willing to fund their development. Ultimately, this will work against the 'digital revolution". Haste makes waste.

Kevin Horne from Verizon, March 12, 2015 at 12:38 p.m.

At a higher level than the previous commenters. a key element in all this for me is understanding a "single viewer over time" - meaning your daughter for example. Yes, she was watching Netflix on a tablet anecdotally, but what was that of her total viewing time for the week? Would be great to see some focus on developing some "models" (or whatever) for video consumption. These static data points (5% cut the cord, etc.) are going to prove their complete lack of "Metrics Insider" usefulness very quickly...

Joshua Chasin from KnotSimpler, March 12, 2015 at 12:50 p.m.

Tony (and Ed)-- In no way do I disagree that this is important. I guess I was simply assuming that an audience measurement service would, a priori, measure the relevant entities that audiences accrue to (and thus it didn't need a call-out.) But you are correct, as we build cross-screen currency solutions, we must keep in mind the most granular, relevant level of exposure we can-- the program and ad/campaign.

Ed-- while a understand "haste makes waste," I also know that on April 3, 2010, Apple released the first iPad; and on April 4 2010, we received our first client question about "When will you have an iPad panel?" As far as some technique to ascertain whether the screen the program is on actually has anyone in-front of it-- agree.

Kevin-- I think "share of screen time" or some such metric will inevitably be one of the metrics we can key in on, once we have been able to properly isolate the unduplicated reach.

John Grono from GAP Research, March 12, 2015 at 7:30 p.m.

Hi Josh, Ed and Tony. This is not the million dollar question but the billion dollar question. The Holy Grail is to be able to report on the finest granularity possible, which in this case would be a "15 second video ad" (or whatever its equivalent in the future becomes) on any and all platforms and devices. And by report we mean those actually seeing the ad. My gut feel is that we need to build an 'Opportunity-To-See' system (which would be quite rigorous), and overlay it with a matrix of 'Likelihood-To-See' factors to produce ad-viewing estimates. We took this approach in Australia when building the O-O-H measurement system MOVE. We had lost of data (such as traffic counts past billboards) which with additional survey data were able to 'convert' to drivers and passengers. We were then able to apply visibility factors for the driver and passengers as to what proportion would see each different sign format (e.g. larger front-on signs are better - but not if you are a passenger in the seat behind the driver). What we ended up with was an acceptable estimate. We of course did this for all OOH formats and sign types. The parallel is that with linear/time-shifted TV, the programme's audience is the OTS for the ad, but we simply don't have the LTS factors. This is where we have to do a LOT of work. This has been the case for the decades I have been involved in audience measurement - but few if any have the appetite to fund such research. The additional issue, as Tony points out, is that the OTS counts across the different media are not comparable. This is the first issue that needs to be resolved, so that one person on medium A equals one person on medium B, medium C etc. Further, the LTS is different from medium to medium, so we will need LTS factors by medium and for a variety of factors relevant to each medium. We can then have a crack at pulling them all together - fingers crossed. Of course, once that gargantuan task is done, we will have our Apple moment as Josh points out - the following day we will be asked by an advertiser ... Can I see that score for my specific ad as my ad is better than all the other ones? I think that's when we invoke the sociologist William Bruce Cameron's famous "Not everything that counts can be counted, and not everything that can be counted counts" quotation.

Tony Jarvis from Olympic Media Consultancy, March 12, 2015 at 9:32 p.m.

For those of you that were following the Media Post discourse after Joe Mandese's piece on the Industry's concerns regarding Spot TV measurement by Nielsen, you may think that John Grono & I are in league to somehow magically "fix" audience measurement in the US by providing advertisers (and programmers) the research principles to develop truly comparable exposure data - real equivalent metrics that do not need calibration - across all media channels at the most relevant granular level for each medium per his excellent description. (Ed: Not noticing or awareness but simply "Eyes-On", "Ears -On" or contact within a visibility or audio zone - remember recall studies to measure noticing or awareness are seriously problematic and include creative and/or brand equity effects that are campaign specific.)
Our ability to address (albeit imperfect I am sure) this fundamental common metrics requirement comes from our common audience research experience - Out-Of-Home measurement. What John has described is the integrated approach, now used by TAB in the US and MOVE in OZ that was developed by Route in the UK. Measuring OOH audiences broke all the "rules of media measurement" at the beginning of this century and "we" believe it offers some terrific lessons to those willing to take the blinkers off. One of the key principles to "visibility adjustments" is to drive this OTS conversion to "LTS" or "contacts" off the primary physical media characteristics of the channel or format independent overall of the ad on the screen or on the billboard or on your watch; unless technology can measure exposure (not noticing) directly (nearly there?). In other words the exposure adjustment measurement must in the end be creative neutral or "average", as best as possible, but allow for all the key physical characteristics of the display vehicle (measured) relative to the consumer that could affect "eyes-on" to be examined - a media effect. Applying visibility (or audio) adjustments to OTS measures across a variety of formats in OOH has produced a truly common currency at the ad exposure level for all measured formats making intra-media comparisons valid and meaningful.
Josh: Unless we address this fundamental common currency issue there surely will be no Golden Spike because the railway lines have no hope in hell of meeting?
John: I trust I have this correct?

John Grono from GAP Research, March 12, 2015 at 10:32 p.m.

You sure do Tony. MOVE in Australia piggy-backed on POSTAR in the UK (thank you Simon!) which was a national measure of billboard coverage. MOVE was five discrete markets with five different formats (roadside, retail, airport, buses, trains/trams. The de-duplication between markets was easy (basically next to nil apart from flyers) but the duplication between formats was a tough nut to crack, as was the temporal and longitudinal duplication of roadside in particular (do you always take the same route to work). We had to use multiple primary data sources (just like we use Census data and panel data) and overlay with visibility. Route in the UK extended on POSTAR and learned from MOVE as has TAB in the US. Incidentally, MOVE has just celebrated its fifth birthday with some nice new additions such as the composition of the audience - drivers, passengers, pedestrians (e.g. high pedestrian count could influence the creative to, say, include QR codes or WiFi interactivity). The key difference is that with OOH the medium is the message. There is no other purpose apart from advertising. With TV content <> advertising, with OOH content = advertising. As Tony sagely points out the AMS' objective is to provide 'the average" as the media owner can't control the creative. That is, a 'bad ad' will deliver worse than the average, whereas a 'good ad' will deliver better. But in my opinion applying an 'average' LTS (based on robust research and empirical evidence) would be at least a step in the right direction. My guess is that the variation around that average for linear TV is pretty small compared to the variation between linear TV, time-shifted TV and SVOD TV. I think the 'merging' of these sources will make an LTS factor imperative, otherwise we won't be comparing apples with apples regarding audiences to ads (assuming we have conquered the issue about comparing apples with apples for programme content). Cheers.

Ed Papazian from Media Dynamics Inc, March 13, 2015 at 6:12 a.m.

Tony, I realize that ad recall measurements can be influenced by campaign and ad execution variables. the solution to that problem is to hold that aspect constant by surveying only the same ads in each medium. I strongly believe that sole reliance on electronic indicators will not get us very far in the quest for comparability between platforms. Certainly, the effects--and possible biases---of each indicator need to be established before it is deemed acceptable. As I mentioned, earlier, I doubt that the advertising and/or the media industry has the will or even the interest to support the kinds of research that will be required to get at this issue properly. More likely, we will come up with some sort of OTS compromise that leaves so many questions unanswered that the current practice of making arbitrary media mix decisions continues in place. I sincerely hope that I'm wrong about this, but-----.

John Grono from GAP Research, March 13, 2015 at 7:50 p.m.

Ed, isn't ad recall the domain of the client and not part of the 'currency' measurement? If the model went along the lines of the currency providing OTS (i.e. the programme's audience estimate based on average minute audience), then an overall LTS factor (by market, by demographic, by station by half hour) that reflected the 'likely' drop-off in audience in ad-breaks for the mythical 'average ad', couldn't it be incumbent on the advertiser to test their ad against the LTS benchmark to see what that particular ad executions likely delivery would be? No - I have no idea how we do it (well I have some ideas about some components) but if we focus on a model that just may work we can then work on the components. In this vague outline, I would see the networks (i) reporting programme performance on the OTS (ii) trading on the 'generic' LTS, while (iii) the media agencies report to the client and be judged on the 'generic' LTS, but also report to the client on the 'ad-specific' LTS - but as they don't control the creative not be judged on it. As the old saying goes ... bad media placement can kill a good ad, but good media placement can't save a bad ad. Just some thought starters.

Ed Papazian from Media Dynamics Inc, March 14, 2015 at 1:56 a.m.

John, ad recall and motivation scores have always been a factor in both the advertiser's assessments of ad effectiveness as well as the media guy's view of the ability of a medium to generate ad exposure---at least in my experience. While Tony is right, in the sense that one can expect major variations in ad recall from ad to ad, due to the nature of the product or service, how well the ad is done, the relevance of its message, whether it's part of a new campaign or an old one that is dying out, etc. properly conducted studies that utilize only the same ads on all platforms, under similar conditions ( same amount of clutter per break, for example, or same demos ) should be a very strong indicator, not of the absolute level of exposure---as many viewers will not remember that they saw an ad-----but, rather, of relative differences between platforms in terms of intensity of viewing, which plays a strong role in determining whatever ad exposure takes place. Couple ad recall, with other respondent evaluations like attentiveness claims, plus dial switching or other electronic measures of holding power, loyalty, etc. and you may find something. Use only electronic data and what appear to be significant indicators may provide very misleading "distinctions". After all, isn't the whole point to compare platforms based on their ability to deliver ad messages, not just opportunities to see them?I have noted a number of studies where the expected difference in ad exposure, as indicated by purely mechanical indicators, were substantially refuted when commercial recall/motivation measures were introduced in TV. Anyway, all I'm saying is this, let's not walk away from human based research just because we place greater trust in electronics. The two, if they corroborate each other, make a far more compelling case and can prevent us from making mistakes.

John Grono from GAP Research, March 14, 2015 at 8:53 a.m.

I'm sorry if you thought that was what I meant Ed. The first stage has to be a way to differentiate the platform effect on an ad. Put another way, is an ad on TV to 1 million people (in target) the equivalent to an ad seen (not served) by 1 million people (in target) on a PC. Or a laptop. Or a tablet. Or a smartphone. Clearly there is no 'magic number' but with sufficient study of ad recall, ad takeout, and brand intention we may/should (OK, hope to) be able to get some acceptable way to equivalise the same ad across different platforms. Inherent in this, would be to also study the variability between ads, creative formats etc. That is, what is the variability around that norm, and is it 'acceptable'. Following that would be what is the variability by a whole host of other factors - demographic, market, network, programme type, programme, daypart etc.? Clients could pre-test against such norms to assist in establishing both the weight of the buy, and the allocation of the buy across platforms. Again, I stress that (I) such granularity may not be possible (ii) it may not be affordable, but (iii) if we look at models of how things MAY work we MIGHT get closer to unification of what COULD be done to get closer than we already are (which every day we do nothing gets further away from where we need to be.

Ed Papazian from Media Dynamics Inc, March 14, 2015 at 9:36 a.m.

I understood you, John.I was just emphasizing my point. No problemo. As I commented earlier, I doubt that advertisers, agencies and media sellers have the stomach or the patience for such an in-depth, albeit valuable, investigation. If past experience is a guide, the media will be expected to do much of the required funding and once it is realized that the results will not always be favorable, heels will be dragged, big time. The usual "solution" is to establish broad based committees representing all factions, but this,invariably, leads to compromise and a watering down of the research to minimize the "damage" it might do. I hope I'm wrong about this but in The States, the electronic media, especially TV, often operate from a position of strength relative to advertisers as the latter rarely join together in a common front against the sellers.

John Grono from GAP Research, March 14, 2015 at 5:54 p.m.

I tend to agree, and share your concerns. One of the issues is that "the media will be expected to do much of the required funding". When we built MOVE here it was 100% media-owner funded. It basically had to be as prior they only had highly discredited traffic counts, footfall counts etc. The 'compromise' was that the OTS data wouldn't be released - just the LTS data (which is what the agency/advertiser wanted). The reason was LTS / OTS = Visibility Index , and they wanted to keep the visibility factors in camera at a site-by-site level. [There were good reasons for this as some sites have multiple 'audiences' - a billboard may have a 'head-on audience' with 90% visibility and a 'side-on audience' from a different road with 10% visibility, so it would appear that the Visibility Index was just 50%.] I still think it is the industry's responsibility to provide visibility to what they are selling - i.e. the LTS of ad-breaks compared to programmes. But it is an unreasonable expectation for the industry to provide the LTS for each and every creative execution in ad-breaks - that must remain the responsibility of the owner of that content, the advertiser. But in order to retain comparability, the industry must provide access to the advertiser to 'rate' their ad via exactly the same system. In my experience, once the media owner is not paying all the air it let out of the tyres and it grinds to a halt.