Commentary

On Measurement Standards

by Josh Chasin , Op-Ed Contributor, December 16, 2008

At the Interactive Advertising Bureau's Audience Measurement Leadership Forum this past Monday, its Audience Measurement Reach Guidelines document was released for public comment. This document, over 18 months in the making and a joint initiative between the IAB and the Media Research Council, goes a long way in helping sort out the differences in methods different sources use to count and report on Internet audiences.

Perhaps most helpful, the document makes the point that the term "Unique" is itself misleading. Unique Visitors or Users (i.e. People), Unique Browsers, Unique Devices, and Unique Cookies are all very different things, and the user of audience measurement data must understand these differences in order to correctly interpret the data, notes the document.

One of the foundational principles upon which the guidelines are based is that "client-Initiated Counting is crucial... [and] that counting should occur on the client side, not the server side, and that counting should occur as close as possible to the final delivery of an ad to the client."

I think this finding, coming from the IAB, will have a huge impact on the dialogue taking place in the industry around audiences.

Publisher executives are accustomed to seeing regular reports from their Web analytics group that track KPIs, logically including one called "Uniques." When people-centric data provided by market research suppliers deviate from internal site-centric Unique counts, much hand wringing (and worse) ensues. The reality is that site-server data do not measure people, but instead count unique cookies, which, because of cookie deletion and multiple browser usage, can greatly overstate the true number of different people visiting. I am hopeful that the IAB document will be a great tool for educating all parties in the industry about the differences between site-centric (aka Web analytic data) and people-centric, audience measurement data.

The document also explains appropriate filtrations that should be applied to site-centric counts in order to reasonably compare them to people-centric counts. One such filtration is to exclude international traffic. On more than one occasion, I've worked with a publisher who was convinced that comScore Media Metrix data were undercounting their audience because they compared our U.S. Media Metrix data to their internal Unique counts. After some investigation, we discovered that the internal data was not filtered to exclude international traffic, and that indeed the worldwide Media Metrix data lined up very well against the (worldwide) internal data.

At the IAB Leadership Forum, we broke into discussion groups to talk about the document. In my section, a gentleman who runs the internal analytics group at a leading publisher (who shall remain unnamed, but I get the newspaper they publish delivered at home each day), noted that internal data cannot be used to track duration. There is a "first page" issue and a "last page" issue. The first page issue pertains to one-page visits. Without a second event (like a second server call), site-centric data cannot assign duration to a single-page visit. And, comScore data show that there are an awful lot of single page visits. The last page issue is similar. On a multi-page visit, site-centric data can track duration, except for that accruing to the last page. There is generally no way to determine the end of the visit based on any observable trigger activity.

Consequently, as duration becomes more and more important to publishers and advertisers, I expect people-centric data to also become more important. Without measuring people, you can't really measure duration.

But, please, go to the IAB Web site and review the document for yourself. Be a part of the industry coalition that moves our collective understanding and use of research forward.

metrics

6 comments about "On Measurement Standards".

Check to receive email when comments are posted.

Joshua Chasin from KnotSimpler, December 16, 2008 at 4:43 p.m.
Just an ammendment to m own column. The MRC isn't the "Media Research Council." Of course it is the Media Rating Council. My bad.
Reply

John Grono from GAP Research, December 16, 2008 at 5:46 p.m.

Hi Josh. I have to concur 100% with both the AMRG and your post. It is refreshing that such an august body would stress that the measures have to be as close as possible to final delivery to the client. Hallaleujah!

I would stress to all readers of Josh's post to download and read the document. The examples of calculating duration (and ensuring that the sum of reported durations across multiple sites or pages do not exceed total duration) are very concise and clear.

The mantra I have adopted down here in Australia while wrestling with these issues is "browses not browsers" and "users not uses". My other catchphrase that we need to shift online from being the "most countable medium" to the "most accountable medium" is greeted with shrieks of horror!! But when you explain that the top publishers when aggregated account for around 250% of the "uniques" in Australia - that is, 40 million unique users when we have a population of 21 million of which around 75% are online in any single month - they begin to see the problem that advertisers look very warily on server-centric data.

Cheers,

John Grono
GAP Research
Sydney Australia

Richard Markus, December 16, 2008 at 8:17 p.m.

In looking at the list of report contributors, publishers seem to be overwhelmingly represented by very high traffic sites or their parent organizations. Nearly all the other contributors are agencies and third party reporting companies with (obviously) a huge stake in the outcome of this standardization effort.

Standardization is a noble goal. But is it standardization if the rules are being written by those with the most to lose?

It's with a jaundiced eye that we review claims that a third party, especially one using mysterious and proprietary panel-based extrapolation methodologies, is more accurate at describing our traffic (visits and demographics) than our own web dev team and in-house survey tools.

Let's make sure that the interests of all parties are represented, not the least of which being the small and medium publishers, for-profit and nonprofit alike.

We'll pore over the document here at Mother Jones over the coming days and happily share comments with IAB and the committee.

Lameness Plocko, December 16, 2008 at 10:01 p.m.

Let me get this straight - you guys are suggesting Google hasn't though about these things?

Secondly - you suggest only two things that would make the count inaccurate for site based stats: Cookie deletion and multi-browser use. How often do people actually delete cookies? Most people don't even know what they are! As for browsers? Only hard core users bother to open multiple browser types on a regular basis.

You may be correct about duration measurement for now, but until Comscore and Neilson are free to all users, I'll take this type of study with a pretty heavy handed grain of salt.

John Grono from GAP Research, December 17, 2008 at 9:01 a.m.

Richard, you ar correct in that a panel will NOT be able to measure accurately smaller sites and the 'long-tail'. This is simply due to the small sample size. Clearly what is needed is a 'hybrid' solution which merges the benefits of client-side observed measures with the traffic 'census' data. Think of this as a yet another algorithm - slightly less mysterious than those used by search engines to return site lists in a sorted order - but one which can say that for every x-thousand page requests onto the server, then y-thousand of those page requests were actually completely rendered on an active browser on a computer on which a human was logged in. For smaller sites there is generally a strong correlation between traffic and people, less so for larger sites. I certainly look forward to your comments Richard and I understand your skepticism.

Regarding the comments from 'Lameness' ... sadly I am suggesting that if Google et alhave thought of these things, then they have done little to ameliorate the issues associated with them and the inherent overstatement of audience data. I do not think that Josh (nor myself) were saying that there are only two issues - there are a myriad. As for how often do people delete cookies, the answer is that it is a LOT more frequently than you would suspect - Josh may be able to jump in with some firm data, but I seem to recall a stat that something like half of all cookie deletion is done by some 2.5% of the online population - those being the heavier users. Also, some sites generate temporary cookies only which are deleted at the end of the site-session - they are deleted without the users knowledge. Some people block cookies (which understates the unique audience) while some like myself delete cookies daily. And as for browers, you can have multiple tabs open in a browser, but a person can only look at a single tab at a time (technically you can have paned tabs - but look/read only one at a time). If a page that is served and fully downloaded goes to a browser or browser-tab that is not looked at then how is it fair to count that as an impression. While the numbers may be (are?) low for these instances, the fact is that site-centric measures have absolutely no way of knowing this. Therefore, they assume that anything that is requested (bot, spider, human) is fully downloaded and always viewed in its entirety. It is this assumption, due to the lack of client-side knowledge, that causes the overstatement of online audiences. And as for why Comscore or Nielsen Online should be free - when it costs good dollars to recruit and conduct client-side research such as this that quantifies such previously undiscussed issues - well that beats me. Next we should get free petrol because we have already paid for the car ?

To put the issue simply, and using roughly correct Australian data (I don't have the most up-to-date to hand), here are the metrics:
* Australia's population 2+ is 21m people (Bureau of Statistics)
* In any single month between 75% and 80% go online (the measure varies according to the age-group of the research conducted - but they all cluster in this range)
* This gives a monthly online audience of around 16m people - maybe as high as 16.5m
* When you aggregate the server-data for just a handful of the major publishers you get a unique (undupilcated) audience of 40m.
* However, when you use a panel (which is not good at collecting online usage in the workplace as most businesses don't approve of such things) you get an audience of around 12m.

That is, the server-side data OVERSTATES the gross audience by almost 100%, while the panel-data UNDERSTATES the gross audience by around 25%. This is why we're looking towards a hybrid system to 'bootstrap' the panel upwards and to 'discount' the server-data downwards. Isn't it amazing that the much maligned panel is CLOSER to the truth than the much lauded cenus of server data ! Until we resolve this issue of gross audience measurement - then I am afraid everything else is academic. I am also very afraid of marketers likely reactions to the magnitude of the over-statement - we have to get hybrid working in a hurry! It won't be perfect to start with until some of the algorithms are tested and recalibrated over and over again, but it will sure be better than either server-side or panel-based data alone!

John Grono
GAP Research
Sydney Australia

Joshua Chasin from KnotSimpler, January 2, 2009 at 11:28 a.m.

Note to Lameness:

1. I just reread my column, and I'm pretty sure I suggested nothing at all about Google.

2. How often do people delete cokies? Read the literature on the topic. About 30% of web users delete cookies; these users see about 4.7 cookies per site per month, and that can lead to a 250% overstatement in Uniques.

To Richard from Mother Jones: I am loathe to speak for IAB, but this is a period of publiccomment on the document. I'd urge you and your organization to review it and make the appropriate comments back to IAB. THis is the window for a more broad spectrum of interested parties to participate.