Commentary

Tricks With Clicks

by Jonathan Blum , November 29, 2005

A guide to surviving online measurement dilemmas.

We all know the adage: That which can be measured is that which can be bought. But what happens if that which can be measured can't be measured exactly?

Online media and marketing professionals are asking themselves that very question as conflicting metrics and evolving standards and tools blur what should be a crisp bit-by-bit Internet vision of the market. Cloudy data isn't fatal online. The Internet ad market still posted a 26 percent increase in revenues in the first half of 2005 compared to the same period last year. It would appear that the online industry is comfortable doing business with the existing, albeit imprecise tools. But the industry still faces the stubborn problem of wrestling with less-than-ideal measures.

"In online everything is measurable," says Scott Spencer, vice president of product management at DoubleClick, the online ad-serving firm. "Some [of the things measured online] have meaning, some do not. It can hard be hard to tell them apart."

Of the many examples of muddy online data, one of the worst is how far apart reach and frequency calculations can get depending on inputs. According to an ad agency insider, reach and frequency results (which seek to capture the number of users who experience a media asset in a given period) can vary widely depending on what measure is used: panel figures from Nielsen Net Ratings or comScore Media Metrics, ad server information from Atlas and Double-Click, or actual server records from the content publisher.

Though specific audience data are considered proprietary information by publishers like espn.com, which declined to comment for this article, measurement companies confirmed that disparities in calculations can exist depending on which methodologies are used.

"The problem has always been translating the information that a server can record into estimates of user behavior," says Jeffrey Goldberg, a consultant with Goldmark Systems, a marketing and information technology company. "The data was easy to collect and people fooled themselves into believing that it contained the information that they wanted."

The larger dilemma is whether what is now merely inconvenient will metastasize into a major problem as the online ad industry becomes more technologically sophisticated. All advertising -- online and off -- is facing challenges from a raft of game-changing media platforms, including wireless, word-of-mouth marketing, video-on-demand, and advergaming, each of which brings its own pressures to measurement methodologies. These new forces are bound to exploit weaknesses, and in turn, to stymie the process of buying and selling ad inventory. Insiders say this will eventually affect pricing.

"In my mind, what's coming down will finally atomize the market," says Court Cunningham, chief operating officer at Community Connect, an urban online social space. "You are already seeing the squeeze on the old model with Google auctioning off keywords without a human media buyer. The pace will only increase."

Though new technologies will always be agents of change, the real monster in the box is, as always, the client.

"We want the transparency that the Web gives us," says Peter Goodnough, customer relationship manager at Samsung Electronics America. "We are not about creating proxies for effectiveness." The key to surviving the coming changes in ad measurement lies in understanding what makes the Web so challenging to quantify accurately. Here are a few rules for Web measurement.

1. Online ain't the 'The Matrix'

Statistical life is not like the movies. Reality and statistical models are far, far apart. In fact, in the strictest sense, the only thing that can be proven with figures is that something is not so, and even then only within strictly defined boundaries of arithmetical certainty. Consequently, Web data, for all its abundance and complexity, is better suited for comparisons and trends. Beware of absolute predictions or hard figures.

The Quote: "The real question is not whether data is accurate in absolute or census terms. It's whether it's consistent," DoubleClick's Spencer says.

The Bottom Line: Online is a trend thing, baby. Any time you hear a hard number, beware.

2. Apples and Oranges

Nielsen Net Ratings, comScore, Hitwise, DoubleClick, and other commercial online measurement providers are all reputable organizations trying hard to deliver products and services. But they are competitors and it can be tough to align their data in common statistical terms.

Though there are standards bodies, like the Interactive Advertising Bureau and the Advertising Research Foundation's Online Media Council, that are trying to force a common standard into the market, the consensus is: Don't hold your breath.

The Quote: "There are the makings of a data oligopoly now, and there aren't the dedicated research staffs at agencies to get them to work together," says Mike Zeman, director of insights and analytics at Starcom IP.

Bottom Line: Data is only as good as its source. Compare and contrast between sources at your own risk.

3. Online ain't the telephone

Think about it. When you log on, go to Google, and log off, that Google page stays there. It's cached on your computer's hard drive so you can still see it and work it. That local cached activity is, at its essence, beyond measure. There are rumors afoot, though, that eventually new operating systems such as Windows Vista will report such by-file usage.

Caching would be a minor problem if 80 percent of all broadband use didn't come from the office. But at-work usage is what's driving Web marketing, and most business networks have servers sitting behind firewalls that cache similarly to the one on your local computer. So content and activity is stored locally, apart from being directly measured online. So-called cache-busting technologies seem to account for the problem. Nevertheless, it's telling that no hard estimates exist for how big a problem caching is.

The Quote: "We use direct ISP [Internet Service Provider] information to provide excellent traffic data, but no, we cannot directly measure the effect of caching, though we feel it is probably minimal," says Bill Tancer, general manager of global research at Hitwise, a company providing traffic data.

Bottom Line: Caching isn't a market killer, but many wonder why impressions don't add up to page views.

4. Online ain't the phone book

Online addresses are expensive. So most service providers, like phone companies, cable companies, and Internet Service Providers, don't give a single Web address to each user. Instead, these Internet service providers use a pool of so-called dynamic addresses to offer access.

These dynamic addresses cut the hard tie between the network session and the discrete user. While the industry has developed many strategies, such as cookies, to work around the problem, the view of what a user is doing is undoubtedly dim.

The Quote: "If you are relying on a single figure like traffic information or something like that, you are probably making a mistake," says Jeff Lanctot, vice president of media at Avenue A/Razorfish.

Bottom Line: Want yet another reason why tracking who's doing what online can get spooky? Look to dynamic Web addressing.

5. Cookies ain't perfect

The advertising industry's answer to the problems of caching and dynamic addressing has been, among others, cookies. These are small bits of data that browsers keep to create discrete user identifiers that report behavior no matter what the network topology. There are two problems with cookies. One, consumers are getting wise to the fact that they're being tracked and are deleting cookies. The industry seems to be managing the problem with extrapolation techniques based on referrer information, log-on data, and smarter cookies. Two -- and more pernicious and problematic -- is that cookies are capturing data of immeasurable complexity.

Browsers don't render fixed images like movies or television. They only interpret instructions to locally create previously defined digital objects into viewable frames. That means there are literally hundreds of thousands of visual elements rendered over tens of millions of users, for hundreds of thousands of Internet minutes. Try merely thinking about that, much less quantifying it statistically. Yet the industry is asking for a tiny bit of code -- a cookie -- to do just that. Cookies are being asked to describe monstrously complex Web scenarios day in and day out, and to report that information to Web trackers. As a result, cookie data can fall seriously out of mathematical whack.

For example, those close to the ad serving process say that often a full session of online behavior isn't being captured by the cookie stream. Only the last bit of code in the cookie file is being shipped back to the ad server for monetization.

The Quote: "Even the most basic things like file size can distort usage," says Eric Garland, chief executive officer at Big-Champagne, a measurement firm specializing in peer-to-peer applications.

Bottom line: In the words of "The Sopranos' " Bobby Baccalieri, cookies "got limits too, ya know." They're great tools, but there is only so much they can be expected to do.

6. Online ain't ever finished

Again, think about it. What happens to all those lovely online measurement tools when you migrate your browser from Internet Explorer to Mozilla's Firefox? All your cookies and tracking information must be changed. Statisticians do something called "biasing" to compensate for the shift of client applications. Biasing for software migration isn't the end of the measurement world. Far from it. Biasing occurs all the time with excellent results. But adjusting numbers for the software shift is specific to the Internet and can become complicated very quickly as browsers, operating systems, and ancillary software migrate, update, or fall off the Web.

Good biasing is critical, particularly for measurement tools that use installed software to measure usage and develop demographic panel data. What is an art form in one instance can very quickly turn into troubled data.

The Quote: "Any estimate that comes from a panel is fundamentally tied to the quality of that panel. Size itself does not equal quality," says Manish Bhatia, senior vice president of product services at Nielsen NetRatings.

Bottom line: Hate old-school Arbitron numbers all you want, but those panels are 80 years old and the industry knows how to bias them.

7. Online ain't even all human

There is one thing about online measurement that is like the movies: bots. Taking a page right out of "The Matrix," a lot of online traffic isn't from flesh-and-blood humans, at least not directly. Automated search agents called bots are used by everyone from Google to hackers to search the Web. Bots can affect traffic, page views, and virtually every other piece of data collected online.

Worse, bot research and scholarship are among the most controversial and secretive topics on the Web, so hard facts seem to be hard to come by. Therefore, advertisers would be wise to discount all their data for what an automated program might do to it.

The Quote: "I assume that about half of what is coming onto the sites we track is automated, and I think that holds true for everybody else out there as well," says BigChampagne's Garland.

Bottom Line: Yes, they are bizarre, but bots cut to the heart of what makes the Web good or evil. Understanding bots is vital.

8. Tracking P2P is tough

You'd think that tracking peer-to-peer downloads and file swaps would be simple, but it's not. Think about it. What can you track? Say you want to download Madonna's new single "Hung Up." You go to Kazaa, Bear Share, or another P2P service and enter the term phrase "Hung Up." A list of files comes up, but did you get the Madonna track? Maybe. Maybe not. For example, you may get a list of files with references to the words "Hung Up," but not the actual song.

With P2P downloading, Garland notes there are no base level market surveys to benchmark the data in the peered world. There are no Nielsen panels or anything comparable.

The Quote: "Measuring peered activity accurately is fairly difficult," says Big Champagne's Garland.

Bottom line: Don't count on gleaning effective data from P2P systems.

The Final Word Online media probably works better than any media measurement system. But it's far from perfect, and if you're pulling your hair out trying to get, say, impressions and page views to reconcile, don't bother. That's the Internet.

Measuring the Web is a bit like a trip through Willy Wonka's Chocolate Factory: It produces a marvelous product, but at the cost of great confusion and drama.

Next story loading