A Next Step In Set-Top-Box Data Adoption: The CIMM Lexicon

Much has been written about set-top-box data. Some say the data are one of the most exciting opportunities for media measurement in recent years. Others point out the vexing problems and inherent challenges. One thing is clear however; set-top-box data offer granularity that, with some standardization and agreed-upon rules, could add to our overall knowledge of media usage.

But where to start? The advancement of set-top-box data has been hampered by concerns about privacy, footprints versus samples, editing rules and full access to the data, among other issues. Despite years of collection, aggregation and analysis, Set-top-box data is still essentially in its formative stages. For one thing, new measurement terms and editing rules are being created every day. This is resulting in a "Tower of Babel" where suppliers and users may not always be talking the same language.

As a first step toward standardization and fuller implementation of set-top-box data as another measurement tool, the Coalition for Innovative Media Measurement (CIMM) has embarked on an ambitious project to collect, collate and interpret all the myriad types of terms that are being developed as part of the set-top-box data measurement initiative. The result is a comprehensive lexicon of set-top-box data terms and their definitions. Until now, there was no single source that offered a full reference list of these metrics and their definitions.



This week at the CTAM Research & Insights Conference in Los Angeles (part of Cable Connections - Spring). CIMM debuts the first Set-Top-Box Lexicon. This is part of an effort to create a common language so that the standardization and adoption of set-top-box data as a media measurement tool can proceed smoothly and efficiently across the industry. The lexicon is a work in progress and will be continually updated.

CIMM is well positioned in the industry to help with the standardization of terms and usage. CIMM represents a coalition of end users of media measurement who have a vested interest in helping to develop strong, accurate set-top-box measurement standards that can help to grow their businesses.

The analysis of set-top-box data in the U.S. has been expanding in recent months and there are indications that the necessary steps are being taken to standardize data coming from a wide variety of configurations of networks, hardware and software.   The data that are starting to become available include not only linear TV data, but data from DVR playback, VOD (video-on demand) sessions, ITV (interactive TV) applications, the EPG (Electronic Program Guide), as well as data from remote controls. 

But this is not without measurement constraints that need to be resolved, including the standardization of terms and metrics and an agreement on the best algorithm for such metrics as capping and latency, for example.  There are also coverage considerations because return path data aren't available for over-the-air homes (which is a diminishing universe) and satellite homes that aren't connected to phone lines.  Finally, there are a range of basic technical issues being addressed, such as synchronizing time across systems, developing edit rules for data outages, accurately identifying content and ads across different boxes and systems, as well as managing the enormous volume of data generated.  These are all being addressed now. 

Despite the current constraints, the potential uses of set-top-box data are many: 

  • Set-top-box data, with its larger footprint, can enable unmeasured networks (often highly targeted networks) to finally be measured with statistical stability, enabling them to attract more advertising and grow.

  • Set-top-box data can also provide the granularity needed to produce TV ratings for smaller audiences who are underrepresented in current measurement.

  • Tuning data can be provided at second-by-second intervals, providing the potential to measure the audience for commercials with lengths shorter than a minute.

  • Set-top-box data offers an opportunity to measure out-of-home and second home viewing and usage patterns.

  • Sellers of local advertising across all networks in small markets often find themselves without enough stable data to get credit for advertisements aired on their networks.

  • Additionally, set-top-box data enable forms of segmented advertising and creative versioning for programs, networks, dayparts and geographic areas. Set-top-box data can provide new opportunities for the TV industry to attract spending from advertisers that has been spent on direct mail.

    The challenges facing the multichannel operators in bringing set-top-box data products to market aren't only technical. There are regulatory and privacy concerns that are being addressed to make sure that data are aggregated and de-identified.  There are also business models that need to be built, since the data business is new for multichannel operators, whose primary concern is the relationship with their subscribers and secondarily, advertising partnerships. 

    All these new opportunities take time to develop, and the initiatives taken now will shape the nature of set-top-box data products.  The first step is for all of us to begin to speak a common language. CIMM's goal is to support development of the set-top-box data business, and the lexicon is the first step in this direction. 

  • 3 comments about "A Next Step In Set-Top-Box Data Adoption: The CIMM Lexicon".
    Check to receive email when comments are posted.
    1. Andrew Crowley from ggiyt, May 12, 2010 at 12:22 p.m.

      second-by-second data is worthless information. network and sponsors dont have the time to monitor it, they need total numbers. there is also no demographics by using set-top-boxes. A thirteen year old can be watching the news and watching a car commercial. There is only one company out there that is almost full proof with volunteering information from viewers, they have several patent pending application on there system and my guess is that they will dominate the market soon.

    2. John Grono from GAP Research, May 12, 2010 at 5:05 p.m.

      An excellent, precise summary Charlene and Jane.

      Paula, STB data can NOT identify the individual viewer. This is like saying that a browser or tracking software can identify the individual on a computer - it is simply wrong in the vast majority of cases.

      In a single-person household that has no guests visiting and viewing in that household, you can safely attribute the tuning to that person. For all other households you are taking an 'educated guess'. By fusing STB data with panel data you will get a reasonably accurate conversion from household tuning to people-based viewing. For smaller channels, the "base" of tuning will be more accurate than the sample so the resultant data will be an improvement. For larger channels, the resultant data COULD be less representative - the challenge is to ensure that the fusion process does not let this happen.

      Andrew, you have made the point I wanted to make. A broadcaster (those who correctly fund the majority of the research costs) is more interested in data at a programme level than the second-by-second level. It is advertisers - who are historically loathe to contribute to the solution and hide behind the claim that "we gave at the office" by buying airtime - who have this interest. While we continue to monitor data streams by time-stamps as opposed to content (video and/or audio authenticated attribution), then in a small advertising unit like a 15-second ad, the chances of applying all the algorithms correctly but ending up with the wrong result remain very high.

      Keep up the discussion!

    3. Paula Lynn from Who Else Unlimited, May 13, 2010 at 8:09 p.m.

      Tongue in cheek, John. So far......

    Next story loading loading..