Validating Attribution Models
Attribution models have emerged as a powerful tool for helping advertisers understand which parts of their marketing efforts are driving sales. An attribution model works by assigning partial credit to each advertising event that influenced a user to convert, and can generally be separated into simple models and advanced models. Simple attribution models use predetermined weights to assign credit to each ad, while advanced attribution models use a more scientific approach.
All simple attribution models have rules for assigning credit to each touch point on the path to conversion. Last-event attribution assigns all the credit to the last ad, while even attribution spreads the credit out evenly. Other attribution models assign each event an arbitrary credit depending on its position in the sequence of events leading to a conversion. Typically, the first touchpoint to reach a user is called the “introducer,” the last touchpoint is called the “closer,” and every touchpoint in between is called a “promoter.” Introducers and closers will get outsized credit, while promoters divide up the rest of the credit.
Advanced attribution is substantially different from these simple attribution methods. Advanced attribution models are “data-driven.”.A data-driven model lets the bottom-up data determine the importance of each ad in a sequence. Data-driven models typically look at the entire data set of converting and non-converting data and they look at every sequence that leads to conversions to determine how much credit to give to each ad.
Simple attribution models will give some quick back-of-the-envelope answers that provide some general insights, but serious marketers rely on the more scientific advanced attribution models for an objective understanding of the actual performance of each part of their advertising campaigns.
But how does one know that one attribution model is superior to another one? How can we know if an advanced attribution model is superior to a simple attribution model? And how can we know how accurate a model is? One way to figure it out is with model validation.
There are several techniques to validate an attribution model. One of the most effective is lift analysis. A lift analysis compares the conversion rates of two groups of users that have been carefully chosen such that they are similar in all respects except one.
As an example, for the first group, separate out users who have never seen a display ad prior to clicking on paid search ads and then converting. Next, let’s make the second group users who have seen display ads prior to clicking on paid search ads and then converted. Now, we can compare the different conversion rates of the two groups.
The difference in the conversion rates between group one and group two will tell us quantitatively at an aggregate level how much the top-of-funnel display activity has lifted the conversion rate above the baseline of the paid search users’ conversion rate. The lift in conversion rate experienced by the second group should be credited to the aggregate display events seen by those users.
For example, if the conversion rate is doubled when users see assisting display impressions, then 50% of the credit for the conversions should go to the display impressions and the other 50% should go to the paid search ads clicked on by the display assisted paid search user group.
While seemingly simple, this technique is extremely valuable because it solves a complex problem. It allows us to validate an attribution model by comparing it with a ground truth that is established without relying on the attribution model itself. Hence, we can measure the efficacy and accuracy of the attribution model by using a top-down approach to validate our bottom-up results.