What Looking At Billions Of Emails Has Taught Me

Every Friday, I look at a few billion emails.  

More specifically, I run some standard reports on an anonymized  database of a few million active consumer mailboxes every Friday. This data is generated by users who have installed one of several email applications  in exchange for the ability to see aggregated anonymous data about what’s going on in the user’s mailbox.

The installation of these applications  includes explicit, positive permission to use the data on an anonymous and aggregated basis. I can’t tell who the user is; subscriber IDs have been anonymized. We primarily collect “metadata” about each message. Was the message read? Was it placed in the inbox? Was it moved to folder? Did the user delete without opening?

The ability to look at aggregate data across millions of mailboxes gives unique insights into what are the important levers that drive email marketing performance. The really interesting thing about this data is that it offers a view into the entirety of the messages that are sent to panelists’ mailboxes. This data makes it clear what kind of messaging works and doesn’t work for the same subscriber.

Here are a few of the things that I’ve learned through analysis of this data—only some of which were obvious:

“Engagement-based” spam filtering at large mailbox providers is real and drives/costs a lot of potential clicks and opens. Large mailbox providers, particularly  Gmail, consider the individual subscriber’s historical engagement with an email’s sender when deciding whether that email should be delivered to the inbox. Some marketers believe that this means deliverability isn’t a problem, because only disengaged subscribers won’t get delivered. Our analysis indicates that delivery problems impact both engaged and disengaged segments, just at different rates.

List quality is the foundation on which good email marketing results are built. Not all subscribers are created equal. Marketers drive more opens and revenue when they mail to lists that consist largely of consumers who actively engage with commercial email. Our analysis indicates this is the most important driver of email marketing performance—even more important than your email program strategy. In other words, a great marketer with a poor list will be outperformed by a mediocre marketer with a great list that is made up of extremely engaged subscribers.

There is a “magic circle” of mailers for most subscribers. Most active subscribers have a handful of “go to” emails that they open on a regular basis. Brands enter and exit this “magic circle” over time. The subscribers who have admitted you into their circle represent a disproportionate percentage of your opens. Understanding what is working for the brands that your subscribers have admitted into their magic circle is a shortcut to driving more opens, clicks, and conversions.

Cadence is an important lever. There is a surprisingly large variation in weekly send volume per subscriber for  mailers in the same vertical with the same business model. A faster cadence typically drives higher termination rates (terminations here are defined as unsubscribes, complaints, and subscribers that are no longer responsive), but will drive higher opens per subscriber up to a point. By looking at domain-level data from the panel, it’s possible to find optimal average frequency for an entire list. Many marketers are far from that optimal cadence.  

Triggered/contextual messaging works, as does segmentation. No real surprises here, except for the disproportionate impact that triggered messaging can have. The read rates from triggered/contextual messages are much higher than campaign-based messages. Increasing the number of triggered/contextual messages quickly increases opens per subscriber.

Spending time on optimizing subject lines appears to be worth the effort. When normalizing for subscriber activity level, cadence, message type (e.g, campaign-based mail or triggered message) , send time, and the number of messages sent by competitors at the same time, there is still a significant amount of unexplained variation in read rate between campaigns for the same domain. A lot of this variation appears to be due to the quality of the subject line.

What other questions do you have that might be answered by this kind of data?

Next story loading loading..