Charlene Weisler: You have been working with data to predict the path and impact of the coronavirus in the United States. What data and what forecast methods are you using?
Rex Briggs: As part of my trend watching, I took note of the closures in China back in January and decided to review the model coming out of China that said the outbreak would be over in 30 days with less than 10,000 cases. I ran the data, and published at the end of January that this had the potential to get into the millions.
My February report accurately forecast where cases would be at the end of the month nearly exactly (within one day).
I then began to focus on the U.S. My initial forecast used the WHO’s estimate for infection fatality rate (IFR), combined with my projection that 15%-20% of the U.S. would become infected by the end of the year.
About a month later, I was able to generate my own IFR, based on my analysis of data from Hubei and the Diamond Princess Cruise. I found a lower IFR of 0.4%, and I lowered my first forecast from 780k to 300k. However, in September, even though the daily death rate was declining, polling data was showing me that masks and other common-sense measures were becoming politicized.
In addition, my analysis had found seasonality suggesting the fall should be much worse. I raised my forecast of the percent infected by five points, and projected 350,000 deaths in the U.S. at the end of the year. The actual number was 350,730.
Weisler: Wow. That’s impressive. Which data sets were the most important?
Briggs: I use a lot of data to inform my forecast. Mobility data (thank you, geopath), cases, deaths, closed population studies, serology studies, clinical trials, cases by latitude, academic research on masks, excess deaths data from CDC, cases and deaths by age, historical flu vaccines, polling data on vaccine intention, ZIP code MOSAIC analysis, Adverse reaction database, economic forecasts. There is a lot to integrate.
Weisler: What national forecasts have you made since the virus hit?
Briggs: My next forecast after China in January was for when the U.S. would cross 1 million cases. It was also accurate to within a day. However, my first forecast back in February, when we had less than 10 deaths in the U.S. was that we could end 2020 with as many as 780k deaths, with 80% of them among 65+. That estimate was too high because I was using WHO’s best estimate of the infection fatality rate. I had data to suggest their 2% estimate was too high, and I discounted it by 20% but that was still too high.
In March, I completed my Diamond Princess Cruise Analysis and found the IFR was actually 0.4%, and I therefore re-forecast for about 300k deaths by end of 2020. I bumped up this forecast to 350k by end of 2020 in September the virus became so politicized, and polls indicated lower compliance with common-sense safety measures.
The two places I look for comparison to my models are the IMHE estimates and the COVID Forecasting Project, which is an ensemble of about 50 models. The IMHE forecast was made about three month prior to the end of the year, similar to my September forecast. The IMHE was too high by about 60,000 for 2020. The Ensemble forecast is only a four-week forecast. Their Dec. 1 forecast was too low by about 17,000. My forecast was off by less than 1,000, and based on the review of these forecast, was the most accurate for 2020.
Looking forward, for Q1, the forecasts are closer in range. You can read more about the comparison on my Forecast Blog, where I keep a running list of news I’m following and updates to the forecast. In terms of 2021, coincidentally, we may hit the 780k by the end of 2021, according to the 3.1 forecast.
Weisler: What about global forecasts?
Briggs: I did a European model as part of my November projection that called the peak of the pandemic in Q1, with an effective end in Q3. If there is interest, I’m considering extending the model to about 50 countries. If you have your readers contact me with their country of interest, I’ll see what I can do.
Weisler: Why do you think any of your forecasts diverged from other forecasters?
Briggs: IMHE ran too high most of 2020. On the other hand, almost all of the 50 models in the Ensemble ran too low. Running too low was probably because the summer had a down trend on deaths per 7-day moving average. I expected this based on my early analysis of cases and deaths by latitude.
There is something about the season that suggested we’d see a resurgence in the fall. I recall saying to my wife over the summer, “I’m afraid this is as good as it is going to get.” The other factor is politics. To inform my model, I pay attention to polling data. I find that attitudes and behaviors are both useful in forecasting.
Weisler: What is important to take into account when looking at this type of data?
Briggs: No one has a crystal ball, including me. We have data. We have assumptions. We have models to feed data and assumption in, but ultimately we have to follow the news, polls, medical research and consumer behavior and consider when enough signals suggest we have to update the model.
For example, most people believe that we’d see herd immunity when we get to about 70% of the population infected. I used that assumption, too. But now that I have more data on infections by state, and the growth rate in cases and deaths, a regression line points to more like 85% herd immunity threshold -- so I updated my model accordingly.
I am also seeing more evidence that there are re-infections. Maybe immunity for 20% of the population that have very mild or asymptomatic exposure lasts less than a year. We don’t have enough data yet, but if that turns out to be the case, then we will need more vaccinations to achieve herd immunity and end the pandemic. The point is, models are frameworks of data and assumptions and when new data comes in, the models should be updated. For reference, I am currently on version 3.1. You can see a summary of the model versions here.
Weisler: What do you project for 2021 and do you see any wildcards?
Briggs: To see my 2021 projections, visit Speaker Rex. There's a lot to the 2021 projections, and the dashboard show the key trend lines of actual and forecast, and links to the blog and methodology paper. Positive wildcards are therapeutics that should lower fatality rates of those infected, and very importantly, marketing communication to encourage more people to get the facts and get vaccinated. Negative wildcards: UK Variant, vaccination hesitancy and misinformation threats.
Weisler: What advice can you give researchers in using data for pandemic forecasting?
Briggs: Be a voracious reader. There is a good Reddit (COVID-19) covering academic research on COVID-19. And, be a part of the solution. I initially created the model to help others be informed and data-driven about their decisions on social distancing and mask wearing. I hoped the model would be shared with political leaders and the local and state levels to make better decisions.
I still think data and models are the best ways to make decisions, and the more people that know about the forecast and act accordingly, the faster we will end this pandemic.