The emergence of a new coronavirus disease, known as Covid-19, that could be transmitted between people was identified in China in December 2019. By 3rd March 2020 it had spread to every continent except Antarctica, totalling 92,840 confirmed cases and 3,118 deaths.
As scientists worldwide scrambled to understand this new virus, a fundamental and immediate question was how many more people are likely to die and what impact can governmental interventions have?
To answer this question, we have two valuable resources available to us. The first are numerical models, which have identified the key equations that can be used to explain a pandemic. The second are observational data, which detail the number of deaths and hospitalisations due to Covid-19 that have occurred to date. Neither models nor observations are perfect but by combining (assimilating) them, we can utilise the best parts of both whilst minimising their flaws. A huge benefit of data assimilation is that it also provides a robust estimate of the uncertainties of the output, offering an understanding of the worst, best as well as the most likely situation.
Assimilating observational data with models is routinely performed in geophysics. In fact, data assimilation is fundamental to modern day weather forecasting. Evidence for this is provided by the step change in the accuracy of weather forecasts that has been possible with the increasing availability of information from satellites orbiting the earth.
Can data assimilation tools that have been developed for the geosciences be applied to pandemic modelling?
A team of scientists from 8 different countries (Argentina, Canada, England, France, Netherlands, Norway, Brazil and the United States of America) diverted their attention from geophysics for a few months to examine this very question. Each employed a state-of-the-art data assimilation tool typically used for geophysical problems to explain and predict the course of the pandemic in their own host country. The evolution of the epidemic is seen to vary widely between these 8 countries. Factors affecting this include differences in location (e.g. hemisphere), population densities, social habits, health-care systems, and importantly the government interventions employed.
It was found that by using data assimilation to derive key parameters of the pandemic we could fit a classic metapopulation model to explain the reported deaths and hospitalisations in each of the 8 countries. The model itself is a version of the Susceptible-Exposed-Infected-Recovered (SEIR) compartment model that has been adapted to Covid-19 by including age-stratification and additional compartments for quarantine and care-homes. This is analogous to compartmental models often used in geophysics such as those used for studying carbon dynamics. Using this approach, we were able to successfully represent the impact of the (very different) interventions taken in the 8 different countries; visualising the rapid drop off in person-person transmission on different dates of lockdown.
Given the success of data assimilation to explain the reported deaths, the next step is to provide predictions under different possible scenarios. For England we took three possible scenarios from the 1st June when lockdown began to be eased. These were defined in terms of, the now familiar variable, the R number, which quantifies the average number of people an infected person will pass the virus onto. The three values that were chosen were 0.5 (reduction in number of cases with time), 1 (steady number of cases with time) and 1.2 (increase in number of cases with time).
As of 1st June, approximately 45,000 deaths were attributed to Covid-19 in England in all settings (source, ONS). Our projections under the three different scenarios predict that by the 1st September the total deaths will be 57,000±1,900 (R=0.5), 63,600±2,700 (R=1) and 76,400±4,900 (R=1.2). Given how widespread Covid-19 already is in England, these results highlight the potential of measures, which reduce a large amount of person-person contact, to save tens of thousands of lives. The uncertainty in the numbers reflects the uncertainty in the simple model and the uncertainty in the reported values. The collection of data on deaths, hospitalisation and number of positive cases is marred by a myriad of political and social complications, problems we do not normally need to consider when measuring winds and rainfall.
Figure: Evolution of the Covid-19 epidemic in England. Top: Total deaths. Middle: Number in hospital. Bottom: The estimated R value (average number of person-person transmission). The black dots show the reported values up to 5th June for deaths (source, ONS) and up to 12th June for number in hospital (source, daily Gov. Press Conference). Blue lines indicate the initial estimates and the red lines indicate the values after assimilation, with the bold line indicating the most likely value. After 1st June three predictions are made based on three different R values.