Sheila Bird,
formerly Programme Leader, MRC Biostatistics Unit, Cambridge Institute of Public Health, CB2 0SR
&
Bent Nielsen,
Department of Economics
and
Nuffield College
University of Oxford
Supported by the European Research Council (grant 694262, DisCont)
Weekly reporting: The number of cases have now fallen to a much lower level than seen at the peak of the epidemic. We will now move to weekly updating of the web site. This will be on Saturdays. (24 May 2020)
Now-casting for English regions:
Plots similar to that above are also done for the English regions. The interpretation is the same.
As the number of cases are smaller, the statistical uncertainty is larger.
We are now beginning to see that the method for computing confidence bands breaks down as the number of cases become smaller. This
is seen for the North East and Yorkshire and for the South West and
This methodological issue is enhanced for regions with shorter reporting delay such as for North East and Yorkshire
(13 May 2020)
The reporting delay: Each day NHS England reports information on deaths of patients who have died in hospitals in England and had tested positive for COVID-19 at time of death. All deaths are recorded against the date of death rather than the date the deaths were announced. NHS points out that the totals reported on any day may not include all deaths that occurred on that day or on recent prior days.
Reporting delay is a well-known feature of death statistics (Bird, 2013). Interpretation of the NHS figures should take into account the fact that totals by date of death, particularly for most recent days, are likely to be updated in future releases. NHS England writes on the reporting delay in the data: "Interpretation of the figures should take into account the fact that totals by date of death are likely to be updated in future releases for more recent dates. For example, a positive result for COVID-19 may occur days after confirmation of death. Cases are only included in the data when the positive COVID-19 test result is received, or death certificate confirmed with COVID-19 mentioned. This results in a lag between a given date of death and exhaustive daily death figures for that day." (13 May 2020)
Method:
The method adjusts for overall delay (across age-groups and regions) in the reporting-in of hospitalized COVID-19 deaths for England.
The method self-adapts to temporal changes in the reporting-distribution
and deliberately does not parameterize how we expect the trajectory to look a priori.
Specifically, we chose an over-dispersed Poisson model with an age-cohort specification.
Specifically, we chose an over-dispersed Poisson model with an age-cohort specification.
This method corresponds to the chain-ladder method used in general insurance for estimating unknown liabilities (England, Verrall 2002).
We apply a recent theory for uncertainty of estimates and now-casts in the presence of over-dispersion (Harnau, Nielsen 2018) extending
(Martínez-Miranda, Nielsen, Nielsen, 2015, 2016). After some experimentations we settled for an approach that only exploits data from the
7 most recent reporting days. This is because the delay distribution varies over time. As a consequence, the now-casts may jump
from one reporting day to the next when there are shifts in the data. An alternative, would be a more smooth approach that would appear
more stable over time, but it would have less ability to follow the shifts in the data.
The number of cases have by now decreased considerably since the peak. At the same time the delay distribution has become tighter,
although still varying considerably throughout out the week. The parameters of the method have been adjusted so as to use data from the
5 most recent reporting days and 7 most recent dates-of-death.
(24 May 2020)
Software:
We used an adapted version of the R package apc (Nielsen, 2015).
Download:
apc from CRAN
and
further documentation and development version.
Additional code and data is needed.
Download:
code from
CovidReporting.zip [17 Apr 2020: 8.43]
and (daily updated) data from
CovidReportingNHS.xlsx.
This contains five R files and one data file in xlsx format. Update the parameters
(drive & choice of region & choice of destination for plots)
in CovidReporting_Main_16apr2020.r
and run in R.
Further instructions on
Regions & Archive
page.
Recursive nowcasts: These help in tracking the performance of the forecasts over time. In the below figure the black crosses, plusses and lines are the same as in the above graph; that is using the most recently reported data. The red crosses, plusses and lines are drawn using the data reported one day earlier. And so on. Now, consider, for instance, the date-of-death 5 days ago. The 5 crosses of different colours show how the information about the number of cases grow day by day. Higher up there are five errors bands of matching colours. They tend to get narrower. Looking at an older data-of-death we see that the final observation tend to be included in all error bands for that date.
Short term forecasting: Castle, Doornik & Hendry present short term forecasts for a variety of countries.
Media:
David Spiegelhalter on Twitter
(16 Apr 2020)
The Scientist
(18 May 2020)
References:
Bird SM. Editorial: Counting the dead properly and promptly. Journal of the Royal Statistical Society Series A 2013; 176: 815 - 817.
England PD, Verrall RJ. Stochastic claims reserving in general insurance. British Actuarial Journal 2008; 8: 443 - 518.
Harnau J, Nielsen B. Over-dispersed age-period-cohort models. Journal of the American Statistical Association 2018; 113: 1722 - 1732.
Martínez-Miranda MD, Nielsen B, Nielsen JP. Inference and forecasting in the age-period-cohort model with unknown exposure with an application
to mesothelioma mortality.
Journal of the Royal Statistical Society Series A 2015; 178: 29 - 55.
Martínez-Miranda MD, Nielsen B, Nielsen JP. Simple benchmark for mesothelioma projection for Great Britain.
Occupational and Environmental Medicine 2016; 73: 561 - 563.
Nielsen B. apc: An R package for age-period-cohort analysis. The R Journal 2015; 7: 52 - 64.