The Department of Health and Social Care has been publishing daily death tolls for the UK, and these are analysed below. On April 29th, it changed the basis for reporting. The main reason was to include care home deaths in England, as they had already been included in Scotland, Wales, and Northern Ireland. The daily data is provided at https://coronavirus.data.gov.uk/. The archive is at https://coronavirus.data.gov.uk/archive, and is now quite complicated. Each day’s figure is now reported on each following day, and is subject to revisions. We do not have more than a very short series of datapoints that is the data actually released on the day, under the new reporting scheme. This makes it hard to know how to analyse it usefully, until that series is long enough.
For that reason, I paused the analyses on 28th April. I may resume when there is enough data to make some sense of each day’s new figure.
The published daily death tolls seem to vary wildly. We can make more sense of them by looking at them by day of the week, as follows:
## Mar 15- Mar 22- Mar 29- Apr 5- Apr 12- Apr 19- Apr 26-
## Sunday 14 48 209 621 737 596 413
## Monday 20 54 180 439 717 449 360
## Tuesday 16 87 381 786 778 823 586
## Wednesday 32 41 563 938 761 759
## Thursday 41 115 569 881 861 616
## Friday 33 181 684 980 847 684
## Saturday 56 260 708 917 888 813
Table 1. The number of UK hospital coronavirus deaths reported each day by day of the week, and week.
This helps, because the deaths are reported sometimes days after they occur and the processes of collating the data are affected by day of the week. By comparing figures on the same day of the week, we remove some of that variation. Although there is still ‘random’ variation, of course, there is more systematic signal in a whole week’s gap, than in a one-day gap, so we win both ways. The obvious next step is to look at each figure as a ratio to exactly one week before, like this:
## Mar 22- Mar 29- Apr 5- Apr 12- Apr 19- Apr 26-
## Sunday 3.43 4.35 2.97 1.19 0.81 0.69
## Monday 2.70 3.33 2.44 1.63 0.63 0.80
## Tuesday 5.44 4.38 2.06 0.99 1.06 0.71
## Wednesday 1.28 13.73 1.67 0.81 1.00
## Thursday 2.80 4.95 1.55 0.98 0.72
## Friday 5.48 3.78 1.43 0.86 0.81
## Saturday 4.64 2.72 1.30 0.97 0.92
Table 2. The ratio on each day of the number of reported deaths to the number from one week previously.
A “1” means it’s the same as last week, greater than one means it’s more than last week, and less than one is the preferable case of being less than last week. First, note that up until Monday 13th April, every figure is greater than one. Since then, every figure is less than one, except 21st April. It looks like a corner was turned on Tuesday 14th.
Some other patterns need explaining. The biggest figure is 13.7 on Wednesday 1st April. That happened because the reporting system changed on Wednesday 25th March, and the figure that day covered only a few hours. So Wednesday 25th March had a very low figure, and the ratio of 1st April to 25th March was very big. In between those Wednesdays, the ratio is between one reporting system in the later week and a different reporting system in the earlier week, and the numbers do bump around a lot.
After Wednesday 1st April, the ratios are much less variable for 10 days, and largely go down steadily with one small exception. This period presents a very positive picture, declining from 4.9 on Thursday 2nd April to 1.19 on Sunday 12th April, presumably the effect of increasing social distancing three to four weeks earlier.
After Sunday 12th April, there is some more turbulence, but that was Easter Sunday, and the holidays may well have disturbed the processes of recording again. From Tuesday 14th onwards, every figure is less than one apart from 21st April. If the reporting system doesn’t change again, we would hope to see further turbulence in the week ahead, with opposite effects from those of Easter week, because the Easter figures will be on the bottom half of the fraction this time, instead of on the top. Such “turbulence” may account for 21st April, suggesting the 14th April may have been too low, and so the true start of the downturn may have been a day or two later. And then we hope back to smoothly changing figures in the rest of Week 7. The big question is, though, how low does the ratio get?
The ratio should go down until the social distancing hasn’t changed in the previous four weeks. If the ratio goes down only to a bit less than one, then this wave of the epidemic will last for a long time, and we can expect many more deaths in it. If the social distancing has been enough to reduce the ratio well below a half, then this wave could be over in only a few weeks, and the deaths could be mercifully smaller.
If we assume the ratio will systematically go down, apart from random variation, then projecting from the current ratio should give us approximately an upper bound to the duration and the eventual death toll. We can hope to see both come down as the turn begun on Tuesday 14th April continues. Tables 1 and 2 are facts and simple manipulations of facts. I’ve made some very tentative projections below, on the pessimistic basis that the current ratios are as far as the ratios will go down.
## Mar 22- Mar 29- Apr 5- Apr 12- Apr 19- Apr 26-
## Sunday 1.7776 2.1224 1.5711 0.2471 -0.3064 -0.5292
## Monday 1.4330 1.7370 1.2862 0.7078 -0.6753 -0.3187
## Tuesday 2.4429 2.1307 1.0447 -0.0148 0.0811 -0.4900
## Wednesday 0.3576 3.7794 0.7365 -0.3017 -0.0038
## Thursday 1.4879 2.3068 0.6307 -0.0331 -0.4831
## Friday 2.4555 1.9180 0.5188 -0.2104 -0.3084
## Saturday 2.2150 1.4452 0.3732 -0.0464 -0.1273
Table 3. The number of doublings of the daily death toll in one week. When a negative figure is stripped of the minus sign, it represents the number of halvings in one week. Each figure is based on the corresponding ratio in Table 2.
This table shows the how many doublings of the death toll happen in one week, or (if the answer is negative) how many halvings happen in one week, in both cases if the corresponding ratio in Figure 2 continued unchanged. Thus, the Figure 2 ratio on Monday April 19th suggests there would be only 0.3064 halvings in a week, and so it would take over three weeks to have the number of daily deaths. The first six negative values all suggest not many halvings happen per week, and at those rates it would take a very long time to reduce the death tolls by 50%. Tuesday 20th does look a lot better, but is only one day, and Wednesday 21st has reverted to increasing – these may be explained together by a change to the usual reporting pattern over Easter. The situation may be reverting to slow continuous change by Friday 24th, but we have to hope for a continuing drop in the ratios.
It doesn’t make sense at the moment to calculate projections of remaining death totals, as they would be very large and, we hope, very unrealistically large.
Figure 1 plots the data in Table 1 by day of the week. Each day has a much simpler pattern than the data all plotted in a single sequence.
Figure 2 plots the ratios from Table 2. The day of the week correction in calculating the ratios means we get a clear pattern in the simple plot against time.
Figure 3 plots the number of doublings (or halvings, when negative) per week predicted by the Figure 2/Table 2 ratio on each day. The pattern is simple for the same reason as in Figure 2.
Figure 3 is a minor re-plotting of Figure 2, essentially by taking the inverse of the log of the ratio. The reason this is useful is that the scale gives negative values the same meaning as positive ones. If the epidemic is to go down as fast as it came up, the negative values now need to be as large as the positive values were a few weeks ago. However, they are smaller in magnitude than the positive values up until well into April. At this rate, the decline will not be rapid.
This is the best published dataset for studying the UK pandemic on a daily basis, and I find the new figure each day in the online Guardian. The series used to be found on https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_the_United_Kingdom, but they have changed the basis of reporting. The data are available on a histogram at https://coronavirus.data.gov.uk/, but not so far as I can see in a spreadsheet (it used to be, at https://www.arcgis.com/).
Testing in the UK has been applied so differently and unsystematically, that the number of known cases depends at least as much on the testing regime as the actual number of infections. The efforts to save lives mean that the UK hospital deaths are actually recorded in a reasonably definite way. The National Statistics Agency publishes figures of deaths on a weekly basis that are more comprehensive, but they are also considerably out of date by the time of publication, as the system for making them comprehensive takes time.
Within the UK daily hospital deaths, there are points to remember. First, there is a large effect of day of the week, and that effect changes – this is one of the main issues discussed in the text above. Second, it was announced on 24th April that 40 deaths had been omitted, because of past incorrect figures from one hospital trust. The corrected daily figures have not, so far as I am aware, been published, and so the analyses cannot be corrected. However, the desire for an analysis of the figures as published each day means that it is moot whether the corrected analysis would be preferable.
I began by constructing statistical models of the numbers, using log-linear models through glm() in R. As the weeks went on, I added day of the week to the model, and a factor for before versus after the change of reporting method. To allow the rates to change, I added linear splines. A surprising feature was that with only a few splines, the residual deviance became consistent with Poisson variation. However, the graphical analysis above is better, because (i) it is clear that the shape of the systematic part of the curve is determined by lockdown conditions and other factors, and so is not going to equal any predetermined curve shape (ii) the Easter perturbations would have required adding special factors (or equivalently dropping datapoints).
I don’t see that the mathematical/statistical models are of any more help than looking at the graphs, as I don’t see that any particular model would reasonably bear that weight. The biggest question is how large does the halving number per week become? The larger the better, obviously, but I don’t see that modelling is going to help.