The Deaths Per Million statistic was designed to account for a weakness in the Total Deaths number. The total number of deaths from COVID-19 is correlated to the size of the country's population. That means Total Deaths tells us as much about the size of the country's population as it does the quality of its policies.
The natural response is to divide by population size in order to produce a standardised result. This is a sensible adjustment, and it gives us a new number with a new interpretation: this is the total death rate, or the percentage of the population that has died from COVID-19. This percentage is very small, so we multiply by 1 million to make it more visually appealing, and that gives us Deaths Per Million.
The Denominator Effect
Deaths Per Million (DPM) is a more useful statistic than Total Deaths, but it retains a flaw that is not well-understood. When we divide by population size, we create a bias that favours countries with larger populations. The effect isn’t as big or as obvious as for Total Deaths, but it is definitely there, and it makes sense too. Let’s use Ireland and China as an example.
When the first Irish person died from COVID-19, Ireland’s DPM was (1 / 5 million) * 1 million = 1 / 5. When the first Chinese person died from COVID-19, the Chinese DPM was 1 / 1,500.
In a hypothetical scenario where both countries had the same policies, the same demographic risk profile, the same everything, Ireland’s national death rate would still be 300x higher, and would therefore appear 300x worse, even though there was no difference at all. In reality, it would just be a mathematical effect resulting from a small denominator. It would tell us nothing about the competency of policymakers either nation.
Sample Population Adjustment
The denominator effect is a problem, but it is not terminal. The good news is that we can still use DPM to compare international outcomes, but only if we restrict the comparison group to countries of similar population sizes.
How similar is similar?
I am not aware of any statistical rule or formula that adjusts for this effect, so we will have to make an educated guess. Clearly nations like China, India or the USA are too populous to be compared to Ireland. Similarly, San Marino, Barbados, or Monaco are too small. China’s population is more than 100 times Ireland’s, while Ireland’s population is more than 100 times San Marino’s. It doesn’t make sense to compare these groups either structurally or mathematically.
If we apply a factor of 10 to Ireland’s population, we get a range of 500,000 to 50,000,000. Instinctively, that range seems too wide to me. We might be able to justify comparing Ireland to Spain at one end, or to Malta at the other, but it is difficult to argue that Spain and Malta should be in the same dataset when one is 100 times the size of the other.
Restricting the group to populations within a factor of 5 of Ireland gives us a range of 1,000,000 to 25,000,000. We have Estonia and Latvia at one end, and the Netherlands, Australia, and Taiwan at the other. It would also exclude Spain, Saudi Arabia, Fiji, and Andorra from the dataset. The factor of 5 seems like a good rule of thumb to me. It dampens the denominator effect, it reflects economic and social commonality, it remains suitable at different scales, and it is wide enough to give a decent sample size. Let’s go with that for now.
Interpreting Deaths Per Million
The table below ranks developed countries with populations between 1 million and 25 million (i.e. within a factor of 5 of Ireland’s population) by Deaths Per Million, as at 30 June 2020. I chose that date because by mid-2020, most developed countries had been through the first wave and come out the other side, with their outbreaks under control. It is a convenient point of comparison and effectively means that this table ranks nations by their performance in the first wave.
Table 1: Global Deaths Per Million
Restricting the analysis to countries of comparable size to Ireland allows us to adjust for the denominator effect. However, the DPM numbers are still biased by the discrepancies in how deaths were officially recorded in the different jurisdictions. We can deal with these discrepancies by adding margins of error to the official headline numbers. In a previous piece we suggested that applying a margin of error with a factor of two would be a conservative adjustment of this effect.
For example, among the Baltics, Latvia has the lowest DPM at 16. Latvia might enjoy bragging rights over its neighbours but as a researcher, I wouldn’t feel confident that Latvia’s DPM number demonstrated a higher level of competence. The range of outcomes across the three countries is too narrow to be confident that one was clearly outperforming the others. Applying a margin of error with a factor of 2 we would get the following ranges:
Latvia: 8 to 32
Lithuania: 15 to 58
Estonia: 26 to 104
Allowing for differences in how deaths were officially recorded, Latvia’s true DPM could be as high as 32. That means it could be higher than both Lithuania’s and Estonia’s, given that those countries' lower bounds are 15 and 26 respectively. Similarly, Lithuania could be higher or lower than the other two, depending on which end of the range you want to work from.
Indeed, you could use these ranges to rank the countries in any order, but that only serves to emphasise the fact that there probably isn’t much difference between them. Their results are all within their margins of error of each other, so we can’t be confident that the difference in performance is down to more skilful policymaking, rather than random variation.
The Nordics tell a much clearer story. The differences between Sweden (546) and the other Nordic nations (26, 59, and 104) are so great that we can rule out the possibility that Sweden’s deaths were simply counted differently. The gap between Sweden’s DPM and those of the rest of the Nordic region are so great that they must reflect policy errors made by Sweden.
There is one thing that we can all agree on: Taiwan was the clear winner. At 0.3, Taiwan’s DPM was so low that it was an order of magnitude better than the chasing pack. Australia, Singapore, New Zealand, and Slovakia all recorded DPMs between 4.1 and 5.1 – and they are excellent results – but the huge difference between their DPMs and Taiwan’s tells us that Taiwan’s performance was unique. Taiwan was able to do something that no one else could.
If I want a quick and easy metric to compare international COVID-19 outcomes, I always start with DPM. It is easily accessible, it reflects the ultimate health cost of the disease, we can put a reliable margin of error around the official deaths counts, and the statistic is flexible, meaning we can slice and dice the data by population, geographic region, or other demographics.
To account for the inconsistencies in how deaths from COVID-19 were officially recorded, we can create a range around the official deaths number using a factor of 2.
Population size distorts the Deaths Per Million number, so it should only be used to compare countries with similar populations, and we suggest a factor of 5.
 Choose your own factor if you disagree, but I don’t think your conclusions will change much.