Data 101: Total Cases and Total Deaths

We measure national COVID-19 outcomes to help us make comparisons between different nations, and then to understand which policies offer the best protection from contagious outbreaks.

We want to know which countries performed the best so that we can learn from their successes. We want to know which countries performed the worst so that we can avoid their mistakes. The latter is perhaps the more important concern. There won’t always be a recipe for success, but there are many recipes for failure. If we can avoid the worst outcomes, we have done most of the hard work. From there we can start to build a framework that will protect us from the next contagious outbreak.

In order to make these comparisons, we need high quality data and we need to be able to interpret it correctly. Ireland has not distinguished itself on either front. The quality and access to information has been poor, and the national media lacks the statistical grounding to interpret the data we have.

This piece is an attempt to address the second issue – to find a simple basis on which we can make high-level comparisons between nations. Today we will discuss Total Cases and Total Deaths.

Total Cases

Total Cases is a useful piece of information. If new infections are increasing, we know the virus is spreading and causing further harm. As more people become infected now, we can expect more demand for our public health services in the weeks to come. In contrast, if the growth in Total Cases slows, it suggests we are beginning to control the outbreak.

There are, however, several caveats.

The first is that this metric is unreliable. Total Cases is hugely dependent on the government’s willingness or the health service’s ability to test for the disease. In the early days of the crisis, not all governments were especially keen to test their populations – Ireland’s among them. Some only want to test at risk demographics or people with symptoms. Others don’t want to risk unfavourable comparisons. Ireland’s government hadn’t acquired enough tests.

The second caveat is that Total Cases is a lagging indicator. The time from initial infection to test result is currently about 10 days, so the daily new case numbers that we see every night reflect the spread of the virus 10 days previously, give or take. The lag is so great because an individual has to develop symptoms, then book a test, then get tested, and then get their results back. It’s only when they have been notified that the confirmed case goes in the official statistics. A couple of weeks could have passed since the initial infection. This is one of the downstream problems of Ireland’s decision to use only the slower PCR tests.

Finally, while Total Cases gives us an indication of the virus’s spread and helps us to prepare for the following weeks, we shouldn’t confuse the map for the terrain. Ultimately, we are trying to save lives. Preventing infections is consistent with that goal, but we will primarily judge our health responses by the number of lives lost, the number suffering long-term effects of COVID-19, the treatments postponed and all other suffering resulting from an overwhelmed health care system.

Overall, Total Cases gives us useful information about the spread of the disease, but inconsistencies in national testing regimes make it an unreliable, noisy statistic. Total Cases naturally lags the true spread of the virus – both in time and in spread – meaning the statistic is best interpreted as a minimum that loosely correlates to the true spread of the virus.

Total Deaths

Deaths should be the first and most important metric by which countries judge their COVID-19 efforts. All the suffering that COVID-19 has brought into our lives can be traced back to our natural desire to not die. There may be issues with the way deaths have been recorded, but that shouldn’t blind us to the fact that avoiding death is the ultimate goal of all our pandemic risk management policies.

Common sense dictates that countries with larger populations would suffer more deaths from COVID-19, and this is confirmed in the data. Statisticians call this ‘correlation’: population size and Total Deaths are positively correlated. Naturally, Total Cases also suffers from this effect. The implication for us is that ranking by Total Deaths will tell us more about the country’s population, and less about the quality of its risk management policies.

The official Total Deaths statistic has a second flaw: there is no international standard for classifying deaths from COVID-19. If nations use different criteria to classify deaths, the official Total Deaths counts may not offer a fair basis on which to make comparisons.

The WHO offers the following guidance on recording deaths from COVID-19:

COVID-19 should be recorded on the medical certificate of cause of death for ALL decedents where the disease caused, or is assumed to have caused, or contributed to death.

The purpose of this definition is to enable the WHO to collect data on every deaths that was either caused or affected by COVID-19. By asking the member nations to collect every possible case of COVID-19, the WHO ensures that its future research can be done on a consistent and highest quality data set. The WHO was not directing the world to use this one metric alone, as some appeared to believe.

There are many ways to define death due to COVID-19 and none is perfect. They each serve different purposes and nations are free to use multiple criteria. An internationally agreed ‘fair’ basis would clearly be useful for making international comparisons. The EU should have its own definition for the same reason. Similarly, nations should develop their own metrics to support their internal evaluations. Rather than having an argument over which is the 'correct' metric to use, we should recognise that different metrics serve different purposes and record them all.

Unfortunately, most countries produced only one official estimate of Total Deaths. Some chose to interpret the WHO’s request as a directive, while others created their own definitions, which lacked comparability.

Belgium, for example, took a particularly conservative approach, choosing to record the maximum possible death toll from COVID-19 rather than attempting to record a neutral estimate of the true number. Various media reports suggest this deliberate overestimate accounts for 40-50% of Belgium’s official Total Deaths, suggesting that the official number is twice the size of a neutral estimate.

Belgium’s criteria for recording deaths from COVID-19 appears to be the strictest in the developed world. As such, it offers us an upper bound on an unbiased or 'fair' estimate of the nation's true death count since if Belgium is the strictest, every other country's estimate must be smaller. Applying a factor of 2 on both sides of the official number for symmetry (which appears especially conservative on the down side – developed countries are unlikely to have recorded less than half of their total COVID-19 deaths) we can put a confidence interval around the headline numbers.

If a country declares Total Deaths of 1,000, we can be confident that the true number of deaths from COVID-19 would be somewhere between 500 and 2,000. That may seem like a large range, but it needs to be. The adjustment accounts for the discrepancy in testing criteria, which is a variable that we cannot observe or quantify directly. We must therefore be conservative in our estimates, and that means we will have a wider range of possible outcomes.

Key Points

Both Total Cases and Total Deaths offer useful information about the status and future path of an outbreak. These statistics can and should be used in combination with each other, and in combination with other metrics to create a more detailed map of the outbreak within a country. However they lack the consistency to enable meaningful international comparisons. In the next piece, we will try to account for these inconsistencies with Deaths Per Million.

  • Total Cases is driven by the government's willingness and ability to test for the virus.

  • Total Cases reflects the minimum spread of the virus, and in countries with bad outbreaks the true number is always multiples higher.

  • The time lag on recording new cases is currently around 10 days in Ireland, so the daily figures reflect the spread from 10 days ago.

  • Total Deaths is driven by population size, and by the government's definition of death from COVID-19.

  • We can adjust for the diversity in definitions of death from COVID-19 by applying a factor of 2 to the headline number, giving us a range of values for a fair estimate.