Wednesday, April 15, 2020

A Simple Model of the Coronavirus Pandemic

Take a Petri dish, fill it with agar, put a drop of bacteria sample in it, close it and put it in a warm place. The bacteria will grow explosively at first but then growth will slow down and eventually stop completely once the bacteria have consumed all the nutrients the agar can provide. Mathematically this process can be characterized quite accurately using the logistic function:


The logistic function also works well for characterizing pandemics, since the underlying process is similar. Rate of growth of bacteria depends on the number of bacteria and slows down as nutrients run out; rate of spread of infection depends on the number of infected individuals and slows down as the number of individuals left to infect declines.

Mathematical models can be arbitrarily complicated and, as an immediate consequence, arbitrarily wrong. It is possible to fit a polynomial to just about any data just by adding enough terms to it, but the predictive value of such an exercise is pretty much nil. The logistic model is simple. It uses just three parameters: midpoint, maximum and growth rate. And it models real, physical phenomena that are ubiquitous in nature: exponential growth and exponential saturation.

In modeling the coronavirus pandemic, we had to add a fourth parameter: a small offset. This is because the Chinese data, for whatever set of reasons, is difficult to treat as part of the world model. In mid-February the coronavirus jumped petri dishes, if you will. After that point the petri dish became the entire planet.

Once we adjust it for the Chinese data, the model produces an excellent fit with a standard deviation of 2%. The data we chose to analyze is the number of coronavirus deaths as reported by worldometers.info [https://www.worldometers.info/coronavirus/worldwide-graphs/#total-deaths]. The reported number of deaths from coronavirus is far more accurate than any of the other statistics. The number of infected individuals depends on the accuracy of the test and the number of individuals tested. The number of severe cases depends on measures of severity that may vary and may have a subjective component. And although there is a nonzero chance that a death certificate listing COVID-19 as the cause of death may be wrong too, such errors, and whatever interplay of factors give rise to them, don’t seem to affect the accuracy of our curve-fitting exercise.

Here is a plot of our model (in red) and the data on coronavirus deaths (in blue).


Except for the Chinese coronavirus deaths (which we skip over by introducing an offset of 2770 deaths) the data and the model differ by less than the thickness of the line. Although this is visually satisfying, there may be some non-obvious trends in the data that the model is missing, which we can look for by looking at residuals between the data and our curve:


If the logistic function weren’t capturing all that’s going on with the coronavirus pandemic, then we’d see some trend in the residuals: the data would deviate from the model in a systematic manner. However, all we see here is a bunch of random noise that averages out over time.

Another method for validating our model is regression analysis. It shows that the linearity is excellent.


A useful way to look at the data is by plotting global deaths. In the plot below the blue bars are COVID-19 deaths as reported and the dotted red line is our model.


Here we can see where our model predicts the midpoint of the pandemic to occur. It is currently set at April 8. However, as the data has been coming in the midpoint has been drifting forward. On March 31 it was set at April 4. That is, over 15 days it drifted by 4 days. We can only guess why this is. Perhaps there is a lag time in reporting COVID-19 deaths. Or perhaps the size of the planet plays a role and in spite of near-instant air travel the geographic spread of the virus is taking a non-negligible amount of time. But these are simply guesses. Suffice it to say that if the drift remains consistent, then by mid-May the midpoint will have been passed around April 16.

As the midpoint drifts, so does the upper bound. A week ago it was close to 140000 total COVID-19 deaths, but now it’s nearing 170000. This drift makes it dangerous to make exact predictions based on our model. The ultimate number of COVID-19 deaths could end up as much as 50% higher. It may very well be the case that the current data, and therefore the model, fail to take into account future geographic spread of the virus into areas with scarce medical resources, where COVID-19 cases will go both undiagnosed and unreported.

As for the many countries that are currently actively battling the coronavirus by shutting down air traffic, imposing curfews and travel restrictions, forcing people to remain indoors, requiring people to wear face masks and to stand far apart in public, we can perhaps assume that for them the data on COVID-19 deaths is both accurate and timely. Based on this assumption, we can arrive at some tentative conclusions.

Different countries have imposed different measures. Some are requiring people to have written passes to set foot outside; others do not. Some have forced a complete economic shutdown; others haven’t. Some test lots of healthy people for coronavirus; others only test suspected cases and a few conduct tests as part of a post-mortem, if at all. How does this seem to be affecting the number of COVID-19 deaths? Well, not at all, actually! It seems to be making about as much difference as would frowning and wagging your finger at a petri dish. The coronavirus is spreading just as it would, and most people who are exposed to it do not even know that they have been exposed to anything out of the ordinary.

Even if we dramatically increase our current estimate of 170000 ultimate COVID-19 deaths to half a million and assume that the coronavirus spreads to every corner of the Earth, this would give us a lethality of 0.07%. This is very much in line with the death toll from the 2009 H1N1 pandemic. Note, however, that the 2009 pandemic did not cause financial and economic collapse. We may speculate that the damage from the largely futile measures being taken to control the spread of the coronavirus, which includes millions of job losses and business bankruptcies, will be far more severe than the damage from the coronavirus itself. Directly lethal side-effects of these measures will include significantly higher murder and suicide rates, deaths from malnutrition and starvation and deaths due to lack of medical care from health care systems which have been commandeered to focus on COVID-19.

Do sane, responsible people shut down the economy and bankrupt themselves for the sake of a virus not unlike the dozens of others in circulation which cause people to cough and sneeze, and sometimes (very rarely) die? No, they do not. Thus, we are forced to formulate other hypotheses. One such hypothesis is that global finance collapsed some time ago, and eventually the trick of “kicking the can down the road” stopped working. And then the global economy collapsed. Luckily, this coronavirus came along just in time to allow the leadership to avoid taking responsibility for what happened.

They have been hyping the coronavirus for all it’s worth, but this won’t work for much longer. What our model is telling us is that the coronavirus pandemic is already past its saturation point and will be over in a month or two. The virus will be gone, and leaders will declare victory, but economic collapse will remain. Some countries will be able to restart their economies while others—the ones that have been kicking the proverbial can down the proverbial road—won’t be able to. And there political collapse will follow economic collapse.

The most significant conclusion we can make from our model is that the coronavirus pandemic is already past its midpoint. This conclusion is tentative, because new data could break the model. For instance, we might see a significant new spike a couple of weeks after the lockdowns and curfews end. But until that happens our model’s simplicity and accuracy makes it a good tracking tool. We will continue adjusting our model as new data comes in and publishing updates. For now, however, our model seems to be telling us that the worst is already over. If you need a reason to be optimistic (as most people do right now)—here it is!