Friday, 6 March 2020

Estimating the Fatality Rate of the Coronavirus, from Time Series Correlation Analysis (Update March 5, 2020)


Estimating a More Realistic Fatality Rate of the Coronavirus, from Correlation Analysis (Update March 5, 2020)

March 5, 2020 Update

Here is a quick update, about the Corona Virus (Covid-19) Fatality Rate estimate, which the World Health Organization now puts at 3.4%, increased from previous estimates of about 2%.  I think this estimate is closer to the actual rate, but still a bit off.  Here’s why.

The graph below shows the worldwide Cases per Day (left scale) and Deaths per Day (right scale).  Though there is a certain spikiness to the data, there appears to be a definite relationship between the Cases Reported and the Deaths Reported.  To a first approximation, the form of the Deaths graph resembles the form of the Cases graph, but shifted to the right, or lagged by some period of time, apparently between a week and two weeks. (This data is from the website below). 





You can get a feel for that by trying to match up some points, such as I have done with the colour coded symbols below, using a 9 day lag between case reports and death reports.



Here is the same graph, but lagged by 7 days.  Visually, there isn’t much difference between the 9 and 7 day lags.



It turns out that the statistically most likely lag time from this dataset is somewhere between 5 and 10 days, with the correlation between Cases and Deaths being highest for the 6 and 7 day lags, with correlations coefficients of about 0.80.  To be sure, the values for 5, 8, 9, and 10 days are high as well, well over 0.70.



A quick digression about correlations: they show the relationship between two variables, indicating that if one variable changes, the other also changes.  The correlation calculations can vary between -1 and +1, with numbers closer to +1 or -1 indicating stronger relationships.  A negative number indicates an inverse relationship, a positive number indicates a direct relationship.  A value of 0 indicates that there is no relationship.  Intermediate values can show a weak relationship. 

The actual Death Rates

 (Total Deaths at time(t+Lag)/(Total Cases at time(t)

can be seen on the graph below, for various lags.  Choosing a lag of 7 would give a fatality rate of about 4.2%, and the entire range of likely rates is from 3.7% to 4.6%. 

This is about a percentage point higher than the World Health Organizations latest figures.  These are merely Current Death/Current Cases, which is a less precise estimate (though it works fine if you have steady state condition or at least a very slowly changing one).



It should be kept in mind that this analysis shows the time lag between being the number of people counted in the Covid-19 case list and the number of people counted in the list of deaths caused by the disease.  It doesn’t follow individual people, just the aggregate numbers.  So, this doesn’t precisely calculate the time from diagnosis to death, for those who did die, but rather uses a proxy for that figure.  Nonetheless, it does provide a useful way of estimating the death rate from the disease, without requiring the tracking of a specific cohort of people over time.



And, here’s a more pleasant travel story than anticipating the worldwide journey of a virus.

On the Road with Bronco Billy

What follows is an account of a ten day journey through western North America during a working trip, delivering lumber from Edmonton Alberta to Dallas Texas, and returning with oilfield equipment. The writer had the opportunity to accompany a friend who is a professional truck driver, which he eagerly accepted. He works as a statistician for the University of Alberta, and is therefore is generally confined to desk, chair, and computer. The chance to see the world from the cab of a truck, and be immersed in the truck driving culture was intriguing. In early May 1997 they hit the road.

Some time has passed since this journal was written and many things have changed since the late 1990’s. That renders the journey as not just a geographical one, but also a historical account, which I think only increases its interest.

We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.

The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.






No comments:

Post a Comment