Estimating the Corona Virus (Covid-19) Transmission Rate, from the Diamond Princess Data
Since the outbreak of the Novel Coronavirus 2019-nCoV (now
called Covid-19), there have been numerous questions concerning the nature and
effects of the virus. One of the key questions,
that is obviously of great interest to everybody, is just how fast it can
spread.
The cruise ship, Diamond Princess, presents us with a “natural
experiment”, that can provide some clues.
We have good data on cases per day, which should be quite reliable, as
the crew and passengers were under constant scrutiny, as to whether they were
showing symptoms, and were being tested quite soon thereafter. Given the attention of the world on the
situation, it seems probable that the data would have been accurately reported.
It is worth noting that the Diamond Princess case is
different from the virus in the wild, so to speak, for a number of reasons:
- Great efforts were being made to prevent the spread of the virus within the ship, as it remained in harbor, with strict rules about access to the ship, travel within the ship and contact between people on board (passengers and crew). That is, after all, what is meant by quarantine. Granted, there are questions about just how effective those efforts were, but that should have reduced risk of transmission, at least in theory.
- Conversely, though, keeping all of these people in close quarters, with a virulent virus on board, presents a greater than average risk of any particular person in this limited population coming into contact with the virus. Even though great efforts were presumably being made to enforce the quarantine, the situation was quite favourable, from the virus’s point of view. As has been said, it was a sort of giant petri dish.
Given those facts, it is hard to say how generalizable the
data on spreading is to other conditions, such as an urban environment. Nonetheless, it is worth examining and
learning what lessons we can, keeping in mind these caveats.
Here are some graphs showing the progress of the disease,
while the ship was in quarantine, in numbers of cases. The graphs show the numbers of cases
reported, a best-fit line that gives an idea of the underlying mathematical
function, the statistical properties of that best-fit function (equation and
R-square) and a projection of future cases predicted by the function, had the
situation remained unchanged over the next couple of weeks. Note that this data is publicly available,
including the ship’s website.
I will present these graphs ascending order of their
R-square. This is a statistical measure
that gives an idea of how closely the data fits the best-fit function; the
higher the R-square, the better the data fits the functional form. An R-square of 1 is a perfect fit (i.e. no
error-term between the actual data and what would be predicted by the
functional form). Any R-square close to
1 is a good fit, though just how good is a bit of a judgement call.
Case 1 – Linear Relationship
This is the best case scenario, where the spread of the
virus is slowest. The graph shows a
relatively slow but steady increase in cases.
In this scenario, the number of cases would remain below 1000 until two
more weeks had passed.
Though the actual data (the blue points) lie relatively
close to the line, the fit doesn’t look that great. The R-square is fairly high at 0.855, though.
This relationship seems fairly unlikely on theoretical
grounds. It implies that the same amount of new people become infected every
day. That could happen, but the rate of
transmission from person to person would be quite low. However, if the amount of time an infected
person was infectious to others was rather short and contact was limited,
something like this might prevail.
Basically, although the number of people who had been infected would
grow, the number that were actually infectious at any point in time and
therefore could spread the disease, would remain stable.
Case 2 – Exponential Relationship
This is probably the worst case scenario, where the spread
of the virus is the most rapid. The best
fit line on the graph shows a rapid rise in cases, with the numbers exploding
to infinity, as it is often said. In
fact, based on this functional relationship, everyone on board the ship would
be expected to fall ill before another week was out (the ship’s complement of
passengers and crew totalled 3711 people.
Again, the actual data (the blue points) lie reasonably close
to the line, but visual impression of the fit isn’t that great, especially the
lack of fit in the last three points. At
0.859, the R-square is nearly identical to the linear relationship.
On theoretical grounds, a pandemic can grow at an exponential
rate, at least for a while. If each
infected person can infect several people, the pandemic can grow at a very
rapid clip. Of course no function in the
real world can remain exponential for too long – in the case of a virus it will
eventually run out of hosts as it grows, the hosts will develop immunity or
succumb to the disease and die.
Eventually, the function must turn downward.
Case 3 – Quadratic Relationship
The quadratic form is a second order polynomial, which can
also indicate rapid growth for an underlying phenomenon, though not as rapid as
exponential growth. This growth rate is faster
than the linear model, but slower than the exponential. This function predicts that about 2000 people
would be infected within two weeks, over half the people on the ship.
In this case, the data points appear to fit the function
very well, with some points slightly below the line and some slightly above
(technically it is not heteroscedastic, as the exponential was). The R-square is very high at 0.964.
As stated earlier, the quadratic model indicates that there
is both a linear trend, and a second order trend to the data. The latter means that the rate of change is
accelerating, so to speak, as the days go on.
Like the exponential, a second order relationship can only
go on for so long in the real world. It
will eventually be bounded by real world constraints.
Case 4 – Power Law Relationship
The quadratic form is a special case of a power law, where
the exponent is equal to the integer 2. The
exploratory power law model shown below is very close to the quadratic model. It would also have about half of the people
on the ship sick within a couple of more weeks.
In this case, the data points also appear to fit the
function very well, closely resembling the quadratic. The R-square is slightly higher, at 0.988.
The Death Rate on Diamond Princess
At the time that passengers are being transferred to other
locations, there were only 2 deaths among the 634 cases. We don’t know when those deaths occurred for
sure, but will make the assumption that they were on the last day (Feb 20, 20
days after the start of the outbreak, which we will place on Feb 1).
Given those parameters, we can calculate a rough death rate,
assuming different lag times for median time from diagnosis to death. Doing that gives the graph below, indicating
a death rate of somewhere between 2 and 5%, assuming that the latency period is
between a week and two weeks. These are
admittedly rough figures, but they correspond fairly well with the experience
in China, which gives a fatality rate of about 5 to 6 percent, using a similar
lag time.
It is difficult to extrapolate the death rates that might
occur in less selected populations than a cruise ship. On the one hand, cruise ship populations skew
older, which could lead to higher fatality rates than in a more general
population.
But on the other hand, people generally don’t take cruises if
they are in extremely bad health or are extremely old. Plus, a cruise ship population will be drawn
from economically well off populations, who have benefitted from good health
care all their lives. So, these factors might
tend to indicate a lower death rate.
At any rate, the ship quarantine has now been called off,
and people are being air-lifted back to their home countries or to mainland
Japan, though they may well continue being kept in quarantine in those
locations. So, this interesting natural experiment
is now over. Here’s hoping that
epidemiologists and other public health workers learn some useful lessons from
it, statistical and otherwise.
And, here’s a more pleasant travel story than anticipating
the worldwide journey of a virus.
On the Road with Bronco Billy
What follows is an account of a ten day
journey through western North America during a working trip, delivering lumber
from Edmonton Alberta to Dallas Texas, and returning with oilfield equipment.
The writer had the opportunity to accompany a friend who is a professional
truck driver, which he eagerly accepted. He works as a statistician for the
University of Alberta, and is therefore is generally confined to desk, chair,
and computer. The chance to see the world from the cab of a truck, and be
immersed in the truck driving culture was intriguing. In early May 1997 they
hit the road.
Some time has passed
since this journal was written and many things have changed since the late
1990’s. That renders the journey as not just a geographical one, but also a
historical account, which I think only increases its interest.
We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.
The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.
We were fortunate to have an eventful trip - a mechanical breakdown, a near miss from a tornado, and a large-scale flood were among these events. But even without these turns of fate, the drama of the landscape, the close-up view of the trucking lifestyle, and the opportunity to observe the cultural habits of a wide swath of western North America would have been sufficient to fill up an interesting journal.
The travelogue is about 20,000 words, about 60 to 90 minutes of reading, at typical reading speeds.
Amazon U.S.: http://www.amazon.com/gp/product/B00X2IRHSK
Amazon U.K.: http://www.amazon.co.uk/gp/product/B00X2IRHSK
Amazon Germany: http://www.amazon.de/gp/product/B00X2IRHSK
Amazon France: https://www.amazon.fr/dp/B00X2IRHSK
Amazon Spain: https://www.amazon.es/dp/B00X2IRHSK
Amazon Italy: https://www.amazon.it/dp/B00X2IRHSK
Amazon Netherlands: https://www.amazon.nl/dp/B00X2IRHSK
Amazon Japan: https://www.amazon.co.jp/dp/B00X2IRHSK
Amazon Brazil: https://www.amazon.com.br/dp/B00X2IRHSK
Amazon Canada: http://www.amazon.ca/gp/product/B00X2IRHSK
Amazon Mexico: https://www.amazon.com.mx/dp/B00X2IRHSK
Amazon Australia: https://www.amazon.com.au/dp/B00X2IRHSK
Amazon India: https://www.amazon.in/dp/B00X2IRHSK
No comments:
Post a Comment