Investigating Wuhan Pandemic Spread and Determinants

A number of studies are emerging to the study the impact of the Chinese flu. The impact of the flu is varied in different countries. Some explanation must arise to decode the differences across countries in terms of incidence and impact. Anecdotally, the impact seems more profound in Europe and US and reasonably moderate impact in Asian and African countries. Very little is known about China however. The secrecy in reporting about the cases and deaths in China add to the woes. If not for the Chinese secrecy and cover-up at every stage, the world would have been spared the severity of the pandemic. Yet China seems unapologetic and is looking for external scapegoat. While there is a need to hold China accountable for the human costs of the Wuhan pandemic, it is important to combat the pandemic as a first priority.

There are many theories going around all over about the incidence of the cases and deaths. There is without doubt a possibility of underreporting the cases, yet the same cannot be said about the deaths. Therefore a study of the pandemic would command greater utility if one were to examine in the context of death rate. If death rate in statistical parlance were to be dependent variable, what would be the independent variables? It is obvious that number of cases might be a good independent variable. It can be safely hypothesised increased deaths would be a function of increased cases assuming the percentage of positives to total tested remains in a certain range. Yet, in absolute numbers, both deaths and cases and in fact even number of diagnostic tests would differ quite sharply across countries. There would be a need to bring about certain uniformity about the variables. Therefore, the analysis would be better served if the incidence of deaths, incidence of cases and incidence of tests is taken into account. The incidence here is reference to the total numbers per million. The study incorporates deaths per million as dependent variable, cases per million and tests per million as independent variables.

While it would be reasonable to assume the linkages between these, there are a number of myths that are going around. Prima facie, it looks that advanced economies are the worst hit. To test the same, the study incorporates advanced economies as a dummy variable. Another myth that seems to be around is the malarial regions is less affected compared to the non-malarial regions. The myth got an impetus thanks to increasing use of hydroxychloroquinine in the treatment of Wuhan flu. India is using the same as prophylaxis for high risk groups. The aura was added by President Trump’s advocacy of the same. While there is an inconclusive debate on the efficacy of the same, it would be interesting to see malarial region having any sort of impact  on the spread of the pandemic. Similarly, a number of scientists are discussing the feasibility of BCG vaccine in controlling the pandemic spread. There are studies currently underway in Australia and Germany. India is cautious and about to begin a trial on the same. Some Indian researchers and scientists are advocating the usage of BCG as prophylaxis. To factor the two into the equation, two dummy variables are introduced. One is to classify countries into malarial countries or non-malarial countries. Similarly, the second dummy variable categorizes countries that have BCG vaccination programs and those who do not have BCG vaccine program at the present. For simplicity, those country which discontinued the vaccine program are treated as non-BCG vaccine countries.  At this moment, if anti-malarial drugs and BCG vaccines might be playing a certain role, a point to ponder about would be the impact on those countries which under malarial belt and also have BCG vaccine program. Thus an interaction variable MALBCG has been created to account for the integrated effects. The dummy variables are MALREG (for malarial region countries) and BCGCOU (for countries with BCG vaccine programs). The advanced economies are factored in through the variable ADVECO.

The data was source from the Worldometer database on coronavirus for the death incidence, case incidence and test incidence. The data was sourced on April 19, 2019. Given the slow growth in death incidence and case incidence as also test incidence, to a good extent, it might cover for small variations that might happen daily. The data for malarial incidence was sourced from the CDC database. The BCG program data was sourced from multiple databases and research papers. The database on advanced economies was sourced from UN and OECD data. The study collected total of 165 observations comprising both countries and dependencies.

The equation assuming a linear relationship will be

DEATHRATE= a + b*CASERATE+ c*TESTRATE + d*ADVECO+ e*MALREG + f*BCGCOU +g*MALBCG

The results of the regression equation run is tabulated below.

SUMMARY OUTPUT
Regression Statistics
Multiple R0.898625
R Square0.807528
Adjusted R Square0.800218
Standard Error52.38903
Observations165
ANOVA
 dfSSMSFSignificance F
Regression61819393303232.2110.48286.2E-54
Residual158433648.42744.61
Total1642253042   

The overall regression as illustrated by significance F looks significant.  Adjusted R-square above .80 is quite high what would be interesting is to watch the results for each variable illustrated below.

 CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept2.794268.0586850.3467390.729249-13.122418.71091-13.122418.71091
CASERATE0.0749480.00333822.452735.01E-510.0683550.0815410.0683550.081541
TESTRATE-0.002780.00028-9.907942.67E-18-0.00333-0.00222-0.00333-0.00222
ADVECO38.1303611.73463.2493960.00141314.9534561.3072614.9534561.30726
MALREG-5.1643211.26784-0.458320.64735-27.419317.09072-27.419317.09072
BCGCOU-10.709611.7305-0.912970.36265-33.878412.45924-33.878412.45924
MALBCG12.7506718.485010.6897840.491342-23.758949.26026-23.758949.26026

As expected, both CASERATE and TESTRATE show significant P values. There is no significant P for MALREG and BCGCOU and the interaction variable MALBCG. Therefore, on the first glance, there seems to be no impact in terms of death rate on account of running a BCG vaccine program or otherwise. A strong argument about lack of BCG vaccine program resulting higher deaths in Italy might prove unfounded. Similarly, there seems to be no impact through the malarial antibodies presence in the humans in regions susceptible to prevalence of malaria. However, there is a significant p at 99% for the ADVECO. The coefficient of 38.13 indicates an increase in deaths by 38 per million in advanced economies as compared to the emerging economies. This is a fodder for thought.

A puzzle would be the reasons behind the higher death rate in developed world as opposed to the emerging and underdeveloped world. One reason could be the higher rates of reporting. It is possible that underreporting is happening in the underdeveloped world. But primary reports do not indicate such a high level of underreporting. In fact, many reports do indicate a lower cases being reported in countries that come in the underdeveloped or emerging regions. Yet, in the absence of medical facilities and reporting infrastructure, there could be high possibilities that many deaths are not being attributed to the Wuhan flu but to host of other epidemics. Nonetheless even accounting for this there is ample thinking needs to happen to decode these reasons rather than merely attribute to inefficiencies in reporting.

The statistical results do point out a few indicators which are worth further investigating. Some arguments like temperature, population density could be integrated into the equation to decode further insights into the determinants of the pandemic spread and the dissimilarity in patterns of spread across the world.

Leave a comment