Featured

1 AC: What the data tell us

Be wise today so you don’t cry tomorrow.

E A Bucchianari

We’re well into Year 2 AC – After Covid. Clearly, we don’t know all we need to know. Conversely, we are awash in data and probably know more – collectively – than we think we do. In this series of posts, I’ve been presenting my observations, preliminary conclusions they’ve led me to, and what might be better approaches to future pandemics and other disasters, from a community perspective. In this post, I want to focus on the restrictions placed on all of us in response to the pandemic. There is a lot of misinformation out there (particularly if you listen to the media or the rather unprofessional rantings of Rochelle Walensky). While we don’t have “final” data for the pandemic, I think we’re close enough to the end to draw some conclusions about the effectiveness – or not – of the restrictions that we’ve been living with.

Probably front and center in most people’s minds is “Did the lockdowns work?” We paid a high price in terms of our economy, our social fabric and our kids’ lives; we need to know whether we got value for the disruptions. In answering this question, we are faced with several important hurdles:
Goals. Initially, we were told that lockdowns were necessary to flatten the curve. Then we were told that they were continued to control the pandemic (whatever that means).
Data. We have case data that is not very good, primarily because we did so little testing early on. The death data seems to be better, but again has some biases because of differing protocols for attribution across jurisdictions, lies misstatements by some public officials, and some question about the accuracy of early data. As noted in my last post, a better early warning system could have helped us have better data early on. For this analysis, I’m going to look at both case and death data, current to 4/5/21.
Lockdowns and NPIs. If it were simply a matter of looking at lockdowns vs no lockdowns, analysis would be so much simpler! Unfortunately, almost every state has had a mix of non-pharmaceutical interventions (NPIs – mask mandates, social distancing, quarantines and lockdowns) making it difficult to isolate the effects of lockdowns. Further, these have changed over time. There are at least two attempts to develop an index try to reflect this spectrum of NPI responses on a common scale – I’m going to use the one developed by WalletHub, and the values for 2/26/21 (Although I don’t present the data here, I’ve looked at the indices for a few dates. While the absolute values change, the general conclusions are the same.). I have renamed their index “State Openness.”

In the following figure, I’ve plotted both cases and deaths by states (treating both Puerto Rico and the District of Columbia as states). Clearly, there is no relation between “state openness” and the death rate (R2 ~ 0). The data suggests that the states in the red box might want to compare their practices with those in the green boxes – a factor of five fewer deaths! There is a rough correlation (R2 = 0.35) between the case rate and state openness, but it is heavily influenced by outliers, especially the cluster of states in the green box (The gray box represents the Standard Error.). Thus, it appears that the NPIs may reduce the case rate, but have little to do with the death rate. This makes sense because the case rate depends on the public’s actions; NPIs influence those. The death rate is more a reflection of the quality of the health care system; NPIs have little influence there.

I’ve also looked at county data. Ideally, if there were a significant predictor of cases, counties and states could be better prepared to deal with potential “hot spots.” I’ve based my search on the CDC’s 2019 county health rankings data, thinking those data were likely to be the best source for a predictor. I looked at all of the data – but I won’t bore you with a plethora of scatter plots! One predictor that was discovered early on still holds – population density is a good predictor of the number of cases, as is the total population of the county (well, duh!). However, the two counties with the highest incidence of covid-19 are Chattahoochee County, GA, and Crowley County, CO; neither large metro areas. For both about one-third of their residents were infected.

There appeared to be “fuzzy” relationships between median household incomes and the prevalence of both cases and deaths in a county. The number of cases and deaths per 100,000 residents were limited by increasing household incomes. This was true for all residents, as well as when broken done by race. Let me stress this was not a correlation, but rather it appeared that low median household incomes were necessary (but not sufficient) conditions for high case and death rates.

Beyond these , I didn’t find any other data that were correlated with either case or death data.* Perhaps most notably, neither the Covid Community Vulnerability Index nor the Social Vulnerability Index correlated with either cases or deaths. This is particularly unfortunate, because they are intended to indicate potential hot spots. At least at the county level, they don’t.

The county data was further broken down by the type of county. The CDC classifies counties as either large, middle and small metro centers; large fringe centers; micrometro centers or non-core (rural areas). Rather than plot all of the data (a confusing profusion of colors and shapes), I’ve plotted the best fit lines for each county type vs state openness. While there is not a good fit for any of these, the “bunching” of the lines for the case data indicates that the county type did not make much of a difference in terms of cases. However, as the second graph of this pair shows, non-core counties tended to have significantly more deaths than the other county types. I’ve plotted the raw data for the large metro counties (red) and the non-core counties (green) in the lower graph. The data suggests that the health care system in many of the rural counties – but not all – are simply inferior. This may be due to a lack of medical personnel and hospitals, or the distance between those who died and health care centers; i.e., poorer care or poorer delivery. As a matter of interest, all four of the large metro counties with the highest deaths per resident were in NY – Queens, Bronx, and Kings and Richmond Counties. Foard County, TX; Emporia, VA and Jerauld County, SD, were the highest of all counties.

Finally, I’ve looked at state unemployment numbers for February (latest available data). Again, there is a rough (negative; R2 ~ 0.4)) correlation between unemployment and state openness. The most interesting outlier (at least to me) is Vermont (lower left corner) – one of the states with the most restrictions (NPIs) and yet very low unemployment. Perhaps unsurprisingly, California and New York have very high unemployment; but surprisingly (to me) Hawaii has the highest unemployment – probably indicative of restrictions on travel.

So, what’s the data trying to tell us? Lockdowns and the other NPIs have had a modest impact limiting the number of cases but also lead to higher unemployment. The NPIs have no measurable impact on the number of deaths. In that sense, they have done nothing to control the pandemic – lots of pain for little gain. The data on cases and deaths by county type clearly show that there are major disparities in rural health care for virtually every state. Perhaps most unfortunately, the data don’t point to a good predictor of impacts at the county level. The CCVI and SVI were worthy attempts to provide this, but ultimately have not been shown to be useful. It could be useful if health professionals dug more into the relationships between cases and deaths and household incomes; there could be a pony in there!

Clearly, I’m not a health professional. I have tried to present the data in as apolitical way as I can because the messages from the media have been filtered through their political biases. As Ernie Broussard has said, Pain is inevitable, but suffering is optional. If our communities are to avoid unnecessary suffering when the next pandemic hits, we will have to make some hard decisions to take difficult steps to alter our approaches. Let us hope that our leaders will base those decisions on cold facts such instead of the hot passions of the moment, or the emotional push to “just do something.” Let us hope that they are wise, lest the rest of us shed tears.


*The data from the 2019 county health rankings that did not correlate with either cases or deaths were:
Life expectancy (overall and by race);
Age adjusted mortality;
Child and infant mortality;
% of the population experiencing frequent physical and mental stress;
% of the population with diabetes;
Number and prevalence of HIV cases;
Number and prevalence of food insecurity;
Number and prevalence of limited access to health care;
Number and prevalence of drug overdoses resulting in death;
Number and prevalence of deaths due to motorcycles;
% of the population with insufficient sleep;
Number and ratio of primary care physicians to residents;
% of the population who are disconnected youth;
% of the population on free lunch;
Segregation index;
Homicide rate;
Number and prevalence of firearms deaths;
Number and prevalence of homeowners;
Number and prevalence of sever housing cost burden;
Fraction of the population under 18;
Number and fraction of the population over 65 (overall and by race);
Number and prevalence of English as a second language;
Fraction of the population who are female;
Number and fraction of the population living in a rural area;
The individual themes and the overall CCVI;
The SVI.


Advertisement