1. Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the etiologic agent of the coronavirus (COVID-19) pandemic. As of 31 August 2020, the associated death toll in the United States is reported to have surpassed 180,000 [
1], the highest of any country in raw numbers but equivalent to many other developed countries when adjusted for population [
2]. The proper recognition and remediation of the disease are pressing concerns and each will likely be subject to debate in the months prior to the 2020 presidential election [
3,
4]. However, there is some concern surrounding the veracity of the data and factors contributing to COVID-19 deaths. Media outlets provide daily updates on the number of cases and deaths but draw this information from data collection and reporting agencies that have adjusted their methods over time [
5]. The resulting inconsistencies have led to charges of underreporting [
6,
7] and overreporting [
8,
9], and have contributed to the politicization of the pandemic.
COVID-19 data inconsistencies and potential political bias in data reporting can have significant implications. If the data that politicians rely on are faulty, subsequent policies may harm public health, the economy, and other aspects of society. Testing differences, false positives, false negatives, and other factors that likely differ from state-to-state and county-to-county make the underlying official deaths with COVID-19 reports somewhat suspect; however, this study leverages the official data used by the Centers for Disease Control and Prevention (CDC). There are several county level studies about COVID-19 available from recent research. Badr et al. (2020) evaluated mobility patterns and COVID-19 transmission [
10]. This study provided county level spread data but did not focus on deaths. Scannell et al. (2020) demonstrated racial disparities at the county level for COVID-19 cases and deaths [
11]. Cases, unfortunately, suffer from severe measurement problems, as will be discussed. Ives and Bozzuto (2020) analyzed county level estimates of R0, the basic reproduction number for COVID-19 [
12]. Altiera et al. (2020) estimated county level deaths used in estimating required medical supplies [
13]. Two articles consider political factors—Flanders et al. (2020) assessed voter turnout as related to COVID-19 [
14], and Makridis and Rothwell (2020) evaluated the effects of political polarization but not in terms of death rates [
15]. We found no other paper that addresses death rate disparities by including a political variable. Thus, given the novel nature of the virus and its progression and the known inconsistencies in the reported data, we sought to gain a deeper understanding of the factors that contribute to reported deaths from COVID-19.
1.1. Research Questions
We investigated three research questions. First, what attributes of geography, demography, population density, economy, population health, hospital characteristics, and politics might explain the deaths per 100,000 (death rate) at the county level as of 31 August 2020? Second, did COVID-19 death rates at the state level differ based upon governor party affiliation after accounting for other relevant variables? As a control for our second line of inquiry, we also examined whether variation existed in previous flu/pneumonia death rates (2014–2018) based upon the governor’s party affiliation.
1.2. Significance and Motivation
To our knowledge, this research is the first to evaluate COVID-19 using combined data from multiple areas covering demographic, socioeconomic, health system, population health, and political factors using a spatial regression. It is also the first study to evaluate the effects of state and county political affiliation on COVID-19 death rates. The motivation behind this study is to address the media promulgation of explanatory factors that may or may not be scientifically verifiable (e.g., population density and political factors), particularly when placed in the context of other known factors established at the individual unit of analysis (e.g., race).
4. Discussion
4.1. Summary of Results
In this study, we first ran a county level analysis for death rates based on geographical, socioeconomic, health status, health capability, and political groupings. Our investigations were reduced to two full OLS models and two geospatial models. From our analysis, it was clear that geospatial models with lags were preferred to the OLS models. Further, the reduced GIS model using only variables identified from lasso produced nearly the same R2 as the full GIS model (0.500 versus 0.507, respectively). Thus, the reduced model performs nearly as well as the full model in estimating county death rates. In that model, we see significant geospatial effects (ρ), as well as those associated with population density, race, and the winning party in the 2016 election. The estimate for Democratic counties (untransformed) was 4.503 deaths per 100,000.
For the state level analysis, we found effects associated with the proportion minority, three principal components associated with health status variables, and the interaction between the governor’s party and the first health status variable. However, when removing the four states with the highest death rates (New Jersey, New York, Massachusetts, and Connecticut), we found that the only predictive variable was the minority proportion in the state. Further, an analysis of influenza death rates showed no effect associated with political party.
4.2. Population Density Effects
Population density has been identified as a predictive factor in disease progression [
41,
42]. A superficial examination of county level data indicates that a relationship might exist between population density and death rate from COVID-19 (see
Table 2). Consistent with prior analysis [
43,
44],
Table 2 also shows urban areas tended to vote Democrat in the 2016 presidential election. Due to these associations, media outlets have presented the urban–rural divide as a viable explanation for the difference in death rates between counties that voted Democrat in 2016, and those that voted Republican [
45,
46]. This divide has also provided an explanation for the divergent response to the disease based on party affiliation. For example, Democrats are more concerned about COVID-19 than Republicans, and are more likely to wear a facemask and practice other forms of social distancing [
28,
47,
48]. However, the effect size of population density at the county level is negligible when other factors are considered. For example, in the reduced GIS model for counties, the standardized coefficient is only 0.051. Population density does not appear as a significant variable in the state level models. The failure of population density to provide a more significant explanation for deaths from COVID-19 has been one of the surprising results from our analysis.
4.3. Race and Ethnicity/Minority Effects
At the county level, our study confirms the findings of numerous researchers pertaining to healthcare disparities in the United States, particularly with respect to Native American, Hispanic, and African American populations [
49,
50,
51]. We found an increase in the percentage of these populations to be associated with an increase in mortality from COVID-19 at the county and state levels of analysis. McLaren (2020) attributes this difference to disparities in education, occupation, and commuting patterns [
51]. The causes of disparity, however, are not explained by the covariates in this study (see Carl, 2020 [
52]). Although we did not include these factors in our analysis, we did find the mortality disparities do not appear to be attributable to differences in unemployment rates or household income. Our county findings suggest that there are healthcare disparities in the United States, but may also be indicative of a pathogenesis of COVID-19 that has a greater and disproportionate effect within these three racial groups [
53,
54]. At the state level, increases in minority population proportions were also associated with increases in death rates per 100,000.
4.4. Health Status Effects
At the state level, health status (measured by three principal components and the interaction between the governor’s party and the first principal component) was a predictor for the n = 51 state observations. These health status effects disappeared after removing the four outlier states from the model. Thus, it would appear that minority status is the predominant predictor such that increases in the proportion of minorities are associated with increases in deaths per 100,000.
4.5. Unemployment Effects
At the county level (and consistent with prior research), unemployment characteristics were identified as having a significant association with COVID-19-related deaths [
44,
45]. While this association is clear, its causation is not. It is possible that unemployment increases exposure to the disease; for example, cost-cutting might lead to increased use of public transportation. It is possible that unemployment increases vulnerability to the disease through elevated stress levels and poor nutrition. The unemployed may also be left without access to healthcare, which increases mortality from disease. However, it is also possible that unemployment increases the incidence of deaths of despair (deaths due to drug, alcohol, and suicide), and that these excess deaths (defined by the CDC as the difference between the observed numbers of deaths and expected number of deaths in a specific time period) [
55] are being reported as COVID-related. For example, on 13 April 2020, New York City added more than 3700 people to the COVID-19 death total – people who were presumed to have died of the coronavirus but had never tested positive [
56,
57]. Without a positive test, it is impossible to know if these additional deaths—at the time, 37% of the city’s total—were actually COVID-related, were deaths of despair, or were due to other causes.
Periods of economic downturn have long been found to be associated with declines in health status and higher suicide rates compared with periods of relative prosperity [
46,
47,
48]. Recent research has found a 17% increase in drug overdose nationally during April and May 2020 [
58]. Compounding the problem, there are indications that a prolonged and overly restrictive COVID response is deepening an already deleterious economic cycle, the result of which is increased unemployment [
49]. As unemployment increases, so does the mortality rate either directly or indirectly from the disease. In short, extended efforts to eradicate the disease may cause additional harmful secondary and tertiary effects that may be worse than the disease itself.
4.6. Political Party Effect
The influence of politics on the reporting of COVID-19 mortality was a significant finding in our analysis. County level Democratic affiliation was significantly associated with increased COVID-19 deaths, even after controlling for factors such as population density. To the best of our knowledge, this is the first time that population density and urbanization are used as controls when evaluating death rates between Democratic and Republican states.
In past years, the CDC retrospectively tabulated the number of flu-associated illnesses, hospitalizations, and deaths—a process that takes up to two years to generate an estimate. The process relies on estimation modeling in and out of hospitals based on behavioral algorithms [
59]. The CDC never relies solely on death certificate data because it recognizes that there is never large-scale testing and that the clinicians do not routinely list influenza data on death certificates if the patient died of pneumonia, heart failure, or deteriorating lung disease. According to the CDC, this leads to significant underreporting of deaths due to flu every year [
59].
On 20 February 2020, the CDC published guidelines for the diagnosis and mandatory reporting of COVID-19 for any patients evaluated with “COVID related” illnesses. This applied to all healthcare practitioners and included a comprehensive set of instructions and codes to document any relationship to COVID-19 on the death certificates [
60]. This represents a significant change in reporting of the disease and consequently the inclusion on the death certificate. Three separate additional guidelines put out in March and April affirmed these measures. In addition, the new CDC guidance stated that: “In cases where a definite diagnosis of COVID–19 cannot be made, but it is suspected or likely, it is acceptable to report COVID–19 on a death certificate as ‘probable’ or ‘presumed’” [
60]. This change introduced significant potential variations in the tabulation of COVID-19 death tolls.
At approximately the same time, the Centers for Medicare and Medicaid Services (CMS) authorized an additional 20% reimbursement for patients carrying a diagnosis of COVID-19 pursuant to Sections 3710 and 3711 of the CARES Act [
61]. These changes created a financial incentive for hospitals to classify patients as positive for COVID-19. Importantly, at the time these measures were introduced, the dominant model used by policy-makers—based on Ferguson et al. [
62]—predicted an exceptionally high mortality rate [
63]. By late March, more accurate estimates predicted a mortality rate well below original expectations [
64]. This should have triggered a policy reversal from the CDC and CMS, but no changes were noted. In short, in the politically charged landscape of 2020, the CDC’s new way of collecting data, combined with CMS’ monetary incentives, may have resulted in the overreporting of COVID-19 deaths. The introduction of these two new sources of reporting bias makes historical comparisons unreliable at best. Without reliable data, it is difficult to effectively fight a pandemic. This conundrum associated with the reliability of data on COVID-related deaths highlights the need for objective and uniform standards for case identification and data collection.
5. Conclusions
During our analysis, we evaluated the data that pointed toward political interference in the reporting of COVID-related deaths. As of 31 August 2020, it is clear that the national death rate from COVID-19 is higher than from other flu pandemics, but the increase in the reported death rate in states with Democratic governors has been greater than the increase in states with Republican governors. Much more research in the area of politicization of medical reporting is needed, particularly given the political climate of the United States.
One of the major limitations of this study is that the associated methods are unable to estimate causality. Any variable found to be unimportant in this analysis might have its effects mediated out by others. The coefficient estimates are associated with the model built, and the associated p-values suggest the importance of that model. A second important limitation is that this analysis is current only as of 31 August 2020. The analysis will continue to change as the pandemic peaks and subsides.
Future research should supplement this analysis by investigating whether states with contested gubernatorial elections (e.g., those with ballot purges, an issue that is becoming more commonplace [
65]) report higher mortality rates than those with normal elections. Additional research should focus on time series models as well as simulations to generate forecasts with the external regressors identified by this research.