1. Introduction
The novel coronavirus disease (COVID-19; caused by SARS-CoV-2) was declared a global health emergency by the World Health Organization on 30 January 2020 [
1]. By mid-November 2020, there were more than 60 million cases worldwide, with over 13 million cases occurring in the United States (US) alone, according to the Johns Hopkins University COVID-19 dashboard [
2,
3]. The current outbreak of COVID-19 has led to an unprecedented impact on daily life and exposed critical weaknesses in the public health infrastructure in the US. Controlling COVID-19 requires swift identification and containment of cases and contacts to prevent further community transmission [
4]. Understanding the transmission dynamics within community settings and determining which groups are at the highest risk of infection is the cornerstone of public health interventions for reducing COVID-19 morbidity and mortality. Geospatial analytics could represent an important tool for determining community level risk factors, including social determinants of health (SDOH).
Examining neighborhood-level stressors and assets provides an important framework for understanding the SDOH [
5]. Simultaneously, at least in the US, renewed attention is directed toward the significance of the SDOH in contemporary health care discourse [
6,
7,
8]. Important aspects of the SDOH include poverty, low educational attainment, rapid urbanization and substandard housing, and lack of employment opportunities [
9,
10]. Additional SDOH relevant to COVID-19 include occupational risks from essential work, multigenerational households, homelessness, and food insecurity [
11]. Given the fact that neighborhood-level social, economic, and environmental factors have both direct and indirect effects on health [
12,
13,
14], understanding how they affect the community spread of COVID-19 is invaluable. In essence, knowledge gained about the spatial structures of any relationship between SDOH and COVID-19 may be used to plan and target intervention programming differently across a given study area [
15,
16].
Across the US, researchers have identified spatiotemporal trends in COVID-19 incidence, determined case-fatality rates [
17,
18], and compared the spatial patterns of socioeconomic variables to identify the factors that correlate with mortality in urban and rural settings [
19]. Studies have reported that significant neighborhood-level inequities underlie the variance in COVID-19 community incidence and mortality rates in the USA [
20,
21,
22]. To date, studies that have conducted geospatial analysis of COVID-19 within communities have predominantly used county [
17,
19,
23,
24,
25,
26,
27,
28] or zip code [
29] as their primary unit of analysis. Meanwhile, the geographic unit used in any area-based analysis is fundamentally important for how precise estimates of reality are, while also enhancing the generalizability of findings and reducing bias. Quantifying the associations between COVID-19 and relevant outcomes aggregated to the county and zip code levels may obscure the heterogeneity of both the dependent and independent outcomes of interest [
29]. In general, smaller geographic units provide more accurate estimates of neighborhood-level characteristics, the only exception being situations where the number of records available at the smaller geography is too small to represent stable estimates of the outcome of interest [
30,
31]. Therefore, census tract may represent an ideal unit of analysis for neighborhood-level characteristics. Additionally, several socioeconomic and demographic data are available at the census tract level. Providing COVID-19 surveillance data at census tract levels may facilitate spatial analysis of related phenomena and potentially benefit public health response efforts. We believe our current analysis using census-tract level data overcomes the limitations faced by other analysis using larger units of measure. We are unaware of any current reporting or surveillance systems in the USA that provide COVID-19 data below the zip code level while integrating SDOH.
In the current analysis, we used extended spatial analytic procedures to disaggregate COVID-19 community incidence estimates provided at the zip-code geographic unit into census tract estimates. Subsequently, we assessed the associations between census tract measures of SDOH and the community incidence of COVID-19 using a series of aspatial and spatially weighted regression models to determine neighborhood drivers of disease transmission in Harris County, Texas, the most diverse county in the USA and the third most populous with 4.7 million people [
32].
4. Discussion
Central to effective control measures for a pandemic is understanding the epidemiology of transmission in the community. Our study joins the list of recent and growing research studies examining various aspects of the relationships between socioeconomic/environmental factors and the incidence of COVID-19 [
11,
49]. We used a series of aspatial and spatially weighted regression models to identify neighborhood-level characteristics that are associated with higher COVID-19 incidence at the census tract level. Our study area, Harris County, is a major USA metropolitan county. Across our several analysis steps, characteristics that represent either minority population or socioeconomic disadvantage had positive associations with COVID-19 incidence. Out of 29 variables that we considered in the analysis, 7 remained significant correlates of COVID-19 community incidence in our final global model: the percentage of the Black or African American population, the percentage of the foreign-born population, ADI, the percentage of households with no vehicle available, and the percentage of the population over 65. Two variables found to be protective were the percentage in education, training, or library occupation and capacity of assisted living. By understanding variables that correlate with community transmission, we can better direct resources, expand testing capacity, and focus disease control measures.
Conducting this analysis at the neighborhood level is a critical component of this study. Over the last decade, scholars have argued for and validated the importance of examining the impact of “place-based” socioenvironmental factors on health outcomes [
50,
51]. In this regard, the places where people live, work, and play are frequently considered, though the residential neighborhood are appropriately the typical unit of analysis. Neighborhoods are not randomly constructed; they are patterned around social status, ethnicity, and income [
52]. These factors strongly influence an individual’s determinants of health and have been shown to correlate with health status and overall mortality rates [
53]. Understanding how neighborhood factors influencing transmission of this novel disease will be critical in preventing future outbreaks.
One variable that was highlighted in our analysis and showed the strongest relationship to increased risk of COVID-19 was the area deprivation index (ADI). This index is a validated composite measure of neighborhood socioeconomic inequalities and disadvantage [
42]. Significant inequalities have been found to influence historic pandemics. Sydenstricker, as far back as 1931, demonstrated inequalities in the working-class population of the USA during the Spanish influenza pandemic of 1918–1919 [
54]. Contemporary evidence has also shown that these inequalities during times of pandemics existed in terms of key spatial attributes such as affluence of neighborhoods, socioeconomic strata, and the urban–rural gradient [
55,
56,
57,
58]. The ADI has been used to examine disease risk factors [
59], predict healthcare utilization [
60], and understand healthcare disparities [
43,
61]. Recently, Singh and colleagues recognized the ADI for having been a powerful tool for documenting and monitoring population health inequalities across time and space in the USA [
61]. ADI was one of the strongest correlates of high COVID-19 incidence at the community level in our analysis. Community-level poverty can influence health on many levels. It affects everything from health care utilization, access to healthy foods, recreational activities, built environment, and neighborhood safety. This index likely represents a very complex relationship between community and health. Our findings are consistent with those of several recent studies that have indicated that factors associated with social and economic disadvantage have been associated with COVID-19 [
19,
62].
Our analysis also indicated that racial/ethnic composition and nativity of neighborhood populations were significantly correlated with COVID-19 incidence. The underlying causes of health disparities among racial minorities in the US are likely complex and cannot be easily summarized. They derive from relationships among social structure, cultural norms, racism, and socioeconomic factors. This is likely why the variable representing the percentage of the Black or African American population and the percentage of the foreign-born population remained significant in our analysis, but many other factors that contribute to inequality were found to be not significant in the final model. Further research to understand these community-level drivers of health inequality is critical to determine points of intervention to prevent disease transmission and reduce the disproportionate morbidity and mortality that these communities have experienced as a result of COVID-19.
These findings have potential relevance to the release of COVID-19 vaccines as part of Operation Warp Speed. Given the potential COVID-19 risks to African American and foreign-born populations, prioritizing vaccine access in Houston and Harris County to such vulnerable groups presents special urgencies. Of particular concern are recent reports of COVID-19 vaccine hesitancy in African American populations [
63]. Still another issue are the high rates of COVID-19 deaths among both African American and Hispanic groups at younger ages (<65 years old) compared to the non-Hispanic Whites, such that relying on 65-year age cut-offs for vaccinations might miss highly vulnerable subpopulations [
36].
Our geographically weighted Poisson regression analysis produced local beta coefficients for each of the census tracts in our study area. This tool allows for visualization of the impact of each independent variable in individual neighborhoods. Interestingly, the impact of ADI appears to be homogeneous across our study area, indicating that variation in ADI affects neighborhoods equally regardless of other factors. It appears that an increased percentage of African American or foreign-born population within the community has a greater impact in the less densely populated periphery of the county. Conversely, neighborhoods with larger populations of residents over 65 years old had a greater impact in parts of the county that are more densely populated. Conducting this local analysis by census tract provides a visual output of each variable’s impact that is easily interpreted for directing public health interventions.
Our study has some noteworthy limitations. Our dependent variable, COVID-19 incidence, was derived from publicly available data provided by public health authorities in Harris County and the City of Houston. While this is currently the only source of COVID-19 incidence data, we cannot ensure that it is always timely and accurate. Inadequate access to testing, delayed testing results, and backlogged data could affect our data quality. The independent variables considered in this analysis may not represent a comprehensive list of all factors influencing COVID-19 transmission in this community. While we believe that we accurately represented likely risk factors, we cannot rule out other influences. As with any epidemiologic analysis using non-individual level estimates, our analysis is susceptible to ecologic fallacy. Additionally, this is a correlational study, and therefore, causal inference cannot be made; as such, coefficients should be cautiously interpreted. We believe that novel analytic workflow and the importance of the findings of this analysis for local public health officials outweighs these limitations. While these limitations are important to consider, we also recognize that the large sample size of cases strengthens the power of our study.
The assessment of disparities in health outcomes requires the ability to understand spatial and spatially driven structures that influence the exposure–disease relationships. Understanding health disparities through spatial processes is perhaps especially useful in societies where heterogeneous neighborhoods composed of diverse groups are seldom the norm. Of course, the utility of geospatial analytics and processes in this regard should not just be for the sake of itself. Findings from this type of geographically weighted analyses should provide insight into neighborhood-level drivers of infection that would have otherwise been missed by public health officials. This powerful analytic process can provide information to effectuate holistic policy prescriptions that are often operationalized in geographical space [
64]. The utility of spatial analyses for understanding and managing the COVID-19 pandemic may create levels of structural resources that, when adequately leveraged, could facilitate effective intervention strategies, allocation of resources, and delivery of care to all, and especially those disproportionately burdened by the pandemic.