1. Introduction
The COVID-19 pandemic has exacerbated a multitude of existing inequalities in the United States, with non-White communities, in particular, being disproportionately affected by the crisis [
1,
2]. From economic insecurity and job loss to inadequate access to healthcare resources, many devastating mechanisms of stratification present in the United States have been reproduced during the pandemic [
3]. While the unequal distribution of resources and opportunities has a long legacy in American society, the pandemic has served to amplify disparities and reshape the mechanisms that enable them.
The digital divide, which describes the unequal distribution of access to technology and the internet, constitutes a growing set of inequalities reproducing and exacerbating long-standing social and economic divides [
4,
5,
6,
7]. These inequalities particularly persist along racial lines, and Black and Hispanic Americans are subsequently disproportionately less likely to participate in the digital economy and access vital resources such as education and healthcare. This unequal access to technology and the internet has far-reaching consequences, hindering upward mobility and perpetuating existing disparities. This set of inequalities is not a novel trend but a manifestation of deeper disparities that have long been entrenched in American society. Ultimately, scholars must study and address these disparities in order to work toward a better-connected and more equitable society.
In examining the digital divide, some scholars have put forth Technology Maintenance theory, which argues that although most Americans own devices that connect to the internet, access to the internet could be unreliable and subject to irregular periods that negatively affect their relationships to necessary digital services [
8]. Although individuals from marginalized backgrounds mostly have devices to access the internet, the access itself could be unreliable and undependable, interrupting essential activities such as education, telehealth, and work. In some cases, negative attitudes towards online adoptions of previously in-person services were legitimized by technology maintenance issues such as poor internet connectivity [
8]. Even among more generally privileged groups, such as university students, those from lower socioeconomic backgrounds and students of color were more likely to experience technology maintenance issues, including internet connectivity issues.
Internet speed has been particularly essential during the COVID-19 pandemic [
9]. Necessitated by social distancing policies, virtual schooling became widespread in 2020 [
10]. Research has found that high-speed internet connectivity and functioning devices have been critical factors in education outcomes and learning proficiency during the pandemic [
11]. Inequalities in internet connectivity during the pandemic may be exacerbated by broader existing inequalities in communication skills and knowledge of resources that are just as relevant for remote learning outcomes. More generally, lower quality of Internet access has been found to correlate with less internet use for communication and information purposes during COVID lockdown periods [
12].
Internet speed was also crucial during the COVID-19 pandemic in terms of employment opportunities [
13]. Internet speed is essential for working from home, enabling real-time communication and information sharing within an organization [
4]. Without sufficient speed, employees may struggle to complete tasks, negatively impacting their employers’ perception of their performance. Additionally, low internet speeds may limit access to working from home entirely, particularly for higher-skilled jobs that rely on the internet and video communication. The inability to work from home may limit both employment opportunities and individuals’ ability to practice social distancing.
As an additional potential impact, internet speed is a critical factor in delivering telehealth services, which have become crucial during the COVID-19 pandemic to maintain health and well-being while practicing social distancing [
14]. Particularly during the pandemic, internet access (which speed can affect) has been essential for accessing vital health information. Additionally, fast and reliable internet speeds are necessary for accessing telehealth appointments. Fast internet speeds enable quality video communication and ensure that critical information is communicated effectively. Ultimately, poor internet connectivity can severely affect healthcare access and impede healthcare providers’ ability to deliver timely and effective care to already marginalized populations [
15,
16,
17].
One of the central units by which to understand the quality of connectivity and internet speed is the neighborhood [
18]. In most cases, broadband infrastructure tends to vary between neighborhoods rather than within them [
19]. Subsequently, understanding neighborhood variation in internet speed is crucial for understanding stratification processes related to the digital divide. A growing body of sociology research has examined how inequality has been exacerbated and reshaped during the COVID pandemic [
20]. Neighborhoods, in particular, are a central component of the broader social and racial stratification process in the United States [
21]. The outcomes of children, in terms of educational attainment, income, and health, among many other things, are a causal product of the neighborhoods they grow up in [
22]. While the exact mechanisms are somewhat unclear, resources, both in social and institutional terms, are highly implicated in neighborhood effects [
23].
Neighborhood effects are historically split into two categories: Developmental and Situational Neighborhood effects [
24]. Developmental effects tend to have long-term impacts that stretch well beyond the time of exposure and are focal to understanding how neighborhoods impact adolescents. Situational effects distinctly refer to short-term neighborhood effects directly related to current residence. We argue that the impacts of the COVID-19 pandemic related to the digital divide can drive both types of effects. If children’s educational attainment suffers because they lack quality internet access, their life course trajectory may be adversely impacted. Similarly, if individuals cannot work from home because of poor internet connections, they may be forced into lower-skill occupations with lower wages and less opportunity for advancement. Impacts on children related to internet access may have long-lasting impacts, while temporary inconveniences for adults related to internet access may have fewer lasting impacts.
Overall, limited and constrained access to the internet and internet devices persists along the same racial lines along which social inequality exists. Non-white groups can be at greater risk of social disconnection when in-person social opportunities are restricted [
25,
26,
27]. This article explores a novel dataset for studying inequality during the COVID pandemic: longitudinal geocoded internet speed data. With fine-grained geocoding, it is possible to engage in neighborhood-level analysis of internet speed. Our analysis reveals that neighborhoods with higher proportions of Black residents tend to have better download speeds but worse upload speeds. Importantly, upload speeds are crucial for video communication, which became increasingly prevalent during the pandemic. Our findings show that upload speeds in Black neighborhoods have consistently fallen during the pandemic compared to white neighborhoods, highlighting the persistent racial inequalities in access to high-quality internet. We find little evidence of Hispanic neighborhoods facing substantial disparities in internet speed relative to White neighborhoods. The results of this study have significant implications for understanding the digital divide and its impact on social and economic inequality.
2. Data and Methods
Data from this project comes from the Ookla Speed test open dataset. Ookla’s speed test is publicly available and voluntarily completed by users concerned with their internet speed. This data has been used repeatedly before to analyze inequalities in internet speed [
27]. Ookla’s speed test measures the performance of an internet connection by sending a small amount of data to and from Ookla’s server and the user’s computer and measuring the time it takes for the data to be sent and received [
28]. The first resulting measurement is the “download speed”, or the speed at which data can be transferred from the internet to the user’s device. The second resulting measurement is the “upload speed”, or the speed at which data can be transferred from the user’s device to the internet. These measurements are crucial in determining an internet connection’s performance and identifying potential connectivity issues.
For this project, data is specifically downloaded using the OoklaOpenDataR package. Fixed broadband data for every four quarters of 2019, 2020, and 2021 is obtained. Ookla speed test data is aggregated into tiles that are approximately 610.8 m to 610.8 m. Upload and download speeds are averaged across all tests performed within each tile, and the number of tests and unique devices that tests are performed on is recorded. Using the tigris package in R, we geocode all tiles to a census tract using each tile’s centroid [
29]. Demographic census tract data is subsequently obtained from the 2017–2021 American Community Survey 5-year estimates. We subsequently calculate average upload and download speeds for each census tract, weighting the average speeds in each tile by the number of unique devices those speeds are based on. Through this approach, users who take many tests are less likely to disproportionately affect our results. In line with the central limit theorem, census tracts with less than 30 unique tests are dropped. Across all quarters, the median census tract had measures of upload and download speed based on 194 unique tests. Census tracts dropped due to low sample sizes make up between 3.8% and 8.5% of total census tracts depending on the quarter. Across all census tracts and time periods, upload speeds ranged from 240 to 425,315 kilobytes per second, with a median of 24,178. Download speeds ranged from 1479 to 486,954 kilobytes per second, with a median of 136,198. In our analyses, we log upload and download speed to account for the apparent skew of the data.
An important limitation of the internet speed data is that it is not necessarily representative. Individuals voluntarily test their internet speed, and as a result, internet speeds may be taken disproportionately by individuals who have issues with their internet or have recently obtained a new device/service. Unfortunately, we have no sense of these unique circumstances and have no means by which to distinguish between tests that may be more or less representative of general internet speed. As an additional limitation, we also have no means by which to identify the specific internet provider a user has. We encourage our results to be interpreted with these limitations in mind.
The primary methodological strategy employed is county fixed-effects Ordinary-least squares models of logged census tract internet speed. County fixed-effects control for unmeasured regional variation in internet speed. Different providers and different broadband infrastructure play a massive role in determining internet speed [
30]. The need and usage of broadband may also vary highly between regions [
31]. Internet speed is logged to account for the skewed distribution. All models are run using the fixest package in R [
32].
Models are estimated separately based on data from all 12 quarters in 2019–2021. Models are estimated using fixed-effects OLS regression in R. The form of the model can be written as follows:
where
represents the upload (download) speed for census tract
i, is an indicator variable indicating if more than 50% of the residents in census tract
i are non-Hispanic Black (7.6% of all census tracts),
is an indicator variable indicating if more than 50% of the residents in census tract
i are Hispanic (of any race) (10.9% of all census tracts),
is an indicator variable indicating if no more than 50% of the residents in census tract
i are non-Hispanic White, non-Hispanic Black, or Hispanic of any race (16.3% of all census tracts),
represents county-level fixed effects (where county j is the county census tract
i is located in), and
is an error term with the usual assumed statistical properties.
This model is notably very simple and excludes any control variables besides county-level fixed effects. We would emphasize that the goal of this analysis is to provide a descriptive portrait of neighborhood-level racial inequality. We view analyses that seek to explore the potential mechanisms of neighborhood racial inequality as outside the scope of this article’s aims. Rather, since surprisingly little research has explored neighborhood racial inequality in internet speed, we would emphasize that we aim to primarily explore whether this inequality exists and, if so, how it has changed over time.
Additionally, we must emphasize one caveat of the descriptive portrait we are seeking to provide. We are not seeking to make nationwide claims on the state and trends in neighborhood racial inequality in internet speed. Rather, by estimating county-level fixed effects, we are estimating inter-county neighborhood racial inequality in internet speed. We view this form of inequality as perhaps more important than absolute nationwide racial inequality. Generally, neighborhood internet access and speed are highly confounded by urban/rural divides. These divides have been investigated and described in great detail in past research. However, we argue that inter-county variation in internet speed may be more important since the need for internet access and speed may vary somewhat from county to county. In rural counties with lower internet access, the labor market may simply be less oriented towards jobs or positions that require quality internet access. Distinctly, in urban counties with more widespread internet access and speed, there may be more jobs that require quality internet access. Additionally, in rural counties with lower internet access, schools may have less of an expectation that students have access to the internet, and they may be less inclined to structure teaching strategies around technology and virtual learning. Contrarily, in urban counties where most households have access to quality high-speed internet, schools may be much more likely to orient teaching strategies around technology and virtual learning. The general point is that variation in internet speed within counties may be more important than variation in internet speed between counties. By specifically examining inter-county racial inequality in internet speed, we are measuring a specific form of inequality that we believe to be focal to understanding internet speed inequalities.
We present the main results of these models in figures. We organize the results into three periods: pre-pandemic, pandemic-onset, and post-pandemic. This strategy allows us to visualize how inequality in internet speed has trended over time. We must note that we are not making causal claims regarding the effects of the COVID pandemic on internet speeds or internet inequality. Instead, we seek to highlight trends crucial to understanding how the COVID pandemic may have exacerbated existing inequalities. As we stated earlier, the COVID pandemic has substantially reshaped the mechanisms of inequality in the United States. In this analysis, we hope to provide a descriptive portrait of how racial inequalities in neighborhood-level internet speed may be associated with this.
As an additional analysis to more precisely capture changes in neighborhood speed inequalities over time, we also create two pooled models of upload and download speed with two-way fixed effects. By controlling census tract and time, we can isolate the relative shifts in internet speed that non-White neighborhoods experience. The model can be written as follows:
where
represents the upload (download) speed for census tract
i, is an indicator variable indicating if more than 50% of the residents in census tract
i are non-Hispanic Black,
is an indicator variable indicating if more than 50% of the residents in census tract
i are Hispanic (of any race)
, is an indicator variable indicating if no more than 50% of the residents in census tract
i are non-Hispanic White, non-Hispanic Black, or Hispanic of any race,
ON is an indicator variable indicating if the quarter and year of measurement was during the onset of the pandemic (e.g., Quarter 1 2020), POST is an indicator variable indicating if the e quarter and year of measurement was after the onset of the pandemic (e.g., after Quarter 1 2020),
represents census tract-fixed effects,
represents county-quarter-year-combination fixed effects (where county
j is the county census tract
i is located in and
k is the quarter and year of the measurement), and
is an error term with the usual assumed statistical properties.
The results of these models for upload and download speed provide a more refined statistical analysis of how internet speed inequality shifted over time between neighborhoods. While the figures provide a visualization of cross-sectional inequalities over time, these models provide a statistical test of whether the difference in these inequalities between periods are significant. More specifically, by including census tract-fixed effects and county-quarter-year combination, coefficients can be interpreted as the differences in internet speed certain types of neighborhood experience during and post the onset of the pandemic, relative to White neighborhoods pre-pandemic.
3. Results
Figure 1 presents results for upload speeds for majority Black neighborhoods. The x-axis represents the quarter and year of observation, while the y-axis represents differences in logged average upload speed between majority Black and majority White neighborhoods. Since the models include county-level fixed effects, these results are not absolute nationwide differences but account for inter-county variation in internet speed. We color-code the figure to highlight three different periods, pre-pandemic, pandemic onset, and post-pandemic (after the onset of the pandemic).
Figure 1 shows that pre-pandemic majority Black neighborhoods had slightly lower speeds than majority White neighborhoods; only about 5 to 6% slower though. Around the onset of the pandemic at the end of 2019 and the beginning of 2020, upload speeds fell considerably, however, and remained low through 2021. At the lowest point in 2021, majority Black neighborhoods experienced upload speeds 13.2% slower than majority White neighborhoods. This difference is quite substantial and constitutes the difference between an upload speed of 20,000 kbps and an upload speed of 17,360 kbps. Notably, upload speeds were statistically different between majority Black and majority White neighborhoods for all 12 quarters observed in this data.
Figure 2 presents results for download speeds for majority Black neighborhoods. These results indicate that download speeds are actually slightly higher in majority Black neighborhoods, though only between 4 and 6 percent higher. In addition, download speeds appeared to increase in majority Black neighborhoods relative to majority White neighborhoods just before the onset of the pandemic, though they fell after the pandemic began and reached the lowest levels in the analysis window during 2021. Notably, download speeds were statistically different between majority Black and majority White neighborhoods for all 12 quarters observed in this data.
Figure 3 presents results for upload speeds for majority Hispanic neighborhoods. These results indicate that upload speeds fluctuate between being higher and lower in majority Hispanic neighborhoods compared to majority White neighborhoods. Upload speeds appeared to increase in majority Hispanic neighborhoods relative to majority White neighborhoods just before the onset of the pandemic, though they fell after the pandemic began and reached the lowest levels in the analysis window during 2021. For all quarters, however, upload speeds were not statistically different in majority Hispanic neighborhoods compared to majority White neighborhoods.
Figure 4 presents results for download speeds for majority Hispanic neighborhoods. These results indicate that download speeds are actually slightly higher in majority Hispanic neighborhoods, though only between 4 and 7 percent higher. In addition, download speeds appeared to increase in majority Hispanic neighborhoods relative to majority White neighborhoods, rising initially shortly before the beginning of the pandemic, before falling, then rising again. Notably, download speeds were statistically different between majority Hispanic and majority White neighborhoods for all 12 quarters observed in this data.
Ultimately, these four figures demonstrate the differences in internet speeds between majority non-White and majority White neighborhoods before and during the COVID-19 pandemic. Generally, these results indicate that, relative to White neighborhoods, Black neighborhoods had substantially slower upload speeds but slightly faster download speeds prior to the pandemic. Hispanic neighborhoods, however, had both slightly faster upload and download speeds. These results suggest that the COVID-19 pandemic was accompanied by a substantial shift in internet speeds in all neighborhoods and that internet speed inequality between majority non-White and majority White neighborhoods has fluctuated over time but generally worsened since the onset of the pandemic.
Table 1 presents the results of the two-way fixed-effects longitudinal models. Model 1 estimates logged upload speed by census tract and time period. The focal interactions in this model are between indicators for neighborhood racial composition and indicator variables for time period. The reference categories are White neighborhoods and pre-pandemic onset (e.g., 2019). In this sense, all other variable coefficients should be interpreted as effects relative to those reference categories. For example, the “ON X Black” coefficient indicates that, relative to White neighborhoods, logged upload speeds in Black neighborhoods are 0.021 lower during the onset of the pandemic compared to before it, controlling on the census tract and the period. This value is statistically significant at
p < 0.01.
Other major findings from Model 1 pertain to ‘POST’ coefficients. Specifically, for all three types of non-White neighborhoods, we find that logged upload speeds have been lower relative to White neighborhoods during the pandemic compared to before it. For Black neighborhoods, upload speeds have been nearly 5% lower, statistically significant at p < 0.001. For Black neighborhoods, upload speeds have been approximately 1.5% lower, an effect that is statistically significant at p < 0.01. For Other neighborhoods, upload speeds have been approximately 0.8% lower, an effect that is statistically significant at p < 0.05. These results generally confirm the findings of the figures, upload speeds have gotten relatively worse in majority Non-White neighborhoods since the onset of the pandemic through the end of 2021.
In addition, as the Figures suggest, the story is somewhat different when it comes to download speeds. Model 2 predicts download speeds using the exact same form as Model 1. The ‘ON’ variables reveal how download speed inequality shifted during the onset of the pandemic relative to before the pandemic. These results reveal that Black, Hispanic, and Other neighborhoods all experienced statistically significant increases in download speeds relative to White neighborhoods during the onset of the pandemic. It should be emphasized, however, that these increases were strikingly small. Majority Hispanic neighborhoods experienced the largest increase, but even that was only equivalent to approximately 1.5%. The post-onset period was somewhat similar. Hispanic and Other, but not Black, neighborhoods experienced significant increases in download speed relative to White neighborhoods. While these two coefficients were both statistically significant, these increases were again markedly small—only approximately 1.0% and 0.5%, respectively.
4. Discussion
Overall, the results of this study suggest that the COVID-19 pandemic accompanied a considerable shift in internet speeds and that inequalities between majority Black and Hispanic neighborhoods and majority White neighborhoods have fluctuated substantially over time. Specifically, these results indicate that prior to the pandemic, majority Black neighborhoods had slightly faster download speeds and slightly slower upload speeds than majority White neighborhoods. However, during the onset of the pandemic, upload speeds in Black neighborhoods relative to White neighborhoods substantially worsened. This disparity is especially notable because upload speeds are essential for video communication, usage of which greatly increased during the pandemic.
We observe that the size of disparities between most neighborhoods is relatively minute. For example, for majority Hispanic neighborhoods, upload speeds are never more than 2% greater than in majority White neighborhoods. Download speeds are similarly never more than 7% higher. The largest observed disparity is for upload speeds between majority Black and majority White neighborhoods, where disparities reached 13.2% at one point in 2021. This was notably the only disparity we observed where internet speeds were slower in non-White neighborhoods compared to White neighborhoods, as majority Black neighborhoods had slightly faster download speeds than majority White neighborhoods. Ultimately, while it is a pleasant surprise that more racial disparities in internet speed do not exist, it is unfortunate that the single disparity we observe is so stark and seems likely to be especially relevant.
Several limitations to this study should be considered when interpreting these results. First, the data applied in this study is based on voluntary internet speed tests initiated by individuals concerned about their internet speed. As a result, these findings may not be representative of the entire U.S. population as sampling bias may be present in who chooses to do these tests. Second, it is possible that the pandemic had a unique impact on sample bias as different sets of users may have been running internet speed tests. This possible sample bias could potentially affect our ability to appropriately interpret changes in disparities over time, as differences may be simply attributable to changes in the sample. Finally, this study analyzed census-tract level variation in internet speed, but individual factors, such as income, spending power, and technological prowess, may also confound associations between neighborhood of residence and internet speed. Further research is needed to examine the role of these factors in internet speed and determine whether or not these inequalities are truly neighborhood-level effects.
There is still much work to be done to address internet speed inequality in neighborhoods. The nexus of neighborhoods and internet speed is extremely understudied from a sociological perspective. Future research should further focus on identifying the root causes of this inequality and exploring potential solutions. One avenue for exploration could include investigating the role of government policy and regulation in ensuring geographically consistent access to high-speed internet. Ultimately, the best way to address these disparities in internet speed is for scholars to work to determine the root causes of it. Future research should also examine the long-term impacts of internet speed inequality on individuals and communities, including empirical analyses of exactly how inequalities in internet speed exacerbate racial disparities. By shedding light on these issues, future research can inform policies and practices that can help create a more equitable and connected society.