3.1. Full Sample Empirical Regression
In Model (1),
was the number of states in the United States,
;
was the number of days from 1 January 2020 to 4 April 2020,
;
was the number of predictors,
; the unit root test results for each variable are shown in
Table 2; and the Hausman test result of the full sample is shown in
Table 3.
Table 2 presents the results of the two-unit root tests for each variable. If the sample sequence were non-stationary, it would need to be processed by a difference or a lag operator to make it a stationary time series. However, both the ADF test and PP test results for 15 variables rejected the null hypothesis at the 0.1% level; all 15 variables were stationary.
After the stationary test, the second step was to apply a Hausman test to assess whether the panel data Model (1) was a fixed or a random effect. The Hausman test result was shown in
Table 3. The result showed that the
p-value is 0.994, which meant the null hypothesis of the Hausman test was not rejected at the 0.1%, 1%, and 5% levels; we chose a random effect model (e.g., Model (3)) to analyze the full sample.
Based on the result of the Hausman test, the random effect model was selected for the full sample estimation in this study. The estimation results are shown in
Table 4.
To answer RQ1, we listed different Google search query keywords with their random effect results of the full sample of Model (3) in
Table 4.
With concern about changing the name from coronavirus to COVID-19, only pneumonia (2.11) was found to have a positive coefficient with newly added cases for disease name-related search queries.
In terms of symptom-related queries, fever was found to have a significant positive coefficient (1.972), which indicated that the most significant symptom for predicting infection was high temperature. Others were not significant, such as shortness of breath (1.576).
For searches related to treatments and medical resources, positive correlations were found between infected cases and searches for a ventilator (8.424), hospital (0.77), and mask (0.574). This meant that more people searching for medical resources of these kinds could predict the increase in the number of positive cases. This was a warning sign to authorities that medical resources such as masks, hospitals, and especially ventilators could be generally insufficient.
With public-health-measures-related search terms, quarantine had a significant positive coefficient (0.587), while lockdown (−2.172) and self-isolation (−6.25) had significant negative coefficients. Public health measures towards passive individual control such as quarantine could positively predict newly added cases, which meant that when citizens searched for public controls at the passive individual level, infections increased rapidly. Compared to collective-level search terms, such as lockdown, the coefficient was negative, which meant that when people noticed a lockdown in an area, it was associated with a decrease in new cases. Search terms on social distancing and self-isolation were significantly associated with a drop in newly added COVID-19 cases. Relevant medical management departments needed to increase public concern about preventive measures at the individual active/initiative level and further carried out collective level control measures, while the effect of passive individual control measures may not be significant.
The results showed that to answer RQ1, the increase or decrease in COVID-19 cases was partly related to these search variables in the United States on the Google platform. The query terms related to treatments and medical resources, ventilator, hospital, and mask were positive predictors of COVID-19 cases. This was probably because COVID-19 was the most harmful to the respiratory system and ventilators were very expensive and scarce. In terms of symptoms, people could find and compare symptoms online to see whether they have COVID-19. This allowed potential patients to be detected and confirmed as early as possible. They could thereby reduce travel, ensure safe social distancing and self-isolate, and seek medical treatment at hospitals. Public safety measures such as lockdown and self-isolation were negative predictors of the number of new cases, reflecting that these measures could have reduced the number of new cases.
3.2. Sub-Sample Empirical Regression
To answer RQ2, we discussed the impact of public health measures between states, this study collected data from 50 states in the United States, focusing on the effect of the significant variables (quarantine, lockdown, and self-isolation) of public health measures in
Table 4 on the increase in COVID-19 cases.
This article defined the first day with a non-zero increase in new cases as the day when the outbreak period began and set the number of days in the outbreak period in each state as
; the sum of daily positive increase of each state was
and the average daily positive increase during the outbreak period was
, where:
,
. Additionally, this study used the average daily positive increase
to sort 50 states from least to most, as shown in
Table 5.
As shown in
Table 5, we found that the average daily positive increase
of the state of Alaska was the lowest, while the average daily positive increase
of New York was the highest in the 50 states of the United States. To better discuss the 50 states of the United States, we divided them into five categories according to the state rankings and calculated the mean values of
(i.e., the number of days in the outbreak period) and
(i.e., the average daily positive increase during the outbreak period) in each category, as shown in
Table 6.
According to
Table 6, both the mean values of
(i.e., the number of days in the outbreak period) and
(i.e., the average daily positive increase during the outbreak period) presented a gradual increase in the five categories. Additionally, the mean values of
were between 24–30 days in each category but the mean values of
were quite different, with the lowest being 11.648 and the highest being 793.319. Then, we used Model (1) to perform a panel data regression on the top 20 (Alaska to Oregon) and the last 20 cities (Missouri to New York); the Hausman test results are shown in
Table 7.
Table 7 presents the Hausman tests for assessing whether the two sub-sample panel data models were fixed effect models or random effect models. The results showed that the
p-value of states ranking 1 to 20 was 1.000, and the
p-value of states ranking 31 to 50 was 0.846, which meant the null hypotheses of the two Hausman tests were not rejected at the 0.1%,1%, and 5% levels. Based on the results of the Hausman tests, the random effect model was selected for both sub-sample estimations in this study. The partial estimation results of the significant variables of public health measures in full sample regression were shown in
Table 8 and
Table 9, respectively.
Table 8 shows that in the states ranking 1 to 20, with public-health-measures-related search terms, quarantine (−0.024), lockdown (−0.035), and self-isolation (−0.185) have negative coefficients, and the significant variables were the same as the full sample. It was particularly important to note that search queries such as quarantine and self-isolation were individual passive and active behaviors in public health measures, while lockdown was an overall government-imposed or suggested public health measure, which meant that in mild states (states ranking 1 to 20), such as Alaska, North Dakota, Wyoming, South Dakota, Montana and so on, all public health measures search queries (i.e., quarantine, lockdown, and self-isolation) were significantly associated with the reduction of COVID-19 situations.
However, as
Table 9 shows, in the states ranking 31 to 50, the variables lockdown (−3.208) and self-isolation (−20.735) had significant negative coefficients but the quarantine was not significant at 0.1%, 1%, and 5% levels, which meant in serious states (states ranking 31 to 50), such as Washington, Illinois, Pennsylvania, Florida, Massachusetts and so on, only government-imposed or suggested quarantine and self-isolation search queries were significantly associated with the reduction of COVID-19 situations; especially the self-isolation search query had a large negative correlation to COVID-19 cases.
To answer RQ2, the searching behaviors suggested that the public health measures taken by the government in response to COVID-19 were closely related to the situation of the pandemic. In mild states (states ranking 1 to 20), all public health measures search queries were significantly associated with the reduction of COVID-19 situations. In serious states (states ranking 31 to 50), only quarantine and self-isolation search queries were significantly associated with the reduction of COVID-19 situations; especially the self-isolation search query had a large negative correlation to COVID-19 cases.