**4. Results**

This section reports the results of the empirical analyses. This section first explains how the optimal lag order for the VAR model was selected. Then, this section reports the core estimation results, focusing on the impulse response figures of different variables and forecast error variance decomposition (FEVD) estimates. After that, several robustness checks on the results are conducted.

#### *4.1. Selection of Optimal Lag Order*

Before estimating a VAR model, it is necessary to choose the optimal lag order for the model on the basis of lag order selection statistics. Table 2 reports useful statistics. It shows that different criteria sugges<sup>t</sup> different selections of lag order. The LR (likelihood ratio), FPE (final prediction error), and AIC (Akaike's information criterion) statistics suggested to use eight lags. Differently, the HQIC (Hannan and Quinn information criterion) suggested six lags, and SBIC (Schwarz's Bayesian information criterion) suggested three lags. Since three of these five criteria suggested to use eight lags in the model, eight lags were initially selected for VAR estimation. In the robustness check section, situations using three and six lags are examined, and they demonstrated robust results.


**Table 2.** Lag order selection statistics for vector autoregression (VAR) model. LR, likelihood ratio; FPE, final prediction error; AIC, Akaike's information criterion; HQIC, Hannan and Quinn information criterion; SBIC, Schwarz's Bayesian information criterion.

Note: \* indicates optimal lag order selection according to each statistic.

#### *4.2. Estimation Results*

In order to generate meaningful estimation results, the whole VAR system must be stable. Figure 2 shows that all eigenvalues were inside the unit circle. This indicates that the established VAR system satisfied the required stability condition, and it could be relied on to analyze interactions among variables.

**Figure 2.** Eigenvalue stability condition.

Since the estimation results of a VAR model are rarely only explained by the estimated coefficients, the analyses primarily focused on impulse response figures (IRFs). Figure 3 shows the impulse response figures based on the estimated coefficients.

The three subfigures in the first row of Figure 3 show the responses of variables to an orthogonalized, one unit, positive shock of (logarithmic) AQI in Shanghai. As shown in Figure 3(i.a), after such an AQI shock, AQI notably increased. As displayed in Figure 3(i.b), the AQI in Beijing fluctuated a little bit, but variation was quite small and not statistically significant. Figure 3(i.c) notably shows that, in response to the AQI shock, the Baidu index rose significantly. To observe the response more clearly, this subfigure was amplified and is displayed in Figure 4(i). It is clear that the AQI shock immediately raised the Baidu index without any time lag. The response was persistent and significantly positive even after ten days. Therefore, Hypothesis 1 is supported. The residents in Shanghai really cared much about the ambient air quality. As air quality deteriorated, Shanghai citizens expressed more concerns about local air pollution, as reflected by the increase in the Baidu index.

The three subfigures in the second row of Figure 3 show the variable responses to a positive shock of (logarithmic) AQI in Beijing. In Figure 3(ii.a), an increase in Shanghai's AQI was observed. This could be explained by the spatial spillover effects of air pollution, as discussed in previous studies reporting the spatial interactions of air pollution among different regions (e.g., [50–52]). Figure 3(ii.b) shows that Beijing's AQI increased persistently. An important finding was that, as demonstrated by Figure 3(ii.c), the Baidu index for Shanghai's air quality rose in response to air pollution in Beijing. This implied that Shanghai residents' concern about local air quality increased after they observed that Beijing's air quality became worse. This graph was amplified and is displayed in Figure 4(ii). As demonstrated in that graph, one day after Beijing's AQI increased, the Baidu index for Shanghai's air quality began to increase. This increase was persistent for five days. The Baidu index returned to its original value after the sixth day. Therefore, Hypothesis 2 is supported. Air pollution problems in another large city indeed tend to increase the residents' concern about air pollution in Shanghai.

**Figure 3.** Impulse response figures (IRFs). Note: Each subfigure with the title of "X →Y" demonstrates the response of variable Y to an orthogonalized positive shock of variable X. In other words, X is an impulse variable, and Y is a response variable. One period in the figure denotes one day.

(i) AQI → Baidu index

(ii) AQI (Beijing) → Baidu index

(iii) Baidu index → AQI

**Figure 4.** Amplified impulse response figures (IRFs) of interest. Note: Each subfigure with the title of "X→Y" demonstrates the response of variable Y to an orthogonalized positive shock of variable X. In other words, X is an impulse variable, and Y is a response variable. One period in the figure denotes one day.

The subfigures in the last row of Figure 3 demonstrated the variable responses to a positive shock of the (logarithmic) Baidu index. Figure 3(iii.a) presents the interesting finding that the AQI in Shanghai actually decreased after the Baidu index increased. In order to observe the details more clearly, this impulse response figure was amplified in Figure 4(iii). As can be seen from the graph, one day after an increase in the Baidu index, the AQI in Shanghai was below its initial level. This phenomenon lasted for two days. After that, the AQI returned back to its initial level. This finding supports Hypothesis 3. If public concern about air pollution intensified, people would take action to ameliorate air quality or, at least, avoid aggravating pollution and wait for the air quality to naturally improve. Figure 3(iii.b) demonstrates that the Baidu index had no significant impact on Beijing's AQI. Figure 3(iii.c) demonstrates the persistency of the increase in the Baidu index.

Table 3 shows FEVD estimates for the Baidu index. As can be seen from the table, generally, within the horizon of fourteen days, local AQI explained around 30%–40% forecast error variance of Shanghai residents' concern about air pollution. The AQI in Beijing explained roughly 5% variance. The rest was explained by the Baidu index itself. Obviously, local air quality was quite important in forecasting fluctuations of the Baidu index. Air quality in another famous city could also partly explain changes in the Baidu index for Shanghai's air quality. These support previous findings that were obtained by observing impulse response figures.


**Table 3.** Forecast error variance decomposition (FEVD) estimates for the Baidu index.

#### *4.3. Robustness Analyses*

In this subsection, we outline several robustness checks that were conducted on previous estimation results. First, whether results were sensitive to the selection of the search engine keyword for the Baidu index was examined. Second, alternative sample periods were considered. Third, alternative selections of lag orders in the VAR model were inspected. Lastly, whether air quality in other cities aside from Beijing affected the public concern about air pollution in Shanghai was further investigated. The impulse response figures are displayed in the subfigures of Figure 5.

#### 4.3.1. Alternative Baidu Index Keyword

In previous analyses, the keywords "Shanghai air quality" ("shanghai kongqi zhiliang" in Chinese) were relied on to derive the Baidu index. Next, another search term, "Shanghai haze ("shanghai wumai" in Chinese), was utilized to ge<sup>t</sup> the Baidu index. The results are shown in Figure 5(i.a–i.c). It is apparent that the AQI in Shanghai positively affected the Baidu index; the AQI in Beijing positively affected the Baidu index for Shanghai; and a rise in the Baidu index tended to depress the AQI in Shanghai. Thus, the previous findings of this study remained unchanged. In addition, other search terms such as

"Shanghai PM2.5", "air quality", "haze", and "PM2.5" were checked, and similar results were generated. To save space, the results using those alternative keywords are not reported here. The Supplementary Material attached to this paper provides additional information to demonstrate the robustness of the study results to the selection of Baidu index keywords. In the Supplementary Material, it is shown that the Baidu index values for different keywords were strongly correlated, indicating that different keywords actually reflected highly consistent online searching behaviors and provided similar information. Moreover, the Supplementary Material demonstrates the IRFs for the keywords "Shanghai PM2.5" (which had the lowest correlation coefficient with "Shanghai air quality", compared to other candidate keywords), which were almost the same as the IRFs shown in Figure 3.

#### 4.3.2. Shorter Sample Period

The baseline analyses were based on the sample period between 2 December 2013 and 31 July 2019. It was admitted that the level of the Baidu index was not only determined by the degree of public interest on the specific topic, but also influenced by some other factors such as the changes in the market share of the Baidu search engine, total number of Internet users, and Internet users' habits. The longer the sample period was, the larger the impact those alternative factors might have. As pointed out by Lu et al. [12], the long term annual trend of public concern about air pollution probably had characteristics different from those of short term fluctuations in air pollution. To mitigate this issue and inspect whether the study results were robust to the selection of the sample period, a shorter sample period from 1 January 2017 to 31 July 2019 was considered. Results are presented in Figure 5(ii.a–ii.c), which are similar to those that have previously been derived. Other subsample periods, such as between 1 January 2018 and 31 July 2019, were also checked. Results were analogous, but are not reported here to save space.

#### 4.3.3. Alternative Selection of Variable Lag Order

Previously, eight lags of variables were selected for the VAR model according to the LR, FPE, and AIC statistics. Since the HQIC and SBIC suggested to use different lags, the model was re-estimated on the basis of the alternative selections of lag order. According to the SBIC, three lags might be suitable. The corresponding impulse response figures are demonstrated in Figure 5(iii.a–iii.c). Notably, these new impulse response figures did not shake the previous statements in this study. Other lag orders, such as six, to follow the suggestion by HQIC, were also tested. Similar results were obtained.

#### 4.3.4. VAR Model with AQI in Another City

Previous analyses used a VAR model containing the AQI variable in Beijing. Next, whether results were sensitive to the selection of this specific city was checked. The city of Nanjing was taken instead of Beijing. Nanjing is one of the largest and most important cities in East China. The obtained impulse response figures are demonstrated in Figure 5(iv.a–iv.c). Compared to the baseline results, it was found that situations would be similar if Nanjing rather than Beijing were selected. Moreover, circumstances using the AQI in Guangzhou, which is the largest city in South China, were also inspected. Results were similar, but not reported here.

Overall, robustness checks strengthened the findings of this study. Hypotheses 1–3 were all supported.

*IJERPH* **2019**, *16*, 4784

**Figure 5.** Cont.

0.2

0.1 0.15

0 0.05

0

0.05

0.1

0.15

0.2

0.2

0.1

0

0

0.2

0.1

#### **5. Discussion and Implications**

The estimation results in this study provided three interesting findings. First, local residents perceived the deprivation of air quality and expressed their concern about air pollution quickly, within the day on which the air quality index rose. This supported previous studies that found a strong correlation between perceived and actual level of air quality (e.g., [22,24]). It was implied that air quality is consistently monitored and assessed by Shanghai residents and that they show high awareness of air pollution. This also lends support to the finding by Yan et al. [36] that people who live in richer and more polluted cities are more likely to perceive air pollution. In addition, the concurrent rise of social media platforms in China might also contribute to this strong association. These platforms provide a way to share news and public opinions quickly. Media alerts on AQI would trigger heated debate and discussion on air quality, as well as information seeking behavior.

Second, a decline in air quality in another major city, such as Beijing, also raised the local concern about air quality in Shanghai. This was plausible because air pollutants could be transported by wind, causing pollution to spread over an extensive region within a short time interval. Prior studies have provided evidence that air pollution has a negative spatial spillover impact on neighboring cities' public health [50]. Given the fact that Beijing is 1200 km away from Shanghai, air pollution in Beijing might not directly cause health problems in Shanghai. However, it increases public concerns on air pollution.

Third, a rise in Shanghai residents' concern had a beneficial impact on air quality improvement. On the one hand, this could be explained by prior findings revealing that environmental concerns could promote people's pro-environmental behaviors [41,42]. On the other hand, public concerns about air pollution could force governments to take actions to improve air quality [11,12]. It has been reported that China has curbed industrial emissions, restricted the use of cars on the road, and shut down coal mines in large cities such as Beijing, Shanghai, and Guangzhou [53].

This study contributed to the air pollution literature by empirically examining the reciprocal relationship between public concern about air quality and actual air quality using data on a daily basis. Different from prior studies that relied on survey data to measure perceived air quality, this study utilized a big data based dataset dating back to 2013 to conduct more accurate analyses. Additionally, this study performed VAR analysis rather than only basic correlation analysis, which helped demonstrate the rapidness and persistency of the rise in public concern about air pollution.

From a practical perspective, it was implied that providing timely air quality indices to residents could be a powerful tool to raise public awareness on air pollution to take steps for pollution mitigation. The governmen<sup>t</sup> could utilize online search engines as a tool for displaying more information and advice explaining how people can minimize their contribution to air pollution during residents' keyword search process. When observing a significant drop in air quality, the governmen<sup>t</sup> should initiate immediate actions to tackle air pollution by providing information explaining how people can minimize their exposure to polluted air and which air pollution reduction strategies have been implemented in order to prevent excessive concerns on pollution. Furthermore, displaying AQI in other major Chinese cities could be utilized by local governments to promote local residents' awareness of air pollution and, subsequently, support for environmental protection.

#### **6. Conclusions, Limitations, and Future Research**

Using Shanghai as a case study, this study empirically examined the interactive relationship between actual level of air pollution and residents' concern about air pollution on the basis of the daily Baidu search index and AQI data. This study highlighted that residents in Shanghai expressed immediate concerns about air pollution as long as the air quality in Shanghai or in other major Chinese cities go<sup>t</sup> worse. The study results also suggested that raising awareness on air pollution would motivate individuals or the governmen<sup>t</sup> to carry out actions to improve air quality.

In evaluating the significant findings from this study, two major limitations need to be acknowledged. First, this paper focused on the circumstances in only one city because of difficulties in data collection. If we can collect data for a larger number of cities in the future and use the panel VAR approach, we could re-examine the findings based on Shanghai. A dataset covering more regions could also allow us to investigate possible heterogeneities across different regions, which may provide further policy implications. Second, as a preliminary exploration, this study only considered three variables in the VAR model because other variables were not available on a daily frequency. There is no doubt that some other factors may also be influential in the relationship between actual air pollution level and public concern about the problem. In the future, more variables may be introduced into the model as long as the data availability problem is overcome.

**Supplementary Materials:** The Supplementary Material is available online at http://www.mdpi.com/1660-4601/ 16/23/4784/s1.

**Author Contributions:** Conceptualization and funding acquisition, D.D. and X.X.; methodology, data curation, formal analysis, and original draft preparation, D.D.; literature review and review and editing, X.X.; software, validation, and supervision, W.X. and J.X.

**Funding:** This research was funded by the Fundamental Research Funds for the Central Universities (Grant Nos. JBK1801039 and JBK1809054).

**Acknowledgments:** The authors are grateful to the Editors and two anonymous referees for their comments and suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest.
