**3. Methods**

## *3.1. Model*

This study is based on time series data relevant to air pollution in Shanghai. VAR analysis, a powerful tool in modeling complex time series, was used in this study. Beyond basic correlation analysis or the ordinary least squares (OLS) regression technique, the VAR model treats all variables as endogenous and contains time lagged variables. Thus, the VAR model can measure the reciprocal reactions among different variables and lagged and persistent effects. The following reduced form VAR model was built:

$$\mathbf{y}\_t = \mathbf{c} + \sum\_{i=1}^p \mathbf{A}\_i \mathbf{y}\_{t-i} + \mathbf{e}\_{t\prime} \tag{1}$$

where *yt* is a vector of endogenous variables in period *t*, *c* is the intercept vector, *Ai* refers to the autoregressive coefficient matrix that captures system dynamics, and *εt* is the residual term. Lag order *p* of the model is selected on the basis of some statistical criteria that are discussed later.

Besides the variables of AQI and the Baidu index in Shanghai, the AQI in Beijing was also included in the model as a potential explanatory variable that might affect Shanghai residents' concern about air pollution. Indeed, in modern times, information is highly mobile via different channels. People's views and opinions are impacted by not only situations in their local area but also by things occurring in other districts.

Therefore, in this study, vector *yt* = [*AQIt*, *AQIBeijing t* , *BaiduIndext*] was used. It contains three variables: *AQIt*, daily air quality index in Shanghai; *AQIBeijing t* , air quality index in Beijing; and *BaiduIndext*, the Baidu index for keywords "Shanghai air quality". As usual, all three variables were log transformed to mitigate potential scaling problems. Hence, variable variations are expressed as percentage changes.

#### *3.2. Data for Measuring Actual Air Pollution*

Daily AQI data used to measure actual air pollution level in Shanghai are publicly available from the website of the China Air Quality Online Monitoring and Analysis Platform: https://www. aqistudy.cn. AQI is a synthesized index reflecting the degree of pollution in ambient air, calculated using the measured data for several major air pollutants (PM2.5, PM10, CO, NO2, O3, SO2) according to the guidelines of the official environmental protection sector. A low AQI value indicates a low degree of air pollution, and a high AQI value implies a high degree of air pollution.

The daily AQI values of 2068 days over the period between 2 December 2013 and 31 July 2019 in Shanghai were retrieved to measure actual air pollution in Shanghai. Additionally, in order to answer the second research question of whether the decline in air quality in another major city, such as Beijing, affects the public concern about air quality in Shanghai, the daily AQI data of Beijing during the same period were also retrieved. Data prior to 2 December 2013 were not included in this study due to data unavailability.

#### *3.3. Data for Measuring Residents' Concern about Air Pollution*

The Baidu index was utilized to measure residents' concern about air pollution in Shanghai. The Baidu index data were provided by Baidu, Inc., and they are publicly available from the web page: http://index.baidu.com. Baidu is the most popular Internet search engine that occupies a major market share in China. The Baidu index is calculated on the basis of the search volume for a specific

search item on a daily basis at the municipal, provincial, and national levels [16]. A high (low) value of the Baidu index for a certain item indicates that many (few) persons searched for information on the item and cared about the relevant topic. Compared to self-administrated survey methods, the Baidu index has two advantages in measuring the degree of public concern about air pollution. First, it covers a wide range of study samples. Since Baidu dominates the search engine market in China, its Baidu index could reflect the aggregate behaviors of most Chinese Internet users. Second, the Baidu index data are available at a daily frequency for several years. This property enabled us to examine not only the long term trend, but also short term fluctuations of public concern about air pollution.

In this study, the Baidu index was restricted within the district of Shanghai in order to rule out individuals who were not living in Shanghai, but were still interested in Shanghai's air quality. The keywords "Shanghai air quality" ("shanghai kongqi zhiliang" in Chinese) were selected as the searched-for item. Other related search terms, such as "Shanghai haze" ("shanghai wumai" in Chinese) and "Shanghai PM2.5" ("shanghai pm2.5" in Chinese) were also tested for robustness checks. Consistent with the sample period for the AQI data, the Baidu index data between 2 December 2013 and 31 July 2019 were exploited.

Table 1 shows the summary statistics of the variables used in the analyses. In the table, both the original level and logarithmic value of variables are shown. In the VAR estimation, the logarithmic values were used to mitigate potential scaling problems.


