*3.1. Model*

In order to provide a precise evaluation of the impact of air pollution on tourism, the empirical analysis should carefully control for other determinants of tourism. Tourist flows can be well modelled using the gravity model, which has been widely used in the international trade literature. As discussed in previous literature (e.g., [43]), the general form of a typical gravity model used in tourism research could be expressed by the following formula: *Tij* = *f*(*Destinationi*,*Originj*, *Interactionij*), where *Tij* denotes the number of tourist visits to destination region *i* from origin region *j*; *Destinationi* refers to the features of the destination that act as forces pulling tourists to region *i* (e.g., clean environment, famous scenic spots, attractive culture); *Originj* refers to the features of the origin region that act as forces pushing tourists from region *j* (e.g., large population size, high disposable income); and *Interactionij* denotes the interactive factors that determine the costs for tourists from origin *j* to visit destination *i* (e.g., geographic distance, convenience of applying for a visa). The gravity model has been used in previous studies to explore the determinants of tourism. For instance, Huang et al. [44], Xu et al. [45], and Yang and Wong [46] investigated inbound tourism flows to China.

In this paper, in order to estimate the impact of air pollution on inbound tourism in different Chinese provinces, the following gravity model in a linear form was used:

$$\begin{aligned} T\_{\text{ijt}} &= \eta A \text{irPollution}\_{\text{it}} + \varrho A \text{irPollution}\_{\text{jt}} + \text{Destination}\_{\text{it}} a \\ &+ \text{Org} \, \text{in}\_{\text{jt}} \theta + \text{Interaction}\_{\text{ijt}} \gamma + \text{s}\_{\text{i}} + \text{u}\_{\text{j}} + \text{v}\_{\text{l}} + \varepsilon\_{\text{ijt}}, \end{aligned} \tag{1}$$

where *Tijt* is the dependent variable, the number of inbound tourist arrivals (in person-times) in China's province *i* from country *j* in period *t*. Here, "inbound tourist" refers to the tourist who is not a resident of Mainland China. *AirPollutionit* and *AirPollutionjt* are the degree of air pollution in province *i* and country *j*, respectively. *Destinationit* is a vector containing a set of variables capturing the characteristics of destination province *i*. *Originjt* is a vector containing the variables measuring the features of origin country *j*. *Interactionijt* is a vector containing the variables describing the interactive relationship between province *i* and country *j*. *si* is the province-fixed effect; *uj* is the country-fixed effect; and *vt* is the time-fixed effect. *<sup>ε</sup>ijt* is the error term. The dependent variable is log-transformed to address the scaling problem. Thus, variations in tourist arrivals are expressed in percentage changes. *η*, *ϕ*, *α*, *β*, and *γ* are coefficients to be estimated.

#### *3.2. Selection of Explanatory Variables*

#### 3.2.1. Air Pollution

In this study, the degree of air pollution was measured by the degree of PM2.5 concentration in ambient air. PM2.5 is one of the most significant air pollutants, well known by the public. Previous studies have reported that PM2.5 heavily harms public health and the tourism experience (e.g., [29,47–52]). Thus, in Equation (1), the variables *AirPollutionit* and *AirPollutionjt* refer to the annual average density of PM2.5 pollutant (μg/m3). According to Hypotheses 1 and 2, we expected that these two variables had negative coefficients.

#### 3.2.2. Destination Features

The vector *Destinationit* in Equation (1) contains the following variables for province *i* in year *t*: *ln*(*Population*)*it*, *ln*(*GDPpc*)*it*, *ln*(*Scenic*)*it*, *ln*(*Hotel*)*it*, *Hospitalit*, *Transportit*, *Urbanit*, *GDPgrit*, *Structureit*, *Temperatureit*, and *Rainit*.

*ln*(*Population*)*it* and *ln*(*GDPpc*)*it* are the logarithmic values of population and real GDP per capita, respectively. Previous studies suggested that the scale of destination economy is a determinant of cross-border tourism volume [53,54]. Ceteris paribus, a bigger economy is correlated with larger tourism volume compared to a smaller economy. The economic scale is typically measured by the size of GDP, which can be decomposed into GDP per capita multiplied by population. GDP per capita can be considered as an indicator of economic development level and people's income level. Countries with higher GDP per capita usually enjoy better infrastructure and more developed transportation networks, and have abilities to provide services with higher quality. All of these attributes help constitute a favored tourism destination [55]. We expected that larger population and higher income level were associated with more inbound tourist arrivals.

*ln*(*Scenic*)*it* refers to the logarithmic value of the number of 4A- and 5A-rated scenic spots, classified by the China National Tourism Administration. Because 5A spots are usually considered to be much more attractive than 4A spots, one 5A spot was assumed to be equal to two 4A spots. *ln*(*Hotel*)*it* is the logarithmic value of the number of star-rated hotels. The number of scenic spots is a

proxy for the attractiveness of the destination region. The number of hotels indicates the capacity of providing accommodation services. Apparently, these two variables were expected to be positively correlated with tourist arrivals.

*Hospitalit* is the ratio of the number of health-care workers to local population. This variable is a proxy for the abundance of public hygiene infrastructure, which provides a component of tourism services. As many foreign tourists stay in China for days or even weeks, the availability of public health services might be a concern. *Transportit* is a variable reflecting the convenience of transportation, measured by the length of road per capita. Since the transportation system is responsible for transporting travelers and relevant tourist products, its infrastructure should be seen as one of the most vital bases for tourism services. Convenient transportation infrastructure will increase the possibility that international travelers will visit. In contrast, terrible traffic conditions will leave a negative impression on foreign visitors, hurt the tourist experience, and limit the expansion of tourism. In previous studies, such as those by Khadaroo and Seetanah [53] and Zheng et al. [56], it was widely confirmed that transportation infrastructure is a significant determinant of tourism development. These two variables, measuring hospital and transportation availability, were supposed to facilitate inbound tourism.

*Urbanit* is the urbanization rate, measured by the proportion of urban population in total population. *GDPgrit* is the GDP growth rate. These two variables were used to describe social and macroeconomic status. As tourism is a part of the aggregate socioeconomic system, it is probably relevant to these two variables. *Structureit* is the industrial structure, with the non-agricultural value added as a share of GDP as proxy. It was expected that the process of industrial updating was positively correlated with the development of tourism.

*Temperatureit* and *Rainit* refer to the annual average temperature and proportion of rainy days. These two variables might be relevant to the number of tourist arrivals because tourism is a weather-dependent industry.

## 3.2.3. Origin Features

The vector *Originjt* in Equation (1) contains the following variables for country *j* in year *t*: *ln*(*Population*)*jt*, *ln*(*GDPpc*)*jt*, *Transportjt*, *Urbanjt*, and *GDPgrjt*.

*ln*(*Population*)*jt* is the logarithmic value of population size. *ln*(*GDPpc*)*jt* is the logarithmic value of GDP per capita. The scale of origin country may be a crucial determinant of cross-border tourism volume [53,54]. Generally speaking, a big country has larger tourism volume compared to a small country. The economic scale can be measured by the size of GDP, which equals GDP per capita multiplied by population. A higher GDP per capita level indicates a higher level of personal income, on average, implying that more people can afford international travels. A larger population base is associated with a greater scale of population mobility. We expected that these two variables had positive correlations with the number of tourists visiting China.

*Transportjt* reflects the convenience of air transport, measured by the ratio of the number of registered carrier departures to local population. The transportation system is responsible for transporting travelers and tourist products. Transportation infrastructure is one of the most vital bases for tourism activities. Since most international tourists need to travel long distances from their origin countries to China, the convenience of air transport in their countries is a crucial concern. We supposed that this variable had a positive effect on tourist arrivals in China.

*Urbanjt* is the urbanization rate, and *GDPgrjt* is the GDP growth rate. We used them as control variables to represent the basic social and macroeconomic status in tourists' home countries. We did not impose any prior expectation on the sign of their coefficients.

#### 3.2.4. Interaction Variables

The vector *Interactionijt* in Equation (1) contains the following variables: *ln*(*ER*)*ijt*, *ln*(*Distance*)*ij*, *TradeOpenijt*, and *VisaFreeijt*.

*ln*(*ER*)*ijt* is the logarithmic value of the relative exchange rate between Chinese currency and foreign currency adjusted by price level. It was calculated according to the formula: *ln*(*ER*)*ijt* = *ln*[(*CPIit*/*Eit*)/(*CPIjt*/*Ejt*)], where *CPI* is the consumer price index (value in 2010 = 100) and *E* is the exchange rate of the local currency against the US dollar (value in 2010 = 100). The relative price of tourism products and services between origin and destination countries influences tourists' decisions [57–59]. The price factor demonstrates the relative cost of staying in the destination country compared to staying in the origin country. Two major elements of tourists' expenditures are the costs of travel and living [60,61]. The consumer price index together with the exchange rate are key indices used to evaluate the cost. We expected that the relative price negatively affected the tourist arrivals in China.

*ln*(*Distance*)*ij* is the logarithmic value of the geographic distance between the capital city of province *i* and the capital of country *j*. The distance between origin and destination countries should be taken into account. The geographical distance between two regions directly affects the cost of travel. International tourists usually depart from and arrive in big cities because of the good infrastructure and convenient transportation in those cities. Thus, researchers often use the distance between capital cities of origin and destination countries as a proxy for intercountry distance. Besides geographic distance, it was found in the literature that cultural distance might also play a role in shaping tourism demand [62]. For instance, Yang and Wong [46] reported that cultural distance had a negative effect on inbound tourism flow into China. Because of the lack of data, we did not include the cultural distance in our model. The variable of geographic distance was expected to be negative correlated with the dependent variable.

*TradeOpenijt* refers to the degree of international trade openness, calculated by *TradeOpenijt* = *Tradeijt*/*GDPjt*, where *Tradeijt* is the volume of international trade between province *i* and country *j* and *GDPjt* is the GDP of country *j*. The factor of trade flows can be used to assess the intensity of economic interactions between two regions. The more intensive the economic relationship, the more business tourists there are traveling between regions. Thus, the relative trade volume is a meaningful variable that can be utilized as a proxy for the closeness of intercountry economic relationship and partially explain the volume of tourism flow [44,45]. Because of its grea<sup>t</sup> size in terms of macroeconomy and international trade, China is one of the most important business partners for many countries. Obviously, the variable of trade flows cannot be omitted in research on tourists into China. It was expected that, if the degree of trade openness was high, the number of cross-border tourists would also be large.

*VisaFreeijt* is a dummy variable indicating whether there is a 72-h visa-free policy for tourists from country *j* to the capital city of province *i*. Its value is equal to 1 if the policy was implemented, and 0 otherwise. As a sign of the relationship between different countries, tourism liberalization policies also play an important role in tourism development. Liberalization policies provide convenience for potential visitors and present a welcome attitude to tourists. Arita et al. [63] found that Approved Destination Status (ADS), which allows government-approved travel agencies to obtain visas to ADS destinations in bulk, resulted in a significant increase in the number of cross-border tourists. Gil-Pareja et al. [64] reported that tourism-related agreements had a significant impact on international tourism inflow. For many ordinary tourists, the process of applying for a visa is time-consuming. Thus, a visa-free policy might significantly stimulate people's intention to visit.
