*3.3. Data*

The sample covered 30 provinces in Mainland China, excluding Tibet. Tibet was not included because of the unavailability of weather-relevant data. Tourist arrivals data at the province level were available for 13 origin countries: Australia, Canada, France, Germany, Japan, South Korea, Malaysia, Philippines, Russia, Singapore, Thailand, the United Kingdom, and the United States. The number of tourist arrivals from those 13 countries accounted for more than half of total foreign tourist arrivals in China. The sample period covered seven years, spanning from 2010 to 2016.

The PM2.5 data in different Chinese provinces were obtained from the Chinese Research Data Services Platform (CNRDS), available at https://www.cnrds.com. The data of tourist arrivals and the destination regions' economic and social features were derived from the database provide by the EPS China Data, which was accessed at http://www.epschinadata. com. The weather data were obtained from the China Meteorological Data Service Center, accessed at http://data.cma.cn/en. The data for variables in different foreign countries were publicly available from the World Bank's World Development Indicators (WDI) dataset at http://datatopics.worldbank.org/world-development-indicators.

The data of the interaction variables were constructed by combining several data sources. The variable of relative exchange rate required information about the exchange rate and price level. The exchange rate data mainly came from the WDI dataset. Since WDI does not report the exchange rate for Euro Zone countries, the exchange rate data for France and Germany were obtained from the website of the Federal Reserve Bank of St. Louis at https://fred.stlouisfed.org. CPI data in the foreign countries were obtained from the WDI. CPI data in different Chinese provinces were obtained from the EPS database. The values of the distances between different regions were calculated based on longitude and latitude information supplied by the World Cities Database, available at https://simplemaps.com/data/world-cities. Information about trade openness required data at both the province and country level, which were available from the EPS and WDI, respectively. The data for 72-h visa-free policy were collected from news reports published by the official mass media.

Summary statistics of the variables are reported in Table 1. The table shows that the study samples were highly diversified, including regions with different levels of air pollution and degrees of tourism development. The final dataset used in regression analyses was comprised of 23 variables with 2651 observations in total.



Abbreviations: SD, standard deviation; Min, minimum; Max, maximum.
