*3.1. Data and Variables*

We use data from the China Health and Nutrition Survey (CHNS), a large-scale panel dataset that employed a multi-stage stratified sampling method to select households from 12 provinces and municipal cities in China, spread across the eastern, central and western regions. The selected households were followed for 10 waves (from 1989 to 2015), and surveyed on a wide range of topics including demographics, socio-economic characteristics and health outcomes (Popkin et al. 2010).

In this study, we use six waves of data from 2000 to 2015 and exclude three newly added municipal cities from 2011. Data before 2000 are not used since the structure of the early questionnaires was different from that of the following waves. We measure income-related inequality for the use of different types of health services (formal medical care, preventive care, folk doctors and inpatient care) and different levels of health facilities (from low-level to high-level facilities: village clinics/community health centres, township health centres, county hospitals and city hospitals). We also consider the burden of OOP payments, defined as the proportion of the OOP payments in the last month to the total monthly per capita household income following previous literature (Wagstaff and Lindelow 2008). We censored the maximum value of this variable at 100% in order to eliminate extremely high OOP values for individuals belonging to households with a very low income (3.25% of the sample). To measure living standards, we adjust household income by applying the Organisation for Economic Co-operation and Development (OECD)-modified equivalence scale to household income, assigning a value of 1 to the household head, 0.5 to each additional adult and 0.3 to each child (OECD n.d.). The CHNS income measure is net monetary income received by the household members and includes income from farming, fishing, gardening, livestock and small commercial household business. Notice that we exclude households with negative income, so the final sample size consists of 24,762 individuals, 6789 households and 67,856 person-wave observations. About 34.9% of respondents were only interviewed once in the survey, 20.2% twice, 13.8% three times, 9.8% four times, 10.2% five times, and 11.3% in all waves. The attrition rate is quite high, and individuals reporting more use of formal medical care and higher level of OOP payments were more likely to drop out of the sample.

In the decomposition analysis we explore how the major determinants of individuals' care-seeking behaviour are associated with income-related health inequality. We look at health-related (number of major diseases, number of symptoms and illness status during the past four weeks preceding the survey), demographic (age, gender, number of children, ethnicity, urban/rural residential status, marital status) and socio-economic characteristics (house ownership, education levels, employment, occupation, social health insurance coverage and geographics). Specifically, health status is thought to be the most important factor that drives utilization of health care. Since self-rated health status is not available in the CHNS 2009 and 2011 surveys, we use the number of major diseases and illness status during the past four weeks preceding the survey as proxy variables for patients' health status (O'Donnell and Propper 1991; Van Doorslaer et al. 1992). We create 14 variables by interacting age categories with gender and control for other demographics such as ethnicity, urban/rural residential status, marital status. The socio-economic characteristics include house ownership, education levels and job status. Household ownership can be a good indicator of the households' wealth level in addition to income, especially for rural households who live from subsistence farming and have little or no income. Better education can either lead to an increase in health care use or better health status that results in lower needs for health care. Job status is particularly relevant in the Chinese context since most state welfare benefits (including UEBMI) are associated with the types of industry. For example, we would expect that state government officials are more likely than self-employed businessmen to have better access to health care facilities because they are granted more generous state welfare benefits. We also investigate the impact of geographic factors by dividing the nine provinces into three groups: eastern, middle and western regions. The eastern coastal area is generally more affluent and supplied with better quality medical infrastructures and services than the middle and western areas. The coverage of the three social health insurance schemes is also included: UEBMI and URBMI for urban residents and NCMS for rural residents. Table 1 shows the definitions of all variables and their summary statistics.


**Table 1.** Number of observations, mean, standard deviation of all variables, pooling all years from 2000 to 2015.

#### **Table 1.** *Cont.*


**Table 1.** *Cont.*


<sup>1</sup> N refers to the total number of observations that pool the repeated observations of the same individuals over the years, and SD to the standard deviation. <sup>2</sup> OOP stands for out-of-pocket expenditures.

#### *3.2. Measurement of Inequality*

Our first goal in this paper is to measure to what extent the health outcomes we have selected are related to incomes, and to examine whether and how these relationships have changed over the study period. Put differently: is there any evidence that wealthy people tend to have better access to health services than poor people, and have the health reforms changed anything? It is customary to use indices to measure the degree of socioeconomic inequality. Since we are looking at the joint distribution of health and income, the indices must be of the bivariate type. Positive index values indicate that health and income are positively correlated, and negative values that they are negatively correlated. Due to a lack of reliability in self-reported health measures in the setting of low-income countries (Van Doorslaer and O'Donnell 2011), we focus on measuring inequality in the allocation of health care resources without standardizing for the differences in health needs.

Broadly speaking, two types of bivariate indices of socioeconomic inequality of health can be distinguished: rank-dependent indices, such as the well-known concentration index (CI) (Wagstaff et al. 1991), and level-dependent indices. Rank-dependent indices measure the degree of correlation between health levels and income ranks, and can be expressed as weighted sums of health levels, where the weights are defined by a function of the income ranks (Coveney et al. 2016). The standard (or relative) CI is usually applied to non-negative ratio-scale health variables (Erreygers and Van Ourti 2011). Given that our health care and OOP burden variables are bounded variables, we use a modified version of the CI developed for bounded health variables (Erreygers 2009; Erreygers and Van Ourti 2011). However, since rank-dependent indices only rely on income ranks, they ignore relevant information about the levels of income (Erreygers and Kessels 2017). Level-dependent indices are similar to rank-dependent indices, but take into account the income levels rather than the income ranks. They too can be expressed as weighted sums of health levels, but the weights are now determined by a function of the income levels. These indices exploit more information about the income distribution and measure both income and health consistently by their levels. Additionally, in this case we have to use a modified version appropriate for bounded variables. The precise definitions of the indices calculated in this paper can be found in Appendix A.

### *3.3. Decomposition of Inequality*

Our second goal is to increase our understanding of the determinants of income-related inequalities. To this end, we decompose the inequality indices by means of demographic, socio-economic and health-related variables at the individual level. The conventional regression-based decomposition approach rests on a regression of the health variable only (Wagstaff et al. 2003), and for this reason has been subjected to criticism (Erreygers and Kessels 2013). In recent years, two alternative methods have been developed. The first is based on the recentred influence function approach (Heckley et al. 2016) and has already been applied to Chinese data (Cai et al. 2017). In this study we employ a new approach, based on a regression of a composite variable that incorporates both health and income (Kessels and Erreygers 2019). The idea is that this variable can be interpreted as an indicator of an individual's deviation from a reference position in the income-health space, with the reference position determined by average health and average income. The exact definitions of the dependent variables of our decomposition regressions can be found in Appendix A.

We apply ordinary least square (OLS) regressions to estimate the marginal effects of each individual variable on the inequality index. Previous studies found there is little difference between OLS and non-linear models for decomposition analysis, while the approximation techniques required by non-linear models might introduce additional errors (Van Doorslaer et al. 2004; Van Doorslaer and Masseria 2004; Van Doorslaer et al. 2000). A positive (negative) regression coefficient means that the associated explanatory variable is positively (negatively) correlated with both income and health. In contrast to what is often done in applications of the conventional regression-based decomposition technique, we do not estimate the contribution of each factor to the inequality indices. Instead, we calculate the logworth values based on the p value of the F tests to evaluate the relative importance of the (groups of) variables in influencing the correlation between the income and health dimensions.

#### **4. Results**

#### *4.1. Income-Related Inequality in Health Care Utilization and OOP Burden*

Tables 2 and 3 present the rank- and level-dependent indices measuring incomerelated inequality for health care utilization and OOP burden. Broadly speaking, both rank- and level-dependent indices give similar results in terms of the direction of inequality and its significance. There is substantial pro-rich inequality in the use of preventive care and pro-poor inequality in the use of folk doctors. The results suggest that low-income people have limited access to preventive services and are more likely to use folk doctors, who are traditional Chinese medical practitioners in rural areas. They are usually less qualified providers who received minimal basic medical and paramedical training and had no more than middle or high school education (Li et al. 2017b). They offer cheaper services compared to formal health providers, but also tend to conduct unnecessary or even

dangerous practices. However, some folk doctor care also includes traditional Chinese medicine, which is considered appropriate in some clinical settings (Cui et al. 2004; Chen et al. 2008; Xiang et al. 2019; Harmsworth and Lewith 2001).

**Table 2.** Rank-dependent indices for income-related inequality of health care utilization and medical expenditure in China.


<sup>1</sup> For each outcome, the first row shows coefficients and the second standard errors. \* indicates statistically significant at the 10% level, \*\* at the 5% level and \*\*\* at the 1% level.

**Table 3.** Level-dependent indices for income-related inequality of health care utilization and medical expenditure in China.


<sup>1</sup> For each outcome, the first row shows coefficients and the second standard errors. \* indicates statistically significant at the 10% level, \*\* at the 5% level and \*\*\* at the 1% level.

Significant values of the inequality indices are found for the health categories preventive care, folk doctors, village clinics/community health centres, and city hospitals. For clarity we represent the values in Figures 1 and 2, respectively for rank-dependent indices

and level-dependent indices. In terms of health facility use, the direction of inequality varies by provider levels where pro-poor inequality is observed for the use of village clinics/community health centres and pro-rich inequality for the use of city hospitals. Wealthier people seem to have better access to high-level hospitals that offer more sophisticated care and require higher OOP expenditures.

**Figure 1.** Rank-dependent indices for income-related inequality of health care utilization in China.

**Figure 2.** Level-dependent indices for income-related inequality of health care utilization in China.

Even though the rich tend to use more expensive health care facilities than the poor, OOP expenditures seem to impose a higher weight on the poor than on the rich. As we defined the OOP burden as the proportion of the absolute amount of OOP payments relative to the per capita household income, a pro-poor distribution of the OOP burden indicates that in relative terms OOP expenditures tend to fall more heavily on the poor than on the rich (see Figure 3). In spite of the rapid expansion of social health insurance and other health care reform efforts, the disparities in health care utilization across incomes remain similar over time.

**Figure 3.** Rank- and level-dependent indices for income-related inequality of OOP burden in China.
