Next Article in Journal
The Impact of the Circular Economy Pilot Policy on Carbon Emissions in Chinese Cities and Its Underlying Mechanisms
Previous Article in Journal
Goals and Pathways of Public Governance Contribution to Achieve Progress in the Quality of Life
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigating the Impact of Public Services on Rental Prices in Chinese Super Cities Based on Interpretable Machine Learning

1
School of Design and Art, Changsha University of Science and Technology, Changsha 410114, China
2
Yibin Institute of Urban and Rural Planning Research, Yibin 644000, China
3
Queen’s Business School, Queen’s University Belfast, Belfast BT9 5EE, UK
4
Lingnan College, Sun Yat-sen University, Guangzhou 510275, China
5
School of Natural and Built Environment, Queen‘s University Belfast, Belfast BT7 1NN, UK
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(17), 7861; https://doi.org/10.3390/su16177861
Submission received: 9 April 2024 / Revised: 27 August 2024 / Accepted: 4 September 2024 / Published: 9 September 2024

Abstract

:
In China, approximately 20% of the permanent population are renters, with 91% of leased land concentrated in first-tier and new first-tier cities. Education and healthcare are primary concerns for residents, significantly influencing rental decisions due to the household registration (hukou) system, competitive educational environment, and uneven distribution of medical resources. This study explores the distinct factors affecting rental decisions in China’s super cities, differing from other countries where renters prioritize proximity to work or urban amenities. Using advanced interpretable machine learning techniques, the study analyses rental markets in Beijing, Shanghai, and Shenzhen. The random forest model demonstrates superior performance in rent prediction across all three cities. The results indicate that the impact of public service resources on rent is notably higher in Beijing and Shanghai, while in Shenzhen, balanced urban planning results in property characteristics being more prominent in tenant preferences. These findings enhance the understanding of global rental market dynamics and provide recommendations for promoting sustainable rental housing development. The scientific novelty of this study lies in its application of advanced machine learning models to identify and quantify the unique influences of public service resources on rental markets in different urban contexts.

Graphical Abstract

1. Introduction

The market system of the modern economy in the context of globalisation is composed of the real and the virtual economy. As one of the representatives of the former, the real estate industry has long held an influential position, enabling the national progress of the financial system. China—as the second largest economy—has always paid attention to the transition from the economic escalation of a one-off expansion to the pursuit of steady and sustainable prosperity. Housing prices are rising rapidly in most major cities in China, so the issue of affordability has become a central concern for society. The continued rise in property prices is partly due to the regulation of housing land supply by local authorities [1]. Since 2004, the Chinese government has completely controlled the land market. Since then, the supply of residential land has declined with the reduced supply of land, especially in the first-tier cities. After the government intervened more in the land market, the elasticity of housing supply has decreased significantly [2]. Local authorities rely heavily on revenue from land transfers. This leads to a consensus that property and land prices cannot fall much, further boosting confidence in property prices. Additionally, the expansion of cities could be due to local governments switching from promoting industrial growth to ‘urbanisation’ in the face of lower tax revenues from the industrial sector and higher revenues from urban development and land transfer [3]. With the progressive and immense influx of people without household registration in the developed regions, the limited resources in the central urban area of each super city have led to the problem that the infrastructure in the inner cities cannot fully meet the demands of both native citizens and immigrant residents [4]. In this controversial atmosphere, renting has become the first choice for a considerable number of people, and location, education, and medical care have become the most important, but not the only factors influencing rental prices [5].
In China, the State Council decrees that a permanent population of more than 10 million is considered a super city. With such a population base, Beijing, Shanghai, and Shenzhen are the economic cores of the Beijing–Tianjin–Hebei, Yangtze River Delta, and Guangdong–Hong Kong–Macao Greater Bay Area urban agglomerations, respectively. With a strong economic upswing, employment opportunities also increase. Unlike many countries where renters are primarily young adults without children and are less concerned with school quality, in China, the household registration system (hukou) plays a pivotal role. This system ties access to public services such as education and healthcare to residents’ registered location, making these factors critical in rental decisions [6,7]. Despite the inconvenience of the hukou system, the number of rural residents who have moved to cities for economic reasons since the reform and opening-up has increased sharply and currently accounts for about 20% of China’s total population [8]. This unique context significantly influences the rental market, as access to quality education and healthcare can vary greatly between districts within the same city, thus affecting rental prices [5].
In major Chinese cities like Beijing, Shanghai, and Shenzhen, where the hukou system determines access to local services, proximity to high-quality schools and healthcare facilities significantly influences rental prices. Families prioritize renting homes within catchment areas of reputable schools to secure better educational opportunities for their children. Similarly, access to advanced medical services in nearby hospitals or clinics is a key determinant for tenants concerned about healthcare quality and convenience [9,10]. The hukou system exacerbates these disparities, as residents with an urban hukou generally have easier access to superior public services compared to those with a rural hukou, despite residing in the same urban areas. This disparity not only affects individual households’ rental decisions but also shapes broader urban planning strategies and social welfare policies. Urban development and infrastructure investments are often influenced by the distribution and quality of public services, reflecting the government’s efforts to balance socio-economic disparities and promote equitable access to resources across different city districts [11,12].
This study applies five machine learning models—multi-linear regression (MLR), artificial neural network (ANN), support vector regression (SVR), random forest (RF) regression, and eXtreme gradient boosting (XGBoost)—to model rental prices in three Chinese super cities. The final choice of the RF model was based on its ability to handle large volumes of data and provide reliable predictions despite the presence of high-dimensional variables. Additionally, interpretable machine learning techniques are employed to identify the key factors influencing rental prices. In particular, this study investigates the influence of education and medical resources on rental prices. This study aims to provide insights into how these factors shape rental prices and inform policy recommendations for urban planning. Additionally, this framework is poised to be applicable to other global cities, particularly megacities and those undergoing rapid development. This broader application highlights the significance of understanding rental price dynamics beyond China’s internal policies, illustrating how these insights can inform urban governance and housing policy frameworks worldwide.
While the impact of public services on rental prices has been studied in other markets, this specific impact on the Chinese real estate market has not been sufficiently explored. Therefore, the research tasks of this study can be summarised as follows: (1) How accurately can five different machine learning models (MLR, ANN, SVR, RF, and XGBoost) predict rental prices in the three major Chinese super cities: Beijing, Shanghai, and Shenzhen? (2) What are the key factors, particularly the impact of public services including education and healthcare, that influence rental prices in these cities, as identified through interpretable machine learning techniques? (3) How can actionable policy recommendations be formulated for local governments to address the unequal distribution of public services in rental markets, and how can insights from Chinese super cities be adapted to improve urban governance and housing policies in rapidly urbanizing cities worldwide? The rest of this paper is structured as follows: Section 2 reviews the literature. Section 3 describes the data and variables used in this study. Section 4 explains the research methodology. Section 5 discusses the empirical results. Finally, Section 6 summarises the whole paper and makes appropriate recommendations.

2. Literature Review

2.1. Theory in Housing Markets

Hedonic pricing theory, a classical framework [13], suggests that buyers value intrinsic product characteristics over the product itself, influencing their willingness to pay. This theory contends that the combination of these attributes’ shapes customers’ perception of a product’s economic value [14]. As a typical heterogeneous product, housing follows the hedonic pricing pattern in both the sales and rental markets. In China, prior to 2017, tenants mainly considered the location of the property and accessibility to public transport in their rental decisions [15]. Most tenants at that time tended to sign short-term leases because they feared the instability caused by rapid rent fluctuations and the lack of a sense of belonging due to unequal rights vis-à-vis the landlord [16]. Since the 2017 report of the 19th National Congress of the Communist Party of China emphasized the notion that “houses are for living, not speculation”, a series of policies promoting equality between renting and buying have been introduced. Consequently, the factors influencing a tenant’s decision to rent have become more diverse. The new generation prioritizes property quality and proximity to public transport and amenities when making rental decisions. They are inclined towards long-term leases due to the promotion of Tenant–Owner Right Equality (2017), which ensures equal access to basic public services such as schooling, employment, social security, and medical care for families living in rental housing [17]. Thus, the rights granted to tenants when they move into the property have become part of the commodity and have acquired an implicit economic value that is reflected in the rental price.
The decision-making process for pricing is usually based on hedonic pricing theory and contains an interaction of supply and demand, which is influenced by both macro and micro factors. At the macro level, rental prices are strongly influenced by socio-economic variables such as tenant demographics and employment opportunities, which together determine a reference price [18]. Based on this reference price, the rent is further adjusted according to the micro characteristics of the property, such as its structure, location, and neighbourhood. Therefore, rental prices are likely to be higher if the property has a micro-level advantage over other offers and lower if the property has a micro-level disadvantage [17]. In most existing studies on housing prices, factors affecting rental prices are classified into three main categories: (1) architectural characteristics; (2) neighbourhood characteristics; and (3) location characteristics [19]. As urban rental markets are not spatially homogeneous, they can be subdivided by geographic area into numerous submarkets comprising rental housing within the same administrative district.
In many countries, rental markets are influenced by different factors compared to China. For instance, in Western countries, proximity to employment centres and urban amenities often outweighs the importance of educational facilities for renters [18]. In contrast, in China, the hukou system and the associated access to public services such as education and healthcare significantly impact rental decisions [17]. This study aims to fill this gap by providing insights into the Chinese rental market, which diverges from international trends.
Beijing, Shanghai, and Shenzhen are among the four cities with the strongest economies in mainland China. The comprehensive strength and competitiveness of these cities are relatively the highest in the country. They have strong economic foundations and considerable political resources, have the ability to radiate to many surrounding provinces, have strong educational resources, a profound culture, and extremely convenient transport links. The rental market has been widely studied as an important economic sector in these super cities. One study argues that traditional hedonic pricing models may overlook residents’ willingness to pay for school quality. They analyse Beijing’s primary school zones and housing values, finding the sensitivity of housing prices to school quality but no increase in willingness to pay [20]. Another suggests that high rental prices are concentrated in Shanghai’s city centre, influenced by regional-level factors, highlighting inequalities in housing affordability and social mobility [21].

2.2. Machine Learning in Housing Market Prediction

The application of machine learning in the housing market has garnered significant attention in recent years due to its ability to handle large datasets and complex interactions among variables [22]. Traditional statistical models, like hedonic pricing, while effective in some scenarios, often assume linearity and fail to capture the complex, non-linear relationships between the various factors influencing rental prices [19]. As a result, machine learning models, such as random forest (RF) and extreme gradient boosting (XGBoost) models, have been increasingly employed to improve the predictive accuracy of housing market analyses [23,24]. These models excel at managing large datasets, dealing with missing data, and understanding variable importance, making them particularly suitable for predicting housing prices and rental prices in rapidly growing urban markets like China.
This study employs five machine learning models—MLR, ANN, SVR, RF, and XGBoost—to predict rental prices in three Chinese super cities. The inclusion of interpretable machine learning techniques, particularly SHAP, allows us to unpack the decision-making process of the models. Unlike traditional “black box” models, interpretable machine learning can reveal how key features, such as proximity to educational and healthcare facilities, influence rent predictions. This ability to deconstruct model outputs and feature interactions is critical for understanding the unique dynamics of China’s rental markets [25].
The superior predictive power of machine learning techniques in the rental market has been widely demonstrated. Lorenz et al. [23] investigate and validate the analytical capabilities of interpretable machine learning in the German real estate industry. Waddell and Besharati-Zadeh [26] employ two methods, RF and multiple least squares regression methods, to predict rental prices in the San Francisco Bay Area. The results suggest that although the predictive accuracy of the RF model is much higher, useful predictions can be made with both models by using almost exclusively local accessibility features. Embaye et al. [27] point out that machine learning models can provide better rent predictions for the three countries Tanzania, Uganda, and Malawi than the ordinary least squares method. Yoshida et al. [24] highlight that as the sample size increases, the XGBoost and RF models can provide higher accuracy than nearest neighbour Gaussian processes in predicting apartment rental prices in Japan. The superior performance of machine learning techniques in the housing market motivates us to use this method in this study.
This study contributes to the literature in the following ways: First, most of the existing articles focus on one city when analysing the real estate market. The research objects of this study include three Chinese super cities, and the similarities and differences between them are examined. Therefore, the research findings are conducive to mutual learning between these super cities. Second, by comparing the RF model with simple hedonic regression, the effect of interactions between features on rent is highlighted, which helps governments to implement public resource allocation strategies in a more optimal way. Finally, this study validates the applicability and effectiveness of machine learning technology in the housing rental market, uncovering the decision-making processes through advanced interpretable machine learning techniques. The research framework is designed to be transferable to cities worldwide, offering insights for policy recommendations.

3. Data and Methodology

The aim of this study is to identify and compare the factors that influence housing rental prices in three Chinese super cities (Beijing, Shanghai, and Shenzhen) using a three-stage approach. First, a set of statistical/machine learning regressors are used to fit the rental prices of these three cities separately. Second, the interpretable AI technique is applied to explore the complexity in the feature space of the best-fitting model from the first stage. Third, the result of the best performing machine learning model (RF) is compared with the results of the multiple linear regression, and the superiority of the former is further demonstrated. This study’s methodology is not only applicable to Chinese cities but also offers a framework that can be adapted to analyse rental markets in other countries, providing a broader analytical tool on global rental market dynamics.

3.1. Data and Variables

The dataset used in this study includes information on rental properties obtained from three mainstream rental websites in China: www.fang.com, www.58.com, and www.Ganji.com from December 2020 to January 2021. This time frame was selected for several reasons. First, it represents a period of relative market stability in the Chinese housing market, where there were no significant policy changes or external shocks that could have affected rental prices drastically. Second, the data collection during this period ensures that seasonal effects, which could skew rental prices (e.g., major holidays or fiscal year-end influences), were minimized. Finally, the two-month window provided sufficient data points to conduct a robust analysis while maintaining data quality and relevance for the study’s objectives.
This study utilizes expected rental prices instead of actual prices to provide a more reliable measure of how public resources influence rental prices in major Chinese cities. Expected prices help smooth out short-term market fluctuations, giving us a clearer view of long-term trends and regional impacts. Moreover, using expected rental prices facilitates meaningful comparisons between cities, aiding in the exploration of effective urban planning strategies and policy implications related to housing and public service provision. In addition to the basic information provided by the rental websites about the properties, information including latitude, longitude, and surrounding facilities is also collected from the Baidu Map Application Programming Interface. The full dataset is divided into three subsets according to the city where the property is located, with rental price (rent) as the response variable and a total of 27 variables as independent variables, mainly reflecting property information in the following four respects: (1) location and physical factors; (2) provider information; (3) furnishings and facilities; and (4) nearby amenities. The detailed data dictionary is provided in Table A1.
During the pre-processing of the data, observations with missing values are removed to avoid biased predictions. Subsequently, the boxplot method is used to check the data quality for both the response variable and the predictors, and all observations with outliers are removed. After screening, a total of 11,827 observations in Beijing (2454), Shanghai (3823), and Shenzhen (5550) are used in this study. The descriptive statistics of all numerical variables for the rental properties in three cities are summarised in Table A2. Table A2 shows that the average rent in Beijing is higher than in Shanghai, which in turn is higher than in Shenzhen. This is consistent with the Summary of China’s Housing Rental Market in 2020 published by China Economic Net. Among these three cities, rental properties in Beijing have the largest average size, are closest to both educational institutions and medical/healthcare facilities and have the most medical/healthcare facilities within 3 km of the property. Rental properties in Shenzhen have the highest floors of residential buildings. Rental properties in Shanghai have the most educational institutions in the area. Then, the descriptive statistics of all categorical variables except ‘district’ are summarised in Table A3 (‘district’ is not statistically compared because this information is unique for each city. See Table A4 for the codes of the districts in the three cities). Table A3 highlights that on average, Shanghai has the most south-facing rental properties. Rental properties in Shenzhen have the most rooms, bathrooms, and common areas, are located on the highest level of floors, and are the most privately listed. A larger proportion of rental properties in Beijing have balconies. In terms of furnishings and facilities, Beijing rentals are generally better equipped in this regard (leading on six factors), followed by Shanghai (leading on four factors), while Shenzhen only excels in AC (air conditioning), which could be because the temperature in Shenzhen is usually the highest among the three cities.

3.2. Methodology

In this study, the data for each city are randomly split into a training set and a test set in a 25/75 ratio to build predictive models. Five regression algorithms, namely multi-linear regression, ANN, SVR, RF, and XGBoost, are used to fit the models. The RF model is selected for in-depth analysis as the model with the best fit.
Among numerous machine-learning techniques, the RF technique is one of the most popular, most powerful, and most accurate in the application of data mining and analysis [28]. The RF methodology seeks to randomize variables and data to generate multiple decision trees—the RF—with output determined by the output categories of all the individual decision trees. Breiman [29] pioneered the combining of decision trees into a RF model. The decision tree is a predictive model, which represents a mapping relationship between object attributes and object values. Derived from the ensemble-learning framework, there are generally three main steps for the learning stage of the decision tree: feature selection, decision tree generation, and the pruning of the decision tree [30]. The underlying principle is to randomize the use of variables (columns) and data (rows) to generate many decision trees, and then summarize the results of these trees. The RF approach has been shown to improve prediction accuracy without a significant increase in computational burden; the RF approach is insensitive to multivariate public linearity, with results more robust to missing data and unbalanced data; the RF approach can optimally predict the role of up to several thousand explanatory variables; the RF approach can evaluate the importance of variables when deciding the category. In model construction, the RF approach can internally produce an unbiased estimate of the generalized error. The RF approach can detect variable interactions and the learning speed is considered to be good [31,32,33]. There are, however, some limitations.
Then, the SHapley Additive exPlanations (SHAP) technique is used to explain the best-fitting model. The main idea behind the SHAP technique is the Shapley value, a method from coalition game theory developed by Shapley in 1953. The Shapley value calculates the marginal contribution of attributes to the model output and then interprets the machine learning model at both global and local levels. The SHAP technique is used to construct an additive explanatory model that treats all attributes as contributing to the prediction. In this process, the importance of feature i for the output of model y can be evaluated as follows:
ϕ i y = S x 1 , , x m x i S ! m S 1 ! m ! y S x i y S
where x i represents feature i, S stands for a subset of features, and p means the number of all features [34]. In Equation (1), the exclamation mark (!) denotes a factorial, which is the product of all positive integers less than or equal to a given number. For example, S ! represents the factorial of the size of the subset S, and m ! is the factorial of m, the total number of features. The SHAP interaction effect is the additional combined feature effect after taking into account the individual feature effects. The SHAP interaction effect of features i and j is defined as follows:
ϕ i , j y = S x 1 , , x m x i , x j S ! m S 2 ! 2 m 1 ! δ i j S
where i j and δ i j S = y ^ x S i , j y ^ x S i y ^ x S j + y ^ x S [25].
The SHAP summary plot uses standard feature importance bar graphs to provide an idea of the relative importance of features, but neither the range and distribution of the feature’s influence on the model’s output nor the relationship between the feature’s value and its influence are reflected. By using personalised feature attributes, the SHAP summary plot illustrates all these aspects of feature importance without compromising visual simplicity.
The SHAP partial dependence plot is used in this study to explain how the independent features individually influence the dependent variable. It illustrates the predicted output of a model given the value of an individual feature. The values of this feature are varied and the predicted output can be plotted. Therefore, the way the predicted output of the model changes when the value of a feature changes can help demonstrate how the output depends on that feature. Additionally, the SHAP dependence plot is combined with SHAP interaction values to illustrate global interaction patterns [35].

4. Empirical Results

4.1. Study Areas

The study area of this paper comprises the three Chinese super cities of Beijing, Shanghai, and Shenzhen, which represent the economic centres of the Beijing–Tianjin–Hebei, Yangtze River Delta, and Guangdong–Hong Kong–Macao Greater Bay Area urban agglomerations, respectively. The district maps of these three super cities are shown in Figure 1. In Figure 1b, the area with * represents the central urban area in Shanghai, which refers to the core, densely populated region of the city, typically including key districts such as Huangpu, Jing’an, and Xuhui (listed in the left tab), where the majority of the city’s commercial, cultural, and political activities are concentrated.

4.2. Model Comparison

Since a main objective of this study is to identify the optimal models for predicting rental prices for three cities, including Beijing, Shanghai, and Shenzhen, Table 1, Table 2 and Table 3 below list all the performance metrics of the models described in Section 3.2 for comparison.
Based on the above model performance measures, the RF model appears to be the most efficient and accurate model for the rental markets of all three cities studied, with an R2 of 84%, 85%, and 86% for the test sets, respectively. In addition, the XGBoost model is the second-best model for rent prediction with an R2 of 81%, 84%, and 84% for the test sets, respectively. This may be because, as mainstream ensemble learning algorithms that use a set of weak learners to produce a strong learner, these two models have the following inherent adaptive advantages: (1) The risk of choosing incorrect hypotheses is reduced by taking an average of different hypotheses. This improves the overall performance of the prediction and effectively avoids overfitting; (2) the risk of obtaining local minima is reduced and therefore offers computational advantages; and (3) by combining different models, the search space of optimal hypotheses can be expanded to better capture the data space [36]. In particular, the RF model is insensitive to the parameters it is run with, and it is easy to identify which parameters to use. Overfitting is less problematic than with a single decision tree, and the arduous task of pruning the tree is not required. Moreover, it can work efficiently with large datasets and can handle very high-dimensional data without dimensionality reductions [37].

4.3. Feature Analysis

In this section, the SHAP technique is used to explain the decision-making process of the RF models constructed for the three super cities. The SHAP summary plot is used to determine the feature importance levels, and the SHAP interaction plot is used to examine the effect of the interactivity of features on rent. In particular, public services, including education and medical care, are mainly investigated.

4.3.1. Beijing

Figure 2 shows the SHAP summary plot for the developed model of rentals in Beijing. Sqmt (size) turns out to be the most influential variable on price, followed by Num_edus (number of educational institutions within 3 km of the property), latitude, Num_meds (number of medical/healthcare facilities within 3 km of the property), and longitude which are the other four important variables.
The result of Sqmt shows a positive correlation between the rental price and the size of the property, which is consistent with the basic realities of the Chinese property rental market [38]. The number of educational institutions is the second most important factor affecting housing rental prices. This is because Beijing has fully promoted the practise of enrolling students based on the proximity of residential addresses in public primary and junior high schools [39]. Additionally, families of senior high school students may choose to rent a property nearby so that they spend less time travelling to school. For higher education institutions, a large number of staff seeking affordable housing and students preferring to live off campus have created a huge demand, contributing to the competition of nearby rental properties and the rise in prices. This indicates that the quality and accessibility of surrounding educational institutions can stimulate not only housing prices but also rental prices [40]. Finally, as for the number of medical facilities, Beijing is one of the cities with the best medical facilities in China. In 2015, the Beijing Municipal Bureau of Statistics reported that about 70% of hospitals in Beijing are located within 735 km2 of central Beijing, while about 30% are located in the other 16,073 km2 of the city [41]. This uneven distribution of medical resources has exacerbated the differentiation of rental prices in different districts of the city.
Figure 3 shows the SHAP interaction plots of latitude and longitude, Sqmt and number of educational institutions, Sqmt and number of medical facilities, and number of educational institutions and number of medical facilities. Two features, longitude and latitude, together define the geographical location of a property. According to the result of Figure 3, rental prices are higher in areas within latitude (39.95, 40.00) and longitude (116.30, 116.42). This result corresponds to the four districts of Xicheng, Dongcheng, Chaoyang, and Haidian, which are all located in the central urban area of Beijing [42]. This suggests that the results suggested by the model developed in this study match well with reality.
The plot of Sqmt and number of educational institutions shows that for small properties (less than 70 square metres), properties with the same size can be rented at a higher price if there are fewer educational institutions within 3 kilometres. However, for large properties (more than 70 square metres), a larger number of educational institutions within a 3-kilometre radius leads to an increase in rental prices for properties of the same size. This could be due to the fact that young immigrants without offspring tend to rent small properties to save money. These tenants generally do not value educational facilities, and places that are far from educational facilities are rich in entertainment venues, shopping malls, etc., which increases their interest in renting a property. Families with children pursuing education need to rent larger properties. For these tenants, educational services are an important consideration, which in turn has contributed to the phenomenon that the more extensive the educational services, the higher the rental prices.
The plot of Sqmt and number of medical facilities shows a similar trend to the plot of Sqmt and number of educational institutions, which could be due to the fact that both education and medical services are concentrated in the central urban area of Beijing. From the above findings, it can be inferred that tenants living in smaller properties are mainly young Beijing drifters for whom convenience of work and daily life should be the priority. Tenants of larger houses include not only children pursuing education, but also many older, even elderly groups. For these tenants, medical care is an important aspect that drives up rental prices in places with better medical conditions.
Finally, the plot of the number of educational institutions and number of medical facilities shows the uneven distribution of public facilities in Beijing. There are a large number of areas with few educational and medical facilities, and rental prices in these areas are relatively stable at a low level. Areas that are rich in educational facilities also tend to have better medical facilities. This may be because the construction of the capital occurred with the earliest urban infrastructure planning in China. However, at the stage of development in the 1950s, it is difficult to estimate such rapid urban expansion. Gradually, resources for education and medicine in Beijing are unevenly distributed among the districts. The design and development of central city areas is already mature, and educational and medical facilities in districts such as Chaoyang, Haidian, Dongcheng, and Xicheng are already very complete, so rental prices in these areas can be extremely high. In the outer districts, the quality and quantity of public facilities is still in great need of improvement compared to the central districts, which has largely contributed to the generally lower rental prices in these areas.

4.3.2. Shanghai

Figure 4 illustrates the SHAP summary plot for the developed model of rental prices in Shanghai. As the national economic centre, the five most important features in Shanghai are the same as in Beijing. Similar to Beijing, Shanghai has also fully implemented the policy of enrolling students by school district. Combined with the high demand in the housing market around senior high schools and higher educational institutions, educational services have a significant impact on the rental housing market in Shanghai. Additionally, medical facilities are also an important criterion for tenants in Shanghai. Location is decisive for rental prices in Shanghai. According to the SHAP interaction plots (see Figure 5), rental prices are higher in areas within latitude (31.15, 31.25) and longitude (121.45, 121.80). This corresponds to the central districts of Shanghai such as Xuhui, Jing’an, and Pudong. This is consistent with the real rental housing market in Shanghai and is further evidence of the superior performance of the developed model.
The other plots in Figure 5 are broadly similar to the plots for Beijing. However, the overall distribution of resources for education and medical care is more even than in Beijing, although the imbalance remains. This may be related to the fact that urban planning in the Western modernisation model was designed by local governments in the 1920s. Since then, with the common prosperity of the economy in the Yangtze River Delta region, the peripheral areas of Shanghai and the central area have since formed an efficient synergy in the vertical integration of the industrial chain, and many industries and services are closely linked, with high-frequency interactions between the different regions. This reduces the development gap between the districts.

4.3.3. Shenzhen

Figure 6 shows the SHAP summary plot for the developed model of rental prices in Shenzhen. The importance of the features decreases in the following order: Sqmt, latitude, floors (total number of floors), longitude, and room (number of rooms). According to the SHAP interaction plots (see Figure 7), rental prices are higher in areas within latitude (22.50, 22.58) and longitude (113.95, 114.10). The results demonstrate that housing rental prices in Nanshan, Futian, and Luohu are higher than in other districts, which is consistent with other empirical studies [42]. This finding may be explained by the fact that the Nanshan, Futian, and Luohu districts were developed at a very early stage in Shenzhen’s history and belong to the central area of the city, where a higher density of recreational, financial, commercial, and educational facilities and businesses can be found.
It is worth noting that floors and rooms are among the top five variables only in Shenzhen. This might suggest that tenants in Shenzhen value the comfort of the property itself more than the surrounding public facilities compared to tenants in Beijing and Shanghai. This may be because Shenzhen is the youngest city in China, with an average age of the resident population of only 33. Most of the tenants are young migrant workers who have no offspring and are not overly dependent on medical services but are more concerned with their own comfort in life. As for floors, it is noted that higher floors are associated with higher residential rental prices. This could be because residential buildings with more floors tend to be concentrated in the more central and denser areas of the city, such as Nanshan, Futian, and Luohu, which increases the value of supply. In terms of the number of rooms, it can be seen that a larger number of rooms leads to lower residential rental prices. It stands to reason that renting a house with fewer rooms (i.e., a one-bedroom flat) usually indicates a more independent and private use of the house and avoids sharing spaces with other tenants, resulting in higher residential rental prices.
The trends of the other plots in Figure 7 are similar to those of Beijing and Shanghai. The interaction plot of the number of educational institutions and number of medical facilities shows a relatively clear linear trend despite some noise. It is clear that properties with better access to educational facilities are always associated with better access to medical facilities, which has led to higher housing rental prices, and vice versa. Of the three super cities investigated, Shenzhen has the most even spatial distribution of educational and medical resources. This could be due to the fact that Shenzhen is a city with a relatively short history. Only since its establishment as a special economic zone in 1980 has it undergone unprecedented urban growth and residential development, so social and environmental inequalities between different areas are likely to be less pronounced [5].

4.4. Comparison with Simple Hedonic Regression

A multiple linear regression model based on hedonic theory is run and the results are shown in Table 4. It is worth noting that rent, number of educational institutions, and number of medical facilities are log-transformed to obtain a better distribution. Room, Common_area, Sqmt, floors, AC, number of educational institutions, and distance to nearest medical facility are significant in all three cities, suggesting that more rooms, more common areas, larger property sizes, higher total floors, air conditioners, more educational institutions within 3 km of the property, and a greater distance to the nearest medical/healthcare facilities, tend to lead to higher rental prices. Of the three location-determining variables, district, longitude, and latitude, at least two are significant in each city, showing that location is always an important factor in determining rental prices. Orientation is demonstrated to be important in Shanghai and Shenzhen. Bathroom and number of medical facilities are positively significant in Beijing and Shanghai, suggesting that more bathrooms and more medical/healthcare facilities within 3 km of the property tend to lead to higher rental prices in these two cities. Agency and wardrobe only play a role in Shanghai, indicating that rental prices for individual published properties and with wardrobe are rising. Sofas, TVs, and broadband are positively significant in Shanghai and Shenzhen, suggesting that having the three variables mentioned above increases rental prices. Fridges, heating, and distance to nearest educational institution are important in Beijing and Shenzhen, suggesting that tenants in these two cities place more value on these three variables. Moreover, a greater distance to the nearest educational institution tends to lead to higher rental prices in Beijing, while the situation is exactly the opposite in Shanghai. Finally, laundry and hot water are significant in Shenzhen and beds are significant in Beijing, showing the attractiveness of these variables for tenants in the respective cities.
The above findings are essentially consistent with the results of the machine learning model. But there are some weaknesses. For example, the multiple linear regression model suggests that in Beijing, higher latitude leads to higher rental prices. However, this is not the case—as the results of the RF model show, latitude only leads to higher rental prices in a relatively high range (39.95, 40.00). Similarly, the results of the distance to nearest educational institution and distance to nearest medical facility in Beijing show that a greater distance to the nearest public facilities tends to lead to higher rental prices. Nevertheless, the RF model shows that the relationship between distance to public facilities and rental prices is not monotonically increasing. The fact is that a reasonable distance leads to higher rental prices and rental prices are lower when the property is too close or too far away. This shows that machine learning models can describe the relationship between features and response variables better than traditional models. This is because, compared to traditional statistical models, machine learning models do not assume that samples conform to a particular distribution and can fit non-linear relationships between features without taking into account the possible high correlation between them. These properties are more in line with reality [43]. However, one thing is certain; regardless of the model, it turns out that the number of public facilities within a three-kilometre radius of the property is shown to have a strong positive correlation with rental prices, which is true for all three cities and points to the importance of public facilities for the residents of the area.

5. Discussion of Findings

5.1. Model Accuracy in Predicting Rental Prices

This study evaluated five machine learning models: MLR, ANN, SVR, RF, and XGBoost. The results show that the RF model consistently outperformed other models in predicting rental prices across the three cities, achieving the highest predictive accuracy with R2 values of 84%, 85%, and 86% for Beijing, Shanghai, and Shenzhen, respectively. The XGBoost model also performed well, but its accuracy was slightly lower compared to the RF model. These findings are consistent with previous studies [26,27], where the RF model was found to be superior in handling large datasets and detecting complex interactions among variables.
The superior performance of the RF model is likely due to its ability to handle a large number of variables and provide insights into variable importance without overfitting, which is particularly useful in the context of rapidly growing urban areas such as Beijing, Shanghai, and Shenzhen. The overall accuracy of the models demonstrates the effectiveness of machine learning techniques in rental price prediction, especially in cities where housing markets are influenced by numerous, often non-linear, factors.

5.2. Key Factors Influencing Rental Prices

Interpretable machine learning techniques, specifically the SHAP technique, allowed this study to uncover the key factors influencing rental prices in the three super cities. In all three cities, property size (Sqmt), latitude, and longitude were consistently identified as the most significant predictors of rental prices, followed by the number of educational institutions (Num_edus) and medical facilities (Num_meds) within a 3 km radius.
In Beijing and Shanghai, public services such as access to high-quality educational and medical facilities played a crucial role in determining rental prices. In contrast, in Shenzhen, where public services are more evenly distributed, tenants placed greater importance on the property’s physical characteristics, such as the number of rooms and floors. This aligns with the city’s younger demographic and more balanced public service infrastructure. The findings highlight the importance of public services in shaping rental markets, especially in cities where access to these services is unevenly distributed. They align with those of previous studies that emphasize the importance of public services, especially education and healthcare, in shaping housing markets. For example, Su et al. [17] note that access to high-quality educational facilities is a key driver of rental prices in China’s super cities. Similarly, Gu et al. [6] highlight the critical role of the hukou system in determining access to healthcare, which further influences housing demand. Miao and Phelps [44] highlight the importance of addressing public resource allocation in rapidly growing urban areas. Additionally, studies in Western contexts have demonstrated comparable findings regarding the influence of public services on housing markets. Waltert and Schläpfer [18] find that proximity to urban amenities, such as schools and hospitals, significantly affects property prices in developed countries. This underscores the global relevance of public service accessibility as a factor in housing affordability.
The results also reveal notable differences in how educational facilities influence rental prices across the three cities. For instance, in Beijing, the impact of proximity to educational institutions on rental prices is particularly strong. This may be due to Beijing’s concentration of highly ranked public schools and the competitive nature of the city’s education system, where proximity to these schools is highly valued by families seeking high-quality education for their children. In contrast, in Shenzhen, the influence of educational facilities on rental prices is less pronounced. This could reflect the relatively more balanced distribution of private and public schooling options, as well as Shenzhen’s status as a younger, rapidly developing city with a more mobile population. Government policies in Shenzhen may also play a role, as the city has been actively promoting education reforms and the establishment of new schools to meet the growing demand from its expanding population. Shanghai, on the other hand, presents a middle ground, where both public and private schools significantly affect rental prices. This could be due to the city’s mix of both local government education policies and a growing number of international schools catering to expatriates and high-income families. Such variations highlight how different socio-economic conditions and government policies across cities shape the rental market in unique ways.

5.3. Policy Implications and Global Application

The insights from this study offer several policy implications for local governments in Beijing, Shanghai, and Shenzhen. First, the uneven distribution of public services in Beijing and Shanghai suggests the need for more equitable resource allocation to prevent excessively high rental prices in certain areas. Local governments could consider redistributing public resources or developing long-term rental housing in high-cost areas to ensure housing affordability for a broader segment of the population. Second, the balanced distribution of public services in Shenzhen presents a potential model for other cities. By ensuring equitable access to education and healthcare, cities can reduce the premium placed on these services in rental markets, allowing tenants to prioritize housing characteristics over public service access. Finally, this study’s analytical framework can be applied to other global cities, particularly those undergoing rapid urbanization, to better understand how public services and housing characteristics interact to influence rental prices.

5.4. Comparison with Other Countries

While the study focuses on the Chinese rental market, it is valuable to compare these findings with those from other international markets, both in Asia and the West. In many Western countries, such as the United States and the United Kingdom, access to public services, particularly high-quality educational institutions, is similarly a key driver of housing prices and rental rates. For example, in the UK, properties within the catchment areas of top-rated schools tend to command significantly higher rental prices, reflecting a strong demand for access to prestigious educational institutions [45,46]. This trend is consistent with our findings in Beijing and Shanghai, where proximity to highly ranked public schools leads to increased rental prices. However, unlike in China, where the hukou system creates additional barriers to accessing public services, Western markets typically do not have such rigid residency requirements, leading to different dynamics in how public services influence rental demand [47].
In other Asian markets, such as Japan and South Korea, public services like education also play a crucial role in determining rental prices. Similar to China, these countries place a strong emphasis on education, and families often prioritize renting homes near top schools [48]. However, the distribution of public and private schooling options in these countries is more balanced than in China, potentially leading to less stark differences in rental prices based solely on proximity to educational facilities. Furthermore, the rental markets in Japan and South Korea are generally more regulated, which may mitigate the price premiums observed near key public services in Chinese cities like Beijing and Shanghai.
Therefore, in contrast to China’s rental market, which is deeply affected by its unique legal and planning frameworks, other countries display a range of regulatory and planning approaches that shape their rental dynamics in distinct ways. In the U.S., rental markets vary widely due to differing regulations across states and cities. In cities with strict rent control policies, such as San Francisco and New York City, regulated rent increases and strong tenant protections aim to enhance housing affordability and stability [49]. In these high-cost areas, proximity to employment centres is a significant factor for renters, who are often willing to pay a premium for shorter commutes, thereby reducing time and costs associated with travel. In contrast, cities like Houston and Dallas, which have minimal rent regulation, undergo more fluctuating rental prices driven by market forces. This can lead to affordability issues but offers greater flexibility for landlords [50]. Additionally, renters in the U.S. highly value access to urban amenities such as restaurants, entertainment, and shopping. High-demand neighbourhoods with these amenities often see increased rental prices due to their desirability. The quality of housing, including modern appliances and well-maintained conditions, also plays a crucial role. In competitive markets, renters may be willing to pay higher rental prices for properties with desirable features [51].
Germany’s rental market benefits from tenant-friendly regulations, including strict rent control and long-term rental agreements. These regulations are highly valued by tenants, as they limit annual rent increases and cap new rental prices relative to local market rates. This system provides stability and predictability in rental prices, making affordability a key concern for German renters. As a result, tenants in Germany are generally less worried about rapid rent changes and more focused on securing stable, affordable housing. In contrast, China’s rental market undergoes significant rent volatility due to rapid urbanization and limited regulatory measures. The influence of public services, such as education and healthcare, on rental prices is more pronounced in China, where access to these services often drives rental prices. Additionally, German renters prioritize well-planned neighbourhoods with good access to green spaces, public transport, and local services. This emphasis on residential comfort and accessibility contributes to a more balanced and predictable rental market compared to China [52].
Japan’s rental market is characterized by stringent regulations designed to protect tenants, including rent control and strong renewal rights. These protections contribute to market stability by ensuring that rent increases are gradual and predictable, though they may also limit landlords’ ability to adjust rental prices in response to changing market conditions. In contrast, China’s rental market is highly volatile, driven by rapid urban development and less robust regulatory frameworks. The market dynamics in China are significantly influenced by the proximity to high-quality public services and the availability of housing relative to demand. Japan’s urban planning policies promote high-density development, particularly in major cities like Tokyo, which helps stabilize the rental market by balancing supply and demand. Additionally, Japanese cultural attitudes view renting as a long-term solution rather than a temporary measure, further influencing market dynamics and rental prices [53].

6. Conclusions

This study enhances the existing understanding of the factors driving rental prices in China’s super cities, with a specific focus on the influence of public services including education and healthcare. By employing advanced machine learning models, this study demonstrated how these services contribute to rental price disparities across Beijing, Shanghai, and Shenzhen. The primary contribution of this research is the development of a transferable analytical framework that can be applied to rental markets in other rapidly urbanizing regions worldwide. This framework offers valuable insights for policymakers and urban planners, enabling them to make informed decisions about the distribution of public services and its impact on housing affordability.
By addressing the unique socio-economic conditions of the Chinese rental market, this study fills a gap in existing research and provides actionable recommendations for reducing rental inequalities through a more equitable public service distribution. This study highlights the significant influence of education and medical resources on rental prices in Chinese supercities, a factor that is less emphasized in many international rental markets where renters are primarily concerned with proximity to employment centres.
Based on these findings, the following policy recommendations are made. (1) Targeted investments in underserved areas: Local governments could focus on increasing investments in areas with lower access to educational and healthcare facilities. By developing new schools, hospitals, and clinics in underserved regions, governments can create a more balanced distribution of public services, potentially mitigating the disparity in rental prices across different neighbourhoods. This could also promote more equitable access to essential services for residents. (2) Incentives for private sector participation: Authorities can offer incentives to the private sector to develop high-quality educational and healthcare facilities in regions where public services are lacking. Public-private partnerships could play a key role in expanding the availability of these services, thus reducing pressure on public resources and minimizing the rental premium associated with access to such facilities. (3) Transport infrastructure improvements: Enhancing public transport links between peripheral or underserved areas and regions with concentrated public services could alleviate some of the disparities in rental prices. Improved transportation would enable residents to access high-quality public services without needing to reside in the immediate vicinity, which could lead to a more balanced rental market across the city. (4) Urban planners and policymakers in other rapidly urbanizing regions could benefit from considering the Chinese experience, particularly the role of public services in shaping rental market dynamics. Additionally, by adopting the analytical framework proposed in this study, cities worldwide can improve their understanding of rental price determinants and develop more informed housing policies.
While the machine learning models provided valuable insights into the factors influencing rental prices, the study also faced several challenges. One notable limitation is the restricted timeframe of data collection (two months), which may not capture long-term rental trends or market fluctuations. Additionally, while interpretable machine learning techniques like the SHAP technique help reveal key influences, they also introduce complexities that may obscure simpler, more actionable insights for policymakers. Future studies could extend the analysis period or explore additional models to ensure a more comprehensive understanding of rental dynamics. Moreover, balancing the depth of the methodological explanation with broader implications of the findings remains a challenge in this type of research.

Author Contributions

R.K.: conceptualization, validation, formal analysis, writing—original draft, Writing—review and editing, supervision, and funding acquisition. Y.L.: validation, formal analysis, writing—original draft, and writing—review and editing. Y.Z.: conceptualization, data curation, methodology, software, formal analysis, and writing—original draft. W.L.: validation, formal analysis, writing—original draft, writing—review and editing. X.H.: validation, formal analysis, writing—original draft, and writing—review and editing. Q.P.: conceptualization, methodology, software, validation, formal analysis, writing—original draft, writing—review and editing, and project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Arts Crafts Industrial Design Centre of Hunan Province (No. 2022GYMSZ1).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Data dictionary.
Table A1. Data dictionary.
#VariableDescription
1RentDependent variable, rental price per month in Chinese Yuan (CNY)
Location and Physical Factors
2DistrictDistrict where the property is located. Beijing: 13 districts; Shanghai: 12 districts; Shenzhen: 9 districts
3OrientationOrientation of the property’s main natural light source, i.e., SW = southwest. 8 categories in total (E, S, W, N, NE, SE, NW, SW)
4LongitudeLongitude of the property
5LatitudeLatitude of the property
6RoomNumber of room(s) in the property
7BathroomNumber of bathroom(s) in the property
8Common_areaNumber of common area(s) in the property
9BalconyWhether the property has balcony or not (0 = no, 1 = yes)
10SqmtSize of the property in square metres
11LevelLevel where property is located in the building (1 = basement, 2 = low levels, 3 = medium levels, 4 = high levels)
12FloorsTotal number of floors of the building where the property is located
Provider Information
13AgencyWhether the listing is published by agency or individual (0 = Agency, 1 = individual)
Furnishings and Facilities
14BedWhether bed is provided or not (0 = no, 1 = yes)
15WardrobeWhether wardrobe is provided or not (0 = no, 1 = yes)
16SofaWhether sofa is provided or not (0 = no, 1 = yes)
17TVWhether television is provided or not (0 = no, 1 = yes)
18FridgeWhether fridge is provided or not (0 = no, 1 = yes)
19LaundryWhether laundry is provided or not (0 = no, 1 = yes)
20ACWhether air conditioning is provided or not (0 = no, 1 = yes)
21Hot waterWhether water heater is provided or not (0 = no, 1 = yes)
22BroadbandWhether broadband is provided or not (0 = no, 1 = yes)
23GasWhether gas is provided or not (0 = no, 1 = yes)
24HeatingWhether heating is provided or not (0 = no, 1 = yes)
Nearby Amenities
25Dis_eduDistance to the nearest educational institution, in metres
26Num_edusNumber of educational institutions within 3 km of the property
27Dis_medDistance to the nearest medical/healthcare facility, in metres
28Num_medsNumber of medical/healthcare facilities within 3 km of the property
Table A2. Descriptive statistics for numerical variables.
Table A2. Descriptive statistics for numerical variables.
BeijingShanghaiShenzhen
MeanStd.Dev.MeanStd.Dev.MeanStd.Dev.
Rent7031.715170.786873.086124.974769.574097.34
Longitude116.390.15121.460.081114.040.12
Latitude39.930.1131.220.0922.620.08
Sqmt77.1034.0676.3044.8263.6332.59
Floors18.048.3513.4410.2222.8011.77
Dis_edu1505.76822.961863.90827.71672.98914.50
Num_edus66.9537.5997.65143.5374.0526.72
Dis_med1722.351104.092205.801329.822230.091247.99
Num_meds118.4839.72100.6441.4669.0637.18
Table A3. Descriptive statistics for categorical variables.
Table A3. Descriptive statistics for categorical variables.
BeijingShanghaiShenzhen
OrientationE9.49%1.83%6.90%
SE4.36%0.16%0.40%
N6.97%0.94%5.62%
NE4.28%0.26%3.50%
NW3.99%0.24%3.46%
S55.83%95.00%70.29%
SW6.15%0.92%7.12%
W8.92%0.65%2.72%
Room143.19%41.93%40.05%
244.01%41.20%29.44%
312.80%14.33%24.81%
40.00%2.54%5.69%
Bathroom185.13%83.78%81.32%
214.87%13.68%18.68%
30.00%2.54%0.00%
Common_area05.87%5.05%18.31%
174.57%60.92%36.65%
219.56%34.03%45.05%
Balcony025.79%38.06%26.68%
174.21%61.94%73.32%
Level10.00%0.05%0.00%
230.32%18.21%21.48%
340.46%52.31%45.60%
429.22%29.43%32.92%
Agency089.98%91.37%83.89%
110.02%8.63%16.11%
Bed024.61%14.26%21.78%
175.39%85.74%78.22%
Wardrobe022.25%24.85%27.01%
177.75%75.15%72.99%
Sofa024.90%31.52%30.11%
175.10%68.48%69.89%
TV023.39%26.08%35.19%
176.61%73.92%64.81%
Fridge016.06%11.25%21.30%
183.94%88.75%78.70%
Laundry014.10%6.80%17.32%
185.90%93.20%82.68%
AC041.16%49.59%31.93%
158.84%50.41%68.07%
Hot water011.65%6.30%10.95%
188.35%93.70%89.05%
Broadband021.96%23.28%29.69%
178.04%76.72%70.31%
Gas020.25%31.81%29.37%
179.75%68.19%70.63%
Heating011.04%30.66%40.83%
188.96%69.34%59.17%
Table A4. District codes list.
Table A4. District codes list.
CodeBeijingShanghaiShenzhen
1ChangpingBaoshanBaoan
2ChaoyangChangningBuji
3DaxingFengxianFutian
4DongchengHongkouGuangming
5FangshanHuangpuLonggang
6FengtaiJiadingLonghua
7HaidianJinganLuohu
8HuairouMinxingNanshan
9MentougouPudongYantian
10ShijingshanPutuo
11ShunyiXuhui
12TongzhouYangpu
13Xicheng

References

  1. Waxman, A.; Liang, Y.; Li, S.; Barwick, P.J.; Zhao, M. Tightening belts to buy a home: Consumption responses to rising housing prices in urban China. J. Urban Econ. 2019, 115, 103190. [Google Scholar] [CrossRef]
  2. Yan, S.; Ge, X.J.; Wu, Q. Government intervention in land market and its impacts on land supply and new housing supply: Evidence from major Chinese markets. Habitat Int. 2014, 44, 517–527. [Google Scholar] [CrossRef]
  3. Mo, J. Land financing and economic growth: Evidence from Chinese counties. China Econ. Rev. 2018, 50, 218–239. [Google Scholar] [CrossRef]
  4. Qi, W.; Li, G. Residential carbon emission embedded in China’s inter-provincial population migration. Energy Policy 2020, 136, 111065. [Google Scholar] [CrossRef]
  5. Li, H.; Chen, P.; Grant, R. Built environment, special economic zone, and housing prices in Shenzhen, China. Appl. Geogr. 2021, 129, 102429. [Google Scholar] [CrossRef]
  6. Gu, H.; Liu, Z.; Shen, T. Spatial pattern and determinants of migrant workers’ interprovincial hukou transfer intention in China: Evidence from a National Migrant Population Dynamic Monitoring Survey in 2016. Popul. Space Place 2020, 26, e2250. [Google Scholar] [CrossRef]
  7. Zhou, J.; Lin, L.; Tang, S.; Zhang, S. To settle but not convert hukou among rural migrants in urban China: How does family-level eligibility for citizenship benefits matter? Habitat Int. 2022, 120, 102511. [Google Scholar] [CrossRef]
  8. Wallace, J. Cities and Stability: Urbanization, Redistribution, and Regime Survival in China; Oxford University Press: Oxford, UK, 2014. [Google Scholar]
  9. Qian, X.; Qiu, S.; Zhang, G. The impact of COVID-19 on housing price: Evidence from China. Financ. Res. Lett. 2021, 43, 101944. [Google Scholar] [CrossRef]
  10. Zheng, S.; Hu, W.; Wang, R. How much is a good school worth in Beijing? Identifying price premium with paired resale and rental data. J. Real Estate Financ. Econ. 2016, 53, 184–199. [Google Scholar] [CrossRef]
  11. Liu, F.; Min, M.; Zhao, K.; Hu, W. Spatial-temporal variation in the impacts of urban infrastructure on housing prices in Wuhan, China. Sustainability 2020, 12, 1281. [Google Scholar] [CrossRef]
  12. Wen, H.; Zhang, Y.; Zhang, L. Do educational facilities affect housing price? An empirical study in Hangzhou, China. Habitat Int. 2014, 42, 155–163. [Google Scholar] [CrossRef]
  13. Linneman, P. Some empirical results on the nature of the hedonic price function for the urban housing market. J. Urban Econ. 1980, 8, 47–68. [Google Scholar] [CrossRef]
  14. Solakis, K.; Pena-Vinces, J.; Lopez-Bonilla, J.M. Value co-creation and perceived value: A customer perspective in the hospitality context. Eur. Res. Manag. Bus. Econ. 2022, 28, 100175. [Google Scholar] [CrossRef]
  15. Zhang, C.; Jia, S.; Yang, R. Housing affordability and housing vacancy in China: The role of income inequality. J. Hous. Econ. 2016, 33, 4–14. [Google Scholar] [CrossRef]
  16. Zheng, S.; Cheng, Y.; Ju, Y. Understanding the intention and behavior of renting houses among the young generation: Evidence from Jinan, China. Sustainability 2019, 11, 1507. [Google Scholar] [CrossRef]
  17. Su, S.; He, S.; Sun, C.; Zhang, H.; Hu, L.; Kang, M. Do landscape amenities impact private housing rental prices? A hierarchical hedonic modeling approach based on semantic and sentimental analysis of online housing advertisements across five Chinese megacities. Urban For. Urban Green. 2021, 58, 126968. [Google Scholar] [CrossRef]
  18. Waltert, F.; Schläpfer, F. Landscape amenities and local development: A review of migration, regional economic and hedonic pricing studies. Ecol. Econ. 2010, 70, 141–152. [Google Scholar] [CrossRef]
  19. Cui, N.; Gu, H.; Shen, T.; Feng, C. The impact of micro-level influencing factors on home value: A housing price-rent comparison. Sustainability 2018, 10, 4343. [Google Scholar] [CrossRef]
  20. Cui, N.; Gu, H. Homeowner and Renter Payment for School Quality in Beijing: Boundary Fixed Effect Analysis with Housing Price–Rent Comparison. J. Urban Plan. Dev. 2021, 147, 05021025. [Google Scholar] [CrossRef]
  21. Li, H.; Wei, Y.D.; Wu, Y. Analyzing the private rental housing market in Shanghai with open data. Land Use Policy 2019, 85, 271–284. [Google Scholar] [CrossRef]
  22. Arribas-Bel, D.; Garcia-López, M.-À.; Viladecans-Marsal, E. Building(s and) cities: Delineating urban areas with a machine learning algorithm. J. Urban Econ. 2021, 125, 103217. [Google Scholar] [CrossRef]
  23. Lorenz, F.; Willwersch, J.; Cajias, M.; Fuerst, F. Interpretable machine learning for real estate market analysis. Real Estate Econ. 2022, 51, 1178–1208. [Google Scholar] [CrossRef]
  24. Yoshida, T.; Murakami, D.; Seya, H. Spatial prediction of apartment rent using regression-based and machine learning-based approaches with a large dataset. J. Real Estate Financ. Econ. 2022, 69, 1–28. [Google Scholar] [CrossRef]
  25. Molnar, C. Interpretable Machine Learning; Lulu.com: Morrisville, NC, USA, 2020. [Google Scholar]
  26. Waddell, P.; Besharati-Zadeh, A. A Comparison of Statistical and Machine Learning Algorithms for Predicting Rents in the San Francisco Bay Area. arXiv 2020, arXiv:2011.14924. [Google Scholar]
  27. Embaye, W.T.; Zereyesus, Y.A.; Chen, B. Predicting the rental value of houses in household surveys in Tanzania, Uganda and Malawi: Evaluations of hedonic pricing and machine learning approaches. PLoS ONE 2021, 16, e0244953. [Google Scholar] [CrossRef] [PubMed]
  28. Emerson, S.; Kennedy, R.; O’Shea, L.; O’Brien, J. Trends and applications of machine learning in quantitative finance. In Proceedings of the 8th International Conference on Economics and Finance Research (ICEFR 2019), Lyon, France, 18–21 June 2019. [Google Scholar]
  29. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  30. Liu, K.; Peng, Q.; Liu, Y.; Cui, N.; Zhang, C. Explainable neural network for sensitivity analysis of lithium-ion battery smart production. IEEE/CAA J. Autom. Sin. 2024, 11, 1944–1953. [Google Scholar] [CrossRef]
  31. Shi, T.; Horvath, S. Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 2006, 15, 118–138. [Google Scholar] [CrossRef]
  32. Liu, K.; Peng, Q.; Liu, Z.; Li, W.; Cui, N.; Zhang, C. Adaptive battery thermal management systems in unsteady thermal application contexts. J. Energy Chem. 2024, 97, 650–668. [Google Scholar] [CrossRef]
  33. Zhu, T.; Cruden, A.; Peng, Q.; Liu, K. Enabling extreme fast charging. Joule 2023, 7, 2660–2662. [Google Scholar] [CrossRef]
  34. Bowen, D.; Ungar, L. Generalized SHAP: Generating multiple types of explanations in machine learning. arXiv 2020, arXiv:2006.07155. [Google Scholar]
  35. Lundberg, S.M.; Erion, G.G.; Lee, S.I. Consistent individualized feature attribution for tree ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar] [CrossRef]
  36. Sagi, O.; Rokach, L. Ensemble learning: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
  37. Horning, N. Random Forests: An algorithm for image classification and generation of continuous fields data sets. In Proceedings of the International Conference on Geoinformatics for Spatial Infrastructure Development in Earth and Allied Sciences, Osaka, Japan, 9–11 December 2010; Volume 911, pp. 1–6. [Google Scholar]
  38. Hanink, D.M.; Cromley, R.G.; Ebenstein, A.Y. Spatial variation in the determinants of house prices and apartment rents in China. J. Real Estate Financ. Econ. 2012, 45, 347–363. [Google Scholar] [CrossRef]
  39. Wen, H.; Xiao, Y.; Zhang, L. School district, education quality, and housing price: Evidence from a natural experiment in Hangzhou, China. Cities 2017, 66, 72–80. [Google Scholar] [CrossRef]
  40. Wen, H.; Xiao, Y.; Hui, E.C. Quantile effect of educational facilities on housing price: Do homebuyers of higher-priced housing pay more for educational resources? Cities 2019, 90, 100–112. [Google Scholar] [CrossRef]
  41. Lu, C.; Zhang, Z.; Lan, X. Impact of China’s referral reform on the equity and spatial accessibility of healthcare resources: A case study of Beijing. Soc. Sci. Med. 2019, 235, 112386. [Google Scholar] [CrossRef]
  42. Hu, S.; Fan, Y.; Zhang, T. Assessing the effect of land use change on surface runoff in a rapidly urbanized city: A case study of the central area of Beijing. Land 2020, 9, 17. [Google Scholar] [CrossRef]
  43. Alpaydin, E. Machine Learning; MIT Press: Cambridge, MA, USA, 2021. [Google Scholar]
  44. Miao, J.T.; Phelps, N.A. Urban sprawl as policy sprawl: Distinguishing Chinese capitalism’s suburban spatial fix. Ann. Am. Assoc. Geogr. 2022, 112, 1179–1194. [Google Scholar] [CrossRef]
  45. Cheshire, P.; Sheppard, S. Capitalising the value of free schools: The impact of supply characteristics and uncertainty. Econ. J. 2004, 114, F397–F424. [Google Scholar] [CrossRef]
  46. Gibbons, S.; Machin, S.; Silva, O. Valuing school quality using boundary discontinuities. J. Urban Econ. 2013, 75, 15–28. [Google Scholar] [CrossRef]
  47. Huang, Y.; Clark, W.A. Housing tenure choice in transitional urban China: A multilevel analysis. Urban Stud. 2002, 39, 7–32. [Google Scholar] [CrossRef]
  48. Shimizu, C.; Watanabe, T. Housing Bubble in Japan and the United States; Research Center for Price Dynamics, Institute of Economic Research, Hitotsubashi University: Tokyo, Japan, 2010. [Google Scholar]
  49. Malpezzi, S. Housing prices, externalities, and regulation in US metropolitan areas. J. Hous. Res. 1996, 7, 209–241. Available online: https://www.jstor.org/stable/24832860 (accessed on 3 September 2024).
  50. Orlando, A.W.; Redfearn, C.L. Houston, you have a problem: How large cities accommodate more housing. Real Estate Econ. 2022, 52, 1045–1074. [Google Scholar] [CrossRef]
  51. Metzger, R.E. Substandard Rental Housing in the Promise Zone of a Mid-Sized US City. Ph.D. Dissertation, Walden University, Minneapolis, MN, USA, 2018. [Google Scholar]
  52. Breidenbach, P.; Eilers, L.; Fries, J. Temporal dynamics of rent regulations–The case of the German rent control. Reg. Sci. Urban Econ. 2022, 92, 103737. [Google Scholar] [CrossRef]
  53. Hirayama, Y. Housing and the rise and fall of Japan’s social mainstream. In Housing East Asia: Socioeconomic and Demographic Challenges; Palgrave Macmillan: London, UK, 2014; pp. 116–139. [Google Scholar] [CrossRef]
Figure 1. District Maps of Beijing, Shanghai, and Shenzhen. (a) Administrative divisions of Beijing (Source: Beijing Municipal Commission of Planning and Natural Resources). (b) Administrative divisions of Shanghai (Source: Shanghai Surveying and Mapping Institute). (c) Administrative divisions of Shenzhen (Source: Shenzhen Planning and Natural Resources Bureau).
Figure 1. District Maps of Beijing, Shanghai, and Shenzhen. (a) Administrative divisions of Beijing (Source: Beijing Municipal Commission of Planning and Natural Resources). (b) Administrative divisions of Shanghai (Source: Shanghai Surveying and Mapping Institute). (c) Administrative divisions of Shenzhen (Source: Shenzhen Planning and Natural Resources Bureau).
Sustainability 16 07861 g001aSustainability 16 07861 g001b
Figure 2. SHAP summary plot for Beijing.
Figure 2. SHAP summary plot for Beijing.
Sustainability 16 07861 g002
Figure 3. SHAP interaction plots for Beijing.
Figure 3. SHAP interaction plots for Beijing.
Sustainability 16 07861 g003
Figure 4. SHAP summary plot for Shanghai.
Figure 4. SHAP summary plot for Shanghai.
Sustainability 16 07861 g004
Figure 5. SHAP interaction plots for Shanghai.
Figure 5. SHAP interaction plots for Shanghai.
Sustainability 16 07861 g005
Figure 6. SHAP summary plot for Shenzhen.
Figure 6. SHAP summary plot for Shenzhen.
Sustainability 16 07861 g006
Figure 7. SHAP interaction plots for Shenzhen.
Figure 7. SHAP interaction plots for Shenzhen.
Sustainability 16 07861 g007
Table 1. Model comparison for Beijing dataset.
Table 1. Model comparison for Beijing dataset.
ModelRMSER2MAE
TrainTestTrainTestTrainTest
Multi-linear Reg.3208.53086.262%63%1838.31871.2
ANN2702.72915.273%67%1764.81817.0
SVM2951.32813.968%69%1788.31812.7
RF2291.22047.981%84%1315.01247.8
XGBoost2331.02214.280%81%1370.81402.7
Table 2. Model comparison for Shanghai dataset.
Table 2. Model comparison for Shanghai dataset.
ModelRMSER2MAE
TrainTestTrainTestTrainTest
Multi-linear Reg.3195.32894.573%77%1650.51560.1
ANN2609.22559.082%82%1521.71488.1
SVM2716.52482.080%83%1403.21331.2
RF2384.52358.185%85%1257.61258.6
XGBoost2443.12422.284%84%1339.11343.6
Table 3. Model comparison for Shenzhen dataset.
Table 3. Model comparison for Shenzhen dataset.
ModelRMSER2MAE
TrainTestTrainTestTrainTest
Multi-linear Reg.2144.72481.670%70%1341.31446.8
ANN1803.11866.479%83%1018.51038.6
SVM1615.21854.783%83%876.9927.6
RF1431.91658.187%86%745.1799.3
XGBoost1523.21807.285%84%865.7967.2
Table 4. Results of multiple linear regression model.
Table 4. Results of multiple linear regression model.
Response Variable: Log (Rent)
Independent VariablesBeijingShanghaiShenzhen
District−0.0041 ***−0.00020.0058 ***
Orientation−0.00160.0282 ***−0.0053 ***
Longitude−0.02340.1138 ***−0.1413 ***
Latitude0.2171 ***0.2768 ***−1.6852 ***
Room0.0226 ***0.0282 ***0.0234 ***
Bathroom0.0288 ***0.0370 ***−0.0051
Common_area0.0381 ***0.0672 ***0.0971 ***
Balcony0.00760.0059−0.0154
Sqmt0.0042 ***0.0032 ***0.0039 ***
Level0.0051−0.00270.0002
Floors0.0032 ***0.0022***0.0061 ***
Agency−0.00290.0563 ***−0.0031
Bed0.0584 ***0.0119−0.0001
Wardrobe−0.01460.0678 ***0.0304
Sofa0.00250.0301 ***0.0314 **
TV0.01200.0149 **0.0214 **
Fridge0.0400 **−0.01630.0312 ***
Laundry−0.01890.00370.0205 *
AC0.0578 ***0.0897 ***0.0368 ***
Hot water−0.0014−0.02640.0253 **
Broadband0.01390.0192 ***0.0254 ***
Gas−0.0165−0.0065−0.0053
Heating0.0302 **0.00340.0111 **
Dis_edu0.0229 **−0.0058−0.0111 *
Log (Num_edus)0.0029 ***0.0015 ***0.0007 ***
Dis_med0.0337 ***0.0163 **0.0206 **
Log (Num_meds)0.0009 ***0.0008 ***−0.0001
Intercept−3.0502−19.3471 ***57.2376 ***
***: 1% significance level; **: 5% significance level; *: 10% significance level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kou, R.; Long, Y.; Zhou, Y.; Liu, W.; He, X.; Peng, Q. Investigating the Impact of Public Services on Rental Prices in Chinese Super Cities Based on Interpretable Machine Learning. Sustainability 2024, 16, 7861. https://doi.org/10.3390/su16177861

AMA Style

Kou R, Long Y, Zhou Y, Liu W, He X, Peng Q. Investigating the Impact of Public Services on Rental Prices in Chinese Super Cities Based on Interpretable Machine Learning. Sustainability. 2024; 16(17):7861. https://doi.org/10.3390/su16177861

Chicago/Turabian Style

Kou, Ruibing, Yifei Long, Yixin Zhou, Weilong Liu, Xiang He, and Qiao Peng. 2024. "Investigating the Impact of Public Services on Rental Prices in Chinese Super Cities Based on Interpretable Machine Learning" Sustainability 16, no. 17: 7861. https://doi.org/10.3390/su16177861

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop