Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China

Xu, Zhixin; Li, Xiaoming; Sun, Bo; Wen, Yueming; Tang, Peipei

doi:10.3390/buildings14092796

Open AccessArticle

Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China

by

Zhixin Xu

¹,

Xiaoming Li

^1,*

,

Bo Sun

¹,

Yueming Wen

²

and

Peipei Tang

³

¹

School of Architecture, Southeast University, Nanjing 210096, China

²

School of Architecture, Zhengzhou University, Zhengzhou 450001, China

³

College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(9), 2796; https://doi.org/10.3390/buildings14092796

Submission received: 22 July 2024 / Revised: 16 August 2024 / Accepted: 3 September 2024 / Published: 5 September 2024

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

Download

Browse Figures

Versions Notes

Abstract

:

As urbanization advances, rural construction and resource development in China encounter significant challenges, leading to the widespread adoption of standardized planning and design methods to manage increasing population pressure. These uniform approaches often prioritize economic benefits over climate adaptability and energy efficiency. This paper addresses this issue by focusing on traditional mountain villages in northern regions, particularly examining the wind and thermal environments of courtyards and street networks. This study integrates energy consumption and comfort performance analysis early in the planning and design process, utilizing Genetic and XGBoost algorithms to enhance efficiency. This study began by selecting a benchmark model based on simulations of courtyard PET (Physiological Equivalent Temperature) and MRT (mean radiant temperature). It then employed the Wallacei_X plugin, which uses the NSGA-II algorithm for multi-objective genetic optimization (MOGO) to optimize five energy consumption and comfort objectives. The resulting solutions were trained in the Scikit-learn machine learning platform. After comparing machine learning models like RandomForest and XGBoost, the highest-performing XGBoost model was selected for further training. Validation shows that the XGBoost model achieves an average accuracy of over 80% in predicting courtyard performance. In the project’s validation phase, the overall street network framework of the block was first adjusted based on street performance prediction models and related design strategies. The optimized model prototype was then integrated into the planning scheme according to functional requirements. After repeated validation and adjustments, the performance prediction of the village planning scheme was conducted. The calculations indicate that the optimized planning scheme improves overall performance by 36% compared with the original baseline. In conclusion, this study aimed to integrate performance assessment and machine learning algorithms into the decision-making process for optimizing traditional village environments, offering new approaches for sustainable rural development.

Keywords:

traditional mountain village spatial environment; wind and thermal environment; genetic design; XGBoost algorithms

1. Introduction

With the rapid urbanization in China, the country’s swift economic growth has brought unprecedented challenges to the development of traditional villages. On the one hand, urban expansion is continually eroding and disrupting the original spatial patterns of surrounding traditional villages. The uniform planning and design methods, which result in a monotonous urban landscape, undermine the ecological layouts of village courtyards that adapt to nature and neglect the consideration of local characteristics. On the other hand, the increasingly deteriorating climate environment and the goals of energy efficiency and carbon reduction in buildings have prompted planners and designers to consider balancing energy consumption with comfort during the planning and design phases. Against this backdrop, optimizing and evaluating the spatial environmental performance of villages with a focus on climate adaptability becomes especially important in the early stages of design. Unlike traditional designs, where performance evaluation is often conducted after the project is completed (post-evaluation paradigm) [1], this approach integrates performance decision-making into the initial design phase, better combining spatial form with performance optimization and minimizing the environmental impact.

Over the past few decades, performance simulation based on multi-objective genetic algorithms has gradually gained popularity [2,3,4,5]. In most studies, building energy consumption has been the primary optimization target. Al-Homoud was among the first to propose balancing thermal comfort and energy consumption through proper design and selection of building components during the early stages of office building design [6]. Coley and Schukat attempted to combine genetic algorithms with dynamic thermal models and validated this method’s feasibility in designing a community hall [7]. Hauglustaine and Azar conducted genetic algorithm-based optimization of building envelopes, focusing on energy consumption and costs [8]. Dong and Sun conducted experiments on typical buildings in severe cold regions. They proposed a decision-making method based on the NSGA-II optimization algorithm to improve the accuracy of energy-efficient design [9]. Tian explored the role of passive design strategies, including opaque envelopes, windows, shading, and natural ventilation, in building energy consumption simulation and optimization [10]. From the above review, it is evident that most current multi-objective studies focused on resolving conflicts between building energy consumption and the indoor environment without considering how to balance energy consumption and indoor–outdoor comfort through controlling building spatial elements during the design concept stage. Additionally, the high learning and time costs of applying these methods have made it difficult for most planning and design professionals to quickly implement them in real projects.

In recent years, machine learning models have emerged to address these shortcomings of performance simulation based on multi-objective genetic algorithms. Current machine learning research mainly focuses on two aspects: First, an imitative patch based on deep learning technology mimics the original reference model by referencing large amounts of raw data from specific databases [11,12]. For example, Mostafavi and Sun used the pix2pix predictive model to conduct design studies on residential space layouts [13]. Such research often targets specific design objects, with the core being image semantics learning and some degree of intervention through a style guide, without analyzing or processing the underlying database. The second approach involves collecting environmental information using data sensors, obtaining tree-based model data, and conducting statistical predictions based on this data [14,15,16]. For instance, Ahmad compared and validated the performance of machine learning models with actual energy consumption using artificial neural network models, utilizing supervised-based machine learning models to predict energy use in different building environments [17]. Liu compared the complexity and accuracy of predictions between artificial neural networks (ANNs) and support vector machines (SVMs) in building energy consumption prediction [18]. The third approach focuses on using machine learning to enhance the contribution of specific building components to indoor comfort. For example, Lin and Tsay predicted the daylighting performance of different building facades using a daylighting model based on artificial neural networks [19]. Mo used the XGBoost algorithm to develop window behavior models from collected data to improve building energy efficiency [20].

In conclusion, most studies to date have focused only on predicting and optimizing indoor or outdoor building performance [21,22,23], rarely combining indoor–outdoor comfort with energy consumption for simulation and optimization over a larger time scale. Moreover, machine learning-related research is relatively limited for the specific cluster type of traditional villages. Most of them are limited to a single particular courtyard type [24,25], with few studies summarizing and generalizing regional typological elements from a broader perspective. Additionally, there is a lack of project-based validation of the proposed models’ feasibility. Based on this, this paper proposes a method for predicting the performance of traditional village courtyards using machine learning algorithms, enabling the assessment of indoor–outdoor performance and energy consumption during the design phase. This study focused on traditional mountain villages in northern China. From the wind and thermal environment performance optimization perspective, it analyzed the performance of courtyards and streets and their relationships with related spatial elements. Specifically, in the analysis of courtyards, after statistical analysis and the extraction of courtyard types, the initial dataset was obtained through multi-objective genetic optimization (MOGO). Based on this, the raw courtyard dataset was preprocessed through Algorithm Selection, Model Setting, and Validation of Model Training Accuracy to finally obtain the predictive model. In the analysis of street networks, the street prediction model and spatial element design guidelines are obtained through performance simulation. Finally, the accuracy of this workflow is validated through an actual project, combining the street and courtyard prediction models.

The innovation of this research lies, first, in the comprehensive evaluation of indoor and outdoor light and thermal performance of courtyards and their internal buildings and, second, in the development and validation of a performance-based generative architectural design workflow that integrates machine learning algorithms by comparing the predictive performance of the XGBoost algorithm with several other commonly used algorithms. This research provides more specialized design recommendations for planning and design professionals, enabling more efficient completion of planning and design tasks.

2. Methods

2.1. Overview Workflow

This paper introduces a design workflow that combines multi-objective genetic optimization (MOGO) and machine learning (Figure 1) that is structured into the following steps:

(1): Simulation site and analysis objects settings: A benchmark courtyard model was selected by simulating and ranking PET and MRT values across various courtyard types using Honeybee 0.0.66 software.
(2): Multi-objective genetic optimization (MOGO) for courtyards. The benchmark courtyard model underwent refinement through MOGO using the Wallacei_X plugin, generating a dataset for subsequent machine learning.
(3): Wind and thermal environment analysis for street networks. Performance simulations of the wind environment within street networks were conducted, and their correlation with spatial elements was analyzed. Additionally, thermal environment simulations at different analysis radii were performed to examine the relationship between comfort and spatial elements, leading to the establishment of stepwise regression predictive equations.
(4): Predictive model construction based on machine learning: The courtyard dataset was processed in Scikit-learn 1.3 and trained using the XGBoost algorithm to create a predictive model for courtyard performance.
(5): Program evaluation: Various design plans were proposed based on different target objectives. The data parameters of each solution were input into the algorithm model for performance evaluation, allowing for the selection of the most optimal design.

2.2. Simulation Site and Analysis Objects Settings

2.2.1. Site Selection for This Study

This study centered on the environmental simulation and analysis of Village A, a mountainous settlement in the cold regions of Shandong Province, northern China. The selection of this study subject is based on the following considerations:

(1): Mountainous Location: Village A is situated in a mountainous area. According to ArcGIS 10.8 statistical analysis of village distribution in Shandong Province, most traditional villages are in mountainous regions. Using a cluster statistical analysis method, this study overlayed the distribution of 519 villages in Shandong with a 30-m resolution digital elevation model (DEM) of the province, resulting in a topographic map of village distribution (Figure 2). The statistical results reveal that approximately 57% of the villages are situated at elevations exceeding 100 m (Table 1), primarily concentrated in the central mountainous region of Shandong.
(2): Distance from Urban Areas: Village A is far from urban centers. To ensure the independence of the study subject and minimize the influence of urban heat island effects, the selected site is over 40 km away from several city centers (Figure 3). The mountainous terrain further reduces the impact of urban heat island effects.

Given these factors, Village A‘s geographical characteristics serve as a representative example of traditional mountain villages in cold regions. It should be noted that the model construction relies on Google imagery and DEM elevation data, which are cross-referenced with on-site photographs. To maintain the comparability of analysis results, the model simplifies roof forms and certain building shapes (Figure 4).

2.2.2. Microclimate Measurements and Validation

Before conducting the subsequent performance simulation analysis, it is essential to verify the accuracy of the simulation data. The simulated data were generated using Honeybee 0.0.66 software, with measured data obtained using Kestrel NK-5400 instruments (Kestrel Instruments, Delaware, PA, USA). This study employed indicators such as Mean Bias Error (MBE) and the Cumulative Variation of the Root Mean Square Error (CV(RMSE)) to compare the measured data to the simulation data. According to standards like ASHRAE Guideline 14, IPMVP, and FEMP, the MBE (%) and CV(RMSE) (%) must be less than 5% and 20%, respectively, to meet the experimental requirements.

In the subsequent analysis, the main performance simulation indicators involved indoor and outdoor conditions, making validating the relevant meteorological elements necessary. Key factors affecting comfort include air temperature, black globe temperature, wind speed, and wind direction. Given that wind direction fluctuates significantly over short intervals and is difficult to control by adjusting meteorological data parameters, the actual measured wind direction was used directly in the simulation analysis. The results indicate that black globe temperature, outdoor air temperature, and wind speed accuracy fall within the standard range, allowing for their use in the subsequent performance analysis (Figure 5, Figure 6 and Figure 7). The building thermal parameters are referenced from local empirical data (Table 2). In addition, this paper also made settings for the trees in the courtyard. In Honeybee 0.0.66 software, the porosity schedule was set to reflect the changes in trees in summer and winter (Figure 8). In subsequent research, the plants were hidden in this study to speed up the computation.

It is important to note that this study only verified the accuracy of outdoor meteorological elements and did not assess the accuracy of indoor conditions and building energy consumption for the following reasons:

(1): Honeybee’s simulation of indoor comfort is based on outdoor meteorological elements and does not offer a separate input interface for indoor data;
(2): This study focused on performance comparisons between multiple schemes, emphasizing the relative improvement or degradation in performance rather than the building performance of a single scheme;
(3): Since various factors influence buildings during operation, the load settings for building energy units can vary significantly, with no unified standard. Therefore, this study only explored the energy consumption of buildings under ideal conditions, examining the differences in energy consumption between various schemes and the correlation with building parameters.

2.2.3. Baseline Courtyards Model and Environment Setup

This study conducted a statistical analysis of over 100 courtyards in the site area, ultimately identifying 17 courtyard types after sorting and analysis (Table 3). These courtyards were categorized based on their degree of enclosure into one-sided, two-sided, three-sided, and four-sided, with areas ranging from 70 to 130 m² (Figure 9). For the analysis period of performance simulation, typical weeks from summer and winter (20–26 July, 23–29 December) were selected.

This stage explores the relationship between courtyard geometric elements (such as shape and form) and outdoor comfort (PET, MRT). Previous studies established that PET and MRT are widely recognized as effective indicators for evaluating outdoor thermal comfort in cold regions. Chen et al. conducted a quantitative analysis of outdoor thermal comfort in Harbin and determined that the acceptable PET range is 2.5–30.9 °C [26]. Yuan et al. further investigated urban–rural differences in outdoor comfort in cold regions, using PET as a benchmark for thermal evaluation [27]. Du et al. demonstrated that MRT significantly impacts outdoor thermal comfort among similar meteorological factors in severely cold areas [28]. Through empirical measurements, Krüger et al. validated the correlation between MRT and outdoor thermal comfort [29]. The combination of PET and MRT provides comprehensive and accurate information for outdoor comfort evaluation. The entire process is linked to the energy model, considering the impact of airflow and long-wave and short-wave radiation on outdoor comfort.

Additionally, the macro-scale environment was configured to closely match the actual site conditions for the environmental setup. The site is elevated to an altitude of 150 m, with mountains of 300 m on the north and south sides, accurately reflecting the real environment (Figure 10). While investigating mountainous villages, it was noted that some villages have rivers running through them. However, this study placed less emphasis on water bodies for the following reasons: First, the width of rivers in northern mountainous villages typically ranges from 2 to 4 m, providing limited cooling capacity in summer and generally drying up in winter. Second, due to software limitations, Honeybee’s ability to simulate the heat storage properties of water bodies is restricted, as it lacks a dedicated water system material library. At the meso-level of specific performance calculations, the courtyard sizes and the widths of surrounding streets are based on empirical data collected during the investigation. Courtyard boundaries were set at 20 by 20 m, with north-south streets at 4 m and east-west streets at 3 m (Figure 11).

2.3. Multi-Objective Genetic Optimization (MOGO) Based on Courtyards

This section employs the Wallacei_X plugin for algorithmic optimization using the baseline courtyard model. By setting performance targets for both indoor and outdoor environments, the plugin automatically identifies building parameter combinations that meet the specified criteria.

2.3.1. Design Parameters and Performance Objectives Selection

In contrast to the previous stage, which focused solely on outdoor comfort (PET, MRT) and courtyard geometric parameters (such as area and layout), this stage incorporates additional building parameter indicators alongside spatial forms, such as standard floor height and window-to-wall ratio (WWR), as outlined in Table 4. All parameter ranges are based on local courtyard construction practices. For example, the “Secondary-house Floor Control” parameter determines the floor height of the two secondary houses within the courtyard. A value of “−1” indicates that both secondary houses are one story high; “0” indicates that the left secondary house is one story high while the right is two stories; “1” indicates that the left secondary house is two stories high while the right is one story. The main house on the north side is set as two stories high by default, following traditional principles where the primary functions are located on the north side, with the east and west sides serving auxiliary functions. The WWR refers to the window-to-wall ratio of each building in the courtyard. W_S and W_N represent the window-to-wall ratios on the south and north sides of the main building, respectively; W_E1 and W_W1 represent the window-to-wall ratios on the east and west sides of the left secondary house, respectively; W_E2 and W_W2 represent those for the right secondary house. Based on local design traditions, no windows are included on other sides.

This paper identified performance objectives that encompass building energy consumption and both indoor and outdoor comfort (Table 5), with the specifics outlined as follows:

(1): Outdoor Comfort: This criterion is based on Honeybee’s outdoor comfort autonomy modules, using PET (Physiological Equivalent Temperature) as the evaluation index. The goal is to maximize the proportion of time within the 5–31 °C threshold, a range deemed acceptable for cold regions according to related studies [27,30]. The analysis period aligns with the benchmark model from the previous stage, focusing on typical weeks in summer and winter (20–26 July and 23–29 December, respectively). The results are denoted as “OTCA_C” for winter and “OTCA_H” for summer.
(2): Indoor Comfort: Similar to the outdoor comfort model, the indoor comfort autonomy modules assess the proportion of comfortable time under natural ventilation, utilizing the adaptive comfort model proposed by De Dear and Brager [31]. The analysis concentrates on the typical summer week (20–26 July). In northern China, where heating is provided from November to March, indoor comfort generally remains within comfort ranges under steady-state conditions, rendering further analysis of a typical winter week unnecessary. The results are represented by “ITCA_H”.
(3): Indoor Illuminance: This criterion evaluates indoor lighting conditions over a year using spatial Daylight Autonomy (sDA) as the index. sDA measures the proportion of time in a year that achieves an illuminance level of 300 lux by the “Standard for Lighting Design of Buildings” [32]. The results are indicated by “sDA”.
(4): Building Energy Consumption: Energy Use Intensity (EUI) was chosen as the evaluation standard, referring to the energy consumption per unit building area (kWh/m²) over one year. Since energy consumption values tend to be larger than other indicators, EUI was normalized to a range of 0–1 for easier comparison.

This analysis incorporated five target values: OTCA_C, OTCA_H, ITCA_H, sDA, and EUI. It is important to note that all comfort indices were calculated under natural ventilation conditions, excluding active energy-saving measures, due to the complexity of real-world factors. Each of these values was normalized to a 0–1 range. Since this study focused on the early design stage, the weights of these five indicators were initially set to 1, though practitioners can adjust them as needed for specific applications. Additionally, as Wallacei_X defaults to minimizing values for optimization, all target values except for EUI were set as negative values. For clarity in the presentation of results, all data in the Results section are displayed in absolute values.

2.3.2. Simulation Generation Setup

Based on the above discussion, building energy consumption and comfort encompass multiple performance evaluation metrics, making multi-objective optimization an unavoidable issue. At this stage, multi-objective genetic optimization (MOGO) was configured. In recent years, MOGO has been widely applied in building form and envelope design [33,34,35,36]. In simple terms, MOGO uses genetic algorithms (GAs) to optimize multiple performance objectives and find Pareto solutions. A brief explanation of the Pareto optimal solution is necessary here: a solution is considered Pareto optimal if no other solution is better in one objective variable while also being equal to or better in all other objective variables [37]. Research has shown that Pareto front solutions offer significant advantages in addressing multi-classification problems. Wright et al. used the NSGA-II algorithm to optimize the window-to-wall ratio and window geometry, aiming for Pareto optimal solutions that balance building energy consumption with economic efficiency [38]. Similarly, Wang et al. conducted multi-objective optimization during the early design stages of green buildings, focusing on variables such as building orientation, floor plan shape, window type, window-to-wall ratio, and wall and roof materials. Their identified Pareto optimal solutions significantly reduced the building’s life cycle costs and environmental impact [39]. This study used Pareto optimal solutions to balance the five target variables, OTCA_C, OTCA_H, ITCA_H, sDA, and EUI, which are interconnected and sometimes conflicting regarding light and thermal mechanisms. The MOGO process was carried out using the Wallacei_X plugin for Grasshopper. The parameters are set in Table 6 based on the relevant literature and several experimental trials [40,41].

2.4. Simulation and Correlation Analysis of Wind and Thermal Environment Performance Based on Street Networks

This section presents the street performance simulation based on the site, focusing on thermal and wind environments. For the thermal environment analysis, this study investigated the impact of various morphological indicators on PET at different analysis radii (10 m,20 m, 50 m, 100 m) (Figure 12). To visually represent this impact, this analysis centered on the difference between the average PET within each analysis radius and the average PET of the entire site, referred to as ΔPET. The influencing factors considered include the street greenery ratio (G), floor area ratio (P), building density (D), average building height (H), total wall area (W), street average height-to-width ratio (R), weighted street radius (A), and the number of road intersections (I) for each analysis radius.

Multiple sets of meteorological data from northern cold regions were included in the simulation to ensure a robust sample size. The key evaluation parameters used were the significance test p-value and the Pearson correlation coefficient. Indicators not significantly associated with ΔPET were excluded based on these analysis results. After identifying the analysis radius where each indicator has the greatest impact on ΔPET and excluding weakly correlated indicators, a stepwise regression method was employed to establish predictive equations. These equations incorporate scales and morphological indicator types to maximize the dependent variable’s explanatory power. Figure 13 illustrates the analysis results for the 50 m radius, with red areas highlighting the simulation analysis zone. Given the wide range of certain design parameters like total wall area (W), normalization was applied according to Equation (1). Here,

x^{'}

represents the normalized value,

x

is the original value, and

m i n (X)

and

m a x (X)

are the minimum and maximum values within the dataset

X

. This method scales the data to a [0, 1] range.

x^{'} = \frac{x - m i n (X)}{m a x (X) - m i n (X)}

(1)

For the wind environment analysis, street spaces were simulated using the Butterfly plugin for Ladybug Tools. The impact of the street space parameters on wind environment efficiency was summarized through data analysis. The evaluation criterion used was the comfortable wind speed ratio, which is defined as the ratio of the area with comfortable wind speeds to the total study area. The independent variables considered were street width and the street height-to-width ratio. The comfortable wind speed settings referenced the comfort range proposed by Professor Shuzo Murakami of the Architectural Institute of Japan, with summer comfort wind speeds ranging from 0.7 to 1.7 m/s and winter comfort wind speeds from 0.5 to 1.3 m/s [42]. The wind tunnel size settings followed the “Green Performance Calculation Standard for Civil Buildings” [43], which specifies that the vertical height from the top of the target building(s) to the upper boundary of the calculation domain should be greater than 5H; the distance from the outer edge of the target building(s) to the horizontal boundary of the calculation domain should also be greater than 5H; the horizontal distance from the inflow boundary to the outer edge of the target building(s) should be greater than 5H, and the horizontal distance from the outflow boundary to the outer edge of the target building(s) should be greater than 10H (Figure 14).

2.5. Machine Learning Settings

This section outlines a comprehensive workflow for developing a machine learning algorithm to rapidly predict and assess the performance of courtyard building designs. The process encompasses data collection and preprocessing, model selection, hyperparameter tuning, and model evaluation. This study leveraged the Scikit-learn library within the Python platform, a widely recognized tool for machine learning. Detailed steps are provided in the following subsections.

2.5.1. Data Preprocessing

This study filtered 3000 samples generated by MOGO, removing 135 samples with a window-to-courtyard-side ratio of 0. These samples were flagged as noise data and removed. Then, a combination paradigm was provided to machine learning models by assigning solution sets with performance labels. Using Wallacei_X for data categorization, two primary methods were employed: the first involved selecting solutions from the Pareto front, and the second utilized unsupervised machine learning for clustering labels. Relevant studies have demonstrated that using Pareto front-based generational division can effectively evaluate the performance of datasets generated by MOGO [3]. Figure 15 illustrates the spatial distribution of datasets in a three-dimensional coordinate system obtained through K-means clustering (unsupervised machine learning method) from Generation 49 to 59, where different colors represent different clusters. Table 7 presents randomly selected sample points from other clusters and their performance scores. The results show high similarity in performance target Outdoor_C and Indoor_sDA among Clusters 4, 5, and 6, while Clusters 1, 2, and 3 exhibit better differentiation, which also can be seen from Figure 15, where multiple clusters overlap in several areas within the three-dimensional coordinate system. Based on this, the Pareto front was chosen as the standard for data classification at this stage, with the accuracy of Pareto front optimization to be discussed in the Results section.

The initial dataset consisted of the Pareto front solution set and non-dominated solution set obtained from optimization using Wallacei_X and was divided into six categories according to specific design requirements (Table 8). Among them, the Pareto front solution set with a significant portion of indoor illuminance indicator sDA greater than or equal to 70% was categorized as A1, B2, and C3, with decreasing comprehensive performance objectives, respectively. Other solutions in the Pareto front set that did not meet the requirement of a time ratio greater than 70% were categorized as D4. Additionally, data not in the Pareto solution set with indoor illuminance indicator sDA greater than or equal to 70% were categorized as E5, while the rest were categorized as F6.

2.5.2. Data Splitting

The dataset was split into an 80% training set and a 20% test set to evaluate prediction performance. The training set was further divided using cross-validation to prevent bias and improve generalization. A five-fold cross-validation was employed to obtain the average accuracy of the algorithm model. Model generalization was optimized using a grid search method, focusing on parameters such as “learning rate”, “max_depth”, and “number of decision trees (n_estimators)”. The specific settings were as follows: n_estimators = 150, max_depth = 6, gamma = 0.01, subsample = 0.6, and learning rate = 0.2.

2.5.3. Algorithm Selection, Model Setting, and Validation of Model Training Accuracy

In the following steps, appropriate machine learning models were chosen to train the preprocessed data. For most structured data problems, such as predicting energy consumption or comfort ratings, tree-based models like Random Forest and XGBoost generally achieve higher accuracy than neural network-based models and deep learning approaches. Figure 16 compares the learning rates of various mainstream machine learning algorithms, with the target being training examples. The results show that, compared with AdaBoost and Random Forest, the XGBoost algorithm has a higher learning rate score of 0.875 and begins to converge gradually after 800 samples.

The XGBoost (eXtreme Gradient Boosting) algorithm, which evolved from the Gradient Boosting Machine method and was proposed in 2016, has been widely adopted for building performance prediction. Its distinguishing features include a gradient-boosting strategy and regularization terms, which enhance the model’s generalization ability (Figure 17). Additionally, XGBoost supports parallel processing, accelerating the training process. Based on these advantages, this study selects XGBoost as the algorithm model.

The components of the XGBoost function include both training loss and regularization terms and are expressed as follows:

L^{(t)} = \sum_{i = 1}^{n} [l (y_{i}, {\hat{y}}_{i}^{(t - 1)}) + g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t}^{2} (x_{i})] + Ω (f_{t})

(2)

Here,

g_{i} = \frac{\partial l (y_{i}, {\hat{y}}_{i}^{t - 1})}{\partial {\hat{y}}_{i}^{t - 1}}

represents the first-order derivative, which is a known value that can be calculated, and

h_{i} = \frac{\partial^{2} l (y_{i}, {\hat{y}}_{i}^{t - 1})}{{(\partial {\hat{y}}_{i}^{t - 1})}^{2}}

represents the second-order derivative, which is a known value that can be calculated.

After determining the machine learning algorithm and training the model for a certain period, it is necessary to validate its accuracy. Common evaluation metrics include accuracy, precision, recall, and F₁ score. The F₁ score combines and balances precision and recall, minimizing the difference between them, as shown in Formula (3). The extended Macro-F₁ can evaluate the accuracy of multi-class classification problems. In this paper, Macro-F₁ was used to assess the accuracy of performance target predictions.

F_{1} = \frac{2 \times p r e c i s i o n \times r e c a l l}{p r e c i s i o n + r e c a l l}

(3)

2.6. Solution Validation

This section validates the analysis of the above results through a real-world project. The following images illustrate the current state of a site within the research area (Figure 18 and Figure 19). Due to site boundary constraints, rapid population growth has resulted in high building density within the village. This dispersed building mass has a high shape factor, negatively impacting building energy efficiency. The high height-to-width ratio in both streets and courtyards also impairs winter sunlight exposure. To address these issues, planning must focus on integrating building masses by breaking down large volumes into smaller ones, thereby improving energy efficiency and comfort through spatial design. This study explored two intervention strategies: first, it employed street design strategies to selectively widen and merge the existing road network; second, it mapped previously obtained optimization prototypes to the design scheme based on actual functional requirements. The trained machine learning model then evaluated the performance of the proposed planning scheme. The improvement in performance relative to the original model was calculated to select the optimal scheme. This workflow, which integrates performance evaluation of streets and courtyards, aimed to enhance design efficiency and provides a scientific approach to renovating and designing traditional villages.

3. Results

3.1. Baseline Model Selection

Figure 20 shows the distribution of mean radiant temperature (MRT) of various types of traditional courtyards in a typical winter week (23–29 December). Overall, the enclosed courtyard has a specific shielding effect on the average radiation compared with the outdoors. For four-sided courtyards, the MRT value of 4c is the lowest between 10:00 and 15:00. It is speculated that due to the low sun altitude angle in winter, the two-story building on the southeast side severely blocks the sun. The overall MRT value of 1c was higher. It is speculated that the enclosed courtyard area is the largest and receives the most solar radiation in winter, so the average value was also the highest. For three-sided courtyards, the overall difference was not significant. The MRT value of 9c was slightly higher. It is speculated that the reason for this is related to the less blockage on the south side and the north-south direction of the courtyard. For two-sided and one-sided courtyards, 11c had the highest MRT value. The reason for this is similar to that of 9c, which is related to the mutual blocking relationship between courtyard buildings. A horizontal comparison of the MRT distribution of courtyards in each period shows that the MRTs of 1c and all three-sided courtyards are generally higher than others. The reason for this is speculated to be that under the premise that the boundaries of the homestead are limited, the courtyard area obtained by three-sided was the largest, and the average MRT was the highest.

Through an hourly analysis of the winter MRT of each group of courtyards, the results show that the courtyard’s primary factor affecting the MRT value is the courtyard area. The larger the area, the higher the average MRT value. Secondly, in the selection of courtyard types, since the areas of the investigated courtyards are relatively close under the same conditions, the area of three-sided courtyards accounts for a higher proportion than that of quadrangles. The area of two-sided courtyards and one-sided courtyards is smaller due to their relatively simple functions; overall, the average MRT value of 3-sided courtyards is higher than that of other courtyard types; in terms of plan aspect ratio, the MRT value of courtyards with north-south extensions is significantly higher than those with square and east-west extensions, assuming the areas are similar. For instance, the MRT value in Figure 20, 9c is overall higher than that of 7c and 8c. Regarding the number of building floors, the height of the buildings on the east and west sides will affect the MRT value of the courtyard to varying degrees. Taking three-sided courtyards as an example, comparing 9c and 10c, we can find that before 13 o’clock, the MRT values of the two are close. After this point, the MRT value of 10c drops particularly obviously. Similar conclusions also appear in 2c and 3c; finally, in terms of orientation, the MRT distribution trends differ in different periods for east-west and north-south orientations. The MRT value distribution of courtyards-oriented east-west is slightly better than that oriented north-south. For example, the overall MRT value of 6c is slightly better than that of 7c.

Figure 21 shows the distribution of MRT of various types of traditional courtyards in a typical summer week (20–26 July). Overall, compared with the outdoors, the closed courtyard had a relatively lower MRT value than the outdoors. For four-sided courtyards, the overall MRT value of 4h was relatively low. It is speculated that the impact of the courtyard height-to-width ratio was higher than that of other factors. In total, 4h had the most significant aspect ratio among all courtyards, so the MRT value was also lower. Similar to the results obtained in summer, the overall MRT value was higher in 1h due to the relatively large courtyard area. For three-sided courtyards, the overall difference was not significant. The MRT values at 6h and 8h were relatively low. It is speculated that the reason for this is that the shading effect of east-west-oriented buildings is better than that of north-south shading. For one-sided and two-sided courtyards, the 14h and 17h values are relatively low in specific periods, and it is speculated that the reason is related to the building orientation and height-to-width ratio. A horizontal comparison of the distribution of MRT in courtyards at various periods shows that the MRTs of two-sided and one-sided are generally lower than others. It is speculated that the reason is related to the height-to-width ratio of the courtyards. The height-to-width ratio of these courtyards is close to 1 to 1.5, which is significantly higher than other courtyard types; secondly, the type of courtyard. In terms of selection, compared with other types of courtyards, the average MRT value of two-sided courtyards was relatively low, indicating that under the conditions of limited land use, this small-scale courtyard type is more suitable for summer heat insulation; finally, in terms of orientation, east-west orientation The average MRT value of the courtyard in each period is lower than that of the north-south direction.

By simulating the MRT of each courtyard in summer and winter, the results show that factors such as courtyard type, building height-to-width ratio, and courtyard area affect the distribution of MRT in summer and winter to varying degrees. The courtyard type has an essential impact on MRT in both summer and winter. That is, three-sided is suitable for increasing MRT values in winter, and two-sided is suitable for reducing MRT values in summer. This also verifies from the side that most of the courtyards surveyed in Shandong Province are three-sided, followed by two-sided, indicating that the actual demand for cold protection in winter is higher than heat insulation in summer.

Figure 22 shows the statistical results of the PET simulation analysis. Due to space limitations, this paper only presents the distribution of the average PET levels for each type of courtyard. The data indicates that the courtyard type with the lowest PET value in summer is the one-sided courtyard, likely due to its larger scale and the shading provided by buildings, creating a relatively comfortable outdoor environment. In winter, the four-sided courtyard and three-sided courtyard have the highest PET values, probably because of their larger areas, ample solar radiation, and a higher proportion of two-story buildings on the north side, which better resist the cold winter wind. Additionally, the low winter sun angle results in the one-sided courtyard having the lowest PET value due to self-shading by the buildings.

Considering the combined judgment of PET and MRT, along with economic costs, the three-sided courtyard was selected as the basic model for further research. To facilitate the comparison and optimization of improvement ratios in later stages, this study selected the most frequently occurring three-sided courtyard within the study area as the baseline model and calculated its performance scores. The results are shown in Figure 23.

3.2. Simulation of Courtyard Performance

After multi-objective genetic optimization (MOGO) for courtyards, 3000 solution sets were generated. The most recent ten generations of Pareto frontier solutions and non-frontier solutions, totaling 500 solution sets, were filtered using Wallacei_X and were distributed in the 3D coordinate system, as shown in Figure 24. The Pareto front, formed by the optimal solutions, is represented by the 3D mesh surface in the figure. These solution sets represent their respective performance attributes in different spatial positions, with the X and Y axes representing the percentages of outdoor comfort time in winter and summer (OTCA_C and OTCA_H), respectively, and the Z axis representing the percentage of indoor comfort time in summer (ITCA_H). The color gradient from green to red indicates indoor illuminance comfort (sDA) from suitable to unsuitable, and the size of the points reflects building energy use intensity (EUI) from high to low. Figure 25 shows the planar layout of the latest generation (59th generation) of Pareto frontier solution sets. Figure 26 shows the distribution of these solution sets in a Parallel Coordinate Plot, with five vertical axes from left to right representing OTCA_C, OTCA_H, ITCA_H, sDA, and EUI in sequence. The closer to the lower end of the vertical axis, the better the optimization. As shown in the figure, these data sets are well distributed in the plot, and most are near the lowest value.

Wallacei_X offers several common optimization selection methods, such as the Average of Fitness Ranks and the Relative Difference Between Fitness Ranks. The Average of Fitness Ranks calculates the average value after ranking all individuals in the population, while the Relative Difference Between Fitness Ranks assesses the diversity and competitive pressure of the population by ranking the relative difference values between two individuals. Wallacei_X selects the optimal solution based on the best performance across these two selection methods. The right half of Figure 27 and Figure 28 displays the data distribution of these two indicators using the Parallel Coordinate Plot. The left half shows the plan views of the schemes corresponding to the evaluation indicators and the specific design parameter values. The two schemes share common characteristics: the courtyards extend along the east-west axis and face southwest, with two-story side houses located on the west side.

Additionally, it can be observed that there is a trade-off between the improvement of building energy efficiency (EUI) and indoor comfort parameters (sDA and ITCA_H). As shown in Figure 27, a 17% improvement in building energy efficiency leads to a 9.3% reduction in sDA. Similar trends can be seen in Figure 28. The characteristics of this data distribution can provide architects with valuable reference points during the early design stages.

3.3. Simulation and Prediction Equation of Street Space Performance

3.3.1. Analysis of Street Thermal Environment Performance

During the calculation of the correlation of alley space indicators, a p-value of less than 0.05 was considered significant. After filtering, the indicators strongly correlated with ΔPET_C included G_20, D_50, H_50, W_100, R_100, and A_50, and those strongly correlated with ΔPET_H included G_50, P_50, H_50, W_100, and R_100 (Figure 29). Based on this, stepwise regression calculations were carried out to establish predictive equations that include different scales and types of form indicators to ensure the highest explanatory power for the dependent variable.

For stepwise regression analysis, G_20, D_50, H_50, W_100, R_100, and A_50 were used as independent variables, and ΔPET_C was used as the dependent variable. After automatic model identification, three variables, D_50, R_100, and A_50, remained in the model, with an R-squared value of 0.353. This means that D_50, R_100, and A_50 can explain 35.3% of the variation in ΔPET_C. The model passed the F-test (F = 23.437, p = 0.000 < 0.05), indicating that the model is effective. Multicollinearity tests showed that all VIF values in the model were less than 5, indicating no multicollinearity problems, the D-W value was close to 2, indicating no autocorrelation, and the sample data were not interrelated, making the model robust. Detailed analysis revealed that the regression coefficient value of D_50 was 0.015 (t = 2.342, p = 0.001 < 0.01), indicating a significant positive relationship with ΔPET_C. The regression coefficient value of R_100 was 0.048 (t = 4.241, p = 0.000 < 0.01), indicating a significant positive relationship with ΔPET_C. The regression coefficient value of A_50 was 0.031 (t = 3.362, p = 0.001 < 0.01), indicating a significant positive relationship with ΔPET_C. In summary, D_50, R_100, and A_50 have significant positive effects on ΔPET_C. The predictive model equation is as follows:

ΔPET_C = 0.003 + 0.015 × D_50+ 0.048 × R_100 + 0.031 × A_50

(4)

For stepwise regression analysis, G_50, P_50, H_50, W_100, and R_100 were used as independent variables, and ΔPET_H was used as the dependent variable. After automatic model identification, four variables, G_50, H_50, W_100, and R_100, remained in the model, with an R-squared value of 0.493. This means G_50, H_50, W_100, and R_100 can explain 49.3% of the variation in ΔPET_H. The model passed the F-test (F = 20.425, p = 0.000 < 0.05), indicating that the model is effective. Multicollinearity tests showed that all VIF values in the model were less than 5, indicating no multicollinearity problems, and the D-W value was close to 2, indicating no autocorrelation, and the sample data were not interrelated, making the model robust. Detailed analysis revealed that the regression coefficient value of G_50 was 0.158 (t = 4.347, p = 0.000 < 0.01), indicating a significant positive relationship with ΔPET_H. The regression coefficient value of H_50 was 0.018 (t = 2.552, p = 0.013 < 0.05), indicating a significant positive relationship with ΔPET_H. The regression coefficient value of W_100 was 0.002 (t = 2.535, p = 0.013 < 0.05), indicating a significant positive relationship with ΔPET_H. The regression coefficient value of R_100 was 0.022 (t = 3.861, p = 0.000 < 0.01), indicating a significant positive relationship with ΔPET_H. In summary, G_50, H_50, W_100, and R_100 have significant positive effects on ΔPET_H. The predictive model equation is as follows:

ΔPET_H = 1.006 + 0.158 × G_50 + 0.018 × H_50 + 0.002 × W_100 + 0.022 × R_100

(5)

3.3.2. Analysis of Street Wind Environment Performance

Figure 30 and Figure 31 illustrate the relationship between alley wind efficiency and spatial scale. The results in Figure 30 show that, overall, the comfortable wind speed ratio in winter is slightly higher than in summer. When the alley width is fixed, the comfortable wind speed ratio shows a significant positive correlation with building height. Conversely, when building height is fixed, the comfortable wind speed ratio initially increases and then decreases as the alley width increases. Figure 31 shows that within a certain range of height-to-width ratios, the comfortable wind speed ratio in both summer and winter initially increases and then decreases with an increase in the alley height-to-width ratio. Additionally, an increase in the height of buildings on both sides of the alley significantly improves the comfortable wind speed ratio. In winter, when the building height reaches 6 m, the comfortable wind speed ratio continues to increase steadily after reaching a high value, unlike in summer, where this does not occur. Based on the current conditions, it can be concluded that an alley width of 1.5 m has the highest comfortable wind speed ratio among all spatial scales. Regarding the height-to-width ratio of the alleys, in summer, a building height-to-width ratio of 1.5 results in the maximum comfortable wind speed ratio. In winter, for buildings with a height of 6 m or more, a height-to-width ratio of 3 is needed to achieve the maximum comfortable wind speed ratio. For buildings below 6 m, a height-to-width ratio of 1.5 results in the maximum comfortable wind speed ratio.

Overall, combining existing studies on wind comfort ratios with alley space factors shows that different combinations of building heights, alley widths, and height-to-width ratios will produce different comfortable wind speed ratios. In practice, these analysis charts can be used to select optimal parameters corresponding to different design parameters. In actual project implementation, for new buildings, a 4 m fire lane should be considered, while this constraint does not apply to renovated buildings. Therefore, the design needs to consider specific situations accordingly.

3.4. Machine Learning Validation

3.4.1. Analysis of the Influence of Courtyard Design Parameters

Figure 32 shows the sensitivity analysis of various design variables using XGBoost, where higher values indicate that changes in those variables have a greater impact on building performance. In addition to the previously discussed design parameters, H/L and H/W ratios have been included in this analysis (where H represents the height of the building on the north side of the courtyard, and L and W represent the length and width of the courtyard, respectively) to better understand how changes in courtyard spatial relationships affect overall performance. The performance here specifically refers to the impact on performance ratings ranging from A to F, as discussed earlier. As illustrated in the figure, the courtyard’s width, the height of the building on the east side of the courtyard, and the angle of the courtyard have the highest impact on performance ratings. Regarding the window-to-wall ratio, W_W1 has the highest influence compared with other locations. Regarding courtyard spatial relationships, the H/W ratio has a greater impact than the H/L ratio, indicating that the horizontal expansion of the courtyard has a more significant effect on building performance ratings than vertical expansion.

3.4.2. Verification Results of Machine Learning Accuracy

In this study, the design parameters were input into the XGBoost model to verify the accuracy of the machine learning-based ratings. Design parameters corresponding to different performance levels were entered into the machine learning algorithm, with 20 samples taken for verification for each performance label. As shown in Table 9, the rightmost column, “Prediction accuracy”, displays the average accuracy of the machine learning model, which ranges from 75% to 89%, with an overall average accuracy of 83.83%. Additionally, to demonstrate the performance improvement relative to the baseline model, one sample’s performance score was selected from every 20 samples to represent that performance level. For example, the scheme with Index 1 showed a 27% and 1% reduction in Outdoor_C and Outdoor_H, respectively, but a 12% and 11% increase in Indoor_H and sDA, respectively. In terms of building energy consumption, energy consumption increased by 24%. By observing the parameter values corresponding to each performance rating, it can be seen that the XGBoost algorithm, after considering multiple performance objectives, places relatively more emphasis on improving Indoor Comfort. This is closely related to the data labeling process, where sDA was set as the second step rating standard based on the Pareto front screening. Overall, except for the energy consumption of the C-level label, which is relatively low, other performance labels generally show an increase in energy consumption. At the same time, the C-level label is the performance rating closest to the baseline model. This indicates, to some extent, that the spatial design of traditional courtyards is relatively energy-efficient but has room for improvement in terms of comfort.

3.5. Application and Performance Analysis Statistics

As shown in Figure 33, the scheme performance evaluation process is depicted from left to right, representing different design stages. The first image shows the “Status of the Base”, indicating the site’s current state. The site can be divided into three parts: Retail and dining area, Master’s studio, and Cultural exhibition area, according to its functions for subsequent scheme comparisons. Building parameter information within each functional area was input into the machine learning model to calculate the average performance rating for each location. Ratings from F to A were assigned values of 0.15, 0.30, 0.45, 0.60, 0.75, and 1.00, respectively. Based on the area proportion of each functional zone, the Overall Performance Score was calculated, with the final score ranging between 0 and 1. As illustrated, the Overall Performance Score for the initial base state was 0.55. The second and third images, representing Plan 1 and Plan 2, indicate proposals for preservation and new construction. The preservation-focused design respects the original village texture but results in limited overall performance improvement. The new construction design is more integrated in its overall layout, achieving a higher comprehensive performance score but showing less respect for the traditional context. The final plan integrates elements from both middle proposals, adjusting the design to include functional spaces honoring the original village texture while incorporating large spaces to meet modern functional demands. The Overall Performance Score aligns with that of Plan 2. The calculations show that the final plan improves overall performance by 36% compared with the status of the base. It is important to note that for large spaces (e.g., the Master’s Studio), no suitable training model is available for performance evaluation. Instead, the performance values were assessed directly by simulating energy consumption and comfort. These values were then compared with the corresponding performance scores for each rating category to identify the closest rating. Regarding alley space control, the north-south streets marked in red represent the main pedestrian pathways, which were appropriately widened in the design. The internal alleys marked in blue were delineated and designed according to the specific needs of each scheme.

In summary, the validation of the above projects demonstrates that this workflow has effectively achieved the goal of integrating performance evaluation with design in the early stages. The process comprehensively considers aspects such as road boundary optimization, functional zoning, and performance evaluation, and it provides real-time feedback on performance assessments.

4. Discussion

Research on the performance simulation of traditional villages often focuses on the thermal insulation of walls and windows, energy-saving strategies for HVAC systems, and other micro-level aspects [44,45,46,47,48]. However, there is rarely an emphasis on design-oriented strategies, such as the spatial relationship between courtyards and streets, to reduce building energy consumption and enhance comfort. This study analyzed the microclimate environment of villages from the two dimensions of courtyards and streets using the XGBoost algorithm. By training a dataset generated from design, a prediction model was obtained. This prediction model allows for rapid performance assessment of planning schemes, significantly reducing time and economic costs. It changes the usual design process where performance prediction can only be conducted after the completion of planning and design, allowing for relatively accurate forecasts at the early stages of planning and design. However, this design process still has certain limitations. Firstly, the current analysis results of street performance only apply to optimizing the original scheme, with limited correlation analysis between streets and courtyards. Including street data in machine learning would greatly increase time costs, and the presence of many different dimensional evaluation indicators would also reduce the accuracy of performance predictions. Secondly, in the stage of generative design based on multi-objective optimization, the involvement of multiple goals and the time cost of thousands of iterations create difficulties for subsequent applications. Lastly, regarding the machine learning algorithm, this study selected the XGBoost model, which is widely used in the professional field. However, the generalizability of its classification prediction model requires further exploration. If design variables are changed or the prediction target is modified, the prediction accuracy may vary.

5. Conclusions

At the early planning and design stage, this paper proposes a design workflow that combines performance-based generative design with machine learning. During the performance simulation stage, the study conducted performance prediction analysis for courtyards and street spaces. In the analysis of street performance, regression analysis of thermal comfort indicated that D_50, R_100, and A_50 can explain 35.3% of the variance in ΔPET_C, while G_50, H_50, W_100, and R_100 can explain 49.3% of the variance in ΔPET_H. Analysis of the wind environment shows that a street width of 1.5 m has the highest comfortable wind speed ratio in various spatial scales. For street height-to-width ratios, in summer, a building height-to-width ratio of 1.5 achieves the highest comfortable wind speed ratio. In winter, except for buildings 6 m or higher, where the ratio needs to be 3, buildings lower than 6 m achieve the highest comfortable wind speed ratio at a height-to-width ratio of 1.5. In the performance analysis of courtyards, a comprehensive analysis of various types of courtyards’ PET and MRT shows that three-sided courtyards perform best overall. In the courtyard performance simulation stage, the models selected and optimized through MOGO, based on the Average of Fitness Ranks and the Relative Difference between Fitness Ranks, showed improvements in building comfort and energy consumption. Notably, both sDA and ITCA_H decreased to varying degrees during the optimization process. In the machine learning stage, this study verified the accuracy of performance labels, showing 89% accuracy for A-rated performance and 88% for F-rated performance, with a relatively lower accuracy of 75% for D-rated performance. Observing the performance prediction analysis results, the baseline model, although classified as C under existing performance rating rules, had the lowest energy use intensity (EUI), indicating that traditional village courtyard designs perform well in passive energy saving and have significant potential for improving comfort. Models rated A showed significant improvements in indoor comfort, with Indoor_H and Indoor_sDA increasing by 12% and 11%, respectively, while outdoor comfort and building energy consumption decreased, especially with a 42% reduction in EUI efficiency. This result relates to the selection of performance labels, where the paper prioritizes indoor illuminance sDA following the Pareto frontier, biasing subsequent machine learning training results towards improved indoor comfort. Results would vary if the selection criteria were changed to outdoor comfort indicators like PET or building energy consumption EUI. Practitioners can adjust criteria as needed during actual operations. Finally, in the scheme validation phase, this study selected a real site within the study area for validation. After assessing the performance of the original site project, two rounds of scheme design were conducted based on different development objectives. Compared with the original performance score of 0.55, the optimized planning scheme achieved a 36% improvement, reaching a score of 0.75. Overall, the workflow based on generative design and machine learning can be applied at the early stage of planning and design, providing relatively professional advice and reference for practitioners in optimizing building and outdoor environment performance. The optimized models at each design stage showed performance improvements over the baseline scheme. Street prediction and machine learning algorithm models can also offer specific design suggestions and timely performance evaluation feedback for planning schemes.

However, this study has some limitations. First, the research focuses on traditional villages in northern China, most located in mid-to-high altitude mountainous areas. Unlike plain villages with multiple courtyards, these mountainous villages are often constrained by topography and typically have single courtyards. As a result, the predictive model based on the three-sided courtyard has limited applicability to courtyards on flat terrains. Future research could build upon this workflow to include an analysis of courtyard combinations in plain villages. In specific application scenarios, planners can incorporate meteorological data of different climate zones into the model calculations and develop appropriate machine-learning models according to the workflow. Second, given the complexity of real-world conditions, the performance simulation of village courtyards in this study primarily compares the performance of various schemes rather than analyzing a specific scheme in detail. This approach inevitably simplifies the simulation of real conditions, such as the thermal performance of water systems and the impact of plant transpiration on the atmospheric environment. Lastly, while this study focuses on energy consumption and thermal comfort, other factors like noise pollution and indoor glare, which also affect user experience, were observed during field research. Future studies should incorporate a comprehensive analysis of environmental sound, light, and heat to more accurately simulate and analyze the real microclimate of villages, providing more professional design advice and guidance for planning practitioners.

Author Contributions

Conceptualization, X.L.; Methodology, Z.X. and X.L.; Software, Z.X., B.S., Y.W. and P.T.; Validation, Z.X., B.S. and P.T.; Formal analysis, Z.X.; Investigation, B.S. and Y.W.; Resources, Z.X., X.L. and P.T.; Data curation, Z.X.; Writing—original draft, Z.X.; Writing—review & editing, Z.X.; Project administration, X.L.; Funding acquisition, Z.X. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (Grant No. KYCX19_0090).

Data Availability Statement

Restrictions apply to the datasets: The datasets presented in this article are not readily available because the data are part of an ongoing study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yuan, P.F.; Song, Y.; Lin, Y.; Beh, H.S.; Chao, Y.; Xiao, T.; Wu, Z. An architectural building cluster morphology generation method to perceive, derive, and form based on cyborg-physical wind tunnel (CPWT). Build. Environ. 2021, 203, 108045. [Google Scholar] [CrossRef]
Yang, B.; Li, X.; Liu, Y.; Chen, L.; Guo, R.; Wang, F.; Yan, K. Comparison of models for predicting winter individual thermal comfort based on machine learning algorithms. Build. Environ. 2022, 215, 108970. [Google Scholar] [CrossRef]
Yan, H.; Yan, K.; Ji, G. Optimization and prediction in the early design stage of office buildings using genetic and XGBoost algorithms. Build. Environ. 2022, 218, 109081. [Google Scholar] [CrossRef]
Sakiyama, N.R.M.; Carlo, J.C.; Mazzaferro, L.; Garrecht, H. Building optimization through a parametric design platform: Using sensitivity analysis to improve a radial-based algorithm performance. Sustainability 2021, 13, 5739. [Google Scholar] [CrossRef]
Fan, Z.; Liu, M.; Tang, S.; Zong, X. Multi-objective optimization for gymnasium layout in early design stage: Based on genetic algorithm and neural network. Build. Environ. 2024, 258, 111577. [Google Scholar] [CrossRef]
Al-Homoud, M.S. Optimum thermal design of office buildings. Int. J. Energy Res. 1997, 21, 941–957. [Google Scholar] [CrossRef]
Coley, D.A.; Schukat, S. Low-energy design: Combining computer-based optimisation and human judgement. Build. Environ. 2022, 37, 1241–1247. [Google Scholar] [CrossRef]
Hauglustaine, J.M.; Azar, S. Interactive tool aiding to optimise the building envelope during the sketch design. In Proceedings of the Seventh International IBPSA Conference, Rio de Janeiro, Brazil, 13–15 August 2001; pp. 387–394. [Google Scholar]
Dong, Y.; Sun, C.; Han, Y.; Liu, Q. Intelligent optimization: A novel framework to automatize multi-objective optimization of building daylighting and energy performances. J. Build. Eng. 2021, 43, 102804. [Google Scholar] [CrossRef]
Tian, Z.; Zhang, X.; Jin, X.; Zhou, X.; Si, B.; Shi, X. Towards adoption of building energy simulation and optimization for passive building design: A survey and a review. Energy Build. 2018, 158, 1306–1316. [Google Scholar] [CrossRef]
Sun, P.; Yan, F.; He, Q.; Liu, H. The Development of an Experimental Framework to Explore the Generative Design Preference of a Machine Learning-Assisted Residential Site Plan Layout. Land 2023, 12, 1776. [Google Scholar] [CrossRef]
Kim, S.; Lee, J.; Jeong, K.; Lee, J.; Hong, T.; An, J. Automated door placement in architectural plans through combined deep-learning networks of ResNet-50 and Pix2Pix-GAN. Expert Syst. Appl. 2024, 244, 122932. [Google Scholar] [CrossRef]
Mostafavi, F.; Tahsildoost, M.; Zomorodian, Z.S.; Shahrestani, S.S. An interactive assessment framework for residential space layouts using pix2pix predictive model at the early-stage building design. Smart Sustain. Built Environ. 2024, 13, 809–827. [Google Scholar] [CrossRef]
Fan, C.; Xiao, F.; Yan, C.; Liu, C.; Li, Z.; Wang, J. A novel methodology to explain and evaluate data-driven building energy performance models based on interpretable machine learning. Appl. Energy 2019, 235, 1551–1560. [Google Scholar] [CrossRef]
Olu-Ajayi, R.; Alaka, H.; Sulaimon, I.; Sunmola, F.; Ajayi, S. Machine learning for energy performance prediction at the design stage of buildings. Energy Sustain. Dev. 2022, 66, 12–25. [Google Scholar] [CrossRef]
Seyedzadeh, S.; Rahimian, F.P.; Oliver, S.; Rodriguez, S.; Glesk, I. Machine learning modelling for predicting non-domestic buildings energy performance: A model to support deep energy retrofit decision-making. Appl. Energy 2020, 279, 115908. [Google Scholar] [CrossRef]
Ahmad, T.; Chen, H.; Huang, R.; Yabin, G.; Wang, J.; Shair, J. Supervised based machine learning models for short, medium and long-term energy prediction in distinct building environment. Energy 2018, 158, 17–32. [Google Scholar] [CrossRef]
Liu, Z.; Wu, D.; Liu, Y.; Han, Z.; Lun, L.; Gao, J. Accuracy analyses and model comparison of machine learning adopted in building energy consumption prediction. Energy Explor. Exploit. 2019, 37, 1426–1451. [Google Scholar] [CrossRef]
Lin, C.H.; Tsay, Y.S. A metamodel based on intermediary features for daylight performance prediction of façade design. Build. Environ. 2021, 206, 108371. [Google Scholar] [CrossRef]
Mo, H.; Sun, H.; Liu, J.; Wei, S. Developing window behavior models for residential buildings using XGBoost algorithm. Energy Build. 2019, 205, 109564. [Google Scholar] [CrossRef]
Tien, P.W.; Wei, S.; Darkwa, J.; Wood, C.; Calautit, J.K. Machine learning and deep learning methods for enhancing building energy efficiency and indoor environmental quality—A review. Energy AI 2022, 10, 100198. [Google Scholar] [CrossRef]
Ngarambe, J.; Irakoze, A.; Yun, G.Y.; Kim, G. Comparative performance of machine learning algorithms in the prediction of indoor daylight illuminances. Sustainability 2020, 12, 4471. [Google Scholar] [CrossRef]
Alawadi, S.; Mera, D.; Fernández-Delgado, M.; Alkhabbas, F.; Olsson, C.M.; Davidsson, P. A comparison of machine learning algorithms for forecasting indoor temperature in smart buildings. Energy Syst. 2020, 13, 689–705. [Google Scholar] [CrossRef]
Teshnehdel, S.; Mirnezami, S.; Saber, A.; Pourzangbar, A.; Olabi, A.G. Data-driven and numerical approaches to predict thermal comfort in traditional courtyards. Sustain. Energy Technol. Assess. 2020, 37, 100569. [Google Scholar] [CrossRef]
Guo, J.; Li, M.; Jin, Y.; Shi, C.; Wang, Z. Energy Prediction and Optimization Based on Sequential Global Sensitivity Analysis: The Case Study of Courtyard-Style Dwellings in Cold Regions of China. Buildings 2022, 12, 1132. [Google Scholar] [CrossRef]
Chen, X.; Xue, P.; Liu, L.; Gao, L.; Liu, J. Outdoor thermal comfort and adaptation in severe cold area: A longitudinal survey in Harbin, China. Build. Environ. 2018, 143, 548–560. [Google Scholar] [CrossRef]
Yuan, T.; Hong, B.; Qu, H.; Liu, A.; Zheng, Y. Outdoor thermal comfort in urban and rural open spaces: A comparative study in China’s cold region. Urban Clim. 2023, 49, 101501. [Google Scholar] [CrossRef]
Du, J.; Sun, C.; Xiao, Q.; Chen, X.; Liu, J. Field assessment of winter outdoor 3-D radiant environment and its impact on thermal comfort in a severely cold region. Sci. Total Environ. 2020, 709, 136175. [Google Scholar] [CrossRef] [PubMed]
Krüger, E.L.; Minella, F.O.; Matzarakis, A. Comparison of different methods of estimating the mean radiant temperature in outdoor thermal comfort studies. Int. J. Biometeorol. 2014, 58, 1727–1737. [Google Scholar] [CrossRef]
Mi, J.; Hong, B.; Zhang, T.; Huang, B.; Niu, J. Outdoor thermal benchmarks and their application to climate–responsive designs of residential open spaces in a cold region of China. Build. Environ. 2020, 169, 106592. [Google Scholar] [CrossRef]
De Dear, R.; Schiller Brager, G. The adaptive model of thermal comfort and energy conservation in the built environment. Int. J. Biometeorol. 2001, 45, 100–108. [Google Scholar] [CrossRef]
GB/T50034-2024; Standard for Lighting Design of Buildings. China Architecture & Building Press: Beijing, China, 2024.
Zhang, L.; Zhang, L.; Wang, Y. Shape optimization of free-form buildings based on solar radiation gain and space efficiency using a multi-objective genetic algorithm in the severe cold zones of China. Sol. Energy 2016, 132, 38–50. [Google Scholar] [CrossRef]
Azari, R.; Garshasbi, S.; Amini, P.; Rashed-Ali, H.; Mohammadi, Y. Multi-objective optimization of building envelope design for life cycle environmental performance. Energy Build. 2016, 126, 524–534. [Google Scholar] [CrossRef]
Gossard, D.; Lartigue, B.; Thellier, F. Multi-objective optimization of a building envelope for thermal performance using genetic algorithms and artificial neural network. Energy Build. 2013, 67, 253–260. [Google Scholar] [CrossRef]
Jalali, Z.; Noorzai, E.; Heidari, S. Design and optimization of form and facade of an office building using the genetic algorithm. Sci. Technol. Built Environ. 2020, 26, 128–140. [Google Scholar] [CrossRef]
Wang, Z.; Rangaiah, G.P. Application and analysis of methods for selecting an optimal solution from the Pareto-optimal front obtained by multiobjective optimization. Ind. Eng. Chem. Res. 2017, 56, 560–574. [Google Scholar] [CrossRef]
Wright, J.A.; Brownlee, A.; Mourshed, M.M.; Wang, M. Multi-objective optimization of cellular fenestration by an evolutionary algorithm. J. Build. Perform. Simul. 2014, 7, 33–51. [Google Scholar] [CrossRef]
Wang, W.; Zmeureanu, R.; Rivard, H. Applying multi-objective genetic algorithms in green building design optimization. Build. Environ. 2005, 40, 1512–1525. [Google Scholar] [CrossRef]
Rosso, F.; Ciancio, V.; Dell’Olmo, J.; Salata, F. Multi-objective optimization of building retrofit in the Mediterranean climate by means of genetic algorithm application. Energy Build. 2020, 216, 109945. [Google Scholar] [CrossRef]
Satrio, P.; Mahlia, T.M.I.; Giannetti, N.; Saito, K. Optimization of HVAC system energy consumption in a building using artificial neural network and multi-objective genetic algorithm. Sustain. Energy Technol. Assess. 2019, 35, 48–57. [Google Scholar]
Murakami, S.; Zeng, J.; Hayashi, T. CFD analysis of wind environment around a human body. J. Wind Eng. Ind. Aerodyn. 1999, 83, 393–408. [Google Scholar] [CrossRef]
JGJT449-2018; Green Performance Calculation Standard for Civil Buildings. China Architecture & Building Press: Beijing, China, 2018.
Mousa, W.A.Y.; Lang, W.; Auer, T. Assessment of the impact of window screens on indoor thermal comfort and energy efficiency in a naturally ventilated courtyard house. Archit. Sci. Rev. 2017, 60, 382–394. [Google Scholar] [CrossRef]
López-Cabeza, V.P.; Carmona-Molero, F.J.; Rubino, S.; Rivera-Gómez, C.; Fernández-Nieto, E.D.; Galán-Marín, C.; Chacón-Rebollo, T. Modelling of surface and inner wall temperatures in the analysis of courtyard thermal performances in Mediterranean climates. J. Build. Perform. Simul. 2021, 14, 181–202. [Google Scholar] [CrossRef]
Ghorbani Naeini, H.; Norouziasas, A.; Piraei, F.; Kazemi, M.; Kazemi, M.; Hamdy, M. Impact of building envelope parameters on occupants’ thermal comfort and energy use in courtyard houses. Archit. Eng. Des. Manag. 2023, 1–27. [Google Scholar] [CrossRef]
Che, W.W.; Tso, C.Y.; Sun, L.; Ip, D.Y.; Lee, H.; Chao, C.Y.; Lau, A.K. Energy consumption, indoor thermal comfort and air quality in a commercial office with retrofitted heat, ventilation and air conditioning (HVAC) system. Energy Build. 2019, 201, 202–215. [Google Scholar] [CrossRef]
Muhaisen, A.S. Shading simulation of the courtyard form in different climatic regions. Build. Environ. 2006, 41, 1731–1741. [Google Scholar] [CrossRef]

Figure 1. Overall workflow.

Figure 2. Distribution map of traditional villages in Shandong province.

Figure 3. Location map of the study subject.

Figure 4. Site modeling.

Figure 5. Comparison of simulated and measured outdoor air temperature.

Figure 6. Comparison of simulated and measured globe temperature.

Figure 7. Comparison of simulated and measured wind speed.

Figure 8. The schedule for controlling tree porosity (The porosity changes are reflected by constructing a schedule in CSV format. The numbers in the upper red box represent the changes in porosity within 12 months, and the red box below represents the time of each month).

Figure 9. Classification of courtyard types.

Figure 10. Macro-scale site environment modeling settings.

Figure 11. Meso-scale baseline model modeling settings.

Figure 12. The corresponding simulation calculation areas for different analysis radii (left for a 20 m analysis radius, right for a 50 m analysis radius, with the calculation area within the red circle in the diagram).

Figure 13. Results of design variables and target values within the area under the 50 m analysis radius.

Figure 14. Parameter settings for wind environment performance simulation using butterfly plugin.

Figure 15. The spatial distribution of solutions based on K-means clustering from Generation 49 to 59.

Figure 16. Learning curves of different algorithm models.

Figure 17. Diagram of the XGBoost regression tree model.

Figure 18. Solution validation area.

Figure 19. Site aerial view.

Figure 20. Hourly mean radiant temperature during a typical winter week.

Figure 21. Hourly mean radiant temperature during a typical summer week.

Figure 22. The relationship between courtyard categories and PET.

Figure 23. Chart of baseline model design parameters and performance scores.

Figure 24. Distribution of Pareto frontier solutions (right) and non-frontier solutions (left) in the 3D coordinate system from Generation 49 to 59.

Figure 25. Layout of the 59th generation Pareto front solution set.

Figure 26. Distribution of the 59th generation Pareto front in a Parallel Coordinate Plot.

Figure 27. Selection results based on Average of Fitness Ranks.

Figure 28. Selection results based on Relative Difference between Fitness Ranks.

Figure 29. Statistical correlation coefficients of alley space indicators with ΔPET_H and ΔPET_C under different calculation radii. G represents the street greenery ratio, P represents the floor area ratio, D represents building density, H represents the average building height, W represents the total wall area, R represents the average street height-to-width ratio, A represents the weighted street radius, and I represent the number of road intersections. “*” indicates that the p value is less than 0.05, which means there is a significant statistical difference between the variables. “**” indicates that the p value is less than 0.01, implying an even more significant statistical difference between the variables.

Figure 30. Variation of comfort wind speed ratio with street width for different building heights ((left) shows a typical winter week, (right) shows a typical summer week).

Figure 31. Variation of comfort wind speed ratio with street height-to-width ratio for different building heights ((left) shows a typical winter week, (right) shows a typical summer week).

Figure 32. Sensitivity analysis of all design variables.

Figure 33. Flowchart of design schemes.

Table 1. Distribution statistics of village elevation.

Elevation (m)	<50	50–100	100–200	200–500	>500
Number of Villages	159	64	134	140	22

Table 2. Energy model settings.

Index	Energy Model Parameters
Wall	5 mm Cement mortar + 370 mm Fired claybrick U-value = 1.72 W·m⁻²·K⁻¹
Floor	10 mm Ceramic tile + 100 mm Concrete + 1500 mm Plain soil compaction U-value = 0.47 W·m⁻²·K⁻¹
Roof	Color steel plate+ 10 mm Asphalt felt +10 mm Grass clay + 200 mm Cement mortar U-value = 1.94 W·m⁻²·K⁻¹
Glazing	Wooden Glass U-value = 5.03 W·m⁻²·K⁻¹ SHGC = 0.6
Shading	Not applied
Equipment loads per area	3.7 W·m⁻²
Infiltration rate per area	0.4 cfm/sf facade @ 75 Pa
Lighting density per area	11.8 W·m⁻²
Num. of people per area	0.03 people·m⁻²
Schedules	Default Honeybee residential schedules /The Schedule for controlling tree porosity
HVAC	Ideal mechanical system

Table 3. Classification process of courtyards.

Classification		No.	Orientation	No. of 2-Storey Buildings		Category Code **	Total Categories
Traditional Courtyards (105)	4-sided	13	North–South	0		5c/5h	17
				1		2c/2h
				2		1c/1h
			East–West	1		3c/3h
			East–West	3		4c/4h
	3-sided	45	North–South	0		7c/7h
				1		9c/9h
				2		10c/10h
			East–West	0		6c/6h
			East–West	1		8c/8h
	2-sided	33	North–South	Central *	0	13c/13h
				Central *	1	12c/12h
				East	1	11c/11h
			East–West	Central	1	15c/15h
			East–West	East	1	14c/14h
	1-sided	14	North–South	0		17c/17h
	1-sided	14	East–West	1		16c/16h

* The courtyards are further classified according to their location. ** “c” stands for typical winter week, and “h” stands for typical summer week.

Table 4. Design variable parameter range.

Design Variables	Unit	Scope
Courtyard width	m	[8, 13]
Standard floor height	m	[2.6, 4.1]
Orientation	°	[−45, 45]
Secondary-house Floor Control	-	[−1, 1]
Window-to-wall ratio	-	[0, 0.35]

Table 5. Performance objectives parameter range.

Performance Objectives		Analysis Period	The Acceptable Range	Scope
Outdoor Comfort	OTCA_C	Typical Week in Winter	>5 °C	[0, 1]
Outdoor Comfort	OTCA_H	Typical Week in Summer	<32 °C	[0, 1]
Indoor Comfort	ITCA_H	Typical Week in Summer	<32 °C	[0, 1]
Indoor Comfort	sDA	1 year	>300 l×	[0, 1]
Building energy consumption	EUI	1 year	-	[0, 1]

Table 6. Genetic algorithm parameter settings.

Population
Generation size	50
Generation count	60
Algorithm parameters
Crossover probability	0.9
Mutation probability	1/n
Crossover distribution index	20
Mutation distribution index	20
Random seed	1

Table 7. The performance scores based on K-means clustering.

	Outdoor_C	Outdoor_H	Indoor_H	Indoor_sDA	EUI
Cluster 1/Individual. 1	0.052194	0.353481	0.892496	0.893248	0.372975
Cluster 2/Individual. 6	0.060127	0.368541	0.793157	0.972145	0.265634
Cluster 3/Individual. 3	0.057044	0.361921	0.739375	0.792245	0.393248
Cluster 4/Individual. 5	0.069542	0.349952	0.897124	0.521406	0.391205
Cluster 5/Individual. 11	0.068231	0.365781	0.926355	0.575702	0.376893
Cluster 6/Individual. 8	0.067198	0.354203	0.885853	0.529257	0.389754

Table 8. Pareto front screening criteria.

	Generations	sDA Value	Performance Label
Pareto Front Solutions	40–59	>0.7	A1
	20–39	>0.7	B2
	0–19	>0.7	C3
	0–59	<0.7	D4
Non-Front Solutions	0–59	>0.7	E5
Non-Front Solutions	0–59	<0.7	F6

Table 9. Performance improvement of each optimization target.

Index/Performance Rating	Outdoor Comfort		Indoor Comfort		Building Energy Consumption /EUI	Prediction Accuracy
Index/Performance Rating	Outdoor_C Simulation	Outdoor_H Simulation	Indoor_H Simulation	Indoor_sDA Simulation	Building Energy Consumption /EUI	Prediction Accuracy
1/A	0.051136	0.358759	0.892034	0.89998	0.372426	89%
Optimization percentage	−27%	−1%	12%	11%	42%	89%
2/B	0.069129	0.365641	0.780046	0.79998	0.317511	82%
Optimization percentage	15%	1%	−2%	−2%	19%	82%
3/C	0.060938	0.362153	0.791221	0.98	0.262619	81%
Optimization percentage	-	-	-	12%	-	81%
4/D	0.065476	0.35846	0.91218	0.53334	0.385004	75%
Optimization percentage	8%	−1%	15%	−33%	46%	75%
5/E	0.057044	0.361921	0.739375	0.79998	0.390826	82%
Optimization percentage	−5%	-	−8%	−2%	50%	82%
6/F	0.056551	0.351798	0.717885	0.78877	0.415123	88%
Optimization percentage	−5%	−3%	−11%	−2%	57%	88%

Blue arrows represent performance improvement, while red arrows represent performance decline.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, Z.; Li, X.; Sun, B.; Wen, Y.; Tang, P. Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China. Buildings 2024, 14, 2796. https://doi.org/10.3390/buildings14092796

AMA Style

Xu Z, Li X, Sun B, Wen Y, Tang P. Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China. Buildings. 2024; 14(9):2796. https://doi.org/10.3390/buildings14092796

Chicago/Turabian Style

Xu, Zhixin, Xiaoming Li, Bo Sun, Yueming Wen, and Peipei Tang. 2024. "Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China" Buildings 14, no. 9: 2796. https://doi.org/10.3390/buildings14092796

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation and Optimization of Traditional Mountain Village Spatial Environment Performance Using Genetic and XGBoost Algorithms in the Early Design Stage—A Case Study in the Cold Regions of China

Abstract

1. Introduction

2. Methods

2.1. Overview Workflow

2.2. Simulation Site and Analysis Objects Settings

2.2.1. Site Selection for This Study

2.2.2. Microclimate Measurements and Validation

2.2.3. Baseline Courtyards Model and Environment Setup

2.3. Multi-Objective Genetic Optimization (MOGO) Based on Courtyards

2.3.1. Design Parameters and Performance Objectives Selection

2.3.2. Simulation Generation Setup

2.4. Simulation and Correlation Analysis of Wind and Thermal Environment Performance Based on Street Networks

2.5. Machine Learning Settings

2.5.1. Data Preprocessing

2.5.2. Data Splitting

2.5.3. Algorithm Selection, Model Setting, and Validation of Model Training Accuracy

2.6. Solution Validation

3. Results

3.1. Baseline Model Selection

3.2. Simulation of Courtyard Performance

3.3. Simulation and Prediction Equation of Street Space Performance

3.3.1. Analysis of Street Thermal Environment Performance

3.3.2. Analysis of Street Wind Environment Performance

3.4. Machine Learning Validation

3.4.1. Analysis of the Influence of Courtyard Design Parameters

3.4.2. Verification Results of Machine Learning Accuracy

3.5. Application and Performance Analysis Statistics

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI