Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF

Liu, Shiliang; Zheng, Yusheng; Lv, Xueqiang; An, Bochao; Huo, Zhichao; Guo, Fangru; Chao, Chen; Mao, Deqiang

doi:10.3390/su17062561

Open AccessArticle

Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF

by

Shiliang Liu

¹,

Yusheng Zheng

¹,

Xueqiang Lv

^2,*,

Bochao An

²,

Zhichao Huo

²,

Fangru Guo

²,

Chen Chao

¹ and

Deqiang Mao

¹

School of Civil Engineering, Shandong University, Jinan 250061, China

²

Shandong Ding’an Testing Co., Ltd., Jinan 250032, China

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(6), 2561; https://doi.org/10.3390/su17062561

Submission received: 3 January 2025 / Revised: 26 February 2025 / Accepted: 10 March 2025 / Published: 14 March 2025

Download

Browse Figures

Versions Notes

Abstract

Soil reclamation is essential for restoring the ecological environment in coal mining subsidence areas, with reclaimed soil quality serving as a key indicator of success. Traditional evaluation methods often rely on subjective judgment, leading to potential biases. This study proposes an approach combining cluster analysis (CA), correlation degree analysis (CDA), principal component analysis (PCA), and membership function (MF) to evaluate soil reclamation quality in the Ezhuang subsidence area, Shandong Province, China. A minimum dataset (MDS) was established, including seven indicators: exchangeable magnesium, total nitrogen, available copper, available manganese, zinc, free iron, and available silicon. Soil quality indices (SQIs) were calculated using membership functions, revealing moderate soil quality across the reclamation area, with significant spatial variations. The northeastern section exhibited relatively good soil quality, while the northwestern and southeastern sections were poorer. Key factors influencing soil quality included variations in organic matter, exchangeable magnesium, and available copper. The accuracy of the CA-CDA-PCA-MF method was validated, with a coefficient of determination (R²) of 0.877 and a coefficient of deviation (CV) of 0.053, demonstrating its reliability. This method provides a robust tool for evaluating and improving soil restoration in mining areas, with potential applications in similar reclamation projects.

Keywords:

reclaimed soil; soil indicators; minimal dataset; soil quality index; soil quality evaluation

1. Introduction

Soil reclamation in coal mining subsidence areas is a critical component of ecological restoration, aiming to mitigate the adverse environmental impacts of mining activities and restore degraded land to productive use. In China, the cumulative area of coal mining subsidence has reached 13.5 million hectares, with an annual increase of 70,000 hectares, leading to severe soil degradation and significant socio-economic and ecological challenges. Soil reclamation is, therefore, an urgent priority. At present, a variety of treatment methods for coal mining subsidence areas, such as in situ soil improvement, natural recovery [1], foreign soil landfill recovery, coal gangue landfill remediation and chemical remediation [2], have been proposed to manage the soil quality of reclamation areas. However, due to the potential pollution and the resource consumption of remediation costs, the foreign soil remediation is a more mainstream method for the reclamation of coal mining subsidence land.

The quality of reclaimed soil is an important index for evaluating the effect of soil reclamation. However, the diversity and complexity of soil quality assessments show significant differences across reclamation objectives. Reclamation to restore soils to agricultural use usually focuses on soil fertility and productivity, emphasizing chemical and physical indicators [3,4], while reclamation for ecological restoration is more concerned with biodiversity restoration and relies on bioindicators [5]. Pollution remediation prioritizes pollutant toxicity and degradation potential, combining chemical and biological toxicity tests [6,7]. In recent years, research on soil quality assessment has gradually developed in a multidimensional and comprehensive direction. For example, in reclamation to restore soils to agricultural use, researchers have not only focused on traditional chemical indicators but also introduced soil microbiomics and metabolomics analyses [3,8]. However, balancing the synergies and trade-offs between indicators under different reclamation objectives remains a core challenge of current research [9].

Traditional methods for selecting indicators rely on literature reviews and expert consultations. Chen et al. characterized the quality of reclamation in the Xuzhou mining area using a soil quality evaluation index, which includes two categories of indicators, soil productivity and environmental quality, employing an ordinary weighted ball method [10]. Zhao et al. utilized frequency analysis, theoretical analysis methods, and expert scoring to select evaluation indicators and assess soil fertility in the reclamation area using a nonlinear membership degree function method and an improved Nemerow index [11]. However, such methods are often subjective, based on the experience of the selectors, and lack objectivity. Various scholars have attempted to apply mathematical methods to select evaluation indicators to reduce the interference of subjective factors. For example, Yang et al. comprehensively evaluated the soil fertility level and degree of heavy metal pollution in the Huaihe mining area’s subsidence area using the single-factor index method and the Nemerow index method based on fuzzy mathematics principles [12]. In recent years, with the development of computer technology, researchers are increasingly applying computer technology to soil quality assessment, such as PCA for data dimensionality reduction to assist in the evaluation of soil quality; the use of big data deep learning exercise evaluation model for soil quality assessment. However, deep learning has a large number of data requirements, which is difficult to meet in general. PCA performs relatively well and has relatively low data requirements, but when the indicator dimensions are high and the sample size is limited, the PCA may lead to distorted results due to the multicollinearity problem, a phenomenon that has been demonstrated in the study of Johnstone [13]. Especially in high-dimensional small-sample data, the limitation of PCA on the assumption of linear relationship makes it difficult to accurately extract data features, and there is still a need for improvement.

Methods of soil quality assessment have developed under the influence of the intersection of disciplines, and many new methods have been developed. For example, technologies based on microbiomics and metabolomics enable in-depth analyses of the biological activity and biochemical status of soils [14], remote sensing and geographic information systems (GIS) enable large-scale, dynamic soil monitoring [15], and deep learning and artificial intelligence provide high-precision predictive models of soil quality through the integration of data from multiple sources [16]. In addition, classical evaluation methods, such as the soil quality index (SQI), soil degradation index (SDI), and soil productivity index (SPI) [17,18], have demonstrated new application values in soil quality assessment by integrating with emerging technologies. Some of these techniques have been applied to assess the quality of reclaimed soil in coal mining subsidence areas. For example, collecting soil data from different years and using geographic information technology to study the spatial and temporal changes in the quality of reclaimed soils on a large scale [19], using complex network theory to establish a computer model to analyse the relationship between indicators and soil quality [20], using unmanned aerial vehicle spectral scanning to rapidly monitor the quality of reclaimed soils on a large scale in coal mining subsidence zones [21], calculating soil quality indices to evaluate the quality of reclaimed soils by using principal component analysis [22], and so on. However, some of these methods are subject to significant conditions, such as geographic information technology, which requires the collection of a large amount of data from different years; drone spectral scanning, which requires drone equipment that meets the conditions and cannot be widely implemented. Given the current widespread distribution of coal mining subsidence areas, a set of affordable and feasible evaluation methods that can be widely disseminated is particularly important. Among the existing methods for reclamation soil quality assessment, SQI has been widely used for soil quality assessment in most soil environments due to its simplicity of calculation and flexibility of quantification [23]. For example, Wendyam et al. employed this method to assess soil quality in a watershed area [24], while Muhammad and Roderick applied it to evaluate soil quality in a semi-arid region and to explore soil pollution, respectively [25,26]. The accuracy of soil quality assessments using the SQI method largely depends on the selection of indicators and scoring methods [27,28].

Traditional methods such as expert review systems rely heavily on subjective judgement and are susceptible to personal bias. In contrast, modern techniques such as deep learning, which utilizes computational techniques for assessment, tend to require large datasets and are costly, making them impractical or not optimal in many cases. This study addresses the limitations of these two approaches and attempts a statistically robust and feasible framework designed for small sample scenarios. In order to improve the accuracy of soil quality evaluation in reclaimed areas under small sample data, we chose to use cluster analysis and correlation analysis to initially screen the full data set of soil indicators to remove redundant indicators and then use PCA to conduct in-depth screening to establish the minimum data set. This method improves the accuracy and interpretability of PCA results by initially reducing the number of indicators, reducing subjectivity in the component selection process, and improving the reliability of the minimum dataset affecting the quality of soil reclamation. Finally, the SQI method and the membership function were used to evaluate the quality of reclaimed soil. What is more, the accuracy of the CA-CDA-PCA-MF method was validated.

2. Materials and Methods

2.1. Overview of the Study Area

The study area is located at the Ezhuang Coal Mine in the Laiwu District of Jinan City, Shandong Province, China (longitude 117°37′00″ E–117°40′42″ E, latitude 36°10′10″ N–36°12′51″ N), as shown in Figure 1. Situated in the central mountainous region of Shandong, it experiences a temperate continental semi-humid monsoon climate with abundant sunshine.

Before coal mining subsidence reclamation, the terrain in the study area was relatively flat, characterized by brown soil with thick layers, high organic matter content, moderate porosity, and suitability for crop cultivation. The subsidence in the area was due to the impact of coal seams No. 2, 4, and 7, resulting in a comprehensive subsidence depth of 0.5–1.1 m. Initial coal mining reclamation, completed in 2019, involved removing topsoil, backfilling with fill material, and leveling the land to reclaim the soil. In addition, a previously waterlogged area within the study area underwent treatment through deep excavation and shallow cushioning. Soil from deeper sections was transferred to shallower areas, merging earthworks with pond excavation to transform deep water areas into ponds for comprehensive restoration. The overall thickness of the backfilled soil in the reclamation area is approximately 0.7–1.0 m, with the backfilled soil type resembling that of the adjacent soil.

2.2. Overall Approach

This study evaluates the tested soil’s physicochemical indicators as the total dataset. It employs cluster, correlation, and principal component analysis to identify the minimal dataset. The research then utilizes membership degree functions to calculate soil evaluation indices to assess the quality of soil reclamation. The isometric method was used to classify the soil quality classes, and the SQI was divided into five classes at four nodes, 0.2, 0.4, 0.6 and 0.8 [29], corresponding to poor, fairly poor, moderate, fairly good and good, respectively. The technical route map is shown in Figure 2.

2.3. Soil Sample Collection and Laboratory Testing

During the investigation, a land area of 16,000 m² was selected for sampling and research in the Ezhuang Coal Mine. This area underwent soil reclamation in 2019 following coal mining subsidence, with nearby excavated soil covering the original soil layer. The sampling points, depicted in Figure 1c, included five selected locations. Soil samples were collected from depths of 0~20 cm and 40~60 cm, with an additional sample collected at a depth of 40~60 cm at sampling point 2, resulting in a total of 11 samples.

Taking into account the reclamation method of foreign soil with relatively uniform soil texture and structure in the study area, the soil quality evaluation standards for reclaimed farmland in China, ten soil indicators were measured: exchangeable magnesium, exchangeable calcium, organic matter, total nitrogen, available silicon, available iron, available manganese, available copper, available zinc, and free iron. Exchangeable magnesium and exchangeable calcium were extracted using hydrochloric or sulfuric acid, followed by titration or spectrophotometry for measurement [30]. Organic matter was determined using the acid-base neutralization method [31]. Total nitrogen was quantified using the semi-micro Kjeldahl method [32]. Available silicon was obtained using a sulfuric acid-hydrofluoric acid digestion method, followed by spectrophotometric measurement [33]. The study extracted available iron, manganese, copper, and zinc using the diethylene triamine pentaacetic acid (DTPA) method and then measured these elements using atomic absorption spectrometry [34]. Free iron was extracted with a sulfuric acid-hydrofluoric acid digestion method and then measured using spectrophotometry.

3. Minimum Dataset Establishment

3.1. Cluster Analysis

Cluster analysis is a statistical method that groups samples in a dataset into categories based on similarities. Cluster analysis can preliminarily determine the data structure and reduce the redundancy of highly correlated indicators by grouping multiple soil indicators with similar features. This study selected suitable indicators for soil quality evaluation using the R-type clustering method within systematic clustering (Realized by SPSS 27 software). Choose to cluster the variables and standardize the data using z-scores. The distance between different variables was defined using the square Euclidean distance, and the two closest variables were merged, followed by the calculation of distances between the merged variables. This process continued until a dendrogram illustrating the relationships between variables was created to establish the minimal dataset [35]. The square Euclidean distance is calculated as follows [36]:

d^{2} (x, y) = {(x - y)}^{T} (x - y) = \sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2} = {(x_{1} - y_{1})}^{2} + {(x_{2} - y_{2})}^{2} + \dots + {(x_{n} - y_{n})}^{2}

(1)

3.2. Correlation Analysis

Correlation analysis is a statistical method used to measure the relationship between two or more variables. Correlation analysis further quantifies the correlation between indicators and identifies groups of variables with high covariance through correlation coefficient matrices. If two or more indicators are highly correlated during the minimum dataset selection process, correlation analysis can be used to screen out redundant variables and retain only the core variables that have a strong influence on soil quality. This step not only reduces the number of variables but also reduces the computational burden during principal component analysis. After conducting normality tests for all evaluation indicators, researchers select the correlation coefficient matrix based on the results. If all factors follow a normal distribution, the Pearson correlation coefficient matrix is used for analysis. Otherwise, the Spearman correlation coefficient matrix is applied.

After selecting the appropriate correlation matrix, the analysis proceeds based on the cluster analysis results. Within different groups classified by cluster analysis, the correlation between indicators is analyzed. A correlation coefficient greater than 0.5 indicates a strong relationship, and such indicators are prioritized for inclusion in the minimal dataset based on practical considerations and established research. Indicators with a correlation coefficient less than 0.5 are considered to have no significant relationship and are treated as backup options for the minimal dataset.

3.3. Principal Component Analysis

Principal component analysis (PCA) is used to transform the original variables into a set of independent principal components through linear transformation. This process reduces data dimensions, minimizes redundant information, and extracts the main variability in the data.

Firstly, KMO (Kaiser–Meyer–Olkin) and Bartlett’s tests were performed on the indicators for which PCA was performed to determine the suitability of the data for PCA. For soil quality evaluation, principal components with eigenvalues of 1 or higher are retained. Indicators with loadings of 0.5 or greater on the same principal component are considered alternative indicators for inclusion in the minimal dataset. If an indicator has loadings of 0.5 or greater in two or more principal components, it is analyzed in the principal component with lower correlations with other indicators.

The vector norm (Norm) calculation is introduced as a reference basis for selecting indicators into the minimal dataset to avoid relying solely on indicator loadings as the criterion and potentially overlooking some indicator information [37]. The larger the Norm value of an indicator, the stronger its ability to explain comprehensive information. The formula for calculating the Norm value is as follows [38]:

N_{i k} = \sqrt{\sum_{j = 1}^{k} (u_{i k}^{2} e_{k})}

(2)

where N_ik is the Norm value of the ith indicator in the top k principal components with eigenvalues greater than 1, u_ik is the loading of the ith indicator in the kth principal component, and e_k is the eigenvalue of the kth principal component.

3.4. Soil Quality Evaluation

The SQI is a multidimensional concept that relies on indicators to comprehensively assess soil quality. This assessment is more representative when it includes multiple indicators rather than focusing on individual ones alone. The SQI numerically represents soil quality by establishing membership functions between the evaluation indicators and soil quality based on their positive and negative effects. The SQI is ultimately calculated to represent soil quality accurately by integrating the weights of indicators from each dataset. The membership function is used to standardize different units of soil indicators in SQI calculations, and it is able to standardize soil quality indicators of different units and scales to an affiliation value between 0 and 1, making them comparable. The way the membership function is defined affects the accuracy of the soil quality index. Different types of membership functions reflect the way in which each indicator affects soil quality, for example, S-type functions are used for positive indicators, inverse S-type functions are used for negative indicators, and parabolic functions are used for indicators for which an optimal range exists [39,40].

The commonly used membership functions are categorized into three types as follows: S-shaped membership function:

u (x) = \{\begin{matrix} \begin{matrix} 1, & x \geq b \end{matrix} \\ \begin{matrix} \frac{x - a}{b - a}, & a < x < b \end{matrix} \\ \begin{matrix} 0, & x \leq a \end{matrix} \end{matrix}

(3)

Inverted S-shaped membership function:

u (x) = \{\begin{matrix} \begin{matrix} 1, & x \leq a \end{matrix} \\ \begin{matrix} \frac{x - b}{a - b}, & a < x < b \end{matrix} \\ \begin{matrix} 0, & x \geq b \end{matrix} \end{matrix}

(4)

Parabolic membership function:

u (x) = \{\begin{matrix} \begin{matrix} 1, & b_{2} \geq x \geq b_{1} \end{matrix} \\ \begin{matrix} \frac{x - a_{1}}{b_{1} - a_{1}}, & a_{1} < x < b_{1} \end{matrix} \\ \begin{matrix} \frac{x - a_{2}}{b_{2} - a_{2}}, & a_{2} > x > b_{2} \end{matrix} \\ \begin{matrix} 0, & x \leq a_{1} or x \geq a_{2} \end{matrix} \end{matrix}

(5)

where x is the actual measured value of the evaluation indicator, a and b are the lower and upper limits of the critical values of the indicator, representing the minimum and maximum measured values, a₁ and a₂ are the lower and upper limits of the critical values of the indicator, representing the minimum and maximum measured values, and b₁ and b₂ are the lower and upper limits of the optimal value.

Principal component analysis was performed on the whole indicator data set and the minimum indicator data set to calculate the weights of each evaluation indicator in the soil quality index calculation. The SQI for different datasets is calculated using the following formula by combining the membership degree:

S Q I = \sum_{i = 1}^{n} W_{i} S_{i}

(6)

where S_i is the indicator’s score, n is the number of indicators, and W_i is the indicator’s weight. A higher SQI value shows better soil quality and greater suitability for plant growth.

3.5. Validation for the CA-CDA-PCA-MF Method

The core of validating the accuracy of the CA-CDA-PCA-MF method is to compare the soil quality index calculated from the minimum data set obtained with the CA-CDA-PCA-MF method with the soil quality index calculated from the full data set using PCA. Comparisons are made by calculating the coefficient of determination R² and the coefficient of deviation CV between the two groups of soil quality indices using the following formulae [41]:

R^{2} = \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(7)

C V = \frac{\frac{1}{n} \sum_{i = 1}^{n} ({\hat{y}}_{i} - y_{i})}{\bar{y}}

(8)

where

y_{i}

is the soil quality index calculated for the full data set for the ith sample,

{\hat{y}}_{i}

is the soil quality index calculated for the minimum data set for the ith sample,

\bar{y}

is the mean value the soil quality index calculated for the full data set, and n is the number of samples.

It was determined that the closer the coefficient of determination R² is to 1, the closer the soil quality index calculated from the minimum data set calculated from the full data set, and the more accurate the model is. The closer the coefficient of deviation CV is to 0, the smaller the model deviation value is.

4. Results and Discussion

4.1. Cluster Analysis Results

Cluster analysis was conducted on a dataset comprising ten factors, with the results displayed in Figure 3. According to these results, the dataset was divided into three classes when the clustering level was between 10 and 15. The first class includes exchangeable magnesium (EMg), exchangeable calcium (ECa), available iron (Fe(avail.)), organic matter (OM), total nitrogen (TN), and available copper (Cu(avail.)). The second class includes available manganese (Mn(avail.)) and available zinc (Zn(avail.)), and the third class includes available silicon (Si(avail.)) and free iron (Fe(free)).

4.2. Correlation Analysis Results

Preliminary screening of the results of cluster analysis is conducted by using correlation analysis. Initially, a normality test was performed on all factors in the dataset, indicating that Mn(avail.), Zn(avail.), Cu(avail.), TN, and Si(avail.) met the criteria for normal distribution. However, OM, Fe(free), Fe(avail.), ECa, and EMg did not follow a normal distribution. Hence, this study utilized the Spearman correlation coefficient matrix for further analysis. Figure 4 shows the correlation coefficients for the plots.

The correlation coefficient between EMg and ECa in Class 1 indicators is 0.927 **. EMg enhances the aggregation and cementation of soil particles, promoting soil structure stability and permeability. It also supports plant growth by regulating physiological metabolic processes within plant cells and affecting soil pH, which influences the effectiveness of other ions in soil and nutrient absorption by plants. Exchangeable calcium promotes soil particle aggregation and cementation, benefiting soil structure stability and aeration. It influences soil biological activity, microbial growth, and organic matter decomposition and plays a significant role in plant cell wall synthesis and cell division, profoundly impacting plant root growth and development [42,43]. Although their functions overlap to some extent, both are indispensable and are thus included in the alternative minimum dataset.

The correlation coefficient between Fe(avail.) and Cu(avail.) is 0.924 **. Fe(avail.) is vital for chlorophyll synthesis and nitrogen metabolism in plants, necessary for photosynthesis, and supports the healthy development of plant roots and leaves. It also facilitates oxidation-reduction reactions in the soil, maintaining soil redox balance. Cu(avail.) is crucial for plants’ photosynthesis, respiration, nitrogen metabolism, protein and enzyme synthesis, root growth, and nutrient absorption and utilization. Their levels directly impact plant growth, and their deficiency or excess significantly affects soil quality [42,43]. Hence, both are included in the alternative minimum dataset.

The correlation coefficient between OM and TN is 0.936 **. TN is a critical nutrient for plant growth and is involved in synthesizing organic compounds such as proteins and nucleic acids. It promotes plant growth, improves yield and quality, and reflects soil fertility. As an essential indicator, it evaluates soil fertility effectively. OM provides vital nutrients for plant growth, including carbon, nitrogen, and phosphorus, playing a crucial role in enhancing plant development, soil structure, water and nutrient retention, soil permeability, aeration, microbial growth, and activity. It also maintains the soil ecosystem’s balance and stability [44,45]. Their relationship is not directly subordinate, so both are included in the alternative minimum dataset.

In Class 2 indicators, Mn(avail.) participates in plant photosynthesis and respiration, promotes chlorophyll synthesis and aids in oxidation–reduction reactions. It also assists plants in absorbing and utilizing nutrients such as nitrogen, phosphorus, and potassium. Zn(avail.) is involved in the synthesis of plant growth hormones and enzyme activity, enhancing the plant’s resistance to diseases and pests [46]. Both indicators show no significant correlation and are thus included in the alternative minimum dataset.

In Class 3 indicators, the correlation coefficient between Si(avail.) and Fe(free) is 0.753 **. Si(avail.) enhances plants’ resistance to diseases and pests and improves their tolerance to adverse environmental conditions. Fe(free) affects the absorption capacity of plant roots and the nutrient supply [44]. The correlation between the two remains unclear; therefore, to prevent errors in evaluating indicators based on correlation coefficients, both are included in the alternative minimum dataset.

4.3. Principal Component Analysis Results

After cluster analysis and correlation analysis, Mn(avail.), Zn(avail.), Fe(free), and Si(avail.) are selected for inclusion in the minimum dataset for soil quality evaluation. However, there is still data redundancy in the alternative minimum dataset of the first-class indicators. Hence, PCA filters out the factors from the first-class indicators that ultimately enter the minimum dataset.

Firstly, the indicators undergoing PCA are subjected to Kaiser–Meyer–Olkin (KMO) and Bartlett’s tests to determine if all indicators are suitable for PCA [47]. The results of the tests showed a KMO value of 0.712, meeting the requirements for conducting PCA, and Bartlett’s test result was p < 0.01, indicating a significant correlation, thus making the indicators suitable for PCA.

Further utilizing SPSS 27 software, PCA was conducted on the alternative dataset comprising Fe(avail.), Cu(avail.), OM, TN, EMg, and ECa. Components with eigenvalues exceeding 0.8 were chosen to ensure that the principal components achieved a sufficient cumulative contribution rate. This selection process identified two principal components, together achieving a cumulative contribution rate of 92.95%. These components were utilized to filter the factors. The specific contributions of the different principal components and the loadings of the different factors in the principal components are shown in Table 1.

The contribution rate of the eigenvalue of the first principal component was 79.25%. Factors exhibiting high loadings in this component comprised Fe(avail.), Cu(avail.), OM, TN, EMg, and ECa. The norm value for each factor was computed, with selections based on the highest norm value, which corresponded to 90% of exchangeable magnesium. The analysis indicated that Cu(avail.), TN, EMg, and ECa satisfied the selection criteria. The correlation coefficients for EMg with ECa, Cu(avail.), and TN were recorded at 0.96 **, 0.706, and 0.585, respectively. Thus, EMg, having the highest norm value and the lowest correlation with TN, was included in the minimum dataset.

The contribution rate of the eigenvalue of the second principal component registered at 13.70%, with EMg and Cu(avail.) demonstrating higher loadings. Since EMg was previously selected for the minimum dataset from the first component, the analysis primarily addressed the inclusion of Cu(avail.). Cu(avail.) was incorporated into the minimum dataset, and no other factors showed high loadings in the second component.

Accordingly, three indicators, including EMg, TN, and Cu(avail.), were added to the minimum dataset through principal component analysis. Ultimately, cluster analysis, correlation analysis, and PCA selected seven indicators, including EMg, TN, Cu(avail.), Mn(avail.), Zn(avail.), Fe(free), and Si(avail.), as the minimum dataset for soil quality evaluation.

4.4. Soil Quality Index Calculation and Soil Quality Assessment

The weights of factors in the minimum dataset and the types of membership functions are presented in Table 2. In this study area, EMg, TN, Cu(avail.), Mn(avail.), Zn(avail.), and Si(avail.) all exhibit positive correlations with soil quality, defined using S-type functions. In contrast, the concentration of Fe(free) is negatively correlated with soil quality, characterized as an inverse S-type function. The soil quality index for the study area is listed in Table 3 by integrating the weights of each indicator and the calculations from the membership functions. This index varies from 0.06 to 0.88, with an average of 0.4575, indicating a moderate level of overall soil quality. A map illustrating the soil quality grade distribution across the study area was generated based on the index values from sampling points (Figure 5), providing a visual representation of varying soil quality grades throughout the study area. Predominantly, the area shows moderate soil quality grade, with the northeastern section exhibiting a fairly good grade while the northwestern and southeastern sections display fairly poor grades.

The soil quality index and the corresponding map of soil quality grades reveal substantial disparities in the quality of reclaimed soil throughout the study area. Soil quality is lowest at sampling point 1 and highest at sampling point 5, demonstrating a clear north-to-south trend of declining quality. Critical contributors to these differences include variations in OM, Cu(avail.), and EMg levels, which result in relatively poor soil aeration and fertility at sampling point 1, adversely affecting plant growth and respiration [48].

4.5. Validation Results for the CA-CDA-PCA-MF Method

The accuracy of the SQI values for the minimum dataset was tested by comparing them with the SQI values calculated for the entire dataset using PCA. The SQI for the whole data set calculated for each point using PCA is shown in Table 4. The SQI based on the whole dataset ranges from 0.02 to 0.91, with a mean value of 4.88. The soil quality index calculated with the minimum dataset ranges from 0.06 to 0.88, with a mean value of 0.4575. The difference between the two soil quality indices was small, at 6.2%. Regression analysis of the two sets of soil quality indices was carried out and a regression analysis plot was obtained (Figure 6). The regression equation was y = 0.78x + 0.09, and the calculated coefficient of determination, R², was 0.882. The slope of 0.78 indicates that the trends of the two data sets are highly similar (1 when identical), the intercept of 0.09 indicates that the bias of the two data sets is small (0 when unbiased), and the R² indicates that the model of the two data sets has a good fit (1 when identical). The deviation coefficient CV was calculated as −0.053, which indicated that the deviation between the soil quality index of the minimum dataset and that of the whole dataset was small, and the accuracy of the soil quality index calculated from the minimum dataset met the requirements. In conclusion, the accuracy of the SQI calculated by the CA-CDA-PCA-MF method was verified.

In the case of high-dimensional small samples, PCA faces multiple covariates interference, which leads to unclear ecological significance of principal components. As shown in Table 1, the first principal component (with a contribution of 79.25%) contains six highly correlated indicators (correlation coefficients greater than 0.7), such as EMg, ECa, and Fe (utilization), and it is difficult to differentiate their actual contributions to soil quality. In contrast, CA-CDA-PCA-MF pre-screened the indicators by cluster analysis (CA) and correlation analysis (CDA), merged redundant variables (e.g., ECa and EMg with correlation coefficients as high as 0.96), and ultimately retained seven relatively independent indicators (MDS). This strategy greatly reduced the information overlap between principal components and made the ecological significance of the PCA loading matrix clearer, i.e., enhanced the interpretability of the data.

4.6. Discussion

The observed differences in soil quality can be attributed to several key factors. At sampling point 1, where the lowest SQI (0.105) was recorded, the relatively low concentrations of exchangeable magnesium (EMg), total nitrogen (TN), and available copper (Cu) likely contributed to poor soil aeration and fertility. EMg and TN are essential for maintaining soil structure and nutrient availability, while Cu plays a critical role in plant photosynthesis and respiration. The deficiency of these elements at sampling point 1 may have limited plant growth and overall soil health. In contrast, sampling point 5, with the highest SQI (0.77), exhibited more favorable conditions, likely due to higher concentrations of these key nutrients. Additionally, the high content of free iron (Fe) across the study area was found to negatively impact soil quality, as excessive Fe can lead to nutrient imbalances and reduced plant uptake of essential elements.

The accuracy test proved that the SQI calculated from the minimum data set was reliable enough to explain the soil quality in the region, which was also reflected in Johnstone’s study, where PCA on a specific subset of the full variable also maintained the consistency of its results with those of PCA on the whole full data set [13]. In the case of high-dimensional small samples, PCA faces multiple covariates interference, which leads to unclear ecological significance of principal components [49]. As shown in Table 1, the first principal component (with a contribution of 79.25%) contains six highly correlated indicators (correlation coefficients greater than 0.7), such as EMg, ECa, and Fe (utilization), and it is difficult to differentiate their actual contributions to soil quality. In contrast, CA-CDA-PCA-MF pre-screened the indicators by cluster analysis (CA) and correlation analysis (CDA), merged redundant variables (e.g., ECa and EMg with correlation coefficients as high as 0.96), and ultimately retained seven relatively independent indicators (MDS). This strategy greatly reduced the information overlap between principal components and made the ecological significance of the PCA loading matrix clearer, i.e., enhanced the interpretability of the data. In addition, the computational complexity of PCA grows cubically with the number of indicators, whereas CA-CDA-PCA-MF reduces the dimensionality of indicators through pre-screening, thus reducing the computational volume of PCA. Despite the small sample size of this study, the stepwise process of CA-CDA-PCA-MF (CA → CDA → PCA) provides a scalable framework for large-scale data scenarios. And the pre-screening process avoids repeated analyses of redundant indicators (e.g., ECa) and reduces the consumption of computational resources. This method is particularly important for areas with limited data, as it does not require large data sets for accurate soil quality assessment. In addition, the flexibility of the method allows it to be adapted to different geographical and geological conditions, providing a versatile tool for assessing the effectiveness of reclamation in different environments. For example, the method can be applied to other mining areas or even to non-mining reclamation projects, such as agricultural land rehabilitation or urban greenfield development.

Of course, our study also has some limitations, first of all, our research site is a coal mining subsidence area reclaimed through foreign soil reclamation, the overall soil texture and soil structure is more uniform; that is, the geological conditions are relatively simple, the results of this study in the foreign soil reclamation of the site has a certain degree of universality, but for the more complex land conditions need to be adjusted to the experimental program. For example, Fayez Raiesi explored soil evaluation indexes in semi-arid areas and found that anthropogenic farming also has a great impact on soil quality, and the most important soil quality indicators are enzymes and microbial activities [50], while our study site has not yet been disturbed by human cultivation, and we need to redesign the soil quality indicators and sampling sites to meet the requirements of the evaluation in the face of more complex site conditions. Another limitation is that the reclamation was completed in a relatively short period of time, and the reclaimed site was not yet covered by vegetation, as shown in Figure 1c. Therefore, the study focused on physicochemical indicators of soil quality, and biological factors such as microbial activity and vegetation restoration were not included.

In order to break through the limitations of our current study, in the future we will conduct experimental studies in complex areas with different reclamation methods to explore the effectiveness of the methods on different sites. Secondly, more ecological indicators, such as microbial activity and vegetation restoration, should be incorporated to provide a more comprehensive assessment of soil quality. Finally, optimizing the weighting and classification of the indicators, and after some data accumulation, machine learning techniques can be integrated to further improve the accuracy and applicability of the model to more accurately predict soil quality and reclamation outcomes.

5. Conclusions

Based on cluster analysis, correlation analysis, and principal component analysis, a minimum dataset for assessing soil quality in the reclamation area of the Ezhuang Coal Mine, Laiwu District, is established. The soil quality index of collected soil samples is calculated to evaluate the reclamation quality in the study area using a membership function approach. The research findings are summarized as follows:

(1) Cluster analysis, correlation analysis, and PCA determine the minimum dataset for soil quality evaluation in Laiwu’s reclamation area, which includes exchangeable magnesium, total nitrogen, effective copper, effective manganese, effective zinc, free iron, and effective silicon;

(2) The soil quality assessment of Ezhuang coal mine reclamation area was achieved, and the soil quality index (SQI) of the study area ranged from 0.06 to 0.88, with a mean value of 0.4575. The soil quality of the whole reclamation area was mainly moderate with large spatial variations, and the soil quality of sampling site No. 1 had the worst soil quality (SQI = 0.105), and the soil quality of sampling site No. 5 had the best soil quality (SQI = 0.77), and the soil quality of sampling site No. 5 had the best soil quality (SQI = 0.77). Sample site 1 had the worst soil quality (SQI = 0.105) and sample site 5 had the best (SQI = 0.77);

(3) The accuracy of the CA-CDA-PCA-MF method was verified. The accuracy of the lowest dataset created by the CA-CDA-PCA-MF method was verified by using the coefficient of determination and the coefficient of deviation, and it was determined that the method can be used for soil quality assessment in topsoil reclamation projects;

(4) Factors affecting soil quality were investigated: The apparent differences in soil quality at the study site were attributed primarily to variations in EMg, TN, and Cu concentrations that affect soil structure, fertility, and plant growth. In addition, high levels of free iron (Fe) negatively affected soil quality across the region. These findings contribute to targeted interventions to increase soil fertility and improve soil structure to support sustainable land restoration;

(5) Future directions for expansion of the study are explored. The completed research is for coal mining subsidence areas reclaimed by guest soil, and it is effective and feasible to use the method for soil quality evaluation in areas reclaimed by the same method, which is of great significance for extension. However, for soils reclaimed by other methods, such as chemical reclamation, the soil conditions are affected by the reclamation method, and the evaluation method of this study needs to be re-designed for the sampling program and the selection of indicators.

Author Contributions

Conceptualization, X.L. and S.L.; methodology, D.M.; software, Y.Z.; investigation, F.G., Z.H., C.C. and Y.Z.; resources, B.A.; data curation, Y.Z. and S.L.; writing—original draft preparation, Y.Z.; writing—review and editing, S.L.; supervision, D.M. and X.L.; project administration, S.L.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

Financial support for this work is provided by the Jinan Science and Technology Bureau Social Livelihood Special Project (No. 202317001) and the Future Plan for Young Talent in Shandong University (31410082064103).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

Authors Xueqiang Lv, Bochao An, Zhichao Huo and Fangru Guo were employed by the company Shandong Ding’an Testing Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wang, L.; Lei, S.G.; Bian, Z.F. Review on Study of Ecological Damage and Natural Recovery in the Coal Mining Subsidence Area in Western China. Resour. Dev. Market 2017, 33, 1188–1192. [Google Scholar]
Zhou, J.; Zhao, C.S.; Zhang, L.J.; Wang, L.W.; Wang, L.L. Research progress of land rehabilitation and soil remediation in mining area. J. Northeast. Norm. Univ. (Nat. Sci. Ed.) 2023, 55, 151–156. [Google Scholar]
Lehmann, J.; Bossio, D.A.; Kögel-Knabner, I.; Rillig, M.C. The concept and future prospects of soil health. Nat. Rev. Earth Environ. 2020, 1, 544–553. [Google Scholar] [CrossRef]
Bünemann, E.K.; Bongiorno, G.; Bai, Z.; Creamer, R.E.; De Deyn, G.; de Goede, R.; Fleskens, L.; Geissen, V.; Kuyper, T.W.; Mäder, P.; et al. Soil quality—A critical review. Soil Biol. Biochem. 2018, 120, 105–125. [Google Scholar] [CrossRef]
Jansson, J.K.; Hofmockel, K.S. Soil microbiomes and climate change. Nat. Rev. Microbiol. 2020, 18, 35–46. [Google Scholar] [CrossRef]
Blasco, C.; Picó, Y. Prospects for combining chemical and biological methods for integrated environmental assessment. TrAC Trends Anal. Chem. 2009, 28, 745–757. [Google Scholar] [CrossRef]
Hou, D.; O’Connor, D.; Igalavithana, A.D.; Alessi, D.S.; Luo, J.; Tsang, D.C.W.; Sparks, D.L.; Yamauchi, Y.; Rinklebe, J.; Ok, Y.S. Metal contamination and bioremediation of agricultural soils for food safety and sustainability. Nat. Rev. Earth Environ. 2020, 1, 366–381. [Google Scholar] [CrossRef]
Cheng, H.; Yuan, M.; Tang, L.; Shen, Y.; Yu, Q.; Li, S. Integrated microbiology and metabolomics analysis reveal responses of soil microorganisms and metabolic functions to phosphorus fertilizer on semiarid farm. Sci. Total Environ. 2022, 817, 152878. [Google Scholar] [CrossRef]
Kopittke, P.M.; Menzies, N.W.; Wang, P.; McKenna, B.A.; Lombi, E. Soil and the intensification of agriculture for global food security. Environ. Int. 2019, 132, 105078. [Google Scholar] [CrossRef]
Chen, L.Q.; Deng, K.Z.; Xu, L.H. Method of Quantitative Evaluation of Quality of Reclaimed Soil. J. China Univ. Min. Technol. 1999, 5, 38–41. [Google Scholar]
Zhao, H.B.; Li, Y.; Zhang, H.W. Evaluation of reclamation in Hequ open-pit coal mine dump: Soil fertility restoration model. Int. J. Min. Sci. Technol. 2024, 9, 77–87. [Google Scholar]
Yang, Y. Quality Evaluation of Reclaimed Soil in Subsidence Area of Huainan and Huaibei Mining Area. Master’s Thesis, Anhui University of Science and Technology, Hefei, China, 2019. [Google Scholar]
Johnstone, I.M.; Lu, A.Y.C. On consistency and sparsity for principal components analysis in high dimensions. J. Am. Stat. Assoc. 2009, 104, 682–693. [Google Scholar] [CrossRef]
Fierer, N.; Jackson, R.B. The diversity and biogeography of soil bacterial communities. Proc. Natl. Acad. Sci. USA 2006, 103, 626–631. [Google Scholar] [CrossRef]
Mulder, V.L.; de Bruin, S.; Schaepman, M.E.; Mayr, T.R. The use of remote sensing in soil and terrain mapping—A review. Geoderma 2011, 162, 1–19. [Google Scholar] [CrossRef]
Padarian, J.; Minasny, B.; McBratney, A.B. Using deep learning to predict soil properties from regional spectral data. Geoderma Reg. 2019, 16, e00198. [Google Scholar] [CrossRef]
Nascimento, C.M.; Mendes, W.S.; Quiñonez Silvero, N.E.; Poppiel, R.R.; Sayão, V.M.; Dotto, A.C.; dos Santos, N.V.; Amorim, M.T.A.; Demattê, J.A.M. Soil degradation index developed by multitemporal remote sensing images, climate variables, terrain and soil attributes. J. Environ. Manag. 2021, 277, 111316. [Google Scholar] [CrossRef]
Pierce, F.J.; Larson, W.E.; Dowdy, R.H. Productivity of soils: Assessing long-term changes due to erosion. J. Soil Water Conserv. 1983, 38, 39–44. [Google Scholar]
Li, X.-R. Study on Quality Change of Reclaimed Soil in Coal Mining Subsidence Area in Zoucheng City. J. Anhui Agric. Sci. 2008, 36, 14206–14209. [Google Scholar]
Zhang, X.; Li, F.; Li, X. Evolution of soil quality on a subsidence slope in a coal mining area: A complex network approach. Arab J. Geosci. 2022, 15, 549. [Google Scholar] [CrossRef]
Zhao, Y.; Zheng, W.; Xiao, W.; Zhang, H.; Lv, X.; Zhang, J. Rapid monitoring of reclaimed farmland effects in coal mining subsidence area using a multi-spectral UAV platform. Environ. Monit. Assess. 2020, 192, 474. [Google Scholar] [CrossRef]
Wang, A.; Liu, G.; Xu, X.; Li, X.; Li, Y. Evaluation of soil quality in iron tailing ore wastelands of various reclamation periods. J. Beijing For. Univ. 2020, 42, 104–113. [Google Scholar]
Yu, P.J.; Han, D.L.; Liu, S.W.; Wen, X.; Huang, Y.X.; Jia, H.T. Soil quality assessment under different land uses in an alpine grassland. Catena 2018, 171, 280–287. [Google Scholar] [CrossRef]
Wendyam, A.F.D.; John, M.G.; James, M.R.; Patrick, G.H. Soil quality index (SQI) for evaluating the sustainability status of Kakia-Esamburmbur catchment under three different land use types in Narok County, Kenya. Heliyon 2024, 10, e25611. [Google Scholar]
Muhammad, J.N.; Muhammad, F.H.; Zeeshan, A.; Waqar, A.; Said, A. Evaluation of soil quality through simple additive soil quality index (SQI) of Tehsil Charsadda, Khyber Pakhtunkhwa, Pakistan. J. Saudi Soc. Agric. Sci. 2024, 23, 42–54. [Google Scholar]
Roderick, A.M.W.; Andrea, O.; Kiri, R.; Steven, K.; Roslyn, M.; Andrew, H.; Fiona, L.H. Estimating soil health in urban allotments: Integrated two-way soil quality index and free-living amoebae in nitrogen recycling. Soil Environ. Health 2023, 1, 100046. [Google Scholar]
Guo, L.L.; Sun, Z.G.; Ouyang, Z.; Han, D.R.; Li, F.D. A comparison of soil quality evaluation methods for Fluvisol along the lower Yellow River. Catena 2017, 152, 135–143. [Google Scholar] [CrossRef]
Wang, D.W.; Bai, J.H.; Wang, W.; Zhang, G.L.; Cui, B.S.; Liu, X.H.; Li, X.W. Comprehensive assessment of soil quality for different wetlands in a Chinese delta. Land Degrad. Develop. 2018, 29, 3783–3794. [Google Scholar] [CrossRef]
Zhang, F.R.; An, P.L.; Wang, J.Y.; Zhang, J.L.; Liu, L.M.; Chen, H.W. Soil Quality Criteria and Methodologies of Farmland Grading. Resour. Sci. 2002, 02, 71–75. [Google Scholar]
NY/T1121.15; Soil Testing. Part 15: Method for Determination of Soil Available Silicon. Industry Standards-Agriculture. Ministry of Agriculture of the PRC: Beijing, China, 2006.
NY/T85; Method for Determination of Soil Organic Matter. Industry Standards-Agriculture. Ministry of Agriculture of the PRC: Beijing, China, 1988.
NY/T53; Method for the Determination of Soil Total Nitrogen (Semi-Micro Kjeldahl Method). Industry Standards-Agriculture. Ministry of Agriculture of the PRC: Beijing, China, 1987.
NY/T1121.13; Soil Testing. Part 13: Method for Determination of Soil Exchangeable Calcium and Magnesium. Industry Standards-Agriculture. Ministry of Agriculture of the PRC: Beijing, China, 2006.
NY/T 890; Determination of Available Zinc, Manganese, Iron, Copper in Soil-Extraction with Buffered DTPA Solution. Industry Standards-Agriculture. Ministry of Agriculture of the PRC: Beijing, China, 2004.
Jin, H.F.; Shi, D.M.; Chen, Z.F.; Liu, Y.J.; Lou, Y.B.; Yang, X. Evaluation indicators of cultivated layer soil quality for red soil slope farmland based on cluster and PCA analysis. Trans. CSAE 2018, 34, 155–164. [Google Scholar]
Han, W.; Zhai, P.M. Three Cluster Methods in Regionalization of Temperature Zones in China. Clim. Environ. Res. 2015, 20, 111–118. [Google Scholar]
Jiang, D.S. Studies on the Evolution Law of the Fertility Quality of Arable Land and Its Influencing Factors in Red Soil Hilly Areas. Ph.D. Thesis, Hunan Agricultural University, Changsha, China, 2009. [Google Scholar]
Askari, M.S.; Holden, N.M. Quantitative soil quality indexing of temperate arable management systems. Soil Tillage Res. 2015, 150, 57–67. [Google Scholar] [CrossRef]
Chen, Z.F.; Shi, D.M.; Jin, H.F.; Lou, Y.B.; He, W.; Xia, J.R. Evaluation on cultivated-layer soil quality of sloping farmland in Yunnan based on soil management assessment framework (SMAF). Trans. CSAE 2019, 35, 256–267. [Google Scholar]
Chen, W.Y.; Yang, X.H.; Yang, H.C.; Zhang, F.H. Soil quality assessment of residual salinization in oasis agricultural area of Manasi river basin. Soils Fertil. Sci. China 2023, 4, 1–7. [Google Scholar]
Nash, J.E.; Sutcliffe, J.V. River flow forecasting through conceptual models part I: A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
Brady, N.C.; Weil, R.R. The Nature and Properties of Soils, 15th ed.; Pearson Education: London, UK, 2017. [Google Scholar]
Marschner, P. Marschner’s Mineral Nutrition of Higher Plants, 3rd ed.; Academic Press: New York, NY, USA, 2012. [Google Scholar]
Lubomír, B.; Kateřina, K.; Martina, K.; Jakub, B.; Jakub, Š.; Emilie, P. SOC content—An appropriate tool for evaluating the soil quality in a reclaimed post-mining landscape. Ecol. Eng. 2012, 43, 53–59. [Google Scholar]
Song, W.; Li, J.Y.; Li, X.J.; Xu, D.Y.; Min, X.Y. Effects of land reclamation on soil organic carbon and its components in reclaimed coal mining subsidence areas. Sci. Total Environ. 2024, 908, 168523. [Google Scholar] [CrossRef]
Zhou, F.Z.; Wang, M.Q.; Ran, L. The available content and spatial distribution of copper, zinc, iron and manganese in Lichuan plowland. Hubei Agric. Sci. 2019, 58, 64–71+101. [Google Scholar]
Wang, Z.L.; Teng, H.H.; Jiang, Q.X.; Liu, C.X.; Shan, J.X.; Wang, K. Effects of Snow Cover Change on the Content of Base Ions and Available Silicon and Aluminum of Black Soilin Northeast China. J. Soil Water Conserv. 2024, 38, 147–156+164. [Google Scholar]
Shen, J.; Yuan, L.; Zhang, J.; Li, H.; Bai, Z.; Chen, X.; Zhang, W.; Zhang, F. Phosphorus dynamics: From soil to plant. Plant Physiol. 2011, 156, 997–1005. [Google Scholar] [CrossRef] [PubMed]
Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Phil. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef]
Raiesi, F.; Kabiri, V. Identification of soil quality indicators for assessing the effect of different tillage practices through a soil quality index in a semi-arid environment. Ecol. Indic. 2016, 71, 198–207. [Google Scholar] [CrossRef]

Figure 1. Location of study area and distribution of sampling points. (a) Location of Laiwu district in China. (b) Elevation map of Laiwu district. (c) Satellite map of the study area and specific location of sampling sites.

Figure 2. Technical route map.

Figure 3. Cluster analysis results.

Figure 4. Correlation heat map.

Figure 5. Distribution chart of soil quality grades.

Figure 6. Regression analysis of SQI between CA-CDA-PCA-MF and full dataset PCA.

Table 1. Indicator loadings and norm values of minimum dataset alternative samples. The table contains the loadings of each indicator on the principal components and the contribution of each principal component.

Evaluation Indicators	Principal Component (PC)		Norm
Evaluation Indicators	PC1	PC2	Norm
EMg (cmol/kg)	0.857	0.473	2.130
ECa (cmol/kg)	0.922	0.293	2.028
Fe(avail.) (mg/kg)	0.858	0.308	1.891
Cu(avail.) (mg/kg)	0.941	−0.19	2.059
OM (g/kg)	0.853	−0.488	1.912
TN (g/kg)	0.907	−0.379	2.007
Eigenvalue	4.755	0.822
Contribution rate	79.253	13.702
Cumulative contribution	79.253	92.955

Table 2. Factor weights and membership function types of minimum dataset.

Minimum Dataset
Indicators	Type of Membership Function	Common Factor Variance	Factor Weight
EMg (cmol/kg)	S	0.916	0.246
Cu(avail.) (mg/kg)	S	0.960	0.334
TN (g/kg)	S	0.875	0.304
Si(avail.) (mg/kg)	S	0.950	0.0902
Fe(free) (g/kg)	inverted S	0.976	−0.225
Mn(avail.) (mg/kg)	S	0.962	0.204
Zn(avail.) (mg/kg)	S	0.951	0.046

Note: Common factor variance: Indicates the ability of the indicator to explain the principal components; Factor weight: Calculated based on the standardization of principal component analysis, the sum is 1.

Table 3. Soil quality index table. The soil quality index was calculated for all samples at each sampling site. The average of multiple samples from the same location at the same depth was taken as the soil quality index for that depth, and the soil quality index for a site was the average of the soil quality indices for its different depths.

Sampling Location	Sampling Point 1		Sampling Point 2			Sampling Point 3		Sampling Point 4		Sampling Point 5
Sampling Location	0–20 cm	40–60 cm	0–20 cm	40–60 cm	40–60 cm	0–20 cm	40–60 cm	0–20 cm	40–60 cm	0–20 cm	40–60 cm
Sample soil quality index	0.15	0.06	0.50	0.48	0.49	0.32	0.37	0.54	0.61	0.88	0.66
Soil quality index at a sampling location	0.105		0.4925			0.345		0.575		0.77

Table 4. Soil quality index for all data sets using only the PCA method.

Sampling Location	Sampling Point 1		Sampling Point 2			Sampling Point 3		Sampling Point 4		Sampling Point 5
Sampling Location	0–20 cm	40–60 cm	0–20 cm	40–60 cm	40–60 cm	0–20 cm	40–60 cm	0–20 cm	40–60 cm	0–20 cm	40–60 cm
Sample soil quality index	0.06	0.02	0.47	0.41	0.39	0.38	0.49	0.76	0.72	0.91	0.68
Soil quality index at a sampling location	0.04		0.435			0.435		0.74		0.795

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, S.; Zheng, Y.; Lv, X.; An, B.; Huo, Z.; Guo, F.; Chao, C.; Mao, D. Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF. Sustainability 2025, 17, 2561. https://doi.org/10.3390/su17062561

AMA Style

Liu S, Zheng Y, Lv X, An B, Huo Z, Guo F, Chao C, Mao D. Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF. Sustainability. 2025; 17(6):2561. https://doi.org/10.3390/su17062561

Chicago/Turabian Style

Liu, Shiliang, Yusheng Zheng, Xueqiang Lv, Bochao An, Zhichao Huo, Fangru Guo, Chen Chao, and Deqiang Mao. 2025. "Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF" Sustainability 17, no. 6: 2561. https://doi.org/10.3390/su17062561

APA Style

Liu, S., Zheng, Y., Lv, X., An, B., Huo, Z., Guo, F., Chao, C., & Mao, D. (2025). Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF. Sustainability, 17(6), 2561. https://doi.org/10.3390/su17062561

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Reclamation Soil Quality in Coal Mining Subsidence Area Based on CA-CDA-PCA-MF

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of the Study Area

2.2. Overall Approach

2.3. Soil Sample Collection and Laboratory Testing

3. Minimum Dataset Establishment

3.1. Cluster Analysis

3.2. Correlation Analysis

3.3. Principal Component Analysis

3.4. Soil Quality Evaluation

3.5. Validation for the CA-CDA-PCA-MF Method

4. Results and Discussion

4.1. Cluster Analysis Results

4.2. Correlation Analysis Results

4.3. Principal Component Analysis Results

4.4. Soil Quality Index Calculation and Soil Quality Assessment

4.5. Validation Results for the CA-CDA-PCA-MF Method

4.6. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI