Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data

Yang, Zanxian; Yang, Fei; Xiang, Yuanjing; Yang, Haiyi; Deng, Chunnuan; Hong, Liang; Sun, Zhongchang

doi:10.3390/land14030556

Open AccessArticle

Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data

by

Zanxian Yang

¹

,

Fei Yang

^1,2,3,*

,

Yuanjing Xiang

³,

Haiyi Yang

^2,4

,

Chunnuan Deng

^1,*

,

Liang Hong

¹ and

Zhongchang Sun

^5,6

¹

Faculty of Geography, Yunnan Normal University, Kunming 650500, China

²

State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China

³

Yunnan Yuanjing Planning and Research Institute (Group) Co., Ltd., Kunming 650051, China

⁴

College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China

⁵

International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China

⁶

Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

^*

Authors to whom correspondence should be addressed.

Land 2025, 14(3), 556; https://doi.org/10.3390/land14030556

Submission received: 9 February 2025 / Revised: 2 March 2025 / Accepted: 4 March 2025 / Published: 6 March 2025

Download

Browse Figures

Versions Notes

Abstract

:

The United Nations Sustainable Development Goal (SDG) 11.1 emphasizes improving well-being, ensuring housing security, and promoting social equity. Informal settlements, one of the most vulnerable groups, require significant attention due to their dynamic changes and habitat quality. These areas limit the ability to comprehensively capture spatial heterogeneity and dynamic shifts in regional sustainable development. This study proposes an integrated approach using multi-source remote sensing data to extract the spatial distribution of informal settlements in Mumbai and assess their habitat environment quality. Specifically, seasonal spectral indices and texture features were constructed using Sentinel and SAR data, combined with the mean decrease impurity (MDI) indicator and hierarchical clustering to optimize feature selection, ultimately using a random forest (RF) model to extract the spatial distribution of informal settlements in Mumbai. Additionally, an innovative habitat environment index was developed through a Gaussian fuzzy evaluation model based on entropy weighting, providing a more robust assessment of habitat quality for informal settlements. The study demonstrates that: (1) texture features from the gray level co-occurrence matrix (GLCM) significantly improved the classification of informal settlements, with the random forest classification model achieving a kappa coefficient above 0.77, an overall accuracy exceeding 0.89, and F1 scores above 0.90; (2) informal settlements exhibited two primary development patterns: gradual expansion near formal residential areas and dependence on natural resources such as farmland, forests, and water bodies; (3) economic vitality emerged as a critical factor in improving the living environment, while social, natural, and residential conditions remained relatively stable; (4) the proportion of highly suitable and moderately suitable areas increased from 65.62% to 65.92%, although the overall improvement in informal settlements remained slow. This study highlights the novel integration of multi-source remote sensing data with machine learning for precise spatial extraction and comprehensive habitat quality assessment, providing valuable insights into urban planning and sustainable development strategies.

Keywords:

informal settlements; MDI; entropy-weighted; fuzzy comprehensive evaluation; habitat environment quality

1. Introduction

Urbanization has led to massive population migration, contributing to the rapid expansion of informal settlements in many developing countries. These settlements are often characterized by inadequate living conditions, heightened socio-economic vulnerabilities, and a lack of essential services like housing, sanitation, and waste management. Around 55% of the global population now resides in urban areas, with a substantial portion living in informal settlements that exacerbate existing social inequalities and environmental challenges [1,2]. While the United Nations Sustainable Development Goal 11 (SDG 11) calls for ensuring access to adequate housing, safe services, and slum upgrading [3], there remains a critical gap in obtaining accurate baseline data for informal settlements. Resource limitations, poor accessibility, and inadequate urban planning prevent effective intervention. The expansion of these settlements further deepens social inequalities and environmental degradation, hindering progress towards sustainable development. Despite extensive studies on informal settlements, many lack comprehensive, spatially explicit data needed for effective planning and policy making. This study, therefore, seeks to fill this gap by offering an integrated approach to assess both the spatial distribution and habitat quality of informal settlements in Mumbai, aligning with SDG 11 to improve quality of life and promote ecological sustainability.

Informal settlements are characterized by complex and dispersed distributions, complicating efforts to standardize measurement and identification across different types of settlements. These challenges significantly impact the extraction process of informal settlements from remote sensing data [4]. The United Nations defines informal settlements as unplanned urban areas lacking basic infrastructure, with shelters constructed on unauthorized land [5]. While this definition is widely accepted, informal settlements also face additional challenges such as limited access to transportation, inadequate waste management, and severe environmental pollution, all of which worsen social disparities and health conditions [6,7,8,9]. Furthermore, they are also vulnerable to natural disasters like extreme rainfall, floods, earthquakes, and epidemics [10]. Although previous studies have categorized informal settlements using key dimensions like housing quality, living space, and sanitation, these studies often lack comprehensive methodologies that integrate spatial data and environmental quality. This study builds upon existing frameworks by applying a more integrated approach that combines spatial data extraction with habitat quality assessment. We systematically categorize informal settlements using five key dimensions: insecure tenure, poor housing structure, limited space, insufficient access to water, and poor sanitation infrastructure [11], and propose targeted interventions to improve these aspects in the context of SDG 11.

Due to the unique distribution and characteristics of informal settlements, traditional data collection methods, such as manual sampling proposed by the Slum Health Improvement Coalition [12] and periodic housing surveys conducted by international organizations like the United Nations [13], are insufficient for assessing SDG 11. Recent advancements in remote sensing technology, coupled with machine learning, have greatly improved the precision of classifying slums [14,15]. The effectiveness of these technologies has been validated through various studies. For example, Pelizari et al. [16] utilized feature subset selection and random forest classifiers to enhance the accuracy of detecting refugee camp buildings; Prabhu and Alagu Raja [17] developed a method combining statistical and spectral techniques for accurate identification of urban slums; Williams et al. [18] introduced an object-based hierarchical machine learning approach that integrates high-resolution imagery and boundary data to map slums. Additionally, Leonita et al. [19] demonstrated the effectiveness of support vector machines and random forest algorithms in improving the accuracy of slum mapping, particularly in Bandung, Indonesia. While these methods have shown promise, the balance between classification accuracy and geographic granularity remains a key research challenge [20]. This study seeks to address this gap by integrating scene-based methods that incorporate spatial context, improving the precision of informal settlement mapping without sacrificing geographic resolution.

While deep learning (DL) has shown significant promise in slum mapping and SDG 11.1.1 monitoring, it faces notable limitations compared to traditional machine learning techniques. High-resolution (HR) and very high-resolution (VHR) imagery, while offering detailed texture features, are resource-intensive, requiring substantial time, labor, and computational power, which limits their scalability for large-scale urban analyses [21]. Additionally, DL models tend to be less interpretable than traditional machine learning models, making them less reliable for real-world slum mapping tasks. In contrast, machine learning algorithms, by integrating feature engineering with expert knowledge, are better suited to understanding the characteristics of informal settlements in high-density urban areas [22]. Moreover, DL performs poorly when relying on limited samples from informal settlements. The development history of informal settlements varies across regions, leading to significant differences in their types and characteristics, making it difficult to generate large-scale, high-quality training samples [23]. Therefore, machine learning, especially small-scale models based on a small number of high-quality samples, is more feasible and reliable for SDG 11.1.1 measurement tasks.

The rapid growth of informal settlements is driven by socio-economic and demographic factors, leading to significant challenges not only for residents’ quality of life but also for the overall livability of cities [24]. The concept of human settlement suitability, introduced in the 1950s by Greek planner Doxiadis [25], has evolved into a comprehensive method for assessing livability, emphasizing the scientific and quantitative evaluation of living conditions. While human settlement suitability has been widely studied, many existing studies rely on static surveys and methods that fail to capture the dynamic, multi-dimensional nature of informal settlements. In 1961, the World Health Organization proposed “suitability” as one of the basic requirements for human settlement construction and development, urging governments to focus on improving public service facilities and enhancing the quality of life for residents of informal settlements. For a long time, the human settlement environment has been widely studied by experts in geography, ecology, and sociology, incorporating various regions, methods, and indicators. Most studies combine natural and human environmental factors to construct human settlement suitability indices [26]. For example, Maimaiti et al. [27] defined the suitability of human settlements in arid and semi-arid regions using remote sensing imagery and socio-economic data, while Huang et al. [28] integrate remote sensing data with social perception data to establish frameworks for evaluating urban residential land suitability. Some studies have also developed multi-level, multi-indicator systems [29] to assess the quality of urban living environments, while methods like ImPACT and trade-off analysis [30] have been applied to quantitatively identify spatial differentiation in rural living environments. In addition to suitability analysis, many studies have focused on issues related to the quality of life and vulnerability in informal settlements. For instance, Shahraki et al. [31] surveyed 400 households in informal areas of Kabul, Afghanistan, to assess local residents’ quality of life, finding widespread dissatisfaction with transportation, leisure, and governance, as well as material deprivation in basic services like water and energy. Chitekwe-Biti et al. [32] used survey data from 328 households in informal areas of Zhanjan City, finding that somatic factors (with a coefficient of 5.61) had the greatest impact on livability conditions. Giri et al. [33] randomly selected 300 slum households from the hilly areas of Kathmandu, Nepal, to assess residents’ vulnerability, revealing that those in flat areas were more vulnerable than those in hilly areas. Patri et al. [34] collected structured questionnaire data from 200 slum households to construct a vulnerability index, showing that slum populations are more susceptible to natural disasters due to economic, material, and awareness-related vulnerabilities. Most studies on human settlement evaluation in informal settlements focus on factors such as quality of life, comfort, and habitability, typically relying on survey methods such as questionnaires and interviews. However, these methods are limited in their ability to capture the broader and multi-dimensional environmental dynamics of informal settlements.

Based on the above analysis, it is clear that informal settlements not only lack accurate baseline data but also exhibit complex spatial variations in their living environments, which are influenced by a range of natural, socio-economic, and policy factors. While existing studies have addressed some of these factors, they often rely on traditional data collection methods or static surveys, which fail to capture the dynamic and multi-dimensional nature of informal settlements. This study seeks to fill these gaps by integrating spatial data extraction with habitat quality assessment, providing a more comprehensive approach to analyzing informal settlements. By combining advanced spatial data extraction techniques with seasonal feature engineering and a Gaussian fuzzy evaluation model, we aim to assess the suitability of human living environments in informal settlements more accurately and holistically. Specifically, we employ RF models to classify slum areas in Mumbai, selecting 18 secondary indicators across four dimensions: economy, society, environment, and living conditions. Using multi-source remote sensing data, we develop a Gaussian fuzzy evaluation model with the entropy-weighted method (EWM) to assess habitat quality in informal settlements. This integrated approach addresses the limitations of previous studies by incorporating dynamic spatial data and improving the precision of environmental quality assessments. The findings from this study contribute valuable insights for urban planning, particularly for optimizing and improving the living conditions in informal settlements, in line with SDG 11.

2. Research Scope and Data Sources

2.1. Overview of the Study Area

Mumbai, one of the fastest urbanizing cities in the world, is located in the Maharashtra region of India, near the Arabian Sea. As one of India’s largest cities and financial centers, Mumbai’s urban population is approximately 12.4 million, with the greater metropolitan area exceeding 20 million. Approximately 55% of Mumbai’s urban population lives in informal settlements, which are predominantly concentrated in areas affected by rapid urban expansion [21]. The study area covers the city of Mumbai and its suburban districts, totaling around 512 square kilometers (Figure 1). Built-up areas occupy about 40%, while forest land covers approximately 36%. However, the rapid urbanization of Mumbai has created significant challenges that disproportionately affect informal settlements. Issues such as urban sprawl, inadequate infrastructure, extreme weather events, flooding, and environmental degradation are particularly severe in these informal areas. [35,36]. The lack of proper urban planning, coupled with high population density and poverty, exacerbates the vulnerability of residents in informal settlements. These challenges are not just a result of the city’s overall urbanization but are tightly linked to the spatial and socio-economic dynamics of informal settlements themselves, where residents often face substandard living conditions and lack access to essential services. Therefore, studying informal settlements in Mumbai and evaluating the quality of the habitat environment in these areas are of critical importance to understanding and addressing the unique challenges they face in the context of the city’s rapid urbanization.

2.2. Data Sources

2.2.1. Sample Data of Informal Settlements

MMRDA is an urban development agency in Mumbai responsible for mapping informal settlements and supporting the government in slum rehabilitation programs (https://sra.gov.in, accessed on 12 December 2024). The study obtained slum demarcation maps from MMRDA and generated random points for informal settlements within the area. Using OSM data from 2017 and 2022, basic building data were filtered to identify formal residential areas. In 2017, 23,830 points for informal settlements and 41,540 points for formal residential areas were identified, while in 2022, 23,850 points for informal settlements and 41,590 points for formal residential areas were identified.

2.2.2. Variable Data

The variable data used for extracting informal settlements and evaluating habitat environment quality include Sentinel-1 and Sentinel-2 data, which provide spectral and texture features for identifying the spatial distribution of informal settlements. Additionally, other publicly available data sources are integrated for habitat environment quality assessment, including OSM data, land use data, population density data, PM2.5 concentration data, climatic data, and topographic data, as shown in Table 1. These data sources vary significantly in resolution, accuracy, and update frequency, which may lead to inconsistencies during data integration.

To address these issues and ensure consistency across the data sources, several preprocessing steps were implemented. First, all datasets were resampled to a unified spatial resolution. For categorical data, the nearest neighbor resampling method was applied to avoid introducing significant errors, while continuous data (such as temperature or PM2.5 concentration data) were resampled using bilinear interpolation to better preserve spatial features. Next, spatial registration was performed to align all datasets to the same coordinate system, ensuring geometric consistency and correcting any distortions. For datasets with different update frequencies, particularly time-sensitive data such as PM2.5 and population density data, temporal alignment was applied to ensure that the data accurately reflect the same time period for analysis.

This process enhanced the robustness and scientific rigor of the data integration. In this study, we refer to the work of Yu et al. [37] who utilized platforms such as GEE and ArcGIS, combined with source remote sensing data (Figure 2), and constructed 16 indicators across four main aspects: economic conditions, social conditions, natural conditions, and residential conditions, as shown in Figure 2. This was achieved through spatial correction, vector-to-raster conversion, projection processing, and image cropping. The indicators included annual average precipitation, annual average temperature, PM2.5 concentration, and distances to POI such as schools, parks, stations, hospitals, as well as rivers and railways.

3. Methodology Overview

This study follows a research process consisting of four key steps, designed to accurately extract informal settlements in Mumbai and thoroughly assess their habitat quality (Figure 3). First, by combining prior knowledge and references, we extract the main factors influencing informal settlements from multi-source remote sensing data (including Sentinel-1 and Sentinel-2 data) and preprocess the data using GEE and Arc GIS 10.4 software. The data preprocessing steps include image registration, denoising, and image fusion, ensuring spatial consistency across data from different sources and providing high-quality inputs for subsequent analysis. Next, we apply interpolation methods to construct a complete dataset for identifying informal settlements and assessing habitat quality, combining spectral indices, texture features, and other key geographic data. During this process, to reduce feature redundancy and avoid overfitting, we combine hierarchical clustering with the random forest model to optimize feature selection. This ensures the representativeness of features and enhances the model’s stability and accuracy. Based on the optimized features, we use the random forest model to map the spatial distribution of informal settlements at a 10 m resolution. An entropy-weighted fuzzy evaluation method is then used to determine the weights of each influencing factor, and a weighted overlay of grid maps for each factor is performed to obtain a comprehensive habitat suitability assessment. Finally, based on spatial overlay analysis, we combine the geographical points of informal settlements with the habitat suitability assessment results and perform a grading analysis to determine the habitat characteristics of different areas, assessing Mumbai’s progress in achieving SDG 11.1. This technical approach ensures the accuracy of informal settlement extraction and the scientific rigor of habitat evaluation, providing a reliable basis for subsequent urban planning and sustainable development goals.

3.1. Calculation of Spectral Indices and Texture Features

Sentinel-1 radar backscatter data are sensitive to buildings, surface roughness, and urban structures. By combining VV and VH polarization data, more mixed surface feature information can be extracted, which facilitates detecting building density and layout in informal settlements. Sentinel-2 provides medium-resolution spectral data, which can identify vegetation, bare land, and buildings at different scales, enhancing the spectral differentiation between informal settlements and other land covers, improving classification accuracy. All data preprocessing in this section was completed on the GEE cloud platform. Using GEE’s median composite algorithm, cloud-free gaps were filled using the median value of pixels over seasonal periods, resulting in high-quality images. A total of 404 scenes were obtained. Therefore, base index data for four seasons (Spring: Spr, Summer: Sum, Autumn: Aut, Winter: Win) in 2017 and 2022 were obtained, as shown in Table 2 and Table 3.

In this study, texture features were extracted from the Sentinel-2 imagery using the GLCM method in GEE. We selected the near-infrared (B8), red (B4), and green (B3) bands to calculate a series of texture indices, including contrast, energy, correlation, homogeneity, and entropy, using the built-in ee.Reducer.glcm() function with a 3 × 3 neighborhood window. By extracting these data indices seasonally, we are able to capture the seasonal variations in the texture features of informal settlements, thus providing a more accurate reflection of how different seasons impact the spatial structure and environmental conditions of informal settlements. This seasonal data extraction enables us to conduct more detailed urban environmental monitoring and change analysis.

3.2. Informal Settlement Extraction Based on Random Forest Algorithm

The RF model is an ensemble learning method based on decision trees, widely used for classification and regression tasks. It improves the robustness and accuracy of the model by combining the results of multiple decision trees. Its core mechanism is bagging, where subsets of training data are randomly selected to train several decision trees, each generated from different subsets, reducing the risk of overfitting by a single decision tree. As a result, RF is more robust and accurate compared to many traditional classifiers [47]. When training a random forest model, commonly adjusted parameters include the number of trees and the minimum leaf population. Increasing the number of trees can slightly improve the model’s accuracy but also increases computational cost. Therefore, based on previous studies, setting the number of trees to 100 strikes a balance between accuracy and efficiency, and the minimum leaf population controls the maximum depth of each tree setting to 10 to prevent overfitting [48]. Additionally, four other random forest parameters, including variablesPerSplit (the number of variables per split, default was the square root of the number of features), bagFraction (the fraction of the data used for each tree, default was 0.5), outOfBagMode (whether the classifier should operate in out-of-bag mode), and seed (random seed), were set to their default values. To ensure the model’s robustness and accuracy, the dataset is split into training and testing sets, typically with a standard ratio of 70% for training and 30% for testing. For 2017, the number of samples used to build the random forest model is 45,766, and the number of samples used for accuracy assessment is 19,614. For 2022, the number of samples used to build the RF model is 45,794, and the number of samples used for accuracy assessment is 19,626.

3.3. Accuracy Evaluation Method for Informal Settlement Extraction

The confusion matrix clearly shows the differences between predicted classifications and actual classifications, allowing the calculation of performance metrics like overall accuracy, recall, F1 score, and kappa coefficient. A confusion matrix is used to verify the accuracy of the random forest model in extracting informal settlements and validating the results. After the random forest model completes the classification of informal settlements, it generates a set of predictions. The confusion matrix compares these predictions with the actual classification labels in geospatial data, dividing the results into four categories: true positive (TP), false positive (FP), true negative (TN), and false negative (FN).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

In the classification of informal settlements, overall accuracy reflects the model’s general accuracy in predicting all types of areas, including both residential and non-residential zones.

K a p p a = \frac{p_{0} - p_{c}}{1 - p_{c}}

(2)

In the formula,

p_{0}

represents the observed accuracy and

p_{c}

is the expected random accuracy. The kappa coefficient is particularly useful for evaluating the classification of binary datasets, with informal settlements being one example.

F 1 = 2 \times \frac{P r e s i o n + R e c a l l}{P r e c i s i o n + R e c a l l}

(3)

where F1 is the harmonic mean of Precision and Recall, where Precision represents the proportion of correctly classified samples among those predicted as informal settlements, and Recall refers to the proportion of actual informal settlement samples that were correctly predicted.

3.4. Fuzzy Comprehensive Evaluation Based on Entropy-Weighted Method (EWM)

When constructing the above evaluation index system, we encountered the issue of spatial heterogeneity in raster data. Due to varying data resolutions, certain indicators, such as distance-based metrics, exhibited significant gradient changes, while other data, such as PM2.5 and surface temperature, showed notable local fluctuations. This uneven spatial distribution caused by varying resolutions resulted in small differences in the calculated weights when using the EWM, failing to accurately reflect the relative importance of each indicator. Particularly for indicators with similar distributions, such as distances to primary roads, major roads, schools, and stations, the weight differences were not significant, making it difficult to distinguish their different contributions to habitat quality. To address the issue, we introduced Gaussian fuzzification processing on the basis of the EWM. Gaussian fuzzification helps smooth data fluctuations, reduce the impact of extreme values, and preserve the spatial gradient characteristics of distance-based metrics. Specifically, Gaussian fuzzification applies fuzzification to each raster dataset, making spatial data changes more continuous and smooth, thus reducing the impact of local outliers on weight calculations while enhancing the spatial continuity of each indicator. This method effectively improves the shortcomings of the EWM when handling data with spatial gradient variations, enhancing the differentiation of indicator weights.

The EWM itself is an objective allocation method based on the principles of information entropy, which eliminates subjectivity and improves the reliability of weight calculation [6,7,8,9]. Gaussian fuzzification is applied to the standardized data to smooth fluctuations and reduce the influence of extreme values, further enhancing the stability and reliability of the final weights [49]. The combination of these two methods ensures that the weight distribution for each indicator is more scientific and objective, while fully considering the spatial heterogeneity of the data, making the final composite score more aligned with the actual habitat environment. Therefore, the entropy values and weights calculated by combining the EWM with Gaussian fuzzification were used to compute the overall habitat environment score for informal settlements in Mumbai based on the index weights.

The study refers to the research of Guo et al. [50] and Bole et al. [51], classifying the indicators into positive and negative categories. In simple terms, the distance to POIs such as schools, hospitals, parks, or roads is considered a negative indicator because the farther the distance, the poorer the accessibility, which negatively impacts the residents’ quality of life. Therefore, these distances are generally categorized as negative indicators. On the other hand, indicators such as nighttime lights and green space density, which symbolize economic and environmental factors, are considered positive indicators because the higher their values, the more positive the impact on the residents’ quality of life.

Since the evaluation indicators have different dimensions, standardization of each indicator should be performed before constructing the comprehensive index. The methods were as follows. Positive indicator standardization:

x_{i}^{'} = \frac{x_{i} - M i n (x_{i})}{M a x (x_{i}) - M i n (x_{i})}

(4)

Negative indicator standardization:

x_{i}^{'} = \frac{M a x (x_{i}) - x_{i}}{M a x (x_{i}) - M i n (x_{i})}

(5)

where, in Equations (4) and (5),

x_{i}^{'}

represents the normalized data for positive and negative indicators in the habitat environment evaluation of Mumbai,

M i n (x_{i})

and

M a x (x_{i})

denote the minimum and maximum values of the corresponding indicators in the evaluation for that region.

Calculate the relative weight of each subindicator:

D_{i} = \frac{x_{i}^{'}}{\sum_{j = 1}^{n} x_{i}^{'}}

(6)

In calculating the entropy value of the corresponding indicators:

E_{i} = - \frac{1}{\ln (n)} \sum_{i = 1}^{n} D_{i} \times \ln (D_{i})

(7)

Calculate the weight of each corresponding indicator:

w_{i} = \frac{1 - E_{i}}{\sum_{i = 1}^{n} (1 - E_{i})}

(8)

Then apply the Gaussian membership function to each weight value to adjust the weight function:

w_{i}^{'} = G (w_{i}) = e^{- \frac{{(w - μ)}^{2}}{2 σ^{2}}}

(9)

G (w_{i})

represents the Gaussian membership of the preliminary weight,

μ

represents the mean of the preliminary weights,

σ

represents the standard deviation of the preliminary weights.

For the weights of positive and negative indicators, the mean and standard deviation are as follows:

μ_{i} = \frac{1}{n} \sum_{i = 1}^{n} w_{i}

(10)

σ_{i} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(w_{i} - μ_{i})}^{2}}

(11)

The final weight values after Gaussian fuzzification are:

w_{i}^{″} = \frac{w_{i}^{'}}{\sum_{j = 1}^{n} w_{i}^{'}}

(12)

At this point, the corresponding weights are multiplied by the evaluation indicators and then summed, resulting in a single column of data, which represents the “comprehensive score”—the comprehensive habitat environment index (HEI) for Mumbai.

H = \sum_{i = 1}^{n} w_{i}^{″} \times x_{i}^{'}

(13)

4. Results

4.1. Analysis of the Results of Informal Settlement Indicator Selection

In this study, a series of features were extracted based on Sentinel-1/2 data to identify the spatial distribution of informal settlements. These features include spectral indices (like EVI, BSI, BUI, NBI, BAEI, NDBI, UI, NBAI) as well as 18 band indicators from the GLCM. Additionally, polarization information from radar data, containing VV, VH, and VVH from ascending and descending orbits, and slope information extracted from DEM data were used. These initial features were filtered using random forest algorithms, hierarchical clustering, and Spearman correlation analysis to retain the most representative indicators for classifying informal settlements.

From the 2017 feature importance chart (Figure 4a), the texture features Spr-diss and Spr-idm from the GLCM stood out, indicating that informal settlements exhibited significant spatial heterogeneity compared to formal residential areas. Particularly in terms of gray-level dissimilarity and homogeneity, texture features were key in distinguishing informal settlements. This finding is consistent with Kuffer et al. [52], which revealed that texture-based methods demonstrated strong robustness in urban images during their investigation of global slum spatial characteristics.

Furthermore, it is important to note that changes in feature importance over time might also reflect underlying factors like the spatial and structural evolution of informal settlements, as influenced by urbanization and infrastructure development. For example, spring (Spr-diss and Spr-idm) and winter (Win-BAEI, Win-NDBI) features showed higher importance in 2017, possibly reflecting seasonal variations in vegetation cover and construction activities. These temporal changes are reflective of the evolving land use patterns and dynamic human activities in informal settlements, which contribute to the prominence of certain features at different times.

In contrast, the feature importance rankings for 2022 (Figure 4b) changed significantly. Sum-inertia and Spr-contrast became the most important features, indicating that energy and contrast in the gray-level matrix played a more significant role during this period. This change reflected the evolution of the morphology and structure of informal settlements in the study area over the years. The 2022 data showed a different seasonal trend compared to the 2017 data. During this period, the importance of autumn features (Aut-inertia and Aut-savg) increased, which was related to vegetation degradation and increased surface exposure in autumn. Spring and winter features still maintained high importance, but summer features (Sum-inertia and Sum-BAEI) also had a greater influence on classification during this period, as surface coverage and object complexity, influenced by climate change, had a stronger impact on classification results.

Additionally, texture features and indices related to building extraction (Win-BAEI, Win-NDBI, and Win-BSI) retained high importance, with surface coverage features also playing a crucial role and providing strong support in distinguishing informal settlements.

We used the Spearman correlation heatmap (Figure 5) to show the correlations between different features. The 2017 heatmap showed strong positive correlations between several spectral features (Figure 5a), exemplified by Win-BSI and Win-NDBI, indicating that these features reflected similar surface information when distinguishing informal settlements. Additionally, radar polarization features such as Spr-VVH and Aut-VH also showed some correlation, suggesting that different polarization combinations of radar data have consistency in reflecting surface characteristics. The 2022 feature heatmap (Figure 5b) showed changes in the correlation structure between different features, particularly with higher correlations between texture features such as Sum-inertia and Spr-contrast, while correlations between polarization and spectral features decreased. This change indicated that, over time, texture features were more important in reflecting the complexity of informal settlement structures, while the contribution of spectral and radar data to classification results had varied with the expansion or transformation of informal settlements.

By comparing the feature importance and correlation analysis results from different years, the following conclusions can be drawn: first, texture features from the GLCM consistently showed significant advantages across different years and seasons, especially in describing the complexity and structural changes of surface objects. Second, spectral features, namely, Win-BAEI, Win-NDBI, and Win-BSI, had high feature importance in both years, reflecting the close relationship between informal settlements and bare soil and building coverage. Finally, the importance of features changed over time, with texture features becoming more critical in 2022, while the role of polarization features diminished.

4.2. Results and Analysis of Informal Settlement Extraction

The RF model performed exceptionally well in handling high-dimensional data and preventing overfitting, yielding better results than rule-based OBIA methods, making it suitable for recognizing complex geographic information in urban informal settlements. Based on the feature selection and importance analysis mentioned above, we constructed an efficient classification model for informal settlements and trained and validated it using the RF algorithm. The model was trained with positive samples (informal settlements) and negative samples (formal settlements), and the dataset was split into training and test sets in a 7:3 ratio. Model performance was validated using kappa coefficients and overall accuracy (OA) results, as shown in Table 4, with kappa coefficients above 0.77 and overall accuracy exceeding 0.89.

This study extracted informal settlements by using medium- and high-resolution remote sensing data. Kuffer et al. [53] achieved an overall accuracy of 87% for Mumbai by integrating high-resolution imagery (WorldView). Similarly, Fallatah et al. [54] combined single-day VHR imagery from the GeoEye-1 sensor with medium-resolution Landsat data, using the RF algorithm to classify six land cover types, comprising informal settlements, with an accuracy exceeding 90%. Peng et al. [55] constructed a composite slum spectral index using Sentinel data and achieved a 54.45% intersection-over-union score through machine learning. Moreover, Matarira et al. [56] also used machine learning classifiers on Quickbird imagery, achieving relatively high accuracy (>80%). In contrast, Najmi et al. [57] combined street view and 0.4 m VHR imagery to extract informal settlements, obtaining a lower overall accuracy of 74.5–80.2%. In this article, only medium-resolution Sentinel data were used, with an overall accuracy of over 89% and an F1 score exceeding 90%.

The binary classification results obtained using the random forest algorithm are shown in Figure 6. In the 2017 map, informal settlements (marked in purple) accounted for 20.6% of the built-up area, mainly concentrated in the northern and central parts of the city. By 2022, informal settlements further expanded southward, with more purple areas visible along the southern edge of the city, indicating the expansion of informal settlements in that region. The total area of informal settlements expanded from 45.37 km² in 2017 to 50.64 km² in 2022, accounting for 23.1% of the built-up area. Compared to 2017, the distribution of informal settlements in 2022 became more widespread, particularly in the peripheral areas, reflecting a trend of urban expansion along with the growth of informal settlements.

We utilized Google Earth and Sentinel satellite imagery to monitor the spatial distribution changes of informal settlements (ISs) in Mumbai between 2017 and 2022. Google Earth provides high-resolution localized views, while Sentinel offers medium-resolution wide-area imagery. By comparing the imagery and extraction results from these two data sources, it was demonstrated that free Sentinel data are effective and accurate in capturing the area and spatial distribution of large-scale informal settlements. Figure 7 and Figure 8 show the 2017 and 2022 Google Earth and Sentinel imagery, along with the corresponding informal settlement extraction results. Although Sentinel imagery has lower resolution and cannot provide the same level of detail as Google Earth, its spatial distribution trend is generally consistent with the high-resolution imagery and effectively reflects the overall distribution of informal settlements. Particularly in large-scale areas, Sentinel imagery exhibits high extraction accuracy, demonstrating its effectiveness in large-scale monitoring.

The expansion trend of informal settlements in 2022 is evident, especially at the urban edge and near transportation and commercial zones, consistent with the findings of Kohli et al. [58]. Comparing the 2017 and 2022 imagery, the expansion speed of informal settlements significantly accelerated around the urban periphery, transportation routes, and commercial zones, reflecting the driving force of urbanization in the expansion of informal settlements. The regions shown in Figure 7a,g,h,i and Figure 8a,g,h,i indicate that informal settlements expanded rapidly, particularly near natural resources like farmland and forests. These settlements are often located in suburban or remote areas, where agriculture and settlement develop alternately [59]. Furthermore, areas lacking government oversight typically form unplanned, dense settlements, especially along rivers and natural resource zones. Informal settlements are often located in hazardous areas, such as floodplains and marshlands, posing significant risks to residents [24]. Regions in Figure 8i,j,k show further expansion of informal settlements along rivers and transportation hubs, highlighting the significant impact of natural resources and infrastructure on the development of informal settlements. The comparison from 2017 to 2022 reveals a close connection between the spatial distribution of informal settlements and natural resources, infrastructure, and the urbanization process. Using Google Earth and Sentinel imagery, especially Sentinel’s medium-resolution imagery, has shown good accuracy in large-scale monitoring of informal settlements, particularly in capturing spatial distribution trends. Future research can focus on how to combine high-resolution and medium-resolution imagery to improve extraction accuracy and explore more efficient monitoring methods, providing theoretical support for urban planning and the improvement of informal settlements.

The findings of this study highlight that urbanization is not a linear process, with different countries following distinct urbanization pathways. In developing countries, urbanization often leads to overurbanization, where land becomes urbanized at a faster pace than the population, accelerating the growth of informal settlements. This pattern is particularly evident in the rapid expansion of informal settlements at the urban edge, near transportation routes and commercial zones. The spatial distribution of informal settlements, especially around these zones, shows a clear reflection of the driving forces of urbanization, where land is rapidly transformed but the population’s settlement patterns struggle to keep up. The dynamic of overurbanization shows that, while land is urbanized quickly, population movements may lag behind, leading to the formation of informal settlements in areas that have been urbanized but are lacking in infrastructure and services. This pattern is especially prominent along transportation corridors and areas adjacent to commercial developments, where informal settlements continue to emerge despite the presence of infrastructure and services intended for formal residents.

4.3. Evaluation Result of the HEI in Informal Settlements

The study determined the weights of the habitat environment index evaluation indicators, constructed from basic geographic data and remote sensing data, through the integration of the EWM and Gaussian fuzzification evaluation model. This method effectively improves the differentiation of weight calculations and more accurately reflects the spatial heterogeneity of raster data. The weight results of the living environment evaluation indicators for 2017 and 2022 (Table 5) revealed the impact of various indicators on the habitat environment and their changes over time.

Overall, economic conditions had the highest weight in the habitat environment evaluation, with significant fluctuations in nighttime light data and housing price data, indicating the importance of economic vitality in evaluating the habitat environment quality. This is because economic development is closely related to urban infrastructure construction, and the improvement of infrastructure usually directly enhances residents’ quality of life. Nighttime light data reflect the intensity of urban economic activity. Highly concentrated commercial areas are often associated with stronger nighttime lighting, which directly impacts residents’ living and working environments. Housing price fluctuations also reflect the level of regional economic activity and residents’ living standards. High housing prices are often associated with better urban facilities and quality of life. Therefore, the high weight of economic condition indicators can also be seen as a reflection of the close connection between urban economic vitality and residents’ quality of life. Another important indicator closely related to economic conditions is urban infrastructure, including transportation convenience, distance to hospitals, and commercial areas in the index system. By comparing the data from 2017 and 2022, we found that the weight changes of transportation infrastructure and healthcare services were minimal. Regardless of urban development, the accessibility of basic social services (such as hospitals and commercial facilities) is crucial for residents’ convenience. Transportation facilities directly affect residents’ mobility and work efficiency, while the accessibility of healthcare facilities such as hospitals is directly related to residents’ health levels. Therefore, the stability and criticality of infrastructure have made these indicators’ weights remain relatively consistent between the two periods.

Despite efforts in recent years to improve air quality, air pollution remains a key factor affecting residents’ health and quality of life. The weight of PM2.5 concentration slightly decreased but remained at a high level, indicating that air quality remains an important long-term factor affecting habitat environment quality. With the progress of urbanization, especially in some informal settlements, air pollution issues may continue to affect residents’ health and living environments. However, the impact of extreme weather and climate change may persist in the long term, and therefore, PM2.5 concentration remains one of the key indicators for evaluating habitat environment quality. The weight of green space density dropped from 0.143 to 0.080, indicating that the relative reduction in green spaces during urban expansion had diminished its influence on the habitat environment index evaluation.

As urbanization progresses, the reduction in green spaces might be offset by improvements in other indicators, particularly the economy and infrastructure. However, this also means that future urban development must pay more attention to the protection of green spaces and public areas to improve residents’ quality of life. The weight of urban transportation infrastructure remained stable. Improvements in infrastructure can directly enhance residents’ mobility and improve the quality of life, especially in areas with urban expansion and high-density development.

In summary, the reason for the higher weights of some indicators in the habitat environment index is mainly because these indicators are directly related to the basic living conditions and quality of residents. Economic activities (such as nighttime light, housing prices) and infrastructure (such as transportation, healthcare) are usually closely related to the level of urban development and residents’ quality of life, and improvements in these factors directly drive the enhancement of residents’ living standards. Although environmental quality (such as PM2.5 concentration and green space density) also plays an important role in habitat environments, its impact has changed during urbanization. Therefore, future urban planning should focus more on the coordinated development of the economy and infrastructure, while protecting and restoring environmental resources to ensure the long-term well-being of residents.

The above weight results were comprehensively evaluated to obtain the HEI, and this result was classified using the natural break method into four categories of habitat environment evaluation (unsuitable, lowly suitable, moderately suitable, highly suitable), overlaid with the spatial distribution of informal settlements. As shown in Figure 9, it was clear that the overall suitability of the habitat environment was improved with most informal settlements located in lowly suitable or unsuitable areas with relatively poor living conditions. In the spatial distribution map, highly suitable areas (green regions) expanded more noticeably in the city center and southern parts, with the suitability of the urban core significantly improving, especially in 2022.

This improvement was reflected not only in spatial distribution but also in the statistical data of habitat environment categories and the area proportions (as shown in Figure 10 and Table 6), confirming the enhancement of the habitat environment quality. Figure 10 shows a significant improvement in Mumbai’s overall habitat environment between 2017 and 2022, particularly with the increase in highly suitable areas and the decrease in lowly suitable areas, reflecting the upgrading of the city center and some secondary centers. However, the improvement in informal settlements was relatively slow, with the proportion of unsuitable areas increasing (from 6.52% to 8.27%), indicating that residents in the city’s periphery and resource-poor areas still face significant challenges.

Despite significant improvements in some areas, unsuitable regions (marked in red) still existed and expanded in the city’s outskirts and some informal settlements. The proportion of highly suitable areas in Mumbai’s urban region increased from 33.25% to 43.42%. Although the total area slightly decreased (from 76.46 km² to 73.04 km²), the more suitable areas became more concentrated, likely due to improvements in infrastructure and the expansion of social services. A similar trend was observed in informal settlements, where the area of highly suitable regions increased from 2.52 km² to 3.47 km², indicating that some of these areas benefited from policy interventions and redevelopment projects, improving the habitat environment quality.

The unsuitable areas in the northern and southern outskirts increased in 2022 (Figure 10 and Table 6). The area of unsuitable regions in Mumbai’s urban area increased from 7.04 km² to 10.57 km², indicating that these areas still face significant challenges, especially due to weak infrastructure, poor air quality, or inconvenient transportation. Similarly, the unsuitable areas in informal settlements increased from 2.52 km² to 3.47 km², with their proportion rising from 6.52% to 8.27%. These areas may have failed to improve due to poor economic conditions, high population density, or insufficient resource allocation. And moderately suitable areas saw an overall decrease in 2022, with Mumbai’s urban area dropping from 92.33 km² to 73.67 km². This shift is shown in Figure 8, where some moderately suitable areas are replaced by highly suitable ones, reflecting overall improvements in the city center or secondary centers. However, some moderately suitable areas were downgraded to lowly suitable or unsuitable, particularly at the edges of informal settlements. The area of lowly suitable regions decreased from 54.11 km² to 37.04 km², indicating that some areas were upgraded to more suitable living conditions. This is also reflected in Figure 10, where the orange areas gradually shrink, and some peripheral regions are improved and transformed into green areas.

5. Discussion

5.1. Validity and Limitations of the Methodology

The study extracted informal settlements using medium-resolution data combined with seasonal indicators and found that the spatial heterogeneity of informal settlements showed significant differences in textural features compared to spectral features. These texture features consistently exhibited stable and significant advantages across different years and seasons, demonstrating their robustness over time. The study utilized the MDI indicator to evaluate feature importance [60], combined with hierarchical clustering to obtain optimal indicators, effectively leveraging the RF model’s feature importance evaluation capability to filter out the best feature subsets, thereby improving classification accuracy and efficiency. However, while the random forest model demonstrated high performance, the potential for overfitting during feature selection was not fully addressed. Overfitting can occur if the model becomes overly complex by retaining too many features that do not generalize well to unseen data. In this study, we utilized hierarchical clustering and the MDI indicator to optimize the feature selection process. However, the model’s reliance on the most informative features could still lead to overfitting, especially when there is high redundancy or correlation between features. To address this issue, future research could explore regularization techniques, such as feature pruning or cross-validation methods, to mitigate overfitting. Moreover, incorporating additional feature selection methods, such as the Boruta algorithm or ReliefF-RFE, could improve the robustness of the model by further reducing dimensionality and retaining only the most relevant features. These approaches would help prevent the model from becoming too sensitive to specific features and improve its generalizability to different spatial and temporal contexts. Choosing the right feature selection algorithm was crucial for the performance of machine learning classifiers in terms of both accuracy and simplicity [61]. The study applied an entropy-weighted Gaussian fuzzy evaluation model to assess the habitat quality in informal settlements and dynamically adjusted the weight of different factors [62] based on the data’s discreteness, adaptively reflecting the impact of different indicators on the HEI at specific time points. This method overcomes the limitations of traditional fixed-weight methods, which may not adapt well to dynamic socio-economic and environmental changes. While this analysis primarily focused on short-term spatial patterns, it provides valuable insights into the habitat quality of informal settlements. Future research could leverage multi-temporal data and long-term monitoring to track changes in habitat suitability over time. This will offer deeper insights into the evolving conditions of informal settlements, providing urban planners and policymakers with the necessary data to implement more effective, adaptive interventions. Previous studies on urban habitat environment quality have typically focused on multi-dimensional indicators to evaluate aspects such as quality of life, environmental suitability, and resilience [63] but often lack a comprehensive approach to account for dynamic changes. In contrast, our approach, by considering both objective environmental indicators and spatial gradient characteristics, offers a more adaptable framework for assessing the evolving nature of informal settlements.

5.2. Reliability and Suitability of Data Sources

Acquiring high-resolution imagery for remote sensing is often limited by high costs, restricted coverage and difficulties in data acquisition, particularly in large cities like Mumbai, where the timeliness and availability of such imagery failed to meet the requirements for long-term and continuous monitoring. However, the accessibility, low cost, and open availability of Sentinel data provided a reliable and universal solution for this study. We used multi-spectral and multi-polarized Sentinel satellite data as the remote sensing data source. The combination of spectral and texture features provided rich information for extracting informal settlements, and the use of multi-seasonal data enhanced the model’s ability to identify complex surface objects. Nevertheless, while Sentinel-1 and Sentinel-2 data are effective for large-scale monitoring, they have limitations in capturing fine urban details, particularly in densely built-up areas with complex urban morphology. Sentinel-1, based on radar imaging, has difficulty distinguishing closely spaced structures, which may lead to misclassification, especially in areas with informal settlements characterized by irregular land use. Similarly, Sentinel-2’s 10 m resolution may not accurately delineate the boundaries of informal settlements, particularly in regions with mixed land use, dense vegetation, or significant shadowing. These challenges highlight the difficulty in accurately identifying informal settlements in cities with high urban complexity. To improve this, future research could combine medium-resolution data from Sentinel-1 and Sentinel-2 with higher-resolution commercial satellite imagery, such as WorldView or Pleiades, to refine the spatial boundaries of informal settlements and improve classification accuracy. Additionally, other methods such as OBIA or DL model-based approaches might provide better solutions to address the complexity of urban settlements.

5.3. Socioeconomic Drivers Behind the Expansion of Informal Settlements

The expansion of informal settlements from 2017 to 2022 is closely tied to socio-economic factors such as migration, economic disparity, and limited access to formal housing. Rapid urbanization and migration from rural areas have contributed to the growth of informal settlements, as many low-income individuals and families seek affordable housing. These settlements typically emerge in areas where land is cheaper, often at the urban periphery or near natural resources. Economic disparity plays a significant role, as rising land prices in city centers push low-income populations to the outskirts. The lack of affordable housing options forces many to settle in informal areas, where access to basic services and infrastructure is limited. Moreover, the absence of effective governance in these regions exacerbates the problem. Informal settlements are often left unchecked due to insufficient urban planning and government intervention. This study highlights that the growth of informal settlements is not only a consequence of economic pressures but also the result of broader governance challenges, including the lack of policies to manage urban sprawl and provide adequate infrastructure. Addressing these issues requires integrated urban planning and policies that balance urban growth with social equity and environmental sustainability. In conclusion, the expansion of informal settlements between 2017 and 2022 reflects the complex interplay of socio-economic and governance factors. Future urban planning strategies must focus on improving infrastructure and providing affordable housing to reduce the pressure on informal settlements and promote sustainable urban growth.

5.4. Relevance and Policy Implications of Evaluation Results

In the habitat environment quality evaluation indicators, the weight of economic conditions (especially nighttime light data and population density) has gained significant importance in recent years, reflecting the critical influence of urban economic vitality and population concentration on the overall habitat environment. However, the weight of social and natural conditions has changed little, indicating the continued influence of infrastructure and environmental quality on living standards. The decline in green space density highlights the balance issue between urban expansion and environmental resources. Given these findings, policymakers should adopt a more integrated approach to urban development, ensuring that economic growth, environmental sustainability, and social equity are all prioritized. The findings suggest that informal settlements in areas lacking infrastructure require urgent attention. A balanced approach is needed, with a focus on providing infrastructure in informal settlements while simultaneously protecting green spaces and promoting socio-economic equality. Moreover, inclusive governance should ensure that residents of informal settlements are involved in the planning process. This will enable urban development that is both equitable and sustainable. While the HEI in Mumbai has improved, many informal settlement residents still live in low-suitability conditions, indicating that their living conditions have not kept pace with overall urban development. Informal settlements near urban centers tend to have better access to economic opportunities and infrastructure support, while those relying on natural resources face greater risks due to inadequate infrastructure and lack of resilience to environmental shocks such as floods and pollution. Long-term adaptive interventions are necessary to track changes and guide policy adaptation, ensuring that informal settlements are gradually improved over time.

6. Conclusions

This study provides valuable insights into informal settlements in Mumbai by integrating multi-source remote sensing data and applying an entropy-weighted Gaussian fuzzy evaluation model. The research aimed to address critical gaps in understanding informal settlement dynamics and habitat quality, particularly by providing accurate baseline data for spatial distribution and environmental quality assessment. The following conclusions are drawn:

(1) The RF model, combined with MDI indicator extraction results, achieved kappa coefficients above 0.77, overall accuracy exceeding 89%, and F1 scores above 90%, demonstrating its high reliability and stability in extracting informal settlements. The study found that informal settlements expanded from 45.37 km² to 50.64 km² between 2017 and 2022, particularly in peripheral areas, reflecting ongoing urban expansion. These settlements typically expanded in two main patterns: one gradually near formal residential areas and the other around natural resources such as farmland, forests, and water bodies.

The findings highlight the growing significance of informal settlements in urban development, especially in developing countries where they emerge due to the limitations of state capacity to provide affordable housing. Informality is not a marginal phenomenon but an integral part of urbanization, shaped by both market forces and state policies. Understanding informal settlements requires considering them within broader urban governance processes and addressing underlying issues of social inequality. This study also emphasizes that informal settlements should be recognized as a norm in urban development, particularly in the context of rapid urbanization in the global south.

(2) This study underscores the importance of balanced development between urban and rural areas as a critical factor in addressing the growth of informal settlements. Viewing urban and rural areas as binary opposites exacerbates disparities and contributes to the formation of informal settlements, particularly at the urban periphery. To mitigate these challenges, both urban and rural areas require simultaneous development strategies that promote spatial equity, ensure access to basic services, and reduce the pressures on urban infrastructure and housing. Policymakers must recognize the interconnectedness of urban and rural development and implement strategies that do not favor one at the expense of the other. Urban development must be integrated with rural policies to ensure that investments in infrastructure and housing are distributed across both domains. By fostering synergy between urban and rural development, policymakers can better manage the expansion of informal settlements and promote sustainable development across regions. The findings also highlight that urbanization is not a linear process and that different countries follow distinct urbanization pathways. In developing countries, urbanization often leads to overurbanization, where land undergoes urbanization at a faster pace than the population, leading to further informal settlement growth. Addressing these challenges requires a deeper understanding of the underlying differences in urbanization trajectories and the dynamics of population movements, which often take much longer to align with land development. Additionally, urbanization’s impact on informal settlement growth requires adaptive and inclusive policies that integrate urban planning with rural development and address the long-term dynamics of population shifts and land use change.

(3) The weight results of HEI evaluation indicators from 2017 and 2022 revealed that economic conditions had the most significant impact on habitat quality, with nighttime light data and housing prices reflecting economic vitality. In contrast, the weights of social, natural, and residential conditions remained relatively stable. Over time, the overall suitability of the habitat environment improved, with highly suitable areas increasing from 33.53% to 43.42% and low-suitability areas decreasing from 3.06% to 2.49%. However, resource-poor areas saw slower improvements, with unsuitable areas rising from 6.52% to 8.27%, highlighting the ongoing challenges for residents on the outskirts. While economic factors such as economic vitality play a critical role in improving habitat quality, addressing the challenges of informal settlements requires more than just economic development. Effective urban governance must extend beyond governance effectiveness to focus on governance inclusivity, ensuring that informal settlement residents are involved in decision making and have access to essential resources and opportunities. This research provides evidence for integrating informal settlement areas into urban planning strategies and urban renewal policies, ensuring that resources are directed to areas in critical need. Urban renewal should be part of a broader policy agenda that integrates both social and economic inclusion, addressing the complex needs of marginalized communities. The findings also highlight the importance of recognizing informal settlements not as isolated phenomena but as integral parts of the urbanization process, requiring targeted interventions to ensure sustainable urban development.

This study enhances the understanding of the environmental quality of informal settlements by integrating multi-source remote sensing data and applying an entropy-weighted Gaussian fuzzy evaluation model. By providing detailed spatial and environmental data, it establishes a solid foundation for sustainable urban planning strategies, ensuring that interventions are tailored to the unique challenges of informal settlements while promoting equitable development. The findings are particularly valuable for policymakers and urban planners aiming to implement targeted interventions that address the specific needs of informal settlement areas. By identifying key regions of expansion, the research offers actionable recommendations for directing resources and infrastructure investments to the most vulnerable areas. Additionally, the study underscores the importance of integrating informal settlements into broader urban renewal and governance frameworks, contributing to the achievement of Sustainable Development Goal 11.1, which focuses on improving the living conditions of urban slums. These insights provide a pathway for improving the habitat quality within these settlements, guiding policy decisions that balance economic growth, social equity, and environmental sustainability.

Author Contributions

Conceptualization, Z.Y. and F.Y.; methodology, Z.Y. and H.Y.; data curation, Z.Y. and H.Y.; writing—original draft preparation, Z.Y., F.Y., and H.Y.; writing—review and editing, Z.Y., F.Y., and Y.X.; visualization, Z.Y. and C.D.; supervision, Y.X., L.H., and Z.S.; funding acquisition, Z.Y., F.Y., and Z.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by National Natural Science Foundation of China (No. 42171079), Yunnan Youth Talent Project “The delimitation of county three lines and three regions and their platform study”, International Research Center of Big Data for Sustainable Development Goals (CBAS) (CBASYX0906), Construction Project of China Knowledge Center for Engineering Sciences and Technology (No. CKCEST-2023-1-5), and the Graduate Research and Innovation Fund of Yunnan Normal University (YJSJJ23-B98).

Data Availability Statement

The code used to categorize the dataset can be accessed on GitHub (https://github.com/EngineYzx/Sentinel-data-processing (accessed on 3 December 2024)).

Acknowledgments

We appreciate the significant time and effort of the reviewers and their assistance to improve the quality of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

United Nations. Department of Economic and Social Affairs; United Nations: New York, NY, USA, 2015. [Google Scholar]
Aboulnaga, M.M.; Badran, M.F.; Barakat, M.M. Global informal settlements and urban slums in cities and the coverage. In Resilience of Informal Areas in Megacities—Magnitude, Challenges, and Policies; Springer: Cham, Switzerland, 2021; pp. 1–51. [Google Scholar]
United Nations. Transforming Our World: The 2030 Agenda for Sustainable Development; United Nations: New York, NY, USA, 2015. [Google Scholar]
Adewunmi, Y.; Chigbu, U.E.; Mwando, S.; Kahireke, U. Entrepreneurship role in the co-production of public services in informal settlements—A scoping review. Land Use Policy 2023, 125, 106479. [Google Scholar] [CrossRef]
Hardoy, J.E.; Satterthwaite, D. Shelter, infrastructure and services in third world cities. Habitat Int. 1986, 10, 245–284. [Google Scholar] [CrossRef]
Parikh, P.; Bisaga, I.; Loggia, C.; Georgiadou, M.C.; Ojo-Aromokudu, J. Barriers and opportunities for participatory environmental upgrading: Case study of Havelock informal settlement, Durban. City Environ. Interact. 2020, 5, 100041. [Google Scholar] [CrossRef]
Vajjarapu, H.; Verma, A.; Allirani, H. Evaluating climate change adaptation policies for urban transportation in India. Int. J. Disaster Risk Reduct. 2020, 47, 101528. [Google Scholar] [CrossRef]
Brotherhood, L.; Cavalcanti, T.; Da Mata, D.; Santos, C. Slums and pandemics. J. Dev. Econ. 2022, 157, 102882. [Google Scholar] [CrossRef]
Yin, H.; Islam, M.S.; Ju, M. Urban river pollution in the densely populated city of Dhaka, Bangladesh: Big picture and rehabilitation experience from other developing countries. J. Clean. Prod. 2021, 321, 129040. [Google Scholar] [CrossRef]
Parthasarathy, D. Decentralization, pluralization, balkanization? Challenges for disaster mitigation and governance in Mumbai. Habitat Int. 2016, 52, 26–34. [Google Scholar] [CrossRef]
United Nations Human Settlements Programme. State of the World’s Cities 2010/2011: Bridging the Urban Divide; Earthscan: Oxford, UK, 2010. [Google Scholar]
Improving Health in Slums Collaborative. A protocol for a multi-site, spatially-referenced household survey in slum settings: Methods for access, sampling frame construction, sampling, and field data collection. BMC Med. Res. Methodol. 2019, 19, 109. [Google Scholar]
Economic Commission for Europe. Measuring and Monitoring Progress Towards the Sustainable Development Goals; United Nations: New York, NY, USA, 2021. [Google Scholar]
Mahabir, R.; Croitoru, A.; Crooks, A.T.; Agouris, P.; Stefanidis, A. A critical review of high and very high-resolution remote sensing approaches for detecting and mapping slums: Trends, challenges and emerging opportunities. Urban Sci. 2018, 2, 8. [Google Scholar] [CrossRef]
Fallatah, A.; Jones, S.; Mitchell, D.; Kohli, D. Mapping informal settlement indicators using object-oriented analysis in the Middle East. Int. J. Digit. Earth 2019, 12, 802–824. [Google Scholar] [CrossRef]
Pelizari, P.A.; Spröhnle, K.; Geiß, C.; Schoepfer, E.; Plank, S.; Taubenböck, H. Multi-sensor feature fusion for very high spatial resolution built-up area extraction in temporary settlements. Remote Sens. Environ. 2018, 209, 793–807. [Google Scholar] [CrossRef]
Prabhu, R.; Alagu Raja, R. Urban slum detection approaches from high-resolution satellite data using statistical and spectral based approaches. J. Indian Soc. Remote Sens. 2018, 46, 2033–2044. [Google Scholar] [CrossRef]
Williams, T.K.; Wei, T.; Zhu, X. Mapping urban slum settlements using very high-resolution imagery and land boundary data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 13, 166–177. [Google Scholar] [CrossRef]
Leonita, G.; Kuffer, M.; Sliuzas, R.; Persello, C. Machine learning-based slum mapping in support of slum upgrading programs: The case of Bandung City, Indonesia. Remote Sens. 2018, 10, 1522. [Google Scholar] [CrossRef]
Su, Y.; Zhong, Y.; Zhu, Q.; Zhao, J. Urban scene understanding based on semantic and socioeconomic features: From high-resolution remote sensing imagery to multi-source geographic datasets. ISPRS J. Photogramm. Remote Sens. 2021, 179, 50–65. [Google Scholar] [CrossRef]
Taubenböck, H.; Kraff, N.J.; Wurm, M. The morphology of the Arrival City—A global categorization based on literature surveys and remotely sensed data. Appl. Geogr. 2018, 92, 150–167. [Google Scholar] [CrossRef]
Brigato, L.; Iocchi, L. A close look at deep learning with small data. In Proceedings of the 2020 25th international conference on pattern recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 2490–2497. [Google Scholar]
Friesen, J.; Taubenböck, H.; Wurm, M.; Pelz, P.F. The similar size of slums. Habitat Int. 2018, 73, 79–88. [Google Scholar] [CrossRef]
Baye, F.; Wegayehu, F.; Mulugeta, S. Drivers of informal settlements at the peri-urban areas of Woldia: Assessment on the demographic and socio-economic trigger factors. Land Use Policy 2020, 95, 104573. [Google Scholar] [CrossRef]
Zhu, B.; Zhang, X.; Yin, X. Evaluation of rural human settlements quality and its spatial pattern in Jiangsu Province. Econ. Geogr. 2015, 35, 138–144. [Google Scholar]
Liu, H.; Li, X.; Guan, Y.; Li, S.; Sun, H.; Health, P. Comprehensive Evaluation and Analysis of Human Settlements’ Suitability in the Yangtze River Delta Based on Multi-Source Data. Int. J. Environ. Res. Public Heal. 2023, 20, 1354. [Google Scholar] [CrossRef]
Maimaiti, A.; Wang, L.; Zhang, J.; Song, Z. Environmental suitability evaluation for human settlements in Bosten Lake Basin. In Proceedings of the IOP Conference Series: Earth and Environmental Science, Beijing, China, 16–17 May 2016; p. 012008. [Google Scholar]
Huang, H.; Li, Q.; Zhang, Y. Urban residential land suitability analysis combining remote sensing and social sensing data: A case study in Beijing, China. Sustainability 2019, 11, 2255. [Google Scholar] [CrossRef]
Xue-ming, L.; Pei-yu, J.I. Characteristics and spatial-temporal differences of urban human settlement environment in China. Sci. Geogr. Sin. 2012, 32, 521–529. [Google Scholar]
Dou, H.; Ma, L.; Li, H.; Bo, J.; Fang, F. Impact evaluation and driving type identification of human factors on rural human settlement environment: Taking Gansu Province, China as an example. Open Geosci. 2020, 12, 1324–1337. [Google Scholar] [CrossRef]
Shahraki, S.Z.; Hosseini, A.; Sauri, D.; Hussaini, F. Fringe more than context: Perceived quality of life in informal settlements in a developing country: The case of Kabul, Afghanistan. Sustain. Cities Soc. 2020, 63, 102494. [Google Scholar] [CrossRef]
Chitekwe-Biti, B.; Mudimu, P.; Nyama, G.M.; Jera, T. Developing an informal settlement upgrading protocol in Zimbabwe—The Epworth story. Environ. Urban. 2012, 24, 131–148. [Google Scholar] [CrossRef]
Giri, M.; Bista, G.; Singh, P.K.; Pandey, R. Climate change vulnerability assessment of urban informal settlers in Nepal, a least developed country. J. Clean. Prod. 2021, 307, 127213. [Google Scholar] [CrossRef]
Patri, P.; Sharma, P.; Patra, S.K. A multidimensional model for cyclone vulnerability assessment of urban slum dwellers in India: A case study of Bhubaneswar city. Int. J. Disaster Risk Reduct. 2022, 83, 103439. [Google Scholar] [CrossRef]
Mendiratta, P.; Gedam, S. Assessment of urban growth dynamics in Mumbai Metropolitan Region, India using object-based image analysis for medium-resolution data. Appl. Geogr. 2018, 98, 110–120. [Google Scholar] [CrossRef]
Bardhan, R.; Pan, J.; Chen, S.; Cho, T.Y. Breathing space in a compact city: Impacts of urban re-densification on Mumbai’s low-income housing environment. Habitat Int. 2024, 149, 103098. [Google Scholar] [CrossRef]
Yu, L.; Xie, D.; Xu, X. Environmental Suitability Evaluation for Human Settlements of Rural Residential Areas in Hengshui, Hebei Province. Land 2022, 11, 2112. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Diek, S.; Fornallaz, F.; Schaepman, M.E.; De Jong, R. Barest pixel composite for agricultural areas using landsat time series. Remote Sens. 2017, 9, 1245. [Google Scholar] [CrossRef]
He, C.; Shi, P.; Xie, D.; Zhao, Y. Improving the normalized difference built-up index to map urban built-up areas using a semiautomatic segmentation approach. Remote Sens. Lett. 2010, 1, 213–221. [Google Scholar] [CrossRef]
Jieli, C.; Manchun, L.; Yongxue, L.; Chenglei, S.; Wei, H. Extract residential areas automatically by new built-up index. In Proceedings of the 2010 18th International Conference on Geoinformatics, Beijing, China, 18–20 June 2010; pp. 1–5. [Google Scholar]
Bouzekri, S.; Lasbet, A.A.; Lachehab, A. A new spectral index for extraction of built-up area using Landsat-8 data. J. Indian Soc. Remote Sens. 2015, 43, 867–873. [Google Scholar] [CrossRef]
Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
Kawamura, M. Relation between social and environmental conditions in colombo, sri lanka and the urban index estimated by satellite remote sensing data. Int. Arch. Photogramm. Remote Sens. 1996, 7, 321–326. [Google Scholar]
Waqar, M.M.; Mirza, J.F.; Mumtaz, R.; Hussain, E. Development of new indices for extraction of built-up area & bare soil from landsat data. Open Access Sci. Rep. 2012, 1, 4. [Google Scholar]
Li, X.; Zhou, Y.; Gong, P.; Seto, K.C.; Clinton, N. Developing a method to estimate building height from Sentinel-1 data. Remote Sens. Environ. 2020, 240, 111705. [Google Scholar] [CrossRef]
Azzari, G.; Lobell, D.B. Landsat-based classification in the cloud: An opportunity for a paradigm shift in land cover monitoring. Remote Sens. Environ. 2017, 202, 64–74. [Google Scholar] [CrossRef]
Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Dedieu, G. Assessing the robustness of Random Forests to map land cover with high resolution satellite image time series over large areas. Remote Sens. Environ. 2016, 187, 156–168. [Google Scholar] [CrossRef]
Yanar, T.; Akyurek, Z. The enhancement of the cell-based GIS analyses with fuzzy processing capabilities. Inf. Sci. 2006, 176, 1067–1085. [Google Scholar] [CrossRef]
Guo, Y.; Chen, P.; Zhu, Y.; Zhang, H. Study on comprehensive evaluation of human settlements quality in Qinghai Province, China. Ecol. Indic. 2023, 154, 110520. [Google Scholar] [CrossRef]
Bole, Y.; Rina, S.; Guga, S.; Na, M.; Fan, S.; Zhang, J. Evaluation of resources, environment, and ecological carrying capacity from the perspective of “production-living-ecology” spaces: A case study of western Jilin Province, China. J. Clean. Prod. 2025, 491, 144770. [Google Scholar] [CrossRef]
Kuffer, M.; Pfeffer, K.; Sliuzas, R. Slums from Space—15 Years of Slum Mapping Using Remote Sensing. Remote Sens. 2016, 8, 455. [Google Scholar] [CrossRef]
Kuffer, M.; Pfeffer, K.; Sliuzas, R.; Baud, I. Extraction of Slum Areas From VHR Imagery Using GLCM Variance. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 1830–1840. [Google Scholar] [CrossRef]
Fallatah, A.; Jones, S.; Wallace, L.; Mitchell, D. Combining Object-Based Machine Learning with Long-Term Time-Series Analysis for Informal Settlement Identification. Remote Sens. 2022, 14, 1226. [Google Scholar] [CrossRef]
Peng, F.; Lu, W.; Hu, Y.; Jiang, L. Mapping Slums in Mumbai, India, Using Sentinel-2 Imagery: Evaluating Composite Slum Spectral Indices (CSSIs). Remote Sens. 2023, 15, 4671. [Google Scholar] [CrossRef]
Matarira, D.; Mutanga, O.; Naidu, M. Texture analysis approaches in modelling informal settlements: A review. Geocarto Int. 2022, 37, 13451–13478. [Google Scholar] [CrossRef]
Najmi, A.; Gevaert, C.M.; Kohli, D.; Kuffer, M.; Pratomo, J. Integrating Remote Sensing and Street View Imagery for Mapping Slums. ISPRS Int. J. Geo-Inf. 2022, 11, 631. [Google Scholar] [CrossRef]
Kohli, D.; Sliuzas, R.; Kerle, N.; Stein, A. An ontology of slums for image-based classification. Comput. Environ. Urban Syst. 2012, 36, 154–163. [Google Scholar] [CrossRef]
Isendahl, C.; Smith, M.E. Sustainable agrarian urbanism: The low-density cities of the Mayas and Aztecs. Cities 2013, 31, 132–143. [Google Scholar] [CrossRef]
You, N.; Dong, J.; Huang, J.; Du, G.; Zhang, G.; He, Y.; Yang, T.; Di, Y.; Xiao, X. The 10-m crop type maps in Northeast China during 2017–2019. Sci. Data 2021, 8, 41. [Google Scholar] [CrossRef] [PubMed]
Georganos, S.; Grippa, T.; Vanhuysse, S.; Lennert, M.; Shimoni, M.; Kalogirou, S.; Wolff, E. Less is more: Optimizing classification performance through feature selection in a very-high-resolution remote sensing object-based urban application. GIScience Remote Sens. 2017, 55, 221–242. [Google Scholar] [CrossRef]
Halder, S.; Roy, M.B.; Roy, P.K.; Sedighi, M. Groundwater vulnerability assessment for drinking water suitability using Fuzzy Shannon Entropy model in a semi-arid river basin. Catena 2023, 229, 107206. [Google Scholar] [CrossRef]
Cao, Y.; Li, F.; Xi, X.; van Bilsen, D.J.C.; Xu, L. Urban livability: Agent-based simulation, assessment, and interpretation for the case of Futian District, Shenzhen. J. Clean. Prod. 2021, 320, 128662. [Google Scholar] [CrossRef]

Figure 1. Location of the study area ((a) The location of Mumbai and Maharashtra within India. (b) Land use and land cover (LUCC) map of Mumbai for 2022, with land use data from the ESA WorldCover 10 m data for 2022).

Figure 2. Habitat environment quality evaluation indicators. Data for all indicators are from 2022 as an example. (a) Population density; (b) Night light intensity; (c) House price distribution; (d) Distance to schools; (e) Distance to hospitals; (f) Distance to commercial areas; (g) Distance to parks; (h) PM2.5 concentration levels; (i) Building density; (j) Green area density; (k) Distance to transport infrastructure; (l) Distance to railway stations; (m) Annual temperature; (n) Annual precipitation; (o) RDLS values; (p) Distance to rivers.

Figure 3. Research flow chart.

Figure 4. Results of feature importance ranking of informal settlements extracted by Random Forest Model Feature sources include spectral indices, polarization data, and texture features for four seasons. X-axis indicates importance values and Y-axis indicates hierarchical clustering to select the final features variable. (a) shows the results for 2017 and (b) shows the results for 2022.

Figure 5. The heatmap of Spearman correlations for the final selection of features, with red indicating strong positive correlations (values close to 1), blue indicating strong negative correlations (values close to −1), and white indicating correlations close to 0. To minimize redundant information, only the lower-left triangular portion of the correlation matrix is shown (diagonal lines and below). (a) shows results for 2017 and (b) shows results for 2022.

Figure 6. Spatial Distribution of Informal Settlement Extraction Results. Panels (a,b) represent the extraction results for 2017 and 2022, respectively.

Figure 7. Detailed Images of Informal Settlement Extraction Results. (The rows (a-1–l-1) represent different locations in the Mumbai region with Google Earth imagery for 2017. (a-2–l-2) show the corresponding Sentinel true color imagery for the same year. (a-3–l-3) display the extraction results of IS for 2017 in each area using the RF model.)

Figure 8. Detailed Images of Informal Settlement Extraction Results. (The rows (a-1–l-1) represent different locations in the Mumbai region with Google Earth imagery for 2022. (a-2–l-2) show the corresponding Sentinel true color imagery for the same year. (a-3–l-3) display the extraction results of IS for 2022 in each area using the RF model.)

Figure 9. Spatial distribution result of HEI, panels (a,b) represent the results for 2017 and 2022, respectively.

Figure 10. Area proportions of HEI classification results: horizontal coordinates represent area size, vertical coordinates represent suitability classification, (a) Mumbai urban area; (b) Mumbai informal settlements.

Table 1. Data source and descriptions.

Products	Variables	Data Descriptions	Data Source
Sentinel-1	VV, VH, VVH	10 m 2017, 2022	COPERNICUS/S1_GRD (Collection Snippet in GEE)
Sentinel-2	EVI, BSI, BUI, NBI, BAEI, NDBI, UI, NBAI, et al.		COPERNICUS/S2_SR (Collection Snippet in GEE)
Sentinel-2	Gray-level co-occurrence matrix		COPERNICUS/S2_SR (Collection Snippet in GEE)
Slope	DEM	30 m	COPERNICUS/DEM/GLO30 (Collection Snippet in GEE)
OSM	Buildings and roads	Vector data	OpenStreetMap
Population	Population density	30 m 2017, 2022	LandScan Global
Landsat 8	Land surface temperature	30 m 2017, 2022	LANDSAT/LC08/C02/T1_L2 (Collection Snippet in GEE)
CHIRPS Pentad	Precipitation data	1000 m 2017, 2022	UCSB-CHG/CHIRPS/PENTAD (Collection Snippet in GEE)
DMSP-OLS Nighttime Lights	Night light data	1000 m 2017, 2022	NOAA/DMSPOLS/NIGHTTIME_LIGHTS (Collection Snippet in GEE)
Global and regional PM2.5 concentrations	PM2.5	1000 m 2017, 2022	Washington University— Atmospheric Composition Analysis Group
Buildings Polygons	Building density	Vector data	Google Research—Open Buildings
House price data	House price	CSV	CommonFloor SmartGuard

Table 2. Spectral Index Characterization.

Index Name	Index ID	Calculation Formula	Reference
Enhanced Vegetation Index	EVI	$E V I = 2.5 \times \frac{N I R - R E D}{N I R + C_{1} \times R E D - C_{2} \times B L U E + L}$	Huete et al. [38]
Bare Soil Index	BSI	$B S I = \frac{(S W I R 1 - R E D) - (N I R + B L U E)}{(S W I R 1 + R E D) + (N I R + B L U E)}$	Diek et al. [39]
Built-Up Index	BUI	$B U I = \frac{(S W I R 1 - N I R)}{(S W I R 1 + N I R)} - \frac{(N I R - R E D)}{(N I R + R E D)}$	He et al. [40]
Normalized Built-up Index	NBI	$N B I = \frac{R E D \times S W I R 2}{N I R}$	Jieli et al. [41]
Built-up Area Extraction Index	BAEI	$B A E I = \frac{0.3 + R E D}{S W I R 1 + G R E E N}$	Bouzekri et al. [42]
Normalized Difference Built-up Index	NDBI	$N D B I = \frac{S W I R 2 - N I R}{S W I R 2 + N I R}$	Zha et al. [43]
Urban Index	UI	$U I = (\frac{S W I R 1 - N I R}{S W I R 1 + N I R} + 1.0) \times 100$	Kawamura and sensing [44]
Normalized Built-up Area Index	NBAI	$N B A I = \frac{S W I R 2 - S W I R 1 / G R E E N}{S W I R 2 + S W I R 1 / G R E E N}$	Waqar et al. [45]
Polarization Data	--	$V V H = V V \times γ^{V H}$	Li et al. [46]

Table 3. Texture Index Characterization Abbreviation.

GLCM Indices	Abbreviation
Angular Second Moment	asm
Contrast	contrast
Correlation	corr
Variance	var
Inverse Difference Moment	idm
Sum Average	savg
Sum Variance	svar
Sum Entropy	sent
Entropy	ent
Difference Variance	dvar
Difference Entropy	dent
Inverse Difference Correlation 1	imcorr
Inverse Difference Correlation 2	imcorr2
Maximum Correlation Coefficient	maxcorr
Dissimilarity	diss
Inertia	inertia
Shade	shade
Prominence	prom

Table 4. Random Forest Classification Model Accuracy Validation Results.

Year	Overall Accuracy	Recall Score	Precision Score	F1 Score	Kappa
2017	89.7%	89.7%	89.6%	90.1%	77.2
2022	89.5%	89.5%	89.5%	90.9%	77.3

Table 5. Weight results of HEI evaluation indicators.

Primary Indicators	Second-Level Indicators (Positive and Negative Indicators)	Weights
Primary Indicators	Second-Level Indicators (Positive and Negative Indicators)	2017 Year	2022 Year
Economic conditions	Night light data/+	0.123	0.216
Economic conditions	House price data/+	0.243	0.221
Social conditions	School distance/−	0.100	0.010
	Hospital distance/−	0.098	0.098
	Commercial service distance/−	0.099	0.099
	Park distance/−	0.098	0.098
Natural conditions	PM2.5 concentration/−	0.105	0.103
	Average annual temperature/+	0.179	0.131
	Average annual precipitation/+	0.199	0.147
	RDLS/−	0.103	0.103
	River distance/−	0.096	0.097
Residential conditions	Building density/−	0.001	0.001
	Green area density/+	0.143	0.080
	Transport distance/−	0.098	0.098
	Railway distance/−	0.099	0.099
	Population density/+	0.112	0.205

Table 6. Statistical features of the HEI.

			Highly Suitable Zone	Moderately Suitable Zone	Lowly Suitable Zone	Unsuitable Zone
2017	Mumbai city	Area (km²)	76.46	92.33	54.11	7.04
2017	Informal settlements		11.65	13.68	10.81	2.52
2022	Mumbai city		73.04	73.67	37.04	10.57
2022	Informal settlements		13.15	14.53	10.84	3.47

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, Z.; Yang, F.; Xiang, Y.; Yang, H.; Deng, C.; Hong, L.; Sun, Z. Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data. Land 2025, 14, 556. https://doi.org/10.3390/land14030556

AMA Style

Yang Z, Yang F, Xiang Y, Yang H, Deng C, Hong L, Sun Z. Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data. Land. 2025; 14(3):556. https://doi.org/10.3390/land14030556

Chicago/Turabian Style

Yang, Zanxian, Fei Yang, Yuanjing Xiang, Haiyi Yang, Chunnuan Deng, Liang Hong, and Zhongchang Sun. 2025. "Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data" Land 14, no. 3: 556. https://doi.org/10.3390/land14030556

APA Style

Yang, Z., Yang, F., Xiang, Y., Yang, H., Deng, C., Hong, L., & Sun, Z. (2025). Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data. Land, 14(3), 556. https://doi.org/10.3390/land14030556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Informal Settlements Extraction and Fuzzy Comprehensive Evaluation of Habitat Environment Quality Based on Multi-Source Data

Abstract

1. Introduction

2. Research Scope and Data Sources

2.1. Overview of the Study Area

2.2. Data Sources

2.2.1. Sample Data of Informal Settlements

2.2.2. Variable Data

3. Methodology Overview

3.1. Calculation of Spectral Indices and Texture Features

3.2. Informal Settlement Extraction Based on Random Forest Algorithm

3.3. Accuracy Evaluation Method for Informal Settlement Extraction

3.4. Fuzzy Comprehensive Evaluation Based on Entropy-Weighted Method (EWM)

4. Results

4.1. Analysis of the Results of Informal Settlement Indicator Selection

4.2. Results and Analysis of Informal Settlement Extraction

4.3. Evaluation Result of the HEI in Informal Settlements

5. Discussion

5.1. Validity and Limitations of the Methodology

5.2. Reliability and Suitability of Data Sources

5.3. Socioeconomic Drivers Behind the Expansion of Informal Settlements

5.4. Relevance and Policy Implications of Evaluation Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI