1. Introduction
Bandung, being Indonesia’s most populous city, is the dynamic urban centre and capital of West Java Province, which is located on the western part of Java Island. The city is renowned for its natural beauty, vibrant culture, and economic potential. However, the stark contrast between the rapid urbanization and the presence of slums in this metropolitan city is indicative of a significant socio-economic disparity. A significant proportion of Bandung’s population inhabits slums and squatter settlements, characterized by substandard living conditions and constrained access to fundamental infrastructure and utilities.
The expansion of housing and slums in Bandung City is an inevitable consequence of the city’s rapid population growth. As of 2023, the population of Bandung City comprised 2,569,107 people with a population density of 15,355/km
2 [
1]. Consequently, the necessity for adequate and secure housing and settlements is also rising, while the capacity of local and central governments to deliver infrastructure and utility facilities has not been able to keep pace. The government’s capacity to prevent and alleviate slum housing and settlements are constrained by the division of authority. According to Regulation of the Minister of Public Works and Housing Number 14/PRT/M/2018 of 2018 on Prevention and Quality Improvement of Slum Housing and Slum Settlements, the authority of the city government extends to slum areas with an area of less than 10 hectares, while those with an area between 10 and 15 hectares come under the jurisdiction of the provincial government. However, slum areas with an area of more than 15 hectares fall under the purview of the central government [
2].
The assessment of slum areas in Indonesia, particularly in Bandung City, is predominantly the responsibility of local governments, which categorize these areas into “light slum” and “Medium Slum.” This classification is crucial for understanding the varying degrees of challenges faced by these communities. According to the Regulation of the Minister of Public Works and Housing Number 14/PRT/M/2018, light slums are identified by their substandard housing and infrastructure, which require basic improvements. In contrast, medium slums face more severe issues such as overcrowding and limited access to essential services, necessitating more comprehensive interventions [
2].
A comprehensive understanding of slums is imperative for policymakers, researchers, and residents alike, as it facilitates the identification and resolution of the numerous challenges associated with slums. It is evident that there are common features observable in populations that are increasing in number, yet this is not being matched by an adequate supply of housing, land, or poor infrastructure. In many cases, residents often have no access to basic infrastructure and services. The classification of slum areas is delineated by the Regulation Number 14/2018 about the Prevention and Quality Improvement of Slum Housing and Slum Settlements promulgated by the Minister of Public Works and Public Housing. This classification is divided into three categories: light, medium, and heavy. The assessment is based on seven indicators, which are scored as follows: building, environmental, roads, drinking water supply, environmental drainage, wastewater management, waste management, and availability of fire protection. The categorization of slum districts as light, medium or heavy is predicated on infrastructural deficiencies, the gravity of the issue, and the infrastructure’s critical or vulnerable state. The classification of settlements as light slums denote the necessity for repairs to be conducted on basic infrastructure elements that are already in existence but are not adequately maintained. In the case of medium slums, the requirement is for the revitalization of settlements. Finally, in instances classified as heavy slums, the necessity arises for either relocation or reorganization through the intervention of government-sponsored programs.
The challenges confronting slum dwellers are manifold, encompassing housing and infrastructure that are vulnerable to flooding and other environmental hazards, resulting in unstable living conditions. Moreover, there is a paucity of clean water and proper and safe sanitation facilities. In addition, there are gaps in health services, limited educational opportunities, and economic disparities.
Slum mapping methodologies can be classified into three primary categories: survey-based, participatory, and remote sensing (RS)-based approaches [
3]. While both survey-based and participatory methods demand significant human resources for effective implementation, RS-based techniques are less resource intensive in terms of fieldwork; however, they necessitate specialized expertise in remote sensing for accurate image interpretation [
4]. The advantages of utilizing RS data include the ability to conduct regular updates and the capacity to gather information from aerial perspectives, which are particularly beneficial for areas that are otherwise difficult to access on the ground [
5].
In recent years, high-resolution remote sensing technology has seen widespread application in the field of mapping and analyzing changes in land cover, as evidenced by numerous research studies [
6]. This land cover change has been identified as a key indicator in the assessment of remote sensing slum mapping techniques, as the previous researchers have extensively explored its potential [
3]. A conventional approach that has been employed is visual image interpretation, wherein experts manually delineate slums. While this method has been shown to produce highly accurate slum maps [
7], it is inherently time consuming and its accuracy is dependent on the subjective perspectives of the experts involved. Consequently, disparities in boundary identification may emerge among different experts [
8]. In instances where official delineations of slum areas are deemed to be obsolete or incompatible with contemporary realities, visual interpretation is frequently utilized as a means to validate boundaries defined by alternative methods.
An alternative to manual delineation is the Automatic Machine Learning (ML) method, which offers a systematic approach to defining slum boundaries using remote sensing imagery (RSI). Within this framework, Object-Based Image Analysis (OBIA) represents a specific subset of ML techniques designed for slum mapping [
9]. This process begins with the segmentation of images, wherein adjacent pixels are clustered into ‘objects’ that correspond to significant features within the imagery, such as buildings and vegetation. Each object is characterized by attributes like size and shape, which are instrumental in identifying slum areas through a predefined set of rules [
10,
11]. However, the formulation of these rule sets poses challenges, as the morphological characteristics of slums can vary significantly both between and within urban areas [
12].
It is evident that contemporary deep learning (DL) methodologies have become a prevailing trend within the domain of image classification, encompassing applications in the field of slum mapping. These advanced techniques utilize artificial neural networks to automatically learn features from data, with the potential to improve the accuracy and efficiency of slum identification in comparison to traditional methods. The integration of DL with existing ML approaches, such as Random Forest (RF), enables researchers to enhance the classification of image objects in complex urban landscapes. This combination of methodologies has been demonstrated to enhance the accuracy of slum mapping, while also circumventing the limitations frequently associated with rule-based OBIA techniques. Consequently, this approach provides a more robust solution for urban analysis [
9,
13].
Remote sensing imagery offers a broad perspective of a region but falls short in conveying intricate details that are essential for a comprehensive understanding of slum environments. It lacks the capacity to reveal specific characteristics such as the materials used in the wall construction, the number of stories in buildings, or the existence of open drainage systems within the community. To acquire this nuanced information, it is essential to conduct ground-level investigations through methods such as field surveys, interviews, focus groups, or by utilizing street view images (SVIs), which capture photographs from the ground perspective. Despite their potential, the application of SVIs in slum identification remains limited, as noted by [
14].
Furthermore, the inherent limitations of SVIs must be acknowledged, as they primarily provide localized data or insights along well-traveled streets, thereby failing to generate comprehensive spatial representations of slum areas. For instance, in scenarios where access is restricted to pedestrian pathways, relying solely on SVIs becomes impractical, given that these images are typically captured from vehicles navigating the area. This constraint underscores the necessity for a multifaceted approach to data collection that encompasses various methodologies to accurately map and understand the complexities of slum regions accurately.
Meanwhile, Bandung City, characterized by a confluence of population pressure, economic inequality, inadequate governance of land and housing legality, and its basin geography, harbors a high degree of vulnerability to annual encroachment of slum areas. Bandung City Government has hitherto determined slum areas on a quinquennial basis through the utilization of a gradual field survey method. Nevertheless, given the potential for slum areas to expand on an annual basis, there is a need for more inclusive urban governance. Consequently, this research recommends the periodic and inclusive compilation of spatial data inventories to facilitate the effective targeting of slum upgrading initiatives. This inventory can be conducted using an integrated and regular method with a multi-resolution texture analysis method. This method has been developed to identify slums in remote sensing images [
15]. Furthermore, the integration of remote sensing networks and street view imagery (SVI) for slum mapping in Jakarta City has been enhanced by employing deep learning networks, including the VGG method [
16]. VGG is a convolutional neural network (CNN) that has become synonymous with its deep architecture, which consists of multiple layers. The efficacy of the system is particularly pronounced in image classification tasks, a phenomenon that can be attributed to its capacity to capture intricate features through the utilization of multiple convolutional layers [
16,
17]. The efficacy of remote sensing and the incorporation of street view imagery as a tool for mapping slums in Bandung City will be assessed through a comparative analysis with field survey results in several areas of the city in 2024.
The conditions described above were the catalyst for the author’s research on the spatial mapping inventory of potential slum areas in Bandung City with remote sensing and street view imagery and a comparative analysis of the effectiveness of this method as opposed to field surveys.
2. Study Area
As a city that has recently transformed into a metropolitan city, Bandung City covers an area of 167.31 km
2 and is bordered by Bandung Regency and West Bandung Regency to the north, Bandung Regency to the east and south, and West Bandung Regency and Cimahi City to the west [
18]. Bandung, which is located at an altitude of 700 m above sea level, is a charming city. The highest point of the city is in Ledeng Village, Cidadap Sub-district, with an altitude of 892 m above sea level, while the lowest point is in Rancanumpang Village, Gedebage Sub-district, which has an altitude of 666 m above sea level. As can be seen in
Figure 1, the city is divided into 30 sub-districts, which consist of 151 villages. Among them, Gedebage is the largest sub-district, covering an area of 9.58 km
2, while Astanaanyar is the smallest sub-district with an area of only 2.89 km
2 [
1].
In addition, Bandung City has a unique geographical shape, located in a basin surrounded by mountains and hills. This uniqueness has significant impacts on its climate, urban planning, environment, and development issues. While Bandung has a mild climate and fertile soil, the basin also causes some serious problems. Air pollution is common as it is difficult for the air to escape the basin, and river blockages cause flooding and lead to poor drainage. The city is also exposed to the effects of urban heat and is experiencing an expansion of hillside slums. Therefore, integrated urban planning is urgently needed to prevent the growth of slums in Bandung City, which is fueled by the rapid increase in population density each year. One comprehensive urban planning effort involves mapping slum areas and taking appropriate preventive measures.
4. Results
The performance metrics for the four slum classification networks utilized in this research are detailed in
Table 3. The findings for the various models were tested using Equations (
1)–(
3), the results of which indicated that a model utilizing solely remote sensing imagery (RSI), specifically the Fully Convolutional Network (FCN), attained an intersection over union (IoU) of 57.28. In contrast, the FCN with Deep Kernel (FCN DK) recorded an IoU of 54.47, while the network that combines both RSI and street view imagery (SVI), referred to as FCN-DK, attained an IoU of 86.01, and the standard FCN reached an IoU of 78.89. Notably, the FCN-DK model, which integrates both types of imagery, demonstrated superior accuracy relative to the FCN alone, achieving the highest F1 score and IoU. This enhancement in performance can be attributed to a 1.83% increase in recall when comparing the FCN-DK to the FCN.
Concurrently, the field survey results were subjected to testing using decision tree modeling. The results of the model evaluation on the 2024 survey data demonstrated outstanding classification performance, attaining perfect accuracy. As demonstrated by the confusion matrix, the 435 samples classified as ‘light slum’ and the 27 samples designated as ‘medium slum’ were accurately identified, with no errors observed in the prediction process. This finding underscores the survey’s high reliability, even when evaluated on unseen data.
A comparison of the confusion matrix of FCN-DK (RSI + SVI) model results with field survey results in
Figure 5 reveals that the former processed a substantially larger dataset of 2323 samples in comparison to the 462 samples utilized in the field survey. In the context of slum classification, the FCN-DK model demonstrated a 99.5% accuracy rate by accurately categorising 2283 light slum areas, with a mere 11 misclassifications as medium slum areas. In contrast, the field survey exhibited a perfect classification of all 435 light slum areas, attaining 100% accuracy. For medium slum classification, the FCN-DK model correctly identified 29 medium slum areas with no misclassification, while the field survey identified 27 medium slum areas, also with no misclassification.
Both methodologies demonstrate impressive efficacy in accurately identifying medium slum areas, achieving a remarkable zero false negatives, meaning that there were no instances where medium slum areas were incorrectly classified as light slum in either approach. This indicates a high level of precision in the classification process. Notably, the FCN-DK model exhibits a slight inclination toward misclassification, which can be interpreted as a conservative approach. This tendency results in a higher likelihood of categorizing certain areas as higher-density slums rather than overlooking genuine slum regions. Such a strategy may prioritize the identification of at-risk populations, ensuring that no significant slum areas are overlooked, even if it leads to some over-classification.
This effective classification performance is further supported by the training and validation loss trends observed in the Fully Convolutional Network (FCN) and Fully Convolutional Network with Dilated Kernels (FCN-DK) model, as illustrated in
Figure 6 and
Figure 7. In
Figure 6, the downward trajectory of both training and validation loss over time signifies that the model is not only learning effectively but also generalizing well to new, unseen data. The initial high training loss, which progressively declines, reflects the model’s capacity to refine its predictions, while the closely aligned validation loss indicates that there are no signs of over-fitting. This strong performance and effective generalization reinforce the model’s capability to accurately classify slum areas, aligning with the earlier assertion of its precision and conservative classification tendencies. As the training process concludes, the stabilization of both losses suggests that the model has reached an optimal performance level, ensuring that significant slum areas are reliably identified.
Meanwhile,
Figure 7 depicts the training loss and validation loss associated with a Fully Convolutional Network with Dilated Kernels (FCN-DK) model utilizing remote sensing imagery over the course of 25 training epochs. At the outset, both loss metrics are elevated; however, they experience a significant decline during the initial epochs, signifying successful learning. Although the losses continue to diminish, the rate of decrease slows over time, ultimately reaching a point of stabilization around the tenth epoch. Notably, throughout the training process, the validation loss consistently remains lower than the training loss, indicating effective generalization and the absence of over-fitting. In the later stages of training, both loss values stabilize, suggesting that the model has converged and shows limited capacity for further enhancement. This observation underscores the model’s effective learning capabilities and its strong generalization, potentially influenced by techniques such as regularization or dropout.
The elevated accuracy levels achieved through each methodology led to the depiction of the recognized slum areas on the revised slum reference map, which exclusively included the classifications of ’light slum’ and ’medium slum’ as defined by the local government’s slum area mapping framework. The findings for both the FCN and FCN-DK approaches are detailed in
Table 3, while
Figure 5 and
Figure 8 provide a visual representation of the outcomes produced by these two methods.
Additional analyzes were performed to investigate how different combinations of input datasets influence the comprehension of deep learning models in the classification of various slum categories. The FCN and FCN-DK models demonstrated comparable performance levels. Subsequently, the prediction outcomes derived from the use of FCN and FCN-DK were categorized into multiple slum classifications, as illustrated in the
Figure 9.
The comparative performance chart (
Figure 9) illustrates the effectiveness of various Fully Convolutional Network (FCN) configurations in detecting slum areas against field survey results. The FCN with remote sensing imagery (RSI) identified 1647 light slum and 55 medium slum areas, achieving a recall rate of 64.13%. The FCN-DK with RSI demonstrated enhanced performance, detecting 1545 light slum and 37 medium slum areas with a recall of 72.83%. The incorporation of street view imagery (SVI) into the RSI framework led to a substantial enhancement in the performance of the FCN model, resulting in the identification of 569 light slum and 12 medium slum areas, with a recall rate of 97.93%. The FCN-DK with both RSI and SVI achieved the highest detection results, identifying 2283 light slum and 40 medium slum areas with a remarkable 99.74% recall, closely aligning with field survey results, which documented 435 light slum and 27 medium slum areas with 100% recall. This progression underscores the efficacy of the FCN-DK architecture in conjunction with both RSI and SVI, demonstrating its capacity to achieve the most precise automated slum detection and efficacious processing of larger geographical areas. However, while the integration of SVI and RSI enhances the precision of slum segmentation, it encounters significant challenges in slum classification due to limitations inherent in the dataset. The 2020 Slum Area Map inadequately delineated severity categories, and although the FCN-DK outperformed in classification accuracy, the FCN showed comparable results for light slum categories.
The FCN with RSI demonstrated the poorest performance, due to its reliance on a solitary, restricted data source (RSI) and a rudimentary model architecture, which collectively proved inadequate for the accurate and comprehensive detection of slum areas. The incorporation of more sophisticated features (e.g., SVI) and the utilization of enhanced architectures (e.g., FCN-DK) has been demonstrated to result in a substantial enhancement in detection performance, as evidenced by the elevated recall rates exhibited by the alternative models.
Nonetheless, the survey results presented in
Figure 9 unequivocally demonstrate that irrespective of the efficacy of the FCN-DK model in identifying slums, the optimal approach for doing so is through the implementation of a field survey method. As the field survey method requires greater investment in human resources and time to classify the slum-ness of a larger area, it naturally generates less data than the other methods.
5. Discussion
5.1. Overview of Methodologies
The results of this research imply that the addition of information from SVI can improve the accuracy in slum classification when compared to the exclusive use of RSI. However, the effectiveness of this improvement is highly dependent on the method of integration of SVI into the FCN-DK network as well as the quality and characteristics of the dataset used. A simple combination of RSI and SVI in FCN-DK showed significant improvement compared to using RSI as the only input.
Conversely, the implementation of slum mapping using the field survey method has been demonstrated to be the most reliable and effective approach for classifying the slum level of an area, in comparison to the utilization of RSI and SVI with FCN DK or FCN modeling alone.
The following conclusion is drawn from comparing the results obtained using FCN and FCN-DK, focusing on evaluating the combination of RSI and SVI compared to the use of RSI alone. The findings show that incorporating SVI increases the accuracy of slum mapping in the absence of direct field surveys. As illustrated in
Figure 8, the comparison is presented between slum visualization outputs generated by FCN and FCN-DK models separately.
5.2. Comparative Analysis of Slum Classification
The presence of street view and features from remote sensing provide important information regarding land surface, water table, and building density, all of which contribute to a better understanding of slums in urban areas and improve the prediction accuracy of FCN-DK. The yellow circle in
Figure 10 indicates that FCN outperforms in locations with SVI. As illustrated in the
Figure 10, the locations indicated by yellow circles denote areas in which FCN-DK exhibits superior performance compared to FCN due to the availability of Google street view locations.
Meanwhile, as demonstrated in
Figure 11, the areas highlighted in blue illustrate the regions where the FCN and FCN-DK models exhibit substandard performance, attributed to their restricted access to GSV locations. Conversely, the areas highlighted in yellow show the regions where FCN-DK demonstrates superior performance compared to FCN, due to its enhanced access to GSV locations.
Figure 11 reveals that the integration of SVI has the potential to improve the identification of various criteria that signify the presence of slum areas. In
Figure 11 column c, which is the result of slum mapping using RSI alone, there is a light slum classification in one area, while in columns a and b, it can be seen that the addition of SVI provides more detailed information that causes the model to classify several points as moderate slums in the same area. This indicates that the integration of SVI has the ability to capture the characteristics of slums that may not be detected by RSI, such as the conformity of buildings to prescribed technical standards and the condition of neighbourhood drainage, as well as the quality of nearby roads. In this research, the approach used is a Convolutional Neural Network (CNN), which was chosen for its architectural simplicity in accordance with the characteristics of the available dataset, which consists of only two categories, namely light slums and medium slums. It was confirmed from an interview with the local government that these categories have been determined in Bandung City since 2020 due to various efforts to deal with slums through the KOTAKU (Kota Tanpa Kumuh) Programme from 2018 to 2022.
In comparison to the results of field survey mapping (
Figure 12), the classification of slum areas provided by field survey mapping is conservative and subjective, with areas described as ‘mild slums’ not distinguished by severity. Conversely, the FCN-DK model provides a more detailed and automated classification, frequently identifying ‘moderate slum’ conditions in areas that are considered to have light slum characteristics by field surveys. This discrepancy may be attributed to the model’s utilization of high-resolution imagery and street-level views, which facilitate the detection of features such as building density and road conditions that may be overlooked by field surveys due to constraints such as accessibility and subjectivity.
5.3. Performance Evaluation
This experiment shows that the complexity of the dataset architecture greatly affects the modeling performance, network selection, and the training process of the latest slum classification. The Places365 VGG16 network is a convolutional neural network (CNN) architecture that has been demonstrated to be highly effective in the classification of images, particularly in the recognition of various environments. However, the complexity of its architecture may not be compatible with simpler datasets, such as the 2020 Bandung City Cumulative Map dataset, which is considered too simple for the capabilities of the VGG16 model. This discrepancy can result in suboptimal classification performance [
16]. In addition, the condition of the dataset from the Bandung City Government shows that the distribution of slum areas are uneven, less varied in each area, and does not cover all slum categories (light, medium, and heavy), causing an imbalance when using more complex modeling such as a Visual Geometry Group (VGG). VGG is more appropriate and superior in terms of storing varied data patterns so it can be adapted to improve the ability to recognize slum category variations with a more diverse and balanced dataset [
16].
5.4. Future Research Directions
A key area for future research is the optimizing the incorporation of a more diverse and pertinent set of variables or features. Such features may include the socio-economic data of the population, the availability of fundamental infrastructure, and environmental conditions surrounding residential areas. It is hypothesized that these variables have the potential to make a greater contribution in distinguishing the level of slum-ness in a more in-depth and comprehensive manner.
Furthermore, the utilization of more sophisticated classification algorithms, such as XGBoost, Light-GBM, or ensemble stacking, could also be a viable option to explore, as they are adept at handling imbalanced data and high feature complexity with competitive performance. It is recommended that future research explore the application of data balancing methods, such as SMOTE or ADASYN, to address imbalances between classes [
27]. For instance, this could help to mitigate the dominance of ‘light slum’ over ‘medium slum’.
A further concern pertains to the paucity of data from the minority category, designated here as the ‘medium slum’. This deficiency may have ramifications for long-term prediction, specifically the potential alteration of its stability, should there be any modification in the future class distribution. Consequently, it is imperative that efforts are made to collect more balanced and representative data in future surveys. It is anticipated that these developments will enhance the reliability and precision of the classification model, thereby ensuring its efficacy in providing decision support for the dynamic conditions of residential areas in Bandung City.
Despite the merging of SVI and RSI showing significant potential in slum classification, further research is needed to optimize this integration. This is particularly due to the lack of direct representation of more detailed slum criteria, such as safe access to drinking water, domestic waste management, waste management, and the availability of fire protection.
5.5. Limitations and Considerations
The coverage of the SVI does not always include all slum areas, as many areas may not be represented in the SVI. This limitation arises because the SVI can only capture information along main roads [
14], while small streets or alleys are often omitted or unavailable in Google SVI. This research utilizes IDW interpolation to generate SVI feature maps, so unified images can represent multiple locations. Slum areas are also detected in areas with alleys or small roads whose images are not available in Google SVI, but this is overcome by utilizing the 2021 Bandung City neighborhood road database. Further research could explore more accurate and advanced methods to address this issue of inconsistent coverage. Methods for future research could include techniques that specifically address missing data or additional data, such as street networks and city blocks that are available on OpenStreetMap to spatially improve SVI interpolation [
28].
The SVI featured map was prepared by collecting imagery through the Google Application Programming Interface (API), focusing on 15 urban villages with an area of more than 10 Ha. The area selection was based on the authority of the provincial and central governments in accordance with the applicable regulation, namely the Regulation of the Minister of Public Works and Housing Number 14/PRT/M/2018 of 2018 on Prevention and Quality Improvement of Slum Housing and Slum Settlements. The limitation of using the Google API meant that not all SVIs in Bandung City could be accessed. Therefore, programmed data collection focused on priority areas, resulting in 2904 images which were then combined with neighborhood road imagery from the Bandung City Government in 2021 as additional data.
The obtained images and components covered only 33% of the total required SVIs, and the existing SVIs were integrated as layers in FCN-DK. Future research is expected to explore methods to generate feature maps that are able to capture more information from SVIs without being hampered by Google API access limitations. In addition, it is also expected to produce SVIs that are easier to integrate into FCN-DK without requiring many spatial layers [
16].
5.6. Implications for Policy and Practice
A comparison with the results of the 2024 field survey, which demonstrated indications of slum area mapping in several locations, suggests that the accuracy achieved through field survey methods is on par with, if not superior to, the use of remote sensing imagery and street view imagery alone. This finding indicates that the combination of the most recent survey results also provides substantial support for the incorporation of more inclusive mapping results prior to the formulation of a slum upgrading policy by the city government. It is evident that utilizing satellite and street view imagery has been advantageous in terms of reducing analysis time and enhancing the extent of the area covered. It is evident that the implementation of the field survey method necessitates a greater investment of time and results in a more circumscribed area of coverage when undertaken over a period of one year.
The findings of this study are expected to encourage district and city administrations to adopt remote sensing imagery (RSI) and street view imagery (SVI) for delineating slum areas, optimizing field survey data collection while saving time and costs. Future research may explore the applicability of FCN and FCN-DK models with RSI and SVI in diverse contexts. By developing a comprehensive overview of identifiable slum characteristics, this study offers insights that can be adapted to local contexts, aiding in the identification and resolution of slum conditions. Engaging stakeholders can enhance tailored interventions, while the application of RSI and SVI in mapping and monitoring can support urban planners and policymakers in improving slum areas, ultimately fostering community resilience and sustainability.
The significance of the classification conceptualization of slum areas cannot be overlooked, as it must be adapted to the local context that exists in each region [
28]. In Indonesia, slum classification is determined based on seven criteria, which include building condition, neighborhood road accessibility, drainage system, drinking water supply, domestic waste management, waste management, and availability of fire protection system. In addition, the limitations in terms of social complexity in slums, which are considered to be inhabited by people with lower–middle socio-economic conditions, are also difficult to measure only from satellite imagery. Further research is expected to combine it with population density or poverty level maps to produce a more representative mapping for policymakers. Therefore, the development of a model that is in accordance with the conceptualization of slums based on the local context in Indonesia is crucial in the process of making slum maps.