1. Introduction
Since the 1950s, the world’s population has grown from 2.6 billion to 7.7 billion, and is expected to reach approximately 9.7 billion by 2050 [
1]. A larger population causes an increased demand for energy, food, housing, water, transportation, and healthcare. To provide these demands, humans have exploited natural resources and consequently changed the surface of the Earth. Land use and land cover (LULC) change is the most obvious indicator of the Earth’s surface changes [
2,
3]. Because of communities’ social and physical characteristics, the distribution of LULC changes varies in space and time. According to recent studies, LULC change can show the adverse effects of different environmental and socioeconomic factors on the Earth’s surface, e.g., climate, water balance, biodiversity, and terrestrial ecosystems [
4,
5,
6]. Among these factors, the impact of the LULC on the ecosystem has become the most attractive topic among researchers [
3,
7,
8,
9,
10]. During past decades, remote sensing (RS) systems have collected valuable information from Earth’s surfaces. Although RS provides valuable data for researchers, standard software packages are not fully functional for delivering a comprehensive solution for analyzing and managing Earth data [
10,
11]. During past decades, a variety of RS sensors, such as Moderate Resolution Imaging Spectroradiometer (MODIS), Satellite Pour l’Observation de la Terre (SPOT), Landsat, and Sentinel 2 have been launched [
12,
13]. Different methods for extracting information from these remotely sensed data have been introduced and developed. Among these data, Landsat is the only sensor providing data for more than four decades and still is active. The spatial resolution of this dataset provides the possibility to study Earth’s surface dynamics which is a necessity for policy development, management, and scientific inquiries [
1]. Landsat dataset provides almost high spatial (30 m and 15 m in PAN band in Landsat 7 and 8) and temporal (8 days, if we combine data from both satellites and 16 days with a single satellite) data to a large extent [
3,
13]. Other sensors such as Sentinel-2 cannot provide a continuous data source for this long period (4 decades) or suffer from other problems such as lower spatial, spectral, or temporal resolution.
There are many classification algorithms for LULC mapping, including parallelepiped [
12,
14], minimum distance (MD) [
14], maximum likelihood (ML) [
15], fuzzy classification (FC) [
16], Artificial Neural Network (ANN) [
17], Support Vector Machine (SVM) [
14], Random Forest (RF) [
3,
18,
19], deep learning (DL) [
20,
21], and deep transfer learning (DTL) [
21,
22]. Among these classifiers, RF, due to its better efficiency and higher accuracy, relatively low computational cost, and the need for a few parameters, has gained great popularity and become one of the best candidates in LULC mapping [
17,
21]. Although DL and DTL methods have become an effective computational approach in machine learning in recent years, these methods require extensive data and limit complex computations in the cloud platform [
20].
In recent years, to overcome the challenge of big data analytics in RS, Google developed the Google Earth Engine (GEE) platform to process a huge amount of data for a long time [
23]. The GEE project was launched in 2010 and has become the most popular big Earth Observations (EO) analysis platform [
10,
24]. GEE encompasses many datasets comprising raw and preprocessed datasets and elevation models. GEE also covers various regional, national, and global extents [
10]. Among the available datasets in GEE, the Landsat imagery archive (e.g., Landsat 1–3 (1972–1983), Landsat 4 (1982–1993), Landsat 5 (1984–2012), Landsat 7 (1999–present), and Landsat 8 (2013–present)) is one of the most commonly used products for various agro-environmental applications such as multi-temporal image classifications [
25], multi-temporal cloud masking [
26], multi-temporal settlement and population mapping [
27], etc. The GEE platform also decreases the time and effort dedicated to preprocessing stages applied to satellite image data available in this environment [
1].
Various approaches for LULC change analysis are used [
27,
28,
29]. Researchers usually use time-series indices derived from multispectral satellite data for LULC change analysis. Change analysis based on the spatiotemporal variation of land surface biophysical properties is one of the most commonly used methods. Noi Phan et al. investigated the role of image composition in LULC classification using RF in the GEE platform [
17]. This study showed that input features are critical and directly influence final classification accuracy. Mugiraneza et al. examined gray-level co-occurrence matrix (GLCM) texture features with standard spectral features to monitor urban LULC change with Landsat time-series imagery [
29,
30]. The features in LULC classification are generally categorized into spectral, textural, and geometrical features. The type of dataset, area conditions, and auxiliary data availability are the main factors behind selecting a type or a combination of these features in the studies [
3,
31,
32]. Different studies have shown the relation between land surface temperature (LST) and LULC [
33,
34,
35], and it is worth checking the magnitude of this feature’s impact in Landsat LULC classification using calculated variable importance (VI) in RF. Gilbertson et al. [
34] also showed that pan-sharpening Landsat 8 (L8) imagery for classifying agricultural fields is beneficial for LULC classification, and final results would become more accurate. Gilbertson et al. applied a pan-sharpening process to L8 multi-temporal imagery for differentiating between crop types using different classification techniques [
34].
This study differs from previous efforts to map LULC in three aspects. First, we used a pan-sharpening method to increase the spatial resolution of RGB bands in top-of-atmosphere (TOA) Landsat data for Landsat 7–8 data. Researchers use surface reflectance (SR) instead of TOA Landsat data [
1]. Given that the panchromatic band is not available in SR Landsat data, it prevents researchers from using the pan-sharpening method. In the TOA product of Landsat, a higher level of radiometric corrections is at the expense of losing higher spatial resolution. Second, we used VI in RF to determine essential features in the LULC mapping process. The results showed that LST and DEM are valuable features in LULC mapping. The increasing use of cloud platforms provided more features in the classification process and allowed us to determine suitable features in classification and change detection applications. Selection of essential variables is mandatory for data sets with many variables. Third, we compared the results of our proposed method with the Copernicus Global Land Cover Layers (CGLCL) map to prove the efficiency of our proposed method. The CGLCL is a global discrete land cover map at 100 m resolution for processing Earth Observation (EO) data. This map is generated by using several proven individual methodologies applied to high-quality external data (e.g., PROBA-V, Sentinel, Landsat, DEM, and land-sea mask) [
35]. This framework increases the classification accuracy for seven to nine generations of the Landsat series, enabling continuous and long-term monitoring and analysis of LULC change. The rest of the paper is organized as follows. First, the dataset used in this study is described. Then, the methodologies of the RF algorithm for LULC classification in the GEE platform are presented in
Section 2. In
Section 3, the results and analyses are presented. A discussion is presented in
Section 4, and finally, the paper’s findings are summarized in
Section 5.
3. Results
The present study aimed to design a method for reliable LULC classification. As mentioned in
Section 2, our methodology consists of four main sections. First, we introduced Landsat images and collected training and test data individually. Next, we introduced standard or pan-sharped spectral bands and additional variables for classification. Then, RF classification was applied to classify images. Finally, we performed an accuracy assessment using test data to evaluate the performance of the results of the LULC map and determined the importance of each input variable using VI in the Isfahan region, a major city in the Middle East.
In this study, we performed two map-to-map comparison approaches and independent accuracy assessments to evaluate the performance of the proposed methodology. To this aim, we used CGLCL data and Landsat SR data.
Pan-sharpening is one of the steps used for Landsat 7 and Landsat 8 data. The pan-sharpening was applied only on data acquired by sensors mounted on Landsat 7 (ETM+) and 8 (OLI) because the pan band is not available in sensors used in Landsat 5 (TM) and an earlier generation of Landsat instruments (MSS). In this step, we used a pan band to increase the spatial accuracy of RGB bands.
Figure 3 shows one sample comparing the 15 m pan-sharpened image results with the standard version of 30 m RGB bands in 2019. According to
Figure 3, the roads and buildings are more vivid and clear to discriminate, and it would help the classifier extract classes more effectively.
This study investigated the relative importance of input features, including Landsat data bands, several known vegetation and building indices, and other auxiliary data for mapping the LULC pattern. For this purpose, we used an RF classifier based on the GEE platform. According to the recommendations of similar studies [
18,
58,
59] and the result of OOB error [
18] from our data (
Figure 4), we selected 50 trees (ntree = 50). At the same time, the square root of the total number of features (mtry) was set as the default value.
Figure 4 suggests that the classifier’s performance remains nearly identical with 50 and 100 decision trees.
Table 5 provides more details about the values of parameters set in the RF algorithm. Using the created RF, we conducted this classifier in six periods.
Figure 5 presents the results of the LULC map for six considered dates from 1985 to 2019.
After performing RF classification, VI for each input feature was calculated.
Figure 6 presents the average feature importance for six proposed periods from 1985 to 2019. We considered 32 features for Landsat 5 time-series data for 1985, 1993, and 2008, 33 features for Landsat 7 time-series data for 2000, and 34 features for Landsat 8 time-series data for 2013–2019. Regarding Landsat 5 time series data, DEM, LST, slope, and B6 were the four most important variables. For Landsat 7 time series data, DEM, LST, NBR2, and B1 and Landsat 8 time series data, DEM, B11, LST, and B10 featured high importance variables. However, RVI, NDVI, and VrNIR_BI only marginally help classify all datasets. The general results also displayed that DEM, LST, B10, B11, and NBR2 are the most important variables, while RVI, VrNIR_BI, NDVI, and MSAVI are less influential in classification processes.
3.1. Independent Accuracy Assessment
Producer’s accuracies, user’s accuracies, OA, Kappa coefficients, accuracy, F1-score, misclassification rate, precision, recall, and specificity for all six periods are listed in
Figure 7 and
Table 6. In general, all datasets produced high accuracies (OA and Kappa range from 94.176% to 97.554% and 0.908 to 0.962,
Table 6). On average, urban, bare land, shrub, and forest not matching area were classified with the highest accuracy, followed by water, urban, and cultivated. The lowest accuracies were observed with forest mixed and wetland classes. These low accuracies are caused by the very low UA and PA of the 2000 and 2008 periods.
Figure 7 and
Table 6 present the details of LULC classification statistics on six considered dates.
3.2. Map-to-Map Comparison
To investigate the performance of the proposed classification approach, we also compared the classification results to the global CGLCL map (
Figure 8). Visual interpretation proved that the class of permanent water bodies was frequently classified as other classes (e.g., bare/sparse vegetation and cultivated), and the urban/built-up class could not successfully be extracted, especially along its borders. In addition, a statistical comparison was made between the proposed classifier and the global CGLCL map for visual comparison. According to the OA and Kappa, the proposed classifier outperforms the CGLCL map with 10.01% and 0.14.
3.3. LULC Change
Figure 9 displays the binary change detection maps where both unchanged and changed zones are pictured. Land cover changes in 1985–1993, 1993–2000, 2000–2008, 2008–2013, and 2013–2019 illustrate that 97,725.78, 105,689.43, 110,185.38, 111,393, and 103,546.98 ha of the study area were subjected to change. These changed areas are equivalent to 15.57, 16.84, 17.59, 17.75, and 16.5% of the study area for the corresponding periods. At the class level, the results indicate that from 1985 to the 2019 situation, urban area, wetland, forest mixed, water, and shrubs increased by 167.97, 93.23, 151.68, 13.43, and 333.19%. In contrast, bare land and cultivated classes were reduced by 5.22 and 23.29%. Bare land and cultivated areas were mainly converted to urban class. The synergy between the change maps and GEE-derived indices/bands allowed the reconstruction of the annual land cover change maps from 1985 to 2019.
Table 7 shows that, since 1985, tremendous land cover conversion, mainly from bare land and cultivated, witnessed urban expansion across the Isfahan region landscape. The study results illustrate extensive urban growth between 1985 and 2019, with a 167.97% growth (approximately 4.94% per year).
Other classes, Shrub, Cult, Bare, Wat, Wet, For_Mix, and Op_For_NM, showed different increasing and decreasing behavior during some periods, which might be due to impacts of severe climate change and drought, rapid industrial growth, and sprawl-encouraging planning policies [
35].
Table 7 and
Figure 10 show the results. For example, the results indicate that vegetation classes have faced fluctuation during the past four decades. Meanwhile, the general tendency has inclined given worldwide climate change, which has caused severe drought conditions, leading to environmental degradation in this region.
4. Discussion
A continuous increase in urbanization characterizes countries in the Middle East. LULC mapping is an effective tool for environmental monitoring and management. The LULC change occurs worldwide, but this phenomenon imitates great difficulties, mainly in metropolitan areas. These changes in the context of the study area, especially when it comes to urban growth and fluctuation in vegetation classes, are affected by economic (e.g., land price and land speculation), political (e.g., housing and tax policies, and civil war), demographic (e.g., population growth), environmental (e.g., enforced immigration and suburban environment), spatial planning (e.g., urban transport system), and climate change (e.g., severe drought) factors [
36]. The case study presents urban environments with a rapid spread of informal settlements over the past four decades since 1985. We adopted the combination of several spectral indices and other auxiliary data. These results were valuable for a reliable land use classification in the study area. The combination of classifications in six proposed periods allowed continuous annual land cover classification from 1985 to 2019. It is necessary to mention that these datasets and the auxiliary variables have the exact spatial resolution, 30 m for Landsat 5 and 15 m for Landsat 7 and 8. In recent years, researchers used Sentinel-2 data to extract LULC [
75]. Although Sentinel-2 data provides images with higher spatial resolution, the Landsat data might be a better source to investigate LULC change because of its capability to monitor Earth for more than five decades.
The results also show that a fair number of trees, one of the parameters of the RF classifier, is 50. Using a more significant number of trees would increase computations, but the results and accuracy would not improve. Comparing SR Landsat products with a pan-sharped version of TOA Landsat product display, the impact of spatial resolution would be more effective than more accurate radiometric calibration. According to the results, DEM and LST are compelling features for LULC mapping and can impact attained accuracy. This study showed that LST, computed from thermal bands of Landsat images, can be an effective feature for discrimination between available LULC in the study area. The effectiveness of the LST feature might be because of different spectral signatures for each class in thermal bands. In addition, classes might have a relationship with spatial metrics such as height patterns, which helped increase the accuracy of the final classification. The results also show that global LULC maps are unsuitable for accurate mapping of LULC change maps, and introducing new algorithms and training data over local areas is essential. The inter-annual change analysis revealed that abrupt change to urban structures was observed from 1985 to 1993 (coincided with the baby boom and postwar economic growth, with an annual averaging rate of 15.42%. During that period, the built-up area extended from 18,595.7 to 41,538.94 hectares. From 1995 to 2019, a gradual increasing urban growth shows stability in urban policies and development plans, with an annual average rate ranging between 0.51% and 0.86%. This gradual increase is related to Iran’s economic, environmental, and spatial planning conditions [
36]. According to [
36], the growth rate of the core city of Isfahan would decrease, and other sub-cities would face a more significant growth rate.
This research shows that 918.67 ha on average between 1985 and 2019 has changed to urban area class yearly over the study area. It shows that many factors, such as economic reasons, have accelerated urban growth, and it puts noticeable pressures on human welfare and the natural environment. This urban growth also increases the rate of rural-urban migration, which causes environmental degradation, especially in metropolitan areas such as Isfahan. The results also show that vegetation classes have faced fluctuation during this time interval, but the general trend was inclined to decrease. This phenomenon is under the impact of several reasons, such as climate change, which caused severe drought conditions. To reduce the long-term consequences of this LULC change, it is necessary to provide an accurate LULC map and build a practical scenario.
5. Conclusions
This study presents a developed method to map LULC with high accuracy at the scale of 30 m for Landsat 5 and 15 m for Landsat 7 and 8 based on GEE’s cloud-based platform. For this purpose, we used the RF algorithm, which is one of the most robust classifiers. We checked a different number of trees and input RF parameters, and the results show that 50 is a suitable value. The results also showed that although using SR products instead of TOA Landsat products would result in a more accurate classification map, it is not correct if a pan-sharpening method is applied. The proposed method’s OA, Kappa, and F1-score were 0.422%, 0.006, and 0.015 better than the similar approach used for the SR version of Landsat data in 2019. Two spectral-temporal features (7, 8, and 9 raw bands for Landsat 5, 7, and 8 and 25 indices and auxiliary data for all Landsat collections) characterize the LULC. In addition, we used DEM and extracted data from DEM (slope and aspect). According to the results, the LST and DEM features are essential in Landsat image classification. Although these features are crucial in LULC mapping, they are not used, or just one of them is used in similar research. We used a CGLCL map and high-resolution Google Earth images to provide reliable training class labels. The OA, Kappa values, and F1-score are 93.64–97.55%, 0.91–0.96, and 0.86–0.95, respectively. Independent and map-to-map comparison assessment approaches verified the final classification results. The results showed clearly that the proposed method outperforms the CGLCL map. The proposed method’s OA, Kappa, and F1-score were 10%, 0.13, and 0.5 better than the CGLCL map in 2019.
This paper proposed a method based on the GEE cloud-based platform to accurately and timely map the change of LULC from historical Landsat data. This method can be quickly and easily applied to other regions of interest for LULC mapping. As the code is available publicly, future improvements to this methodology may be implemented based on user feedback. We recommend that the importance of the LST and DEM features can be checked in other available satellite image data for LULC mapping. In addition, the LULC map would present a clear understanding of the spatial, environmental, and socioeconomic pattern of changes and allows the governments to minimize the cost and negative impacts of these changes. The result of this study can also be used for land-use planning and predicting urban expansion. In this study, the impact of the pan-sharpening methods was not investigated. When we did this study, 10-m landcover such as Esri 10-Meter Land Cover (10-class) and 10 m Annual Land Use Land Cover (9-class) datasets were not released. Therefore, we did not compare our results with these two datasets. We implemented our methodology in the GEE platform.
Moreover, these two global LULC datasets are not available in the GEE. The only global land cover map at 10 m resolution in GEE is ESA WorldCover 10 m v100. However, this dataset is only available for 2020 and later. For future study, the effectiveness of datasets such as Landsat-9 and the combination of Landsat and Sentinel for LULC change can be investigated. More features (e.g., textural features such as GLCM) can also be added to improve the accuracy of LULC change analysis. The results can also be compared to the global LULC maps mentioned earlier.