Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds

Almarines, Nico R.; Hashimoto, Shizuka; Pulhin, Juan M.; Tiburan, Cristino L.; Magpantay, Angelica T.; Saito, Osamu

doi:10.3390/rs16122167

Open AccessArticle

Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds

by

Nico R. Almarines

^1,2,*

,

Shizuka Hashimoto

²

,

Juan M. Pulhin

^3,4

,

Cristino L. Tiburan, Jr.

¹,

Angelica T. Magpantay

⁴ and

Osamu Saito

⁵

¹

Institute of Renewable Natural Resources, College of Forestry and Natural Resources, University of the Philippines Los Banos, Laguna 4031, Philippines

²

Department of Ecosystem Studies, Graduate School of Agriculture and Life Sciences, The University of Tokyo, Tokyo 113-8654, Japan

³

Department of Social Forestry and Forest Governance, College of Forestry and Natural Resources, University of the Philippines Los Banos, Laguna 4031, Philippines

⁴

Interdisciplinary Studies Center for Integrated Natural Resources and Environment Management, University of the Philippines Los Banos, Laguna 4031, Philippines

⁵

Institute for Global Environmental Strategies, Kanagawa 240-0115, Japan

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(12), 2167; https://doi.org/10.3390/rs16122167

Submission received: 4 April 2024 / Revised: 31 May 2024 / Accepted: 12 June 2024 / Published: 14 June 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Cloud-based remote sensing has spurred the use of techniques to improve mapping accuracy where individual images may have lower quality, especially in areas with complex terrain or high cloud cover. This study investigates the influence of image compositing and multisource data fusion on the multitemporal land cover mapping of the Pagsanjan-Lumban and Baroro Watersheds in the Philippines. Ten random forest models for each study site were used, all using a unique combination of more than 100 different input features. These features fall under three general categories. First, optical features were derived from reflectance bands and ten spectral indices, which were further subdivided into annual percentile and seasonal median composites; second, radar features were derived from ALOS PALSAR by computing textural indices and a simple band ratio; and third, topographic features were computed from the ALOS GDSM. Then, accuracy metrics and McNemar’s test were used to assess and compare the significance of about 90 pairwise model outputs. Data fusion significantly improved the accuracy of multitemporal land cover mapping in most cases. However, image composition had varied impacts for both sites. This could imply local characteristics and feature inputs as potential determinants of the ideal composite method. Hence, the iterative screening or optimization of both input features and composites is recommended to improve multitemporal mapping accuracy.

Keywords:

Google Earth Engine; land cover classification; map accuracy; Landsat; ALOS PALSAR; gray-level co-occurrence matrix; seasonal median composite; annual percentile composite

1. Introduction

Remote sensing is a crucial tool for mapping and monitoring land cover change since it facilitates the collection of data over wide expanses quickly and consistently, allowing for the creation of detailed land cover maps over time [1,2]. This not only allows for the detection of changes in land cover but also helps determine shifts in their characteristics which may not be evident in the visible spectrum, such as vegetation health and soil attributes [3,4,5,6,7].

Remote sensing can also provide a more comprehensive understanding of land cover change by combining multiple datasets from a range of sources—like earth observation satellites, ground surveys, and GIS data—or even multitemporal datasets [8,9]. Likewise, fusing data from multiple sensor types can help overcome the limitations of individual sensors, resulting in more accurate and detailed outputs [10,11]. Combining optical and radar data, for example, can help to improve the mapping accuracy by combining the spectral resolution of optical sensors with the penetration capabilities of radar [12,13]. However, processing large amounts of remote sensing data tends to be expensive and time-consuming when using traditional on-premise computing methods.

The adoption of cloud-based remote sensing has helped overcome the challenges of traditional remote sensing by providing a more efficient, cost-effective, and scalable method of processing, analyzing, and managing remote sensing data [14,15]. This advancement has accelerated the use of big data in remotely sensed land cover mapping and made hardware-intensive tools and techniques more accessible to users [16,17]. Since their introduction, cloud-based remote sensing tools such as Google Earth Engine (GEE) have seen significant use in research [18]. GEE has enabled researchers to use petabytes of open data to address computationally intensive issues such as global forest cover and climate change [19].

Cloud-based remote sensing also makes it easier to create cloudless composites. Cloud cover has been a persistent challenge in obtaining high-quality satellite imagery, as clouds can obstruct the view of the Earth’s surface and impede accurate mapping [20]. This is especially problematic in Southeast Asia and other tropical regions, especially during rainy months, as some data indicate historical increases in cloud cover over time [21,22,23]. Cloud-based remote sensing addresses this issue by allowing for faster and easier multitemporal aggregation and multisource fusion-based gap-filling through the use of high-image volume time series [24,25,26].

Similarly, the advancement of machine learning algorithms allows land cover mapping to become more accurate [27]. The ability of machine learning to manage large and complex datasets and automatically learn from them without the need for the explicit programming of rules or assumptions lends itself well to remote sensing, big data, and cloud computing [16,19]. It is also well suited for mapping land cover in areas with high variability and complexity [28,29]. As a result, the use of machine learning algorithms such as random forest, Support Vector Machines, and Neural Networks in remote sensing and land cover mapping has grown in popularity as research demonstrates their high accuracy and relative versatility in image-classifying applications [30,31,32].

However, there is a notable gap in the understanding of how compositing methods impact the precision and accuracy of remotely sensed maps. To date, a limited number of studies have directly conducted comparative assessments on changes in image composition and their effects on mapping accuracy, including studies by Nasiri et al. (2022), Phan et al. (2020), Praticò et al. (2021), and Sellami and Rhinane (2023). These studies tend to show that seasonal median composites tend to perform better than annual median composites since they consider seasonal variations in reflectance, but they do not include annual percentiles or a combination thereof in their comparison which may likewise incorporate seasonality [33,34,35]. Most recently though, it has been shown that the mean reduction algorithm generates better composites when a sole composite is used for image classification [35,36]. Nevertheless, existing research predominantly focuses on evaluating the effects of compositing in isolated geographical areas, thereby necessitating further investigation to determine the most suitable compositing techniques for varying landscapes with distinct land cover types and environmental conditions. Furthermore, there have been numerous studies looking at the impact of multi-sensor inputs [8,9,11] on the performance of machine learning-based land cover mapping. However, there is a lack of studies that delve into the combined influences of compositing methods, feature combinations, and sensor types. Closing these knowledge gaps is imperative for advancing the field of remote sensing and enhancing the accuracy of land cover mapping techniques.

Hence, this study aims to assess the impact of two different compositing techniques, namely annual percentile and seasonal median composites, along with multisource data fusion which includes optical spectral–temporal features, radar spectral features, and topographic features, on the accuracy of a random forest (RF) algorithm for land cover classification. The comparison will be conducted across two distinct landscapes. The goal is to enhance the understanding of common compositing methods so that current approaches to multitemporal mapping using RF models can be improved. The research also aims to contribute to filling the gaps in the availability of relevant and temporally consistent long-term land cover maps in the Philippines.

2. Materials and Methods

This study mapped the 2000, 2005, 2010, 2015, and the 2020 land cover of two Philippine watersheds using a methodology that selects the best-suited classifier from a set of 10 RF models. The land cover classes were based on those used by the Philippine National Mapping and Resource Information Authority (NAMRIA). These are 12 aggregated classes adopted from the guidelines used in the FAO Global Forest Resources Assessment (FRA) which is a standardized and widely recognized framework for land cover classification that captures a broad range of land cover types and ensures consistency and comparability with other studies and national land cover datasets [37,38]. Figure 1 shows the generalized flow of the methods used in this study, and the subsequent subsections detail portions of this method.

2.1. Study Area

This research was conducted in two study areas in the Philippines—the Pagsanjan-Lumban Watershed (PLW) and the Baroro Watershed (BW) (Figure 2). The PLW is located between 14°03′ and 14°22′ north latitudes and from 121°25′ to 121°37′ east longitudes. It encompasses 41,600 ha in the provinces of Laguna and Quezon. It is an important source of natural resources in the Laguna de Bay Basin but experiences issues with flooding and siltation [39]. The BW is in the northeastern part of La Union from 16°35′N to 16°44′N and between 120°20′E and 120°32′E, covering 19,400 ha in five municipalities. It serves as a primary water source in the area but faces issues with water scarcity during the summer.

These watersheds were selected due to their relative differences in climate and a diversity in vegetation and land cover which facilitated the comparison of results in localities of varying conditions (Figure 3).

In the PLW, seasons are not very pronounced, receiving 3800 mm of annual rainfall, but it is still relatively dry from November to April and wet during the rest of the year [40]. The terrain ranges from flat areas near the Laguna de Bay to mountainous regions in the south and northeast, with elevations from 10 to 2158 masl. The PLW has large swaths of agricultural land, especially in the lowlands, while relatively intact forests are still present in the Mounts Banahaw–San Cristobal Protected Landscape to the south [41]. In terms of population, the watershed had relatively high growth rates as its population had increased from 196,000 in 2010 to over 218,000 in 2020 [42]. Likewise, while it has been a significant area for agricultural production, it has been steadily urbanizing and industrializing [43].

The BW has a very distinct wet and dry season receiving around 2250 mm of rainfall annually [40]. Its terrain ranges from flat to rolling hills and steep mountains, with elevations ranging from 0 to 1415 masl. It is primarily an agricultural watershed with commercial agriculture covering the lowland areas, while more traditional agriculture practices are seen in upland areas especially in the ancestral domain areas of local indigenous people to the east of the BW [44]. Hence, the agriculture sector remains the main driver of the local economy; however, tourism is increasingly being seen as a potential growth area [45]. While the watershed population has been increasing, growing from 71,000 in 2010 to about 80,000 in 2020, the population growth rate is relatively modest compared to that of the PLW [42].

2.2. Remote Sensing Data and Preprocessing

GEE was used to process a combination of optical and radar images for each reference year (Table 1). Landsat imagery was utilized for optical images since these provide coverage for the needed temporal range to generate 30 m resolution land cover maps. These images underwent temporal and spatial filtering so that only images within the study area and within the specified time periods were used; cloud masking was applied to all filtered images using a C function of mask (CFmask) algorithm. This reduced the influence of clouds and their shadows in generating multitemporal composites [46,47]. Furthermore, radar data from two Advanced Land Observing Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) datasets were also utilized.

2.3. Feature Generation and Classification

A review of land cover mapping studies indicates the superior performance of RF in multicategory classification compared to other machine learning algorithms like decision trees or gradient boosting [48,49,50,51]. Hence, RF classifiers were used and trained for land cover classification. A combination of optical composites, radar composites, and terrain features was used to generate 101 input features for classification.

This dual compositing approach, paired with an extensive feature set, facilitates a more nuanced understanding of the impact of different compositing methods on classification accuracy. Furthermore, the study’s application across two distinct landscapes with varying climatic and environmental conditions offers valuable insights into the adaptability and robustness of the proposed methodology. The comprehensive analysis and incorporation of iterative composite and feature optimization, as presented in this research, are not extensively documented in the existing literature, thereby contributing novel findings to the field of remote sensing and land cover mapping [33,34].

2.3.1. Optical Features and Composites

Ten spectral indices were computed for each filtered Landsat image to consider the diversity of vegetation in both sites (Equations (1)–(10)). Firstly, Surface albedo (ALB) measures the reflectivity to solar radiation and is sensitive to changes in land use and land cover [52,53]. The Enhanced Vegetation Index (EVI) and the Two-Band Enhanced Vegetation Index (EVI2) are similar vegetation indices that were included because they tend to perform better in areas with high biomass [54,55]. The Green Chlorophyll Index (GCI) and Global Vegetation Moisture Index (GVMI) were selected due to their sensitivity to plant health because the former measures chlorophyll content and the latter measures the moisture content of vegetation [56,57]. The Modified Bare Soil Index (MBI), Normalized Difference Built-up Index (NDBI), and Normalized Difference Water Index (NDWI) help differentiate between bare soil, urban areas, and open water, respectively [58,59,60]. Lastly, the Normalized Difference Vegetation Index (NDVI) and Soil Adjusted Vegetation Index (SAVI) were included due to their responsiveness to vegetation and their widespread use [61,62].

A L B = \frac{({0.356 ⋆ ρ}_{b l u e}) + ({1.130 ⋆ ρ}_{r e d}) + ({0.373 ⋆ ρ}_{N I R}) + ({0.085 ⋆ ρ}_{S W I R 1}) + ({0.072 ⋆ ρ}_{S W I R 2}) - 0.0018}{1.016}

(1)

E V I = 2.5 ⋆ (\frac{ρ_{N I R} - ρ_{r e d}}{ρ_{N I R} + 6 ⋆ ρ_{r e d} - 7.5 ⋆ ρ_{b l u e} + 1})

(2)

E V I 2 = 2.5 ⋆ (\frac{ρ_{N I R} - ρ_{r e d}}{ρ_{N I R} + 2.4 ⋆ ρ_{r e d} + 1})

(3)

G C I = (\frac{ρ_{N I R}}{ρ_{g r e e n}}) - 1

(4)

G V M I = \frac{(ρ_{N I R} + 0.1) - (ρ_{S W I R} + 0.02)}{(ρ_{N I R} + 0.1) + (ρ_{S W I R} + 0.02)}

(5)

M B I = (\frac{ρ_{S W I R 1} - ρ_{S W I R 2} - ρ_{N I R}}{ρ_{S W I R 1} + ρ_{S W I R 2} + ρ_{N I R}}) + 0.5

(6)

N D B I = \frac{ρ_{S W I R} - ρ_{N I R}}{ρ_{S W I R} + ρ_{N I R}}

(7)

N D W I = \frac{ρ_{g r e e n} - ρ_{N I R}}{ρ_{g r e e n} + ρ_{N I R}}

(8)

N D V I = \frac{ρ_{N I R} - ρ_{r e d}}{ρ_{N I R} + ρ_{r e d}}

(9)

S A V I = (\frac{ρ_{N I R} - ρ_{r e d}}{ρ_{N I R} + ρ_{r e d} - L}) ⋆ (1 + L)

(10)

In Equations (1)–(10),

ρ_{g r e e n}

,

ρ_{b l u e},

ρ_{r e d}

,

ρ_{N I R}

,

ρ_{S W I R 1}

, and

ρ_{S W I R 2}

, refer to the value of the Green, Blue, Red, NIR, SWIR1, and SWIR2 reflectance bands, respectively. In addition,

L

is a correction for soil brightness usually defined as 0.5 to encompass a wide range of land cover or vegetation [61].

Likewise, to consider seasonal variability, the annual percentile and seasonal median compositing methods were used to generate cloud-free optical composite images from Landsat [11,24,25]. The 20th, 50th, and 80th percentile values were computed for the Green, Blue, Red, NIR, SWIR1, and SWIR2 reflectance bands and spectral indices of each Landsat collection. Median values were also computed for images captured within the rainy months (i.e., May–October) and images taken during the dry months (i.e., November–April). A total of 80 metrics were generated in GEE from these spectral–temporal data (Table 2).

2.3.2. Radar and Topographic Features

Radar features were extracted from the ALOS PALSAR dataset by computing a simple band ratio (RAT) of its two polarized bands and eight gray-level co-occurrence matrix (GLCM) textures, namely Angular Second Moment (ASM), Contrast (CON), Correlation (CORR), Dissimilarity (DISS), Entropy (ENT), Inverse Difference Moment (IDM), Sum Average (SAVG), and Variance (VAR); [63,64] (Equations (11)–(19)). Both the GLCM textures and the RAT used the vertical transmit and horizontal receive (VH) and the horizontal transmit and horizontal receive (HH) polarization backscattering coefficients of ALOS PALSAR. However, the GLCM textures were computed in a symmetrical normalized GLCM where the number of rows (

i

) was equal to the number of columns (

j

), and probability values (

p_{i, j}

) were computed for each matrix location (

i, j

). Furthermore, some of the GLCM textures also required the number of paired data (

n

), the number of gray levels (

N_{g}

), the GLCM mean (

μ

), and the GLCM variance (

σ

) in their computation.

R A T = \frac{H H}{H V}

(11)

A S M = \sum_{i} \sum_{j} {p_{i, j}}^{2}

(12)

C O N = \sum_{n = 0}^{N_{g} - 1} n^{2} \{\sum_{i = 1}^{N_{g}} \sum_{j = 1}^{N_{g}} p_{i, j}\}

(13)

C O R R = \frac{\sum_{i} \sum_{j} i j p_{i, j} - μ_{x} μ_{y}}{σ_{x} σ_{y}}

(14)

D I S S = \sum_{n = 1}^{N_{g} - 1} n \{\sum_{i = 1}^{N_{g}} \sum_{j = 1}^{N_{g}} {p_{i, j}}^{2}\}, |i - j| = n

(15)

E N T = \sum_{i} \sum_{j} p_{i, j} \log (p_{i, j})

(16)

I D M = \sum_{i} \sum_{j} (\frac{1}{{1 + (i - j)}^{2}}) \cdot p_{i, j}

(17)

S A V G = \sum_{i = 2}^{{2 N}_{g}} {i p}_{x + y} (i)

(18)

V A R = \sum_{i} \sum_{j} {(i - μ)}^{2} \cdot p_{i, j}

(19)

Hence, the computations resulted in seventeen radar features for classifier input, including one band ratio, eight VH, and eight HH metrics, respectively (Table 3). Additional compositing processes were not utilized for these features since the image collection used already had annualized composites. In addition, two topographic features were also used: slope and elevation, both of which were computed from the ALOS GDSM dataset. These topographic features were treated as static variables and were the same for all years.

2.3.3. Feature Normalization and Standardization

All input features were normalized and standardized before their use in RF model training and validation. Although not necessarily required for machine learning and random forest algorithms in general, these have been shown to improve prediction results in multiple applications [28,65,66]. Normalization and standardization were applied to the combined data collection in GEE with Equation (20) and Equation (21), respectively. These equations use the minimum (

x_{m i n}

) and maximum (

x_{m a x}

) values to transform a feature value (

x

) to normalized values (

x_{n o r m}

) and utilize its mean (μ) and standard deviation (

σ

) to generate standardized values (

x_{s t a n d}

).

x_{n o r m} = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}}

(20)

x_{s t a n d} = \frac{x - μ}{σ}

(21)

2.4. RF Model Evaluation

The evaluation undertaken to measure and compare RF model performance was two-fold. First, accuracy metrics were computed for each RF model output for the two sites. The second step was to determine whether the observed differences in accuracies were statistically significant through McNemar’s test. This second part is particularly important since accuracy metrics are subject to variability, and thus, this method lent more statistical rigor to the comparison of map outputs [67,68].

2.4.1. Accuracy Metrics

Reference points for model training and evaluation were collected by sampling readily available secondary maps—including national land cover maps and municipal land use maps—and cross-verified through high-resolution satellite and aerial images in Google Earth Pro. Furthermore, participatory mapping through key informant interviews and focus group discussions with municipal agricultural, environmental, and planning representatives, and representatives from regional government agencies, was also conducted to facilitate the collection of high-quality reference data ensuring robust model training and evaluation. However, since the participatory mapping was only used for recent land cover data (i.e., year 2020) only the reference year 2020 was used for the comparison of RF model performance.

Using the reference data collected, the assessment of the models was undertaken using hold-out validation. Around 8000 reference points were divided using a 0.8 split ratio, where 80% of the points (i.e., 6400) were used to train the RF model, and the remaining 20% (i.e., 1600) were employed to validate its output independently. Error matrices were generated to compute the test overall accuracy (OA) by comparing the number of true positive (TP) and true negative (TN) predictions to the total number of positive (P) and negative (N) predictions.

O A = \frac{T P + T N}{P + N}

(22)

The kappa (κ) coefficient was also computed since it also considers the influence of random chance (

P_{e}

) in addition to the relative observed agreement (P_o) and provides a better metric of classifier reliability than OA by itself [69,70,71].

κ = \frac{P_{o} - P_{e}}{1 - P_{e}}

(23)

In addition, F1-scores were computed since some studies also point out the limitations of solely using the kappa coefficient in determining model performance [72,73,74]. This balances precision and recall in its evaluation by incorporating false positive (FP) and false negative (FN) predictions and is less sensitive to imbalanced class distributions [75,76].

F 1 = \frac{2 T P}{2 T P + F P + F N}

(24)

Furthermore, out-of-bag error (OOBE) estimates were used to assess the internal model estimates of residual variance [77,78]. While this is not necessarily a direct indicator of output accuracy, it lends a great deal of insight to model performance.

2.4.2. Statistical Analysis

To evaluate the statistical significance of the difference between the overall accuracies of classification results, McNemar’s test was used (Equation (25)). McNemar’s test compares the number of discordant pairs between two model results [34,68]. The test determines whether the difference between the two models is statistically significant or whether it can be explained by chance [79,80]. Hence, McNemar’s test provides a more objective basis for comparison compared to simply comparing the accuracy metrics of two maps [67].

χ^{2} = \frac{{(b - c)}^{2}}{b + c}

(25)

2.5. Optimization and Postprocessing

Once the best model was identified for each site, further classifier improvement was made by tuning the number of trees (NoT) and number of variables per split (mTry) hyperparameters. This determines the optimal hyperparameter values which may not be established in model training [81,82]. Multiple hyperparameter tuning iteratively screened NoT from 20 to 500 at intervals of 20 and mTry from 5 to 80 at intervals of 5.

A postprocessing majority filter was also applied to reduce salt-and-pepper noise caused by misclassified pixels. This reduces the impact of random noise or outliers in the data, resulting in smoother, more reliable representation features in the image [83,84]. The majority filter is widely used in image analysis applications, including the mapping of land cover, detecting changes in land use, and analyzing vegetation health [85,86].

2.6. Land Cover Mapping

The optimized RF model was used to generate land cover maps in the BW and the PLW for the years 2000, 2005, 2010, 2015, and 2020. Similar to RF model evaluation, accuracy metrics were also computed for each land cover map so that they could be compared to other readily available maps of the study sites. These maps were then used to calculate the areas occupied by each land cover classification for each year, as well as the changes in those areas. Thus, the post-classification comparison technique was utilized to determine changes in land cover for both sites. This is the most used method of land cover change detection [87] and has achieved high accuracies even when using different inputs and applied to various regions across the globe [1,30,88].

3. Results

3.1. Comparison of Model Performance

RF Model 10 had the highest predictive value for the PLW with a test OA = 0.9224, κ = 0.9010, and OOBE = 0.1960 (Figure 4a). This was the functionally most complex and had the highest dimensionality among the models since it incorporates annual percentile composite features, seasonal spectral–temporal features, radiometric features, and topographic features. This is followed by RF Model 9 and RF Model 8 with κ equal to 0.8992 and 0.8914, respectively, both of which incorporate three feature sets in the RF model.

Conversely, RF Model 8 had the highest accuracy values for the BW, with test OA = 0.9252, κ = 0.8954, and OOBE = 0.0828 (Figure 4b). This model does not use annual percentile composite features but instead incorporates seasonal spectral–temporal features, radiometric features, and topographic features. RF Model 10, RF Model 7, and RF Model 9 also performed well; all three had test OA > 0.91 and κ > 0.88.

The McNemar’s tests showed all but two pairwise model comparisons had statistically significant differences in outputs for the PLW which are the RF Model 4–7 and RF Model 9–10 pairs (Figure 5). Meanwhile, the BW had six pairs with Yate’s p-values greater than 0.05; these are the RF Model 1–2, RF Model 4–5, RF Model 4–9, RF Model 5–9, RF Model 7–10, and RF Model 8–10 pairs. This meant that each of these model pairs produced maps of comparable accuracies.

Furthermore, all single feature set models (i.e., RF Models 1, 2, and 3) performed significantly worse in both sites, and models with two or more feature sets had statistically significant improvements in classification accuracy compared to their previous counterparts. This is the case when the topographic feature set was added to the models (i.e., RF Models 4, 5, and 6) or when both optical and radar feature sets were used (i.e., RF Models 7, 8, and 10). However, while RF Model 6—which utilizes both radar and topographic features—was statistically more accurate than just using radar alone, it was still significantly less accurate than RF Model 1 and RF Model 2.

There were several viable alternatives for the best-performing model in each respective site. RF Model 9 and RF Model 10 were potential choices in the PLW since there were no statistically discernible differences between the two. Likewise, RF Model 7, RF Model 8, and RF Model 10 also had statistically comparable performances as top classifiers in the BW, followed by RF Model 9, which had a slightly lower performance.

3.2. Land Cover Maps

Due to the limitations of radar data availability in the sites for 2010 and earlier, Model 9 was selected and optimized for multitemporal land cover mapping in both sites. The optimized RF models generated land cover maps (Figure 6) with OA > 0.92, κ > 0.90, F1 > 0.88, and OOBE < 0.19 for the PLW and OA > 0.94, κ > 0.92, F1 > 0.85, and OOBE < 0.08 for the BW (Table 4).

The results showed that the PLW is predominantly covered by perennial crops. In 2020, the watershed was composed of 44% perennial crops, 21% shrubland, 13% annual crops, and 12% open forests (Figure 7). From 2000 to 2020, land cover trends show that the most significant land cover change was the decrease in perennial crops, losing 2600 ha or 13% of its total land area in the past 20 years. Conversely, significant increases in shrubland were observed, amounting to a net increase of 1900 ha or 28%.

In contrast, the BW is predominantly covered with shrublands, accounting for about 63% of its total land area as of 2020. Meanwhile, the rest is covered with annual crops (33%) and 2% each for open forests and built areas. Land cover change analysis has shown that there has been a significant net increase in shrublands amounting to 790 ha or 7% and a net increase in built areas equivalent to 170 ha or an 89% expansion. On the contrary, open forest areas have contracted by 550 ha or 61% of their baseline extent, and annual crops have also shrunk by 240 ha or 4% of their original land area.

4. Discussion

This study looks for the first time at the combined effects of compositing and multisource input features on multitemporal mapping accuracy. It is likely one of the few studies that statistically compare these results across two study sites with disparate local characteristics. The subsequent subsections describe the implications of the results of data fusion and then composition.

4.1. Impacts of Multisource Data Integration

This study supports previous research on fusing terrain and optical data [8,9,10]. It shows that incorporating terrain or topographic features significantly improved classification accuracy for both the PLW and the BW.

Likewise, adding textural features also improved accuracy, except for two pairs in the PLW where the addition did not result in any significant difference (i.e., RF Model 4–7 and RF Model 9–10 pairs). These results also seem to corroborate that while most studies show increased accuracies of generated land cover when optical and radar data are paired using multiple techniques [89,90,91,92,93], there are instances when this is not the case. This is supported by a comprehensive study on optimizing the use of optical and radar images for mapping showed that the level of fusion (i.e., pixel, feature, or decision level), data distribution, spatial resolution, and method used (i.e., RF, SVM, etc.) affected the results and that the concurrent use of both optical and radar data does not ensure better outputs [94].

However, the underlying cause for the lack of a significant difference between the two RF model pairs in the PLW differs from that of [95]. This is because all aforementioned factors were the same for all RF models used in this study, with their only difference being their input features. Instead, it is more likely that there are instances when topographic features would have similar contributions to accuracy as radar-derived metrics since topography was also an input feature in both RF model pairs. Gini importance values for topographic features consistently ranked high for all RF models.

4.2. Impacts of Compositing

The results of this work may be the first indication of local site influence and input feature influence on optimal compositing. This is because the statistical analysis had mixed results for the two study sites and the different sets of feature combinations.

Hence, the impacts of compositing methods on accuracy did not exhibit a general trend. In both the PLW and the BW, the RF models that combined optical, radar, and topographic features were more accurate when seasonal median composites were used. This is analogous to the results of the study in China, where an RF model with similar inputs also had more accurate class predictions when monthly median composites were used compared to annual percentiles [46].

Similarly, seasonal median composites were more accurate in RF models with integrated optical and topographic features but only in the PLW. Studies in Iran, Italy, and Mongolia with comparable RF models mirror these results [33,34,35]. However, the converse is true for RF models with solely optical features, where annual percentile composites resulted in more accurate maps in the PLW but had no statistically discernible difference in the BW.

Likewise, the impact of using both seasonal and annual composites also differed in the two sites. This combination was the most accurate RF model in the PLW and is consistent with the concept that increasing the number of input composites (e.g., number of percentiles or number of seasonal or monthly medians) improves accuracy [46,95]. On the contrary, seasonal composites with textural features had the statistically best performance in the BW, even when compared to models that use both seasonal and annual composites.

Nonetheless, the differing accuracies between the two compositing techniques under different feature set combinations and in different sites suggest that there may not be one ‘best’ composition method in all instances. Instead, the choice of compositing method could be dependent on specific site conditions and the features used. For instance, seasonal composites might perform better in the BW because of its distinct rainy and dry seasons, while the PLW could benefit from using both annual and seasonal composites given its more evenly distributed rainfall pattern.

If this is the case, then it could mean that the compositing method utilized should be tailored to the local conditions of each site. Meanwhile, a potential solution to this challenge is the iterative testing and comparison of composites to ensure that the best one is used for classification [35].

4.3. Generated Land Cover

This study also demonstrated a means of generating consistent land cover maps for the past two decades. The optimized RF output maps all had an OA greater than 90%, and all accuracy metrics were well above the acceptable 80% threshold. However, the OOBE was in the high range of acceptable values, particularly for the models in the PLW. This may be due to the large number of features used and the characteristically large imbalanced dataset used for training and testing; both contribute to the overestimation of the OOBE [96,97].

However, specific anomalous trends were observed in the PLW, such as decreases in built-up areas and a pendulum-like pattern in open forest areas, which appear counterintuitive. These patterns seem to be localized to the PLW, as the BW does not exhibit the same trends. Several factors likely contribute to these observed trends. A significant real estate development project in the PLW, planned since the late 1990s and covering approximately 300–400 hectares, was subject to legal proceedings and unresolved issues until 2023. These delays and complications likely periodically halted construction activities, resulting in a perceived decrease in the built-up area during the study period. The maps corroborate this, showing the “loss” of built-up areas in the same vicinity as the said development. Additionally, the PLW, along with several other regions in the Philippines, experienced several severe outbreaks of the coconut scale insect (Aspidiotus rigidus), leading to the cutting and abandonment of large areas of coconut plantations—a major perennial crop—primarily from 2010 to 2016 [98]. Some of these abandoned perennial crops may have resembled open forests in satellite imagery, contributing to the unusual pattern where forest areas appear to increase and decrease periodically. Given these localized factors, the anomalies are specific to the PLW and not indicative of broader trends. While these real-world complexities may continue to influence the observed data trends, the overall model performance remains robust.

Hence, the generated land cover maps are better than those provided by NAMRIA since no official accuracy data were published for their maps, and land cover data earlier than 2010 utilized differing methodologies, making change analysis less reliable. However, the 2010 to 2015 and the 2015 to 2020 land cover seem to follow similar trends to that of NAMRIA. Likewise, the generated land cover had better accuracies than the latest or most commonly used global land cover datasets: CGLS_LC100m V3.0 had 83.7% OA in Asia [99]; GLC_FCS30 had an OA of 82.5% and κ = 0.78 [100]; GlobeLand30 had an OA of 80.3% and κ = 0.75; and ESA WorldCover 10 m 2020 v200 had an OA of 76.7% [101].

4.4. Limitations and Further Studies

This study exclusively examines the impact of integrating additional data sources on the predictive value of the RF model focusing solely on feature combinations or feature sets (i.e., optical, radar, terrain) related to this inquiry. Hence, it does not determine the contributions of individual features to accuracy. Although Gini importance measures of input features were available for the RF models, there are limitations to its application [102]. Hence, this points to a potential area for further analysis. Research needs to expand and help determine how to best combine multisource data and identify which specific combination of input features would yield better predictions. This will entail the assessment and comparison of a much larger set of feature combinations than that conducted in this study. Integrating other data sources like airborne LIDAR, aerial UAV images, or MODIS GPP, among others, is also a potential area of exploration.

Similarly, this study only tried to determine how annual percentile composites and seasonal median composites affect accuracy. However, the mixed results in the two different study sites seem to indicate local influences that may affect optimal composition methods. Thus, more in-depth assessments should be conducted to further supplement and verify these results. Further research could help determine which local characteristics (e.g., climatic variability, phenology, etc.) could affect optimal compositing and measure the magnitude of their influence. The impacts of finer temporal scales for compositing (e.g., quarterly, monthly) especially when using data with more frequent revisit times like MODIS should also be explored.

5. Conclusions

This study used GEE to compare ten RF models for multitemporal land cover mapping. Each utilized a unique combination of optical, radar, and topographic features. The final optimized RF model generated high-accuracy (OA > 0.90) maps in both the PLW and the BW which adds to the availability of temporally consistent long-term land cover maps in the Philippines.

Statistical analysis highlighted that fusing optical and topographic data significantly improved map accuracy and that fusing radar data also tends to be the same but not in all instances. However, the varied influence of compositing potentially suggests, for the first time, that the optimal composite for land cover mapping is predicated on site characteristics and features used in classification. Since comparable studies on compositing are scarce, a consensus on its influence or underlying factors is unlikely to be determined until sufficient data are available. Because of this, it is recommended to adopt iterative composite and input feature optimization as standard procedures for accuracy improvement in multitemporal mapping.

However, further investigation is still needed to better understand the factors that affect compositing accuracy so that current methods can be improved. Research also needs to expand the selection of input features to help determine which combination or how scale would yield better predictions. Additional comparable studies are also needed, especially in other bioclimatic regions to validate the findings of this study.

Author Contributions

Conceptualization, N.R.A.; Formal analysis, N.R.A.; Funding acquisition, J.M.P. and O.S.; Investigation, N.R.A. and A.T.M.; Methodology, N.R.A. and S.H.; Project administration, J.M.P. and O.S.; Supervision, S.H., J.M.P. and C.L.T.J.; Validation, S.H.; Visualization, N.R.A.; Writing—original draft, N.R.A.; Writing—review and editing, S.H., J.M.P., C.L.T.J., A.T.M. and O.S. All authors have read and agreed to the published version of the manuscript.

Funding

This study was conducted as a part of the project “Integration of Traditional and Modern Bioproduction System for a Sustainable and Resilient Future under Climate and Ecosystem Changes (ITMoB)” which was funded by the Japan Science and Technology Agency (JST) East Asia Science and Innovation Area Joint Research Program (e-ASIA JRP), grant number JPMJSC20E6. Additionally, the Philippine component of this research was funded by the Department of Science and Technology-Philippine Council for Agriculture, Aquatic and Natural Resources Research and Development (DOST-PCARRD), fund number N925522. Furthermore, this study was supported by the Environment Research and Technology Development Fund (JPMEERF23S12140) of the Environmental Restoration and Conservation Agency (ERCA) of Japan.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Acknowledgments

We thank all the ITMoB project members for their guidance and contributions.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Mohan Rajan, S.N.; Loganathan, A.; Manoharan, P. Survey on Land Use/Land Cover (LU/LC) Change Analysis in Remote Sensing and GIS Environment: Techniques and Challenges. Environ. Sci. Pollut. Res. 2020, 27, 29900–29926. [Google Scholar] [CrossRef]
Pandey, P.C.; Koutsias, N.; Petropoulos, G.P.; Srivastava, P.K.; Ben Dor, E. Land Use/Land Cover in View of Earth Observation: Data Sources, Input Dimensions, and Classifiers—A Review of the State of the Art. Geocarto Int. 2021, 36, 957–988. [Google Scholar] [CrossRef]
Angelopoulou, T.; Tziolas, N.; Balafoutis, A.; Zalidis, G.; Bochtis, D. Remote Sensing Techniques for Soil Organic Carbon Estimation: A Review. Remote Sens. 2019, 11, 676. [Google Scholar] [CrossRef]
Corwin, D.L.; Scudiero, E. Review of Soil Salinity Assessment for Agriculture across Multiple Scales Using Proximal and/or Remote Sensors. In Advances in Agronomy; Elsevier: Amsterdam, The Netherlands, 2019; Volume 158, pp. 1–130. ISBN 978-0-12-817412-8. [Google Scholar]
Edokossi, K.; Calabia, A.; Jin, S.; Molina, I. GNSS-Reflectometry and Remote Sensing of Soil Moisture: A Review of Measurement Techniques, Methods, and Applications. Remote Sens. 2020, 12, 614. [Google Scholar] [CrossRef]
Lechner, A.M.; Foody, G.M.; Boyd, D.S. Applications in Remote Sensing to Forest Ecology and Management. One Earth 2020, 2, 405–412. [Google Scholar] [CrossRef]
Shanmugapriya, P.; Rathika, S.; Ramesh, T.; Janaki, P. Applications of Remote Sensing in Agriculture—A Review. Int. J. Curr. Microbiol. Appl. Sci. 2019, 8, 2270–2283. [Google Scholar] [CrossRef]
Belgiu, M.; Stein, A. Spatiotemporal Image Fusion in Remote Sensing. Remote Sens. 2019, 11, 818. [Google Scholar] [CrossRef]
Li, J.; Li, Y.; He, L.; Chen, J.; Plaza, A. Spatio-Temporal Fusion for Remote Sensing Data: An Overview and New Benchmark. Sci. China Inf. Sci. 2020, 63, 140301. [Google Scholar] [CrossRef]
Ghamisi, P.; Gloaguen, R.; Atkinson, P.M.; Benediktsson, J.A.; Rasti, B.; Yokoya, N.; Wang, Q.; Hofle, B.; Bruzzone, L.; Bovolo, F.; et al. Multisource and Multitemporal Data Fusion in Remote Sensing: A Comprehensive Review of the State of the Art. IEEE Geosci. Remote Sens. Mag. 2019, 7, 6–39. [Google Scholar] [CrossRef]
Schmitt, M.; Zhu, X.X. Data Fusion and Remote Sensing: An Ever-Growing Relationship. IEEE Geosci. Remote Sens. Mag. 2016, 4, 6–23. [Google Scholar] [CrossRef]
Mahyoub, S.; Fadil, A.; Mansour, E.M.; Rhinane, H.; Al-Nahmi, F. Fusing of optical and Synthetic Aperture Radar (SAR) remote sensing data: A systematic literature review (SLR). Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 127–138. [Google Scholar] [CrossRef]
Orynbaikyzy, A.; Gessner, U.; Conrad, C. Crop Type Classification Using a Combination of Optical and Radar Remote Sensing Data: A Review. Int. J. Remote Sens. 2019, 40, 6553–6595. [Google Scholar] [CrossRef]
Amani, M.; Ghorbanian, A.; Ahmadi, S.A.; Kakooei, M.; Moghimi, A.; Mirmazloumi, S.M.; Moghaddam, S.H.A.; Mahdavi, S.; Ghahremanloo, M.; Parsian, S.; et al. Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5326–5350. [Google Scholar] [CrossRef]
Tamiminia, H.; Salehi, B.; Mahdianpari, M.; Quackenbush, L.; Adeli, S.; Brisco, B. Google Earth Engine for Geo-Big Data Applications: A Meta-Analysis and Systematic Review. ISPRS J. Photogramm. Remote Sens. 2020, 164, 152–170. [Google Scholar] [CrossRef]
Wu, X. Big Data Classification of Remote Sensing Image Based on Cloud Computing and Convolutional Neural Network. Soft Comput. 2022. [Google Scholar] [CrossRef]
Xu, C.; Du, X.; Fan, X.; Giuliani, G.; Hu, Z.; Wang, W.; Liu, J.; Wang, T.; Yan, Z.; Zhu, J.; et al. Cloud-Based Storage and Computing for Remote Sensing Big Data: A Technical Review. Int. J. Digit. Earth 2022, 15, 1417–1445. [Google Scholar] [CrossRef]
Zhao, Q.; Yu, L.; Li, X.; Peng, D.; Zhang, Y.; Gong, P. Progress and Trends in the Application of Google Earth and Google Earth Engine. Remote Sens. 2021, 13, 3778. [Google Scholar] [CrossRef]
Yang, L.; Driscol, J.; Sarigai, S.; Wu, Q.; Chen, H.; Lippitt, C.D. Google Earth Engine and Artificial Intelligence (AI): A Comprehensive Review. Remote Sens. 2022, 14, 3253. [Google Scholar] [CrossRef]
Whitcraft, A.K.; Vermote, E.F.; Becker-Reshef, I.; Justice, C.O. Cloud Cover throughout the Agricultural Growing Season: Impacts on Passive Optical Earth Observations. Remote Sens. Environ. 2015, 156, 438–447. [Google Scholar] [CrossRef]
Jiang, R.; Sanchez-Azofeifa, A.; Laakso, K.; Xu, Y.; Zhou, Z.; Luo, X.; Huang, J.; Chen, X.; Zang, Y. Cloud Cover throughout All the Paddy Rice Fields in Guangdong, China: Impacts on Sentinel 2 MSI and Landsat 8 OLI Optical Observations. Remote Sens. 2021, 13, 2961. [Google Scholar] [CrossRef]
Laborde, H.; Douzal, V.; Ruiz Piña, H.A.; Morand, S.; Cornu, J.-F. Landsat-8 Cloud-Free Observations in Wet Tropical Areas: A Case Study in South East Asia. Remote Sens. Lett. 2017, 8, 537–546. [Google Scholar] [CrossRef]
Mao, K.; Yuan, Z.; Zuo, Z.; Xu, T.; Shen, X.; Gao, C. Changes in Global Cloud Cover Based on Remote Sensing Data from 2003 to 2012. Chin. Geogr. Sci. 2019, 29, 306–315. [Google Scholar] [CrossRef]
Mo, Y.; Xu, Y.; Chen, H.; Zhu, S. A Review of Reconstructing Remotely Sensed Land Surface Temperature under Cloudy Conditions. Remote Sens. 2021, 13, 2838. [Google Scholar] [CrossRef]
Pu, D.C.; Sun, J.Y.; Ding, Q.; Zheng, Q.; Li, T.T.; Niu, X.F. Mapping Urban Areas Using Dense Time Series of Landsat Images and Google Earth Engine. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 42, 403–409. [Google Scholar] [CrossRef]
Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. Aggregating Cloud-Free Sentinel-2 Images with Google Earth Engine. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 4, 145–152. [Google Scholar] [CrossRef]
Salwa Thasveen, M.; Suresh, S. Land—Use and Land—Cover Classification Methods: A Review. In Proceedings of the 2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS), IEEE, Kollam, India, 18 November 2021; pp. 1–6. [Google Scholar]
Kumar, N.; Manhas, J.; Sharma, V. A Comparative Analysis to Visualize the Behavior of Different Machine Learning Algorithms for Normalized and Un-Normalized Data in Predicting Alzheimer’s Disease. J. Comput. Theor. Nanosci. 2019, 16, 3840–3848. [Google Scholar] [CrossRef]
Maurya, K.; Mahajan, S.; Chaube, N. Remote Sensing Techniques: Mapping and Monitoring of Mangrove Ecosystem—A Review. Complex Intell. Syst. 2021, 7, 2797–2818. [Google Scholar] [CrossRef]
Piao, Y.; Jeong, S.; Park, S.; Lee, D. Analysis of Land Use and Land Cover Change Using Time-Series Data and Random Forest in North Korea. Remote Sens. 2021, 13, 3501. [Google Scholar] [CrossRef]
Talukdar, S.; Singha, P.; Mahato, S.; Shahfahad; Pal, S.; Liou, Y.-A.; Rahman, A. Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations—A Review. Remote Sens. 2020, 12, 1135. [Google Scholar] [CrossRef]
Thanh Noi, P.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef]
Nasiri, V.; Deljouei, A.; Moradi, F.; Sadeghi, S.M.M.; Borz, S.A. Land Use and Land Cover Mapping Using Sentinel-2, Landsat-8 Satellite Images, and Google Earth Engine: A Comparison of Two Composition Methods. Remote Sens. 2022, 14, 1977. [Google Scholar] [CrossRef]
Phan, T.N.; Kuch, V.; Lehnert, L.W. Land Cover Classification Using Google Earth Engine and Random Forest Classifier—The Role of Image Composition. Remote Sens. 2020, 12, 2411. [Google Scholar] [CrossRef]
Praticò, S.; Solano, F.; Di Fazio, S.; Modica, G. Machine Learning Classification of Mediterranean Forest Habitats in Google Earth Engine Based on Seasonal Sentinel-2 Time-Series and Input Image Composition Optimisation. Remote Sens. 2021, 13, 586. [Google Scholar] [CrossRef]
Sellami, E.M.; Rhinane, H. A new approach for mapping land use/land cover using google earth engine: A comparison of composition images. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2023, 48, 343–349. [Google Scholar] [CrossRef]
FAO. Terms and Definitions: FRA 2020; Forest Resources Assessment Working Paper 188; FAO: Rome, Italy, 2018. [Google Scholar]
FAO; UNEP. The State of the World’s Forests 2020; FAO: Rome, Italy; UNEP: Nairobi, Kenya, 2020; ISBN 978-92-5-132419-6. [Google Scholar]
Ecosystems and People. The Philippine Millennium Ecosystem Assessment; Ecosystems and People: Abingdon, UK, 2005. [Google Scholar]
Jalil, A.A.; Caguiat, L.S.; Khaing, K.T.; Alos, B.M.; Moreno, N.A. Augmentation of Agrometeorological Stations Network in Southern Luzon, Philippines. EPRA Int. J. Multidiscip. Res. 2021, 7, 187–197. [Google Scholar] [CrossRef]
Cruz, R.V.O.; Pillas, M.; Castillo, H.C.; Hernandez, E.C. Pagsanjan-Lumban Catchment, Philippines: Summary of Biophysical Characteristics of the Catchment, Background to Site Selection and Instrumentation. Agric. Water Manag. 2012, 106, 3–7. [Google Scholar] [CrossRef]
Philippine Statistics Authority. 2020 Census of Population and Housing (2020 CPH) Population Counts; Philippine Statistics Authority: Quezon City, Philippines, 2021. [Google Scholar]
National Economic and Development Authority of the Philippines. CALABARZON Regional Development Plan 2023–2028; National Economic and Development Authority of the Philippines: Mandaluyong City, Philippines, 2023. [Google Scholar]
Encisa-Garcia, J.; Pulhin, J.; Cruz, R.V.; Simondac-Peria, A.; Ramirez, M.A.; De Luna, C. Land Use/Land Cover Changes Assessment and Forest Fragmentation Analysis in the Baroro River Watershed, La Union, Philippines. J. Environ. Sci. Manag. 2020, 2, 14–27. [Google Scholar] [CrossRef]
National Economic and Development Authority of the Philippines. Region 1 Regional Development Plan 2023–2028; National Economic and Development Authority of the Philippines: Mandaluyong City, Philippines, 2023. [Google Scholar]
Xie, S.; Liu, L.; Zhang, X.; Yang, J.; Chen, X.; Gao, Y. Automatic Land-Cover Mapping Using Landsat Time-Series Data Based on Google Earth Engine. Remote Sens. 2019, 11, 3023. [Google Scholar] [CrossRef]
Zhang, H.K.; Roy, D.P. Using the 500 m MODIS Land Cover Product to Derive a Consistent Continental Scale 30 m Landsat Land Cover Classification. Remote Sens. Environ. 2017, 197, 15–34. [Google Scholar] [CrossRef]
Freeman, E.A.; Moisen, G.G.; Coulston, J.W.; Wilson, B.T. Random Forests and Stochastic Gradient Boosting for Predicting Tree Canopy Cover: Comparing Tuning Processes and Model Performance. Can. J. For. Res. 2016, 46, 323–339. [Google Scholar] [CrossRef]
Jun, M.-J. A Comparison of a Gradient Boosting Decision Tree, Random Forests, and Artificial Neural Networks to Model Urban Land Use Changes: The Case of the Seoul Metropolitan Area. Int. J. Geogr. Inf. Sci. 2021, 35, 2149–2167. [Google Scholar] [CrossRef]
Nawar, S.; Mouazen, A. Comparison between Random Forests, Artificial Neural Networks and Gradient Boosted Machines Methods of On-Line Vis-NIR Spectroscopy Measurements of Soil Total Nitrogen and Total Carbon. Sensors 2017, 17, 2428. [Google Scholar] [CrossRef] [PubMed]
Sahin, E.K. Assessing the Predictive Capability of Ensemble Tree Methods for Landslide Susceptibility Mapping Using XGBoost, Gradient Boosting Machine, and Random Forest. SN Appl. Sci. 2020, 2, 1308. [Google Scholar] [CrossRef]
Liang, S. Narrowband to Broadband Conversions of Land Surface Albedo I. Remote Sens. Environ. 2001, 76, 213–238. [Google Scholar] [CrossRef]
Smith, R.B. The Heat Budget of the Earth’s Surface Deduced from Space; Yale: New Haven, CT, USA, 2010. [Google Scholar]
Huete, A.R.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Jiang, Z.; Huete, A.; Didan, K.; Miura, T. Development of a Two-Band Enhanced Vegetation Index without a Blue Band. Remote Sens. Environ. 2008, 112, 3833–3845. [Google Scholar] [CrossRef]
Ceccato, P.; Gobron, N.; Flasse, S.; Pinty, B.; Tarantola, S. Designing a Spectral Index to Estimate Vegetation Water Content from Remote Sensing Data: Part 1. Remote Sens. Environ. 2002, 82, 188–197. [Google Scholar] [CrossRef]
Gitelson, A.A.; Viña, A.; Arkebauer, T.J.; Rundquist, D.C.; Keydan, G.; Leavitt, B. Remote Estimation of Leaf Area Index and Green Leaf Biomass in Maize Canopies. Geophys. Res. Lett. 2003, 30, 1248. [Google Scholar] [CrossRef]
McFeeters, S.K. The Use of the Normalized Difference Water Index (NDWI) in the Delineation of Open Water Features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
Nguyen, C.T.; Chidthaisong, A.; Kieu Diem, P.; Huo, L.-Z. A Modified Bare Soil Index to Identify Bare Land Features during Agricultural Fallow-Period in Southeast Asia Using Landsat 8. Land 2021, 10, 231. [Google Scholar] [CrossRef]
Zha, Y.; Gao, J.; Ni, S. Use of Normalized Difference Built-up Index in Automatically Mapping Urban Areas from TM Imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Rouse, J.W., Jr.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with Erts. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
De Alban, J.; Connette, G.; Oswald, P.; Webb, E. Combined Landsat and L-Band SAR Data Improves Land Cover Classification and Change Detection in Dynamic Tropical Landscapes. Remote Sens. 2018, 10, 306. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, 6, 610–621. [Google Scholar] [CrossRef]
Kim, D. Prediction Performance of Support Vector Machines on Input Vector Normalization Methods. Int. J. Comput. Math. 2004, 81, 547–554. [Google Scholar] [CrossRef]
Saboor, A.; Usman, M.; Ali, S.; Samad, A.; Abrar, M.F.; Ullah, N. A Method for Improving Prediction of Human Heart Disease Using Machine Learning Algorithms. Mob. Inf. Syst. 2022, 2022, 1410169. [Google Scholar] [CrossRef]
De Leeuw, J.; Jia, H.; Yang, L.; Liu, X.; Schmidt, K.; Skidmore, A.K. Comparing Accuracy Assessments to Infer Superiority of Image Classification Methods. Int. J. Remote Sens. 2006, 27, 223–232. [Google Scholar] [CrossRef]
Foody, G.M. Thematic Map Comparison: Evaluating the Statistical Significance of Differences in Classification Accuracy. Photogramm. Eng. Remote Sens. 2004, 70, 627–634. [Google Scholar] [CrossRef]
Lee, M.R.; Sankar, V.; Hammer, A.; Kennedy, W.G.; Barb, J.J.; McQueen, P.G.; Leggio, L. Using Machine Learning to Classify Individuals with Alcohol Use Disorder Based on Treatment Seeking Status. eClinicalMedicine 2019, 12, 70–78. [Google Scholar] [CrossRef]
Li, H.; Calder, C.A.; Cressie, N. Beyond Moran’s I: Testing for Spatial Dependence Based on the Spatial Autoregressive Model. Geogr. Anal. 2007, 39, 357–375. [Google Scholar] [CrossRef]
Verma, P.; Raghubanshi, A.; Srivastava, P.K.; Raghubanshi, A.S. Appraisal of Kappa-Based Metrics and Disagreement Indices of Accuracy Assessment for Parametric and Nonparametric Techniques Used in LULC Classification and Change Detection. Model. Earth Syst. Environ. 2020, 6, 1045–1059. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The Matthews Correlation Coefficient (MCC) Is More Informative Than Cohen’s Kappa and Brier Score in Binary Classification Assessment. IEEE Access 2021, 9, 78368–78381. [Google Scholar] [CrossRef]
Foody, G.M. Explaining the Unsuitability of the Kappa Coefficient in the Assessment and Comparison of the Accuracy of Thematic Maps Obtained by Image Classification. Remote Sens. Environ. 2020, 239, 111630. [Google Scholar] [CrossRef]
Kerr, G.H.G.; Fischer, C.; Reulke, R. Reliability Assessment for Remote Sensing Data: Beyond Cohen’s Kappa. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 4995–4998. [Google Scholar]
Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In Advances in Information Retrieval; Losada, D.E., Fernández-Luna, J.M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3408, pp. 345–359. ISBN 978-3-540-25295-5. [Google Scholar]
Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. In AI 2006: Advances in Artificial Intelligence; Sattar, A., Kang, B., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4304, pp. 1015–1021. ISBN 978-3-540-49787-5. [Google Scholar]
Khan, Z.; Gul, N.; Faiz, N.; Gul, A.; Adler, W.; Lausen, B. Optimal Trees Selection for Classification via Out-of-Bag Assessment and Sub-Bagging. IEEE Access 2021, 9, 28591–28607. [Google Scholar] [CrossRef]
Ramosaj, B.; Pauly, M. Consistent Estimation of Residual Variance with Random Forest Out-Of-Bag Errors. Stat. Probab. Lett. 2019, 151, 49–57. [Google Scholar] [CrossRef]
Fay, M.P.; Proschan, M.A.; Brittain, E. Combining One-Sample Confidence Procedures for Inference in the Two-Sample Case: Combining One-Sample Confidence Procedures. Biometrics 2015, 71, 146–156. [Google Scholar] [CrossRef] [PubMed]
Fay, M.P.; Hunsberger, S.A. Practical Valid Inferences for the Two-Sample Binomial Problem. Statist. Surv. 2021, 15, 72–110. [Google Scholar] [CrossRef]
Witten, I.H.; Frank, E.; Hall, M.A. Data Mining: Practical Machine Learning Tools and Techniques; Elsevier: Amsterdam, The Netherlands, 2011; ISBN 978-0-12-374856-0. [Google Scholar]
Yang, L.; Cervone, G. Analysis of Remote Sensing Imagery for Disaster Assessment Using Deep Learning: A Case Study of Flooding Event. Soft Comput. 2019, 23, 13393–13408. [Google Scholar] [CrossRef]
Cui, G.; Lv, Z.; Li, G.; Atli Benediktsson, J.; Lu, Y. Refining Land Cover Classification Maps Based on Dual-Adaptive Majority Voting Strategy for Very High Resolution Remote Sensing Images. Remote Sens. 2018, 10, 1238. [Google Scholar] [CrossRef]
Phiri, D.; Morgenroth, J. Developments in Landsat Land Cover Classification Methods: A Review. Remote Sens. 2017, 9, 967. [Google Scholar] [CrossRef]
He, L.; Li, J.; Liu, C.; Li, S. Recent Advances on Spectral–Spatial Hyperspectral Image Classification: An Overview and New Guidelines. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1579–1597. [Google Scholar] [CrossRef]
Xin, H.; Qikai, L.; Liangpei, Z.; Plaza, A. New Postprocessing Methods for Remote Sensing Image Classification: A Systematic Study. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7140–7159. [Google Scholar] [CrossRef]
Chughtai, A.H.; Abbasi, H.; Karas, I.R. A Review on Change Detection Method and Accuracy Assessment for Land Use Land Cover. Remote Sens. Appl. Soc. Environ. 2021, 22, 100482. [Google Scholar] [CrossRef]
Perera, S.; Allali, M.; Linstead, E.; El-Askary, H. Landuse Landcover Change Detection in the Mediterranean Region Using a Siamese Neural Network and Image Processing. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, IEEE, Brussels, Belgium, 11 July 2021; pp. 4368–4371. [Google Scholar]
Abdikan, S. Exploring Image Fusion of ALOS/PALSAR Data and LANDSAT Data to Differentiate Forest Area. Geocarto Int. 2018, 33, 21–37. [Google Scholar] [CrossRef]
Cass, A.; Petropoulos, G.P.; Ferentinos, K.P.; Pavlides, A.; Srivastava, P.K. Exploring the Synergy between Landsat and ASAR towards Improving Thematic Mapping Accuracy of Optical EO Data. Appl. Geomat. 2019, 11, 277–288. [Google Scholar] [CrossRef]
Ding, Q.; Shao, Z.; Huang, X.; Altan, O.; Fan, Y. Improving Urban Land Cover Mapping with the Fusion of Optical and SAR Data Based on Feature Selection Strategy. Photogramm. Eng. Remote Sens. 2022, 88, 17–28. [Google Scholar] [CrossRef]
Idol, T.; Haack, B.; Mahabir, R. Comparison and Integration of Spaceborne Optical and Radar Data for Mapping in Sudan. Int. J. Remote Sens. 2015, 36, 1551–1569. [Google Scholar] [CrossRef]
Tavares, P.; Beltrão, N.; Guimarães, U.; Teodoro, A. Integration of Sentinel-1 and Sentinel-2 for Classification and LULC Mapping in the Urban Area of Belém, Eastern Brazilian Amazon. Sensors 2019, 19, 1140. [Google Scholar] [CrossRef]
Zhang, H.; Xu, R. Exploring the Optimal Integration Levels between SAR and Optical Data for Better Urban Land Cover Mapping in the Pearl River Delta. Int. J. Appl. Earth Obs. Geoinf. 2018, 64, 87–95. [Google Scholar] [CrossRef]
Pflugmacher, D.; Rabe, A.; Peters, M.; Hostert, P. Mapping Pan-European Land Cover Using Landsat Spectral-Temporal Metrics and the European LUCAS Survey. Remote Sens. Environ. 2019, 221, 583–595. [Google Scholar] [CrossRef]
Mitchell, M.W. Bias of the Random Forest Out-of-Bag (OOB) Error for Certain Input Parameters. Open J. Stat. 2011, 1, 205–211. [Google Scholar] [CrossRef]
Janitza, S.; Hornung, R. On the Overestimation of Random Forest’s out-of-Bag Error. PLoS ONE 2018, 13, e0201904. [Google Scholar] [CrossRef] [PubMed]
Almarinez, B.J.M.; Barrion, A.T.; Navasero, M.V.; Navasero, M.M.; Cayabyab, B.F.; Carandang, J.S.R.; Legaspi, J.C.; Watanabe, K.; Amalin, D.M. Biological Control: A Major Component of the Pest Management Program for the Invasive Coconut Scale Insect, Aspidiotus Rigidus Reyne, in the Philippines. Insects 2020, 11, 745. [Google Scholar] [CrossRef] [PubMed]
Tsendbazar, N.; Herold, M.; Li, L.; Tarko, A.; De Bruin, S.; Masiliunas, D.; Lesiv, M.; Fritz, S.; Buchhorn, M.; Smets, B.; et al. Towards Operational Validation of Annual Global Land Cover Maps. Remote Sens. Environ. 2021, 266, 112686. [Google Scholar] [CrossRef]
Zhang, X.; Liu, L.; Chen, X.; Gao, Y.; Xie, S.; Mi, J. GLC_FCS30: Global Land-Cover Product with Fine Classification System at 30 m Using Time-Series Landsat Imagery. Earth Syst. Sci. Data 2021, 13, 2753–2776. [Google Scholar] [CrossRef]
Zanaga, D.; Van De Kerchove, R.; Daems, D.; De Keersmaecker, W.; Brockmann, C.; Kirches, G.; Wevers, J.; Cartus, O.; Santoro, M.; Fritz, S.; et al. ESA WorldCover, 10 m 2021 V200; International Institute for Applied Systems Analysis: London, UK, 2022.
Nembrini, S.; König, I.R.; Wright, M.N. The Revival of the Gini Importance? Bioinformatics 2018, 34, 3711–3718. [Google Scholar] [CrossRef]

Figure 1. Methodological flowchart of this study.

Figure 2. A location map of the study sites.

Figure 3. Images of the land cover present in the study sites. (a) Inland water (foreground) and residential build-up (background) in the PLW, (b) lowland annual crops in the PLW, (c) open forest in the northeast of the PLW, (d) lowland annual crops in the BW, (e) grassland in the rolling hills of the PLW, and (f) grassland (foreground) and a mosaic of cropland, brushland, and open forest in the uplands of the BW (background).

Figure 4. The model performance of various feature sets in (a) the PLW and (b) the BW order based on accuracy.

Figure 5. Yate’s p-values of McNemar’s test of pairwise RF model comparisons for the (a) PLW and (b) BW.

Figure 6. Generated land cover maps of (a) the PLW and (b) the BW from 2000 to 2020.

Figure 7. Net land cover change from the 2000 baseline in (a) the PLW and (b) the BW.

Table 1. Sources, temporal range, and number of remotely sensed data used in this study.

Image Collection [Spatial Resolution]	Reference Year	Image Dates (Number of Images)
Landsat5 Collection2 Level2 Tier1 [30 m]	2000	January 1998–December 2000 (83 images)
	2005	January 2004–December 2006 (63 images)
	2010	January 2009–December 2011 (53 images)
Landsat8 Collection2 Level2 Tier1 [30 m]	2015	January 2014–December 2015 (113 images)
	2020	January 2019–December 2020 (123 images)
ALOS Palsar1/Palsar2 Yearly Mosaic [25 m]	2015	January 2015–December 2015 (1 raster grid)
	2020	January 2020–December 2020 (1 raster grid)
ALOS GDSM (AW3D30) v3.2 [25 m]	All years (static)	January 2021 update (1 raster grid)

Table 2. Composite optical features used for classifier comparison.

General Variable	Annual Metric Composites			Seasonal Median Composites
General Variable	20th Percentile	50th Percentile	80th Percentile	Rainy Season	Dry Season
Green	Green_p20	Green_p50	Green_p80	Green_rain	Green_dry
Blue	Blue_p20	Blue_p50	Blue_p80	Blue_rain	Blue_dry
Red	Red_p20	Red_p50	Red_p80	Red_rain	Red_dry
NIR	NIR_p20	NIR_p50	NIR_p80	NIR_rain	NIR_dry
SWIR1	SWIR1_p20	SWIR1_p50	SWIR1_p80	SWIR1_rain	SWIR1_dry
SWIR2	SWIR2_p20	SWIR2_p50	SWIR2_p80	SWIR2_rain	SWIR2_dry
ALB	ALB_p20	ALB_p50	ALB_p80	ALB_rain	ALB_dry
EVI	EVI_p20	EVI_p50	EVI_p80	EVI_rain	EVI_dry
EVI2	EVI2_p20	EVI2_p50	EVI2_p80	EVI2_rain	EVI2_dry
GCI	GCI_p20	GCI_p50	GCI_p80	GCI_rain	GCI_dry
GVMI	GVMI_p20	GVMI_p50	GVMI_p80	GVMI_rain	GVMI_dry
MBI	MBI_p20	MBI_p50	MBI_p80	MBI_rain	MBI_dry
NDBI	NDBI_p20	NDBI_p50	NDBI_p80	NDBI_rain	NDBI_dry
NDVI	NDVI_p20	NDVI_p50	NDVI_p80	NDVI_rain	NDVI_dry
NDWI	NDWI_p20	NDWI_p50	NDWI_p80	NDWI_rain	NDWI_dry
SAVI	SAVI_p20	SAVI_p50	SAVI_p80	SAVI_rain	SAVI_dry

Table 3. Textural radar features used for classifier training.

General Variable	Band Polarization (Return)
General Variable	Horizontal	Vertical
Single Band	HH	HV
Simple Band Ratio	RAT
ASM	HH_ASM	HV_ASM
CON	HH_CON	HV_CON
CORR	HH_CORR	HV_CORR
DISS	HH_DISS	HV_DISS
ENT	HH_ENT	HV_ENT
IDM	HH_IDM	HV_IDM
SAVG	HH_SAVG	HV_SAVG
VAR	HH_VAR	HV_VAR

Table 4. Accuracy metrics of land cover maps generated by the optimized RF model.

Metric	PLW			BW
Metric	2010	2015	2020	2010	2015	2020
OA	0.9284	0.9243	0.9276	0.9517	0.9485	0.9504
κ	0.9092	0.9033	0.9083	0.9334	0.9300	0.9322
F1	0.8845	0.9107	0.9306	0.8586	0.8805	0.8734
OOBE	0.1869	0.1792	0.1784	0.0806	0.0776	0.0781

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Almarines, N.R.; Hashimoto, S.; Pulhin, J.M.; Tiburan, C.L., Jr.; Magpantay, A.T.; Saito, O. Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds. Remote Sens. 2024, 16, 2167. https://doi.org/10.3390/rs16122167

AMA Style

Almarines NR, Hashimoto S, Pulhin JM, Tiburan CL Jr., Magpantay AT, Saito O. Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds. Remote Sensing. 2024; 16(12):2167. https://doi.org/10.3390/rs16122167

Chicago/Turabian Style

Almarines, Nico R., Shizuka Hashimoto, Juan M. Pulhin, Cristino L. Tiburan, Jr., Angelica T. Magpantay, and Osamu Saito. 2024. "Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds" Remote Sensing 16, no. 12: 2167. https://doi.org/10.3390/rs16122167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Influence of Image Compositing and Multisource Data Fusion on Multitemporal Land Cover Mapping of Two Philippine Watersheds

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Remote Sensing Data and Preprocessing

2.3. Feature Generation and Classification

2.3.1. Optical Features and Composites

2.3.2. Radar and Topographic Features

2.3.3. Feature Normalization and Standardization

2.4. RF Model Evaluation

2.4.1. Accuracy Metrics

2.4.2. Statistical Analysis

2.5. Optimization and Postprocessing

2.6. Land Cover Mapping

3. Results

3.1. Comparison of Model Performance

3.2. Land Cover Maps

4. Discussion

4.1. Impacts of Multisource Data Integration

4.2. Impacts of Compositing

4.3. Generated Land Cover

4.4. Limitations and Further Studies

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI