Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia

Arjasakusuma, Sanjiwana; Swahyu Kusuma, Sandiaga; Rafif, Raihan; Saringatin, Siti; Wicaksono, Pramaditya

doi:10.3390/ijgi9110663

Open AccessArticle

Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia

Department of Geographic Information Science, Faculty of Geography, Gadjah Mada University, Bulaksumur, Yogyakarta 55281, Indonesia

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2020, 9(11), 663; https://doi.org/10.3390/ijgi9110663

Submission received: 4 October 2020 / Revised: 25 October 2020 / Accepted: 2 November 2020 / Published: 4 November 2020

(This article belongs to the Special Issue Earth Observation and GIScience for Agricultural Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The rise of Google Earth Engine, a cloud computing platform for spatial data, has unlocked seamless integration for multi-sensor and multi-temporal analysis, which is useful for the identification of land-cover classes based on their temporal characteristics. Our study aims to employ temporal patterns from monthly-median Sentinel-1 (S1) C-band synthetic aperture radar data and cloud-filled monthly spectral indices, i.e., Normalized Difference Vegetation Index (NDVI), Modified Normalized Difference Water Index (MNDWI), and Normalized Difference Built-up Index (NDBI), from Landsat 8 (L8) OLI for mapping rice cropland areas in the northern part of Central Java Province, Indonesia. The harmonic function was used to fill the cloud and cloud-masked values in the spectral indices from Landsat 8 data, and smile Random Forests (RF) and Classification And Regression Trees (CART) algorithms were used to map rice cropland areas using a combination of monthly S1 and monthly harmonic L8 spectral indices. An additional terrain variable, Terrain Roughness Index (TRI) from the SRTM dataset, was also included in the analysis. Our results demonstrated that RF models with 50 (RF50) and 80 (RF80) trees yielded better accuracy for mapping the extent of paddy fields, with user accuracies of 85.65% (RF50) and 85.75% (RF80), and producer accuracies of 91.63% (RF80) and 93.48% (RF50) (overall accuracies of 92.10% (RF80) and 92.47% (RF50)), respectively, while CART yielded a user accuracy of only 84.83% and a producer accuracy of 80.86%. The model variable importance in both RF50 and RF80 models showed that vertical transmit and horizontal receive (VH) polarization and harmonic-fitted NDVI were identified as the top five important variables, and the variables representing February, April, June, and December contributed more to the RF model. The detection of VH and NDVI as the top variables which contributed up to 51% of the Random Forest model indicated the importance of the multi-sensor combination for the identification of paddy fields.

Keywords:

harmonics; random forest; classification and regression trees; variable importance

Graphical Abstract

1. Introduction

The effort to monitor the extent of paddy fields has been a focus of attention for many years. Many countries consume rice as the main staple, especially large, densely populated countries such as Indonesia. However, the growing population in Indonesia put stress on the extent of paddy field areas, and brought about the conversion of agricultural areas to built-up environments, especially in the lowland areas of Java Island [1]. The shrinkage in the paddy field extent in Indonesia has largely been attributed to the expansion of settlement areas [2]. For instance, in Yogyakarta, the growth of settlement areas absorbed 16% of the paddy field areas between 1972 and 1990, which also facilitated the growing road network [3]. Protecting agricultural lands remains a challenge due to weak law enforcement [4]. Therefore, the accurate mapping and monitoring of paddy fields are needed, especially to mitigate the over-conversion of productive paddy fields to other land-use types.

Various remote sensing data and methods have been used in various studies for paddy field mapping. These studies have employed spectral, textural, thermal, and temporal properties, using each variable alone or as a combination of spectral, textural, thermal, and temporal properties. Studies have also employed medium-to-coarse spatial resolution and passive and active sensors. Because the agricultural cropping pattern is largely dependent on climatic factors [5], paddy fields can be characterized by their temporal properties. Early efforts to map paddy fields based on their temporal properties used a MODIS dataset due to its dense observation time, which offers a better chance of obtaining cloud-free observation [6,7,8,9,10]. Although Xiao, Boles, Frolking, Li, Babu, Salas and Moore III [7] emphasized discrepancies in the detected size of paddy field areas due to the inability of a 500 m spatial resolution to detect small patches of field area, and due to the contamination of clouds. In addition, the combination with other data, such as land surface temperature [11], and higher spatial resolution data such as Landsat [12,13,14], has increased the accuracy of paddy field identification, which demonstrated the benefit of multi-sensor integration for mapping paddy fields.

Multiple studies have benefited from the characteristics of radar that can penetrate clouds and operate both in the day and at night. Moreover, a radar signal is able to describe the phenology of rice fields at different paddy growth stages due to their distinct difference from the radar interaction with the plant water content, background water and plant structure such as plant height, leaf size, stalk diameter and crop yield [15]. For instance, during the flooding stage, a darker tone can be found due to the specular reflection of the radar pulses when interacting with water, while during the growing stages, increasing backscatter values are found due to the multiple reflection and volume scattering from the plant [15]. Thus, this triggered the multi-temporal analysis using radar sensors, such as ALOS/PalSar [16,17], which has been used with relatively high accuracy, although misclassification between dryland crops and orchard due to their similar backscatter amplitude has been observed [16]. Currently, paddy field mapping has benefited from freely available c-band dual-polarization Sentinel-1 SAR data with 10 m spatial resolution and 12-day temporal resolution, [18,19,20,21,22,23]. In addition to the use of Sentinel-1 data, combinations with other data, such as Landsat 8 [24,25,26,27] and Sentinel-2 [26,28,29], have also been employed in order to better exploit spectral, textural, and temporal information for the identification of paddy fields.

The usage of multi-temporal and multi-sensor datasets places a demand on computational time and processing, especially when large-area mapping is needed. Therefore, there has been a growing use of the cloud computing platform of Google Earth Engine (GEE) [30], developed in 2010, which aims to dissipate the hardware limitations for multi-scale computation, from local to the global scale. From 2011 to 2017, there were approximately 300 scientific studies employed the GEE services for various earth observation studies [31]. The studies focused on crop mapping, water resources, and land-use and land-cover change mapping and were conducted mostly by using optical medium resolution optical data, such as Landsat, followed by the c-band Sentinel-1 data [32]. Overall, the GEE platform has offered the capability for multi-scale geospatial computation by using multi-sensor, and multi-temporal remote sensing data.

Several studies to map paddy fields have benefited from the multi-temporal and multi-sensor data archive in the GEE platform [20,23,28,33]. Nevertheless, a problem remains with the inclusion of optical imagery in the analysis which requires additional processing to compensate for important missing data during the rainy season, which is the start of the rice planting season or the flooding stage [25]. The way to avoid cloudy images is to select relatively cloud-free images for use in the classification analysis, as conducted by Onojeghuo, Blackburn, Wang, Atkinson, Kindred and Miao [27]. However, this would mean that temporal details would be missing from the optical data, and additional processing would be required to fill the gap due to the cloud-and cloud shadow.

Addressing the gap or missing values due to the clouds or cloud shadows in the optical data is important for paddy field mapping. The missing values could altered the phenology and resulted to the misclassification of the cropping pattern of the paddy field [34]. Reconstructing the data missing due to atmospheric disturbances can be performed by using various gap-filling techniques from spatial, temporal, and spatial-temporal interpolation [35]. The use of temporal interpolations such as Savitzky–Golay or Whitaker, and the singular spectrum analysis filter is widely accepted, although performance is degraded if the gap is more than 20% [36]. Another approach uses harmonic curve functions for filling in missing values [37,38,39], which were beneficial be used as the input for the classification and regression of multi-temporal data [40,41,42] and also to map paddy field types [43].

The efforts to integrate the temporal information or growing stages of paddy fields by using Landsat data were usually carried out by identifying the inundation/flooding stage by using statistical approaches employing the spectral indices such as water and vegetation index to compensate for the missing data due to clouds and cloud shadows [33,34]. Our study aims to explore the mapping applicability and accuracy from combining the 2019 monthly data from the spectral indices from gap-filled Landsat 8 OLI, and Sentinel-1 time-series data for mapping paddy fields with the testing areas of Tegal and Cirebon districts, West and Central Java Provinces, Indonesia. A harmonic function was used to fill in the data that were missing due to clouds and cloud shadows in Landsat 8 OLI, and was simultaneously used with Sentinel-1 monthly-median data as the input for Random Forest (RF) and Classification and Regression Tree (CART) classification.

2. Materials and Methods

2.1. Study Area

The study took place in the Tegal and Cirebon districts, West and Central Java Provinces, Indonesia, at the location 108.6° E to 109.4° E, and 6.7° N to 7.1° N (Figure 1). The study area is a lowland coastal plain dominated by urban, aquaculture, cropland, and paddy field land cover. These areas supply the demand for rice, where the Cirebon district provides approximately 37.2% of rice for Jakarta, the capital city of Indonesia [44]. These areas are in the top ten areas with the highest rate of conversion from agricultural to non-agricultural use, with a loss percentage of 8.76% and 5.88% for Cirebon and Tegal districts, respectively, based on the data from 2009 to 2013 [45].

2.2. Harmonic Regression for Landsat 8 OLI

The 2019 Landsat 8 OLI Tier 1 (LANDSAT/LC08/C01/T1_SR) data archive in GEE was used in this study to generate the temporal spectral indices used as the input for classification. We performed cloud and cloud shadow masking using quality bands from the Landsat dataset, and topography correction to the Landsat 8 OLI data to remove the effect of terrain and terrain shadow from the pixel values. The topography correction followed Sun–Canopy–Sensor correction with the moderator C (SCS+C) method in GEE using the code from Poortinga, et al. [46].

The SCS+C method, developed by Soenen, et al. [47] was a development from the original SCS method [48], which used the parameter of the terrain slope (

α

), solar zenith angle (

θ

), and incidence angle (i). However, the SCS method tends to overestimate at a certain terrain configuration so that the combination with a C-parameter from C-correction method [49] was introduced. The C-parameter was derived by the ratio of the regression slope (b) and intercept (a) from the linear relationship (Equation (1)) from the uncorrected reflectance (L) with the cosine of local incidence angle (i) (Equation (2)). The C-parameter was then introduced to the original SCS approach as the additive component (Equation (3)). The details of SCS+C correction to produce topographic corrected reflectance (Ln) can be written, as follows:

L = a + b \cos i

(1)

C = \frac{a}{b}

(2)

L n = L \frac{\cos α \cos θ + C}{\cos i + C}

(3)

In GEE, the terrain slope was derived by using the digital elevation model (DEM) from the Shuttle Radar Topographic Mission (SRTM; USGS/SRTMGL1_003), with the solar zenith angle was taken from the Landsat-8 metadata, while the incidence angle was calculated by subtracting the solar azimuth angle with the aspect variable from the DEM data.

Following the topographic correction, several spectral indices such as the Normalized Vegetation Index (NDVI) (Equation (4)), Modified Normalized Difference Water Index (MNDWI) (Equation (5)) [50] and Normalized Difference Built Up Index (NDBI) (Equation (6)) [51], were calculated to obtain the temporal information used for identifying irrigated areas [52]. The following spectral bands of green, red, Near Infrared (NIR), and Shortwave Infrared (SWIR1) from Landsat 8 OLI were used in the calculation:

N D V I = \frac{N I R - R e d}{N I R + R e d}

(4)

M N D W I = \frac{G r e e n - S W I R 1}{G r e e n + S W I R 1}

(5)

N D B I = \frac{S W I R 1 - N I R}{S W I R 1 + N I R}

(6)

The spectral indices NDVI, MNDWI, and NDBI from Landsat 8 OLI contained missing values due to cloud and cloud-shadow masking by using the quality assessment (QA) layer, so additional gap-filling was conducted by using harmonic regression. The harmonic regression function is a frequency domain time-series analysis that reconstructs the periodic/seasonality waves from the sine and cosine functions of the original time-series data [39,53]. The harmonics models to calculate the fitted time-series data (Y(t)), based on Clinton [54] and Adams, et al. [55], can be expressed in the Equation (7) as follows:

Y (t) = β_{0} + β_{1} t + \sum_{j \geq 1} (β_{2} \cos (2 π j t) + β_{3} \sin (2 π j t)) + e_{t}

(7)

Whereas t represents the time bands since 1 January 1970, j represent the order of harmonics,

β_{0}

and

β_{1} t

represent the intercept and slope (trend) coefficients,

β_{2}

and

β_{3}

represent the independent fitting coefficients, and

e_{t}

represents the residual error term.

Seasonal values obtained from fitting harmonic functions have been used in various mapping applications that use remote sensing data, especially for land-cover classification [56,57] and the detection of land-use changes [41]. Harmonic regression implementation in GEE was adapted from Nicholas Clinton’s code [54] for fitting one-cycle seasonality per year with the NDVI, MNDWI, and NDBI values. The selection of the first harmonic was used to avoid over-fitting and to represent the overall productivity of the study area [39,42]. Following the harmonic regression analysis, median composites from the values of each of the harmonic-fitted spectral indices were derived to generate the monthly data for 2019. Median values were chosen due to the ability to better represent the vegetation biophysical properties [58] with the median value’s statistical nature, which was not being affected by the outliers. Some examples of harmonic fitted NDVI values (taken from a pixel of the study area) can be found in Figure 2.

2.3. Sentinel-1 Time Series Process

The 2019 archive of 10 m spatial resolution C-band Synthetic Aperture Radar (SAR) Sentinel-1 Ground Range Detected (GRD) data in GEE in the data collection COPERNICUS/S1_GRD was used in this study. The archive data of Sentinel-1 GRD in GEE undergo several processes, such as thermal noise removal, radiometric calibration, terrain correction by using the DEM data from SRTM or ASTER GDEM for higher latitude data (>60° North and South), and log scaling conversion from backscatter coefficient

(σ °)

to decibels (db) by using the following equation (Equation (8)).

d b = 10 \times \log σ °

(8)

In addition, we aggregated the values by taking the mean values to change the pixel size from 10 m to 30 m to follow the spatial resolution of Landsat 8 OLI. We employed the dual vertical transmit and vertical receive (VV) polarization and vertical transmit and horizontal receive (VH) polarization. The monthly median composites of the ascending orbit of Sentinel-1 from both VV and VH polarizations in 2019 from January to December were generated and used as the input for classification. The median value was chosen and used for the same reason as mentioned previously on the Landsat-8 OLI data processing.

2.4. Ancillary Data

Ancillary data in the form of terrain configuration data were used for the classification. The inclusion of terrain configuration in the classification process is thought to increase the accuracy of land cover classification [58,59]. The Shuttle Radar Topographic Mission (SRTM) data from the GEE database (USGS/SRTMGL1_003) with a pixel size of 30 m over the study area was used to generate the Terrain Ruggedness Index (TRI) data. The TRI calculation (Equation (9)) was developed by Riley, et al. [60], and represents terrain roughness from the elevation difference between the center pixel (Xoo) and the surrounding pixels (Xij) within a specified moving window on the original elevation raster image.

T R I = \sqrt[2]{\sum {(X o o - X i j)}^{2}}

(9)

2.5. Classification and Accuracy Assessment

This study employed RF and CART algorithms, available on the GEE system. RF is an algorithm developed by Breiman [61], which can be used for classification and regression analysis. On the classification task, the RF algorithm works by creating a set of tree classifiers from random subsets of the original data, and the majority vote is used as the basis for the classification decision. The tuning parameter for the number of trees in RF was determined by using the grid search method with values of 10 to 100 and an interval of 10. The Out-of-Bag (OOB) error for each iteration was compared, and the model with the lowest OOB error was selected. The variable importance result was then extracted. The OOB is a byproduct of a bagging algorithm, including the RF algorithm, which employs the unused samples in the bootstrap process and is beneficial for evaluating the error on the generated model [62].

In addition, the CART algorithm, originally developed by Breiman, et al. [63] in 1984, was used for the comparison. CART is a classifier that works by creating rules (which can be displayed in the form of a decision tree) for classification or regression by minimizing the heterogeneity measured from the heterogeneity criterion, i.e., the Gini Index or Entropy [64]. By looking at the RF and CART methods, RF can be considered as a development of CART that uses multiple trees in the decision-making process. However, CART is relatively simple, and its computational burden is less heavy than RF, which can be useful in avoiding the 1 million points computation limitation in GEE, which would undermine our comparison analysis.

Both RF and CART classifications used the same input variables. The list of input variables and their codenames are listed in Table 1. The total accumulated number of variables used from the monthly Landsat 8 and Sentinel 1, including the TRI from SRTM, was 61, with the pixel dimension of 867 × 480 pixels for each layer. Therefore, a tileScale parameter was set during the classification process to 8 to avoid 100 mb memory limit in GEE. To train classification algorithm, 174 points of training data were collected which consisted of 62 points for paddy fields, 80 points for non-paddy fields, and 32 points for water body.

An accuracy assessment was conducted by constructing the error matrix using randomly sampled points. A total of 3000 points were distributed, and the land cover information for each point was added by performing a visual check using the Google Map high-resolution satellite view. Following the visual inspection, several accuracy components, namely, producer (PA), user (UA), and overall accuracies (OA), were calculated. Before conducting the accuracy assessment, the classification maps were filtered using a majority filter in a 3 × 3 window to remove small, isolated classes from the final maps.

The workflow of our analysis is shown in Figure 3.

3. Results

3.1. Out-of-Bag Error (OOB) and Variable Importance

A grid search and the corresponding OOB error values for each iteration were calculated (Figure 4). From the grid search, we found that the error ranged from 8.046% to 15.517%. The lowest errors were obtained by setting the number of trees to 50 and 80 trees with an OOB value of 8.046%. Therefore, the two classifications were performed by using 50 and 80 trees as the parameter setting, and the variable importance values from these two classification analyses were derived.

Figure 5A,B showed the top 15 important variables with the highest importance values on the RF50 and RF80 models. The graphs showed that the VH backscatter in August and in October was listed as the most important variable in the RF50 and RF80 models. The VH backscatter was then followed by the FNDVI in April and January as the second greatest contributing variable in the RF50 and RF80 models. Both Figure 5A,B show that the VH and FNDVI variables for different months were included in the top 15 variables. In the RF50 model, the FNDBI in March was identified as one of the top 15 variables.

Previous figures showed the top variables contributing to the RF50 and RF80 models. Because every variable used in the RF analysis possessed importance values, they can be summarized based on the date (the month the data represents), and the type of dataset itself (Figure 6A,B). The TRI variable was not included in this analysis because it did not represent the date, and only consisted of one layer. In addition, the TRI showed relatively poor importance values both for the RF50 and RF80 models, and ranked 53 and 59 out of 61 variables, correspondingly.

Figure 6A shows that the important variables usually were from the months of February, April, June, and December for the RF50 and RF80 models. Most of the months, except for June, represented the rainy season in Indonesia, which is the growing season for croplands, especially paddy fields. In addition to the date, Figure 5B shows that the summary of VH polarization and FNDVI performed better than other datasets, such as FMNDWI, VV polarization, and FNDBI.

Figure 6B shows the superiority of the cross-polarization of VH from Sentinel-1C as the better dataset for classifying paddy fields followed by FNDVI and FMNDWI. Based on the summary in Figure 6B, VH polarization contributed from 27.9 to 28.2% of the model importance, while FNDVI contributed from 22.8 to 22/9% of the model importance in RF50 and RF80, respectively. Figure 7 shows the temporal variability of VH backscatter in the study area to better understand the phenology of paddy fields over a year. Here, the higher backscatter values can be seen from February to July 2019, which can be attributed to the increasing double bounce from the interaction between the paddy plants and the water background. From the figure, it is evident that not all paddy fields have similar patterns, which can also be attributed to the different cropping patterns and cropping intensities in this area.

3.2. Classification Results and Accuracy Assessment

Figure 8B–D display the classification results from CART and two RF models (RF50 and RF80 models) with the background from VH polarization taken in October 2019. VH polarization was used as it was identified as the best contributing variable in the RF80 model. The paddy field distribution had approximately −20 decibels of VH backscatter, which was more than aquaculture and seawater backscatter and lower than the urban class. The October backscatter values of paddy fields in this month are influenced by an interaction of radar pulse with open bare soil. The lowest backscatter was represented by aquaculture and seawater due to the low radar pulse that was returned to the sensor due to the specular reflection from the water content, while urban features possessed the highest backscatter values due to the double-bounce interaction between the building and the radar signal.

Figure 8B–D display the classification results (majority filtered) from CART and two RF models (the RF50 and RF80 models). All maps show a relatively similar spatial distribution, although CART detected fewer paddy fields than did other maps of the RF50 and RF80 models, and inconsistencies between maps were found. One inconsistency was in the separation between dryland farming areas (green box in Figure 7) as detected from the Google Earth imagery, which CART detected as paddy fields, whereas the RF models consistently detected fewer paddy fields for that particular area. Such misclassification resulted in low accuracy for the CART map as compared to the RF maps for PA, UA, and OA (Table 2). The low PA percentage for the CART map indicated a larger omission error for paddy fields as evidenced by the lesser number of paddy fields detected. The complete error matrix can be found in Appendix A.

4. Discussion

GEE provided a platform for processing multi-sensor and multi-temporal remote sensing data easily. We conducted the analysis in GEE by using a combination of Sentinel-1 and Landsat 8 monthly time-series data for mapping paddy fields in parts of West and Central Java Provinces, Indonesia. Several variables, such as the VV and VH backscatters, harmonics-fitted NDVI, NDBI, and MNDWI time-series, and TRI from the Sentinel-1, Landsat 8, and SRTM data were used as the input for the RF and CART classification algorithms. From the analysis, the variable importance was identified and the classification accuracies were determined.

The variable importance recapitulation (Figure 6) showed importance based on the seasonality and the type of dataset. Based on seasonality, the data collected in February, April, June, and December were identified as the important variables for the paddy planting season of the area. The seasonal patterns of the paddy field, including rewetting and transplanting, were distinct. It was useful to separate this class from other classes that showed seasonal flooding, such as the wetlands class [13]. In addition, Lasko, Vadrevu, Tran and Justice [21] and Kontgis, Warren, Skillman, Chartrand and Moody [19] also found that the backscatter values during the flooding and harvesting stage were important for separating paddy fields from other classes. The unique characteristics of paddy fields during different planting stages, especially the radar backscatter, were defined by the interaction of the radar pulse between plants and water [65], which helped to define paddy field areas and other ancillary information, such as crop intensity and crop rotation.

Each dataset of radar backscatters and spectral indices used in this study showed different importance levels for identifying paddy fields. Figure 5 shows VH polarization as the best variable for identifying paddy fields, with FNDVI as the second most important variable. Our finding of the best variables was similar to that in the study by Onojeghuo, Blackburn, Wang, Atkinson, Kindred and Miao [27], which identified VH polarization and NDVI time-series data as the best combination for identifying paddy fields, while Lasko, Vadrevu, Tran and Justice [21] suggested that the ratio data between time-series VV and VH performed better for identifying paddy fields, although the accuracy was only 0.4 to 1.1% better than when using VH polarization alone. The identification of VH polarization as the best predictor can be attributed to lower sensitivity to the water background than VV polarization, so that the values of VH polarization reflect the phenology of the paddy growth better [27]. Nonetheless, the identification of VH and harmonic-fitted NDVI (FNDVI) variables as important variables marked the potential of the synergy between time-series and multi-sensor data for paddy field mapping and the ability of the gap-filled method of harmonic function to reconstruct important temporal characteristics in mapping paddy fields.

Apart from the variable derived from Landsat 8 and Sentinel-1 data, our study also identified the low contribution of the TRI variable to the classification model. This may have resulted from the characteristics of the study area, which is relatively plain, so there is no distinct difference in the terrain configuration between the paddy field and non-paddy field classes. However, TRI’s importance would be greater if the study area had a complex terrain, as this would undermine the distribution of land-use types.

From an accuracy perspective, the accuracy of maps produced by the RF models was relatively better than the results of the CART algorithm, although CART had a faster and relatively lower computational burden in GEE than did RF. From the RF model results (Table 2), it is evident that the UA for the paddy fields is relatively lower than the PA, which indicates the misclassification in the form of omission of the non-paddy field classes that were included as paddy fields. The misclassification in RF50 and RF80 were identical with the 144 and 140 points of omission error (Appendix A) and were contributed by the omission of 2 land cover types: (1.) dryland farming areas (Figure 9A,B), and (2.) riparian areas (Figure 9C,D). The omission of dryland farming as paddy field mapping was also occurred at the study using time-series ALOS/PALSAR data due to the relatively similar backscatter values [16] Although, there was also a possibility that inter-cropping of paddy during the rainy season and other crops occurred in the dryland farming area, which resulted in misclassification. On the other hand, the omission of the riparian areas as paddy fields can be caused by the repeated inundation during the rainy season, which similar to the characteristics of paddy fields.

In our study, GEE provided a friendly interface for generating monthly variables from Landsat-8 and Sentinel-1 data and integrating those into the classification process. The benefits of using GEE can be stressed on the easy implementation to access and pre-process the original time-series dataset from Landsat-8 and Sentinel-1 data to create monthly composites, thus resulting in the less noisy data in the temporal composited data and the less computational time on data access and data storage compared to the conventional process. However, the usage of GEE was not always without a problem. In our study, the main problem was the memory limit when training the RF classification by using 61 variables and 174 training points, as mentioned by Gorelick, Hancher, Dixon, Ilyushchenko, Thau and Moore [30] as the scaling challenges and computational model mismatch when the process is not parallelizable and large dataset was used [32]. The memory limit problem can be solved by increasing the tile size, reducing the training points, and reducing input layers by selecting only the important variables. Therefore, the variable importance result of our study can be used as the basis for variable selection for larger-scale application of paddy field mapping.

In addition, Tamiminia, Salehi, Mahdianpari, Quackenbush, Adeli and Brisco [32] also emphasized the limited option of the classification algorithm in GEE, although future improvement can be initiated with the GEE collaboration with Tensorflow. So that, in the future, extended studies by using GEE may benefit from the development of deep learning methods for time-series satellite imagery data, as applied in convolutional neural networks [66], recurrent neural networks [67], long short-term memory [68], and temporal convolutional neural networks [69].

5. Conclusions

Our study highlighted the applicability of Landsat 8 and Sentinel-1 monthly time-series data for mapping paddy fields in parts of Central and West Java Provinces, Indonesia, by using GEE. The results of variable importance showed VH polarization time-series data as the best variable for detecting paddy fields due to the better sensitivity of the VH-backscatter in the planting stage of the paddy with the harmonic fitted NDVI data were also identified as the second most important variable, and followed by FMNDWI and VV Polarization as the third and fourth-best datasets. The result justifies the importance of multi-sensor mapping for paddy fields. Together, VH polarization and FNDVI contributed from 50.9 to 51.1% of the model importance in RF, while a combination of VH, FNDVI, FNDWI, and VV polarization summed up to 88.3% of the RF model. The identification of FNDVI as the second important variable also showed the ability of the harmonic function to retain important temporal information, which is needed for mapping paddy fields. In addition, although GEE provided good computational capability using big remote sensing data, the lingering memory limit could be one problem that should be taken into account, especially for mapping by using larger datasets for large areas. Therefore, future studies wanting to employ time-series variables for mapping paddy fields by using GEE could focus on the identified important variables in our study. In terms of accuracy, the RF models with 50 (RF50) and 80 (RF80) trees showed a better ability for classifying paddy fields than the CART models, with a PA of 93.48% and 91.63%, and a UA of 85.66% and 85.76%, correspondingly. The lower UA indicated more omission error for the misclassification of dryland farming and periodically flooded riparian areas to be identified as a paddy field in some parts of the study area, although it is possible that some of the dryland farming areas could have been planted with paddy, especially during the rainy season. In the future, this study could benefit from exploration with deep learning analysis for time-series data to better exploit the temporal dimension in identifying paddy fields.

Author Contributions

Conceptualization: Sanjiwana Arjasakusuma; Methodology, Sanjiwana Arjasakusuma, Sandiaga Swahyu Kusuma, Raihan Rafif; Data Curation, Sandiaga Swahyu Kusuma, Raihan Rafif, Siti Saringatin; Writing-Original Draft Sanjiwana Arjasakusuma; Writing-Review and Editing, Sanjiwana Arjasakusuma; Visualization, Sandiaga Swahyu Kusuma; Supervision, Pramaditya Wicaksono. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Directorate of Research and Community Service, Directorate General of Research and Development Strengthening Ministry of Research, Technology and Higher Education of the Republic of Indonesia (Direktorat Riset dan Pengabdian Masyarakat, Direktorat Jenderal Penguatan Riset dan Pengembangan Kementerian Riset, Teknologi dan Pendidikan Tinggi Republik Indonesia), under the Basic Research University Grant with the contract number 1678/UN1/DITLIT/DIT-LIT/PT/2020 for fiscal year 2020/2021.

Acknowledgments

We thank anonymous reviewers for constructive comments on the manuscript, Earth Engine Developers Forum for the insight and code sharing, Google for providing access to the Google Earth Engine, and the data providers (USGS, and European Space Agency) that supplied the remote sensing data in GEE.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Complete error matrix for the RF and CART classification.

RF_50
Class	Reference				UA
Class	Paddy Field	Non-Paddy Field	Water Body	Total	UA
Paddy Field	860	144	0	1004	85.66
Non-Paddy Field	60	932	4	996	93.57
Water Body	0	18	982	1000	98.20
Total	920	1094	986	3000
PA	93.48	85.19	99.59
OA	92.47
RF 80
Paddy Field	843	140	0	1004	83.96
Non-Paddy Field	77	937	3	1017	92.13
Water Body	0	17	983	1000	98.30
Total	920	1094	986	3000
PA	91.63	85.65	1.00
OA	92.10
CART
Paddy Field	744	133	0	1004	74.10
Non-Paddy Field	176	957	60	1193	80.22
Water Body	0	8	926	1000	92.60
Total	920	1098	986	3000
PA	80.87	87.16	0.94
OA	87.57

References

Verburg, P.H.; Bouma, J. Land use change under conditions of high population pressure: The case of Java. Glob. Environ. Chang. 1999, 9, 303–312. [Google Scholar] [CrossRef]
Firman, T. Major issues in Indonesia’s urban land development. Land Use Policy 2004, 21, 347–355. [Google Scholar] [CrossRef]
Dimyati, M.; Mizuno, K.; Kobayashi, S.; Kitamura, T. An analysis of land use/cover change in Indonesia. Int. J. Remote Sens. 1996, 17, 931–944. [Google Scholar] [CrossRef]
Dirman, E.; Saleng, A.; Sapiddin, A. Food agricultural land legal protection to improve food security in Indonesia. EES 2018, 196, 012047. [Google Scholar] [CrossRef]
Wikkramatileke, R. Problems of Land-Use Mapping in the Tropics. An Example from Ceylon. Geography 1959, 44, 79–95. [Google Scholar]
Peng, D.; Huete, A.R.; Huang, J.; Wang, F.; Sun, H. Detection and estimation of mixed paddy rice cropping patterns with MODIS data. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 13–23. [Google Scholar] [CrossRef]
Xiao, X.; Boles, S.; Frolking, S.; Li, C.; Babu, J.Y.; Salas, W.; Moore III, B. Mapping paddy rice agriculture in South and Southeast Asia using multi-temporal MODIS images. Remote Sens. Environ. 2006, 100, 95–113. [Google Scholar] [CrossRef]
Xiao, X.; Boles, S.; Liu, J.; Zhuang, D.; Frolking, S.; Li, C.; Salas, W.; Moore III, B. Mapping paddy rice agriculture in southern China using multi-temporal MODIS images. Remote Sens. Environ. 2005, 95, 480–492. [Google Scholar] [CrossRef]
Sari, D.K.; Ismullah, I.H.; Sulasdi, W.N.; Harto, A.B. Detecting rice phenology in paddy fields with complex cropping pattern using time series MODIS data. J. Math. Fundam. Sci. 2010, 42, 91–106. [Google Scholar]
Sakamoto, T.; Yokozawa, M.; Toritani, H.; Shibayama, M.; Ishitsuka, N.; Ohno, H. A crop phenology detection method using time-series MODIS data. Remote Sens. Environ. 2005, 96, 366–374. [Google Scholar] [CrossRef]
Zhang, G.; Xiao, X.; Dong, J.; Kou, W.; Jin, C.; Qin, Y.; Zhou, Y.; Wang, J.; Menarguez, M.A.; Biradar, C. Mapping paddy rice planting areas through time series analysis of MODIS land surface temperature and vegetation index data. ISPRS J. Photogramm. Remote Sens. 2015, 106, 157–171. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Qin, Y.; Xiao, X.; Dong, J.; Zhou, Y.; Zhu, Z.; Zhang, G.; Du, G.; Jin, C.; Kou, W.; Wang, J.; et al. Mapping paddy rice planting area in cold temperate climate region through analysis of time series Landsat 8 (OLI), Landsat 7 (ETM+) and MODIS imagery. ISPRS J. Photogramm. Remote Sens. 2015, 105, 220–233. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhou, Y.; Xiao, X.; Qin, Y.; Dong, J.; Zhang, G.; Kou, W.; Jin, C.; Wang, J.; Li, X. Mapping paddy rice planting area in rice-wetland coexistent areas through analysis of Landsat 8 OLI and MODIS images. Int. J. Appl. Earth Obs. Geoinf. 2016, 46, 1–12. [Google Scholar] [CrossRef] [Green Version]
Wang, J.; Xiao, X.; Qin, Y.; Dong, J.; Zhang, G.; Kou, W.; Jin, C.; Zhou, Y.; Zhang, Y. Mapping paddy rice planting area in wheat-rice double-cropped areas through integration of Landsat-8 OLI, MODIS, and PALSAR images. Sci. Rep. 2015, 5, 10088. [Google Scholar] [CrossRef] [Green Version]
Aschbacher, J.; Pongsrihadulchai, A.; Karnchanasutham, S.; Rodprom, C.; Paudyal, D.; Le Toan, T. Assessment of ERS-1 SAR data for rice crop mapping and monitoring. In Proceedings of the International Geoscience and Remote Sensing Symposium, IGARSS’95: Quantitative Remote Sensing for Science and Applications, Congress Center, Firenze, Italy, 10–14 July 1995; pp. 2183–2185. [Google Scholar]
Zhang, Y.; Wang, C.; Wu, J.; Qi, J.; Salas, W.A. Mapping paddy rice with multitemporal ALOS/PALSAR imagery in southeast China. Int. J. Remote Sens. 2009, 30, 6301–6315. [Google Scholar] [CrossRef]
Miyaoka, K.; Maki, M.; Susaki, J.; Homma, K.; Noda, K.; Oki, K. Rice-planted area mapping using small sets of multi-temporal SAR data. IEEE Geosci. Remote Sens. Lett. 2013, 10, 1507–1511. [Google Scholar] [CrossRef]
Bazzi, H.; Baghdadi, N.; El Hajj, M.; Zribi, M.; Minh, D.H.T.; Ndikumana, E.; Courault, D.; Belhouchette, H. Mapping paddy rice using Sentinel-1 SAR time series in Camargue, France. Remote Sens. 2019, 11, 887. [Google Scholar] [CrossRef] [Green Version]
Kontgis, C.; Warren, M.S.; Skillman, S.W.; Chartrand, R.; Moody, D.I. Leveraging Sentinel-1 time-series data for mapping agricultural land cover and land use in the tropics. In Proceedings of the 9th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Bruges, Belgium, 27–29 June 2017; pp. 1–4. [Google Scholar]
Singha, M.; Dong, J.; Zhang, G.; Xiao, X. High resolution paddy rice maps in cloud-prone Bangladesh and Northeast India using Sentinel-1 data. Sci. Data 2019, 6, 1–10. [Google Scholar] [CrossRef]
Lasko, K.; Vadrevu, K.P.; Tran, V.T.; Justice, C. Mapping double and single crop paddy rice with Sentinel-1A at varying spatial scales and polarizations in Hanoi, Vietnam. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 498–512. [Google Scholar] [CrossRef]
Nguyen, D.B.; Wagner, W. European rice cropland mapping with Sentinel-1 data: The Mediterranean region case study. Water 2017, 9, 392. [Google Scholar] [CrossRef]
Minasny, B.; Shah, R.M.; Che Soh, N.; Arif, C.; Indra Setiawan, B. Automated Near-Real-Time Mapping and Monitoring of Rice Extent, Cropping Patterns, and Growth Stages in Southeast Asia Using Sentinel-1 Time Series on a Google Earth Engine Platform. Remote Sens. 2019, 11, 1666. [Google Scholar]
Mansaray, L.R.; Huang, W.; Zhang, D.; Huang, J.; Li, J. Mapping rice fields in urban Shanghai, southeast China, using Sentinel-1A and Landsat 8 datasets. Remote Sens. 2017, 9, 257. [Google Scholar] [CrossRef] [Green Version]
Torbick, N.; Chowdhury, D.; Salas, W.; Qi, J. Monitoring rice agriculture across myanmar using time series Sentinel-1 assisted by Landsat-8 and PALSAR-2. Remote Sens. 2017, 9, 119. [Google Scholar] [CrossRef] [Green Version]
Mansaray, L.R.; Wang, F.; Huang, J.; Yang, L.; Kanu, A.S. Accuracies of support vector machine and random forest in rice mapping with Sentinel-1A, Landsat-8 and Sentinel-2A datasets. Geocarto Int. 2020, 35, 1088–1108. [Google Scholar] [CrossRef]
Onojeghuo, A.O.; Blackburn, G.A.; Wang, Q.; Atkinson, P.M.; Kindred, D.; Miao, Y. Mapping paddy rice fields by applying machine learning algorithms to multi-temporal Sentinel-1A and Landsat data. Int. J. Remote Sens. 2018, 39, 1042–1067. [Google Scholar] [CrossRef] [Green Version]
Inoue, S.; Ito, A.; Yonezawa, C. Mapping Paddy Fields in Japan by Using a Sentinel-1 SAR Time Series Supplemented by Sentinel-2 Images on Google Earth Engine. Remote Sens. 2020, 12, 1622. [Google Scholar] [CrossRef]
Cai, Y.; Lin, H.; Zhang, M. Mapping paddy rice by the object-based random forest method using time series Sentinel-1/Sentinel-2 data. Adv. Space Res. 2019, 64, 2233–2244. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Kumar, L.; Mutanga, O. Google Earth Engine applications since inception: Usage, trends, and potential. Remote Sens. 2018, 10, 1509. [Google Scholar] [CrossRef] [Green Version]
Tamiminia, H.; Salehi, B.; Mahdianpari, M.; Quackenbush, L.; Adeli, S.; Brisco, B. Google Earth Engine for geo-big data applications: A meta-analysis and systematic review. ISPRS J. Photogramm. Remote Sens. 2020, 164, 152–170. [Google Scholar] [CrossRef]
Dong, J.; Xiao, X.; Menarguez, M.A.; Zhang, G.; Qin, Y.; Thau, D.; Biradar, C.; Moore III, B. Mapping paddy rice planting area in northeastern Asia with Landsat 8 images, phenology-based algorithm and Google Earth Engine. Remote Sens. Environ. 2016, 185, 142–154. [Google Scholar] [CrossRef] [Green Version]
Kontgis, C.; Schneider, A.; Ozdogan, M. Mapping rice paddy extent and intensification in the Vietnamese Mekong River Delta with dense time stacks of Landsat data. Remote Sens. Environ. 2015, 169, 255–269. [Google Scholar] [CrossRef]
Weiss, D.J.; Atkinson, P.M.; Bhatt, S.; Mappin, B.; Hay, S.I.; Gething, P.W. An effective approach for gap-filling continental scale remotely sensed time-series. ISPRS J. Photogramm. Remote Sens. 2014, 98, 106–118. [Google Scholar] [CrossRef] [Green Version]
Kandasamy, S.; Baret, F.; Verger, A.; Neveux, P.; Weiss, M. A comparison of methods for smoothing and gap filling time series of remote sensing observations-application to MODIS LAI products. Biogeosciences 2013, 10, 4055–4071. [Google Scholar] [CrossRef] [Green Version]
Roy, D.P.; Yan, L. Robust Landsat-based crop time series modelling. Remote Sens. Environ. 2020, 238. [Google Scholar] [CrossRef]
Roerink, G.; Menenti, M.; Verhoef, W. Reconstructing cloudfree NDVI composites using Fourier analysis of time series. Int. J. Remote Sens. 2000, 21, 1911–1917. [Google Scholar] [CrossRef]
Brooks, E.B.; Thomas, V.A.; Wynne, R.H.; Coulston, J.W. Fitting the multitemporal curve: A Fourier series approach to the missing data problem in remote sensing analysis. IEEE Trans. Geosci. Remote Sens. 2012, 50, 3340–3353. [Google Scholar] [CrossRef] [Green Version]
Liu, J.; Heiskanen, J.; Maeda, E.E.; Pellikka, P.K. Burned area detection based on Landsat time series in savannas of southern Burkina Faso. Int. J. Appl. Earth Obs. Geoinf. 2018, 64, 210–220. [Google Scholar] [CrossRef] [Green Version]
Venkatappa, M.; Sasaki, N.; Shrestha, R.P.; Tripathi, N.K.; Ma, H.-O. Determination of vegetation thresholds for assessing land use and land use changes in Cambodia using the Google Earth Engine cloud-computing platform. Remote Sens. 2019, 11, 1514. [Google Scholar] [CrossRef] [Green Version]
Derwin, J.M.; Thomas, V.A.; Wynne, R.H.; Coulston, J.W.; Liknes, G.C.; Bender, S.; Blinn, C.E.; Brooks, E.B.; Ruefenacht, B.; Benton, R. Estimating tree canopy cover using harmonic regression coefficients derived from multitemporal Landsat data. Int. J. Appl. Earth Obs. Geoinf. 2020, 86, 101985. [Google Scholar] [CrossRef]
Landmann, T.; Eidmann, D.; Cornish, N.; Franke, J.; Siebert, S. Optimizing harmonics from Landsat time series data: The case of mapping rainfed and irrigated agriculture in Zimbabwe. Remote Sens. Lett. 2019, 10, 1038–1046. [Google Scholar] [CrossRef]
Surjasa, D.; Gumbira-Sa’id, E.; Arifin, B.; Jie, F. Indonesian rice supply chain analysis and supplier selection model. Int. J. Inf. Bus. Manag. 2013, 5, 198. [Google Scholar]
Febrina, W.D. Determinants of Paddy Field Conversion in Java 1995-2013. J. Home Aff. Gov. 2017, 9, 1–13. [Google Scholar] [CrossRef] [Green Version]
Poortinga, A.; Tenneson, K.; Shapiro, A.; Nquyen, Q.; San Aung, K.; Chishtie, F.; Saah, D. Mapping plantations in Myanmar by fusing landsat-8, sentinel-2 and sentinel-1 data along with systematic error quantification. Remote Sens. 2019, 11, 831. [Google Scholar] [CrossRef] [Green Version]
Soenen, S.A.; Peddle, D.R.; Coburn, C.A. SCS+ C: A modified sun-canopy-sensor topographic correction in forested terrain. IEEE Trans. Geosci. Remote Sens. 2005, 43, 2148–2159. [Google Scholar] [CrossRef]
Gu, D.; Gillespie, A. Topographic normalization of Landsat TM images of forest based on subpixel sun–canopy–sensor geometry. Remote Sens. Environ. 1998, 64, 166–175. [Google Scholar] [CrossRef]
Teillet, P.; Guindon, B.; Goodenough, D. On the slope-aspect correction of multispectral scanner data. Can. J. Remote Sens. 1982, 8, 84–106. [Google Scholar] [CrossRef] [Green Version]
Han-Qiu, X. A study on information extraction of water body with the modified normalized difference water index (MNDWI). J. Remote Sens. 2005, 5, 589–595. [Google Scholar]
Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
Benbahria, Z.; Sebari, I.; Hajji, H.; Smiej, M.F. Automatic Mapping of Irrigated Areas in Mediteranean Context Using Landsat 8 Time Series Images and Random Forest Algorithm. In Proceedings of the IGARSS IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 7986–7989. [Google Scholar]
Ramanathan, K.; Thenmozhi, M.; George, S.; Anandan, S.; Veeraraghavan, B.; Naumova, E.N. Assessing Seasonality Variation with Harmonic Regression: Accommodations for Sharp Peaks. Int. J. Environ. Res. Public Health 2020, 17, 1318. [Google Scholar] [CrossRef] [Green Version]
Clinton, N. Lab 6: Time series analysis. In Introductory Remote Sensing Code Labs; Geospatial Analysis Lab, University of San Francisco: San Francisco, CA, USA, 2019. [Google Scholar]
Adams, B.; Iverson, L.; Matthews, S.; Peters, M.; Prasad, A.; Hix, D. Mapping Forest Composition with Landsat Time Series: An Evaluation of Seasonal Composites and Harmonic Regression. Remote Sens. 2020, 12, 610. [Google Scholar] [CrossRef] [Green Version]
Wilson, B.T.; Knight, J.F.; McRoberts, R.E. Harmonic regression of Landsat time series for modeling attributes from national forest inventory data. ISPRS J. Photogramm. Remote Sens. 2018, 137, 29–46. [Google Scholar] [CrossRef]
Nguyen, M.D.; Baez-Villanueva, O.M.; Bui, D.D.; Nguyen, P.T.; Ribbe, L. Harmonization of Landsat and Sentinel 2 for Crop Monitoring in Drought Prone Areas: Case Studies of Ninh Thuan (Vietnam) and Bekaa (Lebanon). Remote Sens. 2020, 12, 281. [Google Scholar] [CrossRef] [Green Version]
Fahsi, A.; Tsegaye, T.; Tadesse, W.; Coleman, T. Incorporation of digital elevation models with Landsat-TM data to improve land cover classification accuracy. For. Ecol. Manag. 2000, 128, 57–64. [Google Scholar] [CrossRef]
Saha, A.; Arora, M.; Csaplovics, E.; Gupta, R. Land cover classification using IRS LISS III image and DEM in a rugged terrain: A case study in Himalayas. Geocarto Int. 2005, 20, 33–40. [Google Scholar] [CrossRef]
Riley, S.J.; DeGloria, S.D.; Elliot, R. Index that quantifies topographic heterogeneity. Intermt. J. Sci. 1999, 5, 23–27. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Breiman, L. Out-of-bag Estimation. 1996. Available online: https://www.stat.berkeley.edu/~breiman/OOBestimation.pdf (accessed on 25 September 2020).
Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Classification and Regression Trees; Chapman and Hall/CRC: Boca Raton, FL, USA, 1984. [Google Scholar]
Bel, L.; Allard, D.; Laurent, J.; Cheddadi, R.; Bar-Hen, A. CART algorithm for spatial data: Application to environmental and ecological data. Comput. Stat. Data Anal. 2009, 53, 3082–3093. [Google Scholar] [CrossRef]
Kuenzer, C.; Knauer, K. Remote sensing of rice crop areas. Int. J. Remote Sens. 2013, 34, 2101–2139. [Google Scholar] [CrossRef]
Zhao, S.; Liu, X.; Ding, C.; Liu, S.; Wu, C.; Wu, L.J.G.; Sensing, R. Mapping rice paddies in complex landscapes with convolutional neural networks and phenological metrics. Gisci. Remote Sens. 2020, 57, 37–48. [Google Scholar] [CrossRef]
Ndikumana, E.; Ho Tong Minh, D.; Baghdadi, N.; Courault, D.; Hossard, L. Deep recurrent neural network for agricultural classification using multitemporal SAR Sentinel-1 for Camargue, France. Remote Sens. 2018, 10, 1217. [Google Scholar] [CrossRef] [Green Version]
Crisóstomo de Castro Filho, H.; Abílio de Carvalho Júnior, O.; Ferreira de Carvalho, O.L.; Pozzobon de Bem, P.; dos Santos de Moura, R.; Olino de Albuquerque, A.; Rosa Silva, C.; Guimarães Ferreira, P.H.; Fontes Guimarães, R.; Trancoso Gomes, R.A. Rice Crop Detection Using LSTM, Bi-LSTM, and Machine Learning Models from Sentinel-1 Time Series. Remote Sens. 2020, 12, 2655. [Google Scholar] [CrossRef]
Pelletier, C.; Webb, G.I.; Petitjean, F. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens. 2019, 11, 523. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Study area, comprised of parts of Cirebon and Tegal districts, West and Central Java Provinces, Indonesia.

Figure 2. Example of harmonic fitted analysis results of Normalized Difference Vegetation Index (NDVI) for a pixel.

Figure 3. Workflow of the analysis. Most of the analyses were conducted on the Google Earth Engine (GEE) platform.

Figure 4. Out-of-bag errors from the grid search when performing parameter tuning.

Figure 5. Variable importance for the two best models, i.e., (A) Random Forests (RF) model with 50 trees (RF50), and (B) RF model with 80 trees (RF80).

Figure 6. Summary of variable importance based on (A) timing (months) of the temporal variable, and (B) type of data.

Figure 7. Monthly temporal progression of paddy field in the year 2019 from vertical transmit and horizontal receive (VH) polarization on Sentinel-1 monthly median composites data.

Figure 8. Paddy classification maps (yellow areas) from (B) Classification And Regression Trees (CART) model, (C) RF50 model, and (D) RF80 model. The background image was represented by the VH polarization for October data (A). The green box shows the area of dryland farming where most inconsistencies between maps were found.

Figure 9. Examples of the omission errors in the RF models, which represents dryland farming (top row figures; (A,B)), and riparian areas (bottom row figures; (C,D)) with the red dots represent the validation points, and background shows Google Earth imageries.

Table 1. List of variables used in the classification process.

Data	Variables	Codename *	Number Variables
Landsat 8	Harmonic-fitted NDVI	FNDVI1 to FNDVI12	12
	Harmonic-fitted MNDWI	FMNDWI1 to FMNDWI12	12
	Harmonic-fitted NDBI	FNDBI1 to FNDBI12	12
Sentinel-1	VV Polarization	VV1 to VV12	12
Sentinel-1	VH Polarization	VH1 to VH12	12
SRTM	Terrain Ruggedness Index	TRI	1
Pixel Dimension: 867 × 480 pixels		Total Layer	61

* the number on the codenames represents the month.

Table 2. Accuracies of CART and RF maps for the paddy field class.

Model	PA (%)	UA (%)	OA (%)
CART	80.87	84.83	87.57
RF50	93.48	85.66	92.47
RF80	91.63	85.76	92.10

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Arjasakusuma, S.; Swahyu Kusuma, S.; Rafif, R.; Saringatin, S.; Wicaksono, P. Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia. ISPRS Int. J. Geo-Inf. 2020, 9, 663. https://doi.org/10.3390/ijgi9110663

AMA Style

Arjasakusuma S, Swahyu Kusuma S, Rafif R, Saringatin S, Wicaksono P. Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia. ISPRS International Journal of Geo-Information. 2020; 9(11):663. https://doi.org/10.3390/ijgi9110663

Chicago/Turabian Style

Arjasakusuma, Sanjiwana, Sandiaga Swahyu Kusuma, Raihan Rafif, Siti Saringatin, and Pramaditya Wicaksono. 2020. "Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia" ISPRS International Journal of Geo-Information 9, no. 11: 663. https://doi.org/10.3390/ijgi9110663

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combination of Landsat 8 OLI and Sentinel-1 SAR Time-Series Data for Mapping Paddy Fields in Parts of West and Central Java Provinces, Indonesia

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Harmonic Regression for Landsat 8 OLI

2.3. Sentinel-1 Time Series Process

2.4. Ancillary Data

2.5. Classification and Accuracy Assessment

3. Results

3.1. Out-of-Bag Error (OOB) and Variable Importance

3.2. Classification Results and Accuracy Assessment

4. Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI