1. Introduction
Monitoring the spatial-temporal dynamics of land cover and use in riparian zones is essential to understand the numerous surface processes that can occur in these areas [
1]. Deforestation and inadequate use are some of the most notorious problems in many environmentally fragile riparian zones. These regions have an important role in environmental conservation, providing multiple ecosystem services [
2]. With the increasing loss worldwide of wetlands and riparian areas [
2], an accurate mapping of forest vegetation is required to define strategies for both monitoring and conservation. Fine-scale mapping of forest vegetation in riparian zones may provide information to support different tasks, such as maintaining the quality of water resources. Riparian zones offer an ecological function essential to wildlife and human communities that are gathered around its proximities. In this regard, investigating methods that provide an accurate description of these forest fragments is a relevant scientific task.
Satellite imagery consists of a substantial source of information to map forests since they regularly register a wide geographic area and are particularly suited to support the changing detection tasks [
3]. Moreover, satellite imagery is a potential solution for mapping land use at several cartographic scales [
4]. An important orbital platform for monitoring these areas is the Sentinel-2. Offering multispectral images with 10–60 m spatial, 13 spectral bands, and 5-day temporal resolutions, Sentinel-2 images opened many opportunities to investigate the fine-scale mapping of vegetation, including inside the riparian zones. However, digital image classification is a task whose accuracy strongly depends on the availability of data and on the applied method used to perform it [
5]. Image classification tasks were initially performed with conventional supervised methods like maximum likelihood, minimum distance, Mahalanobis distance, and others, as well as with unsupervised methods like K-means and isodatdo a [
6]. Nonetheless, new approaches are required to aid this issue, especially because of the technological advances that permitted the construction of sensors able to acquire images with high spatial-spectral-temporal resolutions, thus, producing a large amount of data to be analyzed.
Machine learning (ML) techniques are a current and promising alternative to process remote sensing data [
7] and are applied in many data processing and analysis tasks [
8]. These learners can be used to model different sets-of-data using a robust approach [
7,
9]. Moreover, ML methods allow establishing non-parametric and nonlinear relationships between the independent variables and dependent variables (usually the target), resulting in overall better performance when compared to the conventional linear models [
10]. Regardless, as no universal learner exists, multiple tests are needed for different types of applications.
Several approaches have been developed with ML and multispectral imagery for mapping the spatial distribution of vegetation areas. A study [
11] investigated the performance of the random forest (RF) and support vector machine (SVM) algorithms for high-resolution multispectral imagery classification. These images were acquired with embedded sensors in an Unmanned Aerial Vehicle (UAV). The reliability of ML learners was also verified in the mapping task of invasive trees in riparian zones, also with UAV-imagery [
12]. Adopting visible and near-infrared data, spectral and texture features were computed at various scales (10, 30, 45, 60), and the most relevant variable (or combination of variables) was identified with a supervised classification model based on the RF algorithm.
A recent study [
13] pointed out that ML algorithms can be used for mapping vegetation areas, and they are especially applicable when training data consist of a large number of observations and covariates. More recently, another research [
14] evaluated several multi-temporal Sentinel-2 images, making combinations of spectral bands and applying principal component analysis and tasseled cap transformations. They used it as input to four ML techniques (SVM, nearest neighbor—KNN, RF, and classification trees—CT), aiming to separate vegetation species. The best results were returned by SVM, but the authors pointed out that further work is needed to determine whether these results are replicable in other vegetation types and regions. Combinations of spectral bands from Sentinel-2 data were also tested [
15] to evaluate the performance of the RF for mapping tree species on different dates, and obtained an averaged accuracy of 80%.
Although remote sensing images have proved their potential to support mapping tasks into different contexts, like land use and land cover, its application for forest vegetation mapping in the riparian zones exclusively is still limited. Up to our knowledge, few studies were conducted [
11,
12] in this manner, and they were mainly with UAV-imagery. UAV-imagery provides an important data source for many applications, like high-detailed vegetation maps, but it is worth mentioning that these platforms may not be attractive when wide geographic areas require to be mapped such as riparian zones in many tropical countries.
The use of machine learning methods with orbital data, like the free-available satellite images from Sentinel-2, can be a promissory strategy to map-wide areas using a low-cost approach. Sentinel-2 data has been evaluated with ML algorithms for mapping different types of vegetation [
16,
17], as other targets [
18], and it was considered efficient in these studies. But, the knowledge about the capacity of current ML models to identify the spatial distribution of forest vegetation in riparian zones based on medium spatial resolution imagery is limited yet. A recent study [
14] used Sentinel 2 imagery to map vegetation, but it is unclear whether these results can be replicated into other riparian zones.
To fulfill the aforementioned gap, we propose an easily reproducible framework to map forest vegetation in riparian zones, based on Sentinel-2 (MSI) multispectral images and processed by ML algorithms. We hypothesize that some machine learning algorithms may be more appropriate than others to potentially map forest-type in those areas with interference from seasonal changes. We then verified the -generalization capability of all trained models using images from different dates and geographic areas. The traditional classification and segmentation methods may not present a consistent accuracy or even provide the same robustness as a machine learning evaluation. In this sense, when monitoring these areas alongside different dates, governmental technical-agencies, and entities responsible for the forest management may adopt the proposed approach.
2. Materials and Methods
Our method was divided into four main stages (
Figure 1). Initially, we performed the organization of a database composed of 14 multispectral Sentinel-2 imagery, acquired alongside a one-year-period, for the area of interest. These images were pre-processed to convert their original value into surface reflectance values [
14].
We defined as a study area the riparian zone within 1 km distance from the Paraná river margin (
Figure 2), and for the last step the riparian zone selected was within 1 km of another river, known as the Paranapanema. We manually labeled the features (forest and non-forest) in a geographical information system (GIS) environment and separated the data in both training and testing sets. We performed the detection of forest vegetation adopting different algorithms. The performance of the learners was compared against one another using classification evaluation metrics.
2.1. Study Area
Our study area was the riparian zone of the Paraná River (
Figure 2) located in the state of São Paulo, Brazil. This region is known for the presence of one of the last original fragments of the Atlantic Biome in Brazil. This riparian zone has an area of 152.91 km² and 468.46 km of the perimeter, and the Paraná River is considered the most important river from this geographic region, dividing the states of São Paulo and Mato Grosso do Sul. The riparian zone is formed by both Cerrado (Brazilian Savanna) and Atlantic Forest biomes. This area is representative of most large rivers in our ecosystem, as it still possesses natural vegetation alongside deforestation, agricultural and urban environments within its area.
2.2. Image Preprocessing and Labeled Features
Our experiment considered a total of 14 Sentinel-2 images (
Table 1). All scenes were available with little or without cloud interference alongside the riparian zone. Also, most of the cloud-cover was formed by thin clouds, which although may impact the algorithm’s performance, served as an additional challenge for the algorithm. A total of 13 Sentinel-2 images were acquired during June 2018 and June 2019 (one per month), and a scene from June 2020 was used to test the performance of the algorithms in a different period. Therefore, we worked with images representing all seasons of the year (summer, winter, autumn, and spring). Each scene was downloaded from the United States Geological Survey (USGS) EarthExplorer Platform. Sentinel-2 images are collected by its MSI sensor and available in Digital Number (DN), in 12-bit radiometric resolution, in both 10, 20, and 60 m resolutions. They were projected to the WGS-84 UTM 22 S zone system.
For all images, we performed the radiometric correction using the SNAP 7.1.0 software with the Sen2Cor Toolbox. It was necessary to minimize the atmospheric influences, and, for that, we adopted the recommendations listed in the Sentinel-2 User Handbook [
19]. The SNAP tool finds the parameters for both radiometric and atmospheric corrections automatically. These values are calculated by default when the software reads the metadata file of each scene. In this regard, aerosol values corresponding to rural areas were adopted; and atmospheric conditions were defined to coincide with the time that the image was recorded. Ozone content in the atmosphere was also automatically defined. A correction using the Cirrus band (Band 10) was not performed since is not available for the 10 m bands at the moment [
19]. We used the DSM (Digital Surface Model) option input for terrain correction. The remaining parameters were left at their respective default values.
For our experimental setup, we labeled the forest-type data as training and testing samples using a GIS tool. The collection of the labeled information was performed with help of a specialist in the area, alongside additional high-resolution imagery from other datasets and imaging (both orbital and aerial) performed within the riparian zone in the last years. Regarding different types of forest vegetation present in the riparian zone, we considered only the fragments formed by forest physiognomies from the Atlantic Biome and, in fewer proportions, Brazillian Savanna, commonly encountered in the area, and associated with wetlands. In this aspect, since this type of forest offers more protection to these fragile ecosystems than the arbustive or grassland-types, it was identified during the labeling process and incorporated in the analysis.
A total of 855 features (polygons) of forest-type vegetation and 855 features of non-forest vegetation (e.g., water, soil, grass, and other land covers) were annotated on Sentinel-2 images, resulting in a total of 1710 polygons with different sizes, occupying almost 3,322.3 ha. Details regarding these samples and their spatial distribution are presented in
Table 2 and
Figure 3 below. Although, by the proximity between polygons, it may seem that some of which are present in both subsets. However, this occurred only during the representation of small polygons in the figure scale, as both training and testing sets were composed of entirely different features. To investigate the performance of the machine learning algorithms in detecting forest vegetation, we used 50% of the dataset (polygon features) for training and 50% for testing the algorithms.
The sample size and quality of training data have generally had a large impact on the classification accuracy [
20]. In this regard, we divided the dataset while ensuring that both training and testing sets contained similar sampling patterns, being representatives of all conditions observed in the area during labeling. This division was applied with the assistance of a widget incorporated in the Orange open-source software, and we integrated it with the sampling extraction method in the QGIS open-source software environment.
2.3. Machine Learning Algorithms
The open-source software Orfeo Toolbox 7.1.0 was used to apply and evaluate the performance of different ML models in classifying forest vegetation in the riparian zones using Sentinel-2 multispectral images. As stated, our dataset was composed of two classes: forest vegetation and non-forest vegetation, and it was used both training and testing the algorithms to ensure an adequate comparison among algorithms. We conducted a comparative study using five machine learning algorithms, including random forest [
21], decision tree [
22], support vector machine [
23], and normal-gaussian Bayes [
9].
The training and testing datasets were divided in the proportion of 50:50. The training set was then used to train and set up the hyperparameters of the chosen algorithms. In this sense, we divided the training set with the holdout method, using 10% of its data to validate it. After defining the best parameters for each model, we used the testing-set containing 50% of the original data, with samples not used during the training and validation process, to evaluate the real performance of our models. As explained by [
24], the even division between samples is necessary to have a good balance between both sets. This ensures a reliable estimation of the models’ performance, as the imbalance between datasets may harm the predictions.
The fine-tuning process of all the algorithms was performed until no improvements in the F1-measure value were identified. The same dataset (training and testing) was adopted for all algorithms. The final configuration of the algorithms is in
Table 3. Once the hyperparameters of each algorithm were defined, the testing dataset was used to verify its real performance. Metrics like global accuracy, F1-measure, precision, and recall were then adopted to evaluate them. These metrics were calculated considering the classification results of all of the labeled pixels in the testing-set. They also represent the average classification values between both classes.
Our experiment was set up to compare the performance of the machine learning models and determine the overall best classifier. This comparison was performed regarding only the spectral bands (blue, green, red, and near-infrared) from the 10 m spatial resolution images, as it provided more detailed samples. Additionally, we conducted tests to evaluate the generalization capability of all trained models using images from dates and geographic areas. The different proposed scenarios were related to a) evaluating different dates acquired during a one-year time interval (each image was evaluated individually by all of the machine learning models), and; b) applying the models in another riparian zone in an image from a different year of a different area; although from the same biome. Each image was then evaluated individually by all of the machine learning models.
3. Results
Table 4 shows the performance of the machine learning algorithms in the proposed task regarding the multiple dates of analysis for the riparian zone of the Paraná River (
Figure 2). As previously explained, all models were trained on 50% of the sampling data and tested with the remaining 50% for a total of 14 Sentinel-2 imagery (
Table 1) with the spectral bands’ number 2,3,4, and 8 (blue, green, red, and, near-infrared bands, respectively), with 10 m of spatial resolution.
To help illustrate the performance of ML algorithms in multiple dates (
Table 4), we organized a box-plot indicating the F1-measure and other evaluation metrics of the applied models (
Figure 4). Here, the overall results indicated that the performance of the algorithms, when considering all of the dates, were similar. However, DT outperformed slightly other ML models, since it returned the highest F1-measure value. The DT then is the most recommended approach in this regard. The NB model, however, may be useful since it is a simple model and does not require substantial processing costs.
Figure 5 presents the qualitative results obtained with each tested ML algorithm using the image of 15 June 2019, since it returned the highest accuracy (F1-measure upper to 95%) compared to other dates (
Table 4).
Although the five evaluated ML algorithms (
Figure 5) can be assumed with similar performance visually, the quantitative analysis (
Table 4) has demonstrated that the best learners in terms of kappa value were DT, NB, SVM, and RF, respectively. We noted that DT and NB techniques presented the same kappa value (95.20; see
Table 4), but the NB, in general, had a slight increase in the Recall value compared to the DT (see
Figure 4), which means that NB classified some non-forest areas as forest more than the DT algorithm. In this regard, the DT learner, while returning similar evaluation metrics as the other algorithms, returned better quali-quantitative results. Considering a forest management perspective, the false-positives (i.e., non-vegetation classified as vegetation) are more harmful than false-negatives.
We verified that for some areas, the DT and RF algorithms were not affected by sparse vegetation characteristics (
Figure 6) like the SVM and NB were (
Figure 7). These areas present a higher contribution from soil brightness pixels and other types of vegetation covers that offer some potential challenges for the classification.
The algorithms also faced additional challenges in mapping forest vegetation in riparian zones, and negative examples are presented in
Figure 8. This classification error possibly is due to the cloud cover interference that this part of the Sentinel-2 scene presented. The algorithms classified part of the clouded area over the water-body as vegetation. Nonetheless, the DT learner was still highly accurate in the proposed task when considering the land area contained within the riparian zone, as our training samples only considered these areas.
Figure 9 shows a positive example in which the DT algorithm was successful in identifying forest vegetation in a complex environment.
Based on these observations (
Figure 8 and
Figure 9) and the results of the quantitative approach (
Table 4), the DT was defined as the overall best technique among the evaluated models. To verify the generalization capability of the ML techniques, we performed additional tests with different dates in another riparian zone, but from the same biome. The generalization ability of a model refers to training it with images from a geographic area and testing its performance in other areas. This additional test was made with four algorithms (RF, SVM, DT, and NB). Therefore, the models, which were trained using images from the riparian zone of the Paraná River, were tested in the riparian zone located in the Paranapanema River. This river is located near one of the last largest fragments of primary vegetation from the Atlantic Biome, known as the Devil’s Mount. It is worth mentioning that we applied the four algorithms on two different dates in this area, considering the rainy summer and dry winter seasons. For that, we labeled a total of 147 features (polygons) representative of forest vegetation and 147 features of non-forest vegetation as data for testing the models, following the same criteria as the previous labeling.
Table 5 shows the results, which returned high accuracies on all dates. Two representative dates of each year were considered in this analysis.
Considering the scene from December 2018, we verified that all algorithms present a decrease in performance in terms of kappa (
Table 5). This is probably related to the cloud cover influence on this image, since, concerning June 2019, it presented 3,57 more times cloud cover (
Table 1). The DT was the best algorithm among the evaluated models in terms of F1-measure, proving its generalization capability. To help ascertain the relationship between the spectral bands and the predictions, we present in
Appendix A our decision tree models’ structure.
4. Discussion
The approach presented here is appropriated to map forest vegetation using satellite multispectral imagery of medium-spatial resolution. Our particular interest was to investigate the potential of machine learning algorithms and measure their variation in performance when applied to Sentinel-2 images for this task in riparian zones. The main advantage of this procedure is that free-available orbital images are used as the dataset, consisting of a low-cost method to support different environmental tasks, like monitoring of landscapes and forest management. In this regard, environmental practices could benefit from it, and future researches could be guided properly in terms of data and temporal analysis.
Studies [
4,
15] have shown that ML techniques are an efficient approach to map different land use and land cover classes, including forest vegetation, and our trials demonstrated that some algorithms can perform this task with higher accuracy than the others. The DT algorithm has been characterized as a simple and fast model for many applications in the remote sensing domain [
25,
26]. It was already characterized as more accurate, has less error rate, and is easier to apply when compared to other known methods in similar research [
27]. In our investigation, the DT algorithm achieved satisfactory results both at different dates and locations. Visually, this algorithm returned better results than others, moreover in areas that were not sampled. It also returned some variation in its accuracy when regarding different dates, although fewer than most of the others (
Figure 4).
The performance of the machine learning algorithms in our research was similar to those encountered in other studies [
14,
28,
29]. In subtropical forest areas, a study [
30] was able to obtain a kappa coefficient (k) of 0.74 for the RF algorithm using WorldView-2 imagery. The authors also improved their classification by adopting LiDAR data, resulting in metrics similar to ours. However, it should be highlighted that both WorldView-2 images and LiDAR surveying are expensive approaches, differently from our proposed method herein that is free. Others studies also presented high performances in separating forest areas from other types of land covers, like detecting natural forest in Sentinel-2 images with the RF algorithm [
31], and separating forest healthy vegetation from damaged vegetation using, in this approach, deep neural networks, and returning 92% accuracy [
32]. We can observe that, in our case of study, the obtained results are similar in terms of accuracy value, even though when a deep learning strategy is adopted, like in [
32].
The strategy of adopting images from different dates was also encountered in related studies [
14,
15]. The temporal resolution is important and has been regarded as a more relevant feature than the spectral and spatial resolution when considering vegetation classification tasks [
33]. In a recent study [
14], the overall best accuracies were obtained during the winter season, something also observed in our approach. The winter period in the riparian region investigated presents less atmospheric interference than the other periods. However, the spectral behavior from the tree fragments may vary, as some species reduce the number of leaves during this season. Regardless, the main aspect remains the presence of cloud interference, and that may explain the differences in accuracy of other seasons (
Table 4), such as summer, in which most of the images possess some sort of atmospheric interference.
Regarding vegetation phenology, few studies, to the best of our knowledge, evaluated different subclasses of forest-type with multispectral medium-spatial resolution remote sensing imagery and machine learning algorithms. Recently, in [
14], the authors used multi-temporal Sentinel-2 images to capture the phenological differences between vegetation classes. This study implemented a 60:40 division between training and testing samples and determined that the classification of different phenologies was better with the SVM (74% accuracy) and NN (72% accuracy) models, returning superior results when compared against other algorithms, like RF (65% accuracy). Nonetheless, it is still difficult to evaluate different phenologies in medium-spatial resolution imagery, and future research could investigate the performance of these methods to map subtypes of forest in riparian zones by implementing other types of data to feed their models.
Another important observation to be made is related to the generalization capability of the machine learning algorithms. These algorithms, unlike other types of traditional classification models, such as Maximum Likelihood, can improve their performance and learning capability when considering more and different informative data for training [
7,
9]. In this aspect, it is possible to adopt the models for different scenarios, providing that enough characteristics are available for learning during its training phase. In our study, by implementing a 50:50 division between both training and testing samples, we demonstrated that most of the algorithms returned satisfactory performances and that most of them can be applied for different seasons throughout the year. The results obtained in a different riparian zone (
Table 5) help to demonstrate the applicability of these models.
In short, our model was trained with data from one riparian zone, related to the Paraná river, and was able to map the forest vegetation considering different conditions (dates and areas) in the riparian zone of another river, known as the Paranapanema. This last experiment showed that the DT model, as well as the others to some extent, can be applied to different sites. As shown in
Table 5, the worst results obtained in this area occurred during the rainy season (which happens from December to January in this region). As previously explained, the reduction in these dates could be explained mainly because the rainy season impacts the spectral response of vegetation [
5], as well as promotes different atmospheric conditions that could also affect it such as the increase in cloud cover.
In a general sense, the machine learning algorithms investigated in this study can be considered a robust approach to classify forest-areas in multispectral imagery across seasonal periods. As we only implemented images from the Sentinel-2 sensor, this approach is suitable for low-cost classification models that intend to monitor areas like the ones adopted here. Nonetheless, other types of data may help in improving the accuracy in dates that did not return similar accuracies (
Table 4) as the remaining pattern from the rest of the year. One study [
31] indicated that a combination of texture metrics from Sentinel-2, seasonality metrics from Landsat time-series, and topography metrics from the SRTM digital elevation model are important features to be incorporated and fed to these models, helping to improve their overall performances in closed canopy natural forest classification.