1. Introduction
Cork oak woodlands characterize forested Mediterranean landscapes, and their ecological and economic values are largely recognized, although not adequately valorized [
1]. This ecosystem covers about 2.2 million hectares across the globe, but the most extensive groves are located on the Atlantic coast of the Iberian Peninsula, to the extent that Portugal and Spain produce 75% of the world’s cork [
2]. Cork is the sixth-most important non-wood forest product globally, and it is used to produce wine stoppers and a wide variety of other products, including flooring components, insulation and industrial materials, and traditional crafts. The most frequent landscape in the Mediterranean region is characterized by open woodland systems in which scattered mature trees coexist with an understory composed of grassland for livestock or cereal crops and shrubs. The ecosystem has high conservation value and provides a broad range of goods and services besides cork [
2]. It is important to increase the knowledge and skills that enable the use of the most innovative technologies to monitor these forest ecosystems in the support of forest management.
The use of unmanned aerial vehicles (UAVs) or UA systems (UASs) in forest and environmental monitoring [
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15] is currently in an expansion phase, encouraged by the constant development of new models and sensors [
3,
16,
17]. Detailed reviews focusing on the use of UAVs in agro-forestry monitoring, inventorying, and mapping are provided by Tang and Shao [
16], Pádua et al. [
3], and Torresan et al. [
18]. Additionally, the short time required for launching a UAV mission makes it possible to perform high-intensity surveys in a timely manner [
16,
19,
20], thus granting forest practices with more precision and efficiency while opening up new perspectives in forest parameter evaluation. Therefore, UAVs are particularly suitable for multi-temporal analysis of small- and medium-sized land parcels and complex ecosystems, for which frequent surveys are required [
17].
There is a large diversity of UAV applications, mainly as a result of the variety of available sensors [
21]. The most common sensors assembled on UAVs are passive sensors, which include classic RGB (red, green, blue) cameras, NIR (near-infrared), SWIR (short-wave infrared), MIR (mid-infrared), TIR (thermal infrared), and their combinations in multispectral (MS) and hyperspectral (HS) cameras and active sensors such as RADAR (radio detection and ranging) and LiDAR (laser imaging detection and ranging) [
3,
16]. An increasingly common agro-forestry application that uses multispectral images acquired from sensors assembled on UAVs is land cover classification [
18,
22].
Until a decade ago, most of the classification methods for agro-forestry environments were based on the statistical analysis of each separate pixel (i.e., pixel-based method), and performed well when applied to satellite imagery covering large areas [
23,
24,
25]. The recent emergence of very high resolution (VHR) images, which are becoming increasingly available and cheaper to acquire, has introduced a new set of possibilities for classifying land cover types at finer levels of spatial detail. Indeed, VHR images frequently show high intra-class spectral variability [
19,
20,
26]. On the other hand, the higher spatial resolution of these images enhances the ability to focus on an image’s structure and background information, which describe the association between the values of adjacent pixels. With this approach, spectral information and image classification are largely improved [
27].
Recently, object-based image analysis (OBIA) has emerged as a new paradigm for managing spectral variability, and it has replaced the pixel-based approach [
27,
28]. OBIA works with groups of homogeneous and contiguous pixels [
19] (i.e., geographical objects, also known as segments) as base units to perform a classification, so it differs from the classic pixel-oriented methods that classify each pixel separately; thus, the segmentation approach reduces the intra-class spectral variability caused by crown textures, gaps, and shadows [
19,
27,
28]. OBIA includes two main steps: (i) identification, grouping, and extraction of significant homogeneous objects from the input imagery (i.e., segmentation); (ii) labeling and assigning each segment to the target cover class (classification) [
19,
20,
28].
More recently, considering that OBIA techniques are applied in many research fields, when referring to a geospatial context, the use of GEOBIA (geographic OBIA) [
27,
29,
30] has become the preferred method. Moreover, GEOBIA should be considered as a sub-discipline of GIScience, which is devoted to obtaining geographic information from RS imagery analysis [
27,
29,
30]. GEOBIA has proven to be successful and often superior to the traditional pixel-based method for the analysis of very high resolution UAV data, which exhibits a large amount of shadow, low spectral information, and a low signal-to-noise ratio [
6,
19,
20,
26].
The availability of hundreds of spectral, spatial, and contextual features for each image object can make the determination of optimal features a time-consuming and subjective process [
20]. Therefore, high radiometric and geometric resolution data require the simultaneous use of more advanced and improved computer software and hardware solutions [
26]. Several works dealing with the classification of forest ecosystems or forest tree species by coupling GEOBIA and UAV imagery can be found in the literature [
4,
15,
16,
22,
31]. Franklin and Ahmed [
31] reported the difficulty in performing an accurate tree crown reconstruction, characterizing fine-scale heterogeneity or texture, and achieving operational species-level classification accuracy with low or limited radiometric control. Deciduous tree species are typically more difficult to segment into coherent tree crowns than evergreens [
31] and tend to be classified with lower accuracies [
32].
In terms of using GEOBIA algorithms, most studies are implemented by means of proprietary software (e.g., [
4,
6,
11,
15,
19,
31]). Considering their cost, especially in an operational context, it can be argued that only a relatively restricted group of operators can use them. Nowadays, several free and open-source software packages for geospatial (FOSS4G) analysis with GEOBIA algorithms are available, thus making their use accessible to a larger number of end users [
20]. Among others, a very promising FOSS4G package equipped with GEOBIA algorithms is Orfeo ToolBox (OTB), developed by the French Centre National d’Etudes Spatiales (CNES). OTB can be operated either autonomously or through a second open-source software (i.e., QGIS), used as a graphical interface that enables a graphical analysis of data processing in real time [
33,
34].
This work was conducted in the framework of a wider project aiming to derive a methodology for monitoring the gross primary productivity (GPP) of Mediterranean cork oak (Quercus suber L.) woodland by integrating spectral data and biophysical modeling. The spatial partition of the ecosystem into different coexisting vegetation layers (trees, shrubs, and grass), which is the task encompassed by this study, is essential to the fulfillment of the objective of the overarching project. The main objective of this work was to develop a supervised classification procedure of the three vegetation layers in a Mediterranean cork oak woodland by applying a GEOBIA workflow to VHR UAV imagery. In order to optimize the procedure, two different algorithms were compared: support vector machine (SVM) and random forest (RF). In addition, data from two contrasting surveying periods—spring and summer—were also compared in order to assess the effect of the composition of different layers (i.e., no living grass in the summer period) on the classification accuracy. In order to increase the processing quality, we tested a methodology based on the combination of spectral and semantic information to improve the classification procedure through the combined use of three informative layers: R-G-NIR mosaics, NDVI (normalized difference vegetation index), and DSM (digital surface model).
The paper is organized as follows.
Section 2 provides details about the study area and the dataset, and gives a general description of OTB.
Section 3 deals with methodological issues by explaining preprocessing, segmentation, classification methods and tools, and the performed accuracy analyses. In
Section 4, the obtained results are shown, and they are discussed in
Section 5.
Section 6 summarizes the results and highlights the limitations of the present research, open questions, and suggested future research directions.
5. Discussion
5.1. Segmentation
Given that it was expected to worsen the results, the smoothing step of the standard segmentation workflow provided in OTB was omitted because it reduces the contrast necessary to discriminate between the cork oak and shrub classes. The optimization of the range radius and minsize values led to a consistent separation between cork oak, shrubs, and grass classes.
To accomplish the segmentation step with the high structural and compositional heterogeneity of the site, we adopted a low-scale factor of objects, and this was effective. Actually, in the LSMS-segmentation algorithm, the scale factor can be only managed through the adopted values of the range radius and minimum region size parameters. As reported by scholars [
56,
57,
58,
59], even today, the visual interpretation of segmentation remains the recommended method to assess the quality of the obtained results.
Taking into account the high geometrical resolution of our imagery (i.e., GSD = 10 cm), a low scale of segmentation was preferable; thus, relatively low values of the range radius and minimum region size were used. Even though producing very small segments, in some cases, leads to over-segmentation (i.e., a single semantic feature is split into several segments, which then need to be merged), a low scale of objects effectively resulted in a high degree of accuracy. It allowed the identification of polygons that represent every small radiometric difference in tree crowns, as well as small patches of grass and shrubs (
Figure 3). As highlighted in several studies [
29,
60,
61,
62], a certain degree of over-segmentation is preferable to under-segmentation to improve the classification accuracy, as can be clearly inferred from the results in
Table 3 and
Table 4.
It is important to underline a very significant difference in the number of polygons in the segmentation results of the two different mosaics, despite our application of region merging using the same threshold size (250 pixels). As shown in the results section, the summer image presents almost twice the number of polygons relative to the spring image. This might be explained by a higher chromatic difference caused by a high contrast of vegetation suffering from drought (especially
Cistus and the herbaceous layer) and by an increased presence of shadows (an increase of more than 4% of the study area compared with the spring image). Actually, the summer flight was performed earlier in the day (11:00) than the spring one (13:00), so the flights were subjected to different sun elevation angles (
Table A1). Nevertheless,
Figure 3 shows that the obtained segmentation correctly outlines the boundaries of the tree crowns, as well as the shrub crowns and the surrounding grass spots, discriminating one from the other. The segmentation results also clearly differentiate between the bare soil and the vegetation, even in the presence of small patches. As reported in recent studies that performed forest tree classification [
18,
59], the additional information provided by the DSM and NDVI significantly improves the delimitation and discrimination of segments, which was true in our case as well. This is more relevant in conditions in which the three vegetation layers are spatially close, because discrimination based only on spectral information has proven very difficult to achieve in such environments. An additional detail to mention is that the algorithm was able to accurately recognize the shadows, an important “disturbing factor” for image processing, since they are usually hardly distinguished from other objects. This is the reason why shadows were specifically considered as a separate class category.
5.2. Classification and Accuracy Assessment
The GEOBIA paradigm coupled with the use of machine learning classification algorithms is currently considered an excellent “first-choice approach” for the classification of forest tree species and the general derivation of forest information [
28,
31,
41,
55,
63]. As reported by Trisasongko et al. [
55] and Immtzer et al. [
41] and confirmed in this study, the default values of the OTB parameters for training and classification processes provide optimal results, and they were maintained in the final test. Moreover, both classifiers were very fast, requiring just around 10 seconds for their execution.
Generally, the quality of the classification can be considered good according to the quality errors, which are intrinsic to the image. This is confirmed by the kappa coefficient, whose values ranged from 0.928 to 0.973 for RF and from 0.847 to 0.935 for SVM. For both spring and summer imagery, the comparison between the SVM and RF algorithms does not reveal large differences in the classification maps, and any differences detected are in very small polygons. However, a comparison of the kappa results with the overall accuracy reveals that the RF algorithm turns out to be more accurate than SVM for both flights, and this is also reflected by the confusion matrix. The most repeated error, particularly in the images classified with SVM, is confusion between the shrubby class and the shadows class, limited to the edges of the shadows. On the contrary, shrubs are clearly distinguished from tree crowns in the whole scene.
The classes, if taken independently, can be analyzed through the user’s and the producer’s accuracy, which indicate commission and omission errors, respectively. For both flights and both algorithms, the cork oak class obtains better classification accuracy, with values that are always higher than 95%. On the other hand, the lowest accuracy is obtained for the shrubs class with the SVM algorithm and the summer flight.
It can be noted from observing the confusion matrix that shrubs are more often confused with shadows or herbaceous vegetation than they are with trees. A preliminary visual analysis is sufficient to notice a common error in all of the maps: some polygons representing the edges of the shadows (at which there is a chromatic transition from the shadow to the adjacent cover) are confused with shrubs, but the confusion of shrubs with grass instead can be justified. It is hard to formulate a reliable assessment of the results for the bare soil class because of the low number of samples, even if the result is representative of the real distribution and width of the class in the whole region (<1.7% of the total surface). The bare soil class is mainly represented by the dirt road present at the site.
Trees and shrubs have similar shapes and spectral features from a nadiral view, and the largest distinction is the height, as well as the epigeal structural form. Thus, the DSM, which has a GSD of 10 cm and underlines the differences in height, plays an important role in discriminating between the two vegetation layers. Moreover, the NDVI is important for the distinction between herbaceous vegetation and shrubs, as well as between the herbaceous layer and bare soil; the NDVI also has a key role in the identification of shrubs. The class separability and thus discrimination provided by these two bands brings additional information, relatively, to the original mosaic bands (R-G-NIR), which is relevant for both processes of segmentation and classification.
Of the two flight seasons, the spring imagery yielded the best classification in terms of overall accuracy and kappa coefficient according to both classification algorithms. This is mainly because of the features of shrubs, particularly Cistus, which has a tendency to dry out during the summer and can potentially be mistaken for shadows and dry grass. This factor increases the chromatic and the NDVI contrasts among the plants. The opposite happens for herbaceous vegetation, which, during the spring, is in the midst of its photosynthesis activity, while it tends to dry out during the summer; this implies that herbaceous vegetation is classified with lower accuracy in the spring, probably as a result of a lower NDVI contrasting with other photosynthetically active vegetation. Moreover, analysis of the spatial distribution of the classes reveals that the difference between the two classification algorithms for the same season leads to almost the same results, with small differences. Differences between the seasons with the same classification algorithm can be also considered small, and the classifier does not appear to be affected by the cyclical and biological behavior of the species. In fact, with the use of the NDVI information layer, the algorithms similarly distinguished herbaceous vegetation from shrubs, even if they were clearly affected by drought. The larger problem to be faced is related to the large area of shadows in the summer mosaic.
6. Conclusions
Technological development for accurately monitoring vegetation cover plays an important role in studying ecosystem functions, especially in long-term studies. In the present work, we investigated the reliability of coupling the use of multispectral VHR UAV imagery with GEOBIA classification implemented in OTB free and open-source software. Supervised classification was implemented by testing two object-based classification algorithms, random forest (RF) and support vector machine (SVM).
The overall accuracy of the classification was optimal, with overall accuracy values that never fell below 89% and kappa coefficient values that were at least 0.847. Both classifiers obtained good levels of accuracy, although the RF algorithm provided better results than SVM for both images. This process does not seem to have been examined in previous studies; as far as we are aware, this study is the first that uses multispectral R-G-NIR images with DSM and NDVI for the classification of forest vegetation from UAV images using open-source software (OTB).
An open question worth investigating relates to the application of our findings to other forest ecosystems. The OTB suite is free and open-source software that is regularly updated and has proven to provide free processing tools with a steep learning curve coupled with robust results in forest ecosystems that have a complex vertical and horizontal structure. Moreover, as our findings show, OTB can effectively process UAV imagery when implemented on low-power hardware. Therefore, the proposed workflow may be highly valuable in an operational context.
However, some open questions and future research directions remain worthy of investigation. On the one hand, further investigation into the use of smoothing to try to optimize potential usefulness should be carried out in different operational conditions. On the other hand, for the issue with tree shadows, our experience in this study reveals that flight schedules can significantly affect the quality of the obtained imagery, especially in open forest ecosystems such as cork oak woodlands. The summer imagery is affected by about 30% more shadowed area than the spring one.
Our findings confirm the reliability of UAV remote sensing applied to forest monitoring and management. Moreover, although a massive amount of UAV hyperspectral and LiDAR data can be obtained with ultra-high spatial and radiometric resolutions, these sensors are still heavy, too expensive, and in most cases require two flights to obtain radiometric and topographic information. Our sensor, which is based on standard R-G-NIR imagery, has the significant advantage of being a very affordable solution in both scientific and operational contexts. Additionally, mounting this sensor on a fixed-wing UAV enables the surveying of surfaces spread over hundreds of hectares in a short time and has thus proven to be a very cost-effective and reliable solution for forest monitoring.