1. Introduction
Land-use change and land degradation represent key contributors to the increase in greenhouse gas emissions, both globally [
1] and in the tropics [
2,
3], while also having an important impact on the habitat quality for many native species [
4]. In the period 2000–2018, an increase in the number and extent of large-scale land acquisitions (from both foreign and domestic investments) in the tropics has shown a predominance of permits granted for land-use changes through deforestation in favor of fast cash crops (e.g., oil palm and rubber plantations) [
5], in addition to illegal land clearance by smallholders [
6]. Moreover, among tropical countries, Indonesia has seen a rapid rate of primary forest loss (0.84 Mha/yr) due to land clearance for agricultural purposes [
3] in the period 2000–2012. After the introduction of the moratorium policy in 2011, the rate of land conversion to plantations appears to have slowed down [
7]. However, the impacts of these land-use changes on the vegetation structure of entire regions have not been investigated.
Vegetation structural complexity is considered a reliable proxy for ecosystem biodiversity and habitat quality, as it can provide insights into several ecosystem functions as well as their overall health [
8,
9,
10,
11]. Recently recognized as one of six Essential Biodiversity Variables (EBVs) [
12], vegetation structural complexity depends on the dimensional, architectural and spatial patterns of plant individuals and increases with increasing heterogeneity of biomass distribution in the 3D space [
13,
14]. Vegetation structure is composed of different elements related to both the vertical and horizontal spatial arrangement of biomass [
15], influencing and maintaining favorable microclimatic conditions for different biota [
16,
17,
18]. It is commonly assumed that structurally complex vegetation provides high quality habitat for the dependent organisms [
19,
20], while less complex vegetation can have a high impact on the habitat selection of several mammal and bird species [
21,
22,
23]. Tropical forests represent some of the most structurally complex, biodiverse and carbon-rich ecosystems on the planet [
24,
25], while also displaying a highly variable structural complexity across small spatial scales [
26,
27,
28]. Transformed land-use systems—depending on their management intensity—might also hold important vegetation structural complexity and thus ecosystem functions. Therefore, it is of paramount importance to acquire detailed knowledge on the vegetation structural complexity of different land uses in a landscape for the effective management of highly modified tropical landscapes [
29]. Measurements of structural complexity through traditional field surveys can be extremely costly and time consuming, particularly in remote areas in the tropics. Airborne or spaceborne remote sensing technologies can offer viable alternatives to traditional fieldwork, being more repeatable and less affected by human measurement errors [
30].
Over the last few decades, the use of remote sensing has gained much popularity in the fields of forestry and ecology, often playing a key role in providing a monitoring baseline for spatiotemporal changes in vegetation health and forest cover [
30,
31,
32,
33]. Among remote sensing technologies, light detection and ranging (LiDAR—also referred to as airborne laser scanning (ALS) when mounted on an airplane) is likely the most suited system to study vegetation structure (and structural complexity), due to the nature of the active sensor and its signal propagation and measurement, which can penetrate the canopy layer and provide insight on the elements below it. LiDAR-derived information can allow the precise estimation of several key vegetation structural attributes [
30]: quantification of plot-level gap fraction, gap sizes and distribution [
34,
35]; maximum tree height and aboveground biomass (AGB) [
36]; and leaf-area index (LAI) for any given vertical foliage profile contained within a plot [
37]. Furthermore, over the last decade, several studies have tried to estimate vegetation structure through the use of LiDAR technology at different scales [
38,
39,
40]. Using discrete return ALS data, authors were able to map regional variations in vegetation structural complexity over a large part of the Canadian province of Alberta [
39]. In their study, Guo et al. [
41], using six variables derived from LiDAR (i.e., standard deviation of heights, canopy cover and four measures of canopy height density), were able to significantly differentiate between eight different vegetation structural classes distributed across nine natural subregions. Moreover, Marselis et al. [
42] were recently able to characterize five different vegetation types in a tropical forest–savanna mosaic across a large portion of Gabon, Africa. Recently, through the use of LiDAR data, Davies et al. [
21] have found that on the island of Borneo, orangutan individuals were more likely to move through forests characterized by an increased canopy closure and by a tall, uniform vegetation layer, while they preferred avoiding canopy gaps. Using remote sensing measurements of forest structure, another study on Bornean forests showed that several mammal species were particularly sensitive to structural simplification after forest disturbance (e.g., repeated logging—[
43]). Other studies have found that the presence of agricultural or urbanized land covers generally reduces the abundance and diversity of stingless bees in Thailand [
44]. Barnes et al. [
45] found that species richness of invertebrate detritivores in Sumatra was significantly lower in highly modified cash-crop monocultures (e.g., rubber and oil palm plantations), rather than in forested ecosystems. Indeed, vegetation structure can also affect arthropod species richness and their relative abundance, depending on the species composition and related leaf quality and distribution for herbivorous species [
46]. However, to date, few studies have focused on the vegetation structural characterization of different land uses using LiDAR data in highly fragmented and modified tropical regions, such as the Indonesian island of Sumatra.
Here we aim to study the potential of ALS-derived measures to highlight differences in tropical vegetation structure from plots sampled across a gradient of land-use management intensity on the island of Sumatra, Indonesia. For this purpose, we analyzed the vegetation structure of five land uses of relevance in the region: tropical rainforests, jungle rubber, rubber plantations, oil palm plantations and transitional lands. Particularly, we (i) investigated the possible directionalities of the derived LiDAR metrics using a principal component analysis (PCA); (ii) focused on the behavior of key structural metrics across the different LULCs, by carrying out analysis of variance (ANOVA) and post-hoc pairwise comparisons; and (iii) trained a random forest (RF) classification algorithm to further characterize the target LULCs. Since we anticipate that the inclusion of both tropical rainforest and jungle rubber plots will result in a lower classification accuracy (due to their similar vegetation structural properties—[
47]), two separate RF characterizations were carried out, one including all five LULCs (hereafter referred to as the “five land uses” (5LU) model) and one using only four LULCs (hereafter referred to as the “four land uses” (4LU) model), with the forest class being represented by a combination of plots belonging to both forest and jungle rubber classes.
4. Discussions
The results of both the PCA and individual metrics analysis of variance helped in characterizing the five land uses under study, according to differences in the behavior of the LiDAR-derived metrics. The first component identified by the PCA (i.e., PC1 in
Figure 4) was strongly associated (eigenvalues > 0.85) with metrics related to tall vegetation occurrence (i.e.,
pzabove2,
zq75,
zq50), entropy of height points (i.e.,
zentropy), vegetation layering (i.e.,
fhd_shan_div,
lai and
fhd_shan_eve) and horizontal spatial occupation of elements (i.e.,
veg_cover,
sum_gapArea and
max_gapArea). Indeed, plots distributed along this dimension followed a clear pattern: moving from transitional land to oil palm, rubber, jungle rubber and forest plots, going from negative to positive values of the first component. This trend is easily explainable when considering the post-hoc pairwise comparisons of most metrics, as the different land uses seem to exhibit a clear pattern following a decreasing gradient in land-use management intensity: from highly degraded (i.e., transitional lands) and strongly managed land uses (i.e., oil palm and rubber plantations) to less managed (i.e., jungle rubber) and relatively unmanaged (i.e., secondary rainforest) land uses.
Rainforests and jungle rubber plots tended to have taller and more complex vegetation, followed by rubber and the remaining land uses (
Figure 5a,d,i–k,v). This trend agrees with the plots distributed along the positive axis of the first component, while plots mainly belonging to the transitional land and oil palm classes (and a few rubber plantation plots) were distributed along the negative axis. Moreover, measures of height skewness and kurtosis, as well as horizontal structure tended to be significantly higher for plots of transitional land (
Figure 5c,m–p,s,t), which is possibly due to the higher variability in vegetation conditions displayed by these plots and typical of this vegetation type [
91]. These trends are consistent with those identified in a recent study estimating carbon storage on a subset of these plots, where rainforest had the highest values of aboveground carbon storage (of which stand canopy height is a proxy), followed by rubber and oil palm monoculture [
92].
When focusing on the second component identified by the PCA (i.e., PC2 in
Figure 4) the distribution of plots appears to become less defined. Indeed, this component strongly identified with the
num_gaps metric (
Table 2, PC2), with several oil palm plots located towards the higher negative values of it. This could indicate that a higher number of gaps in the vegetation canopy could be indicative of oil palm plots, which are indeed characterized by a highly regular planting pattern [
93,
94]. This planting pattern creates gaps in the surroundings of each oil palm tree, at least until the plantation age reaches maturity, by which time the canopy will be mostly closed. This trend was further confirmed by the post-hoc pairwise analysis of the
num_gaps metric, as the number of gaps in oil palm plots was significantly higher than those in forest and transitional land plots, and double the values obtained for jungle rubber and rubber plantation plots (
Figure 5m). Interestingly, when looking at the results obtained from the 5LU RF model, the
num_gaps metric was actually one of the two metrics dropped from the model, as it had a negative MDA value in the preliminary model. This was not the case for the 4LU RF model; nevertheless,
num_gaps was the fourth least important metric in that model (
Figure 6b).
Results from the post-hoc pairwise analysis of variance showed that no metric performed significantly different for all land uses considered, with forest and jungle rubber plots never showing significant differences between them. While forest and oil palm plots differed significantly for most metrics tested (with the exclusion of
zkurt and
min_gapArea), rubber plantation and transitional land plots performed differently according to different LiDAR-derived metrics (
Figure 5). Rubber plantation plots did not show significant differences from forest and jungle rubber plots for
zkurt,
zq25,
zq50,
cr and for all metrics of horizontal structure, and they performed similar to oil palm plots in the case of
rumple,
min_gapArea and
num_gaps. This could be due to the regular planting patterns employed in both oil palm and rubber plantations, which might have resulted in similar values for these metrics. Indeed, as the rumple index is mostly depended on the surface complexity of the top layer of the vegetation canopy [
66], regularly alternating crowns and canopy gaps, as well as the complex crown shape of rubber trees and complex architecture of oil palm fronds, could ultimately provide rougher surfaces when compared to the other land uses (
Figure 5v). Lastly,
Db, a holistic structural measure derived from the application of fractal analysis to point clouds of vegetation stands [
9], was also highly correlated to the first component identified by the PCA (
Table 2), having similar performance to
enl,
lai and
zmax, but shifted towards the negative axis of the second component (
Figure 4). From the post-hoc pairwise analysis of variance, jungle rubber and rubber plantation plots showed significant differences in
Db, while rubber plantation plots were significantly similar to forest plots (
Figure 5q), suggesting different yet somewhat similar vegetation structures among the three land uses. Recent studies showed that
Db is a suitable measure to classify different types of agroforests [
95] and forest types [
13] if point clouds of high resolution are available.
Both random forest models were trained on a relatively small number of samples of each LULC (
n = 19), but, in the case of the 5LU model, the inclusion of two classes (i.e., forest and jungle rubber) with virtually identical LiDAR-derived metrics performance produced lower accuracy results. Indeed, half of the plots belonging to the forest class were erroneously attributed to jungle rubber and rubber plantation, while more than 50% of jungle rubber plots were erroneously classified as forest or rubber (
Table 3). On the other hand, in the case of the 4LU model, the overall classification accuracy was much higher (OA = 72.2%). The same can be said for Cohen’s kappa and for the producer’s and user’s accuracies, with the exception of the producer’s accuracy result for the rubber plantation class. Indeed, similar studies reported comparable results: classification accuracies ranging between 64% and 100%, when using LiDAR-derived metrics and RF to classify six different successional classes of temperate forests in western United States [
96]; 84% overall accuracy and a value of Cohen’s kappa coefficient of 0.62 when classifying eucalypt and rainforest stands in temperate Australian forests using the RF algorithm [
97]; and an overall accuracy of 81% when using RF to classify between four tropical forest types and savanna in tropical central Africa [
42]. In the present case, rubber plantation plots proved to be the most difficult land-use type to classify by the RF model (PA = 44.4% and UA = 60.0%—4LU model), with two plots erroneously attributed to oil palm and three to transitional land (
Table 4). To a certain extent, this can be explained by looking back at the ANOVA results (
Figure 5), where rubber plantation plots were similar to oil palm plantations (i.e.,
zkurt,
rumple,
min_gapArea and
num_gaps) or to transitional lands (i.e.,
lai,
fhd_shan_eve and
num_gaps). A similar trend, although less marked, occurred when classifying oil palm plots, two of which were erroneously classified as either rubber or transitional land (
Table 4). Interestingly, it can be noticed that, after randomly merging half of the forest and jungle rubber plots, the new RF model (i.e., 4LU model) successfully classified the merged forest plots with a 100% producer’s and user’s accuracy (
Table 4).
When looking at the contribution (i.e., importance or mean decrease in accuracy) of individual metrics to the overall model accuracy, stand summary statistics, metrics of vertical structure and complexity/heterogeneity metrics followed one another without a particular order, with only the metrics of horizontal structure consistently showing lower importance values compared to the rest (
Figure 6). The 50th percentile of height points within a point cloud (i.e.,
zq50), standard deviation of height points (i.e.,
zsd), maximum height (i.e.,
zmax) and the effective number of layers (i.e.,
enl) were among the most important variables for both RF models (
Figure 6), which is in partial agreement with the findings of a recent study [
39]. Interestingly, Guo et al. [
39] also reported the major contribution of canopy cover in differentiating structural classes, while in the present case,
veg_cover was the fifth least important metric in the 5LU RF model (
Figure 6a) and the 13th most important metric for the 4LU model (
Figure 6b). This difference in classification importance, in relation to vegetation cover, might be related to the fact that our study was located in the tropics, a bio-climatic region where forests are characterized by higher vegetation structural complexity, which is linked to higher light absorption [
98], while the study by Guo et al. [
39] was located in the Canadian province of Alberta, characterized by more homogeneous and less structurally complex coniferous forests.
Indeed, when looking at the variable importance for the classification of each land-use type in the 5LU model we can find some interesting trends (
Table 5). In agreement with the PCA results, stand summary statistics, metrics of vertical structure and metrics of complexity/heterogeneity were the most important variables in the classification of forest and jungle rubber plots, indicating that these land-use types are indeed characterized by the presence of taller and more structurally complex vegetation. Additionally, it is worth noting that, for both land-use types, metrics of horizontal structure have the tendency to be amongst the least informative, and in some cases even have negative MDA values (
Table 5). On the other hand, metrics importance for the classification of rubber plantation plots showed more variation in the type of metrics, with measures of vertical structure (i.e.,
zq50 and
fhd_shan_div), complexity/heterogeneity (i.e.,
zskew and
zsd) and
veg_cover being among the most important, while
fhd_shan_eve,
enl,
max_gapArea and
num_gaps actually showing negative importance, which is consistent with the highly regular plantation system typical of managed production forestry plantings [
99]. The classification of oil palm plots was mainly driven by stand summary statistics (i.e.,
zmax and
lai) metrics of vertical structure (i.e.,
zq75 and
fhd_shan_div) and complexity/heterogeneity metrics (i.e.,
zsd). Looking back at the ANOVA results, we can quickly realize that oil palm plots, although having the lowest values for most of these traits, are also significantly different from most other land uses (
Figure 5a,d,h,j,u). Lastly, the variable importance for the classification of transitional land plots reflected the high variability identified in this class (
Figure 5). Indeed, lower measures of
rumple,
zsd,
zq75,
zq50,
zmax and
veg_cover, as well as higher values of
zskew,
zkurt and larger gaps in the vegetation canopy, clearly represented the nature of this highly heterogeneous land-use class. Moreover, in several cases, plots belonging to transitional land had little to no tall vegetation (>2.5 m), while other plots strongly resembled rubber and forest plots, which was likely linked to the time since the land clearing occurred [
91], and to an extent reflected the opportunistic nature of the plots used here (see
Section 2.1).
Individual LULC variable importance from the 4LU RF model followed similar patterns as those highlighted for the 5LU model. Indeed, the novel forest class (obtained from merging rainforest and jungle rubber plots) was mainly characterized by metrics of vertical structure as well as complexity/heterogeneity and stand summary statistics (
Table 6). Interestingly, rubber plantation plots experienced a complete change in the importance level of
enl, moving from a negative MDA value for the 5LU model to being the most important metric in the 4LU model. Oil palm plots were mainly characterized by metrics of vertical structure and complexity/heterogeneity, while a very similar pattern to that of the 5LU model was found for transitional lands (
Table 6).
To sum all of this up and provide a clear characterization of the different land uses, we can claim that:
- -
Secondary rainforest and jungle rubber plots share the same structural properties. These properties consist of a high (≈100%) vegetation cover, with a top of the canopy layer reaching heights of 20–40 m,
lai values ranging between 2 and 4, a relatively low number of gaps in the canopy (<30 per plot), high values of box dimension,
enl, canopy surface roughness (i.e.,
rumple), entropy and standard deviation of height points and the lowest values in height kurtosis and skewness. Indeed, these characteristics could indicate the presence of higher vegetation structural complexity (compared to the other land uses), which translates to a greater volume of potential available habitat for different species to occupy, likely leading to higher levels of species richness [
21,
46].
- -
Rubber plantation plots in many cases share similar trait values with rainforest and jungle rubber, with a high vegetation cover, low number and size of canopy gaps and low values of height kurtosis and skewness, while displaying intermediate values between forests and oil palm plantations in terms of canopy height (i.e., 10–20 m),
lai (i.e., 1–3.5),
enl (i.e., 5–20), box dimension and entropy of height points. These characteristics result in an intermediate vegetation structural complexity between forest plots and oil palm and transitional land plots, thus providing reduced habitat availability for the local native species [
48].
- -
Plots of oil palm plantation and transitional land are very similar to each other in most traits considered, with a lower extent and higher variation in vegetation cover and canopy height (i.e., 5–20 m), higher number of gaps (and their size) and lower values of structural complexity (box dimension) and entropy of height points compared to the other land uses. Between the two land uses, oil palm plots had lower
lai values and higher values of
rumple,
cr and number of gaps. This is likely connected with the emblematic shape of the oil palm fronds and with their regular plantation design. Again, the lower values of ALS-derived vegetation metrics identified for oil palm plantations are indicative of this extremely simplified land-use system, which is connected to low habitat availability and reduced ecosystem functioning [
94]. Additionally, the higher canopy gap area found in oil palm plots could explain the higher within-canopy temperature observed in a previous study on a subset of the same plots [
18]. Although the results obtained for the transitional land class might be due to the highly variable (and in some cases degraded) nature of this land use, rather than to intensive management, its overall structural complexity remains quite low, potentially also resulting in low habitat provision.