Next Article in Journal
Exploring the Nexus of Eco-Innovation and Sustainable Development: A Bibliometric Review and Analysis
Previous Article in Journal
Human Impact on Water Circulation Patterns in Raised Bogs of the Baltic Type, Northern Poland
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using a Machine Learning Approach to Classify the Degree of Forest Management

1
Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Hans-Martin-Weg 5, D-97074 Würzburg, Germany
2
Department of Bioinformatics, Biocenter, University of Würzburg, Am Hubland, D-97074 Würzburg, Germany
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(16), 12282; https://doi.org/10.3390/su151612282
Submission received: 26 June 2023 / Revised: 7 August 2023 / Accepted: 8 August 2023 / Published: 11 August 2023
(This article belongs to the Section Sustainability, Biodiversity and Conservation)

Abstract

:
A prerequisite for sustainable forest management is knowing the total diversity and how management affects forests. Both are poorly studied and relate to canopy diversity and comparison with primary forests. From 2001–2004, we fogged beetles from oaks in primary and disturbed, managed sites in Białowieża (Eastern Poland) and also in distant age-class forests. Using a machine learning (ML) method (elastic net), we identified a beetle signature based on the species abundance distribution to distinguish these forest types. The beetle communities from 2001 served as training data, with 21 signature species correctly assigning the oaks to primary and different managed forests. However, the predictive quality of the signature decreased with each year due to high spatio-temporal heterogeneity and beta diversity. To improve the power of the signature, we combined the data from all years to calculate a more general model. Due to its greater complexity, this model identified 60 species that correctly classified both the studied forests and foreign forests in Central Europe, increasing the possibility of a general classification. Further research is needed to determine whether it is possible to establish a general signature-based index on a large number of samples from different years and forest types.

1. Introduction

Predicting the impacts of forest management on biodiversity and ecosystem function and services is becoming increasingly urgent as human use pressures increase [1,2,3,4]. In forests, the greatest biodiversity is found in the soil, deadwood and canopy. The arthropod communities in these habitats differ significantly from each other in terms of taxonomic composition and functional significance [5,6,7,8]. In contrast to the soil, however, the tree crowns are much less studied, and little is known about the importance of arboreal biodiversity on ecosystem processes. This is surprising, as the high biodiversity and abundance of arboreal arthropods can be seen as a surrogate of their functional importance [9]. Many of their functions are also obvious, such as pollination; decomposition of living material; or control of species populations, including pest species [10,11]. In addition, the last few decades have seen the development of methods that allow for access to the canopy, making it easier to study the arboreal fauna [12]. Access to the canopy is thus no longer a significant barrier to intensifying research activities.
In the cultural landscapes of temperate Central Europe, forests have been shaped for centuries by forestry, which has a direct impact on biodiversity by influencing the forest structure and tree species composition. However, there are few primary forests or old-growth forests left to use as a reference for assessing the impact of management in temperate Central Europe [13], and thus, any assessment is based on comparisons with unmanaged forests that have not been used for more than a few decades. It is therefore largely unknown what ecological differences exist between the two baselines. An important feature of forests is the availability of deadwood in all stages of decomposition. Deadwood, especially standing, large-sized deadwood, is a scarce resource in commercial forests. The diverse group of xylobiont beetles are therefore predestined as indicators [14]. Consequently, one focus of forest research is how the accumulation of deadwood in commercial forests affects biodiversity [15]. In addition, other anthropogenic stressors exacerbate the effects of forest management. Prominent examples include climate change and the significant decline in the abundance of insects [9,16,17,18,19]. Given these scenarios, it is difficult to understand why the highly diverse canopy fauna is not considered when discussing how to manage forests sustainably, which, in turn, implies that the canopy fauna is functionally negligible.
How all stressors together affect biodiversity and ecosystem functioning is documented and assessed through monitoring programs; for the case of Germany, see [20]. Canopy arthropods are usually collected using flight interception traps (FITs), e.g., [4], or insecticidal knockdown (fogging), although the latter is much less common. However, it is important to note that different collection characteristics may lead to different results (Floren et al. in prep). FITs record only a small part of the canopy fauna, mainly Diptera and especially xylobiontic Coleoptera, while other taxa are sampled in low quantities. In contrast, fogging allows for the tree-specific collection of free-moving arthropods of all taxa at approximately the frequency at which they occur in the trees [21]. Fogging thus fills an important knowledge gap by providing a good overview of the quantitative distribution of all arthropod taxa in trees. This provides an opportunity to analyze the impact of forest management at the community level, taking into account such important guilds as phytophages and little specialized zoophages, mycetophages or saprophages.
The aim of this study was to investigate the importance of arboreal beetle communities for diversity and their suitability as indicators of anthropogenic stressors, such as forest management. We used primary forests as a baseline and compared how they differed from age-class forests. We fogged beetle communities from Querus robur L. trees in the Białowieża forest (eastern Poland) and in distant isolated oak plantations over four years (2001 to 2004). (1) We characterized all forests in terms of their beetle alpha and beta diversity. (2) We asked whether it is possible to use an elastic net approach to predict whether a tree belongs to a primary or a disturbed, commercial forest based only on species abundance data. In the case of commercial forests, we distinguished between young and old matrix forests in the Białowieża Forest District and old isolated forests at a greater distance. To do this, we trained an elastic net model [22] on the arboreal beetle data from the year 2001 and calculated a beetle species signature that was used to distinguish forest types. (3) We investigated how the prediction quality changed on unknown data from other years and forest types and (4) We asked how well an overall model that integrates all years and forests performs at distinguishing forest types and whether a general species signature provides ecologically novel insights into how management affects forest ecosystems and how sustainability can be improved.

2. Materials and Methods

2.1. Study Area and Sampling Method

Most research was carried out in the Polish Białowieża forest, which comprises an area covering 61,000 hectares. Connected to this Polish forest is the 89,000 ha Belarusian part of the Białowieża forest [23]. The Polish part is managed by the four superintendencies of Białowieża, Browsk, Hajnówka and Bialowieski National Park. Most of the forest sites where fogging was carried out were in the Hajnówka district. The forests are multi-level, of different ages, and with a high proportion of standing and lying deadwood. Many forest stands were over 200 years old, with individual trees over 400 years old. The Białowieża forest also includes the last lowland primeval forest in Central Europe that is dominated by deciduous trees. The height above sea level is 170 m. Along the rivers, there are lowland fens and alder marsh forests. The typical forest community is a mixed forest of Quercus, Carpinus and Tilia (Tilio carpinetum), which includes areas with high proportions of Picea and Pinus [24].
Canopy arthropods were sampled via insecticidal fogging from a total of 138 Q. robur trees in four consecutive years (2001–2004). Pristine nature reserves within the forest matrix served as the basis for assessing changes in the commercial forests. In addition, several planted oak forests of different ages were studied (see Figure S1). These were 34 young oaks (50 to 80 years old) and 13 trees of 170 years of old oak age-class forests. Finally, 35 adult oaks from five other forests that were not connected to a primary forest were studied, namely, Borecka (Poland, 2002), Nurzec and Kampinoski (Poland, 2003), and Assiniki and Berezina (Belarus, 2004). Fogging is a highly effective method for quantitative and tree-specific recording of free-moving arthropods with minimal disturbance [25]. This was assured by using only natural pyrethrum in a concentration of less than 1%. The white fog is blown away by light air currents and quickly dilutes in the air. Natural pyrethroids are also not photostable and are destroyed by sunlight within a few hours [26]. All beetles were sorted from the fogging samples, identified to species and assigned to their feeding guilds by specialists.

2.2. Statistics

All analyses were conducted within the statistical framework R [27] using the packages “vegan” [28], “lme4” [29] and packages of the Bioconductor project [30]. Heterogeneity in the species abundance distribution between forest types was tested using the Kruskal–Wallis rank sum test, together with Nemenyi’s non-parametric post hoc test. The mean numbers of species and individuals per tree were visualized as box plots. The Fligner–Killeen test was performed to test the null hypothesis stating that the variances in each of the groups are the same. If necessary, all p-values were adjusted according to the method of Benjamini Hochberg. In order to compare species diversity on different sample sizes and to test for the completeness of sampling, we calculated sample-based rarefaction curves. Rarefaction and its extrapolation were performed using the R package iNEXT [31]. Fisher’s exact test was used to test for differences in guild composition.

2.3. Beta Diversity

The Morisita Horn beta diversity was calculated using the function “vegdist” as implemented in the vegan package. The distances were visualized using nonmetric multidimensional scaling (NMDS) in k = 2 dimensions. To better illustrate the differences between forest types, we drew polygons representing the hulls of predefined samples. To make the visualization more robust, we show the hull of samples after deleting the outer hull of the considered sampling. To test the association of the Morisita Horn beta diversity with the PMI factor, we performed an analysis of similarity (ANOSIM) as implemented in the “adonis2” function in vegan. The PMI factor is a qualitative factor with levels “Primary”, “Managed” and “Isolated”.
The “Managed forests” were sites within the Białowieża forest matrix and the “Isolated forests” were distant oak plantations that were not connected to a primary forest. We used permutational multivariate analysis of variance (PERMANOVA) to model the beta diversity as Beta Diversity ~ Gbh + WestEast + PMI, where Gbh stands for the girth in breast height of the fogged oak trees and WestEast stands for the west–east coordinates. The north–south factor was not significant in the models, mainly due to the low north–south orientation of the sampling area. The significance level of the PMI factor is shown in the associated NMDS plot for the respective year. To assess, visualize and test the differences in beta diversity, we applied the “betadisper” function of the vegan package with bias reduction and with type “centroid”. In addition, we measured the univariate beta diversity between the tree species using the Horn index, where Euclidean distances between group members and the group centroid on the basis of a principal coordinate analysis were calculated [32]. Differences in the beta diversity were visualized as box plots and tested for significance using the Kruskal–Wallis rank sum test.

2.4. Elastic Net, ROC Curves and Importance of Species

Logistic regression is a classification method in statistics and machine learning. Elastic net is a direct generalization of log regression, which can deal with many, correlated features. The elastic net approach is particularly useful when dealing with high-dimensional data, multicollinearity between predictors and when feature selection is desirable. Elastic net combines L1 (Lasso) and L2 (Ridge) penalties, allowing for both regularization and feature selection. Thus, only the most important features are maintained. This is particularly beneficial when dealing with high-dimensional data or when there are many irrelevant or correlated features [33,34].
To distinguish beetle communities collected from primary forest sanctuaries from those of managed age class forests, we trained a penalized multiple logistic regression model (elastic net) on beetle communities from 2001 using the R functions implemented in the “glmnet” R package [33]. Only beetles with at least six individuals in the dataset were considered. For all analyses, the parameter alpha, which represents the mixture of Lasso (alpha = 1) and Ridge (alpha = 0) penalty in the elastic net model, was set a priori to alpha = 0.5 to balance between the benefits of both methods. As a performance measure, the classification error rate was used (measure.type = “class”), which is simply the proportion of misclassified instances. The goal was to minimize this error rate and to find the best hyperparameter lambda for the elastic net model. The regularization parameter lambda was derived using cross-validation based on the function “cv.glmnet”. As a result of the optimal model, the elastic net approach yielded a beetle species signature and a model object for further prediction purposes as selected features. Either class assignments or class assignment probabilities can be predicted, which allows for quantifying the degree of disturbance of the investigated tree. The optimal signature for the year 2001 consisted of 21 beetles and was further used to predict the disturbance status of trees in other forests in later years. The prediction quality was visualized using a receiver operator characteristic (ROC) curve, which reflected the prediction quality on new (unseen) fogging data. Additionally, the area under the curve (AUC), together with its 95% confidence interval, was calculated using the R package pROC [35]. Clearly, an optimal classification performance results in an AUC equal to 1, while a poor random classification has an AUC value of around 0.5. We used the caret package to calculate the classification measure F1 [36], which is a metric to evaluate the performance of the classification model. A perfect F1 score of 1.0 indicates that the model achieved both perfect precision and recall, meaning it made no false positive or false negative predictions. An F1 score close to 1.0 suggests that the model has a good balance between precision and recall, while an F1 score closer to 0 indicates poor performance. Since the elastic net feature selection is based on a sampling procedure, it is inherently random. Therefore, to evaluate the importance of a particular feature (beetle species), we repeated the entire procedure 9999 times. We interpreted the percentage of the feature included in the model as an expression of the importance of this feature. In addition, we used the described approach to calculate an optimal species signature for the entire dataset (2001 to 2004) to assess the overall importance of species in distinguishing forest types. The signature was finally visualized using a correspondence analysis using the “cca function” of the vegan package to demonstrate the classification power on managed and unmanaged trees in other years based on the identified beetle signature.

3. Results

The 138 foggings sampled 51,579 individuals in 555 beetle species. The tree-specific beetle communities collected from all forests showed significant differences in the number of beetle individuals and species per tree (Kruskal test, p < 0.001) and the highest variance in the primary forests (Fligner’s test, p < 0.001; Figure 1A). Post hoc tests showed that the young and the old forest matrix sites differed only weakly significantly in mean species and individuals per tree (Nemeniy’s test, p < 0.05) and no differences were found between the young matrix forest and the isolated oak age class forest. The old matrix forest differed non-significantly in this respect relative to the primary forest. The sample-based rarefaction curves (Figure 1B) show that the diversity was highest in the primary forest in 2001 and 2002, but the reverse result was found in 2003 and 2004, as reflected in the actual number of species, which is shown for each accumulation curve. In 2002, old oak stands were also sampled in the Białowieża forest matrix. They were intermediate in diversity between the primary sites and the young oak plantations. The years 2003 and 2004 showed a different distribution of species diversity. In both years, the estimated diversity values in the primary forests were significantly lower than in the isolated oak plantations.
The NMDS shows large differences in beta diversity between beetle communities of the forest types for each year (Figure 2). In 2002, only three oaks in the isolated Borecka forest were available for comparison. None was clustered with the primary forest. The NMDS plots for the isolated forests in 2003 show that they were completely separated from the primary forest, while in 2004, there was only partial overlap. PERMANOVA confirmed significant marginal effects for the primary, managed and isolated forest sites, as well as the year of the study (factors PMI and year, p < 0.0001; see Table S1). The associated interaction term is also highly significant as determined by PERMANOVA. The differences between the forest types corresponded to the Anderson beta diversity (Figure S2). They also show a tendency toward higher, though not significantly higher, beta diversity in the primary sites for 2003 and 2004.
ROC curves show how accurately the elastic net has learned to discriminate between primary and managed forests. The 21 beetle species signature (Figure S3) determined from the 2001 training data allowed for perfect separation of primary and managed forests (Figure 3A, red curve, training error = 0, AUC = 1). Even with unknown data from other years and forests, the trees were assigned to the correct forest type with a high degree of accuracy. Due to the increasing spatio-temporal variability, the prediction probability decreased in the following years and the AUC became smaller. The confidence intervals of the AUCs indicate that all ROC curves were significantly different from chance.
Alternatively, we quantify the classification performance with the F1 score, which shows a similar pattern (F1 scores ranged from 0.80 and 0.53 in 2001 to 0.74 in 2004). In addition, the prediction power is demonstrated in the box plots, which visualize the distribution of the predicted trees (Figure 3B). The young and old matrix forests were also correctly predicted for the year 2002, although this difference was not trained for. In contrast, the similarity of the 170-year-old matrix forest to the primary forest is obvious.
The species signature calculated for the whole dataset, covering all years and forest types, comprised 60 species and was three times larger than the 2001 signature (Figure S4) because it needed to incorporate the spatio-temporal heterogeneity contained in all these four years. Resampling the model resulted in 94 species that were important for distinguishing all forest types (Figure 4A). The number of species with an importance greater than 44.9% included all 60 species of the signature. Of all the important species, 38 species represented primary forests and 22 species represented commercial forests. Twenty species with importances greater than 50% were identified from the 2001 signature. The quality of the separation of primary and commercial forests based on this feature selection is shown in the correspondence analysis (Figure 4B), where species of lesser importance also allow for separation. The separation of primary and commercial forests was only weakly associated with a feeding guild (Fisher’s test, p = 0.03) and this was due to the different frequency distributions between 2003 and 2004.
The signature beetles were selected from the elastic net model based on the species abundance distribution to provide an optimal separation between the primary and managed forests, but according to what criteria? Assigning all species to their feeding guilds shows that some guilds were more important than others (Figure 5). All samples were dominated by phytophagous beetles, which represented 67% of the whole dataset and even 84% of the primary forest signature, while the weighting was reversed for the managed forests. The opposite was true for zoophages and mycetophages, with zoophages being more downgraded in the primary forest signature. Xylophagous beetles were always represented in similar proportions, both in the signature and in the data. The overall low number of saprophytes was weighted down in the primary forest signature, but up in the managed forests. The xylobiont beetles accounted for between 14% and 15% of the beetles in the entire dataset, with little change in the forest-type signatures.

4. Discussion

4.1. Beetle Communities on Oaks in Primary and Commercial Forests

For the first time, insecticidal knockdown makes it possible to collect whole communities of arboreal taxa and use them to distinguish primary from managed forests based on an elastic net model. This was clearly confirmed by the results on canopy beetles associated with Q. robur trees. The top-down approach has many advantages. In particular, it provides the best compositional overview of the diversity of a large number of canopy taxa representing a broad faunistic and functional spectrum [6]. This approach is not limited to classical indicator groups, such as commonly used ground beetles, e.g., [37], or xylobiont beetles, e.g., [38]. Fogging data can therefore provide a more comprehensive basis for a better understanding of interactions between species in different guilds or of different taxa. The results also allow for producing a hypothesis regarding how the canopy fauna interacts with other components of the system, such as the soil fauna, the rhizosphere or the atmosphere [39,40,41]. This shows that there are no longer major methodological obstacles to the inclusion of tree canopy fauna in ecosystem research. This is long overdue and, in our view, fundamental to understanding ecosystem processes. Our research in the Białowieża forest should be seen in this context. The primary forest found here is considered to be the last in Central Europe [42]. There are also several protected areas within the surrounding forest matrix that have retained their original primary forest character. We used these to assess the true impact of forest management on biodiversity. There is no such possibility in central European countries, where only a few lowland primeval forests exist. As a result, the effects of forest management are assessed on unmanaged areas [43], often without realizing what has actually been lost.
Primary forests were characterized by great heterogeneity in the distribution of canopy species between trees (Figure 1). Species richness was, therefore, not always the highest in primary forests, even when the overall diversity was the highest. What particularly distinguished the primary forests from the disturbed, commercial forests was the high beta diversity (Figure 2 and Figure S2). This was also evident, although not significant, in 2003 and 2004 when the beetle diversity in the primary forest trees was unusually low. The high heterogeneity of primary forests made analysis difficult, as long-term studies and large sample sizes were required to statistically validate differences between different forests. The importance of remnants of primary forest was demonstrated by the high diversity of oak plantations which grew within the Białowieża forest matrix. Even the 50-year-old sites had far greater diversity per tree than the isolated 170-year-old oak plantations, which was a consequence of habitat connectivity and continuous colonization, demonstrating why habitat connectivity is crucial for conservation [44].

4.2. Machine Learning and System Analysis

Our research confirmed that the use of artificial-intelligence-based analysis could lead to significant knowledge gains [45]. Applying the elastic net approach on 33 oak trees in 2001, it was not only possible to identify a species signature, which comprised 21 beetle species (Figure S3) and could not only be used to classify different forest types but also to predict whether the beetle community of an individual oak tree was growing in a primary forest or a managed forest using new fogged trees. It is worth noting that the calculated species signature was also able to quantify the degree of forest disturbance, even though the 2001 data had only been trained to distinguish primary from disturbed forests. The class assignment probabilities quantified the effects of forest management and allowed for distinguishing differently aged gradient forests from isolated forests. This shows how pronounced the differences in arboreal communities are. Even the 170-year-old matrix forest stands differed significantly in terms of beta diversity and the model probability from primary sites suggests that it was mainly the high habitat complexity that provided a much wider range of niches in primary forests supporting complex communities [46]. This was even more remarkable as the gradient forests were directly connected with primary sites. The tree class assignment probability could therefore be regarded as a score to characterize forest sites of different management. Predicting whether a beetle community belonged to a particular forest type became increasingly difficult when the spatio-temporal heterogeneity increased. This was illustrated by the 2001 species signature, which showed decreasing species overlap over the years, reducing the power of the predictions, as demonstrated by the ROC curves, confidence intervals of the AUCs and the F1 scores (Figure 3).
But does this signature improve our understanding of how management changes primary forests? Ultimately, it is the ecological interpretation of the signature species that determined the quality of the model prediction. First, the low number of tourist species in the signature, as well as in the original data, shows that fogging could clearly separate the canopy fauna from that of the herb layer and the soil. Therefore, the canopy communities could be described as distinct biocoenoses. Second, the heterogeneity in the distribution of beetle species abundances (beta diversity) in primary forests not only indicates an important feature that distinguishes them from commercial forests but shows the ability of beetle communities to detect it. However, the calculated species signature had very little in common with typical indicator species [47]. Signature species with high information (Figure 4, Figure S3 and Figure S4) were collected in large numbers and with high consistency from many trees. As discussed for abundance-based species distribution models, this could indicate that these signature species represent functionally important species [48]. A characteristic of most signature species is that they are not usually selected for monitoring because each species contributes only partial information to the overall result.
The faunistic analysis results show that 9 of the 38 primary forest signature species could be potential primeval indicators. The frequent phytophagous species of humid, undrained forests were striking, like Polydrusus flavipes and Phyllobius arborator (both Curculionidae), although the latter species is common, especially in the humid mountain forests [49]. They are representative of habitat conditions that are rarely found in commercial forests. In contrast, zoophages and mycetophages were represented in lower abundance in the signature (Figure 5). Remarkably, xylobiont beetles, many of which are only found in and are indicative of old-growth forests [14], played a minor role. This is no methodical artifact, as demonstrated by studies of 150 trees in Germany, where xylobionts accounted for an average of 65% of all beetles [21]. The low abundance of xylobiont beetles suggests that the supply of suitable deadwood habitats was similar in the primary and commercial forests studied. This is also indicated by the two rare primary forest indicator species, Phryganophilus auritus (Melandryidae) and Anthonomus pinivorax (Curculionidae), which were identified by the elastic net model to distinguish the old-isolated oak age class forests from the primary sites in 2003 and 2004. There were also 21 typical forest species (55%) that did not require an old forest for their survival. More faunistic details will be published elsewhere; see also [50].

4.3. Is It Possible to Generalize the Proposed Model to Oaks in Central Europe?

Our results raise the question of whether it is possible to use survey data from many years and forest types to identify a general, optimal species signature that has the power to consistently distinguish primary from managed forests and to assess the level of disturbance in oak forests. To test this, we calculated a signature of 60 beetle species from all 138 oak trees included in this study based on all four years and forest types and with the same parameter settings (Figure S4). The first applications on oak forests in Central Europe we have studied over the last 25 years [6] show promising results. Despite the enormous heterogeneity within these data, we found 40 species of the total 60 signature species that show a disturbance gradient ranging from relatively species-poor oak plantations to old, semi-natural oak woodlands to oaks in floodplain hardwood forests with a primary forest-like character (Floren’s data). These results suggest that it is possible to identify a unique species signature for primary forests.
However, there was a clear limitation, as the proposed beetle signature was based on only 60 oak beetle species. The higher the beta diversity, the lower the expected intersection of the signature species with the beetles of the considered new tree and the higher the variability of the score. It seems that a good prediction was possible even if only a few characteristic species occurred on an oak tree. Due to the lack of other primary forests in Central Europe, the predictive power of the elastic network model could not be verified for other primary forests. Refugial forests, such as those in the Caucasus in Georgia or Sochi in Russia, would be suitable for study. However, this must be shown with further research.
Our results raise the general question of how much disturbance forests can tolerate without suffering major losses in biodiversity, and thus jeopardizing ecosystem functions and services. This is particularly true of the neglected, highly diverse canopy fauna and has taken on new importance in view of climate change, which has the potential to exacerbate the effects of forest management by altering temperature and humidity changing the distribution patterns of flora and fauna and risking serious disruption of biotic interactions [9,16,17,18,19]. Knowledge of the actual impacts of forest management on biodiversity is a prerequisite for biodiversity conservation, especially in light of the expected changes. This has not been considered due to the lack of primary forests and highlights the current gaps in sustainable forest management.

5. Conclusions

The aim of this study was to predict the impact of forest management on beetle communities in trees and to assess the importance of primary forests for ecosystem diversity. These analyses require the most complete coverage of canopy arthropods possible, which was achievable via fogging. By applying a machine learning approach, it was possible to identify a specific signature of beetle species, which was derived from only the species abundance distribution. This signature was able to discriminate primary and commercial forests with high accuracy. Further studies will show whether it is possible to identify a general signature to classify unknown forests despite large spatial and temporal heterogeneity and beta diversity. The ecological interpretation of the species signature provides a new perspective on how ecosystems respond to anthropogenic stresses. Consideration of canopy biodiversity and its function is a prerequisite for sustainable forest management, especially in light of the impending reorientation of forestry due to climate change.

Supplementary Materials

The following supporting information can be downloaded from https://www.mdpi.com/article/10.3390/su151612282/s1. Figure S1: Number of foggings carried out in all forest types and years. The distances of the isolated forests from the Białowieża forest are also shown. Table S1: Results of the overall PERMANOVA model. Figure S2: Box plots for each year show the distribution of the Anderson beta diversity based on the Horn index between the primary and managed forests. Figure S3: Signature of the abundance distribution of the 21 beetle species calculated using the elastic net model for the 2001 data. Figure S4: Beetle signature calculated for the complete data set by the Elastic net model. The calculation was based on all species with a constancy larger five trees. Species that were identified as potential primary forest indicators are highlighted in color. Only species with an importance larger 45% are shown. Abbreviations behind species names: P = Primary, D = Disturbed, followed by the importance of the species. S = Species identified also in the 2001 signature (see text).

Author Contributions

A.F. designed the study and performed the fieldwork. T.M. and A.F. performed the statistical analyses. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the VW foundation, I/77048, and by the Institute of Animal Ecology and Tropical Biology of the University of Wuerzburg.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This study would have been impossible without the help of J. Ługowoj from the forestry department of Hajnówka. Thanks to W. B. Jedryczkowski for his help in establishing the project and M. Mackiewicz who helped with choosing the study plots. For help during fieldwork, we particularly thank P. Sprick, S. Otto and A. Q. Aleman. P. Sprick also identified most of the beetle species, gave much insight into the ecology of many species and helped to interpret the distribution of the species. He also provided much valuable information for the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Houlahan, J.E.; McKinney, S.T.; Anderson, T.M.; McGill, B.J. The priority of prediction in ecological understanding. Oikos 2017, 126, 1–7. [Google Scholar] [CrossRef]
  2. Civantos-Gómez, I.; García-Algarra, J.; García-Callejas, D.; Galeano, J.; Godoy, O.; Bartomeus, I. Fine scale prediction of ecological community composition using a two-step sequential Machine Learning ensemble. PLoS Comput. Biol. 2021, 17, e1008906. [Google Scholar] [CrossRef] [PubMed]
  3. Nahrung, H.F.; Liebhold, A.M.; Brockerhoff, E.G.; Rassati, D. Forest Insect Biosecurity: Processes, Patterns, Predictions, Pitfalls. Annu. Rev. Entomol. 2023, 68, 211–229. [Google Scholar] [CrossRef] [PubMed]
  4. Staab, M.; Gossner, M.M.; Simons, N.K.; Achury, R.; Ambarlı, D.; Bae, S.; Schall, P.; Weisser, W.W.; Blüthgen, N. Insect decline in forests depends on species’ traits and may be mitigated by management. Commun. Biol. 2023, 6, 338. [Google Scholar] [CrossRef]
  5. Coleman, D.C.; Callaham, M.A.; Crossley, D.A., Jr. Fundamentals of Soil Ecology, 3rd ed.; Academic Press: Cambridge, MA, USA, 2017. [Google Scholar]
  6. Foren, A.; Linsenmair, K.E.; Müller, T. Diversity and Functional Relevance of Canopy Arthropods in Central Europe. Diversity 2022, 14, 660. [Google Scholar] [CrossRef]
  7. Löfroth, T.; Birkemoe, T.; Shorohova, E.; Dynesius, M.; Fenton, N.J.; Drapeau, P.; Tremblay, J.A. Deadwood Biodiversity. In Boreal Forests in the Face of Climate Change: Sustainable Management; Girona, M.M., Morin, H., Gauthier, S., Bergeron, Y., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 167–189. [Google Scholar]
  8. Dajoz, R. Écologie et Biologie des Coléoptéres Xylophages de la Hetraie; Vie et Milieu, Observatoire Océanologique—Laboratoire Arago, hal-02947236; Masson: Paris, France, 1966; pp. 637–764. [Google Scholar]
  9. Suárez-Castro, A.F.; Raymundo, M.; Bimler, M.; Mayfield, M.M. Using multi-scale spatially explicit frameworks to understand the relationship between functional diversity and species richness. Ecography 2022, 2022, e05844. [Google Scholar] [CrossRef]
  10. Kenis, M.; Hurley, B.P.; Hajek, A.E.; Cock, M.J.W. Classical biological control of insect pests of trees: Facts and figures. Biol. Invasions 2017, 19, 3401–3417. [Google Scholar] [CrossRef] [Green Version]
  11. Freeman, B.E. Ecological and Econoic Entomology: A Global Synthesis; CABI: Boston, MA, USA, 2021; p. 695. [Google Scholar]
  12. Lowman, M.D.; Schowalter, T.; Franklin, J. Methods in Forest Canopy Research; University of California Press: Berkeley, CA, USA, 2012. [Google Scholar]
  13. O’Brien, L.; Schuck, A.; Fraccaroli, C.; Pötzelsberger, E.; Winkel, G.; Lindner, M. Protecting Old-Growth Forests in Europe—A Review of Scientific Evidence to Inform Policy Implementation; European Forest Institute: Joensuu, Finland, 2021; pp. 1–104. [Google Scholar]
  14. Eckelt, A.; Müller, J.; Bense, U.; Brustel, H.; Bußler, H.; Chittaro, Y.; Cizek, L.; Frei, A.; Holzer, E.; Kadej, M.; et al. “Primeval forest relict beetles” of Central Europe: A set of 168 umbrella species for the protection of primeval forest remnants. J. Insect Conserv. 2018, 22, 15–28. [Google Scholar] [CrossRef]
  15. Parajuli, R.; Markwith, S.H. Quantity is foremost but quality matters: A global meta-analysis of correlations of dead wood volume and biodiversity in forest ecosystems. Biol. Conserv. 2023, 283, 110100. [Google Scholar] [CrossRef]
  16. Inouye, D.W. Climate change and phenology. WIREs Clim. Chang. 2022, 13, e764. [Google Scholar] [CrossRef]
  17. Seidl, R.; Thom, D.; Kautz, M.; Martin-Benito, D.; Peltoniemi, M.; Vacchiano, G.; Wild, J.; Ascoli, D.; Petr, M.; Honkaniemi, J.; et al. Forest disturbances under climate change. Nat. Clim. Chang. 2017, 7, 395–402. [Google Scholar] [CrossRef] [Green Version]
  18. Sallé, A.; Cours, J.; Le Souchu, E.; Lopez-Vaamonde, C.; Pincebourde, S.; Bouget, C. Climate Change Alters Temperate Forest Canopies and Indirectly Reshapes Arthropod Communities. Front. For. Glob. Chang. 2021, 4, 710854. [Google Scholar] [CrossRef]
  19. Harvey, J.A.; Tougeron, K.; Gols, R.; Heinen, R.; Abarca, M.; Abram, P.K.; Basset, Y.; Berg, M.; Boggs, C.; Brodeur, J.; et al. Scientists’ warning on climate change and insects. Ecol. Monogr. 2023, 93, e1553. [Google Scholar] [CrossRef]
  20. Bolte, A.; Ammer, C.; Kleinschmit, C.; Kroiher, F.; Krüger, I.; Meyer, P.; Michler, B.; Müller-Kroehling, S.; Sanders, T.; Sukopp, U. National forest biodiversity monitoring. Nat. Landsch. 2022, 8, 398–401. [Google Scholar] [CrossRef]
  21. Floren, A.; Horchler, P.J.; Müller, T. The Impact of the Neophyte Tree Fraxinus pennsylvanica [Marshall] on Beetle Diversity under Climate Change. Sustainability 2022, 14, 1914. [Google Scholar] [CrossRef]
  22. Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
  23. Jedrzejewska, B.; Jedrzejewski, W. Predation in Vertebrate Communities: The Białowieża Primeval Forest as a Case Study; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
  24. Faliński, J.B. Vegetation dynamics in temperate lowland primeval forests. In Ecological Studies in Białowieża Forest; Geobotany 8; Springer: Dordrecht, The Netherlands, 1986; p. 537. [Google Scholar]
  25. Floren, A. Sampling arthropods from the canopy by insecticidal knockdown. In Manual on Field Recording Techniques and Protocols for All Taxa Biodiversity Inventories ABC Taxa; Eymann, J., Degreff, J., Häuser, C., Eds.; Belgian Development Cooperation: Bruselles, Belgium, 2010; Volume Part 1, pp. 158–172. [Google Scholar]
  26. Oguh, C.E.; Okpaka, C.O.; Ubani, C.S.; Okekeaji, U.; Joseph, P.; Amadi, E. Natural Pesticides (Biopesticides) and Uses in Pest Management—A Critical Review. Asian J. Biotechnol. Genet. Eng. 2019, 2, 1–18. [Google Scholar]
  27. R Core Team. R: A Language and Environment for Statistical Computing, 4.2.2; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]
  28. Oksanen, J.; Simpson, G.; Blanchet, F.G.; Kindt, R.; Legendre, P.; Minchin, P.R.; O’Hara, R.B.; Solymos, P.; Stevens, M.; Szoecs, E.; et al. Vegan: Community Ecology; R Package Version 2.6-4; 2022. Available online: https://rdrr.io/cran/vegan/ (accessed on 25 June 2023).
  29. Bates, D.; Maechler, M.; Bolker, B.; Walker, S. Fitting Linear Mixed-Effects Models Using lme4. J. Stat. Softw. 2015, 67, 1–48. [Google Scholar] [CrossRef]
  30. Huber, W.; Carey, V.J.; Gentleman, R.; Anders, S.; Carlson, M.; Carvalho, B.S.; Bravo, H.C.; Davis, S.; Gatto, L.; Girke, T.; et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 2015, 12, 115–121. [Google Scholar] [CrossRef] [Green Version]
  31. Chao, A.; Gotelli, N.J.; Hsieh, T.C.; Sander, E.L.; Ma, K.H.; Colwell, R.K.; Ellison, A.M. Rarefaction and extrapolation with Hill numbers: A framework for sampling and estimation in species diversity studies. Ecol. Monogr. 2014, 84, 45–67. [Google Scholar] [CrossRef] [Green Version]
  32. Anderson, M.J.; Ellingsen, K.E.; McArdle, B.H. Multivariate dispersion as a measure of beta diversity. Ecol. Lett. 2006, 9, 683–693. [Google Scholar] [CrossRef] [PubMed]
  33. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
  34. Tay, J.K.; Narasimhan, B.; Hastie, T. Elastic Net Regularization Paths for All Generalized Linear Models. J. Stat. Softw. 2023, 1, 1–31. [Google Scholar] [CrossRef] [PubMed]
  35. Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.-C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef] [PubMed]
  36. Kuhn, M. Caret: Classification and Regression Training; R Package Version 6.0-93; 2022. Available online: https://ui.adsabs.harvard.edu/abs/2015ascl.soft05003K/abstract (accessed on 25 June 2023).
  37. Magura, T.; Tóthmérész, B.; Bordán, Z. Effects of nature management practice on carabid assemblages (Coleoptera: Carabidae) in a non-native plantation. Biol. Conserv. 2000, 93, 95–102. [Google Scholar] [CrossRef]
  38. Burner, R.C.; Birkemoe, T.; Stephan, J.G.; Drag, L.; Muller, J.; Ovaskainen, O.; Potterf, M.; Skarpaas, O.; Snall, T.; Sverdrup-Thygeson, A. Choosy beetles: How host trees and southern boreal forest naturalness may determine dead wood beetle communities. For. Ecol. Manag. 2021, 487, 119023. [Google Scholar] [CrossRef]
  39. Vlot, A.C.; Rosenkranz, M. Volatile compounds—The language of all kingdoms? J. Exp. Bot. 2022, 73, 445–448. [Google Scholar] [CrossRef]
  40. Castaño, C.; Camarero, J.J.; Zas, R.; Sampedro, L.; Bonet, J.A.; Alday, J.G.; Oliva, J. Insect defoliation is linked to a decrease in soil ectomycorrhizal biomass and shifts in needle endophytic communities. Tree Physiol. 2020, 40, 1712–1725. [Google Scholar] [CrossRef]
  41. Brosset, A.; Blande, J.D. Volatile-mediated plant–plant interactions: Volatile organic compounds as modulators of receiver plant defence, growth, and reproduction. J. Exp. Bot. 2021, 73, 511–528. [Google Scholar] [CrossRef]
  42. Jaroszewicz, B.; Cholewińska, O.; Gutowski, J.M.; Samojlik, T.; Zimny, M.; Latałowa, M. Białowieża Forest—A Relic of the High Naturalness of European Forests. Forests 2019, 10, 849. [Google Scholar] [CrossRef] [Green Version]
  43. Fischer, M.; Bossdorf, O.; Gockel, S.; Hänsel, F.; Hemp, A.; Hessenmöller, D.; Korte, G.; Nieschulze, J.; Pfeiffer, S.; Prati, D.; et al. Implementing large-scale and long-term functional biodiversity research: The Biodiversity Exploratories. Basic Appl. Ecol. 2010, 11, 473–485. [Google Scholar] [CrossRef]
  44. Ranius, T.; Widenfalk, L.A.; Seedre, M.; Lindman, L.; Felton, A.; Hämäläinen, A.; Filyushkina, A.; Öckinger, E. Protected area designation and management in a world of climate change: A review of recommendations. Ambio 2023, 52, 68–80. [Google Scholar] [CrossRef] [PubMed]
  45. Pichler, M.; Hartig, F. Machine learning and deep learning—A review for ecologists. Methods Ecol. Evolut. 2023, 14, 994–1016. [Google Scholar] [CrossRef]
  46. Storch, F.; Boch, S.; Gossner, M.M.; Feldhaar, H.; Ammer, C.; Schall, P.; Polle, A.; Kroiher, F.; Müller, J.; Bauhus, J. Linking structure and species richness to support forest biodiversity monitoring at large scales. Ann. For. Sci. 2023, 80, 3. [Google Scholar] [CrossRef]
  47. Ette, J.-S.; Sallmannshofer, M.; Geburek, T. Assessing Forest Biodiversity: A Novel Index to Consider Ecosystem, Species, and Genetic Diversity. Forests 2023, 14, 709. [Google Scholar] [CrossRef]
  48. Waldock, C.; Stuart-Smith, R.D.; Albouy, C.; Cheung, W.W.L.; Edgar, G.J.; Mouillot, D.; Tjiputra, J.; Pellissier, L. A quantitative review of abundance-based species distribution models. Ecography 2022, 2022, 1–18. [Google Scholar] [CrossRef]
  49. Rheinheimer, J.; Hasseler, M. Die Rüsselkäfer Baden-Württembergs; LUBW Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg: Karlsruhe, Germany, 2010; p. 944. [Google Scholar]
  50. Sprick, P.; Floren, A. Canopy leaf beetles and weevils in the Białowieża and Borecka forests in Poland (Col., Chrysomeloidea, Curculionoidea). Pol. Pismo Entomol. 2007, 76, 75–100. [Google Scholar]
Figure 1. (A) Box plots of beetle species and individuals per tree collected from Quercus show significant differences between the forest types (Kruskal test p < 0.001). The variance was highest in the primary forests and lowest in the young matrix forests and isolated forests (Fligner’stest p < 0.001). Each year contains different forest sites. (B) The sample-based rarefaction curves differ in species diversity between the primary and managed forests in all years. The dashed lines show the extrapolated curve and the numbers at the endpoint of the curves refer to the actual species richness. The considered number of trees is given in brackets in the figure legend. Inserted tables show the species numbers estimated based on the same subsample size.
Figure 1. (A) Box plots of beetle species and individuals per tree collected from Quercus show significant differences between the forest types (Kruskal test p < 0.001). The variance was highest in the primary forests and lowest in the young matrix forests and isolated forests (Fligner’stest p < 0.001). Each year contains different forest sites. (B) The sample-based rarefaction curves differ in species diversity between the primary and managed forests in all years. The dashed lines show the extrapolated curve and the numbers at the endpoint of the curves refer to the actual species richness. The considered number of trees is given in brackets in the figure legend. Inserted tables show the species numbers estimated based on the same subsample size.
Sustainability 15 12282 g001
Figure 2. The non-metric multidimensional scaling (NMDS) based on the Horn beta diversity of beetles sampled from Quercus trees in four years shows that forest types differ between primary and managed forests for each year. The p-values in the top-right of each figure indicate the significance level of the factor “Management” in PERMANOVA. The legend distinguishes primary forests (blue), managed forests in the Białowieża matrix (red) and managed isolated oak forests (green).
Figure 2. The non-metric multidimensional scaling (NMDS) based on the Horn beta diversity of beetles sampled from Quercus trees in four years shows that forest types differ between primary and managed forests for each year. The p-values in the top-right of each figure indicate the significance level of the factor “Management” in PERMANOVA. The legend distinguishes primary forests (blue), managed forests in the Białowieża matrix (red) and managed isolated oak forests (green).
Sustainability 15 12282 g002
Figure 3. (A) ROC curves derived for all years are visualized, where the red line shows an optimal training classification. Additionally, the corresponding AUCs with their 95% confidence intervals are given in the bottom right corner. (B) The box plots visualize the class membership probabilities, which can be interpreted as an index measuring the degree of disturbance of a tree. Probabilities close to one classify trees as disturbed, while probabilities close to zero indicate typical primary trees. Prim—primary, Y:Man/Old:Man—young/old managed forests in the Białowieża forest matrix, Old:Iso—old isolated forests.
Figure 3. (A) ROC curves derived for all years are visualized, where the red line shows an optimal training classification. Additionally, the corresponding AUCs with their 95% confidence intervals are given in the bottom right corner. (B) The box plots visualize the class membership probabilities, which can be interpreted as an index measuring the degree of disturbance of a tree. Probabilities close to one classify trees as disturbed, while probabilities close to zero indicate typical primary trees. Prim—primary, Y:Man/Old:Man—young/old managed forests in the Białowieża forest matrix, Old:Iso—old isolated forests.
Sustainability 15 12282 g003
Figure 4. (A) The bar chart shows the total importance of all beetle species from the overall model fitted using all years (2001–2004) and forest types that were found on more than five trees. All species with an importance greater than 90% are shown (see Figure S4 for the detailed signature). The coloring indicates whether the species is representative of primary or disturbed forests. (B) The correspondence analysis (CA) results for all years show the distinctiveness of the selected signature species. Their importance is also shown. The extreme species in the CA are plotted; the legend shows the number of foggings carried out in primary and disturbed forests.
Figure 4. (A) The bar chart shows the total importance of all beetle species from the overall model fitted using all years (2001–2004) and forest types that were found on more than five trees. All species with an importance greater than 90% are shown (see Figure S4 for the detailed signature). The coloring indicates whether the species is representative of primary or disturbed forests. (B) The correspondence analysis (CA) results for all years show the distinctiveness of the selected signature species. Their importance is also shown. The extreme species in the CA are plotted; the legend shows the number of foggings carried out in primary and disturbed forests.
Sustainability 15 12282 g004
Figure 5. The bar charts compare the relative proportions of beetle individuals per guild of the beetle data and of the signature. Blue—primary forest, red—disturbed forest. Light color—collected data (all), dark color—selected in the signature. Feeding guilds: The dominance of phytophages (phy) was striking and even higher in the signature. Zoo—zoophages, myc—mycetophages, xyl—xylophages, sap—saprophages. Xylobiont beetles: Bar charts for the same strata that compare the proportion of xylobiont beetles with the proportion of other beetles. Xylobionts played only a minor role in distinguishing between primary and managed forests. The tables summarize absolute counts and relative proportions for all strata. Coloring refers to the absolute number of beetles.
Figure 5. The bar charts compare the relative proportions of beetle individuals per guild of the beetle data and of the signature. Blue—primary forest, red—disturbed forest. Light color—collected data (all), dark color—selected in the signature. Feeding guilds: The dominance of phytophages (phy) was striking and even higher in the signature. Zoo—zoophages, myc—mycetophages, xyl—xylophages, sap—saprophages. Xylobiont beetles: Bar charts for the same strata that compare the proportion of xylobiont beetles with the proportion of other beetles. Xylobionts played only a minor role in distinguishing between primary and managed forests. The tables summarize absolute counts and relative proportions for all strata. Coloring refers to the absolute number of beetles.
Sustainability 15 12282 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Floren, A.; Müller, T. Using a Machine Learning Approach to Classify the Degree of Forest Management. Sustainability 2023, 15, 12282. https://doi.org/10.3390/su151612282

AMA Style

Floren A, Müller T. Using a Machine Learning Approach to Classify the Degree of Forest Management. Sustainability. 2023; 15(16):12282. https://doi.org/10.3390/su151612282

Chicago/Turabian Style

Floren, Andreas, and Tobias Müller. 2023. "Using a Machine Learning Approach to Classify the Degree of Forest Management" Sustainability 15, no. 16: 12282. https://doi.org/10.3390/su151612282

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop