1. Introduction
Ecologists and foresters have sought to understand what factors influence the growth variation of plants and trees, which is essential to explaining forest productivity and dynamics [
1]. Among these factors, the most important internal factors are physiology, species, age, and genetic characteristics [
2,
3], and the most important external factors are climatic conditions, soil-slope, type of competition, and nearby trees [
4], besides natural disturbances and silvicultural cutting practices [
5].
The basal area increment (BAI) has been modeled based on individual tree size, stand development, and other variables of site and competition to analyze the influence of competition and aridity on tree productivity [
6]. Individual-based modeling is one of the most comprehensive and detailed approaches to predicting individual trees’ growth. It has been applied to simulate future forest management scenarios [
7], predict/explain wood quality [
8], predict habitat quality [
9], and plan forest management activities [
10].
Competition is a key process in regulating tree and stands dynamics. In mixed forests, the effect of species interactions can be assessed by quantifying the influence of intra- and inter-specific competition on tree growth. Over the years, studies [
11,
12,
13] have reported on the competition between angiosperms and conifers primarily because angiosperms reportedly change conifers in most forest types in the tropics [
13]. In mixed conifer-angiosperm forests in the Southern hemisphere, the long-term dominance relationship between conifers and angiosperms is also known as “temporal stand replacement” or Lozenge model, being reported in several studies [
13,
14,
15].
Mixed Ombrophilous Forests (MOF) consist of a mixture of tropical and temperate floras formed by hundreds of tree species. The Brazilian Pine (
Araucaria angustifolia (Bertol.) Kuntze, Araucariaceae) is characteristic and an exclusive native to MOF, being considered the most important coniferous tree of Brazil due to its high wood quality of medium density and its valued edible seeds [
16,
17,
18]. Intensive and often indiscriminate harvesting have significantly reduced the area of the forest.
A. angustifolia is protected by the law (Law RS 9519-92) and included in the red list of endangered species by the Brazilian government and the International Union for Conservation of Nature [
19]. Logging of this species is therefore prohibited [
17].
Several studies have been applying better statistical techniques and mathematical methods to model forest incrementation and determine the relationship between growth rates and various independent internal and external variables. These methods and techniques include both linear and non-linear regressions, fuzzy logic, Mixed Models (MM), and, more recently, Artificial Neural Networks (ANNs) [
20,
21,
22,
23]. The ANNs’ synthesis of information in a single network helps work with a large amount of data from different locations, genotypes, climatic conditions, sites, and silvicultural interventions, among other site characteristics that influence tree growth. Continuous and categorical variables can thus be used simultaneously in a single trained network to reach accurate estimates [
4,
24].
ANNs form a subset of artificial intelligence (AI) which are efficient alternatives to estimate tree growth [
25,
26,
27], the prognosis of tree diameter, height, and volume [
28,
29,
30], survival and mortality [
31], biomass and carbon [
32,
33]—applied with remote sensing data [
34,
35]—as well as species richness and composition mapping [
36]. ANNs are used to improve estimates in mixed forests since modeling in this type of forest is complex and must consider species interactions, long dynamics of spatial or temporal gradients in resource availability, and climatic conditions. To estimate the volume increment in the mixed-age Hyrcanian forest of irregular age in Iran, ANN and the support vector machine were better and more accurate than other machine learning methods and traditional least squares regression [
28]. In Brazil, ANNs were used to estimate the biomass and volume of different species of Cerrado (Brazilian savanna), obtaining better results than the non-linear mixed effects (NLME) and Random Forest (RF) models [
37]. ANN was also applied in MOF to estimate the bark thickness of
Araucaria angustifolia [
38], but the application of AI techniques to improve estimates of species growth in this type of forest must be further investigated. This study, therefore, aims to model BAI for
Araucaria angustifolia (Bertol.) Kuntze in a mixed ombrophilous forest in Southern Brazil. Our specific objectives are to: i. separate trees in groups according to their Importance Value Index (IVI) of the trees; ii. characterize the effect of competition between groups; iii. develop models using artificial neural networks (ANNs).
2. Materials and Methods
2.1. Study Area
This research was developed at the Sustainable Use Conservation Unit in the São Francisco de Paula National Forest (FLONA-SFP) [29°25′ S and 50°23′ W]. The MOF study area occupies 902 ha (≈56%) of a total area of 1606.7 ha.
The FLONA-SFP is located about 930 m above sea level in the northeastern region of the state of Rio Grande do Sul in the municipality São Francisco de Paula. The characteristic climate is medium mesothermal (Cfb), a temperate climate with rainfall above 2000 mm evenly distributed throughout the year, and a mean annual temperature below 15 °C [
39].
Table 1 gathers the definitions used in this long study.
2.2. Characteristics of the Forest
The Mixed Ombrophilous Forest (MOF) are subtropical conifer-hardwood mixed forests part of the Atlantic forest’s floristic dominion in South America. They are characterized by the presence of
Araucaria angustifolia (Bertol.) Kuntze (
Figure 1) [
14], which are in the upper canopy of the forest and dominant in the vegetation [
40]. The MOF is considered one of the most threatened phytophysiognomies in Brazil [
41] since intensive and often indiscriminate harvesting in past decades have significantly reduced the original area occupied by this forest. The current legislation thus restricts forest management by prohibiting the harvest of the most important timber tree species found in this forest, including
Araucaria angustifolia [
42].
The study site has low floristic diversity with a Shannon diversity index of 1.58 and ecological dominance of a few species with a Pielou equability index of 0.93 [
43]. The
A. angustifolia had an Importance Value Index (IVI) of 41.60% and 79.29% of the total basal area of the study site. The most frequent species found were
Araucaria angustifolia,
Casearia decandra Jacq.,
Blepharocalyx salicifolius (Kunth) O.Berg.,
Ilex brevicuspis Reissek, and
Ilex paraguariensis A.St.-Hil (
Table 2).
2.3. Data Collection
The data were collected from the Long-Term Ecological Research (LTER), installed in 2002 and re-measured annually over eight years. This plot was selected considering the largest number of trees, the largest number of A. angustifolia, and a proper conservation stage. The development of models was considered only for the species A. angustifolia. Variables of size, site, and competition were considered for 331 A. angustifolia trees. The measurement was conducted in 25 square sample plots, with 20 m totaling one hectare (ha). Twenty plots (80%) were used for training (fitting) of the BAI models and five plots (20%) were used for validation purposes. This dataset partition was spatially idealized so that the selected trees covered the entire variability of the study area.
Firstly, we took the measured circumference at the breast height (therefore, c) and converted it to diameter at the breast height (d = c/). The total height (h) of single trees was measured using Vertex IV’s hypsometer (Haglof, Sweden). With these measurements, the Assmann dominant height (h100), the basal area per hectare (G), the number of trees per hectare (N), and the average diameter (daverage) were obtained.
The competition effect of
A. angustifolia trees was assessed using competition indices proposed by Lorimer [
44] (Equations (1)–(4)). In addition, the dependent distance described by Hegyi [
45] was also considered (Equations (5)–(8)). Finally, the total competition of the target tree was classified according to the groups described in
Table 2.
where Lorimer: competition index of Lorimer—the numerical values of the sub-indices are (1) intraspecific competition with
A. angustifolia, (2) first group of species that cause interspecific competition with
A. angustifolia, (3) second group of species that cause interspecific competition with
A. angustifolia (see
Table 2); d
i and d
j: diameter at 1.30 m above ground level (d) of target tree i and competitor j (cm).
where Hegyi: competition index of Hegyi—the numerical values of the sub-indices are (1) intraspecific competition with
A. angustifolia, (2) first group of species that cause interspecific competition with
A. angustifolia, (3) second group of species that cause interspecific competition with
A. angustifolia (see
Table 2); d
i and d
j: diameter at 1.30 m above ground level (d) of target tree i and competitor j (cm); distij: distance between target tree i and competitor j, in (m).
Growth rates were assessed using periodic annual basal area increments (BAI) and calculated in subsequent continuous measurements of the diameter of
A. angustifolia.
where BAI: periodic annual increment in basal area (cm
2.year
−1); d
t: diameter at breast height at the end of the period (cm); d
t−2: diameter at breast height at the beginning of the period (cm); and t: period in years. * Measured in intervals of two years.
2.4. Correlation Analysis
Correlation analysis determines the degree of relationship between two variables, where the values vary between 0 and 1. Values close to 1 indicate a great correlation between the variables. Pearson’s correlation analysis Equation (10) was used to describe the level of association between BAI and variables of size, site and competitions, considering a 5% level of significance:
where
: Pearson’s correlation coefficient; x
i: observed value of x;
: mean of the observed values x; y
i: observed value of y;
: mean of the observed values y; n: number of observations.
2.5. Modeling Using Artificial Neural Networks (ANNs)
Multi-layer Perceptron (MLP) ANNs with only one hidden layer were used for data training (Haykin [
46]) starting from Data Normalization (DN) according to two types of intervals [0; 1] and [−1; 1], given by Equation (11):
where X
i: value to be equalized; X
minimum: lowest value of the data set; X
maximum: highest value of the data set; UL: upper limit; and IL: inferior limit.
This equalization was used to prevent variables of greater magnitude from influencing the result more [
46].
Table 3 shows the number of neurons and activation functions.
We used the activation functions (hyperbolic tangent and logistic sigmoid) of the intermediate layer and activation functions (identity) of the output. In training, the ideal number of neurons was found by the Fletcher-Gloss method [
47], given by Equation (12):
where n: number of network inputs; n
1: number of neurons in the hidden layer; and n
2: number of neurons in the output layer.
The ANN prediction uses the mathematical Equation described for MLP [
3], as follows:
where Y: estimation of the value of the dependent variable; X
i: input value of the i-th independent variable; w
ij: connection weight between the i-th input neuron and the j-th neuron of the hidden layer; β
j: bias value of the j-th neuron of the hidden layer; v
j: connection weight between the j-th neuron of the hidden layer and the output neuron; θ: bias value of the output neuron;
f(.): hidden layer activation function;
g(.): output activation function.
ANNs were trained according to the DN evaluated, activation functions (AF) types, and neurons in the hidden layer (NHL) variations. The maximum amount of NHL defined by the method in Equation (2) sought to avoid memorizing the input data (over-fitting) or extracting insufficient information in training (under-fitting).
2.6. Data Analyses and Statistical Criteria
All the statistical analyses were processed using the package neuralnet available inR version 3.4.4. The goodness of fit criteria used to assess model performance was based on the coefficient of determination Equation (14), root mean square error Equation (15), mean absolute error Equation (16), and mean absolute percentage error Equation (17). In addition, the graphical analysis of residue was adopted as complementary.
i. Coefficient of determination (R
2)
ii. Root mean square error (RMSE)
iii. Mean absolute error (MAE)
iv. Mean absolute percentage error (MAPE)
v. and graphical analysis of residues.
Figure 2 shows the workflow used in this study to develop the BAI model using species groups and ANN.
4. Discussion
This study presents different models of BAI for
A. angustifolia with and without the CI independent variable estimated in groups of species classified according to IVI and by intra and interspecific competition. The importance value index (IVI) characterizes the most important species and species in a high number [
48], that is, those most successful in exploiting the resources of their habitat (from a horizontal perspective), gathering the sum of the analysis criteria ‘relative density’, ‘frequency’, and ‘dominance of each species in plant association’. Mixed Ombrophilous Forests (MOFs) in Southern Brazil are thus marked by the high values of IVI of
A. angustifolia; that is,
A. angustifolia and a few other more expressive species are dominant in these forests (
Figure 1) [
16,
49].
Though intraspecific competition is expected to be more associated with the growth of
A. angustifolia [
50,
51], in this study, the inclusion of interspecific competition based on Group 3 considerably improved the BAI model (
Table 6-Id 23 and Id31). Furthermore, despite having the lowest mean value of CI (Hegyi and Lorimer—
Table 4), Group 3 has greater species diversity. We therefore hypothesize that interspecific competition is more associated with growth when considering several species. We encourage future studies to follow this hypothesis to understand the relationship between the number and size of species and CI value and tree growth.
Moreover, increment modeling must assess both intraspecific and interspecific competition as separate variables to obtain better estimates, as verified with the Id25 and Id33 models. This effect is likely due to the weight assumed by each variable in the model, helping reach all the variation in the data. Therefore, one strategy to assess species increment is including the size variables (d and h) with the variables of vigor, competition, and location (site, climate) [
52].
In the literature, some researchers have described species competition in mixed forests using different methodologies for growth modeling [
42,
51,
53]. In the research by Orellana et al. [
13], for example, the competition between angiosperms and conifers in MOF was assessed based on the characterization of ecological groups according to shade-tolerant and light-demanding species classification. Their results on diameter increment indicated high intraspecific competition among
A. angustifolia trees and moderate competition among light-demanding species, both intraspecific and with
A. angustifolia. The methodology and objective used to group the species can indicate different results and interpretations of growth and dynamics within the forest.
Selecting neighboring competing trees is also a complex part of assessing competition which can influence the choice of the competition index [
54]. Since our study considered the same competition area for all objective trees, the superiority of the Hegyi index over the Lorimer index shows that considering the spatialization of trees to assess competition is important when grouping species by IVI.
The studied MOF is stagnant and overstocked since Brazilian legislation prohibited the exploitation of the forest’s native species to preserve its remnants [
55,
56]. Indications show that the trees in these forests are in high competition, and the permanence of unmanaged old trees will likely depreciate the forest’s diametric structure since species such as
A. angustifolia depend on light to grow and establish themselves in the forest [
55].
Artificial Neural Networks (ANNs) thus proved themselves to be a feasible technique in the BAI modeling strategies of
A. angustifolia for the possibility of including different variables in the model and increasing the complexity of the relationship. This is possible because the ANN technique allows new variables to be included [
57] based on biological theory and dynamic processes according to the ecological reality, and not on accidental or random correlations [
58]. Furthermore, the good performance of the generated models in both training and validation, based on an appropriate structure (number of neurons, type of activation function, and input variables) indicates the stability of these models and their ability to present generalization. In this sense, future studies for the species based on the ANN approach will serve to reinforce these findings and expand their applicability based on additional investigations from different datasets and in larger areas of its natural distribution, improving the understanding of its dynamics.
This possibility of improving the description of forest inventory parameters from machine learning techniques, namely ANN, is relevant for sustainable forest management based on the planning of species-specific actions and aligned with the reality of the forests [
28,
31,
59] - especially mixed uneven-aged forests, in which accurate increment predictions are essential to maintaining species composition and the structures that characterize the forest [
60,
61]. For MOFs, this possibility helps ensure the maintenance of this typology by favoring its regeneration and development. Furthermore, correct strategies for interventions based on reliable data can guarantee the possibility of economic returns to landowners while avoiding conversion to other uses [
55,
62].