A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry

Krikštolaitis, Ričardas; Mozgeris, Gintautas; Petrauskas, Edmundas; Rupšys, Petras

doi:10.3390/axioms12050457

Open AccessArticle

A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry

¹

Faculty of Informatics, Vytautas Magnus University, Universiteto 10, 53361 Akademija, Lithuania

²

Faculty of Forest Sciences and Ecology, Agriculture Academy, Vytautas Magnus University, Universiteto 10, 53361 Akademija, Lithuania

^*

Author to whom correspondence should be addressed.

Axioms 2023, 12(5), 457; https://doi.org/10.3390/axioms12050457

Submission received: 22 March 2023 / Revised: 24 April 2023 / Accepted: 5 May 2023 / Published: 8 May 2023

(This article belongs to the Special Issue Stochastic Modeling and Analysis for Applications and Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

Stochastic differential equations and Copula theories are important topics that have many advantages for applications in almost every discipline. Many studies in forestry collect longitudinal, multi-dimensional, and discrete data for which the amount of measurement of individual variables does not match. For example, during sampling experiments, the diameters of all trees, the heights of approximately 10% of the trees, and the tree crown base height and crown width for a significantly smaller number of trees are measured. In this study, for estimating five-dimensional dependencies, we used a normal copula approach, where the dynamics of individual tree variables (diameter, potentially available area, height, crown base height, and crown width) are described by a stochastic differential equation with mixed-effect parameters. The approximate maximum likelihood method was used to obtain parameter estimates of the presented stochastic differential equations, and the normal copula dependence parameters were estimated using the pseudo-maximum likelihood method. This study introduced the normalized multi-dimensional interaction information index based on differential entropy to capture dependencies between state variables. Using conditional copula-type probability density functions, the exact form equations defining the links among the diameter, potentially available area, height, crown base height, and crown width were derived. All results were implemented in the symbolic algebra system MAPLE.

Keywords:

normal copula; stochastic differential equation; marginal distributions; conditional distributions; entropy; normalized interaction information

1. Introduction

The modeling of the growth of the appropriate tree size variable via age is the main problem foresters have in predicting the state of the stand in the future and in choosing stand management strategies and fellings. When creating scientific models of tree growth, scientists are influenced by their perception of the “measure” of tree size. Among the many potential measures of tree size, foresters seem to focus only on a few size measurements, such as the tree diameter at breast height, height, crown base height, crown width, and potentially available area (stand density). There is no doubt that changes in each of these mentioned size measures affect the other size measure. Modeling the links among these five tree size measures enables us to predict changes in any tree size measure over time and the rest of the tree size measures and to perform a detailed tree size analysis by exploring the increments, inflection points, and more.

In order to model the rate of change of tree size measure over time, two growth models defined by ordinary differential equations (these are equations that do not take random events into account) are usually used in practice [1]. The first of these models, called exponential growth, describes populations that increase in size without ever reaching a limit. The second model, called logistic growth, is characterized by the saturation level of reproductive growth, which affects further population dynamics [2,3,4,5]. Forestry statisticians have used these general principles in their models and experiments since the first growth models of tree size measures [6]. Deterministic growth models postulate the output variable of tree size in the structure of a random variable whose deviations from the expected trend function must obey a normal distribution with constant variance. However, the actual values of the output variable of tree size are governed by the corresponding diffusion process, which possesses a probability density function typically dependent on the input variables and noise process parameters. Additionally, the solution of the stochastic differential equation describing the variable growth dynamics of a given tree size component is expressed as a function of the transition probability density. As a result, we can describe the changes in the modeled quantity not only using the mean, but it is also possible to define any moment. Unfortunately, deterministic models allow us to define only the mean growth curve. Forest science researchers often underestimate the utility of the stochastic model by stating that stochastic models are intellectually preferable to and aesthetically superior to deterministic models [7]. Stochastic differential equations are distinguished by their wide applicability in various scientific areas, such as economics, forestry, and medicine [8,9,10,11,12,13]. When it comes to data analysis, it is a multidisciplinary topic that has well survived a period of intense research where new concepts have led to exciting changes and applications for solving problems in various areas of science. Thus, for most of the previously mentioned deterministic exponential or sigmoidal growth processes described by ordinary differential equations [14], complex analyses can be performed by constructing analogies of stochastic differential equations. New synergies are expected to arise from the application of stochastic differential equations to dynamical growth processes in forestry [15]. In forestry, the stochastic differential equation analogy began to be used in the mid-twentieth century [16]. Dynamic problems of tree and stand size variables and the links among the size variables were analyzed using systems of stochastic differential equations. Traditionally, state variables used in forestry are described by two-dimensional, three-dimensional, and higher-dimensional stochastic differential equation systems, where the interaction of state variables is expressed by the corresponding covariance matrix. For example, the univariate growth processes described by a one-dimensional stochastic differential equation of tree growth are given in [17,18]; the two-dimensional growth process described by a system of stochastic differential equations of tree diameter and height is given in [19]; the three-dimensional growth processes described by a system of stochastic differential equations of tree diameter, height, and number of trees per hectare or diameter, potentially available area, and height; and the higher-dimensional growth processes described by a system of stochastic differential equations of tree diameter, height, crown base height, and crown width are given in [20,21]. It is important to note that some of the growth processes mentioned above are distinguished by the fact that the densities of the solutions of the diffusion processes used are characterized by a symmetrical one-dimensional or multi-dimensional normal distribution or by an asymmetrical one-dimensional or multi-dimensional lognormal distribution. These growth processes are formalized by the symmetrical Vasicek-type diffusion process [22,23], or the asymmetrical Gompertz-type [24], Bertalanffy-type diffusion process [25], and a hybrid mode diffusion process [26]. In order to extend the applicability of growth processes to a larger forest region, random effects are additionally included, which are expressed as random variables normally distributed with zero mean and constant variance. On the one hand, the introduction of random effects increases the number of estimated unknown parameters, and the parameter estimation procedure becomes much more complicated. The large number of computer calculations and time required to estimate the parameters of a mixed-effect parameter system of multivariate stochastic differential equations requires a methodology that divides the parameter estimation procedure for each equation separately. Since the solution of each one-dimensional stochastic equation is associated with a probability density function, it is appropriate to use the copula technique [27] to link them together. Copulas are functions that combine one-dimensional distribution functions with their joint multi-dimensional distribution function. There are several copula functions that depend on how the function generating the copula is justified. The normal copula function is defined using a multi-dimensional standard normal probability density function and a marginal probability density function to obtain the joint distribution. It can be used to model multi-dimensional joint distributions and is suitable for forest growth analysis studies [28,29,30].

In this paper, we analyze five one-dimensional random processes, each of which is described by a separate differential equation (

X_{1} (t)

,

X_{2} (t), X_{3} (t), X_{4} (t), X_{5} (t)

) (the tree diameter,

X_{1} (t)

, potentially available area,

X_{2} (t)

, height,

X_{3} (t)

, crown base height,

X_{4} (t)

, and crown width,

X_{5} (t)

), and our main interest is to determine the dependencies and functional relationships among them. The dependence structure of

(X_{1} (t), X_{2} (t), X_{3} (t), X_{4} (t), X_{5} (t)

) can be formalized using the normal conditional copula function.

The rest of this paper is structured as follows. The Section 2 presents models of stochastic differential equations and conditional normal copula, which are later used in the paper to formalize growth processes in forestry. In the Section 3, the parameter estimators and the joint and conditional probability density functions of the corresponding variables are defined. The Section 4 examines the goodness of fit of the newly derived functional relationships using newly defined normalized multi-dimensional interaction information indices defined by Shannon’s differential entropy [31]. The Section 5 presents the main findings and future challenges.

2. Methods

2.1. Stochastic Differential Equations

Stochastic differential equations are used to model observable variables that evolve continuously and randomly over time. Multi-dimensional stochastic differential equation models are commonly used in many scientific areas, such as biology, economics, mathematical finance, and forestry. Stochastic processes that are described by a system of stochastic differential equations include a large number of unknown parameters that must be estimated. Unfortunately, firstly, a large number of estimated parameters increases the time it takes for computer calculations, and secondly, it reduces the scope of acceptable model-fitting databases because measurements must be made for all variables at a given discrete moment in time [13,20]. As an alternative to the multi-dimensional distribution, recently, studies have adopted a “copula” approach to analyze multi-dimensional measurements. Although the copula models are promising, there are only a few studies on copula-based stochastic differential equation analysis [32] in forestry. Therefore, the main objective of this study was to comprehensively develop the methodologies and theories of a copula-based five-dimensional stochastic process, specifically for the analysis of tree diameter, potentially available area, height, crown base height, and crown width measurements in forestry. Next, we define five stochastic processes,

X_{1} (t), X_{2} (t), X_{3} (t), X_{4} (t), X_{5} (t)

, using a stochastic differential equation framework that formalizes the time evolution of the tree diameter, potentially available area, height, crown base height, and crown width. In this section, all theoretical derivations and applications involve one-dimensional stochastic differential equations. Let us consider the tree diameter diffusion process

\{X_{1}^{i} (t) | t \in [t_{0}, T]\}

, the tree potentially available area diffusion process

\{X_{2}^{i} (t) | t \in [t_{0}, T]\}

, and the tree height diffusion process

\{X_{3}^{i} (t) | t \in [t_{0}, T]\}

(i = 1, …, M, where M is the number of individuals,

t_{0}

is the initial time, and T is any finite number) as described by the four-parameter Gompertz-type stochastic differential equation

d X_{j}^{i} (t) = ((α_{j} + ϕ_{j}^{i}) - β_{j} l n (X_{j}^{i} (t) - ɣ_{j})) (X_{j}^{i} (t) - ɣ_{j}) d t + \sqrt{σ_{j}} (X_{j}^{i} (t) - ɣ_{j}) \cdot d W_{j}^{i} (t), j = 1, 2, 3, X_{1}^{i} (t_{0}) = x_{10}, X_{2}^{i} (t_{0}) = x_{20} = δ, X_{3}^{i} (t_{0}) = x_{30},

(1)

and the tree crown base height diffusion process

\{X_{4}^{i} (t) | t \in [t_{0}, T]\}

and the tree crown width diffusion process

\{X_{5}^{i} (t) | t \in [t_{0}, T]\}

are described by the 3-parameter Vasicek-type stochastic differential equation

d X_{j}^{i} (t) = β_{j} (α_{j} + ϕ_{j}^{i} - X_{j}^{i} (t)) d t + \sqrt{σ_{j}} \cdot d W_{j}^{i} (t), j = 4, 5, X_{4}^{i} (t_{0}) = x_{40}, X_{5}^{i} (t_{0}) = x_{50},

(2)

where δ is an unknown fixed-effect parameter to be estimated,

W_{j}^{i} (t)

, j = 1, …, 5 are independent one-dimensional Brownian motions; the random effects

ϕ_{j}^{i}

, j = 1, …, 5 are independent and normally distributed random variables with zero mean and constant variances (

ϕ_{j}^{i} ~ N (0; σ_{j ϕ}^{2})

); the unknown fixed effect parameters

θ_{0} = \{α_{j}, β_{j}, σ_{j}, σ_{j ϕ}, j = 1, \dots, 5, γ_{j}, j = 1, \dots, 3, δ\}

must be estimated.

The drift and diffusion functions for both used stochastic differential Equations (1) and (2) fulfill the regularity and smoothness conditions of the so-called global Lipschitz and growth bound [33]. Therefore, the stochastic differential Equations (1) and (2) have unique strong solutions [33]. Using state transformations

Y_{j}^{i} = e^{β_{i} t} \ln (X_{j}^{i} (t) - γ_{j})

, j = 1, 2, 3,

Y_{j}^{i} = e^{β_{i} t} X_{j}^{i} (t)

, j = 4, 5, and Itô’s formula [33], the stochastic differential Equations (1) and (2) can be transformed into the Ornstein–Uhlenbeck process [34], which is Gaussian. Finally, by integrating and taking antilogarithms, we obtained that the conditional stochastic processes

(X_{j}^{i} (t) | X_{j}^{i} (t_{0}) = x_{j 0})

, j = 1, 2, 3 have lognormal distributions

L N_{1} (μ_{j}^{i} (t); v_{j} (t))

and the conditional stochastic processes

(X_{j}^{i} (t) | X_{j}^{i} (t_{0}) = x_{j 0})

, j = 4, 5 have normal distributions

N_{1} (μ_{j}^{i} (t); v_{j} (t))

with mean

μ_{j}^{i} (t)

, variance

v_{j} (t),

and probability density function

f_{j}^{i} (x_{j}, t |Θ, ϕ_{j}^{i})

defined as follows:

μ_{j}^{i} (t) = l n (x_{j 0} - γ_{j}) e^{- β_{j} (t - t_{0})} + \frac{1}{β_{j}} (1 - e^{- β_{j} (t - t_{0})}) (α_{j} + ϕ_{j}^{i} - \frac{σ_{j}}{2}), j = 1, 2, 3,

(3)

μ_{j}^{i} (t) = α_{j} + ϕ_{j}^{i} + (x_{j 0} - (α_{j} + ϕ_{j}^{i})) e^{- β_{j} (t - t_{0})}, j = 4, 5,

(4)

v_{j} (t) = \frac{1 - e^{- 2 β_{j} (t - t_{0})}}{2 β_{j}} σ_{j}, j = 1, \dots, 5,

(5)

f_{1}^{i} (x_{1}, t |θ_{1}, ϕ_{1}^{i}) = \frac{1}{\sqrt{2 π v_{1} (t)} (x_{1} - γ_{1})} \exp (- \frac{{(l n (x_{1} - γ_{1}) - μ_{1}^{i} (t))}^{2}}{2 v_{1} (t)}), θ_{1} = (α_{1}, β_{1}, σ_{1}, γ_{1}),

(6)

f_{2}^{i} (x_{2}, t |θ_{2}, ϕ_{2}^{i}) = \frac{1}{\sqrt{2 π v_{2} (t)} (x_{2} - γ_{2})} \exp (- \frac{{(l n (x_{2} - γ_{2}) - μ_{2}^{i} (t))}^{2}}{2 v_{2} (t)}), θ_{2} = (α_{2}, β_{2}, σ_{2}, γ_{2}, δ),

(7)

f_{3}^{i} (x_{3}, t |θ_{3}, ϕ_{3}^{i}) = \frac{1}{\sqrt{2 π v_{3} (t)} (x_{3} - γ_{3})} \exp (- \frac{{(l n (x_{3} - γ_{3}) - μ_{3}^{i} (t))}^{2}}{2 v_{3} (t)}), θ_{3} = (α_{3}, β_{3}, σ_{3}, γ_{3}),

(8)

f_{4}^{i} (x_{4}, t |θ_{4}, ϕ_{4}^{i}) = \frac{1}{\sqrt{2 π v_{4} (t)}} \exp (- \frac{{(x_{4} - μ_{4}^{i} (t))}^{2}}{2 v_{4} (t)}), θ_{4} = (α_{4}, β_{4}, σ_{4}),

(9)

f_{5}^{i} (x_{5}, t |θ_{5}, ϕ_{5}^{i}) = \frac{1}{\sqrt{2 π v_{5} (t)}} \exp (- \frac{{(x_{5} - μ_{5}^{i} (t))}^{2}}{2 v_{5} (t)}), θ_{5} = (α_{5}, β_{5}, σ_{5}) .

(10)

In articles [21,29], we studied the problem of the estimation of the fixed-effect and mixed-effect parameters of discretely observed multi-dimensional processes by the maximum likelihood method. There are several alternative methods for obtaining the parameter estimates of stochastic differential equations using discrete measurements, such as the least squares method [35], the numerical approximation approach [36], and others [37].

The conditional density functions of the separate stochastic processes derived above significantly simplify the procedure for estimating unknown parameters because the joint multi-dimensional stochastic process is divided into five separate stochastic processes that have a number of estimated parameters five times smaller, and the parameters of each separate process are estimated using independent observed datasets.

Assume that

\{x_{j 1}^{i}, x_{j 2}^{i}, \dots, x_{j n_{i j}}^{i}\}

are the directly observed values of the jth separate stochastic process at discrete times

\{t_{1}^{i}, t_{2}^{i}, \dots, t_{n_{i j}}^{i}\}

(n_ij is the number of observed trees of the ith stand, i = 1, …, M, jth stochastic process, j = 1, …, 5).

The associated maximum likelihood function for the fixed-effect parameter separate stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, (in this case, the random-effect parameters

ϕ_{j}^{i}

are taken to be equal to the mean values Ε(

ϕ_{j}^{i}

) = 0) has the following form:

L_{j}^{1} (θ_{j}) = \prod_{i = 1}^{M} \prod_{k = 1}^{n_{i j}} f_{j}^{i} (x_{j k}, t_{k}^{i} |θ_{j}, 0),

(11)

and the maximum log-likelihood function is

L L_{j}^{1} (θ_{j}) = \sum_{i = 1}^{M} \sum_{k = 1}^{n_{i j}} l n (f_{j}^{i} (x_{j k}, t_{k}^{i} |θ_{j}, 0))

(12)

where the probability density function

f_{j}^{i} (x_{j k}, t_{k}^{i} |θ_{j}, 0)

takes the form defined by Equations (6)–(10).

The associated maximum likelihood function for the fixed effect parameters of the separate stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M has the following form:

L_{j}^{2} (θ_{j}^{'}, ϕ_{j}) = \prod_{i = 1}^{M} \int_{R} \prod_{k = 1}^{n_{i j}} f_{j}^{i} (x_{j k}, t_{k}^{i} |θ_{j}, ϕ_{j}^{i}) p (ϕ_{j}^{i} |σ_{j ϕ}^{2}) d ϕ_{j}^{i}, θ_{j}^{'} = (θ_{j}, σ_{j ϕ}), ϕ_{j} = (ϕ_{j}^{1}, \dots, ϕ_{j}^{M})

(13)

where

p (ϕ_{j}^{i} |σ_{j ϕ}^{2})

is the normal probability density function with zero mean and constant variance,

σ_{j ϕ}^{2}

and the maximum log-likelihood function is

L L_{j}^{2} (θ_{j}^{'}, ϕ_{j}) = \sum_{i = 1}^{M} \int_{R} (\sum_{k = 1}^{n_{i j}} l n (f_{j}^{i} (x_{j k}, t_{k}^{i} |θ_{j}, ϕ_{j}^{i})) + l n (p (ϕ_{j}^{i} |σ_{j ϕ}^{2}))) d ϕ_{j}^{i} .

(14)

Although the integral in (14) does not have a closed-form solution, the analytical expression of the integrand (14) is known. Therefore, the Laplace method [38] can be used. Let us define a function

g_{j}^{i} : R \to R

, j = 1, …, 5, i = 1, …, M by

g_{j}^{i} (ϕ_{j}^{i} |\hat{θ_{j}^{'}}) \equiv \sum_{k = 1}^{n_{i j}} l n (f_{j}^{i} (x_{j k}, t_{k}^{i} |\hat{θ_{j}}, ϕ_{j}^{i})) + l n (p (ϕ_{j}^{i} |\hat{σ_{j ϕ}^{2}}),

(15)

where the cap indicates the point estimate of the parameter.

Then, by a second-order Taylor series expansion [39]

L L_{j}^{2} (θ_{j}^{'}, \hat{ϕ_{j}}) \approx \sum_{i = 1}^{M} (g_{j}^{i} (\hat{ϕ_{j}^{i}} |θ_{j}^{'}) + \frac{1}{2} l n (2 π) - \frac{1}{2} l n (\det {([- \frac{\partial^{2} g_{j}^{i} (ϕ_{j}^{i} |θ_{j}, σ_{j ϕ})}{\partial {(ϕ_{j}^{i})}^{2}}])}_{ϕ_{j}^{i} = \hat{ϕ_{j}^{i}}})),

(16)

where

\hat{ϕ_{j}^{i}} = \underset{ϕ_{j}^{i}}{a r g m a x} (g_{j}^{i} (ϕ_{j}^{i} |\hat{θ_{j}^{'}})) .

(17)

Maximizing

L L_{j}^{2} (θ_{j}^{'}, ϕ_{j})

is a two-step optimization problem. The inner optimization step estimates the random effects

\hat{ϕ_{j}^{i}}

for each j = 1, …, 5, i = 1, …, M for the given values of the fixed effects

\hat{θ_{j}^{'}}

with Equation (17). The outer optimization step maximizes

L L_{j}^{2} (θ_{j}^{'}, \hat{ϕ_{j}})

defined by (16) after we insert random effects

\hat{ϕ_{j}}

into Equation (16). In the first iteration, the values of the random effects

\hat{ϕ_{j}}

are set to zero, as the mean values Ε(

ϕ_{j}^{i}

) = 0, j = 1, …, 5, i = 1, …, M. These two steps are iterated until convergence. Parameter estimates using the approximated maximum likelihood procedure are costly processes in terms of time and computational power, as many more iterations are required.

Based on estimation theory, the Cramér–Rao inequality [40] states that the covariance matrix of any unbiased estimator

\hat{θ}

vector θ ∈ Θ, where Θ =

\{θ_{j}, θ_{j}^{'}, j = 1, \dots, 5\}

is the parameter space, is not less than the Fisher information matrix [31], namely, for each main diagonal element,

C o v (\hat{θ})

is greater than or equal to its corresponding element in

I^{- 1} (θ)

:

C o v (\hat{θ}) \geq I^{- 1} (θ),

(18)

where θ = (θ₁, …, θ_m) ∈ Θ and

\hat{θ}

= (

\hat{θ}

₁, …,

\hat{θ}

_m), where m is a positive integer, and both the covariance matrix

C o v (\hat{θ})

and the inverse Fisher information matrix

I^{- 1} (θ)

are m × m matrices. In order to calculate the observed Fisher information matrix

I (θ)

, the maximum log-likelihood function for θ is needed. The approximate asymptotic standard errors

S E (\hat{θ})

of the fixed-effect parameters estimates

\hat{θ}

for both the fixed- and mixed-effect scenarios (if

\hat{θ} = \hat{θ_{j}}

, then

L L_{j} (\hat{θ}) = L L_{j}^{1} (\hat{θ_{j}})

, and it

\hat{θ} = \hat{θ_{j}^{'}}

, then

L L_{j} (\hat{θ}) = L L_{j}^{2} (θ_{j}^{'}, ϕ_{j})

, j = 1, …, 5) are defined by the diagonal elements of the inverse observed Fisher information matrix

F I^{- 1} (\hat{θ})

. By defining the matrix

[\frac{\partial^{2} L L_{j} (θ)}{\partial θ_{k} \partial θ_{l}}], j = 1, ..5

, the observed Fisher information matrix takes the following form

F I (\hat{θ}) = {[- \frac{\partial^{2} L L_{j} (θ)}{\partial θ_{k} \partial θ_{l}}]}^{T} |_{θ = \hat{θ}} .

(19)

The approximate asymptotic standard errors of the fixed effects parameters are defined by

S E (\hat{θ}) = \sqrt{F I^{- 1} (\hat{θ})} .

(20)

The random effects for the newly derived probability density functions (6)–(10) can be calibrated for a new individual (for example, a plot of trees) if we have appropriate measurements for this new individual. The random effect

ϕ_{j}

, j = 1, …, 5 can be calibrated using the newly observed dataset

\{x_{j 1}, x_{j 2}, \dots, x_{j m}\}

at discrete times

\{t_{1}, t_{2}, \dots, t_{m}\}

(where m is the number of observed objects of the new individual) as

\hat{ϕ_{j}} = \underset{ϕ_{j}}{a r g m a x} (\sum_{k = 1}^{m} l n (f_{j}^{i} (x_{j k}, t_{k}^{i} |\hat{θ_{j}}, ϕ_{j}^{i})) + l n (p (ϕ_{j}^{i} |\hat{σ_{j ϕ}^{2}})) .

(21)

2.2. Copulas and Dependence

The interest in copulas comes from several perspectives. First, studies often have more information about the separate distributions of the variables involved and richer databases for estimating one-dimensional distributions than for their multi-dimensional distributions. Second, multi-dimensional copulas can be used for determining the dependence of pairs of random variables. In summary, copulas are functions that connect multi-dimensional distributions to their one-dimensional marginals. According to Sklar’s theorem [27], a copula is a multi-dimensional distribution function connecting two or more marginal distributions to accurately form the joint distribution.

Thus [27], if F is an m-dimensional distribution function with margins F₁, …, F_m, then there exists an m-dimensional copula

C_{1, \dots, m},

such that for all x_k ∊ R, k = 1, …, m,

F (x_{1}, \dots, x_{m}) = C_{1, \dots, m} (F_{1} (x_{1}), \dots, F_{m} (x_{m}))

(22)

Additionally, if F₁, …, F_m are continuous, then

C_{1, \dots, m}

is unique, and conversely, if

C_{1, \dots, m}

is an m-dimensional copula and F₁, …, F_m are distribution functions, then the function F in (22) is an m-dimensional distribution function with marginals F₁, …, F_m.

Let f, c, and f_k, k = 1, …, m be the densities of the distribution functions F,

C_{1, \dots, m},

and F_m, respectively; then,

f (x_{1}, \dots, x_{m}) = \frac{\partial^{m} F (x_{1}, \dots, x_{m})}{\partial x_{1} \dots \partial x_{m}} = \frac{\partial^{m} C_{1, \dots, m} (F_{1} (x_{1}), \dots, F_{m} (x_{m}))}{\partial x_{1} \dots \partial x_{m}} = {\frac{\partial^{m} C_{1, \dots, m} (u_{1}, \dots, u_{m})}{\partial u_{1} \dots \partial u_{m}}|}_{u_{k} = F (x_{k})} \prod_{k = 1}^{m} f (x_{k}) = c_{1, \dots, m} (F_{1} (x_{1}), \dots, F_{m} (x_{m} | P) \prod_{k = 1}^{m} f (x_{k}) .

(23)

Recently, many different copula-type functions have been analyzed. In this article, we limit ourselves to the multi-dimensional normal copula function up to five dimensions, since five different distributions were derived, which are defined by Equations (6)–(10). A normal copula is elliptical and symmetric, which gives it excellent analytical properties that are entirely determined by its correlation matrix [41]. For the specified correlation matrix P, the m-dimensional (m = 1, …, 5) normal copula

C_{1, \dots, m} : {[0, 1]}^{m} \to R

can be written as

C_{1, \dots, m} (u_{1}, \dots, u_{m} | P) = Φ_{m} (Φ^{- 1} (u_{1}), \dots, Φ^{- 1} (u_{m})),

(24)

where

Φ^{- 1}

is the inverse cumulative distribution function of the standard normal distribution, and

Φ_{m}

is the m-dimensional normal cumulative distribution function with a mean vector μ and covariance matrix Σ. According to the basic principles of invariance properties (see [42]), the N_m (μ, Σ) copula is the same as N_m (0, P), where P is the correlation matrix corresponding to the covariance matrix Σ. So, the normal copula is determined using only the correlation matrix P. Although there is no simple analytical formula for the copula function, it can only be defined using integration, but the density can be written as

c_{1, \dots, m} (u_{1}, \dots, u_{m} | P) = \frac{1}{\sqrt{|P|}} e x p (- \frac{1}{2} {(\begin{matrix} Φ^{- 1} (u_{1}) \\ ⋮ \\ Φ^{- 1} (u_{m}) \end{matrix})}^{T} \cdot (P^{- 1} - I) \cdot (\begin{matrix} Φ^{- 1} (u_{1}) \\ ⋮ \\ Φ^{- 1} (u_{m}) \end{matrix})),

(25)

where I is the identity matrix and P takes the form

P = (\begin{matrix} 1 & ρ_{12} & \dots & ρ_{1 m} \\ ρ_{12} & 1 & \dots & ρ_{2 m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ρ_{1 m} & ρ_{2 m} & \dots & 1 \end{matrix}) .

(26)

The classical maximum likelihood method can be successfully used to estimate the parameters of the joint density function (23). This procedure involves estimating all parameters simultaneously and can be computationally complex, as numerical optimization with a large number of parameters is time-consuming in practice. Alternatively, there is a two-step estimation of the copula model parameters that obtains estimates, in the first step, for the parameters of the marginal densities and, in the second step, assumes that the variables are independent, retrieving the copula density parameter estimates by maximizing

l (P) = \sum_{i = 1}^{M} \sum_{j = 1}^{n_{i j}} l n (c_{1, \dots, m} (Φ^{- 1} (F_{1} (x_{1 j}^{i}, t_{j}^{i})), \dots, Φ^{- 1} (F_{m} (x_{m j}^{i}, t_{j}^{i})) |P)),

(27)

where

F_{m} (x_{m}, t) = \int_{z}^{x_{m}} f_{m} (y, t) d y

,

f_{k} (y, t)

, k = 1, …, 5 are defined by Equations (6)–(10); z =

\hat{γ_{k}}

if k = 1, 2, 3; z = −∞ if k = 4, 5.

This estimate still remains asymptotically normal [43]. The asymptotic standard errors of the normal copula density parameters are defined by the diagonal elements of the observed Fisher information matrix

{[F I (\hat{P})]}^{- 1}

. By defining the matrix

[\frac{\partial^{2} l (P)}{\partial ρ_{i} \partial ρ_{j}}]

, the observed Fisher information matrix takes the following form

F I (\hat{P}) = {[- \frac{\partial^{2} l (P)}{\partial ρ_{i} \partial ρ_{j}}]}^{T} |_{P = \hat{P}}

(28)

The asymptotic standard errors of the normal copula density parameters are defined

S E (\hat{P}) = \sqrt{{[F I (\hat{P})]}^{- 1}}

(29)

3. Results

3.1. Parameter Estimates

We used the MAPLE symbolic algebra system [44] to implement maximum likelihood procedures as well as assessments of the derived probability density functions, the normalized conditional entropies, and stand attributes models. It should be noted that standard statistical packages cannot be used to implement our proposed SDE models. The joint copula-type representation of the probability density function (23) implies the two-step maximum log-likelihood decomposition, namely, separately estimating the parameters of the marginal distributions defined by Equations (1) and (2) and the parameters of the copula defined by Equation (25). The first step estimates the approximated maximum log-likelihood contribution defined by Equations (16) and (17) from each margin (under the assumption of independence) using a random sample of discrete measurements of tree diameter (x₁), potentially available area (x₂), height (x₃), crown base height (x₄), and crown width

\{x_{j 1}^{i}, x_{j 2}^{i}, \dots, x_{j n_{i j}}^{i}\}

, at discrete times

\{t_{1}^{i}, t_{2}^{i}, \dots, t_{n_{i j}}^{i}\}

, i = 1, …, M, j = 1, …, 5. The second step estimates the contribution of the maximum log-likelihood defined by Equation (27) from the copula using a random sample of discrete measurements

\{(x_{11}^{i}, x_{21}^{i}, x_{31}^{i}, x_{41}^{i}, x_{51}^{i}), (x_{12}^{i}, x_{22}^{i}, x_{32}^{i}, x_{42}^{i}, x_{52}^{i}), \dots, (x_{1 n_{i 1}}^{i}, x_{2 n_{i 2}}^{i}, x_{3 n_{i 3}}^{i}, x_{4 n_{i 4}}^{i}, x_{5 n_{i 5}}^{i})\}

at discrete times

\{t_{1}^{i}, t_{2}^{i}, \dots, t_{n_{i j}}^{i}\}

, i = 1, …, M, j = 1, …, 5. The measurements of all observed trees were used to estimate the parameters of the stochastic differential Equation (1) that determine the growth of the tree diameter and potentially available area. Another smaller set of measurements was used to estimate the parameters of the tree height growth Equation (1). Finally, the smallest set of measurements was used to estimate the parameters of the tree crown base height and crown width growth Equation (2) and to estimate the dependence parameters matrix P. The parameter estimates of the stochastic differential Equations (1) and (2), and the parameters matrix P of the dependencies defined by Equation (26) for the copula-type probability density function using observed datasets from mixed-species, uneven stands in Lithuania (see Table A1) are presented in Table 1 and Table 2, respectively. All parameter estimates are significant (p < 0.05).

3.2. Joint and Conditional Densities

First, in this section, we review some of the most useful conditional probability density functions defined using five-, four-, three-, and two-dimensional copulas and the probability density functions defined by Equations (6)–(10) that can be successfully applied to relate the response variable to several remaining explanatory variables. We formalize the conditional density functions using the well-known Bayes’ rule. Second, we visually compare several two-dimensional probability density functions and assess which dependence is stronger.

The joint copula-type probability density function for two diffusion processes

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

and

\{X_{k}^{i} (t) | t \in [t_{0}, T]\}

for list

B_{2} = \{j, k| j, k \in \{1, 2, 3, 4, 5\}, j \neq k\}

, i = 1, …, M is defined as

f_{j, k}^{i} (x_{j}^{i}, x_{k}^{i}, t | \hat{ρ_{j k}}) = c_{j, k}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t) | \hat{ρ_{j k}}) \prod_{l \in B_{2}} \hat{f_{l}^{i}} (x_{l}^{i}, t) .

(30)

where

\hat{f_{j}^{i}} (x_{j}^{i}, t) = f_{j}^{i} (x_{j}^{i}, t |\hat{θ_{j}}, \hat{ϕ_{j}^{i}})

,

F_{j}^{i} (x_{j}^{i}, t) = \int_{z}^{x_{j}^{i}} \hat{f_{j}^{i}} (y, t) d y

,

z = \hat{ɣ_{j}}

, if j = 1, 2, 3, z = −∞, if j = 4, 5 and

G_{j}^{i} (x_{j}^{i}, t) = Φ^{- 1} (F_{j}^{i} (x_{j}^{i}, t))

.

The joint copula-type probability density function for three diffusion processes

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

,

\{X_{k}^{i} (t) | t \in [t_{0}, T]\}

, and

\{X_{m}^{i} (t) | t \in [t_{0}, T]\}

for list

B_{3} = \{j, k, m| j, k, m \in \{1, 2, 3, 4, 5\}, j \neq k \neq m\}

, i = 1, …, M is defined as

f_{j, k, m}^{i} (x_{j}^{i}, x_{k}^{i}, x_{m}^{i}, t | \hat{P}) = c_{j, k, m}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t) | \hat{P}) \prod_{l \in B_{3}} \hat{f_{l}^{i}} (x_{l}^{i}, t) .

(31)

The joint copula-type probability density function for four diffusion processes

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

,

\{X_{k}^{i} (t) | t \in [t_{0}, T]\}

,

\{X_{m}^{i} (t) | t \in [t_{0}, T]\}

, and

\{X_{n}^{i} (t) | t \in [t_{0}, T]\}

for list for list

B_{4} = \{j, k, m, n| j, k, m, n \in \{1, 2, 3, 4, 5\}, j \neq k \neq m \neq n\}

, i = 1, …, M is defined as

f_{j, k, m, n}^{i} (x_{j}^{i}, x_{k}^{i}, x_{m}^{i}, x_{n}^{i}, t | \hat{P}) = c_{j, k, m, n}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t), G_{n}^{i} (x_{n}^{i}, t) | \hat{P}) \prod_{l \in B_{4}} \hat{f_{l}^{i}} (x_{l}^{i}, t)

(32)

The joint copula-type probability density function for all five diffusion processes is defined as

f_{1, 2, 3, 4, 5}^{i} (x_{1}^{i}, x_{2}^{i}, x_{3}^{i}, x_{4}^{i}, x_{5}^{i}, t | \hat{P}) = c_{1, 2, 3, 4, 5}^{i} (G_{1}^{i} (x_{1}^{i}, t), G_{2}^{i} (x_{2}^{i}, t), G_{3}^{i} (x_{3}^{i}, t), G_{4}^{i} (x_{4}^{i}, t), G_{5}^{i} (x_{5}^{i}, t) | \hat{P}) \prod_{j = 1}^{5} \hat{f_{j}^{i}} (x_{j}^{i}, t) .

(33)

The conditional probability density function for diffusion process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M given

\{X_{k}^{i} (t) = x_{k}^{i} | t \in [t_{0}, T]\}

,

k \in \{1, 2, 3, 4, 5\} \ j

is defined as

f_{j | k}^{i} (x_{j}^{i}, t | \hat{ρ_{j k}}, x_{k}^{i}) = c_{j, k}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t) | \hat{ρ_{j k}}) \hat{f_{j}^{i}} (x_{j}^{i}, t) .

(34)

The conditional probability density function for diffusion process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i} | t \in [t_{0}, T]\}

,

k, m \in \{1, 2, 3, 4, 5\} \ j

is defined as

f_{j | k, m}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}) = \frac{c_{j, k, m}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t) | \hat{P}) \hat{f_{j}^{i}} (x_{j}^{i}, t)}{c_{k, m}^{i} (G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t) | \hat{ρ_{k m}})} .

(35)

The conditional probability density function for diffusion process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i} | t \in [t_{0}, T]\}

,

k, m, n \in \{1, 2, 3, 4, 5\} \ j

is defined as

f_{j | k, m, n}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}, x_{n}^{i}) = \frac{c_{j, k, m, n}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t), G_{n}^{i} (x_{n}^{i}, t) | \hat{P}) \hat{f_{j}^{i}} (x_{j}^{i}, t)}{c_{k, m, n}^{i} (G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t), G_{n}^{i} (x_{n}^{i}, t) | \hat{P})}

(36)

The conditional probability density function for diffusion process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i}, X_{s}^{i} (t) = x_{s}^{i} | t \in [t_{0}, T]\}

,

k, m, n, s \in \{1, 2, 3, 4, 5\} \ j

is defined as

f_{j | k, m, n, s}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}, x_{n}^{i}, x_{s}^{i}) = \frac{c_{j, k, m, n, s}^{i} (G_{j}^{i} (x_{j}^{i}, t), G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t), G_{n}^{i} (x_{n}^{i}, t), G_{s}^{i} (x_{s}^{i}, t) | \hat{P}) \hat{f_{j}^{i}} (x_{j}^{i}, t)}{c_{k, m, n, s}^{i} (G_{k}^{i} (x_{k}^{i}, t), G_{m}^{i} (x_{m}^{i}, t), G_{n}^{i} (x_{n}^{i}, t), G_{s}^{i} (x_{s}^{i}, t) | \hat{P})} .

(37)

The statistical dependence of the observed multivariate dataset is the relationship between any two or more measured random variables. The dependency structure is important to determine the adequacy of a particular mathematical model to represent the observed dataset. The simplest visual way to assess the independence of two random variables is to visually check whether a change in the values of one variable affects the distribution of the values of the other variable. To see the independence of the variables in a graph, we used a graph of conditional distributions for different values of the explanatory variable. Figure 1a–d shows the conditional probability density functions of the tree diameter defined by Equation (34) as functions of the tree potentially available area, height, crown base height, and crown width, respectively.

The dependence of the tree diameter on the potentially available area, height, crown base height, and crown area is shown in Figure 1a–d by plotting the marginal diameter probability density function and conditional density functions at three different values of an exploratory variable (potentially available area, height, crown base height, and crown area). Obviously, those three conditional densities are different, which confirms that the tree diameter behaves differently when conditioned on different values of an explanatory variable. This is another way to make sure that the tree diameter and other explanatory variables under analysis are dependent. As can be seen in Figure 1, the smallest influence on the development of the tree diameter is provided by its potentially available area (see Figure 1a), while the height has the greatest influence on the tree diameter, because not only the bias of the distribution is changed but also the shape changes significantly. Moreover, the probability density function of the tree diameter becomes flatter as the values of any explanatory variable increase, indicating an increase in variance.

Similar conclusions can be drawn from the values of the Pearson linear correlation presented in the first row of Table 2. We can see that the strongest linear relationship is between the tree diameter and height, and the weakest relationship is between the tree diameter and the area it occupies.

4. Discussion

The comparisons among the conditional probability density functions for the tree diameter shown in Figure 1 indicate that all explanatory variables significantly change the conditional probability density functions of the tree diameter. Visually, the maximum differences between the conditional probability density functions of the diameter and the marginal probability density function of the diameter revealed that height is an explanatory variable. The reason for the biggest difference is due to the greatest influence of height on the evolution of tree diameter.

The best fitting of the conditional probability density functions defined by Equations (34)–(37) could include integrated plotting functions, such as a histogram, along with the estimated probability density functions and Q–Q plots. Since conditional probability density functions defined by Equations (34)–(37) functionally depend on random effects for different population individuals (plots), they are difficult to use to compare distribution models, even in the case of normal random effects. The output of this comparative analysis of densities was determining which list of explanatory variables best improves the fit of the response variable to the set of observational data obtained during the measurements.

Entropy is the most appropriate concept for comparing the uncertainty of multi-dimensional probability density functions [45]. In addition, the conditional probability densities of the response variable may face the redundancy of the explanatory variables. The Shannon differential entropy [46] of the probability density function

f (x, y)

for two random variables X and Y, whose support is a set

X \times Y

and is written as

H (X, Y) = - \int_{X x Y} f (x, y) l o g_{2} f (x, y) d x d y = H (X / Y) + H (Y)

(38)

For two random variables X and Y,

H (X / Y) \leq H (X)

with equality if and only if X and Y are independent. In this study, the parameters of the marginals and copula densities were estimated by a two-stage approximated maximum likelihood procedure (see Table 1 and Table 3). The conditional and multi-dimensional differential entropies were computed for the estimated probability density functions defined by Equations (34)–(37) and (30)–(33), respectively. Here, the entropies were estimated using

H (X, Y) = - E (l o g_{2} f (x, y))

(39)

which takes the following forms for the two-, three-, four-, and five-dimensional probability density functions (30)–(33):

H (X_{j}, X_{k}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j, k}^{i} (x_{j v}^{i}, x_{k v}^{i}, t_{v}^{i} | \hat{ρ_{j k}})), j, k \in \{1, 2, 3, 4, 5\}

(40)

H (X_{j}, X_{k}, X_{m}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j, k, m}^{i} (x_{j v}^{i}, x_{k v}^{i}, x_{m v}^{i}, t_{v}^{i} | \hat{P})), j, k, m \in \{1, 2, 3, 4, 5\}

(41)

H (X_{j}, X_{k}, X_{m}, X_{n}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j, k, m, n}^{i} (x_{j v}^{i}, x_{k v}^{i}, x_{m v}^{i}, x_{n v}^{i}, t_{v}^{i} | \hat{P})), j, k, m, n \in \{1, 2, 3, 4, 5\}

(42)

H (X_{1}, X_{2}, X_{3}, X_{4}, X_{5}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{1, 2, 3, 4, 5}^{i} (x_{1 v}^{i}, x_{2 v}^{i}, x_{3 v}^{i}, x_{4 v}^{i}, x_{5 v}^{i}, t_{v}^{i} | \hat{P}))

(43)

For conditional probability density functions (34)–(37), j = 1, ….5, we have conditional entropies

H (X_{j} / X_{k}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j | k}^{i} (x_{j v}^{i}, t | \hat{ρ_{j k}}, x_{k v}^{i})), k \in \{1, 2, 3, 4, 5\} \ j

(44)

H (X_{j} / X_{k}, X_{m}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j | k, m}^{i} (x_{j v}^{i}, t_{v}^{i} | \hat{P}, x_{k v}^{i}, x_{m v}^{i})), k, m \in \{1, 2, 3, 4, 5\} \ j

(45)

H (X_{j} / X_{k}, X_{m}, X_{n}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j | k, m, n}^{i} (x_{j v}^{i}, t_{v}^{i} | \hat{P}, x_{k v}^{i}, x_{m v}^{i}, x_{n v}^{i})), j, k, m \in \{1, 2, 3, 4, 5\} \ j

(46)

H (X_{j} / X_{k}, X_{m}, X_{n}, X_{s}) = - \sum_{i = 1}^{M} \sum_{v = 1}^{n_{i}} {l o g}_{2} (f_{j | k, m, n, s}^{i} (x_{j v}^{i}, t_{v}^{i} | \hat{P}, x_{2 v}^{i}, x_{3 v}^{i}, x_{4 v}^{i}, x_{5 v}^{i})), k, m, n, s \in \{1, 2, 3, 4, 5\} \ j

(47)

The purpose of selecting explanatory variables is to correctly select variables from a set of candidate variables containing all processes as input variables. Since we can combine the four remaining random processes for modeling the j-th stochastic process, it is possible to formalize mathematically a total of 2⁴ − 1 = 15 different nonlinear relationships. The relationships among the two-dimensional, marginal, and conditional entropies are shown in Figure 2.

From a theoretical interaction information measurement perspective, the theoretical values of entropy can take normalized forms [45]. In this paper, we introduce a new conditional entropy normalization method that uses multi-dimensional entropy as the denominator to adjust the conditional entropy value within [0, 1] (see Figure 2). Unlike the correlation metric, the conditional entropies defined by Equations (44)–(47) can be normalized, for each stochastic process j = 1, …, 5, as

R_{j / k} = 1 - \frac{H (X_{j} / X_{k})}{H (X_{j}, X_{k})}, k \in \{1, 2, 3, 4, 5\} \ j

(48)

R_{j / k, m} = 1 - \frac{H (X_{j} / X_{k}, X_{m})}{H (X_{j}, X_{k}, X_{m})}, k, m \in \{1, 2, 3, 4, 5\} \ j

(49)

R_{j / k, m, n} = 1 - \frac{H (X_{j} / X_{k}, X_{m}, X_{n})}{H (X_{j}, X_{k}, X_{m}, X_{n})}, j, k, m \in \{1, 2, 3, 4, 5\} \ j

(50)

R_{j / k, m, n, s} = 1 - \frac{H (X_{j} / X_{k}, X_{m}, X_{n}, X_{s})}{H (X_{j}, X_{k}, X_{m}, X_{n}, X_{s})}, j, k, m, n, s \in \{1, 2, 3, 4, 5\} \ j

(51)

It is noteworthy that the normalized interaction information indices defined by Equations (48)–(51) may be the definition of the informational measure of the dependency between the response variable and one explanatory variable, two explanatory variables, and three explanatory variables, respectively, but they are not sensitive to the entropy of the entire multi-dimensional process or the relations among them. From the application point of view, it is reasonable to change all normalizing denominators in Equations (48)–(51) to the entropy of all five random processes

X_{1} (t), X_{2} (t), X_{3} (t), X_{4} (t), X_{5} (t)

in the following forms:

{\bar{R}}_{j / k} = 1 - \frac{H (X_{j} / X_{k})}{H (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})}, k \in \{1, 2, 3, 4, 5\} \ j

(52)

{\bar{R}}_{j / k, m} = 1 - \frac{H (X_{j} / X_{k}, X_{m})}{H (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})}, k, m \in \{1, 2, 3, 4, 5\} \ j

(53)

{\bar{R}}_{j / k, m, n} = 1 - \frac{H (X_{j} / X_{k}, X_{m}, X_{n})}{H (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})}, k, m, n \in \{1, 2, 3, 4, 5\} \ j

(54)

{\bar{R}}_{j / k, m, n, s} = 1 - \frac{H (X_{j} / X_{k}, X_{m}, X_{n}, X_{s})}{H (X_{1}, X_{2}, X_{3}, X_{4}, X_{5})}, k, m, n, s \in \{1, 2, 3, 4, 5\} \ j

(55)

In forestry sciences, the ability to detect highly linked stochastic growth processes and their interactions might be very beneficial in understanding how trees in a stand interact and affect each other [47,48]. The normalized information indices calculated by Equations (52)–(55) for the stochastic growth process of tree diameter are presented in Table 3. The comparison of the normalized information index values

R_{1 / k}

,

R_{1 / k, m}

, and

R_{1 / k, m, n}

(see Table 3) supports the conclusion that the stochastic tree height growth process is the best fit among all models with one explanatory variable; the stochastic tree height and crown width growth processes are the best fit among all models with two explanatory variables; and the stochastic tree potentially available area, height, and crown width growth processes are the best fit among all models with three explanatory variables. Similarly, comparing the values of the normalized information index

{\bar{R}}_{1 / k}

,

{\bar{R}}_{1 / k, m}

, and

{\bar{R}}_{1 / k, m, n}

showed that the stochastic growth processes of the tree potentially available area, height, and crown width are best suited for analyzing all possible combinations of explanatory variables.

The relationships among the growth processes of individual tree size components (for example, diameter) and other tree size components (for example, potentially available area, height, etc.) are very important tools for modeling stand productivity and increments because these models are very useful in formulating forest management plans [49,50]. The newly developed methodology of linking stochastic differential equations and the normal copula function makes it possible to formalize the evolution of individual components of tree size with respect to other components of tree size by using nonlinear relationships. Using the probability density functions (6)–(10), we define the dynamics of the mean,

m_{j}^{i} (t)

i = 1, …, M, of the stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

as follows:

\begin{matrix} m_{j}^{i} (t) = E (X_{j}^{i} (t)) = e x p (μ_{j}^{i} (t) + \frac{1}{2} v_{j} (t)), j = 1, 2, 3 \\ m_{j}^{i} (t) = E (X_{j}^{i} (t)) = μ_{j}^{i} (t), j = 4, 5 \end{matrix}

(56)

Using the conditional probability density function (34) and the integration operation, we define the dynamics of the mean of the stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i} | t \in [t_{0}, T]\}

,

k \in \{1, 2, 3, 4, 5\} \ j,

as follows:

m_{j}^{i} (x_{k}^{i}, t) = E (X_{j}^{i} (t) | X_{k}^{i} (t) = x_{k}^{i}) = \int_{Υ_{j}} x_{j}^{i} \cdot f_{j / k}^{i} (x_{j}^{i}, t | \hat{ρ_{j k}}, x_{k}^{i}) d x_{j}^{i}

(57)

where

Υ_{j}

is a support of the density function,

Υ_{j} = [{\hat{γ}}_{j}, + \infty)

, j = 1, 2, 3, and

Υ_{j} = (- \infty, + \infty)

, j = 4, 5.

Using the conditional probability density function (35) and the integration operation, we define the dynamics of the mean of the stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i} | t \in [t_{0}, T]\}

,

k, m \in \{1, 2, 3, 4, 5\} \ j,

as follows:

m_{j}^{i} (x_{k}^{i}, x_{m}^{i}, t) = E (X_{j}^{i} (t) | X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}) = \int_{Υ_{j}} x_{j}^{i} \cdot f_{j / k}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}) d x_{j}^{i}

(58)

Using the conditional probability density function (36) and the integration operation, we define the dynamics of the mean of the stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i} | t \in [t_{0}, T]\}

,

k, m, n \in \{1, 2, 3, 4, 5\} \ j,

as follows:

m_{j}^{i} (x_{k}^{i}, x_{m}^{i}, x_{n}^{i}, t) = E (X_{j}^{i} (t) | X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i}) = \int_{Υ_{j}} x_{j}^{i} \cdot f_{j / k}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}, x_{n}^{i}) d x_{j}^{i}

(59)

Using the conditional probability density function (37) and the integration operation, we define the dynamics of the mean of the stochastic process

\{X_{j}^{i} (t) | t \in [t_{0}, T]\}

, j = 1, …, 5, i = 1, …, M, given

\{X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i}, X_{s}^{i} (t) = x_{s}^{i} | t \in [t_{0}, T]\}

,

k, m, n, s \in \{1, 2, 3, 4, 5\} \ j,

as follows:

\begin{matrix} m_{j}^{i} (x_{k}^{i}, x_{m}^{i}, x_{n}^{i}, x_{s}^{i}, t) & = E (X_{j}^{i} (t) | X_{k}^{i} (t) = x_{k}^{i}, X_{m}^{i} (t) = x_{m}^{i}, X_{n}^{i} (t) = x_{n}^{i}, X_{s}^{i} (t) = x_{s}^{i}) \\ = \int_{Υ_{j}} x_{j}^{i} \cdot f_{j / k}^{i} (x_{j}^{i}, t | \hat{P}, x_{k}^{i}, x_{m}^{i}, x_{n}^{i}, x_{s}^{i}) d x_{j}^{i} \end{matrix}

(60)

The calculated statistical indices, defined as the coefficient of determination and root mean square error for all tree diameter (j = 1) 15 models, defined by Equations (49)–(53), are summarized in Table 4. The ranking of all 15 formulated nonlinear models defined by Equations (56)–(60) according to model fitness using statistical indices defined by the coefficient of determination and the root mean square error shows that this ranking corresponds to the above-discussed comparison of model adequateness by using the newly defined normalized interaction information index

{\bar{R}}_{j / \cdot}

j = 1, …, 5. Finally, we can conclude that to define the evolution of the mean tree diameter with three explanatory variables, it is appropriate to use the potentially available area, height, and crown width of the tree; with two explanatory variables, it is appropriate to use tree height and crown width; and with one explanatory variable, it is appropriate to use tree height.

Using Equations (39)–(60), analogous conclusions can be drawn for other components of tree size, such as potentially available area, height, crown base height, and crown area.

5. Conclusions

We emphasize that previously used multi-dimensional stochastic differential equation models are not a complete description of the dependence structure, and their practical realization faces a rather large number of calculations. The current work describes an alternative copula-based methodology for prediction models in growth-process research. Newly derived copula-type multi-dimensional and conditional probability density functions, defined by Equations (30)–(37), make it possible to predict not only the mean dynamics of individual stochastic processes X_j j = 1,…, 5 but also the corresponding means of other attributes, for example, in forestry, tree or stand basal area, tree or stand volume, tree slenderness, number of trees per hectare, and others. This study introduced the normalized multi-dimensional interaction information index based on differential entropy to capture dependencies between state variables. Using conditional copula-type probability density functions, the exact form equations defining the links among the diameter, potentially available area, height, crown base height, and crown width were derived. The main statistical advantage of copulas is the ability to model different datasets with any modified type of marginal distribution.

Future research could be related to the use of an asymmetric multivariate log-normal copula and covering a wider range of diffusion processes.

Author Contributions

Conceptualization, P.R. and E.P.; methodology, P.R.; software, P.R.; validation, R.K., G.M., E.P. and P.R.; formal analysis, R.K.; investigation, G.M.; data curation, E.P.; writing—original draft preparation, P.R.; writing—review and editing, R.K., G.M., E.P. and P.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Original data presented in the study are included in the main text, and further inquiries can be directed to the corresponding author.

Acknowledgments

We thank the Edmundas Petrauskas research group for their help with sampling in the field. The authors would like to express their appreciation for the support from the Lithuanian Association of Impartial Timber Scalers. Thanks are also due to anonymous reviewers for their constructive criticism.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Data

We focus on the modeling of uneven and mixed-species (pine (Pinus sylvestris), spruce (Picea abies), and silver birch (Betula pendula Roth and Betula pubescens Ehrh.)) tree datasets. In terms of regeneration regime, all plots varied between naturally regenerated and artificially regenerated and were located in pure or mixed stands. Plots were measured between one and seven times every five or more years. At the establishment of 48 permanent experimental plots, the following data were recorded for the sample tree: age, diameter at breast height, tree position, height, crown base height, and crown width. The age of the ith tree (ranging from all trees to the 10th) in the first measurement cycle was recorded by counting its growth rings in the growth core (for even-aged stands, from the records in the documents), and the age of the remaining trees was obtained from the arithmetic mean. The position accuracy of the plane coordinates was 1 dcm, and the diameter measurements were made to the nearest 1 mm. The height, crown base height, and crown width measurements were made with an accuracy of approximately 1 dcm. Given the three-level measurements, three different datasets were used to obtain parameter estimates: the first (48 plots; 39,437 mixed-species trees) was used to estimate the fixed effect parameters for the tree diameter and potentially available area Equation (1), the second (48 plots; 8604 mixed-species trees) was used to estimate the fixed effect parameters for the tree height Equation (1), and the third (48 plots; 8604 mixed-species trees) was used to estimate the fixed effect parameters for the tree crown base height, crown width Equation (2) and to estimate the correlation matrix of the five-dimensional normal copula by maximizing the pseudo maximum likelihood procedure defined by Equation (30). The summary of measurements is presented in Table A1.

Table A1. Tree age, diameter, potentially available area, height, crown base height, and crown width summary statistics for model estimation.

Data Level	Data	Number of Trees	Min	Max	Mean	St. Dev.
First (48 plots)	t * (year)	39,437	12.0	211.0	59.25	26.36
	d (cm)	39,437	0.1	72.2	16.95	10.22
	p (m²)	39,437	0.09	173.82	10.37	8.95
Second (48 plots)	t (year)	8804	12.0	211.0	52.88	29.93
Second (48 plots)	h (cm)	8604	1.30	38.00	15.62	9.16
Third (31 plots)	t (year)	1378	46.0	211.0	83.98	24.39
	d (cm)	1378	3.20	62.40	24.12	10.38
	p (m²)	1378	1.37	161.51	15.27	11.57
	h (m)	1378	1.30	37.80	22.73	6.84
	hc (m)	1378	0.90	29.70	14.85	6.39
	wc (m)	1378	0.58	115.84	12.00	9.94

* t—tree age, d—tree diameter, p—tree potentially available area (the area of the polygon formed by the Voronoi diagram [26]), h—tree height, hc—tree crown base height, wc—tree crown width.

References

Ginzburg, L.R. The theory of population dynamics: I. Back to first principles. J. Theor. Biol. 1986, 122, 385–399. [Google Scholar] [CrossRef]
Turchin, P. Does population ecology have general laws? Oikos 2001, 94, 17–26. [Google Scholar] [CrossRef]
Hara, T. A stochastic model and the moment dynamics of the growth and size distribution in plant populations. J. Theor. Biol. 1984, 109, 173–190. [Google Scholar] [CrossRef]
Gompertz, B. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philos. Trans. R. Soc. A 1825, 115, 513–585. [Google Scholar]
Verhulst, P.F. Deuxième mémoire sur la loi d’accroissement de la population. Mém. Acad. R. Sci. Lett. B Arts Belg. 1847, 20, 142–173. [Google Scholar]
Zeide, B. Analysis of Growth Equations. For. Sci. 1993, 39, 594–616. [Google Scholar] [CrossRef]
Vanclay, J.K. Modelling Forest Growth and Yield: Applications to Mixed Tropical Forests; CAB International: Wallingford, UK, 1994; 312p. [Google Scholar]
Yanishevskyi, V.S.; Baranovska, S.P. Path integral method for stochastic equations of financial engineering. Math. Model. Comput. 2022, 9, 166–177. [Google Scholar] [CrossRef]
Shao, Y. Dynamics of an Impulsive Stochastic Predator–Prey System with the Beddington–DeAngelis Functional Response. Axioms 2021, 10, 323. [Google Scholar] [CrossRef]
Petrauskas, E.; Bartkevičius, E.; Rupšys, P.; Memgaudas, R. The use of stochastic differential equations to describe stem taper and volume. Baltic For. 2013, 19, 43–151. [Google Scholar]
Madheswaran, M.; Lingaraja, K.; Duraisamy, P. Econometric and stochastic analysis of stock price before and during COVID-19 in India. Environ. Dev. Sustain. 2023, 1–16. [Google Scholar] [CrossRef]
Ali, I.; Khan, S.U. A Dynamic Competition Analysis of Stochastic Fractional Differential Equation Arising in Finance via Pseudospectral Method. Mathematics 2023, 11, 1328. [Google Scholar] [CrossRef]
Bonaccorsi, S.; Ottaviano, S. A stochastic differential equation SIS model on network under Markovian switching. Stoch. Anal. Appl. 2022, 1–29. [Google Scholar] [CrossRef]
García, O. New class of growth models for even-aged stands: Pinus radiata in Golden Downs Forest. N. Z. J. For. Sci. 1984, 14, 65–88. [Google Scholar]
Sloboda, B. Kolmogorow–Suzuki und die stochastische Differentialgleichung als Beschreibungsmittel der Bestandesevolution. Mitt. Forstl. Bundes Vers. Wien 1977, 120, 71–82. [Google Scholar]
Suzuki, T. Forest transition as a stochastic process (I). J. Jpn. For. Sci. 1966, 48, 436–439. [Google Scholar]
Narmontas, M.; Rupšys, P.; Petrauskas, E. Models for Tree Taper Form: The Gompertz and Vasicek Diffusion Processes Framework. Symmetry 2020, 12, 80. [Google Scholar] [CrossRef]
Garcia, O. A parsimonious dynamic stand model for interior spruce in British Columbia. For. Sci. 2011, 57, 265–280. [Google Scholar]
Rupšys, P. Generalized fixed-effects and mixed-effects parameters height–diameter models with diffusion processes. Int. J. Biomath. 2015, 8(5), 1550060. [Google Scholar] [CrossRef]
Rupšys, P. Modeling Dynamics of Structural Components of Forest Stands Based on Trivariate Stochastic Differential Equation. Forests 2019, 10, 506. [Google Scholar] [CrossRef]
Rupšys, P.; Petrauskas, E. Analysis of Longitudinal Forest Data on Individual-Tree and Whole-Stand Attributes Using a Stochastic Differential Equation Model. Forests 2022, 13, 425. [Google Scholar] [CrossRef]
Gutiérrez, R.; Gutiérrez-Sánchez, R.; Nafidi, A.; Pascual, A. Detection, modelling and estimation of non-linear trends by using a non-homogeneous Vasicek stochastic diffusion. Application to CO₂ emissions in Morocco. Stoch. Environ. Res. Risk Assess. 2011, 26, 533–543. [Google Scholar] [CrossRef]
Vasicek, O.A. The distribution of loan portfolio value. Risk 2002, 15, 160–162. [Google Scholar]
Román-Román, P.; Serrano-Pérez, J.J.; Torres-Ruiz, F. A Note on Estimation of Multi-Sigmoidal Gompertz Functions with Random Noise. Mathematics 2019, 7, 541. [Google Scholar] [CrossRef]
Barrera, A.; Román-Román, P.; Torres-Ruiz, F. Hyperbolastic Models from a Stochastic Differential Equation Point of View. Mathematics 2021, 9, 1835. [Google Scholar] [CrossRef]
Rupšys, P.; Petrauskas, E. Symmetric and Asymmetric Diffusions through Age-Varying Mixed-Species Stand Parameters. Symmetry 2021, 13, 1457. [Google Scholar] [CrossRef]
Sklar, M. Fonctions de repartition an dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 1959, 8, 229–231. [Google Scholar]
Bhatti, M.I.; Do, H.Q. Development in Copula Applications in Forestry and Environmental Sciences. In Statistical Methods and Applications in Forestry and Environmental Sciences; Springer Nature: Singapore, 2020; Volume 13386, pp. 213–231. [Google Scholar] [CrossRef]
Rupšys, P.; Petrauskas, E. On the Construction of Growth Models via Symmetric Copulas and Stochastic Differential Equations. Symmetry 2022, 14, 2127. [Google Scholar] [CrossRef]
Liu, J.; Wan, Y.; Qu, S.; Qing, R.; Sriboonchitta, S. Dynamic Correlation between the Chinese and the US Financial Markets: From Global Financial Crisis to COVID-19 Pandemic. Axioms 2023, 12, 14. [Google Scholar] [CrossRef]
Fisher, R.A. On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. A 1922, 222, 309–368. [Google Scholar]
Rupšys, P.; Petrauskas, E. Modeling Number of Trees per Hectare Dynamics for Uneven-Aged, Mixed-Species Stands Using the Copula Approach. Forests 2023, 14, 12. [Google Scholar] [CrossRef]
Mackevičius, V. Introduction to Stochastic Analysis: Integrals and Differential Equations; ISTE: Washington, DC, USA; Wiley: London, UK, 2011. [Google Scholar] [CrossRef]
Uhlenbeck, G.E.; Ornstein, L.S. On the Theory of Brownian Motion. Phys. Rev. 1930, 36, 823–841. [Google Scholar] [CrossRef]
García, O. Estimating reducible stochastic differential equations by conversion to a least-squares problem. Comput. Stat. 2019, 34, 23–46. [Google Scholar] [CrossRef]
Shoji, I.; Ozaki, T. A statistical method of estimation and simulation for systems of stochastic differential equations. Biometrika 1998, 85, 240–243. [Google Scholar] [CrossRef]
Han, Y.; Yin, Z.; Zhang, D. Parameter Estimation of Linear Stochastic Differential Equations with Sparse Observations. Symmetry 2022, 14, 2500. [Google Scholar] [CrossRef]
Picchini, U.; Ditlevsen, S. Practical estimation of high dimensional stochastic differential mixed-effectsmodels. Comput. Stat. Data Anal. 2011, 55, 1426–1444. [Google Scholar] [CrossRef]
Joe, H. Accuracy of Laplace approximation for discrete response mixed models. Comput. Stat. Data Anal. 2008, 52, 5066–5074. [Google Scholar] [CrossRef]
Rao, C.R. Linear Statistical Inference and Its Applications; Wiley: New York, NY, USA, 1965. [Google Scholar]
Ashrafi, M.; Soltanian-Zadeh, H. Multivariate Gaussian Copula Mutual Information to Estimate Functional Connectivity with Less Random Architecture. Entropy 2022, 24, 631. [Google Scholar] [CrossRef] [PubMed]
McNeil, J.; Frey, R.; Embrechts, P. Quantitative Risk Management; Princeton Series in Finance; Princeton University Press: Princeton, NJ, USA, 2005. [Google Scholar]
Joe, H. Asymptotic efficiency of the two-stage estimation method for copula based models. J. Multivar. Anal. 2005, 94, 401–419. [Google Scholar] [CrossRef]
Monagan, M.B.; Geddes, K.O.; Heal, K.M.; Labahn, G.; Vorkoetter, S.M.; Mccarron, J. Maple Advanced Programming Guide; Maplesoft: Waterloo, ON, Canada, 2007. [Google Scholar]
Rupšys, P. Understanding the Evolution of Tree Size Diversity within the Multivariate nonsymmetrical Diffusion Process and Information Measures. Mathematics 2019, 7, 761. [Google Scholar] [CrossRef]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 623–656. [Google Scholar] [CrossRef]
DSouza, A.M.; Abidin, A.Z.; Leistritz, L.; Wismüller, A. Exploring connectivity with large-scale Granger causality on resting-state functional MRI. J. Neurosci. Methods 2017, 287, 68–79. [Google Scholar] [CrossRef]
Islam, M.R.; Ahmed, B.; Hossain, M.A.; Uddin, M.P. Mutual Information-Driven Feature Reduction for Hyperspectral Image Classification. Sensors 2023, 23, 657. [Google Scholar] [CrossRef] [PubMed]
Gavrikov, V.L.; Fertikov, A.I.; Vidus, V.E.; Sharafutdinov, R.A.; Vaganov, E.A. Elemental Variability in Stems of Pinus sylvestris L.: Whether a Single Core Can Represent All the Stem. Diversity 2023, 15, 281. [Google Scholar] [CrossRef]
Seo, Y.; Lee, D.; Choi, J. Developing and Comparing Individual Tree Growth Models of Major Coniferous Species in South Korea Based on Stem Analysis Data. Forests 2023, 14, 115. [Google Scholar] [CrossRef]

Figure 1. Conditional and marginal (in black) probability density functions of the tree diameter at time t = 80 year for three different values of the condition: (a) tree potentially available area: 6 m²—red, 26 m²—blue, 46 m²—gold; (b) tree height: 15 m—red, 25 m—blue, 35 m—gold; (c) tree crown base height: 10 m—red, 20 m—blue, 30 m—gold; (d) tree crown area: 5 m²—red, 25 m²—blue, 45 m²—gold.

Figure 2. Relationships among the two-dimensional, marginal, and conditional entropies.

Table 1. Parameter estimates (standard errors) for mixed-effect mode of the stochastic differential Equations (1) and (2).

Equations	α	β	ɣ	σ	δ	σ_ϕ
Diameter	0.0850 (0.0007)	0.0226 (0.0002)	−7.1108 (0.0954)	0.0042 (6.8 × 10⁻⁵)	-	0.0069 (0.0010)
Potentially available area	0.0617 (0.0006)	0.0182 (0.0002)	−1.3259 (0.0358)	0.0102 (0.0001)	1.6151 (0.0237)	0.0094 (0.0014)
Height	0.0827 (0.0011)	0.0213 (0.0003)	−13.1583 (0.3489)	0.0013 (4.8 × 10⁻⁵)	-	0.0044 (0.0006)
Crown base height	18.1688 (0.4625)	0.0226 (0.0020)	-	1.3386 (0.1148)	-	4.4846 (0.5687)
Crown width	156.869 (15.8329)	0.0010 (0.0001)	-	1.0703 (0.0413)	-	33.9665 (3.6940)

Table 2. Parameter estimates (standard errors), ρ_ij, for copula probability density function.

	ρ_i1	ρ_i2	ρ_i3	ρ_i4	ρ_i5
1	1	0.2913 (0.0270)	0.8916 (0.0041)	0.6272 (0.0132)	0.6268 (0.0143)
2	0.2913 (0.0270)	1	0.2336 (0.0276)	0.0635 (0.0279)	0.3105 (0.0266)
3	0.8916 (0.0041)	0.2336 (0.0276)	1	0.7359 (0.0101)	0.4612 (0.0187)
4	0.6272 (0.0132	0.0635 (0.0279)	0.7359 (0.0101)	1	0.1618 (0.0237)
5	0.6268 (0.0143)	0.3105 (0.0266)	0.4612 (0.0187)	0.1618 (0.0237)	1

Table 3. Normalized information indices

R_{1 / \cdot}

(Equations (48)–(51)) and

{\bar{R}}_{1 / \cdot}

(Equations (52)–(55)).

Table 3. Normalized information indices

R_{1 / \cdot}

(Equations (48)–(51)) and

{\bar{R}}_{1 / \cdot}

(Equations (52)–(55)).

k	2	3	4	5
$R_{1 / k}$	0.48694	0.52886	0.47929	0.52043
${\bar{R}}_{1 / k}$	0.66816	0.73356	0.68590	0.68535
k, m	2, 3	2, 4	2, 5	3, 4	3, 5	4, 5
$R_{1 / k, m}$	0.69803	0.66093	0.67514	0.67547	0.71624	0.68761
${\bar{R}}_{1 / k, m}$	0.81637	0.78519	0.78232	0.81561	0.82636	0.80151
k, m, n	2, 3, 4	2, 3, 5	2, 4, 5	3, 4, 5
$R_{1 / k, m, n}$	0.76585	0.79030	0.76747	0.77910
${\bar{R}}_{1 / k, m, n}$	0.81653	0.82653	0.80227	0.82658
k, m, n, s	2, 3, 4, 5
$R_{1 / k, m, n, s}$	0.79949
${\bar{R}}_{1 / k, m, n, s}$	0.79949

Table 4. Two statistical indices for fifteen different nonlinear models of the tree diameter (j = 1), defined by Equations (56)–(60), coefficient of determination, R², and root mean square error, RMSE.

Equation	Statistical Indices
(56)	R²	0.3678
(56)	RMSE	8.2050
	k	2	3	4	5
(57)	R²	0.3931	0.8260	0.5367	0.5280
(57)	RMSE	8.0855	4.3294	7.0631	7.1321
	k, m	2, 3	2, 4	2, 5	3, 4	3, 5	4, 5
(58)	R²	0.8312	0.5752	0.5309	0.8283	0.8360	0.6703
(58)	RMSE	4.2645	6.7633	7.1102	4.3002	4.1997	5.9608
	k, m, n	2, 3, 4	2, 3, 5	2, 4, 5	3, 4, 5
(59)	R²	0.8323	0.8738 *	0.6687	0.8687
(59)	RMSE	4.2497	3.6863 *	5.9758	3.7624
	k, m, n, s	2, 3, 4, 5
(60)	R²	0.8666
(60)	RMSE	3.7898

* The best value of the statistical index in bold.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Krikštolaitis, R.; Mozgeris, G.; Petrauskas, E.; Rupšys, P. A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry. Axioms 2023, 12, 457. https://doi.org/10.3390/axioms12050457

AMA Style

Krikštolaitis R, Mozgeris G, Petrauskas E, Rupšys P. A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry. Axioms. 2023; 12(5):457. https://doi.org/10.3390/axioms12050457

Chicago/Turabian Style

Krikštolaitis, Ričardas, Gintautas Mozgeris, Edmundas Petrauskas, and Petras Rupšys. 2023. "A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry" Axioms 12, no. 5: 457. https://doi.org/10.3390/axioms12050457

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Statistical Dependence Framework Based on a Multivariate Normal Copula Function and Stochastic Differential Equations for Multivariate Data in Forestry

Abstract

1. Introduction

2. Methods

2.1. Stochastic Differential Equations

2.2. Copulas and Dependence

3. Results

3.1. Parameter Estimates

3.2. Joint and Conditional Densities

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Data

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI