Next Article in Journal
Investigating the Lifetime Performance Index under Ishita Distribution Based on Progressive Type II Censored Data with Applications
Next Article in Special Issue
Investigation of Exponential Distribution Utilizing Randomly Censored Data under Balanced Loss Functions and Its Application to Clinical Data
Previous Article in Journal
Synthesis and Structure of a New Iodate Cs5[Sc2(IO3)9](IO3)2 with a Complex Framework Based on the Condensation of [Sc(IO3)6] Building Blocks
Previous Article in Special Issue
Modeling of System Availability and Bayesian Analysis of Bivariate Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Bivariate Family Based on Archimedean Copulas: Simulation, Regression Model and Application

by
Gabriela M. Rodrigues
1,
Edwin M. M. Ortega
1,*,
Roberto Vila
2 and
Gauss M. Cordeiro
3
1
Department of Exact Sciences, University of São Paulo, Piracicaba 13418-900, Brazil
2
Department of Statistics, University of Brasilia, Brasilia 70910-900, Brazil
3
Department of Statistics, Federal University of Pernambuco, Recife 50670-901, Brazil
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(9), 1778; https://doi.org/10.3390/sym15091778
Submission received: 18 August 2023 / Revised: 12 September 2023 / Accepted: 13 September 2023 / Published: 18 September 2023

Abstract

:
We use the Clayton and Frank copulas and the exponentiated odd log-logistic family to define a new flexible bivariate model to fit bimodal and asymmetry data. The copulas allow different distributions for the response variable, thus making analysis more suitable. We present some structural properties of the new model and describe a simulation study to show the consistency of the estimators. We construct a bivariate regression model based on the new family to fit oak lettuce plant data for different concentrations of silicon dioxide and organosilicon compounds. We check the response variables fresh weight and plant height together in order to verify the existing correlation between them. These variables exhibit a bimodal form, and the family used is able to model this behavior. Different marginal distributions are selected, which is an interesting point of the copula methodology. The variables have strong positive dependence, and the experiment is carried out comparing the control treatment with others leading to the following results: (i) the treatment 1-ethoxysilatrane (with concentrations 5 × 10 4 mL·L 1 and 10 3 mL·L 1 ) is not significant for the response variables; (ii) the treatment amorphous silicon dioxide (with concentrations 50 mg·L 1 and 100 mg·L 1 ) and the same treatment (with concentrations 5 × 10 3 mL·L 1 and 10 2 mL·L 1 ) are significant and have positive effects on both responses; (iii) the treatment amorphous silicon dioxide (with concentrations 200 mg·L 1 and 300 mg·L 1 ) are significant and have negative effects on the response variables. Overall, the proposed bivariate model is suitable for the current data and can be useful in other applications.

1. Introduction

Various areas of knowledge can use modeling involving one or more response variables of interest, i.e., two or more attributes to be modeled. If these attributes are not independent and a practical explanation exists for this situation, multivariate statistical models should be used with the objective of explaining and capturing the correlation of these variables.
For this purpose, joint continuous distributions are commonly used. The bivariate normal and t-Student distributions are most often used, although they can be highly restrictive. Besides this, employing them implies a linear relationship and elliptical structure of the variables. These premises do not always hold.
The modeling of multivariate data based on copula functions is an interesting alternative to overcome these drawbacks. According to Nelsen (1986) [1], copulas provide a way to relate functions of multivariate distributions based on their marginal distribution functions. The main advantages from a statistical standpoint using this method are:
  • For constructing joint distributions, copulas allow each of them to be modeled individually by a different marginal distribution, thus enabling more flexible associations by fitting different marginal distributions;
  • Employment of copulas is a useful approach to model and understand the phenomenon of dependence, i.e., the dependence between these variables can assume diverse structures, including nonlinear ones, according to the type of copula utilized;
  • The marginal distributions should not depend on the association of the variables under analysis;
  • A copula is invariant from continuous transformations of the marginals.
Various bivariate distributions have been proposed based on copula functions such as the Weibull bivariate derived from the Farlie–Gumbel–Morgenstern (FGM) copula functions, Ali–Mikhail–Haq (AMH), Gumbel–Hougaard, Gumbel–Barnett [2], generalized bivariate Rayleigh using the Clayton copula (El-Sherpieny and Almetwally [3]), bivariate Fréchet based on the FGM or AMH copulas [4], and generalized inverted Kumaraswamy using the Marshal–Olkin method [5]. Samanthi and Sepanski [6] defined families from four bivariate copulas using the Kumaraswamy distribution, the bivariate exponentiated half logistic based on the Marshall–Olkin class [7], and the bivariate Lindley distribution via the FGM copula [8]. Here, we focus on Archimedean copulas, which have closed-form expressions as an important characteristic, since they can be easily constructed starting from specific generator functions. Further, they are highly flexible, permitting the modeling of various forms of dependence, including asymmetry and extreme dependence. Due to the ease of their construction, these copulas have a large number of applications in various areas of knowledge, such as finance ([3,9,10]), health ([11,12]), hydrology ([13,14,15]), and survival analysis ([16,17,18]). Detailed studies of these families and their applications can be found in Nelsen (2007).
In many practical situations, the response variable presents an asymmetric and/or bimodal behavior. The motivation for this work comes from an experiment carried out at the Plekhanov Russian University of Economics that evaluated the growth of oak leaf lettuce (Lactuca sativa var. crispa). The histograms of the response variables fresh weight (grams) ( y 1 ) and plant height (cm) ( y 2 ) measured in the experiment are reported in Figure 1a,b. It is noted that they present bimodal behavior and positive asymmetries. In addition, the scatter plot between them (Figure 1c) indicates that they present a strong and positive correlation.
Although a wide range of flexible bivariate distributions can be found in the literature, we note a scarcity of bimodal bivariate distributions. For this purpose, we use the exponentiated odd log-logistic-G (EOLL-G) family (Alizadeh et al., 2020) [19]), whose densities have great flexibility in modeling data, such as bimodality and/or positive or negative asymmetry. The Clayton and Frank copulas are adopted, so the new models are called the BCEOLL-G and BFEOLL-G families, respectively. We consider these copulas because they are suitable to model data with positive correlation (Naifar, 2011) [9] according to the dataset in Figure 1c.
This paper is structured as follows: Section 2 provides a summary of copula functions. Section 3 proposes the bivariate Clayton and Frank copulas generated from the EOLL-G family. Section 4 introduces the Frank and Clayton copulas generated from the EOLL Normal distribution. Section 5 formulates the bivariate bimodal regression with copulas and presents inferential issues. Section 6 performs some simulations for different scenarios. Section 7 illustrates the new methodology for lettuce plants from an experimental trial. Section 8 concludes the article.

2. Archimedean Copulas

According to [20], copulas can be described as functions that link univariate marginal distributions, or alternatively, functions with a multivariate distribution whose marginals are uniform in the interval [ 0 , 1 ] . The name and theory of copulas are based on the theorem of Sklar [21]. This important theorem pertaining to the copula method, which guarantees the mentioned relations, is described below. The proof can be found in [22].
Sklar’s Theorem: Let Y = ( Y 1 , , Y d ) be a random vector with marginal cumulative distribution functions (cdfs) F 1 ( y 1 ) , , F d ( y d ) , and let F ( y 1 , y d ) be their joint cdfs. Define u i = F i ( y i ) = P ( Y i y i ) , i = 1 , , d . Then, there is a copula function C ( · ) such that
C ( u 1 , , u d ) = P ( Y 1 y 1 , , Y d y d ) = F ( y 1 , , y d ) .
By differentiating (1), the joint probability density function (pdf) follows as
f ( y 1 , , y d ) = c ( F 1 ( y 1 ) , , F d ( y d ) ) i = 1 d f i ( y i ) ,
where f i ( y i ) is the marginal pdf of Y i and c ( F 1 ( y 1 ) , , F d ( y d ) ) is the copula density by taking the derivative of the copula function. For independent marginals, C ( u 1 , , u d ) = i = 1 d u i and c ( F 1 ( y 1 ) , , F d ( y d ) ) = 1 , and then f ( y 1 , , y d ) = i = 1 d f i ( y i ) , which means the independence of the random variables.
Henceforth, we consider the bivariate case for the data introduced in Section 1.
Archimedean copulas are constructed by a strictly decreasing continuous generating function φ : [ 0 , 1 ] [ 0 , ] , where φ ( 0 ) = and φ ( 1 ) = 0 . Thus, the distribution function of a two-dimensional Archimedean copula can be expressed as
C ( u , λ ) = φ 1 ( φ ( u 1 ) + φ ( u 2 ) ; λ ) , u = ( u 1 , u 2 ) ,
where λ is the control parameter of the degree of dependency. The lower and upper tail dependence measures for a bivariate copula C are defined in Joe (1993) [23] (if the limits exist) as
λ L = lim u 0 C ( u , u ) u , λ U = lim u 1 1 2 u + C ( u , u ) 1 u .
For the Archimedean copula, these limits hold
λ L = lim u φ 1 ( 2 u ) φ 1 ( u ) , if φ ( 0 ) = , 0 , otherwise , λ U = 2 lim u 0 1 φ 1 ( 2 u ) 1 φ 1 ( u ) .
Here, we employ the most cited bivariate Archimedean copulas (Clayton, 1978 [24]; Frank, 1979 [25]) for constructing new bivariate response regression models.

2.1. Clayton Copula

The Clayton copula [24] has generating function φ ( t ) = ( t λ 1 ) / λ and distribution and density functions given by
C ( u 1 , u 2 ) = ( u 1 λ + u 2 λ 1 ) 1 λ ,
and
c ( u 1 , u 2 ) = ( λ + 1 ) ( u 1 λ + u 2 λ 1 ) ( 2 λ + 1 ) λ ( u 1 u 2 ) ( λ + 1 ) ,
respectively, where λ > 0 . In this case, there is independence between the two variables when λ tends to zero. As  φ ( 0 ) = and φ 1 ( t ) = ( 1 + λ t ) 1 / λ , it follows from Equation (3)
λ L = lim u 1 + λ u 1 + λ u 1 / λ = 2 1 / λ
and
λ U = 2 lim u 0 1 ( 1 + 2 λ u ) 1 / λ 1 ( 1 + λ u ) 1 / λ = 0 .

2.2. Frank Copula

The Frank copula [25] has generating function φ ( t ) = log [ ( e λ t 1 ) / ( e λ 1 ) ] and distribution and density functions given by
C ( u 1 , u 2 ) = 1 λ log 1 + ( e λ u 1 1 ) ( e λ u 2 1 ) e λ 1
and
c ( u 1 , u 2 ) = λ e λ u 1 λ u 2 ( 1 e λ ) ( e λ e λ u 1 e λ u 2 + e λ u 1 λ u 2 ) 2 ,
respectively, where λ R 0 . If  λ tends to zero, the two variables can be independent, and if λ tends to infinity the two variables are correlated. As  φ ( 0 ) = and φ 1 ( t ) = log { 1 + exp ( t ) [ exp ( λ ) 1 ] } / λ , it follows from Equation (3) using L’Hôpital rule λ L = 0 and
λ U = 2 lim u 0 λ + log { 1 + e 2 u [ e λ 1 ] } λ + log { 1 + e u [ e λ 1 ] } = 0 .

3. A Bivariate EOLL-G Family Based on Clayton and Frank Copulas

For any baseline cdf G ( y ) = G ( y ; η ) depending on a parameter vector η , Alizadeh et al. (2020) [19] defined the cdf of the exponentiated odd log-logistic (“EOLL-G”) family (for y R ) by
F ( y ) = F ( y ; ν , τ , η ) = G ν τ ( y ) G ν ( y ) + [ G ¯ ( y ) ] ν τ ,
where G ¯ ( y ) = 1 G ( y ) , ν > 0 and τ > 0 are two extra shape parameters.
The pdf associated to Equation (8) becomes
f ( y ) = f ( y ; ν , τ , η ) = ν τ G ν τ 1 ( y ) [ G ¯ ( y ) ] ν 1 g ( y ) G ν ( y ) + [ G ¯ ( y ) ] ν τ + 1 ,
where g ( y ) = d G ( y ) / d y .
From now on, let Y E O L L G ( ν , τ , η ) be a random variable with pdf (9). Let X have cdf G ( · ; η ) such that E ( X p ) = x p g ( x ) d x (for p > 0 ) are finite. Some properties of Y are reported below:
(a)
The positive moments of Y, i.e.,  E ( Y p ) = y p f ( y ) d y , p > 0 , are finite when ν > 1 (see Appendix A.1).
(b)
The variable Y admits the stochastic representation: Y = G 1 S / ( S + 1 ) , where S has the Dagum distribution with shape parameters ν and τ , and unit scale, and G 1 ( · ) denotes the inverse function of G ( · ) (see Appendix A.2).
(c)
The positive moments of the standardization version of Y, say [ Y E ( Y ) ] / Var ( Y ) , are finite if ν > 1 (see Appendix A.3).
Definition 1. 
Let Y 1 , , Y d be an independent copy of Y 1 , , Y d , i.e.,  Y = ( Y 1 , , Y d ) is independent of Y = ( Y 1 , , Y d ) and both have the same joint cdf F = F Y . Following Koshevoy (1997) [26], the distance-Gini mean difference for F is
M D ( F ) 1 2 d E ( Y Y ) = 1 2 d ( 0 , ) d ( 0 , ) d y y d F ( y ) d F ( y ) ,
where · denotes the Euclidean distance in R 2 .
(d)
The distance-Gini mean difference M D ( F ) corresponding to the joint cdf F in (1), where Y i E O L L - G ( ν i , τ i , η i ) , i = 1 , , d , is finite when ν > 1 and X have moments of order greater than one (see Appendix A.4).
(e)
If U U ( 0 , 1 ) , then
Q G U 1 / ( ν τ ) U 1 / ( ν τ ) + ( 1 U 1 / τ ) 1 / ν E O L L - G ( ν , τ , η ) ,
where Q G ( u ) = G 1 ( u ) is the quantile function (qf) of G.
The EOLL-G family includes as special cases: the OLL-G class for τ = 1 (Gleaton and Lynch, 2006 [27]), and the exponentiated (Exp-G) class (Mudholkar et al., 1996 [28]) for ν = 1 . Clearly, Equation (9) becomes the baseline G when ν = τ = 1 .
Thus, we consider the following marginal distributions
Y 1 E O L L G ( ν 1 , τ 1 , η 1 ) and Y 2 E O L L G ( ν 2 , τ 2 , η 2 ) ,
where η 1 and η 2 are parameter vectors of the baseline G, and  ν 1 , τ 1 , ν 2 and τ 2 are positive shape parameters.

3.1. BCEOLL-G Model

By inserting (8) and (9) in Equations (1), (2), (4) and (5), the bivariate joint BCEOLL-G cdf reduces to
F BCEOLL - G ( y 1 , y 2 ) = W 1 λ ( y 1 ; η 1 ) + W 2 λ ( y 2 ; η 2 ) 1 1 λ .
The corresponding joint pdf has the form
f BCEOLL - G ( y 1 , y 2 ) = ( λ + 1 ) W 1 λ ( y 1 ; η 1 ) + W 2 λ ( y 2 ; η 2 ) 1 ( 2 λ + 1 ) λ × W 1 ( y 1 ; η 1 ) W 2 ( y 2 ; η 2 ) ( λ + 1 ) × ν 1 τ 1 g 1 ( y 1 ; η 1 ) G 1 ν 1 τ 1 1 ( y 1 ; η 1 ) G ¯ 1 ν 1 1 ( y 1 ; η 1 ) H 1 τ 1 + 1 ( y 1 ; η 1 ) × ν 2 τ 2 g 2 ( y 2 ; η 2 ) G 2 ν 2 τ 2 1 ( y 2 ; η 2 ) G ¯ 2 ν 2 1 ( y 2 ; η 2 ) H 2 τ 2 + 1 ( y 2 ; η 2 ) ,
where (for k = 1 , 2 )
H k ( y k ; η k ) = G k ν k ( y k ; η k ) + [ 1 G k ( y k ; η k ) ] ν k , W k ( y k ; η k ) = G k ν k τ k ( y k , η k ) H k τ k ( y k ; η k ) .

3.2. BFEOLL-G Model

Similarly, the joint cdf of the bivariate BFEOLL-G model can be expressed as
F BFEOLL - G ( y 1 , y 2 ) = 1 λ log 1 + exp [ λ W 1 ( y 1 ; η 1 ) ] 1 exp [ λ W 2 ( y 2 ; η 2 ) ] 1 exp ( λ ) 1 .
The corresponding joint BFEOLL-G pdf becomes
f BFEOLL - G ( y 1 , y 2 ) = λ [ 1 exp ( λ ) ] exp λ W 1 ( y 1 ; η 1 ) + W 2 ( y 2 ; η 2 ) × exp ( λ ) exp [ λ W 1 ( y 1 ; η 1 ) ] exp [ λ W 2 ( y 2 ; η 2 ) ] + exp [ λ ( W 1 ( y 1 ; η 1 ) + W 2 ( y 2 ; η 2 ) ) ] 2 × ν 1 τ 1 g 1 ( y 1 ; η 1 ) G 1 ν 1 τ 1 1 ( y 1 ; η 1 ) G ¯ 1 ν 1 1 ( y 1 ; η 1 ) H 1 τ 1 + 1 ( y 1 ; η 1 ) × ν 2 τ 2 g 2 ( y 2 ; η 2 ) G 2 ν 2 τ 2 1 ( y 2 ; η 2 ) G ¯ 2 ν 2 1 ( y 2 ; η 2 ) H 2 τ 2 + 1 ( y 2 ; η 2 ) .
The BCEOLL-G and BFEOLL-G models include three special cases:
  • The bivariate Clayton Exponentiated (Exp)-G and Frank Exp-G classes when ν = 1 ;
  • The bivariate Clayton odd log-logistic (OLL)-G and Frank OLL-G classes when τ = 1 .
  • The bivariate Clayton and Frank baseline models when ν = τ = 1 .
We can generate many bivariate models from Equations (14) and (16) by choosing different parent distributions.

3.3. Copula Dependence Measures

Pearson correlation is one of the most widely used measures, but  it cannot be used in cases where the joint distributions are not normally distributed, and also cannot capture nonlinear relations between the variables.
We study the association of the variables by means of copulas using nonparametric concordance measures for the ranks of the variables, thus enabling coping with data not normally distributed and allowing nonlinear relations between the variables. A concordance measure can be defined as follows: let ( y 11 , y 21 ) and ( y 12 , y 22 ) be two observations of the bivariate random variable ( Y 1 , Y 2 ) . Concordance exists when ( y 11 y 21 ) ( y 12 y 22 ) > 0 , while discordance exists when ( y 11 y 21 ) ( y 12 y 22 ) < 0 .
The Kendall ( τ k ) and Spearman ( ρ s ) correlations are concordance measures adopted in this methodology and their expressions derived from the copula dependence parameters are given below:
  • Clayton copula: τ k = λ λ + 2 . The expression for ρ s is very complicated.
  • Frank copula: τ k = 1 + 4 λ [ D 1 ( λ ) 1 ] and ρ s = 1 + 12 λ [ D 2 ( λ ) D 1 ( λ ) ] , where D k ( · ) is the kth order Debye function k α k 0 α t k e t 1 d t (for k = 1 , 2 ).

4. BCEOLL Normal and BFEOLL Normal Models

Here, we discuss special cases of the BCEOLL-G and BFEOLL-G models. The density functions (14) and (16) will be most tractable when the cdf G k ( y k ; η k ) has a simple analytic expression.
It is known that the data of many experiments follow a normal distribution. So, we present special models considering the normal baseline ( y k R )
G k ( y k ; η k ) = Φ y k μ k σ k and g k ( y k ; η k ) = 1 σ k ϕ y k μ k σ k ,
where Φ ( · ) and ϕ ( · ) are the cdf and pdf of the standard normal, respectively, η k = ( μ k , σ k ) , μ k R is a location and σ k > 0 is a scale (for k = 1 , 2 ). Hereafter, let Y k EOLLN ( ν k , τ k , μ k , σ k ) be a random variable.
If the marginals follow the EOLLN distribution, the joint cdf is determined by inserting (17) in Equations (14) and (16). So, the joint pdfs of the BCEOLL Normal (BCEOLLN) and BFEOLL Normal (BFEOLLN) models are
f BCEOLLN ( y 1 , y 2 ) = ( λ + 1 ) Φ ν 1 τ 1 ( z 1 ) H τ 1 ( z 1 ) λ + Φ ν 2 τ 2 ( z 2 ) H τ 2 ( z 2 ) λ 1 ( 2 λ + 1 ) λ Φ ν 1 τ 1 ( z 1 ) H τ 1 ( z 1 ) × Φ ν 2 τ 2 ( z 2 ) H τ 2 ( z 2 ) ( λ + 1 ) × ν 1 τ 1 ϕ ( z 1 ) Φ ν 1 τ 1 1 ( z 1 ) [ 1 Φ ( z 1 ) ] ν 1 1 σ 1 H τ 1 ( z 1 ) ν 2 τ 2 ϕ ( z 2 ) Φ ν 2 τ 2 1 ( z 2 ) [ 1 Φ ( z 2 ) ] ν 2 1 σ 2 H τ 2 ( z 2 )
and
f BFEOLLN ( y 1 , y 2 ) = λ [ 1 exp ( λ ) ] exp λ Φ ν 1 τ 1 ( z 1 ) H τ 1 ( z 1 ) + Φ ν 2 τ 2 ( z 2 ) H τ 2 ( z 2 ) × exp ( λ ) exp λ Φ ν 1 τ 1 ( z 1 ) H τ 1 ( z 1 ) exp λ Φ ν 2 τ 2 ( z 2 ) H τ 2 ( z 2 ) + exp λ Φ ν 1 τ 1 ( z 1 ) H τ 1 ( z 1 ) + Φ ν 2 τ 2 ( z 2 ) H τ 2 ( z 2 ) 2 × ν 1 τ 1 ϕ ( z 1 ) Φ ν 1 τ 1 1 ( z 1 ) [ 1 Φ ( z 1 ) ] ν 1 1 σ 1 H τ 1 ( z 1 ) ν 2 τ 2 ϕ ( z 2 ) Φ ν 2 τ 2 1 ( z 2 ) [ 1 Φ ( z 2 ) ] ν 2 1 σ 2 H τ 2 ( z 2 ) ,
respectively, where (for k = 1 , 2 )
Φ ( z k ) = Φ y k μ k σ k , ϕ ( z k ) = ϕ y k μ k σ k and H ( z k ) = Φ ν k ( z k ) + [ 1 Φ ( z k ) ] ν k .
Figure 2 and Figure 3 report the joint densities, their contour plots, and  bivariate cdfs for the Clayton and Frank copulas. The presence of bimodality is noted by the joint density. In the contour plots, we clearly find different association structures.

5. Bivariate Regression Models

Let ( Y 11 , Y 12 ) , , ( Y n 1 , Y n 2 ) be a bivariate random sample from the BCEOLLN and BEOLLN models, Y i k BCEOLLN ( θ i 1 ) and Y i k BEOLLN ( θ i 2 ) , where θ i 1 = ( μ i 1 , σ 1 , ν 1 , τ 1 ) and θ i 2 = ( μ i 2 , σ 2 , ν 2 , τ 2 ) (for i = 1 , , n and k = 1 , 2 ).
Considering a sample ( y 1 k , x 1 k ) , , ( y n k , x n k ) , a  systematic component can be defined as
μ i k = x i k β k ,
where x i k = ( 1 , x i 1 k , , x i p k ) is the explanatory variable vector of dimension p + 1 (for k = 1 , 2 and i = 1 , , n ), and  β k = ( β 0 k , β 1 k , , β p k ) is the vector of unknown parameters.
Considering n independent observations ( y 1 k , x 1 k ) , , ( y n k , x n k ) (for k = 1 , 2 ), the model defined by (20) and the joint pdf given in Equations (18) and (19). Further, let z i k = ( y i k μ i k ) / σ k , H ( z i k ) = Φ ν k ( z i k ) + [ 1 Φ ( z i k ) ] ν k , and  W ( z i k ) = Φ ν k τ k ( z i k ) / H ν k ( z i k ) .
If θ = ( λ , θ 1 , θ 2 ) , θ k = ( β k , σ k , τ k , ν k ) , β k = ( β 0 k , β 1 k , , β p k ) (for k = 1 , 2 ), the total log-likelihood functions for θ have the forms below:
BCEOLLN regression model
l ( θ ) = n log ( λ + 1 ) ν 1 τ 1 ν 2 τ 2 σ 1 σ 2 2 λ + 1 λ i = 1 n log W λ ( z i 1 ) + W λ ( z i 2 ) 1 ( λ + 1 ) i = 1 n log [ W ( z i 1 ) W ( z i 2 ) ] + i = 1 n log ϕ ( z i 1 ) Φ ν 1 τ 1 1 ( z i 1 ) [ 1 Φ ( z i 1 ) ] ν 1 H τ 1 ( z i 1 ) + i = 1 n log ϕ ( z i 2 ) Φ ν 1 τ 2 1 ( z i 2 ) [ 1 Φ ( z i 2 ) ] ν 1 H τ 2 ( z i 2 ) .
BFEOLLN regression model
l ( θ ) = n log λ [ 1 exp ( λ ) ] ν 1 τ 1 ν 2 τ 2 σ 1 σ 2 λ i = 1 n [ W ( z i 1 ) + W ( z i 2 ) ] 2 i = 1 n log exp ( λ ) exp [ λ W ( z i 1 ) ] exp [ λ W ( z i 2 ) ] + exp [ λ ( W ( z i 1 ) + W ( z i 2 ) ) ] + i = 1 n log ϕ ( z i 1 ) Φ ν 1 τ 1 1 ( z i 1 ) [ 1 Φ ( z i 1 ) ] ν 1 H τ 1 ( z i 1 ) + i = 1 n log ϕ ( z i 2 ) Φ ν 1 τ 2 1 ( z i 2 ) [ 1 Φ ( z i 2 ) ] ν 1 H τ 2 ( z i 2 ) .
For copulas, the estimation of the parameters is usually conducted in two stages to create bivariate response models. In the first stage, the events are considered independent and the parameters are estimated marginally. The estimates are then used in the second stage, where the association parameter α is estimated. This approach is coherent when the focus of the study is to estimate the association parameter λ . If the coefficients of the regression are the focus, this marginal two-stage approach does not add any additional information compared to the use of independent models. Then, it is best to conduct a joint estimation of the regression coefficients and λ .
The maximum likelihood estimate (MLE) θ ^ can be calculated by maximizing (21) and (22). We use the simplex method of Nelder and Mead (1965) [29] implemented in the optim function in the R software [30]. This method is a robust and direct search method, which uses only function values, i.e., it does not use gradient information. It compares the function values at the vertices of a general simplex, then replaces the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. Initial values for β , σ 1 and σ 2 are taken from the fits of the sub-models with τ 1 = ν 1 = 0.5 and τ 2 = ν 2 = 0.5 . It proves to be effective, computationally compact and provides the Hessian matrix, necessary for inference. See [29] for details. The asymptotic normal distribution of θ ^ can be considered for inference, tests, and confidence intervals.

6. Simulation Study

A Monte Carlo simulation study examines the accuracy of the estimates in the bivariate EOLLN regression model. We use the Multivariate Copula Description (MvCD) from the copula package [31] in R. The MvCD function generates the required univariate marginals according to the supplied association parameter using the inverse transformation method.
We consider the sample sizes n = 50 , 100 , 200 , 500 , r = 1000 replications and a covariate x 1 Normal ( 1 , 0.5 ) related to y 1 and y 2 by the identity link functions μ i 1 = ( β 101 + β 111 x i 1 ) and μ i 2 = ( β 102 + β 112 x i 1 ) , respectively. Further, we take σ k = exp ( β 20 k ) , ν k = exp ( β 30 k ) and τ k = exp ( β 40 k ) . Then, let Y k EOLLN ( μ i k , σ k , ν k , τ k ) (for k = 1 , 2 ), the quantile function (qf) has the form
y k = Q ( u ) = Q N u 1 / ( ν k τ k ) u 1 / ( ν k τ k ) + ( 1 u 1 / τ k ) 1 / ν k , 0 < u < 1 ,
where Q N ( p ) = Φ 1 ( p ; μ k , σ k ) , p ( 0 , 1 ) , is the normal qf, namely
Φ 1 ( p ; μ k , σ k ) = μ k + 2 σ k erf 1 ( 2 p 1 )
and erf 1 ( · ) is the inverse error function.
The true parameter values are: μ i 1 = ( 1.5 + 1.32 x i 1 ) , σ 1 = exp ( 1.5 ) , ν 1 = exp ( 1.1 ) , τ 1 = exp ( 1.4 ) , μ i 2 = ( 1.6 + 2.3 x i 1 ) , σ 2 = exp ( 0.6 ) , ν 2 = exp ( 1.8 ) , τ 2 = exp ( 1.9 ) and λ = 5 .
The average estimates (AEs), biases, and mean squared errors (MSEs) are given by
AE ( η ^ ) = 1 r i = 1 r η i ^ , Bias ( η ^ ) = 1 r i = 1 r ( η i ^ η i ) , MSE ( η ^ ) = 1 r i = 1 r ( η i ^ η i ) 2 ,
where η ^ = ( β ^ 101 , β ^ 111 , β ^ 201 , β ^ 301 , β ^ 401 , β ^ 102 , β ^ 112 , β ^ 202 , β ^ 302 , β ^ 402 , λ ^ ) .
For each of the Clayton and Frank copulas and sample size, the calculations follow the Algorithm 1.
Algorithm 1:
Symmetry 15 01778 i001
For both copulas, the biases and MSEs in Table 1 and Table 2 decrease when n grows. So, the consistency of the estimators holds. The empirical coverage probabilities (CPs) in Table 3 show that their values converge to the 95% nominal level.

7. Application to Lettuce Leaf Data

This dataset refers to the effect of the foliar application of different concentrations of silicon dioxide and organosilicon compounds on the growth and biochemical contents of Lactuca sativa var. crispa grown in phytotron conditions. The experiment was carried out on 12 March 2020, in the Department of Goods Commodity and Expertise of Goods Products in Moscow, Russian Federation. More details of the experiment are described in [32]. Here, we check the response variables fresh weight (grams) ( y 1 ) and plant height (cm) ( y 2 ) together in order to verify the existing correlation between them. We also present a regression model relating these covariates with the following treatments:
The levels of gradual concentrations of amorphous silicon dioxide (AS) and 1-ethoxysilatrane (ES) solutions are: AS1 (50 mg·L 1 ), AS2 (100 mg·L 1 ), AS3 (200 mg·L 1 ), AS4 (300 mg·L 1 ), and ES1 (5 × 10 4 mL·L 1 ), ES2 (10 3 mL·L 1 ), ES3 (5 × 10 3 mL·L 1 ) and ES4 (10 2 mL·L 1 ). These variables are defined by dummy variables as follows:
  • y i 1 fresh weight (grams);
  • y i 2 plant height (cm);
  • x i j k combination of AS and ES solutions (for j = 1 8 ).
Hence, the systematic component has the form (for k = 1 , 2 and i = 1 , , 54 ):
μ i k = β 0 k + β 1 k x i 1 k + β 2 k x i 2 k + β 3 k x i 3 k + β 4 k x i 4 k + β 5 k x i 5 k + β 6 k x i 6 k + β 7 k x i 7 k + β 8 k x i 8 k .

7.1. Descriptive Analysis

First, we present a descriptive analysis of the dataset. Table 4 indicates positive asymmetry and kurtosis for both response variables. Figure 1a,b displays the histograms of these variables, thus indicating bimodal behavior for both. Pearson’s coefficient is 0.8497 , which is a strong positive correlation, as can be noted in Figure 1. Figure 4 provides the boxplots for different experimental treatments.

7.2. Univarite Marginal Analysis

Univariate analysis of the response variables is conducted under the EOLLN distribution and its special cases: OLLN, Exp-N, and Normal. The Akaike information criterion (AIC) and Global Deviance (GD) in Table 5 reveal that the OLLN model is more adequate for y 1 (fresh weight) and the EOLLN model for y 2 (plant height).
Figure 5a and Figure 6a report the histograms with estimated densities, and Figure 5b and Figure 6b their empirical and estimated cumulative distributions, thus supporting the findings in Table 5.

7.3. Bivariate Analysis and Regression Model

Next, we carry out a joint analysis of the response variables from the current sixteen models for the Clayton and Frank copulas. Table 6 reveals that the bivariate (OLLN × EOLLN) regression model with Clayton and Frank copulas is the most suitable model to explain the response variables ( Y 1 × Y 2 ), thus agreeing with the univariate analysis.
Additionally, we provide the values of the copula dependence parameter ( λ ), Kendall’s correlation ( τ k ), and Spearman correlation ( ρ s ). For both models and the two copulas, these values indicate that the variables have strong positive dependence, thus confirming the required dependence structure. Table 7 provides the estimated quantities for the best bivariate regression model.
We can conclude the following facts:
  • Interpretations for μ :
    -
    Comparison of the treatments with the control indicates that treatments ES1 and ES2 are not significant for the metrics of fresh mass and plant height. The other treatments are significant for both variables.
    -
    Comparison of the treatments with the control indicates that treatments AS1, AS2, ES3, and ES4 have positive effects on the fresh mass. So, they increase with the fresh mass. On the other hand, the effects of treatments AS3 and AS4 are negative for the fresh mass.
    -
    For the plant height, the treatments AS1, AS2, ES3, and ES4 have positive effects, meaning greater height in relation to the control. In contrast, the treatments AS3 and AS4 have negative effects on plant height.
    -
    Table 8 compares all treatments with the corresponding control, from which other interpretations can be found.
    -
    All the results are clearly consistent with the descriptive analysis.
  • Interpretations for λ , τ k and ρ s :
    -
    The values of λ , τ k , and ρ s in the best regression model reveal that the variables are correlated with moderate dependence. The dependence structure is necessary, as confirmed by the 95% CIs for the dependence parameter of the copula, which does not include zero.
Finally, Figure 7 displays the plots of the observed and predicted values and corresponding CIs, thus supporting that the best bivariate regression model is a good predictor for the current data.

8. Conclusions

This article proposed a new bivariate family based on Archimedean copulas. Motivated by an experiment that evaluated the growth of oak lettuce plants, which presented the variables fresh mass and height with bimodal behavior, we used the exponentiated odd log-logistic family, whose densities can model different types of data, including bimodality. Furthermore, these variables showed a strong positive correlation and the Clayton and Frank copulas were suitable for studying data with a positive correlation.
We presented some mathematical properties of the new family and a simulation study showed the consistency of the maximum likelihood estimators. We considered two categorical explanatory variables (each one with four treatments) to explain two response variables: fresh weight and plant height. Some important results were obtained from the application: (i) For both variables, the same treatments were significant or not; (ii) Two treatments had negative effects on the two variables and two other treatments had the greatest effects on the variables, i.e., with these treatments greater fresh mass and greater plant height were obtained. The bivariate regression model proved to be adequate for predicting fresh weight and plant height values of oak lettuce under the effect of different treatments. The choice of these copulas proved to be adequate due to the positive dependence structure between the variables. Other baseline distributions can be used as well as applications in other areas of knowledge.

Author Contributions

Conceptualization, G.M.R., E.M.M.O., G.M.C. and R.V.; methodology, G.M.R., E.M.M.O., G.M.C. and R.V.; software, G.M.R., E.M.M.O., G.M.C. and R.V.; validation, G.M.R., E.M.M.O., G.M.C. and R.V.; formal analysis, G.M.R., E.M.M.O., G.M.C. and R.V.; investigation, G.M.R., E.M.M.O., G.M.C. and R.V.; data curation, G.M.R., E.M.M.O., G.M.C. and R.V.; writing—original draft preparation, G.M.R., E.M.M.O., G.M.C. and R.V.; writing—review and editing, G.M.R., E.M.M.O., G.M.C. and R.V.; visualization, G.M.R., E.M.M.O., G.M.C. and R.V.; supervision, G.M.R., E.M.M.O., G.M.C. and R.V. All authors have read and agreed to the current version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES) and Conselho Nacional de Desenvolvimento Cientifico e Tecnológico (CNPq).

Informed Consent Statement

Not applicable.

Data Availability Statement

Data Availability at https://www.sciencedirect.com/science/article/pii/S2352340921006120 (accessed on 10 July 2023).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Some Properties Related to the EOLL-G Model

Appendix A.1. Real Moments

Let Y E O L L - G ( ν , τ , η ) . We obtain sufficient conditions that guarantee the existence of the real moments of Y. By using the well-known formula (for the moments of positive random variables)
E ( W p ) = p 0 w p 1 P ( W > w ) d w , W > 0 , p > 0 ,
it is clear that E ( Y p ) < if and only if
I c y p 1 P ( Y > y ) d y < ,
for some c ( 0 , ) so that 0 < G ( c ; η ) < . We now verify the finiteness of I. In fact, let S be a Dagum random variable with shapes ν and τ , and unit scale, say S D ( ν , 1 , τ ) . Since
P ( Y y ) = ( 8 ) P S G ( y ; η ) 1 G ( y ; η ) ,
the integral I becomes
I = c y p 1 P S > G ( y ; η ) 1 G ( y ; η ) d y .
By applying Markov’s inequality, the inequality holds
I E ( S ) c y p 1 1 G ( y ; η ) G ( y ; η ) d y .
If y G ( y ; η ) is a cdf, G ( y ; η ) > G ( c ; η ) for y > c . Let X have cdf G ( · ; η ) . So, the integral in (A3) can be expressed as
E ( S ) G ( c ; η ) c y p 1 [ 1 G ( y ; η ) ] d y E ( S ) G ( c ; η ) 0 y p 1 [ 1 G ( y ; η ) ] d y = E ( S ) E ( X p ) p G ( c ; η ) ,
where it is used Equation (A1). We obtain
I E ( S ) E ( X p ) p G ( c ; η ) .
As E ( S ) < for ν > 1 leads to the condition E ( X p ) < , we have I < .
Hence, under conditions ν > 1 and E ( X p ) < , we have E ( Y p ) < .

Appendix A.2. Stochastic Representation

We can write from (A2),
P ( Y y ) = P G 1 S S + 1 ; η y , y .
Finally, it is evident that
Y = G 1 S S + 1 ; η
provides a stochastic representation for Y E O L L - G ( ν , τ , η ) .

Appendix A.3. Standardized Moments

Under the previous conditions, Appendix A.1 guarantees the existence of moments of positive order of Y, and then the existence of mean and variance. Let μ Y = E ( Y ) and σ Y 2 = Var ( Y ) , and let Z = ( Y μ Y ) / σ Y be the standardized version of Y.
By using the triangle inequality and the C p inequality ( x + y ) p C p ( x p + y p ) , x , y 0 , where p > 0 and C p max { 1 , 2 p 1 } , we have
E ( | Z | p ) C p σ Y p [ E ( | Y | p ) + | μ Y | p ] = C p σ Y p [ E ( Y p ) + μ Y p ] ,
since Y > 0 . As the function x φ ( x ) = x 1 / p is increasing for all x > 0 and p > 0 , we have from the previous inequality
Z p = φ ( E ( | Z | p ) ) φ C p σ Y p [ E ( Y p ) + μ Y p ] = C p 1 / p σ Y [ E ( Y p ) + μ Y p ] 1 / p C p 1 / p C 1 / p σ Y ( Y p + μ Y ) ,
where in the last inequality again we use the C p inequality, and Z p and Y p are the norms in L p of Z and Y, respectively. Appendix A.1 leads to Y p < , μ Y < and 0 < σ Y < , and then Z p < from the above inequality.

Appendix A.4. The Multivariate Distance-Gini Mean Difference

Let Y 1 , , Y d be an independent copy of Y 1 , , Y d , where Y = ( Y 1 , , Y d ) follows the joint cdf F = F Y as given in (1), and Y i E O L L - G ( ν i , τ i , η i ) , i = 1 , , d . The distance-Gini mean difference for F is defined as in (10) (see Koshevoy 1997 [26]).
Since y y 1 = i = 1 d | y i | , where · 1 is the Manhattan norm, it is clear that
M D ( F ) 1 2 d i = 1 d E ( | Y i Y i | ) = 1 2 d i = 1 d E ( max { Y i , Y i } min { Y i , Y i } ) .
As Y 1 , , Y d is an independent copy of Y 1 , , Y d , we have that Y i is idenpendent of Y i and both have the same univariate distribution E O L L - G ( ν , τ , η ) , with variance denoted by σ 2 . By using the following inequality (see Item (5) of reference Vila et al., 2023 [33]), if Z is the standardized version of Y 1 , then
E ( max { Y i , Y i } min { Y i , Y i } ) 2 σ Z p p 1 2 p 1 ( p 1 ) / p , p > 1 , i = 1 , , d ,
we have from (A4)
M D ( F ) σ Z p p 1 2 p 1 ( p 1 ) / p , p > 1 .
By Appendices Appendix A.1 and Appendix A.3, we have 0 < σ < and Z p < if ν > 1 and E ( X p ) < .
Hence, under conditions ν > 1 , p > 1 and E ( X p ) < , we obtain M D ( F ) < .

References

  1. Nelsen, R.B. Properties of a one-parameter family of bivariate distributions with specified marginals. Commun. Stat.-Theory Methods 1986, 15, 3277–3285. [Google Scholar] [CrossRef]
  2. Quiroz-Flores, A. Testing copula functions as a method to derive bivariate Weibull distributions. In Proceedings of the APSA Annual Meeting & Exhibition, Toronto, ON, Canada, 3–6 September 2009. [Google Scholar]
  3. El-Sherpieny, E.S.; Almetwally, E.M. Bivariate generalized rayleigh distribution based on Clayton Copula. In Proceedings of the 54rd Annual Conference on Statistics, Computer Science and Operation Research, Cairo, Egypt, 9–11 December 2019; pp. 1–19. [Google Scholar]
  4. Almetwally, E.M.; Muhammed, H.Z. On a Bivariate Frechet Distribution. J. Stat. Appl. Probab. 2020, 9, 1–21. [Google Scholar]
  5. Muhammed, H.Z. On a bivariate generalized inverted Kumaraswamy distribution. Phys. A Stat. Mech. Its Appl. 2020, 553, 124281. [Google Scholar] [CrossRef]
  6. Samanthi, R.G.M.; Sepanski, J. On bivariate Kumaraswamy-distorted copulas. Commun. Stat.-Theory Methods 2022, 51, 2477–2495. [Google Scholar] [CrossRef]
  7. Alotaibi, R.M.; Rezk, H.R.; Ghosh, I.; Dey, S. Bivariate exponentiated half logistic distribution: Properties and application. Commun. Stat.-Theory Methods 2021, 50, 6099–6121. [Google Scholar] [CrossRef]
  8. Vaidyanathan, V.S.; Sharon Varghese, A. Morgenstern type bivariate Lindley distribution. Stat. Optim. Inf. Comput. 2016, 4, 132–146. [Google Scholar] [CrossRef]
  9. Naifar, N. Modelling dependence structure with Archimedean copulas and applications to the iTraxx CDS index. J. Comput. Appl. Math. 2011, 235, 2459–2466. [Google Scholar] [CrossRef]
  10. Yang, L.; Cai, X.J.; Li, M.; Hamori, S. Modeling dependence structures among international stock markets: Evidence from hierarchical Archimedean copulas. Econ. Model. 2015, 51, 308–314. [Google Scholar] [CrossRef]
  11. Novianti, P.; Kartiko, S.H.; Rosadi, D. Application of Clayton Copula to identify dependency structure of COVID-19 outbreak and average temperature in Jakarta Indonesia. J. Phys. Conf. Ser. 2021, 1, 012154. [Google Scholar] [CrossRef]
  12. Li, H.; Lu, Y. Modeling cause-of-death mortality using hierarchical Archimedean copula. Scand. Actuar. J. 2019, 3, 247–272. [Google Scholar] [CrossRef]
  13. Janga Reddy, M.; Ganguli, P. Risk assessment of hydroclimatic variability on groundwater levels in the Manjara basin aquifer in India using Archimedean copulas. J. Hydrol. Eng. 2012, 17, 1345–1357. [Google Scholar] [CrossRef]
  14. Zhang, L.; Singh, V.P. Bivariate rainfall frequency distributions using Archimedean copulas. J. Hydrol. 2007, 332, 93–109. [Google Scholar] [CrossRef]
  15. Tsakiris, G.; Kordalis, N.; Tigkas, D.; Tsakiris, V.; Vangelis, H. Analysing drought severity and areal extent by 2D Archimedean copulas. Water Resour. Manag. 2016, 30, 5723–5735. [Google Scholar] [CrossRef]
  16. He, W.; Lawless, J.F. Bivariate location–scale models for regression analysis, with applications to lifetime data. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2005, 67, 63–78. [Google Scholar] [CrossRef]
  17. Wienke, A.; Locatelli, I.; Yashin, A.I. The modelling of a cure fraction in bivariate time-to-event data. Austrian J. Stat. 2006, 35, 67–76. [Google Scholar] [CrossRef]
  18. Fachini, J.B.; Ortega, E.M.; Cordeiro, G.M. A bivariate regression model with cure fraction. J. Stat. Comput. Simul. 2014, 84, 1580–1595. [Google Scholar] [CrossRef]
  19. Alizadeh, M.; Tahmasebi, S.; Haghbin, H. The exponentiated odd log-logistic family of distributions: Properties and applications. J. Stat. Model. Theory Appl. 2020, 1, 29–52. [Google Scholar]
  20. Nelsen, R.B. An Introduction to Copulas; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
  21. Sklar, M. Fonctions de répartition à n dimensions et leurs marges. Ann. de L’ISUP 1959, 8, 229–231. [Google Scholar]
  22. Sklar, A. Random variables, joint distribution functions, and copulas. Kybernetika 1973, 9, 449–460. [Google Scholar]
  23. Joe, H. Multivariate dependence measures and data analysis. Comput. Statist. Data Anal. 1993, 16, 279–297. [Google Scholar] [CrossRef]
  24. Clayton, D.G. A model for association in bivariate life tables and its applications in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978, 65, 141–151. [Google Scholar] [CrossRef]
  25. Frank, M.J. On the simultaneous associativity of F(x,y) and x + yF(x,y). Aequationes Math. 1979, 19, 194–226. [Google Scholar] [CrossRef]
  26. Koshevoy, G.A. Multivariate Gini Indices. J. Multivar. Anal. 1997, 60, 252–276. [Google Scholar] [CrossRef]
  27. Gleaton, J.U.; Lynch, J.D. Properties of generalized log-logistic families of lifetime distributions. J. Probab. Stat. Sci. 2006, 4, 51–64. [Google Scholar]
  28. Mudholkar, G.S.; Srivastava, D.K.; Kollia, G.D. A generalization of the Weibull distribution with application to the analysis of survival data. J. Am. Stat. Assoc. 1996, 91, 1575–1583. [Google Scholar] [CrossRef]
  29. Nelder, J.A.; Mead, R. A simplex method for function minimization. Comput. J. 1965, 7, 308–313. [Google Scholar] [CrossRef]
  30. R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]
  31. Kojadinovic, I.; Yan, J. Modeling multivariate distributions with continuous margins using the copula R package. J. Stat. Softw. 2010, 34, 1–20. [Google Scholar] [CrossRef]
  32. Othman, A.J.; Eliseeva, L.G.; Ibragimova, N.A.; Zelenkov, V.N.; Latushkin, V.V.; Nicheva, D.V. Dataset on the effect of foliar application of different concentrations of silicon dioxide and organosilicon compounds on the growth and biochemical contents of oak leaf lettuce (Lactuca sativa var. crispa) grown in phytotron conditions. Data Brief 2021, 38, 107328. [Google Scholar] [CrossRef] [PubMed]
  33. Vila, R.; Balakrishnan, N.; Saulo, H. An Upper Bound and a Characterization for Gini’s Mean Difference Based on Correlated Random Variables. Preprint. 2023. Available online: https://arxiv.org/pdf/2301.07229.pdf (accessed on 15 July 2023).
Figure 1. (a) Histogram of fresh weight, (b) Histogram of plant height, and (c) Scatter plot between fresh weight and plant height.
Figure 1. (a) Histogram of fresh weight, (b) Histogram of plant height, and (c) Scatter plot between fresh weight and plant height.
Symmetry 15 01778 g001
Figure 2. BCEOLLN copula for Y 1 EOLLN ( μ 1 = 0 , σ 1 = 1 , ν 1 = 0.1 , τ 1 = 1 ), Y 2 EOLLN ( μ 2 = 0 , σ 2 = 1 , ν 2 = 0.1 , τ 2 = 1.5 ) and λ = 3 : (a) Bivariate pdf, (b) Contour plots of the pdf and (c) Bivariate cdf.
Figure 2. BCEOLLN copula for Y 1 EOLLN ( μ 1 = 0 , σ 1 = 1 , ν 1 = 0.1 , τ 1 = 1 ), Y 2 EOLLN ( μ 2 = 0 , σ 2 = 1 , ν 2 = 0.1 , τ 2 = 1.5 ) and λ = 3 : (a) Bivariate pdf, (b) Contour plots of the pdf and (c) Bivariate cdf.
Symmetry 15 01778 g002
Figure 3. BFEOLLN copula for Y 1 EOLLN ( μ 1 = 0 , σ 1 = 1 , ν 1 = 0.1 , τ 1 = 1 ), Y 2 EOLLN ( μ 2 = 0 , σ 2 = 1 , ν 2 = 0.1 , τ 2 = 1.5 ) and λ = 3 : (a) Bivariate pdf, (b) Contour plots of the pdf and (c) Bivariate cdf.
Figure 3. BFEOLLN copula for Y 1 EOLLN ( μ 1 = 0 , σ 1 = 1 , ν 1 = 0.1 , τ 1 = 1 ), Y 2 EOLLN ( μ 2 = 0 , σ 2 = 1 , ν 2 = 0.1 , τ 2 = 1.5 ) and λ = 3 : (a) Bivariate pdf, (b) Contour plots of the pdf and (c) Bivariate cdf.
Symmetry 15 01778 g003
Figure 4. (a) Boxplots of fresh weight ( y 1 ) and (b) Boxplots of plant height ( y 2 ).
Figure 4. (a) Boxplots of fresh weight ( y 1 ) and (b) Boxplots of plant height ( y 2 ).
Symmetry 15 01778 g004
Figure 5. (a) Estimated densities and (b) Estimated cumulative functions and empirical cdf for the fresh weight.
Figure 5. (a) Estimated densities and (b) Estimated cumulative functions and empirical cdf for the fresh weight.
Symmetry 15 01778 g005
Figure 6. (a) Estimated densities and (b) Estimated cumulative functions and empirical cdf for the plant height.
Figure 6. (a) Estimated densities and (b) Estimated cumulative functions and empirical cdf for the plant height.
Symmetry 15 01778 g006
Figure 7. (a) Observed values (black) and predicted values (red) with CIs of the Clayton OLLN EOLLN regression model for fresh weight and (b) Observed values (black) and predicted values (red) with CIs of the Frank OLLN EOLLN regression for plant height.
Figure 7. (a) Observed values (black) and predicted values (red) with CIs of the Clayton OLLN EOLLN regression model for fresh weight and (b) Observed values (black) and predicted values (red) with CIs of the Frank OLLN EOLLN regression for plant height.
Symmetry 15 01778 g007
Table 1. Simulation findings from the fitted bivariate Clayton EOLLN regression model.
Table 1. Simulation findings from the fitted bivariate Clayton EOLLN regression model.
η True Value n = 50 n = 100
AEsBiasesMSEsAEsBiasesMSEs
β 101 1.501.4840−0.01600.14361.4949−0.00510.0787
β 111 1.321.32060.00060.05231.3190−0.00100.0244
β 201 1.501.54480.04480.07551.53830.03830.0405
β 301 1.101.17990.07990.08161.15770.05770.0450
β 401 1.401.45550.05550.08921.42330.02330.0453
β 102 1.601.5966−0.00340.00561.5986−0.00140.0024
β 112 2.302.30030.00030.00172.2991−0.00090.0008
β 202 0.600.61350.01350.04840.61320.01320.0224
β 302 1.801.84810.04810.04871.83140.03140.0216
β 402 1.901.97310.07310.09561.93960.03960.0408
λ 5.005.22680.22680.14175.14460.14460.0601
η True Value n = 200 n = 500
AEsBiasesMSEsAEsBiasesMSEs
β 101 1.501.4949−0.00510.04371.50490.00490.0164
β 111 1.321.3172−0.00280.01221.32190.00190.0049
β 201 1.501.53360.03360.02471.51850.01850.0108
β 301 1.101.14400.04400.02721.12530.02530.0112
β 401 1.401.41470.01470.02141.3992−0.00080.0074
β 102 1.601.5985−0.00150.00101.5993−0.00070.0003
β 112 2.302.2996−0.00040.00042.30030.00030.0002
β 202 0.600.61490.01490.01000.60890.00890.0048
β 302 1.801.82440.02440.00931.81350.01350.0045
β 402 1.901.92150.02150.01741.90780.00780.0057
λ 5.005.07360.07360.02235.01780.01780.0060
Table 2. Simulation findings from the fitted bivariate Frank EOLLN regression model.
Table 2. Simulation findings from the fitted bivariate Frank EOLLN regression model.
η True Value n = 50 n = 100
AEsBiasesMSEsAEsBiasesMSEs
β 101 1.501.3278−0.17220.57291.3552−0.14480.3244
β 111 1.321.3110−0.00900.09051.32500.00500.0432
β 201 1.501.58020.08020.08911.57530.07530.0686
β 301 1.101.21400.11400.10231.18750.08750.0734
β 401 1.401.61020.21020.41401.54190.14190.2325
β 102 1.601.5604−0.03960.01951.5730−0.02700.0114
β 112 2.302.2977−0.00230.00372.30150.00150.0017
β 202 0.600.63070.03070.05660.64190.04190.0446
β 302 1.801.85100.05100.06671.85460.05460.0483
β 402 1.902.14300.24300.37372.04950.14950.2159
λ 5.005.12390.12390.33005.09540.09540.2026
η True Value n = 200 n = 500
AEsBiasesMSEsAEsBiasesMSEs
β 101 1.501.4714−0.02860.14271.4853−0.01470.0666
β 111 1.321.3145−0.00550.02251.32640.00640.0086
β 201 1.501.56810.06810.03901.54070.04070.0189
β 301 1.101.17890.07890.04141.14610.04610.0210
β 401 1.401.43910.03910.09991.41020.01020.0421
β 102 1.601.5801−0.01990.00521.5862−0.01380.0024
β 112 2.302.2990−0.00100.00082.30080.00080.0003
β 202 0.600.62710.02710.02170.62780.02780.0100
β 302 1.801.83340.03340.02251.82890.02890.0094
β 402 1.902.01080.11080.09611.96520.06520.0465
λ 5.005.04310.04310.08565.02270.02270.0428
Table 3. CPs for the fitted bivariate Clayton and Frank EOLLN regression models.
Table 3. CPs for the fitted bivariate Clayton and Frank EOLLN regression models.
Clayton
n β 101 β 111 β 201 β 301 β 401 β 102 β 112 β 202 β 302 β 402 λ
500.9950.9710.9990.9960.9970.9990.9721.0001.0000.9990.961
1000.9960.9720.9980.9970.9981.0000.9741.0001.0001.0000.962
2000.9990.9580.9990.9991.0000.9990.9651.0000.9991.0000.968
5000.9990.9620.9990.9991.0001.0000.9641.0001.0001.0000.948
Frank
n β 101 β 111 β 201 β 301 β 401 β 102 β 112 β 202 β 302 β 402 λ
500.9830.9561.0000.9980.9860.9880.9371.0001.0000.9880.950
1000.9930.9610.9991.0000.9920.9890.9441.0000.9990.9900.951
2000.9900.9560.9990.9980.9910.9960.9561.0001.0000.9970.962
5000.9950.9550.9990.9990.9950.9980.9461.0001.0000.9970.953
Table 4. Descriptive analysis of fresh weight ( y 1 ) and plant height ( y 2 ) variables.
Table 4. Descriptive analysis of fresh weight ( y 1 ) and plant height ( y 2 ) variables.
VariableMeanMedians.d.Min.MaxSkewnessVCKurtosis
y 1 39.51138.9258.40626.50057.6100.26821.2742.208
y 2 18.48517.2504.19711.20026.2000.47622.7072.174
Table 5. Adequacy measures of the univariate models for each response variable.
Table 5. Adequacy measures of the univariate models for each response variable.
ModelFresh WeightPlant Height
AICGDAICGD
EOLLN385.98377.98300.80292.80
OLLN384.05378.05305.51299.51
Exp-N387.40381.40309.57303.57
Normal386.16382.16311.16307.16
Table 6. AIC, GD, copula dependence parameter ( λ ), Kendall correlation ( τ k ) and Spearman correlation ( ρ s ) for sixteen bivariate models fitted to lettuce data.
Table 6. AIC, GD, copula dependence parameter ( λ ), Kendall correlation ( τ k ) and Spearman correlation ( ρ s ) for sixteen bivariate models fitted to lettuce data.
Model for ( Y 1 × Y 2 )ClaytonFrank
AICGD λ τ k ρ s AICGD λ τ k ρ s
EOLLN × EOLLN623.18605.182.810.580.77615.95597.9510.160.670.86
EOLLN × OLLN634.02618.022.270.530.72627.92611.929.670.660.85
EOLLN × Exp-N629.52613.523.380.630.81628.37612.378.600.620.82
EOLLN × Normal625.02611.023.580.640.83630.43616.4310.230.670.87
OLLN × EOLLN620.25604.252.930.590.78612.78596.7810.790.690.88
OLLN × OLLN632.00618.002.310.540.72626.71612.719.470.650.85
OLLN × Exp-N628.19614.193.360.630.81623.88609.8810.250.670.87
OLLN × Normal624.47612.473.820.660.84628.43616.4310.360.680.87
Exp-N × EOLLN622.61606.612.920.590.78616.98600.9810.030.670.86
Exp-N × OLLN631.90617.902.870.590.78632.15618.159.110.640.84
Exp-N × Exp-N632.43618.433.470.630.82636.22622.2210.850.690.88
Exp-N × Normal626.75614.753.330.620.81632.04620.049.800.660.86
Normal × EOLLN620.62606.623.000.600.79615.77601.7710.300.670.87
Normal × OLLN633.50621.502.290.530.72630.58618.589.150.640.84
Normal × Exp-N629.29617.293.050.600.79624.74612.7410.160.670.86
Normal × Normal624.54614.543.350.630.81630.37620.379.990.670.86
Table 7. Estimated quantities for the best bivariate regression model fitted to lettuce data.
Table 7. Estimated quantities for the best bivariate regression model fitted to lettuce data.
θ MLEsCIsSEsp-Values
β 01 35.8036[33.7814, 37.8257]1.0086<0.001
β 11 10.8022[8.1460, 13.4583]1.3248<0.001
β 21 9.9009[7.1183, 12.6833]1.3879<0.001
β 31 −7.6892[−10.3286, −5.0497]1.3165<0.001
β 41 −6.7391[−9.2985, −4.1795]1.2766<0.001
β 51 0.6059[−2.1540, 3.3658]1.37660.3308
β 61 −0.1776[−3.0838, 2.7287]1.44960.4515
β 71 8.4060[5.3443, 11.4675]1.5271<0.001
β 81 18.6657[15.5123, 21.8191]1.5729<0.001
log ( σ 1 ) 1.8421[−4.0207, 7.7049]2.9243
log ( ν 1 ) 1.0892[−4.9764, 7.1548]3.0255
β 02 17.0122[15.8189, 18.2056]0.5952<0.001
β 12 9.4337[8.1786, 10.6887]0.6259<0.001
β 22 1.6790[0.0778, 3.2802]0.79860.0201
β 32 −1.6034[−2.7452, −0.4617]0.56940.0034
β 42 −3.2964[−4.6713, −1.9214]0.6857<0.001
β 52 0.9021[−0.4202, 2.2245]0.65950.0885
β 62 1.0147[−0.2440, 2.2736]0.62790.0559
β 72 6.3580[4.5401, 8.1759]0.9067<0.001
β 82 8.3849[7.0828, 9.6870]0.6494<0.001
log ( σ 2 ) 0.7811[−1.6658, 3.2282]1.2205
log ( ν 2 ) 1.1064[−1.0946, 3.3074]1.0978
log ( τ ) −1.0721[−2.6434, 0.4992]0.7837
λ 3.3776[1.1464,5.6086]1.1129
τ k =0.339, ρ s = 0.492 AIC: 456.7843 GD: 408.7843
Table 8. Comparisons between treatments according to the best bivariate regression model.
Table 8. Comparisons between treatments according to the best bivariate regression model.
Hypotheses H 0 Fresh WeightPlant Height
MLEsSEsp-ValuesMLEsSEsp-Values
AS1 - Control10.8021.3250.0009.4340.6260.000
AS2 - Control9.9011.3880.0001.6790.7990.020
AS3 - Control−7.6891.3170.000−1.6030.5690.003
AS4 - Control−6.7391.2770.000−3.2960.6860.000
ES1 - Control0.6061.3770.3310.9020.6600.088
ES2 - Control−0.1781.4500.4521.0150.6280.056
ES3 - Control8.4061.5270.0006.3580.9070.000
ES4 - Control18.6661.5730.0008.3850.6490.000
AS1 - ES4−8.1602.4530.0010.9590.7450.102
AS2 - ES4−9.0172.0970.000−6.8290.9020.000
AS3 - ES4−26.3522.6670.000−9.9750.7540.000
AS4 - ES4−25.4102.4950.000−11.6580.8550.000
ES1 - ES4−18.3762.6420.000−7.5050.8430.000
ES2 - ES4−18.7412.6380.000−7.3120.8100.000
ES3 - ES4−10.1922.8530.000−1.9561.0610.035
AS1 - ES32.6631.5810.0493.1201.1590.005
AS2 - ES31.9351.4360.092−4.5970.8590.000
AS3 - ES3−15.7841.5990.000−7.8021.0880.000
AS4 - ES3−14.6861.4170.000−9.5500.9220.000
ES1 - ES3−7.4681.5040.000−5.3230.8920.000
ES2 - ES3−8.0351.6670.000−5.1431.0330.000
AS1 - ES210.3961.4460.0008.3770.6780.000
AS2 - ES29.7071.4640.0000.6890.8410.208
AS3 - ES2−7.9591.3990.000−2.5920.6360.000
AS4 - ES2−6.8751.3670.000−4.2950.7130.000
ES1 - ES20.2551.4920.432−0.1200.7010.432
AS1 - ES110.4141.3580.0008.5620.7880.000
AS2 - ES19.6401.3390.0000.9180.7640.117
AS3 - ES1−7.9311.3630.000−2.4090.7180.001
AS4 - ES1−7.0251.2430.000−4.1170.6940.000
AS1 - AS417.4721.2180.00012.7680.7940.000
AS2 - AS416.7011.1290.0004.9950.7810.000
AS3 - AS4−1.0361.1910.1941.7150.7150.010
AS1 - AS318.3381.2520.00010.9510.5910.000
AS2 - AS317.7121.3030.0003.3550.8750.000
AS1 - AS21.0901.3150.2057.7820.9770.000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rodrigues, G.M.; Ortega, E.M.M.; Vila, R.; Cordeiro, G.M. A New Bivariate Family Based on Archimedean Copulas: Simulation, Regression Model and Application. Symmetry 2023, 15, 1778. https://doi.org/10.3390/sym15091778

AMA Style

Rodrigues GM, Ortega EMM, Vila R, Cordeiro GM. A New Bivariate Family Based on Archimedean Copulas: Simulation, Regression Model and Application. Symmetry. 2023; 15(9):1778. https://doi.org/10.3390/sym15091778

Chicago/Turabian Style

Rodrigues, Gabriela M., Edwin M. M. Ortega, Roberto Vila, and Gauss M. Cordeiro. 2023. "A New Bivariate Family Based on Archimedean Copulas: Simulation, Regression Model and Application" Symmetry 15, no. 9: 1778. https://doi.org/10.3390/sym15091778

APA Style

Rodrigues, G. M., Ortega, E. M. M., Vila, R., & Cordeiro, G. M. (2023). A New Bivariate Family Based on Archimedean Copulas: Simulation, Regression Model and Application. Symmetry, 15(9), 1778. https://doi.org/10.3390/sym15091778

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop