Next Article in Journal
A Weighted Cosine-G Family of Distributions: Properties and Illustration Using Time-to-Event Data
Previous Article in Journal
Inertial Iterative Algorithms for Split Variational Inclusion and Fixed Point Problems
Previous Article in Special Issue
On Entropy Estimation of Inverse Weibull Distribution under Improved Adaptive Progressively Type-II Censoring with Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Derivative Formulas and Gradient of Functions with Non-Independent Variables

by
Matieyendou Lamboni
1,2
1
Department DFR-ST, University of Guyane, 97346 Cayenne, France
2
228-UMR Espace-Dev, University of Guyane, University of Réunion, IRD, University of Montpellier, 34090 Montpellier, France
Axioms 2023, 12(9), 845; https://doi.org/10.3390/axioms12090845
Submission received: 18 June 2023 / Revised: 17 August 2023 / Accepted: 22 August 2023 / Published: 30 August 2023

Abstract

:
Stochastic characterizations of functions subject to constraints result in treating them as functions with non-independent variables. By using the distribution function or copula of the input variables that comply with such constraints, we derive two types of partial derivatives of functions with non-independent variables (i.e., actual and dependent derivatives) and argue in favor of the latter. Dependent partial derivatives of functions with non-independent variables rely on the dependent Jacobian matrix of non-independent variables, which is also used to define a tensor metric. The differential geometric framework allows us to derive the gradient, Hessian, and Taylor-type expansions of functions with non-independent variables.
MSC:
26A24; 53Axx; 62H05; 62H05

1. Introduction

We used to work with models defined through functions that include non-independent variables such as correlated input variables. This is also the case for models defined via functions with independent input variables and equations or inequations connecting such inputs; functions subject to constraints involving input variables and/or the model output. Knowing the key role of partial derivatives at a given point in (i) the mathematical analysis of functions and convergence, (ii) Poincaré inequalities ([1,2]) and equalities ([3,4]), (iii) optimization and active subspaces ([5,6]), (iv) implicit functions ([7,8]) and (v) differential geometry (see, e.g., [9,10,11]), it is interesting and relevant to have formulas that enable the calculations of partial derivatives of functions in the presence of non-independent variables, including the gradients. Of course, such formulas must account for the dependency structures among the model inputs, including the constraints imposed on such inputs.
Actual partial derivatives aim to calculate the partial derivatives of functions, taking into account the relationships between the input variables ([12]). For instance, let us consider the function f : R 3 R given by f ( x , y , z ) = 2 x y 2 z 3 under the constrained equation h ( x , y , z ) = 0 , where h is any smooth function. Using f x to represent the formal partial derivative of f with respect to. x, which is the partial derivative of f when considering other inputs as constant or independent, the chain rule yields the following partial derivative:
f x ( x , y , z ) = f x ( x , y , z ) + f y ( x , y , z ) y x + f z ( x , y , z ) z x = 2 y 2 z 3 + 4 x y z 3 y x + 6 x y 2 z 2 z x .
In probability and statistics, the point ( x , y , z ) represents a realization or a sample value of the implicit random vector ( X , Y , Z ) . The quantities y x , z x , and f x do not have a definite meaning when the variables ( X , Y , Z ) are npn-independent. At this point, determining y x , z x and f x is often challenging without supplementary assumptions such as the choice of directions or paths. When ( X , Z ) or ( X , Y ) are independent and using the equation h ( x , y , z ) = x + 2 y z = 0 , we can write ([12]):
f x ( x , y , z ) = 2 y 2 z 3 2 x y z 3 i f   X   a n d   Z   a r e   i n d e p e n d e n t   o r   z   i s   b e i n g   h e l d   f i x e d 2 y 2 z 3 + 6 x y 2 z 2 i f   X   a n d   Y   a r e   i n d e p e n d e n t   o r   y   i s   b e i n g   h e l d   f i x e d .
It is clear that the actual partial derivatives are not unique, as such derivatives rely on two different paths or assumptions. While each supplementary assumption can make sense in some cases, it cannot always be guaranteed by the constrained equation h ( x , y , z ) = 0 in general. Indeed, when all the initial input variables are dependent or correlated, the above partial derivatives are no longer valid, even for a linear function h, and it is worth finding the correct relationship between the input variables and the partial derivatives, including the gradient.
In differential geometry (see, e.g., [9,10,11]), using the differential of the function f, that is,
d f = 2 y 2 z 3 d x + 4 x y z 3 d y + 6 x y 2 z 2 d z ,
the gradient of f is defined as the dual of d f with respect to a given tensor metric. Obviously, different tensor metrics will yield different gradients of the same function. While the Euclidean metric, given by γ : = d x 2 + d y 2 + d z 2 , is more appropriate for independent inputs, finding the appropriate metrics is challenging in general. Indeed, the first fundamental form (in differential geometry) requires the Jacobian matrix of the inputs to define the associated tensor metric.
For non-independent variables with F as the joint cumulative distribution function (CDF), the bivariate dependency models ([13]) and multivariate dependency models ([3,14,15]), including the conditional and inverse Rosenblatt transformation ([16,17]), establish formal and analytical relationships among such variables using either CDFs or the corresponding copulas or new distributions that look like and behave like a copula ([18]). A dependency function characterizes the probabilistic dependency structures among these variables. For a d-dimensional random vector of non-independent variables, the dependency models express a subset of d 1 variables as a function of independent variables, consisting of the remaining input and new independent variables.
In this paper, we propose a new approach for calculating the partial derivatives of functions, considering the dependency structures among the input variables. Our approach relies on dependency models. By providing known relationships between the dependent inputs (including constraints imposed on inputs or outputs), dependency models can be regarded as global and probabilistic implicit functions (see Section 2.2). Such dependency models are used to determine the dependent Jacobian and the tensor metric for non-independent variables. The contributions of this paper are threefold:
  • To provide a generalization of the actual partial derivatives of functions with non-independent variables and establish its limits;
  • To introduce the general derivative formulas of functions with non-independent variables (known as dependent partial derivatives) without any additional assumptions;
  • To provide the gradient, Hessian, and Taylor-type expansions of functions with non-independent variables that comply with the dependency structures among the input variables.
In Section 2, we first review the dependency models of dependent input variables, including correlated variables. Next, we derive interesting properties of these models regarding the calculus of partial derivatives and probabilistic implicit functions. By coupling dependency functions with the function of interest, we extend the actual partial derivatives of functions with non-independent variables in Section 3. To avoid their drawbacks, the dependent partial derivatives of functions with non-independent variables are provided in Section 4. The gradient and Hessian matrix of these functions are derived in Section 5 using the framework of differential geometry. We provide an application in Section 6 and conclude this work in Section 7.

General Notations

For an integer d > 0 , let X : = ( X 1 , , X d ) be a random vector of continuous variables with F as the joint cumulative distribution function (CDF) (i.e., X F ). For any j { 1 , , d } , we use F x j or F j for the marginal CDF of X j and F j 1 for its inverse. Also, we use ( j ) : = ( w 1 , , w d 1 ) for an arbitrary permutation of { 1 , , d } { j } and X j : = ( X w 1 , , X w d 1 ) .
For a function f that includes X as inputs, we use f x j for the formal partial derivative of f with respect to X j , considering other inputs as constant or independent of X j , and f : = f x : = f x 1 , , f x d T . We use f x j for the partial derivative of f with respect to X j , which takes the dependencies among inputs into account. We also use f x : = f x 1 , , f x d T R d . Of course, f x = f for independent inputs.

2. Probabilistic Characterization of Functions with Non-Independent Variables

In probability theory, it is common to treat input variables as random vectors with associated CDFs. For instance, for the inputs that take their values within a known domain or space, the Bayesian framework allows for assigning a joint distribution, known as a prior distribution, to such inputs. Without additional information about the inputs, it is common to use non-informative prior distributions, such as uniform distributions or Gaussian distributions associated with higher values of the variances (see e.g., [19]).
Functions with non-independent variables include many types of models encountered in practice. An example is the models defined via a given function and equations or inequations connecting its inputs. The resulting inputs that comply with such constraints are often dependent or correlated. In the subsequent discussion, we use probability theory to characterize non-independent variables (see Definition 1).
Definition 1.
Consider n N { 0 } and a function f : R d R n that includes X F as inputs.
Then, f is said to be a function with non-independent variables whenever there exists at least one pair j 1 , j 2 { 1 , , d } with j 1 j 2 , such that
F j 1 , j 2 ( x j 1 , x j 2 ) F j 1 ( x j 1 ) F j 2 ( x j 2 ) .
Using N 0 , Σ for the multivariate normal distribution, we can check that a function that includes X N 0 , Σ as inputs, with Σ being a non-diagonal covariance matrix, is a member of the class of functions defined by
D d , n = f : R d R n : X F , F ( x ) j = 1 d F j ( x j ) ; x R d .

2.1. New Insight into Dependency Functions

In this section, we recall useful results about generic dependency models of non-independent variables (see [3,13,14,15,18]). For a d-dimensional random vector of non-independent variables (i.e., X F ), a dependency model of X consists of expressing a subset of d 1 variables (i.e., X j ) as a function of independent variables, including X j .
Formally, if X F with F ( x ) j = 1 d F j ( x j ) , then there exists ([3,13,14,15,18]):
(i)
New independent variables Z : = Z w 1 , , Z w d 1 , which are independent of X j ;
(ii)
A dependency function r j : R d R d 1 ,
such that
X j = d r j X j , Z ; a n d X j , X j = d X j , r j X j , Z ,
where X j = : ( X w 1 , , X w d 1 ) , and A = d B means that the random variables A and B have the same CDF.
It is worth noting that the dependency model is not unique in general. Uniqueness can be obtained under additional conditions provided in Proposition 1, which enable the inversion of the dependency function r j .
Proposition 1.
Consider a dependency model of the continuous random vector X F , given by X j = r j X j , Z , with a prescribed order w 1 , , w d 1 .
If X j is the explanatory variable and the distribution of Z is prescribed, then:
(i) 
The dependency model is uniquely defined;
(ii) 
The dependency model is invertible, and the unique inverse is given by
Z = r j 1 X j | X j .
Proof. 
See Appendix A. □
It should be noted that the dependency models (DMs) are vector-valued functions of independent input variables. Thus, DMs facilitate the calculus of partial derivatives, such as the partial derivatives of X with respect to X j . Moreover, the inverse of a DM avoids working with Z . A natural choice of the order w 1 , , w d 1 is ( 1 , , j 1 , j + 1 , , d ) .

2.2. Enhanced Implicit Functions: Dependency Functions

In this section, we provide a probabilistic version of the implicit function using DMs.
Consider X F , a sample value of X given by x R d , and a function h : R d R p with p d an integer. When connecting the input variables by p compatible equations, that is,
h ( X ) = 0 ,
the well-known theorem of the implicit function (see Theorem 1 below) states that for each sample value x * satisfying h ( x * ) = 0 , a subset of X can be expressed as a function of the others in the neighborhood of x * . To recall this theorem, we use u { 1 , , d } with | u | : = c a r d ( u ) = d p , X u : = X j , j u , X u : = X j , j { 1 , d } u , and B ( x u * , r 1 ) R d p (resp. B ( x u * , r 2 ) ) for an open ball centered on x u * (resp. x u * ) with a radius of r 1 (resp. r 2 ). Again, h x u ( x * ) (resp. h x u ) is the formal Jacobian of h with respect to x u (resp. x u ).
Theorem 1
(implicit function). Assume that h ( x * ) = 0 and h x u ( x * ) is invertible. Then, there exists a function g : B ( x u * , r 1 ) B ( x u * , r 2 ) , such that
x u = g ( x u ) ; x u * = g ( x u * ) ; x u x u ( x u * ) = h x u ( x * ) 1 h x u ( x * ) .
While Theorem 1 is useful, it turns out that the implicit function theorem provides a local relationship among the variables. It should be noted that the DMs derived in Section 2.1 provide global relationships once the CDFs of the input variables are known. The distribution function of the variables that satisfy the constraints given by h ( X ) = 0 is needed to construct the global implicit function. To derive this distribution function, we assume that:
(A1) All the constraints h ( X ) = 0 are compatible.
Under (A1), the constraints h ( X ) = 0 introduce new dependency structures on the initial CDF F, which matters for our analysis. Probability theory ensures the existence of a distribution function that captures these dependencies.
Proposition 2.
Let X F and X c : = X F : h ( X ) = 0 F c be the constrained variables. If (A1) holds, we have
X F s . t . h ( X ) : = 0 = d X c ,
where = d denotes equality in distribution.
Introducing constraints on initial variables leads us to work with constrained variables that follow a new CDF, that is, X c F c . Some examples of generic and constrained variables X c and their corresponding distribution functions can be found in [14,15,20]. When analytical derivations of the CDF of X c are challenging or not possible, a common practice is to fit a distribution function to the observations of X c using numerical simulations (see [14,21,22,23,24,25] for examples of distribution and density estimations). By using the new distributions of the input variables, Corollary 1 provides the probabilistic version of the implicit function.
Definition 2.
A distribution G is considered to be a degenerate CDF when G is the CDF of the Dirac measure with δ a as the probability mass function, where a R .
Corollary 1
([14,15]). Consider a random vector X c : = X F : h ( X ) = 0 that follows F c as a CDF. Assume that (A1) holds and F c is a nondegenerate CDF.
Then, there exists a function r j : R d R d 1 and d 1 independent variables Z , such that Z is independent of X j c and
X j c = r j ( X j c , Z ) .
Proof. 
This result is the DM for the distribution F c (see Section 2.1). □
While Corollary 1 gives the explicit function that links X j to X j , we can sometimes extend this result as follows:
X u c = r u ( X u c , Z u ) ,
where Z u is a vector of d | u | independent variables, Z u is independent of X u c , and r u : R d R d | u | (see Section 2.1 and [14]).
Remark 1.
We can easily generalize the above process to handle (i) constrained inequations such as h ( X ) < 0 or h ( X ) > 0 (see Section 6 and [15]), and (ii) a mixture of constrained equations and inequalities involving different variables.
Remark 2.
For a continuous random vector X with C as the copula and F j , j = 1 , , d as the marginal distributions, an expression of its DM is given by ([3,14])
X 1 = F 1 1 C 1 | j 1 Z 1 | F j ( X j ) = : r j , 1 X j , Z 1 X 2 = F 2 1 C 2 | j , 1 1 Z 2 | F j ( X j ) , F 1 r j , 1 X j , Z 1 = : r j , 2 X j , Z 1 , Z 2 ,
where C 1 | j is the conditional copula, C 1 | j 1 is the inverse of C 1 | j , and r j , 1 and r j , 2 are real-valued functions.

2.3. Representation of Functions with Non-Independent Variables

In general, a function may include a group of independent variables, as well as groups of non-independent variables such as correlated variables and/or dependent variables. We can then organize these input variables as follows:
(O): the random vector X : = ( X 1 , , X d ) consists of K independent random vector(s) given by X = X π 1 , , X π K , where the sets π 1 , , π K form a partition of { 1 , , d } . The random vector X π k 1 : = X j , j π k 1 is independent of X π k 2 : = X j , j π k 2 for every pair k 1 , k 2 { 1 , , K } with k 1 k 2 . Without loss of generality, we use X π 1 for a random vector of d 1 0 independent variable(s) and X π k with k 2 for a random vector of d k 2 dependent variables.
We use ( π 1 , k , , π d k , k ) to denote the ordered permutation of π k (i.e., π 1 , k < π 2 , k < π d k , k ). For any π j , k π k , we use X π j , k to refer to an element of X π k ; ( π j , k ) : = ( π 1 , k , , π j 1 , k , π j + 1 , k , , π d k , k ) and Z k : = Z π , k , π , k ( π j , k ) . Keeping in mind the DMs (see Section 2.1), we can represent X π k by
X π j , k = d r π j , k X π j , k , Z k , k { 2 , , K } ,
where r π j , k ( · ) = r π 1 , k ( · ) , , r π j 1 , k ( · ) , r π j + 1 , k ( · ) , , r π d k ( · ) ; Z k is a random vector of d k 1 independent variables; and X π 1 , k is independent of Z k . Based on the above DM of X π k with k = 2 , , K , let us introduce new functions, that is, c π j , k : R d k R d k , given by
c π j , k ( · ) = r π 1 , k ( · ) , , r π j 1 , k ( · ) , r π j , k ( · ) = X π j , k , r π j + 1 , k ( · ) , , r π d k ( · ) and
X π k = c π j , k ( X π j , k , Z k ) = : X π j , k , r π j , k ( X π j , k , Z k ) .
The function c π j , k maps the independent variables ( X π j , k , Z k ) onto X π k , and the chart
R d c R d f R n X π 1 X π j , 2 Z 2 X π j , K Z K X π 1 X π 2 X π K = X f 1 ( X ) f n ( X ) ,
leads to a new representation of functions with non-independent variables. Indeed, composing f with c yields
f ( X π 1 , X 2 , , X π k ) = d f c X π 1 , X π j , 2 , Z 2 , , X π j , K , Z K ,
where c X π 1 , X π j , 2 , Z 2 , , X π j , K , Z K : = X π 1 , c π j , 2 ( X π j , 2 , Z 2 ) , , c π j , K ( X π j , K , Z K ) .
The equivalent representation of f, given by (6), relies on the innovation variables Z : = ( Z 2 , , Z K ) . Recall that for the continuous random vector X π k , the DM r π j , k , given by (5), is always invertible (see Proposition 1), and, therefore, c π j , k is also invertible. These inversions are helpful for working with X only.

3. Actual Partial Derivatives

This section discusses the calculus of partial derivatives of functions with non-independent variables using only one relationship among the inputs, such as the DM given by Equation (5). The usual assumptions made are:
(A2) The joint (resp. marginal) CDF is continuous and has a density function ρ > 0 on its open support;
(A3) Each component of the dependency function r π j , k is differentiable with respect to X π j , k ;
(A4) Each component of the function f, that is, f with { 1 , , n } , is differentiable with respect to each input.
Without loss of generality and for the sake of simplicity, here, we suppose that n = 1 . Namely, we use I d × d R d × d for the identity matrix, and O d × d 1 R d × d 1 for the null matrix. It is common to use f x π k for the formal partial derivatives of f with respect to each input of X π k (i.e., the derivatives obtained by considering inputs as independent) with k = 1 , , K . Thus, the formal gradient of f (i.e., the gradient with respect to the Euclidean metric) is given by
f : = f x π 1 T f x π 2 T f x π K T T .
Keeping in mind the function c π j , k ( · ) , the partial derivatives of each component of X π k with respect to X π j , k are given by
J ( π j , k ) : = c π j , k x π j , k = X π 1 , k x π j , k X π d k , k x π j , k T = r π 1 , k x π j , k 1 j th position r π d k , k x π j , k T .
We use J i ( π j , k ) for the i t h element of J ( π j , k ) . For instance, J j ( π j , k ) = 1 and J d k ( π j , k ) represent the partial derivative of X π d k , k with respect to X π j , k . It is worth recalling that J ( π j , k ) is a vector-valued function of ( X π j , k , Z k ) , and Lemma 1 expresses J ( π j , k ) as a function of x π k only.
Lemma 1.
Let x π k be a sample value of X π k . If Assumptions (A2)–(A4) hold, the partial derivatives of X π k with respect to X π j , k evaluated at x π k are given by
J ( π j , k ) x π k : = r π 1 , k x π j , k 1 j th position r π d k , k x π j , k T x π j , k , r π j , k 1 x π j , k | x π j , k .
Proof. 
See Appendix B. □
Again, J ( π j , k ) x π k with { 1 , , d k } represents the t h component of J ( π j , k ) x π k , as provided in Lemma 1. Using these components and the chain rule, Theorem 2 provides the actual partial derivatives of functions with non-independent variables (i.e., a f x = : J π k a x π k ) , which are the derivatives obtained by utilizing only one dependency function given by Equation (5).
Theorem 2.
Let x R d be a sample value of X and π j , k π k with k = 2 , , K . If Assumptions (A2)–(A4) hold, then:
(i) 
The actual Jacobian matrix of c π j , k is given by
J π k a x π k : = J ( π j , k ) x π k J 1 ( π j , k ) x π k J ( π j , k ) x π k J d k ( π j , k ) x π k , k { 2 , , K } .
(ii) 
The actual Jacobian matrix of c or a X x is given by
J a ( x ) : = I d 1 × d 1 O d 1 × d 2 O d 1 × d K O d 2 × d 1 J π 2 a x π 2 O d 2 × d K O d K × d 1 O d K × d 2 J π K a x π K ,
(iii) 
The actual partial derivatives of f are given by
a f x ( x ) : = J a x T f ( x ) .
Proof. 
see Appendix C. □
The results from Theorem 2 are based on only one dependency function, which uses X π j , k as the explanatory input. Thus, the actual Jacobian J a ( x ) and the actual partial derivatives of f provided in (10)–(12) will change with the choice of the explanatory input X π j , k for every j { 1 , , d k } . All these possibilities are not surprising. Indeed, while no additional explicit assumption is necessary for calculating the partial derivatives of X π k with respect to X π j , k (i.e., J ( π j , k ) ), we implicitly keep the other variables fixed when calculating the partial derivative of X π k with respect to X π i , k , that is, J ( π j , k ) J i ( π j , k ) for each i j . Such an implicit assumption is due to the reciprocal rule used to derive the results (see Appendix C). In general, the components of X π j , k , such as X π i 1 , k and X i 2 , k , are both functions of X π j , k and Z π 1 , k at the very least. Thus, different possibilities of the actual Jacobians are based on different implicit assumptions, making it challenging to use the actual partial derivatives. Further drawbacks of the actual partial derivatives of f are illustrated in Section 3

Example 1

We consider the function f ( X 1 , X 2 ) = X 1 + X 2 + X 1 X 2 , which includes two correlated inputs X N 2 0 , 1 ρ ρ 1 . We see that f ( X ) = 1 + X 2 1 + X 1 . Using the DM of X given by (see [3,14,15])
X 2 = ρ X 1 + 1 ρ 2 Z 2 Z 2 = ( X 2 ρ X 1 ) / 1 ρ 2 ,
the actual Jacobian matrix of c and the actual partial derivatives of f are given by
J a ( X ) = 1 1 ρ ρ 1 ; a f x ( X ) = 1 + X 2 + ρ ( 1 + X 1 ) 1 + X 2 ρ + 1 + X 1 .
When ρ = 1 , both inputs are perfectly correlated, and we have X 1 = X 2 , which also implies that f ( X 1 , X 2 ) = f ( X 1 ) = 2 X 1 + X 1 2 = f ( X 2 ) = 2 X 2 + X 2 2 . We can check that a f x ( X ) = 2 + 2 X 1 2 + 2 X 2 . However, when ρ = 0 , both inputs are independent, and we should expect the actual partial derivatives to be equal to the formal gradient f , but this is not the case. Moreover, using the second DM, which is given by
X 1 = ρ X 2 + 1 ρ 2 Z 1 Z 1 = ( X 1 ρ X 2 ) / 1 ρ 2 ,
it becomes apparent that J a ( X ) = 1 ρ 1 ρ 1 ; a f X ( X ) = 1 + X 2 + 1 + X 1 ρ ρ ( 1 + X 2 ) + 1 + X 1 , which differs from the previous results. All these drawbacks are due to the implicit assumptions made (e.g., keeping some variables fixed), which can be avoided (see Section 4).

4. Dependent Jacobian and Partial Derivatives

This section aims to derive the first- and second-order partial derivatives of functions with non-independent variables, without relying on any additional assumption, whether explicit or implicit. Essentially, we calculate or compute the partial derivatives of X π k with respect to X π i , k using only the dependency function that includes X π i , k as an explanatory input, which can be expressed as follows:
X π i , k = r π i , k X π i , k , Z π i , k , i = 1 , , d k ; k = 2 , , K .
By using the above dependency function, the partial derivatives of X π k with respect to X π i , k are given as follows (see (9)):
J ( π i , k ) x π k : = X π k x π i , k = r π 1 , k x π i , k 1 i th position r π d k , k x π i , k T x π i , k , r π i , k 1 x π i , k | x π i , k .
It should be noted that J ( π i , k ) does not require any supplementary assumption, as X π i , k and Z π i , k are independent. Thus, d k different DMs are necessary to derive the dependent Jacobian and partial derivatives of f (see Theorem 3).
Theorem 3.
Let x R d be a sample value of X , and assume that (A2)–(A4) hold:
(i) 
For all k 2 , the dependent Jacobian matrix X π k x π k is given by
J π k d x π k : = J ( π 1 , k ) x π k J ( π d k , k ) x π k .
(ii) 
The dependent Jacobian matrix X x is given by
J d ( x ) : = I d 1 × d 1 O d 1 × d 2 O d 1 × d K O d 2 × d 1 J π 2 d x π 2 O d 2 × d K O d K × d 1 O d K × d 2 J π K d x π K ,
(iii) 
The partial derivatives of f are given by
f x ( x ) : = J d x T f ( x ) ,
Proof. 
See Appendix D. □
Although the results from Theorem 3 require different DMs, these results are more comprehensive than the actual partial derivatives because no supplementary assumption is available for each non-independent variable.
To derive the second-order partial derivatives of f, we use f x i x j to denote the formal cross-partial derivative of f with respect to x i and x j , and H π k : = f x j 1 x j 2 j 1 , j 2 π k to denote the formal or ordinary Hessian matrices of f restricted to X π k for k = 1 , , K . In the same sense, we use H π k 1 , π k 2 : = f x j 1 x j 2 j 1 π k 1 , j 2 π k 2 to denote the formal cross-Hessian matrix of f restricted to ( X π k 1 , X π k 2 ) for every pair k 1 , k 2 { 1 , K } with k 1 k 2 . To ensure the existence of the second-order partial derivatives, we assume that:
(A5) The function f is twice (formal) differentiable with respect to each input;
(A6) Every dependency function r π j , k is twice differentiable with respect to X π j , k .
By considering the d k DMs of X π k (i.e., X π i , k = r π i , k X π i , k , Z π i , k , with i = 1 , , d k ) used to derive the dependent Jacobian, we can write
J ( π i , k ) x π i , k ( x π k ) : = 2 X π k 2 x π i , k = 2 r π 1 , k 2 x π i , k 0 i th position 2 r π d k , k 2 x π i , k T x π i , k , r π i , k 1 x π i , k | x π i , k ,
for the second partial derivatives of X π k with respect to X π i , k . By using d i a g ( x ) R d × d to represent a diagonal matrix with x as its diagonal elements and
D J π k d x : = d i a g J ( π 1 , k ) x π 1 , k x π k , , J ( π d k , k ) x π d k , k x π k T f x π k ( x ) J π k d x π k ,
for all k { 2 , , K } , Theorem 4 provides the dependent second-order partial derivatives (i.e., 2 f 2 x ).
Theorem 4.
Let x be a sample value of X . If (A2), (A5), and (A6) hold, then
2 f 2 x ( x ) : =
H π 1 ( x ) H π 1 , π 2 ( x ) J π 2 d x π 2 H π 1 , π K ( x ) J π K d x π k J π 2 d x π 2 T H π 2 , π 1 ( x ) J π 2 d x π 2 T H π 2 ( x ) J π 2 d x π 2 J π 2 d x π 2 T H π 2 , π K ( x ) + D J π 2 d x × J π K d x π k J π K d x π k T H π K , π 1 ( x ) J π K d x π K T H π K , π 2 ( x ) J π 2 d x π 2 J π K d x π K T H π K ( x ) × J π K d x π K + D J π K d x .
Proof. 
See Appendix E. □

Example 1 (Revisited)

Since f ( X 1 , X 2 ) = X 1 + X 2 + X 1 X 2 and the DMs of X are given by
X 2 = ρ X 1 + 1 ρ 2 Z 2 Z 2 = ( X 2 ρ X 1 ) / 1 ρ 2 ,
X 1 = ρ X 2 + 1 ρ 2 Z 1 Z 1 = ( X 1 ρ X 2 ) / 1 ρ 2 ,
we can check that
J d = 1 ρ ρ 1 ; f x ( X ) = 1 + X 2 + ρ ( 1 + X 1 ) ρ ( 1 + X 2 ) + 1 + X 1 ; 2 f 2 x ( X ) = 2 ρ 1 + ρ 2 1 + ρ 2 2 ρ .
For instance, when ρ = 1 , we have f x ( X ) = a f x ( X ) , and when ρ = 0 , we have f X ( X ) = f ( X ) and 2 f 2 X ( X ) = H ( X ) . Thus, the dependent partial derivatives of f align with the formal gradient and Hessian matrix when the inputs are independent.

5. Expansion of Functions with Non-Independent Variables

Although Section 4 provided the partial derivatives and cross-partial derivatives of f, it is misleading to think that the infinitesimal increment of f, given by f ( X π k + ϵ e π j , k ) f ( X π k ) , should result in the individual effect quantified by f ( X π k ) x π j , k ϵ with e π j , k : = 0 , , 0 , 1 π j , k th position , 0 , , 0 T and ϵ > 0 . Indeed, when moving X π j , k , it leads to partial movements of the other variables, and the effects we observe (i.e., f x π k T J ( π j , k ) ) can also be attributed to other variables. The dependency structures of these effects are described by the dependent Jacobian matrix J π k d ( x π k ) (see Equation (13)). Therefore, the definition of the gradient and Hessian of f with non-independent variables requires the introduction of a tensor metric or a Riemannian tensor.
In differential geometry, the function of the form
c π j , k : R d k R d k X π j , k , Z π j , k X : = ( X π 1 , k , , X π j , k , , X π d k , k ) ; X π j , k = r π j , k X π j , k , Z π j , k ,
for every π i , k π k can be seen as a parametrization of a manifold M k in R d k . The d k column entries of the dependent Jacobian matrix J π k d ( x π k ) R d k × d k span a local m k -dimensional vector space, also known as the tangent space at x π k , where m k is the rank of J π k d ( x π k ) , indicating the number of linearly independent columns in J π k d ( x π k ) .
By considering all the K groups of inputs and the corresponding dependent Jacobian matrix J d ( x ) , we can see that the support of the random vector X forms an m-dimensional manifold M in R d , where m is the rank of J d ( x ) . When m d , we work within the tangent space T R m (or local coordinate system) spanned by the m column entries of J d ( x ) that are linearly independent. Working in T R m rather than T R d ensures that the Riemannian tensor induced by x x using the dot product is invertible. Since the Riemannian tensor metric is often symmetric, the Moore–Penrose generalized inverse of symmetric matrices ([26,27,28]) allows us to keep working in T R d in the subsequent discussion. Using the first fundamental form (see, e.g., [9,10,11]), the induced tensor metric is defined as the inner product between the column entries of the dependent Jacobian matrix of the dependency functions, that is,
G ( x ) : = J d ( x ) T J d ( x ) .
Based on these elements, the gradient and Hessian matrix are provided in Corollary 2. To that end, we use G 1 to represent the inverse of the metric G, as given by Equation (17), when m = d , or the generalized inverse of G for every m < d ([26,27,28]). For any k { 1 , , d } , the Christoffel symbols are defined by ([9,11,29,30])
Γ i j k : = 1 2 = 1 m = d G k 1 ( x ) G i , x j ( x ) + G j , x i ( x ) G i j , x ( x ) ; i , j = 1 , , d ,
where G i , x j is the formal partial derivative of G i with respect to x j .
Corollary 2.
Let x be a sample value of X , and assume that (A2) and (A5)–(A6) hold:
(i) The gradient of f is given by
g r a d ( f ) ( x ) : = G 1 ( x ) f ( x ) ,
(ii) The Hessian matrix of f is given by
H e s s i j ( f ) ( x ) : = f x i x j ( x ) k = 1 m = d Γ i j k ( x ) f x k ( x ) .
Proof. 
Points (i)–(ii) result from the definition of the gradient and the Hessian matrix within a Riemannian geometric context equipped with the metric G (see [9,10,11,31]). □
Taylor’s expansion is widely used to approximate functions with independent variables. In the subsequent discussion, we are concerned with the approximation of a function with non-independent variables. The Taylor-type expansion of a function with non-independent variables is provided in Corollary 3 using the gradient and Hessian matrix.
Corollary 3.
Let x , x 0 be two sample values of X , and assume that (A2) and (A5)–(A6) hold. Then, we have
f ( x ) f ( x 0 ) + x x 0 T g r a d ( f ) ( x 0 ) + 1 2 x x 0 T H e s s ( f ) ( x 0 ) x x 0 ,
provided that x is close to x 0 .
Proof. 
The proof is straightforward using the dot product induced by the tensor metric G within the tangent space and considering the Taylor expansion provided in [11]. □

Example 1 (Revisited)

For the function in Example 1, we can check that the tensor metric is
G ( X ) = 1 + ρ 2 2 ρ 2 ρ 1 + ρ 2 ; G 1 ( X ) = 1 ( ρ 2 1 ) 2 1 + ρ 2 2 ρ 2 ρ 1 + ρ 2 ; and the gradient is
g r a d ( f ) ( X ) = 1 ( ρ 2 1 ) 2 ( 1 ρ ) 2 + X 2 ( 1 + ρ 2 ) 2 ρ X 1 ( 1 ρ ) 2 + X 1 ( 1 + ρ 2 ) 2 ρ X 2 ,
which reduces to f ( X ) when the variables are independent (i.e., ρ = 0 ).

6. Application

In this section, we consider three independent input factors X j N 0 , 1 with j = 1 , 2 , 3 , X : = ( X 1 , X 2 , X 3 ) , a constant c R + , and the function
f ( X ) = X 1 2 + X 2 2 + X 3 2 .
Also, we consider the constraint f ( X ) c . It is known in [15] (Corollary 4) that the DM of
X w : = X 1 w , X 2 w , X 3 w : = X j N ( 0 , 1 ) , j = 1 , 2 , 3 : X 1 2 + X 2 2 + X 3 2 c
is given by
X 2 w = R 2 Z 2 c X 1 w 2 ; X 3 w = R 3 Z 3 c X 1 w 2 1 Z 2 2 .
We can then write
X 2 w X 1 w = X 1 w R 2 Z 2 1 c ( X 1 w ) 2 = X 1 w X 2 w c ( X 1 w ) 2 ,
X 3 w X 1 w = X 1 w R 3 Z 3 1 Z 2 2 c ( X 1 w ) 2 = X 1 w X 3 w c ( X 1 w ) 2 .
Using the above derivatives and symmetry among the inputs, the dependent Jacobian and the tensor metric are given by
J d X w = 1 X 1 w X 2 w c ( X 2 w ) 2 X 1 w X 3 w c ( X 3 w ) 2 X 1 w X 2 w c ( X 1 w ) 2 1 X 2 w X 3 w c ( X 3 w ) 2 X 1 w X 3 w c ( X 1 w ) 2 X 2 w X 3 w c ( X 2 w ) 2 1 , G X w = J d X w T J d X w .
The following partial derivatives of f can be deduced:
f d x w = 1 X 1 w X 2 w c ( X 1 w ) 2 X 1 w X 3 w c ( X 1 w ) 2 X 1 w X 2 w c ( X 2 w ) 2 1 X 2 w X 3 w c ( X 2 w ) 2 X 1 w X 3 w c ( X 3 w ) 2 X 2 w X 3 w c ( X 3 w ) 2 1 2 X 1 w 2 X 2 w 2 X 3 w = 2 X 1 w 1 ( X 2 w ) 2 c ( X 1 w ) 2 ( X 3 w ) 2 c ( X 1 w ) 2 2 X 2 w 1 ( X 1 w ) 2 c ( X 2 w ) 2 ( X 3 w ) 2 c ( X 2 w ) 2 2 X 3 w 1 ( X 1 w ) 2 c ( X 3 w ) 2 ( X 2 w ) 2 c ( X 3 w ) 2 .
For given values of X 1 w , X 2 w , and X 3 w and as c , we can see that f d x 1 w = 2 X 1 w , which is exactly the partial derivative of f when the inputs are independent. Note that as c , the inputs become independent, as the constraint imposed on X is always satisfied.
Keeping in mind Equation (6), it is worth noting that the partial derivatives of f can be directly derived by making use of an equivalent DM of X w , that is, ( X 2 w ) 2 = Z 2 ( c ( X 1 w ) 2 ) , ( X 3 w ) 2 = Z 3 ( c ( X 1 w ) 2 ) ( 1 Z 2 ) , where ( X 1 w ) 2 B 1 ( c , 1 / 2 , 2 ) , Z 2 B e t a ( 1 / 2 , 3 / 2 ) and Z 3 B e t a ( 1 / 2 , 1 ) are independent, with B 1 representing the beta distribution of the first kind (see [15], Corollaries 2). Indeed, we have
f ( X w ) = ( X 1 w ) 2 ( 1 Z 2 Z 3 ( 1 Z 2 ) ) + c Z 2 + c Z 3 ( 1 Z 2 ) = ( X 1 w ) 2 ( 1 Z 2 ) ( 1 Z 3 ) + c Z 2 + c Z 3 ( 1 Z 2 ) ;
and
f d x 1 w = 2 X 1 w ( 1 Z 2 ) ( 1 Z 3 ) = 2 X 1 w c ( X 1 w ) 2 ( X 2 w ) 2 ( X 3 w ) 2 c ( X 1 w ) 2 ,
because
Z 2 = ( X 2 w ) 2 c ( X 1 w ) 2 ; 1 Z 2 = c ( X 1 w ) 2 ( X 2 w ) 2 c ( X 1 w ) 2 ; Z 3 = ( X 3 w ) 2 c ( X 1 w ) 2 ( X 2 w ) 2 .
As a matter of fact, we obtain the same partial derivatives of f, keeping in mind the symmetry among the inputs.

7. Conclusions

A new approach for calculating and computing the partial derivatives, gradient, and Hessian of functions with non-independent variables is proposed and studied in this paper. It relies on (i) dependency functions that model the dependency structures among dependent variables, including correlated variables; (ii) the dependent Jacobian of the dependency functions; and (iii) the tensor metric using the dependent Jacobian. Based on the unique tensor metric due to the first fundamental form, the unique gradient of a function with non-independent variables is provided. Since the so-called dependent partial derivatives and the dependent Jacobian do not require any additional assumption (which is always the case), such derivatives (including the gradient) should be used.
The results obtained depend on the parameters of the distribution function or the density function of non-independent variables. For the values of such parameters that lead to independent variables, the proposed gradient and partial derivatives reduce to the formal gradient or the gradient with respect to the Euclidean metric. In the same sense, the proposed tensor metric reduces to the Euclidean metric using the above values of the parameters of the distribution function.
Using the proposed gradient and Hessian matrix, the Taylor-type expansion of a function with non-independent variables is provided. Although the generalized inverse of a symmetric matrix is used in some cases, more investigation is needed for the gradient calculus when the tensor metric is not invertible. The proposed gradient will be used for (i) the development of the active subspaces of functions with non-independent variables, and (ii) enhancing the optimization of functions subject to constraints.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The author would like to thank the six reviewers and the associate editor for their comments and suggestions that have helped improve this paper.

Conflicts of Interest

The author declares no competing financial interests or personal relationships that could have influenced the work reported in this paper.

Appendix A. Proof of Proposition 1

For continuous random variables and prescribed w 1 , , w d 1 , the Rosenblatt transform of X j | X j is unique and strictly increasing ([16]). Therefore, the inverse of the Rosenblatt transform of X j | X j is also unique ([17]), and we can write
X j = d r j X j , U ,
where U U 0 , 1 d 1 . For the prescribed distribution of the d 1 innovation variables Z = Z w i F Z w i , i = 1 , , d 1 , the above model becomes
X j = d r j X j , U 1 , , U d 1 = d r j X j , F Z w 1 Z w 1 , , F Z w d 1 Z w d 1 = r j X j , Z ,
because Z w i = d F Z w i 1 U i U i = F Z w i Z w i for continuous variables. Thus, Point (i) holds.
For Point (ii), since X j = d r j X j , U is the inverse of the Rosenblatt transform of X j | X j , we then have the unique inverse
U = r j 1 X j | X j ,
which yields the unique inverse of the DM. Indeed,
( F Z 1 ( Z 1 ) , , F Z d 1 ( Z 1 ) ) = r j 1 X j | X j Z = r j 1 X j | X j .

Appendix B. Proof of Lemma 1

Using the partial derivatives r π i , k x π j , k X π j , k , Z , with i = 1 , , d k given by Equation (8) and the unique inverse of X π j , k = r π j , k X π j , k , Z k for any distribution of Z k given by Z k = r π j , k 1 X π j , k | X π j , k (see Proposition 1), the result is immediate.

Appendix C. Proof of Theorem 2

Firstly, using the partial derivatives of X π k with respect to X π j , k in (9), that is, J ( π j , k ) = X π 1 , k x π j , k X π d k , k x π j , k T with X π i , k x π j , k = r π i , k x π j , k = J i ( π j , k ) for any π i , k π k , the reciprocal rule allows us to write
X π j , k X π i , k = 1 X π i , k X π j , k = 1 J i ( π j , k ) .
Applying the chain rule yields
X π i 1 , k X π i 2 , k = X π i 1 , k X π j , k X π j , k X π i 2 , k = J i 1 ( π j , k ) J i 2 ( π j , k ) .
Thus, the partial derivatives of X π k with respect to X π i , k are given by
X π k X π i , k : = J 1 ( π j , k ) J i ( π j , k ) J d k ( π j , k ) J i ( π j , k ) T = J ( π j , k ) J i ( π j , k ) ,
and the actual Jacobian matrix of X π k (i.e., a X π k X π k ) is given by
J π k a : = J ( π j , k ) J 1 ( π j , k ) J ( π j , k ) J d k ( π j , k ) .
Secondly, under the organization of the input variables (O), and for every pair k 1 , k 2 { 1 , , K } with k 1 k 2 , we have the following cross-Jacobian matrices:
X π k 1 x π k 2 = O d k 1 × d k 2 ; J π 1 : = X π 1 x π 1 = I d 1 × d 1 .
Therefore, the actual Jacobian matrix of X is given by
J a : = X π 1 x π 1 X π 1 x π 2 X π 1 x π K X 2 x π 1 X 2 x π 2 X 2 x π K X π K x π 1 X π K x π 2 X π K x π K = I d 1 × d 1 O O O J π 2 a O O O J π K a .
Finally, using the formal gradient of f in (7), that is, f : = f x π 1 T f x π 2 T , f x π K T T and bearing in mind the cyclic rule, we can write
f x π 1 = f T X x π 1 T = f x π 1 , f x π k = f T X x π k T = f x π k T J π k a T ,
and the actual partial derivatives of f are then given by a f x = J a ( x ) T f ( x ) .

Appendix D. Proof of Theorem 3

For Point (i), building the d k dependency functions for every explanatory input X π i , k with i = 1 , , d k , the partial derivatives of X π k with respect to X π i , k evaluated at x π k are given by (see Equation (9))
J ( π i , k ) ( x π k ) = J ( π i , k ) ( x π i , k , r π i , k 1 x π i , k | x π i , k ; i = 1 , , d k .
Points (ii)–(iii) are similar to those of Theorem 2 using the dependent Jacobian matrix given by Point (i).

Appendix E. Proof of Theorem 4

First, using Equation (15), given by f x ( x ) : = J d x T f ( x ) , we can extract
f x π 1 ( x ) = f x π 1 ( x ) ; f x π k ( x ) = J π k d x π k T f x π k ( x ) ,
By applying the vector-by-vector derivatives of f x π 1 ( x ) with respect to x π 1 and x π k , we have
2 f 2 x π 1 ( x ) = f x π 1 x π 1 ( x ) = H π 1 ( x ) ; 2 f ( x ) x π 1 x π k = f x π 1 ( x ) x x x π k = H π 1 , π k ( x ) J π k d ,
as X π 1 is a vector of independent variables and X x π k = O d 1 × d k J π k d O d K × d k , considering the dependent Jacobian matrix provided in Equation (14).
In the same sense, the derivatives of f x π k ( x ) with respect to x π 1 and x π with k are
2 f x π k x π 1 = J π k d x π k T f x π k ( x ) x π 1 = J π k d x π k T f x π k ( x ) x π 1 = J π k d x π k T f x π k ( x ) x x x π 1 = J π k d x π k T H π k , π 1 ( x ) ,
2 f x π k x π = J π k d x π k T f x π k ( x ) x π = J π k d x π k T f x π k ( x ) x π = J π k d x π k T f x π k ( x ) x x x π = J π k d x π k T H π k , π ( x ) J x π d ( x π ) .
Finally, we have to derive the quantity 2 f 2 x π k = J π k d x π k T f x π k ( x ) x π k . For each π , k π k , we can write
2 f x π k x π , k = J π k d x π k T f x π k ( x ) x π , k = J π k d x π k T x π , k f x π k ( x ) + J π k d x π k T f x π k ( x ) x π , k = J ( π 1 , k ) x π , k J ( π d k , k ) x π , k T f x π k ( x ) + J π k d x π k T f x π k ( x ) x x x π , k = J 1 ( π , k ) J ( π 1 , k ) x π 1 , k J d k ( π , k ) J ( π d k , k ) x π d k , k T f x π k ( x ) + J π k d x π k T H π k ( x ) J ( π , k ) ( x π k ) ,
because for all i { 1 , , d k } , we can write (thanks to the chain rule)
J ( π i , k ) x π , k : = 2 X π 1 , k 2 x π i , k x π i , k x π , k 2 X π d k , k 2 x π i , k x π i , k x π , k T = x π i , k x π , k J ( π i , k ) x π i , k = J i ( π , k ) J ( π i , k ) x π i , k .
Re-organizing the first element of the right-hand terms in the above equation yields
2 f x π k x π , k = d i a g J ( π 1 , k ) x π 1 , k J ( π d k , k ) x π d k , k T f x π k ( x ) J ( π , k ) + J π k d x π k T H π k ( x ) J ( π , k ) .
By running = 1 , , d k , we obtain the result.

References

  1. Bobkov, S. Isoperimetric and Analytic Inequalities for Log-Concave Probability Measures. Ann. Probab. 1999, 27, 1903–1921. [Google Scholar] [CrossRef]
  2. Roustant, O.; Barthe, F.; Iooss, B. Poincaré inequalities on intervals-application to sensitivity analysis. Electron. J. Statist. 2017, 11, 3081–3119. [Google Scholar] [CrossRef]
  3. Lamboni, M.; Kucherenko, S. Multivariate sensitivity analysis and derivative-based global sensitivity measures with dependent variables. Reliab. Eng. Syst. Saf. 2021, 212, 107519. [Google Scholar] [CrossRef]
  4. Lamboni, M. Weak derivative-based expansion of functions: ANOVA and some inequalities. Math. Comput. Simul. 2022, 194, 691–718. [Google Scholar] [CrossRef]
  5. Russi, T.M. Uncertainty Quantification with Experimental Data and Complex System Models; University of California: Berkeley, CA, USA, 2010. [Google Scholar]
  6. Constantine, P.; Dow, E.; Wang, S. Active subspace methods in theory and practice: Applications to kriging surfaces. SIAM J. Sci. Comput. 2014, 36, 1500–1524. [Google Scholar] [CrossRef]
  7. Zhang, W.; Ge, S.S. A global Implicit Function Theorem without initial point and its applications to control of non-affine systems of high dimensions. J. Math. Anal. Appl. 2006, 313, 251–261. [Google Scholar] [CrossRef]
  8. Cristea, M. On global implicit function theorem. J. Math. Anal. Appl. 2017, 456, 1290–1302. [Google Scholar] [CrossRef]
  9. Jost, J.J. Riemannian Geometry and Geometric Analysis; Springer: Berlin, Heidelberg, Germany, 2011; Volume 6, pp. 1–611. [Google Scholar]
  10. Petersen, P. Riemannian Geometry; Springer International Publishing AG: Berlin, Heidelberg, Germany, 2016. [Google Scholar]
  11. Sommer, S.; Fletcher, T.; Pennec, X. Introduction to differential and Riemannian geometry. In Riemannian Geometric Statistics in Medical Image Analysis; Elsevier: Amsterdam, The Netherlands, 2020; pp. 3–37. [Google Scholar]
  12. MITOpenCourseWare. Non-Independent Variables; Open Course; MIT Institute: Cambridge, MA, USA, 2007. [Google Scholar]
  13. Skorohod, A.V. On a representation of random variables. Theory Probab. Appl. 1976, 21, 645–648. [Google Scholar]
  14. Lamboni, M. On dependency models and dependent generalized sensitivity indices. arXiv 2021, arXiv:2104.12938. [Google Scholar]
  15. Lamboni, M. Efficient dependency models: Simulating dependent random variables. Math. Comput. Simul. MATCOM 2022, 200, 199–217. [Google Scholar] [CrossRef]
  16. Rosenblatt, M. Remarks on a Multivariate Transformation. Ann. Math. Statist. 1952, 23, 470–472. [Google Scholar] [CrossRef]
  17. O’Brien, G.L. The Comparison Method for Stochastic Processes. Ann. Probab. 1975, 3, 80–88. [Google Scholar] [CrossRef]
  18. Lamboni, M. On Exact Distribution for Multivariate Weighted Distributions and Classification. Methodol. Comput. Appl. Probab. 2023, 25, 41. [Google Scholar] [CrossRef]
  19. Robert, C.P. From Prior Information to Prior Distributions. In The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation; Springer: New York, NY, USA, 2007; pp. 105–163. [Google Scholar]
  20. Durante, F.; Ignazzi, C.; Jaworski, P. On the class of truncation invariant bivariate copulas under constraints. J. Math. Anal. Appl. 2022, 509, 125898. [Google Scholar] [CrossRef]
  21. Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 1956, 27, 832–837. [Google Scholar] [CrossRef]
  22. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
  23. Epanechnikov, V. Nonparametric estimation of a multidimensional probability density. Theory Probab. Appl. 1969, 14, 153–158. [Google Scholar] [CrossRef]
  24. McNeil, A.J.; Frey, R.; Embrechts, P. Quantitative Risk Management; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
  25. Durante, F.; Sempi, C. Principles of copula theory; CRC/Chapman & Hall: London, UK, 2015. [Google Scholar]
  26. Moore, E. On the Reciprocal of the General Algebraic Matrix. Bull. Am. Math. Soc. 1920, 26, 394–395. [Google Scholar]
  27. Moore, E. General analysis, Part 1. Mem. Amer. Phil. Soc. 1935, 1, 97–209. [Google Scholar]
  28. Penrose, R. A generalized inverse for matrices. Proc. Cambrid. Phil. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef]
  29. Rund, H. Differential-geometric and variational background of classical gauge field theories. Aequationes Math. 1982, 24, 121–174. [Google Scholar] [CrossRef]
  30. Vincze, C. On the extremal compatible linear connection of a generalized Berwald manifold. Aequationes Math. 2022, 96, 53–70. [Google Scholar] [CrossRef]
  31. YiHua, D. Rigid properties for gradient generalized m-quasi-Einstein manifolds and gradient shrinking Ricci solitons. J. Math. Anal. Appl. 2023, 518, 126702. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lamboni, M. Derivative Formulas and Gradient of Functions with Non-Independent Variables. Axioms 2023, 12, 845. https://doi.org/10.3390/axioms12090845

AMA Style

Lamboni M. Derivative Formulas and Gradient of Functions with Non-Independent Variables. Axioms. 2023; 12(9):845. https://doi.org/10.3390/axioms12090845

Chicago/Turabian Style

Lamboni, Matieyendou. 2023. "Derivative Formulas and Gradient of Functions with Non-Independent Variables" Axioms 12, no. 9: 845. https://doi.org/10.3390/axioms12090845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop