3.1. The First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (1st-CASAM-N)
The model and boundary parameters
are considered to be uncertain quantities, having unknown true values. The nominal (or mean) parameter vales
are considered to be known, and these will differ from the true values by quantities denoted as
, where
. Since the forward state functions
are related to the model and boundary parameters
through Equations (1) and (2), it follows that the variations
in the model and boundary parameters will cause corresponding variations
around the nominal solution
in the forward state functions. In turn, the variations
and
will induce variations in the system’s response. Cacuci [
1,
2] has shown that the most general definition of the sensitivity of an operator-valued model response
, where
, to variations
in the model parameters and state functions in a neighborhood around the nominal functions and parameter values
, is given by the first-order Gateaux (G) variation, which will be denoted as
and is defined as follows:
for a scalar
and for all (i.e., arbitrary) vectors
in a neighborhood
around
. The G variation
is an operator defined on the same domain as
and has the same range as
. The G variation
satisfies the relation:
, with
. The existence of the G variation
does not guarantee its numerical computability. Numerical methods most often require that
is linear in
in a neighborhood
around
. Formally, the necessary and sufficient conditions for the G variation
of a nonlinear operator
to be linear and continuous in
in a neighborhood
around
and therefore admit a total first-order G derivative, are as follows:
- (i)
satisfies a weak Lipschitz condition at
, namely:
- (ii)
satisfies the following condition
In practice, the relation provided in Equations (11) and (12) are seldom used directly since the computation of the expression on the right side of Equation (10) reveals if the respective expression is linear (or not) in and, hence, in . Numerical methods (e.g., Newton’s method, and variants thereof) for solving Equations (1) and (2) also require the existence of the first-order G derivatives of original model, in which case the components of the operators which appear in these must also satisfy the conditions provided in Equations (11) and (12). Therefore, the conditions provided in Equations (11) and (12) are henceforth considered to be satisfied by the operators which underly the physical system modeled by Equations (1) and (2), as well as by the model responses.
When the first-order G differential
satisfies the conditions provided in Equations (11) and (12), it can be written in the following form:
where
This, the quantities and in Equation (13) denote the partial G derivatives of with respect to and , evaluated at the nominal parameter values (and hence also nominal values of the state functions). The notation will be used in this work to indicate that the quantity enclosed within the bracket is to be evaluated at the respective nominal parameter and state functions values. The quantity is called the “direct-effect term” because it arises directly from parameter variations . The direct-effect term can be computed once the nominal values are available. The quantity is called the “indirect-effect term” because it arises indirectly, through the variations in the state functions (which are caused through the model by parameter variations). The indirect-effect term can be quantified only after having determined the variations in terms of the variations .
In particular, the first-order Gateaux differential
of the generic response
defined in Equation (9) has the following expression:
where the “indirect-effect” term
comprises all dependencies on
and is defined as follows:
and where the “direct-effect” term
comprises all dependencies on
and is defined as follows:
with
The first-order relationship between the vectors
and
is determined by taking the G differentials of Equations (1) and (2). Thus, applying the definition of the G differential to Equations (1) and (2) yields the following:
Carrying out the differentiations with respect to
in Equations (20) and (21), and setting
in the resulting expressions yields the following:
In Equations (22) and (23), the superscript “(1)” indicates “1st-Level” and the various quantities which appear in these are defined as follows:
The system comprising Equations (22) and (23) is called the “1st-Level Variational Sensitivity System” (1st-LVSS). In order to determine the solutions of the 1st-LVSS that would correspond to every parameter variation
,
, the 1st-LVSS would need to be solved
times, with distinct right sides for each
, as follows:
Subsequently, the solutions
could be used, in turn, in Equation (17) to compute the indirect-effect term corresponding to each parameter variation
,
, to obtain the following contribution from the indirect-effect term to the respective partial sensitivity of the response with respect to the parameter
:
Adding the contribution from the indirect-effect term obtained in Equation (31) to the contribution from the direct-effect term provided in Equation (18) yields the following expression for the sensitivity (i.e., partial G derivative)
of the response
to the parameter
,
:
The quantities are independent of the parameter variations and represent the first-order partial sensitivities (first-order partial G derivatives) of the response with respect to each of the model parameters , , evaluated at the nominal values . Computing the response sensitivities by using the solutions , , of the 1st-LVSS requires large-scale forward computations in order to determine the functions , . Since most problems of practical interest are characterized by many parameters (i.e., has many components) and comparatively few responses, it becomes prohibitively expensive to solve repeatedly the 1st-LVSS in order to determine the functions , . Even though the 1st-LVSS contains first-order parameter and state-functions variations, it is called “first-level” (rather than “first-order”) in anticipation of determining second-order sensitivities, which will use “second-level” forward and adjoint systems. These “second-level” systems will not be called “second-order” because they will not contain second-order parameter and/or state-function variations, but will also contain only first-order variations, even though they will be used for determining second-order sensitivities. Similar terminology, i.e., “third-level” (as opposed to “third-order”) forward/adjoint systems, will be used for determining the third-order sensitivities, and so on.
In most practical situations, the number of model parameters significantly exceeds the number of functional responses of interest, i.e.,
, so it would be advantageous to perform just
(rather than
) computations. The goal of the “1st-order comprehensive adjoint sensitivity analysis methodology for nonlinear systems (1st-CASAM-N)” is to compute exactly and efficiently the “indirect effect term” defined in Equation (17) without needing to compute explicitly the vectors
,
. The qualifier “comprehensive” is meant to indicate that that the 1st-CASAM-N considers that the internal and external boundaries
of the phase-space domain depend on the uncertain model parameters
and are thereby imprecisely known, subject to uncertainties. Thus, the 1st-CASAM-N represents a generalization of the pioneering works by Cacuci [
1,
2] that conceived the “adjoint sensitivity analysis methodology”, in which the domain boundary was considered to be perfectly well known, free of uncertainties. The fundamental ideas underlying the 1st-CASAM-N are as conceived by Cacuci [
1,
2], aiming at eliminating the appearance of the vectors
from the expression of the indirect-effect term defined in Equation (17). This elimination is achieved by expressing the right side of Equation (17) in terms of the solutions of the “1st-Level Adjoint Sensitivity System (1st-LASS)”, the construction of which requires the introduction of adjoint operators. Adjoint operators can be defined in Banach spaces but are most useful in Hilbert spaces. Since real Hilbert spaces provide the natural mathematical setting for computational purposes, the derivations presented in this section are set in real (as opposed to complex) Hilbert spaces, without affecting the generality of the concepts presented herein. Thus, the spaces
and
are henceforth considered to be self-dual Hilbert spaces and will be denoted as
. The inner product of two vectors
and
will be denoted as
, and is defined as follows:
where the dot indicates the “scalar product of two vectors” defined as follows:
. It is important to note that the inner product defined in Equation (33) is continuous in
, i.e., it holds at any value particular value of
, including at the nominal parameter values
.
The construction of the 1st-LASS commences by noting that the vector
itself is independent of the index
, since
The next step is to form the inner product of Equation (22) with a vector
, where the superscript “(1)” indicates “1st-Level”:
Using the definition of the adjoint operator in
, the left side of Equation (35) is transformed as follows:
where
denotes the associated bilinear concomitant evaluated on the space/time domain’s boundary
and where
is the operator adjoint to
, i.e.,
. The symbol
will be used in this work to indicate “adjoint” operator. In certain situations, it might be computationally advantageous to include certain boundary components of
into the components of
.
The domain of
is determined by selecting appropriate adjoint boundary and/or initial conditions, which will be denoted in operator form as:
The above boundary conditions for are usually inhomogeneous, i.e., , and are obtained by imposing the following requirements:
1. They must be independent of unknown values of and ;
2. The substitution of the boundary and/or initial conditions represented by Equations (23) and (37) into the expression of must cause all terms containing unknown values of to vanish.
Constructing the adjoint initial and/or boundary conditions for as described above and implementing them together with the variational boundary and/or initial conditions represented by Equation (23) into Equation (35) reduces the bilinear concomitant to a quantity that will contain boundary terms involving only known values of , , , and ; this quantity will be denoted by . In general, the boundary terms represented by do not vanish automatically. In certain cases, however, may vanish automatically or it may be forced to vanish by considering appropriately constructed extensions of the adjoint operator ; however, such extensions are seldom needed in practice. Since is linear in , it can be expressed in the following form:
Implementing the forward and adjoint boundary and/or initial conditions given in Equations (23) and (37) into Equation (36) transforms the later into the following relation:
Replacing the quantity
in the first term on the right side of Equation (38) by the right side of Equation (22) yields the following relation:
The definition of the function
will now be completed by requiring that the left side of Equation (39) is the same as the indirect-effect term defined in Equation (17), which is achieved by imposing the following relationship:
while satisfying the adjoint boundary conditions represented by Equation (37). The subscript “A” attached to the source term on the right side of Equation (40) indicates “adjoint”. Since the source
may contain distributions (e.g., Dirac delta functions and derivatives thereof), the equality in Equation (40) is considered to hold in the weak sense. The well-known Riesz representation theorem ensures that the relationship in Equation (40) holds uniquely.
The results obtained in Equations (38)–(40) are now replaced in Equation (17) to obtain the following expression of the indirect-effect term as a function of
:
where, for each
, the contribution of the indirect-effect term to the sensitivity of the response with respect to the parameter
is given by
As the identity on the right side of Equation (41) indicates, the desired elimination of all unknown values of from the expression of the indirect-effect term is accomplished. Instead of depending on , the indirect-effect term now depends on the adjoint function .
Replacing in Equation (16) the result obtained in Equation (41) together with the expression provided (18) for the direct-effect term yields the following expression
The expressions of the first-order response sensitivities of the response with respect to the parameters are obtained identifying in Equation (43) the quantities that multiply the respective parameter variations , . This identification yields the following expressions for the first-order response sensitivities , , computed at the model’s nominal parameter and state function values:
As indicated by Equation (44), each of the first-order sensitivities of the response with respect to the model parameters (including boundary and initial conditions) can be computed inexpensively after having obtained the function , using just quadrature formulas to evaluate the various inner products involving in the expression of the indirect-effect term obtained in Equation (42). The function is obtained by solving numerically Equations (37) and (40), which is the only large-scale computation needed for obtaining all of the first-order sensitivities. Equations (37) and (40) will be called the first-level adjoint sensitivity system (1st-LASS), and its solution, , will be called the first-level adjoint function. It is very important to note that the 1st-LASS is independent of parameter variation , , and therefore needs to be solved only once, regardless of the number of model parameters under consideration. Furthermore, since Equation (40) is linear in , solving it requires less computational effort than solving the original Equation (1), which is nonlinear in .
3.2. The Second-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (2nd-CASAM-N)
The 2nd-CASAM-N relies on the same fundamental concepts as introduced in [
14], but the 2nd-CASAM-N also enables the computation of response sensitivities with respect to imprecisely known domain boundaries, thus including all possible types of uncertain parameters. Fundamentally, the second-order sensitivities are defined as the “1st-order sensitivities of the 1st-order sensitivities”. This definition stems from the inductive definition of the second-order total G differential of correspondingly differentiable function, which is also defined inductively as “the total 1st-order differential of the 1st-order total differential” of a function.
The computation of the second-order sensitivities
requires the prior computation of the original forward function
and the first-level adjoint function
. For each
, the first-order sensitivities
will be assumed to satisfy the conditions stated in Equations (11) and (12). Under these conditions, the first-order total G differential
of
will exist in a neighborhood around the nominal values of the parameters and state functions and will have, by definition, the following expression, for each
:
where
and
In Equation (45), the quantities
and
denote, respectively, the indirect-effect term and the direct-effect term. The direct-effect term
comprises all dependencies on the vector
of parameter variations and, in view of Equation (44), has the following expression obtained by using Equation (44):
The indirect-effect term
comprises all dependencies on the vectors
and
of variations in the state functions
and
, respectively. The components of the indirect-effect term have the following expressions obtained by using Equation (44):
and
The direct-effect term
can be computed immediately while the indirect-effect term
can be computed only after having determined the vectors
and
. The vector
is the solution of the 1st-LVSS defined by Equations (22) and (23). On the other hand, the vector
is the solution of the G differentiated 1st-LASS. Thus, taking the G differential of Equations (37) and (40) yields the following system of equations for
:
where
Although Equations (51) and (52) are coupled to the 1st-LVSS, they can be solved sequentially, after having obtained the solution
of the 1st-LVSS. Formally, the functions
and
are obtained by solving the following second-level variational sensitivity system (2nd-LVSS), which is obtained by concatenating the 1st-LVSS to Equations (51) and (52):
The argument “2” which appears in the list of arguments of the vector
and the “variational vector”
in Equations (57) and (58) indicates that each of these vectors is a two-block column vector, with each block comprising a column vector of dimension
, defined as follows:
Thus, the (column) block vector has a total of components; evidently, the (column) block vector also has a total of components. In the relatively simple case regarding the components of either the vector or the vector , the numbers “1” and “2” could also be used as subscripts, but such a subscript notation would become unwieldy for the higher-level (adjoint) functions, which will be introduced in the sections to follow below. The superscript “(2)” which appears in the notation of the vectors and indicates “2nd-level”. Henceforth, such “higher-level” (i.e., level higher than first) variational and adjoint functions/vectors will be denoted using bold capital letters. The argument “2” in the expression indicates that the quantity is a two-block column vector comprising two vectors, each of which has components, all of which are zero-valued, as defined in Equation (3). Thus, the column vector has a total of components, all of which are identically zero.
To distinguish block vectors from block matrices, two capital bold letter are used (and will henceforth be used) to denote block matrices, as in the case of “the second-level” “variational matrix”
. The “2nd-level” is indicated by the superscript “(2)”. Subsequently in this work, levels higher than second will also be indicated by a corresponding superscript attached to the appropriate block vectors and/or block matrices. The argument “
”, which appears in the list of arguments of
, indicates that this matrix is a
-dimensional block matrix comprising four matrices, each with of dimensions
, having the following structure:
Thus, the matrix
has a total of
components (or elements). The other quantities which appear in Equations (57) and (58) are also two-block vectors, with the same structure as
, and are defined as follows:
Solving the 2nd-LVSS requires
large-scale computations, which is unrealistic to perform for large-scale systems comprising many parameters. The 2nd-CASAM-N circumvents the need for solving the 2nd-LVSS by deriving an alternative expression for the indirect-effect term defined in Equation (47), in which the function
is replaced by a second-level adjoint function which is independent of variations in the model parameter and state functions. This second-level adjoint function will satisfy a second-level adjoint sensitivity system (2nd-LASS), which will be constructed by using the 2nd-LVSS as the starting point, following the same principles outlined in
Section 3.1 The 2nd-LASS will be constructed in a Hilbert space which will be denoted as
and which comprises as elements block vectors of the same form as
. Thus, a generic vector in
, denoted as
, comprises two components of the form
and
, each of which are
-dimensional column vectors; hence,
is a
-dimensional column vector.
The inner product of two vectors
and
in the Hilbert space
will be denoted as
and defined as follows:
The inner product defined in Equation (63) is continuous in
in a neighborhood of
. Using the definition of the inner product defined in Equation (63), construct the inner product of Equation (57) with a vector
to obtain the following relation:
The inner product on the left side of Equation (64) is now further transformed by using the definition of the adjoint operator to obtain the following relation:
where the adjoint matrix-valued operator
is defined as follows:
The matrix comprises block matrices, each with dimensions , thus comprising a total of components (or elements).
In Equation (66), the quantity
denotes the corresponding bilinear concomitant on the domain’s boundary, evaluated at the nominal values for the parameters and respective state functions. The definition domain of the adjoint (matrix-valued) operator
is specified by requiring the function
to satisfy adjoint boundary/initial conditions denoted as follows:
The second-level adjoint boundary/initial conditions represented by Equation (67) are determined by requiring that: (a) they must be independent of unknown values of ; and (b) the substitution of the boundary and/or initial conditions represented by Equations (58) and (67) into the expression of must cause all terms containing unknown values of to vanish.
Implementing the second-level (forward and adjoint) boundary/initial conditions, namely Equations (58) and (67) into Equation (65), will transform the later into the following form:
where
denotes residual boundary terms which do not depend on
but may not have vanished automatically. The right side of Equation (64) is now used to replace the vector
in the first term on the right side of Equation (68), thereby obtaining the following relation:
The definition of the second-level adjoint function
is now completed by requiring that both the left side of Equation (69) and the right side of Equation (47) represent the “indirect-effect term”
. As shown in Equation (45), there are a total of
indirect-effect terms, each one corresponding to a first-order sensitivity
,
. Hence, there will be a total of
second-level adjoint functions of the form
,
, with each such adjoint function corresponding to a specific
-dependent indirect-effect term. The left side of Equation (69) will be identical to the right side of Equation (47) by requiring that the following relation is satisfied for each
by the second-level adjoint functions (block vectors)
:
where
The boundary conditions to be satisfied by each of the second-level adjoint functions
are as represented by Equation (67), namely:
Since the source may contain distributions (e.g., Dirac delta functions and derivatives thereof), the equality in Equation (70) is considered to hold in the weak sense. The well-known Riesz representation theorem ensures that the relationship in Equation (70) holds uniquely.
The system of equations represented by Equations (70)–(72) will be called the second-level adjoint sensitivity system (2nd-LASS) and its solution, , will be called the second-level adjoint function. The 2nd-LASS is independent of parameter variations and variations in the respective state functions. It is also important to note that the -dimensional matrix is independent of the index . Only the source term depends on the index . Therefore, the same solver can be used to invert the and numerically solve the 2nd-LASS for each -dependent source in order to obtain the corresponding -dependent -dimensional second-level adjoint function (column vector) . Computationally, it would be efficient to store, if possible, the inverse matrix , in order to multiply the inverse matrix directly with the corresponding source term , for each index , in order to obtain the corresponding -dependent -dimensional second-level adjoint function . The two components and of the second-level adjoint function are distinguished from each other by the use of the numbers “1” and, respectively, “2” in the respective list of arguments. In this particularly simple case, the numbers “1” and “2” could also be used as subscripts, in the customary notation for vector components, but such a use would not lend itself to generalizations because the subscript notation would become unwieldy for the higher-level adjoint functions, which will be introduced in the sections that follow below.
Using the underlying the 2nd-LASS and Equation (68) in Equation (47) yields the following expression for the indirect-effect term, for each
:
As the identity in Equation (73) indicates, the dependence of the indirect-effect term on the functions and is replaced by the dependence on the second-level adjoint function , for each .
Replacing the expression obtained in Equation (73) for the indirect-effect term together with the expression for the direct-effect term provided in Equation (46) yields the following expression for the total differential defined by Equation (45):
Using Equations (25), (46), and (55) in Equation (74) yields the following component form for the total differential expressed by Equation (74):
where the quantity
denotes the second-order sensitivity of the generic scalar-valued response
with respect to the parameters
and
computed at the nominal values of the parameters and respective state functions. The expression of the second-order sensitivity
of the response
with respect to two model parameters
and
has the following expression:
As Equations (70) and (72) indicate, solving the 2nd-LASS once provides the second-level adjoint function , for each index , which enables the exact and efficient computation of the G differential for each index . Notably, the G differential comprises one complete row (running on the index ) of mixed second-order partial sensitivities . Thus, the exact computation of all of the partial second-order sensitivities, requires at most large-scale (adjoint) computations using the 2nd-LASS, rather than at least large-scale computations, which would be required by forward methods.
Since the adjoint matrix is block diagonal, solving the 2nd-LASS is equivalent to solving two 1st-LASS, with two different source terms. Thus, the “solvers” and the computer program used for solving the 1st-LASS can also be used for solving the 2nd-LASS. The 2nd-LASS was designated as the “second-level” rather than the “second-order” adjoint sensitivity system, since the 2nd-LASS does not involve any explicit second-order G derivatives of the operators underlying the original system but involves the inversion of the same operators that needed to be inverted for solving the 1st-LASS.
It is important to note that if the 2nd-LASS is solved -times, the second-order mixed sensitivities will be computed twice, in two different ways, in terms of two distinct second-level adjoint functions. Consequently, the symmetry property enjoyed by the second-order sensitivities provides an intrinsic (numerical) verification that the components of the second-level adjoint function , as well as the first-level adjoint function are computed accurately.
The structure of the 2nd-LASS enables full flexibility for prioritizing the computation of the second-order sensitivities. The computation of the second-order sensitivities would logically be prioritized based on the relative magnitudes of the first-order sensitivities: the largest relative first-order response sensitivity should have the highest priority for computing the corresponding second-order mixed sensitivities; then, the second largest relative first-order response sensitivity should be considered next, and so on. The unimportant second-order sensitivities can be deliberately neglected while knowing the error incurred by neglecting them. Computing second-order sensitivities that correspond to vanishing first-order sensitivities may also be of interest, since vanishing first-order sensitivities may indicate critical points of the response in the phase space of model parameters.
3.3. The Third-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (3rd-CASAM-N)
The second-order sensitivities will be assumed to satisfy the conditions stated in Equations (11) and (12) for each , so that the first-order total G differential of will exist and will be linear in the variations and in a neighborhood around the nominal values of the parameters and the respective state functions. By definition, the first-order total G differential of , which will be denoted as , is given by the following expression:
In Equation (77), the quantity
denotes the “direct-effect term”, which comprises all of the dependencies on the vector
of parameter variations and has the following expression:
In Equation (77), the quantity
denotes the “indirect-effect term” which comprises all of the dependencies on the vectors
and
of variations in the state functions
and
; this indirect-effect term is defined as follows:
where
and
The direct-effect term
can be computed immediately; however,
can be computed only after having determined the vectors
and
. The vector
is the solution of the 2nd-LVSS defined by Equations (57) and (58). The vector
is the solution of the G-differentiated 2nd-LASS. Taking the G differential of Equations (70) and (72) yields the following system of equations for
:
The quantities which appear in Equation (83) are evaluated at the nominal values of the parameters and respective state functions, but the notation , which indicates this evaluation, is omitted in order to simplify the notation.
Concatenating Equations (82) and (83) with the 2nd-LVSS represented by Equations (57) and (58) yields the following system of, which will be called the “3rd-order variational sensitivity system” (3rd-LVSS), for determining the vectors
and
:
where
For subsequent reference, it is important to note that the quantities
are linear in the parameter variations
,
, and can therefore be written in the following form:
The variational matrix comprises block matrices, each comprising components/elements; thus, the matrix comprises a total of components/elements. Each of the vectors , and comprise four -dimensional vectors, as shown in their respective definitions; thus, each of the vectors , , and comprise components/elements.
Solving the 3rd-LVSS would require
large-scale computations, which is unrealistic for large-scale systems comprising many parameters. The 3rd-CASAM-N circumvents the need for solving the 3rd-LVSS by deriving an alternative expression for the indirect-effect term defined in Equation (81), in which the function
is replaced by a third-level adjoint function which is independent of parameter variations. This third-level adjoint function will be the solution of a third-level adjoint sensitivity system (3rd-LASS) which will be constructed by applying the same principles as those used for constructing the 1st-LASS and the 2nd-LASS. The Hilbert space appropriate for constructing the 3rd-LASS will be denoted as
and comprise element block vectors of the same form as
. Thus, a generic block vector in
, denoted as
, comprises four
-dimensional vector components of the form
,
, where each of these four components is a
-dimensional column vector. The inner product of two vectors
and
in the Hilbert space
will be denoted as
and defined as follows:
The inner product defined in Equation (94) is continuous in
, in a neighborhood around
. Using the definition of the inner product defined in Equation (94), the inner product of Equation (84) is constructed with a vector
to obtain the following relation:
The inner product on the left side of Equation (95) is further transformed by using the definition of the adjoint operator to obtain the following relation:
where
and where
denotes the corresponding bilinear concomitant on the domain’s boundary, evaluated at the nominal values for the parameters and respective state functions. The adjoint matrix
comprises
components/elements, while the adjoint function
comprises
components/elements.
The domain of the adjoint matrix operator
is specified by requiring that the function
satisfies adjoint boundary/initial conditions denoted as follows:
The third-level adjoint boundary/initial conditions represented by Equation (98) are determined by requiring that: (a) they must be independent of unknown values of ; and (b) the substitution of the boundary and/or initial conditions represented by Equations (85) and (98) into the expression of must cause all terms containing unknown values of to vanish.
Implementing the boundary/initial conditions represented by Equations (85) and (98) into Equation (96) will transform the latter into the following form:
where
denotes residual boundary terms which may not vanish automatically. The right side of Equation (95) is now used in the first term on the right side of Equation (68) to obtain the following relation:
The definition of the third-level adjoint function is now completed by requiring that the left side of Equation (100) and the right side of Equation (81) represent the “indirect-effect term” for each of the indices . Hence, there will be distinct third-level adjoint functions , corresponding to the indices . Each of these distinct third-level adjoint functions will correspond to a specific -dependent indirect-effect term.
The left side of Equation (100) will be identical to the right side of Equation (81) by requiring that the following relation be satisfied by the third-level adjoint functions
:
The boundary conditions to be satisfied by each of the third-level adjoint functions
are those represented by Equation (98), namely
Since the source may contain distributions, the equality in Equation (101) is considered to hold in the weak sense. The Riesz representation theorem ensures that the weak equality in Equation (101) holds uniquely.
The system of represented by Equations (101) and (107) will be called the third-level adjoint sensitivity system (3rd-LASS); its solution, , will be called the third-level adjoint function. Using the underlying the 3rd-LASS and Equation (100) in Equation (81) yields the following expression for the indirect-effect term:
As the identity in Equation (108) indicates, the dependence of the indirect-effect term on the function is replaced by the dependence on the adjoint function , for each .
Replacing the expression obtained in Equation (108) for the indirect-effect term together with the expression for the direct-effect term provided in Equation (78) yields the following expression for the total differential defined by Equation (77):
In component form, the total differential expressed by Equation (109) has the following expression:
where the quantity
denotes the third-order sensitivity of the generic scalar-valued response
with respect to any three model parameters
,
,
, and has the following expression:
As Equations (101)–(107) indicate, solving the 3rd-LASS once provides the third-level adjoint function , for each of the indices . In turn, the availability of enables the exact and efficient computation of the G differential . Thus, the exact computation of all of the partial third-order sensitivities requires at most large-scale (adjoint) computations using the 3rd-LASS, rather than at least large-scale computations, which would be required by forward methods.
The matrix is block diagonal; therefore, solving the 3rd-LASS is equivalent to solving three 1st-LASS, with different source terms. The 3rd-LASS was designated as “the third-level” rather than “third-order” adjoint sensitivity system since the 3rd-LASS does not involve any explicit second-order and/or third-order G derivatives of the operators underlying the original system, but involves only the inversion of the same operators that needed to be inverted for solving the 1st-LASS.
By solving the 3rd-LASS times, the third-order mixed sensitivities will be computed three times, in three different ways. Consequently, the multiple symmetries intrinsic to the third-order sensitivities provide an intrinsic numerical verification that the components of the first-, second-, and third-level adjoint functions are computed accurately.
The structure of the 3rd-LASS enables full flexibility for prioritizing the computation of the third-order sensitivities. The computation of the third-order sensitivities would logically be prioritized based on the relative magnitudes of the second-order sensitivities, so that the unimportant third-order sensitivities can be deliberately neglected while knowing the error incurred by neglecting them.
3.4. The Fourth-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (4th-CASAM-N)
The third-order sensitivities will be assumed to satisfy the conditions stated in Equations (11) and (12) for each , so that the first-order total G differential of will exist and will be linear in the variations and in a neighborhood around the nominal values of the parameters and the respective state functions. By definition, the first-order total G differential of , which will be denoted as , is given by the following expression:
In Equation (112), the quantity denotes the “direct-effect term” which comprises all of the dependencies on the vector of parameter variations and has the following expression:
In Equation (112), the quantity
denotes the “indirect-effect term” which comprises all of the dependencies on the vectors
and
of variations in the state functions
and
; this indirect-effect term is defined as follows:
where
and
The direct-effect term
can be computed immediately; however, the indirect-effect term
can be computed only after having determined the vectors
and
. The vector
is the solution of the 3rd-LVSS defined by Equations (84) and (85). The vector
is the solution of the G-differentiated 3rd-LASS. Taking the G differential of Equations (101) and (107) yields the following system of for
and for
:
The quantities which appear in Equation (117) are evaluated at the nominal values of the parameters and respective state functions, but the notation , which indicates this evaluation, is omitted in order to simplify the notation.
Concatenating Equations (117) and (118) with the 3rd-LVSS represented by Equations (84) and (85) yields the following system of, which will be called the “4th-order variational sensitivity system” (4th-LVSS), for determining the vectors
and
, for
:
where
, and
For subsequent reference, it is noted that the quantities
are linear in the parameter variations
,
, and can therefore be written in the following form:
The variational matrix comprises block matrices, each comprising components/elements; thus, the matrix comprises a total of components/elements. Each of the vectors , , and comprise eight -dimensional vectors, as shown in their respective definitions; thus, each of the vectors , , comprise components/elements.
Solving the 4th-LVSS would require
large-scale computations, which is unrealistic for large-scale systems comprising many parameters. The 4th-CASAM-N circumvents the need for solving the 3rd-LVSS by deriving an alternative expression for the indirect-effect term defined in Equation (114), in which the function
is replaced by a fourth-level adjoint function which is independent of parameter variations. This fourth-level adjoint function will be the solution of a fourth-level adjoint sensitivity system (4th-LASS) which will be constructed by applying the same principles as those used for constructing the 1st-LASS, the 2nd-LASS, and the 3rd-LASS. The Hilbert space appropriate for constructing the 4th-LASS will be denoted as
and comprises block vectors elements of the same form as
. Thus, a generic block vector in
will have the structure
, comprising eight
-dimensional vectors of the form
,
. The inner product of two vectors
and
in the Hilbert space
will be denoted as
and defined as follows:
The inner product defined in Equation (130) is continuous in
, in a neighborhood around
. Using the definition of the inner product defined in Equation (130), construct the inner product of Equation (119) with a vector
to obtain the following relation:
The inner product on the left side of Equation (131) is further transformed by using the definition of the adjoint operator to obtain the following relation:
where
and where
denotes the corresponding bilinear concomitant on the domain’s boundary, evaluated at the nominal values for the parameters and respective state functions. The adjoint matrix
comprises
components/elements, while the adjoint function
comprises
components/elements.
The domain of the adjoint matrix operator
is specified by requiring that the function
satisfies adjoint boundary/initial conditions denoted as follows:
The fourth-level adjoint boundary/initial conditions represented by Equation (134) are determined by requiring that: (a) they must be independent of unknown values of ; and (b) the substitution of the boundary and/or initial conditions represented by Equations (85) and (98) into the expression of must cause all terms containing unknown values of to vanish.
Implementing the boundary/initial conditions represented by Equations (128) and (134) into Equation (132) will transform the later relation into the following form:
where
denotes residual boundary terms which may have not vanish automatically. The right side of Equation (131) is now used in the first term on the right side of Equation (135) to obtain the following relation:
The definition of the fourth-level adjoint function is now completed by requiring that the left side of Equation (136) and the right side of Equation (114) represent the “indirect-effect term” , for each of the indices ; ; . Hence, there will be distinct fourth-level adjoint functions , each corresponding to one combination of the indices ; ; . Each of these distinct fourth-level adjoint functions will correspond to a specific -dependent indirect-effect term.
The left side of Equation (136) will be identical to the right side of Equation (114) by requiring that the following relation be satisfied by the fourth-level adjoint functions
, for each value of the indices
;
;
:
where
The boundary conditions to be satisfied by each of the fourth-level adjoint functions
are those represented by Equation (134), namely
Since the source may contain distributions, the equality in Equation (137) is considered to hold in the weak sense, and the Riesz representation theorem ensures that this weak equality holds uniquely.
The system of represented by Equations (101) and (107) will be called the fourth-level adjoint sensitivity system (4th-LASS); its solution,
, will be called the fourth-level adjoint function. Using the underlying the 4th-LASS and Equation (136) in Equation (114) yields the following expression for the indirect-effect term:
As the identity in Equation (144) indicates, the dependence of the indirect-effect term on the function is replaced by the dependence on the adjoint function for each ; .
Replacing the expression obtained in Equation (144) for the indirect-effect term together with the expression for the direct-effect term provided in Equation (113) yields the following expression for the total differential defined by Equation (112):
In component form, the total differential expressed by Equation (145) can be written in the following form, for each
;
:
where the quantity
denotes the fourth-order sensitivity of the generic scalar-valued response
with respect to any four model parameters
,
,
,
, and has the following expression:
As Equations (137) and (143) indicate, solving the 4th-LASS once provides the fourth-level adjoint function for each of the indices , and . In turn, the availability of enables the exact and efficient computation of the G differential . Thus, the exact computation of all of the partial fourth-order sensitivities, requires at most large-scale (adjoint) computations using the 4th-LASS, rather than at least large-scale computations, which would be required by forward methods.
The adjoint matrix is block diagonal; therefore, solving the 4th-LASS is equivalent to solving four 1st-LASS, with different source terms. The 4th-LASS was designated as the “fourth-level” rather than “fourth-order” adjoint sensitivity system since the 3rd-LASS does not involve any explicit second-order, third-order, and/or fourth-order G derivatives of the operators underlying the original system but involves the inversion of the operators similar to those that needed to be inverted for solving the 1st-LASS.
By solving the 4th-LASS times, the fourth-order mixed sensitivities will be computed four times, in four different ways using distinct adjoint functions. Consequently, the multiple symmetries intrinsic to the fourth-order sensitivities provide an intrinsic numerical verification that the components of the first-, second-, third-, and fourth-level adjoint functions are computed accurately.
The structure of the 4th-LASS enables full flexibility for prioritizing the computation of the third-order sensitivities. The computation of the fourth-order sensitivities would be prioritized based on the relative magnitudes of the third-order sensitivities, so that the unimportant fourth-order sensitivities can be deliberately neglected while knowing the error incurred by neglecting them.