Next Article in Journal
Mixed Hilfer and Caputo Fractional Riemann–Stieltjes Integro-Differential Equations with Non-Separated Boundary Conditions
Previous Article in Journal
Extending Undirected Graph Techniques to Directed Graphs via Category Theory
Previous Article in Special Issue
Physics-Informed Neural Networks and Functional Interpolation for Solving the Matrix Differential Riccati Equation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Theory of Functional Connections-Based hp-Adaptive Mesh Refinement Algorithm for Solving Hypersensitive Two-Point Boundary-Value Problems

System & Industrial Engineering, University of Arizona, Tucson, AZ 85721, USA
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(9), 1360; https://doi.org/10.3390/math12091360
Submission received: 15 March 2024 / Revised: 20 April 2024 / Accepted: 22 April 2024 / Published: 29 April 2024
(This article belongs to the Special Issue Dynamics and Control Using Functional Interpolation)

Abstract

:
This manuscript introduces the first hp-adaptive mesh refinement algorithm for the Theory of Functional Connections (TFC) to solve hypersensitive two-point boundary-value problems (TPBVPs). The TFC is a mathematical framework that analytically satisfies linear constraints using an approximation method called a constrained expression. The constrained expression utilized in this work is composed of two parts. The first part consists of Chebyshev orthogonal polynomials, which conform to the solution of differentiation variables. The second part is a summation of products between switching and projection functionals, which satisfy the boundary constraints. The mesh refinement algorithm relies on the truncation error of the constrained expressions to determine the ideal number of basis functions within a segment’s polynomials. Whether to increase the number of basis functions in a segment or divide it is determined by the decay rate of the truncation error. The results show that the proposed algorithm is capable of solving hypersensitive TPBVPs more accurately than MATLAB R2021b’s bvp4c routine and is much better than the standard TFC method that uses global constrained expressions. The proposed algorithm’s main flaw is its long runtime due to the numerical approximation of the Jacobians.

1. Introduction

In recent years, the Theory of Functional Connections (TFC) [1] has garnered much interest from researchers for obtaining numerical solutions to scientific and engineering problems. A few examples of the types of problems the TFC has been used to solve include those from transport theory [2], solid mechanics [3,4], and orbit transfer [5,6,7]. The increase in the TFC’s popularity is due to its successful use of functional interpolation to analytically satisfy linear constraints exactly. Functional interpolation is performed by approximating a solution via a constrained expression, which comprises an arbitrary free function and a summation of products between switching functions and projection functionals. Some constraints that could be handled were immediately apparent during the TFC’s inception by Daniele Mortari [8]. Examples include point, derivative, and relative constraints. The ability of the TFC to analytically satisfy these constraints exactly allowed it to be used to solve one-dimensional nonlinear ordinary differential equations (ODEs) in milliseconds and with machine-level accuracy [9]. In this work, we are concerned with solving linear and nonlinear hypersensitive two-point boundary-value problems (TPBVPs) stemming from optimal control problems (OCPs) involving point and derivative constraints. These problems are challenging because of steep gradients in their solutions, which are difficult to approximate with global constrained expressions. We overcome this challenge by developing the first hp-adaptive mesh refinement algorithm for the TFC. The algorithm splits the constrained expressions into multiple segments and adaptively determines the length, location, and hyperparameters for each segment to accurately approximate a solution.
Three families of optimal control methods exist: dynamic programming (DP) [10], direct [11,12], and indirect methods [13,14]. DP solves an OCP by transforming it into the Hamilton–Jacobi–Bellman (HJB) partial differential equation, which gives the necessary and sufficient conditions for an optimal solution when solved and enables closed-loop control effort. Direct methods convert the OCP into a nonlinear programming (NLP) problem that can be solved by any of the available NLP solvers in the literature but are inherently open-loop. Indirect methods are also open-loop, but they analytically construct the necessary conditions for optimality via Pontryagin’s Minimum Principle (PMP) and solve a resulting dual TPBVP to obtain the OCP solution. Indirect methods tend to solve OCPs more accurately than direct methods but are less likely to converge because they are more sensitive to an initial guess. Whether one class of methods should be used over another is problem-dependent. Hence, the TFC has already been used to solve OCPs via all three families, but some more than others. Schiassi et al. used a subset of the TFC, called the Extreme Theory of Functional Connections (X-TFC), to solve the HJB for finite and infinite horizon OCPs with linear and nonlinear dynamics [15]. For the direct method, Zhang et al. solved several coplanar orbit low-thrust multi-target visit problems, governed by aspherical gravity perturbation, with the TFC to analytically satisfy the orbital multi-target visit constraints in an NLP [16]. The TFC has been used to solve OCPs via the indirect method much more extensively in the literature. For example, by forming a TPBVP via PMP, the TFC has been used to solve many spacecraft energy optimal rendezvous problems in order to avoid keep-out-zones [17], planetary landing problems [18], and the famous Riccati problem of optimal control [19].
Even though the TFC has been shown to solve many OCPs very well via the indirect method, the method fails to converge on solutions that are not smooth (e.g., bang-bang problems). This occurs because the TFC initially utilizes global orthogonal polynomials for the free function, which only converge exponentially on smooth continuous functions [20] (p. 29). This issue was rectified by Johnston et al. by splitting the domain of a fuel-optimal planetary landing OCP into several intervals connected by knot points [18]. The knot points were placed where the discontinuity in the solution occurred, making each interval smooth. The total solution could then be converged upon by using constrained expressions for each segment and tying them together. The locations of the knot points were found via an optimization algorithm, which was feasible because the number of switching points in a fuel-optimal planetary landing OCP is known when the gravity is assumed to be constant [21]. A technique where the domains of TPBVPs could be split into segments and solved with the TFC was eventually formalized by Johnston and Mortari [22] for solving hybrid problems. Nonetheless, Johnston and Mortari provided no insight into how to determine the number of orthogonal polynomial basis terms in an interval and how to determine the number of intervals when they are unknown a priori. Hence, our desire to develop a TFC-based adaptive mesh refinement algorithm for determining these hyperparameters.
Researchers have heavily investigated the use of adaptive mesh refinement algorithms in conjunction with many OCP techniques. One of the more well-known applications in optimal control theory is the coupling with pseudospectral methods for solving OCPs directly [23,24,25,26,27], where the state and control are approximated using Lagrange orthogonal polynomials with Gaussian quadrature collocation points. Most adaptive mesh refinement methods for pseudospectral methods are referred to as hp methods because they achieve better convergence toward the solution by varying the number of intervals on the domain (the h method) and the degree of the orthogonal polynomial (the p method). A more in-depth explanation of the h, p, and hp methods can be found by studying finite-element methods [28,29,30], where they are also widely used. For the mesh refinement algorithm to be adaptive, an a posteriori estimate of the error between the actual and approximate solutions must be quantified to determine the ideal number of intervals, the degree of the approximating polynomials, and the location of the knot points for each interval. For example, Darby et al. approximated the error by computing the residuals on and between collocation points [24]. When the error was above some tolerance, they increased the polynomial degree in intervals with low curvature or uniform error across the collocation points. An interval was divided into subintervals instead when the curvature was high or the error across the collocation points in an interval was not uniform. Unfortunately, the error estimates from Darby et al. [24] created much noise, making them computationally intractable when a highly accurate solution is desired. Patterson et al. developed a more accurate error estimate by using the difference between an interpolated value of the state and a quadrature approximation to the integral of the state dynamics [25]. They performed adaptive mesh refinement based on an error estimate ratio: when a set number of polynomial degrees was used to approximate the state, the degree number was increased by one for interpolation. The spectral convergence properties of pseudospectral methods could then be used to determine how many degrees to increase the polynomial by or to divide the interval into smaller subintervals.
The mesh refinement methods proposed in [24,25] only increased the mesh, which sometimes increased the computational complexity. Liu et al. [26] introduced a technique that both increased and decreased the mesh based on the error estimate from Patterson et al. [25] and used a ratio between the second derivative of the approximation on the current mesh and the previous one to determine the smoothness of an interval. Their algorithm used the power-series representation of the Lagrange polynomial to reduce the degree of the approximating polynomial within a mesh interval for smooth intervals when the estimated error was lower than the tolerance. Smooth neighboring mesh intervals were also combined when the estimated error of the larger interval, which extended into the smaller one, was similar. More recently, Liu et al. [27] discovered a new way to estimate the smoothness of mesh intervals according to the decay rate of a Legendre polynomial approximation and increased the mesh size based on the decay rate. The work performed in this manuscript is heavily inspired by [27].
This paper proposes an hp-adaptive mesh refinement algorithm that the TFC, with a truncated Chebyshev orthogonal polynomial-free function, can use to solve hypersensitive TPBVPs. The error between the actual solution and the constrained expression approximation is estimated using an exponential least-squares fit of the Chebyshev coefficients, the orthogonal property of Chebyshev polynomials, and a rule of thumb that correlates the error associated with Chebyshev polynomials when approximating functions with their truncation error. The decay rate of the exponential least-squares fit of the Chebyshev coefficients can be used to determine the smoothness of the solution on an interval. When the error estimate is greater than some tolerance and the interval is deemed smooth, the super-geometric convergence of the truncation error for Chebyshev polynomials can be used to relate the error to the ideal number of basis functions in the Chebyshev polynomial needed to satisfy the tolerance. If the interval is not smooth, it is divided into subintervals so that the solution in each subinterval is easier to approximate with a constrained expression. Smooth intervals can have the number of basis functions in the Chebyshev polynomial decreased when they already satisfy the error estimate. Furthermore, smooth intervals can also be combined if they are exceptionally smooth or if the number of basis functions in all their constrained expression approximations is some minimum number.
As the name of this manuscript suggests, we only demonstrate how the proposed hp-adaptive mesh refinement algorithm can be used to solve hypersensitive TPBVPs, even though it can also be used for more well-behaved problems. Hypersensitive problems have a very long time domain with respect to the rates of expansion and contraction of the dynamics in specific directions within a neighborhood of the optimal solution [31]. Thus, the TPBVPs solved in this work have sharp increases in the gradients near the boundary conditions and a smooth segment between them. These sharp increases concerning the entire domain make hypersensitive TPBVPs challenging to solve with global constrained expression approximations. Although hypersensitive TPBVPs can stem from the indirect method of optimal control, the ones we solve in this work do not contain inequality constraints on the control or states. We do not solve these types of problems in this work because they require smoothing techniques along with a continuation procedure, drastically decreasing computational efficiency. The computation time of our hp-adaptive mesh refinement algorithm is already long (see Section 6 for more details). Therefore, we hope to speed up the computational runtime before coupling the proposed hp-adaptive mesh refinement algorithm with continuation.
This manuscript is outlined as follows. In the next section, we introduce how to solve a general TPBVP with the TFC, as well as how to split the domain into intervals and solve the solution over each with the TFC. The derivation of an error estimate for constrained expressions with orthogonal polynomial-free function approximations is highlighted in Section 3, along with an error analysis for smooth and nonsmooth functions. The TFC-based hp-adaptive mesh refinement algorithm that uses our derived error estimate is then proposed in Section 4. Numerical results for solving three hypersensitive TPBVPs are then provided and discussed in Section 5, along with a computational time study. Lastly, conclusions and an outlook on future work are presented in Section 7.

2. Theory of Functional Connections

Here, the TFC is first presented for solving a TPBVP with a single ODE and with a global constrained expression. We then show how global constrained expressions can be used to solve a TPBVP composed of a system of ODEs. Lastly, we demonstrate how global constrained expressions can be replaced with piecewise constrained expressions, where each only approximates a domain segment.

2.1. General TPBVP Outline

Solving a TPBVP via the TFC can be accomplished by following a simple set of procedures. Indeed, a “cookbook” guide can be summarized in three steps:
1.
Define the loss equation(s), which are represented by the residuals of the differential equations (DEs).
2.
Derive the constrained expression(s).
3.
Map the problem domain to that of the free function (if necessary) or simply normalize it for numerical stability.
Assume a general TPBVP,
y ¨ ( t ) = f ( t , y ( t ) , y ˙ ( t ) ) subject to : y ( t 0 ) = y 0 y ( t f ) = y f ,
where f is some function in terms of time t and the differentiation variable y. As its name suggests, the TPBVP is constrained at the beginning and end of the time domain t [ t 0 , t f ] . With the TFC, we approximate the solution with a constrained expression Y,
y ( t ) Y ( t , g ( t ) ) = g ( t ) + l = 1 N const Ω l ( t ) ϱ l ( t l , g ( t l ) ) ,
where g ( t ) is a free function, Ω l ( t ) are switching functions, ϱ l ( t l , g ( t l ) ) are projection functionals, and  N const is the number of linear constraints.
Defining the loss equation for any set of DEs, including the ODE in Equation (1), is easy. Simply bring all terms to one side of the equals sign, like so,
y ¨ ( t ) f ( t , y ( t ) , y ˙ ( t ) ) = f ^ t , y ( t ) , y ˙ ( t ) , y ¨ ( t ) = 0 ,
where f ^ is the constrained DE residual. To turn the TPBVP into an unconstrained problem (i.e., f ^ is no longer subject to y 0 and y f ), the constrained expressions for y and y ˙ must be derived.
We start the process of figuring out the constrained expressions by rewriting Equation (2) as
Y ( t , g ( t ) ) = g ( t ) + Ω 1 ( t ) ϱ 1 ( t 0 , g ( t 0 ) ) + Ω 2 ( t ) ϱ 2 ( t f , g ( t f ) ) .
The projection functionals (i.e., ϱ 1 and ϱ 2 ) are simply the difference between the constraint value and the free function evaluated at the point where the constraint is defined:
ϱ 1 = y 0 g ( t 0 ) ϱ 2 = y f g ( t f ) .
The most cumbersome part of deriving the constrained expressions is the derivation of the switching functions, which are of the general forms
Ω 1 ( t ) = i = 1 N const = 2 s i ( t ) α i 1 and Ω 2 ( t ) = i = 1 N const = 2 s i ( t ) α i 2 .
Only two switching functions exist because the ODE has two boundary constraints. Combining the equations for the switching functions into a compact linear combination where the constraints occur yields
s 1 ( t 0 ) s 2 ( t 0 ) s 1 ( t f ) s 2 ( t f ) α 11 α 12 α 21 α 22 = Ω 1 ( t 0 ) Ω 2 ( t 0 ) Ω 1 ( t f ) Ω 2 ( t f ) .
To figure out the switching functions, compute the unknown coefficients α i j . The support functions s i do not have to be derived but can merely be chosen. However, to ensure the support matrix is invertible, the columns of the support matrix must be linearly independent. For point constraints, like in our case, the selection of s 1 ( t ) = 1 and s 2 ( t ) = t satisfies this condition. A switching function is defined as being equal to 1 when evaluated at the constraint it is referencing and equal to 0 at all other constraints. Therefore, the system can be solved as
1 t 0 1 t f α 11 α 12 α 21 α 22 = 1 0 0 1 α 11 α 12 α 21 α 22 = 1 t 0 1 t f 1 1 0 0 1 = 1 t f t 0 t f t 0 1 1 .
Substituting the coefficients back into the switching functions and simplifying them gives
Ω 1 ( t ) = t f s 1 ( t ) s 2 ( t ) t f t 0 = t f t t f t 0
and
Ω 2 ( t ) = s 2 ( t ) t 0 s 1 ( t ) t f t 0 = t t 0 t f t 0 .
With the switching and projection functions defined, all that is left to do to finish deriving the constrained expression is to decide what the free function should be. One commonly selected free function among researchers who utilize the TFC method for solving ODEs is a Chebyshev polynomial expansion of the first kind,
g ( t ) = j = N const N const + L 1 ξ j T j ( τ ) = ξ T ( τ ) ,
where ξ j are the coefficients, T j ( τ ) are the Chebyshev basis functions, and L is the number of chosen basis functions. When one expresses a Chebyshev series, one often starts the summation from the zeroth order (i.e., j = 0 ). The TFC process requires the series summation to begin at the order equal to the number of constraints (i.e., j = N const ) because the Chebyshev series terms of order less than N const are linearly dependent on the switching functions. The algebraic system of equations formed via the loss functions must be linearly independent to ensure that the Jacobian products are invertible in the Gauss–Newton algorithm and linear least squares give a unique solution. The loss function eventually derived is guaranteed to be linearly independent when the Chebyshev series begins at j = N const .
The domain of Chebyshev polynomials τ [ 1 , 1 ] does not coincide with the time domain t [ t 0 , t f ] unless t 0 = 1 and t f = 1 . Thus, to ensure all terms within the constrained expressions are on the same domain, the switching functions can be linearly mapped to the τ domain using the equations,
τ = τ 0 + τ f τ 0 t f t 0 ( t t 0 ) t = t 0 + t f t 0 τ f τ 0 ( τ τ 0 ) .
The w-th derivative of the free function can then be defined as
d w g d t w = ξ d w T ( τ ) d τ w c w ,
where the mapping constant c is
c = 2 t f t 0 .
With the free function and subsequent mapping defined, the constrained expressions for the y, y ˙ , and  y ¨ ( t ) variables that make up the loss equation can be defined as
Y ( t ) = Y ( τ ) = ξ T ( τ ) + Ω 1 ( τ ) y 0 ξ T ( τ 0 ) + Ω 2 ( τ ) y f ξ T ( τ f ) = T ( τ ) Ω 1 ( τ ) T ( τ 0 ) Ω 2 ( τ ) T ( τ f ) ξ + Ω 1 ( τ ) y 0 + Ω 2 ( τ ) y f ,
d Y ( t ) d t = c d Y ( τ ) d τ = 2 t f t 0 [ d T ( τ ) d τ d Ω 1 ( τ ) d τ T ( τ 0 ) d Ω 2 ( τ ) d τ T ( τ f ) ξ + d Ω 1 ( τ ) d τ y 0 + d Ω 2 ( τ ) d τ y f ] ,
and
d 2 Y ( t ) d t 2 = c 2 d 2 Y ( τ ) d τ 2 = 2 t f t 0 [ d 2 T ( τ ) d τ 2 d 2 Ω 1 ( τ ) d τ 2 T ( τ 0 ) d 2 Ω 2 ( τ ) d τ 2 T ( τ f ) ξ + d 2 Ω 1 ( τ ) d τ 2 y 0 + d 2 Ω 2 ( τ ) d τ 2 y f ] .
The constrained expressions can now be plugged into Equation (3), which transforms the constrained ODE f ^ into an unconstrained ODE f ˇ consisting of the basis domain τ , unknown Chebyshev coefficients ξ , endpoints of the time domain ( t 0 and t f ), and the boundary conditions of the constrained ODE β :
f ˇ ( τ , ξ , β , t f , t 0 ) = c d 2 Y ( τ ) d τ 2 f τ , Y ( τ ) , c d Y ( τ ) d τ = 0 ,
where β = y 0 , y f .
With the constrained expressions plugged into the DE residual, an algebraic system of equations can be formed after the domain is discretized with Chebyshev–Gauss–Lobatto nodes,
τ d = cos d π N for all d = ( 0 , , N 1 ) ,
where N is the number of discretization points, also called collocation points. Note that having N L ensures that the algebraic system of equations formed via the loss equations is not undetermined. Chebyshev–Gauss–Lobatto nodes are better than uniformly spaced points for discretizing the domain because they avoid the Runge phenomena. Once the domain is discretized, a vector of loss functions at each discretization point can be expressed as
L ( ξ ) = f ˇ ( τ 0 , ξ , β , t f , t 0 ) f ˇ ( τ d , ξ , β , t f , t 0 ) f ˇ ( τ N 1 , ξ , β , t f , t 0 ) ,
where ξ is the only unknown variable, which is why we write the loss only in terms of it here. If f is linear, then ξ within f ˇ will show up linearly. Equation (6) then yields the form
A ξ + b = 0 ,
where the matrix A is a linear combination of terms multiplied by ξ , while the vector b is the loss vector evaluated at ξ = 0 . Linear least squares can then be used to solve Equation (7) for ξ .
In the case of nonlinear DEs, Equation (6) will be nonlinear in the unknown ξ j coefficients. The Gauss–Newton method, also known as iterative least squares, can then be used to solve for the change in ξ at each p iteration step, which is denoted by
Δ ξ p = ξ p + 1 + ξ p .
The Gauss–Newton technique is expressed as
Δ ξ p = J ( ξ p ) J ( ξ p ) 1 J ( ξ p ) L ( ξ p ) ,
where J ( ξ p ) is the Jacobian of the loss vector with respect to the ξ variables and is computed at ξ p . Iterations are repeated until some stopping criterion(s) is met, which in this work is
L ( ξ p ) < ε GN , L ( ξ p + 1 ) L ( ξ p ) < ε GN , or p = p max ,
where ε GN defines some user-prescribed tolerance, · refers to the L -norm, and  p max is the maximum number of iterations. When any stopping criterion is met, the solution is obtained by substituting ξ into the constrained expression and mapping back to the problem domain, which in this example is time. If the last two stopping criteria were met before the first, then the Gauss–Newton algorithm failed to converge toward an accurate solution.

2.2. Solving Systems of ODEs

The example of the general TPBVP solved via the TFC method in the previous walkthrough only contained a single ODE. TPBVPs composed of n ODEs can also be solved via the TFC. Consider the following TPBVP:
y ¨ i ( t ) = f i ( t , y ( t ) , y ˙ ( t ) ) subject to y i ( t i l ) = β i l ,
where i ( 1 , , n ) , l ( 1 , , N const i ) and  y R n . The  β i l constant in Equation (10) is the l-th point constraint on the i-th differentiation variable. Furthermore, t i l is the time at which the l-th constraint occurs on the i-th differentiation variable. To solve this problem with the TFC, a constrained expression may be written for each i-th differentiation variable,
y i ( t ) Y i ( t ) = g i ( t ) + l = 1 N const i Ω i l ( t ) ϱ i l ( t i l , g i ( t i l ) ) ,
where N const i represents the number of constraints on the i-th differentiation variable.
The free function, switching functions, and projection functionals can be derived/chosen following the same procedure as in the previous subsection. Likewise, each constrained expression can be discretized to form a system of algebraic equations,
L i ( Ξ , B , t 0 , t f ) = f ˇ i ( τ 0 , Ξ , B , t 0 , t f ) f ˇ i ( τ N 1 , Ξ , B , t 0 , t f ) = c 2 d 2 Y i ( τ 0 ) d 2 τ f i τ 0 , Y ( τ 0 ) , c d Y ( τ 0 ) d τ c d Y i ( τ N 1 ) d τ f i τ N 1 , Y ( τ N 1 ) , c 2 d 2 Y ( τ N 1 ) d 2 τ ,
Consider that the ξ i and β i variables refer to the vector of Chebyshev coefficients and vector of point constraints associated with the i-th differentiation variable, respectively. Then, the Ξ and β variables in Equation (11) represent the vectors of all Chebyshev coefficient vectors and point constraint vectors appended together as follows:
Ξ = ξ 1 , , ξ i , , ξ n
and
B = β 1 , , β i , , β n .
Note that the lengths of ξ i and β i do not have to be identical. Furthermore, Y R n in Equation (11) is a vector of constrained expressions, where the i-th element approximates the i-th differentiation variable. An augmented algebraic system of equations can be formed as
L ( Ξ ) = L 1 ( Ξ ) L i ( Ξ ) L n ( Ξ ) .
We wrote the final form of the loss only in terms of the unknowns, as in the previous subsection. Substituting ξ with Ξ and L ( ξ ) with L ( Ξ ) into Equations (8) and (9) then allows Ξ to be computed via the Gauss–Newton method. If all of the f i functions are linear, then linear least squares can be used.

2.3. Domain Decomposition

The domain decomposition technique we employ is the same as that presented by Johnston and Mortari [22]. In their work, the authors developed a TFC algorithm that decomposes the domain for solving TPBVPs related to hybrid systems (i.e., systems where a time-sequence of DEs governs the dynamics), along with those that are not but have dynamics that contain transient or stiff behavior. One example is the fuel-optimal planetary landing problem whose control action is bang-bang [18] (i.e., switches between a maximum and minimum, and vice versa). However, the numerical examples provided only involved splitting the domain into two or three segments. This work uses the same procedure when the domain is split into many more segments. How the domain is decomposed is described here for the reader’s convenience. If the reader has any lingering questions, we recommend they read Refs. [18,22].
Assume a TPBVP with one ODE and two boundary conditions, as in Equation (1). To solve this problem with the domain decomposed, separate constrained expressions must be derived for each segment S k for k = ( 1 , , K ) , where K is the total number of segments. Between each segment, C 1 continuity must be enforced and computed together with the Chebyshev coefficients. Figure 1 depicts a solution to the TPBVP where the domain is decomposed into multiple segments. Be aware that Y t in the figure refers to the derivative of Y with respect to t. First, let us talk about the segments at the ends of the domain, i.e.,  S 1 and S K . The initial segment needs a constrained expression that analytically satisfies three constraints: y ( t 0 ) = y 0 , y ( t 1 ) = y 1 , and  y ˙ ( t 1 ) = y ˙ 1 . The boundary condition gives the first constraint, y 0 , while the last two, y 1 and y ˙ 1 , are continuity conditions with the next segment. Similar to the first segment, the last segment’s constraint expression analytically satisfies three constraints as well: y ( t K 1 ) = y K 1 , y ( t f ) = y f , and  y ˙ ( t K 1 ) = y ˙ K 1 . Following the TFC process already introduced, the constrained expression for the first and last segments can be written as
  ( 1 ) Y t ,   ( 1 ) g ( t ) =   ( 1 ) g ( t ) +   ( 1 ) Ω 1 ( t ) y 0   ( 1 ) g ( t 0 ) +   ( 1 ) Ω 2 ( t ) y 1   ( 1 ) g ( t 1 ) +   ( 1 ) Ω 3 ( t ) y ˙ 1   ( 1 ) g ˙ ( t 1 )
and
  ( K ) Y t ,   ( K ) g ( t ) =   ( K ) g ( t ) +   ( K ) Ω 1 ( t ) y K 1   ( K ) g ( t K 1 ) +   ( K ) Ω 2 ( t ) y f   ( K ) g ( t f ) +   ( K ) Ω 3 ( t ) y ˙ K 1   ( K ) g ˙ ( t K 1 ) ,
where the switching functions are
  ( 1 ) Ω 1 ( t ) = 1 t 1 t 0 2 t 1 2 t 1 t + t 2   ( 1 ) Ω 2 ( t ) = 1 t 1 t 0 2 t 0 t 0 2 t 1 + 2 t 1 t t 2   ( 1 ) Ω 3 ( t ) = 1 t 1 t 0 t 0 t 1 t 0 + t 1 t + t 2   ( K ) Ω 1 ( t ) = 1 t f t K 1 2 t f t f 2 t K 1 + 2 t K 1 t t 2   ( K ) Ω 2 ( t ) = 1 t f t K 1 2 t f t K 1 + t f + t K 1 t t 2   ( K ) Ω 3 ( t ) = 1 t f t K 1 t K 1 2 2 t K 1 t + t 2 .
The switching functions for the first and last segments are different because the continuity conditions are at the end of the first segment and the beginning of the last segment. On the other hand, the constrained expressions for the interior segments, S k for all k = ( 2 , , K 1 ) , have constrained expressions with identical switching functions because they always analytically satisfy three continuity constraints at both ends of their domain: y ( t k 1 ) = y k 1 , y ( t k ) = y k , y ˙ ( t k 1 ) = y ˙ k 1 , and  y ˙ ( t k ) = y ˙ k . Unlike the end segments, none of the constraints for the interior segments are associated with the boundary conditions. Thus, the interior segments only have to deal with enforcing continuity. The constrained expressions for these segments can be written as
  ( k ) y t ,   ( k ) g ( t ) =   ( 1 ) g ( t ) +   ( k ) Ω 1 ( t ) y k   ( k ) g ( t k 1 ) +   ( k ) Ω 2 ( t ) y k   ( k ) g ( t k ) +   ( k ) Ω 3 ( t ) y ˙ k 1   ( k ) g ˙ ( t k 1 ) +   ( k ) Ω 4 ( t ) y ˙ k   ( k ) g ˙ ( t k )
where the switching functions are
  ( k ) Ω 1 ( t ) = 1 t k t k 1 3 t k 2 2 t k 1 t k + 6 t k 1 t k t 3 t k 1 + t k t 2 + 2 t 3   ( k ) Ω 2 ( t ) = 1 t k t k 1 3 t k 1 2 t k 1 3 t k 6 t k 1 t k t + 3 t k 1 + t k t 2 2 t 3   ( k ) Ω 3 ( t ) = 1 t k t k 1 2 t k 1 t k 2 + t k 2 t k 1 + t k t t k 1 + 2 t k t 2 + t 3   ( k ) Ω 4 ( t ) = 1 t k t k 1 2 t k 1 2 t k + t k 1 t k 1 + 2 t k t 2 t k 1 + t k t 2 + t 3 .
Now, assume the TPBVP is composed of n ODEs. The free functions within the various constrained expressions can be expressed as truncated Chebyshev polynomials on the domain τ [ 1 , 1 ] . Using Equations (4) and (5), we have
  ( k ) g i ( t ) =   ( k ) g i ( τ ) , d   ( k ) g i ( τ ) d t =   ( k ) c d   ( k ) g i ( τ ) d τ , and c k = 2 t k t k 1 .
Then, by discretizing the domains, the augmented loss vector that combines the loss vectors for each differentiation variable in each segment can be written as
  ( k ) L   ( k ) Ξ ,   ( k ) B , t k 1 , t k =   ( k ) L 1   ( k ) Ξ ,   ( k ) B , t k 1 , t k   ( k ) L i   ( k ) Ξ ,   ( k ) B , t k 1 , t k   ( k ) L n   ( k ) Ξ ,   ( k ) B , t k 1 , t k
where
  ( k ) L i   ( k ) Ξ ,   ( k ) B , t k 1 , t k = f ˇ i τ 0 ,   ( k ) Ξ ,   ( k ) B , t k 1 , t k f ˇ i τ N 1 ,   ( k ) Ξ ,   ( k ) B , t k 1 , t k = c k 2 d 2   ( k ) Y i ( τ 0 ) d τ 2 f i τ 0 ,   ( k ) Y ( τ 0 ) , c k d   ( k ) Y ( τ 0 ) d τ c k 2 d 2   ( k ) Y i ( τ N 1 ) d τ 2 f i τ N 1 ,   ( k ) Y ( τ N 1 ) , c k d   ( k ) Y ( τ N 1 ) d τ .
The unknown variable   ( k ) Ξ represents the k-th segment’s appended unknown Chebyshev coefficient vector,
  ( k ) Ξ =   ( k ) ξ 1 , ,   ( k ) ξ i , ,   ( k ) ξ n .
Likewise,   ( k ) B refers to the augmented vector of point constraints for the k-th segment and can be written as
  ( k ) B =   ( k ) β 1 , ,   ( k ) β i , ,   ( k ) β n .
Not all point constraints are known in constrained expressions that approximate only a segment of the domain. Only the initial conditions on S 1 and final conditions on S K are known. All continuity conditions within a k-th segment and for an i-th differentiation variable are unknown and are represented by   ( k ) β ^ i . Grouping the continuity conditions in a segment gives
  ( k ) B ^ =   ( k ) β 1 ^ , ,   ( k ) β i ^ , ,   ( k ) β n ^ .
Likewise, grouping the continuity conditions in all segments gives
B ^ = unique   ( 1 ) B ^ , ,   ( k ) B ^ , ,   ( K ) B ^ ,
where unique ( · ) means only consider the unique elements of the · vector. All unknowns can then be grouped to form
Z =   ( 1 ) Ξ , ,   ( k ) Ξ , ,   ( K ) Ξ , B ^ .
Exactly how B ^ can be built for a specific TPBVP is shown in Appendix A.
The complete augmented loss vector that contains all segments and forms the system of algebraic equations to be solved is
L ( Z ) =   ( 1 ) L Z   ( k ) L Z   ( K ) L Z .
Once again, L is only explicitly written in terms of Z because the boundary conditions within   ( 1 ) B and   ( K ) B are known, and so too are t k for all k = ( 1 , , K ) . The unknown Z values can be computed using linear least squares when every f i is linear or the Gauss–Newton algorithm when any f i is nonlinear. The  L algebraic system will be taken in a block diagonal form due to its segment-wise construction, allowing efficient algorithms for computing sparse matrices to be exploited.

3. Error Estimation

The most critical aspect of an hp-adaptive mesh refinement algorithm is how it relates the approximation error to when the order of the approximating polynomial should be increased (p-adaptive mesh refinement) or when an approximating polynomial should be divided into more segments (h-adaptive mesh refinement). For the TFC, increasing the approximating polynomial order corresponds to adding more basis functions to the constrained expression’s free function (a truncated expansion of Chebyshev polynomials). In the previous section, it was shown that the TFC minimizes L to obtain an accurate solution. However, a criterion for when to increase the number of basis functions in a constrained expression or split the constrained expression into multiple segments can be made by examining the constrained expression’s truncation error. Therefore, that criterion is presented in this section, along with an error analysis of a few problems to demonstrate that a low truncation error corresponds with a small L .

3.1. Truncation Error of the Constrained Expression

Let y ( τ ) be a piecewise smooth bounded function on the interval τ [ 1 , + 1 ] that has N const point constraints at τ l with l = 1 , , N const . Then, y ( τ ) can be approximated with a constrained expression Y ( τ ) , where the free function is a linear combination of L Chebyshev polynomials, T j for all j = ( N const , , L + N const 1 ) . Assuming point constraints, the projection functionals within the constrained expression take the form
ϱ j ( τ l ) = y ( τ l ) j = N const L + N const 1 ξ j T j ( τ l ) .
Hence, the approximation (constrained expression) takes the form
y ( τ ) Y ( τ ) = j = N const L + N const 1 ξ j T j ( τ ) + l = 1 N const Ω l ( τ ) y ( τ l ) j = N const L + N const 1 ξ j T j ( τ l ) ,
where Ω l is the switching function for the l-th constraint. Since y ( τ ) is piecewise smooth and bounded, it can be represented exactly by an infinite Chebyshev polynomial series,
y ( τ ) = j = 0 a j T j ( τ ) = j = 0 N const 1 a j T j ( τ ) + j = N const L + N const 1 a j T j ( τ ) + j = L + N const a j T j ( τ ) ,
where a j for all j = ( 0 , , ) are the Chebyshev coefficients of the infinite series. The Euclidean norm, otherwise called the L 2 -norm, of the error between y ( τ ) and Y ( τ ) satisfies the following inequality:
e = y ( τ ) Y ( τ ) 2 = j = 0 a j T j ( τ ) j = N const L + N const 1 ξ j T j ( τ ) + l = 1 N const Ω l ( τ ) y ( τ l ) j = N const L + N const 1 ξ j T j ( τ l ) 2 e T + e A + e C ,
where e T is the truncation error, e A is the aliasing error, and  e C is what we call the constraint error. They are given as
e T = j = L + N const a j T j ( τ ) 2 ,
e A = j = N const L + N const 1 a j ξ j T j ( τ ) 2 ,
and
e C = j = 0 N const 1 a j T j ( τ ) l = 1 N const Ω l ( τ ) y ( τ l ) j = N const L + N const 1 ξ j T j ( τ l ) 2 .
The aliasing and constraint errors can only be calculated if the function being approximated is known. A critical conclusion about the magnitude of the truncation error can be made from Rule of Thumb 1, as presented by Boyd [32] (p. 32):
Rule of Thumb 1. 
e A  will be of the same order of magnitude as  e T .
Furthermore, through testing, we found that the constraint error is of a similar magnitude to the aliasing error. Thus, if the truncation error is low, so too should be the aliasing and constraint errors.
Note that f 2 = f , f 1 / 2 is the norm induced by the inner product
f , g = 1 1 f ( τ ) g ( τ ) d τ .
Using the definition of the inner product in Equation (15), the orthogonal property of Chebyshev polynomials is
T m , T n = 1 1 T m ( τ ) T n ( τ ) 1 τ 2 1 / 2 d τ = π 2 δ m n ,
where δ m n is the Kronecker delta function. Multiplying 1 τ 2 1 / 4 by Equation (14) and applying the absolute homogeneity of norms gives
1 τ 2 1 / 4 e T = 1 τ 2 1 / 4 j = L + N const a j T j ( τ ) 2 = j = L + N const a j T j ( τ ) 1 τ 2 1 / 4 2 .
Expanding the norm in Equation (17), substituting in Equation (15), and plugging in Equation (16) gives a new expression for e T ,
e T = 1 τ 2 1 / 4 j = L + N const a j T j ( τ ) 1 τ 2 1 / 4 2 = 1 τ 2 1 / 4 j = L + N const 1 1 a j 2 T j 2 ( τ ) 1 τ 2 1 / 2 d τ 1 / 2 = 1 τ 2 1 / 4 j = L + N const a j 2 1 1 T j 2 ( τ ) 1 τ 2 1 / 2 d τ 1 / 2 = π 2 1 / 2 1 τ 2 1 / 4 j = L + N const a j 2 1 / 2 .
It has been shown that the Chebyshev series has super-geometric convergence when y ( τ ) is analytic in the neighborhood of τ [ 1 , + 1 ] [32,33]. Hence, the Chebyshev coefficient values from the infinite expansion a j for all j = ( 0 , , ) decay as
| a j | O s e q ˜ j r ,
with constants s , q ˜ > 0 , and r > 1 . Following in the spirit of Lie et al. [27], we note that a slightly smaller value q > 0 exists, such that
s e q ˜ j r s min j s e q j | a j | e q j .
The exponential upper bound is more convenient because it allows the Chebyshev coefficients to be approximated with an exponential least-squares fit. The minimum in Equation (19) represents a translation that ensures every value of a j is below the bound. Since the solution of the problem being solved is often not known, the  | a j | values cannot be computed to perform the regression, but the truncated coefficients from the constrained expression | ξ j | can. Based on Rule of Thumb 1, a j ξ j when the truncation error is low. Thus, simply replace a j with ξ j in Equation (19) to obtain an upper bound estimate for a j for all j = ( N const , , L + N const 1 ) . Extrapolation can then be used to obtain an upper bound for j = ( 0 , , ) . As a result, in this work, the upper bound for | a j | is approximated by
ln | a j | ln | ξ j | ln s δ q j ,
with s , q > 0 , j = ( N const , , L + N const 1 ) , and 
δ = min j ln s q j ln | ξ j | .
The s and q coefficients are obtained by performing an exponential least-squares fit on | ξ j | ,
ln | ξ j | = ln s q j .
Note that q represents the exponential decay of the Chebyshev coefficients. Furthermore, an upper bound of e T can be derived by plugging Equation (20) into (18),
e T π 2 1 / 2 1 τ 2 1 / 4 j = L + N const e 2 ln s δ q j 1 / 2 π 2 1 / 2 1 τ 2 1 / 4 j = L + N const s 2 e 2 δ e 2 q j 1 / 2 .
The summation in Equation (22) is a geometric series with the common ratio e 2 q ,
j = L + N const s 2 e 2 δ e 2 q j = s 2 e 2 δ e 2 q L + N const 1 e 2 q .
Thus, Equation (22) becomes
e T π 2 1 / 2 1 τ 2 1 / 4 s 2 e 2 δ e 2 q L + N const 1 e 2 q 1 / 2 π 1 / 2 1 τ 2 1 / 4 s e δ e q L + N const 2 1 e 2 q 1 / 2 π 1 / 2 s e δ q L + N const 2 1 e 2 q 1 / 2 = e ^ T .
From Equations (20) and (23), one can see that the coefficients | ξ j | and the approximate truncation error upper bound e ^ T decrease at the same exponential rate q, a result similar to that seen in Liu et al. [27]. Consequently, the Chebyshev coefficients ξ j for all j = ( N const , , L + N const 1 ) can be used to estimate the decay rate, q, of the truncation error as a function of the free function’s Chebyshev polynomial degree given in Equation (13).
Before moving on, the reader may have noticed that the L 2 -norm is used to derive the analytical expression for the truncation error (see Equation (14)). The  L 2 -norm had to be used to derive the expression using the orthogonal property of Chebyshev polynomials (Equation (16)). On the other hand, the accuracy of the TFC is quantified by taking the L -norm of the losses (see Equation (9)). In general, the  L 2 -norm of the loss can also be used to quantify the accuracy of the TFC, but we chose to use the L -norm in order to align with the only textbook on the TFC [1]. Although the  L 2 -norm of the truncation error and the L -norm of the loss are two separate measurements, for one to be used for hp-adaptive mesh refinement and the other used to be used as a stopping criterion, it is required that both correlate. When one decreases, so must the other, and vice versa. This correlation is demonstrated in Section 3.3 and Section 3.4.

3.2. Determining Function Smoothness

How to determine when to add basis functions to a constrained expression within a segment or split it up is a critical aspect of our mesh refinement algorithm. In the same vein as Liu et al. [27], this decision can be made based on the decay rate of the ξ j coefficients and the truncation error, i.e., the q value from Equations (21) and (23). The main distinction between our work and [27] is that our decay rate comes from an exponential fit on the truncated Chebyshev expansion coefficients within an approximating constrained expression, whereas the decay rate from [27] comes from an exponential fit on the truncated Legendre expansion coefficients computed from an approximating Lagrange polynomial. Similar to [27], the function is smooth enough to be approximated with a single polynomial if q is large. In other words, only basis functions need to be added to bring the error down to some desired tolerance. If q is sufficiently small, then the function is not smooth enough to be approximated with a single polynomial, and a piecewise polynomial should be employed to approximate it. The cutoff decay rate value q ¯ , which decides when to add basis functions or split a segment, is a design choice determined by the engineer. When q q ¯ , basis functions should be added to the free function of the constrained expression to improve the approximation. On the other hand, when q < q ¯ , the segment the constrained expression approximates should be divided to improve the approximation. In the next two subsections, error analysis studies are carried out on a normal TPBVP with a smooth solution and a hybrid TPBVP with a nonsmooth solution to demonstrate agreement between the decay rate of the ξ j coefficients and the truncation error’s maximum upper bound e ^ T , as well as determine how they correspond to the aliasing error, constraint error, actual error, and  L -norm of the loss.

3.3. Case 1 Error Analysis: TPBVP with a Smooth Solution

Consider the following TPBVP:
2 y ¨ 1 ( t ) y 1 ( t ) = 4 sin ( 2 t ) , s . t . y ( 0 ) = 0 y ( 1 ) = 0 ,
which has the analytical solution
y 1 ( t ) = c 1 e t / 2 + c 2 e t / 2 4 19 sin ( 3 t ) ,
where
c 1 = 4 e 1 / 2 sin ( 3 ) 19 e 2 + 1
and
c 2 = 4 e 1 / 2 sin ( 3 ) 19 e 2 + 1 .
It can be seen that y 1 ( t ) is a smooth function. Using the TFC procedure outlined in Section 2, Equation (24) can be solved by approximating y 1 as a constrained expression with the truncated Chebyshev coefficients ξ j . Using the computed ξ j values via the TFC, a least-squares exponential fit of the form shown in Equation (21) can be constructed. Equation (20) can then be used to build an upper bound on the ξ j values. Subsequently, an upper bound on the truncation error e ^ T can be computed, as shown in Equation (23). The aliasing error e A and the constraint error e C can also be computed because the true solution is known. Only the infinite Chebyshev expansion coefficients a j have to be computed, which we performed using the open-source MATLAB toolbox Chebfun [34]. Figure 2 shows how the actual error; two variants of e ^ T , e A , e C ; two estimated error bounds (e = e ^ T + e A + e C ); the  L -norm of the loss shown in Equation (6); and the final Chebyshev coefficient ξ L + N const 1 changed as the number of Chebyshev basis functions L increased. The number of collocation points N was set equal to L + 3 . Note that the results in Figure 2 correspond to Rule of Thumb 2 from Boyd [32] (p. 51), which states the following:
Rule of Thumb 2. 
The truncation error is the same order of magnitude as the last coefficient retained in the truncation for a series with geometric convergence.
For L 15 , the  ξ L + N const 1 coefficients are so low in value that they are negligible. Hence, the accuracy of the solution will not improve by adding more than 15 basis functions to the Chebyshev polynomial. This can be observed by following the L and e curves in Figure 2. All coefficients with j > L R , where L R = 15 , are called the roundoff plateau [32] (pp. 30–31) because their values are less than the machine-level error, which is 1 × 10 16 . Since the inclusion of Chebyshev series terms with their coefficients on the roundoff plateau has no effect on the accuracy of the solution, they do not have to be included in the exponential regression. Not including them results in an improved upper bound (IUB) for the non-negligible ξ j coefficients, i.e., those that are off the roundoff plateau. Including the ξ j coefficients that are on the roundoff plateau in the exponential regression results in a conservative upper bound (CUB) for them. The resulting truncation errors from using a CUB or IUB on the ξ j coefficients are referred to as CUB e ^ T and IUB e ^ T in Figure 2. The CUB and IUB e ^ T curves are practically identical until L > L R . The CUB causes e ^ T to level out, while the IUB causes e ^ T to keep decreasing for larger values of L. One of the main takeaways from Figure 2 is that it validates the truncation error bound because the actual error is always below the estimated total error bounds with a CUB and IUB e ^ T . Figure 2 also shows that when e ^ T is low, so too is L . Lastly, both truncation errors follow ξ L + N const 1 , demonstrating that the decay rate of the ξ j coefficients is the same as the truncation error for a smooth analytic function.
Figure 3 aids in the analysis of the exponential regression performed on the ξ j values computed via the TFC for solving Equation (24). Subplot (a) shows the ξ j coefficients for j = ( 2 , , L ) when L = 30, the exponential fit performed on all coefficients (Exp. Fit 1), the exponential fit performed on only the coefficients of the roundoff plateau (Exp. Fit 2), the CUB, and the IUB. From the subplot, one can clearly see that the IUB bounds the ξ j coefficients off the roundoff plateau tighter than the CUB. Furthermore, it is clear that geometric convergence is maintained until the roundoff plateau is reached. Subplot (b) shows the goodness of the exponential fit according to the coefficient of determination R 2 as L increases. R 2 can be expressed as
R 2 = 1 j = N const L + N const | ξ j | f j 2 i = N const L + N const | ξ j | mean ( ξ ) 2 ,
where f j is the value of the fit for the j-th coefficient and
mean ( ξ ) = 1 L + 1 i = N const L + N const | ξ j | .
From subplot (a), one can see that the exponential fit is not very good when all ξ j coefficients are included in the regression. This is confirmed by subplot (b) because R 2 for Exp. Fit 1 moves further from 1 when more ξ j coefficients with j > L R are included in the regression. However, when only the ξ j coefficients off the roundoff plateau are fitted, R 2 1 as j increases. Subplot (c) shows the decay rates q computed via Exp. Fit 1 and Exp. Fit 2 as L increases. One can see that all decay rates are greater than 1.5, which is emblematic of the y 1 function’s smoothness. The different decay rates are identical until L = L RP = 15 . The Exp. Fit 2 decay rates for L > L RP are constant because the ξ j coefficients included in the fit are only for j = 2 , , L RP . One can see that the Exp. Fit 1 decay rates are greater than those for the Exp. Fit 2 directly after L RP is reached but then start to decrease quickly as L is further increased. Hence, when L = 50 , the decay rate for Exp. Fit 1 could be close to 0. This means that a smooth function could have a very low decay rate if the round-off plateau is included in the exponential regression of the ξ j coefficients. Therefore, Exp. Fit 2 is preferred over Exp. Fit 1.

3.4. Case 2 Error Analysis: Hybrid TPBVP with a Nonsmooth Solution

Consider the hybrid TPBVP
y ¨ ( t ) = t 2 + a , subject to : y ( 0 ) = 0 y ( 1 ) = 0 where : a = 0 for t 0.5 a = 1 for t > 0.5 .
which has a discontinuous solution. We instead find a smoothed solution by transforming Equation (25) into
y ¨ 2 ( t ) = t 2 + 1 1 2 1 2 tanh γ t α , subject to : y ( 0 ) = 0 y ( 1 ) = 0 ,
which has the analytical solution
y 2 ( t ) = 1 12 3 α 2 Li 2 e 2 γ t α + 6 α ln ( 2 ) γ t 3 γ 2 + 6 γ t + t 4 + c 1 t + c 2 ,
where
c 1 = 1 4 α 2 Li 2 e 2 γ 1 α 1 4 α 2 Li 2 e 2 γ α γ 2 + α ln ( 2 ) 2 + 11 12 ,
c 2 = 1 4 α 2 Li 2 e 2 γ α γ α ln ( 2 ) 2 + γ 2 4 ,
and
Li 2 ( x ) = 2 x d τ ln ( τ ) .
The γ variable represents the discontinuous point, which is set to γ = 0.5 . The  α variable determines how smooth the region around t = γ is. The closer α is to 0, the sharper the gradient at t = γ , and the closer the solution of Equation (26) to Equation (25). Thus, we set α = 1.5 × 10 3 . An error analysis study, similar to that shown in Section 3.3, was then performed. The main difference from the previous study is that y 2 is nearly not smooth. This results in the ξ j coefficients, computed via the TFC, oscillating severely between local maximums and local minimums. Hence, in this study, Exp. Fit 2 refers to performing an exponential regression on only the local maximums of the ξ j coefficients. Exp. Fit 1 refers to performing an exponential regression on all ξ j coefficients, just like in Section 3.3. The resulting CUB and IUB are from using Equation (20) with Exp. Fit 1 and Exp. Fit 2, respectively.
Error curves of the TFC solutions to Equation (26) as L increases are shown in Figure 4. Compared with Figure 2, many more basis functions are required for the actual error to converge. Furthermore, the  L -norm of the loss is still decreasing and greater than the actual error. The reason for this is that the sharp gradient that replaces the discontinuity in Equation (25) appears for y ¨ 2 , which contributes to the loss equation, while the actual error is only for the constrained expression of y 2 . Another important observation from Figure 4 is that both truncation error bounds are well above the actual error, meaning they do not accurately represent the total error. Rule of Thumb 1, as described in Section 3.1, does not seem to apply. However, this makes sense because our selection of α = 1.5 × 10 3 creates such a sharp gradient that the solution has a difficult time being approximated by a global polynomial, i.e., it is nearly not smooth. However, the IUB e ^ T is much closer to the actual error and the L -norm of the loss curve than the CUB e ^ T . This is because the IUB e ^ T is, as its name suggests, an improved upper bound, considering only the local maximums of the ξ j coefficients during the exponential regression, as shown in Figure 5.
Subplot (a) in Figure 5 shows the ξ j coefficients for j = ( 2 , , L ) belonging to the TFC solution when L = 4000 , the exponential fit performed on all coefficients (Exp. Fit 1), the exponential fit performed on only the local maximums (Exp. Fit 2), the CUB, and the IUB. As can be clearly seen, the coefficients fluctuate significantly. This results from the Gibbs Phenomenon, which is the oscillatory behavior of an approximation around a jump discontinuity, or in our case, a sharp gradient nearly identical to a jump discontinuity. The oscillatory behavior of the coefficients causes their exponential fit to be poor, as seen in subplot (b), which shows the R 2 value as L increases. When an exponential fit of only the local maximum ξ j coefficients is performed, the  R 2 values are much closer to 1, and the resulting upper bound is much less conservative while still bounding all coefficients. Comparing Figure 3a with Figure 5a, one can see that the local maximum ξ j coefficients for approximating y 2 yield algebraic convergence instead of geometric convergence. This is a sign that y 2 is very close to not being smooth, if not already. Indeed, geometric convergence is only a given for a truncated series of Chebyshev polynomials if the function they approximate is definitively continuous and smooth.
Since the local maximums of the ξ j coefficients for approximating y 2 yield algebraic convergence, Rule of Thumb 2 does not apply (which can be seen in Figure 4). Also, the IUB in subplot (a) has a lot of room between it and the local maximum ξ j coefficients as j increases. Hence, the translated exponential fit shown in Equation (20) is not the best representation for nonsmooth functions. Nonetheless, the translated exponential fit is still a sufficient upper bound, meaning the resulting IUB e ^ T is still valid. Other than the global polynomial not approximating the solution well, another reason why Rule of Thumb 1 is not valid in this case is that the IUB e ^ T is still a conservative upper bound, although not as conservative as the CUB e ^ T . Therefore, Exp. Fit 2 is better than Exp. Fit 1 because it leads to the IUB e ^ T . Figure 4 shows that the IUB e ^ T is closer to the L -norm of the loss and, as such, correlates better with how the TFC method determines the accuracy of a solution.
An indication that the approximated function is not smooth, other than the local maximums of the ξ j coefficients yielding algebraic convergence, is the decay rate of the exponential fits, as shown in subplot (c). The decay rate of the exponential fits as L increases are all very close to 0. Thus, since the solution to Equation (26) is very close to not being smooth (i.e., α = 1.5 × 10 3 ), the decay rate of the ξ j coefficients alone can be used to determine that the actual solution is not smooth, or very close to not being smooth. When this is the case, the segment of the domain the constrained expression approximates should be divided.

4. hp-Adaptive Mesh Refinement Algorithm

The proposed hp-adaptive algorithm for solving hypersensitive TPBVPs begins by constructing an initial mesh M h = 0 and formulating the unconstrained system of ODEs via the TFC. All meshes M h are composed of mesh segments S k = [ t k , t k + 1 ] for all k = ( 1 , , K ) , the number of basis functions for each constrained expression on each segment   ( h , k ) L i , and the number of collocation points on each segment of each mesh   ( h , k ) N . Note that the subscript h refers to an iteration of the hp-adaptive algorithm. Hence, each iteration of the algorithm has its own mesh. Furthermore, the number of basis functions does not have to be the same among the constrained expressions on S k belonging to M h . However, the number of collocation points for all constrained expressions on the same segment must be identical. Therefore, we express   ( h , k ) N as
  ( h , k ) N = max i   ( h , k ) L i + N + ,
where N + is a user-defined parameter.
With M 0 and the unconstrained system of ODEs formed via the TFC, an iteration of the hp-adaptive mesh refinement algorithm begins by solving the algebraic system of equations representing the unconstrained system of ODEs on M 0 with the Gauss–Newton method or linear least squares. Afterward, the truncated Chebyshev series terms/coefficients on the roundoff plateau or the local minimums of significant oscillation are first eliminated for the decay rate and bounded truncation error computations of the constrained expressions. The reason for this was explained in Section 3.3 and Section 3.4, and the method for accomplishing this is explained in Section 4.1. The estimated truncation error bound and decay rates, as described in Section 3, for each constrained expression in each interval can then be computed. If a k segment’s maximum absolute loss on M h ,   ( h , k ) L , is greater than some prescribed tolerance, ϵ MR , then   ( h + 1 , k ) L i for all i = ( 1 , , n ) are increased, decreased, or the mesh interval is divided into subintervals for the next mesh iteration (see Section 4.2 and Section 4.3). The computational efficiency of subsequent mesh iterations can be improved by combining neighboring pairwise segments that are “super smooth” or by combining all neighboring segments that contain the minimum number of basis functions in their constrained expressions, L min (see Section 4.4). With the next mesh interval defined as M h + 1 , the process can be repeated until a maximum number of mesh iterations is reached, h = H , or    ( h , k ) L ϵ MR for all k = ( 1 , , K ) . An outline of the process is given in Section 4.5.

4.1. Disregarding Unnecessary Truncated Chebyshev Series Terms for Bounded Truncation Error Computation

In Section 3.1, we showed that a bounded exponential least-squares approximation of the truncated Chebyshev coefficients is necessary to obtain an expression for the maximum bound of a constrained expression’s truncation error. If the bounded exponential least-squares approximation of the Chebyshev coefficients is bad, then so too is the computation for the truncation error bound of the constrained expression. By “bad”, we do not only mean that the bounds are not valid. Instead, we mean that the Chebyshev coefficients and actual truncation error are well below their respective bounds (i.e., too conservative). In the next subsection, it is mathematically shown that conservative bounds lead to overly large mesh sizes. Hence, conservative bounds are undesirable. Obviously, bounds that are not valid are also undesirable.
In Section 3.3 and Section 3.4, it was shown that performing an exponential least-squares fit on all ξ j coefficients can lead to an incorrect or highly conservative bound for ξ j . However, it was also shown that performing an exponential least-squares fit without the ξ j values that are on the roundoff plateau and without including the local minimums of significant oscillation leads to much better bounds. Representing the set of indices corresponding to the coefficients not on the roundoff plateau and not local minimums of significant oscillation as ς , the exponential least-squares fit equation for computing q and s can then be given as
ln | ξ ς | = ln s q ς .
To determine the j indices not included in ς , we refer to the coefficients on the roundoff plateau as those that are less than double machine epsilon. Furthermore, coefficients are believed to be a local minimum of significant oscillation when they are smaller than 3 orders of magnitude from their neighboring coefficients.

4.2. Increasing or Decreasing the Number of Basis Functions

Suppose   ( h , k ) L > ϵ MR and q q ¯ . Then, all constrained expressions in S k are considered to be smooth. Every   ( h + 1 , k ) L i can then be increased to reduce   ( h + 1 , k ) L . Let   ( h , k ) e ^ T i denote the constrained expression’s estimate for the upper truncation error bound over interval S k , on mesh h, and for the i-th differentiation variable. Then, we have the relationship
  ( h , k ) e ^ T i = π 1 / 2 s exp δ q   ( h , k ) L i +   ( h , k ) N const i + 1 2 1 exp ( 2 q ) 1 / 2 ,
where   ( h , k ) N const i is the number of constraints on the i-th differentiation variable in S k belonging to M h . On the next mesh, M h + 1 , it is desired to achieve   ( h + 1 , k ) e ^ T i = ε T 0 because, as Section 3.3 shows, a small   ( h + 1 , k ) e ^ T i corresponds to a small   ( h + 1 , k ) L i . There is a specific value of   ( h + 1 , k ) L i that yields   ( h + 1 , k ) e ^ T i = ε T ,
ε T = π 1 / 2 s exp δ q   ( h + 1 , k ) L i +   ( h + 1 , k ) N const i + 1 2 1 / 2 1 exp ( 2 q ) 1 / 2 .
Dividing Equations (27) and (28), recognizing that   ( h , k ) N const i =   ( h + 1 , k ) N const i when the intervals are not changed, and solving for   ( h + 1 , k ) L i gives
  ( h + 1 , k ) L i =   ( h , k ) L i + 1 q ln   ( h , k ) e ^ T i ε T ,
where q is calculated as shown in Section 4.1. Note that · rounds the · inside it up to the next highest integer, which is necessary because the number of basis functions has to be an integer that strictly increases for a lower   ( h + 1 , k ) e ^ T i . In Equation (29), one can see that if   ( h , k ) e ^ T i < ε T , then the number of basis functions decreases (i.e.,   ( h + 1 , k ) L i <   ( h , k ) L i ). This is ideal because it allows the number of basis functions for specific constrained expressions in a specific segment to decrease and still have   ( h + 1 , k ) L ϵ MR . Allowing smoother constrained expressions to decrease their   ( h , k ) L i values reduces computational complexity when obtaining the Jacobians and optimizing for the Chebyshev coefficients. Lastly, note that ε T is a user-defined value but should be as low as possible.

4.3. Dividing a Mesh Interval

Let   ( h , k ) L > ϵ MR and q < q ¯ . Then, computing   ( h + 1 , k ) L i with Equation (29) will be very large for some small ε T . Large   ( h + 1 , k ) L i values cause large inversion matrices in the Gauss–Newton and linear least-squares algorithms, which raises the condition number. This can cause both algorithms to become numerically unstable and fail. Instead of increasing the number of basis functions in a mesh interval, it can instead be divided. The number of subintervals, V k , into which S k is divided should equal the predicted number of basis functions in the constrained expression for the next mesh in Equation (29) using q ¯ ,
  ( h , k ) L ¯ i =   ( h , k ) L i + 1 q ¯ ln   ( h , k ) e ^ T i ε T .
Thus, we have
V k = max i   ( h , k ) L ¯ i   ( h , k ) L i .
The locations of the knot points for all V k intervals within S k are equidistant.

4.4. Combining Mesh Intervals

The prescribed decay rate q ¯ indicates whether a function is smooth ( q q ¯ ) or nonsmooth ( q < q ¯ ). Here, a new prescribed decay rate q ` > q ¯ is introduced to determine whether a function is super-smooth ( q q ` ). If two and only two adjacent mesh intervals have q > q ` in all of their constrained expressions, then the intervals are combined. The number of basis functions on the new interval is equal to the sum of the basis functions in the intervals being combined. Only two neighboring mesh intervals are combined in a mesh iteration at a time to reduce the chance that multiple super-smooth segments are combined into one nonsmooth segment. Furthermore, neighboring pairwise super-smooth segments can only be combined in a mesh iteration if they were not divided in the same iteration.
Another way in which segments are combined is if adjacent segments have
max i   ( h , k ) L i = L min ,
which means every constrained expression in a segment has the minimum number of basis functions. Hypersensitive TPBVPs have a solution that typically consists of constant values for long periods of time, followed or proceeded by sharp changes in its gradient. When the equality in Equation (30) holds true, the solution on the k segment is often constant. Thus, all neighboring segments where Equation (30) holds true are combined. We do not have to worry about combining a collection of super-smooth segments into a nonsmooth segment because the solution on the combined segments consists of constant values. Segments can only be combined if they were not divided in the same mesh iteration as well.

4.5. hp-Adaptive Mesh Refinement Algorithm Outline

For the reader’s convenience, a brief step-by-step procedure of the hp-adaptive mesh refinement algorithm is given as follows:
Step 1:
Set h = 0 and provide an initial mesh M 0 that is composed of initial segments   ( 0 , k ) S and an initial number of basis functions for each constrained expression within each segment,   ( 0 , k ) L i . Furthermore, make sure to set the hyperparameters, which are listed in Table 1.
Step 2:
Formulate the unconstrained system of ODEs from the TPBVP by building the constrained expressions for the differentiation variables on M h .
Step 3:
Solve the unconstrained system of ODEs on mesh M h , as described in Section 2.
Step 4:
If ( h , k ) L ϵ MR for all k = ( 1 , , K ) , then the TPBVP is solved with the user’s desired accuracy, and the mesh refinement is complete. If  h = H , then the mesh refinement is also complete, even though the TPBVP may not be solved at the desired accuracy. Otherwise, for every S k where k = ( 1 , , K ) on M h :
  • Proceed to the next segment if   ( h , k ) L ϵ MR .
  • Calculate q i and e ^ T i for every i = 1 , , n in the segment, as shown in Section 3.1. Make sure to disregard the terms from the Chebyshev polynomial within the constrained expressions that contain negligible coefficients on the roundoff plateau or are a local minimum of significant oscillation, as described in Section 4.1.
  • If max i q i > q ¯ , modify the segment for the next mesh iteration by either increasing/decreasing the number of basis functions in any of the segment’s constrained expressions (Section 4.2) or by dividing the segment (Section 4.3).
Step 5:
For neighboring segments that were not divided for the next mesh iteration, combine them for the next mesh iteration, as shown in Section 4.4).
Step 6:
Set h = h + 1 and return to Step 2.
Table 1. Hyperparameters.
Table 1. Hyperparameters.
SymbolValueDescription
ϵ GN 1 × 10 15 Gauss–Newton error tolerance.
ϵ MR 1 × 10 13 Mesh refinement error tolerance for an acceptable solution.
ε T 1 × 10 15 Desired truncation error for a constrained expression.
p max 40Maximum number of Gauss–Newton iterations.
H20Maximum number of mesh refinement iterations.
N + 3Number of collocation points more than the number of basis functions.
q ¯ 0.8Cutoff decay rate that determines whether a function is smooth or not.
q ` 1.2Cutoff decay rate that determines whether a function is super-smooth or just smooth
L min 3Minimum number of basis functions in a constrained expression.

5. Results

We solved three hypersensitive TPBVPs with our proposed TFC hp-adaptive mesh refinement algorithm. Each TPBVP was derived by applying PMP to its respective OCP. A brief procedure for how to use PMP to formulate TPBVPs from OCPs is given in Appendix B. All programming was performed in MATLAB® R2021b and run on an Intel Core i7-9700 CPU PC with 64 GB of RAM. The hyperparameters used for the algorithm are given in Table 1, except that ϵ MR equaled 1 15 for the first test problem.

5.1. Linear Hypersensitive Problem

The first OCP we attempted to solve was
min u J = 1 2 t 0 t f x 2 + u 2 d t s . t . x ˙ = x + u
with the boundary constraints
x ( t 0 ) = 1.5 x ( t f ) = 1 .
The initial and final times were t 0 = 0 and t f = 10,000, respectively. After applying PMP, the dual TPBVP ODEs were
x ˙ = x λ λ ˙ = x + λ ,
where λ is the costate variable that can be used to compute the control,
u = λ .
As its name suggests, the TPBVP is subject to the same boundary conditions as the OCP (Equation (31)). The main reason we decided to solve this hypersensitive TPBVP was because its linearity allows a true analytical solution to be found quite easily:
x * ( t ) = c 1 exp t / 2 + c 2 exp t / 2
and
u * ( t ) = x ˙ * ( t ) + x * ( t )
with
c 1 = 1 exp t f / 2 exp t f / 2 1.5 exp t f / 2 1
and
c 2 = 1 exp t f / 2 exp t f / 2 1 1.5 exp t f / 2 .
To quantify the benefits of using our TFC-based hp-adaptive mesh refinement algorithm, we first solved this problem with the standard TFC procedure that uses global constrained expressions to approximate x and u with a hard-coded number of basis functions and collocation points. Multiple solutions were calculated, each having the same number of basis functions in the constrained expressions approximating x and u, but increased by one for each run. The number of collocation points for each solution was fixed at 1050. Figure 6 shows the L -norm of the loss and actual error for the solutions. As can be seen, the TFC with global constrained expressions was very capable of solving this problem, but around 600 basis functions were required for the minimum loss and minimum actual error to be reached. With the hp-adaptive mesh refinement algorithm, this problem should be able to be solved with a much fewer number of total basis functions, and even with better accuracy.
The TFC-based hp-adaptive mesh refinement algorithm began with an initial mesh of only two knot points: one at t 0 and another at t f , i.e., global constrained expressions. The number of basis functions in the x and λ constrained expressions on the initial mesh was set to 50. The algorithm did not find a solution that satisfied L ϵ MR = 1 × 10 15 , meaning it ran for the maximum number of iterations, but the solution it did find was still very accurate. Figure 7 shows the final solutions for x and u, each overlapping with the truth. The subplots in Figure 8 show how the loss decreased and the mesh knots changed over the mesh iterations. The final mesh consisted of five segments, i.e., six knots at t 0 = 0 , t 1 = 19.5 , t 2 = 29.3 , t 3 = 9970.7 , t 4 = 9980.5 , and t 5 = t f = 10,000. The  L of the final solution was 9.3 × 10 15 . Summing the number of basis functions for all segments of the constrained expressions for x and λ yielded 100 and 103, respectively.

5.2. Nonlinear Hypersensitive Problem

The nonlinear hypersensitive OCP we attempted to solve was
min u J = 1 2 t 0 t f x 2 + u 2 d t s . t . x ˙ = x 3 + u
with the boundary constraints
x ( t 0 ) = 1.5 x ( t f ) = 1 .
As before, the initial and final times were t 0 = 0 and t f = 10,000. After applying PMP, the dual TPBVP ODEs were
x ˙ = x 3 λ λ ˙ = x + 3 x 2 λ ,
where the control is related to the costate by
u = λ .
The TPBVP given by Equations (33) and (34) does not have a true analytical solution due to its nonlinearity. Due to this nonlinearity, an initial guess needed to be provided for the Gauss–Newton step of the TFC procedure. The initial guess we provided for x and λ was simply zero over the entire time domain. Unlike for the linear hypersensitive problem, here, we do not provide a plot showing how L decreases when the number of basis functions is increased and global constrained expressions are used because  L was greater than 1 × 10 4 when 1000 basis functions were used.
Similar to the approach used to solve the previous problem, the TFC-based hp-adaptive mesh refinement algorithm began with an initial mesh of only one segment and with the number of basis functions in the x and λ constrained expressions set to 50. The algorithm ran for the maximum number of mesh iterations, i.e.,  h = H = 20 . Thus, L ϵ MR = 1 × 10 13 was not satisfied by the final loss, even though it was still exceptionally low. Table 2 gives the L -norm of the loss for the TFC solution at the final mesh iteration (TFC Split), the TFC when only global constrained expressions were used with L = 300 (TFC Global), and MATLAB’s bvp4c routine. Note that the bvp4c routine is a finite-difference implementation of the three-stage Lobatto IIIa formula [35,36] and can be regarded as an h method. The same initial guess was used for each mesh. Figure 9 shows the final solutions for x and u near the boundaries of the time domain. The proposed mesh refinement algorithm’s and bvp4c curves overlap, but it is clear that the proposed approach’s solutions are better based on its L -norm of the loss being nine orders of magnitude lower. The subplots in Figure 10 show how the loss decreased in the proposed algorithm over its mesh iterations and how the mesh knots changed. The final mesh consisted of nine segments, one long segment in the center of the time domain, and several much smaller segments clustered near the boundaries to handle the gradients.

5.3. Mass Spring Problem

The last problem solved was a common mass spring system OCP,
min u J = 1 2 t 0 t f x 2 + u 2 d t s . t . x ˙ 1 = x 2 x ˙ 2 = x 1 x 1 3 u 1 ,
with the boundary constraints
x 1 ( t 0 ) = 1 x 2 ( t 0 ) = 0.75 x 1 ( t f ) = 0 x 2 ( t f ) = 0 .
The initial and final times were t 0 = 0 and t f = 10,000, respectively. After applying PMP, the dual TPBVP ODEs were
x ˙ 1 = x 2 x ˙ 2 = x 1 x 1 3 u 1 λ ˙ 1 = x 1 + λ 2 1 + 3 x 1 2 λ ˙ 2 = x 2 λ 1 ,
where the control is related to the costate by
u = λ .
Like the previous TPBVP solved, the TPBVP given by Equations (35) and (36) does not have a true analytical solution due to its nonlinearity. The initial mesh and initial guess of the solution were identical to what was used to solve the nonlinear hypersensitive problem. Table 3 shows the L -norm of the loss for the proposed approach and the other approaches shown in Table 2. Once again, the proposed approach performed the best. Figure 11 shows the trajectories of the various solutions. The subplots in Figure 12 show how the loss decreased in the proposed algorithm over its mesh iterations and portray how the mesh knots changed. The final mesh consisted of 20 segments, and most of them were near the boundary, while one long segment was used to approximate the constant part of the solution, as shown in Figure 12b.

6. A Discussion on Computation Time

The main drawback of the proposed algorithm is its computation time when the problem is nonlinear. Figure 13 shows how the size of the mesh affects the most computationally intensive aspects of the proposed algorithm for the nonlinear hypersensitive problem (a) and mass spring problem (b). The steps that took the longest to complete were the numerical approximations of the Jacobian and the least-square computations performed within the Gauss–Newton algorithm. Hence, Figure 13 shows the total accumulated runtime for performing these calculations at each mesh iteration. The size of the Jacobian is related to the number of basis functions (columns) and the number of collocations (rows) being solved. As can clearly be seen, a larger mesh correlates to a longer runtime. The reason why all accumulated least-squares computations were so small was due to MATLAB’s efficient storage of sparse matrices. The numerical approximation of the Jacobian was time-consuming because it required a number of computations equal to the number of columns in the Jacobian, i.e., the derivatives of the loss with respect to all Chebyshev coefficients in all constrained expressions in all segments were needed. Hence, the proposed algorithm has a long runtime because of the need to numerically approximate the Jacobian at each iteration of the Gauss–Newton algorithm within each mesh iteration.
The best way to reduce the time needed to numerically compute the Jacobian would be to break apart the partial derivatives of the loss with respect to the unknown coefficients into two parts using the chain rule: the partial derivatives of the loss with respect to the differentiation variables and the partial derivatives of the constrained expressions with respect to the unknown coefficients. The partial derivatives of the constrained expressions with respect to the unknown coefficients could be computed by hand a priori. Thus, only the partial derivatives of the loss with respect to the differentiation variables would need to be computed numerically, which would drastically reduce the number of loss evaluations needed. Patterson and Rao [37] demonstrated a similar approach for the pseudospectral method when Lagrange polynomials are used to approximate the differentiation variables. We hope to do the same but for the TFC’s constrained expressions in future work.

7. Conclusions

An hp-adaptive mesh refinement algorithm has been developed for the TFC method that utilizes Chebyshev polynomial-free functions. The algorithm relies on the correlation between the L -norm of the loss and the truncation error to determine the ideal number of basis functions in a constrained expression. Whether to add basis functions to constrained expressions within a segment or divide the segment is determined by the decay rates of the Chebyshev coefficients in the segment’s constrained expressions. Smooth segments have larger decay rates, while nonsmooth segments have small decay rates. The error is decreased by adding basis functions to the smooth segments and dividing the segments that are not smooth. To improve computational performance, the number of basis functions is decreased in constrained expressions on smooth segments that already satisfy the L -norm of the loss tolerance well. Lastly, segments that are very smooth are combined to further improve computational performance.
The only problems solved in this work were TPBVPs generated by performing PMP on hypersensitive OCPs without inequality constraints. All problems were solved using the proposed approach with a maximum loss on the order of 1 × 10 12 or better. In contrast, MATLAB’s bvp4c solved each nonlinear problem with a maximum loss on the order of 1 × 10 4 . Hypersensitive OCPs with inequality constraints will be solved in future work, which will require the continuation of regularization or smoothing parameters. Since the continuation process will be iterative, increased computational performance will be required. As the algorithm is currently constructed, it is too slow to be coupled with continuation due to the cost of numerically approximating the Jacobian. Hence, a major challenge of future work will be discovering new ways to approximate the Jacobian with fewer calculations, similar to what has already been accomplished with other collocation procedures (e.g., the pseudospectral method for optimal control).

Author Contributions

Conceptualization, K.D.; methodology, K.D.; software, K.D.; validation, K.D.; formal analysis, K.D.; investigation, K.D.; resources, K.D. and R.F.; writing—original draft preparation, K.D. and A.D.; writing—review and editing, K.D., A.D. and R.F.; visualization, K.D. and A.D.; supervision, R.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data will be made available by the authors on request.

Acknowledgments

The authors would like to thank Enrico Schiassi and Mario De Florio for sharing problems to test the proposed algorithm with at the beginning stages of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Constraint Vector Generalization

The constraint vector for the end segments (i.e., S 1 and S K ) depends on the boundary conditions of the TPBVP being solved. For example, suppose we are solving the following TPBVP:
x ˙ 1 = x 2 x ˙ 2 = 2 x 1 λ ˙ 1 = 3 λ 2 λ ˙ 2 = x 1 + x 2 + 3 λ 2 λ 1 subject to : x 1 ( t 0 ) = x 1 0 x 2 ( t 0 ) = x 2 0 x 1 ( t f ) = x 1 f λ 2 ( t f ) = 0 .
The constraint vectors for the initial and final segments when solving the TPBVP are
  ( 1 ) B = x 1 ( t 0 ) x 1 ( t 1 ) d x 1 ( t 1 ) d t x 2 ( t 0 ) x 2 ( t 1 ) d x 2 ( t 1 ) d t λ 1 ( t 1 ) d λ 1 ( t 1 ) d t λ 2 ( t 1 ) d λ 2 ( t 1 ) d t and   ( K ) B = x 1 ( t K 1 ) x 1 ( t f ) d x 1 ( t K 1 ) d t x 2 ( t K 1 ) d x 2 ( t K 1 ) d t λ 1 ( t K 1 ) d λ 1 ( t K 1 ) d t λ 2 ( t K 1 ) λ 2 ( t f ) d λ 2 ( t K 1 ) d t ,
respectively. The continuity conditions in the first and last segments,   ( 1 ) B ^ and   ( K ) B ^ , are identical to   ( 1 ) B and   ( K ) B without x 1 ( t 0 ) , x 2 ( t 0 ) , x 1 ( t f ) , and λ 2 ( t f ) . Unlike the constraint vectors for the end segments, the constraint vectors for the interior segments are not unique. Each differentiation variable has four constraints, two on each end of the segment, to enforce continuity. Hence, the constraint vectors for the interior segments when solving Equation (A1) are
  ( k ) B = x 1 ( t k 1 ) x 1 ( t k ) d x 1 ( t k 1 ) d t d x 1 ( t k ) d t x 2 ( t k 1 ) x 2 ( t k ) d x 2 ( t k 1 ) d t d x 2 ( t k ) d t λ 1 ( t k 1 ) λ 1 ( t k ) d λ 1 ( t k 1 ) d t d λ 1 ( t k ) d t λ 2 ( t k 1 ) λ 2 ( t k ) d λ 2 ( t k 1 ) d t d λ 2 ( t k ) d t , for k = ( 2 , K 1 ) .
Since the constraint vectors in the interior segments are made up of only continuity conditions, we have   ( k ) B =   ( k ) B ^ when k = ( 2 , K 1 ) .

Appendix B. Deriving a Dual Two-Point Boundary-Value Problem from an Optimal Control Problem

The goal of an optimal control problem is to minimize a cost functional J ( x ( t ) , u ( t ) , t f ) through the selection of the state vector x ( t ) R n and control vector u ( t ) R m , along with the initial time t 0 and the final time t f when both are free. The general form of a fixed final time OCP can be given as
min x , u , t 0 , t f J ( t 0 , t f , x ( t ) , u ( t ) ) = Φ ( t 0 , t f , x ( t 0 ) , x ( t f ) ) + t 0 t f L ( t , x ( t ) , u ( t ) ) d t
s . t . x ˙ ( t ) = f ( t , x ( t ) u ( t ) )
u ( t ) U
ϕ ( t 0 , x ( t 0 ) , t f , x ( t f ) ) = 0
where J is made up of a scalar penalty term Φ ( t 0 , t f , x ( t 0 ) , x ( t f ) ) , called the Meyer cost, and a scalar integral term, called the running or Lagrangian cost, with the integrand L ( t , x ( t ) , u ( t ) ) . Furthermore, the problem is subject to equations of motion f ( t , x ( t ) u ( t ) ) , an admissible set of controls U, and boundary conditions ϕ ( t 0 , x ( t 0 ) , t f , x ( t f ) ) . Note that inequality constraints on the state can exist but were not considered in the above OCP. The same is true for the control. Hence, the admissible control and its derivative are continuous, U C 1 [ t 0 , t f ] .
Following the indirect method, the OCP can be transformed into a TPBVP whose solution is the solution to the OCP. Formulating the TPBVP first requires the Hamiltonian equation H ( t , x ( t ) , u ( t ) ) ,
H ( t , x ( t ) , u ( t ) ) = L ( t , x ( t ) , u ( t ) ) + λ ( t ) f ( t , x ( t ) , u ( t ) ) ,
where λ ( t ) R n is the costate vector. Minimizing the Hamiltonian with respect to the control, H u = 0 , allows an equation for the optimal control with respect to the costate to be derived, u * ( t , λ ( t ) ) . When u does not appear linearly in the Hamiltonian, and it is not transcendental, substituting it with u * in the Hamiltonian allows it to be written in terms of the costates instead of the controls, H ( t , x ( t ) , u * ( t ) , λ ( t ) ) = H ( t , x ( t ) , λ ( t ) ) . When u does appear linearly in the Hamiltonian, the equation for u * must be added to the final TPBVP. By Pontryagin’s minimum principle, the first-order necessary conditions that minimize the Hamiltonian and solve the OCP are given by
x ˙ = H λ .
and
λ ˙ = H x
The states are still constrained by the boundary condition shown in Equation (A2d). Likewise, the costates have boundary conditions given by
λ ( t 0 ) = Φ x ( t 0 ) + v ϕ x ( t 0 )
and
λ ( t f ) = Φ x ( t f ) + v ϕ x ( t f ) ) ,
where v R q is the Lagrange multiplier associated with the state boundary condition ϕ . If t f is unspecified, then the transversality condition must also hold,
H ( t f , x ( t f ) , λ ( t f ) ) + Φ t f = 0 .
If t 0 is unspecified, then the following transversality condition must also hold:
H ( t 0 , x ( t 0 ) , λ ( t 0 ) ) Φ t 0 = 0 .
Equations (A2d) and (A3)–(A8) then form the dual TPBVP.

References

  1. Leake, C.; Johnston, H.; Daniele, M. The Theory of Functional Connections: A Functional Interpolation Framework with Applications; Lulu: Morrisville, NC, USA, 2022. [Google Scholar]
  2. De Florio, M.; Schiassi, E.; Furfaro, R.; Ganapol, B.D.; Mostacci, D. Solutions of Chandrasekhar’s basic problem in radiative transfer via theory of functional connections. J. Quant. Spectrosc. Radiat. Transf. 2021, 259, 107384. [Google Scholar] [CrossRef]
  3. Yassopoulos, C.; Leake, C.; Reddy, J.; Mortari, D. Analysis of Timoshenko–Ehrenfest beam problems using the theory of functional connections. Eng. Anal. Bound. Elem. 2021, 132, 271–280. [Google Scholar] [CrossRef]
  4. Yassopoulos, C.; Reddy, J.N.; Mortari, D. Analysis of nonlinear Timoshenko–Ehrenfest beam problems with von Kármán nonlinearity using the Theory of Functional Connections. Math. Comput. Sim. 2023, 205, 709–744. [Google Scholar] [CrossRef]
  5. Schiassi, E.; D’Ambrosio, A.; Drozd, K.; Curti, F.; Furfaro, R. Physics-informed veural vetworks for optimal planar orbit transfers. J. Spacecr. Rockets 2022, 59, 834–849. [Google Scholar] [CrossRef]
  6. De Almeida, A.K., Jr.; Johnston, H.; Leake, C.; Mortari, D. Fast 2-impulse non-Keplerian orbit Transfer using the theory of functional connections. Eur. Phys. J. Plus 2021, 136, 223. [Google Scholar] [CrossRef]
  7. de Almeida Junior, A.K.; Prado, A.F.; Mortari, D. Using the theory of functional connections to create periodic orbits with a linear variable thrust. New Astron. 2023, 104, 102068. [Google Scholar] [CrossRef]
  8. Mortari, D. The theory of connections: Connecting points. Mathematics 2017, 5, 57. [Google Scholar] [CrossRef]
  9. Mortari, D.; Johnston, H.; Smith, L. High accuracy least-squares solutions of nonlinear differential equations. J. Comput. Appl. Math. 2019, 352, 293–307. [Google Scholar] [CrossRef]
  10. Bertsekas, D. Dynamic Programming and Optimal Control, 4th ed.; Athena Scientific: Nashua, NH, USA, 2012; Volume 2. [Google Scholar]
  11. Enright, P.J.; Conway, B.A. Discrete approximations to optimal trajectories using direct transcription and nonlinear programming. J. Guid. Control Dyn. 1992, 15, 994–1002. [Google Scholar] [CrossRef]
  12. Kelly, M. An introduction to trajectory optimization: How to do your own direct collocation. SIAM Rev. 2017, 59, 849–904. [Google Scholar] [CrossRef]
  13. Bryson, A.E.; Ho, Y.C. Applied Optimal Control: Optimization, Estimation, and Control, rev. printing ed.; Taylor & Francis Group, LLC: New York, NY, USA, 1975. [Google Scholar]
  14. Longuski, J.M.; Guzmán, J.J.; Prussing, J.E. Optimal Control with Aerospace Applications; Springer: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
  15. Schiassi, E.; D’Ambrosio, A.; Furfaro, R. Bellman neural networks for the class of optimal control problems with integral quadratic cost. IEEE TAI 2022, 5, 1016–1025. [Google Scholar] [CrossRef]
  16. Zhang, H.; Zhou, S.; Zhang, G. Shaping low-thrust multi-target visit trajectories via theory of functional connections. Adv. Space Res. 2023, 72, 257–269. [Google Scholar] [CrossRef]
  17. Drozd, K.; Furfaro, R.; Mortari, D. Rapidly Exploring Random Trees with Physics-Informed Neural Networks for Constrained Energy-Optimal Rendezvous Problems. J. Astronaut. Sci. 2024, 71, 361–382. [Google Scholar] [CrossRef]
  18. Johnston, H.; Schiassi, E.; Furfaro, R.; Mortari, D. Fuel-efficient powered descent guidance on large planetary bodies via theory of functional connections. J. Astronaut. Sci. 2020, 67, 1521–1552. [Google Scholar] [CrossRef]
  19. Drozd, K.; Furfaro, R.; Schiassi, E.; D’Ambrosio, A. Physics-informed neural networks and functional interpolation for solving the matrix differential riccati equation. Mathematics 2023, 11, 3635. [Google Scholar] [CrossRef]
  20. Trefethen, L.N. Spectral methods in MATLAB. In Optimal Control with Aerospace Applications; SIAM: Philadelphia, PA, USA, 2014; pp. 29–39. [Google Scholar] [CrossRef]
  21. Lu, P. Propellant-optimal powered descent guidance. J. Guid. Control Dyn. 2018, 41, 813–826. [Google Scholar] [CrossRef]
  22. Johnston, H.; Mortari, D. Least-squares solutions of boundary-value problems in hybrid systems. J. Comput. Appl. Math. 2021, 393, 113524. [Google Scholar] [CrossRef]
  23. Darby, C.L.; Hager, W.W.; Rao, A.V. An hp-adaptive pseudospectral method for solving optimal control problems. Optim. Control Appl. Methods 2011, 32, 476–502. [Google Scholar] [CrossRef]
  24. Darby, C.L.; Hager, W.W.; Rao, A.V. Direct trajectory optimization using a variable low-order adaptive pseudospectral method. J. Spacecr. Rockets 2011, 48, 433–445. [Google Scholar] [CrossRef]
  25. Patterson, M.A.; Hager, W.W.; Rao, A.V. A ph mesh refinement method for optimal control. Optim. Control Appl. Methods 2015, 36, 398–421. [Google Scholar] [CrossRef]
  26. Liu, F.; Hager, W.W.; Rao, A.V. Adaptive mesh refinement method for optimal control using nonsmoothness detection and mesh size reduction. J. Frankl. Inst. 2015, 352, 4081–4106. [Google Scholar] [CrossRef]
  27. Liu, F.; Hager, W.W.; Rao, A.V. Adaptive mesh refinement method for optimal control using decay rates of Legendre polynomial coefficients. IEEE Trans. Control Syst. Technol. 2017, 26, 1475–1483. [Google Scholar] [CrossRef]
  28. Gui, W.; Babuška, I. The h, p, and h-p versions of the finite element method in 1 dimension. I. the error analysis of the p-version. Numer. Math. 1986, 49, 577–612. [Google Scholar] [CrossRef]
  29. Gui, W.; Babuška, I. The h, p, and h-p versions of the finite element method in 1 dimension. II. The error analysis of the h- and h-p versions. Numer. Math. 1986, 49, 613–657. [Google Scholar] [CrossRef]
  30. Gui, W.; Babuška, I. The h, p, and h-p versions of the finite element method in 1 dimension. III. The adaptive h-p version. Numer. Math. 1986, 49, 659–683. [Google Scholar] [CrossRef]
  31. Pan, B.; Wang, Y.; Tian, S. A high-precision single shooting method for solving hypersensitive optimal control problems. Math. Probl. Eng. 2018, 2018, 7908378. [Google Scholar] [CrossRef]
  32. Boyd, J.P. Chebyshev and Fourier Spectral Methods, 2nd ed.; Dover Publishing: New York, NY, USA, 2001; pp. 19–57. [Google Scholar]
  33. Trefethen, L.N. Is Gauss quadrature better than Clenshaw–Curtis? SIAM Rev. 2008, 50, 67–87. [Google Scholar] [CrossRef]
  34. Driscoll, T.A.; Hale, N.; Trefethen, L.N. Chebfun Guide; Pafnuty Publications: Oxford, UK, 2014; Available online: http://www.chebfun.org/docs/guide/ (accessed on 21 April 2024).
  35. Kierzenka, J.; Shampine, L.F. A BVP solver based on residual control and the Maltab PSE. ACM Trans. Math. Softw. 2001, 27, 299–316. [Google Scholar] [CrossRef]
  36. Shampine, L.F.; Kierzenka, J.; Reichelt, M.W. Solving boundary value problems for ordinary differential equations in MATLAB with bvp4c. Tutor. Notes 2000, 2000, 1–27. [Google Scholar]
  37. Patterson, M.A.; Rao, A.V. Exploiting sparsity in direct collocation pseudospectral methods for solving optimal control problems. J. Spacecr. Rockets 2012, 49, 354–377. [Google Scholar] [CrossRef]
Figure 1. K-segment constrained expressions with C 1 continuity enforced. Red dots indicate the boundary/continuity conditions for each constrained expression.
Figure 1. K-segment constrained expressions with C 1 continuity enforced. Red dots indicate the boundary/continuity conditions for each constrained expression.
Mathematics 12 01360 g001
Figure 2. Error analysis of the TFC solution to Equation (24).
Figure 2. Error analysis of the TFC solution to Equation (24).
Mathematics 12 01360 g002
Figure 3. Three plots that aid in the analysis of the exponential fits performed on the truncated Chebyshev coefficients computed with TFC for the TPBVP shown in Equation (24). Subplot (a) shows Exp. Fit 1 for L = 30 , where all truncated Chebyshev coefficients are included in the regression, and Exp. Fit 2 for L = 30 , where only the truncated Chebyshev coefficients off the roundoff plateau (RP) are included in the regression. The resulting conservative upper bound (CUB) and improved upper bound (IUB) when Exp. Fit 1 and Exp. Fit 2 are used, respectively, are also shown. Subplot (b,c) shows the goodness and decay rates of each fit for increasing values of L.
Figure 3. Three plots that aid in the analysis of the exponential fits performed on the truncated Chebyshev coefficients computed with TFC for the TPBVP shown in Equation (24). Subplot (a) shows Exp. Fit 1 for L = 30 , where all truncated Chebyshev coefficients are included in the regression, and Exp. Fit 2 for L = 30 , where only the truncated Chebyshev coefficients off the roundoff plateau (RP) are included in the regression. The resulting conservative upper bound (CUB) and improved upper bound (IUB) when Exp. Fit 1 and Exp. Fit 2 are used, respectively, are also shown. Subplot (b,c) shows the goodness and decay rates of each fit for increasing values of L.
Mathematics 12 01360 g003
Figure 4. Error analysis of the TFC solution to Equation (26).
Figure 4. Error analysis of the TFC solution to Equation (26).
Mathematics 12 01360 g004
Figure 5. Three plots that aid in the analysis of the exponential fits performed on the truncated Chebyshev coefficients computed with TFC for the TPBVP shown in Equation (26). Subplot (a) shows Exp. Fit 1 for L = 4000 , where all truncated Chebyshev coefficients are included in the regression, and Exp. Fit 2 for L = 4000 , where only the local maximums of the truncated Chebyshev coefficients are included in the regression. The resulting conservative upper bound (CUB) and improved upper bound (IUB) when Exp. Fit 1 and Exp. Fit 2 are used, respectively, are also shown. Subplot (b,c) shows the goodness and decay rates of each fit for increasing values of L.
Figure 5. Three plots that aid in the analysis of the exponential fits performed on the truncated Chebyshev coefficients computed with TFC for the TPBVP shown in Equation (26). Subplot (a) shows Exp. Fit 1 for L = 4000 , where all truncated Chebyshev coefficients are included in the regression, and Exp. Fit 2 for L = 4000 , where only the local maximums of the truncated Chebyshev coefficients are included in the regression. The resulting conservative upper bound (CUB) and improved upper bound (IUB) when Exp. Fit 1 and Exp. Fit 2 are used, respectively, are also shown. Subplot (b,c) shows the goodness and decay rates of each fit for increasing values of L.
Mathematics 12 01360 g005
Figure 6. L -norm curves for the error and loss of the TFC’s solution to the TPBVP represented by Equations (31) and (32) when global constrained expressions are used as the number of basis functions increases.
Figure 6. L -norm curves for the error and loss of the TFC’s solution to the TPBVP represented by Equations (31) and (32) when global constrained expressions are used as the number of basis functions increases.
Mathematics 12 01360 g006
Figure 7. Solutions to the linear hypersensitive problem.
Figure 7. Solutions to the linear hypersensitive problem.
Mathematics 12 01360 g007
Figure 8. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the linear hypersensitive problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Figure 8. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the linear hypersensitive problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Mathematics 12 01360 g008
Figure 9. Solutions to the nonlinear hypersensitive problem.
Figure 9. Solutions to the nonlinear hypersensitive problem.
Mathematics 12 01360 g009
Figure 10. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the nonlinear hypersensitive problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Figure 10. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the nonlinear hypersensitive problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Mathematics 12 01360 g010
Figure 11. Solutions to the mass spring problem.
Figure 11. Solutions to the mass spring problem.
Mathematics 12 01360 g011
Figure 12. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the mass spring problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Figure 12. Evolution of the mesh by the proposed hp-adaptive mesh refinement algorithm for solving the mass spring problem. Subplot (a) shows the L -norm of the loss for each mesh iteration. Subplot (b) shows the knot points for each mesh iteration.
Mathematics 12 01360 g012
Figure 13. The computation time of a mesh iteration versus the size of the Jacobian pertaining to the same mesh iteration for the nonlinear hypersensitive problem (a) and the mass spring problem (b).
Figure 13. The computation time of a mesh iteration versus the size of the Jacobian pertaining to the same mesh iteration for the nonlinear hypersensitive problem (a) and the mass spring problem (b).
Mathematics 12 01360 g013
Table 2. L -norm of the loss for different methods for the nonlinear hypersensitive problem.
Table 2. L -norm of the loss for different methods for the nonlinear hypersensitive problem.
TFC SplitTFC Globalbvp4c
1.285 × 10 13 4.969 × 10 1 4.128 × 10 4
Table 3. L -norm of the loss for different methods for the mass spring problem.
Table 3. L -norm of the loss for different methods for the mass spring problem.
TFC SplitTFC Globalbvp4c
2.345 × 10 12 5.455 × 10 1 9.878 × 10 4
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Drozd, K.; Furfaro, R.; D’Ambrosio, A. A Theory of Functional Connections-Based hp-Adaptive Mesh Refinement Algorithm for Solving Hypersensitive Two-Point Boundary-Value Problems. Mathematics 2024, 12, 1360. https://doi.org/10.3390/math12091360

AMA Style

Drozd K, Furfaro R, D’Ambrosio A. A Theory of Functional Connections-Based hp-Adaptive Mesh Refinement Algorithm for Solving Hypersensitive Two-Point Boundary-Value Problems. Mathematics. 2024; 12(9):1360. https://doi.org/10.3390/math12091360

Chicago/Turabian Style

Drozd, Kristofer, Roberto Furfaro, and Andrea D’Ambrosio. 2024. "A Theory of Functional Connections-Based hp-Adaptive Mesh Refinement Algorithm for Solving Hypersensitive Two-Point Boundary-Value Problems" Mathematics 12, no. 9: 1360. https://doi.org/10.3390/math12091360

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop