1. Introduction
We begin this review by considering briefly various approximation strategies for continuous functions. The approximation of such continuous functions in Numerical Analysis is usually by a least squares approach or interpolation methods. In such cases a basis is chosen from an underlying approximation space. Depending on the problem this may be a polynomial, rational function or of the sinusoidal type. Specifically interpolation plays a key role in approximation of the unknown solution of differential equations. In such a case a polynomial basis is chosen, which serves as a trial function for the solution. For efficiency and stability a low order polynomial basis is used on a discretized domain. This approach is called the collocation method, and the trial functions are referred to as the collocation polynomial, or the interpolating polynomial. We review the background of collocation by looking at the criterion of the method. We then review different methods that are used to discretize both ODEs and PDEs. To do this, we go into detail about various collocation methods, as well as how they have been used over the years.
In particular we review the method of orthogonal collocation on finite elements (OCFE), where the domain is split into elements, and the latter mapped to the standard interval
, thereafter Gauss points in
are used as collocation points. This approach is proved to yield optimal error bounds. In particular using spline approximation results in sub-optimal error bounds. For example using quintic spline approximation for solving third order ODEs gives only third order convergence, whereas OCFE produces order six convergence. We solved the resulting DAEs using two approaches. Firstly we used DASSL to solve the DAEs resulting from OCFE in their original form. Secondly, we converted the DAEs to ODEs and used a stiff integrator to solve the resulting system of IVPs. These approaches are clearly superior to previous approaches, in which quasilinearization together with the second order Crank Nicolson method for the time integration is used. In
Section 2 we show how the idea of collocation arises naturally from an approximation to the Galerkin method.
Section 3 includes a literature review and the history of collocation. In
Section 4 we derive the quintic Hermite basis functions and discuss their properties.
Section 5 describes OCFE with respect to a quintic Hermite basis, and derives a simplified form of the trial functions, using the continuity conditions. In
Section 6 we present a detailed error analysis for the linear ODE case using a quintic Hermite basis. In particular we show that the error is optimal when the collocation points are chosen as the Gauss points, and present nodal, and global error bounds. Finally in
Section 7 we present numerical results for several linear, and non-linear differential equations, and validate the theorem in
Section 6 by examining the global and nodal convergence orders.
2. Mathematical Setting
Here we review how collocation arises. We consider
, where
is the vector space of continuous functions on the domain
X. Let
L be a bounded linear operator. Let
be a basis for a subspace
of
of dimension
N. Consider solving the linear equation
If
is an approximation to
y, then it follows that
where
are real coefficients. We also assume that
The quantity
is defined as the residual, and it can be shown that
where
is the condition number of
L and
denotes compatible operator and function norms on
, with
For well conditioned problems
implies that
. Galerkin’s method attempts to minimize the norm of the residual by requiring that
, which is equivalent to
This geometry in the Hilbert space
is illustrated in
Figure 1.
But these integrals may be expensive to evaluate. The collocation method attempts to avoid this expensive calculation and is a simplification of the Galerkin method.
Integrals such as those mentioned in (
1) can be evaluated using numerical quadrature, for instance,
where we have used the quadrature points
and
are the corresponding weights. If
is a set of orthogonal polynomials then we may choose the
as the zeros of
. This ensures that
We note that the zeros of an orthogonal set of polynomials are real, unique and confined to the interval of orthogonality. Now
would not approximate zero since
for
. Hence, in order that this integral is approximately zero, we set
for
This is the criterion for the collocation method.
3. Literature Review
There are three broad classes of methods for discretizing ODEs and PDEs in space.
- (a)
Finite difference methods (FDM) are local methods which yield the numerical solution at mesh points in the spatial domain. They are easy to implement yielding matrices, which are sparse and computationally less expensive to compute. The main drawback of finite difference methods are that they are not very accurate. The Crank-Nicholson method is the most popular FDM for solving PDEs. We refer to the books [
1,
2], and references therein, for a further exposition of these methods.
- (b)
Finite element methods are derived using a variational/integral formulation and are global methods, which yield the numerical solution on intervals/sub-intervals in the spatial domain. Variants include the method of weighted residuals, Galerkin, Petrov-Galerkin method, Rayleigh Ritz method, Tau method, etc. These methods are mathematically robust but from a numerical point of view are not very practical, due to the need to evaluate integrals exactly. We refer to the classical text [
3] for a more detailed description of these methods. Besides this, many other texts can be found in the literature.
- (c)
Collocation methods have been developed to circumvent this shortfall. The collocation method is used to solve ordinary and partial differential equations, as well as integral equations. This is done by determining the approximate solution, by requiring the equation to be satisfied at specific points. These specific points are referred to as collocation points. These are chosen points that lie within the domain of the equation, and are used along with candidate solutions to check if the required conditions are met, such that the residual will be zero at these specific points.
Frazer et al. [
4] was the first to use this method, in 1937. It was then used in 1941 by Bickley [
5] in order to solve unstable heat condition problem. A low-order collocation method was used for a number of boundary-layer problems by Schetz [
6], in 1963.
There are many variants of collocation methods and are mainly based on the application. Spectral/pseudo-spectral methods [
7] are highly accurate methods and are mainly used in problems, which have very smooth solutions. Fourier Spectral methods are limited to problems posed on a periodic domain, such as when the basis functions are sinusoidal. If the basis functions are polynomials, then it is referred to as a polynomial collocation method. Chebyshev spectral methods have been developed to solve problems posed on a bounded domain. They yield very fast convergence algorithms known as spectral/exponential convergence. In most cases the matrices generated by these methods are dense and one has to resort to the Fast Fourier transform (FFT) to solve them.
Sinc collocation methods [
8] are another example of collocation methods. By definition, the basis functions used in these methods are the Sinc functions, which are naturally posed on an unbounded domain, which is one of the drawbacks of this method. B-splines are by far the most popular choice of basis used in collocation methods to solve problems on a bounded domain. They have been successfully applied to solve many important problems including Burgers, KdV and Schrödinger equations [
9,
10]. The Hermite collocation method closely resembles the B-spline collocation method, but are computationally more efficient, and when used in conjunction with orthogonal collocation on finite elements, yield super convergence properties in algorithms similar to spectral convergence.
Orthogonal collocation arises from the polynomial collocation method, as the collocation points are chosen to be the roots of orthogonal polynomials. It can also be described as being an expansion to the method of weighted residuals.
Villadsen and Stewart were the first to use the orthogonal method, in 1967, after discovering that they yielded very good results, due to the roots of orthogonal polynomials having interesting properties [
11]. Finlayson then used it for various applications in chemical engineering, in 1972 [
12], and again for application to non-linear problems [
13], in 1980. It was used by Fan, Chen and Erikson to solve problems from chemical reactions [
14]. Orthogonal collocation is a method that has been widely used by many people for problems in chemical engineering [
15,
16,
17,
18,
19,
20] over the years. More recently, the method of orthogonal collocation using cubic Hermite basis has been developed, and applied successfully to solve parabolic partial differential equations, and models appearing in chemical engineering [
21,
22]. Traditionally a monomial basis
was used to represent the trial functions or more efficiently the Lagrange basis
, where
The main focus in this review is to illustrate the use of orthogonal collocation using a quintic Hermite basis.
4. Quintic Hermite Basis Functions
We seek a basis for
, the vector space of polynomials of degree
on the interval
. There are six such functions and we denote them by
We further stipulate their function and derivative values at the end points
and
as follows:
where
and
is the well-known Kronecker delta symbol.
It is convenient to transform to the variable
defined by
where
h is the uniform interval length. As
x varies from
to
,
z varies from 0 to 1. The interpolatory conditions in (
3) transform naturally in the variable
z to
These conditions enable the unique derivation of the
, however it is only necessary to derive
and
as by symmetry/anti-symmetry one may obtain the others. These polynomials are displayed in (
5)–(10).
If we write
, then we note that
is a reflection of
about the vertical axis together with a shift of one unit to the right.
is related to
in a similar manner. Also,
may be interpreted as
rotated by
anticlockwise and then shifted one unit to the right.
5. Orthogonal Collocation on Finite Elements
Consider solving an ordinary differential equations in one spatial variable,
x, and on the domain
. Firstly the domain
is divided into
N sub-intervals or elements of spacing
, by placing the dividing points or nodes,
, as illustrated in
Figure 2. We shall refer to this discretization as the mesh
Here and coincide with the left and right hand boundaries, respectively. This differs from global orthogonal collocation where the domain is not subdivided and instead higher order polynomials are used to achieve greater accuracy.
The
ith element
is mapped to
by using a transformation of the form (
4). We assume that the approximate solution in the
ith element is given by
and is represented in the
element by
The basis functions are plotted across both the
ith and
elements and illustrated in
Figure 3.
We observe continuity at for the basis functions. This is also true of their first and second derivatives. However the third derivative is discontinuous. This continuity has some interesting consequences on the coefficients of the solutions in the successive elements.
In order to obtain a smooth solution that is
continuous, we enforce the condition
which is equivalent, in the variable
z, to
This implies that
. The continuity of the derivative at
is equivalent to
and yields
. Similarly the continuity of the second derivative at
is equivalent to
This results in
. Hence the first three coefficients in the
interval coincide with the last three coefficients in the
ith interval. This repetitive pattern continues as we proceed to successive elements. Thus, we may represent the solution in the
ith element by
where we write
for
bearing in mind that
is a function of
i, and that we have dropped the superscript
i from
. With this labelling of the coefficients we are automatically ensuring that the solution, and its first and second derivatives are continuous at the nodes. Their are
unknowns resulting from (
12). Had we used a quintic Lagrange basis then we would have
unknowns, as it is not possible to build in the continuity conditions to get a simple form like in (
12).
6. Error Analysis
The following is an adaptation of the work done by de Boor and Swartz [
23] to quintic Hermite collocation. A third order differential equation, defined on
, can be written in the form
where the operator
and
D denotes the derivative operator. Then, the error is given by
where
is the residual. Hence,
where
is the Green’s function, or integral kernel, associated with the linear problem.
Now,
where
and
Consider one of the intervals
of
. Denote the collocation points by
Approximate the residual
on
by a quadratic interpolant
such that
where
Since
vanishes at the collocation points, we have
and hence (
16) can be written as
where
and
denotes the third divided difference of
Hence the integrand in (
13) can be written in the form
where
.
Interpolate
by a polynomial
of degree
, in the variable
at any points
in
to obtain
where
Equation (
17) then becomes
Substituting (
19) into (
15) results in
If the collocation points are chosen so that
is orthogonal to
that is
for all polynomials
of degree
, then
Equation (
20) follows since
and we have assumed that
can be bounded independently of
If we now sum
in (14), we obtain
If
, we forgo the interpolation of
and assume that
can be bounded on
such that
We follow this idea to obtain higher order convergence in the case that
cannot be bounded independent of
. In order to achieve this we prove the following theorem.
Theorem 1. Let be constants independent of Δ
and assume that there exists a constant such that where we consider to be an element of If the coefficients of L are smooth enough, specifically and then there exists a constant such that for each jwhere Proof. Since
, it follows that
on
is a combination of the derivatives of the coefficients of
L, up to the order
, with the derivatives of
up to the order
(recall
L is third order). It is therefore sufficient to prove the existence of a constant
such that
The following result, namely.
is proved by de Boor and Swartz [
23].
We now consider the case
. Expand
y at some point
in a Taylor series
of order
to obtain
where
from which it is simple to show that
From (
25), we deduce that
If
X is a non-negative discrete random variable and
t is a positive real number then, by Markov’s inequality
where
denotes the expectation of
X and
P, the probability. With
and
, we obtain
and with
and
we obtain
which allows us to write
for
where
We also have that
where we have used both (
24) and (
26). Substituting (
28) into (
27) yields
Hence, for
we have that
where we have used (
26) and (
29).
For
so that
Hence (
24), (
30) and (
31) proves (
23). □
The following theorem establishes the order of convergence.
Theorem 2. Let be constants independent of Δ.
Assume that the coefficients of L satisfy for all i, that Equation (
21)
holds and that If the collocation points are chosen such that is orthogonal to for every then there exists a constant such thatand a constant such that Proof. We write
with
the Taylor series for
, with terms of order
(see (
17) and (
18)). Choosing the collocation points so that
for every polynomial
We have
Now
We let
and
From
we infer that
Hence
We now consider three cases:
- (i)
If
then from (
21)
while
from (
22). Hence from (
38), we have
and (37) gives
- (ii)
If
then
Now consider say
. Equation (
34) is then rewritten as
where
is the Taylor series for
of order one. Once again, we choose the collocation points so that
Hence the equivalent equation to (
36) is
The equivalent equation to (
38) is then
and
Hence (
40) reduces to
Similarly, by considering
we may show that
and
- (iii)
If
so that
for all
j then, from summing (
39) (recall (14)) we obtain
for
x coinciding with the nodes
Otherwise
for exactly one of the sub-intervals of
. So, by (
39) and (
41) and summing over
j we obtain
□
The following theorem establishes the optimal choice for the collocation points.
Theorem 3. The optimal choice for the collocation points are the Gauss points.
Proof. The orthogonality of
and
from (
35) is equivalent to
Change the integral in (
42) to the interval
by using the linear transformation
to yield
Equation (
43) is equivalent to the following three equations
Integrating (
44)–(46), we get
Equations (
47) and (49) implies that
hence we choose
without loss of generality. This implies that
and
so that
These are the Gauss points, or zeros, of the third degree Legendre Polynomial. □
7. Numerical Examples
Example 1. We now consider the ODESubstituting (
12)
in the given ODE results in the following system of equations: The boundary conditionyields whilst the derivative conditionyields andyields There are unknowns. Given that we have three boundary conditions, we thus require conditions in order to solve the problem uniquely. We therefore choose three collocation points in each element. The are chosen as the zeros of the third degree Legendre polynomial, shifted to the interval These were shown to be the optimal choice for collocation points. The collocation points are then substituted into Equation (
50)
to give the linear equations. The matrix vector system, of size has the form where A has the form (illustrated for )and has the form The non-zero blocks of matrix A are shifted three places to the right and accounts for the repetition of the coefficients. The position of the ones account for the boundary conditions.
After solving (
51),
the solution is constructed on each sub-interval using the appropriate coefficients and can then be plotted. Since there is very good agreement between the approximate solution and the exact solution, we choose to show the error plot () in Figure 4 for . If the discrete error at the nodes is thenandBy taking the ratio of (
52)
to (
53)
we obtainfrom which the order of convergence The conditions of Theorem 2 are satisfied. We have solved the problem with and and summarise in Table 1 the convergence orders at the common nodes for according to Equation (
32).
The orders agree remarkably with Equation (
32),
the difference being attributed to numerical error. The global order is shown in Table 2 and illustrates the validity of Equation (
33).
Example 2. Consider the problemwhich has the exact solutionAs there are two boundary conditions, we choose an additional collocation point in the first element. Hence the collocation points in the first element are the zeros of the fourth degree Legendre polynomial shifted to , whilst in the rest it is the zeros of the third degree Legendre polynomial. The solution is smooth on , but the coefficient is unbounded at This, however, is not a problem since the collocation points are chosen in which avoids the endpoints. The errors at the common nodes as well as the convergence order is displayed in Table 3. Clearly convergence is . Theorem 2 does not apply here as there is a coefficient with a singularity at . In the next example we consider the application of the method to partial differential equations.
Example 3. We illustrate the full spatial discretization process involved in the collocation procedure on this example.with boundary conditionsand initial conditionThe exact solution is given byIn the variable z Equation (
54)
becomes Analogous to Equation (
12),
the trial solution, in the ith element, is written asHere the time dependence is reflected in the coefficients a. Substituting (
56)
into (
55),
we obtainWe then substitute three collocation points from each sub-interval into Equation (
57)
to obtain conditions. Using these, along with the boundary conditions, we get a linear differential algebraic system of size . In particular, the boundary conditions imply that and Differential algebraic systems are solved using the classical DASSL solver [24]. DASSL is designed for the numerical solution of implicit systems of differential/algebraic equations written in the form , where and are vectors, and initial values for y and are given. It is a classic DAE solver released as open source in 1982 and has been ported to Julia recently. (https://github.com/JuliaLang/julia, accessed on 1 November 2022) was released in 2012 by MIT. It is an open-source, multi-paradigm, high-level, high-performance, and incredibly dynamic scripting language. For scientific computing it resembles Matlab in syntax. It comes close to C in speed. It allows interaction with other scientific languages like Fortran, Python and R. There are over 2000 modules that can be added to it. One can interface with many plotting modules, but we have elected to use pyplot from matplotlib.
The graphs of both the solution and error are depicted in Figure 5 and Figure 6, respectively. The numerical results indicate that the present method is successful in capturing the oscillatory nature of the solution in both space and time with a reasonable accuracy. The drop in accuracy as compared to the ode case can be attributed to the error in the time integration process. We now solve the Korteweg–de Vries (KdV) equation [
10]. The semi-discretization in space results in a system of DAEs which can be converted into a system of IVPs and be solved along similar lines as the ODE case discussed in
Section 5. For the time integration we use a suitable high order stiff integrator. In contrast, most methods developed in previous studies used quasi-linearization with second order Crank Nicolson method for the time integration.
Example 4. We consider the following form of the KdV equation and illustrate the spatial discretization process involved in the collocation process below:It has boundary conditionsand an initial conditionwhere The exact solution is given by, (see [10])We use the exact same method as before, switching to the variable z so that (
58)
becomesWe then substitute (
56)
into (
59),
to getWe substitute three collocation points per element into Equation (
60)
to obtain conditions. Together with the boundary conditions, we obtain a non-linear differential algebraic system of size In particular, the boundary conditions imply that In order to obtain the initial conditions for the coefficients, that is , we letWe substitute the collocation points into (
61)
to obtain a matrix vector system whose solution yields the value of the coefficients at . Equations (
60)
and (
61)
is easily converted to a system of of IVPs which can be efficiently solved using the Julia QNDF solver (which is equivalent to the ODE15s in Matlab). Thereafter the PDE solution over the domain is obtained from Equation (
56).
Figure 7 depicts the solution profiles for various t values. This example describes a wave propagating in the positive × direction. In this figure the solid line represents the exact solution and the circles the approximate solution. It is clear the approximate solution closely matches the exact solution. The 3D solution and error are shown in Figure 8 and Figure 9, respectively. We see that applying the quintic OCFE method in space combined with a powerful stiff ODE solver in time is capable of producing a very smooth and accurate solution. 8. Conclusions
We have reviewed the numerical approximation of the solution of differential equations, using collocation methods, with a bias towards a quintic Hermite basis. In Lagrange collocation, the continuity conditions have to be satisfied, as well as the residual set to zero at the collocation points. This results in a much larger system of equations to solve. By using a Hermite basis the continuity conditions are automatically satisfied, as reflected in the coefficients of the trial solution. Thus only the collocation equations have to be solved. This is a huge computational advantage. In addition we have provided a useful error analysis, which is also applicable to the non-linear case. Collocation on finite elements is essentially equivalent to minimizing the residual over the domain of the problem. It is a powerful and versatile tool that rivals finite difference approximations, especially when the solution is oscillatory in nature.