Mathematical Modeling on a Physics-Informed Radial Basis Function Network

Stenkin, Dmitry; Gorbachenko, Vladimir

doi:10.3390/math12020241

Open AccessArticle

Mathematical Modeling on a Physics-Informed Radial Basis Function Network

by

Dmitry Stenkin

and

Vladimir Gorbachenko

^*

Department of Computer Technologies, Penza State University, Penza 440026, Russia

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(2), 241; https://doi.org/10.3390/math12020241

Submission received: 3 December 2023 / Revised: 3 January 2024 / Accepted: 9 January 2024 / Published: 11 January 2024

(This article belongs to the Special Issue Application of Neural Network Algorithm on Mathematical Modeling)

Download

Browse Figures

Versions Notes

Abstract

The article is devoted to approximate methods for solving differential equations. An approach based on neural networks with radial basis functions is presented. Neural network training algorithms adapted to radial basis function networks are proposed, in particular adaptations of the Nesterov and Levenberg-Marquardt algorithms. The effectiveness of the proposed algorithms is demonstrated for solving model problems of function approximation, differential equations, direct and inverse boundary value problems, and modeling processes in piecewise homogeneous media.

Keywords:

physics-informed neural networks; partial differential equations; boundary value problem; inverse problems; radial basis function networks; neural network learning; Nesterov method; Levenberg-Marquardt method

MSC:

68T07

1. Introduction

Partial differential equations (PDEs) describe many physical, biological, and economic processes. The study of models of such processes requires solving boundary value problems. Traditionally used to solve such problems, the finite difference, finite element, and finite volume methods require the construction of meshes, which for real problems is a complex and time-consuming task, and the solution of ill-conditioned systems of high-dimensional mesh equations. Particularly difficult is the solution of inverse boundary value problems when, based on the results of measurements of certain characteristics of an object or process at a limited number of points, it is necessary to calculate the missing parameters of the mathematical description of the object or process. Such problems are mathematically incorrect.

It is well known that boundary value problems for PDEs can be successfully solved using neural networks [1,2]. The theoretical basis for solving PDEs in neural networks consists of two propositions:

-: Using neural networks as universal function approximators [3,4,5]. A neural network approximates an unknown solution to a problem.
-: Using a variational approach to solving boundary value problems. This means that the solution to the boundary value problem is found by minimizing the error functional of the neural network. The neural network’s error functional (often called a loss function) uses the residuals of the approximate solution at some set of test points within, at the boundary of the solution domain, and possibly at additional condition points. As a rule, test points are located in an arbitrary manner (although a grid can be used). Therefore, the solution to a PDE in neural networks is a meshless approximate analytical solution to the PDE, since the solution to the problem is a function (which is almost impossible to represent analytically due to complexity) determined by the architecture and parameters of the neural network. Because PINNs include a mathematical model, the solution results are interpretable.

Increased interest in the application of neural networks to solving PDEs began with the publication of works [6,7,8]. A new class of PDE solvers has been formed—physics-informed neural networks (PINNs), which include a mathematical model in the structure of the neural network. Training of such networks is based on minimizing residuals calculated by substituting an approximate solution in a limited set of test points in the PDE, boundary conditions, and, possibly, additional conditions. That is, the network is trained not on known examples but on the satisfaction of the solution generated by the network with physical laws. Using a PINN, you can solve direct and inverse boundary value problems. When solving direct problems, training data is not used to train a neural network, so there is no problem of overfitting [9], which is a great difficulty for machine learning models. Interest in PINNs is largely due to the popularity of freely available machine learning libraries, such as TensorFlow and PyTorch, which implement automatic differentiation [10], a very important step for solving PINNs.

A PINN is a new direction and is in its infancy. Many problems in the theory and practical application of PINNs have not yet been resolved. One of the important problems hindering the use of neural networks for solving boundary value problems is the long training time of neural networks, which is due to the use of first-order gradient algorithms in modern deep neural networks. For traditional problems solved using neural networks, such as pattern recognition, the network is trained once. The recognition process using a trained network is fast. Solving each boundary value problem requires new network training. Therefore, when solving PDEs with neural networks, training time becomes a critical factor. The solution is to adapt fast second-order algorithms for training neural networks to solve PDEs.

Neural networks can approximate continuous functions as accurately as desired. In the case of modeling processes in inhomogeneous multicomponent media, the solution at the boundary of the interface of piecewise homogeneous media suffers a discontinuity. The need to model processes in heterogeneous media often arises when solving very important applied problems, for example, modeling processes in layered composite materials, modeling oil reservoirs, and modeling groundwater filtration. Neural networks have practically never been used to solve problems with heterogeneous media.

Algorithms for training neural networks when solving inverse boundary value problems, including inverse coefficient problems, especially for piecewise homogeneous media, have been poorly studied. Such problems arise when detecting anomalies in the environment, for example, in medicine.

Most modern PINNs are fully connected deep networks. The use of radial basis function networks as PINNs has great promise [2,11,12,13,14]. Radial basis function networks allow for solving not only direct but also inverse problems [15,16,17]. Such networks are simpler than fully connected ones since they contain only two layers and are easier to train. Second-order gradient learning algorithms have been developed for such networks [13,18,19,20]. The results of comparing fully connected networks and radial basis function networks [21] when solving PDEs showed the advantage of radial basis function networks in terms of training time. Therefore, radial basis function networks designed to solve PDEs are becoming popular. A new type of neural network is being formed—physics-informed radial basis networks (PIRBNs) [12,22,23,24].

Radial basis function networks can be seen as an extension of E.J. Kansa’s method [25,26]. It is a collocation method that uses radial basis functions as basis functions [27]. The Kansa method is meshless, does not require mesh generation like finite difference, finite element, and finite volume methods, and is as accurate as these methods. The Kansa method is inferior in accuracy to spectral methods but does not have restrictions on the shape of the solution region, like these methods. It is superior in accuracy to pseudospectral methods [28], although it should be noted that pseudospectral methods use radial basis functions as basis functions [29]. Radial basis functions depend only on the distance between the input value and the center of the function, so E.J.’s method is applicable to high-dimensional problems.

The difference between radial basis function networks and the method of E.J. Kansa consists of using a number of basis functions that are not equal to the number of collocation points and determining the expansion coefficients (weights of the neural network) during the network training process rather than solving a system of linear algebraic equations. The authors’ approach to training networks of radial basis functions is distinguished by tuning not only the weights but also the parameters of the radial basis functions during the network training process. This approach reduces training time and increases solution accuracy.

The authors of this work adapted second-order gradient algorithms for training PIRBNs [13,18], proposed an approach to solving PDEs in PIRBNs for piecewise homogeneous media [30], and solved coefficient inverse problems in PIRBNs, including problems for piecewise homogeneous media [24]. In the above works, most of the experiments were performed in MATLAB with the preliminary obtaining of analytical expressions for the gradient components of the error function based on the network parameters and the Jacobian matrix. Analytical calculation of derivatives in these expressions is not theoretically difficult but is labor-intensive and prone to errors. Currently, the authors have developed programs for implementing PIRBNs in Python using the automatic differentiation functions of the TensorFlow library.

The purpose of this work is to summarize the authors’ experience in the field of PIRBNs and present the results of using automatic differentiation in PIRBN training.

2. Materials and Methods

The output of a PIRBN is described by the following expression:

u (x) = \sum_{k = 1}^{n_{R B F}} w_{k} φ_{k} (x)

where

n_{R B F}

is the number of radial basis functions,

w_{k}

is the weight of the

k

th neuron, and

φ_{k} (x)

is the value of the radial basis function at the point

x

.

In this work, the Gaussian function was used:

φ (x) = \exp (- \frac{{‖x - c‖}^{2}}{2 a^{2}})

where

c

is the function center and

a

is the form parameter (width).

For radial basis function networks, it is possible to analytically calculate the components of the gradient vector of the error functional and the elements of the Jacobian matrix. Using MATLAB, a preliminary analytical calculation of derivatives was carried out.

We will present the PIRBN parameters as a single vector of network parameters. For input vectors of a network with a dimension equal to two, we have:

θ = {[w_{1}, w_{2}, \dots, w_{n_{R B F}}, c_{11}, c_{21}, \dots, c_{n_{R B F} 1}, c_{12}, c_{22}, \dots, c_{n_{R B F} 2}, a_{1}, a_{2}, \dots, a_{n_{R B F}}]}^{T},

where

w_{j}

is the weight,

j = 1, 2, 3, \dots, n_{R B F}

,

n_{R B F}

is the number of radial basis functions,

c_{j 1}

and

c_{j 2}

are center coordinates, and

a_{j}

is the width.

By representing the network parameters as a single vector, it is possible to adjust both the network weights, which are linearly included in the error functional, and the parameters of the radial basis functions, which are nonlinearly included in the error functional.

Let us consider the problem of approximating a function of two variables. To train the network, the Nesterov and Levenberg-Marquardt algorithms adapted by the authors were used. The error functionality for this task has the form:

L = \frac{1}{2} \sum_{j = 1}^{n} e_{j}^{2} = \frac{1}{2} \sum_{j = 1}^{n} {(u (x_{j}) - T_{j})}^{2},

where

e_{j}

is the residual at the

j

th test point,

n

is the number of test points,

x_{j}

is the coordinate vector of the

j

th test point,

u (x)

is the network output for the

j

th test point, and

T_{j}

is the known value of the function at the

j

th test point.

In Nesterov’s method [31], the vector of network parameters at each iteration is adjusted according to the following formulas:

θ^{(k + 1)} = θ^{(k)} + Δ θ^{(k + 1)}, Δ θ^{(k + 1)} = α Δ θ^{(k)} - η \nabla L (θ^{(k)} + α Δ θ^{(k)}),

where

α

and

η

are selected coefficients and

\nabla L (θ^{(k)} + α Δ θ^{(k)})

is the error functional gradient vector.

Nesterov’s method is a development of the gradient descent method with momentum. But in Nesterov’s method, the gradient of a pre-corrected vector of parameters is calculated, which provides a higher speed of convergence.

The gradient components of the functional have the following form:

\frac{\partial I}{\partial w_{i}^{}} = \frac{\partial}{\partial w_{i}^{}} [\frac{1}{2} \sum_{j = 1}^{n} {(u (x_{j}) - T_{j})}^{2}] = \sum_{j = 1}^{n} (u (x_{j}) - T_{j}) φ_{i} (x_{j}) . \frac{\partial I}{\partial c_{i 1}} = w_{i} \sum_{j = 1}^{n} (u (x_{j}) - T_{j}) \cdot φ_{i} (x_{j}) \cdot \frac{x_{j 1} - c_{i 1}}{a_{i}^{2}} . \frac{\partial I}{\partial c_{i 2}} = w_{i} \sum_{j = 1}^{n} (u (x_{j}) - T_{j}) \cdot φ_{i} (x_{j}) \cdot \frac{x_{j 2} - c_{i 2}}{a_{i}^{2}} . \frac{\partial I}{\partial a_{i}^{}} = \frac{\partial}{\partial a_{i}^{}} [\frac{1}{2} \sum_{j = 1}^{n} {(u (x_{j}) - T_{j})}^{2}] = w_{i} \sum_{j = 1}^{n} (u (x_{j}) - T_{j}) \cdot φ_{i} (x_{j}) \cdot \frac{{‖x_{j} - c_{i}‖}^{2}}{a_{i}^{3}} .

The Levenberg-Marquardt algorithm is an adaptation for training PIRBNs of the well-known second-order gradient optimization algorithm [32]. Second-order gradient algorithms have a significantly higher convergence rate than first-order algorithms but are more labor-intensive. Second-order gradient algorithms have not become widespread in deep neural networks. The simple architecture of the PIRBN allows this algorithm to be used for network training. In this algorithm, the correction

Δ Θ^{(k)}

of the vector of network parameters is obtained as a result of solving a system of linear algebraic equations.

(J_{k - 1}^{T} J_{k - 1} + μ_{k} E) Δ θ^{(k)} = - \nabla L_{k - 1},

(1)

where

J_{k - 1}

is the Jacobian matrix calculated from the values of network parameters in iteration

k - 1

,

E

is the identity matrix, and

μ_{k}

is the regularization parameter.

To approximate a function of two variables, we present the Jacobian matrix in block form

J = [\begin{matrix} J_{w} ¦ & J_{c_{1}} ¦ & J_{c_{2}} ¦ & J_{a} \end{matrix}]

, where

J_{w} = [\begin{matrix} \frac{\partial e_{1}}{\partial w_{1}} & \frac{\partial e_{1}}{\partial w_{2}} & \dots & \frac{\partial e_{1}}{\partial w_{n_{R B F}}} \\ \frac{\partial e_{2}}{\partial w_{1}} & \frac{\partial e_{2}}{\partial w_{2}} & \dots & \frac{\partial e_{2}}{\partial w_{n_{R B F}}} \\ \dots & \dots & \dots & \dots \\ \frac{\partial e_{n}}{\partial w_{1}} & \frac{\partial e_{n}}{\partial w_{2}} & \dots & \frac{\partial e_{n}}{\partial w_{n_{R B F}}} \end{matrix}], J_{c_{1}} = [\begin{matrix} \frac{\partial e_{1}}{\partial c_{11}} & \dots & \frac{\partial e_{1}}{\partial c_{n_{R B F} 1}} \\ \frac{\partial e_{2}}{\partial c_{11}} & \dots & \frac{\partial e_{2}}{\partial c_{n_{R B F} 1}} \\ \dots & \dots & \dots \\ \frac{\partial e_{n}}{\partial c_{11}} & \dots & \frac{\partial e_{n}}{\partial c_{n_{R B F} 1}} \end{matrix}], J_{c_{2}} = [\begin{matrix} \frac{\partial e_{1}}{\partial c_{12}} & \dots & \frac{\partial e_{1}}{\partial c_{n_{R B F} 2}} \\ \frac{\partial e_{2}}{\partial c_{12}} & \dots & \frac{\partial e_{2}}{\partial c_{n_{R B F} 2}} \\ \dots & \dots & \dots \\ \frac{\partial e_{n}}{\partial c_{12}} & \dots & \frac{\partial e_{n}}{\partial c_{n_{R B F} 2}} \end{matrix}], J_{a} = [\begin{matrix} \frac{\partial e_{1}}{\partial a_{1}} & \dots & \frac{\partial e_{1}}{\partial a_{n_{R B F}}} \\ \frac{\partial e_{2}}{\partial a_{1}} & \dots & \frac{\partial e_{2}}{\partial a_{n_{R B F}}} \\ \dots & \dots & \dots \\ \frac{\partial e_{n}}{\partial a_{1}} & \dots & \frac{\partial e_{n}}{\partial a_{n_{R B F}}} \end{matrix}] .

The analytically calculated elements of the Jacobian matrix have the following form:

\frac{\partial e_{i}}{\partial w_{j}} = \frac{\partial}{\partial w_{j}} [u (x_{i}) - T_{i}] = φ_{j} (x_{i}), \frac{\partial e_{i}}{\partial c_{j 1}^{}} = w_{j} \cdot φ_{j} (x_{i}) \cdot \frac{x_{i 1} - c_{j 1}}{a_{j}^{2}}, \frac{\partial e_{i}}{\partial c_{j 2}^{}} = w_{j} \cdot φ_{j} (x_{i}) \cdot \frac{x_{i 2} - c_{j 2}}{a_{j}^{2}}, \frac{\partial e_{i}}{\partial a_{j}^{}} = \frac{\partial}{\partial a_{j}^{}} [u (x_{i}) - T_{i}] = w_{j} φ_{j} (x_{i}) \cdot \frac{{‖x_{i} - c_{j}‖}^{2}}{a_{j}^{3}} .

Let us consider the solution of the PDE in the PIRBN using the example of solving the two-dimensional Poisson equation.

\frac{\partial^{2} u}{\partial x_{1}^{2}} + \frac{\partial^{2} u}{\partial x_{2}^{2}} = f (x_{1}, x_{2}), (x_{1}, x_{2}) \in Ω,

(2)

u = p (x_{1}, x_{2}), (x_{1}, x_{2}) \in \partial Ω,

(3)

where

\partial Ω

is the region border and

f

and

p

are known functions.

f (x_{1}, x_{2}) = \sin (π x_{1}) \cdot \sin (π x_{2}), p (x_{1}, x_{2}) = 0

(4)

PIRBN training consists of minimizing the error functional, which is the sum of squared residuals at the internal and boundary test points.

L = \frac{1}{2} [\sum_{i = 1}^{N} {(\frac{\partial^{2} u_{i}}{\partial x_{1}^{2}} + \frac{\partial^{2} u_{i}}{\partial x_{2}^{2}} - f_{i})}^{2} + λ \cdot \sum_{j = 1}^{K} {(u_{j} - p_{j})}^{2}],

where

N

is the number of test points located inside the solution area,

K

is the number of test points on the boundary of the solution area, and

λ

is the penalty multiplier.

To train a PIRBN, similarly to the function approximation problem, the components of the gradient vector of the error functional and the elements of the Jacobian matrix can be calculated analytically.

At each iteration of the Levenberg-Marquardt method, it is necessary to solve system (1). Matrix

J_{k - 1}^{T} J_{k - 1} + μ_{k} E

of this system is symmetric, positive-definite, and ill-conditioned. Various methods can be used to solve the system. For example, you can apply the direct Cholesky method. To solve high-dimensional systems, it is advisable to use iterative methods, for example, the conjugate gradient method. The complexity of solving systems of linear algebraic equations using direct methods is of the order of

O (n^{3})

, where

n

is the number of equations in the system and

O (n^{2})

is the complexity of one iteration of iterative algorithms. Therefore, fast iterative methods have an advantage over direct methods when solving large systems of linear algebraic equations with low accuracy. Our experiments have shown that in the Levenberg-Marquardt method, it is sufficient to solve system (1) up to the relative norm of the residual

10^{- 2} - 10^{- 3}

, which requires 3–4 iterations for the conjugate gradient method. In addition, Krylov subspace methods, which include the conjugate gradient method, are numerically stable [33], which weakens the influence of poor matrix conditioning. The biconjugate gradient stabilized algorithm has even greater numerical stability [33].

Let us consider solving problems for a piecewise homogeneous medium using the example of a model problem with two regions of the medium.

\frac{\partial}{\partial x} (σ_{i} (x, y) \frac{\partial u}{\partial x}) + \frac{\partial}{\partial y} (σ_{i} (x, y) \frac{\partial u}{\partial y}) = f (x, y), (x, y) \in Ω, i = 1, 2,

(5)

u (x, y) = p (x, y), (x, y) \in \partial Ω,

(6)

where

f = \sin (2 π x_{1}) \cdot \sin (π x_{2})

,

p = 0

, and properties of media are described by functions

σ_{i}

.

The solution region has dimensions

1 \times 1

and is divided at

x = 0.5

into two subregions. At the interface between media s, the interface conditions must be met.

{u_{1}|}_{S} = {u_{2}|}_{S}, {σ_{1} \frac{d u_{1}}{d x}|}_{S} = σ_{2} {\frac{d u_{2}}{d x}|}_{S}

(7)

In [30], it was proposed to solve problems (5)–(7) in the PIRBN iteratively. Each iteration consists of several steps, the number of which is determined by the number of subareas. At each step, the problem is solved for the corresponding subdomain with the conjugation conditions on the boundary of the subdomain obtained from the approximation of the solution at the previous iteration. The error functional for each subdomain is the sum of squared residuals within the subdomain, in the boundary conditions, and in the conjugation conditions. The condition for the end of the iterative process is small values of the norms of residuals within each subdomain, in the boundary conditions, and in the conjugation conditions.

The inverse coefficient problem of determining the properties of a piecewise homogeneous medium can be solved on radial basis function networks only approximately. It is not possible to accurately determine the position of the boundaries of subregions with different properties of the medium using meshless methods. Let us solve the inverse problem approximately using a continuous function

k (x)

, describing the environment as follows:

\frac{\partial}{\partial x_{1}} (k (x) \frac{\partial u}{\partial x_{1}}) + \frac{\partial}{\partial x_{2}} (k (x) \frac{\partial u}{\partial x_{2}}) = f (x), x \in Ω

(8)

with boundary conditions

B u (x) = p (x), x \in \partial Ω

.

At a set of points

z \in Z, Z \subset Ω \cup \partial Ω

, the solution

u (z) = ψ (z)

of the problem for a piecewise homogeneous medium is known and measured with a known error.

Each iteration of solution (8) consists of two steps. In the first step, one iteration of training radial basis function networks approximating function

k (x)

is performed.

k_{R B F} (x) = \sum_{m = 1}^{M_{k}} w_{m}^{k} φ_{m}^{k} (x) .

(9)

At the second step, one iteration of training a network is performed that approximates solution

u

of problem (8), in which function

k (x)

is approximated by the first network:

u_{R B F} (x) = \sum_{m = 1}^{M_{u}} w_{m}^{u} φ^{u} (x) .

(10)

The task error function has the following form:

L = \frac{1}{2} \sum_{i = 1}^{N} {[A u (x_{i}, k (x_{i})) - f (x_{i})]}^{2} + \frac{λ_{B}}{2} \sum_{j = 1}^{K} {[B u (x_{j}) - p (x_{j})]}^{2} + \frac{λ_{D}}{2} \sum_{m = 1}^{S} {[u (x_{m}) - ψ (x_{m})]}^{2},

(11)

where

A u (x_{i}, k (x_{i}))

is the differential operator of the problem (8),

N

,

K

, and

S

are the number of test points inside the area, on the border, and points of additional conditions, respectively, and

λ_{B}, λ_{D}

are the penalty multipliers.

Substituting expressions (9) and (10) into the error functional (11), the components of the gradient vectors and elements of the Jacobian matrices for training networks (9) and (10) are calculated.

Regularization of an ill-posed inverse problem is carried out by the iterative regularization method [34]. To do this, training the networks continues until

\sum_{m = 1}^{S} {[u (x_{m}) - ψ (x_{m})]}^{2} > S δ^{2}

, where

δ

is the known absolute error of the solution at the points of additional conditions.

Analytical calculation of the derivatives of the error functional does not present any fundamental difficulties but is cumbersome and fraught with errors. Therefore, programs in Python were developed that implement PIRBNs, which use the automatic differentiation functions of the TensorFlow library. No graphics accelerator was used. In the future, it is planned to develop custom TensorFlow extensions to implement PIRBNs.

3. Results

The experiments were carried out on a computer running Windows 11 with an Intel(R) Core(TM) i5-12400F 2.50 GHz processor and 16.0 GB of RAM.

Function approximation using a PIRBN was tested in MATLAB using the Franke function [35], which is widely used for testing approximators, as an example (Figure 1a).

\begin{array}{l} f (x, y) = 0.75 \exp (- \frac{{(9 x - 2)}^{2}}{4} - \frac{{(9 y - 2)}^{2}}{4}) + 0.75 \exp (- \frac{{(9 x + 1)}^{2}}{49} - \frac{9 y + 1}{10}) + \\ + 0.5 \exp (- \frac{{(9 x - 7)}^{2}}{4} - \frac{{(9 y - 3)}^{2}}{4}) - 0.2 \exp (- {(9 x - 4)}^{2} - {(9 x - 7)}^{2}) . \end{array}

The problem was solved in the

(x = 0 \dots 1, y = 0 \dots 1)

region using 100 randomly located sample points and 16 radial basis functions. The network was trained using the Levenberg-Marquardt method with an analytical calculation of derivatives.

The network was trained to a root mean square residual of

10^{- 6}

in 15 iterations. Figure 1b shows the approximated Franke function. Figure 2 shows radial basis function networks during initialization and after training the network. The diameters of the circles conventionally indicate the shape parameter, and the fill color conventionally displays the weights associated with the radial basis functions.

First-order gradient algorithms, including Nesterov’s method, did not allow the network to be trained to approximate the Franke function. Using Nesterov’s algorithm, it was not possible to achieve a root mean square error less than

10^{- 1}

. At the same time, Nesterov’s method allows one to well approximate the smooth functions of one variable. Thus, when approximating a sinusoid, a root mean square error of

10^{- 6}

was achieved in 70 iterations.

Model problems (2)–(4) were solved using a PIRBN, trained by the Nesterov and Levenberg-Marquardt methods using automatic differentiation. The problem was solved in a unit square using 100 test points located inside the solution area and 40 test points located on the boundary of the area. The network contained 64 Gaussian functions.

On the network trained by the Nesterov method, in 100 iterations, it was not possible to achieve a residual RMSE value even equal to

10^{- 3}

(Figure 3a).

On the network trained by the Levenberg-Marquardt method, in 100 iterations, an RMSE residual value of

10^{- 6}

was achieved (Figure 4a). The plot of the network solution (Figure 4b) is almost identical to the analytical PDE solution.

Thus, the adapted Levenberg-Marquardt learning algorithm with both analytical calculation of derivatives and numerical differentiation allows training radial basis function networks for solving approximation problems and solving PDEs.

Before solving the inverse problem for a piecewise homogeneous environment, the direct problem for such an environment was first solved on a network trained in MATLAB using the Levenberg-Marquardt algorithm. The problem was solved for constant values of functions

k_{1} = 2

and

k_{2} = 5

, describing two environments. In each of the two areas, 64 Gaussian functions were used. For each subregion, 60 internal and border test points were used. There were 20 test points along the media interface. The network was trained for 1500 iterations to an RMSE residual value of

10^{- 12}

.

From the solution of the direct problem, a solution was taken at 40 points located on the grid. These values are taken as additional conditions. To complicate the problem, additional conditions were not taken at the interface between the media. The radial basis function networks for solving the inverse problem contained the same number of basis functions and were trained on the same number of test points as the networks for solving the direct problem.

The results of solving the inverse problem are shown in Figure 5.

As can be seen in Figure 5a, it was possible to well restore the solution to the direct problem. The calculated values of the environment function show the nature of changes in the properties of the environment.

4. Discussion

Although the developed algorithms have been studied on model problems, it can be expected that they will work well, at least for the same classes of problems as the considered model problems.

It is more difficult to theoretically substantiate the convergence of the developed algorithms. These issues are poorly developed for neural networks and are of interest for future research. Research into other learning algorithms is also planned.

In the practical area, it is planned to develop custom extensions of libraries for working with neural networks that implement the proposed and newly developed algorithms. This will expand the ability to solve real problems and the access of other researchers to test the proposed algorithms.

Author Contributions

Conceptualization, V.G.; Investigation, D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [PubMed]
Yadav, N.; Yadav, A.; Kumar, M. An Introduction to Neural Network Methods for Differential Equations; Springer: Dordrecht, The Netherlands, 2015. [Google Scholar]
Cybenko, G. Approximation by Superposition of a Sigmoidal Function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 1999, 4, 251–257. [Google Scholar] [CrossRef]
Hanin, B. Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations. Mathematics 2019, 7, 992. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part I): Data-Driven Solutions of Nonlinear Partial Differential Equations. arXiv 2017, arXiv:1711.10561. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part II): Data-Driven Discovery of Nonlinear Partial Differential Equations. arXiv 2017, arXiv:1711.10566. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Lakshmanan, V.; Robinson, S.; Munn, M. Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOps; O’Reilly Media: Sebastopol, CA, USA, 2020. [Google Scholar]
Baydin, A.G.; Pearlmutter, B.F.; Radul, A.A.; Siskind, J.M. Automatic Differentiation in Machine Learning: A Survey. J. Mach. Learn. Res. 2018, 18, 1–43. [Google Scholar]
Tarkhov, D.; Vasilyev, A. Semi-Empirical Neural Network Modeling and Digital Twins Development; Academic Press: Cambridge, MA, USA, 2019. [Google Scholar]
Ramabathiran, A.A.; Ramachandran, P. SPINN: Sparse, Physics-based, and partially Interpretable Neural Networks for PDEs. J. Comput. Phys. 2021, 445, 110600. [Google Scholar] [CrossRef]
Gorbachenko, V.I.; Zhukov, M.V. Solving Boundary Value Problems of Mathematical Physics Using Radial Basis Function Networks. Comp. Math. Math. Phys. 2017, 57, 145–155. [Google Scholar] [CrossRef]
Hryniowski, A.; Wong, A. DeepLABNet: End-to-end Learning of Deep Radial Basis Networks with Fully Learnable Basis Functions. arXiv 2019, arXiv:1911.09257. [Google Scholar] [CrossRef]
Mostajeran, F.; Hosseini, S.M. Radial basis function neural network (RBFNN) approximation of Cauchy inverse problems of the Laplace equation. Comput. Math. Appl. 2023, 141, 129–144. [Google Scholar] [CrossRef]
Xiao, J.-E.; Ku, C.-Y.; Liu, C.-Y. Solving Inverse Problems of Stationary Convection-Diffusion Equation Using the Radial Basis Function Method with Polyharmonic Polynomials. Appl. Sci. 2022, 12, 4294. [Google Scholar] [CrossRef]
Liu, Z.; Chen, Y.; Song, G.; Song, W.; Xu, J. Combination of Physics-Informed Neural Networks and Single-Relaxation-Time Lattice Boltzmann Method for Solving Inverse Problems in Fluid Mechanics. Mathematics 2023, 11, 4147. [Google Scholar] [CrossRef]
Alqezweeni, M.; Gorbachenko, V. Solution of Partial Differential Equations on Radial Basis Functions Networks. In Proceedings of the International Scientific Conference on Telecommunications, Computing and Control, St. Petersburg, Russia, 18 November 2019. [Google Scholar] [CrossRef]
Liu, C.-Y.; Ku, C.-Y. A Novel ANN-Based Radial Basis Function Collocation Method for Solving Elliptic Boundary Value Problems. Mathematics 2023, 11, 3935. [Google Scholar] [CrossRef]
Miaoli, M.; Xiaolong, W.; Honggui, H. Accelerated Levenberg–Marquardt Algorithm for Radial Basis Function Neural Network. In Proceedings of the Chinese Automation Congress, Shanghai, China, 6–8 November 2020. [Google Scholar] [CrossRef]
Alqezweeni, M.M.; Glumskov, R.A.; Gorbachenko, V.I.; Stenkin, D.A. Solving Partial Differential Equations on Radial Basis Functions Networks and on Fully Connected Deep Neural Networks. In Proceedings of the International Conference on Intelligent Vision and Computing, Sur, Oman, 3–4 October 2021. [Google Scholar] [CrossRef]
Bai, J.; Liu, G.-R.; Gupta, A.; Alzubaidi, L.; Feng, X.Q.; Gu, Y.T. Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equations. Comput. Methods Appl. Mech. Eng. 2023, 415, 116290. [Google Scholar] [CrossRef]
Gorbachenko, V.I.; Stenkin, D.A. Physics-Informed Radial Basis Function Networks. Tech. Phys. 2023, 1–7. [Google Scholar] [CrossRef]
Gorbachenko, V.I.; Stenkin, D.A. Physics-Informed Radial Basis Function Networks: Solving Inverse Problems for Partial Differential Equations. In Proceedings of the 2nd International Conference Cyber-Physical Systems and Control, St. Petersburg, Russia, 29 June–2 July 2021. [Google Scholar] [CrossRef]
Kansa, E.J. Multiquadrics—A scattered data approximation scheme with applications to computational fluid-dynamics—I surface approximations and partial derivative estimates. Comput. Math. Appl. 1990, 19, 127–145. [Google Scholar] [CrossRef]
Kansa, E.J. Multiquadrics—A scattered data approximation scheme with applications to computational fluid-dynamics—II solutions to parabolic, hyperbolic and elliptic partial differential equations. Comput. Math. Appl. 1990, 19, 147–161. [Google Scholar] [CrossRef]
Buhmann, M.D. Radial Basis Functions: Theory and Implementations; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Larsson, E.; Fornberg, B. A numerical study of some radial basis function solution methods for elliptic PDEs. Comput. Math. Appl. 2003, 46, 891–902. [Google Scholar] [CrossRef]
Arora, G.; Bhatia, G.S. A Meshfree Numerical Technique Based on Radial Basis Function Pseudospectral Method for Fisher’s Equation. Int. J. Nonlinear Sci. Numer. Simul. 2020, 21, 37–49. [Google Scholar] [CrossRef]
Stenkin, D.A.; Gorbachenko, V.I. Solving Equations Describing Processes in a Piecewise Homogeneous Medium on Radial Basis Functions Networks. In Proceedings of the International Conference on Neuroinformatics 2020, Moscow, Russia, 2–16 October 2020. [Google Scholar] [CrossRef]
Sutskever, I.; Martens, J.; Dahl, G. On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
Marquardt, D.W. An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
Liesen, J. Krylov Subspace Methods: Principles and Analysis; Oxford University Press: Oxford, UK, 2015. [Google Scholar]
Morozov, V.A. Methods for Solving Incorrectly Posed Problems; Springer: New York, NY, USA, 1984. [Google Scholar]
Franke, R. Scattered data Interpolation: Tests of some Methods. Math. Comput. 1982, 38, 181–200. [Google Scholar] [CrossRef]

Figure 1. Franke function of the (a) analytically calculated function and the (b) approximated function.

Figure 2. Radial basis functions when approximating the Franke function (a) during network initialization and (b) after network training.

Figure 3. Results of solving the Poisson equation in a network trained by the Nesterov method. (a) Dependence of the root mean squared error (RMSE) of the residual on the iteration number; (b) solution obtained in the network.

Figure 4. Results of solving the Poisson equation in a network trained by the Levenberg-Marquardt method. (a) Dependence of the root mean squared error (RMSE) of the residual on the iteration number; (b) solution obtained in the network.

Figure 5. Solution results for a piecewise homogeneous medium. (a) Restored solution to the direct problem; (b) computed environment function values.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stenkin, D.; Gorbachenko, V. Mathematical Modeling on a Physics-Informed Radial Basis Function Network. Mathematics 2024, 12, 241. https://doi.org/10.3390/math12020241

AMA Style

Stenkin D, Gorbachenko V. Mathematical Modeling on a Physics-Informed Radial Basis Function Network. Mathematics. 2024; 12(2):241. https://doi.org/10.3390/math12020241

Chicago/Turabian Style

Stenkin, Dmitry, and Vladimir Gorbachenko. 2024. "Mathematical Modeling on a Physics-Informed Radial Basis Function Network" Mathematics 12, no. 2: 241. https://doi.org/10.3390/math12020241

APA Style

Stenkin, D., & Gorbachenko, V. (2024). Mathematical Modeling on a Physics-Informed Radial Basis Function Network. Mathematics, 12(2), 241. https://doi.org/10.3390/math12020241

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mathematical Modeling on a Physics-Informed Radial Basis Function Network

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI