Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations

Wang, Jing; Li, Yubo; Wu, Anping; Chen, Zheng; Huang, Jun; Wang, Qingfeng; Liu, Feng

doi:10.3390/app14135490

Open AccessArticle

Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations

by

Jing Wang

^1,2,3,†

,

Yubo Li

^2,3,

Anping Wu

^2,3,

Zheng Chen

⁴,

Jun Huang

¹

,

Qingfeng Wang

¹ and

Feng Liu

^2,3,*,†

¹

School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang 621010, China

²

Hypervelocity Aerodynamics Institute, China Aerodynamics Research and Development Center, Mianyang 621000, China

³

National Key Laboratory of Aerospace Physics in Fluids, Mianyang 621000, China

⁴

China Academy of Launch Vehicle Technology, Beijing 100076, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2024, 14(13), 5490; https://doi.org/10.3390/app14135490

Submission received: 12 May 2024 / Revised: 11 June 2024 / Accepted: 20 June 2024 / Published: 25 June 2024

(This article belongs to the Section Applied Physics General)

Download

Browse Figures

Versions Notes

Abstract

:

This paper establishes a method for solving partial differential equations using a multi-step physics-informed deep operator neural network. The network is trained by embedding physics-informed constraints. Different from traditional neural networks for solving partial differential equations, the proposed method uses a deep neural operator network to indirectly construct the mapping relationship between the variable functions and solution functions. This approach makes full use of the hidden information between the variable functions and independent variables. The process whereby the model captures incredibly complex and highly nonlinear relationships is simplified, thereby making network learning easier and enhancing the extraction of information about the independent variables in partial differential systems. In terms of solving partial differential equations, we verify that the multi-step physics-informed deep operator neural network markedly improves the solution accuracy compared with a traditional physics-informed deep neural operator network, especially when the problem involves complex physical phenomena with large gradient changes.

Keywords:

deep neural operator; solving partial differential equations; multi-step; embedding physics-informed

1. Introduction

Machine learning, especially deep learning, has received increasing attention due to its breakthrough performance in various cognitive applications. In recent years, the modeling of partial differential equations has been subjected to neural networks (NNs), which can be regarded as universal approximators of nonlinear functions. On this basis, Raissi et al. [1] proposed physics-informed neural networks (PINNs), which changed the conventional method of modeling physical systems. When dealing with physical systems, PINNs utilize available measurement data to obtain a parameterized partial differential Equation (PDE), which provides additional information for NN training [1,2,3,4,5].

PINNs can solve difficult problems in which local boundary conditions are missing, such as the thermal boundary conditions in heat transfer problems [6], or that rely on small displacements [7] to detect holes and defects in materials when performing inference tasks. Recently, many researchers have used PINNs in the field of fluid dynamics. In applied research, Raissi et al. [8] developed hidden fluid dynamics using PINNs to directly extract the velocity and pressure field of a fluid flow from images. Jin et al. [9] used PINNs to precisely code the Navier–Stokes governing equations into deep NNs, overcoming the challenges faced by traditional numerical discrete solutions with incompressible laminar and turbulent flows, such as strong grid dependence, absence of boundary conditions, and the high computational cost of inverse problems. Moreover, using velocity–pressure and vorticity–velocity forms, Navier–Stokes flow networks were developed. Zhu et al. [10] applied PINNs in the modeling of three-dimensional metal processing and required only a small amount of labeled data to learn the dynamic changes in temperature and melt distribution. Song et al. [11] presented a Physics-Informed Neural Networks (PINNs) framework for identifying constitutive parameters in soft materials, achieving under 5% error in complex geometries and noisy conditions. Their approach leverages multi-modal synthetic datasets to ensure robust performance across various testing scenarios. For more research findings, please refer to reference [12]. Jagtap et al. [13] used adaptive activation functions to approximate the solutions of linear and nonlinear partial differential equations. This method has high solution accuracy and fast convergence speed, especially in the early training stage. Xiang et al. [14] observed that the loss function composed of a weighted combination of multiple loss sub-functions plays an important role in training PINNs. Therefore, they proposed an adaptive loss function method based on maximum likelihood estimation, which automatically assigns the weight of the loss by updating the noise parameters at each iteration step. Li et al. [15] proposed the use of the minimax algorithm to adaptively adjust the weights of the loss function based on this foundation. Yu et al. [16] proposed a gradient-enhanced PINN that embeds the gradient information of PDE residuals into the loss function when training PINN models and proved the effectiveness of this approach in forward and reverse PDE problems. To improve the modeling efficiency, Jagtap et al. [17] proposed a conservative PINN over discrete domains based on the nonlinear conservation law. This enables the solution process of PINNs to be parallelized, thus enhancing the solution efficiency. Jagtap et al. [18] then generalized this method to adapt to problems with various complex computational domains. Although PINNs have achieved remarkable effectiveness, they rely on specific boundary conditions, initial conditions, and special source terms during the training stage. This approach requires extensive expertise and incurs significant costs in the inference process. The application of transfer learning under specific conditions can alleviate this limitation [19,20]; although, this often leads to strong targeting of new target domains and a loss of generalization reasoning capability. Thus, PINNs cannot produce inferences in real time under different boundaries, initial conditions, and loads. If a cross-scenario prediction method with generalized reasoning capabilities could be established, the time cost would be almost negligible compared with traditional numerical solvers (such as CFD or solid mechanics simulators).

To establish the generalization reasoning capability of neural networks, Lu et al. [21] proposed the deep operator network (DeepONet). This network is based on the universal approximation theorem of operators [21,22], which has been theoretically proven. DeepONet predefines the input space with input generalization ability before training, such that the trained model can generalize predictions on the input space and achieve the generalization reasoning capability. Additionally, DeepONet is unaffected by the curse of dimensionality in the input space [23], which has great application value in engineering problems. Jin et al. [24] employed a machine learning framework with deep neural operators (DeepONet) to map microstructures to mechanical responses in metamaterials, achieving prediction accuracies within 5–10% from sparse data. This approach marks a significant advance in the inverse design of materials with complex, nonlinear properties. The cost of DeepONet is that it requires numerous expensive experiments or high-fidelity simulation datasets to express the input space, making it difficult for DeepONet to solve PDEs containing variable functions. Moreover, fitting the output function in DeepONet is not guaranteed to satisfy the deep PDEs and only provides a rough approximation of the target solution operator. Thus, DeepONet lacks interpretability at the physical level.

DeepONet and PINNs are complementary approaches. Therefore, Wang et al. [25] combined DeepONet with a PINN to give PIDeepONet, which makes use of NNs and solution function spaces to directly establish mapping relationships. This requires the network to capture very complex and highly nonlinear relationships [26]. Hence, the network learning process can be difficult and false model convergence may occur, meaning that accurate prediction results are not guaranteed.

This article establishes a new method for directly solving arbitrary PDEs. The proposed method does not rely on data and has strong predictive generalization capabilities and high accuracy. Inspired by the latent variable model [27], our approach breaks the direct mapping relationship between PIDeepONet and the solution function space and establishes an intermediate variable model. The model uses latent variables to encode the information bottleneck between the inputs and output and incorporates a multi-step deep network structure. Essentially, our proposed multi-step physics-informed deep operator neural network (MulSPIDeepONet) for solving PDEs combines the advantages of PINNs and DeepONet.

This paper describes the novel NN structure of MulSPIDeepONet, which simplifies the mapping process between variable functions and solution functions to achieve enhanced interpretability and generalization capabilities. The architecture of MulSPIDeepONet can be described as follows. First, the independent variables of the variable function and the solution function are represented by two subnetworks, and we obtain the transition operator of the solution function through the dot product operation. Second, the transition operator of the solution function provides the input to the transition subnetwork. Third, to enhance the information fusion between the network output and the trunk network, we take the dot product of the transition subnetwork and the trunk network to obtain the final solution function. Finally, physics-informed constraints are applied as additional penalty terms in the loss function. MulSPIDeepONet can be easily extended to solve any PDE.

To verify the effectiveness of the MulSPIDeepONet model, we perform data-independent modeling to solve three PDEs. Numerical simulation results demonstrate that the proposed MulSPIDeepONet achieves significantly improved model accuracy and generalization capability compared with the baseline PIDeepONet in solving these PDEs. The effect is especially significant when the model is applied to solve PDEs characterized by large gradient changes and discontinuities.

1.1. Our Contributions

Overall, we make the following three contributions:

We outline approximation methods for nonlinear functions and nonlinear operators, along with the PIDeepONet, which we consider as the baseline method in this paper.
We introduce a new structure named MulSPIDeepONet. In this architecture, the outputs from the trunk and branch networks are first merged via a dot product operation to form a transitional operator. This operator serves as an intermediate variable for the solution operator mapping and is used as the input to a transitional subnetwork. Following this, the input function and its associated trunk network are linked with the transitional subnetwork, establishing a connection through the dot product operation. Finally, the output from the transitional subnetwork is combined with that of the trunk network through a dot product operation to produce the final network output. The proposed MulSPIDeepONet framework is designed to be easily adaptable for solving arbitrary PDEs.
The numerical results from three example PDEs indicate that the proposed MulSPIDeepONet achieves approximately 2–3× improvement in accuracy over the baseline PIDeepONet, particularly in areas with significant gradient changes or discontinuities in the equations.

1.2. Roadmap

The remainder of this paper is organized as follows. In Section 2, we introduce approximation methods for nonlinear functions and nonlinear operators and describe the DeepONet neural operator and physics-informed DeepONet. Finally, the proposed MulSPIDeepONet architecture is presented. We solve three PDEs to verify the effectiveness of the proposed method in Section 3. In Section 4, we summarize the results of this study and provide some ideas for future work.

2. Method

2.1. General Nonlinear Functions and Nonlinear Operator Approximations

In this section, we briefly review nonlinear functions and operators based on NNs and emphasize several key principles of general approximation. Consider the continuous function

f (u)

defined on U, where

U \subseteq C [a, b]

, and a bounded function

σ

representing the sigmoid function or some other nonlinear function. For

u \in U

,

f (u)

can be approximated as follows [28]:

|f (u) - \sum_{i = 1}^{N} c_{i} σ (\sum_{j = 1}^{m} ε_{i, j} u (x_{j}) + θ_{i})| < δ,

(1)

where

δ

is a given error range,

c_{i}, ε_{i, j}

, and

θ_{i}

are real numbers,

u (x_{j})

is the j-th value of the input function

u (x)

at the m points

\{x_{1}, x_{2}, \dots, x_{m}\}

, N is the number of neurons in the hidden layer, and N is a hyperparameter of the NN. Equation (1) is known as the universal approximation theorem for general linear and nonlinear functions.

Chen et al. [22] further expanded their work on universal approximation theorems for general linear and nonlinear functions to consider dynamical systems. These systems are described by PDEs and are called operators. Specifically, the mapping

G (u) (y)

from function

u (x)

to function

G (y)

is approximated by the following expression [29]:

|G (u) (y) - \sum_{k = 1}^{N} [\sum_{i = 1}^{M} {c_{i}}^{k} g (\sum_{j = 1}^{m_{k}} ε_{i, j}^{k} u (x_{j}) + θ_{i}^{k})] g (\sum_{l = 1}^{n} w_{k, l} y_{l} + ζ_{k})| < δ,

(2)

where

{c_{i}}^{k}

,

ε_{i, j}^{k}

,

θ_{i}^{k}

,

w_{k, l}

,

m_{k}

, n, and

ζ_{k}

are all constants,

i = 1, 2, \dots M

,

k = 1, 2, \dots N

,

l = 1, 2, \dots N

,

x_{j}

is the j-th value of the independent variable x at the m points

\{x_{1}, x_{2}, \dots, x_{m}\}

of the input function, and

y_{l}

is the l-th value of the independent variable y at the n points

\{y_{1}, y_{2}, \dots, y_{n}\}

of the input function. G is a nonlinear continuous operator and g is a nonlinear activation function, such as the sigmoid function or ReLU. N and M are hyperparameters of the NN.

2.2. DeepONet and PIDeepONet Architectures

Inspired by the nonlinear operator approximation theorem, Lu et al. [21] recently proposed the DeepONet architecture. In this architecture, they named the network structures related to the input function u and the output variable y as the branch and trunk networks, respectively. Therefore, the general nonlinear operator approximate expression can be written as follows:

|G (u) (y) - \sum_{k = 1}^{N} \underset{Branch Net}{\underset{︸}{[\sum_{i = 1}^{M} {c_{i}}^{k} g (\sum_{j = 1}^{m_{k}} ε_{i, j}^{k} u (x_{j}) + θ_{i}^{k})]}} \underset{Trunk Net}{\underset{︸}{g (\sum_{l = 1}^{n} w_{k, l} y_{l} + ζ_{k})}}| < δ .

(3)

The subexpressions related to the trunk and branch networks are explicitly divided into stacked and unstacked components according to whether the branch network has multiple subnetworks. Essentially, the unstacked type merges multiple subbranches of the network into one branch (as demonstrated in Figure 1).

Lu et al. also proposed a generalized approximation theorem for operators based on the unstacked DeepONet architecture [21], which is expressed as follows:

|G (u) (y) - 〈\underset{Branch Net}{\underset{︸}{g (u (x_{1}), u (x_{2}), \dots, u (x_{m}))}}, \underset{Trunk Net}{\underset{︸}{f (y)}}〉| < δ .

(4)

Wang et al. [25] were inspired by PINNs and the DeepONet deep neural operator. They combined the advantages of DeepONet with those of PINNs to produce PIDeepONet. This architecture embeds the explicit formula of the PDE system into the network loss function as an additional physical constraint through automatic differentiation technology, thereby greatly reducing the need for training data. The network architecture of PIDeepONet is shown in Figure 2. The biggest difference between PIDeepONet and DeepONet is that the network output uses automatic differentiation to formulate an appropriate regularization mechanism, so that the target output function satisfies any given differential constraint. Therefore, the loss function of PIDeepONet is written as follows:

L (θ) = L_{o p e r a t o r} (θ) + L_{p h y s i c s} (θ),

(5)

where

L_{p h y s i c s} (θ) = \frac{1}{N Q} \sum_{i = 1}^{N} \sum_{j = 1}^{Q} {|N (u^{i}, G_{θ} (u^{i}) (y_{u, j}^{i}))|}^{2},

(6)

and

L_{o p e r a t o r} (θ) = \frac{1}{N P} \sum_{i = 1}^{N} \sum_{j = 1}^{P} {|G_{θ} (u^{i}) (y_{u, j}^{i}) - G (u^{i}) (y_{u, j}^{i})|}^{2},

(7)

in which

θ

represents the set of all trainable weights and bias parameters in the trunk and branch networks,

{\{u^{i}\}}_{i = 1}^{N}

represents the N independent input functions for sampling each

u^{i}

, and

{\{y_{u, j}^{i}\}}_{j = 1}^{P}

are P positions determined by data observations, initial conditions, or boundary conditions. Additionally,

{\{y_{u, j}^{i}\}}_{j = 1}^{Q}

is a set of collocation points that can be randomly sampled in the domain. Therefore,

L_{o p e r a t o r} (θ)

is used to fit the available solution measurements, while

L_{p h y s i c s} (θ)

imposes the basic PDE constraints.

We propose the MulSPIDeepONet architecture, which is based on the general nonlinear operator approximation theorem and incorporates the PIDeepONet structure.

2.3. MulSPIDeepONet Architecture

We now describe the MulSPIDeepONet structure, which considers the indirect mapping between the input function and the output function. Through physical constraints, this architecture achieves a better ability to solve arbitrary PDEs with relatively little training data.

MulSPIDeepONet is based on the advanced network architecture of DeepONet, which consists of a branch network, trunk network, and independent transition subnetwork. An independent transition subnetwork is established to learn the transition operator, thus enhancing the information utilization between the variable function and the independent variable of the solution function. This subnetwork also improves the feature recognition of the network with regard to the independent variable.

Our model uses the DeepONet architecture [21] and considers the form of general parametric PDEs. MulSPIDeepONet can be expressed as follows:

|F (u) (y) (G) - \sum_{s = 1}^{U} \underset{Trunk Net}{\underset{︸}{g (\sum_{l = 1}^{n} w_{s, l} y_{l} + ζ_{s})}} \underset{Transition SubNet}{\underset{︸}{g (\sum_{q = 1}^{d} p_{q, s} G {(u, y)}_{q} + ς_{q})}}| < δ,

(8)

where

G (u, y)

is the output of DeepONet:

G (u, y) = \sum_{k = 1}^{N} \underset{Branch Net}{\underset{︸}{[\sum_{i = 1}^{M} {c_{i}}^{k} g (\sum_{j = 1}^{m_{k}} ε_{i, j}^{k} u (x_{j}) + θ_{i}^{k})]}} \underset{Trunk Net}{\underset{︸}{g (\sum_{l = 1}^{n} w_{k, l} y_{l} + ζ_{k})}},

(9)

in which

{c_{i}}^{k}

,

ε_{i, j}^{k}

,

θ_{i}^{k}

,

w_{k, l}

,

ζ_{k}

,

p_{q, s}

, and

ς_{q}

are the weights and biases of the corresponding network, which are constants. N, M, and U are network hyperparameters. Therefore, MulSPIDeepONet can be expressed as follows:

|F (u) (y) (G) - 〈F N N (〈\underset{branch}{\underset{︸}{g (u (x_{1}), u (x_{2}), \dots, u (x_{m}))}}, \underset{trunk}{\underset{︸}{f (y)}}〉) \underset{trunk}{\underset{︸}{f (y)}}〉| < δ,

(10)

where

F (u) (y) (G)

is the final network output, and

F N N

represents the transition subnetwork. The main purposes are to simplify the mapping of the network to the solution function and reduce the difficulty of capturing the complex and highly nonlinear relationship. The MulSPIDeepONet network architecture is depicted in Figure 3.

MulSPIDeepONet adopts an unstacked DeepONet structure. First, the outputs from the trunk and branch networks are merged through the dot product operation to obtain the transition operator, which is used as the intermediate variable for mapping the solution operator and forms the input to the transition subnetwork. Through the transition subnetwork, the characteristics of the branch and trunk networks are fused to enhance the utilization of hidden information between the two networks, thereby establishing an indirect mapping relationship with the solution function.

The accuracy of the PINN model is closely related to the physical constraints between the independent variables of the solution function. The key is to enhance the feature fusion of the solution function and the independent variables, as this has a great impact on improving the accuracy of the network model. To this end, the proposed architecture connects the input function and the trunk network with the transition subnetwork, and forms an association through the dot product operation. Finally, the outputs of the transition subnetwork and the trunk network are combined through the dot product operation to obtain the final network output. This improves the physics-informed fusion of the independent variables of the solution function and expands the recognition ability of the physical laws. In this way, MulSPIDeepONet captures regional information with complex physical changes more accurately. Furthermore, the network output embeds physics-informed constraints through automatic differentiation technology, with these physical constraints forming part of the loss function. Therefore, the proposed method not only achieves data-independent training but also explains the output results at the physical level.

MulSPIDeepONet uses automatic differentiation to formulate an appropriate regularization mechanism for the output of the network. This guarantees that the target output function satisfies any given differential constraint. Thus, the loss function is defined as follows:

L (θ) = L_{o p e r a t o r} (θ) + L_{p h y s i c s} (θ),

(11)

where

L_{p h y s i c s} (θ) = \frac{1}{N Q} \sum_{i = 1}^{N} \sum_{j = 1}^{Q} {|N (u^{i}, F_{θ} (u^{i}) (y_{u, j}^{i}) (G_{u, y, j}^{i}))|}^{2},

(12)

and

L_{o p e r a t o r} (θ) = \frac{1}{N P} \sum_{i = 1}^{N} \sum_{j = 1}^{P} {|F_{θ} (u^{i}) (y_{u, j}^{i}) (G_{u, y, j}^{i}) - F (u^{i}) (y_{u, j}^{i}) (G_{u, y, j}^{i})|}^{2},

(13)

in which

θ

represents the set of all trainable weights and bias parameters in the trunk network, branch network, and transition subnetwork. The variables

{\{u^{i}\}}_{i = 1}^{N}

,

{\{y_{u, j}^{i}\}}_{j = 1}^{P}

, and

{\{y_{u, j}^{i}\}}_{j = 1}^{Q}

are the same as in Equations (6) and (7).

L_{o p e r a t o r} (θ)

is used to fit the available solution measurements, while

L_{p h y s i c s} (θ)

imposes the basic PDE constraints.

3. Experimental Verification

This section presents the modeling results of MulSPIDeepONet. We verify the effectiveness of the proposed method by solving three different PDE systems and compare the results with those from the baseline NN model. To ensure fairness, all networks in the following examples have a similar number of trainable parameters or weights, facilitating a more accurate comparison of the generalization performance of the model. Additionally, fixed random seeds are used to minimize the effects of randomness.

The baseline model in this paper is the classic PIDeepONet architecture. All test results are obtained using an NVIDIA 3060 Ti GPU based on the open-source Jax deep learning framework, and the ground truth results of the PDEs are obtained by analytical numerical simulations. To comprehensively evaluate the fitting ability of the constructed model in solving PDEs, we select several performance indicators. These indicators are the mean absolute error (MAE), relative mean absolute error (RMAE), relative root mean square error (rRMSE), and L2 norm error:

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i}^{t r u e} - y_{i}^{p r e d i c t}|,

(14)

R M A E = \frac{1}{n} \sum_{i = 1}^{n} |\frac{y_{i}^{t r u e} - y_{i}^{p r e d i c t}}{y_{i}^{t r u e}}|,

(15)

r R M S E = \frac{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i}^{t r u e} - y_{i}^{p r e d i c t})}^{2}}}{y^{a v e r a g e}},

(16)

L 2 n o r m error = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i}^{t r u e} - y_{i}^{p r e d i c t})}^{2}},

(17)

where n is the number of cells in the computational grid.

3.1. Advection Equation

We first study the ability of PIDeepONet and MulSPIDeepONet to process advection equations. The physical process of such PDEs in fast flowing media (such as a gas or liquid) is somewhat complex, so the solutions to numerical simulations become complicated. In particular, sharp gradients or non-physical oscillations of the solution may occur [30]. The traditional reduced-order model also faces significant challenges in obtaining a solution [31]. Therefore, this paper considers a linear advection partial differential system with simple hyperbolic variable coefficients:

\frac{\partial s}{\partial t} + u (x) \frac{\partial s}{\partial x} = 0, (x, t) \in (0, 1) \times (0, 1) .

(18)

The initial boundary conditions are as follows:

s (x, 0) = sin (π x),

s (0, t) = sin (\frac{π}{2} t),

where

u (x)

is a variable function. To make the input function

u (x)

strictly positive, we let

u (x) = ν (x) - {min}_{x} ν (x) + 1,

where

ν (x)

is sampled from a Gaussian random field. The data used to train the MulSPIDeepONet model are obtained by sampling the random function

ν (x)

generated by the Gaussian random field [25]. The generalization performance of the model is better verified by evaluating the prediction accuracy using the variable function

u (x)

, which is not included in the training process. The training goal is to learn the solution operator that maps the variable coefficient (variable function)

u (x)

to the solution

s (x, t)

.

The model learning rate is set as 0.001, and the Adam optimizer is used. The training set includes 1000 groups of training data. The specific network size settings are listed in Table 1, and the network settings determine the depth of each network and the number of neurons in each layer. The results for the residual loss (Loss-Res) and boundary loss (Loss-Bcs) during the training process are shown in Figure 4.

It can be seen from Figure 5 that PIDeepONet trained by physics-informed constraints is in good agreement with the reference PDE solution derived under the conditions specified in Figure 6, but the accuracy needs to be improved when dealing with PDE systems with rigid, turbulent, or chaotic dynamics [25]. For example, in this test case, the PIDeepONet model produces obvious errors in areas with large gradient changes (such as the area in the red triangle in Figure 5), and the average absolute error is 0.0218. Using the proposed method, the average absolute error of the test results is 0.0083 in the area where the gradient changes significantly. More comparative results are given in Table 2, and additional experimental results are shown in Figure 7 and Figure 8, where Figure 7 and Figure 8 correspond to the solutions under the functions presented in Figure 9 and Figure 10, respectively.

In Figure 7, the average absolute error between the prediction results of PIDeepONet and the analytical solution in the region with large gradient changes is 0.0337, while the average absolute error between the prediction results of MulSPIDeepONet and the analytical solution in this region is 0.0108. In Figure 8, the average absolute error between the prediction results of PIDeepONet and the analytical solution in the region with large gradient changes is 0.0307, while the average absolute error between MulSPIDeepONet and the analytical solution in this area is 0.0070. More comparative indicators are presented in Table 3.

The MulSPIDeepONet architecture represents an improvement over the PIDeepONet method. First, by sharing the independent variable information in the variable function and the solution function, and then applying feature fusion between these two, an indirect mapping of the solution function is efficiently achieved. This method simplifies the task of the network in capturing extremely complex and highly nonlinear relationships, which not only simplifies the learning process of the network, but also improves the ability to identify and analyze complex physical phenomena. In dealing with large gradient changes, as are often encountered in physical problems, the experimental results show that the proposed method effectively captures and reveals the complex physical processes behind the phenomenon.

Through comparisons and analysis of the solution results, we find that the MulSPIDeepONet architecture proposed in this paper has an obvious effect in reducing the error in large-gradient regions. Compared with the baseline model, the error is reduced by a factor of three, which reflects the superior generalization performance of MulSPIDeepONet compared with the baseline model. This improvement provides a new perspective and solution for dealing with complex physical problems in regions with large gradient changes. In general, the MulSPIDeepONet architecture has unique advantages and higher-level analysis capabilities in the face of complex physical scenes with large gradient changes. Thus, it provides a new approach for deep learning in the field of physical problems.

3.2. One-Dimensional Parametric Euler Equation

The conservation of mass, momentum, and energy in a compressible, inviscid flow can be described by the Euler equation. The conservative form of the Euler equation can be written as follows [32]:

\partial_{t} U + \nabla \cdot f (U) = 0, x \in Ω \subset R^{d}, d = 1, 2, t \in (0, T] .

(19)

The one-dimensional Euler equation has the following form:

U = (\begin{matrix} ρ \\ ρ u \\ ρ E \end{matrix}), f (U) = (\begin{matrix} ρ u \\ ρ u^{2} + p \\ u (ρ E + p) \end{matrix}),

(20)

where

ρ

is the density, p is the pressure, u is the velocity, and E is the total energy. To close the equation, a state equation describing the relationship between pressure and energy is introduced. In this paper, we consider the state equation for an ideal gas:

p = (γ - 1) (ρ E - \frac{1}{2} ρ | | u | |^{2}),

(21)

where

γ = 1.4

is the adiabatic coefficient.

In this class of PDE, even if the initial conditions are smooth, the solutions will be discontinuous at some finite time [33]. In such a case, it is difficult to obtain analytical solutions. Thus, numerical methods are widely used to find approximate solutions to such equations. The goal is to perform non-oscillatory reconstruction around the discontinuities. Therefore, this example aims to verify the ability of PIDeepONet and MulSPIDeepONet to deal with one-dimensional Euler equations with moving contact discontinuities [34].

In this example, the range of the computational domain x is [0, 1], the initial discontinuity is at

x = 0.5

, and the left and right sides of the discontinuity can be described by

(ρ_{L}, u_{L}, p_{L}) = (a, 0.1, 1.0), (ρ_{R}, u_{R}, p_{R}) = (1.0, 0.1, 1.0),

(22)

where a is a variable parameter. To reduce the computational overhead, the value range for the training process is

a \in [1.2, 5.4]

. We use Dirichlet boundary conditions and specify the exact solution as follows:

ρ (x, t) = \{\begin{matrix} a, x < 0.5 + 0.1 t, \\ 1.0, x > 0.5 + 0.1 t, \end{matrix} u (x, t) = 0.1, p (x, t) = 1.0 .

(23)

Sampling different points on a hypercube within the range of a, we obtain sufficient training data. The prediction accuracy is represented by the variable parameter, which is not used during the model training. The training goal is to learn the solution operator that maps the variable parameter a to the solution of Equation (19). Sixty randomly distributed boundary points are used for training. The number of initial training points is 60 and the number of function training points is 1000. The distribution of training points under each set of training parameters is shown in Figure 11.

In this example, the learning rate is set to 0.005, and the Adam optimizer is used to train the model for 60,000 steps. The exponential decay learning strategy is adopted. The learning rate attenuates every 3000 steps, and the attenuation rate is 0.95. The training set includes 50 sets of training data randomly sampled in the range

a \in [1.2, 5.4]

. The specific network size settings are listed in Table 4, which displays the depth of each network and the number of neurons in each layer. The results for the Loss during the training process are shown in Figure 12.

Figure 13 shows the density and velocity curves of Equation (19) solved by PIDeepONet and MulSPIDeepONet at

t = 2.0

for

a = 4.4

without training. The experimental results show that the velocity curves of PIDeepONet and MulSPIDeepONet are almost consistent with the analytical solution, but the density curve given by PIDeepONet at the discontinuity has obvious errors. In contrast, MulSPIDeepONet makes full use of the information fusion between the trunk network and the branch network and enhances the feature fusion between the final output of the network and the independent variables of the solution function. Thus, it captures the occurrence of discontinuities more accurately, and the density curve in the discontinuous area exhibits better fitting performance.

Figure 14 shows the changes in the density field with time when

a = 4.4

. Panels (c) and (d) indicate that the proposed method is closer to the analytical solution at the discontinuity, demonstrating that MulSPIDeepONet has better processing ability in the face of strong discontinuities.

Figure 15 shows the predicted density field given by PIDeepONet and MulSPIDeepONet when

a = 5.0

without training. The average absolute error between the PIDeepONet results and the analytical solution at the discontinuity (area framed by the red rectangle) is 0.7726. It is obvious that PIDeepONet dissipates significantly at the discontinuity. However, the average absolute error between the MulSPIDeepONet results and the analytical solution in this region is just 0.1918, which is very impressive. For more test error results, see Table 5.

The experimental results show that PIDeepONet produces obvious errors at the discontinuity. MulSPIDeepONet provides better compensation for the dissipation at the discontinuity, resulting in predictions that are about three times better than PIDeepONet and improved generalization ability. This experimental result confirms that MulSPIDeepONet can effectively capture and reveal the processes behind complex physical phenomena with moving contact discontinuities. This is an important achievement for theoretical research.

3.3. Eikonal Equation

This example aims to highlight the ability of PIDeepONet and MulSPIDeepONet to handle different input functions in two dimensions. We consider the following two-dimensional eikonal equation:

\begin{matrix} {∥\nabla s (x)∥}_{2} = 1, \\ s (x) = 0, x \in \partial Ω, \end{matrix}

(24)

where

x = (x, y) \in R^{2}

denotes two-dimensional space coordinates, and

Ω

is an open domain with a piecewise smooth boundary

\partial Ω

. The solution to Equation (24) is a signed function measuring the distance from the point in

Ω

to the nearest point on the boundary

\partial Ω

. For example,

s (x) = \{\begin{matrix} d (x, \partial Ω), x \in Ω, \\ - d (x, \partial Ω), x \in Ω^{C}, \end{matrix}

(25)

where

d (\cdot, \cdot)

is defined as the distance function. The purpose of this example is to explore an effective method for mapping the closed curve F to its corresponding signed distance function while satisfying the eikonal Equation [25]. In this paper, circular boundaries with different radii centered on the origin are considered to fully explore the characteristics of the mapping process. The training data are a random selection of 1000 circular boundaries of different radii taken from a uniform distribution, and the test data are the circular boundaries of different radii that are not involved in the training process. In this example, the learning rate is set as 0.001, and the Adam optimizer is used. The specific network size is listed in Table 6. The results for the residual loss (Loss-Res) and boundary loss (Loss-Bcs) during the training process are shown in Figure 16.

The test results are displayed in Figure 17, indicating that PIDeepONet achieves good consistency between the exact value and predicted value of the signed distance function. The L2 norm error of the test result given by the proposed method is 0.001904, whereas that of the baseline model is 0.002975. The error is significantly reduced by using MulSPIDeepONet. Additional test results and error metrics are presented in Figure 18 and Table 7, illustrating that the model proposed in this study outperforms the baseline model.

This experimental result again proves that, through its network enhancement strategy, the proposed method has an improved ability to identify the independent variable characteristics of the solution function by integrating the characteristics of the trunk network and the branch network. This unique network architecture and feature processing strategy give MulSPIDeepONet better generalization ability and solving effect in dealing with PDEs, which verifies the effectiveness of the network structure design and feature fusion strategy proposed in this work.

In this section, three different PDE systems have been solved and analyzed. The proposed MulSPIDeepONet indirectly constructs the mapping relationship between the variable function and the solution function and enhances the feature recognition of the solution function to the independent variables. Thus, it has stronger ability to solve PDEs, especially systems with large gradient changes, moving contact discontinuities, and large changes in physical characteristics.

The reasons are as follows. First, the baseline model completely separates the trunk network and the branch network, such that the information between the independent variables of the variable function and the solution function cannot be fully utilized. This results in the loss of available information. In contrast, the proposed method accurately captures the trend of changes in physical characteristics. Second, MulSPIDeepONet enhances the identification of the independent variable information in the solution function, which helps prevent the loss of information as the network layer deepens during training. Third, the potential variables are used to encode the information bottleneck between input and output. Therefore, this network structure exhibits unique advantages in capturing complex physical phenomena, dealing with regions with drastic gradient changes, and accurately reflecting discontinuities in the system.

4. Conclusions and Prospects

This paper has proposed the MulSPIDeepONet architecture. This structure changes the direct mapping method between the variable function and the solution function in classical network models. Instead, our method uses an indirect mapping to find the solution function. This simplifies the solution process when faced with complex, highly nonlinear relationships, and accelerates the network learning process. By enhancing the close relationship between the independent variables and the solution function, the feature recognition ability of the model is dramatically improved at the physical level, which enhances the ability to solve PDE systems.

The results of three example PDEs show that MulSPIDeepONet has stronger generalization ability than the classical PIDeepONet model. In the solution of PDEs, the prediction accuracy of MulSPIDeepONet is nearly three times greater than that of PIDeepONet, especially in the presence of large gradients and strong discontinuities. Therefore, the network structure proposed in this paper offers unique advantages, not only capturing complex physical phenomena but also handling regions with dramatic gradient changes. Moreover, it can accurately reflect the discontinuities within a system.

Future work will focus on the following aspects. First, we will introduce additional indirect explicit constraint information into the transition subnetwork architecture, such as the explicit constraint information in the Green’s function. Second, we will apply this model to more complex PDE scenarios, such as turbulent flows and ultra-high-speed flow field predictions, to verify the advantages of the proposed approach.

Author Contributions

Conceptualization, J.W., Investigation, J.W., Methodology, J.W., J.H., Q.W. and F.L., formal analysis, Z.C. and J.H., Supervision, Y.L. and F.L., Visualization, A.W., Writing—original draft, J.W., Writing—review and editing, J.W., Y.L., J.H. and F.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study did not receive any external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Samaniego, E.; Anitescu, C.; Goswami, S.; Nguyen-Thanh, V.M.; Guo, H.; Hamdia, K.; Zhuang, X.; Rabczuk, T. An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Comput. Methods Appl. Mech. Eng. 2020, 362, 112790. [Google Scholar] [CrossRef]
Guo, Y.; Cao, X.; Song, J.; Leng, H.; Peng, K. An efficient framework for solving forward and inverse problems of nonlinear partial differential equations via enhanced physics-informed neural network based on adaptive learning. Phys. Fluids 2023, 35, 106603. [Google Scholar] [CrossRef]
Steinfurth, B.; Weiss, J. Assimilating experimental data of a mean three-dimensional separated flow using physics-informed neural networks. Phys. Fluids 2024, 36, 015131. [Google Scholar] [CrossRef]
Cai, S.; Wang, Z.; Wang, S.; Perdikaris, P.; Karniadakis, G.E. Physics-Informed Neural Networks for Heat Transfer Problems. J. Heat Transf. 2021, 143, 060801. [Google Scholar] [CrossRef]
Zhang, E.; Dao, M.; Karniadakis, G.E.; Suresh, S. Analyses of internal structures and defects in materials using physics-informed neural networks. Sci. Adv. 2022, 8, eabk0644. [Google Scholar] [CrossRef] [PubMed]
Raissi, M.; Yazdani, A.; Karniadakis, G.E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, 1026–1030. [Google Scholar] [CrossRef] [PubMed]
Jin, X.; Cai, S.; Li, H.; Karniadakis, G.E. NSFnets (Navier–Stokes flow nets): Physics-informed neural networks for the incompressible Navier–Stokes equations. J. Comput. Phys. 2021, 426, 109951. [Google Scholar] [CrossRef]
Zhu, Q.; Liu, Z.; Yan, J. Machine learning for metal additive manufacturing: Predicting temperature and melt pool fluid dynamics using physics-informed neural networks. Comput. Mech. 2021, 67, 619–635. [Google Scholar] [CrossRef]
Song, S.; Jin, H. Identifying constitutive parameters for complex hyperelastic materials using Physics-Informed Neural Networks. arXiv 2024, arXiv:2308.15640. [Google Scholar] [CrossRef]
Cai, S.; Mao, Z.; Wang, Z.; Yin, M.; Karniadakis, G.E. Physics-informed neural networks (PINNs) for fluid mechanics: A review. arXiv 2021, arXiv:2105.09506. [Google Scholar] [CrossRef]
Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef]
Xiang, Z.; Peng, W.; Zheng, X.; Zhao, X.; Yao, W. Self-adaptive loss balanced physics-informed neural networks for the incompressible Navier–Stokes equations. arXiv 2021, arXiv:2104.06217. [Google Scholar] [CrossRef]
Li, S.; Feng, X. Dynamic Weight Strategy of Physics-Informed Neural Networks for the 2D Navier–Stokes Equations. Entropy 2022, 24, 1254. [Google Scholar] [CrossRef]
Yu, J.; Lu, L.; Meng, X.; Karniadakis, G.E. Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems. Comput. Methods Appl. Mech. Eng. 2022, 393, 114823. [Google Scholar] [CrossRef]
Jagtap, A.D.; Kharazmi, E.; Karniadakis, G.E. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 2020, 365, 113028. [Google Scholar] [CrossRef]
Karniadakis, A.D.J.; Karniadakis, G.E. Extended Physics-Informed Neural Networks (XPINNs): A Generalized Space-Time Domain Decomposition Based Deep Learning Framework for Nonlinear Partial Differential Equations. Commun. Comput. Phys. 2020, 28, 2002–2041. [Google Scholar] [CrossRef]
Goswami, S.; Anitescu, C.; Chakraborty, S.; Rabczuk, T. Transfer learning enhanced physics informed neural network for phase-field modeling of fracture. Theor. Appl. Fract. Mech. 2020, 106, 102447. [Google Scholar] [CrossRef]
Goswami, S.; Kontolati, K.; Shields, M.D.; Karniadakis, G.E. Deep transfer operator learning for partial differential equations under conditional shift. Nat. Mach. Intell. 2022, 4, 1155–1164. [Google Scholar] [CrossRef]
Lu, L.; Jin, P.; Pang, G.; Zhang, Z.; Karniadakis, G.E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 2021, 3, 218–229. [Google Scholar] [CrossRef]
Chen, T.; Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neural Netw. 1995, 6, 911–917. [Google Scholar] [CrossRef] [PubMed]
Lanthaler, S.; Mishra, S.; Karniadakis, G.E. Error estimates for deeponets: A deep learning framework in infinite dimensions. Trans. Math. Appl. 2022, 6, tnac001. [Google Scholar] [CrossRef]
Jin, H.; Zhang, E.; Zhang, B.; Krishnaswamy, S.; Karniadakis, G.E.; Espinosa, H.D. Mechanical characterization and inverse design of stochastic architected metamaterials using neural operators. arXiv 2023, arXiv:2311.13812. [Google Scholar] [CrossRef]
Wang, S.; Wang, H.; Perdikaris, P. Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Sci. Adv. 2021, 7, eabi8605. [Google Scholar] [CrossRef] [PubMed]
Fuks, O.; Tchelepi, H.A. Limitations of physics informed machine learning for nonlinear two-phase transport in porous media. J. Mach. Learn. Model. Comput. 2020, 1, 19–37. [Google Scholar] [CrossRef]
Grigo, C.; Koutsourelakis, P.-S. Bayesian Model and Dimension Reduction for Uncertainty Propagation: Applications in Random Media. SIAMASA J. Uncertain. Quantif. 2019, 7, 292–323. [Google Scholar] [CrossRef]
Chen, T.; Chen, H. Approximations of continuous functionals by neural networks with application to dynamic systems. IEEE Trans. Neural Netw. 1993, 4, 910–918. [Google Scholar] [CrossRef] [PubMed]
Tan, L.; Chen, L. Enhanced DeepONet for Modeling Partial Differential Operators Considering Multiple Input Functions. arXiv 2022, arXiv:2202.08942. [Google Scholar] [CrossRef]
Cao, Z.-W.; Liu, Z.-F.; Liu, Z.-F.; Wang, X.-H. A self-adaptive numerical method to solve convection-dominated diffusion problems. Math. Probl. Eng. 2017, 2017, 8379609. [Google Scholar] [CrossRef]
Quarteroni, A.; Rozza, G. (Eds.) Reduced Order Methods for Modeling and Computational Reduction; Springer: Berlin/Heidelberg, Germany, 2014; Volume 9. [Google Scholar]
Courant, R.; Friedrichs, K.O. Supersonic Flow and Shock Waves; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1999; Volume 21. [Google Scholar]
Dafermos, C.M.; Dafermos, C.M. Hyperbolic Conservation Laws in Continuum Physics; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3. [Google Scholar]
Mao, Z.; Jagtap, A.D.; Karniadakis, G.E. Physics-informed neural networks for high-speed flows. Comput. Methods Appl. Mech. Eng. 2020, 360, 112789. [Google Scholar] [CrossRef]

Figure 1. Architecture diagram of unstacked DeepONet.

Figure 2. PIDeepONet architecture. The asterisk (*) represents the set of trainable parameters in the network.

Figure 3. MulSPIDeepONet architecture. Different colors represent different parts of the network, and the asterisk (*) represents the set of trainable parameters in the network.

Figure 4. Solving a advection equation: Boundary and residual losses of a physics-informed DeepONet and multi-step physics-informed DeepONet.

Figure 5. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 6. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions given by PIDeepONet and MulSPIDeepONet, respectively.

Figure 5. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 6. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions given by PIDeepONet and MulSPIDeepONet, respectively.

Figure 6. Input sample of

u (x)

for Figure 5.

Figure 6. Input sample of

u (x)

for Figure 5.

Figure 7. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 9. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions of PIDeepONet and MulSPIDeepONet, respectively.

Figure 7. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 9. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions of PIDeepONet and MulSPIDeepONet, respectively.

Figure 8. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 10. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions of PIDeepONet and MulSPIDeepONet, respectively.

Figure 8. Comparison of PIDeepONet, MulSPIDeepONet, and analytical solution based on Figure 10. (a,d) are the real solutions

s (t, x)

of the random sampling test variable function

u (x)

, (b,e) are the prediction results of PIDeepONet and MulSPIDeepONet, respectively. (c,f) are the absolute errors between the predicted results and the numerical solutions of PIDeepONet and MulSPIDeepONet, respectively.

Figure 9. Input sample of

u (x)

for Figure 7.

Figure 9. Input sample of

u (x)

for Figure 7.

Figure 10. Input sample of

u (x)

for Figure 8.

Figure 10. Input sample of

u (x)

for Figure 8.

Figure 11. Training data points. The blue points are the initial condition training points, orange and green points are the boundary training points, and red points are the equation constraint training points.

Figure 12. Solving a one-dimensional parametric Euler equation: training loss of a physics-informed DeepONet and multi-step physics-informed DeepONet.

Figure 13. Prediction results of velocity (u) and density (rho) when

a = 4.4

and

t = 2

: (a) baseline model, (b) proposed method.

Figure 13. Prediction results of velocity (u) and density (rho) when

a = 4.4

and

t = 2

: (a) baseline model, (b) proposed method.

Figure 14. Density field prediction results when

a = 4.4

. (a,c) Prediction results and absolute error of the baseline model (PIDeepONet), respectively; (b,d) prediction results and absolute error of the proposed method (MulSPIDeepONet), respectively.

Figure 14. Density field prediction results when

a = 4.4

. (a,c) Prediction results and absolute error of the baseline model (PIDeepONet), respectively; (b,d) prediction results and absolute error of the proposed method (MulSPIDeepONet), respectively.

Figure 15. Comparison of density field prediction when

a = 5.0

. (a) Prediction of baseline model (PIDeepONet); (b) prediction of the proposed method (MulSPIDeepONet).

Figure 15. Comparison of density field prediction when

a = 5.0

. (a) Prediction of baseline model (PIDeepONet); (b) prediction of the proposed method (MulSPIDeepONet).

Figure 16. Solving a Eikonal equation: boundary and residual losses of a physics-informed DeepONet and multi-step physics-informed DeepONet.

Figure 17. PIDeepONet and MulSPIDeepONet compared with analytical solutions. (a,d) are the real solutions

s (x)

of numerical simulations for any arbitrary test variable function (circular boundary centered on the origin, with different radii), (b,e) are the prediction results of the baseline model (PIDeepONet) and proposed method (MulSPIDeepONet), respectively, and (c,f) are the absolute errors between the prediction results and the numerical solution.

Figure 17. PIDeepONet and MulSPIDeepONet compared with analytical solutions. (a,d) are the real solutions

s (x)

of numerical simulations for any arbitrary test variable function (circular boundary centered on the origin, with different radii), (b,e) are the prediction results of the baseline model (PIDeepONet) and proposed method (MulSPIDeepONet), respectively, and (c,f) are the absolute errors between the prediction results and the numerical solution.

Figure 18. Prediction results of PIDeepONet and MulSPIDeepONet under two different radii compared with the analytical solutions. (a,d) are the real solutions

s (x)

of numerical simulations for any arbitrary test variable function (circular boundary centered on the origin, with different radii), (b,e) are the prediction results given by the baseline model (PIDeepONet) and proposed method (MulSPIDeepONet), respectively, and (c,f) are the absolute errors between the prediction results and the numerical solution.

Figure 18. Prediction results of PIDeepONet and MulSPIDeepONet under two different radii compared with the analytical solutions. (a,d) are the real solutions

s (x)

of numerical simulations for any arbitrary test variable function (circular boundary centered on the origin, with different radii), (b,e) are the prediction results given by the baseline model (PIDeepONet) and proposed method (MulSPIDeepONet), respectively, and (c,f) are the absolute errors between the prediction results and the numerical solution.

Table 1. Network settings.

Network Name	Subnetwork Categories	Network Setting
PIDeepONet	Trunk Net	[2,100,100,100,100,100,100]
PIDeepONet	Branch Net	[100,100,100,100,100,100,100]
MulSPIDeepONet	Trunk Net	[2,100,100,100,100]
	Branch Net	[100,100,100,100,100]
	Transition SubNet	[1,100,100,100,100,100]

Table 2. Test results for solving linear advection equations. MulSPIDeepONet is superior to the baseline model in all indicators.

Test	Model	MAE	rMAE	rRMSE
Test1	PIDeepONet	0.0218	0.118	0.091
Test1	MulSPIDeepONet	0.0083	0.004	0.039

Table 3. Test results for solving linear advection equations. MulSPIDeepONet is superior to the baseline model in all indicators.

Test	Model	MAE	rMAE	rRMSE
Test2	PIDeepONet	0.0337	0.117	0.111
Test2	MulSPIDeepONet	0.0108	0.038	0.033
Test3	PIDeepONet	0.0307	0.177	0.167
Test3	MulSPIDeepONet	0.0070	0.040	0.039

Table 4. Network settings.

Network Name	Subnetwork Categories	Network Setting
PIDeepONet	Trunk Net	[2,200,200,200,200,200,100]
PIDeepONet	Branch Net	[1,200,200,200,200,200,300]
MulSPIDeepONet	Trunk Net	[2,200,200,200,200,100]
	Branch Net	[1,200,200,200,200,200,300]
	Transition SubNet	[3,100,100,300]

Table 5. Test results for solving one-dimensional parametric Euler equation. MulSPIDeepONet is superior to the baseline model in all indicators.

a	Model	MAE	rMAE	rRMSE
$a = 4.4$	PIDeepONet	0.6567	0.2432	0.3177
$a = 4.4$	MulSPIDeepONet	0.1704	0.0631	0.1583
$a = 5.0$	PIDeepONet	0.7726	0.2575	0.3308
$a = 5.0$	MulSPIDeepONet	0.1918	0.0639	0.1650

Table 6. Network settings.

Network Name	Subnetwork Categories	Network Setting
PIDeepONet	Trunk Net	[2,50,50,50,50,50,50]
PIDeepONet	Branch Net	[200,50,50,50,50,50,50]
MulSPIDeepONet	Trunk Net	[2,50,50,50,50,50]
	Branch Net	[200,50,50,50,50,50]
	Transition SubNet	[1,50,50,50]

Table 7. Test results for solving Eikonal equations. MulSPIDeepONet is superior to the baseline model in all indicators.

	Model	L2 Norm Error	MAE
Radiu 1	PIDeepONet	0.00313	0.00125
Radiu 1	MulSPIDeepONet	0.00153	0.00064
Radiu 2	PIDeepONet	0.00523	0.00122
Radiu 2	MulSPIDeepONet	0.00378	0.00083

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Li, Y.; Wu, A.; Chen, Z.; Huang, J.; Wang, Q.; Liu, F. Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations. Appl. Sci. 2024, 14, 5490. https://doi.org/10.3390/app14135490

AMA Style

Wang J, Li Y, Wu A, Chen Z, Huang J, Wang Q, Liu F. Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations. Applied Sciences. 2024; 14(13):5490. https://doi.org/10.3390/app14135490

Chicago/Turabian Style

Wang, Jing, Yubo Li, Anping Wu, Zheng Chen, Jun Huang, Qingfeng Wang, and Feng Liu. 2024. "Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations" Applied Sciences 14, no. 13: 5490. https://doi.org/10.3390/app14135490

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Step Physics-Informed Deep Operator Neural Network for Directly Solving Partial Differential Equations

Abstract

1. Introduction

1.1. Our Contributions

1.2. Roadmap

2. Method

2.1. General Nonlinear Functions and Nonlinear Operator Approximations

2.2. DeepONet and PIDeepONet Architectures

2.3. MulSPIDeepONet Architecture

3. Experimental Verification

3.1. Advection Equation

3.2. One-Dimensional Parametric Euler Equation

3.3. Eikonal Equation

4. Conclusions and Prospects

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI