Next Article in Journal
Advancements in Optimization: Critical Analysis of Evolutionary, Swarm, and Behavior-Based Algorithms
Next Article in Special Issue
Cubic q-Bézier Triangular Patch for Scattered Data Interpolation and Its Algorithm
Previous Article in Journal
Vulnerability Analysis of a Multilayer Logistics Network against Cascading Failure
Previous Article in Special Issue
Algorithm for Option Number Selection in Stochastic Paired Comparison Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Meshfree Variational-Physics-Informed Neural Networks (MF-VPINN): An Adaptive Training Strategy

1
Dipartimento di Scienze Matematiche, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy
2
MEGAVOLT Team, Inria, 48 Rue Barrault, 75013 Paris, France
3
Laboratoire Jacques-Louis Lions, Sorbonne Center for Artificial Intelligence, Sorbonne Université, 4 Place Jussieu, 75005 Paris, France
*
Author to whom correspondence should be addressed.
Algorithms 2024, 17(9), 415; https://doi.org/10.3390/a17090415
Submission received: 22 July 2024 / Revised: 8 September 2024 / Accepted: 11 September 2024 / Published: 19 September 2024
(This article belongs to the Special Issue Numerical Optimization and Algorithms: 2nd Edition)

Abstract

:
In this paper, we introduce a Meshfree Variational-Physics-Informed Neural Network. It is a Variational-Physics-Informed Neural Network that does not require the generation of the triangulation of the entire domain and that can be trained with an adaptive set of test functions. In order to generate the test space, we exploit an a posteriori error indicator and add test functions only where the error is higher. Four training strategies are proposed and compared. Numerical results show that the accuracy is higher than the one of a Variational-Physics-Informed Neural Network trained with the same number of test functions but defined on a quasi-uniform mesh.
MSC:
65N12; 65N15; 65N50; 68T05; 92B20

1. Introduction

Physics-Informed Neural Networks (PINNs) are a rapidly emerging numerical technique used to solve partial Differential equations (PDEs) by means of a deep neural network. The first idea can be traced back to the works of Lagaris et al. [1,2,3], but, thanks to the hardware advancements and the existence of deep learning packages like Tensorflow [4], Pytorch [5] and JAX [6], they have recently became popular since the works of Raissi et al. [7,8], published in [9]. In its original formulation, the approximate solution is computed as the output of a neural network trained to minimize the PDE residual on a set of collocation points inside the domain and on its boundary.
The growing interest in PINNs is strictly related to their flexibility. In fact, with minor changes to the implementation, it is possible to solve a huge variety of problems. For example, exploiting the nonlinear nature of the involved neural network, nonlinear [10,11] and high-dimensional PDEs [12] can be solved without the need for globalization methods or additional nonlinear solvers. Moreover, by changing the neural network’s input dimensions or suitably adapting the loss function, it is possible to solve parametric [13,14] or inverse [15,16] problems. When external data are available, they can also be used to guide the optimization phase and improve the PINN accuracy [17].
In order to improve the original PINN proposed in [9] and to adapt it to solve specific problems, several generalizations have been proposed. For example, the deep Ritz method (DRM) [18,19,20] looks for a minimizer of the PDE energy functional and, in the deep Galerkin method (DGM) [21,22,23], an approximation of the L 2 norm of the PDE residual is minimized. It is also possible to exploit domain decomposition strategies [24,25] as in the conservative PINN (CPINN) [26], in the parallel PINN [27], in the extended PINN (XPINN) [28], or in the Finite Basis PINN (FBPINN) [29]. Moreover, it is even possible to change the neural network architecture or the training strategy as in [14,30,31,32,33,34,35]; between the methods based on different architectures, we highlight some works based on the novel Kolmogorov–Arnold Network (KAN) [36] architecture [37,38] and on a Large Language Model (LLM) [39]. More extensive overviews of the existing approaches can be found in [40,41,42,43]. In the context of the current work, an important extension is the Variational-Physics-Informed Neural Network (VPINN) [44,45], where the weak formulation of the problem is used to construct the loss function.
In this work, we focus on VPINNs. As discussed in [44,45,46,47], in order to train a VPINN, one needs to choose a suitable space of test functions, compute the variational residuals against all the test functions on the basis of such a space, and minimize a linear combination of these residuals. Since a spatial mesh is required to define the test functions, the VPINN cannot be considered a meshfree method, even though it is an extension of the PINN, which is meshfree. In this work, we present an adaptive Meshfree VPINN (MF-VPINN) that does not require a global triangulation of the domain but is trained with the same loss function and neural network architecture of a standard VPINN. Note that the MF-VPINN and the original VPINN can solve the same differential problems because the neural network is trained with the same loss functions. We also highlight that they can solve problems where the solution has low regularity that cannot be solved with standard PINNs, for example, in the presence of singular forcing terms, thanks to the weak formulation of the PDE without introducing further approximations or regularizations. However, one of the VPINN’s limitations is that a triangulation of the entire domain is required to define the test functions. Generating it may be very expensive or even impractical for very complex geometries (like, for example, the ones in [48]) and in moderate- or high-dimensional problems, for which automatic mesh-generation algorithms do not exist. For such domains, it is therefore highly advisable or computationally necessary to use a meshfree method such as the original PINN or the proposed MF-VPINN. Moreover, when dealing with complex geometries for which a mesh can be hardly generated, the refinement of the mesh for adaptive methods can be very difficult. In this paper, we describe an algorithm that solves the problem and provides a reliable solution.
The paper is organized as follows. In Section 2, we introduce the problem we are interested in. In particular, we focus on the problem discretization in Section 2.1 and on the MF-VPINN loss function in Section 2.2. Then, an a posteriori error estimator is presented in Section 2.3 and used in Section 2.4 to iteratively generate the required test functions. Numerical results are presented in Section 3. In Section 3.1, we describe the model implementation and some strategies to improve the model efficiency, in Section 3.2 we compare different approaches to generate the test functions and compare their performance and, in Section 3.3, we analyze the role of the error estimator introduced in Section 2.3. Similar tests are performed on a different problem in Section 3.4 to describe possible extensions on more complex domains. Finally, we conclude the paper in Section 4 and discuss future perspectives and ideas.

2. Problem Formulation

Let us consider the following second-order elliptic problem, defined on a polygonal or polyhedral domain Ω R n with a Lipshitz boundary Γ = Ω :
L u : = · ( μ u ) + β · u + σ u = f in Ω , u = g on Γ ,
where μ , σ L ( Ω ) , β ( W 1 , ( Ω ) ) n satisfy μ μ 0 , σ 1 2 · β 0 in Ω for some constant μ 0 > 0 , whereas f L 2 ( Ω ) and g = u ¯ | Γ for some u ¯ H 1 ( Ω ) .
In order to derive the corresponding variational formulation, we define the bilinear form a and the linear form F as
a : V × V R , a ( w , v ) = Ω μ w · v + β · w v + σ w v ,
F : V R , F ( v ) = Ω f v ;
where V is the function space V = H 0 1 ( Ω ) . We denote by α μ 0 the coercivity constant of a and by a and F the continuity constants of a and F. Then, the variational formulation of Problem (1) reads as follows: Find u u ¯ + V such that
a ( u , v ) = F ( v ) v V .

2.1. Problem Discretization

In order to numerically solve Problem (4), one needs to choose suitable finite-dimensional approximations of the trial space u ¯ + V and of the test space V. A Galerkin formulation is considered when we consider a finite-dimensional space V h trial for the trial space u ¯ + V h trial and a finite-dimensional test space V h test , with V h trial = V h test ; whereas a Petrov–Galerkin formulation is considered otherwise. In this work, we consider a Petrov–Galerkin formulation in which the trial space is approximated by a set of functions V N N of the form V N N = u ¯ + V h trial , with V h trial represented by a neural network suitably modified to enforce the Dirichlet boundary conditions, and the test space is a space V h of piecewise linear functions.
The neural network considered in the following is a standard fully connected feed-forward neural network. Given the number L of layers and a set of matrices A R N × N 1 and vectors b R N , = 1 , , L containing the neural network’s trainable weights, the function w : R n R associated with the considered neural network architecture is:
x 0 = x , x = ρ ( A x 1 + b ) , = 1 , , L 1 , w ( x ) = A L x L 1 + b L .
where ρ : R R is a nonlinear function applied element-wise to the vector A x 1 + b . In this section, we use ρ ( x ) = tanh ( x ) ; other common choices include, but are not limited to, ρ ( x ) = ReLU ( x ) = max { 0 , x } , ρ ( x ) = RePU ( x ) = max { 0 , x p } for 1 < p N , ρ ( x ) = 1 / ( 1 + e x ) and ρ ( x ) = log ( 1 + e x ) . Note that, in order to represent a function w : R n R , the layer widths N of the first and last layers are chosen as N 0 = n and N L = 1 . We denote by W N N the set of functions that can be represented as in (5) for any combination of the neural network weights and by w N N the vector containing all the trainable weights of the neural network.
The function w defined in (5) is independent of the differential problem that has to be solved and is, in most papers on PINNs or related models, trained to minimize both the residual of the equation and a term penalizing the discrepancy between w | Γ and g. Instead, we add a non-trainable layer B to the neural network architecture in order to automatically enforce the required boundary conditions without the need to learn them during the training. As described in [49], the operator B acts on the neural network output as
B w = ϕ w + g ¯ ,
where ϕ : Ω R is a function vanishing on Γ and strictly positive inside Ω , and g ¯ : Ω R is a suitable extension of g : Γ R . The advantages of such an approach are also described in [50]. Then, the discrete trial space approximating u ¯ + V can be defined as
V N N = { v N N u ¯ + V : v N N = B w for some w W N N } .
On the other hand, the discrete test space V h is not associated with the neural network and only contains known test functions. In standard VPINNs, one generates a triangulation T of the domain Ω and then defines V h as the space of functions that coincide with a polynomial of order p N inside each element of T . Instead, we want to construct a discrete space V h of functions independent from a global triangulation T . Moreover, since in [47] it has been proven that the VPINN convergence rate with respect to mesh refinement decreases when the order of the test functions is increased, we are interested in a space V h that only contains piecewise linear functions. For the sake of simplicity, we only consider the case where n = 2 ; the discussion can be directly generalized to the more general case n N .
Let P ^ R n be a reference patch. In the following discussion, P ^ can be any arbitrary star-shaped polygon with N P ^ vertices and the dimension of its kernel strictly greater than zero. Nevertheless, in the numerical experiments, we only consider the reference patch P ^ = [ 0 , 1 ] 2 to avoid any unnecessary computational overhead. Let M = { M i } i = 1 n patches be a set of affine mappings such that M i : P ^ P i Ω , where we denote as P i the patch obtained transforming the reference patch P ^ through the map M i . We assume that P = { P i } i = 1 n patches is a cover of Ω , i.e., i = 1 n patches P i = Ω , and we admit overlapping patches.
Let us consider the triangulation T ^ = { T ^ j : 1 j N P ^ } of P ^ obtained by connecting each vertex with a single point c P ^ in its kernel. It is then possible to define a piecewise linear function φ ^ vanishing on the border of P ^ such that φ ^ ( c P ^ ) = 1 and φ ^ | T ^ j P 1 ( T ^ j ) , for any j = 1 , , N P ^ . Then, we define the discrete test space V h as V h = span { φ i : i = 1 , , n patches } , where φ i V is the piecewise linear function:
φ i ( x ) = φ ^ ( M i 1 ( x ) ) , x P i , 0 , x P i .
We remark that the only required triangulation is T ^ , which contains only N P ^ triangles (in the numerical tests in this paper, N P ^ = 4 ). Instead, there exists no mesh on Ω and the test functions φ i and their supports P i are all independent. Therefore, the proposed method is said to be meshfree. A simple example of a set of patches P with n patches = 7 on the domain Ω = [ 0 , 1 ] 2 is shown in Figure 1. For the sake of simplicity, in this work, we consider a squared reference patch P ^ with c P ^ coinciding with its center, and we let each mapping M i represent a combination of scalings and translations.
Using the introduced finite-dimensional set of functions V N N and V h , it is possible to discretize Problem (4) as follows: Find u N N V N N such that
a ( u N N , v ) = F ( v ) v V h .

2.2. Loss Function

In this section, we derive the loss function used to train the neural network. It has to be computable, and its minimizer has to be an approximate solution of Problem (1). We highlight that, when a standard PINN is used, the loss function can be seen as a discrete cost penalizing the residual of (1) directly. In this context, instead, the loss function penalizes the variational residuals of (4) as in standard VPINNs. This is the key difference that differentiates the VPINNs (and its extension proposed in this manuscript) from the other generalizations of the original PINN introduced in Section 1.
Let us consider a quadrature rule of order q 2 on each triangle T j T ^ , j = 1 , , N P ^ , uniquely identified by a set of nodes and weights { ( ξ ˜ j , ω ˜ j ) : I T j } . The nodes and weights of a composite quadrature formula of order q on P ^ can be obtained as
{ ( ξ ^ , ω ^ ) : I P ^ } = j = 1 N P ^ { ( ξ ˜ j , ω ˜ j ) : I T j } .
Then, the corresponding quadrature rule of order q of an arbitrary patch P i is defined as
( ξ i , ω i ) : I P ^ | ξ i = M i ( ξ ^ ) , ω i = ω ^ area ( P i ) area ( P ^ ) .
Using the quadrature rule in (9), it is possible to define an approximate restriction on each patch of the forms a and F as follows:
a h i ( w , v ) = I P ^ [ μ w · v + β · w v + σ w v ] ( ξ i ) ω i a P i ( w , v ) ,
F h i ( v ) = I P ^ [ f v ] ( ξ i ) ω i F P i ( v ) ,
where a P i ( w , v ) and F P i ( v ) are defined as in (2) and (3) but restricting the supports of the integrals to P i . We remark that, since it is not possible to compute integrals involving a neural network exactly, we can only use the forms a h i and F h i in the loss function. Exploiting the linearity of a ( w , v ) and F ( v ) with respect to v to consider only the basis { φ i } i = 1 n patches of V h as set of test functions, we approximate Problem (8) as follows: Find u N N V N N such that
a h i ( u N N , φ i ) = F h i ( φ i ) i = 1 , , n patches .
Then, in order to cast Problem (12) into an optimization problem, we define the residuals
r h , i ( w ) = F h i ( φ i ) a h i ( w , φ i ) , i = 1 , , n patches
and the loss function
R h 2 ( w ; P ) = 1 n patches i = 1 n patches γ i r h , i 2 ( w ) ,
where γ i are suitable positive scaling coefficients. In this work, we use γ i = area ( P i ) 1 to give the same importance to each patch. Note that this is equivalent to normalizing the quadrature rules involved in (10) and (11); this way, each residual r h , i can be regarded as a linear combination of the MF-VPINN value and derivatives independent of the size of the support of the patch P i . We also highlight that the loss function depends on the choice of M since all the used test functions are generated starting from the corresponding mappings M i M . We are now interested in a practical procedure to obtain a set P ˜ such that the approximate solution computed minimizing R h 2 ( · ; P ˜ ) is as accurate as possible with P ˜ being as small as possible.

2.3. The a Posteriori Error Estimator

The goal of this section is to derive an error estimator associated with an arbitrary patch P i , with i { 1 , , n patches } . To do so, we rely on the a posteriori error estimator proposed in [46]. It has been proven to be efficient and reliable; therefore, such an estimator allows us to know where the error is larger without knowing the exact solution of the PDE. Let us consider the patch P i , formed by the triangles T i , 1 , , T i , N P ^ and a triangulation T i of Ω such that T i , j T i , for every j = 1 , , N P ^ . We remark that the triangulation T i does not have to be explicitly generated; it is only used to properly define all the quantities introduced in [46] required to derive the proposed error estimator.
Let V h i = span { ψ j i : 1 j dim V h i } be the space of piecewise linear functions defined on T i . Where { ψ j i : 1 j dim V h i } is a Lagrange basis of V h i . It is then possible to define two constants c h i and C h i , with 0 < c h i < C h i , such that
c h i | v | 1 , Ω v 2 C h i | v | 1 , Ω v V h i ,
where v = j = 1 dim V h i v j ψ j i is an arbitrary element of V h i associated with the expansion coefficients v = v 1 , , v dim V h i and v 2 = j = 1 dim V h i v i 2 1 / 2 .
Then, given an integer k 0 , for any element E T i , we define the projection operator Π E , k : L 2 ( E ) P k ( E ) such that
E Π E , k ϕ = E ϕ ϕ L 2 ( E ) .
We also denote by { ( ξ E , ω E ) : I E } a quadrature formula of order q on E and define the quadrature-based discrete seminorm:
v 0 , E , ω = I E v 2 ( ξ E ) ω E 1 / 2 .
We require the weights and nodes of this quadrature rule to coincide with the ones introduced in (9) when E is a triangle included in P i (i.e., when E { T i , 1 , , T i , N P ^ } ). We can now introduce all the terms involved in the a posteriori error estimator.
Let η rhs , 1 ( E ) and η rhs , 2 ( E ) be the quantities:
η rhs , 1 ( E ) = h E f Π E , q 1 f 0 , E , η rhs , 2 ( E ) = h E f Π E , q 1 f 0 , E , ω + f Π E , q f 0 , E , ω .
They measure the oscillations of the forcing term with respect to its polynomial projections in various norms. Similar oscillations are also measured for the diffusion, convection and reaction terms by the terms η coef , i ( E ) for i = 1 , , 6 :
η coef , 1 ( E ) = μ u N N Π E , q ( μ u N N ) 0 , E , η coef , 2 ( E ) = h E β · u N N Π E , q 1 ( β · u N N ) 0 , E , η coef , 3 ( E ) = h E σ u N N Π E , q 1 ( σ u N N ) 0 , E , η coef , 4 ( E ) = μ u N N Π E , q ( μ u N N ) 0 , E , ω , η coef , 5 ( E ) = h E β · u N N Π E , q 1 ( β · u N N ) 0 , E , ω , + β · u N N Π E , q ( β · u N N ) 0 , E , ω η coef , 6 ( E ) = h E σ u N N Π E , q 1 ( σ u N N ) 0 , E , ω + σ u N N Π E , q ( σ u N N ) 0 , E , ω ,
where u N N is the output of the neural network after the enforcement of the Dirichlet boundary conditions through the operator B and h E is the diameter of E. Then, let us define the term η r e s ( E ) , which measures how well the equation is satisfied, as
η res ( E ) = h E bulk E ( u N N ) 0 , E + h E 1 / 2 e E jump e ( u N N ) 0 , e ,
where
bulk E ( u N N ) = Π E , q 1 f + · Π E , q ( μ u N N ) Π E , q 1 ( β · u N N + σ u N N )
jump e ( u N N ) = Π E 1 , q ( μ u N N ) · n Π E 2 , q ( μ u N N ) · n .
Note that jump e ( u N N ) measures the interelemental jumps of Π E , q ( μ u N N ) across the edge e with normal unit vector n shared by the elements E 1 and E 2 .
Finally, we introduce the approximate elemental forms:
a h i , E ( w , v ) = I E [ μ w · v + β · w v + σ w v ] ( ξ E ) ω E ,
F h i , E ( v ) = I E [ f v ] ( ξ E ) ω E ,
where ξ E and ω E , I E , are the nodes and weights used in Equation (17). With such forms, it is possible to define the residuals
r h , i , j ( w ) = E T i F h i , E ( ψ j i ) a h i , E ( w , ψ j i ) , j = 1 , , dim V h i
and the quantity η loss ( E ) as
η loss ( E ) = C h j I h E r h , i , j 2 ( u N N ) .
Here, denoting the support of the function ψ j i V h i by supp ψ j i , the elemental index set
I h E = { j I h : E supp ψ j i }
is the set containing the indices of the functions whose support contains E. It is then possible to estimate the error between the unknown exact solution u and its MF-VPINN approximation u N N by means of the computable quantities in Equations (18)–(20) and (23) as
| u u N N | 1 , E η res 2 ( E ) + η loss 2 ( E ) + i = 1 6 η coef , i 2 ( E ) + i = 1 2 η rhs , i 2 ( E ) 1 / 2 .
Once more, we refer to [46] for the proof of such a statement.
We recall that our goal is to obtain a computable error estimator associated with a single patch P i . When evaluated on an element E P i , the quantity on the right-hand side of Equation (24) implicitly depends on several elements in V h i that do not belong to P i because of the presence of η res 2 ( E ) and η loss 2 ( E ) . Therefore, such an estimator is not computable without generating the triangulation T i and the corresponding space V h i . Instead, we look for an error estimator that does not control the error on the entire patch but only in a neighborhood N i of its center c P i = M i c P ^ . This can be carried out by considering only the terms whose computation involves geometric elements containing c P i and the only function ψ j i that does not vanish on c P i . Note that such a function is the function φ i defined in (7). Therefore, the error estimator η i that controls the error in N i can be computed as
η i = η res , i 2 + C h 2 r h , i 2 ( u N N ) + j = 1 N P ^ k = 1 6 η coef , k 2 ( T i , j ) + k = 1 2 η rhs , k 2 ( T i , j ) 1 / 2 ,
where η res , i is defined as
η res , i = j = 1 N P ^ h T i , j bulk T i , j ( u N N ) 0 , T i , j + h P i 1 / 2 jump e i , j ( u N N ) 0 , e i , j .
In (26), we denote by h P i the diameter of the patch P i and by e i , j , j = 1 , , N P ^ the edges connecting its vertices with c P i .
Since η i can be seen as an approximation of the right-hand side of (24), we use it as an indicator of the error | u u N N | 1 , N i . It is important to remark that η i can be computed without generating T i and V h i . In fact, its computation involves only the function φ i , the triangles partitioning P i and the edges connecting its vertices with its center.

2.4. The Choice of M and P

In this section, the procedure adopted to generate the set of test functions used to train the MF-VPINN is described. We propose an iterative approach, in which the MF-VPINN is initially trained with very few test functions, and then other test functions are added in the regions of the domain in which the H 1 norm of the error is larger. We anticipate that, as shown in Section 3.3, generating test functions in regions where r h , i 2 is large may not lead to accurate solutions because r h , i 2 is not proportional to the H 1 error. Therefore, such a choice may increase the density of test functions where they are not required while maintaining only a few test functions in regions in which the error is large. Instead, we use the error indicator η i defined in (25).
Let us initially consider a cover P 0 = { P i } i = 1 n patches of Ω comprising a few patches (i.e., n patches is a small integer) and the corresponding set of mappings M 0 = { M i } i = 1 n patches and test functions { φ i } i = 1 n patches . These sets induce a loss function R h 2 ( w ; P 0 ) as defined in (14), which is used to train an MF-VPINN. After this initial training, one computes η i γ = γ i η i for each patch P i P 0 and stores the result in the array η   = η 1 γ , , η n patches γ . Note that η i γ is a suitable rescaling of η i to get rid of dependence from the size of P i . Let us choose a threshold 1 τ 0 n patches , sort η in descending order obtaining η sort = η s 1 γ , , η s n patches γ (where we denote by [ s 1 , , s n patches ] the index set corresponding to a suitable permutation of [ 1 , , n patches ] ) and consider the vector η ¯ 0 = η s 1 γ , , η s τ 0 γ . It is possible to note that η ¯ 0 contains only the τ 0 worst values of the indicator; it thus allows us to understand where the error is higher and where additional test functions are required to increase the model accuracy.
It is then possible to move forward with the second iteration of the iterative training. For each patch P i such that η i γ η ¯ 0 , we generate k new new patches P i k , k = 1 , , k new with centers inside P i and areas such that area ( P i ) < k = 1 k new area ( P i k ) < c · area ( P i ) , where c > 1 is a tunable parameter. In the numerical experiments, we use c = 1.25 . There exist different strategies to choose the number, the dimension, and the position of the centers of the new patches. Such strategies are described in Section 3 with particular attention to the effects of these choices on the MF-VPINN accuracy.
Let us denote by P 1 the set P 1 = P 0 { P s 1 k } k = 1 k new { P s τ 0 k } k = 1 k new and by M 1 the corresponding set of mappings. Then, it is possible to define the loss function R h 2 ( w ; P 1 ) , continue the training of the previously trained MF-VPINN, compute the error indicator η i γ for each patch P i P 1 , and obtain the vector η ¯ 1 used to decide where to insert the new patches to generate P 2 . In general, iterating this procedure, it is possible to compute a set of patches P m and of mappings M m from the previously obtained sets P m 1 and M m 1 . Technical optimization details are discussed in Section 3.1.

3. Numerical Results

In this section, we provide several numerical results to show the performance of the training strategy described in Section 2.4. In Section 3.1, we describe the structure of the MF-VPINN implementation and highlight some details that have to be taken into account in order to increase the efficiency of the training phase. Different strategies to choose the position of the new patches are discussed in Section 3.2. The importance of the use of the error indicator is remarked in Section 3.3 with additional numerical examples. An example on a more complex domain is shown in Section 3.4 to discuss some ideas to adapt the proposed strategies in more complex domains.

3.1. Implementation Details

The computer code used to perform the experiment is implemented in Python using the Python package Tensorflow [4] to generate the neural network architecture and train the MF-VPINN. Using the notation introduced in Section 2.1, the used neural network consists of L = 5 layers with N = 50 neurons in each hidden layer (i.e., for = 1 , , L 1 ); the activation function is the hyperbolic tangent in each hidden layer. For the first iteration of the iterative training, the neural network weights in the -th layer are initialized with a glorot normal distribution, i.e., a truncated normal distribution with mean 0 and standard deviation equal to 2 / ( N 1 + N ) . Then, for the subsequent iterations, their are initialized with the weights obtained at the end of the previous one.
During the first iteration of the training (during the minimization of R h 2 ( · ; P 0 ) ), the optimization is carried out by exploiting the ADAM optimizer [51] with an exponentially decaying learning rate from 10 2 to 10 4 and with the second-order L-BFGS optimizer [52]. Then, from the second training iteration, we only use the L-BFGS optimizer. We remark that L-BFGS allows a very fast convergence but only if the initial starting point is close enough to the problem’s solution. Therefore, in the first training iteration, we use ADAM to obtain a first approximation of the solution that is then improved via L-BFGS. Then, since the m-th training iteration starts from the solution computed during the ( m 1 ) -th one, we assume that the starting point is close enough to the solution of the new optimization problem (associated with a difference loss function with more patches) and we only use L-BFGS to increase the training efficiency.
During the m-th iteration of the training, the training set consists of all the quadrature nodes ξ i , for any I P ^ and for any patch P i P m as defined in (9). The order of the chosen quadrature rule is q = 3 inside each triangle. The Dirichlet boundary conditions are imposed by means of the operator B defined in (6). In this operator, for our first numerical test, the function ϕ is a polynomial bubble vanishing on Γ and g ¯ is the output of a neural network trained to interpolate the boundary data. For the numerical test in Section 3.4, instead, ϕ is computed as in [50] and g = 0 . To decrease the training time, the functions ϕ , ϕ , g ¯ and g ¯ are evaluated only once at the beginning of the m-th training iteration and they are then combined to evaluate B u N N and its gradient (where u N N is the output of the last layer of the neural network). The derivatives of u N N and g ¯ are computed via automatic differentiation [53] due to the complexity of their analytical expressions.
The output of the model is the value of the function B u N N and its gradient evaluated at the input points. Such values are then suitably combined using sparse and dense tensors to compute the quantity R h 2 ( B u N N ; P m ) . The sparse tensors contain the evaluation of φ i and φ i at each input point, whereas the dense ones store the quadrature weights, the vector γ = { γ i } i = 1 n patches and the evaluation of μ , β , σ and f at the input points. We highlight that all these tensors have to be computed once at the beginning of the m-th training iteration (updating the ones of the ( m 1 ) -th iteration) to significantly decrease the training computational cost.
As discussed in Section 2.1, we assume that all the patches and test functions can be generated from a reference patch P ^ . For each patch P i P m , one has to generate all the data structures required to assemble the loss function and the error indicator η i . To do so, it is possible to explicitly construct all the tensors required to assemble the term a ^ h ( w , φ ^ ) and all the terms involved in the computation of the reference error indicator η ^ only once, at the beginning of the first iteration of the training. Then, all these tensors can be suitably rescaled to obtain the ones corresponding to the patches and test functions involved in the loss function and error indicators computations.
To stabilize the MF-VPINN, we introduce the L 2 regularization term
L reg ( u N N ) = λ reg u N N 2 2 ,
where u N N is the set of weights of the neural network introduced in Section 2.1. In our numerical experiments, we use λ reg = 10 5 . During the m-th iteration of the training, such a quantity is added to R h 2 ( B u N N ; P m ) to obtain the training loss function
L m ( u N N ) = R h 2 ( B u N N ; P m ) + L reg ( w N N ) ,
which has to be minimized accurately enough. Indeed, if L m is minimized poorly, the new patches P m + 1 P m may be added in regions where they are not necessary because the accuracy of B u N N may still improve during the training and may not be inserted in areas where they are required. Note that, in order to compute the numerical solution, the MF-VPINN has to be trained multiple times with a different set of patches P m to minimize the losses { L m } . Since such an iterative training may be expensive, we propose an early stopping strategy [54] based on the discussed error indicator to reduce its computational cost. In its basic version, early stopping consists of evaluating a chosen metric on a validation set in order to know when the neural network accuracy on data that are not present in the training set start worsening. Interrupting the training there prevents overfitting and improves generalization. In our context, instead, we can directly track the behavior of the MF-VPINN H 1 error on each patch through the corresponding error indicator to understand when it stops decreasing. Therefore, given the set of patches P m , the chosen metric is the linear combination E S m = i = 1 dim ( P m ) η i γ . Numerical results showing the performance of this strategy are presented in Section 3.2 and Section 3.4.

3.2. Adaptive Training Strategies

Let us consider the Poisson problem:
Δ u = f in Ω , u = g on Γ ,
defined on the unit square Ω = ( 0 , 1 ) 2 . The forcing term f and the boundary condition g are chosen such that the exact solution is, in polar coordinates,
u ( r , θ ) = r 2 3 sin 2 3 θ + π 2 .
We use this function, represented in Figure 2, because the solution u is such that u H 5 / 3 ε ( Ω ) but u C ( Ω N 0 ) , where we denote by N 0 a neighborhood of the origin. Therefore, we know that an efficient distribution of patches has to be characterized by a high density only near the origin.
Below, we propose, in order of complexity, three alternatives to construct the new patches after having marked the ones with the higher error indicator. The first strategy is the most simple and intuitive, and the new patches are randomly generated with centers inside the marked patches, whereas the second strategy and third one place the new centers on a small local cartesian grid to ensure a more regular distribution. The difference between the second and the third strategies is that the marked patches are removed to increase the efficiency and we add a constraint to the marking procedure to ensure more regular distributions of the new patches.
  • Strategy #1: Random patch centers with uniform distribution
To solve Problem (28), as a first strategy, we consider the reference patch P ^ = ( 0 , 1 ) 2 and generate a sequence of sets of patches. During the first training iteration, we use P 0 = { P ^ } since this is already a cover of Ω . During the second iteration, we enrich the set of patches as P 1 = P 0 { P 1 , P 2 , P 3 , P 4 } where P 1 , P 2 , P 3 and P 4 are squared patches with edge h i = 0.6 , i = 1 , , 4 and centers
c P 1 = ( 0.3 , 0.3 ) , c P 2 = ( 0.7 , 0.3 ) ,
c P 3 = ( 0.3 , 0.7 ) , c P 4 = ( 0.7 , 0.7 ) .
This allows us to start from a homogeneous distribution of patches before utilizing the error indicator to choose the location of the new patches. Then, to decide how many patches have to be added to P m 1 to generate P m , we choose τ ˜ m such that
τ ˜ m = dim τ ˜ { 1 , , dim ( P m 1 } : i = 1 τ ˜ η s i γ i = 1 dim ( P m 1 ) η i γ < 0.75 + 1
and fix
τ m = min ( 0.3 · dim ( P m 1 ) , τ ˜ m ) .
Note that (30) allows us to consider the smallest set of patches such that the corresponding error indicators contribute at least 75 % of the global error indicator E S m 1 , whereas (31) is considered to limit the maximum number of patches that can be added for efficiency reasons.
Then, to generate the generic set of patches P m , we fix a multiplication factor C M to decide how many new patches have to be inserted inside each patch P i such that η i γ η ¯ m 1 . Inside each chosen patch P i , C M centers c ˜ P i k = ( x ˜ i k , y ˜ i k ) , k = 1 , , C M , are randomly generated with a uniform distribution and the new patches’ edges’ lengths are chosen as h i k = λ A ratio C M h i . Here, λ is a random real value from the uniform distribution U 9 10 , 10 9 , and the scaling coefficient A ratio C M is chosen such that the sum of the areas of the new patches is A ratio times the area of the original patch P i . In the numerical experiments, we use A ratio = 1.25 . This way, it is possible to allow the new patches to overlap and keep the area of the region P i k = 1 C M P i k reasonably small.
We remark that, with this strategy, it may happen that some patches are outside Ω . In order to avoid this risk, we move the centers c ˜ P i k to obtain the actual patches centers c P i k as follows:
c P i k = ( x i k , y i k ) max min x ˜ i k , 1 h i k 2 , h i k 2 , max min y ˜ i k , 1 h i k 2 , h i k 2 .
We remark that, when the patch P i is very close to a vertex of the domain, it is possible that multiple original centers c ˜ P i k are such that the distance of both x ˜ i k and y ˜ i k from the x and y coordinates of the domain vertex is smaller than h i k / 2 . In this case, it is important to consider the random coefficient λ in the definition of h i k to avoid updating all these centers with the same point; otherwise, multiple new patches would coincide (because they would share the same center and size).
For the numerical test, we consider C M = 4 and C M = 9 . Using significantly more accurate quadrature rules, we compare the approximate solution with the exact one defined in (29) and compute the relative H 1 error u u N N 1 / u 1 at the end of each training iteration. The obtained errors are shown as blue circles ( C M = 4 ) and red triangles ( C M = 9 ) in Figure 3. It can be noted that, with both values of C M , when more patches are used, the error is smaller, even though the convergence rate is limited by the low regularity of the solution. It is also interesting to observe the positions and sizes of the used patches; such information is summarized in Figure 4 and Figure 5. In such figures, each dot is in the center of a patch P i , and its size and color represent the size h i 2 and the scaled indicator η i γ associated with P i . It can be noted that, even if the new centers are chosen randomly in the few selected patches, the final distribution is the expected one. In fact, most of the patches cluster around the origin, whereas the rest of the domain is covered by fewer patches. Nevertheless, we highlight that, when C M = 9 , there are more small and medium patches far from the origin, yielding a more uniform covering of the areas far from the singular point and a slightly better accuracy.
  • Strategy #2: Fixed patch centers
From the results discussed in Strategy #1, it can be observed that choosing the position of the new centers randomly may lead to non-uniform patch distribution in regions far from the singular point. In order to obtain better distributions, let us fix a priori the position of the new centers. Let us consider the reference patch P ^ = ( 0 , 1 ) 2 and the points
c ^ 1 = ( 0.25 , 0.25 ) , c ^ 2 = ( 0.75 , 0.25 ) , c ^ 3 = ( 0.25 , 0.75 ) , c ^ 4 = ( 0.75 , 0.75 ) ,
when C M = 4 and
c ^ 1 = ( 0.2 , 0.2 ) , c ^ 2 = ( 0.2 , 0.5 ) , c ^ 3 = ( 0.2 , 0.8 ) , c ^ 4 = ( 0.5 , 0.2 ) , c ^ 5 = ( 0.5 , 0.5 ) , c ^ 6 = ( 0.5 , 0.8 ) , c ^ 7 = ( 0.8 , 0.2 ) , c ^ 8 = ( 0.8 , 0.5 ) , c ^ 9 = ( 0.8 , 0.8 ) ,
when C M = 9 . At the end of the ( m 1 ) -th training iteration, if η i γ η ¯ m 1 , the C M centers inside P i are chosen as c P i k = M i ( c ^ k ) , k = 1 , , C M . Once more, to avoid patches partially outside Ω , we update such centers as in (32). We highlight that defining the new centers as in (33) and in (34) and the length h i k of the edges of the new patches as in Strategy #1, then the new patches with centers inside P i form a cover of P i , i.e., P i k = 1 C M P i k . Such a property does not hold if the new centers are randomly chosen.
Training an MF-VPINN with such a strategy leads to more accurate results. The error decays are shown in Figure 6, whereas a comparison with the previous one will be presented in Section 3.3. The patch distributions, for C M = 4 and C M = 9 , are shown in Figure 7 and Figure 8, respectively. Analyzing such distributions, it can be noted that the patches still accumulate near the origin as expected. However, it is possible to observe that there are regions that are only covered by the largest patches. This phenomenon is more evident when C M = 4 . To avoid such a phenomenon, we aim at inserting more patches far from the origin in order to train the MF-VPINN in the entire domain with a more balanced set of patches.
  • Strategy #3: Fixed patch centers and small level gap strategy
In order to ensure better patch distributions, let us consider a new criterion to choose the position and the size of the new patches. We name this strategy the small-level gap strategy because it penalizes patch distributions with large differences between the levels of the smallest patches and the ones of the largest patches.
We denote by k-th level patch any patch P i such that P i P k and P i P k for any k < k . With this notation, it is possible to group all the patches according to their level. To do so, we denote by L the set of k-th level patches with k . Let us consider the m-th training iteration. We define η sort as the array containing the elements η i γ of η sort (maintaining the same ordering) such that P i L . We also denote by η ¯ m , the array containing the first τ m = min { τ m , dim ( L ) } elements of η sort . Note that η ¯ m , is the equivalent of η ¯ m for patches in L .
In order to generate the new patches in P m + 1 P m , let us add C M new patches in any patch P i such that η i γ η ¯ m η ¯ m , . The centers and sizes of the new patches are chosen as in Strategy #2. This allows us to exploit the fact that P i k = 1 C M P i k to remove the patches P i such that η i γ η ¯ m η ¯ m , from the new set of patches P m + 1 . We remark that such patches cannot be removed when the centers are randomly chosen as in Strategy #1 because, in that case, P m + 1 would not be a cover of Ω anymore.
We also highlight that, removing the patches P i such that η i γ η ¯ m η ¯ m , and choosing A ratio = 1 , it is possible to satisfy the inequality
P i P m + 1 | P i | C | Ω | ,
for any m N and with C > 0 independent of m. Such a bound on the sum of the area of the patches is useful to ensure that there exists a number N patch_per_point such that any point inside Ω belongs to at most N patch_per_point patches. This property is useful to derive global error indicators. We choose to maintain A ratio = 1.25 to compare the numerical results with the ones obtained using the previous strategies and to consider overlapping patches.
We train an MF-VPINN with C M = 4 and C M = 9 as in the previous tests. The corresponding error decays are shown in Figure 9. It can be observed that the error decreases in a smoother way and that, as in the previous tests, choosing C M = 4 or C M = 9 does not lead to significant differences in the error behavior. The patches used during the training are represented in Figure 10 and Figure 11. We highlight that, when compared with the patch distributions in Strategy #2, there exist much more patches far from the origin, and, most importantly, the closer the center of a patch to the origin, the smaller its size. Even though the error decays with C M = 4 and C M = 9 are qualitatively similar, it should be noted that the patch distribution with C M = 9 is more skewed. In fact. its patches can be clustered into two subgroups: the first one containing larger patches and covering most of the domain the second one containing only small patches with centers very close to the origin. A similar distribution is obtained with C M = 4 , even though it is characterized by a smoother transition between large and small patches.
In both cases, it can be observed that there are no large patches very close to small ones. This is in contrast with the distributions obtained in Strategy #2 and leads to more stable solvers. Indeed, even though the test functions are not related to a global triangulation on the entire domain Ω , the current loss function is very similar to the one used in a standard VPINN with a good-quality mesh, i.e., a mesh in which neighboring elements are similar in size and shape. On the other hand, in Strategy #2, there exist large patches that are very close to small ones; this is equivalent to training a VPINN on a very poor-quality mesh. Such meshes, in the context of FEM, are strictly related to convergence and accuracy issues.

3.3. The Importance of the Error Indicator

As discussed in the previous sections, we use the error indicator described in Section 2.3 to interrupt the training and to decide where the new patches have to be inserted to maximize the accuracy. In this section, the advantages of such a choice are described.
Since each set P m is a cover of Ω , the quantity E S m = i = 1 dim ( P m ) η i is an indicator of the global H 1 error u u N N 1 on the entire domain Ω . Therefore, tracking its behavior during the training is equivalent to tracking that of the unknown H 1 error. Such information is used to implement an early stopping strategy to reduce the computational cost of the iterative training. At the beginning of the m-th training iteration, all the vectors and sparse matrices required to compute E S m are computed in a preprocessing phase. When such data structures are available, the error indicator can be assembled suitably combining basic algebraic operations.
We assemble E S m every N check epochs and store the best value obtained during the training, together with the corresponding neural network trainable parameters. Then, if no improvements are obtained in p · N check epochs, the training is interrupted and the neural network parameters associated with the best value of E S m are restored. Here, p is a tunable parameter named patience. The first N negl m epochs are neglected because they are often characterized by strong oscillations due to the optimizer initialization and the different loss functions. In the numerical experiment, we use N check = 10 , p = 10 , N negl m = 100 ( m + 1 ) .
Two typical scenarios are shown in Figure 12. In the top row, the behaviors of E S m and of c u u N N 1 are shown. Here, c is a scaling parameter used for visualization purposes, chosen such that E S m and c u u N N 1 coincide at the beginning of the training. Indeed, u u N N 1 is about two orders of magnitude smaller than E S m . Nevertheless, it can be noted that these two quantities display very similar behaviors during the training. In the bottom row, instead, we represent the corresponding loss function decay. The left column is associated with the training performed using the patches in P 6 shown in Figure 5f and the right column with the one performed using the patches in P 2 in Figure 11a. We remark that the loss function, E S m and c u u N N are evaluated in the same epochs and that, in real applications, it is not possible to explicitly compute c u u N N since u is not known. Moreover, since we use the L-BFGS optimizer, the neural network is evaluated multiple times on the entire training set in each epoch. Therefore, on the x-axis of Figure 12 we show the number of neural network evaluations instead of the number of epochs.
It can be noted that the behavior of the quantities shown in the left column is qualitatively different from the ones in the right column. In fact, when the MF-VPINN is trained with P 6 of Figure 5f, the error, the error indicator, and the loss function decrease in similar ways. Therefore, there is no need to interrupt the training early since the accuracy is improving, minimizing the loss function. On the other hand, when the MF-VPINN is trained with the P 2 of Figure 11a, the loss decreases even when the error and the error indicator increase or remain constant. In this case, it is convenient to interrupt the training, since minimizing the loss function further would lead to more severe overfitting phenomena and a loss in accuracy and efficiency. At the end of the training, the neural network’s trainable parameters corresponding to the best value of E S 2 are restored. We highlight that such a phenomenon, observed in [46] too, highlights the fact that the minimization of the loss function generates spurious oscillations that cannot be controlled and ruin the model accuracy. The issue can be partially alleviated with the adopted regularization or completely removed using inf-sup stable models as in [47].
  • Strategy #4: Adaptive strategy without the error indicator
Let us now analyze the consequences of choosing the position of the new patches without using the error indicator. To do so, we consider Strategy #1 but, instead of considering the new centers inside the patches P i with the highest values of η i γ , we add them inside the patches with the highest values of r h , i 2 ( u N N ) . Using the equation residuals is a common choice in PINN adaptivity because the residuals describe how accurately the neural network satisfies the PDE at that point. The obtained error decay is shown in Figure 13. It can be seen that the accuracy is worse than the ones obtained with the other strategies and that the convergence rate with respect to the number of patches is lower. In such a figure, we also compare the MF-VPINN with a standard VPINN trained with test functions defined on Delaunay meshes. Note that, when Strategy #2 or Strategy #3 is adopted, the MF-VPINN is more accurate than a simple VPINN, even though its main advantage resides in being a meshfree method.
We highlight that, due to the low regularity of the solution, the expected convergence rate with respect to the number of test functions of an FEM solution computed on uniform refinements is −1/3. Note that the convergence rate of the proposed MF-VPINN method is still close to −1/3, even though it is a meshfree method (see Table 1). For completeness, we also remark that, if an adaptive FEM is used, the rate of convergence depends on the FEM order.
Coherently with Figure 13, the best strategies are Strategy #2 and Strategy #3, whereas the worst one is Strategy #4, which does not exploit the error indicator. The poor performance of Strategy #4 can also be explained by analyzing the corresponding patch distribution. Such distribution is shown in Figure 14 for C M = 4 and in Figure 15 for C M = 9 . These plots highlight that the patches do not accumulate near the origin because the residuals of the patches closer to it are not significantly higher than the other ones. For example, note the different colors in Figure 4 and Figure 14, since in both cases, we randomly choose the position of C M = 4 centers inside the selected patches. Such a property is explained by the fact that, in order to minimize the loss function, the optimizer does not focus on specific regions of the domain. Therefore, the orders of magnitude of all the residuals with similar sizes are very close to each other regardless of the position of the corresponding patches. As discussed regarding Figure 12, we can conclude that the value of the residuals is not a good indicator of the actual error.

3.4. Extension to More a Complex Domain

In this section, we present some ideas that can be used to apply the method to more complex domains.
Let us consider a domain Ω 2 with some internal holes and boundary Ω 2 = Γ 2 . In particular, Ω 2 = ( 0 , 1 ) 2 i = 1 4 H i , where H i , i = 1 , 2 , 3 , 4 are rectangular holes with centers c H i defined as
c H 1 = 9 26 , 9 34 , c H 2 = 17 26 , 9 34 ,
c H 3 = 9 26 , 25 34 , c H 4 = 17 26 , 25 34 ,
and basis and height equal 1 26 and 1 34 , respectively.
In this domain, we consider the Poisson problem:
Δ u = f in Ω 2 , u = g on Γ 2 ,
with f and g such that the exact solution is
u ( x , y ) = 1 C u x ( x 1 ) x 4 13 x 5 13 x 8 13 x 9 13 · y ( y 1 ) y 4 17 y 5 17 y 12 17 y 13 17 ,
normalized through the constant 1 C u to assume value 1 in 2 13 , 2 17 . This function is represented in Figure 16.
We extend the approaches proposed in Section 3.2 by adding a cutting procedure after the generation of the new patches. Note that, in particular, all the patches are already completely inside the square [ 0 , 1 ] 2 when we apply the cutting procedure, and we can thus focus only on the holes. When a patch intersects more than one hole, we recursively remove it from P m , we subdivide the corresponding region in 4 overlapping patches, and we add them to P m until all the generated patches intersect at most one hole. Moreover, we observe that the region P i H j inside the patch P i P m and outside the hole H j , j = 1 , 2 , 3 , 4 , can always be covered by the union of at most four rectangles. When a generated patch intersects a hole, we thus remove the patch and generate the minimum number of patches (at most four) that are as large possible and whose union is the region P i H j .
To avoid numerical instabilities, when this cutting procedure generates a patch with an aspect ratio larger than 100 or with an area more than 100 times smaller than the original uncutted patch, the new patches are removed from P m . This implies that it is not possible to remove the patches associated with the highest error indicators as in Strategy #3 because otherwise, the union of all the patches would not cover the entire domain. We thus present numerical results only for Strategy #1 and Strategy #2.
The obtained error decays are shown in Figure 17 for Strategy #1 and Strategy #2 with C M = 4 and C M = 9 . The first and second errors are computed with the patches generated by cutting the patches in P 0 and P 1 , respectively, whereas the third and fourth errors are obtained by refining the previous patches with the error indicator as previously described. Note that the first and second errors are very close for all the curves since the strategy and the value of C M does not influence the training and that both strategies converge better with C M = 4 . The final patch distributions are displayed in Figure 18. Here, we can see that the inner part of the domain is covered by a few large patches, whereas the distribution is denser closer to the external boundary of Ω 2 , where the solution is more oscillating.

4. Conclusions and Discussion

In this work, we presented a Meshfree Variational-Physics-Informed Neural Network (MF-VPINN). It is a PINN trained using the PDE variational formulation that does not require the generation of a global triangulation of the entire domain. In order to generate the test functions involved in the loss computation, we use an a posteriori error estimator based on the one discussed in [46]. Using such an error estimator, it is possible to add test functions only in regions in which the error is higher, thus increasing the efficiency of the method.
We highlight that the main advantages of the method are that it is meshfree, as it requires only a covering of the domain with patches that can be of different shapes and that it automatically improves the solution with the application of local patches without requiring a global mesh manipulation. It can be therefore used in domains where it is expensive or impossible to generate a mesh. On the other hand, if a mesh suitable to describe the solution can be generated, a standard VPINN is preferable since the implementation is simpler and the convergence rate with respect to the number of test functions is higher.
We discuss several strategies to generate the set of test functions. We observe that adding a few test functions inside the patches associated with higher errors while ensuring a smooth transition between regions with large patches and regions with small patches is the best way to obtain accurate solutions. We also show that, if the a posteriori error indicator is not used, the model’s accuracy decreases and the training is slower.
In this paper, we only focus on second-order elliptic problem even though VPINNs can be used to solve more complex problems. In a forthcoming paper, we will adapt the a posteriori error estimator and analyze the MF-VPINN performance on other PDEs. Moreover, we are interested in the analysis of the approach in more complex domains (in which the patches have to be suitably deformed) and in high-dimensional problems, where using a standard VPINN is not practical.

Author Contributions

Conceptualization, S.B. and M.P.; methodology, S.B. and M.P.; software, M.P.; validation, M.P.; formal analysis, S.B. and M.P.; investigation, S.B. and M.P.; resources, S.B.; data curation, M.P.; writing—original draft preparation, M.P.; visualization, M.P.; supervision, S.B.; project administration, S.B.; funding acquisition, S.B. All authors have read and agreed to the published version of the manuscript.

Funding

The author S.B. kindly acknowledges partial financial support provided by PRIN project “Advanced polyhedral discretisations of heterogeneous PDEs for multiphysics problems” (No. 20204LN5N5_003) and by PNRR M4C2 project of CN00000013 National center for HPC, Big Data and Quantum Computing (HPC) (CUP: E13C22000990001). The author M.P. kindly acknowledges the financial support provided by the Politecnico di Torino where the research was carried out.

Data Availability Statement

The data are available upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Lagaris, I.; Likas, A.; Fotiadis, D. Artificial neural network methods in quantum mechanics. Comput. Phys. Commun. 1997, 104, 1–14. [Google Scholar] [CrossRef]
  2. Lagaris, I.; Likas, A.; Fotiadis, D. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [PubMed]
  3. Lagaris, I.; Likas, A.; Papageorgiou, D. Neural-network methods for boundary value problems with irregular boundaries. IEEE Trans. Neural Netw. 2000, 11, 1041–1049. [Google Scholar] [CrossRef] [PubMed]
  4. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: https://www.tensorflow.org (accessed on 15 September 2024).
  5. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Curran Associates, Inc.: Red Hook, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
  6. Bradbury, J.; Frostig, R.; Hawkins, P.; Johnson, M.J.; Leary, C.; Maclaurin, D.; Necula, G.; Paszke, A.; VanderPlas, J.; Wanderman-Milne, S.; et al. JAX: Composable Transformations of Python+NumPy Programs. 2018. Available online: http://github.com/google/jax (accessed on 15 September 2024).
  7. Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv 2017, arXiv:1711.10561. [Google Scholar]
  8. Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics informed deep learning (part ii): Data-driven solutions of nonlinear partial differential equations. arXiv 2017, arXiv:1711.10566. [Google Scholar]
  9. Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  10. Pu, J.; Li, J.; Chen, Y. Solving localized wave solutions of the derivative nonlinear Schrödinger equation using an improved PINN method. Nonlinear Dyn. 2021, 105, 1723–1739. [Google Scholar] [CrossRef]
  11. Yuan, L.; Ni, Y.; Deng, X.; Hao, S. A-PINN: Auxiliary physics informed neural networks for forward and inverse problems of nonlinear integro-differential equations. J. Comput. Phys. 2022, 462, 111260. [Google Scholar] [CrossRef]
  12. Guo, Q.; Zhao, Y.; Lu, C.; Luo, J. High-dimensional inverse modeling of hydraulic tomography by physics informed neural network (HT-PINN). J. Hydrol. 2023, 616, 128828. [Google Scholar] [CrossRef]
  13. Demo, N.; Strazzullo, M.; Rozza, G. An extended physics informed neural network for preliminary analysis of parametric optimal control problems. Comput. Math. Appl. 2023, 143, 383–396. [Google Scholar] [CrossRef]
  14. Gao, H.; Sun, L.; Wang, J. PhyGeoNet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state PDEs on irregular domain. J. Comput. Phys. 2021, 428, 110079. [Google Scholar] [CrossRef]
  15. Yuyao, C.; Lu, L.; Karniadakis, G.; Dal Negro, L. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 2020, 28, 11618–11633. [Google Scholar] [CrossRef]
  16. Tartakovsky, A.; Marrero, C.; Perdikaris, P.; Tartakovsky, G.; Barajas-Solano, D. Learning parameters and constitutive relationships with physics informed deep neural networks. arXiv 2018, arXiv:1808.03398. [Google Scholar]
  17. Chen, Z.; Liu, Y.; Sun, H. Physics-informed learning of governing equations from scarce data. Nat. Commun. 2021, 12, 6136. [Google Scholar] [CrossRef] [PubMed]
  18. Weinan, E.; Yu, B. The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat. 2018, 6, 1–12. [Google Scholar]
  19. Müller, J.; Zeinhofer, M. Error estimates for the deep Ritz method with boundary penalty. In Proceedings of the Mathematical and Scientific Machine Learning, PMLR, Beijing, China, 15–17 August 2022; pp. 215–230. [Google Scholar]
  20. Lu, Y.; Lu, J.; Wang, M. A priori generalization analysis of the deep Ritz method for solving high dimensional elliptic partial differential equations. In Proceedings of the Conference on Learning Theory. PMLR, Boulder, CO, USA, 15–19 August 2021; pp. 3196–3241. [Google Scholar]
  21. Sirignano, J.; Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 2018, 375, 1339–1364. [Google Scholar] [CrossRef]
  22. Al-Aradi, A.; Correia, A.; Jardim, G.; de Freitas Naiff, D.; Saporito, Y. Extensions of the deep Galerkin method. Appl. Math. Comput. 2022, 430, 127287. [Google Scholar] [CrossRef]
  23. Li, J.; Zhang, W.; Yue, J. A deep learning Galerkin method for the second-order linear elliptic equations. Int. J. Numer. Anal. Model. 2021, 18, 427–441. [Google Scholar]
  24. Smith, B.F. Domain decomposition methods for partial differential equations. In Parallel Numerical Algorithms; Springer: Berlin/Heidelberg, Germany, 1997; pp. 225–243. [Google Scholar]
  25. Toselli, A.; Widlund, O. Domain Decomposition Methods-Algorithms and Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006; Volume 34. [Google Scholar]
  26. Jagtap, A.; Kharazmi, E.; Karniadakis, G. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 2020, 365, 113028. [Google Scholar] [CrossRef]
  27. Shukla, K.; Jagtap, A.D.; Karniadakis, G.E. Parallel physics-informed neural networks via domain decomposition. J. Comput. Phys. 2021, 447, 110683. [Google Scholar] [CrossRef]
  28. Jagtap, A.; Karniadakis, G. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Commun. Comput. Phys. 2020, 28, 2002–2041. [Google Scholar]
  29. Moseley, B.; Markham, A.; Nissen-Meyer, T. Finite Basis Physics-Informed Neural Networks (FBPINNs): A scalable domain decomposition approach for solving differential equations. Adv. Comput. Math. 2023, 49, 62. [Google Scholar] [CrossRef]
  30. Viana, F.; Nascimento, R.; Dourado, A.; Yucesan, Y. Estimating model inadequacy in ordinary differential equations with physics-informed neural networks. Comput. Struct. 2021, 245, 106458. [Google Scholar] [CrossRef]
  31. Yang, L.; Meng, X.; Karniadakis, G. B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys. 2021, 425, 109913. [Google Scholar] [CrossRef]
  32. Yang, L.; Zhang, D.; Karniadakis, G. Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations. SIAM J. Sci. Comput. 2020, 42, A292–A317. [Google Scholar] [CrossRef]
  33. Yucesan, Y.; Viana, F. Hybrid physics-informed neural networks for main bearing fatigue prognosis with visual grease inspection. Comput. Ind. 2021, 125, 103386. [Google Scholar] [CrossRef]
  34. Zhu, Y.; Zabaras, N.; Koutsourelakis, P.; Perdikaris, P. Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 2019, 394, 56–81. [Google Scholar] [CrossRef]
  35. Pang, G.; Lu, L.; Karniadakis, G.E. fPINNs: Fractional physics-informed neural networks. SIAM J. Sci. Comput. 2019, 41, A2603–A2626. [Google Scholar] [CrossRef]
  36. Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
  37. Koenig, B.C.; Kim, S.; Deng, S. KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics. arXiv 2024, arXiv:2407.04192. [Google Scholar]
  38. Qian, K.; Kheir, M. Investigating KAN-Based Physics-Informed Neural Networks for EMI/EMC Simulations. arXiv 2024, arXiv:2405.11383. [Google Scholar]
  39. Kumar, V.; Gleyzer, L.; Kahana, A.; Shukla, K.; Karniadakis, G.E. Mycrunchgpt: A llm assisted framework for scientific machine learning. J. Mach. Learn. Model. Comput. 2023, 4, 41–72. [Google Scholar] [CrossRef]
  40. Beck, C.; Hutzenthaler, M.; Jentzen, A.; Kuckuck, B. An overview on deep learning-based approximation methods for partial differential equations. Discret. Contin. Dyn. Syst. B 2022, 28, 3697–3746. [Google Scholar] [CrossRef]
  41. Cuomo, S.; Di Cola, V.S.; Giampaolo, F.; Rozza, G.; Raissi, M.; Piccialli, F. Scientific Machine Learning Through Physics-Informed Neural Networks: Where we are and What’s Next. J. Sci. Comput. 2022, 92, 88. [Google Scholar] [CrossRef]
  42. Lawal, Z.; Yassin, H.; Lai, D.; Che Idris, A. Physics-Informed Neural Network (PINN) Evolution and Beyond: A Systematic Literature Review and Bibliometric Analysis. Big Data Cogn. Comput. 2022, 6, 140. [Google Scholar] [CrossRef]
  43. Viana, F.A.; Subramaniyan, A.K. A survey of Bayesian calibration and physics-informed neural networks in scientific modeling. Arch. Comput. Methods Eng. 2021, 28, 3801–3830. [Google Scholar] [CrossRef]
  44. Kharazmi, E.; Zhang, Z.; Karniadakis, G. VPINNs: Variational physics-informed neural networks for solving partial differential equations. arXiv 2019, arXiv:1912.00873. [Google Scholar]
  45. Kharazmi, E.; Zhang, Z.; Karniadakis, G. hp-VPINNs: Variational physics-informed neural networks with domain decomposition. Comput. Methods Appl. Mech. Eng. 2021, 374, 113547. [Google Scholar] [CrossRef]
  46. Berrone, S.; Canuto, C.; Pintore, M. Solving PDEs by variational physics-informed neural networks: An a posteriori error analysis. Ann. Univ. Ferrara 2022, 68, 575–595. [Google Scholar] [CrossRef]
  47. Berrone, S.; Canuto, C.; Pintore, M. Variational-Physics-Informed Neural Networks: The role of quadratures and test functions. J. Sci. Comput. 2022, 92, 100. [Google Scholar] [CrossRef]
  48. Berrone, S.; Pieraccini, S.; Scialò, S. Towards effective flow simulations in realistic discrete fracture networks. J. Comput. Phys. 2016, 310, 181–201. [Google Scholar] [CrossRef]
  49. Sukumar, N.; Srivastava, A. Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. Comput. Methods Appl. Mech. Eng. 2022, 389, 114333. [Google Scholar] [CrossRef]
  50. Berrone, S.; Canuto, C.; Pintore, M.; Sukumar, N. Enforcing Dirichlet boundary conditions in physics-informed neural networks and variational physics-informed neural networks. Heliyon 2023, 9, e18820. [Google Scholar] [CrossRef] [PubMed]
  51. Kingma, D.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  52. Wright, S.; Nocedal, J. Numerical Optimization; Springer: Berlin/Heidelberg, Germany, 1999; Volume 35, p. 7. [Google Scholar]
  53. Baydin, A.; Pearlmutter, B.; Radul, A.; Siskind, J. Automatic differentiation in machine learning: A survey. J. Mach. Learn. Res. 2018, 18, 5595–5637. [Google Scholar]
  54. Prechelt, L. Early stopping-but when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 1998; pp. 55–69. [Google Scholar]
Figure 1. Graphical representation of a set { P i } i = 1 n patches obtained from a squared reference patch P ^ with c P ^ in its center covering the domain Ω = ( 0 , 1 ) 2 .
Figure 1. Graphical representation of a set { P i } i = 1 n patches obtained from a squared reference patch P ^ with c P ^ in its center covering the domain Ω = ( 0 , 1 ) 2 .
Algorithms 17 00415 g001
Figure 2. Graphical representation of the solution u in (29).
Figure 2. Graphical representation of the solution u in (29).
Algorithms 17 00415 g002
Figure 3. Strategy #1: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Figure 3. Strategy #1: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Algorithms 17 00415 g003
Figure 4. Strategy #1: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 6 ; (e) Representation of P 8 ; (f) Representation of P 9 .
Figure 4. Strategy #1: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 6 ; (e) Representation of P 8 ; (f) Representation of P 9 .
Algorithms 17 00415 g004
Figure 5. Strategy #1: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Figure 5. Strategy #1: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Algorithms 17 00415 g005
Figure 6. Strategy #2: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Figure 6. Strategy #2: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Algorithms 17 00415 g006
Figure 7. Strategy #2: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 3 ; (b) Representation of P 4 ; (c) Representation of P 5 ; (d) Representation of P 6 ; (e) Representation of P 7 ; (f) Representation of P 8 .
Figure 7. Strategy #2: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 3 ; (b) Representation of P 4 ; (c) Representation of P 5 ; (d) Representation of P 6 ; (e) Representation of P 7 ; (f) Representation of P 8 .
Algorithms 17 00415 g007
Figure 8. Strategy #2: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 5 ; (e) Representation of P 6 ; (f) Representation of P 7 .
Figure 8. Strategy #2: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 5 ; (e) Representation of P 6 ; (f) Representation of P 7 .
Algorithms 17 00415 g008
Figure 9. Strategy #3: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Figure 9. Strategy #3: Relative H 1 errors obtained at the end of each training iteration for C M = 4 (blue circles) and C M = 9 (red triangles).
Algorithms 17 00415 g009
Figure 10. Strategy #3: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 3 ; (b) Representation of P 5 ; (c) Representation of P 6 ; (d) Representation of P 7 ; (e) Representation of P 8 ; (f) Representation of P 9 .
Figure 10. Strategy #3: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 3 ; (b) Representation of P 5 ; (c) Representation of P 6 ; (d) Representation of P 7 ; (e) Representation of P 8 ; (f) Representation of P 9 .
Algorithms 17 00415 g010
Figure 11. Strategy #3: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 5 ; (e) Representation of P 6 ; (f) Representation of P 7 .
Figure 11. Strategy #3: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity η i γ . (a) Representation of P 2 ; (b) Representation of P 3 ; (c) Representation of P 4 ; (d) Representation of P 5 ; (e) Representation of P 6 ; (f) Representation of P 7 .
Algorithms 17 00415 g011
Figure 12. Top row: error indicator E S m and rescaled H 1 error c u u N N . Bottom row: loss function. Left column: curves for the training with patches in P 6 shown in Figure 5f. Right column: curves for the training with patches in P 2 in Figure 11a. (a) E S 6 and c u u N N for patches in Figure 5f; (b) E S 2 and c u u N N for patches in Figure 11a; (c) Loss function for patches in Figure 5f; (d) Loss function for patches in Figure 11a.
Figure 12. Top row: error indicator E S m and rescaled H 1 error c u u N N . Bottom row: loss function. Left column: curves for the training with patches in P 6 shown in Figure 5f. Right column: curves for the training with patches in P 2 in Figure 11a. (a) E S 6 and c u u N N for patches in Figure 5f; (b) E S 2 and c u u N N for patches in Figure 11a; (c) Loss function for patches in Figure 5f; (d) Loss function for patches in Figure 11a.
Algorithms 17 00415 g012
Figure 13. Comparison between the relative H 1 errors obtained at the end of each training iteration with different strategies to choose the position of the new patches. (a) C M = 4 ; (b) C M = 9 .
Figure 13. Comparison between the relative H 1 errors obtained at the end of each training iteration with different strategies to choose the position of the new patches. (a) C M = 4 ; (b) C M = 9 .
Algorithms 17 00415 g013
Figure 14. Strategy #4: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Figure 14. Strategy #4: Patches used to train the MF-VPINN with C M = 4 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Algorithms 17 00415 g014
Figure 15. Strategy #4: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Figure 15. Strategy #4: Patches used to train the MF-VPINN with C M = 9 . Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . (a) Representation of P 1 ; (b) Representation of P 2 ; (c) Representation of P 3 ; (d) Representation of P 4 ; (e) Representation of P 5 ; (f) Representation of P 6 .
Algorithms 17 00415 g015
Figure 16. Graphical representation of the solution u in (36).
Figure 16. Graphical representation of the solution u in (36).
Algorithms 17 00415 g016
Figure 17. Relative H 1 errors obtained by solving problem (35).
Figure 17. Relative H 1 errors obtained by solving problem (35).
Algorithms 17 00415 g017
Figure 18. Problem (35): Representation of the last set of patches obtained with the different strategies. Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . The black rectangles represent the holes H i , i = 1 , 2 , 3 , 4 . (a) Strategy #1, C M = 4 ; (b) Strategy #2, C M = 4 ; (c) Strategy #1, C M = 9 ; (d) Strategy #2, C M = 9 .
Figure 18. Problem (35): Representation of the last set of patches obtained with the different strategies. Each dot represents a patch P i , its position is the center c P i of the patch, its size is proportional to the patch size h i 2 , and its color is associated with the quantity r h , i 2 ( u N N ) . The black rectangles represent the holes H i , i = 1 , 2 , 3 , 4 . (a) Strategy #1, C M = 4 ; (b) Strategy #2, C M = 4 ; (c) Strategy #1, C M = 9 ; (d) Strategy #2, C M = 9 .
Algorithms 17 00415 g018
Table 1. Rates of convergence with respect to the number of test functions.
Table 1. Rates of convergence with respect to the number of test functions.
C M Strategy #1Strategy #2Strategy #3Strategy #4Reference VPINN
4−0.213−0.295−0.283−0.105−0.232
9−0.294−0.376−0.287−0.182−0.232
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Berrone, S.; Pintore, M. Meshfree Variational-Physics-Informed Neural Networks (MF-VPINN): An Adaptive Training Strategy. Algorithms 2024, 17, 415. https://doi.org/10.3390/a17090415

AMA Style

Berrone S, Pintore M. Meshfree Variational-Physics-Informed Neural Networks (MF-VPINN): An Adaptive Training Strategy. Algorithms. 2024; 17(9):415. https://doi.org/10.3390/a17090415

Chicago/Turabian Style

Berrone, Stefano, and Moreno Pintore. 2024. "Meshfree Variational-Physics-Informed Neural Networks (MF-VPINN): An Adaptive Training Strategy" Algorithms 17, no. 9: 415. https://doi.org/10.3390/a17090415

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop