Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations

Lagrave, Pierre-Yves; Tron, Eliot

doi:10.3390/psf2022005013

Open AccessProceeding Paper

Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations^†

by

Pierre-Yves Lagrave

^1,*,‡

and

Eliot Tron

^2,‡,§

¹

Thales Research and Technology, 91767 Palaiseau, France

²

Ecole Normale Supérieure, 69342 Lyon, France

^*

Author to whom correspondence should be addressed.

^†

Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.

^‡

These authors contributed equally to this work.

^§

The author contributed to this work during an internship at Thales Research and Technology in 2021.

Phys. Sci. Forum 2022, 5(1), 13; https://doi.org/10.3390/psf2022005013

Published: 7 November 2022

(This article belongs to the Proceedings of The 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

This paper discusses the use of Equivariant Neural Networks (ENN) for solving Partial Differential Equations by exploiting their underlying symmetry groups. We first show that Group-Convolutionnal Neural Networks can be used to generalize Physics-Informed Neural Networks and then consider the use of ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. The benefit of our approach is illustrated on the 2D heat equation through the instantiation of an SE(2) symmetry-preserving discretization.

Keywords:

geometric deep learning; equivariant neural networks; partial differential equations; differential invariants; physics informed machine learning

1. Introduction

Numerically solving Partial Differential Equations (PDEs) is of paramount importance for a wide range of applications such as physics, crowd theory, epidemiology and quantitative finance. Conventional methods such as Finite Element or Finite Difference methods have the main advantage of being easy to implement but are highly time consuming. With the rise of Deep Learning in the past decade, new approximate methods based on Physics-Informed Neural Networks (PINN) have been developed [1,2,3] and allow to significantly improve simulation capacities [4,5].

Nonetheless, usual PDEs typically exhibit symmetries [6,7], and it is therefore natural to expect numerical solving schemes to comply with those. For Hamiltonian systems, symplectic integrators [8,9,10,11] have been introduced and have also recently been combined with machine learning techniques for the sake of efficiency [12]. For more general PDEs, symmetry-preserving Finite Difference schemes have been proposed [13,14], with the underlying theory being consolidated in [15]. Practical applications showing improvements with respect to the conventional approach have been presented in [16,17]. However, the formal derivation of the required numerical invariantization of the differential operators becomes more and more challenging as the number of variables increases, hence limiting the applicability of these methods and motivating the need for alternative approaches.

There are mainly two ways to imprint Deep Learning algorithms with symmetries. The first one, recently explored in [18] for PDEs solving, generalizes the data augmentation techniques widely used for image processing tasks and aims at learning symmetries directly from the data. The second one aims at directly encoding the symmetries within the learning algorithms by leveraging the emerging field of Geometric Deep Learning [19,20]. In this context, Equivariant Neural Networks (ENN), initially introduced in [21], have been shown very efficient, leveraging generalized convolution operators such as Steerable Convolution or

G -

convolution [22,23,24,25], and therefore providing equivariance to a wide range of symmetry groups. These equivariance mechanisms are very appealing, as proving theoretical guarantees with respect to the algorithms response to inputs variations and have been shown more efficient than data augmentation techniques from both theoretical [26] and empirical standpoints [27] in several contexts. Yet, these architectures cannot be applied directly to PDEs solving as one would for a conventional PINN and, at the time of writing, only [28] proposes to use steerable convolution to solve PDEs, but it is limited to special cases of symmetries.

Contributions

In this paper, we present two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. By anchoring in [29], we first show that Group-Convolutionnal Neural Networks can be used to generalize the PINN architecture to encode generic symmetries. By leveraging differential invariant theory [6], we then propose using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods without the need to formally derivate corresponding numerical invariantizations. A key advantage of this approach is that it allows solving any other PDE with the same symmetry group without any retraining. Finally, we illustrate the interest of our approach on the 2D Heat Equation and show in particular that a set of fundamental differential invariants of the roto-translation group SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.

2. PDEs and Symmetries

2.1. Systems of PDEs

We are interested in the following in solving systems of PDEs involving one time variable t, p independent space variables

(x_{1}, \dots, x_{p}) = x \in X

and q dependent variables

(u_{1}, \dots, u_{q}) = u \in U

, for which a solution is of the form

u = f (t, x)

, with

u^{j} = f^{j} (t, x)

for

j = 1, \dots, q

in terms of components. In the following, we denote by

X = R^{p}

, with coordinates

(x_{1}, \dots, x_{p}),

the space of the independent variables, and by

U = R^{q}

, with coordinates u, that of the dependent variable.

We call

n -

order jet space

J^{(n)}

the Cartesian product between the space of the independent variables

X

and enough copies of the space of the dependent variables

U

to include coordinates for each partial derivative of order less or equal than n

J^{(n)} = X \times \underset{(\binom{p + n}{n})}{\underset{︸}{U \times \dots \times U}}

(1)

In the above definition, the binomial coefficient

(\binom{p + n}{n})

corresponds to the number of partial derivatives (assumed to be smooth enough) with order less than or equal to n. A function

f : X \to U

represented as

u = f (x)

can naturally be prolonged to a function

u^{(n)} = f^{(n)} (x)

from

X

to

J^{(n)}

by evaluating f and the corresponding partial derivatives, so that

u^{(n)} = \{\partial_{x}^{α} u, |α| \leq n\}

with

\partial_{x}^{α} u

is the spatial cross-derivative corresponding to the multi-index

α = (i_{1}, \dots, i_{p}) \in N^{p}

. According to this formalism, a PDEs system can be then written as

Δ (t, x, u^{(n)}) = 0

(2)

where

Δ

is an operator from

R^{+} \times J^{(n)}

to

R^{q}

.

2.2. Symmetry Group and Differential Invariants

We consider a Lie group G of dimension m acting as

g . (x, u)

on a sub-manifold

M \subseteq X \times U

, with its Lie algebra

g

generated by the vector fields

ζ_{1}, \dots, ζ_{m}

. We can define the transform of a function

u = f (t, x)

under the action of G by identifying f with its graph

Γ_{f}

and by defining

g . f = f_{g}

, where the function

f_{g}

is the function associated with the transformed graph

g . Γ_{f} = Γ_{f_{g}}

.

A symmetry group G of a PDEs system is a group G such that if f is a solution, then its transform

f_{g}

by the group action is also a solution. We then denote by

{pr}^{(n)} G

the prolongation of the group action of G to

J^{(n)}

for which a prolonged transform

g^{(n)}

, for

g \in G

, sends the graph

Γ_{f^{(n)}}

onto

Γ_{{(g . f)}^{(n)}}

and by

{pr}^{(n)} ζ_{1}, \dots, {pr}^{(n)} ζ_{m}

the corresponding prolonged vector fields. The algebraic invariants

I_{G}

of the prolonged group action

{pr}^{(n)} G

are called the differential invariants of order n of the group G and can be obtained by leveraging the infinitesimal invariance criteria

{pr}^{(n)} ζ_{i} I_{G} = 0

. A complete set of independent differential invariants of order n in the sense of Theorem 2.17 of [6] is generically denoted by

\partial ϕ_{u, n}^{G} = \{\partial ϕ_{u, n}^{G, 1}, \dots, \partial ϕ_{u, n}^{G, k}\}

in the sequel and is related to the symmetry group of PDEs systems as illustrated in Section 4.2.

3. Equivariant Neural Networks

To incorporate the symmetry information of PDEs into the neural network solver, it is of capital importance to introduce equivariance into neural networks. Multiple approaches have been studied these past years. Those approaches can be separated into two categories: G-CNN and Steerable CNN. The first method is the one we chose to work with and to generalize. The second one can be explored in [23,28,30,31,32].

3.1. G-CNN

The idea behind a G-CNN is to perform the convolution over the group G one wants equivariance with. These kind of convolution layers were first introduced by Cohen and Welling in [33] for discrete groups. There has been some important work on generalizing this approach to other groups [29,34,35,36]. Let us first start with some reminders about the group-based convolution operator and its properties.

Definition 1

(Group Convolution). Let G be a compact Group and

V_{1}, V_{2}

two vector spaces. Let

K : G \to (V_{1}, V_{2})

be a kernel,

f : G \to V_{1}

be a feature function and μ the Haar measure on G. We define the group convolution for any

s \in G

by

(K * f) (s) = \int_{G} K (r^{- 1} s) f (r) d μ (r) .

Proposition 1.

If the actions of G on

V_{1}^{G}

and

V_{2}^{G}

has regular representations, then the group convolution defined in Figure 1 is G-equivariant.

As illustrated in Figure 1b, regular representations only allow to describe limited group actions. Indeed, this group convolution does not revoke the constraint on the kernel (see [36]) if one wants equivariance to all kinds of actions and not only ones with regular representations.

3.2. A New Convolution

Definition 2

(Representative Group Convolution). Let G be a compact Group and

V_{1}, V_{2}

two vector spaces. Let

K : G \to (V_{1}, V_{2})

be a kernel,

f : G \to V_{1}

be a feature function and μ the Haar measure on G. If

ρ_{1} : G \to (V_{1})

and

ρ_{2} : G \to (V_{2})

are the linear representations of the action of G on

V_{1}

and

V_{2}

, respectively, we define the representative group convolution for any

s \in G

by

(K ⊛ f) (s) = \int_{G} ρ_{2} (r) K (r^{- 1} s) ρ_{1} (r^{- 1}) f (r) d μ (r)

(3)

Remark 1.

In what follows, we keep the same definitions for G, K, V, ρ, μ and f.

Theorem 1.

With the same hypothesis stated in Definition 2, let V denote either

V_{1}

or

V_{2}

and ρ either

ρ_{1}

or

ρ_{2}

. If G acts on

V^{G}

by

ρ (g) f (g^{- 1} r) \forall g, r \in G and f : G \to V,

then the representative group convolution is G-equivariant.

This new convolution layer is thus very powerful to create an estimator to an equivariant function, because it is itself equivariant by construction. However, we cannot compose it with other non-equivariant operations without breaking the equivariance for the whole network. Therefore, alone, a convolution layer is not that powerful, but a chain of multiple convolution layers is much more interesting.

Lemma 1.

Any composition of G-equivariant functions is still G-equivariant.

Multiple representative-group convolution layers can be composed to obtain a G-equivariant network. Note that the action of G on the output of the ith layer must match the action of G on the input of the

(i + 1)

th layer.

In Table 1, there is a chain of representations

ρ_{0} \to \dots \to ρ_{L}

, but what really matters is only the first one (

ρ_{0}

) and the last one (

ρ_{L}

). Indeed, by Lemma 1, the whole network is equivariant to G’s action with

ρ_{0}

on the input and

ρ_{L}

on the output. Thus, we have full choice over the other representations

ρ_{ℓ}

for

1 \leq ℓ \leq L - 1

.

Remark 2.

One can still use non-equivariant functions between two hidden layers of the network, as long as these functions are point-wise and that the representations chosen for theses hidden layers are regulars. This covers the main usual architectures for convolutional neural networks.

Lifting the Coordinate Space

The Representative Group Convolution cannot be used right away since the convolution is constructed to be performed on G and not on the input data space (denoted

X

in the sequel). The problem is obviated by lifting the coordinates from

X

to G. One can find more details of this method in [29].

Definition 3

(Lifting). Let

Q = X / G

be the set of orbits of G. If u is a mapping from

X

to V, we set

u^{↑} : G \times Q \to V

as the lifted version of u, defined by:

\begin{matrix} u^{↑} & : & G \times Q & ⟶ & V \\ (r, \bar{o_{q}}) & ⟼ & u (r \cdot o_{q}) \end{matrix}

An element

x \in X

is then lifted to a tuple

(r_{x}, \bar{o_{q}})

.

Definition 4

(Lifted Action). If G acts on

X \times V

, then it has an extended action on the lifted space

(G \times X / G) \times V

. If

((r, q), u^{↑} (r, q))

is a lifted element, and

g \in G

is acting by:

\begin{matrix} g \cdot ((r, q), u^{↑} (r, q)) & = g \cdot (r \cdot o_{q}, u (r \cdot o_{q})) \\ = (r \cdot o_{q}, ρ (g) u (g^{- 1} r \cdot o_{q})) \\ = ((r, q), ρ (g) u^{↑} (g^{- 1} r, q)) . \end{matrix}

4. Solving of PDEs with ENN

We discuss in this section two ways of using ENN for solving PDEs, starting with the use of G-CNN to generalize the PINN concept and then by building symmetry-preserving Finite Difference schemes by using the ENN as differential invariant approximators.

4.1. Equivariant PINN

The idea behind PINNs is somewhat straightforward. Let us consider the PDE defined in Section 2.1 but with added boundary conditions on a set B, which gives:

(E) : \{\begin{matrix} Δ (t, x, u^{(n)}) = 0 & \forall t, x \in R^{+} \times X \\ u (x) = u_{b} (x) & \forall x \in B \end{matrix}

Now, we directly estimate a solution u of

(E)

at

t_{0} + d t

with

N_{θ}

an Equivariant Neural Network (ENN) parameterized by

θ

, taking as input the initial profile of the solution, being u at time

t_{0}

. This ENN is equivariant to the symmetry group of

(E)

.

In order to train the ENN

N_{θ}

to approximate the solution of the PDE, we introduce the following optimization problem

(P)

:

(P) : θ^{*} = arg min_{θ} \{L (θ, T)\} .

Additionally, the following loss function

L

:

L (θ, T) = w_{f} L_{f} (θ, T_{f}) + w_{b} L_{b} (θ, T_{b})

(4)

with

\begin{matrix} L_{f} (θ, T_{f}) & = \frac{1}{| T_{f} |} \sum_{x \in T_{f}} {∥Δ (t, x, {\hat{u}}^{(n)})∥}_{2}^{2} \end{matrix}

(5)

\begin{matrix} L_{b} (θ, T_{b}) & = \frac{1}{| T_{b} |} \sum_{x \in T_{b}} {∥B ({\hat{u}}_{θ}, x)∥}_{2}^{2} \end{matrix}

(6)

with boundary conditions

B (u_{b}, x) = 0

on B. Additionally,

T_{f}

and

T_{b}

are two training sets of randomly distributed points, with

T_{b} \subset B

and

T_{f} \subset X

.

Remark 3.

A similar approach has been used by Wang et al. in [28] but with steerable neural networks. They used a U-Net and a ResNet architecture, common for these types of tasks [37], which they made equivariant in order to predict the PDEs’ results. To tackle the constraints on the kernels, they manually design some transformations to make their neural networks equivariant to 4 different actions. Our result is meant to be more general than this case-by-case design. We bypass the kernel’s constraints and have a fully equivariant network to any given group.

4.2. Symmetry-Preserving Finite Difference

We restrict ourselves in the following to PDEs systems of order n with a linear dependency with respect to the time differentials. According to the introduced formalism, we only consider systems of the form

\sum_{i = 1}^{k_{t}} a_{i} \partial_{t}^{i} u = Δ (t, x, u^{(n)})

(7)

where

k_{t} \in N

and

Δ

is an operator from

R^{+} \times J^{(n)}

to

R^{q}

. The above form covers most of the PDEs encountered in physics, ranging from the heat and wave equations to the Navier Stokes, Shrödinger and Maxwell equations. Assuming that the above PDEs system is regular enough, it admits G as symmetry if and only if the operator

Δ

can be expressed as a function of a complete set of differential invariants, i.e., if and only if (2) can can be re-written as

\sum_{i = 1}^{k_{t}} a_{i} \partial_{t}^{i} u = F (\partial ϕ_{u, n}^{G, 1}, \dots, \partial ϕ_{u, n}^{G, k})

(8)

with

F : R^{q k} \to R^{q}

.

We propose in the following to approximate the differential invariants by neural networks equivariant to the corresponding group action. More precisely, let us consider a discretisation

x^{(1)}, \dots, x^{(n_{x})}

(resp.

t^{(1)}, \dots, t^{(n_{t})})

of the input space

X

(resp. time interval

R^{+})

and denote

f^{(i, j)} = f (t^{(i)}, x^{(j)})

for any

f : R^{+} \times R^{p} \to R^{q}

. For a given differential invariant

\partial ϕ_{f}^{G}

, we are then interested in approximating the vector

{(\partial ϕ_{f}^{G, (i, j)})}_{j = 1}^{n_{x}}

by the output of an equivariant neural network

N_{G}

, taking as input the tensor

{(f^{(i) (j)})}_{j = 1}^{n_{x}}

, so that we have for the jth component

N_{G} {({(f^{(i) (ℓ)})}_{ℓ = 1}^{n_{x}})}^{j} \approx \partial ϕ_{f}^{G, (i, j)} = \partial ϕ_{f}^{G} (t^{(i)}, x^{(j)})

(9)

In the following, we propose to train

N_{G}

on multivariate polynomial functions in the space variables x of degree d, but other choices could be envisioned depending on the considered problem.

The use of equivariant neural networks is motivated here by the fact that the operator

f \to \partial ϕ_{f}^{G}

that we are approximating is equivariant. Indeed, for a function

f : R^{+} \times R^{p} \to R^{q}

,

\partial ϕ_{f}^{G}

is a function from

R^{+} \times R^{p}

to

R^{q}

, and we can therefore consider its transform

g . \partial ϕ_{f}^{G}

according to the action of G by considering its transformed graph, as defined in Section 2.2. As the differential invariant is an algebraic invariant of the prolonged group action

{pr}^{(n)} G

, it is possible to write

g . \partial ϕ_{f}^{G} = \partial ϕ_{g . f}^{G}

(10)

meaning that differential invariant operators are equivariant with respect to the associated group action, as illustrated on Figure 2 for the case of

SE (2)

.

We now come back to the PDE systems (2) and detail the numerical scheme that we propose here for its integration. The idea is to first train an ENN

N_{G}^{i}

for approximating each of the k differential invariants

\partial ϕ_{., n}^{G, j}

which are involved and then to integrate by using an explicit scheme in which the differential invariants are replaced by their approximation with the ENN, leading to

\sum_{ℓ = 1}^{k_{t}} a_{ℓ} \partial_{t}^{ℓ} u (t^{(i)}, x^{(j)}) \approx F (N_{G}^{1} {({(u^{(i, ℓ)})}_{ℓ = 1}^{n_{x}})}^{j}, \dots, N_{G}^{k} {({(u^{(i, ℓ)})}_{ℓ = 1}^{n_{x}})}^{j})

(11)

4.3. Numerical Experiments

4.3.1. Approximating $SE (2)$ Differential Invariants

Here, we considered the case of

SE (2)

for which a generating set of second-order differential invariants is given by

\partial ϕ_{u, 2}^{SE (2)} = \{\begin{matrix} u, u_{x}^{2} + u_{y}^{2}, u_{x x} + u_{y y}, u_{x}^{2} u_{x x} + 2 u_{x} u_{y} u_{x y} + u_{y}^{2} u_{y y}, u_{x x}^{2} + 2 u_{x y}^{2} + u_{y y}^{2} \end{matrix}\}

(12)

We trained in the following 2 Neural Networks, namely one conventional Convolutional Neural Network

N^{R^{2}}

with

R^{2} -

equivariant layers and one

S E (2) -

ENN

N^{SE (2)}

, both of them built to have roughly the same number of parameters (≈

2.2 \times 10^{6}

). The training set consists in

29 \times 29

evaluations of 2D-polynomials in

R [X, Y]

with degrees up to 10 and generated from random coefficients drawn uniformly in [−1, 1]. Polynomials evaluations were performed on the discrete grid

{(i / 29, j / 29)}_{i, j = - 14, \dots, 14}

. An example of prediction with the trained

N^{SE (2)}

, together with the corresponding theoretical value, is given in Figure 3.

4.3.2. Solving the 2D Heat Equation

Here, we consider the 2D heat equation

u_{t} = u_{x x} + u_{y y}

defined on a square domain [−a,a]×[−b,b], with the boundary condition

u (t, \pm a, b) = 0

and

u (t, \pm a, - b) = 100

and the initial condition

u_{t = 0} = f

, for

f : R^{2} \to R

an arbitrary function. Below, we give the results that we obtained by using an FD scheme relying on the approximation of the 2D-Laplacian, as described in Section 4.2, i.e., by computing the solution according to the following update rule:

u^{n + 1} = u^{n} + δ_{t} \times N_{SE (2)}^{Δ} (u^{n})

(13)

where

u^{n} = u (n \times δ_{t}, .)

. We ran

10^{5}

steps of the simulation for

δ_{t} \approx 10^{- 7}

using the two trained architectures considered in Section 4.3.1, namely

N^{R^{2}}

and

N^{SE (2)}

, and compared the obtained heat profiles with that of the ground truth. The boundary condition was taken into account by overriding the predicted outputs by the conventional second-order derivative approximation for the corner cases. The obtained results are depicted in Figure 4, where we can, in particular, observe the high benefit of preserving the SE(2) symmetry during the numerical integration.

5. Conclusions and Further Work

We presented two innovative ways of using ENN to solve PDEs while exploiting the associated symmetries. We first showed that G-CNN can be used to generalize the PINN architecture to encode generic symmetries and then proposed using ENN to approximate differential invariants of a given symmetry group, hence allowing to build symmetry-preserving Finite Difference methods. Our approach is illustrated on the 2D Heat Equation for which we, in particular, showed that a set fundamental differential invariants of SE(2) can be efficiently approximated by ENN for arbitrary functions by training on simple bivariate polynomials evaluations, allowing to easily build SE(2) symmetry-preserving discretization schemes.

Additional work will include performing proper benchmarking between the two approaches and more conventional numerical schemes for PDE integration. More complex PDEs with richer symmetry groups such as Maxwell equations could be considered in this context.

Author Contributions

Both authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

Eliot Tron contributed to this work during an internship at Thales Research and Technology in 2021.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. arXiv 2020, arXiv:cs.LG/1907.04502. [Google Scholar] [CrossRef]
Sirignano, J.; Spiliopoulos, K. DGM: A Deep Learning Algorithm for Solving Partial Differential Equations. J. Comput. Phys. 2018, 2018 375, 1339–1364. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Yazdani, A.; Karniadakis, G.E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, 1026–1030. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Kashinath, K.; Mustafa, M.; Albert, A.; Yu, R. Towards Physics-Informed Deep Learning for Turbulent Flow Prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’20, Virtual Event, 6–10 July 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1457–1466. [Google Scholar] [CrossRef]
Olver, P. Applications of Lie Groups to Differential Equations. In The Handbook of Brain Theory and Neural Networks; Springer: New York, NY, USA, 1993. [Google Scholar]
Fushchich, W.; Nikitin, A. Symmetries of Maxwell’s Equations; Mathematics and Its Applications; Springer: Dordrecht, The Netherlands, 2013. [Google Scholar]
Morrison, P. Structure and structure-preserving algorithms for plasma physics. Phys. Plasmas 2016, 24, 055502. [Google Scholar] [CrossRef] [Green Version]
Kraus, M. Metriplectic Integrators for Dissipative Fluids. In Geometric Science of Information; Nielsen, F., Barbaresco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 292–301. [Google Scholar]
Coquinot, B.; Morrison, P.J. A general metriplectic framework with application to dissipative extended magnetohydrodynamics. J. Plasma Phys. 2020, 86, 835860302. [Google Scholar] [CrossRef]
Luesink, E.; Ephrati, S.; Cifani, P.; Geurts, B. Casimir preserving stochastic Lie-Poisson integrators. arXiv 2021, arXiv:2111.13143. [Google Scholar]
Zhu, A.; Jin, P.; Tang, Y. Deep Hamiltonian networks based on symplectic integrators. arXiv 2020, arXiv:2004.13830. [Google Scholar]
Dorodnitsyn, V. Finite Difference Models Entirely Inheriting Symmetry of Original Differential Equations. Int. J. Mod. Phys. C 1994, 5, 723–734. [Google Scholar] [CrossRef]
Shokin, I.; Shokin, J.; Shokin, Y.; Šokin, Û.; Roesner, K. The Method of Differential Approximation; Computational Physics Series; Springer: Berlin/Heidelberg, Germany, 1983. [Google Scholar]
Olver, P.J. Geometric Foundations of Numerical Algorithms and Symmetry. Appl. Algebra Eng. Commun. Comput. 2001, 11, 417–436. [Google Scholar] [CrossRef] [Green Version]
Marx, C.; Aziz, H. Lie Symmetry Preservation by Finite Difference Schemes for the Burgers Equation. Symmetry 2010, 2, 868. [Google Scholar] [CrossRef]
Razafindralandy, D.; Hamdouni, A. Subgrid models preserving the symmetry group of the Navier–Stokes equations. C. R. Méc. 2005, 333, 481–486. [Google Scholar] [CrossRef]
Brandstetter, J.; Welling, M.; Worrall, D.E. Lie Point Symmetry Data Augmentation for Neural PDE Solvers. arXiv 2022, arXiv:2202.07643. [Google Scholar]
Bronstein, M.M.; Bruna, J.; Cohen, T.; Veličković, P. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges. arXiv 2021, arXiv:2104.13478. [Google Scholar]
Gerken, J.E.; Aronsson, J.; Carlsson, O.; Linander, H.; Ohlsson, F.; Petersson, C.; Persson, D. Geometric Deep Learning and Equivariant Neural Networks. arXiv 2021, arXiv:2105.13926. [Google Scholar]
Cohen, T.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; Balcan, M.F., Weinberger, K.Q., Eds.; PMLR: New York, NY, USA, 2016; Volume 48, pp. 2990–2999. [Google Scholar]
Cohen, T.S.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32, pp. 9145–9156. [Google Scholar]
Weiler, M.; Cesa, G. General E(2)-Equivariant Steerable CNNs. arXiv 2019, arXiv:1911.08251. [Google Scholar]
Worrall, D.; Welling, M. Deep Scale-spaces: Equivariance Over Scale. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
Kondor, R.; Trivedi, S. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Dy, J., Krause, A., Eds.; PMLR: Stockholm, Sweden, 2018; Volume 80, pp. 2747–2755. [Google Scholar]
Elesedy, B.; Zaidi, S. Provably Strict Generalisation Benefit for Equivariant Models. arXiv 2021, arXiv:2102.10333. [Google Scholar]
Gerken, J.E.; Carlsson, O.; Linander, H.; Ohlsson, F.; Petersson, C.; Persson, D. Equivariance versus Augmentation for Spherical Images. arXiv 2022, arXiv:2202.03990. [Google Scholar]
Wang, R.; Walters, R.; Yu, R. Incorporating Symmetry into Deep Dynamics Models for Improved Generalization. arXiv 2020, arXiv:2002.03061. [Google Scholar]
Finzi, M.; Stanton, S.; Izmailov, P.; Wilson, A.G. Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data. arXiv 2020, arXiv:2002.12880. [Google Scholar]
Cohen, T.S.; Welling, M. Steerable CNNs. arXiv 2016, arXiv:1612.08498. [Google Scholar]
Lang, L.; Weiler, M. A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels. arXiv 2020, arXiv:2010.10952. [Google Scholar]
Cohen, T.S.; Weiler, M.; Kicanaoglu, B.; Welling, M. Gauge Equivariant Convolutional Networks and the Icosahedral CNN. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
Cohen, T.S.; Welling, M. Group Equivariant Convolutional Networks. arXiv 2016, arXiv:1602.07576. [Google Scholar]
Cohen, T.S.; Geiger, M.; Weiler, M. Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks). arXiv 2018, arXiv:1803.10743. [Google Scholar]
Kondor, R.; Trivedi, S. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. arXiv 2018, arXiv:1802.03690. [Google Scholar]
Cohen, T.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. arXiv 2018, arXiv:1811.02017. [Google Scholar]
Wang, R.; Kashinath, K.; Mustafa, M.; Albert, A.; Yu, R. Towards Physics-informed Deep Learning for Turbulent Flow Prediction. arXiv 2019, arXiv:1911.08655. [Google Scholar]

Figure 1. Action of G with various representations. (a) Regular representation

ρ (g) = id

on scalars. (b) Regular representation

ρ (g) = id

on vectors. (c) Non-regular representation on vectors.

Figure 1. Action of G with various representations. (a) Regular representation

ρ (g) = id

on scalars. (b) Regular representation

ρ (g) = id

on vectors. (c) Non-regular representation on vectors.

Figure 2. From left to right and top to bottom: the initial function u, its rotated version

\tilde{u}

, the rotated version of the

SE (2)

differential invariant

u_{x}^{2} + u_{y}^{2}

(see Section 4.3) and the differential invariant of the rotated function. As expected, computing the differential invariant from u and applying the rotation (bottom left) gives the same results as computing the differential invariant from the rotated function (bottom right).

Figure 2. From left to right and top to bottom: the initial function u, its rotated version

\tilde{u}

, the rotated version of the

SE (2)

differential invariant

u_{x}^{2} + u_{y}^{2}

(see Section 4.3) and the differential invariant of the rotated function. As expected, computing the differential invariant from u and applying the rotation (bottom left) gives the same results as computing the differential invariant from the rotated function (bottom right).

Figure 3. The

SE (2)

differential invariant

u_{x}^{2} + u_{y}^{2}

computed for the function u depicted in Figure 2 with an SE(2)-CNN (left) and its theoretical value (right).

Figure 3. The

SE (2)

differential invariant

u_{x}^{2} + u_{y}^{2}

computed for the function u depicted in Figure 2 with an SE(2)-CNN (left) and its theoretical value (right).

Figure 4. Comparison of the theoretical heat profile of the 2D heat equation with a top

100^{\circ}

boundary condition (see Section 4.3.2) with those obtained through simulation with two symmetry-preserving FD schemes (see Section 4.2) by leveraging

R^{2}

(middle) and SE(2) (right) equivariant neural networks.

Figure 4. Comparison of the theoretical heat profile of the 2D heat equation with a top

100^{\circ}

boundary condition (see Section 4.3.2) with those obtained through simulation with two symmetry-preserving FD schemes (see Section 4.2) by leveraging

R^{2}

(middle) and SE(2) (right) equivariant neural networks.

Table 1. Example of an Equivariant Neural Network.

input layer:	$N^{0} = f \in V_{0}^{G}$ ,
convolution layers:	$N^{ℓ} = K^{ℓ} ⊛ N^{ℓ - 1} \in V_{ℓ}^{G}$ ,	with $ρ_{ℓ - 1}$ , $ρ_{ℓ}$ , for $1 \leq ℓ \leq L$ .

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lagrave, P.-Y.; Tron, E. Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations. Phys. Sci. Forum 2022, 5, 13. https://doi.org/10.3390/psf2022005013

AMA Style

Lagrave P-Y, Tron E. Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations. Physical Sciences Forum. 2022; 5(1):13. https://doi.org/10.3390/psf2022005013

Chicago/Turabian Style

Lagrave, Pierre-Yves, and Eliot Tron. 2022. "Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations" Physical Sciences Forum 5, no. 1: 13. https://doi.org/10.3390/psf2022005013

Article Menu

Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations^†

Abstract

1. Introduction

Contributions

2. PDEs and Symmetries

2.1. Systems of PDEs

2.2. Symmetry Group and Differential Invariants

3. Equivariant Neural Networks

3.1. G-CNN

3.2. A New Convolution

Lifting the Coordinate Space

4. Solving of PDEs with ENN

4.1. Equivariant PINN

4.2. Symmetry-Preserving Finite Difference

4.3. Numerical Experiments

4.3.1. Approximating $SE (2)$ Differential Invariants

4.3.2. Solving the 2D Heat Equation

5. Conclusions and Further Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations †

Abstract

1. Introduction

Contributions

2. PDEs and Symmetries

2.1. Systems of PDEs

2.2. Symmetry Group and Differential Invariants

3. Equivariant Neural Networks

3.1. G-CNN

3.2. A New Convolution

Lifting the Coordinate Space

4. Solving of PDEs with ENN

4.1. Equivariant PINN

4.2. Symmetry-Preserving Finite Difference

4.3. Numerical Experiments

4.3.1. Approximating SE ( 2 ) Differential Invariants

4.3.2. Solving the 2D Heat Equation

5. Conclusions and Further Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Equivariant Neural Networks and Differential Invariants Theory for Solving Partial Differential Equations^†

4.3.1. Approximating $SE (2)$ Differential Invariants