**2. PINN Architecture**

Knowing the characteristics of the solution to the differential equation under consideration is very helpful when designing the PINN architecture, including structure, number of hidden layers, activation function, etc. For this reason, the PINN developed here has one input node *x* (the independent variable representing the spatial coordinate), one hidden layer consisting of *N* nodes and one output node *y* (the dependent variable representing pressure). Figure 1 depicts a graphical illustration of the present architecture, which when trained solves both the IVP example and the Reynolds BVP considered here.

**Figure 1.** Architecture of the PINN employed to solve the IVP and BVP considered here.

The Sigmoid function, i.e.,

$$\phi(\xi) = \frac{1}{1 + e^{-\xi}},\tag{1}$$

which is mapping R to [0, 1] and exhibits the property

$$
\phi'(\xi) = \phi(\xi)(1 - \phi(\xi)).\tag{2}
$$

is employed as activation function for the hidden layer. This means that the neural network has 3*N* + 1 trainable parameters. That is, the weights *w*(0) *<sup>i</sup>* and bias *b* (0) *<sup>i</sup>* for the nodes in the hidden layer and the weights *w*(1) *<sup>i</sup>* , *i* = 1 ... *N*, for each synapses connecting them with the output node, plus the bias *b*(1) applied there.

Based on this particular architecture, the output *zi* of each node in the first hidden layer is,

$$z\_i(\mathbf{x}) = \phi\left(w\_i^{(0)}\mathbf{x} + b\_i^{(0)}\right). \tag{3}$$

The output value is then given by applying the Sigmoid activation function scaled by the weight from the node in the second layer and yields

$$y(\mathbf{x}) = b^{(1)} + \sum\_{i=1}^{N} w\_i^{(1)} z\_i(\mathbf{x}) = b^{(1)} + \sum\_{i=1}^{N} w\_i^{(1)} \phi \left( w\_i^{(0)} \mathbf{x} + b\_i^{(0)} \right). \tag{4}$$

Let us now construct the cost function which the network will be trained to minimise. While the cost function appearing in a typical machine learning procedure is just the

quadratic difference between the predicted and the target values, it will here be defined by means of the operators L and B. The cost function applied here reads

$$dl = \left\langle \left(\mathcal{L}y - f\right)^2 \right\rangle + \left(\left(\mathcal{B}y - \mathbf{b}\right) \cdot \mathbf{e}\_1\right)^2 + \left(\left(\mathcal{B}y - \mathbf{b}\right) \cdot \mathbf{e}\_2\right)^2,\tag{5}$$

where *f* defines the average value of *f* , and this is exactly the feature that makes an ANN "physics informed", i.e., a PINN.

Since L*y* is a differential operator the cost function contains derivatives of the network output (4). In order to obtain an expression of the cost function, in terms of the input *x*, the weights *w* and bias *b*, the network output (4), must be differentiated twice with respect to (w.r.t. ) *x*. This can be accomplished by some kind of automatic differentiation (AD) (also referred to as algorithmic differentiation, computer differentiation, auto-differentiation or simply autodiff), which is a computerised methodology based on the chain rule, which can be applied to efficiently and accurately evaluate derivatives of numeric functions, see e.g., [10,11]. The present work instead applies symbolic differentiation to clearly explain all the essential details of the PINN. Indeed, differentiating one yield

$$\begin{split} y'(\mathbf{x}) &= \frac{\partial}{\partial \mathbf{x}} \left( \left( \sum\_{i=1}^{N} w\_i^{(1)} z\_i(\mathbf{x}) \right) + b^{(1)} \right) = \frac{\partial}{\partial \mathbf{x}} \left( \left( \sum\_{i=1}^{N} w\_i^{(1)} \phi \left( w\_i^{(0)} \mathbf{x} + b\_i^{(0)} \right) \right) + b^{(1)} \right) = \\ &= \sum\_{i=1}^{N} w\_i^{(1)} w\_i^{(0)} \phi' \left( w\_i^{(0)} \mathbf{x} + b\_i^{(0)} \right) = \sum\_{i=1}^{N} w\_i^{(1)} w\_i^{(0)} \phi \left( w\_i^{(0)} \mathbf{x} + b\_i^{(0)} \right) \left( 1 - \phi \left( w\_i^{(0)} \mathbf{x} + b\_i^{(0)} \right) \right), \end{split} \tag{6}$$

and, because of (2), a consecutive differentiation then yields

*y*--(*x*) = *<sup>∂</sup> ∂x N* ∑ *i*=1 *w*(1) *<sup>i</sup> <sup>w</sup>*(0) *<sup>i</sup> φ*- - *w*(0) *<sup>i</sup> x* + *b* (0) *i* = *N* ∑ *i*=1 *w*(1) *i* - *w*(0) *i* 2 *φ*--- *w*(0) *<sup>i</sup> x* + *b* (0) *i* = = *N* ∑ *i*=1 *w*(1) *i* - *w*(0) *i* 2 *φ*- - *w*(0) *<sup>i</sup> x* + *b* (0) *i* -1 − 2*φ* - *w*(0) *<sup>i</sup> x* + *b* (0) *i* = = *N* ∑ *i*=1 *w*(1) *i* - *w*(0) *i* 2 *φ* - *w*(0) *<sup>i</sup> x* + *b* (0) *i* -1 − *φ* - *w*(0) *<sup>i</sup> x* + *b* (0) *i* -1 − 2*φ* - *w*(0) *<sup>i</sup> x* + *b* (0) *i* . (7)

Moreover, finding the set of weights and bias minimising the cost function requires its partial derivatives w.r.t. to each weight and bias defining the PINN. In the subsections below, we will present how to achieve this, by first considering a first order differential equation with an analytical solution, and, thereafter, we will consider the classical Reynolds equation which is a second order (linear) ODE that describes laminar flow of incompressible and iso-viscous fluids in narrow interfaces.
