compute the mean of analysis ensemble
ua[:,k+1] = np.mean(uai,1)
km = km+1
✝ ✆
```
Figure 5 shows the EnKF results for the Lorenz 63 system. We see that the analysis trajectory is close to the true one and more accurate than the background. Readers are encouraged to play with the codes to explore the effect of increasing or decreasing the ensemble size with different levels of noise. Furthermore, different observation operators can be defined (for instance, observe only 1 or 2 variables, or assume some nonlinear function *h*(·)).

**Figure 5.** EnKF results for the Lorenz 63 system with virtual observations.

For the sake of completeness, we also sketch the DEnKF predictions in Figure 6. The same implementation in Listing 19 can be adopted, but calling the DEnKF function framed in Listing 18 instead of EnKF. Although different approaches might give slightly dissimilar results, we are not trying to benchmark them in this introductory presentation since we are only showing very simple implementation, with idealized twin experiments.

**Figure 6.** DEnKF results for the Lorenz 63 system without virtual observations.

#### **9. Applications**

In this section, we gradually increase the complexity of the test cases using fluid dynamics applications. In Section 9.1, we slightly increase the dimensionality of the system from 3 (as in Lorenz 63 system) to 36 using the Lorenz 96 system and demonstrate the capability of DA algorithms to treat uncertainty in initial conditions. This is further extended in Section 9.2, where we show that DA can recover the hidden underlying processes and provide closure effects using the two-level variant of the Lorenz 96 model. We also introduce the utilization of an inflation factor and its impact to mitigate ensemble collapse and account for a slight under-representation of covariance due to the use of a small ensemble in EnKF and DEnKF. In Section 9.3, we illustrate the application of DA on systems governed by partial differential equations (PDEs) using the Kuramoto–Sivashinsky equation. We highlight that for each application, we only show results for a few selected algorithms and extensions to other approaches covered in this tutorial are left to readers as computer projects. We also emphasize that we are demonstrating the implementation and capabilities of the presented DA algorithms, not assessing their performance nor benchmarking different approaches against each other.

#### *9.1. Lorenz 96 System*

The Lorenz 96 model [33] is a system of ordinary differential equations that describes an arbitrary atmospheric quantity as it evolves on a circular array of sites, undergoing forcing, dissipation, and rotation invariant advection [34]. The Lorenz 96 dynamical model can be written as

$$\frac{d\mathbf{X}\_i}{dt} = (\mathbf{X}\_{i+1} - \mathbf{X}\_{i-2})\mathbf{X}\_{i-1} - \mathbf{X}\_i + \mathbf{F}\_i \quad i = 1, 2, \dots, n\_\prime \tag{61}$$

where *Xi* is the state of the system at the *i th* location and *F* represents a forcing constant. Periodicity is enforced by assuming that *X*−<sup>1</sup> = *Xn*−1, *X*<sup>0</sup> = *Xn*, and *Xn*+<sup>1</sup> = *X*1. In the present study, we use *n* = 36, and *F* = 8 defining a forcing term. In order to obtain a valid initial condition, we begin at *t* = −5 using equilibrium conditions defined as (*Xi* = *F* for *i* = 1, 2, ... , *n*) and adding a small perturbation to the 20th state variable as *X*<sup>20</sup> = *F* + 0.01. Then, ODE integrator is run up to *t* = 0 and solution at *t* = 0 is treated at the true initial conditions for our twin experiment. We assume a background erroneous initial condition by contaminating the true one with Gaussian noise with zero mean and standard deviation of 1.

A total time window of 20 time units is considered, with a time step of Δ*t* = 0.01 and the RK4 schemes is adopted for time integration. Synthetic measurements are collected at every 0.2 time unit (i.e., each 20 time integration steps) sampled at 9 equidistant locations (i.e., at *i* ∈ {4, 8, 12, ... , 36}) from true trajectory assuming that sensors add a white noise with zero mean and a standard deviation of 0.1. We also assume a process noise drawn from a multivariate Gaussian distribution with zero mean and covariance matrix **Q** defined as **Q** = 0.12**I**36. We first apply the EKF approach to correct the solution trajectory, which yields very good results as shown in Figure 7. For visualization, we only plot the time evolution of *X*9, *X*18, and *X*36. We see that the Lorenz 96 is sensitive to the initial conditions and small perturbation is sufficient to produce a very different trajectory (e.g., background solution).

**Figure 7.** EKF results for the Lorenz 96 system. The trajectories of *X*9, *X*18, and *X*<sup>36</sup> are shown.

The second approach to test is the stochastic version of EnKF. We create an ensemble of 50 members to approximate the covariance matrices. Results are depicted in Figure 8 for *X*9, *X*18, and *X*36. We highlight that observations appear only in the *X*<sup>36</sup> plot because observations are collected at *i* = 4, 8, 12, . . . , 36 (neither *X*<sup>9</sup> nor *X*<sup>18</sup> are measured). We highlight that, generally speaking, increasing the size of ensemble improves the predictions. However, this comes on the

expense of the solution of the forward nonlinear model for each added member. Thus, a compromise between the accuracy and computational burden is in place.

**Figure 8.** EnKF results for the Lorenz 96 system using an ensemble of 50 members.

#### *9.2. Two-Level Lorenz 96 System*

In this section, we describe the two-level variant of the Lorenz 96 model proposed by Lorenz [33]. The two-level Lorenz 96 model can be written as

$$\frac{dX\_i}{dt} = -X\_{i-1}(X\_{i-2} - X\_{i+1}) - X\_i - \frac{hc}{b} \sum\_{j=1}^{J} Y\_{j,i} + F\_{\prime} \tag{62}$$

$$\frac{dY\_{\bar{j},i}}{dt} = -cbY\_{\bar{j}+1,i}(Y\_{\bar{j}+2,i} - Y\_{\bar{j}-1,i}) - cY\_{\bar{j},i} + \frac{hc}{b}X\_{i\nu} \tag{63}$$

where Equation (62) represents the evolution of slow, high-amplitude variables *Xi* (*i* = 1, ... , *n*), and Equation (63) provides the evolution of a coupled fast, low-amplitude variable *Yj*,*<sup>i</sup>* (*j* = 1, ... , *J*). We use *n* = 36 and *J* = 10 in our computational experiments. We utilize *c* = 10 and *b* = 10, which implies that the small scales fluctuate 10 times faster than the larger scales. Furthermore, the coupling coefficient *h* between two scales is equal to 1 and the forcing is set at *F* = 10 to make both variables exhibit the chaotic behavior.

We utilize the fourth-order Runge–Kutta numerical scheme with a time step Δ*t* = 0.001 for temporal integration of the Lorenz 96 model. We apply the periodic boundary condition for the slow variables, i.e., *Xi*−*<sup>n</sup>* = *Xi*+*<sup>n</sup>* = *Xi*. The fast variables are extended by letting *Yj*,*i*−*<sup>n</sup>* = *Yj*,*i*+*<sup>n</sup>* = *Yj*,*i*, *Yj*−*J*,*<sup>i</sup>* = *Yj*,*i*−1, and *Yj*+*J*,*<sup>i</sup>* = *Yj*,*i*<sup>+</sup>1. The physical initial condition is computed by starting with an equilibrium condition at time *t* = −5 for slow variables. The equilibrium condition for slow variables is *Xi* = *F* for *i* ∈ 1, 2, . . . , *n*. We perturb the equilibrium solution for the 18th state variable as

*X*<sup>18</sup> = *F* + 0.01. At the time *t* = −5, the fast variables are assigned with random numbers between −*F*/10 to *F*/10. We integrate a two-level Lorenz 96 model by solving both Equations (62) and (63) in a coupled manner up to time *t* = 0. The solution at time *t* = 0 represent the true initial condition. For our twin experiment, we obtain observations by adding noise drawn from the Gaussian distribution with zero mean and *σ*<sup>2</sup> *<sup>o</sup>* = 1.0. We assume that observations are sparse in space and are collected at every 10th time step.

The motivation behind this example is to demonstrate how covariance inflation can be utilized to account for the model error. Usually the imperfections in the forecast model is taken into account by adding a Gaussian noise to the forecast model. Another method to account for model error is covarinace inflation. It also helps in alleviating the effect finite number of ensemble members in practical data assimilation and addresses the problem of covariance underestimation in the EnKF algorithm. We use the multiplicative inflation [35] where the ensemble members are pushed away from the ensemble mean by a given inflation factor and mathematically it can be expressed as

$$\mathbf{u}\_{b}^{(i)}(t\_{k+1}) \leftarrow \mathbf{u}\_{a}(t\_{k+1}) + \lambda \cdot (\mathbf{u}\_{b}^{(i)}(t\_{k+1}) - \mathbf{u}\_{a}^{(i)}(t\_{k+1})),\tag{64}$$

where *λ* is the inflation factor. The inflation factor can be a constant scalar over the entire domain at all time step or it can space and time dependent.

In this example, we discard parameterizations of fast variables in the forecast model. The forecast model for two-level Lorenz system with no parameterizations is equivalent to setting the coupling coefficient *h* = 0 in Equation (62) and it reduces to one-level Lorenz 96 model as presented in Section 9.1. We note here that the observations used for data assimilation are obtained by solving a two-level Lorenz 96 model in a coupled manner (i.e., without discarding fast-variables). Therefore, the effect of unresolved scales is embedded in observations. The parameterization of fast variables (i.e., *hc <sup>b</sup>* <sup>∑</sup>*<sup>J</sup> <sup>j</sup>*=<sup>1</sup> *Yj*,*<sup>i</sup>* term in Equation (62)) can be considered as an added noise to the true state of the system for a one-level Lorenz 96 model. Figure 9 displays the RMSE for a two-level Lorenz system when 18 observations are used for DA with different number of ensemble members and inflation factors for EnKF and DEnKF algorithms. Figure 10 shows the full state trajectory of two-level Lorenz system corresponding to minimum RMSE, which is obtained with 50 ensemble members and inflation factor *λ* = 1.04 for the EnKF algorithm. The parameters corresponding to minimum RMSE for the DEnKF algorithm are 45 ensemble members and *λ* = 1.05.

**Figure 9.** RMSE for a two-level Lorenz model for different combinations of number of ensembles and inflation factor.

**Figure 10.** Full state trajectory of the multiscale Lorenz 96 model with no parameterizations in the forecast model. The EnKF algorithm uses the inflation factor *λ* = 1.04 and *N* = 50 and the DEnKF uses the inflation factor *λ* = 1.05 and *N* = 45. The observation data for both EnKF an DEnKF algorithm is obtained by adding measurement noise to the exact solution of the two-level Lorenz 96 system.

#### *9.3. Kuramato Sivashinsky*

In this section, we describe the Kuramoto–Sivashinsky (K-S) equation derived by Kuramoto [36], which is used as a turbulence model for different flows. The one-dimensional K-S equation can be written as

$$\frac{\partial u}{\partial t} = -\nu \frac{\partial^4 u}{\partial x^4} - \frac{\partial^2 u}{\partial x^2} - \nu \frac{\partial u}{\partial x},\tag{65}$$

where *ν* is the viscosity coefficient. The K-S equation is characterized by the second-order unstable diffusion term responsible for an instability at large scales, the fourth-order stabilizing viscous term that provides damping at small scales, and a quadratic nonlinear term which transfers energy between large and small scales. We use the computational domain extending from 0 to *L*, i.e., *x* ∈ [0, *L*] and time *t* ∈ [0, ∞]. We impose the Dirichlet and Neumann boundary conditions as given below

$$
\mu(0, t) = \mu(L, t) = 0,\tag{66}
$$

$$
\left.\frac{\partial u}{\partial \mathbf{x}}\right|\_{\mathbf{x}=0} = \left.\frac{\partial u}{\partial \mathbf{x}}\right|\_{\mathbf{x}=L} = 0.\tag{67}
$$

We spatially discretize the domain with the grid size Δ*x* = *L*/(*n* − 1), where *n* is the degrees of freedom. We set *L* = 50 and *n* = 129 for our numerical experiments. The state of the system at discretized grid is denoted as *ui* = *u*((*i* − 1)Δ*x*) for *i* = 1, ... , *n*. Using the second-order finite difference discretization, the discretized K-S equation can be written as

$$\frac{du\_i}{dt} = -\nu \frac{u\_{i+2} - 4u\_{i+1} + 6u\_i - 4u\_{i-1} + u\_{i-2}}{\Delta x^4} - \frac{u\_{i+1} - 2u\_i + u\_{i-1}}{\Delta x^2} - \frac{1}{2} \frac{u\_{i+1}^2 - u\_{i-1}^2}{2\Delta x}.\tag{68}$$

The first term on the right hand side is computed by utilizing ghost nodes and the Neumann boundary condition is assigned for ghost points. We impose *u*<sup>0</sup> = *u*<sup>2</sup> and *un*+<sup>1</sup> = *un*−<sup>1</sup> using the second-order discretization for Equation (67) at boundary points *u*<sup>1</sup> and *un*, respectively.

We use the fourth-order Runge–Kutta scheme for time integration with a time step Δ*t* = 0.25. To generate an initial condition for the forward run we start with an equilibrium condition at time *t* = −50 and integrate up to time *t* = 0. The equilibrium condition for the model is *ui* = 0.1 for *j* ∈ {1, ... , *n*}. Once the true initial condition is generated, we run the forward solver up to time *t* = 50. We test the prediction capability of sequential data assimilation algorithms for forecast up to *t* = 50. The K-S equation exhibits different levels of chaotic behavior depending on the value of viscosity coefficient *ν*. The chaos depends upon the bifurcation parameter *L*˜ = *L*/2*π* <sup>√</sup>*ν*. We utilize *<sup>ν</sup>* <sup>=</sup> 1/2 which represent the less chaotic behaviour. The observations for twin experiments are obtained by adding some noise to the true state of the system to account for experimental uncertainties and measurement errors. The observations are also sparse in time, meaning that the time interval between two observations can be different from the time step of the forecast model. For our twin experiments, we assume that observations are recorded at every 10th time step of the model for *ν* = 1/2. Therefore, the time difference between two observations is *δt* = 2.5 for the K-S equation. We present results for the EnKF algorithm with three sets of observations. The first set of observations is very sparse with only 12.5% of the full state of the system. The first set utilizes observations for states [*u*8, *<sup>u</sup>*16, ... , *<sup>u</sup>*128] <sup>∈</sup> *<sup>R</sup>*16. In a second set of observations we employ observations at [*u*4, *<sup>u</sup>*8, ... , *<sup>u</sup>*128] <sup>∈</sup> *<sup>R</sup>*<sup>32</sup> for the assimilation. The third set of observations consists of 50% of the full state of the system, i.e., observations at states [*u*2, *<sup>u</sup>*4, ... , *<sup>u</sup>*128] <sup>∈</sup> *<sup>R</sup>*<sup>64</sup> for the assimilation. We apply *<sup>σ</sup>*<sup>2</sup> *<sup>o</sup>* <sup>=</sup> 1.0 <sup>×</sup> <sup>10</sup>−<sup>2</sup> and *<sup>σ</sup>*<sup>2</sup> *<sup>i</sup>* <sup>=</sup> 1.0 <sup>×</sup> <sup>10</sup>−<sup>2</sup> as the variance of observation noise and initial condition uncertainty, respectively.

In Figure 11, we present the time evolution of selected states for three different number of observations included in the assimilation of the EnKF algorithm. There is an excellent agreement between true and assimilated states *u*<sup>51</sup> and *u*101, for which observations are not present. We also provide the full state trajectory of the K-S equation in Figure 12. The results obtained clearly indicate that the EnKF algorithm is able to determine the correct state trajectory even when the observation data are very sparse, i.e., *m* = 16. With an increase in the number of observations, the prediction of the full state trajectory gets smoother, and almost the exact state is recovered with 50% observations.

**Figure 11.** Selected trajectories of the Kuramoto–Sivashinsky model (*ν* = 1/2) with the analysis performed by the ensemble Kalman filter (EnKF) using observations from *m* = 16 (**left**), *m* = 32 (**middle**), and *m* = 64 (**right**) state variables at every 10 time steps.

**Figure 12.** Full state trajectory of the Kuramoto–Sivashinsky model (*ν* = 1/2) with the analysis performed by the ensemble Kalman filter (EnKF) using observations from *m* = 16 (**left**), *m* = 32 (**middle**), and *m* = 64 (**right**) state variables at every 10 time steps.

#### *9.4. Quasi-Geostrophic (QG) Ocean Circulation Model*

We consider a simple single-layer QG model to illustrate the application of sequential data assimilation for two-dimensional flows. Specifically, we use the deterministic ensemble Kalman filter (DenKF) algorithm discussed in Section 8.1 to improve the prediction of the single-layer QG model. The wind-driven oceanic flows exhibit a vast range of spatio-temporal scales and modeling of these scales with all the relevant physics has always been challenging. The barotropic vorticity equation (BVE) with various dissipative and forcing terms is one of the most commonly used models for geostrophic flows [37,38]. The dimensionless vorticity-streamfunction formulation for the BVE [39] with forcing and dissipative terms can be written as

$$\frac{\partial\omega}{\partial t} + J(\omega, \psi) - \frac{1}{\text{Ro}}\frac{\partial\psi}{\partial x} = \frac{1}{\text{Re}}\nabla^2\omega + \frac{1}{\text{Ro}}\sin(\pi y),\tag{69}$$

where *<sup>ω</sup>* is the vorticity, *<sup>ψ</sup>* is the streamfunction, <sup>∇</sup><sup>2</sup> is the standard two-dimensional Laplacian operator, Re is the Reynolds number, and Ro is the Rossby number. The kinematic relation between vorticity and streamfunction is given by the following Poisson equation

$$
\nabla^2 \psi = -\omega.\tag{70}
$$

The nonlinear convection term is given by the Jacobian as follows

$$J(\omega, \psi) = \frac{\partial \psi}{\partial y} \frac{\partial \omega}{\partial x} - \frac{\partial \psi}{\partial x} \frac{\partial \omega}{\partial y}. \tag{71}$$

The computational domain for the QG model is (*x*, *y*) ∈ [0, 1] × [−1, 1] and is discretized using <sup>128</sup> <sup>×</sup> 256 grid resolution. Therefore, the QG model has the dimension of about 3.2 <sup>×</sup> 104. We utilize the homogeneous Dirichlet boundary condition for the vorticity and streamfunction at all boundaries. The vorticity and streamfunction is initialized from quiescent state, i.e., *ωt*=<sup>0</sup> = *ψ*|*t*=<sup>0</sup> = 0. The QG model is numerically solved by discretizing Equation (69) using second-order finite difference scheme. The nonlinear Jacobian term is discretized with the energy-conserving Arakawa [40] numerical scheme. A third-order total-variation-diminishing Runge–Kutta scheme is used for the temporal integration and a fast sine transform

Poisson solver is utilized to update streamfunction from the vorticity [41]. For the physical parameters, we use values of Re <sup>=</sup> 100 and Ro <sup>=</sup> 1.75 <sup>×</sup> <sup>10</sup>−3.

The QG model is integrated with a constant time step of 5 <sup>×</sup> <sup>10</sup>−<sup>5</sup> from time *<sup>t</sup>* <sup>=</sup> 0 to *<sup>t</sup>* <sup>=</sup> 0.25 to generate the true initial condition at final time *t* = 0.25. Then the data assimilation is conducted from time *t* = 0.25 to *t* = 0.4 with observations getting assimilated at every tenth time step. The synthetic observations are generated by sampling vorticity field on 16 equidistant points in *x* and *y* directions respectively and then adding the Gaussian noise, i.e., **<sup>v</sup>***<sup>k</sup>* ∼ N (0, **<sup>R</sup>***k*), where **<sup>R</sup>***<sup>k</sup>* <sup>=</sup> *<sup>σ</sup>*<sup>2</sup> *<sup>b</sup>* **I**. We set the observation noise variance at *σ*2 *<sup>b</sup>* = 5. The typical vorticity and streamfunction field along with the locations of measurements are shown in Figure 13.

**Figure 13.** A Typical vorticity (left) and streamfunction (right) field for the single-layer QG model. The dots shows the locations of observations.

We employ 20 ensemble members for the DEnKF algorithm. The initialization of the ensemble members is an important step to get accurate prediction with any type of the EnKF algorithm. We initialize different ensemble members by randomly selecting the vorticity field snapshots between time *t* = 0.24 to *t* = 0.25. The other methods such as adding a random perturbation from the Gaussian distribution to the true initial condition can also be adopted. Figure 14 displays the vorticity field and the predicted vorticity field at three different time instances along with the difference (error) between the two. We can see that the true and analysis field are similar at all time instances and the magnitude of error is also small. We recall that we observe only around 2% of the system (i.e., observations at 16 × 32 locations). With more observations, the quality of the results can be further improved.

**Figure 14.** Snapshots of the true vorticity field (**left**), analysis estimate of the DEnKF algorithm (**middle**), and the difference (error) between the two fields (**right**) obtained for a particular run of the single-layer QG model. The snapshots of vorticity field are plotted at *t* = 0.3, 0.35, 0.4 (from top to bottom).

#### **10. Concluding Remarks**

In this tutorial paper, we provided a 101 introduction to common data assimilation techniques. In particular, we briefly covered the relevant mathematical foundation and the algorithmic steps for three dimensional variational (3DVAR), four dimensional variational (4DVAR), forward sensitivity method (FSM), and Kalman filtering approaches. Since it is considered as a first exposure to DA, we focused on the simplest implementations that anybody can easily follow. For example, to treat nonlinearity (e.g., in 3DVAR, 4DVAR, FSM and EKF), we only presented the first order Taylor expansions. We demonstrated the execution of the covered approaches with a series of Python modules that can be linked to each other easily. Again, we preferred to keep our codes as concise and simple as possible, even if it comes on the expense of computational efficiency. The Python codes used to generate this tutorial are publicly available through our GitHub repository https://github.com/Shady-Ahmed/PyDA.

Since it is introductory exploration, we should admit that we have bypassed a few important analyses and shortcut some key derivations. Interested readers are referred to well-established textbooks that offer in-depth discussions about various DA techniques [14,27,29,42–47]. Likewise, more advanced topics such as the particle filters [48], maximum likelihood ensemble filters [49,50], optimal sensor placement [51,52], higher-order analysis of variational methods [53], or hybrid methods [54–59] are omitted in our current presentation.

**Author Contributions:** Data curation, S.E.A., S.P., and O.S.; Supervision, O.S.; Writing—original draft, S.E.A., and S.P.; and Writing—review and editing, S.E.A., S.P., and O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Award Number DE-SC0019290. Omer San gratefully acknowledges their support. Disclaimer: This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

**Acknowledgments:** We thank Sivaramakrishnan Lakshmivarahan for his insightful comments as well as his archival NPTEL lectures [60] on dynamic data assimilation that greatly helped us in developing the PyDA module. Special thanks go to Ionel Michael Navon for providing his lecture notes on data assimilation.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

**Changhong Mou 1, Zhu Wang 2, David R. Wells 3, Xuping Xie <sup>4</sup> and Traian Iliescu 1,\***


**Abstract:** Reduced order models (ROMs) are computational models whose dimension is significantly lower than those obtained through classical numerical discretizations (e.g., finite element, finite difference, finite volume, or spectral methods). Thus, ROMs have been used to accelerate numerical simulations of many query problems, e.g., uncertainty quantification, control, and shape optimization. Projection-based ROMs have been particularly successful in the numerical simulation of fluid flows. In this brief survey, we summarize some recent ROM developments for the quasi-geostrophic equations (QGE) (also known as the barotropic vorticity equations), which are a simplified model for geophysical flows in which rotation plays a central role, such as wind-driven ocean circulation in mid-latitude ocean basins. Since the QGE represent a practical compromise between efficient numerical simulations of ocean flows and accurate representations of large scale ocean dynamics, these equations have often been used in the testing of new numerical methods for ocean flows. ROMs have also been tested on the QGE for various settings in order to understand their potential in efficient numerical simulations of ocean flows. In this paper, we survey the ROMs developed for the QGE in order to understand their potential in efficient numerical simulations of more complex ocean flows: We explain how classical numerical methods for the QGE are used to generate the ROM basis functions, we outline the main steps in the construction of projection-based ROMs (with a particular focus on the under-resolved regime, when the closure problem needs to be addressed), we illustrate the ROMs in the numerical simulation of the QGE for various settings, and we present several potential future research avenues in the ROM exploration of the QGE and more complex models of geophysical flows.

**Keywords:** reduced order models; quasi-geostrophic equations; closure models

#### **1. Introduction**

*1.1. Reduced Order Models (ROMs)*

Reduced order modeling aims at answering the following question:

For a given system, what is the model with the minimum number of degrees of freedom? (1)

The resulting models, called reduced order models (ROMs), can decrease the computational cost of traditional full order models (FOMs) (i.e., models obtained through classical numerical discretizations, such as finite element, finite difference, finite volume, or spectral methods) by orders of magnitude without a significant decrease in numerical accuracy. Thus, ROMs can be used in the efficient numerical simulation of problems that require numerous runs, e.g., uncertainty quantification, control, and shape optimization.

ROMs come in different flavors. Projection ROMs have been used in the numerical simulation of both nonlinear [1–3] and linear [4] systems. In particular, projection ROMs

**Citation:** Mou, C.; Wang, Z.; Wells, D.R.; Xie, X.; Iliescu, T. Reduced Order Models for the Quasi-Geostrophic Equations: A Brief Survey. *Fluids* **2021**, *6*, 16. https:// doi.org/10.3390/fluids6010016

Received: 01 December 2020 Accepted: 24 December 2020 Published: 31 December 2020

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

have been successful in the numerical simulation of complex fluid flows [2,5–7]. In this survey, we exclusively consider projection ROMs that answer question (1) as follows:

To construct the ROM, use numerical or experimental data to find the "best" basis. (2)

Once the "best" basis is found, the ROM is constructed by using projection methods. In Galerkin projection ROMs, the trial and test spaces are the same; in Petrov-Galerkin projection ROMs, the trial and test spaces are different. In this paper, we focus on Galerkin projection ROMs.

Specifically, to approximate the dynamics of a flow variable *u* of a given system

$$\stackrel{\bullet}{u} = f(u)\,,\tag{3}$$

the ROM strategy proceeds as follows:

#### **Algorithm 1** ROM Strategy

1: Use numerical or experimental data to choose modes {*ϕ*1, ... ,*ϕR*}, which represent the recurrent spatial structures in the flow.


$$
\stackrel{\bullet}{a} = F(a),
\tag{4}
$$

where *a*(*t*)=(*ai*(*t*))*i*=1,..., *<sup>r</sup>* is the vector of coefficients in the Galerkin truncation in step (3) and *F* comprises the ROM operators.


At this point, several remarks are in place.

**ROMs are Galerkin methods with a data-driven basis**: First, we note that the general form of projection ROMs (outlined in Algorithm 1) is strikingly similar to the general form of classical Galerkin methods used in a finite element, spectral, or spectral element context. Conceptually, the main difference between ROMs and classical Galerkin discretizations is the way the basis is constructed: In classical Galerkin methods, the basis is universal, i.e., it is the same for all the problems. For example, for finite elements, the basis functions are piecewise polynomial functions on a given mesh. In projection ROMs, however, the basis is a *data-driven basis*, i.e., a basis constructed from problem data. Thus, the ROM basis is adapted to the specific problem (see steps (1)–(2) in Algorithm 1): Once the problem changes, the ROM basis changes accordingly.

While the choice of basis is the main conceptual difference between ROMs and classical Galerkin methods, this choice can make a tremendous difference in the computational cost: For example, for a two-dimensional flow past a circular cylinder at a Reynolds number Re <sup>=</sup> 1000, a finite element discretization requires <sup>O</sup>(105) degrees of freedom, whereas a

ROM requires O(10) degrees of freedom [8,9]. Thus, for this particular test case, *the ROM dimension is four orders of magnitude lower than the FOM dimension*.

**Recurrent, Dominant, Coherent Spatial Structures**: ROMs do not work well for all problems. ROMs are numerical methods and, like any other numerical method, ROMs work well for certain classes of problems and not so well for other classes of problems. One class of problems for which ROMs have been particularly successful is flows that display *recurrent, dominant, coherent structures*. A classical example in this class is the twodimensional flow past a circular cylinder, which has become the workhorse of ROMs for fluid flows [5,6,9]. The flow past a circular cylinder displays coherent spatial structures (the von Karman vortex street) that continuously recur in time. One can show that a few such structures have significantly higher kinetic energy content than the remaining structures, and, therefore, are expected to dominate the dynamics of the underlying system. Indeed, as mentioned above, for the two-dimensional flow past a circular cylinder, the dimension of the ROM constructed with these dominating structures can be four orders of magnitude lower than the FOM dimension. Thus, for this test problem, ROMs work extremely well. Other problems that display recurrent, dominant, coherent structures, for which ROMs work well, include: (i) lid driven cavity flow [10]; (ii) flow past a backward facing step [11]; (iii) flow in a constrained channel [12,13]; and (iv) flow in the boundary layer of a pipe [2].

We emphasize, however, that there are classes of problems for which ROMs do not work well. Homogeneous flows are one such example. Indeed, for homogeneous flows, it was proved in [2,14] that one of the most popular ROM techniques yields a ROM basis that is identical to the Fourier basis. Thus, in this case, the resulting ROM is nothing but a spectral method, which does not reduce the FOM dimension.

The take-home message is that *ROMs are appropriate for problems that display recurrent, coherent, dominant spatial structures. However, for problems that do not display these types of spatial structures (e.g., homogeneous flows), ROMs are not appropriate since they cannot reduce the FOM computational cost*.

#### *1.2. ROMs for the Quasi-Geostrophic Equations*

*ROMs are an excellent fit for the numerical investigation of ocean flows.* Indeed, largescale ocean circulation includes large-scale coherent structures (gyres) that recur in time and permanent gyres (e.g., the Sargasso Sea) that have a relatively high kinetic energy content. Thus, as pointed out above, ROMs could enable an efficient and relatively accurate numerical simulation of large scale ocean circulation, decreasing the FOM computational cost by orders of magnitude and making possible efficient ensemble calculation and uncertainty quantification for climate modeling and weather prediction.

However, generating FOM data to build the ROM basis can be a daunting task. Specifically, using an accurate mathematical model (e.g., the Boussinesq equations), including all the relevant flow variables, and using realistic parameters, could require enormous computational resources on state-of-the-art computational platforms, both in terms of CPU time and memory. Thus, various *simplified mathematical models* for the large scale ocean circulation have been proposed over the years [15–17]. These simplified models are constructed by using *asymptotic expansions* with respect to both the time scales and the length scales. The *rotation* and *stratification* are the two main effects that are used to construct simplified models for geophysical flows.

One of the most popular simplified models for large scale ocean circulation is the *quasi-geostrophic equations (QGE)* (also known as the barotropic vorticity equations), which were proposed in the late 1940s by Jule Charney [18]. The QGE are a simplified model for geophysical flows in which rotation plays a central role, such as wind-driven ocean circulation in mid-latitude ocean basins. Specifically, the QGE ensure a near-geostrophic balance, i.e., the pressure gradient almost balances the Coriolis force (which is due to rotation). The one-layer QGE do not include stratification effects, but the *N*-layer or continuously stratified QGE model stratification.

The computational cost of numerical simulations of large scale ocean flows is significantly lower for the QGE than for the full-fledged Boussinesq equations. Since the QGE represent a practical compromise between the efficient numerical simulations of ocean flows and the accurate representation of large scale ocean dynamics, these equations have been often used in the testing of new numerical methods for ocean flows. Thus, to understand the potential of using ROMs for the efficient numerical simulation of ocean flows, ROMs have been tested on the QGE for various parameter settings. Of course, once the ROMs are calibrated for the simplified (yet relevant) setting of the QGE, they should be extended to more realistic mathematical models, such as the Boussinesq equations. In this brief survey, we summarize some of the ROM developments for the QGE.

The rest of the paper is organized as follows: In Section 2, we present and discuss the QGE. In Section 3, we summarize the main types of numerical discretizations used to generate the FOM data for the ROM construction. In Section 4, we present the Galerkin ROM approach for the QGE. In Section 5, we illustrate numerically the QGE reduced order modeling for one test case. Finally, in Section 6, we present conclusions and outline open problems in the reduced order modeling of the QGE.

#### **2. Quasi-Geostrophic Equations (QGE)**

The QGE describe the motion of stratified, rotating flows, and have been used extensively for modeling mid-latitude oceanic and atmospheric circulations. In 1950, a singlelayer quasi-geostrophic model was used for modeling the atmospheric dynamics in the first successful numerical weather prediction performed on the ENIAC digital computer [18], which led to "enormous scientific advance", in Richardson's words [15,19,20]. Since then, the QGE have been widely investigated and applied in weather prediction and climate modeling.

The QGE can be derived from the primitive equations, that is, the incompressible Navier–Stokes equations under the Boussinesq approximation in a rotating framework [21–24]. The equations in Cartesian coordinates on a plane Ω tangent to the sphere read:

$$\frac{Du}{Dt} - f\_c v = -\frac{1}{\rho} \frac{\partial p}{\partial x} + \frac{\partial}{\partial x} \left( \mathcal{A} \frac{\partial u}{\partial x} \right) + \frac{\partial}{\partial y} \left( \mathcal{A} \frac{\partial u}{\partial y} \right) + \frac{\partial}{\partial z} \left( \nu\_E \frac{\partial u}{\partial z} \right), \tag{5}$$

$$\frac{D\upsilon}{Dt} + f\_{\varepsilon}u = -\frac{1}{\rho}\frac{\partial p}{\partial y} + \frac{\partial}{\partial x}\left(\mathcal{A}\frac{\partial \upsilon}{\partial x}\right) + \frac{\partial}{\partial y}\left(\mathcal{A}\frac{\partial \upsilon}{\partial y}\right) + \frac{\partial}{\partial z}\left(\upsilon\_{\mathbb{E}}\frac{\partial \upsilon}{\partial z}\right),\tag{6}$$

$$0 = -\frac{1}{\rho} \frac{\partial p}{\partial z} - \mathbf{g},\tag{7}$$

$$0 = \frac{\partial u}{\partial \mathbf{x}} + \frac{\partial v}{\partial y} + \frac{\partial w}{\partial z'},\tag{8}$$

$$\frac{D\rho}{Dt} = \frac{\partial}{\partial x}\left(\mathcal{A}\frac{\partial\rho}{\partial x}\right) + \frac{\partial}{\partial y}\left(\mathcal{A}\frac{\partial\rho}{\partial y}\right) + \frac{\partial}{\partial z}\left(\kappa\_E \frac{\partial\rho}{\partial z}\right),\tag{9}$$

where *u*, *v*, and *w* are velocity components in the *x*, *y*, and *z* directions, *<sup>D</sup> Dt* <sup>=</sup> *<sup>∂</sup> <sup>∂</sup><sup>t</sup>* <sup>+</sup> *<sup>u</sup> <sup>∂</sup> <sup>∂</sup><sup>x</sup>* + *v ∂ <sup>∂</sup><sup>y</sup>* <sup>+</sup> *<sup>w</sup> <sup>∂</sup> <sup>∂</sup><sup>z</sup>* is the material derivative, *ρ* is density, *p* is the pressure, *fc* is the Coriolis force, and eddy viscosity and diffusivity coefficients A, *νE*, and *κ<sup>E</sup>* are either constant or functions of flow variables and grid parameters. The dimensionless Rossby number Ro is defined as Ro = *<sup>U</sup> fcL* , in which *U* and *L* represent the velocity and length scale of the geophysical flows. The Rossby number essentially characterizes the strength of inertia compared to the Coriolis and pressure forces. Another dimensionless number is the Ekman number, which is defined as Ek = *<sup>ν</sup><sup>E</sup>* <sup>Ω</sup>*H*<sup>2</sup> , with *H* the vertical extent of the flow. Since Ro is the ratio of the respective scales *<sup>U</sup>*<sup>2</sup> *<sup>L</sup>* and *fcU* of the first two terms in (5) and (6), and since Ek measures the ratio of viscous forces to Coriolis forces, when both *Ro* and *Ek* are much smaller than 1 (e.g.,

*Ro* = 0.0036 in Section 5), the Coriolis term dominates the left hand sides of momentum Equations (5) and (6), and the equations can be simplified, yielding the geostrophic balance:

$$-f\_{\varepsilon}v = -\frac{1}{\rho} \frac{\partial p}{\partial \mathbf{x}'}\tag{10}$$

$$f\_c u = -\frac{1}{\rho} \frac{\partial p}{\partial y}.\tag{11}$$

The resulting system reaches an equilibrium state in which the pressure gradient balances perfectly with the Coriolis force. When the Rossby and Ekman numbers are still small, but not nearly zero, the flow only achieves a near-geostrophic balance. Considering the beta-plane approximation *fc* = *f*<sup>0</sup> + *βy* and ignoring the stratification effect, one can obtain the single layer QGE by regular perturbation analysis [17,25,26]. The resulting equations are usually put in the following streamfunction-potential vorticity two-dimensional formulation:

$$\frac{\partial q}{\partial t} + J(q, \psi) = \text{Re}^{-1} \Delta q + F\_{\text{e}} \tag{12a}$$

$$q = -\text{Ro}\,\Delta\psi + y\_\prime \tag{12b}$$

where Ro = *<sup>U</sup> <sup>β</sup>L*<sup>2</sup> is the redefined Rossby number [27–29], Re <sup>=</sup> *UL <sup>A</sup>* is the Reynolds number, *ψ* is the streamfunction, *q* is the potential vorticity, *J*(*q*, *ψ*) = *<sup>∂</sup><sup>q</sup> ∂x ∂ψ <sup>∂</sup><sup>y</sup>* <sup>−</sup> *<sup>∂</sup><sup>q</sup> ∂y ∂ψ <sup>∂</sup><sup>x</sup>* is the Jacobian, *Fe* is the external forcing, and *βy* measures the beta-plane effect from the Coriolis force due to rotation. By eliminating the potential vorticity, we can obtain the pure streamfunction formulation:

$$-\frac{\partial}{\partial t}(\Delta \psi) + \text{Re}^{-1}\Delta^2 \psi - J(\Delta \psi, \psi) - \text{Re}^{-1}\frac{\partial \psi}{\partial x} = \text{Re}^{-1}F\_{\varepsilon}.\tag{13}$$

Equations (12) and (13) are supplemented by boundary conditions, such as *ψ* = *∂ψ <sup>∂</sup><sup>n</sup>* = 0 on *∂*Ω. More details regarding the parameters and nondimensionalization of the QGE are given in, e.g., [8,30–33]. Note that the velocity can be recovered from the streamfunction according to the following formula:

$$
\sigma = \left(\frac{\partial \psi}{\partial y}, -\frac{\partial \psi}{\partial x}\right). \tag{14}
$$

One can also introduce the vorticity *<sup>ω</sup>* <sup>=</sup> Ro−1(*<sup>q</sup>* <sup>−</sup> *<sup>y</sup>*) and recast the QGE (12) in the following streamfunction-vorticity formulation:

$$\frac{\partial \omega}{\partial t} + J(\omega, \psi) - \text{Ro}^{-1} \frac{\partial \psi}{\partial \mathbf{x}} = \text{Re}^{-1} \Delta \omega + \text{Ro}^{-1} F\_{\mathbf{c}} \tag{15a}$$

$$
\omega = -\Delta \psi \,. \tag{15b}
$$

This form is close to the streamfunction-vorticity formulation of the two-dimensional Navier–Stokes equations, but it has an additional convection term Ro−<sup>1</sup> *∂ψ <sup>∂</sup>x*<sup>1</sup> and the forcing term is scaled by Ro−<sup>1</sup> due to the rotation effect of the Earth. Such rotation effect can significantly change the behavior of QGE and yields a strong boundary layer in the solution, as shown in Figure 1: When Ro is unphysically large (i.e., close to 1) we have larger, circular gyres with lower kinetic energy but when Ro is decreased the gyres both increase in energy (due to increased forcing), which can be seen in the higher vorticity magnitudes, and move westward (due to the convection term).

**Figure 1.** Solutions (vorticities) of the QGE subject to different Rossby numbers on the rectangular domain [0, 1] × [0, 2] when Re = 100 and *Fe* = sin(*π*(*y* − 1)) at *t* = 0.1. From left to right: Ro = 1, 0.1, 0.01, and 0.001. It is seen that decreasing the Rossby number yields a sharper western boundary layer.

When the fluid of interest is homogeneous, that is, no stratification is considered, we have the single layer QGE model. This is what we mainly focus on in this paper. However, to better approximate a continuously stratified fluid, a multi-layer model can be developed that assumes that the fluid consists of stacked isopyncal layers, the variation in the thickness of each layer is small compared to its mean thickness, and adjacent layer equations are coupled through the quasi-geostrophic potential vorticity [17]. Numerical investigations of multi-layer QGE have been made, for instance, in [34–36].

#### **3. Full Order Model (FOM)**

To generate FOM numerical data to construct the ROM basis, the QGE (12) need to be discretized both in space and in time. Popular QGE spatial discretizations include finite difference (FDM), finite volume (FVM), finite element (FEM), and pseudospectral methods. Essentially all discretizations use the method of lines (e.g., Runge-Kutta methods or other standard ODE solvers) to discretize in time. In this section, we survey each of these spatial discretizations for the QGE and, where available, comment on the existing numerical analysis results.

The primary intent of this paper is to survey the state-of-the-art for reduced order modeling of the QGE. While FOMs are required by ROMs, this section is not intended to be an exhaustive survey of the literature on the subject and instead only highlights the major trends.

#### *3.1. Finite Difference Methods for the QGE*

It is straightforward to apply the FDM to a geophysical flow model on rectangular grids. These were the first methods used [18,37] to simulate geophysical flows. In particular, the Arakawa grids were introduced by Arakawa and Lamb [38] to conserve energy and enstrophy at the grid level by effectively locating state variables across the mesh (i.e., a staggered-grid representation instead of nodal or cell-centered). See, e.g., [15,39] for detailed discussions. Among this class of grids, the C-grid places scalar quantities at the cell centers, while specifying the normal velocity components at the cell edges (which is essentially the classic MAC scheme [40]). Because of its excellent representation of the inertial-gravity waves, it has been widely used in geophysical flow simulations, for instance, for solving QGE in [33] and is the standard solver in the Modular Ocean Model version 6 [41]. Staggered-grid grid approximations like the C-grid can be thought of as either finite difference or finite volume schemes since the various velocity fluxes are explicitly solved for at cell faces rather than being reconstructed first from cell-centered values—this is the fundamental property that gives, e.g., the MAC scheme exactly zero divergence at cell centers (when calculated with standard second-order difference operators). Such schemes can also be extended to work with various turbulence modelling strategies [33,42–44].

#### *3.2. Finite Volume Methods for the QGE*

Like the staggered-grid finite difference schemes, the principal advantage of the FVM is preservation of the essential conservative quantities for the governing equations of geophysical fluid flows while additionally dealing with unstructured grids (i.e., complex geometries) more easily. This avoids the need for discretizing boundaries with staircasing, which results in inaccurate modelling of coastal phenomena like Kelvin waves [45]). This combination of properties makes the FVM the most common method for large-scale ocean simulations, such as those performed with [41] or [46].

Methods that use C-grid like discretizations (i.e., storing normal velocities on cell faces and mass or pressure in cell centers) on arbitrarily structured meshes must additionally introduce corrective measures to deal with the reconstruction of the tangential velocity (which is required by the discretization of the Coriolis force) [47–49]. These generalized C-grid methods are applicable to a wide class of meshes including latitude-longitude grids, Delaunay triangulations, Centroidal Voronoi tessellation (CVT), and spherical CVT. A different approach to overcome this issue was considered in [50], where the non-staggered Z-grid scheme [51] was used for the QGE model.

#### *3.3. Pseudospectral and Spectral Methods for the QGE*

Like finite difference methods, pseudospectral methods (due to their immediate applicability to hypercube geometries) have been used in a variety of different ways in QGE solvers. Some QGE solvers, like the one used in [52], use a pseudospectral discretization to compute turbulence statistics. Alternatively, some finite difference methods use a pseudospectral interpretation of solution grid values to do fast Laplace solves with a multidimensional discrete sine transformation [32,53] or resolve stability problems from nonlinearities via dealiasing [54,55].

Furthermore, pseudospectral methods have been used for the spatial discretization of the QGE [56,57]. The FOM results used in this paper to construct the ROM basis in Section 5 were also generated with a pseudospectral method. By *pseudospectral* we mean that spatial derivatives in (12) are evaluated by performing a discrete sine transform, wave number multiplication, and an another discrete sine transform in which dealiasing (the 3/2 s rule from [58]) is used in the nonlinear term of (12) for stability. The FOM solver exploits the homogeneous boundary conditions to ignore even-numbered Fourier modes (i.e., the RODFT00 transformation in [59]). Since this method permits very fast evaluation of spatial derivatives we essentially treat them as a black box operation as part of an explicit ODE solver for evolving the Fourier coefficients in time. The largest stable timestep is found by using the power method for computing the principal eigenvalue of the linearized discretization of (12). Numerical experiments imply that setting a strict error tolerance on the error caused by the ODE solver requires smaller timesteps than the one required for ODE stability, which validates the choice of explicit methods for the relevant range of Reynolds numbers and grid resolutions. See Section 5.4 for additional details on the numerical experiments used in this manuscript.

#### *3.4. Finite Element Methods for the QGE*

The FEM is particularly appealing because it combines advantages of multiple methods. It can easily handle adaptive mesh refinement and complex geometries (like the FVMs), but also can create higher-order schemes (like pseudospectral methods) at the same time, like the discretization used in [30,60], which is shown in Figure 2 (see also [61–63]). The first FE approximation of the QGE, to the best of our knowledge, was a scheme based on the mixed formulation developed in [64]. The conservation properties and stability of the FE discretization were proved as well as the suboptimal convergence of the FE method. The performance of FEM on simulating a multilayer QGE of ocean circulations has been compared to the FDM in [65]. Since the vorticity-streamfunction formulation (15) of the QGE results in a second-order PDE, for a conforming finite element discretization, a *C*<sup>0</sup> element can be utilized. Considering the finite element spaces *<sup>W</sup><sup>ψ</sup>* <sup>⊂</sup> *<sup>H</sup>*<sup>1</sup> <sup>0</sup> (Ω) # *W*1,4(Ω), *<sup>W</sup><sup>ω</sup>* <sup>⊂</sup> *<sup>H</sup>*1(Ω) (see, e.g., [66] for the definition of these finite element spaces), the finite element discretization reads: Find *ψ<sup>h</sup>* ∈ *W<sup>ψ</sup>* and *ω<sup>h</sup>* ∈ *W<sup>ω</sup>* satisfying

$$\begin{cases} \begin{array}{l} \left(\frac{\partial\omega\_{\mathrm{h}}}{\partial t},\phi\_{\mathrm{h}}\right) + \left(I(\omega\_{\mathrm{h}},\psi\_{\mathrm{h}}),\phi\_{\mathrm{h}}\right) = -\mathrm{Re}^{-1}\left(\nabla\omega\_{\mathrm{h}},\nabla\phi\_{\mathrm{h}}\right) + \mathrm{Re}^{-1}\left(\frac{\partial\phi\_{\mathrm{h}}}{\partial x},\phi\_{\mathrm{h}}\right) + \mathrm{Re}^{-1}\left(F\_{\mathrm{e}},\phi\_{\mathrm{h}}\right) & \forall\phi\_{\mathrm{h}}\in\mathcal{W}\_{\mathsf{F}}\\ \left(\omega\_{\mathrm{h}},v\_{\mathrm{h}}\right) = \left(\nabla\psi\_{\mathrm{h}},\nabla v\_{\mathrm{h}}\right) & \forall\ v\_{\mathrm{h}}\in\mathcal{W}\_{\omega} \end{array} \end{cases} \end{cases}$$

**Figure 2.** Triangulation of the Mediterranean Sea suitable for simulations with finite element methods which was used in [30,60] (see also [61–63]).

In [34,67], Medjo considered this formulation and proved bounds for the time discretization error. Cascon et al. [68] proved both a priori and a posteriori error estimates for the FE discretization of the linear Stommel-Munk model, which is a simplified version of the QGE obtained by dropping the nonlinear term.

The streamfunction formulation (13) of QGE is a fourth-order PDE, which naturally necessitates *C*<sup>1</sup> elements for a conforming finite element discretization. Considering the finite element space *<sup>W</sup>* <sup>⊂</sup> *<sup>H</sup>*<sup>2</sup> <sup>0</sup> (Ω), the finite element discretization reads: Find *ψ<sup>h</sup>* ∈ *W* [66] satisfying

$$\left(\frac{\partial}{\partial t}\nabla\boldsymbol{\Phi}\boldsymbol{\eta}, \nabla\boldsymbol{\Phi}\boldsymbol{\eta}\right) + \mathrm{Re}^{-1}\left(\Delta\boldsymbol{\varphi}\_{\hbar}, \Delta\boldsymbol{\varphi}\_{\hbar}\right) + \left(f(\boldsymbol{\varphi}\_{\hbar}, \Delta\boldsymbol{\varphi}\_{\hbar}), \boldsymbol{\phi}\_{\hbar}\right) - \mathrm{Re}^{-1}\left(\frac{\partial\boldsymbol{\varphi}\_{\hbar}}{\partial\mathbf{x}}, \boldsymbol{\phi}\_{\hbar}\right) = \mathrm{Re}^{-1}\left(\boldsymbol{F}\_{\hbar}, \boldsymbol{\phi}\_{\hbar}\right), \forall \boldsymbol{\phi}\_{\hbar} \in \mathcal{W}. \tag{17}$$

To our knowledge, the first optimal error convergence results for the finite element approximation of the QGE (12) were proved for the streamfunction formulation (17) using Argyris elements in [30]. Several numerical tests, commonly employed in the geophysical literature, showed the accuracy of the finite element discretization and illustrated the theoretical estimates. Other recent developments of FEM for QGE include discontinuous Galerkin formulation using *C*<sup>0</sup> elements [62] and B-splines [69–73]. In particular, an adaptive refinement algorithm for B-splines finite element approximation was presented in [71] for the streamfunction formulation.

#### **4. Reduced Order Models (ROMs)**

ROMs for the QGE have been developed for decades (see, e.g., [8,32,61,74–83]). Most ROMs have been constructed by using a classical Galerkin projection framework, but datadriven modeling (e.g., machine learning) has also been recently used [84,85]. In Section 4.1, we outline the standard Galerkin ROM construction. In Section 4.2, we explain the importance of considering under-resolved regimes when developing ROMs for realistic, chaotic flows. Furthermore, we present several ROM closure strategies, which are generally needed when ROMs are used in an under-resolved regime. The flowchart of the ROMs presented in this section is illustrated in Figure 3.

**Figure 3.** Framework of the ROMs presented in Section 4.

#### *4.1. Galerkin Reduced Order Model (G-ROM)*

To construct the standard Galerkin ROM, we start by generating the ROM basis. To this end, we use the proper orthogonal decomposition (POD) [2,6], which is also known as empirical orthogonal functions (EOF) and principal component analysis (PCA). We emphasize, however, that other ROM bases could be used, such as principal interaction patterns (PIPs) and optimal persistence patterns (OPPs) [74] (see also [1,3,5,7,75,86,87] for alternative strategies).

The POD starts by collecting the snapshots {*ω*<sup>1</sup> *<sup>h</sup>*, ... , *<sup>ω</sup><sup>M</sup> <sup>h</sup>* }, which are numerical approximations of the vorticity in the QGE (15) at *M* different time instances. We consider relatively accurate snapshots. If inaccurate snapshots are used to construct the POD basis, the resulting ROM can be inaccurate (see, e.g., [88]). For clarity of presentation, in this paper we use the finite element discretization, but other numerical discretizations could be used. The POD seeks a low-dimensional basis that approximates the snapshots optimally with respect to a certain norm. In this presentation, we use the *L*<sup>2</sup> norm and the *L*<sup>2</sup> inner product:

$$
\int \left(\omega\_1, \omega\_2\right) = \int\_{\Omega} \omega\_1(\mathbf{x}) \, \omega\_2(\mathbf{x}) \, d\mathbf{x} \,. \tag{18}
$$

We note that, although the *L*<sup>2</sup> norm and the *L*<sup>2</sup> inner product are the most popular choices in reduced order modeling, other norms and inner products could also be used (see, e.g., [89]). To construct POD basis functions that approximate the snapshots optimally with respect to the *L*<sup>2</sup> norm, we solve the following minimization problem [89]:

$$\min\_{\substack{\tilde{\boldsymbol{\varphi}}\_{l},\cdots,\tilde{\boldsymbol{\varphi}}\_{N}}} \sum\_{j=1}^{M} \left\| \boldsymbol{\omega}\_{l}^{j} - \sum\_{i=1}^{N} \left( \boldsymbol{\omega}\_{h'}^{j} \, \tilde{\boldsymbol{\varphi}}\_{i} \right) \, \tilde{\boldsymbol{\varphi}}\_{i} \right\|\_{L^{2}}^{2} \tag{19}$$
 
$$\text{s.t. } \left( \tilde{\boldsymbol{\varphi}}\_{l} \, \tilde{\boldsymbol{\varphi}}\_{m} \right) = \delta\_{lm} \quad \text{for} \ 1 \le l, m\_{\prime} \le N\_{\prime}$$

where *δlm* is the Kronecker delta. The solution of the minimization problem (19) is equivalent to the solution of the eigenvalue problem

$$Y^{\dot{T}}M\_{\hbar}Y\tilde{\varphi}\_{\dot{j}} = \lambda\_{\dot{j}}\tilde{\varphi}\_{\dot{j}}, \quad \dot{j} = 1, \ldots, N,\tag{20}$$

where *Y* denotes the snapshot matrix, whose columns correspond to the finite element coefficients of the snapshots, *Mh* denotes the finite element mass matrix, and *N* is the dimension of the finite element space. The eigenvalues are real and non-negative, so they can be ordered as follows: *λ*<sup>1</sup> ≥ *λ*<sup>2</sup> ≥ ... ≥ *λ<sup>R</sup>* ≥ *λR*+<sup>1</sup> = ... = *λ<sup>N</sup>* = 0, where *R* is the rank of the snapshot matrix. It can be shown [89] that these eigenvalues determine how well the corresponding POD modes represent the given vorticity snapshots: the lower the eigenvalue index, the more important the corresponding POD mode. Thus, we choose the POD vorticity basis functions {*ϕj*}*<sup>r</sup> <sup>j</sup>*=<sup>1</sup> from the eigenfunctions in (20) that correspond to the first *r* ≤ *R* largest eigenvalues and define the ROM vorticity space as *<sup>X</sup><sup>r</sup>* :<sup>=</sup> span{*ϕ*1,..., *<sup>ϕ</sup>r*}.

To determine the POD streamfunction basis functions, we use the POD vorticity basis functions and follow the approach in [8,32]. Specifically, we define the POD streamfunction basis functions as the normalized functions {*φj*}*<sup>r</sup> <sup>j</sup>*=1, which are chosen such that the satisfy the following Poisson problem with homogeneous Dirichlet boundary conditions:

$$-\Delta \phi\_{\vec{\jmath}} = \varphi\_{\vec{\jmath}}, \quad \vec{\jmath} = 1, \ldots, r. \tag{21}$$

Next, we define the ROM approximations of the vorticity and streamfunction as follows:

$$
\omega\_{\mathcal{T}}(\mathbf{x}, t) \quad = \sum\_{j=1}^{r} a\_{j}(t) \,\,\varrho\_{j}(\mathbf{x})\,,\tag{22}
$$

$$\left(\psi\_{l}(\mathbf{x},t)\right)\_{l} = \sum\_{j=1}^{r} a\_{j}(t)\,\phi\_{j}(\mathbf{x})\,. \tag{2.3}$$

where {*aj*(*t*)}*<sup>r</sup> <sup>j</sup>*=<sup>1</sup> are the sought time-varying ROM coefficients. We note that we made two important choices in our approach: (i) We enforced the coupling between the POD vorticity and streamfunction basis functions in (21); and (ii) We used the same ROM coefficients in the ROM vorticity approximation (22) and in the ROM streamfunction approximation (23). The motivation for making these two choices is efficiency. Indeed, we only need to construct a ROM for the vorticity; once the coefficients *aj* are determined from (15a), Equation (15b) is automatically satisfied. (Of course, one could use a different approach and construct two different ROM bases and two different ROM approximations for the vorticity and streamfunction, but that would increase the ROM computational cost.) To construct a ROM for the vorticity, we replace the vorticity *ω* by *ω<sup>r</sup>* in the QGE (15a), and then we use a Galerkin projection onto *X<sup>r</sup>* . Thus, we obtain the Galerkin ROM (G-ROM) for the QGE: ∀ *i* = 1, . . . ,*r*,

$$\left(\frac{\partial\omega\_{\boldsymbol{r}}}{\partial t},\boldsymbol{\varphi}\_{\boldsymbol{i}}\right) + \left(\boldsymbol{I}\left(\omega\_{\boldsymbol{r}},\boldsymbol{\varphi}\_{\boldsymbol{r}}\right),\boldsymbol{\varphi}\_{\boldsymbol{i}}\right) - \mathrm{Re}^{-1}\left(\frac{\partial\boldsymbol{\varphi}\_{\boldsymbol{r}}}{\partial\mathbf{x}},\boldsymbol{\varphi}\_{\boldsymbol{i}}\right) + \mathrm{Re}^{-1}\left(\nabla\omega\_{\boldsymbol{r}},\nabla\boldsymbol{\varphi}\_{\boldsymbol{i}}\right) = \mathrm{Re}^{-1}\left(\boldsymbol{F}\_{\boldsymbol{\ell}},\boldsymbol{\varphi}\_{\boldsymbol{i}}\right). \tag{24}$$

The G-ROM (24) yields the following autonomous dynamical system for the vector of time coefficients, **a**(*t*)=(*ai*(*t*))*i*=1,...,*r*:

$$\stackrel{\bullet}{\mathbf{a}} = \mathbf{b} + \mathbf{A}\,\mathbf{a} + \mathbf{a}^{\top}\,\mathbf{B}\,\mathbf{a},\tag{25}$$

where **b**, **A**, and **B** are an *r* × 1 vector, an *r* × *r* matrix, and an *r* × *r* × *r* tensor, which correspond to the constant, linear, and quadratic terms in the numerical discretization of the QGE (15), respectively. The *r*-dimensional system (25) can be written componentwise as follows: For all *i* = 1, . . . ,*r*,

$$\overset{\bullet}{a}\_i(t) = b\_i + \sum\_{m=1}^r A\_{im} a\_m(t) + \sum\_{m=1}^r \sum\_{n=1}^r B\_{imn} \, a\_m(t) \, a\_n(t), \tag{26}$$

where

$$b\_{\bar{i}} = \text{Ro}^{-1}\left(F\_{\epsilon\prime} \,\, \varphi\_{\bar{i}}\right),$$

$$A\_{\rm im} = \mathrm{Ro}^{-1}\left(\frac{\partial \phi\_m}{\partial x}, \varphi\_i\right) - \mathrm{Re}^{-1}\left(\nabla \varphi\_{m\prime}, \nabla \varphi\_i\right),\tag{28}$$

$$B\_{\rm imu} = -\left(J(\varphi\_{\rm m}, \phi\_{\rm n})\_{\prime} \,\mathrm{q}\_{i}\right). \tag{29}$$

The G-ROM (25) has been investigated in the numerical simulation of the QGE (15) (see, e.g., [8,32,80,83]), where it was shown that it can decrease the FOM computational cost

by orders of magnitude. However, the numerical simulations in [8,32] have also shown that a low-dimensional G-ROM is not able to produce accurate approximations of the streamfunction and the velocity fields. The G-ROM's numerical inaccuracy in [8,32] is due to the lack of a closure model [8,9,90], which we discuss in Section 4.2.

#### *4.2. ROM Closure Models*

In this section, we survey the *ROM closure models* developed for the QGE (12). First, we define closure modeling and we explain why it is needed when ROMs are used in the under-resolved regime (Section 4.2.1). Then, we present the two main types of ROM closure modeling for the QGE that are in current use: large eddy simulation (LES) ROM closure models (Section 4.2.2) and machine learning (ML) ROM closure models (Section 4.2.3). While LES and ML ROM closures are both data-driven modeling approaches, they are different in the way they use data to develop a closure model: The LES approach is based on ROM spatial filtering and least squares methods, whereas the ML approach is based on machine learning techniques.

#### 4.2.1. Under-Resolved ROMs Require Closure Models

The concept of under-resolved simulations is central in classical CFD. *Under-resolved simulations are those simulations in which the number of degrees of freedom* (e.g., the number of mesh points or basis functions) *is not enough to capture the dynamics of the underlying system.* For example, in turbulent flow simulations the available number of mesh points in a finite element or finite volume discretization, or the number of basis functions in a spectral discretization are not enough to resolve all the lengthscales in the turbulent flow, down to the Kolmogorov scale [91–93]. The numerical simulations at these inherently coarse resolutions are called under-resolved simulations.

In under-resolved simulations of turbulent flows, standard discretizations yield inaccurate results, which are not acceptable in practical engineering settings, e.g., large relative errors, inaccurate quantities of interest (e.g., lift and drag), and inaccurate flow features (e.g., vortex shedding frequency for the flow past a cylinder). In these cases, the classical computational models (e.g., the Navier–Stokes equations) are generally supplemented with correction terms that model the effect of the neglected scales (e.g., the scales smaller than the given coarse mesh size). These correction terms are generally called *closure models* [91–93].

The concept of under-resolved simulations is also relevant to reduced order modeling: *Under-resolved ROM simulations are those simulations in which the ROM dimension is not enough to capture the dynamics of the underlying system.* But how exactly do we determine whether a ROM simulation is resolved or under-resolved? Next, we present several potential answers to this question. Some of these answers are *a priori* criteria (i.e., can be used before the ROM simulation), some are *a posteriori* criteria (i.e., can be used only after the ROM simulation).

*Kolmogorov n-width*: The Kolmogorov n-width is an *a priori* criterion to determine whether the ROM simulation is resolved or under-resolved. Given the solution manifold M of the underlying system's dynamics, the Kolmogorov n-width [94] provides a way to quantify the best *<sup>n</sup>*-dimensional trial subspace <sup>X</sup> *<sup>n</sup>*:

$$d\_n(\mathcal{M}) := \inf\_{\mathcal{X}^n} \sup\_{\omega \in \mathcal{M}} \inf\_{\mathcal{S}' \in \mathcal{X}^n} \|\omega - \mathcal{g}\|.$$

Of course, calculating the Kolmogorov n-width for general systems can be challenging. There are, however, cases when the relative size of the Kolmogorov n-width is known. For example, it is known that, for computational problems dominated by diffusion, the Kolmogorov n-width decays fast, while for those dominated by convection, it decays slowly [95]. As a result, in order to obtain an accurate approximation of the solution manifold, the dimension of the ROM trial space is expected to be much higher in the convection-dominated case than in the diffusion-dominated case. Thus, for convectiondominated systems, if we use a very high-dimensional (i.e., of the same order as the Kolmogorov n-width) ROM, we obtain a resolved ROM simulation. If, however, we use

a low-dimensional (i.e., much lower than the Kolmogorov n-width) ROM, we obtain an under-resolved ROM simulation.

*Eigenvalue decay rate*: The eigenvalue decay rate is an *a priori* criterion to determine whether the ROM simulation is resolved or under-resolved. The eigenvalues *λ*1, ... , *λ<sup>R</sup>* in the eigenvalue problem (20) (used to construct the ROM basis) represent the energy content of the corresponding ROM modes [2,89]. Thus, the ratio

$$\frac{\sum\_{i=1}^{r} \lambda\_i}{\sum\_{i=1}^{R} \lambda\_i} \tag{30}$$

defines the relative energy content of the first *r* ROM basis functions with respect to the total energy of the system (see, e.g., page 16 in [89]). We emphasize that the concept of "energy" in this context is used in a generic sense. For example, when the snapshots are FOM approximations of a the velocity field in a fluid flow, the energy in (30) is the kinetic energy; when the snapshots are FOM approximations of the vorticity field in the QGE, the energy in (30) is the enstrophy. We can define the resolved regime as the regime in which the ROM dimension *r* is large enough to ensure that the relative energy ratio (30) is larger than a certain threshold (e.g., 90%). Thus, we expect a low-dimensional ROM to be in the resolved regime when the eigenvalues have a fast decay, and in the under-resolved regime when the eigenvalues have a slow decay.

To illustrate this point, in Figure 4 we plot the scaled eigenvalues *λk*/*λ*1, *k* = 1, ... , 150 for two flow settings: the 2D flow past a cylinder at Re = 1000 and the QGE with Re = 450 and Ro = 0.0036 (the latter will be used in the numerical investigation in Section 5). This plot shows that the eigenvalues decay much faster for the flow past a cylinder case than for the QGE case.

**Figure 4.** Scaled eigenvalues *<sup>λ</sup><sup>k</sup> <sup>λ</sup>*<sup>1</sup> for the 2D flow past a circular cylinder with Re = 1000 and the QGE with Re = 450 and Ro = 0.0036 (see Section 5 for details).

Indeed, the results in Table 1 show that, in order to achieve a 90% relative energy ratio in (30), we need to use only 2 ROM modes for the flow past a cylinder, and 77 ROM modes for the QGE case. Thus, if we use only a handful of ROM modes to ensure a low computational cost, we expect the resulting low-dimensional ROM to accurately capture the dynamics of the flow past a cylinder, but not the dynamics of the QGE. In this case, we perform a resolved ROM simulation of the flow past a cylinder, and an under-resolved ROM simulation for the QGE.

**Table 1.** Number of ROM modes needed to achieve a given relative energy content (30) for the 2D flow past a circular cylinder with Re = 1000 and the QGE with Re = 450 and Ro = 0.0036 (see Section 5 for details).


*ROM Lengthscale*: The ROM lengthscale is an *a priori* criterion to determine whether the ROM simulation is resolved or under-resolved. In principle, the ROM lengthscale criterion follows the same algorithm as the standard CFD lengthscale criterion: Start with a lengthscale that is large enough to capture the relevant dynamics, and then choose the input discretization parameters such that phenomena occurring at the chosen lengthscale can be approximated. Choosing the discretization parameters is, however, fundamentally different in classical CFD and ROMs: In classical CFD, the *spatial meshsize* (e.g., for finite difference or finite element methods) or the cutoff wavenumber in a Fourier truncation (e.g., for spectral methods) clearly determines what lengthscale can be approximated. For ROMs, however, *there is no straightforward definition of a lengthscale based on the ROM discretization parameters*, i.e., the ROM dimension (*r*), the ROM basis ({*ϕ*1, ... , *ϕr*}), and the ROM eigenvalues ({*λ*1, ... , *λr*}). To our knowledge, only very few ROM lengthscale definitions based on the ROM discretization parameters have been proposed. In [96], a ROM lengthscale was defined for the 3D flow past a circular cylinder at *Re* = 1000 (see also [2] for related work). This lengthscale was then used in [96] to build ROM closure models.

*Trial and error*: The trial and error approach is an *a posteriori* criterion to determine whether the ROM simulation is resolved or under-resolved. Specifically, a few ROM simulations are run in the offline stage in order to determine the ROM discretization parameters that yield accurate results, which are acceptable in practical engineering settings, e.g., small relative errors, accurate quantities of interest (e.g., lift and drag), and accurate flow features (e.g., vortex shedding frequency for the flow past a cylinder).

In Section 5, we show that under-resolved ROM simulations of the QGE can yield inaccurate results. To increase the accuracy of these under-resolved ROM simulations, the standard G-ROM (25) is generally supplemented with a closure model:

$$\mathbf{\tilde{a}} = \mathbf{b} + \mathbf{A}\,\mathbf{a} + \mathbf{a}^\top \,\mathbf{B}\,\mathbf{a} + \tau^{ROM},\tag{31}$$

where *τROM* is the closure model that needs to be determined.

There are two main types of ROM closure modeling approaches, i.e., approaches to modeling the term *τROM* in (31) in the offline stage:


In this paper, we do not survey the ROM closure models. Instead, we only focus on ROM closure modeling for the QGE. Specifically, in Sections 4.2.2 and 4.2.3, we present two different ROM closure modeling strategies for the QGE. We note, however, that there are alternative ROM closure modeling strategies for the QGE, e.g., the stochastic mode reduction strategy developed by Majda and his collaborators (see [76] and references therein).

#### 4.2.2. Large Eddy Simulation ROM Closure Models

The large eddy simulation (LES) ROM closure modeling is inspired from classical LES of turbulent flows [91–93]. The LES-ROM closure models come in two flavors: black box and mathematical.

The black box LES-ROM closure models developed for the QGE use physical insight to postulate a model form for the closure term. Specifically, they postulate that the ROM closure term has to be dissipative. In [80], a linear damping term (i.e., a third-order Laplace operator) is used as a ROM closure model. In [32], a nonlinear damping term (i.e., a simplified Smagorinsky model [97]) is used as a ROM closure model. A significant improvement to the Smagorinsky ROM closure model used in [32] is the dynamic subgridscale ROM closure model which was first proposed in [96] and later adapted to the QGE in [79].

The mathematical LES-ROM closure models developed for the QGE are an elegant approach to closure modeling. These ROM closure models are built in three steps: In the first step, the QGE are *spatially filtered* to obtain the large structures which can be approximated at the given coarse resolution. In this first step, an exact formula for the ROM closure term *τFOM* is also obtained. In the second step, a specific model form is postulated for the LES-ROM closure model, i.e., *<sup>τ</sup>FOM* <sup>≈</sup> *<sup>τ</sup>ROM* in (31). Finally, in the third step, FOM data is used to find the parameters in the general form *τROM* that yield the closest (in a least squares sense) approximation to true ROM closure term *τFOM*. One example of mathematical LES-ROMs is the recently developed *data-driven variational multiscale* ROM (DD-VMS-ROM) [8,9,90], which is centered around the variational multiscale framework. There are two versions of the DD-VMS-ROM: a two-scale model [8,9] and an improved three-scale model [90]. For clarity of presentation, we present the two-scale DD-VMS-ROM. To construct the DD-VMS-ROM, the ROM projection from the ROM space *X<sup>R</sup>* = span{*ϕ*1, ... , *<sup>ϕ</sup>R*} to the subspace *<sup>X</sup><sup>r</sup>* <sup>=</sup> span{*ϕ*1, ... , *<sup>ϕ</sup>r*}, *<sup>r</sup>* <sup>≤</sup> *<sup>R</sup>* is used as a spatial filter. The filtered QGE yield an exact formula for the ROM closure term, *τFOM*. Next, a specific model form is prescribed for the exact closure term

$$
\pi^{FOM} \approx \pi^{ROM} := \check{A}\,a + a^\top \,\check{B}\,a \tag{32}
$$

and the entries of the LES-ROM closure operators *<sup>A</sup>*\$ and *<sup>B</sup>*\$ are found by solving the following *least squares problem*:

$$\min\_{\bar{A}, \bar{B}} \sum\_{j=1}^{M} \left\| \mathbf{r}^{\text{FOM}}(t\_j) - \left( \tilde{\mathbf{A}} \, a^{\text{FOM}}(t\_j) + (\mathbf{a}^{\text{FOM}}(t\_j))^\top \, \tilde{\mathbf{B}} \, \mathbf{a}^{\text{FOM}}(t\_j) \right) \right\|^2,\tag{33}$$

where *aFOM* are computed from the FOM data. Finally, the LES-ROM closure operators obtained in (33) are used to build the DD-VMS-ROM:

$$\mathbf{\dot{a}} = \mathbf{b} + \left(\mathbf{A} + \tilde{A}\right)\mathbf{a} + \mathbf{a}^\top \left(\mathbf{B} + \tilde{B}\right)\mathbf{a}.\tag{34}$$

In Section 5, we investigate the DD-VMS-ROM in the under-resolved simulation of the QGE.

#### 4.2.3. Machine Learning ROM Closure Models

Machine learning (ML) methods have recently started to make an impact in reduced order modeling of fluid flows. The ML methods most frequently used to build ROMs include multilayer perceptron (MLP) [85,98,99], convolution neural networks (CNN) [100], recurrent neural networks (RNN) [84,101,102], and variational autoencoder (VAE) [100,103]. Some of these ML-ROMs are *nonintrusive*, i.e., they use the FOM codes as black boxes, only to generate output data from different inputs. For these nonintrusive ML-ROMs, no prior information about the underlying governing equations is required to construct the model. These models fully rely on data combined with ML methods to discover the ROM dynamics and can be written as follows:

$$
\stackrel{\bullet}{a} = \hat{\mathcal{F}}(a, \theta),
\tag{35}
$$

where *a* is the vector of ROM coefficients and *θ* is the vector of learnable parameters in the ML model *<sup>F</sup>*. The nonintrusive ML-ROMs are fundamentally different from classical intrusive modeling strategies, such as the Galerkin method used to generate the G-ROM (25), which need access to the underlying governing equations in order to construct the ROM.

Only few ML-ROMs have been developed for the QGE. For example, an extreme learning machine concept with neural networks was introduced for the ROM closure of the QGE in [85]. Furthermore, a nonintrusive reduced order modeling framework embedded with a long short-term memory (LSTM) network was developed for quasi-geostrophic turbulence to improve the time series prediction of ROMs in [84]. The LSTM-ROM for QGE [84] was constructed (trained) in two steps:


The resulting model was then used in the testing stage to predict the ROM coefficients at new time instances.

Recently, *hybrid* ROMs that combine classical Galerkin modeling with machine learning have started to become popular. For example, a hybrid ROM closure was proposed in [104] for the QGE. This hybrid ROM combined classical Galerkin projection methods with neural network closures to perform near real-time prediction of mesoscale ocean flows. The numerical investigation in [104] showed that the hybrid ROM was more accurate than both the classical G-ROM and a pure ML-ROM (i.e., a ROM built entirely from data by using machine learning).

#### **5. Numerical Results**

In this section, we present an illustration of the projection ROMs constructed in Section 4 in the numerical simulation of the QGE described in Section 2. First, we describe the details of the computational setting that we use in our ROM numerical investigation: the regimes (Section 5.1), the test problem (Section 5.2), the criteria (Section 5.3), and the generation of the FOM data used to construct the ROM basis (Section 5.4). After we clarify these details, we perform a numerical investigation of the ROM accuracy and ROM efficiency (Section 5.5).

#### *5.1. Regimes*

In our numerical illustration, we use four regimes: (i) a *reconstructive* regime, which is an easier test case, in which the ROM is validated on the same time interval as the time interval used to train the ROM; (ii) a *predictive* regime, which is a harder test case, in which the ROM is trained on a short time interval and validated on a longer time interval; (iii) a *resolved* regime, in which the number of ROM basis functions is enough to represent the system's dynamics; and (iv) an *under-resolved* regime, in which the number of ROM basis functions is not enough to represent the system's dynamics. These four regimes illustrate different features of the ROMs.

The reconstructive regime is the first step in a ROM investigation. At the very least, the proposed ROM needs to provide an efficient and accurate approximation of the FOM data used to train it (i.e., the FOM results used to construct the ROM basis). The predictive

regime is a harder test in the ROM investigation. In order to be practical, the proposed ROM needs to be able to approximate the FOM results on time intervals and parameter ranges that are wider than those used to train the ROM. (For clarity, in this section, we consider only a longer time interval, but wider parameter ranges could also be considered.) Of course, the proposed ROM generally has a harder time approximating data that it has not seen in the training process, but the ROM needs to perform well in the predictive regime in order to be deemed successful in practice.

The resolved regime is an easier test in the ROM investigation. Since the ROM uses a relatively large number of ROM basis functions, which is enough to capture the underlying system's dynamics, a straightforward, standard G-ROM is expected to perform well in the resolved regime. The under-resolved regime is a much harder test in the ROM investigation. In the under-resolved regime, the proposed ROM needs to use a relatively small (i.e., not enough to capture the system's dynamics) number of ROM basis functions and somehow still be able to approximate the FOM data. In classical computational fluid dynamics (CFD), the under-resolved regime is one of the most important tests for the practicality of the proposed numerical method. Indeed, many realistic CFD applications are turbulent and chaotic, and standard resolved discretizations (e.g., direct numerical simulation (DNS)) are simply not possible, since they require an unrealistic number of degrees of freedom. The under-resolved regime is relatively much less investigated in the ROM world. We believe, however, that to develop ROMs that can be used in the numerical simulation of *realistic, chaotic* geophysical flows, the proposed ROMs need to be investigated in the under-resolved regime.

Since the reconstructive and predictive regimes, on the one hand, and the resolved and under-resolved regimes, on the other hand, serve different purposes, we consider four regime pairs in the ROM numerical investigation in Section 5.5: First, we consider the resolved and reconstructive, and the resolved and predictive regimes. The goal here is to investigate the reconstructive and, more importantly, the predictive capabilities of the standard G-ROM in the relatively simple resolved regime. Second, we consider the under-resolved and reconstructive, and the under-resolved and predictive regimes. The goal here is different. We want to investigate the reconstructive and predictive capabilities of the standard G-ROM in the challenging under-resolved regime. We expect that, when only a few ROM basis functions are used to build it, the standard G-ROM will perform poorly in the under-resolved regime. Thus, to address the G-ROM's potential inaccuracies, we also consider the LES-ROM proposed in Section 4.2.2, i.e., the DD-VMS-ROM. In the under-resolved regime, we expect the LES-ROM to be more accurate than the standard G-ROM.

#### *5.2. Test Problem Setup*

In our ROM numerical investigation in a QGE setting, we need to make several choices. Specifically, in the QGE (15), we need to choose the spatial domain, the time interval, the forcing (*Fe*), the Reynolds number (Re), and the Rossby number (Ro). We emphasize that these choices are important: Some choices yield a relatively easy test problem, i.e., a problem in which a standard ROM built with relatively few ROM basis functions can generate an accurate and efficient approximation. Other choices, however, yield a challenging test problem, in which standard low-dimensional ROMs produce inaccurate results.

In our numerical investigation, we choose parameters that yield a *challenging test problem*, which has been used in numerous studies (see, e.g., [8,27–29,31–33,79,85,105]) as a simplified model for more realistic ocean dynamics. Specifically, we choose the simple spatial domain [0, 1] × [0, 2], the relatively long time interval [0, 100], and a symmetric double-gyre wind forcing given by *Fe* = sin *π* (*y* − 1) , which yields a four-gyre circulation in the time mean. We also choose the same Reynolds number and Rossby number as those used in [8,28,32,33], i.e., Re = 450 and Ro = 0.0036.

We emphasize that this four-gyre QGE test problem represents a significant challenge for FOM simulations with standard numerical methods. Indeed, as shown in [27], although a double-gyre wind forcing is used, the long term time-average yields a *four-gyre* pattern (see Figure 5). On realistic coarse meshes, classical numerical methods (e.g., finite element and finite volume methods) generally produce inaccurate approximations to this test problem. In particular, standard numerical discretizations fail to recover the correct fourgyre pattern (see, e.g., [32,33]). One of the main reasons for the challenging character of the four-gyre test problem is the relatively low Rossby number used (i.e., *Ro* = 0.0036). Indeed, as shown in Figure 1, a relatively small Rossby number yields a sharp western boundary layer, which makes the test problem challenging for FOM simulations (see, e.g., [32,33]). While the Reynolds number used (i.e., *Re* = 450) is not large by turbulence modeling standards, it turns out that it yields a convection-dominated regime that is challenging for FOM simulations. Overall, these parameter choices together with the chosen spatial domain, time interval, and forcing function, yield a challenging FOM test problem. This is clearly illustrated in the plot of the FOM kinetic energy in Figure 6, which suggests that this is a *chaotic* system, with non-periodic time evolution.

Given the non-periodic, chaotic evolution of the this four-gyre test problem, we expect it to represent a challenging test not only for FOM simulations, but also for ROM simulations. This expectation is supported by projecting the FOM data on the ROM basis functions to obtain the true ROM coefficients, which the proposed ROMs need to approximate. These true ROM coefficients, which are plotted in Figure 7, have a non-periodic, chaotic evolution, which is challenging to capture by standard ROMs. In Section 5.5, we will show that this four-gyre test problem does indeed represent a challenging test for ROMs.

**Figure 5.** FOM streamfunction contour plots at *t* = 40 (**left**), *t* = 60 (**middle**), and time-averaged (**right**).

**Figure 6.** Time evolution of the kinetic energy of the FOM.

**Figure 7.** Time evolution of *aFOM* <sup>1</sup> (*t*), *<sup>a</sup>FOM* <sup>10</sup> (*t*), *<sup>a</sup>FOM* <sup>100</sup> (*t*).

**Remark** (QGE vs. 2D Flow Past a Cylinder)**.** *The 2D flow past a cylinder at low Reynolds numbers has become one of the most popular test problems in the ROM world. The reason is that the time evolution of the true ROM coefficients is periodic and a few ROM modes are required to capture the system's dynamics. By comparison, the QGE test problem used in our numerical investigation is a significantly harder test problem: Its true ROM coefficients display a non-periodic, chaotic time evolution and relatively many ROM modes are required to capture the system's dynamics. This statement is supported by the plot in Figure 4 and the results in Table 1: The plot shows that the eigenvalues decay much faster for the flow past a cylinder test case than for the QGE test case. The results in Table 1 show that, in order to achieve a* 90% *relative energy content (which is defined in* (30)*), the flow past a cylinder test case requires only* 2 *ROM modes, whereas the QGE test case requires* 77 *ROM modes.*

#### *5.3. Criteria*

To investigate the ROMs, we use the following three criteria: (i) the relative *L*<sup>2</sup> norm of the time-averaged streamfunction errors between *ψFOM* and *ψROM*:

$$\left\| \frac{1}{M} \sum\_{j=1}^{M} \psi^{FOM}(t\_j) - \frac{1}{M} \sum\_{j=1}^{M} \psi^{FOM}(t\_j) \right\|\_{L^2}^2 \Big/ \left\| \frac{1}{M} \sum\_{j=1}^{M} \psi^{FOM}(t\_j) \right\|\_{L^2}^2. \tag{36}$$

(ii) The ROM's ability to recover the four-gyre pattern of the time-average of the FOM streamfunction in Figure 5. (iii) The ROM computational cost. The first two criteria quantify the ROM numerical accuracy, whereas the third criterion quantifies the ROM efficiency. We note that the first two criteria utilize time-averages. The reason for using time-averages is that, in the numerical investigation of chaotic systems (such as the four-gyre test problem), pointwise in time quantities are less robust (e.g., prone to phase errors) and can yield deceiving results.

To define the resolved and under-resolved ROM regimes, we use two of the four criteria outlined in Section 4.2.1: (i) the trial and error criterion; and (ii) the eigenvalue decay rate criterion. Specifically, when we use the trial and error criterion in our numerical investigation, we call the ROM regime resolved if its relative *L*<sup>2</sup> norm of the error (36) is <sup>O</sup>(10−1), and under-resolved otherwise. We note that an <sup>O</sup>(10−1) relative error is large by engineering standards. However, our numerical investigation will show that even this large threshold requires high-dimensional ROMs. When we use the eigenvalue decay rate criterion in our numerical investigation, we call the ROM regime resolved if its relative kinetic energy content (which was defined in (30)) is above 90%, and underresolved otherwise.

#### *5.4. FOM Snapshot Generation*

To generate the FOM data (i.e., snapshots) that is used to construct the ROM basis functions, we utilize fine resolution spatial and temporal discretizations. Specifically, for the FOM spatial discretization, we use a pseudospectral method with a 257 × 513 spatial resolution [8]. For the FOM time discretization, we use an explicit Runge-Kutta method (Tanaka-Yamashita, an order 7 method with an embedded order 6 method for error control), and an error tolerance of 10−<sup>8</sup> in time with adaptive time refinement and coarsening [8] in addition to an eigenvalue-based time step restriction for ensuring numerical stability. These spatial and temporal discretizations yield numerical results that are similar to the fine resolution numerical results obtained in [32,33].

To collect FOM snapshots, we first need to decide what time interval we utilize. To this end, in Figure 6, we plot the time evolution of the kinetic energy, *E*(*t*). Figure 6 (see also Figure 1 in [32]) shows that the flow starts with a short transient interval (approximately [0, 10]), after which it converges to a *statistically steady state*. We emphasize that, although the flow is statistically steady, it still displays a complex, chaotic behavior. To illustrate this, in Figure 5, we display the instantaneous contour plot for the streamfunction field at *t* = 40 and *t* = 60. While *t* = 40 and *t* = 60 are well within the statistically steady state regime, the flow displays a non-periodic, complex time evolution, with a high degree of variability. Furthermore, in Figure 7, we plot the time evolution of the true ROM coefficients *aFOM* <sup>1</sup> (*t*), *<sup>a</sup>FOM* <sup>10</sup> (*t*), and *<sup>a</sup>FOM* <sup>100</sup> (*t*), which are obtained by projecting the FOM vorticity data onto the ROM bases, *ϕ*1, *ϕ*10, and *ϕ*100, respectively:

$$a\_i^{FOM}(t) = \left(\omega^{FOM}(t), \varphi\_i\right)\_{\prime} \tag{37}$$

where *ωFOM*(*t*) is the FOM vorticity at time *t*. The true ROM coefficients display a nonperiodic, chaotic behavior within the time interval [10, 80]. Thus, the numerical approximation of this statistically steady regime remains challenging for the ROMs that we investigate in this section.

In our numerical investigation, we follow [8,32,33] and collect 701 FOM snapshots in the time interval [*Tmin*, *Tmax*]=[10, 80] at equidistant time intervals. Collecting a large number of snapshots ensures that the FOM data used to train the ROM is rich enough to capture the relevant dynamics. Next, we use the algorithm outlined in Section 4 and the FOM snapshots to construct the ROM basis. In Figure 8, we plot selected ROM streamfunction basis functions. We observe that, as the ROM basis index increases, the spatial structures displayed by the ROM basis functions become smaller and smaller. This is consistent with the idea that the ROM modes are arranged in decreasing importance

(dominance) order: The first ROM mode is the most dominant, the second ROM mode is the second most dominant, and so on.

**Figure 8.** Streamfunction basis functions: *φ*1, *φ*10, and *φ*80.

#### *5.5. ROM Numerical Investigation*

In this section, we perform a numerical investigation of the ROM accuracy and efficiency.

To investigate the ROM accuracy, we consider the four regimes discussed in Section 5.1. First, we consider the resolved regime, both in the reconstructive (Section 5.5.1) and predictive (Section 5.5.2) settings. In these two regimes, we investigate only the standard G-ROM (25), since in the resolved case there is no need for ROM closure. The goal of these two sections is to use the two criteria presented in Section 5.3 (i.e., the relative *L*<sup>2</sup> error and the relative energy content) to determine the minimum ROM dimension (*r*) that is necessary in the resolved regime. Next, we consider the under-resolved regime, both in the reconstructive (Section 5.5.3) and predictive (Section 5.5.4) settings. In these two regimes, we investigate the standard G-ROM (25) and one LES-ROM, i.e., the DD-VMS-ROM presented in Section 4.2.2. The goal of these two sections is to determine whether the LES-ROM can significantly increase the standard G-ROM accuracy in the under-resolved regime.

To investigate the ROM efficiency, in Section 5.5.5 we discuss the computational cost of the standard G-ROM and the LES-ROM.

For both the G-ROM and the LES-ROM, we use the same time discretization on the time interval [10, 80]: the RK4 method with a uniform step size Δ*t* = 10−3.

#### 5.5.1. Resolved, Reconstructive Regime

In this section, we consider the resolved, reconstructive regime.

In Table 2, we list the relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction and the relative energy content (30) for G-ROM with several *r* values: *r* = 10, 20, 40, and 80. As expected, as the G-ROM dimension (*r*) increases, the relative errors converge to 0 and the relative energy content increases. We emphasize, however, that

one needs a relatively large *r* value to attain what we defined as a resolved regime: *To attain an* <sup>O</sup>(10−1) *relative error and* 90% *relative energy content, one needs to take <sup>r</sup>* <sup>=</sup> <sup>O</sup>(102).

**Table 2.** Resolved, reconstructive regime. Relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction and relative energy content (30) for G-ROM with different *r* values.


In Figure 9, for *r* = 10, 40, and 120, we plot the time-average of the streamfunction *ψ* over the time interval [10, 80] for the FOM and G-ROM. We note that we use the same scale for the FOM and the G-ROM with large *r* values (i.e., *r* = 40 and *r* = 120). However, for the G-ROM with a low *r* value (i.e., *r* = 10), we use a different scale, since the magnitude of these G-ROM results is much larger than the rest. The plots in Figure 9 show that the G-ROM with low *r* values (i.e., *r* = 10 and *r* = 40) fails to recover the FOM four-gyre pattern. The G-ROM with *r* = 120 captures the FOM four-gyre pattern, but even in this case the magnitude of the time-averaged streamfunction is only marginally accurate. Thus, the plots in Figure 9 support the results in Table 2: *To recover the FOM four-gyre pattern, one needs to take r* <sup>=</sup> <sup>O</sup>(102).

**Figure 9.** Resolved, reconstructive regime. Time-averaged streamfunction, *ψ*, for FOM and G-ROM with *r* = 10, 40, and 120.

5.5.2. Resolved, Predictive Regime

In this section, we consider the resolved, predictive regime. To construct the G-ROM basis functions, we use data (snapshots) from the time interval [10, 45] and test the G-ROM on a longer time interval (i.e., [10, 80]) to test the predictive capabilities of the G-ROM.

In Table 3, we list the relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction and the relative energy content (30) for G-ROM with several *r* values: *r* = 10, 20, 40, and 80. We note that, as the G-ROM dimension (*r*) increases, the errors converge to 0 and the relative energy content increases. As expected, the errors in the predictive regime are worse than the errors in the reconstructive regime in Section 5.5.1. Furthermore, as in the reconstructive regime, one needs a relatively large *r* value to attain what we defined as a resolved regime: *To attain an* <sup>O</sup>(10−1) *error and* 90% *relative energy content, one needs to take r* <sup>=</sup> <sup>O</sup>(102).


**Table 3.** Resolved, predictive regime. Relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction and relative energy content (30) for Galerkin ROM (G-ROM) with different *r* values.

In Figure 10, for *r* = 10, 40, and 120, we plot the time-average of the streamfunction *ψ* over the time interval [10, 80] for the FOM and G-ROM. We note that we use the same scale for the FOM and the G-ROM with large *r* values (i.e., *r* = 40 and *r* = 120). For the G-ROM with a low *r* value (i.e., *r* = 10), we use a different scale, since the magnitude of these G-ROM results is much larger than the rest. The plots in Figure 10 show that, as in the reconstructive regime in Section 5.5.1, the G-ROM with low *r* values (i.e., *r* = 10 and *r* = 40) fails to recover the FOM four-gyre pattern. The G-ROM with *r* = 120 captures the FOM four-gyre pattern, but even in this case the magnitude of the time-averaged streamfunction is only marginally accurate. Thus, the plots in Figure 10 support the results in Table 3: *To recover the FOM four-gyre pattern, one needs to take r* <sup>=</sup> <sup>O</sup>(102).

**Figure 10.** Resolved, predictive regime. Time-averaged streamfunction, *ψ*, for FOM and G-ROM with *r* = 10, 40, and 120.

#### 5.5.3. Under-Resolved, Reconstructive Regime

In this section, we consider the under-resolved, reconstructive regime. Since we use the under-resolved regime, we investigate the standard G-ROM and an LES-ROM. Specifically, we investigate the improved, three-scale version [90] of the DD-VMS-ROM (34).

In Table 4, we list the relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction of the G-ROM and LES-ROM for several *r* values: *r* = 10, 15, and 20. For all the *r* values considered, *the LES-ROM is orders of magnitude more accurate than the G-ROM*. More importantly, for *r* = 20, *the LES-ROM is almost one order of magnitude more accurate than the G-ROM with r* = 120, which was used in Table 2.

In Figure 11, for *r* = 10, we plot the time-average of the streamfunction *ψ* over the time interval [10, 80] for the FOM, G-ROM, and LES-ROM. We note that we use the same scale for the FOM and the LES-ROM. For the G-ROM, however, we use a different scale, since the magnitude of the G-ROM results is much larger than the rest. The plots in Figure 11 show that the G-ROM fails to recover the FOM four-gyre pattern. On the other hand, the LES-ROM successfully captures the four-gyre pattern and its correct magnitude. In fact, the LES-ROM with *r* = 10 is even more accurate than the resolved G-ROM with *r* = 120

in Figure 9. Thus, the plots in Figure 11 support the results in Table 4: *The LES-ROM is dramatically more accurate than the G-ROM.*

**Figure 11.** Under-resolved, reconstructive regime. Time-averaged streamfunction, *ψ*, for FOM, G-ROM, and large eddy simulation (LES)-ROM with *r* = 10.


**Table 4.** Under-resolved, reconstructive regime. Relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction for G-ROM and LES-ROM for different *r* values.

5.5.4. Under-Resolved, Predictive Regime

In this section, we consider the under-resolved, predictive regime for the G-ROM and LES-ROM. To construct the G-ROM and LES-ROM basis functions, we use data (snapshots) from the time interval [10, 45] and test the G-ROM and LES-ROM on a longer time interval (i.e., [10, 80]) to test the predictive capabilities of the G-ROM and LES-ROM.

In Table 5, we list the relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction of the G-ROM and LES-ROM for several *r* values: *r* = 10, 15, and 20. For all the *r* values considered, *the LES-ROM is orders of magnitude more accurate than the G-ROM*. Most importantly, for *r* = 20, *the LES-ROM is more accurate than the G-ROM with r* = 120, which was used in Table 3.

In Figure 12, for *r* = 10, we plot the time-average of the streamfunction *ψ* over the time interval [10, 80] for the FOM, G-ROM, and LES-ROM. We note that we use the same scale for the FOM and the LES-ROM. For the G-ROM, however, we use a different scale, since the magnitude of the G-ROM results is much larger than the rest. The plots in Figure 12 show that the G-ROM fails to recover the FOM four-gyre pattern. On the other hand, the LES-ROM successfully captures the four-gyre pattern and its correct magnitude. Thus, the plots in Figure 12 support the results in Table 5: *The LES-ROM is significantly more accurate than the G-ROM.*

**Figure 12.** Under-resolved, predictive regime. Time-averaged streamfunction, *ψ*, for FOM, G-ROM, and LES-ROM with *r* = 10.

**Table 5.** Under-resolved, predictive regime. Relative *L*<sup>2</sup> errors (36) of the time-averaged streamfunction for G-ROM and LES-ROM for different *r* values.


#### 5.5.5. Computational Cost

The ROM computational cost has two components: (i) the computational cost of the *offline stage*, i.e., when the ROM operators are assembled; and (ii) the computational cost of the *online stage*, i.e., when the ROM is actually used in practical computations. While the offline computational cost can be high, it is often offset in the online stage, when the ROM is used for numerous runs.

In Table 6, we list the CPU time for the FOM, G-ROM, and LES-ROM in the online stage. We note that the CPU time of the G-ROM is similar to the CPU time of the LES-ROM. We emphasize that *both the G-ROM and the LES-ROM CPU times are orders of magnitude lower than the FOM CPU time.* Furthermore, the G-ROM CPU time increases significantly as *r* increases.

**Table 6.** CPU time for FOM, G-ROM, and LES-ROM in the online stage.


#### 5.5.6. Summary

The results in our numerical investigation yield the following conclusions:


#### **6. Conclusions and Outlook**

The quasi-geostrophic equations (QGE) (also known as the barotropic vorticity equations) are a simplified mathematical model for large scale wind-driven ocean circulation. Since the QGE computational cost is significantly lower than the computational cost of full fledged mathematical models of ocean flows, the QGE have often been used to test new numerical methods for geophysical flows, such as reduced order models (ROMs).

In this brief survey, we summarized projection-based ROMs developed for the QGE in order to understand ROMs' potential in efficient numerical simulations of ocean flows. Specifically, in Section 2, we briefly explained how the QGE are derived from the primitive equations by using simple scaling arguments. We also outlined the various QGE formulations currently used, and we illustrated the importance of the Rossby number, which quantifies the rotation effects in the QGE. In Section 3, we surveyed the main numerical methods used in the spatial discretization of the QGE: finite difference, finite volume, pseudospectral and spectral, and finite element methods. In Section 4, we presented the main steps in the construction of the standard Galerkin ROM (G-ROM). Specifically, we showed how the full order model (FOM) simulations generate data (snapshots) that is used to build the ROM basis, which is then utilized in a Galerkin projection framework to construct the G-ROM. We also emphasized the importance of appropriate treatment of the under-resolved regime, i.e., when the number of ROM modes is not enough to capture the relevant QGE dynamics. The ROM under-resolved regime is often encountered in realistic geophysical settings dominated by convection, when the Kolmogorov n-width is large. One of the main approaches for tacking the ROM under-resolved regime is ROM closure modeling, i.e., modeling the effect of the discarded ROM modes. We reviewed two types of ROM closure models for the QGE: large eddy simulation (LES) ROM closure models (which are based on spatial filtering and data driven modeling), and machine learning (ML) ROM closure models. Finally, in Section 5, we showed how ROMs are used in the numerical simulation of the QGE. To this end, we considered a QGE test problem in which long-term time averaging yields a four-gyre pattern. We showed that, if enough ROM modes were used (i.e., in the resolved regime), the standard G-ROM yielded accurate results at a low computational cost. If, however, only a few ROM modes were used (i.e., in the under-resolved regime), the standard G-ROM yielded inaccurate results, whereas the LES-ROM yielded accurate results at a low computational cost.

ROMs have a significant potential in efficient and relatively accurate numerical simulations of geophysical flows that display recurrent dominant spatial structures. This brief survey aimed at showcasing the ROMs' potential in simplified settings, i.e., for QGE simulations. We emphasize, however, that the ultimate goal is to use ROMs in realistic many query atmospheric and oceanic applications, e.g., uncertainty quantification and data assimilation. While the first steps have been made (see, e.g., [106–112]), there are significant challenges that still need to be addressed. Next, we present several potential future research avenues in the ROM exploration of the QGE and more complex models of geophysical flows.

To develop ROMs for geophysical flows, *realistic* computational settings need to be considered. For example, realistic parameters (e.g., the Reynolds number, *Re*), and realistic complex geometries need to be investigated. Since realistic oceanic and atmospheric flows display an enormous range of spatial and temporal scales, new ROMs need to be constructed for under-resolved regimes in which the ROM closure problem becomes central, just as in FOM. Thus, novel robust, stable, accurate, and efficient ROM closure models for realistic geophysical flows need to be built. But how should these ROM closure models be developed? By using physical insight (as in classical FOMs), data (as currently done in many research areas), or both? Furthermore, in addition to the rotation effects modeled by the QGE, *stratification* should also be investigated. In the simplified QGE setting, stratification could be included by considering the multilayer QGE or the continuously stratified QGE. More realistic western boundary layers should also be investigated. Of course, all these problems are compounded when mathematical models that are more accurate than the QGE are considered, such as the Boussinesq equations. Finally, mathematical support for these new ROMs needs to be provided. The first steps in this direction have been made (see, e.g., [61,75]), but much more remains to be done.

**Author Contributions:** Conceptualization, C.M., Z.W., D.R.W., X.X. and T.I.; methodology, C.M., Z.W., D.R.W., X.X. and T.I.; software, C.M., Z.W. and D.R.W.; validation, C.M., Z.W. and D.R.W.; formal analysis, C.M., Z.W. and D.R.W.; investigation, C.M., Z.W., D.R.W., X.X. and T.I.; resources, C.M., Z.W., D.R.W., X.X. and T.I.; data curation, C.M., Z.W. and D.R.W.; writing—original draft preparation, C.M., Z.W., D.R.W., X.X. and T.I.; writing—review and editing, C.M., Z.W., D.R.W., X.X. and T.I.; visualization, C.M., Z.W. and D.R.W.; supervision, T.I.; project administration, T.I.; funding acquisition, Z.W., D.R.W. and T.I. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Science Foundation, DMS-2012253, DMS-1953113, DMS-1913073, OAC-1450327, and by U.S. Department of Energy, DE-SC0020270.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Swing of Beauty: Pendulums, Fluids, Forces, and Computers**

#### **Michael Mongelli <sup>1</sup> and Nicholas A. Battista 2,\***


Received: 27 January 2020; Accepted: 5 April 2020; Published: 12 April 2020

**Abstract:** While pendulums have been around for millennia and have even managed to swing their way into undergraduate curricula, they still offer a breadth of complex dynamics to which some has still yet to have been untapped. To probe into the dynamics, we developed a computational fluid dynamics (CFD) model of a pendulum using the open-source fluid-structure interaction (FSI) software, *IB2d*. Beyond analyzing the angular displacements, speeds, and forces attained from the FSI model alone, we compared its dynamics to the canonical damped pendulum ordinary differential equation (ODE) model that is familiar to students. We only observed qualitative agreement after a few oscillation cycles, suggesting that there is enhanced fluid drag during our setup's initial swing, not captured by the ODE's linearly-proportional-velocity damping term, which arises from the Stokes Drag Law. Moreover, we were also able to investigate what otherwise could not have been explored using the ODE model, that is, the fluid's response to a swinging pendulum—the system's underlying fluid dynamics.

**Keywords:** fluid dynamics education; damped pendulums; fluid drag; fluid-structure interaction; computational fluid dynamics

#### **1. Introduction**

Historically, pendulums have been used for a multitude of purposes. From seismometers used almost two thousand years ago [1,2], to devices that increase efficiency for societal capacity, such as reciprocating saws and pumps [3,4], to keeping time [5,6], to even medieval torture devices [7], applications of pendulums are far and wide. Edgar Allan Poe even wrote a short story about one, *The Pit and the Pendulum* [8]. The esteemed polymath Galileo Galilei dreamt of the first pendulum clock in 1637, but it wasn't constructed until 1656 when Dutch physicist Christiaan Huygens drew out the plans, thus designing it. He enlisted clockmaker Salomon Coster to build it. The introduction of a pendulum clock increased time keeping accuracy from 15 minutes to a quarter of minute [5] pendulums changed history!

It is no surprise that the study of pendulums swings its way into many foundational courses in science, mathematics, and engineering. Students are introduced to them in courses such as mechanics, dynamics, or differential equations, where they are first exposed to the idea of a *simple* gravity pendulum. A simple gravity pendulum is an idealized pendulum—a weight (*bob*) is attached to a massless string, which is tethered to a fixed pivot point, and is allowed to swing freely, without friction [9]. It will continue to swing forever. Realistic? Not unless one lives in a vacuum, but ultimately a good place to begin a student's exploration of simple harmonic motion.

If *θ*(*t*) represents the angular displacement (in radians) from the vertical at time *t* (see Figure 1a), the ordinary differential equation (ODE) describing such a simple pendulum system takes the form:

$$I\frac{d^2\theta}{dt^2} + mgL\sin\theta = 0,\tag{1}$$

where *I*, *m*, and *L* are the pendulum bob's moment of inertia and overall mass and length, respectively, and *g* is the gravitational acceleration. Since the only external force acting on this pendulum is gravity, it will swing forever, with no loss in oscillatory amplitude, see Figure 1b for an example. Figure 1b shows simulated results for different pendulum cases, each with a circular bob of a different radius. Note that the ODE was solved numerically, as no small angle approximation was used [9]. For these cases of circular bobs, of radius *R*, attached to a massless string of length *L*, the moment of inertia is calculated to be:

$$I = m\frac{1}{2}R^2 + mL^2.\tag{2}$$

There are two things to note from Figure 1b. The first is that over time the oscillation amplitudes do not decay. The second is that although amplitudes of oscillation are not affected, the period of oscillation is affected by bobs of different radii. Larger bobs have larger periods, due to their moment of inertia being greater [9].

**Figure 1.** (**a**) A pendulum of length *L* with circular bob of radius, *r*, and mass, *m*. (**b**) Angular displacement (in radians) over time for various gravity pendulums of differing radii. The non-dimensional time is given in terms of the number of periods of the case with radius, *r*.

In a world (or classroom) like ours which does not exist in a vacuum, a table-sized pendulum demonstration would eventually lead to its angular oscillations reaching zero, i.e., standing still. This is due to the pendulum swinging in air—a fluid. The air resists the pendulum's motion, effectively pushing back on the pendulum. This is known as *fluid drag*.

The concept of fluid drag is probably familiar to you already. It is the reason why parachutes work. The equation governing a pendulum swinging in a fluid environment is given by

$$I\frac{d^2\theta}{dt^2} + b\frac{d\theta}{dt} + mgL\sin\theta = 0,\tag{3}$$

where the parameter *b* is deemed a *damping* parameter. This is a reduced order model of the system, as the contribution of the fluid onto the pendulum is entirely contained within the parameter, *b*. That is, this equation models how the fluid is believed to affect the swinging motion of the pendulum, while providing no mechanism to understand the underlying fluid's dynamics. Numerical simulation results from solving Equation (3) are presented in Figure 2.

**Figure 2.** Angular displacements against non-dimensional time for damped physical pendulums in the case of (**a**) constant radius and varied damping, *b*, and (**b**) constant *b* and varied radii.

Figure 2a holds the radius constant at *r*, the same *r* from Figure 1b, while varying the damping coefficient, *b*. Compared to Figure 1b, angular oscillations decay in all cases when *b* > 0. The damping induced from *b* > 0 will eventually lead to its equilibrium—a stationary pendulum hanging straight down. However, the decay rate is dependent on *b*; larger values of *b* lead to a quicker decay of oscillation. Note that *<sup>b</sup>* has units of kg·m<sup>2</sup> <sup>s</sup><sup>2</sup> and in realistic situations, *<sup>b</sup>* > 0. Moreover, depending on the value of *b*, the pendulum system will exhibit one of three behaviorial cases:


The simulations shown in Figure 2b held the damping parameter fixed (*b* = 150 from Figure 2a) and varied the radius of the bob. Note that changing the radius *r* will vary the moment of inertia (see Equation (2)). This data suggests that as *r* increases for a given *b*, this would lead to more oscillatory behavior. That is, smaller *r* tends to make the pendulum system more damped. Intuitively this doesn't make much sense as is—a larger pendulum bob should feel more drag since it has a larger surface area. It would be like jumping out of an airplane with a parachute with a surface area of 10 m2 and floating down slower (and more softly) than a parachute of 40 m2. How could this be?

For the simulations in Figure 2b, we fixed the damping parameter *b* and then varied *r*. We did not consider that the damping parameter may depend on the radius (among a variety of other parameters), that is, the geometry of the pendulum bob. Furthermore, we have yet to motivate where the damping term in Equation (3) comes from. Let's settle that.

It comes from the seminal work of prolific physicist and mathematician Sir George Stokes, who even studied drag force using pendulums [10]! In particular, he derived a drag force equation, now known as *Stokes Law*, by investigating spheres moving through a fluid at low Reynolds numbers, i.e., situations in which either the fluid is moving extremely slowly or the fluid's viscosity is very high [11,12]. Stokes Law (for a sphere at low *Re*) is presented as the following:

$$F\_D = 6\pi\mu r v\_\prime \tag{4}$$

where *FD* is the fluid drag, *μ* is the fluid's dynamic viscosity and *r* and *v* are radius and speed of the sphere that is moving through the fluid. Note that the Reynolds number (*Re*) depends on two fluid parameters, i.e., its density, *ρ*, and dynamic viscosity, *μ*, as well as two parameters based on the physical system being investigated, i.e., a characteristic length and velocity scale, *L* and *V*, respectively. The *Re* is defined to be

$$Re = \frac{\rho LV}{\mu}.\tag{5}$$

Thus *Stokes Drag* describes that this damping frictional force acting on the sphere is proportional to its size, *FD* ∼ *r*, and speed, *FD* ∼ *v*. Careful to remember that this may only be true in low Reynolds number situations, where either *v*, *r*, or both may be small. Notice that the form that the damping term took in Equation (3) was similar, but used angular displacement (*θ*(*t*)) and angular velocity ( *<sup>d</sup><sup>θ</sup> dt* ). As suggested by numerical simulations presented above, this damping equation gives rise to exponential decay in angular displacement (Figure 2).

At higher Reynolds numbers, i.e., situations in which fluid viscosity is low or the speed or size of an object is large, the drag force takes a different form. For these situations, the drag law was discovered by none other than the infamous Lord Rayleigh (John William Strutt) using dimensional analysis [13]. For high Reynolds numbers settings, the fluid drag force takes the following form [14,15]:

$$F\_D = \rho r^2 K v^2,\tag{6}$$

where *ρ* is the fluid density, *r* is the sphere's radius, *v* is the sphere's velocity, and *K* is a non-dimensional number that is based on the fluid flow's speed and direction as well as the object's shape, size, and orientation in respect to the flow, and the fluid's density and viscosity. In a nutshell, for a specific object, this constant *K* may significantly change if one or more of these parameters are varied.

This drag force is traditionally represented in the following generalized manner:

*FD* <sup>=</sup> <sup>1</sup> 2 *ρACDv*2, (7)

where *A* is a cross-sectional area of an object in flow and *CD* is a dimensionless number called the *drag coefficient*. In this representation *CD* is analogous to *K* above.

Moreover, work in the latter half of the 20th century and early 21st century has shown that in particular situations correction terms must be included into the drag force equations [16,17]. Furthermore there are still unknown dynamics of pendulums involving small amplitude oscillations [18]. Although physical pendulums have been used for thousands of years and studied by students in foundational courses for over a century, they remain an active research area [19].

With the advent of new technologies, e.g., experimental measurement and flow visualization tools, researchers have further probed into the complex interactions of pendulums and the fluid environments they are immersed within [20–25]. In particular, Mathai et al. [25] investigated how fluid drag on pendulums may be enhanced due to dynamic interactions with their own vortex wake as they swing—something not quantified previously!

Mathai et al. [25] went on to note *Even with the wake history force included, the current model is still quite basic. In reality, the dynamics <of a pendulum> is highly nonlinear, with changes in direction, curvilinear trajectories and wide variations in instantaneous Re <Reynolds number>. . . Fully resolved direct numerical simulations. . . can provide better insights into the flow-induced forces*. That is exactly where our work on pendulums began, although there have been two previous studies using CFD models of pendulums [26,27].

In this paper we implemented a fluid-structure interaction (FSI) computational model of a swinging pendulum containing a spherical bob (a circular bob in two dimensions). In our FSI model, we varied the size of the circular pendulum bob, i.e., its radius, and its mass. We then analyzed the resulting data in terms of angular displacement, speed of the pendulum bob, and fluid forces acting on it, as well as compared the dynamics between our FSI model and the canonical reduced ODE model for a damped physical pendulum, Equation (3). Furthermore, we visualized (and qualitatively analyzed) the fluids motion in response to a swinging pendulum.

In addition, we provide instructor resources, such as slides and movies, in the Supplemental Materials (the items are listed in Appendix A), with the goal to streamline use of this work in educational settings. Moreover, we also offer the science community the first open-source pendulum models in a fluid-structure interaction framework. The models can be found at github.com/ nickabattista/IB2d in the sub-directory:

IB2d <sup>→</sup> matIB2d <sup>→</sup> Examples <sup>→</sup> Examples\_Education<sup>→</sup> Pendulum.

Note that each example is of a point mass model bob and was selected for its computational speed in comparison to circular bobs (those of non-zero radius). Moreover, three versions are presented, each corresponding to a different grid resolution. The least spatially resolved case, 256 × 256, will be the fastest to run, but also the least accurate of the 3, while the 1024 × 1024 case has the highest spatial resolution, but will run the slowest. The default setting is to only save the pendulum (Lagrangian) data. To store the fluid (Eulerian) data, flags corresponding to the desired fluid quantities may be selected within the input2d file. We also provided the scripts used to solve Equations (1) and (3) that produce Figures 1 and 2.

#### **2. Methods**

To investigate the swinging motion of a two-dimensional pendulum bob immersed in a viscous, incompressible fluid, we used computational fluid dynamics (CFD). In our model, the bob starts at rest and begins to swing under gravitational acceleration acting on the mass of the pendulum. An immersed boundary (IB) framework was used to couple the pendulum's motion and the fluid it is immersed within. Scientists and engineers can use IB to study the interactions of an object and the fluid it is contained within, i.e., you can explore how the fluid affects the object and vice versa.

The immersed boundary method was developed by Charles Peskin, a mathematical physiologist at the Courant Institute of Mathematics [28–30]. Even though IB was invented in the 1970s, it is still extensively used for investigating fluid-structure interaction problems today. Many mathematicians, engineers, and scientists have since improved the original algorithm in attempts to increase its accuracy without significantly increasing the computational expense and time required [31–38]. IB is still a leading numerical framework for studying problems in FSI due to its robustness [39,40].

It has previously been applied to study problems ranging from cardiac fluid dynamics [41–44] to aquatic locomotion [45–49] to insect flight [50–52] to dating and relationships [39]. Additional details on the IB method can be found in Appendix B.

In the remainder of this section we will introduce our FSI pendulum's implementation into the *IB2d* framework, i.e., the computational geometry, geometrical and fluid parameters, and model assumptions.

#### *2.1. Model Geometry*

Figure 3 presents our pendulum model's computational geometry. In particular, Figure 3a illustrates the modeling idea—a bob (of radius *r*) is composed of a central point mass (of mass *m*) and outer neutrally-buoyant shell layer. It is tethered to a particular fixed location, a distance *L* away. The pendulum is immersed in a viscous, incompressible fluid of uniform density *ρ*, and dynamic viscosity, *μ*. Note that the fluid within and outside the pendulum bob is uniform in its properties. We define the pendulum's angular displacement, *θ*, to be the angle from the vertical dotted line. Gravity acts on the central mass point; as the rest of the pendulum bob's geometry is neutrally-buoyant, all acceleration of the bob is due to this single gravitational interaction. However, due to the structure properties of the pendulum bob, the neutrally-buoyant shell will undergo fluid drag due to the fluid's resistance in letting the pendulum bob move through it.

**Figure 3.** (**a**) Model of an immersed pendulum with circular bob of radius *r* in a viscous, incompressible fluid. The fluid has density and viscosity of *ρ* and *μ*, respectively. The pendulum has length *L* and the bob has mass *m*, concentrated at its center. (**b**) The computational geometry illustrating the fiber model construction of the discretized Lagrangian mesh.

As we wished to vary the pendulum bob's radius and mass of its central point, we considered the parameters listed in Table 1 for our FSI pendulum model. The explicit radii, *r*, and masses, *m* studied are *r* ∈ {0.001, 0.0025, 0.005, 0.0075, 0.01, 0.0125, 0.015, 0.0175, 0.02, 0.0225, 0.025} m and *m* ∈ {<sup>2</sup> <sup>×</sup> <sup>10</sup>2, 5 <sup>×</sup> <sup>10</sup>2, 1 <sup>×</sup> <sup>10</sup>3, 2 <sup>×</sup> <sup>10</sup>3, 5 <sup>×</sup> <sup>10</sup>3, 1 <sup>×</sup> <sup>10</sup>4} kg, respectively. The initial angular displacement, *<sup>θ</sup>*0, was <sup>−</sup>*<sup>π</sup>* <sup>2</sup> <sup>+</sup> *<sup>π</sup>* <sup>5</sup> <sup>=</sup> <sup>−</sup>3*<sup>π</sup>* <sup>10</sup> radians. We did not vary properties of the fluid, neither its density nor viscosity. Note that the kinematic viscosity in our simulations was *ν* = *<sup>μ</sup> <sup>ρ</sup>* = <sup>10</sup>−<sup>5</sup> m2/s. Some common liquids with kinematic viscosities around 10−<sup>5</sup> m2/s are sulphuric acid at room temperature or a variety of oils (coconut, SAE Motor Oils, peanut, whale, etc.) at ∼100–130 degrees Fahrenheit [53]. Kinematic viscosity, *ν*, measures the fluid's internal resistance to flow under gravity.


**Table 1.** Table of geometric and fluid parameters used in our pendulum study.

#### *2.2. Model Construction*

Figure 3b provides a more detailed overview of the computational geometry. In particular it provides details regarding how the structure is modeled using *IB* fiber models, which are used to mimic desired material properties between discretized points, i.e., Lagrangian points, that compose the geometry [39]. The single mass point is tethered to the fixed point, a distance *L* away, via a *virtual spring*. The static hinge point is tethered in place using the *target point* model. Target points can be used to hold Lagrangian points nearly rigid. The individual mass point uses a *massive point* model that tethers the individual Lagrangian point to a *mass*, which dampens its movement [54]. Note that the target point and massive point models use spring-like mathematical formulations to achieve their desired effects, see [39].

The neutrally-buoyant shell is composed of equally-spaced Lagrangian points, to which are tethered to their neighboring points via virtual spring connections. Furthermore, each of these Lagrangian points are further tethered to the massive point in the center of the pendulum bob by a virtual spring. All of the virtual spring connections in the model use *stiff* springs with a particular spring resting length in order to keep the geometry from changing shape, i.e., trying to ensure that the Lagrangian points maintain a specific distance from other points. The number of Lagrangian points in a circular shell varies by the radius, see Table 2. Note that due to the coupled nature of the Lagrangian structure and fluid in the standard immersed boundary framework, it was more straightforward for us to approximately model rigid structures in this manner. Additional steps would have been necessary to solve the problem of each Lagrangian point only being allowed to move in a constrained way, due to the imposed rigidity, under forces from the fluid and other external forces (like gravity) pushing on it. Please see [46,55] for further information regarding immersed boundary formulations with rigid bodies.

**Table 2.** Table providing number of Lagrangian Points in the circular shell for a particular radius, *r*.


Fiber models use a variety of different deformation force laws to model material properties. To model virtual (Hookean) springs, deformation forces were calculated as follows,

$$\mathbf{F}\_{spr} = k\_{spr} \left( 1 - \frac{R\_L(t)}{||\mathbf{X}\_A(t) - \mathbf{X}\_B(t)||} \right) \cdot \begin{pmatrix} x\_A(t) - x\_B(t) \\ y\_A(t) - y\_B(t) \end{pmatrix} \tag{8}$$

where *kspr* is the spring stiffness, *RL*(*t*) is the spring's resting lengths, and **X***<sup>A</sup>* = *xA*, *yA* and **X***<sup>B</sup>* = *xB*, *yB* are the Lagrangian nodes tethered by the spring. Note that in our model the resting lengths are time-independent, hence *RL*(*t*) = *RL*. As mentioned previously, the spring stiffnesses are large to ensure minimal stretching or compression of the computational geometry. The spring stiffness used to tether the massive point to the fixed hinge point and the pendulum bob points to both one another and the massive point are denoted by *ksprL* and *ksprB* , respectively.

In the simple case where a preferred position is enforced, boundary points are tethered to target points via springs. Its corresponding deformation force equation, which describes the force applied to the fluid by the boundary in Lagrangian coordinates is given by **F***target* and is explicitly written as,

$$\mathbf{F}\_{\text{target}} = k\_{\text{target}} \left( \mathbf{Y}\_A(t) - \mathbf{X}\_A(t) \right), \tag{9}$$

where *ktarget* is the stiffness coefficient, and **Y***A*(*t*) is the prescribed Lagrangian position of the target point. In all simulations the hinge point was held nearly rigid by applying a force proportional to the distance between the location of the actual Lagrangian point and its preferred target position. Using a large value of *ktarget* helps mitigate a small deviation between the actual and preferred position.

Artificial mass is modeled using the massive point approach of Kim et al. [54]. It is similar to target points. **Z***<sup>A</sup>* gives the Cartesian coordinates of the *massive* points, with associated mass density *MA*. Note that such points do not interact with the fluid directly, similar to target points. **X***A*(*t*) give the Cartesian coordinates of a neutrally-buoyant Lagrangian boundary point, which do interact with the fluid. Similar to target points, if a Lagrangian point deviates from its massive point, a restoring force drives them back together, as shown in its mathematical description below

$$\mathbf{F}\_{\rm Mars} = k\_{\rm mass} (\mathbf{Z}\_A(t) - \mathbf{X}\_A(t)) \tag{10}$$

$$M\_A \frac{\partial^2 \mathbf{Z}\_A(t)}{\partial t^2} = -\mathbf{F}\_{\text{Mass}} - M\_A \mathbf{g} \mathbf{\hat{y}},\tag{11}$$

where *kmass* is a stiffness coefficient with *kM* 1, *MA* is the mass of the massive point, *g* is the acceleration due to gravity in vertical direction, *y*ˆ, and **Z***A*(*t*) is the position of the massive point to which the Lagrangian point, **X***A*(*t*), is tethered to at time *t*.

All numerical stiffness parameters are given in Table 3. The stiffnesses were selected to be as high as possible while also maintaining stability and fidelity of our numerical solver. Each pendulum simulated was of length 0.2 m and was immersed in a square computational domain of size (*Lx*, *Ly*)=(1 m, 1 m), with a grid resolution of 1024 <sup>×</sup> 1024, i.e., *dx* <sup>=</sup> *dy* <sup>=</sup> *Lx Nx* <sup>=</sup> *Ly Ny* = 0.0009765625 m. Points that compose the circular pendulum bob were evenly spaced apart at a distance of *ds* = *dx* <sup>2</sup> . Note that this is the standard convention in the immersed boundary literature when choosing the Lagrangian point spacing, *ds*. It is used to avoid leaky boundaries [30]. Thus, fluid will not be allowed to flow in or out of the pendulum bob, unless due to numerical error. Moreover, note that the adjacent nodes along the circle were a distance *r* from the massive point at the center of the pendulum bob, which was tethered a distance of *L* from the fixed hinge target point. Each of the spring connections between specific points used a spring resting length equal to such corresponding distances in an effort to maintain the geometry while the pendulum was swinging. A time-step of *dt* <sup>=</sup> 2.5 <sup>×</sup> <sup>10</sup>−<sup>5</sup> s was used to march forward in time.



While running the simulations, we stored the following data every 0.005 s of simulation time:


We then used the open-source software VisIt [56], created and maintained by Lawrence Livermore National Laboratory for visualization, see Figure 4, and the data analysis package software within *IB2d* [39] for data analysis. Figure 4 provides a visualization of some of the data produced for a pendulum of mass and radius of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> 102 kg and *<sup>r</sup>* <sup>=</sup> 0.0175 m, respectively, at one snapshot in time. Section 3.4 explores the underlying fluid dynamics in further detail, including the time evolution over a pendulum's first swing, see Figure 19.

**Figure 4.** Snapshots of a single moment in time for a pendulum with mass, *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg, and radius, *r* = 0.0175 m, providing some of the data stored during the time-step in which the simulation time reached *t* = 0.70 s, i.e., positions of Lagrangian points (pendulum), the velocity vector field, magnitude of velocity, and vorticity. Note that data from giving the force spread from the Lagrangian grid (pendulum) onto the Eulerian (fluid) grid is not shown. Lagrangian Coherent Structures (LCS) via finite time Lyapunov exponents (FTLE) are also illustrated, although they were computed during the post-processing stage, after the data was collected.

#### **3. Results**

Using an open source implementation of the immersed boundary method, IB2d, we modeled the motion of a pendulum with a circular bob immersed within a fluid undergoing gravity's influence. For this education focused paper, we explored angular displacement and speed of the pendulum bob as well as forces acting upon the pendulum bob to impede its motion. Upon doing so we quantified the decay in oscillation amplitude and speed damping. This was done for a variety of pendulum bob masses as well as radii (size). We also explored the effect that the motion of the pendulum bob has onto the fluid it was immersed. Lastly, we compared the reduced ODE model of a damped physical pendulum and our FSI model. We organized our results into the following five subsections:


#### *3.1. Angular Displacement of the Pendulum Bob*

As suggested from Section 1, since the pendulum is immersed within a fluid environment, its oscillation amplitude will decay over time. Figure 5 provides snapshots of multiple pendulums' angular displacement over time for pendulum bobs of differing radius and *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> 103 kg. Note that all simulations were run independently of each other and Lagrangian position data is being overlaid during post-processing.

**Figure 5.** Snapshots of multiple pendulums' (of differing radius) angular displacement over time in the case of *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg.

Moreover, both the size of the bob and the mass of the bob will affect its dynamics. Figure 6 illustrates that pendulums with the same size and shape bob may experience significantly different oscillation patterns due to different mass. In particular, depending on the mass, the pendulum could fall into any of the 3 damping cases (under-, over-, or critically-damped). See Figure A1 for the counterpart case where a specific mass is tested for a variety of radii. Consistent dynamics are observed.

**Figure 6.** Depicting the angular displacement (radians) vs. time (s) for pendulums with the same radius but different masses. (**a**–**c**) give data for a specific radius, either *r* = 0.005 m, 0.015 m, or 0.025 m, respectively, for 4 orders of magnitudes in mass in each.

Next we calculated the maximum amplitude during each oscillation cycle for a variety of masses. The amplitude decays exponentially, see Figure 7. Figure 7 presents the displacement amplitude against peak number (number of half swings) for a variety of masses for *r* = 0.005 m. The data shows a linear relationship between the logarithm of the amplitude and the peak number, suggesting exponential decay, although lines seem a better fit starting at the first peak, rather than the initial displacement. Note that Figure 8 shows a steady increase in time between successive peaks for three

different masses (*<sup>m</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> 102, 1 <sup>×</sup> 103, and 5 <sup>×</sup> <sup>10</sup><sup>3</sup> kg) for a variety of radii. As mass increases, the time between peaks decreases. Moreover, generally as more swings occur the time become peaks becomes more consistent. We also note that the time between the start of each simulation and their first peak generally does not fit the data. Sections 3.2, 3.3 and 3.5 will also explore this observation in more detail.

**Figure 7.** (**a**) Plot illustrating the decay of the height (m) that the pendulum bob reaches as the pendulum continues to swing for the case of *r* = 0.005 m for a spectrum of masses. The peak amplitude decays exponentially as illustrated by the linear relationship between the logarithm of the amplitude against peak number, as shown in (**b**).

**Figure 8.** Plots illustrating the time of the peak in angular displacement against the pendulum bob's radius for its 1st through 6th peak (**a**–**c**) and the time difference between the peaks (**d**–**f**) against the pendulum bob's radius for three different masses: (**a**,**d**) *<sup>m</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> 102 kg, (**b**,**e**) *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg, and (**c**,**f**) *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg.

From this data, we computed the approximate damped period of oscillation, see Figure 9a,b. Figure 9b provides a colormap with contour lines of the period data from Figure 9a. As mass increases, the approximate period decreases, as suggested previously. Moreover, as radius increases, the period also increases. Note that when mass is high enough (e.g., the case of *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> 104 kg), the period

does not significantly change between different radii; however, there is still exponential decay in the oscillatory amplitude (see Figure A2). Furthermore, for very small radii there appears to be a non-monotonic trend in period as a function of mass. The period was computed by averaging the time between every other peak over the first 5 full oscillations.

Next we explore how the speed of the pendulum bob is affected by variations in its mass and radius in a fluid environment.

**Figure 9.** (**a**) The period given as a function of a pendulum bob's radius for a variety of masses. (**b**) A contour map showing the period as a function of both the pendulum bob's radius and mass. The highest periods occur for small masses and large pendulum bobs.

#### *3.2. Speed of the Pendulum Bob*

Recall that in Section 3.1 we observed that peaks in angular displacement decayed exponentially over time. This suggests that the pendulum bob's speed is inherently slowing down as well. Figure 10 details the pendulum bob's speed vs. the number of swings (half a complete oscillatory cycle). The data shown is for the case of *r* = 0.015 m for a variety of masses. During each swing the pendulum accelerates to a maximal speed before decelerating. The maximum occurs at roughly halfway through each swing, when the pendulum passes the point of 0 displacement from the vertical. Note that physically the pendulum must hit a speed of zero when it swings from one direction to the other; our data does not reflect this due to the time resolution of the sampled time-points.

Furthermore, from Figure 10b, it does not appear that the speed peaks decay exponentially over time from the beginning of the simulation. After a few swings, the peaks in speed appear to satisfy a linear relationship between the logarithm of speed versus time; however, the peak speed significantly decreases from the first to second swing, see Figure 11. Figure 11 illustrates that for most cases from the second swing on, the peak speeds approximately demonstrate exponential decay; however, there is significantly more decay between the first and second swing than successive peaks thereafter. Figure A3 presents the counterpart data of how the speed of pendulum bobs of the same mass decays over time for a variety of radii. Similarly trends are observed in the data.

**Figure 10.** (**a**) Plot depicting the linear speed of the pendulum bob against non-dimensional time given as the # of swings (half a full displacement cycle) for the case of *r* = 0.015 m and a variety of masses. Speed peaks near the center of each swing. This corresponds to when the pendulum has approximately zero angular displacement from the vertical. The peak speed appears to begin decaying exponentially, starting on the second or third swing in most cases. This can be seen from linear relationships between peak speed and swings in (**b**).

**Figure 11.** Plot illustrating that exponential decay appears in peak speed starting with the second swing. There is significantly more decay in peak speed between the first and second swings, than successive swings thereafter.

Next we wished to quantify the amount of damping due to the pendulums immersion in a fluid of kinematic viscosity *ν* = *μ*/*ρ* = 10−<sup>5</sup> m2/s. To do this we computed the theoretical speed of a pendulum bob void of a fluid environment by energy conservation. We set the original potential energy when the pendulum bob was at time zero and computed the kinematic energy when the pendulum was passing 0 displacement, i.e.,

$$\frac{1}{2}m\upsilon\_{NF}^2 = mgh\_0 \implies \upsilon\_{NF} = \sqrt{2gh\_0}\upsilon$$

where *vNF* is the velocity of the pendulum bob outside of a fluid environment and *h*<sup>0</sup> is the initial height of the pendulum bob before it begins swinging. For our initial setup, *h*<sup>0</sup> = *L*(1 − cos(*π*/2 − *π*/5)), since the pendulum is released from an initial angle of *π*/5 radians from the horizontal.

We then defined the speed ratio to be: *SR* = *v*/*vNF* and the speed damping ratio to be: *SDR* = 1 − *SR* = 1 − *v*/*vNF*. If *v* = *vNF*, *SDR* ≈ 0 suggesting only small damping effects. Figure 12 gives the speed damping ratio as a percentage. For this particular fluid environment, even cases of high mass and small radius result in *SDR*'s of ∼88%, i.e., the fluid immersed pendulum bob is moving ∼88% slower than its fluid void counterpart. As the radius increases, the *SDR* increases. Note that smaller masses have less significant increases in *SR*. As mass increases, *SDR* decreases for a given radius.

**Figure 12.** The percentage decrease in speed when comparing pendulum bob speed once it reaches 0 degree angular displacement on the first swing compared between simulated cases in fluid and theoretical value outside of a viscous fluid environment.

Lastly we explored the phase space between pendulum bob speed and its angular displacement. Figure 13a presents the data for the case of *r* = 0.001 m for a spectrum of masses of over 3 orders of magnitude. As suggested by all data previously, the trajectories eventually converge to zero displacement and speed. Interesting, all the trajectories collapse onto an approximate parabolically-capped cone. The last cycle is given for all cases in Figure 13b. Similar topologies are seen in cases of other radii (see Figure A5) over a variety of masses. Furthermore, similar topologies emerge when fixing the mass and exploring trajectories for a variety of radii (see Figure A4).

**Figure 13.** (**a**) Phase space of linear speed of the pendulum bob vs. angular displacement (radians) for a variety of masses in the case of *r* = 0.001 m. (**b**) A closer look at the last simulated cycle's phase space in each case.

#### *3.3. Forces on the Pendulum Bob*

We have observed that a fluid-immersed pendulum experiences exponential decay in angular displacement and speed. As discussed in Section 1, this is due to fluid drag on the pendulum bob. In this section we will explore this drag force. We wish to emphasize that our numerical experiments did not prescribe any specific form of the drag forces *a priori*, or any forces for that matter, beyond gravity acting on the pendulum bob.

First we selected three masses, *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup>2, 1 <sup>×</sup> 103, and 2 <sup>×</sup> 103 kg, and investigated the drag force acting on the pendulum versus time for a variety of radii. This data is shown in Figure 14. The drag force was calculated by computing the forces perpendicular to the direction of the pendulum arm as the bob swung. A normal unit vector was computed at each sampled time-point and the drag force was computed using a vector projection in the direction opposite to the swinging motion of the pendulum bob.

Figure 14a–c suggest that the drag force exponentially decays as time progressed. This was further confirmed by the linear relationship between the logarithm of drag force vs time in Figure 14d–f. As the radius increases, the surface (circumference) of the circle increases, and thus the amount of drag on the pendulum bob's body increased for a particular mass as well. Hence the drag on the bob is dependent on its geometry (shape and size), as discussed in Section 1. Furthermore, as mass increases, so does the overall drag on the bob. This can also be seen in the counterpart case, where 3 radii are selected (*r* = 0.005, 0.015, and 0.025 m) and mass was varied, see Figure A6.

**Figure 14.** Drag forces (N) over time in seconds for multiple radii for cases with (**a**,**d**) *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg, (**b**,**e**) *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg, and (**c**,**f**) *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg. The semi-log data is provided in (**d**–**f**) to highlight a linear relationship between the logarithm of the drag force and time. This linear relationship suggests an exponential decay in drag force over time.

Next, we explored the phase space of drag force on the pendulum bob versus angular displacement of the bob. This data is given in Figure 15. Similar to the phase plots of speed versus displacement, the force-displacement trajectories all collapse onto a unique exponentially decaying envelope. Smaller radii (Figure 15a, *r* = 0.005 m) correspond to larger angular displacements and larger spectrum of initial drag forces corresponding to a variety of masses over 3 orders of magnitude. In the case of a larger radii (Figure 15c, *r* = 0.025 m), there are smaller angular displacements, comparatively, and the range of initial drag forces is smaller for the same variety of masses.

*Fluids* **2020**, *5*, 48

**Figure 15.** Phase space of drag force (N) versus angular displacement (radians) for a variety of masses in cases of (**a**) *r* = 0.005 m, (**b**) *r* = 0.015 m, and (**c**) *r* = 0.025 m. The data for each case of a specific radius appears to overlap as well as suggesting that as the peaks in angular displacement decay exponentially (see Figure 7), the drag forces also decay exponentially as well.

Finally, we wish to compare force information across all cases of radii and masses considered. The metric we chose to compare for each is the drag coefficient; recall the *CD* term from Equation (7). To justify our use of Equation (7), we verified that the pendulum simulations fell into the appropriate Reynolds number range, i.e., *Re* > 1. Recall the Reynolds number, *Re*, is defined to be

$$\text{Re} = \frac{\rho LV}{\mu},$$

where *ρ* and *μ* are the fluid's density and dynamic viscosity, and *L* and *V* are a characteristic length and velocity scale, respectively. We chose *L* to be the length the pendulum's arm, but rather than select *V* to be a constant, we used the time-dependent speed of the pendulum bob for each case. Note that this speed is inherently a function of the radius of the pendulum bob itself, see Figure A3 in Appendix C. Figure 16a illustrates that for *<sup>m</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg that the peak in Reynolds number is greater than one. Moreover, where *Re* drops down, i.e., at the beginning and end of each swing, is where speed is near zero. Thus we assume the Drag Force Law derived by Lord Rayleigh will suffice for our purposes here (*FD* <sup>∼</sup> *<sup>V</sup>*2, see Equation (7)). Note that as the pendulums continue to swing, *Re*(*t*) will decrease in every case. Eventually there will be a shift into the regime where Lord Rayleigh's Drag Force Law begins to fail and one must consider Stokes Drag Law (*FD* ∼ *V*, see Equation (4)) when *Re* < 1. Furthermore, we computed the time-averaged Reynolds number over the first swing of the pendulum for all masses and radii considered, see Figure 16b. Generally, as the mass of the bob increases, the average *Re* increases. On the other hand, as the radius increases, average *Re* decreases.

**Figure 16.** (**a**) Re vs. Time for *<sup>m</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg and (**b**) a colormap depicting the temporally-averaged Reynolds number during the first swing for different masses and radii. Note that over time as the pendulum slows down, the average Reynolds number will decrease.

Upon calculating Equation (7), we also need to describe *A*, the cross-sectional area of the circular pendulum, and *FD*, the drag force on the pendulum. We define *A* to be the circumference of the circle, i.e., *A* = 2*πr*, for each given radius, *r*. Having computed the speed of the pendulum bob and drag force on the body previously, we could solve for the time-dependent drag coefficient *CD* at each sample time-point using Equation (7). Figure 17a,b gives *CD*(*t*) over the first swing (half an oscillation cycle) and 4 swings (2 full oscillation cycles), respectively, for the case of *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> and a variety of radii. Note that the *CD* peaks correspond to when the pendulum bob changes direction and thus reach speeds near zero. Moreover, the time-dependent drag coefficients will increase over time due to the pendulum continually slowing down.

**Figure 17.** The drag coefficient, *CD*, during the first pendulum's first swing (**a**) and first 4-swings (**b**) for a variety of radius in the case of *m* = 1*e*3 kg. Note that the drag coefficient maximizes when the pendulum reaches near zero speed at the end of a swing. (**c**) The temporally-averaged drag coefficients across the first swing for all mass and radius cases considered. (**d**) A contour map of the temporally-averaged drag coefficients over the first swing from (**c**) as a function of both the pendulum bob's mass and radius. Generally higher drag coefficients are seen for larger mass and size pendulum bobs.

In order to compare all simulations of differing mass and radii, we averaged the time-dependent drag coefficient, *CD*(*t*), over the first swing, as shown in Figure 17c,d. Figure 17d provides a colormap with contour lines of the time-averaged drag coefficient using the data from Figure 17c. Larger radii pendulums tend to have larger drag coefficients and higher mass pendulum bobs also have larger drag coefficients. Notice that a pendulum bob with same shape (circle) and size (radius) could elicit different drag coefficients based on variations in mass. From Section 3.2 we have already observed that variations in mass give rise to variations in speed, which is also required to compute *CD* in the first place. The system is highly coupled in its many dynamical features!

Finally, we highlight the relationship between drag coefficient, *CD*, and Reynolds number, *Re*, in Figure 18. For a given mass (mass is denoted by a particular shape in the figure), as average *Re* increases, *CD* decreases. As the system is highly coupled, for a given mass, the average *Re* only increases as the pendulum bob's radius decreases (radius is denoted by the colormap). On the other hand, for a given radius, as the mass increases, the average *CD* and *Re* also generally increase. The overall trend of decreasing *CD* with increasing *Re* is common in many fluid dynamics phenomena, not only physical experiments, such as flow past rigid objects [57–59], but also in biology, such as tiny insect flight [50,51], or even sports such as baseball [60], American football [61], or football (soccer) [62].

**Figure 18.** The average drag coefficient, *CD*, vs. average Reynolds number for a variety of masses and radii. The averages were computing over the first swing of the pendulum bob.

#### *3.4. Effect the Pendulum Bob Has onto the Fluid*

In addition to the data analysis performed in Sections 3.1–3.3, which focused primarily on the Lagrangian structure itself—the pendulum, CFD (FSI) simulations grant us the opportunity to analyze how the underlying fluid reacts to a pendulum swinging through it. Also, we are able to visualize the fluid dynamics and observe the evolution of the fluid's velocity field, **u**(**x**, *t*), magnitude of velocity, |**u**(**x**, *t*)|, and vorticity (∇ × **u**(**x**, *t*)), see Figure 19, for qualitative analysis.

Figure 19 shows the resulting fluid dynamics due to the swinging motion of the pendulum bob of mass, *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg, and radius, *<sup>r</sup>* <sup>=</sup> 0.0175 m, during its first swing. As the pendulum swings, there is a pocket of fast moving fluid directly behind the bob until it passes zero angular displacement, as shown by the Velocity Field and Magnitude of Velocity plots. Note that streamlines are presented on the velocity field's plots and contours are given on the magnitude of velocity plot, as well. Streamlines illustrate the path of massless tracer particles in the flow at an instantaneous point in time, while the contours give a line in which the quantity (here, magnitude of velocity) has constant value. Note that the direction of the pocket of fast moving fluid is towards the pendulum bob; hence objects directly behind the moving bob receive an fluid dynamic benefit. This phenomenon is commonly called *drafting* and has been studied in the context of many sports, such as ice skating [63], running [64,65], swimming [66], or cycling [67], as well as biological locomotion [68–72].

Within the region of fast moving fluid there are two interacting, oppositely spinning vortices behind the bob, as illustrated by the Vorticity plots. When the vortices are shed off the bob entirely, i.e., once the bob swings past zero displacement, the vortex pair continues to move vertically downwards, rather than upwards and to the right with the bob. These visualizations were produced using the raw .*vtk*-data produced during the FSI simulations using the open-source software VisIt [56]. We wish to

emphasize that one cannot gain knowledge of fluid dynamics from the reduced-order ODE model, Equation (3) alone.

Moreover, because of the FSI simulations we are able to analyze the fluid data further and determine regions that have varying levels of fluid mixing. During data post-processing we could compute the finite-time Lyapunov exponents (FTLE), which can be used characterize the rate of separation in the trajectories of two infinitesimally close fluid blobs. Maxima in the FTLE (called *ridges*) have been used to determine Lagrangian Coherent Structures (LCSs), which are used to determine distinct flow structures in the fluid [73–76]. LCSs are a tool to divide the fluid's complex dynamics into distinct regions to better understand transport properties of flow [77–79]. In this paper, we computed forward-time FTLE field, whose maximal ridges give LCSs corresponding to regions of repelling fluid trajectories and low values give rise to regions of attraction [76]. This data is presented in Figure 19 as well. Our desire here is not to emphasize fluid mixing metrics, but merely point out that through CFD one is able to investigate deeper dynamics of a system, even one as well studied as a damped pendulum.

**Figure 19.** Colormaps (and its contours) illustrating the time evolution of the fluid's vorticity, magnitude of velocity, and finite-time Lyapunov Exponent (FTLE), as well as the velocity field (and its streamlines) resulting from the pendulum bob's first swing in the case of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg and *r* = 0.0175 m.

Furthermore, we wish to point out that the resulting fluid dynamics are diverse. Not every pendulum bob sheds vortices in the same way as the case of (*m*,*r*)=(<sup>5</sup> <sup>×</sup> 102 kg, 0.0175 m) (as illustrated in Figure 19). Figure 20 illustrates differences in vortex formation and shedding for the case of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg for a variety of radii during the first swing. In particular the overall size and magnitude of vortices formed is less in the smaller radius cases; however, once shed, the vortex dynamics are different. This would give rise to different dynamics in drafting behind the bob. In the larger radius cases (*r* > 0.015 m), the vortices move vertically downwards upon being shed, while they are significantly different in the smaller cases: in the *r* = 0.010 m case, the vortex-pair travels along with the pendulum bob, and for *r* = 0.005 m two sets of vortex-pairs move on either side of the pendulum bob. Thus, modifying the size of the pendulum results in different dynamics of the underlying fluid, even though all pendulums swing along the same circular arc; however, they do so at different speeds.

**Figure 20.** Comparing vortex dynamics among pendulum bob of different radii for a mass of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg.

Lastly, taking a further look at the *<sup>r</sup>* <sup>=</sup> 0.010 m case (for *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> 102 kg) reveals that as the pendulum swings, it swings back through its vortex wake, see Figure 21. The act of swinging through its vortex wake has been suggested as a possible mechanism for increased fluid drag on the pendulum bob [25]. However, a vortex could also enhance the speed of the bob if an appropriately spinning vortex interacts with the bob at the right moment in time, see *t* = 1.0 s in Figure 21. The vortex in red is spinning counter-clockwise and may give the bob a boost in speed, as it is moving in that same direction. There are complex interwoven dynamics within the system. Figure 22 provides an additional sequence of snapshots depicting these complex interactions. It shows a pendulum bob (*<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>4</sup> kg, *<sup>r</sup>* <sup>=</sup> 0.005 m) swinging through its own vortex wake during the return swing of the first oscillatory cycle. Such complex interaction mechanisms have not been fully explored and warrant further attention from the scientific community.

**Figure 21.** The vortex dynamics of the case (*m*,*r*) = (5 <sup>×</sup> 102 kg, 0.0175 m) within the first 2 s of oscillation.

**Figure 22.** The vortex dynamics of the case (*m*,*r*) = (1 <sup>×</sup> <sup>10</sup><sup>4</sup> kg, 0.005 m) on the return swing during its first oscillatory cycle.

#### *3.5. Numerical Comparison & Validation*

Lastly, in this section we will compare and validate the canonical damped physical pendulum equation against our fluid-structure interaction (FSI) model. Recall the damped physical pendulum (Equation (3)) is given by

$$\frac{d^2\theta}{dt^2} + \frac{b}{I}\frac{d\theta}{dt} + \frac{mgL}{I}\sin\theta = 0.\tag{13}$$

Our FSI model did not assume any knowledge of the existence of this reduced ordinary differential equations (ODE) model. Instead it placed a spherical (circular) pendulum bob into a fluid environment, tethered it to a fixed location, and under the influence of gravity alone, it swung. This was to mimic a physical experiment, but performed *in silica*, rather than in a laboratory setting.

To compare Equation (13) and our FSI model, we first matched the parameter values. For each radius and mass considered in our FSI model, we were able to compute the exponential decay of the peaks in angular displacement amplitude using a linear least squares framework to fit a line through the logarithm of the peak values in angular displacement over time (linear regression), see Figure 23a as an illustrative example. The slope of each line was *<sup>γ</sup>* <sup>=</sup> <sup>−</sup> *<sup>b</sup>* <sup>2</sup>*<sup>I</sup>* for that particular mass and radius. Hence for the parameter *b*/*I* in Equation (13), we multiplied each slope *γ* by −2, i.e.,

$$b/I = -2\gamma.\tag{14}$$

Note that the term *mgL <sup>I</sup>* is the approximate natural (undamped) angular frequency squared of the pendulum bob, i.e.,

$$
\omega\_N^2 = \frac{mgL}{I}.\tag{15}
$$

Note that this is not the true natural, undamped angular frequency, as we did not invoke the small angle approximation, i.e., sin *θ* ≈ *θ* for small *θ* [9]. Moreover, due to the presence of the fluid, the pendulum bob was not in an undamped setting, so we could not directly calculate *ω<sup>N</sup>* from our numerical experiments. However, we used the pendulum's period, as previously computed in Section 3.1, and *γ* to compute *ω*<sup>2</sup> *<sup>N</sup>*. For clarity with the undamped case, we define *TD* and *ω<sup>D</sup>* to the damped period and angular frequency of our pendulum bob from the FSI experiments. Recall the relationship between *ω<sup>D</sup>* and *TD*,

$$
\omega\_{\rm D} = \frac{2\pi}{T\_{\rm D}}.\tag{16}
$$

and the relationship between *ωN*, *ωD*, and *γ*,

$$
\omega\_D^2 = \omega\_N^2 - \gamma^2. \tag{17}
$$

Hence using Equations (15)–(17), we can compute the natural (undamped) angular frequency,

$$
\omega\_N^2 = \frac{mgL}{I} = \frac{4\pi^2}{T\_D^2} + \gamma^2. \tag{18}
$$

Using Equations (14) and (18) we found the appropriate parameter values for the reduced ODE model, as computed from the FSI simulations. Figure 23b–f provides comparison of the FSI and ODE models' angular displacement over time for a variety of pendulum bob radii and masses. For comparative purposes, the ODE model was initialized at the 5th peak of the FSI model and its solution was computed by propagating both forwards and backwards in time. Moreover we also plotted the exponential decay, using the 5th peak amplitude as the coefficient, and plotted it in a similar manner both forwards and backwards in time from the 5th peak.

**Figure 23.** (**a**) Slopes of the least squares (linear regression) fits through the peaks of angular displacement over time to compute the exponential decay, *<sup>γ</sup>* <sup>=</sup> <sup>−</sup> *<sup>b</sup>* <sup>2</sup>*<sup>I</sup>* , for a variety of radii in the *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg case. (**b**–**f**) Comparison of the FSI and ODE models' angular displacement over time for a variety of masses and radii.

We note that the ODE model only agrees with our FSI model after a few oscillations. If we propagated the ODE model forward in time from the original position of the FSI pendulum, angular displacements were not consistent, see Figure 24. Furthermore, if we used the first peak amplitude (initial angular displacement) as the coefficient on the exponential decay, the FSI model appeared to not agree with its own decay rate. However, the decay rates are consistent, as seen in Figure 23, but the FSI model does not start obeying such decay until after a few oscillations. Thus, the decay rates, *<sup>γ</sup>* <sup>=</sup> <sup>−</sup> *<sup>b</sup>* <sup>2</sup>*<sup>I</sup>* , were calculated starting with the third peak rather than the initial displacement. We chose the third peak rather than the second as the linear relationship was more prevalent from that point on. This is the same phenomenon from Figure 11 in Section 3.2, where the exponential decay in peak speeds did not start until what appeared to be the second swing.

**Figure 24.** Depicting the dynamics if the ODE model started from the original angular displacement of the FSI pendulum rather than the the 5th peak for the case (*m*,*r*)=(<sup>5</sup> <sup>×</sup> <sup>10</sup><sup>2</sup> kg, 0.005 m) and (*m*,*r*)=(<sup>5</sup> <sup>×</sup> <sup>10</sup><sup>3</sup> kg, 0.015 m) for (**a**,**b**), respectively. A visualization of the exponential decay is also provided with the coefficient either being *A*0, the original angular angular displacement, or *A*5, the displacement of the 5th peak.

Overall the reduced ODE model agrees with the FSI model after a few oscillations for different size pendulum bob radii as well as over a spectrum of masses; however, the dynamics during the first swing are substantially different (see Figure 24). This is possibly due to a different fluid drag law on the pendulum, before it settles into the regime where the drag can be modeled as linearly proportional to its velocity.

Lastly, we computed the damping parameter, *b*, by itself as a function of the mass and radius of the pendulum bob. To compute this, we first found the effective moment of inertia of each case using Equation (18), e.g.,

$$I = \frac{mgL}{\frac{4\pi^2}{T\_D^2} + \gamma^2},\tag{19}$$

and then calculated

$$b = -2\gamma I.\tag{20}$$

Note that we chose to calculate the effective moment of inertia, *I*, for each simulation here. Our FSI pendulum bob geometry was not a solid structure, but rather, had a singular mass source at its center, a shell composed of neutrally-buoyant points tethered together, and from the IB formulation, its shell enclosed fluid within. This fluid had the same properties as the backward fluid environment in which the pendulum is immersed. Moreover, as the *principle of added mass* states that inertia is added to a fluid system when an object is accelerating (or decelerating) through it [21]. Thus, the appropriate moment of inertia, *I*, to use in the ODE model to match the FSI model is non-trivial.

The damping parameter, *b*, increased as mass increased for a particular radius, see Figure 25. Moreover, as the radius increased for a given mass, *b* increased as well. While the system appears to be more sensitive to changes in mass, recall that the mass was varied over two orders of magnitudes in value, while the radius varied roughly over one.

**Figure 25.** Values of the damping parameter, *b*, as a function of the mass and radius of the pendulum bob.

#### **4. Discussion and Conclusions**

Two-dimensional immersed boundary simulations were used to model the swinging motion of a circular pendulum bob under the influence of gravity that was contained within a viscous, incompressible fluid environment. In addition, to the authors knowledge, this is the first fluid-structure interaction (FSI) simulation that explores the motion of an ordinary pendulum system which also offers an open-source complement. The angular displacement data collected from the motion of the pendulum bob was directly compared against the reduced-order damping ODE model that is familiar to most STEM students. In general, the oscillatory dynamics agree between the ODE model and the FSI model (see Section 3.5). However, there were discrepancies in the decay rates between the first few swing's maximal angular displacements and speeds with those following thereafter. The mechanisms underlying these observations are not fully understood. There appear to be interesting dynamics to probe further involving the pendulum bob's mass and radius, the vortex wake it creates, the interactions of those vortices with each other and the bob itself, and the resulting drag on the bob during large amplitude oscillations.

Moreover, the ODE model's linearly-proportional-velocity damping term, i.e., Stokes Drag Law, was appropriate once the pendulum has swung a few times after a large initial angular displacement. During the first initial swing there was an enhancement of drag on the pendulum bob, potentially obeying Rayleigh's Drag Law and/or from the principle of added mass [24] and/or other complex drag mechanisms involving vortex shedding [23].

Furthermore, we were able to determine the approximate damping parameter, *b*, that fit the ODE model from our FSI model. The damping parameter, *b*, was found to be dependent on both mass and radius of the pendulum (see Section 3.5). Also, through the FSI simulations, we were able to quantify how the drag coefficient, *CD*, increased with both increasing mass and radius of the pendulum bob (see Section 3.3). On that note, we also illustrated how the pendulum's period of oscillation was a function of both the mass and radius of the pendulum bob (see Section 3.1).

Beyond the dynamics of the pendulum structure itself, using a FSI model allowed us to peek into the resulting dynamics of the underlying fluid (see Section 3.4), which we would not have otherwise been able to do using the reduced-ODE model alone. While the purpose of this study was to explore the dynamics of a FSI pendulum model, which inherently did not assume any particular drag laws *a priori*, and compare the results to the ODE model, this framework can be used to further probe into new scientific frontiers that could unravel the enhanced contributions to fluid drag, whether from traveling through your own vortex wake, as suggested by Mathai et al., 2019 [25], vortex shedding as suggested by Bolster et al., 2010 [23], or added mass, as suggested by Bandi et al., 2013 [24]. There are still complex relationships to decrypt between oscillatory amplitude, geometry and mass of the pendulum bob, and fluid scale. The aforementioned studies performed physical experiments with pendulums and used sophisticated visualization techniques, either the Baker electrolytic technique [80] or *particle image velocimetry* (PIV) [81,82] to visualize the underlying fluid dynamics.

Furthermore, using this FSI framework it is possible to couple multiple pendulum together and study the resulting complex dynamics and/or investigate the motion of a variety of geometric objects being swung as pendulum bobs. We have seen that the size of the pendulum bob affects the underlying fluid dynamics, as observed through different vortex dynamics in Section 3.4; however, one has yet to explore how shape affects the fluid dynamics or motion of the bob itself.

While a student's first brief foray into fluids may have been through the concept of damped simple harmonic motion involving a pendulum, we hope that this manuscript provides the following context for students in an introductory fluid mechanics course:


To conclude, the pendulum may be an old, historic device that has been studied for millennia; however, under the hood, there are a lot of hidden, complex dynamics left to discover.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2311-5521/5/2/48/s1.

**Author Contributions:** Conceptualization, M.M. and N.A.B.; Methodology and Software, N.A.B.; Validation, N.A.B.; Formal Analysis, Investigation, and Data Curation, M.M. and N.A.B.; Writing—Original Draft Preparation, N.A.B.; Writing—Review & Editing, M.M. and N.A.B.; Visualization, M.M. and N.A.B.; Funding Acquisition, N.A.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** Computational resources were provided by the NSF OAC #1826915 and the NSF OAC #1828163. Support for N.A.B. was provided by the TCNJ Support of Scholarly Activity (SOSA) Grant, the TCNJ Department of Mathematics and Statistics, and the TCNJ School of Science.

**Acknowledgments:** The authors would like to thank Christina Battista, Robert Booth, Karen Clark, Jana Gevertz, Christina Hamlet, Alexander Hoover, Laura Miller, Matthew Mizuhara, Arvind Santhanakrishnan, Emily Slesinger, Edward Voskanian, and Lindsay Waldrop for comments and discussion.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A. Instructor Resources**

#### **Teaching Resources:**

Associated supplemental files contain slides, movies, and open-source codes pertaining to the paper. It encompasses the following:


$$\text{\textbf{1B2d}} \rightarrow \texttt{\textbf{nat}} \\ \texttt{TB2d} \rightarrow \texttt{\textbf{Examples}} \rightarrow \texttt{\textbf{Examples}} \\ \texttt{\textbf{Educat:}} \text{on} \rightarrow \texttt{\textbf{Pendu}} \\ \texttt{\textbf{1um}} \rightarrow \texttt{\textbf{1}}$$

4. Visualization software used: VisIt (https://visit.llnl.gov/) (v. 2.12.3)

#### **Appendix B. Immersed Boundary Method**

The immersed boundary method [30] was used to model the motion of a pendulum under gravitational acceleration, see Section 2. Although, IB is capable of solving fully coupled fluid-structure interaction systems involving flexible or squishy structures, here we use it to model the stiff boundaries of a pendulum bob immersed within an incompressible, viscous fluid. The fluid motion is governed by the Navier-Stokes equations, given as

$$
\rho \left( \frac{\partial \mathbf{u}(\mathbf{x},t)}{\partial t} + \mathbf{u}(\mathbf{x},t) \cdot \nabla \mathbf{u}(\mathbf{x},t) \right) = -\nabla p(\mathbf{x},t) + \mu \Delta \mathbf{u}(\mathbf{x},t) + \mathbf{f}(\mathbf{x},t) \tag{A1}
$$

$$\nabla \cdot \mathbf{u}(\mathbf{x}, t) = 0,\tag{A2}$$

where **u**(**x**, *t*)=(*u*(**x**, *t*), *v*(**x**, *t*)) is the fluid velocity, *p*(**x**, **t**) is the pressure, **f**(**x**, *t*) is the force per unit volume (area in 2*D*) applied to the fluid by the immersed boundary, i.e., the pendulum. The independent variables are the position, **x** = (*x*, *y*), and time, *t*. Equations (A1) and (A2) are conservation laws for the fluid, i.e., the conservation of momentum and mass, respectively. Note that Equation (A2) is known as the *incompressibility* condition.

The interaction equations between the fluid and the immersed structure are given by

$$\mathbf{f}(\mathbf{x},t) = \int \mathbf{F}(r,t)\delta(\mathbf{x}-\mathbf{X}(r,t))dr\tag{A3}$$

$$\mathbf{U}(\mathbf{X}(r,t),t) = \frac{\partial \mathbf{X}(r,t)}{\partial t} = \int \mathbf{u}(\mathbf{x},t) \delta(\mathbf{x} - \mathbf{X}(r,t)) d\mathbf{x},\tag{A4}$$

where **X**(*r*, *t*) gives the Cartesian coordinates at time *t* of the material point labeled by Lagrangian parameter *r*, **F**(*r*, *t*) is the force per unit area imposed onto the fluid by elastic deformations in the boundary, as a function of the Lagrangian position, *r*, and time, *t*. Equation (A3) applies a force from the immersed boundary to the fluid grid through a delta-kernel integral transformation. Equation (A4) sets the velocity of the boundary equal to the local fluid velocity.

As suggested in Section 2.2, the deformation force equation, **F**(r,t), is specific to the system being explored. For this pendulum model, it takes the following form

$$\mathbf{F}(r,t) = \mathbf{F}\_{spr} + \mathbf{F}\_{target} + \mathbf{F}\_{Mass} \tag{A5}$$

that is the summation of forces arising from spring, target point, and massive point deformations pertaining over each Lagrangian point being modeled with one or more of this model features.

#### *IB Algorithm*

In our pendulum model, we imposed periodic boundary conditions on a square domain. To solve Equations (A1)–(A4) we need to update the velocity, pressure, and both the position of the boundary and forces acting on it from the previous time-step data, time *n*. IB traditionally does this in the following steps [30,39]:

**Step 1:** Calculate the force density, **F***<sup>n</sup>* on the immersed boundary, from its current boundary configuration at time *n*, **X***n*.

**Step 2:** Use Equation (A3) to spread the force from the Lagrangian boundary to the Eulerian (fluid) mesh to compute **f***<sup>n</sup>*

**Step 3:** Solve the Navier-Stokes equations, A1 and A2, on the Eulerian grid, thus updating **u***n*+<sup>1</sup> and *pn*+<sup>1</sup> from **u***n*, *pn*, and **f***n*.

**Step 4:** Update the Lagrangian point positions, **X***n*<sup>+</sup>1, using the local fluid velocities, **U***n*<sup>+</sup>1, computed from **u***n*+<sup>1</sup> and (A4).

We quickly note that to approximate the integrals in Equations (A3) and (A4), discretized (and regularized) delta functions were used. We chose to use the delta functions described in [30], i.e., *δh*(**x**),

$$
\delta\_h(\mathbf{x}) = \frac{1}{h^3} \phi\left(\frac{\mathbf{x}}{h}\right) \phi\left(\frac{y}{h}\right) \phi\left(\frac{z}{h}\right),
\tag{A6}
$$

where *φ*(*r*) is defined as

$$\phi(r) = \begin{cases} \frac{1}{8}(3 - 2|r| + \sqrt{1 + 4|r| - 4r^2}), & 0 \le |r| < 1\\ \frac{1}{8}(5 - 2|r| + \sqrt{-7 + 12|r| - 4r^2}), & 1 \le |r| < 2\\ 0 & 2 \le |r|. \end{cases} \tag{A7}$$

#### **Appendix C. Additional Pendulum Data**

In this appendix we provide complementary data to the data presented in Section 3, e.g., if we provided a figure that contained a variety of masses for a particular radius (as in Figure 6), here we will provide the opposite—a variety of radii for particular cases of mass (as in Figure A1 below). These figures are provided for additional clarity in regards to the comparisons being discussed and analyzed.

First we provide the angular displacement (in radians) over time (in seconds) for cases of pendulums with the same mass, but different radii in Figure A1. This data is to illustrate clearly that pendulum bobs with the same mass can experience different oscillatory patterns for different radii. Moreover, it appears that by increasing mass orders of magnitude, from 2 <sup>×</sup> <sup>10</sup><sup>2</sup> kg to 2 <sup>×</sup> <sup>10</sup><sup>3</sup> kg to 1 <sup>×</sup> <sup>10</sup><sup>4</sup> kg could result in the pendulum bob undergoing different regimes of oscillation—either underdamped to overdamped.

**Figure A1.** Depicting the angular displacement (radians) vs. time (s) for pendulums with the same mass but different radii. (**a**–**c**) give data for a specific mass, either *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>4</sup> kg, 2 <sup>×</sup> <sup>10</sup><sup>3</sup> kg, or <sup>2</sup> <sup>×</sup> 102 kg, respectively, and a variety of radii in each.

Next we provide a plot of the height the pendulum reaches (in meters) as a function of the peak number in angular displacement for *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>4</sup> kg and a variety of radii in Figure A2. This illustrates that as the radius increases, the height decreases. Not only does the height decreases as the radius increases, the linear speed of the pendulum also decreases as well, as given in Figure A3. Our simulations suggest that smaller pendulum bobs generally move faster than larger ones for a given mass. In both of these figures, among all cases, both the speed and height decay exponentially as illustrated by the semi-logarithmic plots in Figures A2b and A3b. These data are provided to suggest that as the size of the pendulum increases, there must be more drag force acting on the bob to decelerate their speed and thus not allow them to reach as great of heights (angular displacements) as other smaller bobs.

**Figure A2.** (**a**) Plot illustrating the decay of the height (m) that the pendulum bob reaches as the pendulum continues to swing for the case of *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> 104 kg for a variety of radii. The peak amplitude decays exponentially as illustrated by the linear relationship between the logarithm of the amplitude against peak number, as shown in (**b**).

**Figure A3.** (**a**) Plot depicting the linear speed of the pendulum bob against non-dimensional time given as the # of swings (half a full displacement cycle) for the case of *<sup>m</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> 103 kg for a variety of radii. Speed peaks in the middle of a swing corresponding to when the pendulum has zero angular displacement from the vertical and the peak speed appears to decay exponentially, given by the linear relationship in (**b**).

Furthermore we also provide more detailed phase space explorations of linear speed (m/s) versus angular displacement (radians) in Figures A4 and A5. Figure A4 provides the phase space of linear speed versus angular displacement for the case of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> 103 kg and a variety of radii, while Figure A5 selects four radii (*r* = 0.001 m,*r* = 0.005 m,*r* = 0.0015 m, and *r* = 0.025 m) and varies mass. Similar topological structures are observed, where the data collapses onto a parabolically-capped cone. This is intuitive as both the peaks in angular displacement and speed decrease over time; however, what is particularly interesting is that the cone angle looks to be approximately conserved among all cases.

**Figure A4.** (**a**) Phase space of linear speed of the pendulum bob vs. angular displacement (radians) for a variety of radii in the case of *<sup>m</sup>* <sup>=</sup> <sup>5</sup> <sup>×</sup> 103 kg. (**b**) A closer look at the last simulated cycle's phase space for each case.

**Figure A5.** Phase space of linear speed of the pendulum bob vs. angular displacement (radians) for a variety of masses for cases: (**a**) *r* = 0.001 m, (**b**) *r* = 0.005 m, (**c**) *r* = 0.015 m, and (**d**) *r* = 0.025 m.

Finally we provide data depicting the drag force (*N*) over time for 3 different radii (*r* = 0.015 m, *r* = 0.020 m, and *r* = 0.025 m) over a variety of masses. Compared to Figure 14, we notice that for the same radius but different masses, the drag forces begin to overlap with time. Specifically, the drag forces corresponding to larger masses decay more rapidly. This could also be surmised from Figure 17c,d, which show an increased average drag coefficient during the higher mass cases for a specific radius.

**Figure A6.** Drag forces (N) over time in seconds for a variety of masses for cases with (**a**,**d**) *r* = 0.015 m, (**b**,**e**) *r* = 0.020 m, and (**c**,**f**) *r* = 0.025 m. The semi-log data is provided in (**d**–**f**) to highlight a linear relationship between the logarithm of the drag force and time. This linear relationship suggests an exponential decay in drag force over time.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
