**1. Introduction**

At present, changes in the global climate and ecological environment have become some of the most important scientific problems. The ocean, as a vital part of the global climate system, has become a heated topic of research for scientists. Ocean models have considerably developed and improved since 1969, when they were developed [1]. Now, there are various kinds of global ocean models. HYCOM [2], NEMO [3], MOM [4], and LICOM [5] are some global ocean models that are used for research and production. Among them, MOM, HYCOM, and NEMO represent ocean models that were developed in Europe and the USA. LICOM was developed by scientists from IAP-CAS [6]. LICOM is widely used in the area of climate simulation and prediction, as well as the area of the numerical simulation and prediction of air–sea coupling. Different editions of LICOM were used for the ocean components of three coupled air–sea models in the Sixth Coupled

**Citation:** Hao, H.; Jiang, J.; Wang, T.; Liu, H.; Lin, P.; Zhang, Z.; Niu, B. Deep Parallel Optimizations on an LASG/IAP Climate System Ocean Model and Its Large-Scale Parallelization. *Appl. Sci.* **2023**, *13*, 2690. https://doi.org/10.3390/ app13042690

Academic Editors: Antonio J. Nebro and Juan A. Gómez-Pulido

Received: 31 January 2023 Revised: 13 February 2023 Accepted: 16 February 2023 Published: 19 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Model Intercomparison Project (CMIP6) [7,8], which included the Flexible Global Ocean– Atmosphere–Land System model version 3 [9] with a finite-volume atmospheric model (FGOALS-f3) [10], the Flexible Global Ocean–Atmosphere–Land System model version 3 with a grid-point atmospheric model (CAS FGOALS-g3) [11], and the Chinese Academy of Sciences Earth System Model (CAS-ESM) [12]. LICOM is an important tool for investigating and forecasting ocean circulation and its mechanisms. Meanwhile, as a vital part of climate system models (CSMs) and Earth system model (ESMs) [13], LICOM's performance may considerably affect climate change simulations [14,15]. As the resolution increased, the enormous computing, communication, and input/output (IO) became significant scientific and engineering challenges for scientists [16]. It will take either more time or more computing resources to run models. Widely used in many areas, including material science [17], electrochemistry [18], and meteorology, high-performance computing has become a very powerful tool in scientific research [19]. Due to the time constraints of meteorological simulation software, parallel optimization in high-performance computing is considerably important for models. Most meteorological systems suffer from the issues of low simulation speed and the inability to utilize large-scale machines. Some work on optimizing meteorological simulating systems has been done by researchers to tackle these issues. Optimizations were conducted on NEMO [20]. The overall performance was improved by 31%. The Princeton ocean model (POM) [21] was transplanted into Sunway TaihuLight [22], a very powerful supercomputer. The new edition, swPOM, was 2.8 times faster than it was on conventional supercomputers. It could be scaled up to over 250,000 cores [23]. Meanwhile, in addition to oceanic models, researchers have performed optimizations on some atmospheric models, such as IAP-AGCM [24]. The optimized code scaled up to 196,608 CPU cores, attaining a speed of 11.1 simulation years per day (SYPD) at a high resolution of 25 km [25]. In this study, in order to improve the simulation speed of LICOM, we implemented a series of optimizations and achieved a considerable speedup in comparison with the original version. The fully optimized edition of LICOM was twice as fast as the original edition. The performance experiments were conducted on the "Era", "Tianhe II", and "Tianhe III" supercomputers. Additionally, although we performed optimizations on GPUs with OpenACC [26], CUDA [27], and HIP [28], the CPU version is still widely used. For instance, many machines, such as "Tianhe III", are still pure CPU machines. Moreover, since the CPU version of LICOM is used as the ocean component of various coupled Earth system models, it is necessary to deeply optimize the CPU version.

The rest of this paper is organized as follows. The following section introduces LICOM and its control flow, along with the parallel and communication algorithm. In Section 3, the detailed optimizations that we implemented on LICOM are illustrated. Section 4 describes the experimental setups and the performance of the optimized model. Finally, Section 5 draws conclusions concerning our optimization work on LICOM.

#### **2. The LICOM Model**

LICOM is used to solve the Navier–Stokes equations. An ocean circulation model can simulate ocean temperature, salinity, velocity, and sea surface height under certain initial and boundary conditions. Meanwhile, the results of simulations can be used as the lower boundary conditions of atmospheric models and sea ice models. In addition, they can provide boundary conditions for regional ocean models. In addition, the results are capable of providing information on marine environment variables with an equably distributed space and continuous time in order to cover the shortages in the currently uneven observation data. This is beneficial for understanding the physical mechanisms of oceanographic processes. According to the original N-S equations, LICOM uses a finitedifference discrete model equation to ensure the conservation of energy and volume that are transferred during a discrete process. Since there are considerable differences in the vertical direction during the processes of marine stratification and mixing, a method with different layers is used to solve the problem. LICOM uses sea surface fluctuation with a free

surface, including a surface gravity wave with a high speed and a Rossby wave with a low speed. In order to reduce the calculations, the model splits the surface wave mode and uses smaller time steps for integration, while it uses larger time steps for models that describe the vertical structure. During the process of integration, the interactions between the two time steps are kept. This method is called "the decomposition and interaction of models" [29]. The calculation of the barotropic model integral is considerably reduced through this decomposition. There are some procedures that cannot be captured by LICOM or by other ocean models, such as the procedure of turbulence. It is necessary to create processes of parameterization to describe these model-invisible procedures to realize their impacts. A low-resolution ocean model contains mesoscale vortex parameterization. However, a highresolution ocean model (less than 10 km) can recognize mesoscale vortexes. Thus, there is no need for parameterization. Currently, LICOM has reached resolution levels of 10 km and even higher. It is capable of recognizing mesoscale vortexes and nicely simulating vortexes and their impacts. High-resolution ocean models use a corrected barotropic and baroclinic decomposition algorithm. Meanwhile, an improved double-adjustable and sticky disjunction diffusion scheme can be employed in the horizontal direction in the momentum equation and thermohaline equation. Therefore, mesoscale vortexes can be better simulated. The main control flow of LICOM is shown in Figure 1. The major processes in the integral loop include barotropic, baroclinic, and thermohaline processes, among others, and the Euler forward or leapfrog scheme is used.

**Figure 1.** The control flow of LICOM.

LICOM has a variety of choices of output format, including binary files and netcdf. A binary file can be converted into a netcdf file. Thus, the results can be easily handled and investigated by using various types of professional software.
