*Article* **A Barotropic Solver for High-Resolution Ocean General Circulation Models**

**Xiaodan Yang 1,2,3, Shan Zhou 4, Shengchang Zhou 1,2,3,5, Zhenya Song 1,2,3,\* and Weiguo Liu 2,5**


5

**Abstract:** High-resolution global ocean general circulation models (OGCMs) play a key role in accurate ocean forecasting. However, the models of the operational forecasting systems are still not in high resolution due to the subsequent high demand for large computation, as well as the low parallel efficiency barrier. Good scalability is an important index of parallel efficiency and is still a challenge for OGCMs. We found that the communication cost in a barotropic solver, namely, the preconditioned conjugate gradient (PCG) method, is the key bottleneck for scalability due to the high frequency of the global reductions. In this work, we developed a new algorithm—a communication-avoiding Krylov subspace method with a PCG (CA-PCG)—to improve scalability and then applied it to the Nucleus for European Modelling of the Ocean (NEMO) as an example. For PCG, inner product operations with global communication were needed in every iteration, while for CA-PCG, inner product operations were only needed every eight iterations. Therefore, the global communication cost decreased from more than 94.5% of the total execution time with PCG to less than 63.4% with CA-PCG. As a result, the execution time of the barotropic modes decreased from more than 17,000 s with PCG to less than 6000 s with CA-PCG, and the total execution time decreased from more than 18,000 s with PCG to less than 6200 s with CA-PCG. Besides, the ratio of the speedup can also be increased from 3.7 to 4.6. In summary, the high process count scalability when using CA-PCG was effectively improved from that using the PCG method, providing a highly effective solution for accurate ocean simulation.

**Keywords:** barotropic solver; PCG; CA-PCG; ocean general circulation model; NEMO
