**1. Introduction**

In 2016, Canada launched the CAD 1.5 billion Ocean Protection Plan (OPP) [1] to protect the world's longest coastline and support cleaner, healthier and safer waters. Under the oceanography sub-initiative of the OPP, Fisheries and Oceans Canada (DFO) was tasked to develop high-resolution operational nearshore ocean models for enhanced marine safety and emergency response, specifically electronic navigation and the prediction of oil spill drift trajectory. The nearshore models will eventually fit into the multi-scale, multi-level nested operational ocean-forecasting systems of the Government of Canada, through the collaborative development by Environment and Climate Change Canada (ECCC) and DFO under the Canadian Operational Network of Coupled Environmental Prediction Systems (CONCEPTS) [2] Memorandum of Understanding. The current phase of this OPP sub-initiative focuses on six pilot ports/waterways: Kitimat, Port Metro-Vancouver, Fraser River Port, the St. Lawrence River, Port Hawkesbury, and the Port of Saint John, with plans to extend modelling to other ports in the future. In 2017, during the first year of OPP, a significant e ffort was made to develop configurations for the Port of Saint John using two widely used, open source, ocean models and evaluate their suitability for OPP applications. The two models, NEMO (Nucleus for European Modelling of the Ocean) [3,4] and FVCOM (Finite Volume Coastal Ocean Model), [5,6] were selected due to their existing applications in Canada.

NEMO is a finite di fference model that runs on structured horizontal grids. It was first developed for global and basin-scale applications, and subsequently for coastal applications. Prior to this study, NEMO had not been used for near-shore, port-scale applications. Under CONCEPTS, DFO and ECCC developed a series of operational ocean and sea-ice prediction systems using NEMO, covering the global ocean (with a horizontal resolution of 1/4-degree in longitude/latitude, [7]), regional ocean basins (North Atlantic, Arctic and North Pacific, with 1/12-degree resolution, [8]), and the Great Lakes (with 2 km resolution; [9]). The development of prediction systems for the shelf and coastal oceans o ff the western and eastern coasts of Canada (with 1/36-degree resolution) is ongoing. These systems run operationally with 24/7 support at ECCC's Canadian Centre for Meteorological and Environmental Prediction (CCMEP).

FVCOM uses finite volume numerics on unstructured horizontal grids. In Canada, and particularly within DFO, there are extensive applications of FVCOM for coastal, near-shore and lake waters. FVCOM is used for simulating both barotropic tides (without including density variations) (e.g., [10,11]), and full baroclinic dynamics (e.g., [12–15]). The use of unstructured grids enables very high horizontal resolution, reaching a couple of metres in nearshore waters in some cases (e.g., [16,17]) while maintaining coarser resolution for o ffshore areas with larger scale dynamics. Many applications make use of FVCOM's wetting/drying scheme to simulate processes in the intertidal zone (e.g., [17,18]). Under a previous DFO project, FVCOM was used to develop port scale models for five of the six OPP pilot ports, including a baroclinic configuration for the Port of Saint John (without atmospheric forcing).

Several studies have compared di fferent configurations of the same root model, but few have compared fully baroclinic, structured and unstructured models with similar resolution over the same domain. Huang et al. [19] compared FVCOM with a structured grid model, ROMS (the Regional Ocean Modelling System), [20] but the study focused on idealized test cases and used barotropic configurations only. Trotta et al. [21] examined the use of NEMO and the Shallow Water Hydrodynamic Finite Element Model (SHYFEM) [22], in a downscaling context for a relocatable ocean platform for forecasting, but they did not directly compare the two models. More recently, and most relevant to our study, Biastoch et al. [23] performed a comprehensive comparison of a nested configuration of NEMO and the unstructured Finite Element Sea Ice-Ocean Model (FESOM) [24], but this was for global configurations and focused on large scale ocean circulation.

The models developed for OPP applications will eventually be run with 24/7 operational support, but amongs<sup>t</sup> the Canadian governmen<sup>t</sup> agencies, the capacity to do so only exists within the ECCC. ECCC currently uses NEMO for operational ocean forecasting. To add the operation of FVCOM to ECCC would entail additional resources (and cost) compared to utilizing the existing NEMO infrastructure. Hence, ECCC required the assessment of the performance of both NEMO and FVCOM for consideration in decision making. To achieve this, DFO and ECCC jointly developed an evaluation process, summarized here, to objectively compare both the predictive accuracy of the key parameters required for OPP applications, and the efficiency in terms of computational cost, between NEMO and FVCOM.

This paper describes the principles and factors that were considered in designing the evaluation process (Section 2), and application to the Port of Saint John including the metrics and sample results of the evaluation (Section 3). The selected examples do not cover the full results of the NEMO/FVCOM evaluation, and do not demonstrate the full strength of either model. The evaluation guided on-going research in the development and improvement of both models. More comprehensive descriptions of the configurations and results of both models are or will be documented elsewhere, e.g., the NEMO configuration by Paquin et al. [25]. Finally, the proposed process and metrics can be generalized and modified to evaluate the configurations developed for other regions, and with models other than NEMO and FVCOM, for research and operations.

#### **2. Factors Considered for Evaluation**

The evaluation process includes the selection of the study area, the requirements for the model setup, and the metrics for evaluating the models. The evaluation was designed to objectively (quantitatively) assess the accuracy and efficiency of NEMO and FVCOM for operational forecasting at port scales for parameters of interest to the OPP applications. Here, the term *accuracy* pertains to the models' ability to reproduce the observations, and *e*ffi*ciency* refers to the computer resources that are required to run the models in an operational context. The objectives of OPP are to improve electronic navigation and to predict oil spill drift, thus the parameters of interest for the evaluation were water level, currents, water temperature and salinity (density), and surface drifter trajectories. As with any experimental design, it is important to consider the ability of the evaluation to detect contrasts in the models and to force the strongest contrast possible. Thus, the evaluation was designed to challenge the models with respect to accuracy and efficiency. Consequently, the evaluation helped gain a better understanding of the strengths and weaknesses of the models, and although the evaluation focused on one port, the results can be reasonably expected to extend to other ports.

Of the six OPP pilot ports, one was selected for the evaluation process. The selection of the study area was based on (1) the regional oceanography being sufficiently complex to include key dynamic processes; (2) the availability of forcing data (from atmosphere, rivers, and open ocean) to drive the port models; and (3) the availability of sufficient observational data to assess the predictive accuracy of the models with respect to the parameters of interest. The parameters of interest at Canadian ports are typically influenced by complicated coastlines, bathymetry, tides, river runoff, the open ocean, and, in some cases, the presence of sea-ice. Due to the urgen<sup>t</sup> timeline of the OPP (i.e., the evaluation was to be completed within the first year of OPP), the explicit inclusion of sea-ice and the simulation of the intertidal zone (with a wetting/drying scheme) were not required, but to ensure that FVCOM was used to its full potential, the use of the wetting/drying scheme was encouraged. At the time of the evaluation, a wetting/drying scheme was not available in NEMO. Forecasting the surface waves at port scales was not considered for the current phase of the OPP.

Various aspects of the model setup were built into the process of evaluation, including spatial resolution, the inclusion of dynamic processes, forcing fields, the duration of the simulation, computational cost, and the variables, frequencies and format of the model output. The required

horizontal grid spacing of the models was 100 m or less to resolve the horizontal gradients of currents and the presence of eddies due to nonlinear processes. The required vertical grid spacing was 1 m or less near the surface, to resolve the currents in the upper layer that are important for navigation and oil spill drift. The models included the full baroclinic dynamics to simulate the variations due to surface momentum and buoyancy fluxes, river runo ff, and open ocean forcing into the port. The models were subjected to the same forcing, which included atmospheric forcing from the operational weather forecasting system, large-scale oceanic forcing from the operational regional ice-ocean forecasting system, and available tidal and river runo ff. At the current stage of the OPP, the port models did not include any data assimilation, partly due to the lack of su fficient real time observational data. A common time frame for the simulation was determined by the available observations. The duration of the simulation was 15 months to allow for a spin-up of the models and the evaluation of a full annual cycle. For the proper evaluation of model e fficiency, both models were run on the same high-performance computer facility of the Government of Canada. For operational applications, the required run-time of the models was 0.5 h (or less) for a 48 h simulation. The frequency of the model output was minimally 0.5 h for the proper evaluation of tidal variations.

The metrics for the model data comparison were defined based on the existing expertise of the team, and through expert consulting and literature research. The metrics were defined for the tidal and non-tidal components of sea level and currents, vertical profiles of water temperature and salinity, time series of sea surface temperature (SST) at fixed stations, and the trajectories of surface drift. The quantitative metrics are statistically robust to measure the discrepancy between the model solution and observational data. Prior to performing the statistical comparison, the model results were extracted from the grid node nearest to the locations of the observations. Time series analysis also included a comparison of the energy spectrum. Because the domains of both models cover the area beyond the port, the evaluation was carried out for the "inner harbour" (port) and "outer harbour" separately, with an emphasis put on the inner harbour. In addition to the quantitative evaluation, the models were evaluated qualitatively based on known features of the regional oceanography, e.g., the presence of the river plume, tidal fronts, eddies, etc.

Finally, when possible, the models were compared to existing operational products that are used in Canada. This includes the Scotia-Fundy-Maine WebTide (hereafter referred to as WebTide) solutions for tidal elevation and currents [26], and the regional ice-ocean prediction system (RIOPS) that covers the North Atlantic, Arctic and North Pacific with 1/12-degree resolution [8]. Note that neither WebTide or RIOPS were developed for near-shore operational applications and are not high-resolution port solutions, but were the existing operational products for the area at the time of this evaluation.
