**2. System Description**

#### *2.1. Ocean Numerical Model*

BSFS is based on a free surface implementation of the NEMO hydrodynamical model (v3.4, [7]) for the Black Sea region. The horizontal resolution is approximately 3 km (1/27◦ in zonal and 1/36◦ in meridional directions), which conforms to the mesoscale eddy-resolving scale (Rossby radius of deformation in the Black Sea is ~20 km, [8]). On the other hand, 31 unevenly distributed z-levels are used along the vertical direction, with an initial layer at about 2.5 m in depth. BSFS bathymetry is derived from GEBCO at 1 resolution [9]; a local refinement of the coastline, using a high-resolution NOAA dataset [10], is performed to represent the main coastal features of the basin.

The model is forced by momentum, water and heat fluxes interactively computed by bulk formulae implemented first for the Mediterranean Sea [11] and adapted for the Black Sea, as described in Appendix A. The analysis and forecast atmospheric fields are provided by the European Centre for Medium Range Weather Forecast (ECMWF) at 0.125◦ horizontal resolution, at 3–6 h frequency.

The momentum equation is written in flux form and is solved with a leapfrog time stepping scheme. The free surface equation uses a linearized form (i.e., the barotropic velocity field is defined with an integral from the flat surface to the bottom) and is integrated implicitly [12] with the time step of the total velocity field equal to 150 s. The advection scheme for the temperature and salinity tracers is the total variance dissipation (TVD) scheme. Horizontal eddy diffusivity is applied for tracers using a Laplacian operator, with a coefficient of 50 m2/s, while a bi-Laplacian viscosity is applied for momentum, with a coefficient of 1.0 × 10<sup>8</sup> m4/s. The vertical diffusivity and viscosity are computed from a turbulent kinetic energy (TKE) closure model, with the parameters set as in [7]. Vertical eddy viscosity and diffusivity coefficients are selected with values of 1.2 × 10−<sup>5</sup> m2/s and 1.2 × 10−<sup>6</sup> m2/s, respectively. The model also implements free-slip lateral boundary conditions and a classical quadratic bottom friction scheme, with a drag coefficient of 1.0 × 10−<sup>3</sup> m2/s2.

The Bosporus Strait connects the Black Sea with the Sea of Marmara: to model it, a twolayer, narrow strait water mass exchange which maintains a relatively steady state water and salt balance in the Black Sea [13–18] is used, with a closed boundary condition. The excess precipitation and river runoff over evaporation is removed using the outflow from the strait, thus leading to a zero balance in the Black Sea [19]. Considering the horizontal average of the free surface equation [20], the net transport at the Bosporus Strait is given by:

$$\frac{1}{A}Tr\_B = -\frac{\partial}{\partial t} \langle \eta \rangle - \langle E - P - R\delta(\mathbf{x} - \mathbf{x}\_i) \rangle \tag{1}$$

where *TrB* is the net transport at the Bosporus Strait, and *A* is the basin surface area, *η* is the sea surface height, *E*, *P*, *R* are, respectively, evaporation, precipitation and runoff, the Dirac *δ* is different from zero at 72 river mouths and the triangular brackets mean horizonal basin average. First, we redefine the transport at the Bosporus Strait in terms of a "discharge" *RB* :

$$Tr\_B = L\_{i,B} H\_{i,B} \ R\_B \tag{2}$$

Thus, the discharge at the Bosporus Strait is given by:

$$R\_B = -\frac{A}{L\_{i,B}H\_{i,B}} \left[ \frac{\partial}{\partial t} \langle \eta \rangle + \langle E - P - R\delta(\mathbf{x} - \mathbf{x}\_i) \rangle \right] \tag{3}$$

where *Hi*,*<sup>B</sup>* and *Li*,*<sup>B</sup>* are depth and width in one-grid-sea-point, respectively.

We calculated the Bosporus discharge in (3) from a 10-year simulation by computing the mean free surface tendency and the mean water flux. The values of the Bosporus discharge are stored as monthly mean values and set as vertical velocity boundary conditions as done for the rivers, except with the negative sign, indicating a discharge out of the basin or a "negative river". This parametrization is quite robust for decadal long simulations that do not consider climate change trends in sea level and water fluxes.

With regard to the 72 real river runoff contributions, we use monthly mean discharge data from the SESAME dataset [21] for all rivers, including the Danube, the Dniepr, the Dniester, the Rioni, the Kizil Irmak, and the Sakarya.

#### *2.2. Data Assimilation Scheme*

The ocean model is coupled with a data assimilation system in order to produce analyses for optimal initial conditions of the forecasts. The DA ocean state vector *x* contains the temperature *T*, salinity *S* and sea level values at each model grid location. For a model setup with *n* vertical levels, this vector is defined as:

$$\mathbf{x} = (T\_{0\prime}, \dots, T\_{n\prime}S\_{0\prime}, \dots, S\_{n\prime}\,\eta) \tag{4}$$

Given an observation vector *y*0 and the background model state *xb* it is possible to define the innovation, i.e., the difference between the observations and their respective model predictions, denoted *d*:

$$d = H(\mathbf{x}\_b) - \mathbf{y}\_0 \tag{5}$$

with *<sup>H</sup>*(*xb*) being the observation operator that projects the model state onto the observation space. The aim of the DA system is to find a correction to the model state:

$$
\delta \mathbf{x} = \mathbf{x} - \mathbf{x}\_b \tag{6}
$$

This correction needs to minimize the analysis error while taking into account the error covariances of the model state vector and the observations. There are several methods to calculate such corrections; however, they can be broadly categorized as variational methods or Kalman filters. In BSFS, a 3D-variational method is used [22,23]. In this method, the corrections are derived by iteratively minimizing a cost function *J*, defined as:

$$J = \frac{1}{2} \delta \mathbf{x}^T \mathbf{B}^{-1} \delta \mathbf{x} + \frac{1}{2} (\mathbf{H} \delta \mathbf{x} - d)^T \mathbf{R}^{-1} (\mathbf{H} \delta \mathbf{x} - d) \tag{7}$$

Here the matrices **B** and **R** are the error covariance matrices of, respectively, the background state and the observations. For a number of state variables *nb*, the model covariance matrix **B** is of size *n*<sup>2</sup> *b*. In addition, as the cost function contains **<sup>B</sup>**−1, the covariance matrix needs to be inverted. It is clear that for large numbers of state variables, this is a computationally costly calculation, which can be avoided if additional constraints are imposed on **B**, in particular if **B** is of the form:

$$\mathbf{B} = \mathbf{V}\mathbf{V}^T\tag{8}$$

with **V** an arbitrary matrix. In this case, Equation (7) can be written as:

$$J = \frac{1}{2}v^T v + \frac{1}{2}(\mathbf{H}\mathbf{V}v - d)^T \mathbf{R}^{-1} (\mathbf{H}\mathbf{V}v - d) \tag{9}$$

introducing a coordinate transformation:

$$
\delta \mathbf{x} = \mathbf{V} \mathbf{v} \tag{10}
$$

$$
v = \mathbf{V}^{-1} \delta \mathbf{x} \tag{11}$$

The minimization of the cost function can now be performed in terms of the new control variable *v*, without the need to calculate **B**−1. It is sufficient to perform the minimization in terms of *v* before transforming back to the model state space increment *δx* with (10).

As the covariance is by definition a positive definite matrix, the matrix **V** exists and could be found by performing a Cholesky decomposition on **B**. However, since the model error covariance **B** is usually derived from model state variable anomalies in a long model run, **V** is simply the anomaly matrix and **B** is already calculated according to (8).

One of the main features of OceanVar is the covariance decomposition. The matrix **V** of (8) is expressed as a product of different components:

$$\mathbf{V} = \mathbf{V\_H}\mathbf{V\_{\eta}}\mathbf{V\_{\vee}}\tag{12}$$

where **V** H is the horizontal and **V**V is the vertical component of the covariance. The horizontal component provides the correlation between neighboring grid points at each model level. This is implemented by means of a recursive filter with a radius of approximately 25 km in open sea and a gradual falloff near the coast. The vertical component is calculated from a long model simulation and decomposed using empirical orthogonal functions (EOF). This EOF decomposition significantly reduces the number of variables, *nv nx*, and reduces spurious correlations due to the finite dataset used to estimate the covariance. *Vη* is the dynamic height operator, which uses the local hydrostatic adjustment scheme [23] to transform the sea level anomaly innovations into increments for T and S in regions deeper than 1000 m.

For the BSFS system with 31 vertical levels, the vertical covariance of *T* and *S* is represented by 15 EOF. The EOF are calculated separately for each month and each grid location to adequately capture the variability of the covariance in time and in space. The DA system is used to assimilate in-situ observations of temperature and salinity from available ARGO profilers in the Black Sea, and satellite observations of sea level anomaly (SLA) and of sea surface temperature (SST). For the latter, the foundation temperature is used and assimilated only at nighttime (as in e.g., [24]). This approach limits possible biases due to the diurnal cycle of SST.

## *2.3. Operational Chain*

BSFS version here considered is EAS3 (European Analysis and forecasting System, version 3, referred also in the next sections) for the Black Sea. It implements technical interfaces with upstream data, used for model runs—including ECMWF atmospheric forcing, CMEMS in-situ and satellite observations from corresponding TACs—as well as with the CMEMS Dissemination Unit for the delivery of analysis and forecast products to users. Observations used by BSFS for assimilation and validation are summarized in Table 1.

**Table 1.** List of observations used by BSFS in the assimilation and verification steps.


The atmospheric forcing (previous day analyses and 10 days forecast starting at 00:00 UTC) is downloaded from ECMWF through Aeronautica Militare Italiana. As soon as it is available, the BSFS processing system starts (at around 7:00 UTC). The atmospheric forcing availability is a major source of delay for the oceanic forecast.

Every day the BSFS (Figure 1) produces ten days of forecast (J to J+9), one day of simulation (J−), and three days of analysis (J−4 to J−1). Once a week (on Tuesday), 14 days of analysis are produced, from J−15 to J−1, with the assimilation of all available satellite and in-situ data collected over the past weeks. Analysis produced at the weekly cycle represents the best estimation of the Black Sea state because all in-situ and satellite altimetry high quality processing data is used up to J−3. Analysis and simulation runs are forced by ECMWF atmospheric analysis fields at 6 h frequency; the forecast cycle is forced with ECMWF atmospheric forecast fields at 3 h frequency for the first 3 days and 6 h frequency for the remaining 7 days.

At the end of the forecasting cycle, post-processing involves the preparation of all the BSFS files in a format that is compliant with the CMEMS and CF standards, delivering to the CMEMS DU, and archiving of the BSFS native products at CMCC's supercomputing facilities. Every day, the forecast is released to CMEMS within less than 3 h (target delivery time: 12 UTC).

The BSFS product catalogue includes daily and hourly mean datasets, centered at midday of each J, for the Black Sea essential variables: 3D temperature ( *T*), salinity (*S*), zonal and meridional velocity components ( *U* and *V*, respectively) and 2D sea surface height (*SSH*), bottom temperature (*BottomT*) and mixed layer depth (*MLD*).

The BSFS operational chain is very similar to that of the Mediterranean Forecasting System (MFS, [25]) in terms of general setup (technical interface, analysis-simulationforecast chain). Unlike MFS, it implements a more frequent analysis cycle to provide higher quality Black Sea products using daily observations.

#### **3. Quality Assessment of the Operational System**

The aim of the product quality assessment is to monitor the analysis quality and forecast accuracy of BSFS products using quasi-independent validation assessment. Operationally, a regional website provides daily bulletins (http://bsfs.cmcc.it/, accessed on 29 December 2021) and skill scores (http://oceanlab.cmcc.it/bsfs-evaluation/, accessed on 29 December 2021). Daily bulletins consist of a collection of interactive 2D maps for visualizing the essential variables of the Black Sea (as provided in the CMEMS catalogue). Weekly skill scores are provided through the evaluation section: during the assimilation weekly cycle, as described in Section 2.3, the difference between the model analysis and the observations at the time and location of the observations (i.e., misfits) is stored to compute statistics. Two main statistical metrics for the assessment of the analysis quality are computed using the misfits: the first is the bias, given by:

$$\text{bias}(\phi) = \frac{1}{N} \sum \left( \phi\_i^M - \phi\_i^O \right) \tag{13}$$

and the Root Mean Square Error (RMSE) given by:

$$\text{RMSE}(\phi) = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left(\phi\_i^M - \phi\_i^O\right)^2} \tag{14}$$

where *N* is the number of data used in the evaluation, *φ<sup>M</sup>* is the model analysis field, and *φO* the observation.

On the other hand, to assess the quality of the forecast, we evaluated the differences of the forecast fields with respect to the analysis considered to be the best estimate of the truth. Murphy (1993, [26]), revisited by [2], defined the "forecast goodness" methodology and here we use his "quality measure" methodology that compares the forecast with the analysis and the "persistence" or the best estimate at initial time. If the difference between forecast and analysis is better than the difference between forecast and persistence, then the forecast is valuable. For a modern oceanographic forecasting system, [27] defined a metrics that we will partially follow here by computing:

• The difference between the analysis and the forecast (*AF*):

$$AF(t) = \sqrt{\frac{\sum\_{1}^{T} \left(\phi^{AN}(t) - \phi^{FC}(t)\right)^{2}}{T}} \tag{15}$$

where *φAN* and *φFC* is the temperature (salinity) daily mean forecast and analysis, respectively, at each forecast day *t* = *day*1, *day*2, ... , *day*10 (ID day as in Figure 2); *T* corresponds to the time period covered by the evaluation (here we consider a time period of 1 month). The metrics is normalized horizontally by averaging over the area and at a specific selected depth.

**Figure2.**BSFSprocessing

• The difference between the forecast and the persistence (*PF*):

 chain.

$$PF(t) = \left\langle \sqrt{\frac{\sum\_{1}^{N} \left(\phi^{FC}(t) - \phi^{AN}(t = day0)\right)^{2}}{N}} \right\rangle \tag{16}$$

Here, *φAN*(*t* = *day*0) is the persistence (ID day as in Figure 2), e.g., the initial condition for the forecast. The latter is the last time step of a simulation started from an assimilated initial condition at day J−2.

In the following subsections, we present BSFS analysis statistics for the period 2018–2019 for SST and SLA, as well as temperature and salinity at given layers: 2–5 m, 5–10 m, 10–20 m, 20–30 m, 30–50 m, 50–75 m, 75–100 m, 100–200 m, 200–500 m, and 500–1000 m. Forecast assessment is performed for 2 significant months—February and August 2020: metrics have been computed at selected depth, e.g., 2.5, 30, 150 m.

#### *3.1. Sea Surface Temperature*

The BSFS analysis sea surface temperature is assessed by comparing analysis model fields against SST satellite data remapped over the Black Sea basin at 1/16◦ spatial resolution and representative of night SST values and delivered by CMEMS (Table 1).

Figure 3 shows the time series from 2018 to 2019 of the difference between BSFS and satellite SST (BIAS, top) and RMSE (bottom). The numerical SST is slightly warmer than the measured one (+0.1 ◦C), and the error is about 0.5 ◦C.

#### *3.2. Sea Surface Height*

Figure 4 shows the RMSE for SLA in the operational period (green dotted line) and the number of observational data used, given by the sum of the available along track sea level data provided by Altika, Cryosat-2, Jason-2 and 3, Sentinel 3A and 3B. The mean value of RMSE misfits for SLA is around 2.3 cm. A decrease in the error is also shown once the number of observations increases (June 2018–December 2018; March 2019– October 2019 thanks to inclusion of Sentinel-3B data). The accuracy is comparable to the Mediterranean Sea [25], where the average error over the same period is about 3.35 cm (http://oceanlab.cmcc.it/mfs-evaluation/ accessed on 29 December 2021).

**Figure 3.** SST bias (top) and RMSE (bottom) timeseries, expressed in ◦C, at daily (thinner line) and monthly (solid line) frequency in 2018 (**a**) and 2019 (**b**).

**Figure 4.** RMSE misfit of SLA, expressed in cm, computed using available observations from Altika, Cryosat-2, Jason-2 and 3, Sentinel-3A and 3B during 2018–2019.

#### *3.3. Temperature and Salinity*

## 3.3.1. Analysis Quality

Water column properties given by BSFS are assessed after 3D temperature and salinity are validated against all available observations.

Table 2 shows the basin scale RMSE misfits at specific layers (m) for 2018–2019. In the 2–5 m surface layer, the average RMSE for temperature at the whole basin is about 0.4 ◦C (Figure 5(a1)) and 0.2 PSU (Figure 5(b1)). However, the error in temperature starts to increase in the sub-surface, from 5 m to 30 m, from 0.5 ◦C to 2 ◦C in the summer and below 0.5 ◦C in the winter (Figure 5(a2–a4)).

**Table 2.** Temperature and salinity RMSE misfits over the period 2018–2019.


The maximum error for temperature occurs in the 10–20 m layer with a value up to 2.5 ◦C computed for August–October in both 2018 and 2019 (Figure 5(a3)).

Regarding salinity, the error in the subsurface, on average, is around 0.2 PSU (Figure 5(b2–b4)), which increases from 30 m up to 100 m up to a maximum value of 0.4 PSU (Figure 5(b5–b7)). In the intermediate layers up to 1000 m, the error decreases to quasi-zero for both temperature (Figure 5(a8)) and salinity (Figure 5(b8)).

The increased error in the thermocline (10–30 m layer) and in the halocline (50–100 m layer) in the summer is likely due to a deficiency in the vertical discretization (we use only 31 z-levels) and in the vertical mixing parameterization.

**Figure 5.** *Cont*.

**Figure 5.** RMSE timeseries between BSFS analysis fields and ARGO observations for temperature (**a1**–**a8**) (expressed in ◦C) and salinity (expressed in PSU) (**b1**–**b8**). The reference layer as considered for the evaluation is written on the top of each panel.

The errors are due to (a) the Bosporus Strait being represented as closed boundary, (b) mixing processes that were not completely resolved since effects induced by waves and tides are not accounted for, and (c) simplified representation of the heat, momentum and water fluxes through bulk formulation, whose effects are particularly important especially during the summer. These are significant challenges to be considered in the development of the next generation of the Black Sea physical system.

## 3.3.2. Forecast Assessment

In Figures 6 and 7 we present the BSFS forecast quality assessment for temperature and salinity considering 2 reference months, February 2020 and August 2020, respectively. Metrics are given at different depths—2.5, 30 and 150 m. If the forecast is skillful, *AF* score should be better than *PF* score.

**Figure 6.** RMSE of *AF* (Analysis-Forecast, in blue), *PF* (Forecast-Persistence, in orange) for temperature (on the left, (**a1**–**a3**), expressed in ◦C) and salinity (on the right, (**b1**–**b3**), expressed in PSU) at 2.5 m (1), 30.0 m (2) and 150 m (3), computed for February 2020.

**Figure 7.** RMS of *AF* (Analysis-Forecast, in blue), *PF* (Forecast-Persistence, in orange) for temperature (on the left, (**a1**–**a3**), expressed in ◦C) and salinity (on the right, (**b1**–**b3**), expressed in PSU) at 2.5 m (1), 30.0 m (2) and 150 m (3), computed for August 2020.

In February 2020, *PF* is bigger than *AF* after *day*2 (Figures 6 and 7): this means that after the first day the advantage of doing the forecast is clear over persisting the initial condition. It is interesting to note that for salinity this advantage grows more rapidly, i.e., for salinity the forecast overcomes the persistence even before the first day. Over the water column, temperature and salinity analysis and forecast exhibit a similar variability: higher error reaching a maximum around 0.6◦ Celsius and 0.5 PSU at *day*10 is shown for temperature (Figure 6(a1,a2)) and salinity (Figure 6(b1,b2)) in the subsurface up to 30 m; at 150 m, both variables exhibit a lower error (never higher than 0.1◦ Celsius and 0.3 PSU over the reference 10-day period).

In August 2020, errors at 30 m are higher than the ones computed in the previous analyzed month: starting from *day*2, in particular for temperature (Figure 7(a1,a2)), *PF* shows higher values than *AF*—almost 50% higher at 30 m, with maximum error of about 2.0◦ Celsius (Figure 7(a2)). This means that during the summer the forecast adds to persistence more than during winter: the dynamical model is clearly responsible for the proper changes in water column stratification even for few days. At 30 m and in August (Figure 7(a1)) the difference between *PF* and *AF* is the largest: we assume that this is an indication of the complex dynamics involved in the mixing and the CIL dynamics. Regarding salinity (Figure 7(b1–b3)), *PF* and *AF* errors are quite similar comparing to winter and of the order of 0.3–0.4 PSU.

#### **4. Black Sea Diagnostics and Circulation Consistency**

#### *4.1. Temporal Evolution of the CIL and Stratification*

The BSFS is able to represent the formation and evolution of the cold intermediate layer (CIL), which is a typical water mass property of the Black Sea basin, and well documented in the literature ([16] for a very comprehensive literature review and updated evaluation). It is identified by the 8.0 ◦C isotherm, arising at the surface during winter, penetrating the subsurface (typically at a depth range of 50–100 m), intruding the warmer zone, and persisting over time. The time versus depth diagrams at the observation location are shown for temperature and for salinity in Figure 8, between 2015–2019. Figure 8a shows the cold event of 2017, which is also captured by observations as demonstrated by the quite low difference between the model and observations (Figure 8b). It also shows the reduction in the CIL width because of the warming period from 2018 and its perforation and partial disappearance in 2019 [16]. The major differences of about 2 ◦C occur in the upper layer and are shown in Figure 8b. Salinity stratification is well represented in Figure 9a with small differences with respect to observations, as shown in Figure 9b. It is characterized by a 2-layer structure with low salt in the CIL zone down to 100 m in depth and saltier values of about 22 PSU or more in the intermediate to deepest levels.

**Figure 8.** Time versus depth diagram for BSFS temperature in the period 2015–2019 (**a**) at the observation location and (**b**) for the computed model minus observation value.

**Figure 9.** Time versus depth diagram for BSFS salinity (PSU) in the period 2015–2019 (**a**) at the observation location and (**b**) for the computed model minus observation value.
