*f*(DIC, TA, pH) = (U10, DD, wind stress, transfer velocity, depth, SSS, chlorophyll-a, SST, Rrs 412, Rrs 443/555, Rrs 531/555, Rrs 443/48, PIC, POC, MLD) (1)

Multi-source, geo-located EO and model reanalysis datasets (Level 4, SMI format) covering the period of 16–24 April 2016 were derived (Figure 3) from the co-located and co-temporal values (corresponding to BD 1–7 and PD 1–10) at the points shown in red (i.e., Spring 2016: ANN training set) in Figure 1. Choosing the right number of hidden neurons is usually performed through trial and error [77]. The ANN optimal topology hinges on the complexity of the relations between inputs and outputs. In this study, two sets of hidden neurons were tested: *n* = 5 and *n* = 10, where the assumption was that the greater the number of nodes, the smaller the error on the training set. However, at a certain point, the generalization began to increase, and the first structure (i.e., *n* = 5) was chosen on the basis of the smallest value for RMSE that was achieved during the training phase. The best topology found in this study consisted of an input layer with 17 neurons, 5 neurons in the hidden layer, and an output layer consisting of 3 neurons whose output gave the scaled DIC, TA, and pH (Figure 4). The training algorithm adjusted the bias and weighting factors according to the negative gradients of the error cost function [58] for the final training pattern.

The ANN training process algorithm for DIC, TA, and pH is shown in Figure 4 as follows:


The training of the BPN algorithm started by using a small, random weight. It propagated each input pattern to the output layer, compared the pattern in the output layer with the correct one, and adjusted the weights according to the back propagation learning algorithm. After the presentation of around 10,000 patterns, the weights converged, i.e., the network picked up the correct pattern, and the error-correction learning stopped. In so doing, the network systematically reduced and/or reinforced the weights of the connection architecture and all of the 'knowledge' in the BPN was then contained in the weights. Naturally, the magnitude of this error depended on the choice, relation, quality, and accuracy of the inputs (predictors).

The predictive power of the BPN algorithm was maximized by means of the following steps:


salinity, winds, wind stress, temperature, and mixed layer depths) and biological (chlorophyll-a, dissolved organic carbon, and surface-leaving reflectance) parameters, in order to model the highest possible scenario for appropriate learning under a wide range of variability. This training procedure can be further improved by including input and output variables with a greater degree of variability, such as measurements covering other regional areas and time periods;

3. An optimal number of hidden units (*n* = 5) was found with the sigmoid activation function and a liner output unit to derive an optimal 'expressive' power of the network. The present training set presented a 'smooth' function and therefore the number of hidden units needed was kept to a minimum (*n* = 5). For strongly fluctuating functions, more hidden units are generally needed, which does not seem to be a requirement for our study.

**Figure 4.** The network architecture of the BPN model (left) showing 5 neurons in the hidden layer and the respective weights (in red and blue) of each connection. The values in each of the neurons is a scaled down value (1 decimal place) of the input, hidden, and output neurons, corresponding to one possible solution between the proxy environmental drivers (predictors) and the values for DIC, TA, and pH (predictands). Layer 1 (input): 17 neurons (see Table 1 for a list of input neurons). Layer 2 (hidden): 5 neurons. Layer 3 (output): 3 neurons: DIC, TA, and pH. Steps involved in the development of the BPN model (right).

#### 2.3.2. Performance of the BPN Algorithm

Entire M/V Equinox cruise transect datasets were reserved and used as independent datasets to validate the performance of the BPN training method. This is a common practice that ensures that the model can produce reliable estimates outside the range of the learning data (generalization capabilities) [78]. Thus, by assigning the trained BPN algorithm with the values of the fixed set of co-located and co-temporal input neurons as the physicochemical and biological drivers, the resulting ANN-output-modeled DIC, TA, and pH were validated against the assigned datasets (Table 1).

The precision of the machine learning approach was evaluated, which was based on the trained BPN model, through a comparison with the M/V Equinox dataset using the mean bias (MB; Equation (2)) and the root mean square error (RMSE; Equation (3)), as well as the slope of the linear regression between the ANN-retrieved values and the corresponding in situ measured values, as follows:

$$\mathbf{MB} = \Sigma\_{\mathbf{n}, \mathbf{i} = 1} (\mathbf{x\_i} - \mathbf{y\_i}) / \mathbf{n} \tag{2}$$

$$\text{RMSE}\_{\text{fo}} = \left[ \frac{\sum\_{i=1}^{N} \left( z\_{f\_i} - z\_{oi} \right)^2}{N} \right]^{1/2} \tag{3}$$

where the mean bias is how far the model is from the ground truth data and RMSE determines the error on the test set (or generalization error). The objective of the best BPN model topology was based on the lowest possible metrics for the entire test data.

#### *2.4. Construction of Gridded DIC, TA, and pH Gridded Data for 30 October 2016*

Finally, the ability of the trained BPN algorithm to process and generate a huge number of DIC, TA, and pH data points was applied to a 1.1 million km2 subset area located in the mid-North Atlantic Ocean, represented by a total number of 63,360 gridded data points (each encompassing the full set of 17 environmental drivers when available). In view of the extensive retrieval and processing requirements, these data points were based on the validated ANN algorithm and initiated by the physico-chemical and biological drivers that were retrieved on 30 October 2016.

The geographical extent of this area was west −61.00◦; east −50.04◦; west–east 10.96◦; south 24.99◦; north 34.96◦; and south–north 9.96◦. This area was chosen on the basis of its interesting hydrodynamics, as well as on its inter-annual trends in CO2 concentrations. The large temporal and spatial gradients of *p*CO2, as well as its variability driven by a diversity of physical and biological processes, make the analysis of the carbonate chemistry over the region both interesting and challenging [79]. The study's region of interest is influenced by the North Atlantic gyre and has a seasonal surface temperature variation of about 8 to 10 ◦C, occurring alongside a fluctuation in the MLD between the Northern Hemisphere's winter and summer seasons. On average, the MLD deepens to 200 m in winter up to about 10 m in summer. Generally, nutrients remain below the euphotic zone for most of the year, resulting in low primary production. During winter convective mixing, nutrients penetrate the euphotic zone, causing a short-lived phytoplankton bloom in the spring. All of these seasonal changes ultimately influence the total amount of CO2 in the seawater.

All of the grid-point predictor variables were inserted in the BPN algorithm and the values of DIC, TA, and pH were modeled for that day for the entire area, with a native grid size of 0.04167◦. On 30 October, there was a total of 7897 empty grid cells in this area that were attributed to cloud cover and, therefore, the lack of optically retrieved remotely sensed predictors (i.e., chlorophyll-a, Rrs, PIC, and POC).

#### **3. Results and Discussion**

#### *3.1. Validation between Remotely Sensed- and Cruise-Derived SST and SSS Data*

Table 4 shows a strong correlation between SST and SSS derived from the full cruisesegmented datasets (see Figure 1) and the remotely sensed PD1 and PD2.

#### *3.2. Performance of the BPN Algorithm*

By means of the independent validation datasets, we evaluated the performance of the algorithm by comparing the BPN-retrieved values of DIC, TA, and pH with the measurements that were taken by M/V Equinox (NCEI Accession 0154382) elsewhere, during the different time periods.


**Table 4.** Correlation between same variables obtained remotely and by M/V Equinox [ID: MLCE; 7 March 2015 to 6 November 2016] cruise-segmented datasets. Their *p*-value is <0.00001 and all of the correlations are significant at *p* < 0.05.

#### 3.2.1. M/V Equinox—7–8 March 2015

Figure 5 shows the distribution and the statistical significance of the data points within the range that is shown by both in situ and ANN-estimated values, as well as the existence of outliers. The co-located, ANN-estimated DIC, TA, and pH values were in very good agreement with the surface underway measurements given that the BPN algorithm was trained on the data that were collected during 28 April–6 May 2015 along the entire North Atlantic width. The results show that the mean biases for DIC, TA, and pH are −2.5 <sup>μ</sup>mol.kg<sup>−</sup>1, −3.2 <sup>μ</sup>mol.g−1, and 0.0048, respectively. Compared to the range of DIC, TA, and pH that is shown by the surface underway measurements along all of the transects (Table 1), the values for the mean bias show low variations and a good ANN algorithm performance. Importantly, apart from the fact that no outliers were detected, the overall dispersion of the ANN-estimated values is well within the range of those shown by the M/V data. Some skewness is shown by the ANN-estimated pH and, to a lesser extent, for DIC. The similarity between these three sets of data is statistically significant at the 99% confidence level.

These results point to an effective BPN algorithm that is able to capture the information provided by the chosen environmental drivers. It is important to note that for oceanic and coastal regions with a different matrix of environmental drivers (such as for areas with high chlorophyll-a, where the net productivity is likely to perturb the carbonate system more, or in areas where there are river inputs), further learning of the BPN algorithm is therefore recommended.

**Figure 5.** Median and variability of ANN-estimated values fall within those shown by discrete underway measurements (Winter 2015 cruise transect). The means of the two datasets are similar at the 99% C.L.

3.2.2. M/V Equinox—30 October to 6 November 2016, North Atlantic Ocean (20◦ N to 40◦ N; −80◦ W to −10◦ W)

Similarly, Figure 6 shows the resultant statistical evaluation when the ANN-estimated values were compared against the corresponding in situ data. As for the previous validation set, the predictions for the October–November 2016 dataset were in good agreement with the co-located and co-temporal M/V Equinox data. Overall, the ANN-estimated data show less dispersion than the in situ values and that the spread of the former is well within that

shown by the data from M/V Equinox. The few ANN-estimated outliers are well within the interquartile range of the M/V Equinox data.

**Figure 6.** Median and variability of ANN-estimated values fall within those shown by discrete underway measurements (Autumn 2016 cruise transect). The means of the two datasets are similar at the 99% C.L.

The performance indicators between the modeled and the validation dataset 1 (i.e., the 7–8 March 2015 in situ dataset) point to a stronger estimation than in the case of the second validation dataset. This is most likely because dataset 1 is based on the same seasonal variations of the carbonate chemistry when compared to the second validation sample that was collected during the Autumn of 2016. The mean bias values generally show a non-Gaussian distribution and spread, with the exception of TA for both of the validation datasets, and pH for the Spring 2015 dataset (Figure 7). In the latter case, the residuals are skewed toward lower modeled values.

The uncertainties that were inherent in the in situ measurements were not included in the metadata information within NCEI Accession 0154382, and therefore this element of uncertainty attributed to the surface underway observation could not be evaluated. Overall, however, the results' metrics are very comparable to the validation metrics that were obtained by Fourrier et al., for their neural network estimation of pH and total alkalinity in the Mediterranean [80]. It is rather complex to identify the main sources of the observed metric errors in view of (1) the procedure that was used by this study and (2) the uncertainty embedded in the in situ data that were used for both the BNP algorithm training and its validation; however, this bias could be expected to decrease if the following steps are taken:


(**b**) M/V Equinox: 30 October–6 November 2016, North Atlantic Ocean.

**Figure 7.** Histogram reporting the distribution of the mean bias values for DIC, TA, and pH.

#### *3.3. Model Applications: ANN-Derived Ocean Variability of DIC, TA, and pH over the Mid-North Atlantic Ocean*

Based on the previous two validation studies that span different time periods and geographical areas (where each area manifests its own variability in terms of the magnitude of the environmental drivers), we were able to apply the validated ANN topology to model DIC, TA, and pH within the ROI described in Section 2.4 at a resolution of 0.04167◦. The final product was a set of gridded, time-specific geophysical maps of these predictands (i.e., surface DIC, TA, and pH). The resolution of these maps took on the native resolution of the input (i.e., predictor) datasets (i.e., 17 environmental drivers). If needed, these raster outputs can be subsequently re-gridded to coarser resolutions in order to (1) further understand the spatiotemporal variability of the carbonate system over specific oceanic regions, (2) comprehensively map the carbonate system components in support of the cruise data, and (3) input the predicted values into numerical modeling systems (such as ocean forecasting models).

Figure 8a–d represents the gridded output of DIC, TA, and pH maps for the area of interest that were produced by the ANN algorithm. The data gaps represent that no ocean surface data are available whenever clouds obstruct part of the field of view of the optical satellite sensors, at which points the ANN algorithm nullifies the predictions. These high-resolution data representing the carbonate system of the area can be exploited by other modeling activities, including data assimilation for general circulation models [81] and improved model reanalyses [82], as well as the identification of daily trends over sensitive marine areas [83].

**Figure 8.** *Cont*.

**Figure 8.** Modeled, gridded (**a**) DIC, (**b**) TA, and (**c**) pH maps for the ROI produced by the ANN algorithm. The spatial resolution is 0.04◦ × 0.04◦, which corresponds to the native spatial resolution of some of the predictands. The (**d**) *p*CO2 map valid for 30 September until 31 October 2016 has been inserted for reference [78]).

Figure 8d shows how the co-temporal spatial distribution of *p*CO2 that has been derived by the Landschützer et al., dataset [84] and grid-resampled over our exact area of study is similar to the way that the ANN-estimated pH is distributed. It clearly shows higher *p*CO2 levels over areas with a lower pH estimate (Figure 8c). This relationship corresponds with the results that were obtained by Sutton et al., (2014) and by Bates et al., (2012) when they studied the variability between *p*CO2 and pH over the Pacific Ocean and the Atlantic Ocean surface, respectively [19,85]. In our study, the subtle gradient in *p*CO2 from east to west at around 27◦ N in Figure 8d is well captured by the modeled spatial variation of the pH high resolution field over the same area (Figure 8c, including the relatively lower pH values corresponding to the northerly *p*CO2 'tongue' originating from around −58◦ W, 26◦ N (Figure 8d).

#### 3.3.1. Validation of the Modeled Data over the Mid-North Atlantic Ocean

#### **In Situ Cruise Data**

The results of the data validation against the in situ datasets available over the same area using validation dataset 3 are shown in Table 5. The in situ cruise transect (comprising StationIDs 1120000–1160000) did not include any pH measurements along the way. The linear regression analysis shows that the correlation between the TA datasets is statistically significant (*p* < 0.05; 95% C.L.). Moreover, the regressed observations and ANN-estimated DIC and TA values fall within the predicted 95% confidence level of the regression line.

**Table 5.** Corresponding ship-based and ANN-estimated values for DIC, TA, and pH. In situ pH measurements were not collected by M/V Equinox during part of the transect of 28 April–6 May 2015 (Validation dataset 3). (n/a: not available). The location of the individual StationIDs is as follows: 1120000: (31.1390◦ N, −60.5765◦ W); 1130000: (31.3795◦ N, −59.3347◦ W); 1140000: (31.7085◦ N, −57.6472◦ W); 1150000: (32.1818◦ N, −55.2020◦ W); 1160000: (32.7458◦ N, −52.2730◦ W); 1200000: (34.0460◦ N, −45.4433◦ W); and 1330000: (27.5105◦ N, −78.8207◦ W).

