**2. Materials and Methods**

### *2.1. Study Area and Geological Settings*

The study area, Bagnoli, is located within the Bay of Pozzuoli, in the western sector of the urban territory of Naples. Close to the Bagnoli site, the already mentioned Phlegraean fields is a large "super volcano" situated to the west of Naples, Italy. Declared a regional park in 2003, the area of the caldera consists of 24 craters and volcanic edifices, most of them submerged under the bay with very high hydrothermal activity and effusive gaseous manifestations [19] (Figure 2). From a geological point of view, this large caldera is in a state of quiescence. The Phlegraean fields experienced extreme volcanic activity in the last 39,000 years. The two main events occurred 35,000 years ago (the 'Campanian ignimbrite eruption'), followed by caldera collapse, and 15,000 years ago (the 'Neapolitan Yellow Tuff', NYT eruption). The NYT eruption formed the central and eastern part of the Bagnoli–Fuorigrotta plain, which is constituted by a sequence of pyroclastic and volcanic material that thickens along the Agnano crater. Agnano volcanic products dominates the western part of the plain. The last eruptive

event occurred in A.D. 1538 and gave rise to the Mt. Nuovo cone. From then on, only bradyseismic and hydrothermal activity (in the Solfatara and Pisciarelli area) are present in the area. The ground deformation in this area was recognized as caused mainly by the actions of long periods of uplift, eruptive activity, and possible deep magma movements [36–38]. According to some studies [39], the Phlegraean fields activity has been predominantly explosive and characterized by an interaction between magma and water. There is evidence of both interaction of the magma with sea water as well as intra-caldera lake water and deep-seated aquifers in some Plinian volcanic events. High concentrations of metals and metalloids, such as As, Mn and Fe are found in groundwater and in the sea sediments near the main tectonic and hydrothermal activity [3,18,31,40,41].

**Figure 2.** Location of the study area, Bagnoli bay, Naples, Italy.

Since the early 1900's Bagnoli has been affected by the presence of various industries. The most important industrial activity was through ILVA steelworks, formerly Italsider. The ILVA steel plant was located along the Bagnoli coastline and was characterized by two long piers (still present now) which served as a berth for large ships that carried raw materials such as coal, iron ore and limestone to the steelworks. In the early 1960's, in order to enlarge the plant, the area between the two piers was filled with industrial waste coming from the steelworks. A new coastline was designed, giving more space for the industrial activities, but causing consistent damage to the marine ecosystems. Furthermore, the presence of nine discharge points along the coastline, combining both industrial and civil sewage, may have contributed to the deterioration of the quality of the environment. Some of these collectors are no longer in use since the steelworks stopped its production in 1990.

The Bagnoli Bay is also characterized by the presence of some submarine thermal springs, mainly located close to the former industrial area. Some offshore thermal springs are also present in the GoP.

#### *2.2. Data Availability: Sampling, Sewage Discharge, and Wave Information*

In this study, a total of 126 sampling points were considered as shown in Figure 3 (black dot points). These samples were part of an explorative campaign carried out in 2017 in the GoP by the Stazione Zoologica Anton Dohrn [42,43]. On the sea bottom, 94 sediment samples were collected using drilling (0-50 cm) and 32 sediment samples were collected by bucket (surface layer). In each point, the concentrations of heavy metals and PAHs were measured. A total of 11 heavy metals (i.e., Al, As, Cd, Cr, Cu, Fe, Hg, Ni, Pb, V, and Zn) and 18 PAHs (i.e., naphthalene, anthracene, phenanthrene, acenaphthylene, acenaphthene, fluorene, fluoranthene, pyrene, benzo(a)anthracene, chrysene, benzo(b)fluoranthene, benzo(a)pyrene, benzo(k)fluoranthene, indeno(1,2,3-cd)pyrene, benzo(ghi)perylene, dibenz(ah)anthracene, benz(j)fluoranthene, benz(e)pyrene) indicated by Environmental Protection Agency (EPA) as important toxicological contaminants were considered. All the samples and the analytical determination derive from the series of studies conducted under the framework of the ABBaCo project [4,33]. Contaminants general statistics are reported in Table 1.


**Table 1.** General statistics of the sampled contaminants in Bagnoli sea sediments. Heavy metals (in mg/kg) and polycyclic aromatic hydrocarbons (PAHs, in μg/kg) are divided by bold line.

A total of 14 sewage discharge points were mapped and selected in the study area (from 1 to 9C, in Figure 3). Some of these discharge points were supposed to be inactive (e.g., Points 4, 7 and 8). However, it is well recognized that due to poor maintenance and some illegal discharges in sewage disposal these points, especially when heavy rains occur, release water carrying waste and sewage to the coastline. For this reason, all the discharge points shown in the figure have been considered as active. It should also be noted that the discharge points labeled as "C" in Figure 3 are control points chosen as the nearest to discharge point (i.e., 9C is the control point of discharge point 9).

**Figure 3.** Location and typology of sewage discharges located along the Bagnoli coastline. The capital letter "C" indicates the discharges which control the nearest discharge point (i.e., 9C is the control point of Discharge Point 9).

Representing natural contaminant sources spread along the Bagnoli bay seabed, 13 thermal springs were identified [18] (Figure 4).

**Figure 4.** Location of natural sources inside the Bagnoli bay [12,18].

In addition to the presence of sewage discharges and thermal springs, concentration patterns can be influenced by marine dynamics that remobilize sediments and accumulate them in places where there are steady-state conditions. In order to include the analysis, the main effects of the wave-induced hydrodynamics inside the GoP area in the analysis, we incorporated wave properties generated by a numerical model of the gulf.

#### *2.3. Numerical Wave Modelling*

The wave climate analysis was carried out using the hindcast data from the European Centre for Medium-Range Forecasts (ECMWF). Significant wave height (*Hs*), mean period (*Tm*), and mean direction (Θ*m*) for the time period ranging from January 1979 to December 2018 were extracted from the wave model (WAM) of the ERA-interim archive (ERA: ECMWF Atmospheric Reanalysis), available for download online [44]. Both the offshore energetic patterns and the nearshore water conditions have been studied by means of the MIKE 21 SW coastal propagation model [45]. The model has been previously calibrated and validated applying a multi-collocation-based estimation approach as described in [46]. Such studies perform an optimization procedure for:


It is worth noting that spreading and dispersion studies and studies of water quality or ecological systems in marine areas are generally carried out by means of coastal hydrodynamic models coupled with sediment transport and particle tracking models. Generally, such modelling is computationally demanding. Therefore, in the perspective to undertake multi-year wave hindcast studies more quickly, a spectral modeling of wave propagation has been used. In order to assess the hypothesis that the bay's hydrodynamics is a factor of influence on pollution patterns, a single wave scenario able to represent the yearly average condition in the bay in terms of wave energy flux, had to be selected. Such a sea state, here defined as "energy equivalent", was obtained considering the wave power content of each wave from the whole dataset of 40-year wave records. Operatively, from the 6-h data of *Hs*, *Tm*, θ*<sup>m</sup>* provided by the fictious WAM model, a 6-h wave power and a wave energy dataset were obtained. For a specific sea state, the average wave energy per unit area is proportional to the square of the significant wave height, *Hs*, according to the known relationship [48]:

$$E\_{\rm new\nu} = \frac{1}{16} \rho g H\_s^2 \left[\frac{f}{s}\right] \tag{1}$$

where ρ is the sea water density, assumed equal to 1025 kg/m3, *g* is the gravity acceleration (equal to 9.81 m/s2). For a given spectrum, the significant wave height computation is based on zero-order moment of the spectral function and readily estimated as follows:

$$H\_s = H\_{m0} = 4\sqrt{m\_0} \tag{2}$$

and the wave characteristic/statistic periods can be defined as:

$$T\_{\varepsilon} = 2\pi \frac{m\_{-1}}{m\_0} \tag{3}$$

$$T\_m = T\_{01} = 2\pi \frac{m\_0}{m\_1} \tag{4}$$

where *Te*, *T*<sup>01</sup> are the energy period and the spectral mean period, respectively, with the spectral moment being defined as:

$$m\_n = \int\_0^\infty S(f) f^n df \tag{5}$$

where *n* = −2, −1, 0, 1, 2, ...

The wave power in irregular waves can be computed as:

$$P = \frac{\rho \times g^2 \times H\_s^2 \times T\_c}{64 \times \pi} \tag{6}$$

Following a conservative approach, according to [49,50], the energy period in the present study has been assumed as 1.14 *Tm*. Equation (6) can be written in the approximate deep water expression:

$$P = 0.459 \times H\_s^2 \times T\_\varepsilon \tag{7}$$

For each *j*-th year, an average wave power (*PJ*) has been computed. Considering that the energy period can be computed directly using the approximate formula:

$$T\_c = 4.5 \times \sqrt{H\_s} \tag{8}$$

then, an "energetic" yearly significant wave height, *He*, can be calculated as:

$$H\_{\mathfrak{e},j} = \sqrt[2.5]{\frac{P\_j}{0.459 \times 4.5}}\tag{9}$$

for *j* = 1, ... , 40, and a correspondent energy period can be estimated as follows:

$$T\_{\rm av,j} = 4.5 \times \sqrt{H\_{\rm c,j}} \tag{10}$$

The 40-year average for these parameters gives *He* = 0.93 m and *Tee* = 4.3 s. Regarding the representative direction, *De*, a vectoral analysis about the energy flows (carried out comparing records of ADCP installed in the bay and correspondent offshore waves) indicate as the offshore wave direction can be considered 217◦ N. Finally, the "energy equivalent scenario" with *He*, *Tee* and *De* has been propagated applying the MIKE 21 SW model. The seabed was performed by interpolating at the grid nodes the information provided by the General Bathymetric Chart of the Oceans (GEBCO) database [51]. A gross verification of the 10-m isobath, as representative of the nearshore region in a wave propagation model, has been applied by comparison with a recent bathymetric campaign [19].

#### *2.4. Statistical Analysis: Multivariate Analysis and Bivariate Correlation*

All the statistical computations were implemented using the statistical software IBM SPSS Statistics® (version 26.0), while spatial queries, distance calculation and data mapping were managed through the open source environmental software QGIS (version 3.8.3, https://qgis.org/en/site/).

PCA [52] is a powerful technique to investigate patterns of correlations among a set of variables. PCA in fact allows to extract a set of new variables, uncorrelated, called principal components (herein after PCs), each aggregating the variables more strongly correlated with the PC and among themselves. PCA seeks in fact a linear combination of input variables, which extracts the maximum variance of the bivariate correlation matrix. Following this, with a second linear combination, orthogonal from the first one, PCA extracts the maximum variance of the remaining variance, and so on. PCs represent all the linear combinations of the original variables weighted by their contribution in explaining the variance, in a specific orthogonal dimension. PCA was turned into a factor analysis (FA) to reduce the contribution of the less significant parameters within each component, by extracting a new set of vari-factors through rotating the axes defined by the PCA extraction. The Varimax rotation criterion was used to rotate the PCA axes allowing us to maintain the axes, orthogonality. The number of factors to be retained was chosen based on the "eigenvalue higher than 1" criterion (i.e., all the factors that explained less than the variance of one of the original variables were discarded).

To assess the variability within the whole set of contaminants, PCA/FA was performed considering all 11 metals, and all 18 PAHs (Table 1). For each of the *N* sampling points, PCA enabled to calculate principal component scores aggregating the information of the different parameters' concentration values. Moreover, in order to assess correlations between the sampled contaminants concentrations and the position of sewage discharges and thermal springs, a distance matrix was created through the QGIS vector toolbox, containing the pairwise distance between each sampling point and each discharge point. The measurements-discharges distance matrix was then associated to the component loadings and a bivariate Pearson's linear correlation analysis was performed. Finally, in order to assess the possible effects of the waves' dynamics on the concentration patterns found with PCA, a Kruskal Wallis test was performed to test the significance of hydrodynamics' effect on the contaminant concentrations, likewise similar studies [53–56]. Five wave characteristics were considered, such as the direction, the height, the mean peak period, the energy and the resulting power of the wave motion inside the bay. These characteristics were subdivided into classes, identified based on a hierarchical cluster analysis (HCA) [52] and tested as factors on the vari-factors assumed as dependent variable in the Kruskal Wallis test. Therefore, the KW null hypothesis was the equality of the vari-factor medians across wave characteristics classes. The level of significance assumed was *P*-value lower than 0.05.

## **3. Results**
