**1. Introduction**

The contamination of marine environments represents an increasing global concern because of the potential risks to both human health and along the coast heavily affecting the marine ecosystems. The Mediterranean Sea, due to reduced circulation and the presence of multiple industrial inputs along the coastline, is particularly vulnerable to environmental impacts and risks. Moreover, there is evidence that contamination may persist long after the end of industrial activities [1]. The historical industrial district and metallurgical production at the Bagnoli steel factory (ILVA), active for roughly a century, has exposed the marine sediments of the Gulf of Pozzuoli (GoP) to pollution by heavy metals and polycyclic aromatic hydrocarbons (PAHs). This area became a key site for twentieth-century Italian economic growth through industrial plants that produced steel and cement using iron ore and coal transported

from other locations and processed on site. These activities, now recognized as detrimental to the environment and economically impractical, prompted the industrial area to be dismantled starting in the mid-1990's [2]. The impact of heavy industries was never completely remediated, however, and this negligence results in high concentrations of PAHs and trace metals in marine sediments [3]. Sediment contamination is still evident in the vicinity of the industrial sites but also widespread in neighboring areas due to re-suspension of sediments and to water currents [4–7]. Furthermore, sewage loss from wastewater treatment plants or accommodation facilities along the western Bagnoli coastline increased the magnitude of the marine environment contamination [8,9]. Bagnoli has served as the subject of numerous scientific studies aimed at verifying the current condition of the water and marine sediments present in the GoP. Recent work has focused on bioaccumulation and biomarkers investigations to better understand the toxic effects and mechanism of action of contaminants [10–12]. Recently, an integrated assessment called sediment quality triad (SQT) was used to consider chemical analyses and biological effects as different lines of evidence (LOEs) to describe environmental quality of marine sediments [2,13]. A multidisciplinary approach (the weight of evidence approach, WoE) permits researchers to interpret various environmental conditions compared to a univariate analysis, such as a chemical approach, and circumvent overestimated and costly management decisions [11]. The WoE approach, integrating five lines of evidence (LOEs: sediment chemistry, bioavailability of chemicals, subcellular effects, toxicity at organism level and at the community level) was successfully applied to the Bagnoli case study and revealed a clearly polluted area but less critical levels of pollution when compared to the results of sediment chemistry alone [2].

This study, developed in the framework of the ABBaCo project which started in 2017, is designed to (a) update and improve the characterization of the environmental quality of the Bagnoli industrial site, (b) identify contamination sources, and (c) propose suitable remediation strategies. In the Bagnoli area, the determination of polluting sources is particularly challenging due to the cohabitation between natural processes and anthropic activities, which assesses sources apportionment more difficult. Moreover, some contaminations are nowadays almost ubiquitous (e.g., PAHs contamination is present in urban areas [14], rural areas [15], and coastal areas [5]).

The GoP is an area characterized by intense volcanic activity due to the presence of a large caldera—the Phlegraean fields—which is one of the most densely populated active volcanoes on Earth. It is characterized by a strong record of historical unrest and eruption events that date back to 2.2 ka BP. Since the 1950's, the Phlegraean fields' area has undergone four episodes of caldera-wide uplift and seismicity, which have raised the coastal town of Pozzuoli, near the center of unrest by 4.5 m and triggered the repeated evacuation of some 40,000 people. During the last 20 years of subsidence, following the uplift peak reached in 1984, the caldera started a new, low-rate uplift episode accompanied by low-magnitude seismicity and marked geochemical changes in fumaroles [16]. For this reason, some elements, such as arsenic (As), represent a natural enrichment. In some Italian areas, such as the Po plain, it is recognized that arsenic originates from the reductive dissolution of Fe oxides [17]. Therefore, in the present study, the presence of arsenic is assessable as a characteristic natural background level of the study area [6]. This natural presence of heavy metals in the sea sediments of the gulf is attributed to an active system of submarine thermal springs near the Bagnoli coastline that constantly release volcanic gasses [18,19]. PAHs also originate, however, from percolation through the soils or landfills contaminated by industrial activities [7].

For the assessment of the source apportionment in a highly polluted area, the use of statistical tools is well recognized. Some studies report the advantages of using multivariate statistical analysis [20–27] to interpret the contaminant distributions and the pollutants patterns. For the marine environment, some studies use principal component analysis (PCA) [7,15,28–30] and bivariate correlation analysis (Pearson coefficient) [31,32], nonparametric multivariate multiple regression analyses [33,34], canonical analysis of principal coordinates [33], multivariate linkage tree analysis [34] and randomized analysis of variance (PERMANOVA) [8,33,34] to investigate the co-occurrence of a suite of pollutants in sediments and to assess the related response on the biological assemblages that inhabit the

seabed. Some correlation analyses were carried out between pollutant concentrations and sediment granulometry [31]. Other studies examined the distribution patterns of the meiofauna and the diversity and abundance of microorganisms inhabiting the sediments of GoP in relation to environmental variation and chemical pollution [8,33,34]. Most PAH inputs in the environment are linked to anthropogenic activity (e.g., wastes from industrialized and urbanized areas, off-shore petroleum hydrocarbons production or petroleum transportation) [35]. One of the main issues is the connection between pollutant concentrations and their possible source (natural or anthropic) combined with the influence of wave climate on their concentration patterns. Within this project, the present research aims to apply a robust statistical approach (PCA/FA) to demonstrate its practical application for assessing the main contamination sources, distinguishing among natural or anthropic/industrial contributions, and finding correlation between the contaminant concentrations and wave hydrodynamics in the area. The workflow framework of the study is shown in Figure 1.

**Figure 1.** Workflow scheme of the study. This work basically generates three main results boxes: the first and the second box include bivariate correlations analysis, respectively between PC/Fs and distances from sewage discharge and PC/Fs and distances from thermal spring, while the third box includes the hierarchical cluster analysis (HCA) and Kruskal Wallis test.
