*Article* **Water-Rock Interaction Processes: A Local Scale Study on Arsenic Sources and Release Mechanisms from a Volcanic Rock Matrix**

**Daniele Parrone \*, Stefano Ghergo, Elisabetta Preziosi and Barbara Casentini**

Water Research Institute—National Research Council, IRSA-CNR, Via Salaria km 29.300, PB 10, 00015 Rome, Italy; stefano.ghergo@irsa.cnr.it (S.G.); elisabetta.preziosi@irsa.cnr.it (E.P.); barbara.casentini@irsa.cnr.it (B.C.) **\*** Correspondence: daniele.parrone@irsa.cnr.it

**Abstract:** Arsenic is a potentially toxic element (PTE) that is widely present in groundwater, with concentrations often exceeding the WHO drinking water guideline value (10.0 µg/L), entailing a prominent risk to human health due to long-term exposure. We investigated its origin in groundwater in a study area located north of Rome (Italy) in a volcanic-sedimentary aquifer. Some possible mineralogical sources and main mechanisms governing As mobilization from a representative volcanic tuff have been investigated via laboratory experiments, such as selective sequential extraction and dissolution tests mimicking different release conditions. Arsenic in groundwater ranges from 0.2 to 50.6 µg/L. It does not exhibit a defined spatial distribution, and it shows positive correlations with other PTEs typical of a volcanic environment, such as F, U, and V. Various potential As-bearing phases, such as zeolites, iron oxyhydroxides, calcite, and pyrite are present in the tuff samples. Arsenic in the rocks shows concentrations in the range of 17–41 mg/kg and is mostly associated with a minor fraction of the rock constituted by FeOOH, in particular, low crystalline, containing up to 70% of total As. Secondary fractions include specifically adsorbed As, As-coprecipitated or bound to calcite and linked to sulfides. Results show that As in groundwater mainly originates from water-rock interaction processes. The release of As into groundwater most likely occurs through desorption phenomena in the presence of specific exchangers and, although locally, via the reductive dissolution of Fe oxy-hydroxides.

**Keywords:** potentially toxic elements; aquifer; volcanic tuff; sequential extraction; Fe oxyhydroxides; sorption

#### **1. Introduction**

Arsenic is a natural element easily found throughout the environment. Its presence in drinking water represents a hazard to human health [1]. Long-term exposure to As-rich waters can cause serious diseases, from skin lesions, cardiovascular diseases, and type II diabetes to bladder, lung, and skin cancers [1–3]. In groundwater, this element frequently shows concentrations exceeding 10.0 µg/L, which is the guideline value suggested by WHO [4] for drinking water and was implemented in Europe as standard by 98/83/EC [5], entailing a prominent risk to human health.

The presence of this potentially toxic element (PTE) in groundwater depends on the hydrogeological setting as well as on various natural processes, including climate, biological activity, and volcanic emissions. Since the 1990s, wide occurrences of As in well water in Bangladesh have attracted the attention of the scientific community, which seeks to deepen its knowledge of the origin and fate of arsenic in groundwater [6–9]. Water-rock interactions under favorable biogeochemical conditions are considered to be the most important mechanism for the release of As in aquifers by far [10]. Arsenic concentrations in natural waters may range from 0.5 to > 5000 µg/L [9]. Arsenic-enriched groundwater has been reported in many parts of the world, mainly in Asia (Bangladesh, China, India,

**Citation:** Parrone, D.; Ghergo, S.; Preziosi, E.; Casentini, B. Water-Rock Interaction Processes: A Local Scale Study on Arsenic Sources and Release Mechanisms from a Volcanic Rock Matrix. *Toxics* **2022**, *10*, 288. https://doi.org/10.3390/ toxics10060288

Academic Editor: Ilaria Guagliardi

Received: 29 April 2022 Accepted: 25 May 2022 Published: 27 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Vietnam, Nepal), the Americas (USA, Mexico, Argentina, Chile, etc.), and Europe (Italy, Greece, the Pannonian Basin, etc.) [11]. Arsenic release in groundwater has been mainly attributed to three possible processes: the reductive dissolution of Fe/Mn hydroxides, ion exchange with different competitive exchangers (e.g., phosphate, bicarbonate, and silicate), and the oxidation of arsenic-bearing minerals [9,12,13]. Precipitation/dissolution, sorption/desorption, biotic and abiotic oxidation/reduction [14,15], and complex formation are the main reactions controlling the distribution of arsenic between the aqueous and solid phases.

The scientific literature is abundant regarding the presence of As and the main processes leading to its release in groundwater in alluvial environments [16–20], though less so in volcanic contexts. Alarcón-Herrera et al. [21] found a co-occurrence of As and F in oxidized and alkaline environments due to interactions with volcanic glass and, to a lesser extent, with hydrothermal minerals. Further, they identified Fe, Mn, and Al (hydr-)oxides as secondary sources due to their great adsorbing properties. In Argentina, Francisca and Carro Perez [22] reported the presence of As, F, and V in groundwater due to volcanic glass leaching. In a volcanic aquifer in Mexico, under oxidative conditions and high temperature, Morales et al. [23] observed an As mobilization from metallic sulfides, which was favored by thermal water rising through faults and fractures. Ren et al. [24] identify Tertiary volcanic tuffs as the main source of the high As content in groundwater, hypothesizing an origin linked to the leaching of Arsenogoyazite-like arsenic minerals. In a geothermally active area of northern Greece, Winkel et al. [25] analyzed different travertines containing high levels of arsenic (up to 913 mg/kg) and found that it mainly coprecipitated with calcite. In this regard, Di Benedetto et al. [26] claimed that As incorporation in calcite may be an effective limit of As mobility under conditions where immobilization through sorption by Fe and/or Mn oxy-hydroxides is not acting.

In the volcanic aquifers of northern Latium (central Italy), the presence of As and other PTEs typical of the volcanic environment is well known [27,28]. However, the main origin of the As contamination, as well as whether a common process associates arsenic and the other PTEs, is still unclear. Vivona et al. [29] claimed volcanic rock leaching in cold groundwater, while Angelone et al. [30] pointed to a major influence of localized thermal fluids rising along fractures. In the same region, Armiento et al. [31] and Cinti et al. [32] ascribed the diffused As concentration in groundwater to water-rock interaction processes, locally enhanced by thermalism and volcanic gas emissions. Casentini et al. [33] performed batch leaching tests to investigate the release of As induced by interactions of volcanic rocks (lava and tuffs) with inorganic anions and found a positive influence of F and HCO3, likely due to anion exchange processes.

The main laboratory experiments aiming to understand the mechanisms of arsenic release include column tests [34–36] and batch tests [37–39]. Several laboratory studies showed that As is released from sediments under anaerobic conditions [40–42], particularly following the reductive dissolution of Fe and Mn oxyhydroxides [43]. Batch tests are often used to study the effects of oxidation-reduction potential (ORP) variations [19], which also affect pH [44] and could naturally occur as a consequence of the mixing of waters with different chemical compositions. Frohne et al. [38] found that dissolved As concentration decreases significantly with increasing ORP and concluded that low ORP levels promote As mobility. Batch tests have also been used to investigate As sorption and dissolution kinetics linked to various mineral phases [45,46].

Arsenic is relatively scarce in the Earth's continental crust (1.5–2 mg/kg) and can be associated with different As-bearing phases [47]. In this regard, Selective Sequential Extraction (SSE) is a valuable procedure to quantify the distribution of As between different solid fractions that constitute the rocks [48–51], potentially providing important information about the adsorption-bearing phases and related processes occurring at the soil/water interface [52]. The literature does not lack works on sequential extraction methods aimed at arsenic fractionation [53–59], and additional procedures have been proposed to improve the extraction of specific As-bearing phases [60,61]. SSE has been used to characterize As distribution and mobility both in contaminated areas [62–65] and in pristine aquifers.

This work aims to study the main mechanisms that govern As behavior in the waterrock interaction processes in a volcanic context where aquifer rocks are dominated by tuffs. The research was organized in the following three phases:


Lastly, field and laboratory data have been analyzed in order to evaluate the significant connections between aquifer mineralogy and groundwater geochemistry, with particular reference to As, and also to deepen its relationships with other PTEs in groundwater.

#### **2. Materials and Methods**

*2.1. Study Area*

The study area (Figure 1) lies on the eastern flank of the Sabatini volcanic apparatus.

**Figure 1.** Hydrogeological scheme of Latium volcanic domain, from Parrone et al. [66]. The yellow box indicates the investigated area.

Its activity developed from 0.6 to 0.04 M.a. b.p. through five main phases, during which ultrapotassic volcanic rocks ranging from trachybasalts to trachytes to phonolites were erupted. During the Sabatini volcanic district formation, the Bracciano lake volcanotectonic depression and several calderas developed [67–69]. The volcanic activity developed

through several vents distributed over a large area, the main one being the Sacrofano system in the east and the Bracciano system in the west.

Pleistocene volcanic products, which are usually characterized by an overall medium permeability [70], overlap, in angular unconformity, Pliocene and Pleistocene sedimentary deposits of marine and continental origin along with clays and silts that form the low permeability bottom of the groundwater body.

The volcanic deposits outcropping in the study area include mainly pyroclastic deposits: "Via Tiberina Yellow Tuff" (VTYT), "Sacrofano Tuff", and "La Storta Tuff" (Figure 2). Among these, the VTYT is the most important formation in the investigated aquifer. Cineritic levels and paleosoils determine local decreases of the vertical permeability resulting in a multilayer aquifer with semiconfined horizons [71].

**Figure 2.** Simplified geological scheme with the location of the rock sampling points of the "Via Tiberina Yellow Tuff" (VTYT) and the groundwater sampling points.

The VTYT is a pyroclastic flow (ignimbrite) produced by the activity of the Sacrofano center, located about 15 km west of the Tiber Valley, which has been set in place directly on the Plio-Pleistocene sedimentary basement. The VTYT has several overlapping units with different lithological characters. Nappi et al. [72] identified a stratigraphically upper lithology with components of smaller dimensions and, between the abundant zeolites, chabazite prevailing over phillipsite; the other, which can be found in the lower levels of the formation, has coarse grain size, has more abundant phillipsite, and shows alterations (clays) due to the action of groundwater. De Rita et al. [73] distinguish three units, spaced by pyroclastite layers of different nature, while Campobasso et al. [74] recognize at least seven depositional units.

The formation has been extensively described from the petrographic and chemical points of view by Lombardi and Marra [75]. The VTYT appears as a massive aggregate, mainly lithoid, of pumices of centimetric sizes linked by a very fine matrix, with colors

ranging from yellow to light gray. The average porosity stands at 48% [75]. As reported by Jackson et al. [76], glassy matrix represents 42–54% of the rock, lithic fragments and crystals 6–14%, and secondary minerals such as zeolites and calcite 37–48%. Zeolitization processes of the vitreous mass are evident, with the formation of chabazite and phillipsite aggregates that are primarily responsible for the high degree of lithification of the rock. Another widespread secondary mineral is calcite, resulting from precipitation of the water circulating in diagenetic phases and present not only in the common carbonate fragments but also in the cementing matrix [75].

#### *2.2. Groundwater Sampling and Analysis*

The first phase of the work consisted of sampling and analysis of groundwater samples from the investigated area. This phase is aimed at the geochemical characterization of circulating groundwater, with a particular focus on the As content and distribution in the two study sectors.

A total of 69 groundwater samples (Figure 2) were collected from wells (64) and springs (5), of which 32 were in the northern sector and 37 were in the southern one. These are mainly private wells for domestic/irrigation use.

At each site, pH, EC, Eh, DO, and water T were measured with probes (Hach-Lange). The sampling was carried out after a purging of the wells until the physical-chemical parameters stabilized (usually 20–30 min). All samples were filtered in the field with 0.45 µm membrane filters under N<sup>2</sup> and stored in HNO<sup>3</sup> 1% rinsed polyethylene bottles. One fraction was immediately acidified by HNO<sup>3</sup> (Suprapure, Merck) for major cations and trace metal determination. Bicarbonates were determined in laboratory by HCl (Suprapure, Merck) titration on 50 mL of sample within 24 h from sampling. Field blanks with ultrapure water were periodically collected, checking for possible environmental contaminations. The samples were analyzed for anions by ion chromatography (IC, Dionex DX-120), for major cations by ICP-OES (Perkin Elmer P400) and trace elements by inductively coupled plasma mass spectrometry (ICP-MS, Agilent 7500c); certified materials (NIST 1640a, trace elements in natural waters) were used to check accuracy of the laboratory measurements. The electro-neutrality (EN%) evaluated as the percent difference for major cations and anions, including F and NO3, ranges between −3.96% and +5.10%. For statistical analysis, the concentrations below detection limit (LOD) were assumed to be equal to half of LOD itself.

#### *2.3. Statistical Analysis of Groundwater Chemical Data*

A univariate statistical analysis of groundwater chemical data was carried out. Groundwater analytical data were processed by means of descriptive basic statistics, and the concentrations of some parameters were represented via scatterplots and distribution maps.

Normality or lognormality of As distribution was verified using the Shapiro-Wilk goodness of fit test [77]. In addition, Rosner's parametric test [78] and Huber's nonparametric test [79] for detection of data outliers were applied.

Finally, a bivariate correlation analysis in the two study sectors was carried out using both Pearson's linear correlation index, which is useful in case of normal data distributions, and Spearman's rank correlation index, which is more robust and suitable for skewed distributions or in presence of data outliers.

Data processing was performed through different software: ProUCL 5.1 [80] for the normality/lognormality and outlier detection statistical tests, Past 4.05 [81] for the bivariate correlation analysis, and ArcGIS 10.2.2 for the As concentration mapping.

#### *2.4. Outcropping Rock Sampling and Mineralogical Analysis*

Seven rock samples from the "Via Tiberina Yellow Tuff" (VTYT) were collected for mineralogical analysis (Figure 2). Four of these samples (PAR04, PAR10, PAR11, and PAR22) were extracted from excavation fronts of VTYF tuff quarries. The superficial portion of the samples was removed in order to eliminate the part of the rock that has undergone

exogenous alteration processes. The samples were then transported to the laboratory for the analysis.

Tuff samples were analyzed by X-Ray Powder Diffraction (XRPD) and Scanning Electron Microscopy (SEM), in order to have a first mineralogical characterization of the aquifer matrix. The XRPD patterns were collected with a Siemens D5000 diffractometer operating with Bragg-Brentano θ/2θ geometry, CuKα = 1.518 Å, 40 kV, and 40 mA. Each XRPD pattern was collected from 4 to 80◦ of 2θ, with a step scan of 0.02◦ . Identified Bragg reflections were assigned to the corresponding crystalline standards contained in the inorganic crystal structure database (ICSD). Powdered samples were prepared by hand grinding using an agate mortar and pestle, then sieved with a sieve of 0.5 mm mesh. Powders were further ground prior to the analysis to achieve a particle size < 50 µm.

SEM observations were performed with the Philips XL30 Analytical Scanning Electron Micro-scope equipped with secondary (2 nm imaging resolution), backscattered (0.1 AZ elemental resolution) electron detectors, and probe for Energy Dispersive X-ray Analysis (EDAX 134 eV) for the execution of punctual elemental analysis of the mineralogical phases and spectrum representation. SEM investigations were carried out on ordinary 30 µm thin sections after graphite sputter-coating of the samples.

Preliminary observations of the thin sections at the optical polarizing microscope (Nikon Eclipse E400 Pol) were also performed.

#### *2.5. Selective Sequential Extraction*

For three selected VTYT samples (PAR10, PAR11, and PAR22) we performed a Selective Sequential Extraction (SSE) in order to determine the distribution of As between the different solid bearing phases forming the rocks and to better understand the possible processes governing As mobility. Several protocols are available in literature to perform SSE analysis [82,83]. For the extraction procedure, we chose the methodology proposed by Wenzel [57] and modified it in order to consider a specific step for calcite from Costagliola et al. [61] and one for sulfides from Torres and Auleda [84].

The final procedure is able to distinguish seven fractions that can potentially bind As: (1) easily exchangeable As, (2) specifically adsorbed As, (3) As bound with calcite, (4) low crystalline Fe(III)-oxyhydroxides, (5) crystalline Fe(III)-oxides, (6) sulfides, and (7) residual fraction. The seven different steps that constitute the procedure are presented in Table 1.


**Table 1.** Selected sequential extraction procedure for As fractioning.

For the SSE, we started from 1 g of tuff powder. After each step, the suspension was centrifuged at 5000 rpm for 10 min, supernatant separated, diluted with MilliQ (18.2 MΩ cm at 25 ◦C), acidified with 1% HNO3, and stored at 4 ◦C until analysis conducted by ICP-MS (Agilent 7500c). Two replicates were performed for each rock sample.

#### *2.6. Batch Tests*

In order to investigate the water-rock interaction processes and evaluate the release of arsenic and other elements under different conditions, two batch tests were performed on a previously selected Yellow Tuff sample (PAR11). In polyethylene bottles, two different extracting solutions (40 mL) were added to 2 g of pulverized rock (solid/liquid ratio 1/20) using:


The chemical composition of the two extracting solutions is reported in Table S1.

The bottles were placed in a horizontal shaker at 400 rpm at room temperature, and aliquots (20 mL) were taken at fixed intervals (3, 6, 12, 24, 48, 120, 720 h). Samples were centrifuged at 5000 rpm for 10 min and supernatant separated, diluted with MilliQ (18.2 MΩ cm at 25 ◦C), acidified 1% with HNO3, and stored at 4 ◦C until analysis. Before each sampling step, pH value of the samples was measured with probes. After each sampling, 20 mL of fresh solution was added to the bottles, inducing new dissolution at each step.

A second test was carried out with an experimental apparatus consisting of a 500 mL polyethylene bottle filled with the solid matrix (PAR11) and the synthetic rainwater. Oxygen, redox, and pH variations in the system were continuously monitored by DO, Eh, and pH electrodes. The system was also equipped to be N<sup>2</sup> fluxed, and water samples were collected at fixed intervals. In addition, a solution of sulfide (25 mg/L S2−) was injected into the batch solution to simulate sulfide-rich hypoxic conditions. A total of 20 g of the pulverized tuff were placed in 400 mL of synthetic solution and stirred for the whole duration of the test using a magnetic stirrer. The test was divided into different steps:


For each sample, 15 mL of water was collected with a syringe, filtered with 0.4 µm (polycarbonate filters), diluted with MilliQ (18.2 MΩ cm at 25 ◦C), acidified by 1% with HNO3, and stored at 4 ◦C until analysis. Samples of the two batch tests were analyzed for anions (only filtered aliquots) by IC (Dionex DX-120), for Fe by UV/Vis spectrophotometry (Hach Lange DR2800), and for trace elements by ICP-MS (Agilent 7500c).

#### **3. Results and Discussion**

#### *3.1. Groundwater Geochemistry*

As observed by Parrone et al. [66], piezometric levels decrease according to the main flow direction from NNW-SSE towards the Tiber River. Groundwater shows the dominance of a bicarbonate-earth-alkaline facies, with samples ranging from Ca-HCO<sup>3</sup> types to Na+K-HCO<sup>3</sup> types.

Water samples clearly show different pH values, passing from acidic (median = 6.1) to alkaline (median = 7.7) conditions from the northern to the southern sector (Figure 3). Low pH values in the north are probably related to a widespread circulation of CO<sup>2</sup> that can be hypothesized to be in the area of the hydrogeological watershed near the ancient Sacrofano caldera, as evidenced by the bubbling observed during the sampling of numerous wells. Some parameters typical of volcanic environments (e.g., PO4, V), together with Al, show higher concentrations in the northern area, while an enrichment in HCO<sup>3</sup> and Na can be observed in the south (Table S2). This suggests a greater influence of the sedimentary

deposits on the groundwater chemistry in the southern sector. As shown in Figure 3, in support of this hypothesis, the Na/K ratio increases from the north (median = 1.05) to the south (median = 1.55). The enrichment in K represents a typical feature of the waters circulating in the potassium-alkaline volcanites of Central Italy [29,85].

**Figure 3.** Latitude trend of the pH values (left ordinate scale) and the Na/K ratio (right ordinate scale).

Arsenic does not present a clear spatial distribution (Figure S1), showing slightly higher values on average in the southern area and some localized peaks. In most of the samples (91.3%), As exceeds the drinking water standard (10.0 µg/L), without particular differences between the northern (maximum value 50.6 µg/L and 90.6% of exceedances) and the southern sector (maximum value 50.2 µg/L and 91.9% of exceedances).

In both sectors, As shows lognormal distributions (Shapiro-Wilk test, α = 0.05) and no data outliers (Rosner's parametric test, Huber's non-parametric test).

In Table 2, As' correlation with the chemical and physical-chemical parameters for the two sectors is shown. In the northern area, As shows direct correlations, both parametric and non-parametric, with F, U, and V. In the southern sector, the strong positive correlation with F remains along with Na, K, Li, and B, while a significant inverse correlation with oxygen can be observed.

**Table 2.** Pearson's (*r*) and Spearman's (*rs*) correlation coefficients among As and other chemical and physical-chemical parameters. Positive correlations are highlighted on a green scale, negative correlations on a red scale. Statistically significant values in bold (α = 0.05).



**Table 2.** *Cont.*

#### *3.2. Mineralogical Characterization of the Rocks*

The VTYT looks generally massive and coherent, with a yellow-gray color. The formation is usually affected by significant subvertical fractures and is often mineralized. Macroscopically, the tuff appears as an aggregate of pumice embedded in an ashy matrix. At the microscopic level, the VTYT samples show different fragments of rocks and minerals, such as sanidine, plagioclase, pyroxene, leucite (often analcimized), biotite, garnet, and apatite. The tuff skeleton also presents fragments of carbonate rocks, presumably ripped from the basement during the eruption.

XRPD highlighted the constant and important presence of zeolites (chabazite/hershcelite), a result consistent with data previously reported in the literature about this type of tuffs [75]. Other minerals identified within the rock with this technique include calcite, quartz, K-feldspar, mica (biotite), plagioclase, kaolinite, and chlorite.

SEM-EDX elemental analysis allowed us to identify other minerals within the rock: pyrite (Figure 4), magnetite, secondary calcite, titanite, rutile, zircon, monazite, other Fe, and rare earth oxides.

One of the objectives was the research of possible mineral phases containing particularly relevant As concentration; however, the qualitative analysis of the investigated crystalline forms showed no traces of elevated As, suggesting its possible presence as homogeneously dispersed arsenic at a level below the LOD of this technique.

#### *3.3. As Fractioning within the Selected Tuffs*

We performed a Selective Sequential Extraction on three selected VTYT samples (PAR10, PAR11, PAR22), with the purpose of quantifying the total As content and its distribution among the different solid phases that potentially constitute the aquifer matrix. The results allowed us to hypothesize the As-bearing phases that are more easily mobile from circulating waters and, accordingly, are the most likely geochemical mechanisms that can cause arsenic release from the investigated volcanic matrix.

The total arsenic in the sampled rocks is in the range of 17.5–40.9 mg/kg, which is data that is consistent with the findings of Armiento et al. [31] regarding the deposits of the Cimino-Vicano volcanic district. The sequential extraction procedure (Figure 5) shows As distribution among different solid fractions.

The As found in easily exchangeable fraction is negligible (0.22–0.41 mg/kg, corresponding to 0.9–1.4% of the total As). Conversely, the arsenic specifically adsorbed onto mineral surfaces ranges from 1.31 to 4.53 mg/kg (6.9–16.7%). Competitive ions in solution (e.g., phosphate) can affect the mobilization of this fraction, which can thus play an important role in the As contamination of groundwater in the study area.

**Figure 4.** SEM images: zeolites and pyrite (right) within the Via Tiberina Yellow Tuff. For pyrite the elemental analysis is also shown.

**Figure 5.** As fractioning resulting from the sequential extraction (two replicates for PAR10 and PAR22). The table (right) shows the weight percentage of interesting oxides in the rocks (mean ± std. dev. for PAR10 and PAR22).

Arsenic association with calcite can be significant (0.52–2.97 mg/kg, 3.4–7.8%), and the calcimetry measurements showed a total content of CaCO<sup>3</sup> between 5.5% and 12.0%. As stated by Alexandratos et al. [86], the mechanisms of arsenate sorption on the calcite surface and of its incorporation in the calcite lattice are very similar to the point where surface sorption can be considered to be a precursor of incorporation in the crystal lattice; therefore, it may be difficult to distinguish between adsorbed and incorporated As(V) complexes at the calcite surface. As a result, at least part of the specifically adsorbed As extracted in step 2 is likely derived from As incorporated in the lattice at the calcite surface.

In this step, the acetic buffer is used to selectively dissolve the carbonate, but Feoxyhydroxides may re-adsorb the As oxyanions liberated by the calcite dissolution. At pH 5 (acetate buffer), the capacity of Fe-oxyhydroxides to adsorb As-oxyanions is high, thus leading us to underestimate the amount of As bound to calcite, and conversely, to overestimate As bound to Fe-oxyhydroxides in the subsequent step. This effect cannot be quantitatively assessed, and therefore, the amount of As bound to calcite recovered by the specific step must be considered as a minimum estimation [61]. A release of As linked to calcite in groundwater of this area cannot be excluded, particularly where favorable conditions for dissolution exist (e.g., acidic conditions in the northern sector near the hydrogeological watershed).

Most of the As (11.69–29.51 mg/kg, 63.8–72.1%) that was extracted during the procedure is linked to Fe oxy-hydroxides, especially low crystalline (50.2–63.2%), which nevertheless constitutes only about 4% of the solid matrix. Commonly, amorphous phases present a larger specific surface area [87] than crystalline structures and tend to also adsorb As within their loose and hydrated structures, not only on their outer surfaces [88]. Furthermore, the conversion of amorphous ferrihydrite to crystalline Fe-oxide phases, which may gradually occur over time [89], can reduce the density of As sorption sites [90–92], leading to the desorption of adsorbed As [93]. These considerations are relevant in the study of As release for the dissolution of amorphous and crystalline phases in presence of reducing conditions of the aquifers [94]. Although oxidizing conditions are largely dominant in the study area, local conditions potentially favorable to the reductive dissolution of iron oxyhydroxides can also occur, with the consequent possible release of associated As.

Arsenic's association with sulfide fractions (3.17–4.64 mg/kg, 10.7–18.4%) confirms the important presence of sulfide minerals, such as pyrite, which was already observed during the SEM observations. Smedley and Kinniburgh [9] reported this association, emphasizing the chemical closeness of As and S and the occurrence of the highest As concentrations in sulfide minerals, among which pyrite is generally abundant in volcanic and geothermal contexts. It is, however, an As fraction hardly mobilized in the normal conditions of groundwater of the study area, which are always strongly undersaturated with respect to As sulfides.

#### *3.4. Batch Experiments*

Two batch tests were realized on one of the samples (PAR11) characterized with SSE. The first experiment, simulated rainwater-rock and groundwater-rock interaction processes to evaluate the As release from the tuff in contact with solutions of different ionic strength. The second experiment provided information about the influence that pH and redox potential can exert on the As-release dynamics and allowed us to investigate processes that may naturally occur in the aquifer.

In Figure 6a, the pH monitoring during the first batch test on the three selected tuffs is shown. The pH values are always higher (8.7–9.3) for the synthetic water than the As-free groundwater, which always shows values closer to neutrality (7.4–7.8) due to buffer capacity. As and V are strongly affected by high pH values and, to a lesser extent, by the presence of other exchanging anions, as observed for U (Figure 6b). Arsenic values are always higher for the synthetic water, reaching a concentration of 202 µg/L (against 77 µg/L for the As-free groundwater), corresponding to 14.9% of As released. The greater

As and V release at high pH, compared to near-neutral conditions, is promoted by the alkaline desorption processes.

**Figure 6.** Batch test results. (**a**)—pH monitoring (3 rock samples, mean of two replicates); (**b**)—As, V and U released during the test (sample PAR-11, mean of two replicates).

A batch desorption study by Kim et al. [18] reported that hydrolysis of Fe (hydr-) oxides released As and F under reducing conditions; however, the same study also found that under oxidizing conditions, an increase in pH was the main mode of As and F mobilization. At pH above 8, a significant displacement of arsenate is induced by OH- ions through an anion exchange mechanism.

The second test was divided into different steps, simulating the water-rock interaction processes in oxygenated conditions (STEP 1) under anaerobic conditions and weakly positive redox potential (STEP 2a) and in anoxic and strongly reducing conditions (STEP 2b), with the latter being locally observed in the study area.

Figure 7a shows the trend of pH and Eh values during the second batch test. DO showed stable values around 7.8 mg/L during the aerobic phase, while it suddenly reduces during the second step (complete anaerobic conditions are established 15 min after fluxing with N2).

**Figure 7.** Behavior of different parameters during the second batch test. (**a**)—Eh and pH values; (**b**)—As, V, Fe, U and F concentrations.

Eh showed a slight and progressive increase in STEP 1, followed by a decrease to slightly positive values in STEP 2a (about +70 mV), while a sharp decrease due to the addition of a reducing species (S2−) can be observed in STEP 2b before it stabilized at slightly negative values (about −60 mV). The pH after the water-rock interaction always remains at rather high values (8.4–9.6) and tends to progressively decrease during the aerobic phase and increase in the anaerobic ones, due to N<sup>2</sup> blowing.

Arsenic and vanadium (Figure 7b) follow a similar trend during the test. The two elements show an increase in the first 24 h during the aerobic phase and then a slight decrease towards the end of this step, probably due to the re-sorption of the oxyanions. The desorption/resorption processes from zeolites cannot be excluded. In the anaerobic step, the concentration of As and V gradually increased and may also have been affected by the dissolution of the iron oxides in an anoxic environment, as shown by the increase in Fe content, which was up to 150 ug/L.

The uranium trend appears to be strongly influenced by pH variations, with an increase in the aerobic phase and then a decrease to 1.1 µg/L at the end of the anaerobic step, suggesting a probable re-adsorption, given the affinity of uranyl for Fe oxide phases [95]. In contrast, F shows a gradual increase of the concentration both in the aerobic and anaerobic steps (only 3 samples in STEP1), thus appearing to be linked to dissolution processes and not influenced by the variable pH-Eh conditions affecting the test.

Regarding STEP2b (reducing conditions), As and V again show a very similar trend, with an increase of concentrations for a large part of the step and a final slight decrease, probably due to the partial precipitation of sulfides. Here, Fe, undetectable for a large part of the test, shows appreciable concentrations and apparently similar behavior. Uranium still appears constrained by the pH increase and tends to the gradual reduction of the concentration towards the end of the test. Fluoride, on the other hand, shows a distinct trend almost opposite to that of As and V, with a slight concentration decrease in the first 24 h and then a marked increase in the following 48. Therefore, the element is strongly mobilized under alkaline and reducing conditions.

#### *3.5. Arsenic and Other PTEs: Linking Solid Matrix Analysis with Groundwater Geochemistry*

As observed in Table 2, arsenic in groundwater in the northern sector (more representative of the volcanic domain) is well correlated with other PTEs, suggesting common processes. The link between these elements and the release from the volcanic tuff under different experimental conditions was therefore analyzed, investigating possible connections with groundwater data.

Uranium in the two batch tests shows a different behavior in regard to As, appearing strongly linked to pH variations in the frankly alkaline environment. The lack of correspondence between field and laboratory data could therefore be attributed to the different existing conditions, with lower pH values in the study area (especially in the northern sector) than those measured during the tests. However, it should be emphasized that the correlation with As is observable at the low concentrations (median = 0.8 µg/L; max = 13.9 µg/L) found in the northern sector. In the southern area, U, although showing more important concentrations (median = 6.6 µg/L; max = 34.6 µg/L), poorly correlates with As.

The strong As-F correlation is constant in groundwater in both sectors. A correlation is also partially observable in the second batch test, but only in the aerobic/anaerobic steps (1 and 2a). Beyond the common origin of the two elements due to water-rock interaction processes, in the study area, As and F remain strongly linked in aqueous solution, and processes capable of breaking this common mobility do not intervene. On a regional scale, Parrone et al. [96] attributed this stable co-presence to the widespread geochemical background of cold groundwater circulating volcanic formations, also highlighting areas where the good correlation is lost due to localized peculiarities (e.g., in the proximity of fractures/faults systems or mineral deposits).

In contrast, during the batch test, As and F showed almost opposite behaviors in the reducing step (2b). The F concentration shows a noticeable increase, probably due to an exchange with OH at high pH, while As is probably released by reductive processes and partially re-precipitated as sulfide. Reducing conditions are uncommon in the study area but can locally affect the As-F correlation. For example, the well N32, one of the few points having strongly reducing conditions (Eh = −323 mV), shows unusually low concentrations of As and V (0.2 and 2.2 µg/L) and high values of F (1.9 mg/L). This could be due to the mixing of groundwater with reducing deep fluids, leading to the subsequent precipitation of As and V as sulfides and the loss of the As-F correlation. Once eliminated, these local geochemical anomalies, usually characterized by physical-chemical parameters' abnormal values (negative Eh, low DO, high EC), cause the As-F correlation to further increase, up to a correlation of 0.8. Additionally, Kim et al. [18] identify a good correlation between As and F under oxidizing conditions and increasing pH, associated with the desorption from Fe (hydr)oxides, while a poorer correlation between the two elements verifies in reducing aquifers due to sulfate reduction and consequent As precipitation as sulfide, a process not affecting F concentration.

The good As-V correlation observed in groundwater of the northern sector is confirmed by the results of the two batch tests performed on the solid matrix. The two elements show completely overlapping trends in all conditions and therefore can be considered representative of the water-VTYT interaction process. The good correlation in groundwater is lost in the southern sector, where the importance of sedimentary deposits increases. The presence of clays could affect the As-V correlation through sorption/desorption processes. Zhu et al. [97] observed the consistent pH-dependent sorption of V onto kaolinite and montmorillonite, with relatively high sorption capacity occurring at the pH range of 4–10. However, the electrostatic repulsion between the V anions and the negatively charged surface determines the very low maximum adsorption capacities of the two clay minerals (<1.0 mg/g) compared to the result of V adsorption onto soil [98] and soil colloids [99]. This is due to the high surface area of soil colloids and the presence of Fe/Al (hydr)oxides, which have a high tendency for V adsorption [100]. Recently Gonzalez-Rodriguez and Fernandez-Marcos [101] observed high sorption of vanadate and arsenate onto non-crystalline iron and aluminum species. Furthermore, they found that phosphate can induce the desorption of arsenate, while it can hardly displace adsorbed vanadate. This different tendency to displace As and V may be a plausible explanation for the loss of positive correlation in the southern sector (Figure 8a), where PO<sup>4</sup> indeed shows very low concentrations and may have undergone sorption processes (Figure 8b). This could explain also why As in the southern area remains in solution and maintains its strong correlation with F.

**Figure 8.** (**a**) scatterplot As-V for the northern and southern sector; (**b**) Latitude trend of the PO<sup>4</sup> concentrations.

#### **4. Conclusions**

The analysis of the distribution of PTEs in the different matrices (groundwater, rocks) and the study of the geochemical processes responsible of their mobilization can provide a comprehensive framework on the origin and evolution of geogenic contaminants in groundwater.

The joint analysis of field data and laboratory observations suggests the existence of a diffused As geochemical background due to the water-rock interactions. In many hydrogeological settings, as alluvial aquifers, the release of As has been proved to be triggered by reductive dissolution, while in geothermal contexts, it is often explained by upraising geothermal fluids, mobilization from volcanic rocks due to acidity or high temperature, and local processes as a release from carbonate trapping. Our study pointed out that in volcanic areas, although the reductive dissolution of Fe oxyhydroxides could cause an enhanced localized release of As into groundwater, the anion exchange induced by specific exchangers (e.g., phosphates) with As adsorbed on the surface seems to be the most widely diffused process, which is often not taken into consideration in other studies.

The strong and stable As-F correlation in groundwater already observed at the regional scale has been confirmed, while the As-V correlation is lost during the transition towards the sedimentary deposits, thus representing a good geochemical marker of the volcanic hydrogeological domain.

Investigating PTEs adsorption/desorption behavior at variable pH and Eh conditions could give useful hints for the comprehension of the phenomena that release arsenic in groundwater exploited for human consumption, hence suggesting suitable technology and management options for the distribution of As-free waters in affected areas.

**Supplementary Materials:** The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/toxics10060288/s1, Table S1: Chemical analysis of the two solutions used for the first experiment; Table S2: Comparison of the main statistics for all the analyzed parameters, distinguished between northern and southern sectors; Figure S1: Distribution of As concentrations in groundwater of the study area.

**Author Contributions:** Conceptualization, D.P., S.G., B.C. and E.P.; methodology, D.P., B.C. and S.G.; validation, D.P., B.C. and S.G.; formal analysis, D.P.; investigation, D.P., B.C. and S.G.; resources, D.P., B.C. and S.G.; writing—original draft preparation, D.P.; writing—review and editing, D.P., S.G., B.C. and E.P.; visualization, D.P.; supervision, S.G. and E.P.; project administration, S.G. and E.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Acknowledgments:** We wish to thank Paola Tuccimei for the coordination and supervision of the activities carried out at the Roma Tre University, Fabio Bellatreccia and Sergio Lo Mastro of the Roma Tre University for the mineralogical observations at the optical microscope and SEM analysis, Gianluca Iezzi of the Chieti University for the XRPD analysis, Jasper Griffioen of the Utrecht University for his suggestions on the laboratory tests, David Rossi of the CNR-IRSA for his support in the groundwater sampling activities and Domenico Mastroianni of the CNR-IRSA for the chemical analysis of water samples.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Ilaria Guagliardi 1,\*, Aleksander Maria Astel <sup>2</sup> and Domenico Cicchella <sup>3</sup>**


**Abstract:** The geochemical composition of bedrock is the key feature determining elemental concentrations in soil, followed by anthropogenic factors that have less impact. Concerning the latter, harmful effects on the trophic chain are increasingly affecting people living in and around urban areas. In the study area of the present survey, the municipalities of Cosenza and Rende (Calabria, southern Italy), topsoil were collected and analysed for 25 elements by inductively coupled plasma mass spectrometry (ICP-MS) in order to discriminate the different possible sources of elemental concentrations and define soil quality status. Statistical and geostatistical methods were applied to monitoring the concentrations of major oxides and minor elements, while the Self-Organizing Maps (SOM) algorithm was used for unsupervised grouping. Results show that seven clusters were identified—(I) Cr, Co, Fe, V, Ti, Al; (II) Ni, Na; (III) Y, Zr, Rb; (IV) Si, Mg, Ba; (V) Nb, Ce, La; (VI) Sr, P, Ca; (VII) As, Zn, Pb—according to soil elemental associations, which are controlled by chemical and mineralogical factors of the study area parent material and by soil-forming processes, but with some exceptions linked to anthropogenic input.

**Keywords:** soil; potentially harmful elements; contamination; multidimensional spatial analysis; Calabria

#### **1. Introduction**

Soil is a dynamic natural resource which, being the basic constituent of the trophic system, has a variety of vital functions for human and environmental life [1–3]. These functions are the result of the soil's ability to control and maintain the materials and energy cycles between the atmosphere, groundwater and plant cover.

Many factors are responsible for the content, distribution and the behaviour of the chemical elements in soil, the first of which is the mineralogical and geochemical composition of the bedrock [4–6], followed by weathering [7,8] and soil formation processes (physical, chemical and biological), In addition, soils can be affected by the influence of phenomena such as the anthropogenic pollution [9–14] and the ratio and chemical composition of atmospheric depositions [15,16]. These latter sources of pollutants are widely distributed in urban soil, which is a repository of rainfall and wastewater discharge as well as atmospheric pollutants accumulated via deposition, Hence, the soil is an indicator of environmental contamination [17,18].

Differing to natural soils, which have a profile consisting of degrading vertical horizons, urban soils do not have a profile, and present great variability, both vertical and horizontal, because during their formation there are no pedogenetic processes, but instead the layering of debris, landfill, construction, and the remains of excavations of foundations [19,20]. Therefore, soils in the urban environment are the result of anthropogenic activities. Rapid industrialization and urbanization have occurred in most parts of the world during the last decades, and have stressed the soil with a growing pool of pollutants

**Citation:** Guagliardi, I.; Astel, A.M.; Cicchella, D. Exploring Soil Pollution Patterns Using Self-Organizing Maps. *Toxics* **2022**, *10*, 416. https://doi.org/ 10.3390/toxics10080416

Academic Editor: Fayuan Wang

Received: 6 July 2022 Accepted: 22 July 2022 Published: 25 July 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

from different sources, posing a significant risk to humans and ecosystem [21,22]. The difference between soil pollution and air and water pollution lies in the fact that, in the first case, the pollutants remain for a long time in direct contact with the soil. Thus, the soil is continuously subject to pollution by toxic materials and dangerous micro-organisms which enter the air, water and the food chain [23,24]. Contact with contaminated soil may be also direct (inadvertent hand to mouth administration by children from using soil of parks and schools) or indirect (by inhaling soil contaminants which have vaporized) [25–28]. Longer contact with pollutants causes their accumulation in bones and organs. During this exposure, organ activities are disturbed, the nervous system is affected, and tumour diseases mature [29–32]. There are different types of environmental pollutants, and their potentially harmful elements (PHEs) are those that are particularly dangerous due to their ubiquity, toxicity, and persistence [33,34].

Therefore, evaluating soil pollution is of great concern. Due to urban soil spatial heterogeneity, a valuable approach to assess its quality is the application of multivariate statistics, since the environment is considered multivariate. According to many available recommendations [35], among the most effective data mining tools, those which enable unsupervised grouping based on mutual relationships between features of the analysed matrix, both of linear and nonlinear nature, are the most desired. One of the most powerful techniques for this purpose is use of the Self-Organizing Maps of Kohonen [36] because they enable display of the pattern present in multidimensional data sets on two-dimensional surface plots, are resistant against missing data and outliers, and their results are easily interpretable by decision-makers. In light of these issues, the aims of this study are (i) to individuate different types of pollution fonts controlling for a structure of monitoring data sets in a southern Italy area; (ii) to visualize geographical distribution of potentially harmful elements, and (iii) to identify high-risk areas that can be targeted for environmental risks and public health. These outcomes could be used by decision-makers working in the field of sustainable development implementation.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The study area is located in the NW sector of the Calabria region (southern Italy) inside the Crati graben and covers the Cosenza and Rende municipalities territory (Figure 1). Geologically, the study area represents a tectonic depression extending over 92 km<sup>2</sup> bordered by NS, SW-NE and NW-SE-trending faults [37–39] associated with the horst-graben system of the Sila-Coastal Chain [40,41]. A tick succession of Pliocenic sediments made up of light brown and red sands and gravels, blue grey silty clays and silt interlayers, Pleistocene to Holocene alluvial sands and gravels and very small outcrops of Miocene carbonate rocks characterize the study area [42]. Sediments overlap a Palaeozoic intrusive-metamorphic complex formed by paragneiss, biotite schists, grey-phyllitic schists with quartz, chlorite and muscovite which, in some cases, are in a weathering process [43].

The soil map of the Calabria region at 1:250,000 scale [44], for the study area reports the presence of Fluvisols, Luvisols, Cambisols, Vertisols, Calcisols, Arenosols, Leptosols, Umbrisolsand Phaeozems. Properties, dynamics and functions of the studied soils are highly variable. For these, the average values are 17.59% for clay content, 56.50% for sand content, 6.84 for pH, 2.86% for organic matter, 0.25 µScm−<sup>1</sup> for electrical conductivity, 16.14 meq 100 g−<sup>1</sup> for CEC and 1.24 gcm−<sup>3</sup> for bulk density.

Geomorphologically, a flat part including the urban area surrounded by hills, characterizes the study area. Falling inside the Mediterranean Sea, the Calabrian climate is typically Mediterranean, but the orography of the region affects it [45] with African warm air currents from its Ionian side and a western humid air current from the Tyrrhenian side.

**Figure 1.** Study area with indication of sampling points and urban areas**. Figure 1.** Study area with indication of sampling points and urban areas.

The soil map of the Calabria region at 1:250,000 scale [44], for the study area reports the presence of Fluvisols, Luvisols, Cambisols, Vertisols, Calcisols, Arenosols, Leptosols, Umbrisolsand Phaeozems. Properties, dynamics and functions of the studied soils are highly variable. For these, the average values are 17.59% for clay content, 56.50% for sand The Cosenza-Rende area has a population of approximately 100,000 inhabitants and typical urban land use, such as housing and intense automobile traffic, with limited presence of industries, commercial activities, parks and gardens. For these characteristics, different potential sources of pollution can be recognized.

#### content, 6.84 for pH, 2.86% for organic matter, 0.25 µScm−1 for electrical conductivity, 16.14 meq 100 g−1 for CEC and 1.24 gcm−3 for bulk density. *2.2. Soil Sampling and Analytical Methods*

Geomorphologically, a flat part including the urban area surrounded by hills, characterizes the study area. Falling inside the Mediterranean Sea, the Calabrian climate is typically Mediterranean, but the orography of the region affects it [45] with African warm air currents from its Ionian side and a western humid air current from the Tyrrhenian side. The Cosenza-Rende area has a population of approximately 100,000 inhabitants and typical urban land use, such as housing and intense automobile traffic, with limited presence of industries, commercial activities, parks and gardens. For these characteristics, different potential sources of pollution can be recognized. *2.2. Soil Sampling and Analytical Methods*  In this study, 149 soil samples were collected from residual and non-residual topsoil in gardens, parks, flowerbeds and agricultural fields (Figure 1) in the study area. In addition, two duplicate pairs were collected from every 10 sites and split in the laboratory to In this study, 149 soil samples were collected from residual and non-residual topsoil in gardens, parks, flowerbeds and agricultural fields (Figure 1) in the study area. In addition, two duplicate pairs were collected from every 10 sites and split in the laboratory to produce replicates. Before collecting samples, removal of the surface litter at the sampling spot was carried out. At each site, topsoil samples (0–10 cm depth from the surface) were collected from five locations at the corners and at the centre of a 20 × 20 m square with a hand auger and combined to form a bulked sample. Mixing of the samples thoroughly, and removal of foreign materials such as roots, stones, pebbles and gravel, were carried out. The final sample volume was 1–1.5 kg of material, reduced to about half by the following step of quartering. Sample preparation was started in laboratory by drying soil at 40 ◦C prior to analysis in order to obtain a water-free reference for elemental contents. Prior to further sample processing, the soil was adequately homogenized and then sieved to fine soil of ≤2 mm. Successive soil analyses were performed on fine soil, and analyte contents were based on fine soil as common reference for interstudy comparisons.

produce replicates. Before collecting samples, removal of the surface litter at the sampling spot was carried out. At each site, topsoil samples (0–10 cm depth from the surface) were collected from five locations at the corners and at the centre of a 20 × 20 m square with a hand auger and combined to form a bulked sample. Mixing of the samples thoroughly, and removal of foreign materials such as roots, stones, pebbles and gravel, were carried After appropriate preparation procedures, each soil sample was analysed by X-ray fluorescence spectrometry (XRF) for aluminium (Al), calcium (Ca), iron (Fe), potassium (K), magnesium (Mg), manganese (Mn), sodium (Na), phosphorous (P), silica (Si) and titanium (Ti), and by inductively coupled plasma mass spectrometry (ICP-MS) for arsenic (As), barium (Ba), cerium (Ce), cobalt (Co), chromium (Cr), lanthanum (La), niobium (Nb),

out. The final sample volume was 1-1.5 kg of material, reduced to about half by the

nickel (Ni), lead (Pb), rubidium (Rb), strontium (Sr), vanadium (V), yttrium (Y), zinc (Zn) and zirconium (Zr).

Quality of the analysis was monitored by the simultaneous analysis of certified international reference materials AGV-1, BCR-1, BR, DR-N, GA, GSP-1, NIM-G, and analysis duplicates included in analytical procedure in the range of one in twenty in each batch. Errors of the estimate for the measured elements were determined by relative standard deviation (<5%) based on three replicates of one sample randomly chosen.

#### *2.3. Data Processing Methods*

Evaluation of the spatial distribution of pollutants is important to assess the anthropogenic burden on the environment. Numerous different chemometric approaches are available for multidimensional data mining; however, methods which can be used for unsupervised exploratory analysis and pattern recognition, as well as able to handle non-linear problems, are the most desired.

Among the different statistical tools applied, an increasing number of studies have used artificial neural networks to probe complex data sets, since the visual output of the SOM analysis provides a rapid and intuitive means to examine covariance between explanatory variables, especially when the relationships among them and phenomena under analysis are unknown, and possibly nonlinear. SOMs, while extensively used in many areas, have only recently been used in ecological applications [46]. Applications can be found in ecological community ordination and gradient analysis [47], and in characterization and prediction of water quality in rivers [48] and coastal areas [49]. Applications of SOMs in oceanography are quite recent, too, and consider mostly feature extractions from univariate data sets [50].

Self-organizing maps (SOMs), in particular, are a kind of unsupervised Artificial Neural Network (ANN) that have been becoming increasingly popular for the analysis of large multivariate data sets, since they provide a topology preserving nonlinear projection of the data set in a regular two-dimensional space, and therefore constitute a methodology for nonlinear ordination analysis.

The SOM technique, known as self-organizing maps of Kohonen, is able to deal with big data sets with the possibility of visually exploring the outcomes of the model in versatile 2D maps in which similar samples are mapped close together on a grid [51]. SOM is often used in association with other algorithms, such as K-means, Principal Component Analysis and Hierarchical Cluster Analysis, for further elaborating its outcomes. However, the majority of those associations are mainly methodological studies aimed at comparing outputs of various data mining strategies. Since the current research is a case-study, it was decided to use only the self-organizing map (SOM) algorithm, considering it one of the most current neural network architectures for exploratory data analysis, clustering, and data visualization.

Among the different statistical tools applied, an increasing number of studies have used artificial neural networks to probe complex data sets, since the visual output of the SOM analysis provides a rapid and intuitive means to examine covariance between variables.

SOM is a kind of artificial neural network performing a non-linear projection of the original data space onto a two-dimensional space of neurons. It consists of two layers: the first represents input nodes (one per variable) connected to the samples, while the second one (an output layer) is a set of neurons organized on an array. A preliminary number of neurons can be determined according to one of the most accepted recommendations where *n* = (number of samples)ˆ(−1/2) [52], while the final map dimension ratio is usually slightly modified based on analysis of topographic and quantization errors (TE and QE, respectively). In general, a matrix of input vectors representing the variability and relationships of the experimental data is initialized by a series of parameters (i.e., shape of the map, shape of the map units, number of map neurons, map initialization matrix, distance function, neighbourhood function, number of epochs, etc.) retaining the number of variables of the experimental data. This input matrix becomes "the map", usually

represented in a two-dimensional plot where the map vectors are called prototypes (or neurons). Then, each vector of the experimental data is presented to the algorithm, and it finds the prototype most similar to the experimental vector and adjusts it together with all surrounding prototypes to be even more similar to the experimental vector. When all the experimental vectors are presented to the algorithm, a single iteration is finished; usually, several iterations are needed to convergence. In the current study, the following initializing parameters were used: rectangular shape of the map, hexagonal shape of the map unit, 66 map units, random initialization, Euclidean distance to find the best prototype and adjust the surrounding neurons, and Gaussian neighbourhood function to establish how the neurons around the best prototype are updated during the training process. Once the SOM has converged, the weight vectors of the elements are fed into a non-hierarchical K-means algorithm to extract the neurons of the best similarity. Separating by K-means requires the user to decide the final number of k clusters the algorithm is converged into. Diverse values of k (predefined number of clusters) were tested and the sum of square for each run was calculated. Lastly, the best classification with the lowest Davies-Bouldin index (D-B) was selected. D-B index is a function of the ratio of the sum of within-cluster scatter and between-cluster separation [53]. The non-parametric Kruskal-Wallis test was performed to evaluate the significance of the cluster pattern.

All calculations in this study were performed by applying Matlab 2020 (Mathworks, Inc., Natick, MA, USA) and TIBCO Statistica 13.0 (TIBCO Software, Palo Alto, CA, USA) running on a Windows 10 platform.

#### **3. Results and Discussion**

Table 1 presents the descriptive statistics for the soil data. Except for Na and Si, a positive skewness is observed for all elements (Table 1), and a kurtosis which ranges from slight (0.04) to high (46.74).


**Table 1.** Basic statistics for soil samples.

To analyse the spatial variations of elemental concentrations, the data set, consisting of analytical results from urban and peri-urban soil samples, was arranged in a two-way array of 25 variables, and the SOM algorithm was deployed. Apart from the methodological information presented in the section above, the detailed theoretical background of the SOM approach can be found elsewhere [54–57]; however, it is worth mentioning that here the SOM was successfully applied in assessment of soil pollution with PHEs [58], and heavy metals [59–63] as well as PCDD and PCDFs [64]. In Tao et al. [58] the distribution of PHEs in surface soil was examined. Yotova et al. [59] focused on toxic elements present in soil and their phytoavailability in an industrial area with copper mining factories and a smelter. In Yang et al. [60], soil samples were collected in several sites in a vast Chinese region and analysed for toxic elements presence. Kosiba et al. [61] compared the use of SOM with three other statistical techniques for assessing soil quality in a Polish area and its impact on the diffusion of a pathogen on a specific plant species. Dai et al. [64] evaluated the dioxin content in soil at different depths and in different years in a river floodplain, while in Nadal et al. [62] the use of the SOM allowed identification sites differently impacted by heavy metal pollutants in a petrochemical industrial area. Cheng et al. [63] proposed a SOM model built from a dataset composed of toxic metal content of soil and sediment samples collected at different depths from cascading reservoir catchments of a Chinese river. Having in mind the facts mentioned above, in the present study, exploratory data analysis, clustering and data imagining were approached by the self-organizing map (SOM) algorithm, which represents a powerful neural network architecture for these topics.

According to one of the most accepted recommendations [52], the total number of Kohonen's map neurons was estimated as *n* = 5 \* (149)ˆ(−1/2) ≈ 61. Since there was more than one possible combination of the final dimension which was close to the dimension obtained by Vesanto's formula (i.e., 10 × 6, 8 × 7, 9 × 7, 11 × 6), quantization (QE) and topographic errors (TE) were calculated in all cases. Finally, the chosen dimensionality of the 11 × 6 had the lowest values of errors (QE = 0.231, TE = 0.011). Once the SOM's grid has been optimized, the U-matrix and the individual variable planes based on hexagonal lattice were visualized (Figure 2).

A component plane, scaled to represent the range of changeability of a parameter, is associated to each variable while the corresponding hexagon (i.e., top-left one of coordinates row × column = 1 × 1) of the consecutive plane represents the changeability of the given parameters for the same set of samples. Based on this, the component planes can be used to visualize possible correlation among the variables, while the U-matrix can be used to identify the possible presence of different clusters of data. By the analysis of planes, high concentration values of Al, Ti, Fe, Y, Rb, Cr, V, La and Ce, which are generally located in the top of the planes, and the highest concentration values of Pb and Zn in the bottom-left part of the planes, were observed.

Since PHE concentration in soils depends both on the nature of bedrock, on abiotic and biotic factors, and human activities, accurately extracting key features and characteristic patterns of variability from an elemental large data set is essential to correctly determining the sources. For this, the relationship between elements in the soil matrix gives information on PHE sources and pathways in the geo-environment. In fact, positive correlations between elements, inspected by comparing component planes, suggest that pairs in the soil samples are from the same source. Conversely, negative correlations suggest different origins between the element's pairs which, therefore, can be considered unrelated to their geochemical dynamics.

Scaling the weights vectors of each plane in the range between 0 (the least positive) and 1 (the most positive), the set of variables could be separated into several groups of similarity representing their mutual directly or inversely proportional correlations.

I. Cr, Co, Fe, V, Ti, and Al with clear consistent patterns of the highest weights in the top-left part of the planes and the lowest weights in the middle-bottom section of the planes. These variables are all positively correlated and probably not associated with anthropogenic sources, but supposedly related to the predominant rock-forming elements constituting the soil parental materials. Indeed, the higher values of this group's element concentrations were located mostly in the NW and SE sectors of the study area where a very low road network density occurs and where there is the occurrence of ultrabasic rocks, found below the Pliocene deposits, in which these elements are predominant. The igneous-metamorphic complex can be ascribed to the pile of tectonic nappes forming the mountain chain of the northern Calabrian Arc, described in [65], which includes an intermediate structural element made up of ophiolite-bearing units [66] that mostly extend along the Tyrrhenian side of the arc to form a westward convex arc-shaped belt separated from the southern Apennines by the roughly E-W trending left-lateral strike-slip fault zone. This unit is represented by a tectonic mélange constituted by a monotonous sequence of phyllites, quartzites, and calcschists, including metric to kilometric lens-shaped blocks of ophiolitic rocks. These rocks are mainly constituted by serpentinized ultramafics, and by glaucophane-bearing meta-basites, with remnants of their sedimentary cover and rare meta-gabbros [65]. In particular, the geochemical behaviour of V resembles that of Fe which can substitute in Fe-Mg silicates (amphiboles, pyroxenes, micas). This elemental association confirms that the soils are controlled by the same typically lithogenic elements associated with silicate minerals.


common source of elemental association. This may be due to Sr geochemical affinity with Ca [68]. Sr is a relatively common element that substitutes for Ca in crystal lattices of rock-forming minerals, including feldspars and plagioclase, as in the study area. *Toxics* **2022**, *10*, x FOR PEER REVIEW 7 of 17

**Figure 2.** Component planes for all sampling sites and parameters. U-matrix visualizes distances between neighbouring map units and helps to identify the cluster structure of the map. High values of the U-matrix indicate a cluster border, uniform areas of low values indicate clusters themselves; each component plane shows the values of one variable in each map unit. Both grey-tone pattern and grey-tone bar labelled as "d" deliver information regarding compounds/element abundance calculated through the SOM learning process. **Figure 2.** Component planes for all sampling sites and parameters. U-matrix visualizes distances between neighbouring map units and helps to identify the cluster structure of the map. High values of the U-matrix indicate a cluster border, uniform areas of low values indicate clusters themselves; each component plane shows the values of one variable in each map unit. Both grey-tone pattern and grey-tone bar labelled as "d" deliver information regarding compounds/element abundance calculated through the SOM learning process.

A component plane, scaled to represent the range of changeability of a parameter, is associated to each variable while the corresponding hexagon (i.e., top-left one of coordinates row × column = 1 × 1) of the consecutive plane represents the changeability of the given parameters for the same set of samples. Based on this, the component planes can be used to visualize possible correlation among the variables, while the U-matrix can be used to identify the possible presence of different clusters of data. By the analysis of planes, VII. As, Zn, Pb, with compatibly the highest weights. occur in only a few hexagons located in the bottom-left triangle of the planes. The consistent colour indicates that As, Zn and Pb have a strong positive correlation, and their concentrations are higher in soil next to roads than in the soils away from them. This indicates that larger concentrations of these elements are related to road traffic. Consequently, their positive correlation allows us to draw conclusions about their common source

high concentration values of Al, Ti, Fe, Y, Rb, Cr, V, La and Ce, which are generally located

linked to anthropogenic activities conducted in urban environments. These elements are, indeed, present in vehicle fuel, being used for increasing gasoline antiknock.

The set of component planes, with weights scaled in the range 0–1 grouped according to their correlations, is presented in Figure 3. *Toxics* **2022**, *10*, x FOR PEER REVIEW 10 of 17

**Figure 3.** Soil quality parameter similarity pattern obtained by self-organizing mapping. An analysis of the distance between variables on the map connected with an assessment of the colour-tone patterns provides semi-quantitative information about the nature of correlations between them. **Figure 3.** Soil quality parameter similarity pattern obtained by self-organizing mapping. An analysis of the distance between variables on the map connected with an assessment of the colour-tone patterns provides semi-quantitative information about the nature of correlations between them.

Among the seven clusters, C\_I includes 20 samples (13.4%) with the highest concentration of Mg, Al, Ti, Ni, Fe, Y, Cr, V, Co and Ba. Most of the samples included in C\_I was collected in peri-urban soils and in areas in which Paleozoic paragneiss and biotite schists occur. More precisely, the observed association clustered in C\_I can be clarified considering the presence in the study area of ultrabasic rocks in which these elements are principal. Highest baseline concentrations of these elements seem to be highly associated with the igneous-metamorphic complex found below the Pliocene deposits that outcrop mostly in The significant information deriving from the SOM theory, that each node of the SOM map could be consecutively referred to one or more samples, leads to the conclusion that the differentiated structure of PHEs abundance (reflected in different colour scales in the planes) revealed the presence of numerous similarity clusters in the set of samples. Consequently, weight vectors of the converged map were clustered based on a K-means clustering mode. Some predefined numbers of clusters were tested, and the sum of squares for each run was calculated. The best partition was gained for a seven-cluster configuration having the lowest Davies-Bouldin index value (Figure 4).

the NW and SE sectors of the territory. This structure represents the pile of tectonic nappes forming the mountain chain of the northern Calabrian Arc and contains an intermediate structural element made up of ophiolite-bearing units. C\_II consists of only 11 (7.4%) samples collected in peri-urban soils. These samples were characterized by the lowest concentration of Mg and Ca, with the highest abundance of REEs such as La, Ce, Nb and Zr. Such a phenomenon indicates that REE content is associated with alkaline igneous rocks and carbonatites, which are igneous rocks derived from carbonate-rich magma rather than silica-rich magma [69]. According to SOM theory, the node (map neuron) with a weight vector closest to the input sample vector is identified as the best matching unit, and the number of tagging is summarized. Lastly, the distribution of the sample vectors along a Kohonen map can be analysed by decoding the best matching unit selection events. Clusters I-VII (consecutively named as C\_I-C\_VII) include numerous numbers of 149 soil samples (C\_I-20, C\_II-11, C\_III-22, C\_IV-26, C\_V-28, C\_VI-19, C\_VII-23). Cluster distribution of investigated soil samples in the study area according to the local geological setting is presented in Figure 5. Comparison of initially determined PHEs concentrations in soil samples with the clustering results allowed for the assignment of clustering patterns to factors impacting soil quality. Comparison of analyte concentration values according to clustering pattern is presented in

Figure 6 (concentration at % level) and Figure 7 (concentration at mg kg−<sup>1</sup> level) together with a statistical assessment from the non-parametric Kruskal-Wallis test.

*Toxics* **2022**, *10*, x FOR PEER REVIEW 11 of 17

**Figure 4.** Clustering patterns according to the Davies-Bouldin index minimum value. **Figure 4.** Clustering patterns according to the Davies-Bouldin index minimum value.

Among the seven clusters, C\_I includes 20 samples (13.4%) with the highest concentration of Mg, Al, Ti, Ni, Fe, Y, Cr, V, Co and Ba. Most of the samples included in C\_I was collected in peri-urban soils and in areas in which Paleozoic paragneiss and biotite schists occur. More precisely, the observed association clustered in C\_I can be clarified considering the presence in the study area of ultrabasic rocks in which these elements are principal. Highest baseline concentrations of these elements seem to be highly associated with the igneous-metamorphic complex found below the Pliocene deposits that outcrop mostly in the NW and SE sectors of the territory. This structure represents the pile of tectonic nappes forming the mountain chain of the northern Calabrian Arc and contains an intermediate structural element made up of ophiolite-bearing units.

C\_II consists of only 11 (7.4%) samples collected in peri-urban soils. These samples were characterized by the lowest concentration of Mg and Ca, with the highest abundance of REEs such as La, Ce, Nb and Zr. Such a phenomenon indicates that REE content is associated with alkaline igneous rocks and carbonatites, which are igneous rocks derived from carbonate-rich magma rather than silica-rich magma [69].

C\_III includes 22 samples (14.8%) with the highest concentration of K and relatively low abundance of Zn, Mn, Ni, Ba, Pb, As. The majority of these samples were collected in soils along the part of the Crati river falling in the study area, and their composition indicates the presence of organic matter, suggesting that this might play a role in increasing K adsorption rate. As can be seen, samples clustered in C\_I-C-III as a set, in comparison to the rest of clusters, were characterized by higher concentrations of Al, K, Ti, and Fe, with lower concentration of P and Sr. Moreover, samples from C\_I and C\_II were characterized by the highest concentration range for REEs. The content of REE in soil, without other inputs, is influenced by the parent material and on geochemical processes such as mineral weathering, which is an important input of elements into the soils [70]. Twenty-six samples clustered in C\_IV (17.4%) were, in general, grouped together based on the lowest content of

Al and Si, relative to minimal concentrations among of the other samples, and the highest concentration of P, Ca, Sr. According to their location, this suggests that the underlying rocks are the major source of P. C\_V, consisting of 28 (18.8%) samples, represents soils with moderate concentrations of the majority of investigated elements. It seems they are clustered separately due to a relatively large range of determined concentrations for Mg and Mn. C\_VI includes 19 (12.7%) soil samples in which their chemical composition is dominated by relatively high concentration of P, Zn, and Sr. These samples were additionally characterized by the highest concentration and range of values for Pb and As, and lowest the abundance of Y. The soil samples characterized by these elemental contents are distributed in the urban area, where road networks and vehicular traffic are intense, and, consequently, higher Pb contents occur. Particularly, soils close to high traffic roads of the study area showed the highest Pb and Zn baseline values. These elements are included in vehicle fuel for increasing gasoline antiknock. The last C\_VII includes 23 samples characterized by the lowest concentration of the majority of elements, such as Ti, Mn, Rb, Ni, Fe, Zr, Y, Cr, V, La, Ce and Co. In general samples clustered in C\_V-C\_VII show consistent chemical composition with the exception of some elements, determining their separation in a single cluster. As can be seen in Figure 6, a monotonic increasing trend of determined concentration values from C\_I to C\_VII is observed for Si, Na, and Sr, while much more frequently observed was a decreasing trend for Al, Ti, Rb, Ni, Fe, Y, Cr, V and Co. **Figure 4.** Clustering patterns according to the Davies-Bouldin index minimum value.

*Toxics* **2022**, *10*, x FOR PEER REVIEW 11 of 17

**Figure 5.** Geological setting of the study area and localization of sampling according to cluster classification. The seven identified clusters—(I) Cr, Co, Fe, V, Ti, Al; (II) Ni, Na; (III) Y, Zr, Rb; (IV) Si, Mg, Ba; (V) Nb, Ce, La; (VI) Sr, P, Ca; (VII) As, Zn, Pb—representing the seven groups in which the elements are associated according to local lithologies, are indicated.

elements are associated according to local lithologies, are indicated.

**Figure 6.** Compound concentration values according to clustering patterns (central line: median, box: 25–75% percentile, whiskers: minimum-maximum) with statistical assessment of differences between clusters based on the Kruskall-Wallis non-parametric test (K-W). **Figure 6.** Compound concentration values according to clustering patterns (central line: median, box: 25–75% percentile, whiskers: minimum-maximum) with statistical assessment of differences between clusters based on the Kruskall-Wallis non-parametric test (K-W).

**Figure 5.** Geological setting of the study area and localization of sampling according to cluster classification. The seven identified clusters—(I) Cr, Co, Fe, V, Ti, Al; (II) Ni, Na; (III) Y, Zr, Rb; (IV) Si, Mg, Ba; (V) Nb, Ce, La; (VI) Sr, P, Ca; (VII) As, Zn, Pb—representing the seven groups in which the

**Figure 7.** PHE concentration values according to clustering patterns (central line: median, box: 25– 75% percentile, whiskers: minimum-maximum) with statistical assessment of differences between clusters based on the Kruskall-Wallis non-parametric test (K-W). **Figure 7.** PHE concentration values according to clustering patterns (central line: median, box: 25–75% percentile, whiskers: minimum-maximum) with statistical assessment of differences between clusters based on the Kruskall-Wallis non-parametric test (K-W).

#### **4. Conclusions**

Correct monitoring and management of potentially harmful elements are key issues for urban and peri-urban soil knowledge, linking PHE concentrations at sites in which geogenic or anthropogenic input occur. In this study, evaluation of the usefulness of a powerful approach, such the SOM algorithm, for multidimensional geochemical data analysis and modelling problems of environmental pollution, was performed using data sets obtained by comprehensive monitoring of PHE content in the municipalities of the Cosenza-Rende area (Calabria, southern Italy). In the study area, a total of 149 soil samples, collected in residual and non-residual areas, parks, flowerbeds and agricultural fields, were investigated for 25 elements in order to better understand influences on soil geochemistry.

A self-organizing map (SOM) was selected as a powerful approach in soil science application for spatial distribution and geochemical mapping. A combination of the analysis of major metals, minor metals and PHEs, with the statistical treatment of SOMs, showed the geolithological formations and anthropogenic pressure on the territory. The association between the neurons and variables achieved by an unsupervised procedure performed by the SOM technique, allows recognition of high-risk areas which can represent environmental hazards and public health risks. By using the SOM method, the occurrence of anomalies ascribable to anthropogenic input in urban soils, referring to elements such as Pb and Zn, and of some geogenic anomalous high values of As, Cr, and V mainly identified in peri-urban areas, was recognized. The SOM was employed to cluster the data, and results presented a classification in seven clusters—(I) Cr, Co, Fe, V, Ti, Al; (II) Ni, Na; (III) Y, Zr, Rb; (IV) Si, Mg, Ba; (V) Nb, Ce, La; (VI) Sr, P, Ca; (VII) As, Zn, Pb—mainly determined by the chemical and mineralogical factors typical of the geological setting of the study area, and by soil forming and weathering processes. Among them, C\_II and C\_VII can be linked to anthropogenic input. However, in general, more contamination was identified in urban soils than in peri-urban ones.

In summary, the main outcomes of the study are as follows:


The paper contains an important methodological novelty. In fact, it proposes the application of an existing methodology for data analysis to a new class of problems. Its results can have a valuable role in identifying polluted areas and proposing remedial action aimed at reducing health risks to people. Further development of this tool should also help soil scientists to identify novel relationships about already studied phenomena, and act as a hypothesis generator for traditional research, as well as supplying clear and intuitive visualization of the environmental phenomena studied.

**Author Contributions:** Conceptualization, I.G., A.M.A. and D.C.; methodology, I.G. and D.C.; software, A.M.A.; validation, I.G., A.M.A. and D.C.; formal analysis, A.M.A.; investigation, I.G. A.M.A. and D.C.; resources, I.G. and D.C.; data curation, I.G.; writing—original draft preparation, I.G. and A.M.A.; writing—review and editing, I.G., A.M.A. and D.C.; visualization, I.G. and A.M.A.; supervision, I.G.; project administration, I.G. and D.C.; funding acquisition, I.G. and D.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**

