1. Introduction
Among the vast field of archaeometry, provenance studies stand out [
1]. These studies involve the analysis of artifacts, raw materials, and geological sources to determine the origin and movement of ancient objects [
2]. Provenance studies help unravel the mysteries of human history, trade, and cultural exchange. This type of research has been developed on a number of materials (see [
2] and references therein), including pottery (and their corresponding clays) [
3,
4], stones [
5,
6,
7], metals (e.g., [
8,
9]), mortars/plasters (e.g., [
10,
11]), glasses [
12], glazes (e.g., [
13,
14]) and pigments (e.g., [
15,
16]). The strategy to trace the source of the corresponding raw materials is specific to every type of material. Importance can be focused on specific mineral inclusions, relevant chemical compositions, isotopic ratios, etc. The most convenient and useful characterization techniques vary depending on the material under study and the tracing strategy.
In the particular field of stone provenance studies, there is also a large diversity of approaches that simply reflect the variety of rock types [
17,
18,
19,
20,
21,
22]. Marbles and volcanic rocks are probably among the materials whose provenance has been most frequently investigated, and this is for both archaeological and analytical reasons. Obsidians (volcanic glasses) aside, volcanic rocks had specific uses. They were often used as grinding tools like millstones and mortars [
23,
24,
25] or as building materials such as vault stones [
26,
27]. The number of plausible volcanic sources can be delimited by using geological, geographical and historical data. In some cases, the analyzed materials can bear a particular chemical and mineral content, enabling the ability to trace them back to a specific origin among the possible sources [
28,
29,
30,
31]. These provenance studies are of capital importance to infer trading routes and to assess the status of a given archaeological site.
In the present paper, we petrographically and geochemically characterize volcanic stones retrieved from two Roman archaeological sites. We explore and compare conventional and new strategies to infer their corresponding volcanic sources. The novelty lies (i) in the exhaustive quantification of petrographic data as opposed to the common non- or poorly quantified petrographic descriptions and, (ii) more importantly, in the innovative use of supervised machine learning algorithms instead of the common unsupervised exploratory geochemical data analysis to tackle the provenance of the characterized materials. The studied materials were retrieved from Iulia Libica, a Roman municipium (2nd–3rd century CE) located in the Eastern Pyrenees (between Spain and France) and Sidi Zahruni, a non-excavated late Roman large pottery center (5th–7th century CE) located in northern Tunisia (
Figure 1a). The two sites have been selected for their role as trading centers [
32,
33] and their proximity to a known geological source of volcanic stone (Olot Volcanic Field and Pantelleria island, respectively).
4. Discussion
Regardless of the properties or techniques used, the attribution of provenance to archaeological volcanic samples requires accurate quantification of a set of parameters and a dataset containing the same quantification for a representative collection of reference geological samples.
In this paper, a quantification effort has been undertaken for both petrographic and compositional properties. Some of the quantifiable petrographic properties could arguably not be representative of a given provenance, e.g., the porosity or the shape of the pores could vary enormously even within the rocks of a given volcanic eruption. However, the petrographic characterization has possibly the potential to combine a set of discerning elements, including mineralogy of the phenocrysts, relative ratios, size, morphology, presence of alterations, texture/mineralogy of the matrix, etc. The problem with petrography is that, presently, there is a lack of large reference datasets that should comprise a standard set of quantified parameters. Therefore, the petrographic approach is not ready to fulfill the requirements to be applied as a routine technique for provenance determination. However, it has the potential to be developed as such. In any case, for well-delimited cases such as a binary classification problem, the petrographic approach can be applied because the self-production of the required reference dataset is feasible. Besides this, the petrographic characterization is also very helpful to determine whether the set of sampled materials from a given archaeological site is homogeneous (likely from a single provenance) or not. It is worth mentioning that a set of volcanic samples that are mineralogically and petrographically different could exhibit chemical homogeneity.
In the present investigation, the samples from Sidi Zahruni (Z) appear petrographically homogenous, indicating a single geological supply of volcanic materials, and the same can be said for the three millstones from Iulia Libica (L). The minerals that have been identified are those common in basalts, such as plagioclase (absent as phenocryst in L samples), clinopyroxenes and olivine. The absence of plagioclase phenocrysts (as the absence of leucite in both L and Z samples) could help to constrain the provenance. Other particular petrographic features that could be helpful are the alteration rims of olivine (L samples) or the occasional occurrence of green aegirine–augite (Z samples). Alteration rims in olivine are actually a common feature in many olivine basalts worldwide and have many potential provenances, including Sardinia, Agde (to the south of Massif Central), Olot (Catalonia), Ustica (Italy) and Middle Atlas (Morocco) basalts [
56]. The presence of aegirine–augite crystals has been described in volcanic rocks from Pantelleria and is typical of more felsic rocks from this island [
57], with which the basalts are intimately associated [
58].
The chemical approach is currently much more ready than the petrographic approach to determine the provenance of archaeological volcanic materials. The quantification of the chemical composition is routinely applied by many researchers who publish their data, and some initiatives have been undertaken to systematically collect these data to create huge geochemical reference databases like GEOROC. However, some difficulties arise from the use of different equipment and sample preparation methods by the contributors to the reference database. Nevertheless, part of the solution is to build larger and larger datasets, which contributes to reducing random errors (improving precision), and the average of different systematic errors from different laboratories could also increase accuracy. Also, an extra complication of large datasets is that the geochemical data can be expressed in different ways (as oxides or elements, wt%, ppm, including volatiles or not, assuming a given valence state for elements like Fe or distinguishing different states, etc.). All these points could imply additional transformation steps before using the data, but they should not represent a real issue for well-managed databases.
Despite carefully selecting and standardizing relevant sets of data, the limitations of the common geochemical classification tools have been illustrated. On the one hand, biplots representing only two chemical elements (or two chemical elemental ratios) can be misleading because they disregard data that could be relevant. On the other hand, multivariate data analysis methods like PCA, despite considering all the features (i.e., the analyzed chemical elements), reveal that as the number of considered predefined classes increases, it is more likely that they form overlapping clusters. For the studied archaeological samples retrieved from Iulia Libica (L), the suggested provenances using certain biplots are the Hyblean plateau, Middle Atlas, Southern France and Spain. For those retrieved in Sidi Zahruni (Z), the suggested provenances are Pantelleria (recurrently), Sardinia and also the Hyblean Plateau. PCA can also be used to discard some of the multiple considered provenances. Provenances that can be discarded both for the samples from Iulia Libica and Sidi Zahruni include the basalts from the Aeolian, Aegean, Gibraltar and Tyrrhenian arcs and those from Campania, Lazio and possibly Algeria, Alps–Sardinia–Corsica and Etna. The fact that the suggested provenances correspond to those of intraplate basalts (intraplate nature was also geochemically deduced for L and Z samples) and that the discarded reference sites are essentially those linked to volcanic arcs (i.e., linked to a subduction zone) attest the robustness of both GEOROC database and the PCA method. However, PCA is not really useful in deciphering which, among the suggested intraplate provenances, is the most probable.
Finally, the supervised machine learning models are also multivariate methods that take into consideration a high number of compositional variables, but they have the additional advantage of being designed to discriminate the different predefined reference classes (i.e., the provenances). The capacity to discriminate between different classes can be measured using different statistical metrics computed using the test sets. Along this line, as was previously stated in a methodologically similar study [
43], using supervised models is very important to check for the convergence between the results using different classification models and training several times using different splits. Reliable provenance determination requires a significant agreement between the results using these different strategies. In the present research, besides different models and splits, three different reference databases have been used (G90, G60 and G30) with accuracies ranging from 0.55–0.80 (using G90) to 0.66–0.90 (using G30).
In the case of the Sidi Zahruni (Z) samples, the results point very monotonously to the Sicily Channel Rift (X21 class) as the most probable provenance. This means that the Z basalts were extracted from Pantelleria or Linosa volcano islands, which are made of ocean island basalts known to have been exploited in antiquity for the manufacture of lava grinding tools like millstones [
25,
59]. This origin (especially Pantelleria) had appeared recurrently among the provenances suggested by the common geochemical classification tools (biplots and PCA). Moreover, the petrographic properties of Z samples also agree with those exhibited by the Pantelleria basalts, e.g., weakly porphyritic texture, mineralogy and relative abundance of the phenocrysts with pl > ol > cpx [
59,
60,
61]. From the two basalt types described in Pantelleria [
59], those with a higher affinity with the samples retrieved from Sidi Zahruni are the younger basalts with relatively low TiO
2 and P
2O
5 contents [
60].
In contrast, for the samples from Iulia Libica, there is no agreement between the prediction models, and the predicted provenance usually appears statistically divided into different classes that hold fractions of provenance probability. Up to nine different classes (i.e., provenances) are suggested by the models, and, in general, the probability average percentages combining all the predictions are very scattered (the highest value is 50.4% for class X5 (Anatolia) using database G90). According to [
43], this heterogeneity indicates that the provenance attribution is not robust. Indeed, in some cases, even the results from a given classification algorithm vary significantly when using different trained models (i.e., different train–test sets). Some classes (among others, incidentally X5, Anatolia) are overrepresented within the reference databases and may spread out, covering most of the point cloud in the corresponding PCA (e.g., yellow dots in
Figure 11 and
Figure 12). Therefore, it is not surprising that such classes could capture substantial fractions of the provenance probability. Using geographical proximity as a criterion, classes X17 (Massif Central) and especially X8 (Catalonia) are the nearest locations to Iulia Libica. The models assign to these classes rather low probabilities, and both classes have a very low amount of data within the reference databases. Even if training is enabled, a low number of data may not be fully statistically representative of the class. In fact, class X17 has only been considered by the models using database G90 because the number of X17-class data in the other datasets was too low to train the models to discriminate this class. The same would have occurred for class X8 if we had not supplemented the original database (produced using GEOROC) with additional data for this class. Ideally, all classes with a low number of reference samples should be supplemented. In particular, it would have been interesting to increase the reference samples for class X17, as the Massif Central (in particular cap d’Agde) is the presumed origin for the basaltic millstones produced in Lattes [
36], typologically similar to those found in Iulia Libica. Such a trading route would be consistent with the presence of pottery from the southern Gaulish area in Iulia Libica [
32]. With the available data and comparing with the results obtained for Sidi Zahruni, the conclusion is that the provenance of Iulia Libica samples has not been successfully determined, and this implies that it corresponds to a class imperfectly represented or absent from the reference databases used.
To summarize, supervised machine learning methods have a number of advantages over other statistical methods. Conceptually, they are designed to learn the best way to discriminate the different classes. However, they are not foolproof, and they are subject to limitations linked to incomplete reference databases (unbalanced, underrepresented or missing classes). Despite the good indicators of overall performance (see accuracies in
Table 4,
Table 5 and
Table 6) obtained after testing the trained models, a model will obviously never be able to predict a class not described in the training database and the performance of the model to predict underrepresented classes can be far from appropriate. One of the statistical metrics used to check how well a model performs for a particular class is sensitivity, which is defined as the ratio between well-predicted samples and the total samples within the class. For instance, from the confusion matrices obtained during the test step, it could be seen that classes X8 (Catalonia) and X17 (Massif Central) exhibit low average sensitivities per model (usually below 0.5 and sometimes even 0). Specifically, for class X8, the highest average sensitivity values are 0.78 and 0.69 using LSVM and kkNN models, respectively, and incidentally, these two models are the ones that statistically attribute higher provenance probabilities to class X8 compared to other models (see
Table 4,
Table 5 and
Table 6). As for class X17 (only considered in runs using database G90), the highest average sensitivity is only 0.23, obtained using the LSVM model, and this is actually the only model that assigns a certain probability to this provenance.
5. Conclusions
Statistical approaches to establish the provenance of volcanic rock tools require quantification of a set of properties and a large reference database. The quantified properties could be geochemical, mineralogical, or petrographic. However, currently, only large reference geochemical datasets are available. Implementing the statistical approach to petrographic properties would first require an agreement on the parameters to be quantified and a collective effort to produce the corresponding datasets.
The limitations of the common geochemical and statistical classification tools have been illustrated. On the one hand, biplots using a few variables can be misleading because relevant data could be omitted. As a result of this, different biplots can suggest different provenances. On the other hand, multivariate data analysis methods, like PCA, often exhibit a single point cloud and not separated clusters, and then it is difficult to assign a given provenance to the unlabeled archaeological data.
The supervised machine learning approach applied to geochemical data has proven to be a powerful tool for discerning different reference clusters, and this enables provenance prediction for unlabeled samples. Among the 18 different considered provenances, the models assign systematically to all the samples from Sidi Zahruni a very high provenance probability to the class representing the basalts from the Sicily Channel Rift. The provenance of Sidi Zahruni samples has been successfully established, not only because of the strong agreement between the different supervised models but also because of the high petrographic consistency between the studied samples and the known characteristics of the Pantelleria basalts. Additionally, this provenance was among those suggested using common geochemical classification tools, and it lies only ~115 km from the archaeological site of Sidi Zahruni.
In contrast, the models assign different provenances to the samples from Iulia Libica, and this suggests that the corresponding provenance was missing in the considered reference datasets or that it was incompletely described within them. This highlights the limitations of the supervised approach and the need for a statistical approach making use of different sister samples, different trained models and different classification algorithms to be able to identify a homogeneous and, therefore, robust provenance prediction.