*Article* **How a Paleogenomic Approach Can Provide Details on Bioarchaeological Reconstruction: A Case Study from the Globular Amphorae Culture**

**Stefania Vai 1,\*, Maria Angela Diroma <sup>1</sup> , Costanza Cannariato <sup>1</sup> , Alicja Budnik <sup>2</sup> , Martina Lari <sup>1</sup> , David Caramelli <sup>1</sup> and Elena Pilli <sup>1</sup>**


**Abstract:** Ancient human remains have the potential to explain a great deal about the prehistory of humankind. Due to recent technological and bioinformatics advances, their study, at the palaeogenomic level, can provide important information about population dynamics, culture changes, and the lifestyles of our ancestors. In this study, mitochondrial and nuclear genome data obtained from human bone remains associated with the Neolithic Globular Amphorae culture, which were recovered in the Megalithic barrow of Kierzkowo (Poland), were reanalysed to gain insight into the social organisation and use of the archaeological site and to provide information at the individual level. We were able to successfully estimate the minimum number of individuals, sex, kin relationships, and phenotypic traits of the buried individuals, despite the low level of preservation of the bone samples and the intricate taphonomic conditions. In addition, the evaluation of damage patterns allowed us to highlight the presence of "intruders"—that is, of more recent skeletal remains that did not belong to the original burial. Due to its characteristics, the study of the Kierzkowo barrow represented a challenge for the reconstruction of the biological profile of the human community who exploited it and an excellent example of the contribution that ancient genomic analysis can provide to archaeological reconstruction.

**Keywords:** ancient DNA; palaeogenomics; Neolithic; multiple burial; kinship; phenotypic traits

#### **1. Introduction**

Over the years, the study of ancient DNA study has proven valuable and essential for tracing migrations of historic and prehistoric individuals and groups. Advances in the sequencing and analysis of the genomes of both modern and ancient peoples have facilitated a number of breakthroughs in the understanding of human evolutionary history [1]. In this context, the archaeological excavations and retrieval of skeletal human remains represent a unique opportunity to study a population history and shed light on cultural changes and the lifestyles of our ancestors.

Archaeological excavations in the Znin district (northwestern Poland), led by Professor ˙ Tadeusz Wi´sla ´nski from the Institute of History and Material Culture of the Polish Academy of Sciences in Pozna ´n, uncovered a Megalithic barrow tomb associated with the Globular Amphora culture (GAC); it contained the remains of several individuals. Through the integration of archaeological, anthropological, and molecular (mtDNA and nuDNA) data, and theories about the development of the GAC, our previous results [2,3] add nuance to the model of Late Neolithic gene flow from the Pontic steppes into Central Europe, showing that the eastern affinities of the GAC in the archaeological record reflect cultural influences

**Citation:** Vai, S.; Diroma, M.A.; Cannariato, C.; Budnik, A.; Lari, M.; Caramelli, D.; Pilli, E. How a Paleogenomic Approach Can Provide Details on Bioarchaeological Reconstruction: A Case Study from the Globular Amphorae Culture. *Genes* **2021**, *12*, 910. https://doi.org/ 10.3390/genes12060910

Academic Editor: Jennifer A. Leonard

Received: 28 April 2021 Accepted: 8 June 2021 Published: 11 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

rather than movements of people. During the study, archaeogenomic research permitted us to obtain information on past population dynamics, whereas biological traits (such as phenotypic traits) of each individual and relationships within the group in the burial site have not been investigated due to the fact that samples with presumed kinship relations were removed from our previous studies. However, beyond population genetics purposes, as recently proposed, for instance, by [4–7], obtaining more information on the individuals (such as sex estimation and phenotype descriptions of physical appearance and functional traits), and investigating the relationships among them could be important—in general and in particular for the Megalithic barrow of Kierzkowo—for better understanding the social structure and funerary rituals of ancient cultures and preserving and enhancing a specific archaeological site. In fact, the molecular characterisation of human remains can represent an important information source to describe past populations, also for educational purposes: genomic data can be integrated in archaeological and anthropological reconstruction in order to reach a wider public in contexts devoted to science and culture communication, such as museum exhibitions, archaeological parks, and documentaries.

Methods based solely on archaeological and anthropological assessments can sometimes be limited in establishing biological kinship and providing data on biological traits, above all, in contexts such as this Megalithic barrow tomb, in which, over time, anthropic action and post-depositional natural events had a strong impact on the ease of data analysis and interpretation.

Therefore, we decided, here, to reanalyse the molecular data of GAC from our previous studies, attempting to recover information, such as sex, phenotypic, and functional traits to provide individual characterisation, concentrating our attention, in particular, on closely related individuals (previously eliminated from the analysis) and their relations, with archaeological context, to obtain more information useful to reconstruct GAC funerary rituals and social organisation. The complexity of this archaeological site, characterised by a secondary multiple burial of fragmented and mixed human remains belonging to several individuals, represented a challenge for the reconstruction of the biological profile of this community and an excellent example of the contribution that aDNA analysis can provide to archaeological reconstruction. In this context, a molecular investigation turns out to be fundamental to integrate and confirm anthropological and archaeological data/hypotheses, and suggests further insight.

#### **2. Materials and Methods**

#### *2.1. The Archaeological Site of Kierzkowo*

The archaeological site of Kierzkowo is located in the Znin district (Kuyavian-Pomeranian ˙ voivodeship, northwestern Poland, Figure 1) on top of a hill on the shore of Kierzkowski Lake. Pottery fragments belonging to characteristic decorated globular vessels [8–11], and a radiocarbon date on animal bone found in the site confirmed its attribution to the GAC period (4270 ± 40 BP calibrated to 2896 ± 19 cal BCE) [12].

The original barrow embankment was about 35 m long, 16–18 m wide, and 2 to 4 m high, overgrown by trees and bushes (Figure 1A). After removing the earth embankment, a stone structure approximately 22 m long, W–E, and 3 to 6 m wide, N–S, was revealed (Figure 1B). In the western part of this Megalithic tomb, there was a chamber about 10 m long, about 1.5 m wide, and of the same depth. The chamber was built of stone slabs with a height of about 1 m and a thickness of 0.5 to 0.8 m. Inside, a large boulder divided the chamber into two unequal parts. On the south side, a corridor lined with stones led to the chamber [11,13,14] (Figure 1C).

**Figure 1.** Geographic location and details of the site. Location of Kierzkowo site (Poland). (**A**) Megalithic barrow in Kierzkowo. (**B**) Megalithic barrow after removing the earth embankment. (**C**) Detail of the barrow: burial chamber. (**D**) Cluster of human Neolithic bones within the chamber. (**A**–**D**): photographed by Professor T. Wi´sla ´nski; (**A**,**D**): fragments of photographs, adapted with permission from book [11]; (**B**,**C**): from the Archives of the Institute of Archaeology and Ethnology of the Polish Academy of Sciences.

Within the chamber, Neolithic human bones gathered into two large clusters and a smaller one (at metres 3 and 4 and 5, Figure 1D). Some bones were also located under the large stone dividing the chamber, and much fewer bones were found outside of the chamber. In the two large bone gatherings, remains were stratified into seven layers, while in other locations, bones were in superficial layers. In many instances, human bones were mixed with animal bones with signs of dismemberment [11]. The presence of domesticated animal remains testify that animal husbandry was the dominant form of

economy. Individuals buried in Kierzkowo, indeed, being people of the GAC, belonged to the set of agriculture-based economies with a dominance of pastoralism. They were nomadic people with unstable settlement patterns. The megalithic tombs may have been important central points or boundaries of the territory used by the travelling human group [8,15–17]. Consumption of milk and animal meat is attested by the analysis of lipid residues in ceramic vessels from the barrow. Carbon (δ <sup>13</sup>C) and nitrogen (δ <sup>15</sup>N) isotopes were used to reconstruct the diet of the individuals from Kierzkowo. The δ <sup>15</sup>N values in bone collagen were slightly higher in adult men than in adult women. This could indicate a slightly better access of men to valuable animal products, but this differentiation is very small and requires further research [11].

Most of the human remains found in Kierzkowo were highly fragmented and mixed, without anatomical order, placed in the barrow as secondary interment (Figures 1D and 2). Altogether, 428 fragments of various parts of the skeleton were recovered, 94.2% of them from the burial chamber. These fragments were studied for taphonomy and morphological traits. Type of bone, skeletal area, side of the body, size of a fragment, and its robustness surface markings were taken into account. Furthermore, the degree of bone deterioration, traces of plant roots and eventual activity of invertebrates, bone colour and its alterations, types of breaks and soil in which they were preserved were also considered. Wherever possible, sex and age at the death of individuals were estimated. Preliminary descriptions and anthropological analyses were made many years ago [18]. At present, all bone fragments were inspected again and revisions of diagnoses (such as estimations of sex and age) were performed using newer diagnostic methods.

**Figure 2.** Fragments of bones of two individuals partly arranged in anatomical order. Bone fragments were tentatively attributed to individuals considering sex and age estimations, side, size, robustness, surface relief, colour, and state of preservation. Photo by Dr Andrzej Długo ´nski.

Evaluating sex and age estimations, the origin from the left or right side of the skeleton, the number of paired bones in the skeleton, size, robustness, surface relief, and also colour and state of preservation, bone fragments were assembled into individual skeletons (Figure 2). It needs to be stressed that the application of only morphological criteria does not produce complete certainty about bone allocation to the same individual. The Ockham's razor principle was used for allocating bone fragments to individuals in order to not multiply numbers of individuals beyond the necessary minimum. Finally, we estimated that in the barrow there were at least 27 persons buried [11]. Both sexes and ages ranging from newborns to elders were represented.

#### *2.2. The Analysed Samples*

Seventeen bone remains from Kierzkowo were chosen for genetic analysis as they were attributed to the same number of putative different individuals (Table 1).

The primary reports of the molecular data obtained from these samples were previously published by [2,3], where population genetics methods were applied to these and other samples to investigate population dynamics during the Neolithic period in Europe. Complete or almost complete mtDNA consensus sequences were available in GenBank with accession numbers MF114211-MF114224. Sequence alignments of nuclear DNA (nuDNA) reads in BAM (Binary Sequence Alignment/Map) format were retrieved for eight individuals from the European Nucleotide Archive (ENA) under accession numbers ERS2040835-ERS2040840, ERS2040844, and ERS2040845.

In this paper, both mtDNA and nuDNA data were object to a new kind of analysis with a specific focus on the single community buried in the Kierzkowo site.

#### *2.3. Mitochondrial Haplotype and Haplogroup Analysis*

The available mtDNA consensus sequences were reanalysed to estimate haplogroups and phylogenetic relationships between haplotypes.

Different from our previous study [3], HaploGrep 2.0 [19] was used for haplogroup assignment based on PhyloTree Build 17 [20]. Polymorphisms according to the revised Cambridge reference sequence (rCRS) were annotated and showed in a graphical phylogenetic tree obtained by HaploGrep 2.0.

Misincorporation rates at read termini and average fragment length were calculated by MapDamage2.0 [21] to describe the damage pattern of the mtDNA molecules. Modern human contamination at mitochondrial level was estimated by ContamMix [22]. Default parameters were used for these analyses.

#### *2.4. Sex Identification*

Sex identification of all samples for which nuDNA data were available was performed using the guidelines suggested by Skoglund et al. [23] (v0.4). A python script was applied to the read alignments in BAM format to infer the molecular sex by comparing the number of alignments to the Y chromosome to the total number of alignments to X- and Y-chromosomes.

#### *2.5. Y-Chromosome Haplogroup Determination*

Y-chromosome haplogroup inference was attempted for male Neolithic samples using the software Yleaf v2.2 [24], which accepted read alignments in BAM format as input files**.** The prediction, performed with a minimum number of reads for each base (-r) set to 2 and a minimum quality for each read (-q) set to 30, was based on over 41,000 markers present in the International Society of Genetic Genealogy (ISOGG) Y-DNA Haplogroup Tree defining 5353 haplogroups [24].

#### *2.6. Kinship Prediction*

We first genotyped our read alignments in BAM format, after trimming 3 bases at read ends, by using bamUtil "trimBam" function (v1.0.14) [25], and SAMtools [26] "mpileup" function (v1.7), followed by pileupCaller from sequenceTools (v1.4.0.5, https://github. com/stschiff/sequenceTools, accessed on 1 August 2020) by randomly calling one allele per position considering the human genome as pseudo-haploid genome. Base alignment quality computation was disabled (-B) and minimum base (-q) and mapping quality (-Q) were set to 30 in the mpileup process, as recommended by the authors. We used GRCh37 as reference genome assembly and called the SNPs according to the 1240K panel [27–29]. Genotypes were obtained in Eigenstrat output format, subsequently converted in plink format using EIGENSOFT convertf (v7.2.1, https://github.com/DReichLab/EIG, accessed on 1 August 2020) and plink [30] (v1.9). Finally, kinship prediction was performed by READ [31], whose approach can successfully infer up to second degree relationships even with 0.1× shotgun coverage per genome for pairs of individuals. The analysis was carried out with default parameters, using the median across all pairwise differences for normalization.

In addition to READ, we also applied NgsRelate [32] (v2) to infer relatedness using genotype likelihoods instead of called genotypes. In this fashion, we compared kinship predictions obtained by two different approaches, useful for uncertain classifications due to low coverage. ANGSD [33] (v0.933) was used to calculate genotype likelihoods while doing SNP calling from the mapped reads with the GATK model [34], after trimming 3 bases from both read ends and requiring minimum base and mapping quality scores = 30. NgsRelate was run using European population allele frequencies, previously collected from the 1000 Genomes Project, phase 3 [35]. The tool calculated 11 summary statistics, including the coefficient of kinship theta [36], the KING-robust kinship statistics by Manichaikul et al. [37], and the "R0" and "R1" ratios described in Waples et al. [38], based on a two-dimensional site-frequency spectrum (2D-SFS) approach.

#### *2.7. Analysis of Phenotypically Informative Markers*

Variant calling of 41 SNPs from the HIrisPlex-S panel [39] related to eye, hair, and skin pigmentation was performed using two different approaches, allowing to get both pseudohaploid and diploid calls on the same targeted sites. To the first aim, SAMtools mpileup and pileupCaller were used as described in the Section 2.6, while diploid calling was enabled by ATLAS [40] maximum likelihood genotype caller (task = call method = MLE) by specifying post-mortem damage and recalibration input files. These latter were previously obtained by ATLAS ("PMD" and "recal" functions, respectively) considering only bases with a minimum quality = 20 and using highly conserved positions among species defined by high GERP scores [41] as a training set to infer the recalibration parameters applied to our data. ATLAS allowed retrieving potential heterozygous variations, which could not be detected by the pseudo-haploid calling. We used the HIrisPlex-S [39,42–44] online web tool to predict eye, hair, and skin colour from DNA genotypes called by ATLAS. Moreover, 7 additional SNPs in *EDAR*, *MCM6*, *ABO*, and *RHCE* genes were also detected, being phenotypically informative markers. Annotations for each SNP identified in our samples were retrieved from SNPedia [45].

**Table 1.** The analysed specimens and summary of the results. Sample ID corresponds to the code attributed to the bone remains during the anthropological study, while Lab code corresponds to the laboratory label assigned during genotyping of nuclear genome. Note that Lab code is the same for sample 7.1 and 8.4, because the mitochondrial (mtDNA) and nuclear DNA (nuDNA) profiles indicated they belong to the same individual. Anatomical element, sex, and age determination obtained through anthropological methods [11,18] are indicated. Descriptive results for mtDNA and nuDNA analyses are reported. Direct radiocarbon dates are indicated where available. Samples 6.2, 6.3, 7.5, 8.8, and 8.9 were discarded from further analyses because of their low coverage or recent radiocarbon dating.



**Table1.***Cont.*

#### **3. Results and Discussion**

#### *3.1. Mitochondrial Haplotype and Haplogroup Analysis*

Our molecular study of the human skeletal remains from Kierzkowo first focused on the analysis of mitochondrial DNA (mtDNA) genomes obtained through a capture enrichment approach associated with next generation sequencing [3]. This strategy allowed us to obtain 15 complete or almost complete mtDNA profiles out of 17 samples. Polymorphisms according to rCRS are annotated in Table S1 and showed in a phylogenetic tree including information about haplogroup attribution (Figure S1). The C > T misincorporation percentages at 5′ termini and average fragment length as well as estimates of modern human contamination calculated for the mtDNA molecules are indicated in Table 1.

As expected, the reanalysis of haplogroup attributions based on a newer version of PhyloTree and HaploGrep agrees with previously published data [3], except for some samples now attributed to specific sub-lineages generally with a higher resolution (H28a instead of H28, U5b2b1a1 instead of U5b2b1, U5b1d1a instead of U5b1d1, U3b3 instead of U3b, and H1 instead of H1b for sample 3.1). Haplogroup H was the most frequently predicted and represented more than half of the mtDNA variation in our dataset with eight samples carrying three different lineages (H, H1 and H28a). Haplogroup U characterises four samples, three belonging to U5 lineages and one to U3b. The remaining three samples belong to haplogroups J, K, and W. At the haplotype level, despite some missing positions, four samples (corresponding to sample IDs 7.1, 8.4, 6.1, and 5.1) share the same mutation motif, which is attributed to the H28a haplogroup. This haplotype sharing could be explained (i) by a possible maternal relationship, more or less close in time, or (ii) by the presence of two or more samples belonging to the same individual (as shown below, further analyses at the nuclear genome level were necessary to support one of the two possible hypotheses).

The evaluation of damage patterns described through average fragment length and the C > T misincorporation percentage at 5′ termini showed a certain degree of molecular degradation for all the samples. Even if fragment length cannot be used as an antiquity marker (degradation of the DNA to an average size lower than 100 bp occurs rapidly after the death of the organism, probably due to autolytic processes [46,47]), the average length of our samples was estimated to range from 50.6 to 67.7 bp, in line with degradation processes.

Moreover, a different level of cytosine deamination at read ends was observed for the samples. Most of them showed percentages of C > T at 5′ ranging from 23% to 39% (values compatible with the Neolithic chronology of the site [46]), while three samples (6.2, 8.8, and 8.9) were characterised by lower percentages of molecules affected by misincorporation ranging from 11% to 12%.

As is known, opposite to fragment length, the frequency of misincorporations at the ends of the molecules tends to increase over time and can be used to distinguish modern from ancient DNA (aDNA) [46]. Therefore, the presence of a lower percentage of misincorporation for samples 6.2, 8.8, and 8.9 could be a signal of possible modern contamination of the samples or indicate that the three individuals did not belong to the Neolithic period.

The estimates obtained by the evaluation of ContamMix [22] allowed us to exclude a relevant contamination by modern human DNA for these samples. Referring to archaeological data, these three samples turned out to belong to a skeletal assemblage located outside the principal burial chamber of the tomb.

For these reasons, as described below, further investigation was required to explain the different damage patterns that characterise these samples.

#### *3.2. Analysis of Nuclear Genome Data*

Nuclear genome data were obtained through a target enrichment approach based on using 1240 K informative SNPs as a target [2]. The population genetics analysis showed the three individuals characterised by lower misincorporation percentages as diverging from the overall genetic variability associated with the other samples from Kierzkowo, providing

further clues about a possible different provenance of these samples [2]. Radiocarbon dating was, therefore, applied to some of the bones (both with lower and higher values of misincorporation percentages) in order to directly check their chronology. Dating results confirmed the bones from the tomb chamber as belonging to the Globular Amphorae culture (GAC) period: 2870–2575 cal BCE (β-430,712, sample 3.4), 3260–3095 cal BCE (β-430,713, sample 8.2), and 3015 cal BCE (β-430,714, sample 8.4) [2]. Results obtained for the bones found outside the burial chamber proved, instead, that 6.2, 8.8, and 8.9 samples represented a recent intrusive burial (8.8 and 8.9 dated to 210 ± 30 BP and 130 ± 30 BP, respectively, β-430,716 and β-430,715 [2]) of an adult woman and two children (one infant and a 2–3 year-old child), who were not maternally related, according to the mtDNA profiles. As a result, archaeologists suggested that they may have been people who—for some reason—could not be buried in the "consecrated ground" or that they were victims of one of the epidemics that ravaged these lands, such as the plague in the early 18th century or the cholera epidemic in the 19th century. In fact, at some distance from Kierzkowo, a mass grave of the victims of the pestilence from the mid-19th century was discovered. In addition, several victims of the cholera epidemic were also buried in the 19th century in the megalithic tomb of the GAC in Złotowo, 10 km from Kierzkowo [11].

If data obtained from the study of mtDNA genomes first provided important information about possible relations among samples and their chronology, only with the analysis of nuclear DNA (nuDNA) data was it possible to clarify the clues arising from the mtDNA analysis and to obtain thorough attributions at the individual level. Nuclear genome analysis, indeed, allowed us to clarify the relationships among the four samples sharing the same mitochondrial H28a haplotype. By evaluating the mitochondrial and nuclear DNA data, it was possible to provide a proper estimate of the minimum number of individuals (MNI), which otherwise would not have been allowed due to the complex taphonomic situation of bone remains in the burial.

In addition, the comparison of the obtained nuDNA genotypes allowed us to identify and verify that samples 7.1 and 8.4 belonged to the same individual and aid the anthropological study in the correct site reconstruction since the two samples (a maxillary and femur fragment) had been previously tentatively attributed to different individuals because of surface appearance and state of preservation.

#### 3.2.1. Sex Identification

Nuclear genome data were obtained for eight individuals, of which six were identified as male and two as female (Table 2). The comparison of genetic and anthropological data highlighted a perfect match of results for all samples analysed, except sample 6.1 for which the genetic determination gave a result different from the anthropological estimation. Furthermore, it generally allowed us to obtain a determination in case of absence of diagnostic morphological markers, i.e., for highly fragmented remains and infant and juvenile individuals. As a result, genetic sex determination allowed us to improve the attributions previously made through the anthropological approach. Even in this case, the fragmentation and preservation level of skeletal material, as well as the presence of infant and juvenile individuals, represented a strong limit for proper sex determination via the anthropological approach.

#### 3.2.2. Y-Chromosome Haplogroup Determination

Three out of six male individuals yielded enough valid markers to allow a haplogroup determination with the approach used by Y-leaf 2.2 [24]. All three samples (3.4, 7.6, 8.5) belonged to haplogroup I2, with different levels of definition at the sub-haplogroup level (Table 3). As expected, the more valid markers were available per sample, the deeper attribution at the sub-haplogroup level was possible. Due to the limited number of shared covered positions between samples and their low coverage, it was not possible to perform a detailed comparison between individuals at the haplotype level. For this reason, we have been able to highlight the presence of the same haplogroup in more individuals but not go

deeper into the estimate of haplotype sharing and possible kin relationships based on this uniparental marker.

**Table 2.** Sex reconstruction for the Neolithic bones. Nseq: total number of sequences; NchrY + NchrX: total number of alignments to both sex chromosomes; NchrY: number of alignments to the Y chromosome; R\_y: ratio between NchrY and NchrY + NchrX; SE: R\_y standard error; 95% CI: 95% confidence interval for R\_y.


#### 3.2.3. Kinship Analysis

Kinship predictions by READ software [31] allowed us to identify the couples 6.1–7.6, 7.1/8.4–5.1, and 5.1–6.1 as possible parent–offspring or full-siblings, while 7.1/8.4–6.1 and 7.1/8.4–7.6 were classified as second-degree relatives (grandparent–grandchild, avuncular or half-siblings). However, the standard error of the average proportion of non-matching alleles (P0 values) calculated by READ [31] expressed very high uncertainty of the classification for the couple 7.1/8.4–6.1 (Figure 3), probably due to a reduced number of comparable genomic positions. To a lesser extent, the categorisation of most couples showed uncertainty, frequently spanning more than one class; thus, only the first-degree relationships involving sample 5.1 may be considered unambiguously predicted (Figure 3). We also performed a genotype likelihood-based analysis of kinship using NgsRelate [32] to compare kinship predictions obtained using different approaches (see Methods). Both methods led to quite similar outcomes and the full results are shown in Supplementary Table S2. First degree relationships involving 6.1 were confirmed and further distinguished using the R0 ratios described in Waples et al. [38] into parent–offspring or sibling relationships [38], resulting in a family unit composed of 5.1 as mother, 7.6 as father and 6.1 as son. A maternal relationship between mother and son was also confirmed by the same mtDNA profile belonging to haplogroup H28a. Theta values suggested a first degree of kinship for the couple 7.1/8.4–6.1; however, the prediction remained unreliable since only 189 genomic sites were considered in the analysis. A first-degree relationship was also estimated for the couple 7.1/8.4–7.6, differing from the READ prediction (second-degree). Low coverage, low degree of preservation of aDNA in these samples and possible inbreeding leading to a discrepancy from expected values of identity-by-descent (IBD), as shown for samples 7.1/8.4 and 6.1 (Table S2), could explain the difficulties in inferring kinship. Individual 7.1/8.4 could be, indeed, brother or uncle of 6.1, as well as brother of 5.1 as indicated also by mtDNA analysis; furthermore, he could be involved in a possible second-degree relationship with 7.6. Both biological and methodological reasons could complicate pedigrees, e.g., the previous generation showing some degree of relatedness or inaccurate population allele frequencies due to population structure in the data. The different kinship degree predicted by the two software programs could be explained by another possible reconstruction where individual 7.1/8.4 could be the half-brother of 6.1, (both sons of 5.1) with a possible kinship relation between their fathers.

**Table 3.** Y-chromosome haplogroup determination. Hg: final haplogroup predicted using ISOGG nomenclature. Hg\_Marker: final haplogroup predicted using marker nomenclature. Total\_Reads: total number of reads. Valid\_Markers: number of markers used for the haplogroups prediction. QC\_score: overall quality score that is the factor of the three scores QC-1, QC-2, and QC-3; QC-1: score that indicates whether the predicted haplogroup follows the expected backbone of the haplogroup tree structure; QC-2: score that indicates whether equivalent markers to the final haplogroup prediction were found in the ancestral state; QC-3: score that indicates whether the predicted haplogroup follows the expected within haplogroup tree structure.


82

**Figure 3.** Kinship predictions by READ and NgsRelate software. (**A**) The average proportion of non-matching alleles (P0 values) with two standard errors calculated by READ is shown for each pair of samples. The solid horizontal line indicates the median value used for normalization. Dashed lines show the cut-offs calculated by READ to classify individuals as identical twins (MZ), first-degree, or second-degree related. The grey areas around dashed lines indicate 95% confidence intervals for the cut-offs. (**B**) Theta values for each pair of samples were calculated by NgsRelate. Dashed lines define ranges of expected theta values for each kinship category. We considered predictions of kinship by NgsRelate up to the third degree.

Additional second-degree relationships could be inferred from the theta values involving 5.3 and 8.2 (although they were not confirmed by READ), as well as potential third-degree relationships for nine couples of samples. Moreover, the relationship for all the remaining pairs seemed to be more distant than that of third-degree, but it could be overestimated by NgsRelate, as shown for the old version of the tool [48]; thus, we considered these couples as unrelated. The KING statistic is frequently the most discordant among all the estimated predictors: it probably underestimates kinships, as already demonstrated [49].

As a result of kinship analysis, we were able to construct a reasonable possible pedigree tree including all four abovementioned samples by admitting the absence of some members of a hypothetical extended family among the remains (Figure 4), with the nuclear family composed of 5.1 (mother), 7.6 (father), and 6.1 (son), and considering 7.1/8.4–6.1 as halfsiblings and 7.1/8.4–7.6 as nephew/uncle.

**Figure 4.** Pedigree reconstructed for the Kierzkowo family. A possible pedigree was inferred from genome-wide estimates of pairwise identity-by-descent (IBD). The nuclear family in the grey area was confirmed by both READ and NgsRelate tools. For each individual, age (yo: years old) estimated through morphological analysis is indicated. Y chromosome (YHg) haplogroup is described where available. Mitochondrial haplogroups (mtHg) are highlighted in different colours. Mitochondrial haplotype sharing supports the estimates of kinship relations between mother and sons based on genomic data (see Figure S1 and Table S1). The pedigree was plotted using the kinship2 R package [50].

In an attempt to combine molecular with anthropological data, information about age classes estimated through morphological analysis could help to discriminate between different hypotheses about kinship relations, but we have to consider the uncertainty of the age estimates due to high fragmentation and degradation of skeletal remains and the possibility that individuals could have been dead and buried in different epochs.

Even if the obtained data did not allow a sound detailed reconstruction of the possible pedigrees, we can suppose the presence of at least a nuclear family and of different degrees of biological relationships between several couples of individuals buried in the Kierzkowo barrow. Even in this case, mtDNA data provided a first indication about possible relationships among individuals, but nuDNA information was necessary to discriminate between different possible reconstructions.

#### 3.2.4. Phenotypes

Because of low or no coverage, we were able to partially detect the selected phenotypically informative markers in our samples. Indeed, 32 of 41 HIrisPlex-S SNPs were not totally covered in some samples, presumably due to DNA fragmentation and because the remaining nine SNPs were not part of the 1240 K capture array. Generally, diploid genotyping (see Methods) called 35 out of the 48 analysed genomic positions (41 HIrisPlex-S + 7 additional markers), including three variants in a heterozygous state and 28 homozygous alleles (either reference or variant) confirmed by pseudo-haploid calling (Table S3). Samples 5.1, 6.1, and 8.5 probably had blue eyes, as predicted by the HIrisPlex-S tool [39], and confirmed by data reported in SNPedia 6 [45]. The same eye colour could be supposed for 3.4, 5.3, and 7.6 (SNPedia, Table S3). Hair colour was predicted for 5.1 (dark brown) and 8.5 (blonde), while 6.1 was probably light-haired. Red hair could be excluded as a phenotypic trait in more than half of the samples (3.4, 8.2, 5.3, in addition to 5.1 and 8.5). In terms of skin colour, pale skin was predicted for 8.5 and 5.1, which was also confirmed by SNPedia. The absence of freckles in 8.5 and light skin colour in 5.3 can be additionally supposed. Annotations by SNPedia are shown in Supplementary Table S3; probabilities calculated by the HIrisPlex-S tool are reported in Table 4 and Supplementary Table S4.

We also investigated *AB0* and *RHD* genes. Blood type 0 was inferred for 8.5, a classic OO homozygote like the 42% of Caucasians [45], while we could just exclude type B as the blood group for 3.4, 5.1, and 7.6, whose specific blood group could not be further determined due to a lack of coverage of the genomic position related to rs8176719 in the *ABO* gene. Similarly, Rhesus blood groups could not be properly inferred in our data since rs676785 and rs609320 in the *RHCE* gene (which determine c and e antigens, respectively) are not included in the 1240K SNPs panel. However, the *RHD* gene was partially covered in all the samples. Thus, its full deletion, which is mainly associated with the Rh-phenotype in Europeans [51], can be excluded.

Another trait we investigated is lactase persistence, since consumption of milk is attested in Kierzkowo by the presence of milk fats in ceramic vessels found in the burial and reconstruction of the diet through isotope analysis of human bone remains. Lactase is, indeed, the enzyme responsible for the digestion of the milk sugar lactose. Its production usually decreases after the weaning phase, except for some individuals who continue to produce lactase throughout adulthood. A mutation at the SNP rs4988235 (−13,910\*T) in the *MCM6* gene, supporting the lactase persistence haplotype, explains the phenotype in European populations [52]. We were able to describe this SNP only in four individuals (5.1, 7.6, 8.2, and 8.5) from Kierzkowo. Interestingly, they all were homozygous for the ancestral allele, suggesting they could have become lactose intolerant in adulthood, probably with no effect on 8.5, who was a newborn. It is a common idea that lactase persistence coevolved with the cultural adaptation of dairying. Interestingly, the date estimates for the rise in frequency of the −13,910\*T mutation (2188 and 20,650 years ago [53]) bracket archaeological dates for the spread of domestic animals and dairying into Europe, pointing at a probably strong positive selection linked to the cultural traits of animal domestication and adult milk consumption [53–56]. However, aDNA studies have shown that the −13,910\*T allele was

very rare or absent in early Neolithic central Europeans, which lets us suppose that lactase persistence and dairying coevolution began around 7500 years ago, probably among people of the Linearbandkeramik culture between the central Balkans and central Europe [57]. According to our data, people from Kierzkowo were lactose intolerant in adulthood, even though their culture was characterised by the importance of dairying. We have to consider that an inter-individual variation of the amount of lactose tolerated by lactase non-persistent people (probably as a result of variation in the composition of the gut flora) is known and that some intolerant individuals can consume lactose-containing products without any obvious ill effects. In particular, fermented dairy products (i.e., yoghurt or cheese) are usually well tolerated by non-persistent individuals since they contain less lactose [58].

#### *3.3. Comparison between GAC Sites*

A comparison of our results can be made with information available from another burial site attributed to the GAC in Poland, for which genome analysis was performed on human bone remains: the mass grave of Koszyce (Małopolska province) [59]. In this case, 15 individuals including men, women and children, were buried at the same time, next and on top of each other, after a violent death. According to the genomic data, the buried individuals belonged to a large extended family connected by several first- and seconddegree relationships. In particular, four nuclear families were represented. The presence of unrelated females and related males would indicate a dominant form of patrilineal social organisation. The two burial contexts are quite different (a mass grave in Koszyce and a typical GAC burial ritual in Kierzkowo), as well as the experimental strategy applied due to a different level of sample preservation (whole genome sequencing for Koszyce samples and target enrichment of selected SNPs for Kierzkowo). However, similarly genetic and reproductive relationships were probably the basis of social relationships in GAC communities. In Kierzkowo as well as in Koszyce, the presence of different mtDNA lineages can support the hypothesis of a community not based on the matrilocal residence system. Unlike the Koszyce samples, the resolution of our data on the Y-chromosome did not allow us to confirm the presence of a unique patrilineal lineage, although we cannot exclude it either, since data for Kierzkowo show the same haplogroup for all the available samples. In particular, Y-chromosome variability is represented in both sites only by the I2 haplogroup, indicating a low genetic diversity in the GAC population according to this marker, which could support the hypothesis of a society based on patriarchy.

This hypothesis could find support from archaeological data: the economic base of the GAC group from Kierzkowo was agriculture and animal husbandry. In particular, the latter played a special role, as evidenced by the numerous skeletal remains of farm animals and the milk fat residues retrieved in ceramic vessels found in the burial, which suggest consumption of milk and meat. Preliminary isotope analysis on the human skeletal remains from Kierzkowo confirms the presence of these foods in the diet and seems to indicate better access to valuable animal products for men than for women [11]. Furthermore, most pastoralist societies are patriarchal due to the male role in animal husbandry. A particular role of men in GAC society is also sometimes demonstrated by the equipment of the graves, which is differentiated by gender [8,15,60–63].


**Table 4.** HIrisPlex-S annotations and probabilities. Prediction probabilities calculated by the HIrisPlex-S software are shown only for the three samples with available data. Complete results are provided in Supplementary Table S4. In terms of eye colour, the predicted phenotype (blue) was defined by the highest p-values. In terms of hair colour, the highest p-value approach is used in combination with a stepwise model as described in [43]. In terms of skin colour, the highest p-value approach is used in combination with the second highest probability value [39].


#### **4. Conclusions**

Due to the (quite recent) introduction of next generation sequencing methodologies in the aDNA field, particular emphasis has been placed on archaeobiological research (archaeobotany, zooarchaeology, anthropology) as a way to gain insight into the environmental conditions and the social organisation of past populations and provide information at the individual level.

The study of the human bone remains found in the Kierzkowo Megalithic barrow allowed us to add important details to the description of this Neolithic community associated with the Globular Amphorae culture, and to the reconstruction of the use of the site through time. Indeed, the evaluation of nucleotide misincorporation patterns of aDNA (molecular degradation) suggested the presence—confirmed through radiocarbon C14 dating—of a recent intrusion into the burial that happened in the last century, when the remains of a woman and two little children were deposed inside the perimeter of the Neolithic burial. Together with the more recent remains, two female and six male Neolithic individuals were found in the tomb and identified. Despite the taphonomic condition of the site, characterised by a secondary deposition of fragmented and mixed bone remains, kin relationships and phenotypic traits of the buried individuals were also successfully estimated via aDNA analysis. Most of them were related by first-, second-, and third-degrees of kinship, and in particular, we located a nuclear family, one of the oldest attested by genetic data.

Considering the nuDNA data in general, the methodological approach used presented some limitations in terms of information recovered, as results are restricted to a relatively small and pre-selected set of genomic markers sequenced after a target enrichment. A whole-genome sequencing with high coverage would represent, indeed, the best strategy to obtain accurate individual profiling. However, producing whole-genome data is suitable for samples with levels of endogenous DNA content higher than that in Kierzkowo bone fragments (average 0.19%, with a minimum value of 0.007% and a maximum value of 0.66% for Neolithic individuals). More suitable samples than those used would have been represented by the petrous part of the temporal bone or by teeth, but they were not available for this study. Thus, to the best of our knowledge, the methodology chosen was the most suitable.

In conclusion, thanks to the development of the new sequencing technology and the choice of the best experimental strategy according to the level of preservation and the type of sample available, this study highlighted a new opportunity for ancient genomic studies.

Despite its limitations, even a target enrichment approach originally conceived for investigating large-scale populations' structure and relationships can provide interesting information at the individual level that can be exploited for a fine reconstruction of singular archaeological contexts. Through a multidisciplinary approach that involves archaeologists, archaeobotanists, zooarchaeologists, and biological and molecular anthropologists, a detailed bioarchaeological reconstruction of ancient communities and their members can be supplied. From this perspective, molecular data represent an important informative source to support biological anthropological activity, answering specific questions and reconstructing individual profiles as well as biological and social relationships. The results presented in this study, together with possible further genetic data obtained for other GAC sites, can provide an important contribution that is useful for supporting archaeological and anthropological research on the social organisation and funerary rituals that characterised this culture.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/genes12060910/s1, Table S1: Polymorphisms according to rCRS and Mitochondrial Haplogroup determination, Table S2: Kinship analysis by READ and NgsRelate, Table S3: Annotation of phenotypically informative SNPs detected by ATLAS in the dataset, Table S4: Eye, hair, and skin colour prediction according to HIrisPlex-S on SNPs detected by ATLAS, Figure S1: Phylogenetic tree and haplogroup determination for fifteen mitogenomes from Kierzkowo. Polymorphisms are

indicated according to rCRS reference sequence. Haplogroup assignment is based on PhyloTree Build 17. For the sake of simplicity, sample IDs were used to indicate samples.

**Author Contributions:** Conceptualization, S.V. and E.P.; methodology, S.V., M.A.D., E.P.; formal analysis, S.V., M.A.D., C.C.; investigation, S.V., M.A.D., C.C., A.B., E.P.; resources, S.V., M.L., D.C.; data curation: S.V., M.A.D., E.P.; visualization, S.V., M.A.D., C.C., E.P.; writing—original draft preparation, S.V., M.A.D., A.B., E.P.; writing—review and editing, S.V., M.A.D., C.C., A.B., M.L., D.C., E.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Italian Ministry of Education, University and Research (PRIN2017 grant no. 20177PJ9XF) and by the University of Florence—Bando di Ateneo per i progetti competitivi per Ricercatori a Tempo Determinato (RTD) 2019-2020 (Prot. n. 197432).

**Institutional Review Board Statement:** Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data analysed in this study are openly available in the European Nucleotide Archive (ENA) under accession numbers ERS2040835-ERS2040840, ERS2040844, and ERS2040845.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*
