The Delta Variant, which was first detected in India in October 2020 [
36] and spread to more than 170 countries globally (GISAID), was introduced into our region at the beginning of May 2021 [
37]. In 2021, Palestine witnessed two COVID-19 peaks: the first was the Alpha Variant peak during March–April, and the second was the Delta Variant during August–October [
10,
38,
39,
40]. In mid-September 2021, the number of active COVID-19 cases increased to 32,533, with a mortality rate of 19 per day (3850 deaths), owing to the spread of the Delta Variant. The entire country was enlisted on the list of international high-risk areas [
38]. The situation was in line with reports that SARS-CoV-2 genomes from Israeli, Lebanese, and Egyptian patients were dominated by the Delta Variant during July to December 2021 [
41,
42]. Our study detected 45 cases of the Delta Variant who were Palestinian travelers arriving mainly from Turkey through the border bridge with Jordan and tested for COVID-19 directly upon arrival. However, the purely spatial distribution of COVID-19 cases in Palestine showed eight significant geographical clusters (
Figure 1). Phylogenetic analysis (
Figure 2) showed the absence of any clear pattern of geographic and genetic clustering (
Figure 4). Conversely, haplotype networking displayed three haplogroups, though still without any geographical implications. By contrast, global-level studies have disclosed the SARS-CoV-2 haplotype association with geographic origin and case fatality rates among COVID-19 patients [
43]. Compared to a vast geography like India, both analyses showed a high number of haplotypes and lineages, suggesting that the Delta Variant has been imported into and exported from Palestine multiple times [
44]. Nearest neighbor analyses suggest that large numbers of Palestinians cross the Palestinian–Israeli borders as close relatives, laborers, or even as regular attendants of the Israeli industrial zones in Palestine and in Turkey, which acts as a regional transport hub. It was clearly observed that the Palestinian and Israeli peaks in 2021 had a sequential pattern in which the Palestinian peaks (maximum number of cases in the highest peak was around 8k) were always preceded by the Israeli ones, with a short period of a couple of weeks to one month in between (maximum number of cases in the highest peak was over 75k) [
45] (
Supplementary Figure S1). On the other hand, the Jordanian peak in spring was identical to that in Palestine, and the second peak was in December 2021, one month after the second Palestinian peak, ruling out any effect of the Jordanian Delta peak on Palestine, whereas the opposite could have happened [
46]. Based on the timing of peaks and the fact that both Palestinians and Israelis are adjacent communities, the transfer of SARS-CoV-2 from the Israelis to the Palestinians is highly probable, including the Delta Variant, especially when knowing that the first Israeli Delta Variant was at the beginning of May, and the first Palestinian Delta Variant was reported two months later, at the end of June 2021 [
37,
47]. This lag time was enough to effectively spread the Delta Variant in the Palestinian community. Furthermore, the nature of the Delta Variant, which is 63–167% more transmissible and emerges 1.4–2.6 times faster than the Alpha Variant, could have contributed to the spread of the lethal COVID-19 variant [
48]. Yet, the recombination between Alpha and Delta SARS-CoV-2 variants is extremely rare; thus, recombination cannot be used to explain the Delta peak that followed the Alpha one [
49]. Additionally, in a relatively small geography like the West Bank, the absence of any geographical clustering with a high number of haplotypes and lineages may further indicate active endogenous circulation of the Delta Variant (
Figure 3) due to noncompliance with the official preventive regulations such as the lockdown, mass gatherings like weddings, social distancing, and mask wearing. In addition, the transmission network showed that none of the districts formed a source of COVID-19; rather, they were hubs (SHR ≈ 0.5) with equal weights reflected by equal node sizes, except for Ramallah, shown as a sink due to the limited sample size. Although SHR does not indicate which node is the most important in the spread of SAR-CoV-2, Al-Khalil district has the thickest arrows originating from the node (outdegree), indicating a higher frequency of transitions.
In this study, the total nucleotide diversity (π = 0.0009 ± 0.000,
Table 3) across the SARS-CoV-2 spike region was relatively high compared to the global nucleotide diversity (π = 0.00044 ± 0.00001) as well as that of the global regions across the whole genome [
50]. However, the neutrality tests, Tajima’s D, Fu-Li’s F, and Fu-Li’s F (
Table 3), significantly deviated from neutrality to the negative side, leading to a loss of equilibrium. The negativity of neutrality tests can be explained by a recent viral population expansion event or a recent introduction of a new mutation or a group of mutations that have placed themselves in the population and became fixed (selective sweep), or by natural (negative) selection in which deleterious alleles are removed, selectively leading to reduced genetic diversity. However, our results suggest the recent population expansion as the most plausible explanation to the low neutrality test values and low nucleotide diversity (
), along with the excess of rare mutations (negative neutrality tests,
Table 3). Both forces, population expansion and excess of rare mutations, resulted in low genetic diversity among the viral populations. Moreover, other minor forces are expected to increase diversity, such as co-circulation and population genetic admixture. These results are in congruence with other studies that showed statistically significantly negative values for neutrality tests [
51].
The genetic distance between the three SARS-CoV-2 clusters (Fst > 0.25) and low to intermediate gene flow (Nm) supports the clustering of the genetic diversity, with signs of gene flow between cluster A and B (Nm = 0.57) and between I and C (Nm = 0.36), but with low transfer of genetic diversity (Nm = 0.21) between cluster B and C due to extensive movement of hosts. A study in South America showed statistically significant values, indicating slight genetic differentiation [
51]. Unlike the Fst, the other genetic differentiation estimators, Gst and Da, were underestimated due to the high mutation rate [
33,
52]. Furthermore, the low values of Dxy (0.0009–0.0011) combined with high Fst values can be explained by selection sweep, a mutation that increases its frequency and becomes fixed in a population, which ultimately reduces genetic variation after a period of time. The transmission network (
Figure 3) partially supports the potential exchange of genes between clusters due to host movement between districts, especially those coming from Al-Khalil, the most COVID-19 prevalent district, forming the node with the thickest outdegree arrows in the transmission network. More evidently, the value of Rg, the recombination parameter for the entire gene, is high (27–10
5), suggesting that exchange within the gene may be occurring to varying degrees in the three clusters but more so in cluster C.
The study suggests that the three clusters did not completely evolve into isolated distinct units or clusters, which is most probably due to the recent population expansion, where Fst is expected to increase and Nm to decrease over time. The low estimates of HKA suggest very low departure from neutral theory, indicating an almost constant ratio between within-population polymorphism and between-population divergence, explained by neutral selection of mutations and a low recombination rate [
35,
53]. In addition to the mutation rate, the recombination rate is a major contributor to genetic diversity in viral genomes. Although a low recombination rate is classically explained by genetic hitchhiking and background selection, the spike gene might not be enough to detect recombination compared to WGS, as previously thought [
49,
54,
55]. Further, the high frequency of consecutive SARS-CoV-2 pandemic waves and the genetically similar viral lineages add up to the difficulty of detecting an accurate recombination rate [
49,
54]. With the conflicting reports on recombination events in SARS-CoV-2, longer time scales are needed for the recombination to become more pronounced and to take effect on the long-term evolution of the virus [
54,
56].