Next Article in Journal
Identification and Characterization of Two Novel Members of the Family Eubacteriaceae, Anaerofustis butyriciformans sp. nov. and Pseudoramibacter faecis sp. nov., Isolated from Human Feces
Previous Article in Journal
Antibiotic-Resistant Pseudomonas aeruginosa: Current Challenges and Emerging Alternative Therapies
Previous Article in Special Issue
SARS-CoV-2 Molecular Evolution: A Focus on Omicron Variants in Umbria, Italy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Genomic Surveillance Circuit for Emerging Viral Pathogens

by
Carlos S. Casimiro-Soriguer
1,2,
Maria Lara
1,
Andrea Aguado
1,
Carlos Loucera
1,2,
Francisco M. Ortuño
1,3,
Nicola Lorusso
4,
Jose M. Navarro-Marí
5,6,
Sara Sanbonmatsu-Gámez
5,6,
Pedro Camacho-Martinez
7,
Laura Merino-Diaz
7,
Adolfo de Salazar
6,8,9,
Ana Fuentes
6,8,9,
The Andalusian COVID-19 Sequencing Initiative
,
Jose A. Lepe
2,7,9,‡,
Federico García
6,8,9,‡,
Joaquín Dopazo
1,2,* and
Javier Perez-Florido
1,2,*,‡
1
Platform of Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, 41013 Sevilla, Spain
2
Institute of Biomedicine of Seville (IBiS), University Hospital Virgen del Rocío/CSIC/University of Seville, 41013 Sevilla, Spain
3
Department of Computer Engineering, Automatics and Robotics, University of Granada, 18071 Granada, Spain
4
Dirección General de Salud Pública, Consejería de Salud y Consumo, Junta de Andalucía, 41020 Sevilla, Spain
5
Servicio de Microbiología, Hospital Virgen de las Nieves, 18014 Granada, Spain
6
Instituto de Investigación Biosanitaria ibs.GRANADA, 18012 Granada, Spain
7
Servicio de Microbiología, Unidad Clínica Enfermedades Infecciosas, Microbiología y Medicina Preventiva, Hospital Universitario Virgen del Rocío, 41013 Sevilla, Spain
8
Servicio de Microbiología, Hospital Universitario San Cecilio, 18016 Granada, Spain
9
Centro de Investigación Biomédica en Red en Enfermedades Infecciosas (CIBERINFEC), Instituto de Salud Carlos III (ISCIII), 28029 Madrid, Spain
*
Authors to whom correspondence should be addressed.
The members of the network are listed in the Supplementary Material.
These authors contributed equally to this work.
Microorganisms 2025, 13(4), 912; https://doi.org/10.3390/microorganisms13040912
Submission received: 4 March 2025 / Revised: 7 April 2025 / Accepted: 14 April 2025 / Published: 16 April 2025

Abstract

:
Genomic surveillance has been crucial in monitoring the evolution and spread of SARS-CoV-2. In Andalusia (Spain), a coordinated genomic surveillance circuit was established to systematically sequence and analyze viral genomes across the region. This initiative organizes sample collection through 27 hospitals, which act as regional hubs within their respective health districts. Sequencing is performed at three reference laboratories, with downstream data analysis and reporting centralized at a bioinformatics platform. From 2021 to 2025, over 42,500 SARS-CoV-2 genomes were sequenced, enabling the identification of major variants and their evolutionary dynamics. The circuit tracked the transition from Alpha and Delta to successive Omicron waves, including both recombinant and non-recombinant clades. The integration of genomic and epidemiological data facilitated rapid variant detection, outbreak investigation, and public health decision making. This surveillance framework at a regional granularity demonstrates the feasibility of large-scale sequencing within a decentralized healthcare system and has expanded to monitor other pathogens, reinforcing its value for epidemic preparedness. Continued investment in genomic surveillance is critical for tracking viral evolution, guiding interventions, and mitigating future public health threats.

1. Introduction

With more than 17 million sequences submitted to GISAID [1] and other databases in the moment this manuscript was written, SARS-CoV-2 is probably the most widely sequenced pathogen in the world. Successive waves of infection have resulted in a constant selection of SARS-CoV-2 variants with new mutations in their viral genomes [2,3,4]. Sometimes, these novel variants carry specific mutations that have been linked to higher transmissibility [5,6,7] and/or immune evasion [8,9], making them relevant from a public health perspective [10] and leading to their classification as variants of interest (VOI) or variants of concern (VOC) [11].
While public health interventions, quarantine measures, and vaccination programs have been integral to the management of both past and present pandemics, the COVID-19 pandemic represents the first instance in which genomic sequencing has been deployed on an unprecedented global scale. This genomic surveillance has provided a critical advantage in pandemic response, enabling near-real-time insights into the transmission dynamics and evolutionary trajectory of SARS-CoV-2 [12].
Spain has a decentralized health system with the competencies in healthcare transferred to the autonomous regions. In particular, Andalusia, the largest region of Spain and the third largest region in Europe, with a population of 8.5 million, equivalent to a medium-sized European country like Austria or Switzerland, has implemented over the last decades, a thoroughly digitalized health system. During the first wave of the pandemic, Andalusia established an early pilot project SARS-CoV-2 sequencing [13], which became later the genomic surveillance circuit of Andalusia [14], in close coordination with the Spanish Health Authority [15]. This circuit was an integral component of the strategy for personalized medicine during the COVID-19 pandemic [16]. On the other hand, the Andalusian Public Health System has systematically been storing the electronic health record (EHR) data of all Andalusian patients in the Population Health Base (BPS, acronym from its Spanish name “Base Poblacional de Salud”) since 2001, making of this database one of the largest repositories of highly detailed clinical data in the world (containing longitudinal detailed clinical information on over 15 million of patients) [17].
The genomic surveillance circuit is a consortium that includes the 27 main hospitals across the eight provinces of Andalusia (Table S1), the Platform of Computational Medicine, the General Directorate of Public Health of the Ministry of Health and Consumer Affairs, and the Technical Subdirectorate of Information Management of the Andalusian Health Service. Figure 1 sketches the general operating layout of the circuit.
This study aims to describe the conception, implementation, and outcomes of the SARS-CoV-2 genomic surveillance circuit in Andalusia, highlighting its role in monitoring viral variants and informing public health interventions.

2. Materials and Methods

2.1. Design and Patient Selection

The genomic surveillance circuit for SARS-CoV-2 in Andalusia includes 42,552 SARS-CoV-2 genomes that were systematically sequenced among RT-PCR positive individuals following the recommendations of the Spanish Ministry of Health [18] in the period January 2021 to the present.
According to these recommendations, the circuit employed both random and targeted sampling strategies. Random sampling was conducted on RT-PCR positive cases to estimate the frequency and monitor the distribution of variants of public health interest in the general population. To minimize sampling bias, samples from travelers or epidemiologically linked cases (e.g., from the same outbreak) were excluded, and selection was made independently of specific RT-PCR results targeting known mutations. The proportion of sequenced samples was adapted to the epidemiological context, ranging from a minimum of 5% during periods of high incidence (7-day incidence greater than 250 cases per 100,000 inhabitants) to nearly 100% during periods of very low incidence (7-day incidence below 10 cases per 100,000 inhabitants).
In parallel, targeted sampling was applied in specific epidemiological or clinical scenarios to enhance early detection and characterization of emerging variants. This included: (i) cases linked to outbreaks or settings associated with high incidence of VOI and VOC not yet widespread locally; (ii) clinically atypical cases, such as those with unusually severe disease, prolonged infection in immunocompromised individuals, or poor response to SARS-CoV-2-specific treatments; (iii) outbreaks exhibiting exceptionally high transmissibility or virulence; and (iv) suspected diagnostic anomalies, such as discordant results between nucleic acid amplification tests (NAATs) and antigen tests.
This dual approach enabled both population-level representativeness and rapid response to potential signals of concern in the viral genomic landscape, based on samples systematically collected by hospitals from their own inpatients as well as from other facilities within their health districts, including primary care centers and care homes.

2.2. SARS-CoV-2 Genome Sequencing

SARS-CoV-2 RNA-positive samples were subjected to whole-genome sequencing at the sequencing facilities of Hospital Universitario San Cecilio (Granada, Spain), Hospital Universitario Virgen del Rocío (Sevilla, Spain), and Hospital Universitario Virgen de Las Nieves/Andalusian Virus Reference Laboratory (Granada, Spain).
The sequencing strategy primarily involved short-read sequencing, although long-read sequencing was applied to a limited subset of samples.
For short-read sequencing, RNA extraction and amplification were performed following the ARTIC network protocols [19] using ARTIC primer set versions V3, V4, V4.1, and V5.3.2 [20,21,22,23] from Integrated DNA Technologies (Coralville, IA, USA) for Illumina sequencing over time. SARS-CoV-2 positive samples with RT-PCR Ct values below 29, which are inversely correlated with viral RNA concentration, were selected for sequencing. Following nucleic acid extraction, overlapping amplicons spanning the SARS-CoV-2 genome were generated after cDNA synthesis using SuperScript IV Reverse Transcriptase (ThermoFisher Scientific, Waltham, MA, USA), 1 µL of random hexamer primers, and 11 µL of RNA.
Libraries were prepared according to the COVID-19 ARTIC protocol (V3, V4, V4.1, and V.5.3.2, depending on the version) and the Illumina DNA Prep Kit (Illumina, San Diego, CA, USA). Library quality was assessed using the Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA), and libraries were subsequently quantified using the Qubit DNA BR assay (ThermoFisher Scientific, Waltham, MA, USA). Normalized libraries were pooled and sequenced on various Illumina platforms, including MiSeq v2 (2 × 150 cycles), Miniseq (2 × 150 cycles), iSeq (2 × 150 cycles), NextSeq 500/550 Mid Output v2.5 (2 × 150 cycles) and NextSeq 1000 (2 × 150 cycles) sequencing reagent kits.
For long-read sequencing, SARS-CoV-2 samples were sequenced on a MinION Mk1C platform (Oxford Nanopore, Oxford, UK) using a FLO-MIN106D flow cell. Library preparation followed the “PCR tiling of SARS-CoV-2 virus with rapid barcoding and Midnight RT PCR Expansion” protocol (SQK-RBK110.96 and EXP-MRT001), which generates 1200 bp amplicons [24].
While short-read platforms formed the backbone of the sequencing strategy due to their high throughput and accuracy, long-read technologies such as Nanopore served as a valuable complementary tool in SARS-CoV-2 genomic surveillance and public health response, thanks to their portability, affordability, and rapid turnaround. Notably, studies have shown that long-read platforms like Nanopore can generate consensus-level sequences with quality comparable to short-read technologies such as Illumina for SARS-CoV-2 variant detection, supporting their integration into surveillance workflows where rapid or decentralized sequencing is required [25].

2.3. Illumina Sequencing Data Processing Workflow

Illumina sequencing data were analyzed using in-house scripts and the nf-core/viralrecon pipeline software (v.2.6.0) [26]. Briefly, after read quality filtering, sequences for each sample are aligned to the SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank accession: MN908947.3) [27] using bowtie2 (v.2.4.4) algorithm [28], followed by primer sequence removal and duplicate read marking using iVar (v.1.4) [29] and Picard tools (v3.0.0) [30], respectively. Genomic variants were identified through iVar (v.1.4) software using a minimum allele frequency threshold of 0.25 for calling variants and a filtering step to keep variants with a minimum allele frequency threshold of 0.75. Using the set of high confidence variants and the MN908947.3 genome, a consensus genome per sample was finally built using bcftools (v.1.16) [31].
Lineage and clade assignment to each consensus genome was generated by the Pangolin (v.4.3.1, pangolin-data v.1.32) [32] and Nextclade (v.3.9.1) [33] tools, respectively.

2.4. Nanopore Sequencing Data Processing Workflow

For Nanopore data, base calling was performed on a graphics processing unit (GPU) cluster with four Tesla v100 GPUs using the app Guppy (v.5.0.16) [34] and the model dna_r9.4.1_450bps_hac. High-accuracy FASTQ files produced by Guppy were then processed with the nf-core/viralrecon pipeline (version 2.6.0), which utilizes the ARTIC Network pipeline [35] for read alignment to the SARS-CoV-2 isolate (MN908947.3), variant calling and consensus sequence generation. This pipeline employs Nanopolish for variant calling and consensus generation, which corrects base-calling errors that are characteristic of Nanopore reads as part of the consensus-building process [36]. Lineage and clade assignment were performed in the same manner as in the Illumina workflow.

2.5. Phylogenetic Analysis

A phylogenetic analysis was performed using the Augur toolkit (v.28.0.1) [37] on a representative set of consensus genomes obtained from the Andalusian surveillance circuit. From the entire dataset spanning January 2021 to January 2025, a random selection of 50 genomes per Pango-lineage was applied. Augur functionality relies on the IQ-Tree (v.2.2.0.3) software [38]. The MAFFT program (v.7.515) [39,40] was utilized for the multiple alignment, using the strain MN908947.3 as reference. The phylogenetic tree is recovered by maximum likelihood, using a general time reversible model with unequal rates and unequal base frequencies [41]. Branching date estimation was carried out with the least square dating (LSD2) method [42] using TreeTime (v.0.9.4) [43]. Branching point reliabilities were estimated by UFBoot, an ultrafast bootstrap approximation to assess branch support [44].
The results can be viewed on the Nextstrain [45] local server with detailed sampling information, including the collection date, host’s primary care center and its location (town and province), the hospital that recruited the sample, sequencing technology (Illumina or Nanopore) and the sequencing laboratory facility.

2.6. Resolution and Performance of the Andalusian Surveillance Circuit

To evaluate the resolution and performance of the Andalusian genomic surveillance circuit, two different approaches were followed, addressing each objective, respectively.
First, all consensus SARS-CoV-2 sequences and their associated metadata available in the GISAID database [46] were downloaded. To ensure consistency and data quality, sequences were filtered to exclude those with more than 5% undetermined bases (Ns), a length shorter than 29,000 nucleotides, or incomplete collection dates. From this curated dataset, sequences corresponding to samples collected in Spain between January 2021 and January 2025 were extracted (a total of 257,402 sequences). To evaluate the contribution of the Andalusian circuit, sequences originating from Andalusia were removed, resulting in a final comparative dataset of 222,906 complete genomes (EPI_SET_250402xc) [47]. All genomes were classified into lineages and clades using Pangolin (v.4.3.1, pangolin-data v.1.32) and Nextclade (v.3.9.1) tools, respectively, using the same criteria applied to the circuit dataset. By contrasting the relative frequencies of SARS-CoV-2 clades observed in Andalusia with those from the rest of Spain, the regional resolution provided by the Andalusian circuit in terms of variant distribution can be assessed. This highlights how a regionally focused approach can uncover local patterns that might not be fully captured by aggregated national or supranational data. Such a localized perspective supports more timely and targeted public health interventions.
For the second approach, the overall performance of the Andalusian surveillance circuit—including both sequencing and subsequent bioinformatic analysis, following methods described in Section 2.2 and Section 2.3—was evaluated through two quality control assessments (QCAs) for SARS-CoV-2 sequencing coordinated by the Spanish Health Authority [15]. Inactivated and lyophilized SARS-CoV-2 samples were prepared by the Reference Laboratory—Respiratory Virus and Influenza Unit of the National Centre for Microbiology (CNM), Carlos III Health Institute—prior to distribution to the participating laboratories of the Andalusian surveillance circuit, where sequencing was performed. These assessments aimed to evaluate the accuracy of clade and lineage assignments for a total of 15 samples: 9 in 2021, with participation from two reference sequencing facilities, and 6 in 2024, involving three reference sequencing facilities of the Andalusian circuit.

3. Results

3.1. Sequencing Effort over the 2021–2025 Period

Since the beginning of the circuit, more than 42,500 SARS-CoV-2 genomes have been obtained (Figure 2), with a non-homogeneous sequencing intensity across the monitored period, reflecting fluctuating epidemic waves, the seasonal incidence and, in some cases, specific punctual resource limitations in the circuit. While the vast majority of sequences were generated using Illumina technology, a small proportion—approximately 0.7%—were obtained using Nanopore sequencing, typically in situations requiring faster turnaround or operational flexibility. The main SARS-CoV-2 VOI and VOC and variants under monitoring (VUM) were detected in Andalusia. Notable VOCs such as Alpha (20I/B.1.1.7), Beta (20H/B.1.351), Gamma (20J/P.1), and Delta (21A/B.1.617.2, 21I, 21J) were identified in early 2021, followed by the emergence of multiple Omicron subvariants (21K/BA.1, 21L/BA.2, 22A/BA.4, or 22B/BA.5) as well as recombinant forms like 23A/XBB.1.5, 23D/XBB.1.9, or 23B/XBB.1.16 throughout 2022 and 2023. The detection of recent variants, including 23I/BA.2.86, 24A/JN.1, 24C/KP.3, and 24F/XEC recombinant during 2023 and 2024 underscores the continued evolution of SARS-CoV-2 and the necessity of sustained genomic monitoring.
Over the study period and focusing on high-quality SARS-CoV-2 genomes with at least 95% genome coverage, the Omicron variant and its descendants accounted for the largest proportion of detected cases (59.3%, Figure 3). Delta variants formed the second-largest group (22.8%), represented by three primary clades (21A, 21I, and 21J) with distinct distributions. The Alpha/20I variant followed, making up 15.6% of cases. Other variants circulated at lower frequencies, including Beta/20H (0.2%), Gamma/20J (0.5%), and previously classified VOI and VOC (0.4%), which include Eta/21D, Iota/21F, Lambda/21G, and Mu/21H. Finally, 1.2% of cases were grouped into “other clades”, encompassing 20E/B.1.177, an early prevalent lineage in Spain that later spread across Europe [6], as well as other early SARS-CoV-2 lineages (20A/B.1, 20C/B.1.575, among others). This distribution reflects the shifting dynamics of SARS-CoV-2 variants, with Omicron emerging as the dominant variant, likely driven by its substantial immune escape capabilities, which enabled widespread infections even in highly immunized populations [48].
When compared with data from the rest of Spain (Figure S1; see Section 2.6 for details), characteristic profiles emerge in the distribution of SARS-CoV-2 clades, likely reflecting distinct founder effects, introduction timings, and heterogeneous mobility patterns. A particularly illustrative example is the detection of the Alpha variant (20I/B.1.1.7) near the Gibraltar border in December 2020, during the early stages of the Andalusian genomic surveillance circuit. This variant was first identified in border towns such as La Línea de la Concepción and Algeciras, key entry points due to their proximity to Gibraltar, which has direct travel links with the United Kingdom and high volumes of daily cross-border commuting. Its early introduction, combined with strict inter-regional mobility restrictions in early 2021, contributed to its accelerated local spread and higher prevalence in Andalusia (15.6% vs. 11.1% of the total dataset).
Similarly, clade 20E (B.1.177), initially detected among agricultural workers in Aragón and Catalonia [6], disseminated along regional corridors but followed a different trajectory in Andalusia, likely due to reduced inter-regional mobility (59.31% vs. 66.95% of the “Other clades” subset). In contrast, the Beta variant (20H), probably introduced via major international airports in Madrid and Barcelona, exhibited limited circulation in Andalusia (0.2% vs. 0.7% of the total dataset). Differences were also observed among Delta sub-clades (21J, 21I, 21A), likely shaped by separate introduction events and localized superspreading, further emphasizing the role of founder effects. In particular, sub-lineage 21I was overrepresented in Andalusia when compared to the rest of Spain (13.41% vs. 5.37% within the Delta subset). These observations underscore the importance of regionally coordinated genomic surveillance systems—such as the Andalusian circuit—in capturing localized patterns of viral evolution and transmission. Such granularity supports more targeted public health interventions than national-level monitoring alone might enable (see Section 3.3).
Figure 4 illustrates the evolution of SARS-CoV-2 variants in Andalusia from 2021 to 2025, showing distinct patterns of clade dominance and coexistence. Early in the timeline, 2021 was characterized by the coexistence of several clades, including 19B, 20A, 20E, and 20I (Alpha) and Delta (21J, 21A, and 21I) among others less relevant. By mid-2021, Delta (21J) became the dominant clade, maintaining its prevalence into late 2021. This prolonged dominance underscored its high transmissibility and global impact during that phase of the pandemic.
The transition from Delta to Omicron began in late 2021, with Omicron rapidly replacing Delta by early 2022. Among Omicron sublineages, 21K (BA.1), 21L (BA.2), and 22B (BA.5) emerged as the most prevalent, driven by key mutations that enhanced immune evasion. For instance, BA.1 contained key mutations in the spike protein such as S371L, S373P, and S375F, which reduced antibody neutralization, affecting the efficacy of monoclonal antibodies and immune responses from prior infections or vaccinations [48]. BA.2, while sharing many mutations with BA.1, contained the unique S371F substitution in the spike protein, further improving immune escape [48]. Meanwhile, BA.5’s (22B) dominance was primarily attributed to spike protein mutations such as L452R and F486V, which significantly improved immune evasion [49]. BA.5 remained dominant until November 2022, after which its descendant BQ.1 (22E) emerged as the dominant variant, maintaining dominance until March 2023.
In 2023, the evolutionary landscape of SARS-CoV-2 shifted with the emergence and dominance of Omicron recombinant clades. By late 2023, the landscape exhibited the highest clade diversity, characterized by the coexistence of recombinant clades such as 23A (XBB.1.5), 23D (XBB.1.9), and 23F (EG.5.1), among others, reflecting a complex viral ecosystem. These recombinant clades, arising from BA.2-derived variants, represented a significant proportion of the Omicron landscape. Notably, 23A (XBB.1.5), also known as “Kraken”, became the most prevalent recombinant clade due to its superior immune evasion, largely due to mutations in the spike protein, such as S486P [50]. Other recombinant clades, including 23D (XBB.1.9) and 23F (EG.5.1), further highlighted the diversity and adaptive capacity of the virus. During this period, no single clade maintained clear dominance, indicating a transitional phase driven by the emergence and competition of multiple variants.
By early 2024, the non-recombinant clade 24A (JN.1) emerged as the dominant clade, reaching high prevalence. However, as the year progressed, 24E (KP.3.1.1) gained dominance, demonstrating increased transmissibility and immune escape potential. Along with 24F (XEC), it carries spike protein mutations such as F456L, which enhance transmissibility, receptor binding affinity, and immune evasion, highlighting its potential to influence future transmission dynamics [51].
The overall trends reveal alternating periods of clade dominance and high diversity. From 2021 to late 2022 and again from late 2023 to late 2024, specific clades predominated. Conversely, 2023 stood out for the coexistence of recombinant clades. These patterns reflect the ongoing interplay between transmissibility, immune evasion, and recombination, reinforcing the need for continuous genomic surveillance.
While Figure 4 provides insights into the temporal evolution and dominance of SARS-CoV-2 clades in Andalusia, Figure 5 complements this by presenting a proportional overview of Omicron clades based on sample representation in the surveillance circuit, rather than temporal trends. Additionally, it distinguishes between recombinant and non-recombinant clades. As a result, the most abundant clades, 21L/BA.2, 21K/BA.1, and 22B/BA.5, together account for a significant portion of the dataset. This high representation likely reflects their widespread circulation and epidemiological dominance during the early phases of the Omicron wave, coinciding with intensified sequencing efforts during their emergence (Figure 2). Recombinant clades such as 23A/XBB.1.5, 23D/XBB.1.9, and 23F/EG.5.1 are also well-represented, highlighting their growing significance in the later stages of the pandemic. Meanwhile, clades like 24E/KP.3.1.1 and 24F/XEC, though less prominent, reflect the ongoing diversification of the virus and its ability to adapt to selective pressures.

3.2. Nextstrain Local Server

The Nextstrain local server for the circuit, available at [52], allows epidemiologists and regional public health institutions to conduct near real-time genomic surveillance of SARS-CoV-2 evolution. Figure 6 shows the map of Andalusia generated by the Auspice (v.2.62.0) software for the Nextstrain local server of the circuit, using a representative set of approximately 9000 genomes (see materials and methods). As can be observed, the entire region is well represented, with a higher concentration of samples in more densely populated areas.
The corresponding phylogeny is also available (Figure 7), providing information on individual samples, including details such as the primary care center. This data supports epidemiologists in enhancing outbreak response, strengthening surveillance, and improving public health decision making.

3.3. Impact of Genomic Surveillance on Public Health Interventions in Andalusia

Genomic surveillance has been pivotal in informing public health strategies worldwide. The World Health Organization underscores the importance of integrating genomic sequencing into public health systems to enable timely detection of emerging threats and support effective responses [53,54]. For example, in Taiwan, genomic surveillance has played a critical role in monitoring SARS-CoV-2 variants, shaping public health policies, and guiding decisions on travel restrictions and quarantine protocols, highlighting the value of incorporating real-time genomic data into health policy decision-making [55].
In Andalusia, the regional genomic surveillance circuit played a decisive role in enabling rapid public health responses. One key example was the early detection of the Alpha variant (20I/B.1.1.7) in the area near Gibraltar in December 2020 during the early stages of the surveillance circuit (see Section 3.1). The timely identification of this variant prompted the regional health authorities to implement targeted mass screening campaigns and reinforce control measures on mobility between Gibraltar and neighboring municipalities [56].
Another significant case was the rapid response to the emergence of the Omicron variant (21K/BA.1), which was detected by the circuit in Andalusia in late 2021. In response, regional health authorities reinforced public health measures, including stricter mask mandates, updated isolation protocols, intensified vaccination campaigns, and restrictions on hospitality venues and public gatherings. These decisions, informed by real-time genomic data, exemplify the rapid translation of surveillance findings into effective, localized interventions.

3.4. Performance Evaluation Through Quality Control Assessments (QCAs)

The QCAs of SARS-CoV-2 sequencing and analysis demonstrated the high reliability of the Andalusian circuit’s variant identification procedures under standardized benchmarking conditions. Clade assignment accuracy reached 100% in 2021 and 88.9% in 2024, resulting in an overall accuracy of 94.4% (34 out of 36). Lineage assignment was also 100% accurate in 2021 and 83.3% in 2024, yielding an overall accuracy of 91.7% (33 out of 36). Detailed results are presented in Table 1.

3.5. Use Cases

During the 2021–2025 period the Andalusian surveillance circuit has been used for several retrospective studies by facilitating the systematic storage of SARS-CoV-2 genomes within the BPS database [17]. This integration enables the direct linkage of viral genomic data with the clinical record of infected patients, providing an unprecedented environment for large-scale real-world evidence (RWE) studies. Through this unique data ecosystem, researchers can explore the interplay between viral evolution, patient characteristics, disease progression, treatment responses, and long-term health outcomes. Such a robust framework fosters novel epidemiological insights and supports precision medicine approaches, strengthening public health decision making in the face of emerging infectious threats. Actually, the availability of SARS-CoV-2 genomes in the context of the clinical data of the infected patients allowed to carry out a study demonstrating that variants can have different mortality (regardless of the patient status, age, sex, comorbidities, and any other characteristic), in particular, that the alpha variant was deadlier that the previous Wuhan variant [57].
Additionally, a series of studies allowed the evaluation of the protective effect of some drugs on COVID-19 prognostic and patient mortality. Notably, vitamin-D [58] or the antipsychotic aripiprazole [59] have shown significant protective effects. Furthermore, a broader study identified 21 drugs that were associated with reduced COVID-19 mortality [60].
The evolution of the virus and the constant replacement of variants has also been studied, with a particular focus on the role of recombination in viral adaptation. Studies have documented the occurrence of viral co-infections, which provided the conditions for the emergence of novel recombinant variants. These findings underscore the importance of recombination as a key mechanism driving SARS-CoV-2 diversity and evolution [61].
Moreover, the circuit has actively contributed to technical advancements in genomic surveillance. Efforts have been directed toward improving experimental procedures for virus detection, enhancing the sensitivity and specificity of diagnostic testing [62]. In addition, significant progress has been made in genomic data management, including the development of methodologies for reconstructing complete viral genomes from partial or low-quality sequencing data [63]. These innovations have improved the accuracy and reliability of genomic analyses, ensuring high-quality data for epidemiological and public health decision making.

4. Discussion

The implementation of a standardized genomic surveillance circuit for SARS-CoV-2 in Andalusia has provided an unprecedented opportunity to monitor the evolution of the virus and inform public health decisions in near real time. The integration of a common sequencing protocol and a unified bioinformatics analysis procedure across the whole region has ensured consistency in data quality and interpretation, providing uniformity in sequence processing, clade/lineage assignment, and data quality control. This approach has reduced variability and enhanced comparability across participating centers, with its reliability further confirmed through external benchmarking, showing clade and lineage assignment accuracies exceeding 90% under standardized benchmarking conditions. Covering a population of 8.5 million people comparable to that of Austria or Switzerland, this coordinated effort represents one of the largest regional efforts in genomic surveillance within a decentralized healthcare system. By unifying sequencing workflows across hospitals and integrating genomic data into a centralized platform, the circuit has facilitated rapid variant detection, epidemiological tracking, and clinical outcome assessment.
To ensure both representativeness and early detection capability, the circuit applied two complementary sampling strategies—random and targeted sampling—following national guidelines. The use of random sampling enabled unbiased monitoring of variant prevalence across the population, while targeted sampling supported focused investigation of specific clinical or epidemiological scenarios. This combined approach strengthened the reliability and responsiveness of the surveillance system.
Another key strength of the Andalusian circuit is its ability to provide near real-time genomic monitoring, which has enabled health authorities to quickly adapt containment measures in response to emerging threads, such as the Alpha and Omicron variants. The circuit has identified and tracked the introduction and expansion of major SARS-CoV-2 variants in Andalusia, reflecting both global and region-specific transmission dynamics. The transition from early variants like Alpha and Delta to the successive waves of Omicron subvariants (21K/BA.1, 21L/BA.2, 22B/BA.5) and recombinant forms (23A/XBB.1.5, 23B/XBB.1.16, 23D/XBB.1.9), including novel recombinants [61], demonstrates the rapid adaptability of SARS-CoV-2 to immune pressure and transmissibility advantages. This underscores the importance of sustained genomic surveillance in identifying new evolutionary pathways and reinforces the need for continuous monitoring.
Several SARS-CoV-2 genomic surveillance initiatives across Europe have adopted diverse models in terms of institutional organization, scale, and integration with healthcare systems. The United Kingdom’s COVID-19 Genomics UK (COG-UK) consortium is one of the most expansive efforts, sequencing over 2 million SARS-CoV-2 genomes through a collaboration involving the health agencies of the four countries of the UK, the Wellcome Sanger Institute, and more than 16 academic institutions [64]. In Denmark, over 800,000 genomes have been sequenced via a distributed network of hospitals and academic centers [65]. France’s EMERGEN consortium, a nationally coordinated network of more than 50 laboratories, has contributed over 650,000 sequences and is actively expanding its scope to include other respiratory pathogens such as influenza and RSV [66]. In Switzerland, more than 143,000 genomes have been sequenced through 15 diagnostic laboratories and three high-throughput platforms [67]. Portugal’s national program, led by the Portuguese National Institute of Health and supported by over 60 laboratories and several research institutions, has generated over 50,500 genomes [68]. Finally, Austria’s Datenplattform COVID-19, while extensive, focused only on partial S-gene sequencing for more than 220,000 samples, limiting its capacity for full genomic resolution [69].
In comparison, the Andalusian genomic surveillance circuit stands out as a regionally coordinated initiative within a decentralized healthcare system. It integrates 27 hospitals and three sequencing centers with a centralized bioinformatics platform and connects with the Population Health Base (BPS), a comprehensive clinical data repository. In addition to variant identification, the genomic surveillance circuit has provided crucial insights into epidemiological trends. By linking genomic data with clinical records from the BPS, the circuit has enabled studies assessing the impact of specific mutations on disease severity, patient outcomes, and treatment effectiveness.
Although smaller in sequencing volume (~42,500 genomes) compared to some national programs, analyses against broader datasets have shown how a regional system like the Andalusian circuit can detect localized variant patterns and introduction events that may be overlooked at larger surveillance scales. The circuit, has also set the stage for a broader whole-genome sequencing surveillance initiative [70]. Building upon the infrastructure and expertise developed during the COVID-19 pandemic, this circuit has expanded its scope to other emerging and endemic viral threats, including West Nile virus [71], monkeypox virus [72], influenza virus and respiratory syncytial virus. This expansion reinforces the long-term value of investing in genomic surveillance as a fundamental tool for epidemic preparedness and response. The ability to quickly adapt sequencing pipelines to new pathogens ensures that Andalusia remains at the forefront of public health genomics, providing a model that can be replicated in other regions.
Despite its achievements, the genomic surveillance circuit also presents challenges and areas for improvement. One limitation is the logistical complexity of maintaining a high-throughput sequencing infrastructure at regional level. Coordinating sample collection, sequencing workflows, and data integration in a decentralized healthcare system requires continuous optimization of resources and standardization efforts. Additionally, while the reliance on centralized sequencing hubs has facilitated high-quality data generation, it may also introduce delays during periods of very high demand. Future actions for improving the efficiency of the circuit include more automation of laboratory processes and exploring decentralized sequencing capabilities, such as portable platforms (e.g., Oxford Nanopore), to reduce transport logistics and turnaround time. Also, the imminent use of the SIEGA [73] application in the circuit will provide automatic data quality control and processing, as well as allow direct input of the data in the central repository directly from the sequencing facilities.
Ensuring the sustainability of genomic surveillance beyond pandemic crises will require long-term investment, interdisciplinary collaboration, and integration with other epidemiological monitoring systems to maintain a robust and adaptable genomic surveillance network [74].

5. Conclusions

The establishment of a regional genomic surveillance network for SARS-CoV-2 in Andalusia has demonstrated the power of whole-genome sequencing (WGS) in tracking viral evolution, guiding public health interventions, and integrating clinical and epidemiological data. Expanding this initiative to other pathogens strengthens infectious disease monitoring and pandemic preparedness.
Centralized WGS-based surveillance circuits, such as this one, provide an efficient approach for real-time outbreak detection, transmission tracking, and infection control. Their integration into public health systems enhances epidemiological investigations and response strategies.
Moving forward, maintaining a continuous genomic monitoring infrastructure will be critical for early threat detection and effective outbreak response. By leveraging centralized WGS surveillance, Andalusia demonstrates the power of regional networks to contribute to global infectious disease monitoring and response within the One Health framework.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/microorganisms13040912/s1, Table S1: List of hospitals that participate in the SARS-CoV-2 genomic surveillance circuit of Andalusia. Figure S1: Distribution of main SARS-CoV-2 variants in Spain, excluding data from Andalusia (2021–2025).

Author Contributions

Conceptualization, J.D., J.A.L. and F.G.; methodology, J.P.-F., C.S.C.-S., M.L., A.A., C.L., F.M.O., P.C.-M., L.M.-D., A.d.S., A.F. and S.S.-G.; software, J.P.-F., C.S.C.-S., F.M.O. and M.L.; formal analysis, J.P.-F., C.S.C.-S., C.L. and F.M.O.; resources, J.D., J.A.L., F.G., J.M.N.-M. and N.L.; The Andalusian COVID-19 Sequencing Initiative; data curation, J.P.-F.; writing—original draft preparation, J.P.-F. and J.D.; writing—review and editing, J.P.-F., J.D., J.A.L., F.G., C.L., S.S.-G., C.S.C.-S. and J.M.N.-M.; supervision, J.D.; funding acquisition, J.D., J.A.L., F.G. and J.M.N.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This study has been funded by the European Union’s EU4Health programme (EU4HEALTH-101113109), and the HERA incubator plan (ECDC/HERA/2021/024 ECD.12241). J.D. has been supported by grant PT17/0009/0006 from the ISCIII, co-funded by the European Regional Development Fund (ERDF), as well as the European Commission’s H2020 ELIXIR-EXCELERATE (Grant Agreement No. 676559) and by the Consejería de Salud y Familias-Junta de Andalucía (COVID-0012-2020).

Institutional Review Board Statement

Ethical review and approval were waived for this study as the data were obtained in routine surveillance.

Informed Consent Statement

Not applicable.

Data Availability Statement

The SARS-CoV-2 whole-genome sequences described in this study are available in the European Nucleotide Archive (ENA) under the identifier PRJEB44396 and in GISAID.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
VOIVariants of interest
VOCVariants of concern
VUMVariants under monitoring
EHRElectronic health record
BPSBase Poblacional de Salud
GPUGraphics processing unit
RWEReal-world Evidence
WGSWhole-genome sequencing

References

  1. Shu, Y.; McCauley, J. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance 2017, 22, 30494. [Google Scholar] [CrossRef] [PubMed]
  2. Faria, N.R.; Mellan, T.A.; Whittaker, C.; Claro, I.M.; Candido, D.D.S.; Mishra, S.; Crispim, M.A.; Sales, F.C.; Hawryluk, I.; McCrone, J.T. Genomics and epidemiology of the P. 1 SARS-CoV-2 lineage in Manaus, Brazil. Science 2021, 372, 815–821. [Google Scholar] [CrossRef] [PubMed]
  3. Tang, J.W.; Tambyah, P.A.; Hui, D.S. Emergence of a new SARS-CoV-2 variant in the UK. J. Infect. 2021, 82, e27–e28. [Google Scholar] [CrossRef]
  4. Tegally, H.; Wilkinson, E.; Giovanetti, M.; Iranzadeh, A.; Fonseca, V.; Giandhari, J.; Doolabh, D.; Pillay, S.; San, E.J.; Msomi, N. Detection of a SARS-CoV-2 variant of concern in South Africa. Nature 2021, 592, 438–443. [Google Scholar] [CrossRef] [PubMed]
  5. Volz, E.; Mishra, S.; Chand, M.; Barrett, J.C.; Johnson, R.; Geidelberg, L.; Hinsley, W.R.; Laydon, D.J.; Dabrera, G.; O’Toole, Á. Assessing transmissibility of SARS-CoV-2 lineage, B. 1.1.7 in England. Nature 2021, 593, 266–269. [Google Scholar] [CrossRef]
  6. Hodcroft, E.B.; Zuber, M.; Nadeau, S.; Vaughan, T.G.; Crawford, K.H.D.; Althaus, C.L.; Reichmuth, M.L.; Bowen, J.E.; Walls, A.C.; Corti, D.; et al. Spread of a SARS-CoV-2 variant through Europe in the summer of 2020. Nature 2021, 595, 707–712. [Google Scholar] [CrossRef]
  7. Araf, Y.; Akter, F.; Tang, Y.D.; Fatemi, R.; Parvez, M.S.A.; Zheng, C.; Hossain, M.G. Omicron variant of SARS-CoV-2: Genomics, transmissibility, and responses to current COVID-19 vaccines. J. Med. Virol. 2022, 94, 1825–1832. [Google Scholar] [CrossRef]
  8. Chen, R.E.; Zhang, X.; Case, J.B.; Winkler, E.S.; Liu, Y.; VanBlargan, L.A.; Liu, J.; Errico, J.M.; Xie, X.; Suryadevara, N. Resistance of SARS-CoV-2 variants to neutralization by monoclonal and serum-derived polyclonal antibodies. Nat. Med. 2021, 27, 717–726. [Google Scholar] [CrossRef]
  9. Beyer, D.K.; Forero, A. Mechanisms of Antiviral Immune Evasion of SARS-CoV-2. J. Mol. Biol. 2022, 434, 167265. [Google Scholar] [CrossRef]
  10. Cyranoski, D. Alarming COVID variants show vital role of genomic surveillance. Nature 2021, 589, 337–338. [Google Scholar] [CrossRef]
  11. WHO. SARS-CoV-2 Variants of Concern and Variants of Interest. Available online: https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/ (accessed on 12 January 2025).
  12. Porter, A.F.; Sherry, N.; Andersson, P.; Johnson, S.A.; Duchene, S.; Howden, B.P. New rules for genomics-informed COVID-19 responses-Lessons learned from the first waves of the Omicron variant in Australia. PLoS Genet. 2022, 18, e1010415. [Google Scholar] [CrossRef]
  13. Sequencing of the SARS-CoV-2 Virus Genome for the Monitoring and Management of the COVID-19 Epidemic in Andalusia and the Rapid Generation of Prognostic and Response to Treatment Biomarkers. Available online: https://www.clinbioinfosspa.es/projects/covseq/indexEng.html (accessed on 12 January 2025).
  14. SARS-CoV-2 Whole Genome Sequencing Circuit in Andalusia. Available online: https://www.clinbioinfosspa.es/COVID_circuit/ (accessed on 12 January 2025).
  15. Vázquez-Morón, S.; Iglesias-Caballero, M.; Lepe, J.A.; Garcia, F.; Melón, S.; Marimon, J.M.; de Viedma, D.G.; Folgueira, M.D.; Galán, J.C.; López-Causapé, C.; et al. Enhancing SARS-CoV-2 Surveillance through Regular Genomic Sequencing in Spain: The RELECOV Network. Int. J. Mol. Sci. 2023, 24, 8573. [Google Scholar] [CrossRef] [PubMed]
  16. Dopazo, J.; Maya-Miles, D.; García, F.; Lorusso, N.; Calleja, M.Á.; Pareja, M.J.; López-Miranda, J.; Rodríguez-Baño, J.; Padillo, J.; Túnez, I. Implementing Personalized Medicine in COVID-19 in Andalusia: An Opportunity to Transform the Healthcare System. J. Pers. Med. 2021, 11, 475. [Google Scholar] [CrossRef] [PubMed]
  17. Muñoyerro-Muñiz, D.; Goicoechea-Salazar, J.; García-León, F.; Laguna-Tellez, A.; Larrocha-Mata, D.; Cardero-Rivas, M. Health record linkage: Andalusian health population database. Gac. Sanit. 2019, 34, 105–113. [Google Scholar] [CrossRef]
  18. ISCIII. Integration of Genome Sequencing in the SARS-CoV-2 Surveillance. Available online: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov/documentos/Integracion_de_la_secuenciacion_genomica-en_la_vigilancia_del_SARS-CoV-2.pdf (accessed on 13 February 2025).
  19. ARTIC Network. Available online: https://community.artic.network/ (accessed on 13 February 2025).
  20. ARTIC Network. V3 Primer Availability. Available online: https://community.artic.network/t/v3-primer-availability/123 (accessed on 13 February 2025).
  21. ARTIC Network. SARS-CoV-2 Version 4 Scheme Release. Available online: https://community.artic.network/t/sars-cov-2-version-4-scheme-release/312 (accessed on 13 February 2025).
  22. ARTIC Network. SARS-CoV-2 V4.1 Update for Omicron Variant. Available online: https://community.artic.network/t/sars-cov-2-v4-1-update-for-omicron-variant/342 (accessed on 13 February 2025).
  23. ARTIC Network. SARS-CoV-2 Version 5.3.2 Scheme Release. Available online: https://community.artic.network/t/sars-cov-2-version-5-3-2-scheme-release/462 (accessed on 13 February 2025).
  24. Freed, N.E.; Vlková, M.; Faisal, M.B.; Silander, O.K. Rapid and inexpensive whole-genome sequencing of SARS-CoV-2 using 1200 bp tiled amplicons and Oxford Nanopore Rapid Barcoding. Biol. Methods Protoc. 2020, 5, bpaa014. [Google Scholar] [CrossRef]
  25. Bull, R.A.; Adikari, T.N.; Ferguson, J.M.; Hammond, J.M.; Stevanovski, I.; Beukers, A.G.; Naing, Z.; Yeang, M.; Verich, A.; Gamaarachchi, H.; et al. Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis. Nat. Commun. 2020, 11, 6272. [Google Scholar] [CrossRef] [PubMed]
  26. Patel, H.; Monzón, S.; Varona, S.; Espinosa-Carrasco, J.; Garcia, M.U.; Nf-Core Bot; Ewels, P. nf-core/viralrecon: Nf-core/viralrecon v2.6.0—Rhodium Raccoon. Zenodo 2023, 7764938. Available online: https://zenodo.org/records/7764938 (accessed on 13 February 2025).
  27. Zhou, P.; Yang, X.-L.; Wang, X.-G.; Hu, B.; Zhang, L.; Zhang, W.; Si, H.-R.; Li, B.; Huang, H.-L.; Chen, H.-D.; et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020, 579, 270–273. [Google Scholar] [CrossRef]
  28. Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 2012, 9, 357–359. [Google Scholar] [CrossRef]
  29. Grubaugh, N.D.; Gangavarapu, K.; Quick, J.; Matteson, N.L.; Goes De Jesus, J.; Main, B.J.; Tan, A.L.; Paul, L.M.; Brackney, D.E.; Grewal, S.; et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol. 2019, 20, 8. [Google Scholar] [CrossRef]
  30. Broad Institute. Picard Toolkit; Broad Institute, GitHub Repository. 2018. Available online: http://broadinstitute.github.io/picard/ (accessed on 13 February 2025).
  31. Danecek, P.; Bonfield, J.K.; Liddle, J.; Marshall, J.; Ohan, V.; Pollard, M.O.; Whitwham, A.; Keane, T.; McCarthy, S.A.; Davies, R.M.; et al. Twelve years of SAMtools and BCFtools. GigaScience 2021, 10, giab008. [Google Scholar] [CrossRef]
  32. O’Toole, Á.; Scher, E.; Underwood, A.; Jackson, B.; Hill, V.; McCrone, J.T.; Colquhoun, R.; Ruis, C.; Abu-Dahab, K.; Taylor, B.; et al. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evol. 2021, 7, veab064. [Google Scholar] [CrossRef]
  33. Aksamentov, I.; Roemer, C.; Hodcroft, E.B.; Neher, R.A. Nextclade: Clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 2021, 6, 3773. [Google Scholar] [CrossRef]
  34. Wick, R.R.; Judd, L.M.; Holt, K.E. Performance of Neural Network Basecalling Tools for Oxford Nanopore Sequencing. Genome Biol. 2019, 20, 129. [Google Scholar] [CrossRef]
  35. ARTIC Network. The ARTIC Field Bioinformatics Pipeline. Available online: https://github.com/artic-network/fieldbioinformatics (accessed on 21 February 2025).
  36. Nanopolish. A Software Package for Signal-Level Analysis of Oxford Nanopore Sequencing Data. Available online: https://nanopolish.readthedocs.io/en/latest/index.html (accessed on 5 April 2025).
  37. Huddleston, J.; Hadfield, J.; Sibley, T.R.; Lee, J.; Fay, K.; Ilcisin, M.; Harkins, E.; Bedford, T.; Neher, R.A.; Hodcroft, E.B. Augur: A bioinformatics toolkit for phylogenetic analyses of human pathogens. J. Open Source Softw. 2021, 6, 2906. [Google Scholar] [CrossRef] [PubMed]
  38. Minh, B.Q.; Schmidt, H.A.; Chernomor, O.; Schrempf, D.; Woodhams, M.D.; Von Haeseler, A.; Lanfear, R. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020, 37, 1530–1534. [Google Scholar] [CrossRef]
  39. Katoh, K.; Misawa, K.; Kuma, K.; Miyata, T. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30, 3059–3066. [Google Scholar] [CrossRef] [PubMed]
  40. Katoh, K.; Standley, D.M. A simple method to control over-alignment in the MAFFT multiple sequence alignment program. Bioinformatics 2016, 32, 1933–1942. [Google Scholar] [CrossRef]
  41. Tavaré, S. Some Probabilistic and Statistical Problems in the Analysis of DNA Sequences; University of Utah: Salt Lake City, UT, USA, 1986. [Google Scholar]
  42. To, T.-H.; Jung, M.; Lycett, S.; Gascuel, O. Fast Dating Using Least-Squares Criteria and Algorithms. Syst. Biol. 2016, 65, 82–97. [Google Scholar] [CrossRef]
  43. Sagulenko, P.; Puller, V.; Neher, R.A. TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol. 2018, 4, vex042. [Google Scholar] [CrossRef]
  44. Hoang, D.T.; Chernomor, O.; von Haeseler, A.; Minh, B.Q.; Vinh, L.S. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol. Biol. Evol. 2018, 35, 518–522. [Google Scholar] [CrossRef]
  45. Hadfield, J.; Megill, C.; Bell, S.M.; Huddleston, J.; Potter, B.; Callender, C.; Sagulenko, P.; Bedford, T.; Neher, R.A. Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics 2018, 34, 4121–4123. [Google Scholar] [CrossRef] [PubMed]
  46. Khare, S.; Gurry, C.; Freitas, L.; Schultz, M.B.; Bach, G.; Diallo, A.; Akite, N.; Ho, J.; Lee, R.T.C.; Yeo, W.; et al. GISAID’s Role in Pandemic Response. China CDC Wkly. 2021, 3, 1049–1051. [Google Scholar] [CrossRef] [PubMed]
  47. EPI_SET_250402xc. High-Quality SARS-CoV-2 Genomes Deposited in GISAID from Spain, Excluding Andalusia Region. Available online: https://doi.org/10.55876/gis8.250402xc (accessed on 5 April 2025).
  48. Willett, B.J.; Grove, J.; MacLean, O.A.; Wilkie, C.; De Lorenzo, G.; Furnon, W.; Cantoni, D.; Scott, S.; Logan, N.; Ashraf, S.; et al. SARS-CoV-2 Omicron is an immune escape variant with an altered cell entry pathway. Nat. Microbiol. 2022, 7, 1161–1179. [Google Scholar] [CrossRef]
  49. Wang, Q.; Guo, Y.; Iketani, S.; Nair, M.S.; Li, Z.; Mohri, H.; Wang, M.; Yu, J.; Bowen, A.D.; Chang, J.Y.; et al. Antibody evasion by SARS-CoV-2 Omicron subvariants BA.2.12.1, BA.4, and BA.5. Nature 2022, 608, 603–608. [Google Scholar] [CrossRef]
  50. Tamura, T.; Irie, T.; Deguchi, S.; Yajima, H.; Tsuda, M.; Nasser, H.; Mizuma, K.; Plianchaisuk, A.; Suzuki, S.; Uriu, K.; et al. Virological characteristics of the SARS-CoV-2 Omicron XBB.1.5 variant. Nat. Commun. 2024, 15, 1176. [Google Scholar] [CrossRef]
  51. Wang, Q.; Mellis, I.A.; Ho, J.; Bowen, A.; Kowalski-Dobson, T.; Valdez, R.; Katsamba, P.S.; Wu, M.; Lee, C.; Shapiro, L.; et al. Recurrent SARS-CoV-2 spike mutations confer growth advantages to select JN.1 sublineages. Emerg. Microbes Infect. 2024, 13, 2402880. [Google Scholar] [CrossRef]
  52. The SARS-CoV-2 Genomic Surveillance Circuit of Andalusia, Nextstrain Server. Available online: https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 14 February 2025).
  53. Genomic Sequencing of SARS-CoV-2: A Guide to Implementation for Maximum Impact on Public Health; WHO: Geneva, Switzerland. 2021, pp. 1–64. Available online: https://www.who.int/publications/i/item/9789240018440 (accessed on 5 April 2025).
  54. Global Genomic Surveillance Strategy for Pathogens with Pandemic and Epidemic Potential, 2022–2032. Available online: https://www.who.int/initiatives/genomic-surveillance-strategy (accessed on 5 April 2025).
  55. Gong, Y.N.; Kuo, N.Y.; Yeh, T.S.; Shih, S.R.; Chen, G.W. Genomic Surveillance of SARS-CoV-2 in Taiwan: A Perspective on Evolutionary Data Interpretation and Sequencing Issues. Biomed. J. 2024. Online ahead of print. [Google Scholar] [CrossRef]
  56. Salud Organiza un Cribado en La Línea y Algeciras y Reclama al Gobierno Mayor Control en el Tránsito con Gibraltar. Available online: https://www.juntadeandalucia.es/presidencia/portavoz/salud/157252/Covid/coronavirus/pandemia/cribado/CampodeGibraltrar/Algeciras/LaLineadelaConcepcion/Brexit/Gibraltar/Cadiz/Andalucia/JuntadeAndalucia/GobiernodeAndalucia (accessed on 5 April 2025).
  57. Loucera, C.; Perez-Florido, J.; Casimiro-Soriguer, C.S.; Ortuño, F.M.; Carmona, R.; Bostelmann, G.; Martínez-González, L.J.; Muñoyerro-Muñiz, D.; Villegas, R.; Rodriguez-Baño, J.; et al. Assessing the impact of SARS-CoV-2 lineages and mutations on patient survival. Viruses 2022, 14, 1893. [Google Scholar] [CrossRef] [PubMed]
  58. Loucera, C.; Peña-Chilet, M.; Esteban-Medina, M.; Muñoyerro-Muñiz, D.; Villegas, R.; Lopez-Miranda, J.; Rodriguez-Baño, J.; Túnez, I.; Bouillon, R.; Dopazo, J. Real world evidence of calcifediol or vitamin D prescription and mortality rate of COVID-19 in a retrospective cohort of hospitalized Andalusian patients. Sci. Rep. 2021, 11, 23380. [Google Scholar] [CrossRef]
  59. Loucera-Muñecas, C.; Canal-Rivero, M.; Ruiz-Veguilla, M.; Carmona, R.; Bostelmann, G.; Garrido-Torres, N.; Dopazo, J.; Crespo-Facorro, B. Aripiprazole as protector against COVID-19 mortality. Sci. Rep. 2024, 14, 12362. [Google Scholar] [CrossRef]
  60. Loucera, C.; Carmona, R.; Esteban-Medina, M.; Bostelmann, G.; Muñoyerro-Muñiz, D.; Villegas, R.; Peña-Chilet, M.; Dopazo, J. Real-world evidence with a retrospective cohort of 15,968 COVID-19 hospitalized patients suggests 21 new effective treatments. Virol. J. 2023, 20, 226. [Google Scholar] [CrossRef] [PubMed]
  61. Perez-Florido, J.; Casimiro-Soriguer, C.S.; Ortuño, F.; Fernandez-Rueda, J.L.; Aguado, A.; Lara, M.; Riazzo, C.; Rodriguez-Iglesias, M.A.; Camacho-Martinez, P.; Merino-Diaz, L.; et al. Detection of high levels of co-infection and the emergence of novel SARS-CoV-2 Delta-Omicron and Omicron-Omicron recombinants in the epidemiological surveillance of Andalusia. Int. J. Mol. Sci. 2023, 24, 2419. [Google Scholar] [CrossRef] [PubMed]
  62. Chaves-Blanco, L.; de Salazar, A.; Fuentes, A.; Viñuela, L.; Perez-Florido, J.; Dopazo, J.; García, F. Evaluation of a combined detection of SARS-CoV-2 and its variants using real-time allele-specific PCR strategy: An advantage for clinical practice. Epidemiol. Infect. 2023, 151, e201. [Google Scholar] [CrossRef]
  63. Ortuño, F.M.; Loucera, C.; Casimiro-Soriguer, C.S.; Lepe, J.A.; Camacho-Martinez, P.; Merino-Diaz, L.; de Salazar, A.; Chueca, N.; García, F.; Perez-Florido, J.; et al. Highly accurate whole-genome imputation of SARS-CoV-2 from partial or low-quality sequences. GigaScience 2021, 10, giab078. [Google Scholar]
  64. COVID-19 Genomics UK (COG-UK) Consortium. Available online: https://webarchive.nationalarchives.gov.uk/ukgwa/20230505083137/https://www.cogconsortium.uk/ (accessed on 5 April 2025).
  65. COVID-19 Genomics Consortium Denmark. Available online: https://www.covid19genomics.dk/home (accessed on 5 April 2025).
  66. Consortium EMERGEN. Available online: https://www.santepubliquefrance.fr/dossiers/coronavirus-covid-19/consortium-emergen (accessed on 5 April 2025).
  67. Wegner, F.; Cabrera-Gil, B.; Tanguy, A.; Beckmann, C.; Beerenwinkel, N.; Bertelli, C.; Carrara, M.; Cerutti, L.; Chen, C.; Cordey, S.; et al. How Much Should We Sequence? An Analysis of the Swiss SARS-CoV-2 Surveillance Effort. Microbiol. Spectr. 2024, 12, e0362823. [Google Scholar] [CrossRef] [PubMed]
  68. Genetic Diversity of the Novel Coronavirus SARS-CoV-2 (COVID-19) in Portugal. Available online: https://insaflu.insa.pt/covid19/ (accessed on 5 April 2025).
  69. Frank, O.; Balboa, D.A.; Novatchkova, M.; Özkan, E.; Strobl, M.M.; Yelagandula, R.; Albanese, T.G.; Endler, L.; Amman, F.; Felsenstein, V.; et al. Genomic Surveillance of SARS-CoV-2 Evolution by a Centralised Pipeline and Weekly Focused Sequencing, Austria, January 2021 to March 2023. Eurosurveillance 2024, 29, 2300542. [Google Scholar] [CrossRef]
  70. The Whole Genome Sequencing Surveillance Circuit of Andalusia. Available online: https://www.clinbioinfosspa.es/surveillance_circuit/ (accessed on 14 February 2025).
  71. Casimiro-Soriguer, C.S.; Perez-Florido, J.; Fernandez-Rueda, J.L.; Pedrosa-Corral, I.; Guillot-Sulay, V.; Lorusso, N.; Martinez-Gonzalez, L.J.; Navarro-Marí, J.M.; Dopazo, J.; Sanbonmatsu-Gámez, S. Phylogenetic Analysis of the 2020 West Nile Virus (WNV) Outbreak in Andalusia (Spain). Viruses 2021, 13, 836. [Google Scholar] [CrossRef]
  72. Casimiro-Soriguer, C.S.; Perez-Florido, J.; Lara, M.; Camacho-Martinez, P.; Merino-Diaz, L.; Pupo-Ledo, I.; de Salazar, A.; Fuentes, A.; Viñuela, L.; Chueca, N.; et al. Molecular and phylogenetic characterization of the monkeypox outbreak in the South of Spain. Health Sci. Rep. 2024, 7, e1965. [Google Scholar] [CrossRef]
  73. Casimiro-Soriguer, C.S.; Pérez-Florido, J.; Robles, E.A.; Lara, M.; Aguado, A.; Rodríguez Iglesias, M.A.; Lepe, J.A.; García, F.; Pérez-Alegre, M.; Andújar, E.; et al. The Integrated Genomic Surveillance System of Andalusia (SIEGA) Provides a One Health Regional Resource Connected with the Clinic. Sci. Rep. 2024, 14, 19200. [Google Scholar] [CrossRef]
  74. Neves, A.; Willassen, N.P.; Hjerde, E.; Cuesta, I.; Martin, C.S.; Inno, H.; Pilvar, D.; Ng, K.; Salgado, D.; van Helden, J.; et al. ELIXIR CONVERGE WP9 community. A survey into the contribution of regional/national pathogen data platforms and on the resources needed to develop and maintain them. F1000Research 2024, 12, 1590. [Google Scholar] [CrossRef]
Figure 1. The genomic surveillance circuit for SARS-CoV-2 in Andalusia. A total of 27 hospitals distributed across the eight provinces of Andalusia act as SARS-CoV-2 sample collection points. These hospitals gather specimens not only from their own patients but also from other healthcare facilities within their respective health districts, including primary care centers and care homes. RNA extraction is performed on site at each hospital before samples are forwarded to designated reference sequencing facilities: Hospital Universitario Virgen del Rocío (Sevilla) for Western Andalusia and Hospital Universitario San Cecilio (Granada) for Eastern Andalusia, with additional support from the Andalusian Virus Reference Laboratory (Hospital Universitario Virgen de las Nieves, Granada). Genomic data is uploaded to the Platform of Computational Medicine for unified data analysis procedure. Reports are sent to health authorities and genomic sequences are deposited in GISAID and ENA databases and stored in the BPS alongside the corresponding EHR record. Genomes can also be viewed in a Nextstrain local server: https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025). The map of Andalusia was generated using mapSpain (v.0.10.0) software from: https://ropenspain.github.io/mapSpain/ (accessed on 5 April 2025) and icons have been downloaded from https://bioicons.com/ (accessed on 5 April 2025) and https://www.iconfinder.com/ (accessed on 5 April 2025). The figure was generated with PowerPoint.
Figure 1. The genomic surveillance circuit for SARS-CoV-2 in Andalusia. A total of 27 hospitals distributed across the eight provinces of Andalusia act as SARS-CoV-2 sample collection points. These hospitals gather specimens not only from their own patients but also from other healthcare facilities within their respective health districts, including primary care centers and care homes. RNA extraction is performed on site at each hospital before samples are forwarded to designated reference sequencing facilities: Hospital Universitario Virgen del Rocío (Sevilla) for Western Andalusia and Hospital Universitario San Cecilio (Granada) for Eastern Andalusia, with additional support from the Andalusian Virus Reference Laboratory (Hospital Universitario Virgen de las Nieves, Granada). Genomic data is uploaded to the Platform of Computational Medicine for unified data analysis procedure. Reports are sent to health authorities and genomic sequences are deposited in GISAID and ENA databases and stored in the BPS alongside the corresponding EHR record. Genomes can also be viewed in a Nextstrain local server: https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025). The map of Andalusia was generated using mapSpain (v.0.10.0) software from: https://ropenspain.github.io/mapSpain/ (accessed on 5 April 2025) and icons have been downloaded from https://bioicons.com/ (accessed on 5 April 2025) and https://www.iconfinder.com/ (accessed on 5 April 2025). The figure was generated with PowerPoint.
Microorganisms 13 00912 g001
Figure 2. SARS-CoV-2 genomic surveillance in Andalusia: sequencing efforts and variant detection (2021–2025). Monthly SARS-CoV-2 sequencing efforts and cumulative totals in Andalusia from 2021 to 2025. Yellow bars represent the number of sequenced samples per month and the blue line shows the cumulative total of sequenced samples across the study period. Red dashed lines mark the points of initial detection of some key SARS-CoV-2 VOC, VOI, and VUM within Andalusia. The figure was generated using the Pandas (v.1.5.0) and Seaborn (v.0.11.2) packages in Python.
Figure 2. SARS-CoV-2 genomic surveillance in Andalusia: sequencing efforts and variant detection (2021–2025). Monthly SARS-CoV-2 sequencing efforts and cumulative totals in Andalusia from 2021 to 2025. Yellow bars represent the number of sequenced samples per month and the blue line shows the cumulative total of sequenced samples across the study period. Red dashed lines mark the points of initial detection of some key SARS-CoV-2 VOC, VOI, and VUM within Andalusia. The figure was generated using the Pandas (v.1.5.0) and Seaborn (v.0.11.2) packages in Python.
Microorganisms 13 00912 g002
Figure 3. Distribution of main SARS-CoV-2 variants in the Andalusian surveillance circuit (2021–2025). The figure displays the relative proportions of the main SARS-CoV-2 variants detected over the study period. Pie plot was generated using the Pandas and Seaborn packages in Python.
Figure 3. Distribution of main SARS-CoV-2 variants in the Andalusian surveillance circuit (2021–2025). The figure displays the relative proportions of the main SARS-CoV-2 variants detected over the study period. Pie plot was generated using the Pandas and Seaborn packages in Python.
Microorganisms 13 00912 g003
Figure 4. Temporal dynamics of SARS-CoV-2 clades in the Andalusian surveillance circuit (2021–2025). Each point represents the proportion of a specific clade in the sequenced dataset for a given month. The size and color of the points indicate prevalence, with larger points representing higher prevalence and the color scale transitioning from blue (lower prevalence) to red (higher prevalence). Relplot was generated using the Pandas and Seaborn packages in Python.
Figure 4. Temporal dynamics of SARS-CoV-2 clades in the Andalusian surveillance circuit (2021–2025). Each point represents the proportion of a specific clade in the sequenced dataset for a given month. The size and color of the points indicate prevalence, with larger points representing higher prevalence and the color scale transitioning from blue (lower prevalence) to red (higher prevalence). Relplot was generated using the Pandas and Seaborn packages in Python.
Microorganisms 13 00912 g004
Figure 5. Distribution of main SARS-CoV-2 Omicron clades in the Andalusian surveillance circuit (2021–2025). The figure illustrates the relative proportions of SARS-CoV-2 Omicron clades detected during the study period, highlighting the distinction between recombinant and non-recombinant clades. Bar plot was generated using the Pandas and Seaborn packages in Python.
Figure 5. Distribution of main SARS-CoV-2 Omicron clades in the Andalusian surveillance circuit (2021–2025). The figure illustrates the relative proportions of SARS-CoV-2 Omicron clades detected during the study period, highlighting the distinction between recombinant and non-recombinant clades. Bar plot was generated using the Pandas and Seaborn packages in Python.
Microorganisms 13 00912 g005
Figure 6. Nextstrain map of the Andalusian surveillance circuit (2021–2025). This figure illustrates the geographical distribution of a representative set of SARS-CoV-2 sequencing data across the region. Extracted from https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025).
Figure 6. Nextstrain map of the Andalusian surveillance circuit (2021–2025). This figure illustrates the geographical distribution of a representative set of SARS-CoV-2 sequencing data across the region. Extracted from https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025).
Microorganisms 13 00912 g006
Figure 7. Phylogeny of a representative set of SARS-CoV-2 genomes from the Andalusian surveillance circuit (2021–2025). The figure illustrates the evolutionary relationships among SARS-CoV-2 genomes and provides an example of the available metadata for a given sample. For instance, sample AND24607 was collected on 5 May 2022 at the Casarabonela primary care center, located in the town of Casarabonela in Málaga province, and sent to Hospital Universitario Virgen de la Victoria for RNA extraction. The extracted genetic material was then sent to Hospital San Cecilio (HUSC) for sequencing using Illumina technology. The genome belonged to the 21L clade/BA.2.3 lineage. Figure extracted from https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025).
Figure 7. Phylogeny of a representative set of SARS-CoV-2 genomes from the Andalusian surveillance circuit (2021–2025). The figure illustrates the evolutionary relationships among SARS-CoV-2 genomes and provides an example of the available metadata for a given sample. For instance, sample AND24607 was collected on 5 May 2022 at the Casarabonela primary care center, located in the town of Casarabonela in Málaga province, and sent to Hospital Universitario Virgen de la Victoria for RNA extraction. The extracted genetic material was then sent to Hospital San Cecilio (HUSC) for sequencing using Illumina technology. The genome belonged to the 21L clade/BA.2.3 lineage. Figure extracted from https://nextstrain.clinbioinfosspa.es/SARS-COV-2-2021-2025 (accessed on 5 April 2025).
Microorganisms 13 00912 g007
Table 1. Results of the quality control assessments (QCAs) for SARS-CoV-2 sequencing conducted across the Andalusian genomic surveillance circuit. The table summarizes the number and percentage of correct clade and lineage assignments for each QCA sample, based on centralized reference designations provided by the National Centre for Microbiology (CNM, Instituto de Salud Carlos III). Two reference sequencing laboratories participated in 2021, and three participated in 2024.
Table 1. Results of the quality control assessments (QCAs) for SARS-CoV-2 sequencing conducted across the Andalusian genomic surveillance circuit. The table summarizes the number and percentage of correct clade and lineage assignments for each QCA sample, based on centralized reference designations provided by the National Centre for Microbiology (CNM, Instituto de Salud Carlos III). Two reference sequencing laboratories participated in 2021, and three participated in 2024.
Sample IDReference CladeReference LineageClade: Number of Correct Results (%)Lineage: Number of Correct Results (%)
QCA-01-202120IB.1.1.72 (100)2 (100)
QCA-02-202120HB.1.3512 (100)2 (100)
QCA-03-202119BA.282 (100)2 (100)
QCA-04-202121HB.1.6212 (100)2 (100)
QCA-05-202120JP.12 (100)2 (100)
QCA-06-202121IAY.9.22 (100)2 (100)
QCA-08-202121JAY.942 (100)2 (100)
QCA-09-202121JAY.942 (100)2 (100)
QCA-10-202121JAY.432 (100)2 (100)
QCA-02-202423BXBB.1.163 (100)3 (100)
QCA-03-202423AXBB.1.52 (66.66)2 (66.66)
QCA-04-202424AJN.1.593 (100)3 (100)
QCA-05-202423AXBB.1.53 (100)3 (100)
QCA-07-202424AJN.13 (100)3 (100)
QCA-10-202424AJN.12 (66.66)1 (33.33)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Casimiro-Soriguer, C.S.; Lara, M.; Aguado, A.; Loucera, C.; Ortuño, F.M.; Lorusso, N.; Navarro-Marí, J.M.; Sanbonmatsu-Gámez, S.; Camacho-Martinez, P.; Merino-Diaz, L.; et al. A Genomic Surveillance Circuit for Emerging Viral Pathogens. Microorganisms 2025, 13, 912. https://doi.org/10.3390/microorganisms13040912

AMA Style

Casimiro-Soriguer CS, Lara M, Aguado A, Loucera C, Ortuño FM, Lorusso N, Navarro-Marí JM, Sanbonmatsu-Gámez S, Camacho-Martinez P, Merino-Diaz L, et al. A Genomic Surveillance Circuit for Emerging Viral Pathogens. Microorganisms. 2025; 13(4):912. https://doi.org/10.3390/microorganisms13040912

Chicago/Turabian Style

Casimiro-Soriguer, Carlos S., Maria Lara, Andrea Aguado, Carlos Loucera, Francisco M. Ortuño, Nicola Lorusso, Jose M. Navarro-Marí, Sara Sanbonmatsu-Gámez, Pedro Camacho-Martinez, Laura Merino-Diaz, and et al. 2025. "A Genomic Surveillance Circuit for Emerging Viral Pathogens" Microorganisms 13, no. 4: 912. https://doi.org/10.3390/microorganisms13040912

APA Style

Casimiro-Soriguer, C. S., Lara, M., Aguado, A., Loucera, C., Ortuño, F. M., Lorusso, N., Navarro-Marí, J. M., Sanbonmatsu-Gámez, S., Camacho-Martinez, P., Merino-Diaz, L., de Salazar, A., Fuentes, A., The Andalusian COVID-19 Sequencing Initiative, Lepe, J. A., García, F., Dopazo, J., & Perez-Florido, J. (2025). A Genomic Surveillance Circuit for Emerging Viral Pathogens. Microorganisms, 13(4), 912. https://doi.org/10.3390/microorganisms13040912

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop