*2.2. Sanger and Next-Generation Sequencing and Analysis*

Sequencing is a crucial step performed after the SELEX-based selection of aptamer candidates and before aptamer–target characterization. Here, we review two of the main methods used for sequencing aptamers, those being Sanger sequencing and next-generation sequencing (NGS). Sanger sequencing is a DNA sequencing technology based on the chain termination method invented by Frederick Sanger and his colleagues [48] in 1997. The main workflow involves blocking the polymerase-mediated elongation of DNA through the incorporation of fluorophore-labeled dideoxynucleotides (dideoxyadenosine triphosphate, dideoxyguanine triphosphate, and dideoxythymine triphosphate) at the 3 ends of DNA sequences, resulting in various lengths of DNA fragments for size separation and fluorescentbased detection. This method has the advantages of high precision, high efficiency, and low radioactivity, and the disadvantages of being expensive and having low-quality primer binding in the first 15 to 40 base pairs [49]. Subsequently, high-throughput second- and third-generation sequencing were invented and superseded Sanger sequencing [50]. In 1998, Balasubramanian and Klenerman co-invented Solexa sequencing [51], also known as NGS. This method differs from Sanger sequencing mainly in its mode of chain termination: modified deoxynucleoside triphosphates (dNTPs) with a reversible terminator are used to terminate polymerization and are then removed to allow incorporation of the next modified dNTP [52]. NGS offers some advantages over Sanger sequencing, including enabling massively parallel sequencing in a short period and having a lower cost per base pair.

In order to analyze sequencing data, in silico technique towards SELEX facilitate the aptamer selection [53–55]. The initial step is primary sequence analysis using some bioinformatic tools, such as ClustalW [56] and Clustal Omega (https://www.ebi.ac.uk/ Tools/msa/clustalo/ (accessed on 12 October 2022)) [57]. Multiple sequence alignment is performed to divide sequences into families or clusters. The conserved regions of the sequences are analyzed using Gblocks software. Furthermore, Multiple Expectation maximizations for Motif Elicitation (https://meme-suite.org/meme/tools/meme (accessed on 12 October 2022)) [58] is used to identify the sequence motif, which provides the basis for downstream analysis. Aptamers can form diverse secondary structures, such as stem-loop, triplex, G-quadruplex, and pseudoknot structures; thus, enriched sequences can also be selected based on secondary structure and Gibbs free energies (ΔG) prediction. Several nucleic acid structure-prediction Web servers are available and are often used to computationally predict secondary structures and Gibbs free energies (ΔG) of aptamer sequences. These include MFOLD (http://www.unafold.org/mfold/applications/dna-folding-form.php (accessed on 12 October 2022)) [59], KineFold (http://kinefold.curie.fr/ (accessed on 12 October 2022)) [60], and RNAstructure (https://rna.urmc.rochester.edu/RNAstructure.html (accessed on 12 October 2022)) [61]. The predicted secondary structures were then converted into unique three-dimensional (3D) structures using online web servers. These include fragment-based methods: RNAComposer (https://rnacomposer.cs.put.poznan.pl/ (accessed on 12 October 2022)), 3dRNA (http://biophy.hust.edu.cn/3dRNA (accessed on 12 October 2022)), and Vfold 3D (http://rna.physics.missouri.edu/vfold3D/ (accessed on 12 October 2022)); and energy-based method: simRNA (https://genesilico. pl/SimRNAweb (accessed on 12 October 2022)) [62]. Molecular docking tools include AutoDock, AutoDock Vina, and DOCK, which were used to predict the predominant binding modes and regions of the target molecule based on the generated binding scores for specific sequences [63–65]. After a few prediction steps, the aptamer candidates are shortlisted and subjected to binding tests and characterization.
