**1. Introduction**

The proteins in the ABC (ATP-binding cassette) family can be found in every group of living organisms, from bacteria to primates, and are generally known for their ability to translocate a wide range of substrates across extracellular as well as intracellular biomembranes [1,2]. Typically, ABC transport proteins contain two nucleotide-binding domains and two transmembrane domains. ABC proteins are organized as full- or half-transporters in eukaryotes. The products of half-transporters have to homodimerize or heterodimerize to create a functional transporter. Forty-eight ABC protein-coding genes, which have been described in the human genome, are divided into seven subfamilies according to the similarity in their amino acid (aa) sequences and organization of protein domains [3]. The ABCA subfamily is represented by 12 full transporters, which belong to the largest molecules among ABC proteins, with a median of 1925 aa. They have been reported to play important roles in the transport of cholesterol and its derivatives, as well as some vitamins and xenobiotics [4,5]. Several members of the ABCA subfamily have been causatively linked to a diverse set of human inborn diseases such as familial high-density lipoprotein (HDL) deficiency (*ABCA1*), neonatal surfactant deficiency (*ABCA3*), degenerative retinopathies (*ABCA4*), and congenital keratinization disorders (*ABCA12*) [6]. Gene expression studies conducted in our laboratories have demonstrated associations of intra-tumoral transcript levels of several ABCA genes with the response of patients to oncological therapy or disease-free survival [7–11]. Their roles in cancer progression and metastasis attributed mainly to lipid trafficking are a matter of intensive research [4,12]. Phylogenetic analyses suggest that current ABCA genes evolved by many duplication and loss events from a common ancestor gene [6,13,14]. *ABCA5*-related genes (*ABCA5*/*6*/*8*/*9*/*10*), which evolved from the *ABCA5* gene by duplications, form a cluster on the q-arm of the human chromosome 17 (17q24). The remaining ABCA genes are dispersed on six other human chromosomes [5,15].

Gene expression at the protein level does not reflect the mRNA level in normal human tissues perfectly, not only in the case of ABC genes [4,16]. The reason for this difference is believed to lie in post-transcriptional regulation. Regulation of the initiation of translation has been implicated as the major mechanism in this complex process. Several features in the 5′ untranslated regions (5′UTR, also leader sequence) of genes, such as the length of 5′UTRs, upstream ATG start codons (uATG), upstream open-reading frames (uORF), introns, RNA G-quadruplex-forming sequences (RG4), diverse secondary structures like stem loops as well as the Kozak consensus motif in the vicinity of start codons, act as *cis*-acting regulatory factors (Figure 1) [17,18]. RNA structures such as stem loops and RG4s, as well as uORFs and uATGs, mainly inhibit translation. RNA modifications, or RNA-binding proteins (RBP) and long non-coding RNAs (lncRNA) that interact with RNA binding sites, as well as the Kozak motif, can additionally stimulate translation initiation. It is still not clear how the actions of these elements interact, when multiple factors are present together, or if some of them have a superior role. Several highly conserved elements have been revealed in our recent bioinformatics study focusing on the 5 ′UTRs of the human *ABCA1* gene and its vertebrate orthologs [19]. The 5′UTRs of the other ABCA subfamily genes have not yet been studied in detail. Mapping of the 5′UTR features that are known to have the potential to regulate translation, among the whole subfamily, is addressed in the current work. Those interpreting the significance of new mutations and polymorphisms can take our findings into consideration.

**Figure 1.** Secondary structure of the 5′UTR of the human *ABCA1* gene. The positions of the features influencing the initiation of translation are depicted; intron spl. s., intron splice-sites; RG4, RNA G-quadruplex-forming sequence; uATG, upstream ATG codon; uORF, upstream open-reading frame.
