*2.2. ABCA3*

The ABCA3-201 transcript, coding for the main functional isoform, with 33 exons is 6602 bp long. It has the second longest 5′UTR among ABCA genes spanning 694 bps. The 5′UTR is divided into four parts by the three introns—Intron 1–2 (10718 bp), Intron 2–3 (890) and Intron 3–4 (1960). The most conserved region was localized to −519 to −503 from the start ATG codon of the main ORF (sATG). Four uATGs were described at the following positions: 1) −525, 2) −498, 3) −329, and 4) −262. The third and fourth uATGs are conserved in primates (cat. 1), the first uATG in placental mammals (cat. 3)

and the second uATG in placental mammals as well as reptiles and birds (cat. 4). The second and fourth showed weak contexts, the first and third adequate contexts. High TIS scores were calculated for the first and second, and middle scores for the third and fourth. Two probable uORFs were predicted: one is 150 nt long (starting at −525) and the second overlapping with the main ORF (starting at −262); and one RG4-forming sequence at −608 to −576. Five stem loops were predicted in this region. The ABCA3 protein was found to be expressed in many human tissues (42 out of 45 tissues tested). The Mode expression score was Medium, enhanced in e. g. brain, glands, lung, testis and spleen. Table 1 discloses an overview of the 5′UTR features for the 12 human ABCA genes. A more detailed overview, including the positions of all features studied, can be viewed in Table S2.


**Table 1.** Overview of the 5′UTR features in human ABCA genes sorted according to their phylogenetic relationships.

Abbreviations: Distrib., distribution; Expres., expression; n.a., not available; No., number; seq., sequence; uATG, upstream ATG; uORF, upstream ORF. Protein tissue distribution (in normal human tissues): 1-One/2-Some (less than a half)/3-Many/4-All. Protein expression score: 1-Not detected/2-Low/3-Medium/4-High.
