*2.7. Other Conservation Patterns*

Though the conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8) is the most observed in functionally divergent columns, some other patterns are well represented (the frequencies of well-represented functionally divergent conservation patterns are shown in Supplementary Table S2, and some of these are represented on the structure of ABCG2 in Supplementary Figure S5). Given the evolution of the subfamily, it is instructive to examine the conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5, ABCG8), which highlights another 13 residues, shown on ABCG2 in Figure 3b. Notably, ten of these are found either in the NBD:NBD interface or the NBD:TMD interface. The remainder are found in the TMD, and two of these (C438 and I573 in ABCG2, P431/460 and F567/595 in ABCG5/G8) form pairs in the structures of ABCG2 and ABCG5/G8.

An interpretation of these patterns based on their likely evolution is that both of these sets of residues diverged when the ancestors of ABCG1 and ABCG4, ABCG2, and ABCG5

and ABCG8 specialised to transport different substrates. Later, ABCG5 and ABCG8 could take on different parts of the function of a transporter by forming an obligate heterodimer, and the residues corresponding to columns conserved as (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8) represent further functional divergence. Thus, both sets would be responsible for differences in substrate specificity, with 13 residues requiring conservation across ABCG5 and ABCG8. Taken together, these patterns reiterate the likely importance of allostery to differences in the function of the ABCGs.

The three patterns found most frequently other than (ABCG1, ABCG4), (ABCG2), (ABCG5, ABCG8), having 24, 23, and 23 members respectively, are (ABCG1, ABCG4), (ABCG2), (ABCG5); (ABCG1, ABCG4), (ABCG2), (ABCG8); and (ABCG1, ABCG4), (ABCG2). In total, these four patterns make up 103/533 of the functionally divergent columns. In all of these, ABCG1 and ABCG4 conserve the same amino acid, and ABCG2 conserves another, but conservation within ABCG5 and ABCG8 differs. In the conservation patterns not examined more closely in the sections above, ABCG5 or ABCG8 or both do not conserve the column. This indicates positions which have a decreased evolutionary pressure in ABCG5 and ABCG8, perhaps due to their splitting some of the functions which normally both halves of a dimer must maintain due to their forming a heterodimer. That so many of these positions are also sources of functional divergence between ABCG1 and ABCG4 on one hand and ABCG2 on the other is intriguing.

Another set of conservation patterns that is well represented is columns with type II divergence between one member and all the other members. (ABCG1, ABCG4, ABCG2, ABCG8), (ABCG5) has 16 members. Two interesting residues with this pattern are: F439 in ABCG2 (a tyrosine in ABCG5), which serves as a "clamp" for substrates [36], and E451 in ABCG2 (a leucine in ABCG5), which is a key residue in coupling ATPase activity to transport [34]. (ABCG1, ABCG4, ABCG5, ABCG8), (ABCG2) has 12 members. (ABCG1, ABCG4, ABCG2, ABCG5), (ABCG8) has 9 members. Due, probably, to the close relatedness of ABCG1 and ABCG4, there are fewer (0 and 5 respectively) columns with type II divergence between these and the rest of the subfamily. These are tantalising groups, as they show places that each member specialises in a way distinct from the ABCG family on the whole. However, a molecular interpretation is much more difficult.
