*2.2. Phylogenetically Significant Type II Divergence in ABCGs*

To explore the differences in substrate specificity within the G subfamily, it is necessary to explore columns of the alignment showing functional divergence. Residues essential to maintaining the overall ABCG fold will be either identical across proteins or highly conserved. Other behaviours, such as force transmission and substrate recognition, are likely to be conserved by each protein, but change across the family. The approach adopted here, which classifies columns by their conservation pattern, was deliberately chosen to allow interpretation of differences between multiple groups within the alignment. Though it does not provide a score, classifying columns by conservation pattern allows discrimination of functional divergence at different levels, exploiting existing knowledge of the proteins under investigation. Emphasis here was on the ability to estimate functional divergence in a way that allows interpretation based on what we know of the proteins involved.

The conservation patterns that are most likely to yield insight into differences in substrate binding are those that separate proteins by their substrates. ABCG1 and ABCG4 have a high sequence identity, and are identical in 434 of 595 columns (excluding gapped columns). ABCG1 and ABCG4 also overlap in their function and substrates [3,21], so grouping them together to establish functional divergence is sensible from both an evolutionary and functional perspective. Though ABCG5 and ABCG8 by definition transport the same substrates, their interactions with those substrates may differ, if the shape and chirality of the substrate is reflected in the substrate-binding site. Furthermore, they are less similar in sequence (being conserved in the same way in only 138 columns), and to some extent must carry out different functions due to the asymmetry of their nucleotide-binding sites [6].
