*2.4. Conservation of the Polar Relay*

A feature of ABCG5/G8 identified from the ABCG family fold [9] that is also likely to carry out a role in conformational changes is the "polar relay". This comprises 11 residues from ABCG5 and 9 residues from ABCG8. In the multiple sequence alignment, five of these positions overlap, leaving a total of 15 columns in the alignment corresponding to the polar relay (Figure 4). Notably, one of the columns in the polar relay of both ABCG5 and ABCG8

(column 1011) aligns with R482 in transmembrane helix 3 of ABCG2, mutation of which has long been shown to alter substrate specificity [11,22,23].

**Figure 4.** Conservation patterns in columns corresponding to the polar relay in ABCG5/G8. Columns coloured as in Figure 2. Orange dashed boxes indicate the polar relay for ABCG5 and ABCG8. Columns with the conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8) have that pattern underlined in blue at the bottom.

The conservation patterns of these fifteen residues are shown in Figure 4. Though 12 of these positions have type II divergence for all ABCGs, two columns (915: R389 in ABCG5 and H420 in ABCG8; and 963: N437 in ABCG5 and D466 in ABCG8) are not conserved in ABCG8, and one is not conserved in ABCG5 (1006: V471 in ABCG5 and E500 in ABCG8). One remarkable observation is that 40% of the columns in the polar relay (6/15) have the conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8). This makes this conservation pattern much more common here than in the whole protein, as it is only found in 5.5% (33/595) of the aligned columns (Supplementary Table S3).

The observation that the type II divergence pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8) subsumes much of the polar relay (as shown in Figures 4 and 5a), which has previously been attributed allosteric significance in ABCG5/G8, suggests that the entire corkscrew of residues contributes to the allosteric divergence of the ABCG family.

**Figure 5.** Comparison of the polar relay with functionally divergent residues. (**a**) Distribution on structure of ABCG2. Residues found in the polar relay are shown as spheres. Those with the conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8) are coloured green. Others are coloured red. Residues outside the polar relay with the conservation pattern above are coloured violet within the cartoon representation. (**b**) Identity of residues with conservation pattern (ABCG1, ABCG4), (ABCG2), (ABCG5), (ABCG8). Bars are coloured by protein, and their height represents the number of that residue found in the 33 positions with the above conservation pattern for that group. For each residue, bars are in the order ABCG1 and ABCG4; ABCG2; ABCG5; and ABCG8.
