*2.4. Missense Mutations Cause Conformational Changes at the ATP-Binding Site*

To examine the relationship between structural changes of missense mutations and their impact on the ATPase activity, we performed molecular dynamics (MD) simulations for the WT and three mutants in this study. We then analyzed the MD structures to understand how the mutations could lead to different conformation around the hypothetical surrounding residues at the nucleotide-binding sites (NBS). These residues were obtained through a structural comparison between the crystal structure of ABCG5/G8 (Protein Data Bank (PDB) identifier (ID): 5DO7) and a cryo-EM structure of ABCG2 (PDB ID: 6HBU) for which two ATPs were bound in the homodimer [21,38].

To identify which residues are important for the ATP binding, we conducted MD simulations for the ABCG2 system. We calculated the ligand–residue MM-GBSA (Molecular Mechanics-Generalized Born Surface Area) free energies (∆Glig-res) for the 32 surrounding residues and identified eight hotspot residues which have <sup>∆</sup>Glig-res better than <sup>−</sup>7.0 kcal/mol (Table S1). Although those hotspots were identified for ABCG2, it is reasonable to assume they are also hotspots for ABCG5/G8 given the apparent structural and sequence similarity (only one hotspot has different amino-acid types). The root-mean-square deviation (RMSD) for the main-chain atoms was 2.60 Å, and the corresponding amino acid types of both proteins are listed in Table S1. The detailed interactions between ATP and ABCG2 revealed by a representative MD structure are shown in Figure S3. In this study, we focused on the active nucleotide-binding site (known as NBS2) in ABCG5/G8 [21] and analyzed residues 88–103, 246–251 of ABCG5, and 210–220 and 237–245 of ABCG8. Those residues were recognized as the surrounding residues of the NBS2 in ABCG5/G8.

As shown in Figure 7, the mutations at the three sites could lead to global changes in the overall ABCG5/G8 structure, with RMSD values larger than 2.0 Å. The difference between the RMSDs of the secondary structures was smaller, probably because more obvious changes needed a longer simulation time to manifest. We were especially interested in the mutational effect on the ATP-binding site and generated RMSD vs. simulation time curves for those hypothetic surrounding residues (Figure S4). We observed that the RMSDs with and without least-square (LS) fitting were very stable for the WT, whereas, for G5-E146Q and G5-A540F, both the LS fitting and no-fitting RMSD were significantly larger. However, the G8-R543S mutation did not lead to significantly larger RMSD. This is because the distance between the mutation site and ATP binding site is greater and much longer MD simulations are required. Indeed, the RMSD had an increasing trend along the MD simulation time for G8-R543S (Figure S4D). We then conducted correlation analysis using an internal program to identify possible interaction pathways between the two sites. As shown in Figure S5, the shortest path contained R543, E474, N155, V205, and L213. L213 is linked to four key residues for ATP binding. It is understandable that a perturbation at R543 needs a long simulation time to reach the ATP binding site, given that the shortest interaction path contains six residues including two ends. Overall, we observed a significant perturbation on the conformations of the putative surrounding residues due to the mutations at G5-E146Q and G5-A540F. We anticipated that the G8-R543S mutation could lead to a significant conformational change at NBS2 in much longer MD simulations.

**Figure 7.** Fluctuation of root-mean-square deviations (RMSDs) along molecular dynamics (MD) simulation time course. RMSDs were calculated using the main-chain atoms of all residues (black lines) or secondary structures only (red lines): (**A**) wild type; (**B**) E146Q mutant in ABCG5; (**C**) A540F mutant in ABCG5; (**D**) R543S mutant in ABCG8. G5G8: ABCG5/G8; SS: secondary structure.

Next, we identified representative MD conformations for all four ABCG5/G8 protein systems for comparison (Figure 8). It was observed that the hotspot residues were overlaid very well between the crystal and MD structures for the WT (Figure 8E) and R543S mutant (Figure 8H), except for R211, while, for the other two mutants, the RMSDs were significantly larger (Figure 8F,G). This observation is expected, and the reason was explained above. Interestingly, the side-chain of R211 underwent dramatic changes for all four protein systems during MD simulations. If R211 was omitted, the main-chain RMSDs became much smaller. In summary, the conformational changes from our molecular modeling could qualitatively explain why the three mutations can lead to impaired ATPase activity. Of particular note, G5-K92, the hotspot residue that has the strongest interaction with ATP, is a part of the Walker A motif at the active nucleotide-binding site and required for ABCG5/G8 functions [18,33].

**Figure 8.** Representative structures of the WT ABCG5/G8 and its three missense mutants. The representative structures (shown as blue cartoons and bluish sticks) were aligned to the crystal structure (green cartoons, and greenish lines). The three mutation residues, E146Q, A540F, and R543S, are shown as spheres. The hypothetical surrounding residues of ATP are shown as dashed rectangles. (**A**,**E**) Wild type; (**B**,**F**) E146Q; (**C**,**G**) A540F; (**D**,**H**) R543S. G5: ABCG5; G8: ABCG8. Residues in G5 and G8 are separately colored in black and red. Root-mean-square deviations (RMSDs) for the main-chain atoms (rmsdMC) and all heavy atoms (rmsdHEV) are shown in the lower panels. If R211 is omitted from RMSD calculations, RMSDs of the main-chain atoms are 0.69, 1.30, 0.88, and 0.78 Å for WT, E146Q, A540, and R543S, respectively; the corresponding RMSDs of heavy atoms are 0.85, 1.42, 1.13, and 0.96 Å.
