*3.1. The Main Channel*

A funnel shaped channel appears in the apo form 1TQN of CYP3A4 (Figure 2a). It is located in the deformable area of channel 2 (as named by Cojocaru et al. [64]) and lets a wide opening at the neighbor of the active site. It lies between the following secondary structures: *β*<sup>1</sup> sheet, A-anchor, B-C block and *β*<sup>4</sup> sheet (in 3A4, it is the C-terminal loop, as denoted further in the text; size slightly depending on the conformations of a given isoform), and F-G block (helix-loop-helix), very flexible due to channel opening (Figure 2b). The B-C block is defined to extend from the beginning of the B helix to the end of the C helix, and the F-G block to extend from the beginning of the F helix to the end of the G helix. The F-G block region offers highly variable amino acids sequences and structures for different CYPs, and thus can be important for substrate specificity through making contacts with the substrate. These regions are bordered by putative SRSs (Substrate Recognition Site: see [70]). The flexibility of the F-F' loop let us define three conformations of CYP3A4 [60]: the closed one, labeled C, and the opened conformations, labeled O1 or O2 depending, respectively, if block 1 or block 2 opens. Molecular dynamics simulations of cytochrome P450*cam* showed that substrates and products could egress from the active site via pathways in the vicinity of the three routes identified by TMP (thermal motion pathway) analysis, but with pathway 2 being energetically favored over pathways 1 and 3 [36,37]. Analysis of the simulations of cytochromes P450*cam*, P450-BM3 and P450*eryF*, showed that egress trajectories in the region of channel 2 could be clustered into subclasses, named 2a, 2b, etc., according to the secondary structure elements lining the ligand pathway as it emerges from the protein surface. Although the overall fold of the CYPs is well conserved, the length and secondary structure of the B-C and F-G blocks vary considerably inside the P450s family. The flexibility of these 2 blocks is the main source of conformational change for a given isoform [32].

The facial graph (see Section 2.1) of this main channel is connected, i.e., there is only one channel there: see Figure 2a,b. Knowing the volumes of the tetrahedra issued from the triangulation, we computed that about 88% of the void appears to be surface pockets and only 12% of the void is located at the distal face of the heme (Figure 2a,b). Some ligands are in surface binding pockets, such as the progesterone [71] (PDB code 1W0F; see also Figure 4 in [57]). It could be considered that surface pockets are concavities which are not part of the protein domain, but there is no consensus in the literature about the definition of the protein domain and of the pockets. The binding cavity has a large volume, reported to range from 1173 to 1332 Å3, increasing up to 2000 Å3 when binding large substrates such as ketoconazole and erythromycin [72]. The whole channel 2 computed by CCCPP for CV = 6 Å (see the definition of CV in Section 2.1), which includes the binding cavity, the path accessing to it and the mouth of the channel, has a total volume of 42,400 Å<sup>3</sup> and a bounding surface of 41,800 Å2. These values are large because they include the contribution of the mouth of the channel, which is a large cavity lying at the protein surface. It is pointed out that the status of such a surface cavity is unclear because, while it is inside the convex hull of the protein, it is difficult to decide if it is indeed a part of the protein domain or if it is a void region exterior to the non convex shaped protein.

The holo form 2V0M of CYP3A4 contains two ketoconazoles molecules. This antifungal molecule is bulky (van der Waals volume: 450 Å3). The funnel appears for a maximal ligand size CV*lim* = 7 Å (see the definition of CV*lim* in Section 2.1), and leads to observe two pathways, each corresponding to a subchannel and containing one ketoconazole. These two subchannels are in block 1 and block 2 [60]. In the apo form, only block 1 appears to let pass the first ketoconazole (Figure 2a). In the holo form, block 2 is opened at CV = 5.75 Å because its access is obstructed by a bottleneck of CV < 6 Å (see Figure 2c). At this CV, block 2 appears (Figure 2a).

Several crystallographic structures of CYP3A4 exist, in several conformations corresponding to different states: interaction with one or two ligands, or none. These differences are related to conformational changes, which can be correlated to the differences between the channels, in function of the ligand. To accommodate two molecules or a large molecule, the protein undergoes a significant conformational change, especially in the F-G region and around the Phenyle cluster, which contains eight phenylalanines residues (57, 108, 213, 215, 219, 220, 241 and 304). Positioning of the I-helix and the C-terminal loop are also altered. The apo form 1TQN is flagged as closed subtype 1 [64,73]. It is such that the F-F' loop binds the ligand in 2V0M (Figure 3c), and is flagged as open subtype 2 [60,64,73]. It is bound to two ketoconazoles, a known inhibitor of CYP3A4. Ketoconazole has an imidazole group (highly polar) and it has nine rotatable bonds and thus it is highly flexible [59]. In 2V0M there are two co-crystallized ketoconazoles (respective thicknesses of 5.19 Å and 6.52 Å, measured as indicated in [57]).

The first one has its imidazole group near the heme: the N3 atom of the imidazole ring binds the iron atom of the heme [74]. The dichlorinated aromatic ring is in the active site and the main skeleton is in the part of channel 2a common with channel 2f (channels 2a and 2f separate at Thr224). The bottleneck in channel 2a is at Phe108 and Thr224. Compared to the apo structure, an enlargement of channel 2a of 1 Å is needed to accept a large ligand (Figures 3c and 4a,b).

The second ketoconazole is inside channel 2f with an orientation opposite to the one of the first ketoconazole, and with the acetyl group near the active site and the dichlorinated aromatic ring (more hyrophobic) near the entry of the channel (Figure 5a,b). The second ketoconazole is inside channel 2f but it has the opposite orientation, with the acetyl group near the active site and the dichlorinated aromatic ring (more hyrophobic) near the entry of the channel (Figure 5a,b). There is a bottleneck in channel 2f, surrounded by four residues (Tyr53, His54, Phe215, and Leu216), three of them being aromatic and acting as gating residues: see Figure 3b. Phe215 and Leu216 are in F-F' loop after conformational change of the apo structure where Phe215 obstructed the opening of channel 2f. Tyr53 and His54 are on anchored helix A in the membrane without significant change from the apo structure, and are thus assumed to have no role in building the bottleneck. Thus, this latter would be due only to a move of the F-F' loop. Phe215 moves in the holo structure toward the mouth of channel 2f and catches

the hydrophobic ligand at the membrane/water interface: this is a difference with the first molecule, which enters in channel 2a through the membrane, thus explaining the opposite orientations of the two molecules. The movement of F-F' loop let Arg212 go to the interior of the protein at a location at the opposite of the active site: Arg212 is no longer able to interact with the ligand.

**Figure 2.** Four representations of the main channel through its facial graph. The target atom is the iron of the heme group (in red). (**a**) Facial graph of the main channel to the active site of CYP3A4 computed from 2V0M (in brown, for CV*lim* = 7 Å), superposed to the facial graph of this channel computed from 1TQN (in green, for CV*lim* = 6 Å). The two facial graphs show the way in channel 2a: the one of 2V0M is the beginning of channel 2f with a bottleneck at its entry (not connected to the exterior); the two facial graphs show also the beginning of channels S. (**b**) The funnel shaped channel leading to the active site, computed from 2V0M, appearing at CV*lim* = 7 Å (in brown) superposed on the one computed at CV = 7.25 Å (in yellow); the part of the facial graph of channels 2 lies within *β*<sup>1</sup> sheet and C-terminal loop and B-C and F-G loops; only surface pockets were visible for CV > CV*lim* (in yellow), thus the evidence of a bottleneck at the entry of the two pathways. (**c**) Zoom of (**b**); bottleneck toward 2a: Phe108 (in B-C loop) and Thr224 (in helix F'); bottleneck toward 2f: Tyr53 and His54 (both in helix A), Phe215 and Leu216 (both in F-F' loop); the motions of these residues induce the opening/closing of the bottlenecks; the residues constituting the bottleneck are in green and blue sticks; the dashed line separates blocks 1 and 2. (**d**) Zoom of (**b**), superposed on channel 2f computed at CV = 5.75 Å; six residues border the bottlenecks of channel 2f: four are located at the entry (in helix A and FF' loop), Phe108 in the common part of 2a and 2f, and Thr224 where 2a and 2f separate.

**Figure 3.** Apo forms are in light colors, holo forms are in dark colors. Each holo form refers to 2V0M, with two ketoconazoles (see Figure 5). (**a**) Channel 2a in 1TQN with some of its bounding residues in the F-F' and B-C loops; the mouth of the channel is lined by G' helix hydrophobic residues; the gating residues Phe213 and Phe215 of the F-F' loop and Phe108 of the B-C loop are bounded by channel 2a in closed form. (**b**) Channel 2f of 2V0M, superposed on 1TQN; shows the steric obstruction of lining residues of F-F' and B-C loops (Arg212 and gating residues Phe213, Phe215 and Phe108) and of helix F' to open channel 2f. (**c**) Channels 2a and 2f in 2V0M with some bounding residues; the common part of channels 2a and 2f is lined by F-F' and B-C loops and by C-terminal loop; the mouth of 2a is lined by helix G' helix and by the mouth of 2f by F-F' loop; 2a and 2f are separated at exit by hydroxy-Thr224 in helix F' at the hydroxy group; the common part is in orange and the separated parts are in light rose; the gating residues Phe 213 and Phe215 in loop F-F' and Phe108 in B-C loop borders 2a and 2f in the open form of CYP3A4. (**d**) 1TQN superposed on 2V0M; the secondary structures are in cartoon and the residues are in stick (gating ones: Leu216, Arg212); 1TQN is transparent, showing 3A4 conformation before entrance of the two ketoconazoles; indicate what could be the moves of B-C and F-G blocks and of C-terminal loop, involved in the opening of access channels 2a and 2f.

**Figure 4.** Superposition of CYP3A4 1TQN and 1V0M at blocks F-G and B-C, which bound channel 2a. For clarity, a piece of F-F' loop was removed. Apo forms are in light colors, holo forms are in dark colors. Each holo form refers to 2V0M, with two ketoconazoles (see Figure 5). (**a**) Channel 2a of 1TQN (CV = 6 Å) superposed on 2V0M. (**b**) Channel 2a of 2V0M (CV = 7 Å) superposed on 1TQN.

**Figure 5.** (**a**) The two ketoconazoles of 2V0M inside channels 2a and 2f, within the secondary structures bounding these channels (these latter are not delineated). The first ketoconazole, in yellow, enters through channel 2a. The second ketoconazole, in purple, enters through channel 2f. (**b**) Idem, with delineation of channels 2a and 2f by the boundaries of the trajectories (residues are not displayed).
