*3.2. Access Channels and Narrow Channels*

#### 3.2.1. Channels 2a and 2f

We know that the access to 3A4 active site is through channel 2a, computed from the apo structure with CV = 6 Å. This channel is opened in the apo form, and it enlarges to accept a first ketoconazole ligand at CV = 7 Å between the F-G and the B-C loops and the *β*<sup>1</sup> sheet. Channel 2a is the first found by CCCPP, and it is the biggest one for this conformation of CYP3A4. It contains the first ketoconazole, bound to the heme. This channel is assumed to be the first pathway found by the ketoconazole. Pathway 2a, in which the ligand passes between F-G loop, B-C loop and the *β*<sup>1</sup> sheet, appeared to be the most likely route for substrate access and product egress in previous study on bacterial proteins [75]. In the bacterial P450 structures, 2a is opened in most of the structures that have at least one channel opened. Simulations of product egress from CYP101 indicate that this route would be used by the product as well as the substrate [36]. The MCP (see Section 2.2) corresponding to channel 2a has a volume of 613 Å3 and a bounding surface of 273 Å<sup>2</sup> in the apo form, and it has a volume of 774 Å<sup>3</sup> and a bounding surface of 896 Å<sup>2</sup> in the holo form.This volume is small because it corresponds to the path accessing to the binding site rather than to the full binding cavity. It should be

compared to the volume of convex hull of the ketoconazole, which is estimated to be in the interval 300–340 Å3, depending on the conformation. The surface boundary is small because the path is not a closed one: it is bounded only at some places while it opens on the rest of the channel at other places. Such quantitative results got by CCCPP cannot be obtained by other channel calculation programs.

Then, other pathways are needed to accept more ligands. In the holo form, a second channel is found: it is channel 2f, in block 2, located between F-G block and C-terminal loop, and containing the second ketoconazole. It is identified as distinct channel and it lies between channel 2a and the solvent channel S. In our study of the 16 crystallographic structures, we observe that the channel 2f opens for CV in the range 5.75–8.75 Å, depending on the ligand size. In 2V0M, channels 2a and 2f have a common part and separate at exit in helix F' at hydroxy group of Thr224. We found that the second ketoconazole of structure 2V0M is oriented from the heme in the direction of channel 2f. At CV*lim* = 7 Å, channel 2f appears as a dead end, not connected to the exterior (Figure 2a). It has a total volume of 49,200 Å<sup>3</sup> and a bounding surface of 48,100 Å2. The MCP in channel 2f has a volume of 851 Å3 and a bounding surface of 360 Å2 in the holo form. A bottleneck at CV < 6 Å obstructs the connection to the exterior of the protein. An opening at CV = 5.75 Å and a slight channel flexibility should permit the second ligand to enter (CV = 6.52 Å). Channels 2a and 2f are bordered by SRS-2, lying above the active site cavity.

Channels 2a and 2f are hydrophobic. They have a common part then separate near the hydroxy group of Thr224, on helix F' (see Figure 2c). Channel 2a is bordered by SRS-1, SRS-2 and SRS-3, and channel 2f is bordered by SRS-2 and SRS-6. Both channels 2a and 2f open at the C-terminal region of the helix F and in the F-F' loop, forming SRS-2 (near the active site cavity). The other part of the entry of channel 2a lies at the N-terminal region of the helix G and in the G-G' loop, forming SRS-3 (see Figure 5a). The B-C loop/B' helix (bordering channel 2a) was identified as a putative SRS-1 [70], and involved in substrate binding. We found with CCCPP that the opening of channel 2a is apolar for 1TQN, located at His28, Phe228, Ile232 and Val235. It is more polar for 2V0M, located at Lys35, Phe46, Phe228, Val235, Asp380, and Lys390. For both 1TQN and 2V0M, the entry of 2a is bordered on one of its sides by the hydrophobic helix G' (residues 230–237; see Figure 6a,b), which is anchored in the membrane (see Figures 3 and 6a,c). The following residues of channel 2a have an orientation common to the apo form 1TQN and to the holo form 2V0M: Pro227, Phe228, Leu229, Ile230, Pro231, Ile232, Leu233, Glu234, Val235, and Leu236. Channel 2a enlarges of ΔCV~1 Å and channel 2f opens to accept two ketoconazoles in 2V0M. The superposition of 1TQN and 2V0M shows that the two B-C loops are superposed due to a move of B-C in 2V0M at Phe108, Gly109, and Pro110, at the exterior of the funnel area, so that it permits the entrance of the ligand in 2a and 2f. These two channels are bordered by par the B-C loop (see Figure 3a,b). In the apo form, Phe108 obstructs the channel. Its move enlarges the common space admitting the two ketoconazoles. In 2V0M, R106 (in B-C loop) and E374 are lining channel 2a. Tyr53 and His54 are on the anchored helix A, without change between the apo form 1TQN and the holo form 2V0M. These few flexible two residues are not involved in the move of the bottleneck of channel 2f. This move is thus due to the move of F-F' loop. The two bottleneck residues are at the mouth of 2f: Phe215 and Leu216, after the conformational change of F-F' where Phe215 obstructed the opening of 2f in the apo form 1TQN (Figure 2b).

**Figure 6.** (**a**) Channels 2a and 2e within lining secondary structures of 1TQN (*β*<sup>1</sup> sheet, F-G and B-C blocks): the two mouths are neighboring, then the channels separate at Pro107 and Phe108 in B-C loop. (**b**) Channels 2a, 2f and 2e of 2V0M enclosed by *β*<sup>1</sup> sheet, F-G and B-C blocks, and C-terminal loop; channels 2a and 2e are separated at Pro107 and Phe108 in B-C loop. Channels 2a and 2f are separated by Thr224 in helix F'. (**c**) Same as (**a**), rotated: the bottom part of this figure is a part of Figure 1 in [27] (©American Chemical Society, https://pubs.acs.org/doi/abs/10.1021%2Fja4003525). Membrane-bound form of CYP3A4 shows the location of the mouths of 2a and 2e in 1TQN and highlights residues inserting into the membrane and interacting with it via the hydrophobic side chains of helices F' and G' and of A-anchor. A-anchor has a fixed position on A'A" loop, and is formed by residues His28, Leu44, Pro45, Phe46, Leu47 and Gln78: it is labeled in dark blue and boxed. Channel 2a leads to the membrane; channel 2e leads to cytosol; the membrane is viewed on the side of the globular domain; the transmembrane helix is visible; and the location of the organic phase and the short-tailed lipids are represented. (**d**) Channel 2a represented by three MCPs, respectively, in purple, yellow and cyan, surrounded by secondary structures. The three MCPs have a common part in channel 2a.

F-F' loop is hydrophobic and has three charged residues: Arg212, Asp214 and Asp217. Its move may act as a closing/opening valve, at Leu211, Arg212, Phe213, and Phe215 (Figure 2a,c). F-F' loop in a closed conformation is exposed to block 1. Its residues Arg212, Phe213 and Phe215 border the body of the hydrophobic channel 2a in the apo form 1TQN. In this latter, only one polar residue, Arg212 (oriented toward the active site: Figure 2a) borders channel 2a. Arg212 was suggested to regulate solvation of the active site [61], and was reported to assist the binding of ligands to the active site [76,77]. Phe215 is in F-F' loop, bordering channel 2a in conformation C (Figure 2a). Phe215 is involved in the substrate orientation within CYPs active sites and in catalytic mechanisms of these substrates [78]. In the 2V0M, F-F' loop is distorted to open channel 2f. The orientation of the F-F' loop residues follow either the same direction (K208, K209, L210, snf D214), the opposite direction (Leu211, Arg212, Phe215, and Leu216), or are oriented in the same sense but shifted (valves: Phe213, Asp217, and Pro218). The opening of channel 2f is at Leu51, His54, Lys55, Phe215, Pro218 and Lys486. Helix F' borders channel 2f with apolar residues Leu221, Phe220 and Thr224. In the holo form, an opening occcurs via the move of the F-F' loop such that Phe215 (bordering 2a in the apo form), is pushed toward block 2. This push of Phe215 toward the surface is useful: (i) to suppress the sterical hindrance to free space to open channel 2f and enlarge the cavity near the part common to 2a and 2f; and (ii) to border the entry of channel 2f, as Leu216 goes to the mouth of 2f at the water interface of the membrane/cytosol to catch the second hydrophobic molecule (see Figure 2c). The same phenomenon is observed in the structure 4K9U, which welcomes two molecules of GS5 (a ritonavir analog [79]), with Log(P) = 3.06. This phenomenon is observed neither for 4K9T, which welcomes one GS4 molecule (another ritonavir analog [79]), nor for 2J0D, which welcomes the bulky erythromycin molecule. For these two more hydrophilic molecules (Log(P) = 2.22 and 2.60, respectively), Phe215 is deformed without being located on the mouth of the computed channel 2f. Pro218 remains at the surface and borders the mouth of channel 2f: that lets the mouth of channel 2f more hydrophobic, thus it helps to catch the hydrophobic ligand. Asp217 is at the entry of block 2 but is not exposed at the surface, i.e., it does not border the mouth of 2f, at the opposite of Pro218, which is hydrophobic and borders the mouth of 2f. The residues rearrangements occur with the move of Leu211 (inside the protein in the apo form), to become (in 3A4-ketoconazole complex) oriented in the active site, bordering the common part of channels 2a and 2f, near the active site, at the opposite of Arg212 which is pushed on the other side of channels 2a and 2f (Figures 2c and 3a). This orientation can be correlated with the fact that substituting Leu211 by a phenylalanine residue affects the homotropic cooperativity in the binding with testosterone [80]. The presence of Leu211 compensates the hydrophobicity of channels 2a and 2f after departure of Phe215. Other significant changes are a shift of 4 Å of the C-terminal loop and a deformation of the of the crevice of helix I around Tyr307 (the C*<sup>α</sup>* is shifted at more than 2 Å) [81]. C-terminal loop becomes closer to the common body of the two channels 2a and 2f delineated by Gly481 and Leu482 near the active site (Figure 2a,c).

#### 3.2.2. Detection of Narrow Channels 2e and S

The narrow channel 2e appears at CV = 5.75 Å in 1TQN and at 5.5 Å in 2V0M. It appears with a total volume in the apo form of 44,500 Å3 (bounding surface: 44,100 Å2) and a volume in the holo form of 51,000 Å<sup>3</sup> (bounding surface: 48,200 Å2). It is a rather common channel observed in twelve different crystallized P450 isoforms [64]. This channel 2e was observed in the 16 crystallized 3A4 structures (available in the pdb data bank) considered this study (see Table 1 in Section 3.3). It has been observed that the channel 2e opens for CV in the range of CV between 5.5 Å and 6 Å, without any relationship with the fact that the protein is bound or not. Channel 2e appears for all conformations of CYP3A4 (C, O1, O2). It threads through the B-C loop (Figure 6), and its opening could depend on the length and of the flexibility of this loop. In the stage of the reaction where these structures of CYP3A4 were crystallized, channel 2e is opened in this range of ΔCV = 0.5 Å, not containing ligand. Channel 2e is detected as a secondary egress route for substrates or products in molecular dynamics simulations of P450s [64]. It is also observed as a secondary exit pathway in simulations of P450eryF [75] and CYP2C5 [40]. It is difficult to conclude about the role of channel 2e as an egress route as long as there is no known structure of complexes with oxidation products. MD (Molecular Dynamics) studies give information on the dynamics of ligand tunnels (opening/closure), but do not involve simulation of the process of ligand egress. MD and SMD (Steering Molecular Dynamics) studies mainly focused on ligand preferred exit tunnel. It was found that channel 2e is an exit one for the hydroxylated product of diazepam in 1TQN, and that channel S is an exit one for 6-hydroxytestosterone [44,45]. That suggests the enlargement of the small opened channel 2e for ligand exit. As shown in Section 1, a controlled hydration of the substrate bound P450 active site is extremely important for catalysis. A solvent filled channel from bulk solvent to heme let water circulate as a water pump. It is likely that its function is related to active site hydration, although it may also have a role in proton transfer (furthermore

channel 2e opens on the cytosol). It was suggested that a water channel exists between B-C loop and the C-terminal part of helix B [72]. Channel 2e is displayed with channels 2a and 2f in Figure 7.

In the apo form, channel 2e exits at Lys115, Asp123, Glu124, Pro231, Val235 and Lys390, and channel S exits at Phe22, Pro23, Val235, Asp380 and Lys390. Channels 2a and 2e have a common part near the active site, then separate (Figure 6a,b). Channel 2e and S exit in the cytosol. The residues of the mouth of channel 2e are rather polar, while those of 2a and 2f are apolar (Figure 6c). For the 16 complexes considered in this study, the ligands were located in channels 2a, 2f and S, but none was found in channel 2e. It suggests that channel 2e could be an exit channel, either for oxidized products or for water or protons. Channel 2e may also be for dropping water out of access channels to free space for ligands.

In the apo and holo forms, channel 2e is opened simultaneously with channel 2a, 2f and S, suggesting that the access channels may alternatively open by a F-G move away from the B-C loop, without affecting the 2e channel. This is in agreement with a study on the structure of CYP2C5 [82].

A contiguous water channel from the bulk solvent to the active site is possible [83]. Channel S, i.e., the solvent channel, is detected at CV = 5 Å in the apo form 1TQN. Then, it was computed at CV = 5.75 Å (slightly larger), as a host channel for one of the two GS5 molecules (a ritonavir analog) in the 4K9U complex [79]. It is flagged as important in several isoforms in [64,84,85]. It faces the cytosol and contains a charged gating residue which could lead the product out of the CYP3A4 active site [44]. MD simulations of expulsion of temazepam and 6*β*-hydroxy testosterone out of CYP3A4 were done: channel S has the largest opening for these two products, so that it may be an exit way for the substrate [44]. Channel S was proposed as a route for controlling water access and egress to the active site for water based on its observation in P450-BM3 [84]. It may also be used for substrate egress [84,86]. It was proposed as the main gateway to the active site of the human 2D6 [86].

**Figure 7.** Superposition of the channels of 2V0M. The dashed lines separate blocks 1 and 2. (**a**) Channels 2a and 2f computed at CV = 5.75 Å. (**b**) Channels 2a, 2f and 2e, computed at CV = 5.50 Å.

3.2.3. Channels Opening/Closing and Interactions with the Lipid Bilayer

In a recent paper [87], it is stated that access channels to the buried active site control substrate specificity in CYP1A P450 enzymes. Then, in a recent review [33], it is suggested that the network of channels is involved in the control of the P450 enzymes substrate specificity for all P450 family. The diversity and the deformability of the channels could explain the diversity of its substrates. We emit the assumption that the specificity of the CYPs relies on what happens at the entry of the channels and that, due to sterical constraints, the orientation of the ligand does not change until it reaches the active site. It was suggested that the channels are often gated by aromatic residues all along them [73]. It was also suggested from the analysis of crystal structures that aromatic residues can form

a network of gates, which regulates cooperatively the opening and closing of different tunnels [88]. Except Tyr53 for channel 2f, the following gating residues are phenylalanines:


Mammalian CYPs are generally membrane-bound proteins [89,90]. The mechanisms of substrate access and product egress from the mammalian membrane bound P450s may differ from those in the soluble bacterial P450s studied before [53]. The microsomal CYPs are anchored in the membrane by an N-terminal transmembrane alpha-helix and there is evidence that their globular domain dips into the membrane [91]. A membrane bound model of human CYP3A4 provides the structure of the protein membrane complexes consistent with most experimental data [51]. Membrane binding of the globular domain in CYP3A4 significantly reshapes the protein at the membrane interface, where most channels open, inducing conformational changes relevant to access tunnels [27].

The CYPs substrates are rather hydrophobic and in the case of membrane bound P450s they are expected to come from the lipid bilayer. As the products of the P450 catalyzed reactions are more hydrophilic, they may be released into the aqueous environment or the polar headgroup region rather than back into the lipid bilayer. The multiplicity of channels suggests possibilities for ligand channelling to and from the P450s sitting in or on the membrane. The P450 protein topology favors channel formation on the distal side of the heme. The proximal side is the likely reductase binding site and corresponds to the smallest channels found by CCCPP.

Helices F' and G' do not completely insert into the membrane, with helix G' establishing a closer contact to the membrane than helix F'. CYP3A4 is anchored into the membrane helix G', which partitions mainly within the headgroup region [27]. The mouth of channel 2f is bordered by helix F' at the water/membrane interface. It was assumed from modeling studies that channel 2f opens at arrival of the molecule [92,93]. Channel opening was observed as a consequence of ligand-induced conformational changes [94]. The mouths of channels 2f that we computed have hydrophobic residues in F-F' loop: Phe213, Phe215, Leu216 and Pro218; thus, the interaction with the channel mouth is facilitated for hydrophobic molecules.

Given the dynamic nature of membrane-anchored CYPs, the precise positions of channels may change dynamically over time [93]. We retained for our calculations the positioning used in [27].

The dynamic motions of the protein can cause the opening of channels not seen in the crystal structures as well as changes in the relative dimensions of the channels [36,40,75]. Even though these motions cannot be seen dynamically in the crystal structures, the location of the channels in these latter, supported by their capacity of ligands, provides a useful basis for exploring ligand access and egress routes, particularly when the snapshots from different crystal structures are considered together.

The mammalian CYPs are characterized by a subdivision of their larger F-G region in F' et G' helix [51]. Insertion of F'-G' helix-loop of CYP3A4 in the membrane moves the *β* domain towards the heme plane, allowing channel 2a to open, whereas this opening does not occur in soluble bacterial P450s. In these latter, the beta domain plane is farther from the heme plane: the opening occurs between F-G loop and B helix [26], at the level of the opening between F-F' loop and the C-terminal loop in 3A4, corresponding to the channel 2f. The opening of block 2 (due to the move of F-F' loop), which characterizes 3A4, offers a more diverse exterior environment for the compound than the prokaryotic CYPs which offers only solvent exterior channel 2a as environment [36,75].

#### *3.3. Characteristics of the Four Major Channels*

For convenience, we summarize these characteristics in Table 2. The four major channels contains substrate recognition sites (SRS). We also summarize in Table 1 the characteristics of the channels computed by CCCPP for the crystallographic structures of CYP3A4 considered in the present study. We recall that CV is the value of the channel bottleneck (see Section 2). When the critical thickness CV of a ligand exceeds CV*lim*, the bulky part of the ligand is not in the bottleneck: the passage of the ligand requires flexibility. We also recall that the topology of the channels may be constituted by several MCPs having one or several common parts. This was observed for 2J0D and 4K9T, but for clarity in these cases we provide only the CV values of the main paths in Table 1. We show the four channels in Figure 8 and we summarize the location of the four channels as follows:
