**2. Results**

The results are described in the following sequence. First, the major principles of spectral tuning that make the proposed models possible are described. Then, models are introduced and applied to mutants of bovine rhodopsin and sodium pumping rhodopsin as an example. Finally, the limits of applicability of these models are discussed

#### *2.1. Steric and Electrostatic Factors in Rhodopsin Spectral Tuning*

The tuning of the rhodopsins' first spectral band has been widely investigated [13,16,17,19,20]. Two factors are found to be the most important: the steric and electrostatic interactions of PSBs with surrounding amino acids of the opsin. Several other factors, such as the polarization of the retinal environment or inter-residual charge transfer within the binding pocket, have also been studied but were found to be less significant [21,22].

A substantial modification of the steric interaction between the protein pocket and the chromophore can lead to the distortion of the chromophore and, consequently, to a change in *λmax*. If this is the case, Δ*λmax* evaluation requires detailed information about the chromophore geometrical parameters. Generally, the resolution of available X-ray structures is not good enough to obtain these parameters with required precision. To date, the geometrical parameters of sufficient quality can be obtained only from high-level computational models, and simpler approaches are hardly possible. To cause prominent

change in the steric interaction of PSB with opsin, one has to either introduce/remove a bulky residue in the protein binding pocket or to introduce/remove a distant residue that causes a substantial deformation of the binding pocket. Although modification of steric interactions for *λmax* tuning occurs in nature [23], rational design of rhodopsins with specific *λmax* by adjustment of steric interactions is not straightforward.

On the contrary, the electrostatic tuning mechanism is not only quite common in nature but also can be more easily utilized for rational design. Due to the charge transfer character of PSB *S*0 → *S*1 transition [24,25], *λmax* is very sensitive to the electrostatic field generated by the amino acids constituting an opsin. The contribution of any residue to this electrostatic field is primarily determined by its charge or dipole moment and its position relative to the chromophore. The contributions of quadrupoles and the multipoles of higher order can be neglected [26]. Generally, amino acid substitution can change Δ *λmax* in two ways: either directly or indirectly. In the first case, Δ *λmax* is obtained via substitution of the original amino acid by an amino acid with different charge/dipole moment. In the second case, Δ *λmax* is caused by a substitution that induces reorganization of the whole protein, including charged and polar residues and changing the electrostatic field in the chromophore region.

Unlike the steric part of Δ *λmax*, the electrostatic part can be treated by a simple, practical model. To make it possible, the following assumptions must be done:


The first point is the widely used approximation for rhodopsin *λmax* modeling. Although more molecular dynamics studies are necessary to understand the limits of applicability for this assumption, inhomogeneous broadening of the absorption band is not ye<sup>t</sup> reported for rhodopsins in contrast to some other photosensitive proteins [27,28]. Moreover, static QM/MM models have already been tested intensively and proven to be able to reproduce experimental *λmax* for dozens of rhodopsins [10,12,13,16,29,30]. It is worth mentioning that a system of interest can consist of a mixture of two or more stable forms, such as 13-cis and all-trans retinal containing the ground state of bacteriorhodopsin or different protonation states of titratable residues in Anabaena sensory rhodopsin [31]. If this is the case, several representative snapshots should be taken into account. Points 2 and 3 are the common coarse-grained approximation with well-known limitations [32,33]. For the last three points, additional justifications and discussion is given in the section "Limitations of the proposed models" and in the Supplementary Materials to avoid readers' distraction from the main subject.

If the statements above are assumed, a two-dimensional grid for charges and a threedimensional grid for dipoles can be calculated only once with a robust quantum mechanical method, and, then, one can use graphical representations, tables, or fitting by suitable functions for fast data acquisition. An approximate evaluation of Δ*λmax* using these tabulated data can be performed only based on the knowledge of charges/dipoles of altered residues and their positions relative to the chromophore. In other words, all required information can be obtained from an X-ray structure or a structure predicted by comparative modeling.

**Figure 1.** (**a**) Illustration of the "cylindrical" symmetry assumption for the protonated Schiff base (PSB). The impact of a charged/polar residue on *λmax* depends only on its charge/dipole moment and its distance to/orientation along the chromophore axis. (**b**) Spectral shift values caused by a unit negative charge (denoted as '−1') located at 5 Å from different reference atoms of all-trans PSB. Negative spectral shift values (reference atoms C15, C13, C11, C9) are presented in blue; positive spectral shift value (reference atom C7) is presented in red.

#### *2.2. Models to Evaluate the Direct Electrostatic Effect of Charged Residues*

For each chromophore and negative/positive charges, we derived the numerical function that relates the position of a charged residue to the chromophore and its impact on *λmax* (Figure 2).

The geometries of the chromophores were kept as calculated in the gas phase, i.e., without external charges. We constructed a grid for the placement of unit charges as follows: We plotted thirteen grid lines from thirteen PSB reference atoms perpendicular to the chromophore axis (see Figure 2). Along each grid line, we placed unit charges at the fixed distances from the retinal chromophore, from 3 Å to 18 Å with the 1 Å interval

(16 points along each grid line). In total, we performed 13 × 16 = 208 calculations for each charge and for each chromophore.

**Figure 2.** Grid representing the positions of a unit charge relative to the all-trans PSB. At each grid point we performed an ab initio calculation of Δ*λmax* values for the positive and negative unit charges.

The calculated reference absorption maxima for the 11-cis and all-trans PSB without external charges *λref max* were found to be 595 nm and 596 nm, respectively. For each chromophore and charge type and position, we performed the SORCI+Q calculation of the absorption maximum value (*λimax*). Then, we derived the effect of the charged residue placed at a certain position relative to the retinal chromophore as Δ*λimax* = *λimax* − *λref max*. The results of these calculations are illustrated as 2D functions Δ*λimax* (reference atom, distance) in Figure 3, Figure S1 and Tables S1–S4.

The four panels in Figure 3 correspond to the four considered systems: (a) a negative charge, 11-cis chromophore; (b) a positive charge, 11-cis chromophore; (c) a negative charge, all-trans chromophore; (d) a positive charge, all-trans chromophore. These 2D functions allow us to estimate Δ*λmax* caused by the charged residue placed at a certain position in the rhodopsin. For example, a positively charged residue (lysine, arginine, or protonated histidine) placed at 7 Å from the C14 PSB atom would cause an approximately +40 nm red shift for the 11-cis chromophore and a +36 nm red shift for the all-trans chromophore. On the other hand, a negatively charged residue (glutamic or aspartic acid) at the same position would cause an approximately −40 nm blue shift for 11-cis chromophore and a −35 nm red shift for the all-trans chromophore.

Several well-known rules/patterns can also be clearly seen from the plots in Figure 3:


The effect of charged residues on *λmax* is slightly larger for 11-cis PSB than for all-trans PSB (Figure 3). To rationalize this fact, we calculated the charge distributions in the ground and the first excited states of 11-cis and all-trans PSB. The calculations were performed at the CASSCF/6-31G\* level of theory. We found that the portion of positive charge, which is translocated from the NH region to the beta-ionone ring region upon photoexcitation, is larger for 11-cis PSB (0.30) than for all-trans PSB (0.21).

**Figure 3.** The impact of a unit charge on PSB *λmax* as a 2D function of its position relative to the chromophore. (**a**) 11-cis PSB, negative charge; (**b**) 11-cis PSB, positive charge; (**c**) all-trans PSB, negative charge; (**d**) all-trans PSB, positive charge

#### *2.3. Models to Evaluate the Direct Electrostatic Effect of Polar Residues*

To derive models for polar residues, we followed the strategy similar to the approach used in the previous section. Δ*λimax* caused by a polar residue depends not only on the distance of this residue to the chromophore but also on the orientation of this residue polar group (see Figure 4). Therefore, we calculated a numerical function that relates Δ*λimax* and both the distances to the chromophore and the angle between a dipole and the chromophore axis. Each dipole was represented by two charges of a different sign (−0.43 and +0.43) that are situated 1.0 Å from each other (see Figure 4). The magnitudes of the charges and the distance between them were chosen in order to represent the Amber parameters for the -OH group of polar residues, such as Ser or Thr. The geometries of 11-cis PSB and all-trans PSB were kept as calculated in the gas phase, i.e., without an external electrostatic field. Four atoms of the 11-cis and all-trans chromophore (N16, C12, C8, C4) were taken as the reference atoms. We plotted four grid lines from these four reference atoms perpendicular to the chromophore (see Figure 4). Along each grid line we placed the center of the dipole at the fixed distances from the retinal chromophore, from 3.5 Å to 6.5 Å with the 1 Å interval (4 points along each grid line in total). The orientation of the dipole was varied by changing the angle *γ* that is defined as the angle between the chromophore axis and the line connecting oxygen and hydrogen atoms. *γ* varied from 0◦ to 360◦ with the step of 30◦. The results of our calculations are presented in Figures 5 and 6, Tables S5 and S6. We also performed spline interpolation of the calculated

data to plot the largest possible negative and positive contributions of polar residues to *λmax* as functions of a dipole moment position along the chromophore axis (Figure 7).

**Figure 4.** Grid representing the positions of a dipole moment relative to the all-trans PSB. At each grid point, we performed the ab initio calculation of Δ*λmax* values for different orientations of the dipole moment relative to the chromophore axis.

**Figure 5.** Impact of the dipole moment on *λmax* of the 11-cis PSB as a function of the angle between the dipole and the chromophore axis. Functions were calculated at different positions of the dipole relative to the chromophore. Dipoles were located at the distances 3.5 Å, 4.5 Å, 5.5 Å, 6.5 Å from the C4 (**a**), C8 (**b**), C12 (**c**), and N16 (**d**) chromophore atoms along the grid line perpendicular to the chromophore axis (see Figure 4). Dots represent the ab initio calculated values. Functions were derived as the spline interpolation of the calculated data.

**Figure 6.** Impact of the dipole moment on *λmax* of the all-trans PSB as a function of the angle between the dipole and the chromophore axis. Functions were calculated for different positions of the dipole relative to the chromophore. Dipoles were located at the distances 3.5 Å, 4.5 Å, 5.5 Å, 6.5 Å from the C4 (**a**), C8 (**b**), C12 (**c**), and N16 (**d**) chromophore atoms along the grid line perpendicular to the chromophore axis (see Figure 4). Dots represent the ab initio calculated values. Functions were derived as the spline interpolation of the calculated data.

**Figure 7.** Largest possible negative and positive contributions of dipole moments to *λmax* of 11-cis PSB (**a**) and all-trans PSB (**b**) as functions of a dipole moment position along the chromophore axis. Functions were calculated at different distances of the dipole moment from the chromophore. Dots represent the ab initio calculated values. Functions were derived as the spline interpolation of the calculated data.

Several rules/patterns can be derived from the plots presented in Figures 5 and 6.


this dipole). Therefore, to estimate the effect of a polar residue on the rhodopsin absorption maximum, accurate structural information is required.

The effect of polar residues on *λmax* is slightly larger for 11-cis PSB than for all-trans PSB. This fact can be rationalized by the difference in the portion of positive charge translocated from the NH region to the beta-ionone ring region upon photoexcitation, as discussed in the Section 2.2.

Only the rotation of a dipole moment in the grid plane (Figure 4) was considered. Due to the symmetry of the system, the dipole moment component that is perpendicular to the grid surface should have a negligible effect on *λmax*. To prove this, we located the dipole moment at 3.5 Å from the N16 PSB atom and rotated it perpendicular to the grid surface; Δ *λmax* values were calculated with a step of 30◦. The calculated spectral shift values did not exceed 1.5 nm, which is within the limits of QM calculation error.

#### *2.4. Application of the Proposed Models to Evaluate the Direct Effect of Amino Acid Substitutions*

Using the protocol described in the Methods section, we applied the proposed models to evaluate the direct effect of twelve amino acid substitutions in bovine rhodopsin (Rh) and sodium pumping rhodopsin (KR2). For charged residues, the data presented in Figure 3 were used to obtain the correspondence between their position and the possible spectral shift. For polar residues, the data presented in Figures 5–7 were used to evaluate the range of possible spectral shift values due to the lack of information about the orientation of a polar group relative to PSB. The exact orientation of a polar group could be defined only by constructing the corresponding protein QM/MM model. The estimated direct effect values were compared with the experimentally observed and QM/MM calculated spectral shift values.

According to the proposed models, for seven mutants, the direct effect of amino acid substitution completely explains the experimentally observed spectral shift (Figure 8 and green color-coding in Table 1). These estimations were also confirmed by the analysis of the corresponding QM/MM models. For five mutants, the direct effect of amino acid substitution cannot completely explain the experimentally observed spectral shift, and the indirect effect has to be taken into account (Figure 9 and brown color-coding in Table 1). The analysis of the corresponding QM/MM models confirmed that these five substitutions cause structural reorganization of the protein (Figure 10). Reorganization can include three components: (1) Reorientation of charged/polar residues in the protein due to the substitution; (2) Addition/deletion of water molecules. Water molecules possess a dipole moment and for this reason can impact on *λmax*; (3) Change in the protonation state of titratable residues. Below, we describe the indirect effect of considered amino acid substitutions in more detail.


(5) The structural reorganization caused by W265F replacement in Rh (4.9 Å from C4 PSB atom) involves the reorientation of the polar Y191 residue and the addition of three water molecules in the increased cavity at the substitution site. According to our QM/MM model, W265F replacement has a non-negligible effect on retinal geometry. The spectral shift related to retinal geometry modification is −8 nm, while the experimentally observed spectral shift is −18 nm.

Overall, the proposed models can be applied not only to estimate the direct effect of amino acid substitution but also to determine if the indirect effect of amino acid substitution occurs.

**Figure 8.** Bovine rhodopsin (Rh) and sodium pumping rhodopsin (KR2) mutants, for which the direct effect of amino acid substitution completely explains the experimentally observed spectral shift. Experimental spectral shift values are shown in black. The distances from the wild-type residues to the PSB and corresponding contributions to *λmax* are shown in blue. The evaluated contributions of new residues to *λmax* are shown in green. The distances are given in Å.

**Figure 9.** Bovine rhodopsin (Rh) and sodium pumping rhodopsin (KR2) mutants, for which the direct effect of amino acid substitution cannot completely explain the experimentally observed spectral shift. Experimental spectral shifts are shown in black. The distances from the wild-type residues to the PSB and corresponding contributions to *λmax* are shown in blue. The evaluated contributions of new residues to *λmax* are shown in green. The distances are given in Å.

**Figure 10.** Structural reorganization caused by amino acid replacements. (**a**) E122A substitution in bovine rhodopsin (Rh); (**b**) P219T substitution in sodium pumping rhodopsin (KR2); (**c**) S254A substitution in sodium pumping rhodopsin (KR2).

**Table 1.** Spectral shifts in bovine rhodopsin (Rh) and sodium pumping rhodopsin (KR2) caused by amino acid substitutions. Δ*λexp max*—experimental spectral shift. Δ*λdirect max* —the magnitude of the spectral shift estimated by the proposed models (Figures 3, 5–7 and Tables S1–S6). Δ*λQM*/*MM max* —the spectral shift calculated with the quantum mechanics/molecular mechanics models of rhodopsin mutants. Color coding: green—substitutions that do not induce structural reorganization (direct spectral tuning), brown—substitutions that cause a substantial structural reorganization (indirect spectral tuning).


#### *2.5. Limitations of the Proposed Models*

As demonstrated in a number of experimental and computational studies [24,39–42], the positive charge located at the NH moiety of the chromophore after excitation partially relocates to the *β*-ionone ring moiety, making the NH part less positive and, accordingly, the *β*-ionone ring part more positive. This difference in the charge distribution between *S*0 and *S*1 states leads to different electrostatic interactions of the chromophore with external charges (see Figure 11).

The interaction between the charge density of the chromophore and the negative charge located in the NH region stabilizes the ground state more than the first excited state and, therefore, leads to a blue shift in the *S*0 to *S*1 band. On the other hand, a negative charge in the *β*-ionone region stabilizes the excited state more than the ground state, leading to a red shift. A positive charge in the NH region destabilizes the ground state more than the excited state, leading to a red shift. Finally, a positive charge in the *β*-ionone region destabilizes the excited state more than the ground state, leading to a blue shift (Figure 11).

If Δ *λmax* was determined only by this "charge transfer" factor, and both *S*0-to-*S*1 charge redistribution and the geometry of the chromophore did not depend on an external electrostatic field, the impact of each residue on Δ *λmax* would be independent of the rest of the residues; i.e., Δ *λmax* of each residue would be additive. In fact, this additivity is broken due to the polarization effect caused by any charged or polar residue. The additional electrostatic field modifies both the magnitude of the *S*0 → *S*1 charge transfer due to different polarization of the ground and the excited states and the ground state geometry of the chromophore by changing the so-called bond length alternation (BLA), i.e., averaged difference between single- and double-bond lengths of the chromophore (Figure 12). However, as demonstrated, for example, for *N. Pharaonis* halorhodopsin [10], the contribution of these polarization effects to Δ *λmax* is much smaller than the contribution of the "charge transfer" effect, and, in the absence of other protein residues reorganization, the Δ *λmax* additivity can be considered as a good approximation.

The proposed models assume that the impact of a charged/polar residue on *λmax* depends only on its charge/dipole moment and its distance to/orientation along the chromophore axis but not a radial angle. To confirm this "cylindrical symmetry" assumption, we performed an additional set of calculations rotating negative unit charges at 4 Å around the 11-cis chromophore axis. The results confirm that calculated Δ *λmax* only slightly depend on radial angles (See Supplementary Table S7 for details).

Finally, the proposed models do not take into account the possible distortion of the chromophore due to the steric interactions caused by amino acid substitution [23]. This effect can be accurately taken into account only by QM/MM models.

**Figure 11.** Difference in the charge distribution between *S*0 and *S*1 states of PSB leads to different electrostatic interactions of the chromophore with external charges.

**Figure 12.** The electrostatic spectral tuning mechanism in rhodopsins involves three main factors: 1. The "charge transfer" factor related to difference in charge distributions of ground and excited states of the PSB. The positive charge partially translocates from the NH region to the *β*-ionone ring region upon photoexcitation. Therefore, ground and excited states of the chromophore possess different interactions with external charges. 2. The modification of the bond length alternation (BLA) of the chromophore by the external electrostatic field. 3. Differences in polarization of the ground and excited states of the PSB by the external electrostatic field.

#### **3. Materials and Methods**

#### *3.1. Ab Initio-Based Models*

Geometries of 11 -cis PSB (protonated Schiff base) and all-trans PSB were optimized at the B3LYP/6-31G\* level of theory. Absorption maxima values were calculated at the SORCI(6,6)+Q/6-31G\* level of theory. Electrostatic embedding scheme was used to include the effect of external charges. The calculations were performed with the ORCA program, version 3.0.3 [43].

#### *3.2. Evaluation of Spectral Shifts Caused by Amino Acid Replacements*

To evaluate Δ*λmax* values caused by amino acid substitutions, the corresponding three-dimensional structures of the wild-type proteins were used. For bovine rhodopsin (Rh), the 2.2 Å X-ray structure was used, RCSB code 1U19 [44]; for sodium pumping *Krokinobacter eikastus* rhodopsin 2 (KR2), the 1.8 Å structure was used, RCSB code 6RF6 [45]. The distance from the substituted amino acid to the closest atom of the retinal chromophore was measured using visualizing software (VMD program, v.1.9.3) [46]. The pdb file of the X-ray structure was used without any preliminary modifications. When the position of the residue was defined, we used the figures and tables presented in the Results section and the Supporing Information to determine the correspondence between the position of the residue and the possible spectral shift.

#### *3.3. QM/MM Models Construction*

To generate QM/MM models of rhodopsin mutants, we started from the corresponding wild-type X-ray structures. The amino acid substitutions were inserted into the wildtype X-ray structures using the Mutate Model algorithm implemented in Modeller v.9.15 program package [47]. The algorithm replaces the indicated amino acid in the protein X-ray structure and optimizes its position, leaving other protein residues intact. The retinal chromophore was inserted into the models and bound to the proper lysine residue (11-cis PSB, Lys296 for Rh; all-trans PSB, Lys255 for KR2). Afterward, models were hydrated with

the Dowser++ algorithm [48]; the configuration parameters for running Dowser++ and the parameter set for the PSB were described in our previous work [29]. The PROPKA program, version 3.1 [49], was used to calculate the pKa values of titratable residues (pH = 7.0) and assign their protonation states; hydrogen atoms were added with the pdb2pqr program, version 2.1.1 [50]. The obtained models were optimized gradually first at the MM level (Amber96 force field [51], TIP3P for water) then at the QM/MM level utilizing the hybrid two-layer ONIOM (QM:MM-EE) scheme (QM = B3LYP/6-31G\*; MM = AMBER96 for amino acids and ions, TIP3P for water, EE = electronic embedding). The ONIOM calculations were performed with Gaussian09 [52]. Fifty atoms of the retinal chromophore were included in the QM part; the link atom was placed at the N*Z*-C bond of Lys296. The SORCI+Q/6-31G\* method was used to calculate the PSB absorption maxima values in the opsin environment represented as Amber96 point charges. The absorption maxima calculations were performed with the ORCA program, version 3.0.3 [43]. The reliability of the applied methodology for rhodopsins was tested in several previous studies [10,12,13,29,30].
