**Modular Diversity of the BLUF Proteins and Their Potential for the Development of Diverse Optogenetic Tools**

#### **Manish Singh Kaushik 1, Ramandeep Sharma 1, Sindhu Kandoth Veetil 1, Sandeep Kumar Srivastava 2,\* and Suneel Kateriya 1,\***


Received: 7 April 2019; Accepted: 6 May 2019; Published: 19 September 2019

**Abstract:** Organisms can respond to varying light conditions using a wide range of sensory photoreceptors. These photoreceptors can be standalone proteins or represent a module in multidomain proteins, where one or more modules sense light as an input signal which is converted into an output response via structural rearrangements in these receptors. The output signals are utilized downstream by effector proteins or multiprotein clusters to modulate their activity, which could further affect specific interactions, gene regulation or enzymatic catalysis. The blue-light using flavin (BLUF) photosensory module is an autonomous unit that is naturally distributed among functionally distinct proteins. In this study, we identified 34 BLUF photoreceptors of prokaryotic and eukaryotic origin from available bioinformatics sequence databases. Interestingly, our analysis shows diverse BLUF-effector arrangements with a functional association that was previously unknown or thought to be rare among the BLUF class of sensory proteins, such as endonucleases, *tet* repressor family (tetR), regulators of G-protein signaling, GAL4 transcription family and several other previously unidentified effectors, such as RhoGEF, Phosphatidyl-Ethanolamine Binding protein (PBP), ankyrin and leucine-rich repeats. Interaction studies and the indexing of BLUF domains further show the diversity of BLUF-effector combinations. These diverse modular architectures highlight how the organism's behaviour, cellular processes, and distinct cellular outputs are regulated by integrating BLUF sensing modules in combination with a plethora of diverse signatures. Our analysis highlights the modular diversity of BLUF containing proteins and opens the possibility of creating a rational design of novel functional chimeras using a BLUF architecture with relevant cellular effectors. Thus, the BLUF domain could be a potential candidate for the development of powerful novel optogenetic tools for its application in modulating diverse cell signaling.

**Keywords:** photoreceptor; BLUF; modular domain; optogenetics

#### **1. Introduction**

Microorganisms respond to changing light conditions using an evolved repertoire of photoreceptors that perceive light and execute a light-dependent control of regulatory 'output' domains [1]. Blue-light using flavin (BLUF) protein photoreceptors respond to blue light and are often coupled with different effector domains (enzymes or transcriptional regulators) to generate the full range of combinations to regulate photo-adaptive responses [2–5]. However, proteins having the BLUF domain with an extended C-terminus only have also been reported, and their responses were controlled by light-dependent protein-protein interactions [6–9]. Upon illumination, the isoalloxazine

moiety of the BLUF domain associated flavin chromophore (flavin adenine dinucleotide; FAD, or flavin adenine mononucleotide; FMN, or riboflavin; RF) absorbs blue light [10], and undergoes structural rearrangements to modulate the communion between BLUF and the effector domains [11]. Unlike the complex mechanisms of photo-transformation in other photoreceptors, the BLUF domain, upon illumination, mainly shows a hydrogen bond rearrangement around the flavin cofactor, which causes a 10–15 nm red shift in the BLUF absorbance peak [11]. The photo-activation of the BLUF domain is due to the involvement of a conserved glutamine and tyrosine residues [12–14]. The hydrogen bond rearrangement involves a unique ability of the BLUF domain, i.e., photo-induced proton-coupled electron transfer (PCET), which enables them to switch between receptor and signaling states [15–17]. The photo-activation and structural rearrangements around the chromophore of the BLUF domain are transmitted as a signal for the activation of an associated effector domain. The BLUF domains are considered as an attractive model to investigate new paradigms of photo-induced signaling. BLUF domains have a modular architecture; hence, they may be functionally fused to different effector domains, as observed for the modular light oxygen voltage (LOV) photoreceptors [18–25]. Barends and coworkers have characterized a full-length active photoreceptor, BlrP1, from *Klebsiella pneumoniae*, which is composed of BLUF and EAL (Glutamine-Alanine-Leucine) as the sensor/output domain combination, respectively [4]. The EAL domain is a conserved signature motif which hydrolyses cyclic dimeric GMP (c-di-GMP) and is involved in the regulation of motility, biofilm formation, virulence and antibiotic resistance in the bacteria [26–29]. When exposed to light, the BLUF domain from BlrP1 activates the EAL domain via an allosteric communication relayed through conserved domain-domain interfaces [4]. In *Escherichia coli*, YcgF is another photoactivated protein with the BLUF-EAL domain that has been reported [30]. However, unlike other EAL domain proteins, YcgF acts as a transcriptional regulator and controls the YcgF/YcgE pathway, which regulates the synthesis of small regulatory proteins. These small regulatory proteins are involved in the modulation of biofilm functions via the Rcs two-component pathway, necessary for *E. coli* to sustain the adverse environment [30]. In many bacteria, proteins with tandem GGDEF (diguanylate cyclase; DGC)/EAL (phosphodiesterase) domains were also reported to be involved in the c-di-GMP turnover, which modulates a variety of functions ranging from the functional modification of cell surface components, the expression of extracellular signaling molecules, virulence and motility [31,32]. Photoactivated BLUF associated adenylyl cyclase homology domains (CHD) were also investigated and well characterized in several microorganisms, where they are specifically involved in the catalytic conversion of ATP to cyclic AMP (cAMP), which regulates the downstream signal transduction [33–38]. PAS domains (Per/ARNT/Sim) are one of the broadly spread domains involved in sensing variations in light, oxygen, redox potential and the binding of small ligands [39]. The role of PAS domains is diverse, and few reports demonstrated the involvement of PAS in domain chromophore attachment [40], light-regulated protein-protein interactions [41] and in complementary chromatic adaptation (CCA) [42].

Optogenetics is a recently developed molecular tool that combines genetic and optical methods and enables us to modulate specific functions in any cell or tissue using light in a controlled manner [43]. Microbial, algal opsins and natural light regulated ion channels have been reported to be versatile and good actuators for optogenetic applications [44]. However, BLUF domains, due to their small size, solubility, reversibility, temporal precision and diverse association with a wide variety of effectors, could be engineered for the light-dependent modulation of a wide range of cellular signaling [44]. Prominent examples include BLUF containing photoactivated adenylyl cyclase from *Beggiatoa* sp. (bPAC) efficiently activating cyclic-nucleotide-gated ion channels in neurons. The mutagenic variant of bPAC, BlaG, yielded a higher level of light-induced production of cGMP than cAMP [11,45]. In the present study, we have characterized the modular diversity of the BLUF domain coupled proteins that could be valuable in the development of novel synthetic photoswitches and we expand the scope of the optogenetics modulation of novel cellular signaling within a functional expression in the appropriate living system.

#### **2. Materials and Methods**

#### *2.1. Database of Sequences used in this Analysis*

The BLUF domain encoding protein sequences were retrieved from the National Center for Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/), and each of them was subjected to a conserved domain search using the Conserved Domain Architecture Retrieval Tool (CDART; https://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi) [46]. The 34 uncharacterized BLUF domain containing proteins were selected for a further analysis. For each protein, sequences encoding the BLUF domain were selected and aligned for the homology analysis using the BioEdit tool [47]. The Multiple EM for the Motif Elicitation (MEME) suite (http://meme-suite.org) was employed to scan conserved motifs throughout the sequences [48]. The interacting partners for each of the output domains were predicted using the String version 11 [49].

#### *2.2. Phylogenetic Analysis*

A phylogenetic analysis involving 34 BLUF sequences from different organisms was performed by employing the Maximum Likelihood method, based on the JTT matrix-based model [50]. Gaps were eliminated from the sequences. The tree with the highest log likelihood (−955.3175) is shown. The values shown with the branches represent the percentage of trees clustering with the associated taxa. The JTT matrix of pairwise distance was subjected to Neighbor-Join and BioNJ algorithms for the construction of the first tree(s), which was then used to select the topology with a higher log likelihood value. Phylogenetic analyses were performed using MEGA6 [51].

#### *2.3. Analysis and Homology Modeling of the BLUF Domain*

The annotated sequences were further analyzed for the putative secondary structures and function using Predict Protein (https://www.predictprotein.org/). Based on the secondary structure analysis, a two-dimensional topology was generated using POTTER (http://wlab.ethz.ch/protter/#) [52]. Three-dimensional models of the predicted BLUF sequences and associated effector domains were created using the Phyre2 modeling tool [53], employing an integrated combinatorial approach comprising comparative modeling, threading, and *ab initio* modeling [54]. All the energy-minimized models of the annotated BLUF domains and BLUF, in combination with the effector domains, were further evaluated for structural errors and the stereochemistry quality, as well as for manual curation. The variety of the models were performed, in terms of bond angles, distances, stereochemical analysis, and vocabulary, by the UCLA Structure Analysis and Verification Server (SAVES) with the PROCHECK and ERRAT programs [55,56]. Finally, the most acceptable models were finalized based on Ramachandran plot analysis and structural fit for each annotated sequence. All of the predicted BLUF sequences were aligned for an analysis of conservation and variation of residues using the Clustal Omega program.

#### **3. Results and Discussion**

#### *3.1. BLUF Sequences, Modular Domains and Phylogenetic Analysis*

We assessed the residues that are conserved and important for the substrate specificity in the respective orthologs. The details for each protein sequence and domain architecture are given in Table 1 and Figure 1.


#### **Table 1.** Blue light using flavin (BLUF) modular domains from different organisms.


**Table 1.** *Cont.*

BLUF- Blue light using flavin; EAL- Glutamine/Alanine/Leucine; PRK- Phosphoribulokinase; PAS- Per/Arnt/Sim; PBP- Phosphatidylethanolamine-Binding Protein; CHD- Cyclase homology domain; LRR- Leucine-rich repeats; Med26- Mediator of RNA polymerase II transcription subunit 26; B12- Vitamin B12; DNA pol-DNA polymerase; REC- cheY-homologous receiver domain; HTH-Helix turn helix; Endo3- Endonuclease 3; ANK- Ankyrin repeats; RGS- Regulator of G protein signaling; DUF- Domain of unknown function; RhoGEF- Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; GAL4- GAL4-like Zn(II)2 Cys6 (or C6 zinc) binuclear cluster DNA-binding domain; MHR- Middle homology region; PDZ- PSD95/Dlg1/zo-1; SRPBCC-START/RHO\_alpha\_C/PITP/Bet\_v1/CoxG/CalCligand-binding; BTB- Broad-Complex, Tramtrack and Bric a brac; FH2- Formin Homology 2; Drf- Diaphanous related formins; SMC\_N- N terminus of structural maintenance of chromosomes. Sequences WP\_051596720.1 from *Curtobacterium* sp. UNCCL17; Q8S9F2.1 from *Euglena gracilis;* and XP\_013758351.1 from *Thecamonas trahens* ATCC 50062 are strongly predicted to contain two BLUF domains on either side of their respective effector domains. NP\*: Probable modulated function cannot be predicted, but requires a detailed study.

**Figure 1.** Schematic representation of the different blue light using flavin (BLUF) modular domain containing proteins. The accession numbers were taken from National Center for Biotechnology Information (NCBI). The mentioned "AA" indicates the amino acid numbers of the particular BLUF modular protein.

The analysis of 34 BLUF sequences revealed that residues forming BLUF catalytic core are conserved throughout for interaction with the flavin chromophore (Figure 2). The amino acids tyrosine (Y), asparagine (N), glutamine (Q), and tryptophan (W) or methionine (M), which are crucial for the photodynamics and photocycle of the BLUF domain, are highly conserved (Figure 2). Our analysis further confirms that tyrosine, glutamine, and tryptophan (or methionine) are critical for the substrate specificity of the BLUF sequences, as has been reported earlier [6,11,33,57]. The photo-activation of the BLUF domains actually involves conserved glutamine and tyrosine residues, where glutamine contributes to hydrogen bond formation in a dark state [11–14]. However, in light adapted conditions, a hydrogen bond rearrangement (tautomerization) occurs to form a new hydrogen bond with tyrosine [11]. A motif analysis revealed three different conserved motifs (Figure 2) among different BLUF sequences. Each motif has its importance, as each of them was comprised of an essential amino acid residue involved in the regulation of the BLUF photocycle and photodynamics [6,11].

**Figure 2.** Multiple sequence alignment of the different BLUF modular domains depicting conserved amino acids. The black arrow indicates conserved amino acids crucial for regulating the flavin binding pocket, photocycle and photodynamics of the BLUF domain containing proteins [6,11]. The sequences under the solid boxes represent the conserved motifs of the BLUF domain. The conserved motifs were predicted using the Multiple EM for the Motif Elicitation (MEME) suite.

The phylogenetic analysis of the selected sequences of 34 BLUF domains divides them into four distinct clusters (I–IV), which seem to differ from each other mainly by the presence of eukaryotic and prokaryotic BLUF proteins in them (Figure 3). As observed from the phylogenetic analysis, the BLUF domain sequences of eukaryotic and prokaryotic origins are evolutionarily intermixed. Each cluster (except cluster IV) was further subdivided into sub-clusters, which represent closely related BLUF domain sequences broadly associated with similarkind of effector domains (Figure 2).

**Figure 3.** Phylogenetic analysis by the Maximum Likelihood (ML) method. The analysis was done using 34 amino acid sequences of the modular BLUF domain containing proteins. All positions containing gaps and missing data were eliminated. There were a total of 17 positions in the final dataset. A solid red circle represents protein sequences of prokaryotic origin, and a solid red square represents protein sequences of eukaryotic origin. Evolutionary analyses were conducted using MEGA6 [51].

#### *3.2. Modular Diversity of BLUF Domains*

In the present communication, we have selected 34 different modular architectures for the BLUF domain in combinations with various effector domains (Table 1; Figure 1). The annotated BLUF domains in our database sequences range from 80 to 100 amino acids, situated mostly at the N-terminal region of the effector domains. However, the predicted BLUF sequences in our database show a high sequence similarity across the BLUF region, with E-values in the range of e−<sup>40</sup> to e<sup>−</sup>120. The residues show an upper conservation pattern around the flavin binding region (Figure 4). The secondary structural analysis indicates that the BLUF sequences are composed mostly of five β strands and two α helices. The E-value, sequence length and sequence conservation details are summarized in Table 1. Three-dimensional models of the BLUF domains were constructed using the Phyre2 modeling software [53]. The BLUF domains reveal conserved βαββαββ fold conformations in which two α-helices surround a five-stranded antiparallel β-sheet platform (Figure 4).

**Figure 4.** Models of the BLUF domains and details of its flavin binding pocket. (**A**) Superposition of the modelled BLUF domains using the Phyre server with a crystal structure of *Rhodobacter sphaeroides* BlrB (PDB: 2BYC) (magenta). The homology models include the annotated BLUF domains from WP\_045444510.1 (*Psychrobacter sp.*) (Green tone); WP\_014148160.1 (*Methylomicrobium alcaliphilum*) (Brown tone); XP\_025342216.1 (*Pseudomicrostroma glucosiphilum*) (Blue tone); AFL74487.1 (*Thiocystis violascens*) (Violet tone); EHQ08139.1 (*Leptonema illini*) (Yellow tone); ABP71929.1 *Rhodobacter sphaeroides* ATCC 17025 (Pink tone); WP\_0229622806.1 (*Pseudomonas pelagia*) (Skyblue tone); and ORY86082.1 (*Protomyces inouyei*) (Smudge tone). The domain boundaries of the modelled BLUFs are mentioned in Table 1. (**B**) The BLUF photocycle scheme shows the protein/FAD interactions through the hydrogen bonding pattern of the flavin moiety with conserved glutamine upon illumination. (**C**) Superposition of the flavin binding pocket in the BLUF models in comparison to *Rhodobacter sphaeroides* BlrB (PDB: 2BYC) and the BLUF domain of AppA (PDB: 1YRX). The side chains of residues with potentially important roles in catalysis and/or substrate binding are shown as stick models and are labelled. The selected regions of the same are shown in a reduced multiple sequence alignment.

We modeled the binding modes of the flavin chromophore based on the observations in the known BLUF domain structures [33,57]. The isoalloxazine ring of flavin can be readily accommodated in the pocket in the models sandwiched between two α helices, suggesting that the active site is appropriately formed in BLUF models (Figures 4 and 5). Spectroscopic analyses and photochemistry of BLUF proteins have shown that flavin chromophore facilitates blue light-induced electron transfers by a hydrogen bond rearrangement between flavin N5 and O4 and conserved tyrosine, glutamine, and tryptophan or methionine of the BLUF [14,17], leading to a spectral shift from blue to red and the transmission of the signal further downstream. Sequence alignment studies of 34 BLUF domain containing sequences, in this study and elsewhere, show the conservation of tyrosine and glutamine residues in the β1 and β3 strands of the BLUF fold (Figure 4), and this has been reported to be critical for the photochemical reaction, as mutations in these amino acids result in the loss of its ability to perceive light [12–14,17,58]. In our sequence database, BLUF photo domains are fused at the N-terminus of a wide array of different effector domain modules, viz. kinases, phosphatases, phosphodiesterases, anti-sigma factors, DNA binding domains, a transcriptional repressor (TetR), PAS, endonuclease and PBP proteins. Some of these have been modeled with >90% confidence using the Phyre2 modeling program and are presented in Figure 4. These domains are part of various photoreceptors [1,59,60]. Our structural and field mapping analysis also shows that sequences like TetR\_C\_6, SRPBCC1 superfamily, GGDEF1 COG5001 superfamily, COG5001 super family PBP1\_NHase PAS superfamily, and CHD4 Med26\_M super family comprise of two BLUF domains at each end with an effector domain sandwiched in-between (Table 1). LOV photoreceptors have also been reported to be present in two copies, along with effector sequences, but both are situated at the N-terminus of effector domains [61]. It remains to be seen if the role of two BLUF domains on either side of the effector is to diversify the signal further or to play the role of a signal transduction further downstream with the help of the auxiliary linker helix found toward the C-terminus of the conserved BLUF domain in our database. The function of these short auxillary helical stretches, located in the association of several BLUF and LOV photoreceptors, is not known precisely, but it seems likely that they mediate the signal progression between their photosensors and the effector domain, as has been predicted in earlier reports [62].

**Figure 5.** Representation of 3D structural models of BLUF (Red) and effector domain combinations (Blue) using the Phyre server [54]. (**A**) WP\_0229622806.1 (*Pseudomonas pelagia*) (206 aa) codes a combination of the BLUF (5–101) and DUF domain (99–181). The model was generated with 100% confidence covering 1–186 residues using the *R. sphaeroides* AppA (PDB: 4HH0) as the template. (**B**) WP\_051596720.1 (*Curtobacterium sp.* UNCCL17) (470 aa) codes for BLUF in combination with the transcriptional repressor (224–327). The model was generated with >90% confidence covering 1–350 residues using 12 different templates. (**C**) Q8S9F2.1 (*Euglena gracilis*) (1019 aa) codes for BLUF in combination with cyclase homology domains (CHDs), which are part of the class III nucleotydylcyclases (20–379). The model was generated with >90% confidence covering 1–800 residues using 13 different templates. (**D**) ARH96915.1 (*Escherichia coli*) (403 aa) codes for BLUF (2–93) in combination with the EAL signaling domain (150–389). The model was generated with 100% confidence covering 1–389 residues using *K. pneumoniae* BlrP1 (PDB: 3GFZ) as the template. The structures are represented as interactive coloured ribbons. The model images were generated using PyMol (http://www.pymol.org) [63].

#### *3.3. BLUF Modules in Association with the E*ff*ector Domains*

The occurrence of the BLUF domain in different associations reveals the diversity and abundance of this modular domain in a wide range of organisms. Our sequence and fold analyses confirm various types of effector domains fused with the BLUF domains (Figure 5).

#### 3.3.1. EAL and GGDEF Domain

The EAL and GGDEF domain-containing proteins are widely distributed among bacteria and are involved in the regulation of the cellular level of the universal signaling molecule bis-(3',5')-cyclic-guanosine monophosphate (c-di-GMP), where the former act as diguanylate cyclases (DGCs) while the latter ones are phosphodiesterases (PDEs) [64]. The c-di-GMP generally controls a variety of signaling pathways associated with cell differentiation, bacterial adhesion and biofilm formation, bacterial motility, the colonization of host tissues and virulence [65,66]. In the EAL-GGDEF domain containing proteins, the GGDEF and EAL motifs in the active sites are crucial for the DGC and PDE enzyme activities [67–69]. Although most of the proteins involved in c-di-GMP signaling contain the GGDEF/EAL domains as a single polypeptide, possessing both the DGC and PDE enzyme activities, in some cases the GGDEF and EAL domain is also found alone [68]. In some of the EAL-GGDEF domains containing proteins, it was also reported that one [70,71] or both [72,73] of these domains are catalytically inactive due to a lack of respective GGDEF and EAL motifs that are crucial for the enzymatic function. In these proteins, the inactive domains either act as regulators [74] or c-di-GMP effectors [72,73]. Yang and coworker have reported a FimX-like protein (Flip) with a degenerate EAL-GGDEF domain which interacts with the PilZ-Domain protein to control virulence in *Xanthomonas oryzae pv. oryzae* [75]. YhdA is another protein with a degenerate EAL-GGDEF domain that promotes the turnover of CsrB and CsrC (small RNAs), which reduce the expression of the flhDC (flagellar master regulator) by sequestering the CsrA (RNA-binding protein) [76]. Several EAL-GGDEF domains containing proteins are also reported to have N-terminal sensory domains that can regulate the GGDEF and/or EAL domain functions. In the present study, we have selected an EAL only and EAL-GGDEF domain-containing proteins with the blue light using flavin (BLUF) domain as a sensory domain from *E. coli* and *Thiocystis violascence*, respectively (Table 1).

Barends and coworker have characterized the *Klebsiella pneumoniae* BlrP1 photoreceptor protein biochemically, structurally, mechanistically, and have elucidated the mechanism of the light-induced regulation of the EAL domain (PDE) via the BLUF sensor domain [4]. Upon light illumination, the structural change in the flavin binding pocket (Trp replace Met) of the BLUF domain cross-activates the EAL domain via allosteric communication and increases the PDE activity [4]. The photo-dependent alteration of the BLUF–EAL interactions influences the quaternary structure, the EAL–EAL interface at the dimerization helix, the compound helix α5EAL and the loop connecting it to β5EAL, which are implicated in the EAL activation [4,77]. In the BlrP1 photoreceptor, the BLUF domain shares a similar architecture, as shown by other BLUF domains from different organisms, where the central BLUF domain (N-terminus) is surrounded by two helices (helical cap; C-terminus) [78–80]. However, an additional EAL output domain has been identified in the BlrP1 protein, which is connected to the BLUF domain via a 50 Å long linker peptide (triosephosphate isomerase (TIM)-barrel fold) [4]. The EAL active site in the BlrP1 photoreceptor involved Glu188, Asn239, Glu272, Asp302, Asp303, Lys323 and Glu359 [4]. We used this well-characterized BlrP1 as a template and aligned it pairwise with EAL only (*E. coli*), EAL-GGDEF (*Thiocystis violascence* DSM 198), and EAL-PRK15043 domains (*Trichuris trichuris*), and observed that most of the amino acids constituting the EAL active site in BlrP1 are highly conserved through all of the EAL domain-containing proteins (Figure S1), which suggested that the EAL domain-containing proteins possess a similar mechanism for the PDE activity.

The query protein sequence (ARH96915.1) containing the EAL output domain was subjected to a protein-protein interaction analysis, which revealed several interacting partners involved in the regulation of different signaling pathways (Figure S7a; Table S1). The protein-protein interaction analysis showed that most of the interacting partners for this photoreceptor are either EAL domain-containing PDEs (JD73\_03740) or GGDEF domain-containing DGCs (JD73\_23675, JD73\_25605, JD73\_23680, YeaP, and YdaM). However, the interactions with YcgZ (two-component connector protein) and YcgE (Mer-like repressor protein/transcriptional regulator) have also been revealed, indicating their involvement in the regulation of bacterial biofilm formation [30]. Tschowri et al. [30] have characterized the previously unknown function of the BLUF-EAL domain-containing protein, YcgF, from *E. coli*, and suggested that upon blue light irradiation, this protein acts like an antirepressor. The antirepressor YcgF removesYcgE (Mer-like repressor) from the promoter, and resumes the expression of different small regulatory proteins (YmgA and YmgB). These small regulatory proteins (YmgA and YmgB) utilize the RcsC/RcsD/RcsB two-component phosphorelay system to activate the production of colanic acid, a biofilm matrix component, and to decrease adhesive curli fimbriae [30]. The query protein may also interact with AriR/YmgB (regulator of acid resistance influenced by indole), another biofilm-related protein involved in the regulation of acid resistance in *E. coli* [81]. A string analysis also revealed the possible interaction between the query protein and regulatory protein, LuxR, which regulates quorum sensing in the bacterial system [82,83].

#### 3.3.2. PsiE Domain

In *E. coli*, the phosphate-starvation-inducible (*psiE*) gene is positively and negatively regulated by both PhoB and cAMP-CRP (cAMP receptor protein), which are respectively involved in the phosphate and carbon metabolism [84]. The phosphate and carbon sources regulate the *psiE* gene by using the *lacZ* and *chloramphenicol acetyltransferase* gene (*cat*) fusions, respectively [84]. Although the function of PsiE has not yet been determined, sometimes it has been predicted to have features like DNA-binding protein inhibitor-related, putative transcriptional regulators or hypothetical DNA binding proteins (IPR020948).

#### 3.3.3. Cyclase Homology Domain (CHD)

The cyclase homology domains (CHDs) are the catalytic domains of eukaryotic and prokaryotic nucleotidylcyclases, i.e., adenylyl cyclases (ACs) and guanylyl cyclases (GCs), which belong to the evolutionary diverse class III nucleotidylcyclases. CHDs are reported as three different structural forms, i.e., heterodimers (mammalian CHDs), pseudoheterodimers (metazoan CHDs) and homodimers (bacterial and protozoan CHDs) [85]. Heterodimeric and pseudoheterodimeric CHDs have a single catalytic pocket sharing catalytic amino acid residues at the dimer interface, while homodimeric CHDs have two separate catalytic pockets, each of which contributes the CHD determinant [85]. In spite of having two potential catalytic pockets, several enzymes with homodimeric CHDs (for example, eukaryotic class III nucleotydylcyclases) may have only one catalytically competent site [86]. Although CHDs are structurally diverse, all of them have a conserved structural component (i.e., a helical region) mutating which compromises the stability and active dimeric conformation of a protein [87]. All CHDs have a common catalytic mechanism in which they require two magnesium or manganese ions to bind a polyphosphate group of the nucleotide, followed by nucleophile activation.

Most CHDs, except a few (Rv1359 from *M. tuberculosis*), are reported to exist in combination with different regulatory modules, thus making it possible to perceive the variety of signals and the regulation of the intracellular cAMP generation [85]. In the present communication, we have selected three such proteins, with the BLUF domain in combination with the CHD domain, having the accession numbers EHQ08139.1 (from *Leptonema illini* DSM 21528), Q8S9F2.1 (from *Euglena gracilis*) and XP\_013758351.1 (from *Thecamonas trahens* ATCC 50062) (Table 1). In *Euglena gracilis*, CHDs (adenylyl cyclases) occur in combination with the BLUF domain, with an overall domain arrangement of BLUF1CHD1BLUF2CHD2, where the flavin chromophore senses blue light and stimulates the adenylyl cyclase activity [35]. Furthermore, we have performed a alignment of sequences representing the CHD domain from selected proteins against the progression of the well-characterized template CHD domain of bPAC (Figure S2). The multiple sequence alignment revealed that active site residues involved in the nucleotide binding (i.e., Asn257-His266, Lys263-Met264 (forming the β4AC- β5AC tongue), Gly259-Asn178 (forming the α2AC helix) and Lys263–Thr196, Asp265–Phe198, and His266–Lys197 (forming the β2AC- β3AC

hairpin) [33]) are highly conserved among all three selected BLUF-CHD domain-containing proteins (Figure S2). In PACs, under dark conditions, the orientation and arrangement of an individual amino acid in the active site (i.e., Thr267 and Lys197 (bound to Phe198)) render the conformation of AC inactive. On the other hand, upon illumination, the change in the interaction between Asn25 and His266 resulted inthe correct orientation of Thr267 required for the communication with the adenine base; additionally, the Lys197 is detached from Phe198, thus providing space for the adenine base to enter the active site more deeply [33]. We also performed a protein-protein interaction analysis, which revealed the possible interacting partners for CHD domain-containing proteins, which range from phosphodiesterases, the RNA polymerase subunit β, and the DNA helicase to another adenylate cyclase/guanylate cyclase associated with the GAF and PAS/PAC sensor (Figure S7b and Table S1).

#### 3.3.4. PAS Domain

PAS (Per-Arnt-Sim) domain-containing proteins are widely distributed among all domains of life. PAS domain acts as a sensor, generally found at the N terminus of sensory and signaling transduction related proteins, and detect a variety of stimuli and regulating the functions of a diverse array of effector domains [88,89]. Members of the PAS domain family can bind a diverse range of small-molecule metabolites [22], which could either directly act as a signal and be involved in initiating a cellular signaling response [90], or which could serve as a cofactor and respond to subsequent messages like gas molecules, redox potential, or photons [39]. Although PAS domain-containing proteins are chemically and functionally diverse, almost all PAS domains have a conserved core comprised of a five-stranded antiparallel β-sheet and several α-helices, which are responsible for the generation and propagation of a signal to the adjoining effector domain. In the present study, we have selected four proteins, three of them from different *Legionella* strains and one from *Allochromatium vinosum* DSM 180, respectively, having combinations of the BLUF and PAS sensor domain (Table 1). We performed the alignment of sequences representing the PAS domain in four different proteins against a well-characterized photoactivated yellow protein (PYP; 1NW\_Z) from *Halorhodospira halophila* [91] (Figure S3). As discussed earlier, PAS domains are structurally diverse; the same is revealed from the multiple sequence alignment analysis. Although, the PAS core (5' NAAEGDIT 3') in PYP is not conserved among other PAS domain-containing proteins, interestingly, in three of the selected PAS domains containing proteins for *Legionella* genus, the sequences representing the PAS core are conserved (Figure S3). From the above observation, we could suggest that, although they are diverse in different organisms, PAS cores might be conserved in plants belonging to the same genus. We also performed a protein-protein interaction analysis, which revealed the possible interacting partners for PAS domain-containing proteins, which range from multisensor histidine kinase, CheA signal transduction histidine kinase, CheW protein, CheB methylesterase, 4-coumarate-CoA ligase, phenylalanine/histidine ammonia- lyase and Hpt sensor hybrid histidine kinase (Figure S7c and Table S1).

#### 3.3.5. B12 Binding Domain

Many prokaryotes synthesize vitamin B12 (cobalamine) having a tetrapyrrole-like structure composed of a bound cobalt atom (Co) with two axial ligands. The lower ligand is known to be involved in vitamin B12 binding, while the upper one is used as a cofactor for different groups of enzymes/proteins, such as methyltransferases, reductases and isomerases [92]. The vitamin B12 binds to a specific domain, i.e., Asp/Glu-X-His-X-X-Gly-(41)-Ser/Thr-X-Leu-(26–28)-Gly-Gly, which is highly conserved in almost all B12-dependent enzymes/proteins and characterized as a Rossmann fold, typically composed of 5 parallel β-sheets surrounded by 4–5 α helices [93,94]. Generally, vitamin B12 and its derivatives are known for their role in fatty acid and folate metabolism; however, recently they have been characterized as photoreceptors with a novel and unanticipated biological function as a light dependent transcriptional regulator [95]. Ortiz-Guerrero et al. [96] reported, for the first time, a light-induced excitation of CarH (Mer-like transcriptional factor/repressor) bound adenosylcobalamin (AdoB12), which inhibited the formation of a stable CarH-AdenoB12 tetramer, thus allowing the gene expression

for the carotenoid biosynthesis in *Myxobacteria*. In the dark, the stable CarH-AdoB12 tetramer binds at the promoter region and shuts down the carotenoid biosynthesis [95–97]. Upon light illumination of the corrin ring associated with AdoB12, it promotes the re-orientation of a helix bundle forming a covalent linkage between H132 and Co, and causes the CarH dissociation from the promoter region, which ultimately leads to the carotenoid gene expression [95,96,98]. Cheng et al. [92] also reported a small stand-alone B12-binding domain protein, AerR in *Rhodobacter capsulatus*, which controls the light-dependent regulation of the biosynthesis of the photosystem via interacting with CrtJ, a repressor of the photosystem gene expression. Like CarH, the light illumination also leads to a covalent association between the His10 and Co ligand, which suggested that the light-dependent covalent linkage between the Co ligand and His residue might be the common mechanism in B12-dependent photoreceptors. Furthermore, photoreceptors have also been naturally endowed with multiple photosensory domains, which could relay signals to output domains to control specific light-dependent functions [95]. In the present communication, we have selected a modular protein (accession number ABP71929.1) with the BLUF and vitamin B12 binding domain from *Rhodobacter sphaeroides* ATCC 17025 (Table 1).

We performed the alignment of the selected protein with the well-characterized B12 binding domain containing proteins (CarH and AerR) and, surprisingly, observed that the crucial amino acids (Trp131, Val138, Glu141 and His142) involved in forming the binding pocket for AdoB12 [95,99] were not observed in the selected protein sequence (Figure S4). Moreover, there are many prokaryotes which do not necessarily require the B12 cofactor; hence, they can acquire alternative B12-independent metabolic pathways for the same reaction [100]. The association of the B12 domains with the BLUF domains was also reported by Cheng et al. [92], where the role of the B12 domain is to sense light that is out of the absorption range of flavin. In several proteins, B12 binding domains are also found in combination with heme/oxygen sensing globin domains; however, none of these proteins have been characterized [92]. In proteins with combinations of histidine kinases or serine/threonine kinases with the B12 domain, the kinases are responsible for regulating the response to light absorption by the B12 domain [93].

The selected protein sequence (ABP71929.1) containing the B12 domain was analyzed for a protein-protein interaction using String (version 11).The observation showed several interacting partners involved in the regulation of different signaling pathways (Figure S7d). The protein-protein interaction analysis revealed the interaction of the B12 domain with the prephenate dehydratase enzyme (Rsph17025\_0524; Figure S7d), which catalyzes the conversion of prephenate to phosphoenolpyruvate (PEP), water, carbon dioxide [101], and is generally involved in phenylalanine, tyrosine and tryptophan biosynthesis [102,103]. The selected protein also interacted with tyrosyl-tRNA synthetase (TyrS), an enzyme that catalyzes the attachment of tyrosine to tRNA in a two-step reaction. The interaction of the B12 domain-containing protein with a transmembrane protein PA-phosphatase-like phosphodiesterase (Rsph17025\_2732), a SARP family transcriptional regulator and a DNA mismatch repair protein, MutL, was also revealed through the protein-protein interaction analysis (Figure S7d). The BLUF-B12 binding domain containing protein also interacted with the TonB protein, which communicates with outer membrane receptor proteins and drives the energy-dependent uptake of various substrates (such as iron citrate, enterochelin, aerobactin, etc.) into the periplasmic space [104].

#### 3.3.6. PRK Superfamily

The PRK family represents a group of three types of P-loop containing kinases, i.e., phosphoribulokinase [105], uridine kinases [106], and pantothenate kinases (CoaA) [107]. This family is named after one of its members, i.e., phosphoribulokinase (PRK), which drives the phosphoryl transfer from Mg-ATP to ribulose 5-phosphate to form ribulose 1, 5-bisphosphate (RuBP) during the pentose phosphate pathway [105]. In *E. coli*, the *udk* gene encodes the pyrimidine salvage enzyme uridine kinase, causing the phosphorylation of uridine/cytidine into UMP/CMP using GTP as the phosphate donor [106]. Pantothenate kinase controls the rate-limiting step in the coenzyme A (CoA) biosynthesis pathway [107]. The *coaA* gene is transcribed to produce the 1.1 kb transcript, which is further translated

into two protein products of 36.4 and 35.4 kDa, respectively [107]. The resultant proteins encoded by the *coaA* gene, showed a difference of eight amino acids at the N-terminus. The *E. coli* strains bearing multiple copies of the *coaA* gene showed a higher activity of the pantothenate kinase [107].

#### 3.3.7. DNA pol 3 gamma3 Superfamily

In *E. coli*, DNA polymerase III (Pol III) is a complex holoenzyme comprised of three functionally distinct subassemblies, i.e., the core polymerase (α, ε, and θ subunit), the sliding clamp (β subunit) and the clamp loader complex (τ2γδδ'χψ subunit) [108]. The clamp loader is responsible for the DNA-dependent hydrolysis of ATP to load β2 clamps onto DNA for the interaction with core polymerases [109]. The gene *dnaX* encodes the ATP motor subunits of the clamp loader, i.e., one γ and two τ subunits, where the γ subunits are considered as a truncated product of the τ subunits [110,111]. The gamma (γ) subunit shares domains I-III with the tau (τ) subunit, while the domain IV and the entire alpha-interacting domain V subunit are only observed in the τ-subunit. The bacterial DNA pol III γ III domain and its homolog, the eukaryotic replication factor C (RFC), belong to the AAA-ATPase superfamily and are primarily involved in the breaking or restructuring of the supramolecular assembly of proteins [110]. In this communication, we have selected two BLUF modular domains in association with the DNA pol III γ III domain only (accession number AIQ92835.1) from *Methylobacterium oryzae* CBMB20, and the PRK11633-DNA pol III γ III domain (WP\_048452447.1) from *Methylobacterium tarhaniae* (Table 1).

Furthermore, we have performed the alignment of the sequences representing the DNA pol III γ III domain against the well-characterized truncated sequence (1-373 amino acids) of the DNA polymerase III subunit gamma/tau (WP\_113440333.1) from *E. coli* (Figure S5). The sequences corresponding to the DNA pol III γ III domains aligned against the domain II of the γ III subunit of DNA pol III from *E. coli*; however, most of the critical amino acids are not conserved amongst the DNA pol III γ III domain. In the *E. coli* DNA pol III γ III domain, Thr157, responsible for the hydrogen bond formation with the terminal phosphate of AMP-PNP, is only conserved in the DNA pol III γ III domain from *Methylobacterium tarhaniae* but not in *Methylobacterium oryzae* CBMB20 (Figure S5) [110]. Moreover, the C-terminal SARC motif (Ser168, Arg169, and Cys170) located in the α7 helix in the sensor1 region, which is highly conserved in the γ subunit of almost all organisms [112,113], is found missing in both of the DNA pol III γ III domains. Arg169 has dual roles, where on the one hand it acts as an "arginine finger" for the SARC motif, while on the other hand it is responsible for electrostatic and hydrophobic interactions which hold the δ subunit onto the γ III subunit. Arg215 is another critical amino acid and is a part of the conserved motif G/PxΦRXΦ (where Φ is any hydrophobic residue) located in the DNA pol III γ III domains, among prokaryotes as well as among eukaryotes [112]. The correct alignment of Arg215 is very crucial for its proper interaction with the phosphate group of ADP/ATP [110]. From the multiple sequence alignment analysis, it was observed that this particular amino acid is conserved in the BLUF-DNA pol III γ III domains but replaced by another polar amino acid (tyrosine:Q) in the BLUF- PRK11633-DNA pol III γ III domain (Figure S5).

The protein-protein interaction analysis revealed that most of the interacting partners of DNA pol III γ III (dnaX) are the components involved in the regulation of DNA replication, such as DNA pol I (Pol A), replicative DNA helicase (dnaB), DNA mismatch repair protein (recR), DNA pol III subunit α (dnaE), δ (holA and holB), ε (dnaQ), chi (holC), psi (holD) and β sliding clamp (dnaN) (Figure S7e and Table S1).

#### 3.3.8. Cytochrome p450/ p450 Superfamily

Cytochrome p450s (CYPs) are a diverse group of heam-containing monooxygenases responsible for the oxidative degradation of steroids, fatty acid and xenobiotics [114]. These heam-thiolate proteins are named after their spectral absorbance peak at 450 nm, due to linkage with the cysteine thiolate of the protein [114]. In spite of a low sequence conservation, the structures are highly conserved. The cytochrome p450 core is made up of a four-helix bundle, helices J and K, two sets

of beta-sheets, and it possesses a heam-binding loop, a proton-transfer groove and the conserved EXXR motif in helix K [115]. The hormone synthesis, cholesterol, and vitamin metabolism are some other pathways that are regulated by cytochrome p450s. In this study, we have selected a protein containing the BLUF domain associated with the p450 domain (accession number WP\_045444510.1) from *Psychrobacter sp.* P11F6 (Table 1). A multiple sequence alignment of the sequence representing the p450 domain against the well-characterized CYP for *Bacillus subtilis* has been performed (Figure S6). The multiple sequence alignment revealed that, among two amino acids (i.e., arginine (Arg242) and proline (Pro243)), essential for substrate binding in the peroxygenase enzyme [116], arginine (Arg108) is also conserved in the BLUF regulated p450 domain (Figure S6). However, in the BLUF controlled p450 domain, hydrophobic proline is replaced by a polar amino acid residue, i.e., serine (Ser109). In several p450 enzymes, an adjacent acidic-polar amino acid pair was reported in the substrate binding site. In *Pseudomonas putida* camphor hydroxylase (CYP101A1), Asp251 and Thr252 were observed in the substrate binding site and used to relay protons onto iron-oxo species to activate the catalytic cycle [114]. The above observations indicate the different evolutionary routes adapted by these enzymes for the H2O2-driven catalysis. We also performed a protein-protein interaction analysis, which showed interacting partners for the selected protein (Figure S7f and Table S1). Most of the interacting proteins belonged to the fatty acid metabolism (CypB, CypC, CypD, and YitS) and to the secondary metabolism (PksJ and PksM).

#### 3.3.9. REC Domain

Signal receiver (REC)/CheY-like photo-acceptor domains are the widely distributed regulatory domains in bacteria (CheY, OmpR, NtrC, and PhoB), however, they are now also reported in eukaryotes, for example, ETR1 from *Arabidopsis thaliana*. In the bacterial two-component regulatory system, the response regulator typically consists of a receiver domain that is covalently linked to an effector domain (DNA binding or catalytic units), and which is controlled by sensor kinase-catalyzed aspartyl phosphorylation [117]. The role of the REC domain is to receive the input signal perceived and transmitted from the sensor partner in the two-component systems. The REC domain interacts with different proteins to regulate processes like bacterial chemotaxis and some other regulatory pathways [118].

#### 3.3.10. TetR and AcrR Domain

The TetR protein family is a group of transcriptional regulators with an HTH DNA-binding motif, which is widely distributed among bacteria [119]. TetR family proteins control efflux pumps and transporters having a role in antibiotic resistance and tolerance to toxic chemicals, synthesis of osmoprotectants, quorum sensing, drug resistance, virulence and sporulation [120,121]. However, the TetR family is named after its most characterized member, TetR, which has a role in the regulation of the expression of *tet* genes, involved in conferring tetracycline resistance in the bacterial system [120]. Proteins with the TetR domain maintain its optimal cellular level by feedback control. TetR (dimer) binds to two adjacent DNA major grooves (6bp each) located in the promoter region of the target gene on both of the strands, where helix α3 (Gln38 to His44) is involved in a sequence-specific recognition [119,120]. The Arg28 in helix α2 strengthens the specific contact with the complementary strand [119,120]. The hydrophobic core, developed from the contributing residues from the α1, α2, and α3 bundle, stabilizes the TetR DNA binding domain [119]. A highly conserved Lys48, located in α4,also has an essential role in the TetR-DNA complex formation [119].

The TetR family protein also works in a complex circuit with other proteins, including AcrR, a transcriptional repressor of the *acrAB* operon responsible for encoding a multidrug efflux pump which removes a wide range of antibiotics and confers antibiotic resistance in *E. coli* [121]. The AcrR protein (215 amino acid; dimer) crystal structure showed that it is composed of a three-helix DNA-binding domain and a unique C-terminal domain (large internal cavity) for ligand binding, which is structurally similar to members of the TetR family of transcriptional repressors [120,122]. It was predicted that

ligand (rhodamine 6G, ethidium and proflavin) binding at the C-terminal ligand binding site leads to an alteration in the conformation of the N-terminal DNA binding region and thereby initiates transcription at the corresponding promoter of the target gene [123].

#### 3.3.11. Endonuclease 3c and Endonuclease-NS Domain

The endonuclease 3c domain is widely distributed among the family of DNA repair proteins such as endonuclease III and DNA glycosylase (MutY or MBD4). The members of this family possess a conserved helix-hairpin-helix (HhH), a Gly/Pro-rich loop, as well as a conserved aspartate residue [124,125]. On the other hand, the endonuclease-NS domain has been explicitly reported in the DNA/RNA non-specific endonucleases and found both in prokaryotes and eukaryotes. The endonuclease-NS domain, containing endonucleases, showed an Mg2<sup>+</sup> dependent cleavage of double-stranded as well as single-stranded nucleic acids. The extracellular *Serratia marcescens* nuclease is a well-characterized example of an endonuclease with an endonuclease-NS domain having a conserved histidine residue. The *Serratia marcescens* nuclease requires magnesium ion, three acidic (Asp107, Glu148 and Glu232) amino acid residues, as well as a few basic amino acid residues (Arg108, Arg152) for the endonuclease activity [126,127]. Proteins with the endonuclease-NS domain are broadly involved in hydrolase activity, nucleic acid binding and metal ion binding [126].

#### 3.3.12. AraC Domain

The AraC protein (inducer/activator) regulates the *ara*BAD operon in *E. coli*, which is responsible for encoding structural components for the arabinose metabolism [128]. X-ray crystallization and NMR studies demonstrated that the AraC is a dimeric protein composed of two helix-turn-helix DNA-binding motifs [129]. AraC uses arabinose as a substrate, and the induction of the *ara*BAD operon depends on the concentration of extracellular arabinose (>10−<sup>7</sup> M) as well as on the rate of the arabinose uptake and catabolism [128].

#### 3.3.13. Abhydrolase (α/β Hydrolase) Superfamily

The α/β hydrolase superfamily is a diverse group of hydrolytic enzymes that may differ in their catalytic function but that share a common fold with a conserved loop bearing catalytic triad [130]. The catalytic triad involves serine, glutamate/aspartate, and a histidine amino acid residue, and participates in the nucleophilic attack on a carbonyl carbon atom. The α/β hydrolase fold includes proteases, lipases, peroxidases, esterases, epoxide hydrolases and dehalogenases [131]. Unlike other proteins, the core of the protein belonging to this superfamily has an α/β sheet composed of 8 β strands lined to 6 α helices [130,132].

#### 3.3.14. Domain of Unknown Function (DUF)

Generally, every protein domain has a distinct structure and function. However, there are several domains which have no known role, and these are referred to as domains of unknown function (DUFs). Most of the time these domains were ignored as having little relevance, but now there are reports which show that many DUFs are essential, as they are crucial for protein function. Basing themselves on sets of bioinformatic analyses of several uncharacterized DUFs, Goodcare et al. speculated about probable tasks which may be related to ATP binding or transcription [133].

#### 3.3.15. ANK Repeats

Ankyrin (ANK) like repeats mediated protein-protein interactions between diverse groups of proteins [134,135] have been reported in almost all species [136]. The ANK proteins exhibit a domain shuffling via a horizontal gene transfer [137]. A protein may have several numbers of ANK repeats per protein [134,138]. Davis et al. [138] demonstrated the association of a specific 33 residue erythrocyte ankyrin repeat with an anion exchanger. A stack of ANK repeats has a superhelical arrangement with four consecutive repeats, and each unit contains two antiparallel helices and a beta-hairpin. ANK repeats may also occur in combinations with other types of domains [134].

#### 3.3.16. RhoGEF Domain

The RhoGEF protein is a guanine nucleotide exchange factor (GEF) responsible for the activation of Rho family GTPases (Rho, Rac, and Cdc42) [139–141]; it controls a diverse array of cellular processes, including cellular differentiation [142], cell morphology [143], cell motility and adhesion [144], phagocytosis [145], cytokinesis [146], smooth muscle contraction [147], and the etiology of human disease such as hypertension [148] and cancer [149]. The Rho family proteins are generally found in two different conformational states, i.e., active GTP-bound and inactive GDP-bound [150]. The Rho family GTPases have a conserved domain of ~200 amino acid residues known as the RhoGEF domain or Dbl homology (DH) domain, which encodes a GEF specific to different Rho family members [151]. In addition to the RhoGEF domain, Rho family GTPases also have another functionally independent conserved domain (~100 amino acid residue), i.e., the pleckstrin homology (PH) domain, located at the C-terminus of the RhoGEF domain [152]. The C-terminal PH domain is generally involved in intracellular targeting and regulates the function of the RhoGEF domain. The RhoGEF domain has an α-helix bundle-like structure with three conserved regions, i.e., conserved region 1 (CR1), conserved region 2 (CR2) and conserved region 3 (CR3). Among these three conserved regions, CR1 and CR3 interact partly with α-6 and the DH/PH junction site, forming the Rho GTPase binding pocket.

#### 3.3.17. PDZ Domain

PDZ domains or discs-large homologous regions (DHR) are widely spread in a wide range of membrane-bound signaling proteins from bacteria, yeasts, plants, insects and vertebrates [153,154]. The PDZ domain presents either as a single copy or as multiple copies and interacts either with the C-terminus of proteins or with internal peptide sequences [154]. Proteins with the PDZ domain are generally located at the plasma membrane, where they can directly interact with phosphatidylinositol 4, 5-bisphosphate (PIP2), as observed with the class II PDZ domain in syntenin [155]. PDZ domains (80–90 amino acids) are composed of compactly arranged six β-strands (βA-βF) and two α-helices (αA and αB) in a globular structure. PDZ domains interact with shaker-type K<sup>+</sup> channels in several MAGUKs or bind similar ligands of other transmembrane receptors [154].

#### 3.3.18. GAL4-Fungal TF MHR Domain

Gal4 is a fungal-specific positive regulator for the expression of galactose-induced genes [156]. This domain is generally located at the N-terminus of several fungal-specific transcriptional regulators and contains a binuclear Zn cluster bound by six Cys residues; additionally, it is involved in the zinc-dependent binding of DNA [157,158]. The transcriptional regulators or proteins with the GAL4-fungal TF-MHR domain are generally involved in the arginine, proline, pyrimidine, quinate, maltose and galactose metabolisms, amide and GABA catabolism and leucine biosynthesis [159].

#### *3.4. BLUF Proteins for Optogenetic Tools*

Small photoreceptors, such as sensors of blue light using flavin (BLUF), light oxygen voltage (LOV) based receptors and cryptochromes, have been identified in the genomes of different organisms, from prokaryotes to higher eukaryotes, as has been shown in the present study and in several other studies [160,161]. As evident from the present study and other in vitro experiments [161–165], one of the peculiar features of these light-sensitive motifs are their association with various effector domains, thus hinting about the wide range of novel mechanistic and functional diversity controlled by light. This aspect is yet to be studied in detail. Investigations into PixJ1 (blue and green light), RcaE (red and green), changes in light-dependent *E. gracilis* PAC-α activity for *Drosophila,* behavioral modulation and neural responses in marine gastropod *Aplysia* and *Caenorhabditis elegans* [166–171] suggest sensitivities of each receptor toward a particular light bandwidth, which can be one of the

important tools used to engineer a novel system for a broad range of physiological outputs. This study further expands the possibilities for the BLUF domain to be used as a powerful optogenetic tool for the development of novel optogenetic technologies. A vast variety of domain combinations of BLUF photoreceptors in different genomes (Figure 1) represents a promising and valuable tool to design novel photo-regulated enzymes, messengers, photo-modulation of gene expression patterns, photo-control of the virulence in pathogenic bacteria through the recombinant expression of such systems, photobehavioural responses in photobacteria, modulation of neural systems and dynamic molecular switches to regulate biological activities. BLUF domains, in combination with the EAL, GGDEF or CHD domains, could be utilized for the photo-dependent regulation of c-di-GMP and cAMP associated signaling in bacteria [64,85]. The BLUF domain associated with a B12 binding domain has also been analyzed; however, it does not show the important amino acids that are required in order to form the binding pocket for AdoB12. Cheng et al. [92] also reported a similar BLUF module and suggested that the associated B12 binding domain also has a photosensory function which can regulate activity in response to light. The B12 binding domain broadens the absorption range for the BLUF photosensor, which could be critical to several regulatory pathways [96,172]. We could also use this modular combination for the photo-dependent regulation of pathways like carotenoid synthesis or photosystem biosynthesis [92]. Another combination which could be engineered and used for an optogenetic application is the association of the BLUF domain with the DNA pol III γ III domain. Using this modular architecture, we could regulate the actions of different components involved in DNA replication in a light-dependent manner. Furthermore, the BLUF domain is also found in association with the p450 (cytochrome p450) domain, which could be used as an optogenetic tool for the light-dependent regulation of several pathways like fatty acid metabolism and secondary metabolism. The optogenetic potential of the BLUF domain could also be acquired in the two-component regulatory system for both prokaryotes and eukaryotes. A modular architecture in which the BLUF sensor domain is associated with the Rec (receiver) domain could be exploited as a two-component regulatory system for the photo-dependent regulation of processes like bacterial chemotaxis and other regulatory pathways [118]. The optogenetic potential of the BLUF domain could also be extended to the regulation of the efflux pump and transporter involved in antibiotic resistance, tolerance to a toxic chemical, synthesis of an osmoprotectant, quorum sensing, drug resistance and sporulation [120–122]. The BLUF domain associated with the TetR or AcrR domain may be exploited for the light-dependent regulation of pathways (as mentioned earlier) in bacteria. The BLUF domain was also analyzed with the endonuclease 3c and endonuclease\_NS domains for their optogenetic potential. The BLUF domain endonuclease 3c could be engineered and used as an optogenetic tool for the light-dependent regulation of the DNA repair process. On the other hand, the BLUF domain associated with the endonuclease\_NS domain could be exploited for the light-dependent regulation of processes like hydrolase activity, nucleic acid binding and metal ion binding [126]. The modular architectures comprised of BLUF associated with the AraC and abhydrolase domains could also be harnessed for the light-dependent regulation of the arabinose metabolism, as well as the diverse group of hydrolytic enzymes that include proteases, lipases, peroxidases, esterases, epoxide hydrolases, and dehalogenases, respectively [131]. BLUF in combination with the RhoGEF domain is also considered an important modular architecture, which could be used as an optogenetic tool for the light-dependent regulation of a diverse array of cellular processes [142–146,148,149].

#### **4. Conclusions**

Applications of the photoreceptors in order to quickly control molecular machines and, in turn, biological systems and processes, present the scientific community with an exciting opportunity with several possibilities, approaches, along with their limitations as well. Recently, several such approaches have been reported, including photoswitches, UV photo-reactivation and deactivation, spectral tuning, and photocaging [25]. However, a detailed understanding of electron transfer mechanisms, transient state intermediates, and amino acid patterns will allow the development of more precise

recombinant techniques, fusion proteins and complexes for improving such systems and providing an alternate route to design broadly reactive light-sensitive probes. In the case of LOV domains, the deprotonation of flavin N (5) involves rate-determination for the recovery, using base catalysis, pH, proton inventory and structural studies [173]. It will also be interesting to study mechanisms to convert these small spectral shifts into more significant jumps leading to many fold increases in the rage of 400–500 nm in the activities of such receptors and downstream signals. Theoretically, the fold increase in the signal response could be improved to that extent [173,174]. Biocatalytic reactions using photoactivated enzymes, produced either through recombinant methods or through directed evolution, can reduce the complexity of the system by controlling the system remotely to deliver high-value materials and compounds in biotechnology or the pharma industry. Photo-controlled receptors like BLUF can play a vital role in biotransformation cascade in the same way as that of photosensitive chemical groups like O-nitrobenzyl, 3-nitrophenyl and benzyloxycarbonylphenylin organic reaction steps, where the reaction's product acts as the substrate for the next reaction in a multistep pathway [175]. We have identified and analysed 34 such proteins containing the BLUF domain in association with different effector domains, such as kinases, phosphatases, phosphodiesterases, a transcriptional repressor (TetR), PAS, endonuclease, PBP proteins, etc., involved in regulating a wide range of cellular processes. All of the selected proteins have a conserved catalytic core, including tyrosine, glutamine, and tryptophan or methionine, which are essential for the BLUF photo-activation and photocycle. Until now, several photoreceptors, such as channelrhodopsin2 (ChR2) [176], phytochrome [177], cryptochromes [178], LOV [19], as well as BLUF [4,30], have been adopted for their optogenetic potentials. However, considering the modularities in the BLUF domain architecture and their unexplored nature, these combinations have a great potential to befurther utilized for the development of novel optogenetic technologies. To conclude, these photoactivated/controlled systems could be the way forward in synthetic biology, as different and subtle differences in the light sensitivities of these vast arrays of receptors can be harnessed to regulate a reaction cascade in an engineered organism by choosing a particular photoreactivity and control.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/9/18/3924/s1, Figure S1: Multiple sequence alignment of the BLUF coupled EAL domain. Sequence representing the EAL domain from the BlrP1 protein was used as the template for the sequence alignment analysis. Amino acid residues in solid box are the conserved residues involved in the formation of the EAL active site, Figure S2: Multiple sequence alignment of the BLUF coupled CHD domain. Sequence representing the CHD domain from the bPAC protein was used as template for the sequence alignment analysis. Amino acid residues in solid box are the conserved residues involved in the formation of the nucleotide binding site, Figure S3: Multiple sequence alignment of the BLUF coupled PAS domain. Sequence representing the PAS domain from the photoactivated yellow protein (PYP) from *Halorhodospira halophila* was used as template for the sequence alignment analysis. Amino acid residues in solid box are representing the PAS core motif responsible for the generation and propagation of the signal to the adjoining effector domain, Figure S4: Multiple sequence alignment of the BLUF coupled vitamin B12 binding domain. Sequence representing the B12 binding domain from the CarH and AerR proteins was used as template for the sequence alignment analysis. The conserved amino acids (Trp 131, Val138, Glu141 and His142) essential for forming the binding pocket for the substrate, i.e., AdoB12, is not found in the aligned portion of the BLUF coupled vitamin B12 binding domain, Figure S5: Multiple sequence alignment of the BLUF coupled DNA pol III γ III domain. The sequence of the well characterized truncated (1-373 amino acids) DNA polymerase III subunit gamma/tau (WP\_113440333.1) from *E. coli* was used as template for the sequence alignment analysis. Amino acid residues in solid box represent the important residues crucial for the enzyme activity, Figure S6: Multiple sequence alignment of the BLUF coupled p450 and the well characterized CYP protein from *Bacillus subtilis* (used as template). Amino acid residues in solid box representing the conserved (Arg) and altered (Pro to Ser) amino acid residues essential for the substrate binding, Figure S7: Protein-Protein interaction network depicting interacting partners of the selected effector domain (EAL, CHD, PAS, B12, DNA POL III γ III and p450) of the BLUF modular proteins. Protein highlighted in yellow is the query protein. Protein-protein interaction analysis was performed by using String version 11 (https://string-db.org/). Details of the query proteins, domains, along with the annotations are given in Table S1, Table S1: Output showing the details of query proteins, domains, interacting proteins and annotated functions.

**Author Contributions:** M.S.K. had analysed all the sequences in details and wrote the manuscript. R.S. had done preliminary analysis of the BLUF coupled proteins. S.K.V. had conceptualized protein-protein interaction of the BLUF coupled domains and wrote manuscript. S.K.S. had done detailed structure-function analysis of the BLUF and its modular domains and wrote relevant part of the paper. S.K. had conceptualized, outlined and wrote the manuscript.

**Funding:** M.S.K. research associateship was supported from SR/NM/NB-1087/2017(G)-JNU. S.K. was supported by the SR/NM/NB-1087/2017(G)-JNU. S.K.S. was supported by the research endowment fund from MUJ.

**Conflicts of Interest:** The authors have no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Ion Channel Properties of a Cation Channelrhodopsin,** *Gt***\_CCR4**

#### **Shunta Shigemura 1, Shoko Hososhima 1, Hideki Kandori 1,2 and Satoshi P. Tsunoda 1,3,\***


Received: 23 July 2019; Accepted: 18 August 2019; Published: 21 August 2019

**Abstract:** We previously reported a cation channelrhodopsin, *Gt*\_CCR4, which is one of the 44 types of microbial rhodopsins from a cryptophyte flagellate, *Guillardia theta*. Due to the modest homology of amino acid sequences with a chlorophyte channelrhodopsin such as *Cr*\_ChR2 from *Chlamydomonas reinhardtii*, it has been proposed that a family of cryptophyte channelrhodopsin, including *Gt*\_CCR4, has a distinct molecular mechanism for channel gating and ion permeation. In this study, we compared the photocurrent properties, cation selectivity and kinetics between well-known *Cr*\_ChR2 and *Gt*\_CCR4 by a conventional path clamp method. Large and stable light-induced cation conduction by *Gt*\_CCR4 at the maximum absorbing wavelength (530 nm) was observed with only small inactivation (15%), whereas the photocurrent of *Cr*\_ChR2 exhibited significant inactivation (50%) and desensitization. The light sensitivity of *Gt*\_CCR4 was higher (EC50 = 0.13 mW/mm2) than that of *Cr*\_ChR2 (EC50 = 0.80 mW/mm2) while the channel open life time (photocycle speed) was in the same range as that of *Cr*\_ChR2 (25~30 ms for *Gt*\_CCR4 and 10~15 ms for *Cr*\_ChR2). This observation implies that *Gt*\_CCR4 enables optical neuronal spiking with weak light in high temporal resolution when applied in neuroscience. Furthermore, we demonstrated high Na<sup>+</sup> selectivity of *Gt*\_CCR4 in which the selectivity ratio for Na<sup>+</sup> was 37-fold larger than that for *Cr*\_ChR2, which primarily conducts H+. On the other hand, *Gt*\_CCR4 conducted almost no H<sup>+</sup> and no Ca2<sup>+</sup> under physiological conditions. These results suggest that ion selectivity in *Gt*\_CCR4 is distinct from that in *Cr*\_ChR2. In addition, a unique red-absorbing and stable intermediate in the photocycle was observed, indicating a photochromic property of *Gt*\_CCR4.

**Keywords:** microbial rhodopsin; channelrhodopsin; electrophysiology; optogenetics

#### **1. Introduction**

Microbial-type rhodopsins are made up of seven or eight transmembrane helices with a covalently bound all-*trans* retinal as the chromophore [1]. They are found in archaea, bacteria, eukaryota (such as fungi and algae) and viruses, and are physiologically responsible for energy production and the phototaxis reaction. Molecular functions of microbial rhodopsin involve ion transporters, sensors and light-regulated enzymes. As for ion-transporting rhodopsins, they are divided into ion-pumps and channels. Bacteriorhodopsin (BR) was the first identified outward directed proton-pumping rhodopsin [2]. The discovery of a Cl<sup>−</sup> pump, an Na<sup>+</sup> pump and inward-directed proton pumps has been even until recently [3–6]. Structure-based and spectroscopic studies, when combined with electrophysiology and molecular dynamics studies, revealed the detailed molecular mechanism of bacteriorhodopsin and other pumps.

Channelrhodopsin-1 and -2 (*Cr*\_ChR1 and *Cr*\_ChR2) from *Chlamydomonas reinhardtii* were the first light-gated ion channels to be discovered [7,8]. These homologous proteins permeate cations in which the permeability ratio of H+, Na+, and K<sup>+</sup> is 106, 1, and 0.5, respectively. High-resolution X-ray structures revealed details of their molecular architecture and provided insight into their photoactivation and ion conduction [9,10].

Successful expression of *Cr*\_ChR2 in neurons allowed the action potential to be manipulated by light, which opened up a new field of research, optogenetics [11,12]. A number of variant molecules have been engineered to improve the functionality of ChR, and homologous ChRs were then reported [13]. Color-tuning variants cover almost the entire visible range. *Cr*\_ChR2 displays an action spectrum maximum at 470 nm [8]. ChR variants such as C1V1, which is the chimeric version of ChR1 from *Chlamydomonas reinhardtii* and *Volvox carteri*, or C1C2 (a green receiver) absorb light at around 530~545 nm [14–16]. Another red-shifted ChR, Chrimson from *Chlamydomonas noctigama*, exhibits an absorption maximum at 590 nm which allows reliable neuronal stimulation by light exceeds 600 nm [17]. On the other hand, *Ts*ChR or *Ps*ChR absorb a shorter wavelength, making it possible to excite neurons at 440 nm [18].

The lifetime of an open channel can be extended by mutations at C128 and D156 (DC pair) which form a hydrogen bond bridge in *Cr*\_ChR2. Mutations at C128 to Thr, Ala and Ser slowed the kinetics of channel closing 200, 5000 and 10,000-fold respectively [19]. *Cr*\_ChR2 D156C displayed an even stronger effect, namely higher light sensitivity and prolonged lifetime of the open channel, by as much as 30 min [20].

Converting ion selectivity is challenging. The *Cr*\_ChR2 L132C mutant showed improved Ca2<sup>+</sup> permeability [21]. The permeability ratio between H<sup>+</sup> and Na<sup>+</sup> could be modified by a replacement at E143 to A in Chrimson [22]. Anion channelrhodopsins were engineered or discovered from nature and they have been applied as neuronal silencing tools [23–25]. The crystal structures of anion channelrhodopsins revealed their unique features related to the channel gating mechanism [26–28].

A novel cation channelrhodopsin family was reported in 2016 and 2017 from *Guillardia theta*, namely *Gt*\_CCR1-4 [29,30]. These cation channelrhodopsins (CCRs) from cryptophyte algae are more homologous to haloarchaeal rhodopsins, such as proton pumping bacteriorhodopsin, than to chlorophyte CCRs, including *Cr*\_ChR2. Actually, *Gt*\_CCRs conserve the characteristic amino-acid residues involved in unidirectional proton transfer, including the proton acceptor D85 and the proton donor D96 in bacteriorhodopsin (Table S1).

On the other hand, a characteristic glutamic acid in TM2 (E90 in *Cr*\_ChR2) which is crucial for channel gating and ion selectivity, is not conserved in *Gt*\_CCRs [23,31]. *Cr*\_ChR2 possesses a so-called DC pair (C128 and D156 in *Cr*\_ChR2), which is responsible for the channel life time [19,32,33]. This is not found in *Gt*\_CCRs. Thus, overall sequence patterns separate these cryptophyte CCRs form chlorophyte channels. The molecular mechanisms such as channel gating mechanism and ion selectivity could be distinct in chlorophyte CCRs. Sineshchekov and coworkers already revealed that the retinal Schiff-base (SB) in *Gt*\_CCR2 rapidly deprotonates to the D85 homolog, as in BR, upon photoisomerization [34]. Channel-opening requires deprotonation of the D96 homolog. We independently identified photocycle intermediates during the channel function of *Gt*\_CCR4 from electrophysiological and flash photolysis experiments. The M-decay corresponds to channel-closing, implicating tight coupling between retinal dynamics and channel function. However, reprotonation of SB for channel closing was achieved by the direct return of a proton from the D85 homolog. Such proton transfers are not the case with *Cr*\_ChR2. In *Cr*\_ChR2, D156 in TM4 provides the proton [35]. We demonstrated, using an FTIR study, that the secondary structural change in the primary reaction was much smaller than in *Cr*\_ChR2 [30]. These differences in the molecular mechanism place the cryptophyte CCR in a new family of channelrhodopsins, which we described as "DTD channelrhodopsins" or "BR-like cation channelrhodopsins" [29,30]. To further reveal the characteristics of these DTD channelrhodopsins, in this study we performed electrophysiological measurements in parallel with *Cr*\_ChR2.

#### **2. Materials and Methods**

#### *2.1. Expression Plasmids*

A mammalian expression plasmid peGFP-*Gt*\_CCR4 was described previously [30]. pVenus-N1-Chop2-315 was a kind gift from Prof. Yawo (The University of Tokyo) [12].

#### *2.2. Cell Culture*

The electrophysiological assays of *Gt*\_CCR4 and *Cr*\_ChR2 were performed on ND7/23 cells, which are hybrid cell lines derived from neonatal rat dorsal root ganglia neurons fused with mouse neuroblastoma [36]. ND7/23 cells were grown on a collagen-coated coverslip in Dulbecco's modified Eagle's medium (Wako, Osaka, Japan) supplemented with 2.0 μM all-*trans* retinal and 5% fetal bovine serum, and under a 5% CO2 atmosphere at 37 ◦C. The expression plasmids were transiently transfected by using the FuGENE HD transfection Reagent (Promega, Fitchburg, WI, USA) according to the manufacturer's instructions. Electrophysiological recordings were then conducted 24–36 h after transfection. Successfully transfected cells were identified by eGFP or Venus fluorescence under a microscope prior to the measurements.

#### *2.3. Electrophysiology*

All experiments were carried out at room temperature (22 ± 2 ◦C). Photocurrents were recorded as previously described using an Axopatch 200B amplifier (Molecular Devices, Sunnyvale, CA, USA) under a whole-cell patch clamp configuration [12]. Data were filtered at 5 kHz and sampled at 20 kHz (Digdata1550, Molecular Devices, Sunnyvale, CA, USA) and stored in a computer (pClamp10.6, Molecular Devices). Pipette resistance was 3–6 MΩ. The standard internal pipette solution for the whole-cell voltage clamp contained (in mM) 120 KOH, 100 glutamate, 2.5 MgCl2, 2.5 MgATP, 0.01 Alexa568, 50 HEPES, and 5 EGTA, and adjusted to pH 7.2. The standard extracellular solution for the whole-cell voltage clamp contained (in mM) 140 NaCl, 2 KCl, 2 MgCl2, 2 CaCl2, and 10 HEPES, and adjusted to pH 7.2. The ion selectivity internal pipette solution for the whole-cell voltage clamp contained (in mM) 1 NaCl, 1 KCl, 2 CaCl2, 2 MgCl2, 110 N-methyl D-glucamine, 10 CHES, and 10 EGTA, and adjusted to pH 9.0. The ion selectivity extracellular solution for the whole-cell voltage clamp contained (in mM) ExNMG, 9.0 is 1 NaCl, 1 KCl, 2 CaCl2, 2 MgCl2, 140 N-methyl D-glucamine, and 10 CHES, and adjusted to pH 9.0. ExNMG, 6.85 is 1 NaCl, 1 KCl, 2 CaCl2, 2 MgCl2, 140 N-methyl D-glucamine, and 10 MES, and adjusted to pH 6.85. ExNaCl, 9.0 is 140 NaCl, 1 KCl, 2 CaCl2, 2 MgCl2, and 10 CHES, and adjusted to pH 9.0. ExKCl, 9.0 is 1 NaCl, 140 KCl, 2 CaCl2, 2 MgCl2, and 10 CHES, and adjusted to pH 9.0. ExCsCl, 9.0 is 1 NaCl, 1 KCl, 140 CsCl, 2 CaCl2, 2 MgCl2, and 10 CHES, and adjusted to pH 9.0. ExCaCl2, 9.0 is 1 NaCl, 1 KCl, 70 CaCl2, 2 MgCl2, and 10 CHES, and adjusted to pH 9.0. ExMgCl2, 9.0 is 1 NaCl, 1 KCl, 2 CaCl2, 70 MgCl2, and 10 CHES, and adjusted to pH 9.0. All solutions of pH were adjusted with N-methyl D-glucamine or HCl. The liquid junction potential was calculated and compensated by pClamp 10.6 software. Time constants were determined by a single exponential fit unless noted.

#### *2.4. Optics*

For whole-cell voltage clamp, irradiation at 470 or 530 or 590 nm was carried out using WheeLED and collimated LED (parts No. WLS-LED-0530-03 or LCS-0530-03-22, WLS-LED-0590-03 Mightex, Toronto, ON, Canada) or an ND YAG flash laser, Mini lite at 532 nm (Continuum, San Jose, CA, USA) controlled by computer software (pCLAMP10.6, Molecular Devices). Light power was measured directly by an objective lens of a microscope by a power meter (LP1, Sanwa Electric Instruments Co., Ltd., Tokyo, Japan).

#### *2.5. Confocal Images*

Cell images shown in Figure 1A,B were observed by a Nikon A1 LFOV through an objective lens, Apo 60x Oil λS DIC N2.

**Figure 1.** Basic properties of *Gt*\_CCR4 and *Cr*\_ChR2. (**A**,**B**) Each cation channelrhodopsin expressed in ND7/23 cells was stimulated by green (530 nm) or blue (470 nm) LED light (6.8 mW/mm2). Standard solutions were used. Membrane potentials were clamped from −60 mV to +60 mV in +20 mV steps. Fluorescence images were taken using a confocal microscope. (**C**) Current-voltage relationship (I-V plot) of *Gt*\_CCR4 (filled symbol) and *Cr*\_ChR2 (empty symbol). Current peak component (square) and steady state amplitude (circle) of two channels are depicted. (**D**) Current-decay kinetics of *Gt*\_CCR4 and *Cr*\_ChR2. τoff is plotted as a function of membrane voltage. Filled circle; *Gt*\_CCR4, empty circle; *Cr*\_ChR2. (**E**,**F**) Light power dependency of photocurrents from *Gt*\_CCR4 and *Cr*\_ChR2 at −60 mV. Each channel was stimulated by 530 nm (*Gt*\_CCR4) and 470 nm (*Cr*\_ChR2). Photocurrent values are normalized. Current peak component (filled circle) and steady state amplitude (empty circle) are depicted. (n = 4–8 cells).

#### *2.6. Statistical Analysis*

All data in the text and figures are expressed as mean ± SEM.

#### **3. Results**

#### *3.1. Basic Characterization of Photocurrent*

We transiently expressed *Gt*\_CCR4 and *Cr*\_ChR2 in ND7/23 cells by a conventional transfection method (FuGENE). Expression of these channels was visualized by tagged-GFP or Venus fluorescence. Strong membrane expression was confirmed for both channels, although cytoplasmic GFP was observed in *Gt*\_CCR4-expressing cells (Figure 1A microscope images). We illuminated 530 nm light to induce a photocurrent for *Gt*\_CCR4 and 470 nm light for *Cr*\_ChR2 at the same light intensity (6.8 mW/mm2). A large photocurrent was recorded from the *Gt*\_CCR4-expressing cells, reproducing previous studies (Figure 1A, left) [30]. Current amplitude reached −2 nA at −60 mV. The current showed an initial peak component (Ip) which decayed slightly into a steady state level (Iss). However, amplitude of the steady state still retained about 80% of the transient peak component. The photocurrent from *Cr*\_ChR2 showed a large peak component which reached about −2 nA at −60 mV (Figure 1A, right). The current decayed by 50% of the initial peak, suggesting that *Cr*\_ChR2 exhibits a markedly large inactivation compared to *Gt*\_CCR4. Figure 1C depicts the current-voltage relationship of photocurrent from *Gt*\_CCR4 and *Cr*\_ChR2. Both peak component (Ip) and steady state current (Iss) are plotted. The shape of the I-V plot from *Gt*\_CCR4 indicates strong inward-rectification. Iss of *Cr*\_ChR2 displayed similarly inward-rectification, while Ip was weakly rectified, in which a markedly outward-directed current was observed at positive membrane voltages. For both Ip and Iss, *Gt*\_CCR4 showed a larger current density (pA/pF) than *Cr*\_ChR2. For example, the photocurrent (Iss) of *Gt*\_CCR4 at −100 mV exceeded −80 pA/pF, while that of *Cr*\_ChR2 was only about −40 pA/pF.

Kinetics in the photocurrent decay after shutting off the light is shown in Figure 1D. The time constant of *Gt*\_CCR4 is about 25–35 ms under a membrane voltage between −100 and 80 mV, while *Cr*\_ChR2 showed faster kinetics by about 10–20 ms. Next, we compared light sensitivity in the photocurrent of two channels (Figure 1E,F). The photocurrent amplitude from *Cr*\_ChR2 grows as a typical sigmoidal curve for both the initial peak and the steady state components (Figure 1F). EC50 was determined as 0.8 mW/mm<sup>2</sup> for Ip and 0.35 mW/mm2 for Iss under the conditions tested. On the other hand, the *Gt*\_CCR4 current showed unique growth in terms of power dependency with two apparent phases in which the current first saturated at 0.1 mW/mm2 at about 50% of full activation, followed by the second phase of growth from 1 to 10 mW/mm2 (Figure 1E). The EC50 was determined as 0.13 mW/mm<sup>2</sup> for Ip and 0.18 mW/mm<sup>2</sup> for Iss. These results indicate that *Gt*\_CCR4 is more sensitive to light with respect to channel activation.

#### *3.2. Ion Selectivity*

It was already reported that *Gt*\_CCRs are H+- and Na+-permeable cation channels [29,30]. We here investigated the cation selectivity of *Gt*\_CCR4 in more detail relative to *Cr*\_ChR2. Ionic conditions of the extracellular solution were systematically exchanged with various cations including Na+, K+, Cs+, Ca2+, Mg2<sup>+</sup> and NMG. In the presence of NMG, one can assume H<sup>+</sup> as the permeated ion. Figure 2A,B show the I-V plot of *Gt*\_CCR4 and *Cr*\_ChR2 under several ionic conditions. Obviously, the reversal potential shift in Na<sup>+</sup> solution is larger in *Gt*\_CCR4 than in *Cr*\_ChR2, suggesting that *Gt*\_CCR4 is more permeable to Na<sup>+</sup> than *Cr*\_ChR2. Notably, I-V plots of *Gt*\_CCR4 at pH 6.85 and 9.0 in NMG are almost identical (Figure 2A) whereas a large shift of reversal potential was observed in *Cr*\_ChR2 under the same conditions (Figure 2B). These results indicate that *Gt*\_CCR4 has less H<sup>+</sup> selectivity than *Cr*\_ChR2. The photocurrent amplitude of *Gt*\_CCR4 and *Cr*\_ChR2 at −60 mV under each condition is summarized in Figure 2C,D. The current amplitude of *Gt*\_CCR4 was significantly larger in the presence of Na<sup>+</sup> and K<sup>+</sup> close to <sup>−</sup>100 pA/pF and in the presence of Cs<sup>+</sup> at about <sup>−</sup>40 pA/pF. In contrast, only a negligible current was observed at low pH, or in the presence of Ca2<sup>+</sup> or Mg2<sup>+</sup>. This supports the notion that *Gt*\_CCR4 is more of a monovalent metal cation selective channel. In addition, we also tested measurements under a competitive environment in which both Na<sup>+</sup> and Ca2<sup>+</sup> were both added to bath solutions (Figure S1). Interestingly, photocurrents by Gt\_CCR4 were suppressed at

a higher Ca2<sup>+</sup> concentration (40 mM), suggesting that Na<sup>+</sup> flow is blocked by Ca2+. In contrast, such a large difference in current amplitude was not observed under various conditions in the photocurrent from *Cr*\_ChR2 (Figure 2D). This is due to low ion selectivity in *Cr*\_ChR2 as was reported previously. To assume the selectivity ratio, the reversal potential shift from the condition with NMG at pH 9.0 is depicted in Figure 2E. The shifts (ΔErev) in *Gt*\_CCR4 are larger than in *Cr*\_ChR2 for Na+, K+, and Cs<sup>+</sup> indicating that *Gt*\_CCR4 is selective for monovalent cations but less selective for H+. In the initial study of *Cr*\_ChR2, the permeability ratio was calculated based on the current amplitude [8]. The table in Figure 2F summarizes the ratio both for *Gt*\_CCR4 and *Cr*\_ChR2. H<sup>+</sup> permeability for *Cr*\_ChR2 is 0.77 <sup>×</sup> 105, close to the reported value (1 <sup>×</sup> 106), while that for *Gt*\_CCR4 is 2.1 <sup>×</sup> 104, indicating about 37-fold less permeability for H<sup>+</sup> in *Gt*\_CCR4.

**Figure 2.** Ion selectivity of *Gt*\_CCR4 and *Cr*\_ChR2. Each channelrhodopsin expressed in ND7/23 cells was stimulated by green and blue LED light. I-V plot of *Gt*\_CCR4 (**A**) and *Cr*\_ChR2 (**B**) are depicted. Steady-state current density (pA/pF) in 20 mV steps from −100 mV to +80 mV was plotted. The present liquid junction potential was considered. The pipette solution contained 110 mM NMG-Cl at pH 9.0, and the bath solution varied: black, NMG-Cl at pH 9.0; red, NMG-Cl at pH 6.85; green, NaCl at pH 9.0; blue, KCl at pH 9.0; grey, CsCl at pH 9.0. See "Materials and Methods" for details about the solutions. (**C**,**D**) Comparison of current density of *Gt*\_CCR4 (**C**) and *Cr*\_ChR2 (**D**) in the presence of various cations at −60 mV. (**E**) Reversal potential shift (ΔErev) for each condition for *Gt*\_CCR4 and *Cr*\_ChR2. Erev was determined from the I-V plot shown in **A**,**B**. Each Erev value was subtracted from the Erev at NMG-Cl at pH 9.0. (**F**) Permeability ratio for H<sup>+</sup> and Na<sup>+</sup> in *Gt*\_CCR4 and *Cr*\_ChR2 as estimated from current value and ionic concentration. (n = 6–9 cells).

#### *3.3. Flash Laser Electrophysiology*

We then measured the photocurrent under a single-turnover condition with a flash laser as the light source (ND-YAG). As shown in Figure 3A,B, 5 ns light evoked a large inward-directed peak current at −60 mV for both *Gt*\_CCR4- and *Cr*\_ChR2-expressing cells. The current amplitude and direction are voltage-dependent for both channels, as was expected from recordings with LED. Current growth was fitted with a single exponential function (Figure 3C). The time constant of *Gt*\_CCR4 seems to be independent of membrane voltage, while that of *Cr*\_ChR2 slowed down slightly as voltage increased. Current decay was determined as about 20 ms for *Gt*\_CCR4 and about 5–10 ms for *Cr*\_ChR2, both of which are smaller than the value obtained from the measurement by LED light (Figure 1C), suggesting two distinct open states with different kinetics.

**Figure 3.** Flash laser stimulation. Standard solutions were used. Each channelrhodopsin in ND7/23 cells was stimulated by 5 ns by a green flash laser. Representative trace generated by *Gt*\_CCR4 (**A**) and *Cr*\_ChR2 (**B**) in 20 mV steps from −100 mV to +80 mV. (**C**) τon-voltage relationship from *Gt*\_CCR4 (filled circle) and *Cr*\_ChR2 (empty circle). Current rise was fitted by a single exponential function. The time constant was plotted. (**D**) τoff-voltage relationship from *Gt*\_CCR4 (filled circle) and *Cr*\_ChR2 (empty circle). Current decay was fitted by a single exponential function. (n = 5–7 cells).

#### *3.4. High Frequency Stimulation*

For optogenetics application, reliable neuronal activation would require large and stable activity of the light-gated ion channel. In addition, rapid channel-closing is desired for optical stimulation at a high frequency. Thus, we compared the photocurrent from *Gt*\_CCR4- and *Cr*\_ChR2-expressing cells with three different light frequencies. As shown in Figure 4A, activation of *Gt*\_CCR4 at 10 Hz light at 9.67 mW/mm2 generated high temporal peak currents which almost fully decayed before the next illumination. Peak amplitude remained unchanged because of a small inactivation, as shown in Figure 1A. On the other hand, peak current amplitude immediately decayed less after the initial stimulation in *Cr*\_ChR2-expressing cells (Figure 4B). Stimulation of *Gt*\_CCR4 at 20 and 50 Hz still retained a high level of peak amplitudes, although each peak did not decay completely (Figure 4C,E). Current inactivation was observed in *Cr*\_ChR2-expressing cells at each frequency (20 and 50 Hz)

(Figure 4D,F). Figure 4G summarizes the residual current level before the next stimulation at each light frequency. At 10 Hz, both *Gt*\_CCR4 and *Cr*\_ChR2 showed a very low level of the current level, indicating that channels were almost shut off. As frequency increased to 20 Hz and 50 Hz, significant residual currents were observed which exceeded 50% in *Gt*\_CCR4 at 50 Hz whereas *Cr*\_ChR2 showed a slightly lower residual current, probably because of faster channel kinetics.

**Figure 4.** High frequency stimulation of *Gt*\_CCR4 and *Cr*\_ChR2. (**A**,**C**,**E**) *Gt*\_CCR4 in ND7/23 cells was stimulated by green LED light with a frequency of 10, 20 or 50 Hz as indicated by colored dots or a line under each trace. Membrane voltage was clamped at −60 mV. (**B**,**D**,**F**) Similarly, *Cr*\_ChR2 was stimulated by blue LED light with a frequency of 10, 20 or 50 Hz as indicated by colored dots or a line under each trace. (**G**) Residual tail current at three different light frequencies is summarized. Residual tail current is the current amplitude right before the next photo stimulation indicated by an arrowhead in **A**,**B**. The current value was normalized to the peak amplitude as 100%. A standard solution was used. (n = 3 cells).

#### *3.5. Gt\_CCR4 Is Inactivated by 590 nm Light But Fully Reactivated by Blue Light*

As already shown in Figure 1A, one of the obvious characteristics of *Gt*\_CCR4 is the small inactivation upon 530 nm illumination which is close to λmax (525 nm). Here we compared the current shape by illumination of 530 nm and 590 nm light (Figure 5A–C). Repetitive illuminations of 530 nm light three times gave almost the same current shape, which has a high steady state level (Figure 5A). Upon illumination of 590 nm light, the current slowly decayed into a small steady state level, which was further reduced after a second illumination of 590 nm LED following several seconds of a dark period (Figure 5B,D). Even after 30 sec or 2 min, the inactivated current did not recover (Figure S1A,B). These observations suggest that *Gt*\_CCR4 possesses a long-lived inactivated state which could accumulate in 590 nm light. Only illuminating with 530 nm allowed for a full recovery to the original steady state level (Figure 5C,E).

**Figure 5.** *Gt*\_CCR4 is inactivated by light with a longer wavelength. Standard solutions were used. Membrane voltage was clamped at <sup>−</sup>60 mV. Photocurrent by 530 nm light (7.44 mW/mm2) in (**A**) reached a steady state level after transient peak current. Repetitive stimulations gave almost identical current shapes. (**B**) Slow inactivation of the photocurrent was observed by illumination with 590 nm light (7.44 mW/mm2). The current was further reduced when illuminated twice with 590 nm light. (**C**) After 590 nm inactivation, 530 nm light fully reactivated the photocurrent to the original steady state level. (**D**) Photocurrent density from the measurement shown in (**B**). The photocurrent density upon exposure to the first 530 nm light (shown in G), and the second and third 590 nm light (Y), is shown. Ip; transient peak component. Iss; steady state component. (**E**) Photocurrent density from the measurement shown in C. The photocurrent density upon exposure to the first 530 nm light (shown in G), the second 590 nm light (Y), and the third 530 nm light (shown in G), is shown. (n = 4 cells).

#### **4. Discussion**

In this study, we aimed to elucidate the ion-channel properties of a recently discovered light-gated cation channel *Gt*\_CCR4 from a cryptophyte and compare it to well-known *Cr*\_ChR2 from *Chlamydomonas reinhardtii* by using an electrophysiological method. In ND7/23 cells, *Gt*\_CCR4 showed a large current density (Figures 1C and 2A,B). Inactivation of the photocurrent obtained from *Gt*\_CCR4 was smaller than in *Cr*\_ChR2. In other words, a large current was observed under constant light (Figure 1A,B). This was also obvious in the current trace after illumination at a high frequency (Figure 4). These characteristics promise stable and reproducible stimulation of neuronal excitability by *Gt*\_CCR4.

The light sensitivity of *Gt*\_CCR4 is higher than that of *Cr*\_ChR2 with a particular steady state component (Iss) (Figure 1E,F). ChR variants with high light sensitivity have already been developed [19,20], but those have a long channel life time with at least two orders of magnitude or even much longer. Therefore, these are inappropriate for high-frequency light stimulation. In contrast, *Gt*\_CCR4 has a short open life time of 25–30 ms, which is about the same range as *Cr*\_ChR2, i.e., 10–15 ms (Figure 1D). Together, *Gt*\_CCR4 is light sensitive and useful as an optogenetics tool with high time resolution. Optical irradiation causes heat and elevates temperature by 0.2~2 ◦C, especially in cranial nerve experiments [37]. Moreover, it has been demonstrated that the rise in temperature suppressed neuronal spiking in multiple brain regions, serving as a warning of the use of strong light for neuronal stimulation. Such an undesirable artefact has to be avoided by lowering light intensity, while effective depolarization has to be stably maintained. *Gt*\_CCR4 has the potential for overcoming this problem.

H<sup>+</sup> permeability is high for *Cr*\_ChR2 [8]. Permeability for Ca2<sup>+</sup> has been reported, and not only for monovalent cations such as Na<sup>+</sup> and K<sup>+</sup> [8,21]. On the other hand, *Gt*\_CCR4 showed high selectivity in monovalent metal cations and low H<sup>+</sup> permeability. The permeability of a divalent cation such as Ca2<sup>+</sup> seems to be very low or negligible. The position that is important to ion selectivity has been studied in *Cr*\_ChR2. E90 in the central gate is crucial for cation/anion selection [23]. L132 in TM3 influences on Ca2<sup>+</sup> permeability [21]. Duan and coworkers recently demonstrated that D156H and D156C mutation increase permeability for Na<sup>+</sup> and K<sup>+</sup> [38]. The outer gate in Chrimson (E139) on the extracellular side is important for Na<sup>+</sup> extrusion [22]. These key residues are not conserved in *Gt*\_CCR4, implying that a different ion selection property resides in DTD channels. It would be necessary to study selectivity based on variant analysis and structural information in the future. Considering its application in optogenetics, *Gt*\_CCR4 would not cause a significant change to pH in the cell membrane because of its very low H<sup>+</sup> permeability, which could be advantageous when an unknown effect by pH needs to be prevented. To enable optical stimulation without improper calcium signaling, *Gt*\_CCR4 might work better than *Cr*\_ChR2.

A single turnover photocurrent of *Gt*\_CCR4 by laser irradiation provided a time constant (τoff) of 15–20 ms, which is smaller than that obtained by constant light (25–30 ms). This suggests two processes for channel opening and shutting. A dual photocycle model was indeed proposed for *Cr*\_ChR2 [39]. It is expected that a similar reaction is caused in *Gt*\_CCR4, but more experiments are needed to prove this.

We found a characteristic inactivation of channel activity by long wavelength absorption in *Gt*\_CCR4 (Figure 5). Since inactivation lasted at least a few minutes, formation of a stable intermediate with long wavelength absorption is anticipated. Alternatively, *Gt*\_CCR4 exhibits a photochromic property that is seen in the photocycle of *Anabaena* sensory rhodopsin [40]. Such photochromism or desensitization was also observed in chlorophyte channelrhodopsins [41,42]. We are now focusing on understanding the reaction mechanism in greater depth via a spectroscopic experiment. In conclusion, we here elucidated the cation channel properties of *Gt*\_CCR4. Its high conductance and cation selectivity without significant inactivation would be an appropriate set of features for optogenetics applications. We are currently assessing the feasibility of *Gt*\_CCR4 as an optical stimulator in cultured neurons.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/9/17/3440/s1, Table S1 Amino acid alignments of bacteriorhodopsin (BR), *Cr*\_ChR2 and *Gt*\_CCR4. Figure S1: Ca2<sup>+</sup> photecurrents of *Gt*\_CCR4 and *Cr*\_ChR2. Figure S2 Gt\_CCR4 has a long-lived and long wavelength-absorbing inactivated state.

**Author Contributions:** Conceptualization, All authors; Data Curation, S.S. and S.H.; Validation, S.S., S.H. and S.P.T.; Writing—Original Draft Preparation, S.P.T.; Writing—Review & Editing, S.P.T.; Supervision, H.K. and S.P.T.

**Funding:** This work was funded by the Japanese Ministry of Education, Culture, Sports, Science and Technology (25104009, 15H02391 to H.K and 18K06109 to S.P.T.), a JST CREST grant (JPMJCR1753 to H.K.), and a JST PRESTO grant (JPMJPR1688 to S.P.T). S.H. is a Research Fellow of the Japan Society for the Promotion of Science (JSPS Research Fellow).

**Acknowledgments:** We thank Ryoko Nakamura for her excellent technical support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Mutated Channelrhodopsins with Increased Sodium and Calcium Permeability**

#### **Xiaodong Duan, Georg Nagel \* and Shiqiang Gao \***

Botanik I, Julius-Maximilians-Universität Würzburg, Biozentrum, Julius-von-Sachs-Platz 2, D-97082 Würzburg, Germany; xiaodong.duan@stud-mail.uni-wuerzburg.de

**\*** Correspondence: nagel@uni-wuerzburg.de (G.N.); gao.shiqiang@uni-wuerzburg.de (S.G.)

Received: 14 January 2019; Accepted: 12 February 2019; Published: 15 February 2019

#### **Featured Application: This study provides optogenetic tools with superior photocurrent amplitudes and high Na<sup>+</sup> and Ca2+ conductance.**

**Abstract:** (1) Background: After the discovery and application of *Chlamydomonas reinhardtii* channelrhodopsins, the optogenetic toolbox has been greatly expanded with engineered and newly discovered natural channelrhodopsins. However, channelrhodopsins of higher Ca2+ conductance or more specific ion permeability are in demand. (2) Methods: In this study, we mutated the conserved aspartate of the transmembrane helix 4 (TM4) within Chronos and *Ps*ChR and compared them with published ChR2 aspartate mutants. (3) Results: We found that the ChR2 D156H mutant (XXM) showed enhanced Na+ and Ca2+ conductance, which was not noticed before, while the D156C mutation (XXL) influenced the Na<sup>+</sup> and Ca2+ conductance only slightly. The aspartate to histidine and cysteine mutations of Chronos and *Ps*ChR also influenced their photocurrent, ion permeability, kinetics, and light sensitivity. Most interestingly, *Ps*ChR D139H showed a much-improved photocurrent, compared to wild type, and even higher Na+ selectivity to H+ than XXM. *Ps*ChR D139H also showed a strongly enhanced Ca2+ conductance, more than two-fold that of the CatCh. (4) Conclusions: We found that mutating the aspartate of the TM4 influences the ion selectivity of channelrhodopsins. With the large photocurrent and enhanced Na<sup>+</sup> selectivity and Ca2+ conductance, XXM and *Ps*ChR D139H are promising powerful optogenetic tools, especially for Ca2+ manipulation.

**Keywords:** optogenetics; channelrhodopsins; sodium; calcium; DC gate

#### **1. Introduction**

Channelrhodopsins were first discovered and characterized from *C. reinhardtii* [1,2]. After the showing of light-switched large passive cation conductance in HEK293 and BHK cells by Nagel et al., the ChR2 (*C. reinhardtii* channelrhodopsin-2) was immediately applied in neuroscience by several independent groups for studies in hippocampal neurons [3,4], *Caenorhabditis elegans* [5], inner retinal neurons [6], and PC12 cells [7]. H134R (histidine to arginine mutation at position 134) was the first ChR2 gain-of-function mutant which showed enhanced plasma membrane expression and larger stationary photocurrents in comparison to ChR2 wild type [5].

Other variants came out in rapid sequence, either of natural origin or mutated and engineered. The calcium translocating channelrhodopsin CatCh (a ChR2 leucine to cysteine mutation at position 132, L132C) showed improved Ca2+ conductivity together with a larger photocurrent and higher light sensitivity [8]. Newly discovered Chronos (*Stigeoclonium helveticum* channelrhodopsin = *Sh*ChR) and Chrimson (*Cn*ChR1 from *Chlamydomonas noctigama*) showed a faster channel closing and a red-shifted action spectrum, respectively [9]. An E90R (glutamate to arginine mutation at position 90)

point mutation could extend the cation conductance of ChR2 to additional anion conductance [10]. Naturally, very specific anion conductive channelrhodopsins, *Gt*ACR1 and *Gt*ACR2 (*Guillardia theta* anion channelrhodopsin 1 and 2), were discovered afterwards [11].

Mutation of ChR2 C128 (cysteine at position 128) to threonine (T), alanine (A), and serine (S) slowed the closing kinetics dramatically [12–14]. Mutation of ChR2 D156 (cysteine at position 156) to alanine also decreased the closing kinetics [14]. Spectral studies suggested that a putative hydrogen bond between C128 and D156 could be an important structural determinant of the channel's closing reaction [14] or might represent the valve of the channel [15]. Thus, hydrogen bond-linked D156 and C128 was proposed as the putative gate buried in the membrane ("DC gate") [14,15]. But the first channelrhodopsin structure was a chimaera (C1C2) of truncated ChR1 and ChR2, and the distances between C167 (corresponding to C128 in ChR2) and D195 (corresponding to D156 in ChR2) are too far away to be associated by a hydrogen bond [16]. However, the recently solved structure of wild type ChR2 revealed a water molecule between C128 and D156 to bridge them, indeed, by hydrogen bonds [17].

Mutation of the DC gate has a strong effect on the open channel lifetime. The ChR2 D156C (aspartate to cysteine mutation at position 134) mutant (XXL) generated very large photocurrents and is 1000-fold more light-sensitive than wild type ChR2 in *Drosophila* larvae [18]. The ChR2 D156H (aspartate to histidine mutation at position 134) mutant (XXM) also showed a superior photo stimulation efficiency with faster kinetics than XXL, which made it an ideal optogenetic tool for *Drosophila* neurobiological studies [19].

The aspartate D156 in ChR2 is located close to the protonated retinal Schiff base (RSBH+). Thus, mutations of D156 logically have strong effects on the open channel lifetime by influencing the protonation state of the retinal Schiff base. However, the water-bridged C128 and D156 are not in the putative ion pore proposed by Volkov et al. [17]. And no attention had been paid to the potential changes of ion selectivity by DC gate mutations.

In this study, we compared the ion selectivity of our previously published XXL and XXM and found that XXM showed a four-fold increased Na<sup>+</sup> selectivity over H<sup>+</sup> together with a two-fold increased K<sup>+</sup> selectivity over H+, compared to wild type ChR2. Based on this finding, we made further aspartate to histidine and cysteine mutations of *Ps*ChR (*Platymonas subcordiformis* channelrhodopsin) [20] and Chronos [9]. *Ps*ChR wild type was already reported to be highly Na+ conductive and indeed showed a six-fold increased Na<sup>+</sup> and K+ selectivity over H<sup>+</sup> compared to wild type ChR2 in our measurements. But the D139H mutation of *Ps*ChR further increased the Na+ selectivity over H+ five-fold. Furthermore, *Ps*ChR D139H showed a 5-fold larger photocurrent than *Ps*ChR wt.

We further compared the Ca2+ permeability of these mutants. XXM showed an increased Ca2+ current compared to CatCh [8]. *Ps*ChR wild type already showed a good Ca2+ current, but the D139H mutation further increased the Ca2+ current. We concluded that the mutant *Ps*ChR D139H would be a powerful tool for optogenetic Ca2+ manipulation.

#### **2. Materials and Methods**

#### *2.1. Plasmids and RNA Generation for Xenopus Laevis Oocyte Expression*

ChR2, XXM, and XXL in the pGEMHE vector were described in previous studies [18,19,21]. PsChR (from Platymonas subcordiformis, Accession No.: JX983143) and Chronos (from Stigeoclonium helveticum, Accession No.: KF992040) were synthesized by GeneArt Strings DNA Fragments (Life Technologies, Thermo Fisher Scientific), according to the published amino acid sequences, with the codon usage optimized to Mus musculus. The synthesized DNA segment was inserted into the pGEMHE vector with N-terminal BamHI and C-terminal XhoI restriction sites. Yellow fluorescent protein (YFP), together with a plasma membrane trafficking signal (KSRITSEGEYIPLDQIDINV) [22] beforehand and an ER export signal (FCYENEV) [22] afterward, the YFP was attached to the C-terminal

end. Mutations were made by QuikChange Site-Directed Mutagenesis. The sequence was confirmed by DNA sequencing. Plasmids were linearized by NheI digestion and used for in vitro generation of cRNA with the AmpliCap-MaxT7 High Yield Message Maker Kit (Epicentre Biotechnologies).

#### *2.2. Two-Electrode Voltage-Clamp Recordings of Xenopus Laevis Oocytes*

cRNA-injected oocytes were incubated in ND96 solution (96 mM NaCl, 5 mM KCl, 1 mM MgCl2, 1 mM CaCl2, 5 mM HEPES, pH 7.4) containing 1 μM all-trans-retinal at 16 ◦C. Two-electrode voltage-clamp (TEVC) recordings were performed with solutions, as indicated in figures, at room temperature. For experiments with external Ca2+, we blocked activation of the Ca2+-activated endogenous chloride channels of oocytes by 1,2-bis(o-aminophenoxy)ethane-N,N,N ,N -tetraacetic acid (BAPTA) injection. We injected 50 nl 200 mM of the fast Ca2+ chelator BAPTA (potassium–salt) into each oocyte (~10 mM final concentration in the oocyte), incubated for 90 mins at 16 ◦C and then performed the TEVC measurement at room temperature. Twenty nanograms of cRNA were injected into *Xenopus* oocyte for all the constructs. Photocurrents were measured two days after injection. For Figures 1 and 2, measurements were performed in standard solution with BaCl2 instead of CaCl2 (110 mM NaCl, 5 mM KCl, 2 mM BaCl2, 1 mM MgCl2, 5 mM HEPES and pH 7.6).

#### *2.3. Light Stimulation*

Illumination conditions were different, considering the published action spectra of ChR2, Chronos, and *Ps*ChR: 473 nm for ChR2, XXM, and XXL; 445 nm for *Ps*ChR, *Ps*ChR D139H, and *Ps*ChR D139C; 532 nm for Chronos, Chronos D173H, and Chronos D173C. Lasers were from Changchun New Industries Optoelectronics Technology. Light power was set to 5 mW/mm2, except for the light sensitivity measurement. The light intensities were measured with a PLUS 2 Power & Energy Meter (LaserPoint s.r.l). For light sensitivity measurement, applied light intensities ranged from 1.7 to 5000 μW/mm2.

#### *2.4. Protein Quantification by Fluorescence*

All expression levels of channelrhodopsin variants in oocytes were quantified by the fluorescence emission values of the YFP-tagged protein. Fluorescence emission was measured at 538 nm by a Fluoroskan Ascent microplate fluorometer (Thermo Scientific) with 485 nm excitation.

#### *2.5. Fluorescence Imaging*

Fluorescence pictures of *Xenopus* oocytes were taken under 5x objective with a Leica DM6000 confocal microscope after two days' expression. Oocytes were put in a 35 x 10 mm petri dish (Greiner GBO) containing ND96 for imaging. Excitation was done using 496 nm laser light. Fluorescence emission was detected from 520 nm to 585 nm.

#### *2.6. Data Processing*

pClamp 7.0 was used to read out the photocurrent. Figures 1a, 2, 3, 4, 5a and A1a were made with OriginPro 2017. Figures 1b, 5b and A1b were made with GraphPad Prism. Tables 1 and 2 were made with Microsoft Excel. Sequence alignment in Appendix B was performed by BioEdit. Closing time was determined by biexponential fit. Light sensitivity curves were fitted with Hill equation. All values were plotted or presented with mean values, and error bars represent the standard deviations (SD) or standard error mean (SEM), as indicated in each figure. Statistical analysis was done by *t* test within GraphPad Prism. Differences were considered significant at *p* < 0.05. \*\*\* = *p* < 0.001, \*\* = *p* < 0.01, \* = *p* < 0.05.

**Figure 1.** Comparison of ChR2, *Ps*ChR, and Chronos variants. (**a**) Representative photocurrent traces of ChR2, *Ps*ChR, Chronos, and their mutants. (**b**) Comparison of stationary photocurrents of above channelrhodopsin variants. All data points were plotted in the figure and mean ± standard error mean (SEM) (*n* = 7–8) was indicated.

**Figure 2.** Comparing the light sensitivities of ChR2, *Ps*ChR, and Chronos variants. (**a**) 5 s, 30 s, and 100 s continuous 473 nm blue light illumination were applied to ChR2, XXM and XXL, respectively. Photocurrent at maximum light power density of each oocyte was normalized as 1. (**b**) 5 s continuous 532 nm blue light illumination were applied to Chronos, Chronos D173H, and Chronos D173C. Photocurrent at maximum light power density of each oocyte was normalized as 1. (**c**) 5 s, 30 s, and 100 s continuous 445 nm blue light illumination were applied to *Ps*ChR, *Ps*ChR D139H, and *Ps*ChR D139C. Photocurrent at maximum light power density of each oocyte was normalized as 1. Photocurrents of XXM, XXL, *Ps*ChR D139H, and *Ps*ChR D139C were measured at −60 mV because of the larger current and slower kinetics; other constructs were measured at −100 mV. Data points were presented as mean ± SD, *n* = 3–4.

#### **3. Results**

#### *3.1. Mutating the Conserved Aspartate of TM Helix 4 Influences the Expression, Photocurrent, and Kinetics*

Similar to previously published results [18,19], fluorescence measurements of whole oocyte membranes with YFP-tagged XXM and XXL showed a ~ three-fold increased expression level, compared to ChR2 (Figure A1). The steady-state photocurrents were increased ~30- and ~48-fold for XXM (D156H) and XXL (D156C), respectively, compared to ChR2 (Figure 1 and Table 1). The enhanced photocurrent might have been a comprehensive outcome of the higher plasma membrane expression level, higher light-sensitivity, or increased single channel conductance. Both XXM and XXL showed much-prolonged closing kinetics, leading to higher light-sensitivity (Figure 1a and Table 1).

**Figure 3.** Comparison of Na<sup>+</sup> and H<sup>+</sup> permeabilities of ChR2, Chronos, *Ps*ChR, and their mutants. Reversal potentials (Vr) were determined after photocurrent measurements from −90 mV to + 10 mV. Reversal potential shift for Na<sup>+</sup> was calculated by the reversal potential differences in two outside buffers containing 120 mM NaCl pH 7.6 and 1 mM NaCl pH 7.6. Reversal potential shift for H+ was determined by the reversal potential differences in two outside buffers of pH 7.6 and 9.6 containing 120 mM NaCl. Data points were presented as mean ± SD, *n* = 4–6.

**Figure 4.** Comparison of K+ and H+ permeabilities of ChR2, Chronos, *Ps*ChR, and their mutants. Reversal potentials were determined after photocurrent measurements from −90 mV to +10 mV. Reversal potential shift for K<sup>+</sup> was calculated by the reversal potential differences in two outside buffers containing 120 mM KCl pH 7.6 and 1 mM KCl pH 7.6. Reversal potential shift for H+ was determined by the reversal potential differences in two outside buffers of pH 7.6 and 9.6 containing 120 mM KCl. Data points were presented as mean ± SD, *n* = 4–6.

We further synthesized Chronos [9] and *Ps*ChR [20] and characterized the corresponding aspartate to cysteine, and the histidine mutations as the aspartate in transmembrane helix 4 (TM4) were conserved in all three channelrhodopsins (Appendix B). Chronos D173C, D173H, and *Ps*ChR D139H, D139C, all showed an increased expression level, compared to their wild type (Figure 1a). All mutants also showed increased light-sensitivities along with prolonged off kinetics (Figure 2 and Table 1). Chronos D173C, *Ps*ChR D139H and *Ps*ChR D139C showed dramatically increased photocurrents while the Chronos D173H was similar to the wild type Chronos (Figure 1 and Table 1). Among these variants, *Ps*ChR D139C was the most light-sensitive with an effective light power density (LPD) for 50% photocurrent (EPD50) of ~ 3.2 μW/mm2, which was ~ 250 times more sensitive than *Ps*ChR (Figure 2 and Table 1). The EPD50 for XXL was ~ 5.4 μW/mm2, which was ~ 130 times more sensitive than ChR2 (Figure 2 and Table 1).

**Figure 5.** Calcium permeabilities of selected channelrhodopsin variants. (**a**) Photocurrent traces of different channelrhodopsins before (grey) and after (black) 1,2-bis(o-aminophenoxy)ethane-N,N,N ,N -tetraacetic acid (BAPTA) injection. Measurements were done in 80 mM CaCl2 pH 9.0 at −100mV. Blue bars indicate light illumination. (**b**) Comparison of calcium permeability of ChR2 (light grey), Chronos (dark grey), *Ps*ChR (black), CatCh (light grey), XXM (dark grey), and *Ps*ChR D139H (black). All data points were plotted in the figure and mean ± SEM was indicated.


**Table 1.** Basic properties of ChR2, *Ps*ChR, and Chronos variants.

†, expression or photocurrent of corresponding wild type was normalized as 1×. Is, stationary (plateau) current. ‡, as most ChRs exhibit biphasic off-kinetics which comprised a fast and a slow component, here the % indicated the percentage of the amplitude of the fast (τ1) or slow (τ2) component to the whole photocurrent. Data are shown as mean ± SEM, n = 4-5. Values are presented as approximates. \*, expression level was calculated from the data of fluorescence value in Appendix A Figure A1.

**Table 2.** Ion selectivity of ChR2, *Ps*ChR, and Chronos variants.


Reversal potentials and permeability ratio were determined from stationary currents in the indicated solution. Values represent mean ± SD, *<sup>n</sup>* = 4–6. Values without SD are presented as approximates. †, with the existence of 120 mM Na+. ‡, with the existence of 120 mM K+.

#### *3.2. Mutation of the Aspartate in TM4 Influences the Na+ Permeability*

The potential influence on ion selectivity by mutating ChR2 D156 was not reported. To investigate this, we measured the photocurrent at different potentials and calculated the reversal potential shift of these mutants, when systematically changing bath solutions with different pH and Na<sup>+</sup> or K+ concentrations.

ChR2 is a non-selective cation channel which is mostly permeable to H+ [2]. Both changing extracellular pH from 7.6 to 9.6 and changing extracellular Na<sup>+</sup> concentration from 120 mM to 1 mM altered the ChR2 reversal potential (Figure 3). The ChR2 permeability ratio of Na+ to H+ (PNa+/PH+) was determined as PNa+/PH+ = 3.1×10−<sup>7</sup> (Table 2). Interestingly, D156H (XXM) influenced the Na<sup>+</sup> permeability and increased the PNa+/PH+ four times, while D156C (XXL) changed the PNa+/PH+ only slightly (Table 2).

Chronos is more permeable to H<sup>+</sup> than ChR2 with a PNa+/PH+ = 2.3×10−<sup>7</sup> (Table 2). The D173H mutation did not obviously change this, and D173C increased the PNa+/PH+ slightly to 5×10−<sup>7</sup> (Figure 3 and Table 2). *Ps*ChR was reported to be highly Na+ conductive [20]. Changing the outside Na+ concentration from 120 mM to 1 mM greatly influenced its photocurrent. The inward photocurrent was nearly abolished at 1 mM Na+ pH 7.6, and we determined the *Ps*ChR PNa+/PH+ to be 18×10−7, which was even higher than that of XXM (Figure 3 and Table 2). The D139H mutation increased the PNa+/PH+ even five-fold more, to 90×10−7, while the D139C mutation decreased the PNa+/PH+ slightly (Figure 3 and Table 2).

Among the tested constructs, *Ps*ChR D139H was the most Na<sup>+</sup> permeable channelrhodopsin with a large photocurrent and, to our knowledge, the most Na+-permeable channelrhodopsin ever reported.

#### *3.3. Mutation of the Aspartate in TM4 Influences the K+ Permeability*

As tools for light-induced depolarization, ideal cation-permeable channelrhodopsins should be more Na<sup>+</sup> conductive and less K<sup>+</sup> conductive, because K<sup>+</sup> efflux across the plasma membrane would lead to a more hyperpolarized membrane potential. To test the potential influences on K+ permeability of different mutations, we measured photocurrents and calculated the reversal potential shift of these mutants when systematically changing bath solutions from 120 mM K<sup>+</sup> to 1 mM K<sup>+</sup> in comparison to changing pH from 7.6 to 9.6.

ChR2 had a slightly weaker K<sup>+</sup> conductance in comparison to Na<sup>+</sup> with a PNa+/PK+ = 1.2. XXL increased the Na<sup>+</sup> and K+ permeability slightly and equally. XXM increased the Na<sup>+</sup> permeability more than that for K+, and the PNa+/PK+ of XXM reached 2.2 (Figure 4 and Table 2). *Ps*ChR showed a higher Na+ permeability, together with an enhanced K+ permeability in comparison to that for H+, with a similar PNa+/PK+ as ChR2 (Figure 4 and Table 2). Interestingly, the D139H mutation increased the Na+ permeability five-fold, while changing the K+ permeability only 1.6-fold, thus the PNa+/PK+ of *Ps*ChR D139H increased to 3.5 (Figure 4 and Table 2). The increased PNa+/PK+ makes *Ps*ChR D139H even more suitable as a depolarization tool.

Chronos, Chronos D173H, and Chronos D173C had much lower K+ permeability and the highest PNa+/PK+ value among the tested constructs (Figure 4 and Table 2). However, the H<sup>+</sup> permeability was the highest for all Chronos variants (Table 2).

#### *3.4. Mutation of the Aspartate in TM4 Influences the Ca2+ Permeability*

As obvious impacts of mutation of the conserved aspartate in TM4 on ion selectivity were observed, we further compared the Ca2+ permeability of these mutants, considering the importance of Ca2+ in biological systems. Due to the existence of Ca2+-activated chloride channels in *Xenopus* oocytes [23], BAPTA was injected into the oocyte to a final concentration of ~10 mM, to block the Ca2+-induced chloride current (Figure 5). Then the photocurrents at −100 mV were measured in outside solution containing 80 mM CaCl2 at pH 9.0. At −100 mV and pH 9, no net H<sup>+</sup> current could be observed and the inward photocurrent was then only from the Ca2+ influx.

ChR2 showed a robust composite photocurrent with 80 mM CaCl2 at pH 9.0 and −100 mV, which was dramatically reduced to the pure Ca2+ current after injection of 10 mM BAPTA (Figure 5a), as reported previously [2]. Both Chronos and ChR2 showed small Ca2+ photocurrents (Figure 5a). ChR2 L132C (CatCh) showed an increased Ca2+ photocurrent, compared to ChR2 (Figure 5), as previously reported [8]. Astonishingly, XXM also showed an increased Ca2+ photocurrent, even higher than that of CatCh (Figure 5). *Ps*ChR D139H showed the highest Ca2+ photocurrent, which on average was more than two times higher than that of CatCh (Figure 5b).

#### **4. Discussion**

Channelrhodopsins, originating from different organisms, show quite different properties with respect to kinetics, action spectrum, and ion selectivity. Such changes can also be engineered by point mutations. In this study we compared the properties of ChR2, Chronos [9], *Ps*ChR [20], and their corresponding mutants of the aspartate in TM4 (DC gate aspartate).

Generally, the aspartate to histidine or cysteine mutations of the three channelrhodopsins increased the expression level (probably because the mutant became more stable against degradation [21]) and slowed the closing kinetics. Nearly all mutants showed a much-increased photocurrent, probably because of a much-prolonged open state or enhanced single channel conductance, with only Chronos D173H as an exception.

The tools with slowed kinetics are unfavorable for ultra-fast multiple stimulation but preferred for experiments which require low light and longtime stimulation. The prolonged open times were accompanied by elevated light sensitivities. Among the tested constructs, *Ps*ChR D139C and XXL became ~ 220 times and ~ 130 times more sensitive than ChR2. If slow closing would have not been a problem nor even desired, the more light-sensitive channelrhodopsins would have been ideal for efficient deep brain stimulation with infrared light via upconversion nanoparticles (UCNPs) [24]. These tools need to be further tested in mammalian systems for a broader field application.

Furthermore, we investigated the influence of mutation of the aspartate in TM4 on ion selectivity. We found that aspartate to histidine mutation of ChR2 and *Ps*ChR increased the Na+ and Ca2+ permeability dramatically. To test the Ca2+ current, we used BAPTA to block the Ca2+-activated endogenous chloride channels of oocytes. The fast Ca2+ chelator BAPTA may have been altering the ion currents in more ways [25]. However, as we could see from the kinetics in Figure 5a that the Clcurrent (which shows a slower off kinetics) was well-blocked. Then we could reliably compare only the photocurrent of our channelrhodopsins.

With the large photocurrent, increased Na+ permeability, and bigger Ca2+ current, *Ps*ChR D139H is a novel powerful optogenetic tool for depolarization and Ca2+ manipulation. Channelrhodopsins with higher Ca2+ currents have the advantage of being "direct" light-gated Ca2+ channels, in contrast to the highly Ca2+-conductive CNG (cyclic nucleotide-gated) channels which became light-gated channels when fused with bPAC (photoactivated adenylyl cyclase) [26].

In summary, we found that mutating the conserved aspartate in TM4 influenced not only the expression level and kinetics of channel closing but also the ion selectivity; with appropriate mutations, we provided novel optogenetic tools with superior photocurrent amplitudes and high Na<sup>+</sup> and Ca2+ conductance.

**Author Contributions:** Conceptualization, S.G. and G.N.; methodology, X.D., S.G. and G.N.; software, X.D. and S.G.; validation, X.D., S.G. and G.N.; formal analysis, X.D. and S.G.; investigation, X.D. and S.G.; resources, X.D., S.G. and G.N.; data curation, X.D. and S.G.; writing—original draft preparation, S.G.; writing—review and editing, X.D., S.G. and G.N.; visualization, X.D. and S.G.; supervision, S.G. and G.N.; project administration, G.N.; funding acquisition, G.N.

**Funding:** This research was funded by grants from the German Research Foundation to GN (TRR 166/A03 and TR 240/A04). GN acknowledges support provided by the Prix-Louis-Jeantet.

**Acknowledgments:** We are grateful to Shang Yang for help with some of the cloning work. This publication was funded by the German Research Foundation (DFG) and the University of Wuerzburg in the funding program Open Access Publishing.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Figure A1.** Expression level of ChR2, *Ps*ChR, and Chronos variants in *Xenopus* oocyte. (**a**) Representative confocal images of all the constructs, scale bar = 500 μm. (**b**) Yellow fluorescent protein (YFP) fluorescence emission values from oocytes expressing different channelrhodopsins. Data was shown as mean ± SEM, *n* = 5–6. Pictures and fluorescence emission values were taken and measured 2 days after 20 ng cRNA injection.

#### **Appendix B**


**Figure A2.** Sequence alignment of ChR2, Chronos, and *Ps*ChR. Conserved cysteine and aspartate of the DC gate were marked in the red box.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

#### *Article*
