Next Article in Journal
Reinforcement Learning for Bit-Flipping Decoding of Polar Codes
Next Article in Special Issue
Information Theory in Molecular Evolution: From Models to Structures and Dynamics
Previous Article in Journal
U-Model-Based Two-Degree-of-Freedom Internal Model Control of Nonlinear Dynamic Systems
Previous Article in Special Issue
Allostery and Epistasis: Emergent Properties of Anisotropic Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators

1
Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080, USA
2
Center for Systems Biology, University of Texas at Dallas, Richardson, TX 75080, USA
3
Department of Bioengineering, University of Texas at Dallas, Richardson, TX 75080, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2021, 23(2), 170; https://doi.org/10.3390/e23020170
Submission received: 17 December 2020 / Revised: 21 January 2021 / Accepted: 26 January 2021 / Published: 30 January 2021

Abstract

:
Two-component systems (TCS) are signaling machinery that consist of a histidine kinases (HK) and response regulator (RR). When an environmental change is detected, the HK phosphorylates its cognate response regulator (RR). While cognate interactions were considered orthogonal, experimental evidence shows the prevalence of crosstalk interactions between non-cognate HK–RR pairs. Currently, crosstalk interactions have been demonstrated for TCS proteins in a limited number of organisms. By providing specificity predictions across entire TCS networks for a large variety of organisms, the ELIHKSIR web server assists users in identifying interactions for TCS proteins and their mutants. To generate specificity scores, a global probabilistic model was used to identify interfacial couplings and local fields from sequence information. These couplings and local fields were then used to construct Hamiltonian scores for positions with encoded specificity, resulting in the specificity score. These methods were applied to 6676 organisms available on the ELIHKSIR web server. Due to the ability to mutate proteins and display the resulting network changes, there are nearly endless combinations of TCS networks to analyze using ELIHKSIR. The functionality of ELIHKSIR allows users to perform a variety of TCS network analyses and visualizations to support TCS research efforts.

1. Introduction

Two-component systems (TCSs) are ubiquitous in bacteria and archaea and are the key signaling transduction machineries for sensing and responding to the environment. TCSs consist of sets of interaction signaling partners, histidine kinases (HKs) that phosphorylate their cognate response regulators (RRs). Interactions, however, are often not one-to-one. Multiple HKs can interact with multiple RRs. Identifying relevant interactions among TCS is an important task that has been addressed experimentally only for a limited number of organisms.
We advanced the study of interaction specificity in TCS by creating a model based on amino acid coevolution at the interface of HKs and RRs. Our Direct Coupling Analysis (DCA) [1] based interaction model not only confirms known cognate partners [2] but also reveals novel interactions in multiple organisms. We uncovered a TCS network in Synechococcus elongatus regulating cyanobacterial circadian clock and confirmed important master regulators [3]. Our model is also able to predict functional mutations to modulate binding specificity between partners, such as PhoQ and PhoP [4] or even design new interactions between non-cognate, interspecies TCS proteins, such as the EnvZ from Escherichia coli and Spo0F from Bacillus subtilis [5]. Another application of this model is the identification of crosstalk across signaling networks and the influence of mutation in the topology of the network. Figure 1 illustrates a section of statistical couplings in a protein sequence and highlights two of the most common applications, the identification of physical contacts in a protein [6,7] or the identification and quantification of interactions between multiple proteins [8,9].
We decided to make this model and tools available to the scientific community in an interactive web server that facilitates the analysis and prediction of TCS networks as well as the exploration of the effects of mutation in these proteins prior to experimental work. We named the service Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response regulators (ELIHKSIR) and it can be accessed at https://elihksir.org.
In recent years, online repositories of sequence data have seen a large influx of sequences and are painting a more refined picture of protein families. Using these data, one can construct global probabilistic models that verify the observed statistics and relate them to inter-residual couplings. Cheng et al. [2] have used these probabilistic models to introduce an objective function H T C S s p e c i f i c ( σ ) to describe the specificity (fitness) of the interaction between a response regulator and a histidine kinase partner by a scalar score using a sequence σ from a linked multiple sequence alignment (MSA). For completeness, we reproduce the introduction of H T C S s p e c i f i c ( σ ) here.
Using the set of sequences { σ } , we can create a global probabilistic model P ( σ ) to find a given amino acid sequence σ in a protein family by the following:
P ( σ ) = 1 Z · exp ( H ( σ ) )
with a general Hamiltonian H ( σ ) and the partition function Z to verify normalization for the probabilities. A sufficient form for H ( σ ) [10] is given by the large-q Potts Model [11]:
H ( σ ) i j e i j ( a i , a j ) i h i ( a i )
with the coupling matrix e i j ( a i , a j ) between two sequence sites a i , a j at sequence positions i and j; and the local field h i ( a i ) at the site a i at sequence position i. The sites a can have q = 21 different states for amino acid and sequence gap composition. The entries of the coupling matrix e i j ( a i , a j ) and the local fields h i ( a i ) encode preferences for sequence compositions at positions i and j. The inference of the coupling matrix e i j ( a i , a j ) and the local fields h i ( a i ) is a non-trivial task. Several methods exist to do so [1,12,13]. We inferred the couplings using mean field DCA (mfDCA), which is fast and accurate at predicting interaction specificity in TCS.
From these coupling parameters, we can introduce and create objective functions to measure varying effects. In the Material and Methods, we introduce an objective function H T C S s p e c i f i c ( σ ) that is sensitive to sequence mutations and linked to protein interaction specificity. For the calculation of H T C S s p e c i f i c ( σ ) , we need full access to the couplings e i j ( a i , a j ) and local fields h i ( a i ) . Throughout the process, we consider these as constant and created a database that our server uses internally to calculate new values for the H T C S s p e c i f i c ( σ ) score in a mutation event.
Figure 2 gives an overview of the entire process of the ELIHKSIR web server. The MSA for our system is created by concatenating the HisKA domain section of the Pfam [14] Histidine Kinase (HK) family (Pfam:PF00512) [15] and the REC domain of the Response Regulator (RR) family (PF00072) [16], which contains information for thousands of organisms. Furthermore, we collect metadata for each organism and sequence pairs through the Uniprot database [17]. From this, we calculate the coupling matrices e i j ( a i , a j ) and the local fields h i ( a i ) . These parameters allow us to calculate a score for the interaction specificity H T C S . The data are visualized in a web interface with interactive heatmaps.
ELIHKSIR is a user-friendly and accessible tool that displays TCS signaling networks. The breadth of the web server allows for analysis of TCS networks in both common and uncommon species and strains. Table 1 summarizes the number of organisms and interaction partners available. Users can easily search for their organism of interest, view TCS specificity networks for the whole organism, and view all possible interactions for an HK or RR of interest. This capability allows researchers with restricted computational resources to analyze signaling networks. Some common use cases of ELIHKSIR’s features include identifying cross-talk interactions between non-cognate HKs and RRs, comparing specificity of different HK–RR pairs, and comparing differences in signaling networks between species and/or strains. In addition to browsing and exporting wild-type TCS networks, mutations may be introduced into HKs and/or RRs, for which all interaction specificity scores are recalculated and displayed. This allows users to predict network-wide changes in specificity after introducing a mutation. Further applications include testing mutants for desired change(s) in specificity, guiding engineering of TCS proteins with interaction or insulation requirements, and viewing changes in specificity for new or uncommon clinical and environmental variants. With these capabilities, ELIHKSIR is an effective tool for a variety of researchers who interface with TCS proteins and signaling.

2. Results

2.1. Validation

Validation of the ELIHKSIR web server was performed through detailed investigation using three model organisms: Escherichia coli, Synechococcus elongatus, and Enterococcus faecalis. True positive specificity predictions were defined by either positive selection and/or negative selection for a cognate pair. Positive selection is defined as an HK having its highest specificity with a single RR. Negative selection is defined as an RR having poor specificity across all HKs but having its relative highest specificity with an HK. False negatives were defined as selection towards a noncognate partner that is greater than that of the cognate partner, in which both positive and negative selection fail to identify the cognate pair. Only cognate pairs in which the HK contains a HisKA domain were evaluated. For E. coli, there were fourteen true positives and three false negatives for seventeen cognate pairs, shown in Figure A1. For S. elongatus, there were five true positives and one false negative for the six cognate pairs, shown in Figure A2. For E. faecalis, there were seven true positives and one false negative for the eight cognate pairs, shown in Figure A3. The resulting sensitivity and accuracy is 0.84.
DCA identifies coevolving residues at the HK–RR interface for HisKA and REC domains that have been used to accurately predict the structure of the HK–RR complex [18]. Out of the top 20 DCA-identified interfacial couplings, 10 are present in the 3DGE structure, as shown in Figure A4b. Information about all 3DGE interfacial contacts is present in the DCA-generated couplings and local fields (Figure A4a). Couplings are scored by their direct information (DI) value as defined by DCA (Table A2). Thus, higher DI values indicate that these couplings are more important for HK–RR interactions. When utilizing DCA couplings for the calculation of Hamiltonian values, only couplings present on the structurally verified HK–RR interface are used. This ensures auxiliary information obtained through DCA does not impact the Hamiltonian values, and thus, does not impact the resulting specificity score.
The interface is aligned for each TCS pair during the construction of the MSA, which was performed using a hidden Markov model. The sequences displayed in ELIHKSIR are the aligned residues and gaps. Predictions made based on HK and RR sequences only consider residues which align with their respective protein family. Insertions and deletions are not considered in the alignment of the interface and may result in deviations in the three dimensional structure of the resulting signaling complex. The model assumes no changes in the three dimensional structure of the HK–RR interface during evaluation of different TCS pairs.

2.2. Mutations

A key functionality of the ELIHKSIR server is the ability to interactively perform in silico mutations on a HK–RR pair. In the mutation screen, as shown in Figure 3b, the full MSA of a pair is shown with a visual clue to the histidine kinase region and the response regulator region. Any part of the MSA can be transformed and the changes in a HK or RR become applied globally. The heatmap is also updated accordingly. Gaps can be introduced as ’-’ characters. As the mutation values are run against a tabulated database for the positions and amino acid type, the total length of the MSA has to remain at 176 amino acids. Insertions are not possible in the model unless they occur in gap regions.
Only a subset of the positions in the genetic sequence correspond to an actual interfacial residue of the protein interface between Thermotoga maritima class I HK853 and the response regulator RR468, (PDB ID: 3DGE). Because of this, not every change in the sequence performed by a user will translate into a change in the specificity score. Furthermore, some types of amino acids can play similar roles in a specific residue position. In this case, the model accounts for this and only reflects minor or no changes in the total score.
An interesting application of the mutation user interface is shown in Figure 4, the rewiring of specificity. By transferring portions of a sequence from one cognate pair to another cognate pair, interaction properties can be discovered or lost. In this specific example, a portion of amino acids positions 70 to 80 transferred from ntrC into the same position in the cusR response regulator creates cross-talk with a new interacting partner qseC, while maintaining the initial interaction cognate partner cusS. Alternatively, introducing the same sequence positions from the response regulator qseB into cusR is entirely sufficient to rewire the entire interaction and create an exclusively positive selection towards qseC.

2.3. Data Export

The user has three options to export data from ELIHKSIR. First, the user may export a PNG image, as shown in Figure 3a of the entirety of the heatmap in PNG format by clicking on the Export to PNG button on the left panel once a heatmap has been displayed. This will generate a PNG image of the heatmap on a transparent background and download it onto the user’s machine. The image will also include the labels and legend. When selecting an n × m -sized subselection in a heatmap, the user is presented with the choice to display the subselection as a new heatmap. Second, the user may export a PNG image of a histogram as shown in Figure 3c of a row of response regulator and histidine kinase pairs that corresponds to a desired histidine kinase by clicking on the Export to PNG button that is located inside the opened histogram. The histogram export will also include the names of each response regulator. Finally, the user may export a CSV representation of the user’s arbitrary selections of the cells of the heatmap. After the user makes selections of the cells on the heatmap, the Export to CSV button on the right panel can be clicked to download a file that contains a comma delimited list of the user’s selections. All these methods of exporting will take into consideration the mutated Hamiltonian values, if any, of the response regulator and histidine kinase pairs.

2.4. Negative Selection

An important concept highlighted by the server is that of negative selection. Not only are interaction partners indicated by strong couplings and a highly negative score for a TCS pair, but equally by high interaction scores with each partner except one. In this case, the interaction with a marginal advantage will be the strongest interaction and may facilitate signal transduction. Hence, we differentiate by either positive selection and/or negative selection for the cognate pair, where positive selection is defined as an HK having the highest specificity for its cognate RR and where negative selection is defined as the cognate HK having the highest specificity out of all HKs for a given RR. Figure 5 highlights this for two different cases in E. coli (ECOLI). Besides the heatmap, a good indicator for the interactions is a look at the histograms (Figure 5b) of interaction strengths, which are, for this purpose, available through the server. In cusR, a single interaction between cusR and the histidine kinase cusS is dominant (Figure 5b top). In rcsB, the majority of interactions are reported as less specific. Even though the interaction between rcsB and the histidine kinase rcsC is not reported as very specific, it will be the dominant interaction for rcsB.

3. Discussion

3.1. Characterization of Cognate Specificity

Through both mutational and computational analyses, the interface between the HisKA domain and the REC response regulator domain has been shown to control specificity of TCS interactions [19]. In Figure 6, this finding is confirmed for 14 out of 17 cognate pairs shown for E. coli. In Figure 7, this finding is confirmed for all eight cognate pairs shown for M. tuberculosis. While predictions of interaction specificity have been previously demonstrated, ELIHKSIR presents specificity scores for all HisKA HK and RR pairs in thousands of organisms, defining specificity landscapes. These specificity landscapes can then be used to determine favorable interactions through identification of pairs exhibiting positive and/or negative selection. When assessing cognate pairs, the prevalence of interactions either partially or solely characterized by negative selection becomes apparent. In the validation process, 54.8% of detected cognate pairs exhibited both positive and negative selection and 19.4% of detected cognate pairs were characterized by negative selection only. Negative selection is important for preventing cross-talk and ensuring orthogonality [20], but results indicate that it may be a main or contributing determinant of many cognate interactions. It is unclear if other attributes or domains contribute to reinforcement of specificity for cognate pairs detected by negative selection only.
By identifying whether cognate interactions are maintained by positive and/or negative selection, users can explore how deletion of TCS proteins may affect gene expression. Experimental deletion of the cognate RR in a pair regulated by negative selection may result in a noncognate RR being phosphorylated by the HK. In deletion experiments, it may be useful to understand how removal of TCS proteins may affect overall expression. Furthermore, some TCS proteins are encoded for on plasmids. Understanding how the presence or lack of plasmid-encoded TCS proteins on organisms’ genetic expression may be important for the study of antibiotic resistance and plant cell transformation by bacteria [21].
It is important to note that, in many proteins, HisKA domains are accompanied by an HATPase_c domain, which is responsible for binding ATP and transferring its γ -phosphate to the HisKA domain. Aside from its ATPase activity, the HATPase_c domain alone can act as a histidine kinase [22]. It is unknown whether the HATPase_c domain itself encodes specificity or is partially responsible for specificity in certain cognate TCS pairings. Further analysis of the HATPase_c domain as well as other histidine kinase domains could reveal additional residues and mechanisms controlling TCS orthogonality.

3.2. Exploration of Non-Cognate Interactions

The ELIHKSIR web server allows for exploration and visualization of signaling networks. Using the displayed heatmap, users may identify crosstalk interactions in signaling networks. Non-cognate, crosstalk interactions are common in signaling networks and may influence the expression patterns in organisms. H T C S scores can be used to identify non-cognate, crosstalk interactions. Non-cognate interactions may be predicted by high specificity for a non-cognate partner as shown in Figure 7b–d and Figure 6b,d. Any negative score indicates some level of encoded specificity. While scores near zero indicate no encoded specificity, TCS non-cognate partners with scores near zero may still interact due to shared attributes present in all TCS proteins, shown in Figure 6c and Figure 7b. TCS non-cognate pairs in which shared TCS attributes are partially removed have positive specificity scores, indicating low specificity. These methods of identifying possible interactions may be used across all available organisms, allowing for users to investigate crosstalk interactions within specific, and possibly uncommon, species or strains.
TCS pairs in which the RR has a cognate HK of a different family than HisKA have low specificity, but may still interact are shown in Figure 7b,d,f and Figure 6b,d,f. The ability to interact despite very low specificity indicates there may be activity of HATPase_c in phosphorylation of non-cognate RRs whose cognates belong to other HK families since HATPase_c is present in both HisKA and HisKA3 family HKs.
In Figure 6g, we observe an orphan RR that exhibits low specificity for many HKs and has been phosphorylated by HKs with low predicted specificity. Aside from the possibility of HATPase_c domain contributions, it is possible that low specificity for orphan RRs is favorable as it promotes promiscuity. In the case of rssB in E. coli, phosphorylation is important for function [25,26]. Therefore, promiscuity of rssB could ensure maintenance of function throughout the E. coli life cycle. Using similar reasoning, one can identify potential interactions with orphan HKs and RRs. Information yielded from analysis of orphan TCS proteins may assist in describing their role in organisms’ life cycles, environmental stress responses, and expression patterns. Utilizing predicted orphan TCS protein interactions could be useful in the study of antibiotic resistance in bacteria, response to environmental metals and compounds in archaea, or plant response to drought.

3.3. Revealing Interaction Specificity for Mutation and Variation

After mutating a protein residue, specificity scores are recalculated and the heatmap is updated. This reveals how mutation(s) change interaction specificity with all possible TCS partners. A feature that becomes important when scientists would like to assess the network effect of mutations as opposed to single pairwise interactions. The ELIHKSIR web server also separates organisms by strain, allowing interaction specificities to be compared between different strains of the same organism. Accessibility of specificity predictions for different mutants and strains may reveal differences in TCS signaling of clinical and environmental variants and may assist in the engineering of sensory kinases and response regulators as it has been shown in previous studies [5].

4. Materials and Methods

4.1. MSA Construction

Raw HMM profiles for HisKA and REC were obtained through Pfam’s hidden Markov models (HMM) [27,28]. Then, the profile was searched using Hmmer’s hmmsearch against the TrEMBL database. HKs with a sequence gap of 5 residues or larger were excluded from the MSA. The resulting HisKA domain MSA was 67 residues in length and contained 111,032 sequences utilized in the ELIHKSIR web server. RRs with a sequence gap of 6 residues or larger were excluded from the MSA. The resulting REC domain MSA was 112 residues in length and contained 225,616 sequences utilized in the ELIHKSIR web server. Cognate HK-RR pairs were concatenated and used for the generation of couplings and local fields using mfDCA, where cognate is defined by having adjacent loci [29]. The resulting cognate MSA was 179 residues in length and contained 10,091 sequences. A number of 25 iterations of random concatenation of each HK to a random RR was used to generate a scrambled MSA. The resulting MSA was 179 residues in length and contained 16,363,100 sequences.

4.2. mfDCA Evolutionary Couplings and Hamiltonian Scores

Mean field DCA (mfDCA) [1] was used to infer the coevolutionary parameters from conjugated multiple sequence alignments (MSAs) of cognate HK–RR sequences and scrambled HK–RR sequences. The resulting coupling parameters and local field parameters were utilized in the calculation of Hamiltonian scores. In order to quantify changes on the Hamiltonian H ( S ) , Cheng et al. introduced a score H T C S as follows:
H TCS ( HK A + RR A ) = i = 1 L HK A j = L HK A + 1 L HK A + L RR A e i j ( A i , A j ) × Θ ( c r i j ) i = 1 L HK A + L RR A h i ( A i )
for a specific pair between a sequence HK A and RR A of sequence lengths L HK A and L RR A with the coupling matrix e i j ( A i , A j ) between two sequence sites A i , A j at sequence positions i and j; and the local field h i ( A i ) at the site A i at sequence position i. L HK A is 67 for the HisKA domain and L RR A is 112 for the REC domain. The couplings are only taken within a pair distance r i j < c = 12   Å of a native contact, expressed by a function Θ ( x ) = 1 for all x > 0 and Θ ( x ) = 0 for x 0 . The contact map of the native interfacial pairs is given by the 3D resolved structure of protein interface Thermotoga maritima class I HK853 with its cognate, RR468, (PDB ID: 3DGE). This interface is used as a template for the spatial complex. Equation (3) is used to calculate energies H T C S and H T C S 0 at interface positions, where H T C S is calculated using cognate couplings and local fields and H T C S 0 is calculated using scrambled couplings and local fields. H T C S 0 is generated using the large-q Potts Hamiltonian model on the scrambled MSA which is constructed by completing 25 rounds of concatenation of any of m HKs in the data set with any of n RRs in the dataset:
H TCS 0 ( { HK , RR } ) = H TCS ( HK X | X { 1 , , m } + RR Y | Y { 1 , , n } ) 25
To find H TCS specific , Hamiltonian energies calculated from shared attributes present in all HK–RR pairs must be removed from the specific HK–RR pair being evaluated:
H TCS specific ( HK A + RR A ) = H TCS ( HK A + RR A ) H TCS 0 ( { HK , RR } )
where the resulting H TCS specific represents the interaction specificity strength between the HK and RR. Therefore, this energy function could be used to predict the interaction preference between any HK and RR. Additionally, an updated H TCS specific score, after incorporating a mutation in the MSA, serves a reference for the effect of the mutation on binding specificity strength. The updated H TCS specific is generated by performing the same calculations presented in Equations (3) and (5). Ranges for H TCS specific values are varied between organisms and strains where a positive score indicates a loss of shared encoded TCS attributes, a negative score indicates encoded specificity, and a score of zero indicates a presence of all shared TCS attributes but diminished encoded specificity. When qualifying potential interactions, users should compare H TCS specific for different TCS pairs belonging to the same organism. One should consider more negative values to have increased encoded specificity, zero values to be capable of interacting with other TCS proteins without encoded specificity in the HisKA domain, and positive values to exhibit insulation of HisKA and REC domains.

4.3. Software

The web server has a custom-built front end running React [30] for enhanced user experience with custom components. The back-end is serving data through REST [31] endpoints. Upon mutation, the scores are looked up from a pre-computed table. The python source code for the calculation of H T C S is accessible via the web server. Details on public endpoints can be found in Appendix A.

5. Conclusions

The ELIHKSIR web server is a valuable tool for analyzing TCS specificity landscapes in a growing list of 6412 species and strains of bacteria, 65 species and strains of archaea, and 188 species and strains of eukaryotes. This allows users to find potential cross-talk interactions and characterize existing orthogonality for many organisms across different kingdoms. For each organism, heatmaps and histograms of TCS networks are easily accessed, displayed, and exported. Furthermore, the ability to compute, display, and export changes in specificity for mutated HK or RR proteins allows users to explore potential interactions and visualize changes in specificity over an entire signaling network. This ability can assist in the analysis of engineered mutants, clinical and environmental variants, and cross-talk behavior. While ELIHKSIR is useful for interactions between HisKA family HKs and the REC domain of RRs, there exist other HK families in which the ELIHKSIR model does not evaluate. Building and validating models to predict specificity for other families of HK would further assist TCS research. Even though ELIHKSIR only displays specificity scores for HisKA and REC domains, these domains are critical in determining specificity for many TCS interactions, as demonstrated by the 6,272,607 HK-RR pairs evaluated. Due to the ability to mutate each protein and recalculate network-wide specificity scores, there are nearly endless possibilities of HK–RR pairs to evaluate using ELIHKSIR. The accessibility, breadth, and functionality of ELIHKSIR allows a variety of researchers (both computational and experimental) to harness TCS specificity predictions, supporting research efforts through a tool that did not previously exist.

Author Contributions

Conceptualization, F.M.; methodology, F.M., X.J. and C.S.; software, C.S. and Y.H.J.; validation, C.Z. and X.J.; formal analysis, C.Z., X.J., C.S.; investigation, C.Z., X.J., C.S.; resources, C.Z.; data curation, C.S. and C.Z.; writing—original draft preparation, C.S., C.Z.; writing—review and editing, F.M., C.Z., C.S.; visualization, C.S., C.Z.; supervision, F.M.; project supervision, F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the University of Texas at Dallas (X.J. and F.M.); NIH grant number R35GM133631 (to C.S., X.J. and F.M.); and NSF grant number MCB-1943442 (to C.Z. and F.M.).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data presented in this work is available at the ELIHKSIR web server at https://elihksir.org/.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ELIHKSIREvolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response regulators
TCSTwo-Component System
DCADirect-Coupling Analysis
mfDCAcouplings generated by mean-field method as outlined in Morcos, 2011 [1]
DIDirect Information
HKHistidine Kinase, Histidine Kinase family (Pfam:PF00512) [15]
RRResponse Regulator, Response Regulator family (Pfam: PF00072) [16]
TPTrue Positive
FNFalse Negative
PSPositive Selection
NSNegative Selection

Appendix A

Data of the server can be accessed in a programmatic way through two REST endpoints as described in Table A1. The all organisms endpoint api/list returns a list of all the organisms currently accessible through ELIHKSIR. The return value will contain the names (ORGANISM_NAMES::STRING), UNIPROT ID (ORGANISM_UNIPROT_ID::STRING), and the numeric identifier/primary key (ORGANISM_ID::INT) for each organism. By using the numeric identifiers obtained from the list endpoint further meta data and information, along with the scores for each interacting pair, can be obtained through the api/pairs endpoint.
Table A1. List of the available endpoints for the REST API.
Table A1. List of the available endpoints for the REST API.
EndpointHTTP MethodURL
All OrganismsGETapi/list
Pairs for heatmapGETapi/pairs/{ORGANISM_ID::INT}
Figure A1. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable interaction.
Figure A1. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable interaction.
Entropy 23 00170 g0a1
Figure A2. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable.
Figure A2. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable.
Entropy 23 00170 g0a2
Figure A3. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable interaction.
Figure A3. True positives are correct prediction of cognate pairs through positive and/or negative selection. False negatives occur when the cognate pairing is not the most favorable interaction.
Entropy 23 00170 g0a3
Table A2. Couplings used in specificity model sorted by descending DI value.
Table A2. Couplings used in specificity model sorted by descending DI value.
HKRRDIHKRRDIHKRRDIHKRRDI
18770.10285371470.00806367141700.0057227116770.00460402
22800.072283330830.00801695191690.0057149642770.00460128
111670.070523233830.00781679141470.00570246261760.004455
26840.0515243221710.00767719221700.0056994142760.00444185
23800.049259419790.0076557416760.00569788171690.00442802
141460.04276231720.0076366215760.0056873327800.00440244
46760.0398581181710.0075978838800.0056818151670.00439432
19760.0392644191470.00753089101470.0056706919810.00420284
251700.035177918810.00752615221720.0056210538790.00418686
251710.0303173121480.00734758131470.0055972441790.00416339
111680.0285807101500.0073232734870.00559491151700.0041264
151460.027004845760.00716933840.00556154241730.00403743
29870.026571122760.0071208934840.00552466181460.00382329
19770.0259669211720.00705718251690.00545037161470.00378105
30870.021565330800.00702085151180.00544512181720.00377972
23760.019361615740.00697282251740.00543429201680.00375797
19800.0189693181470.0069165618740.00541696161690.00366294
22770.018835545790.00690392141490.0054039125810.00362641
23790.018087422780.0068039830860.00538702131690.00359782
19740.0176283191700.0067932733870.00538074101480.00359015
81470.0172729231700.00679065171470.0053763220770.00355346
181690.017160618780.0067636333860.00534217111460.00345648
291710.017073626810.0067506271490.0053042621810.00344168
161680.016867431870.0067342141690.0053039442800.00338203
15770.015240421770.0067038238830.00526882281730.00334077
251720.014978427840.0066746526820.00525117221740.0032904
39830.01490122810.0066624717770.00524036201700.00327778
291720.014618746770.0065862542830.00521168141680.0032269
211700.01446926790.0065767734830.00520958221760.00319442
26800.0141692181680.0065145734800.00518935241720.00310293
26830.013957945800.0064824520800.00514101171700.00307046
23830.0130196191720.0063998546800.00512939191680.00298315
121680.012826918760.00633046181700.0051037726850.00290492
151470.0123301281720.0062695923820.00509604201710.0027667
29840.0121863251750.0062303525800.00502442151690.00275228
81480.011902116740.0061512349770.00502117201690.00269252
23840.011838630880.0061455945770.00501361151680.00266174
22840.011396419750.00610985241710.00496918211680.00254422
231710.0108972101490.0060853117760.00494564261740.00238278
32870.010863724800.00607124141670.00492586271730.00238187
261710.0107505231690.00605014151480.00492344241690.00237797
25840.010497833880.00604033231730.00489615201720.00233119
141480.010405571500.00599047241700.00488941181450.00222389
30840.010280248760.0059651844760.00484498131680.00190674
26870.010277721800.0059512211730.00483877101690.00186603
23780.01007251730.00592837291730.00481891181670.00162171
23770.00997545281710.00591566101680.00480323121690.00147153
221690.0099616542790.00586154171710.00478502151450.00135628
43800.00972282221730.00585409221680.00478308111490.00105631
22830.00952424261720.0058532121470.0047804111690.000969333
18800.009360549760.005838261730.00476983111500.000862052
71480.00897782261750.0058162622820.00476307111480.000856593
29830.00882006301720.0057796145750.00475623111180.000790599
211710.0086542522790.0057726539800.00473036111470.000413171
261700.0085448715730.00576882271720.00472585
191710.0083150441800.0057601841760.00472158
171680.008266301730.0057439827830.0046522
19780.0081893623810.0057378920760.00465154
211690.0080992221750.0057310716770.00460402
Figure A4. Gray structures show the HK residues lying outside of the HisKA domain. Black structures show the RR residues lying outside the REC domain. The blue structure represents the HisKA domain, and the yellow structure represents the REC domain. Green pseudobonds show contacts within 12 Angstroms Cα to Cα. Red pseudobonds show the top 20 DCA couplings. The distribution of DCA couplings indicates that the model does not show biases towards subregions of the interface. (a) All contacts within 12 Angstroms as found in the structure viewed from two different positions, left and right faces; (b) Top 20 interfacial DI contacts as viewed from left and right faces.
Figure A4. Gray structures show the HK residues lying outside of the HisKA domain. Black structures show the RR residues lying outside the REC domain. The blue structure represents the HisKA domain, and the yellow structure represents the REC domain. Green pseudobonds show contacts within 12 Angstroms Cα to Cα. Red pseudobonds show the top 20 DCA couplings. The distribution of DCA couplings indicates that the model does not show biases towards subregions of the interface. (a) All contacts within 12 Angstroms as found in the structure viewed from two different positions, left and right faces; (b) Top 20 interfacial DI contacts as viewed from left and right faces.
Entropy 23 00170 g0a4

References

  1. Morcos, F.; Pagnani, A.; Lunt, B.; Bertolino, A.; Marks, D.S.; Sander, C.; Zecchina, R.; Onuchic, J.N.; Hwa, T.; Weigt, M. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. USA 2011, 108. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Cheng, R.R.; Morcos, F.; Levine, H.; Onuchic, J.N. Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information. Proc. Natl. Acad. Sci. USA 2014, 111. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Boyd, J.S.; Cheng, R.R.; Paddock, M.L.; Sancar, C.; Morcos, F.; Golden, S.S. A combined computational and genetic approach uncovers network interactions of the cyanobacterial circadian clock. J. Bacteriol. 2016, 198, 2439–2447. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Cheng, R.R.; Nordesjö, O.; Hayes, R.L.; Levine, H.; Flores, S.C.; Onuchic, J.N.; Morcos, F. Connecting the sequence-space of bacterial signaling proteins to phenotypes using coevolutionary landscapes. Mol. Biol. Evol. 2016, 33, 3054–3064. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Cheng, R.R.; Haglund, E.; Tiee, N.S.; Morcos, F.; Levine, H.; Adams, J.A.; Jennings, P.A.; Onuchic, J.N. Designing bacterial signaling interactions with coevolutionary landscapes. PLoS ONE 2018, 13, e0201734. [Google Scholar] [CrossRef]
  6. Morcos, F.; Hwa, T.; Onuchic, J.N.; Weigt, M. Direct Coupling Analysis for Protein Contact Prediction. In Protein Structure Prediction; Springer: New York, NY, USA, 2014; pp. 55–70. [Google Scholar] [CrossRef]
  7. Muscat, M.; Croce, G.; Sarti, E.; Weigt, M. FilterDCA: Interpretable supervised contact prediction using inter-domain coevolution. bioRxiv 2019. Available online: https://www.biorxiv.org/content/early/2019/12/24/2019.12.24.887877.full.pdf (accessed on 11 December 2020).
  8. Szurmant, H.; Weigt, M. Inter-residue, inter-protein and inter-family coevolution: Bridging the scales. Curr. Opin. Struct. Biol. 2018, 50, 26–32. [Google Scholar] [CrossRef]
  9. Jiang, X.L.; Martinez-Ledesma, E.; Morcos, F. Revealing protein networks and gene-drug connectivity in cancer from direct information. Sci. Rep. 2017, 7, 3739. [Google Scholar] [CrossRef] [Green Version]
  10. Jacquin, H.; Gilson, A.; Shakhnovich, E.; Cocco, S.; Monasson, R. Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models. PLoS Comput. Biol. 2016, 12, e1004889. [Google Scholar] [CrossRef]
  11. Levy, R.M.; Haldane, A.; Flynn, W.F. Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Curr. Opin. Struct. Biol. 2017, 43, 55–62. [Google Scholar] [CrossRef] [Green Version]
  12. Figliuzzi, M.; Jacquier, H.; Schug, A.; Tenaillon, O.; Weigt, M. Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase tem-1. Mol. Biol. Evol. 2016, 33, 268–280. [Google Scholar] [CrossRef] [PubMed]
  13. Ekeberg, M.; Hartonen, T.; Aurell, E. Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous. J. Comput. Phys. 2014, 276, 341–356. [Google Scholar] [CrossRef] [Green Version]
  14. El-Gebali, S.; Mistry, J.; Bateman, A.; Eddy, S.R.; Luciani, A.; Potter, S.C.; Qureshi, M.; Richardson, L.J.; Salazar, G.A.; Smart, A.; et al. The Pfam protein families database in 2019. Nucleic Acids Res. 2018, 47, D427–D432. Available online: https://academic.oup.com/nar/article-pdf/47/D1/D427/27436497/gky995.pdf (accessed on 11 December 2020). [CrossRef] [PubMed]
  15. Pfam. Family: HisKA (PF00512)—His Kinase A (Phospho-Acceptor) Domain; Pfam: Hinxton, UK, 2020. [Google Scholar]
  16. Pfam. Family: Response_reg (PF00072) Response Regulator Receiver Domain; Pfam: Hinxton, UK, 2020. [Google Scholar]
  17. Consortium, T.U. UniProt: A worldwide hub of protein knowledge. Nucleic Acids Res. 2018, 47, D506–D515. [Google Scholar] [CrossRef] [Green Version]
  18. Schug, A.; Weigt, M.; Onuchic, J.N.; Hwa, T.; Szurmant, H. High-resolution protein complexes from integrating genomic information with molecular simulation. Proc. Natl. Acad. Sci. USA 2009, 106, 22124–22129. [Google Scholar] [CrossRef] [Green Version]
  19. Szurmant, H.; Hoch, J.A. Interaction fidelity in two-component signaling. Curr. Opin. Microbiol. 2010, 13, 190–197. [Google Scholar] [CrossRef] [Green Version]
  20. Capra, E.J.; Laub, M.T. The Evolution of Two-Component. Annu. Rev. Microbiol. 2012, 66, 325–347. [Google Scholar] [CrossRef] [Green Version]
  21. Heath, J.D.; Charles, T.C.; Nester, E.W. Ti Plasmid and Chromosomally Encoded Two-Component Systems Important in Plant Cell Transformation by Agrobacterium Species. In Two—Component Signal Transduction; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 1995; Chapter 23; pp. 367–385. [Google Scholar]
  22. Stewart, R.C.; Jahreis, K.; Parkinson, J.S. Rapid phosphotransfer to CheY from a CheA protein lacking the CheY-binding domain. Biochemistry 2000, 39, 13157–13165. [Google Scholar] [CrossRef]
  23. Yamamoto, K.; Hirao, K.; Oshima, T.; Aiba, H.; Utsumi, R.; Ishihama, A. Functional characterization in vitro of all two-component signal transduction systems from Escherichia coli. J. Biol. Chem. 2005, 280, 1448–1456. [Google Scholar] [CrossRef] [Green Version]
  24. Agrawal, R.; Pandey, A.; Rajankar, M.P.; Dixit, N.M.; Saini, D.K. The two-component signalling networks of Mycobacterium tuberculosis display extensive cross-talk in vitro. Biochem. J. 2015, 469, 121–134. [Google Scholar] [CrossRef] [Green Version]
  25. Becker, G.; Klauck, E.; Hengge-Aronis, R. Regulation of RpoS proteolysis in Escherichia coli: The response regulator RssB is a recognition factor that interacts with the turnover element in RpoS. Proc. Natl. Acad. Sci. USA 1999, 96, 6439–6444. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Klauck, E.; Lingnau, M.; Hengge-Aronis, R. Role of the response regulator RssB in σS recognition and initiation of σS proteolysis in Escherichia coli. Mol. Microbiol. 2001, 40, 1381–1390. [Google Scholar] [CrossRef] [PubMed]
  27. Eddy, S.R. Profile hidden Markov models. Bioinformatics 1998, 14, 755–763. [Google Scholar] [CrossRef] [PubMed]
  28. Finn, R.D.; Mistry, J.; Tate, J.; Coggill, P.; Heger, A.; Pollington, J.E.; Gavin, O.L.; Gunasekaran, P.; Ceric, G.; Forslund, K.; et al. The Pfam protein families database. Nucleic Acids Res. 2009, 38, 211–222. [Google Scholar] [CrossRef] [PubMed]
  29. Williams, R.H.; Whitworth, D.E. The genetic organisation of prokaryotic two-component system signalling pathways. BMC Genom. 2010, 11, 720. [Google Scholar] [CrossRef] [Green Version]
  30. Facebook Inc. React—A JavaScript Library for Building User Interfaces; Facebook Inc.: Menlo Park, CA, USA, 2020. [Google Scholar]
  31. Fielding, R.T.; Taylor, R.N. Architectural Styles and the Design of Network-Based Software Architectures. Ph.D. Thesis, University of California, Irvine, CA, USA, 2000. [Google Scholar]
Figure 1. Statistical couplings for sequence position and residue type are inferred from the MSA for the protein family using the DCA method. High couplings indicate significant interactions between sequence positions. These couplings can be used to infer physical contacts within a single protein structure, or to infer the interaction interface and strength between two proteins.
Figure 1. Statistical couplings for sequence position and residue type are inferred from the MSA for the protein family using the DCA method. High couplings indicate significant interactions between sequence positions. These couplings can be used to infer physical contacts within a single protein structure, or to infer the interaction interface and strength between two proteins.
Entropy 23 00170 g001
Figure 2. (a) A concatenated MSA is generated for Pfam [14] Histidine Kinase (HK) family (Pfam:PF00512) [15] and Response Regulator (RR) family (PF00072) [16]. (b) From this MSA coupling matrices are generated with mfDCA [1]. From these couplings, we are able to calculate a numeric score using the equation shown. This equation formally describes how Hamiltonian scores are generated for each HK–RR pair and is equivalent to Equation (3). The data are displayed in a web interface with interactive heatmaps. The user has an elaborate menu available to explore the data by creating mutations to sequence positions. The default heatmap legend is more sensitive towards the outer extremes of the values, coloring strongly negative (favorable) or positive values (unfavorable).
Figure 2. (a) A concatenated MSA is generated for Pfam [14] Histidine Kinase (HK) family (Pfam:PF00512) [15] and Response Regulator (RR) family (PF00072) [16]. (b) From this MSA coupling matrices are generated with mfDCA [1]. From these couplings, we are able to calculate a numeric score using the equation shown. This equation formally describes how Hamiltonian scores are generated for each HK–RR pair and is equivalent to Equation (3). The data are displayed in a web interface with interactive heatmaps. The user has an elaborate menu available to explore the data by creating mutations to sequence positions. The default heatmap legend is more sensitive towards the outer extremes of the values, coloring strongly negative (favorable) or positive values (unfavorable).
Entropy 23 00170 g002
Figure 3. (a) Heatmap for Synechoccus elongatus as displayed on ELIHKSIR and when exported as an image. (b) Mutation screen as displayed on ELIHKSIR. (c) Histogram depicting all selectivity scores for a given HK or RR.
Figure 3. (a) Heatmap for Synechoccus elongatus as displayed on ELIHKSIR and when exported as an image. (b) Mutation screen as displayed on ELIHKSIR. (c) Histogram depicting all selectivity scores for a given HK or RR.
Entropy 23 00170 g003
Figure 4. (a) One of the many use cases for the web server is the exploration and in silico change of specificity. In this example, we identify the response regulator cusR as the interaction partner of the histidine kinase cusS indicated by the lowest value in our Hamiltonian. (b) The transfer of a significant sequence portion of the response regulator ntrC does not disrupt the initial interaction and introduces cross-talk through a second interaction partner. (c) Alternatively, the introduction of a sequence portion of the response regulator qseB into cusR disrupts the initial interaction and rewires the interaction towards qseC.
Figure 4. (a) One of the many use cases for the web server is the exploration and in silico change of specificity. In this example, we identify the response regulator cusR as the interaction partner of the histidine kinase cusS indicated by the lowest value in our Hamiltonian. (b) The transfer of a significant sequence portion of the response regulator ntrC does not disrupt the initial interaction and introduces cross-talk through a second interaction partner. (c) Alternatively, the introduction of a sequence portion of the response regulator qseB into cusR disrupts the initial interaction and rewires the interaction towards qseC.
Entropy 23 00170 g004
Figure 5. Negative selection in Escherichia coli strain K12 (ECOLI). (a) Heatmap view for the response regulators cusR and rcsB. In cusR, a single interaction between cusR and the histidine kinase cusS is dominant. This is a case of positive selection between two interacting partners. In rcsB, the majority of interactions are reported as having a low specificity. Even though the interaction between rcsB and the histidine kinase rcsC is not reported as having a high specificity, it will be the dominant interaction for rcsB as there is no stronger interaction partner for signal transduction. This is an example of negative selection. (b) Histogram view for the response regulators cusR and rcsB. From these histograms, it becomes clear that cusR-cusS (top) and rcsB-rcsC (bottom) are the dominant interactions.
Figure 5. Negative selection in Escherichia coli strain K12 (ECOLI). (a) Heatmap view for the response regulators cusR and rcsB. In cusR, a single interaction between cusR and the histidine kinase cusS is dominant. This is a case of positive selection between two interacting partners. In rcsB, the majority of interactions are reported as having a low specificity. Even though the interaction between rcsB and the histidine kinase rcsC is not reported as having a high specificity, it will be the dominant interaction for rcsB as there is no stronger interaction partner for signal transduction. This is an example of negative selection. (b) Histogram view for the response regulators cusR and rcsB. From these histograms, it becomes clear that cusR-cusS (top) and rcsB-rcsC (bottom) are the dominant interactions.
Entropy 23 00170 g005
Figure 6. (a) Cognate interactions and observed in vitro crosstalk interactions overlaid onto the specificity score heatmap for E. coli [23]. Noncognate interactions are assessed. (b) BarA phosphorylates cusR, narL, and narP, in which the scores are −5.723, 2.390, and 3.491 respectively. The score for barA-cusR indicates that phosphorylation occurs due to high specificity for its noncognate partner. Phosphorylation of narL and narP are characterized in (f). (c) PhoR phosphorylates cpxR, in which the score is −0.037. A score near zero indicates diminished specificity, while still retaining attributes shared among all TCS pairs. (d) BaeS phosphorylates glrR, rssB, and cheY, in which the scores are −1.264, 3.998, and 4.605. The score for baeS-glrR indicates that phosphorylation occurs due to increased specificity for a noncognate partner. Phosphorylation of rssB is characterized in (g). Phosphorylation of cheY can be described similarly to (f), as its cognate HK utilizes a different family of HK than HisKA. (e) Cognate, crosstalk, and average non-cognate scores are shown for each HK. (f) HKs narQ and narX are not shown as they utilize a HisKA3 family HK, rather than HisKA. Their RRs, narL and narP, have low specificity for all HKs utilizing the HisKA domain. This leads narL and narP to be nonspecific for HisKA family HKs. Despite a lack of specificity, crosstalk is observed. (g) RssB is an orphan RR that can be phosphorylated by multiple HKs.
Figure 6. (a) Cognate interactions and observed in vitro crosstalk interactions overlaid onto the specificity score heatmap for E. coli [23]. Noncognate interactions are assessed. (b) BarA phosphorylates cusR, narL, and narP, in which the scores are −5.723, 2.390, and 3.491 respectively. The score for barA-cusR indicates that phosphorylation occurs due to high specificity for its noncognate partner. Phosphorylation of narL and narP are characterized in (f). (c) PhoR phosphorylates cpxR, in which the score is −0.037. A score near zero indicates diminished specificity, while still retaining attributes shared among all TCS pairs. (d) BaeS phosphorylates glrR, rssB, and cheY, in which the scores are −1.264, 3.998, and 4.605. The score for baeS-glrR indicates that phosphorylation occurs due to increased specificity for a noncognate partner. Phosphorylation of rssB is characterized in (g). Phosphorylation of cheY can be described similarly to (f), as its cognate HK utilizes a different family of HK than HisKA. (e) Cognate, crosstalk, and average non-cognate scores are shown for each HK. (f) HKs narQ and narX are not shown as they utilize a HisKA3 family HK, rather than HisKA. Their RRs, narL and narP, have low specificity for all HKs utilizing the HisKA domain. This leads narL and narP to be nonspecific for HisKA family HKs. Despite a lack of specificity, crosstalk is observed. (g) RssB is an orphan RR that can be phosphorylated by multiple HKs.
Entropy 23 00170 g006
Figure 7. (a) Cognate interactions and observed in vitro crosstalk interactions overlaid onto the specificity score heatmap for M. tuberculosis [24]. Noncognate interactions are assessed. (b) MtrB phosphorylates kdpE, phoP, tcrX, tcrA, and narL, in which the scores are −4.895, −5.826, 0.391, −1.093, and 2.813 respectively. Scores for kdpE, phoP, and tcrA indicate that phosphorylation by mtrB occurs due to high specificity for these noncognate partners. TcrX has a score near zero, /textcolorredindicating diminished specificity but a presence of attributes shared among all TCS pairs. Phosphorylation of narL is characterized in (f). (c) PrrB phosphorylates mprA, in which the score is −11.263. This score indicates that phosphorylation of mprA by prrB occurs due to high specificity. (d) PhoR phosphorylates tcrX, tcrA, and devR, in which the scores are −5.744, −5.176, and 6.856, respectively. Scores for tcrX and tcrA indicate that phosphorylation by phoR occurs due to high specificity for these noncognate partners. Phosphorylation of devR is characterized in (f). (e) Cognate, crosstalk, and average noncognate scores are shown for each HK. (f) HKs devS and narS are not shown as they utilize a HisKA3 family HK, rather than HisKA. Their response regulators, narL and devR, have low specificity for all HKs utilizing the HisKA domain.
Figure 7. (a) Cognate interactions and observed in vitro crosstalk interactions overlaid onto the specificity score heatmap for M. tuberculosis [24]. Noncognate interactions are assessed. (b) MtrB phosphorylates kdpE, phoP, tcrX, tcrA, and narL, in which the scores are −4.895, −5.826, 0.391, −1.093, and 2.813 respectively. Scores for kdpE, phoP, and tcrA indicate that phosphorylation by mtrB occurs due to high specificity for these noncognate partners. TcrX has a score near zero, /textcolorredindicating diminished specificity but a presence of attributes shared among all TCS pairs. Phosphorylation of narL is characterized in (f). (c) PrrB phosphorylates mprA, in which the score is −11.263. This score indicates that phosphorylation of mprA by prrB occurs due to high specificity. (d) PhoR phosphorylates tcrX, tcrA, and devR, in which the scores are −5.744, −5.176, and 6.856, respectively. Scores for tcrX and tcrA indicate that phosphorylation by phoR occurs due to high specificity for these noncognate partners. Phosphorylation of devR is characterized in (f). (e) Cognate, crosstalk, and average noncognate scores are shown for each HK. (f) HKs devS and narS are not shown as they utilize a HisKA3 family HK, rather than HisKA. Their response regulators, narL and devR, have low specificity for all HKs utilizing the HisKA domain.
Entropy 23 00170 g007
Table 1. Attributes of the ELIHKSIR web server.
Table 1. Attributes of the ELIHKSIR web server.
Total Organisms6676
 Bacteria6412
 Archaea65
 Eukaryotes188
 Unknown Organisms/Metagenomes11
Total Interactions Evaluated6,272,607
 Number of HKs111,032
 Number of RRs225,616
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sinner, C.; Ziegler, C.; Jung, Y.H.; Jiang, X.; Morcos, F. ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators. Entropy 2021, 23, 170. https://doi.org/10.3390/e23020170

AMA Style

Sinner C, Ziegler C, Jung YH, Jiang X, Morcos F. ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators. Entropy. 2021; 23(2):170. https://doi.org/10.3390/e23020170

Chicago/Turabian Style

Sinner, Claude, Cheyenne Ziegler, Yun Ho Jung, Xianli Jiang, and Faruck Morcos. 2021. "ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators" Entropy 23, no. 2: 170. https://doi.org/10.3390/e23020170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop