E-Mail Alert

Add your e-mail address to receive forthcoming issues of this journal:

Journal Browser

Journal Browser

Special Issue "Computational Analysis for Protein Structure and Interaction"

A special issue of Molecules (ISSN 1420-3049). This special issue belongs to the section "Bioorganic Chemistry".

Deadline for manuscript submissions: closed (1 November 2017)

Special Issue Editor

Guest Editor
Prof. Dr. Quan Zou

School of Computer Science and Technology, Tianjin University, Tianjin 300350, China
Website | E-Mail
Interests: bioinformatics; protein structure prediction; protein-protein interaction; special protein identification; machine learning; noncoding RNA

Special Issue Information

Dear Colleagues,

Protein structure analysis is a hot topic and key issue in organic chemistry and molecular biology research. Several essential protein molecules were rebuilt with Cryo-EM (Cryo-Electron Microscopy) and their structures were published in Nature and Science. Computational structure analysis and prediction is a key process for the 3D structure reconstruction. Machine learning techniques have been employed for protein secondary and tertiary structure prediction for a long time, and it seemed to have reached a bottleneck. However, the development of the Cryo-EM technique brings new challenges and requirements to computer science. Additionally, deep learning in machine learning also seems to be powerful. Therefore, there is considerable and increasing interest in developing computational methods for protein structure analysis and prediction. Moreover, new techniques on structure could also facilitate protein–protein interaction research.

The Guest Editor looks forward to collecting a set of recent advances in the related topics, to provide a platform for researchers, and bridge the gap between computer researchers and structural chemistry researchers.

Prof. Dr. Quan Zou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Molecules is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • protein structure prediction
  • protein–protein interaction network
  • Cryo-EM molecule particles boxing
  • Cryo-EM image process
  • machine learning
  • protein disorder region
  • docking
  • protein inter-residue contacts prediction

Published Papers (17 papers)

View options order results:
result details:
Displaying articles 1-17
Export citation of selected articles as:

Research

Jump to: Review

Open AccessArticle Designability of Aromatic Interaction Networks at E. coli Bacterioferritin B-Type Channels
Molecules 2017, 22(12), 2184; doi:10.3390/molecules22122184
Received: 31 October 2017 / Revised: 1 December 2017 / Accepted: 6 December 2017 / Published: 8 December 2017
PDF Full-text (3988 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
The bacterioferritin from E. coli (BFR), a maxi-ferritin made of 24 subunits, has been utilized as a model to study the fundamentals of protein folding and self-assembly. Through structural and computational analyses, two amino acid residues at the B-site interface of BFR were
[...] Read more.
The bacterioferritin from E. coli (BFR), a maxi-ferritin made of 24 subunits, has been utilized as a model to study the fundamentals of protein folding and self-assembly. Through structural and computational analyses, two amino acid residues at the B-site interface of BFR were chosen to investigate the role they play in the self-assembly of nano-cage formation, and the possibility of building aromatic interaction networks at B-type protein–protein interfaces. Three mutants were designed, expressed, purified, and characterized using transmission electron microscopy, size exclusion chromatography, native gel electrophoresis, and temperature-dependent circular dichroism spectroscopy. All of the mutants fold into α-helical structures and possess lowered thermostability. The double mutant D132W/N34W was 12 °C less stable than the wild type, and was also the only mutant for which cage-like nanostructures could not be detected in the dried, surface-immobilized conditions of transmission electron microscopy. Two mutants—N34W and D132W/N34W—only formed dimers in solution, while mutant D132W favored the 24-mer even more robustly than the wild type, suggesting that we were successful in designing proteins with enhanced assembly properties. This investigation into the structure of this important class of proteins could help to understand the self-assembly of proteins in general. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Identification of DNA–protein Binding Sites through Multi-Scale Local Average Blocks on Sequence Information
Molecules 2017, 22(12), 2079; doi:10.3390/molecules22122079
Received: 31 October 2017 / Revised: 22 November 2017 / Accepted: 24 November 2017 / Published: 28 November 2017
PDF Full-text (1548 KB) | HTML Full-text | XML Full-text
Abstract
DNA–protein interactions appear as pivotal roles in diverse biological procedures and are paramount for cell metabolism, while identifying them with computational means is a kind of prudent scenario in depleting in vitro and in vivo experimental charging. A variety of state-of-the-art investigations have
[...] Read more.
DNA–protein interactions appear as pivotal roles in diverse biological procedures and are paramount for cell metabolism, while identifying them with computational means is a kind of prudent scenario in depleting in vitro and in vivo experimental charging. A variety of state-of-the-art investigations have been elucidated to improve the accuracy of the DNA–protein binding sites prediction. Nevertheless, structure-based approaches are limited under the condition without 3D information, and the predictive validity is still refinable. In this essay, we address a kind of competitive method called Multi-scale Local Average Blocks (MLAB) algorithm to solve this issue. Different from structure-based routes, MLAB exploits a strategy that not only extracts local evolutionary information from primary sequences, but also using predicts solvent accessibility. Moreover, the construction about predictors of DNA–protein binding sites wields an ensemble weighted sparse representation model with random under-sampling. To evaluate the performance of MLAB, we conduct comprehensive experiments of DNA–protein binding sites prediction. MLAB gives M C C of 0.392 , 0.315 , 0.439 and 0.245 on PDNA-543, PDNA-41, PDNA-316 and PDNA-52 datasets, respectively. It shows that MLAB gains advantages by comparing with other outstanding methods. M C C for our method is increased by at least 0.053 , 0.015 and 0.064 on PDNA-543, PDNA-41 and PDNA-316 datasets, respectively. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Drug-Target Interaction Prediction through Label Propagation with Linear Neighborhood Information
Molecules 2017, 22(12), 2056; doi:10.3390/molecules22122056
Received: 12 October 2017 / Revised: 19 November 2017 / Accepted: 20 November 2017 / Published: 25 November 2017
PDF Full-text (1888 KB) | HTML Full-text | XML Full-text
Abstract
Interactions between drugs and target proteins provide important information for the drug discovery. Currently, experiments identified only a small number of drug-target interactions. Therefore, the development of computational methods for drug-target interaction prediction is an urgent task of theoretical interest and practical significance.
[...] Read more.
Interactions between drugs and target proteins provide important information for the drug discovery. Currently, experiments identified only a small number of drug-target interactions. Therefore, the development of computational methods for drug-target interaction prediction is an urgent task of theoretical interest and practical significance. In this paper, we propose a label propagation method with linear neighborhood information (LPLNI) for predicting unobserved drug-target interactions. Firstly, we calculate drug-drug linear neighborhood similarity in the feature spaces, by considering how to reconstruct data points from neighbors. Then, we take similarities as the manifold of drugs, and assume the manifold unchanged in the interaction space. At last, we predict unobserved interactions between known drugs and targets by using drug-drug linear neighborhood similarity and known drug-target interactions. The experiments show that LPLNI can utilize only known drug-target interactions to make high-accuracy predictions on four benchmark datasets. Furthermore, we consider incorporating chemical structures into LPLNI models. Experimental results demonstrate that the model with integrated information (LPLNI-II) can produce improved performances, better than other state-of-the-art methods. The known drug-target interactions are an important information source for computational predictions. The usefulness of the proposed method is demonstrated by cross validation and the case study. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Predict the Relationship between Gene and Large Yellow Croaker’s Economic Traits
Molecules 2017, 22(11), 1978; doi:10.3390/molecules22111978
Received: 21 October 2017 / Revised: 5 November 2017 / Accepted: 6 November 2017 / Published: 16 November 2017
PDF Full-text (1888 KB) | HTML Full-text | XML Full-text
Abstract
The importance of a gene’s impact on traits is well appreciated. Gene expression will affect the growth, immunity, reproduction and environmental resistance of some fish, and then affect the economic performance of fish-related business. Studying the connection between gene and character can help
[...] Read more.
The importance of a gene’s impact on traits is well appreciated. Gene expression will affect the growth, immunity, reproduction and environmental resistance of some fish, and then affect the economic performance of fish-related business. Studying the connection between gene and character can help elucidate the growth of fishes. Thus far, a collected database containing large yellow croaker (Larimichthys crocea) genes does not exist. The gene having to do with the growth efficiency of fish will have a huge impact on research. For example, the protein encoded by the IFIH1 gene is associated with the function of viral infection in the immune system, which affects the survival rate of large yellow croakers. Thus, we collected data through the published literature and combined them with a biological genetic database related to the large yellow croaker. Based on the data, we can predict new gene–trait associations which have not yet been discovered. This work will contribute to research on the growth of large yellow croakers. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1a

Open AccessArticle Glypre: In Silico Prediction of Protein Glycation Sites by Fusing Multiple Features and Support Vector Machine
Molecules 2017, 22(11), 1891; doi:10.3390/molecules22111891
Received: 20 September 2017 / Accepted: 26 October 2017 / Published: 3 November 2017
PDF Full-text (2023 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Glycation is a non-enzymatic process occurring inside or outside the host body by attaching a sugar molecule to a protein or lipid molecule. It is an important form of post-translational modification (PTM), which impairs the function and changes the characteristics of the proteins
[...] Read more.
Glycation is a non-enzymatic process occurring inside or outside the host body by attaching a sugar molecule to a protein or lipid molecule. It is an important form of post-translational modification (PTM), which impairs the function and changes the characteristics of the proteins so that the identification of the glycation sites may provide some useful guidelines to understand various biological functions of proteins. In this study, we proposed an accurate prediction tool, named Glypre, for lysine glycation. Firstly, we used multiple informative features to encode the peptides. These features included the position scoring function, secondary structure, AAindex, and the composition of k-spaced amino acid pairs. Secondly, the distribution of distinctive features of the residues surrounding the glycation and non-glycation sites was statistically analysed. Thirdly, based on the distribution of these features, we developed a new predictor by using different optimal window sizes for different properties and a two-step feature selection method, which utilized the maximum relevance minimum redundancy method followed by a greedy feature selection procedure. The performance of Glypre was measured with a sensitivity of 57.47%, a specificity of 90.78%, an accuracy of 79.68%, area under the receiver-operating characteristic (ROC) curve (AUC) of 0.86, and a Matthews’s correlation coefficient (MCC) of 0.52 by 10-fold cross-validation. The detailed analysis results showed that our predictor may play a complementary role to other existing methods for identifying protein lysine glycation. The source code and datasets of the Glypre are available in the Supplementary File. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Open AccessArticle ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network
Molecules 2017, 22(10), 1732; doi:10.3390/molecules22101732
Received: 30 August 2017 / Revised: 11 October 2017 / Accepted: 11 October 2017 / Published: 17 October 2017
PDF Full-text (612 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a
[...] Read more.
With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language “ProLan” to the protein function language “GOLan”, and build a neural machine translation model based on recurrent neural networks to translate “ProLan” language to “GOLan” language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Systematic Identification of Machine-Learning Models Aimed to Classify Critical Residues for Protein Function from Protein Structure
Molecules 2017, 22(10), 1673; doi:10.3390/molecules22101673
Received: 14 August 2017 / Revised: 24 September 2017 / Accepted: 24 September 2017 / Published: 9 October 2017
PDF Full-text (824 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Protein structure and protein function should be related, yet the nature of this relationship remains unsolved. Mapping the critical residues for protein function with protein structure features represents an opportunity to explore this relationship, yet two important limitations have precluded a proper analysis
[...] Read more.
Protein structure and protein function should be related, yet the nature of this relationship remains unsolved. Mapping the critical residues for protein function with protein structure features represents an opportunity to explore this relationship, yet two important limitations have precluded a proper analysis of the structure-function relationship of proteins: (i) the lack of a formal definition of what critical residues are and (ii) the lack of a systematic evaluation of methods and protein structure features. To address this problem, here we introduce an index to quantify the protein-function criticality of a residue based on experimental data and a strategy aimed to optimize both, descriptors of protein structure (physicochemical and centrality descriptors) and machine learning algorithms, to minimize the error in the classification of critical residues. We observed that both physicochemical and centrality descriptors of residues effectively relate protein structure and protein function, and that physicochemical descriptors better describe critical residues. We also show that critical residues are better classified when residue criticality is considered as a binary attribute (i.e., residues are considered critical or not critical). Using this binary annotation for critical residues 8 models rendered accurate and non-overlapping classification of critical residues, confirming the multi-factorial character of the structure-function relationship of proteins. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Molecular Dynamic Simulation of Space and Earth-Grown Crystal Structures of Thermostable T1 Lipase Geobacillus zalihae Revealed a Better Structure
Molecules 2017, 22(10), 1574; doi:10.3390/molecules22101574
Received: 21 August 2017 / Accepted: 16 September 2017 / Published: 25 September 2017
PDF Full-text (3639 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Less sedimentation and convection in a microgravity environment has become a well-suited condition for growing high quality protein crystals. Thermostable T1 lipase derived from bacterium Geobacillus zalihae has been crystallized using the counter diffusion method under space and earth conditions. Preliminary study using
[...] Read more.
Less sedimentation and convection in a microgravity environment has become a well-suited condition for growing high quality protein crystals. Thermostable T1 lipase derived from bacterium Geobacillus zalihae has been crystallized using the counter diffusion method under space and earth conditions. Preliminary study using YASARA molecular modeling structure program for both structures showed differences in number of hydrogen bond, ionic interaction, and conformation. The space-grown crystal structure contains more hydrogen bonds as compared with the earth-grown crystal structure. A molecular dynamics simulation study was used to provide insight on the fluctuations and conformational changes of both T1 lipase structures. The analysis of root mean square deviation (RMSD), radius of gyration, and root mean square fluctuation (RMSF) showed that space-grown structure is more stable than the earth-grown structure. Space-structure also showed more hydrogen bonds and ion interactions compared to the earth-grown structure. Further analysis also revealed that the space-grown structure has long-lived interactions, hence it is considered as the more stable structure. This study provides the conformational dynamics of T1 lipase crystal structure grown in space and earth condition. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods
Molecules 2017, 22(10), 1602; doi:10.3390/molecules22101602
Received: 15 August 2017 / Revised: 19 September 2017 / Accepted: 20 September 2017 / Published: 22 September 2017
PDF Full-text (807 KB) | HTML Full-text | XML Full-text
Abstract
DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient
[...] Read more.
DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient feature representation method is important to enhance the classification accuracy. However, existing feature representation methods cannot efficiently distinguish DNA-binding proteins from non-DNA-binding proteins. In this paper, a multi-feature representation method, which combines three feature representation methods, namely, K-Skip-N-Grams, Information theory, and Sequential and structural features (SSF), is used to represent the protein sequences and improve feature representation ability. In addition, the classifier is a support vector machine. The mixed-feature representation method is evaluated using 10-fold cross-validation and a test set. Feature vectors, which are obtained from a combination of three feature extractions, show the best performance in 10-fold cross-validation both under non-dimensional reduction and dimensional reduction by max-relevance-max-distance. Moreover, the reduced mixed feature method performs better than the non-reduced mixed feature technique. The feature vectors, which are a combination of SSF and K-Skip-N-Grams, show the best performance in the test set. Among these methods, mixed features exhibit superiority over the single features. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Integrative Pathway Analysis of Genes and Metabolites Reveals Metabolism Abnormal Subpathway Regions and Modules in Esophageal Squamous Cell Carcinoma
Molecules 2017, 22(10), 1599; doi:10.3390/molecules22101599
Received: 22 August 2017 / Revised: 20 September 2017 / Accepted: 20 September 2017 / Published: 22 September 2017
PDF Full-text (2928 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Aberrant metabolism is one of the main driving forces in the initiation and development of ESCC. Both genes and metabolites play important roles in metabolic pathways. Integrative pathway analysis of both genes and metabolites will thus help to interpret the underlying biological phenomena.
[...] Read more.
Aberrant metabolism is one of the main driving forces in the initiation and development of ESCC. Both genes and metabolites play important roles in metabolic pathways. Integrative pathway analysis of both genes and metabolites will thus help to interpret the underlying biological phenomena. Here, we performed integrative pathway analysis of gene and metabolite profiles by analyzing six gene expression profiles and seven metabolite profiles of ESCC. Multiple known and novel subpathways associated with ESCC, such as ‘beta-Alanine metabolism’, were identified via the cooperative use of differential genes, differential metabolites, and their positional importance information in pathways. Furthermore, a global ESCC-Related Metabolic (ERM) network was constructed and 31 modules were identified on the basis of clustering analysis in the ERM network. We found that the three modules located just to the center regions of the ERM network—especially the core region of Module_1—primarily consisted of aldehyde dehydrogenase (ALDH) superfamily members, which contributes to the development of ESCC. For Module_4, pyruvate and the genes and metabolites in its adjacent region were clustered together, and formed a core region within the module. Several prognostic genes, including GPT, ALDH1B1, ABAT, WBSCR22 and MDH1, appeared in the three center modules of the network, suggesting that they can become potentially prognostic markers in ESCC. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Open AccessArticle EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites
Molecules 2017, 22(9), 1463; doi:10.3390/molecules22091463
Received: 23 July 2017 / Revised: 29 August 2017 / Accepted: 30 August 2017 / Published: 5 September 2017
PDF Full-text (665 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date,
[...] Read more.
Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date, several computational methods have been established for the prediction of pupylation sites which usually artificially design the negative samples using the verified pupylation proteins to train the classifiers. However, if this process is not properly done it can affect the performance of the final predictor dramatically. In this work, different from previous computational methods, we proposed an enhanced positive-unlabeled learning algorithm (EPuL) to the pupylation site prediction problem, which uses only positive and unlabeled samples. Firstly, we separate the training dataset into the positive dataset and the unlabeled dataset which contains the remaining non-annotated lysine residues. Then, the EPuL algorithm is utilized to select the reliably negative initial dataset and then iteratively pick out the non-pupylation sites. The performance of the proposed method was measured with an accuracy of 90.24%, an Area Under Curve (AUC) of 0.93 and an MCC of 0.81 by 10-fold cross-validation. A user-friendly web server for predicting pupylation sites was developed and was freely available at http://59.73.198.144:8080/EPuL Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Detection of Interactions between Proteins by Using Legendre Moments Descriptor to Extract Discriminatory Information Embedded in PSSM
Molecules 2017, 22(8), 1366; doi:10.3390/molecules22081366
Received: 24 July 2017 / Accepted: 15 August 2017 / Published: 18 August 2017
PDF Full-text (978 KB) | HTML Full-text | XML Full-text
Abstract
Protein-protein interactions (PPIs) play a very large part in most cellular processes. Although a great deal of research has been devoted to detecting PPIs through high-throughput technologies, these methods are clearly expensive and cumbersome. Compared with the traditional experimental methods, computational methods have
[...] Read more.
Protein-protein interactions (PPIs) play a very large part in most cellular processes. Although a great deal of research has been devoted to detecting PPIs through high-throughput technologies, these methods are clearly expensive and cumbersome. Compared with the traditional experimental methods, computational methods have attracted much attention because of their good performance in detecting PPIs. In our work, a novel computational method named as PCVM-LM is proposed which combines the probabilistic classification vector machine (PCVM) model and Legendre moments (LMs) to predict PPIs from amino acid sequences. The improvement mainly comes from using the LMs to extract discriminatory information embedded in the position-specific scoring matrix (PSSM) combined with the PCVM classifier to implement prediction. The proposed method was evaluated on Yeast and Helicobacter pylori datasets with five-fold cross-validation experiments. The experimental results show that the proposed method achieves high average accuracies of 96.37% and 93.48%, respectively, which are much better than other well-known methods. To further evaluate the proposed method, we also compared the proposed method with the state-of-the-art support vector machine (SVM) classifier and other existing methods on the same datasets. The comparison results clearly show that our method is better than the SVM-based method and other existing methods. The promising experimental results show the reliability and effectiveness of the proposed method, which can be a useful decision support tool for protein research. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Predicting and Interpreting the Structure of Type IV Pilus of Electricigens by Molecular Dynamics Simulations
Molecules 2017, 22(8), 1342; doi:10.3390/molecules22081342
Received: 30 June 2017 / Revised: 7 August 2017 / Accepted: 10 August 2017 / Published: 12 August 2017
PDF Full-text (4254 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Nanowires that transfer electrons to extracellular acceptors are important in organic matter degradation and nutrient cycling in the environment. Geobacter pili of the group of Type IV pilus are regarded as nanowire-like biological structures. However, determination of the structure of pili remains challenging
[...] Read more.
Nanowires that transfer electrons to extracellular acceptors are important in organic matter degradation and nutrient cycling in the environment. Geobacter pili of the group of Type IV pilus are regarded as nanowire-like biological structures. However, determination of the structure of pili remains challenging due to the insolubility of monomers, presence of surface appendages, heterogeneity of the assembly, and low-resolution of electron microscopy techniques. Our previous study provided a method to predict structures for Type IV pili. In this work, we improved on our previous method using molecular dynamics simulations to optimize structures of Neisseria gonorrhoeae (GC), Neisseria meningitidis and Geobacter uraniireducens pilus. Comparison between the predicted structures for GC and Neisseria meningitidis pilus and their native structures revealed that proposed method could predict Type IV pilus successfully. According to the predicted structures, the structural basis for conductivity in G.uraniireducens pili was attributed to the three N-terminal aromatic amino acids. The aromatics were interspersed within the regions of charged amino acids, which may influence the configuration of the aromatic contacts and the rate of electron transfer. These results will supplement experimental research into the mechanism of long-rang electron transport along pili of electricigens. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle Neighbor Affinity-Based Core-Attachment Method to Detect Protein Complexes in Dynamic PPI Networks
Molecules 2017, 22(7), 1223; doi:10.3390/molecules22071223
Received: 28 June 2017 / Revised: 14 July 2017 / Accepted: 18 July 2017 / Published: 24 July 2017
PDF Full-text (5444 KB) | HTML Full-text | XML Full-text
Abstract
Protein complexes play significant roles in cellular processes. Identifying protein complexes from protein-protein interaction (PPI) networks is an effective strategy to understand biological processes and cellular functions. A number of methods have recently been proposed to detect protein complexes. However, most of methods
[...] Read more.
Protein complexes play significant roles in cellular processes. Identifying protein complexes from protein-protein interaction (PPI) networks is an effective strategy to understand biological processes and cellular functions. A number of methods have recently been proposed to detect protein complexes. However, most of methods predict protein complexes from static PPI networks, and usually overlook the inherent dynamics and topological properties of protein complexes. In this paper, we proposed a novel method, called NABCAM (Neighbor Affinity-Based Core-Attachment Method), to identify protein complexes from dynamic PPI networks. Firstly, the centrality score of every protein is calculated. The proteins with the highest centrality scores are regarded as the seed proteins. Secondly, the seed proteins are expanded to complex cores by calculating the similarity values between the seed proteins and their neighboring proteins. Thirdly, the attachments are appended to their corresponding protein complex cores by comparing the affinity among neighbors inside the core, against that outside the core. Finally, filtering processes are carried out to obtain the final clustering result. The result in the DIP database shows that the NABCAM algorithm can predict protein complexes effectively in comparison with other state-of-the-art methods. Moreover, many protein complexes predicted by our method are biologically significant. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Open AccessArticle Prediction of Drug–Target Interaction Networks from the Integration of Protein Sequences and Drug Chemical Structures
Molecules 2017, 22(7), 1119; doi:10.3390/molecules22071119
Received: 27 May 2017 / Revised: 27 June 2017 / Accepted: 3 July 2017 / Published: 5 July 2017
PDF Full-text (798 KB) | HTML Full-text | XML Full-text
Abstract
Knowledge of drug–target interaction (DTI) plays an important role in discovering new drug candidates. Unfortunately, there are unavoidable shortcomings; including the time-consuming and expensive nature of the experimental method to predict DTI. Therefore, it motivates us to develop an effective computational method to
[...] Read more.
Knowledge of drug–target interaction (DTI) plays an important role in discovering new drug candidates. Unfortunately, there are unavoidable shortcomings; including the time-consuming and expensive nature of the experimental method to predict DTI. Therefore, it motivates us to develop an effective computational method to predict DTI based on protein sequence. In the paper, we proposed a novel computational approach based on protein sequence, namely PDTPS (Predicting Drug Targets with Protein Sequence) to predict DTI. The PDTPS method combines Bi-gram probabilities (BIGP), Position Specific Scoring Matrix (PSSM), and Principal Component Analysis (PCA) with Relevance Vector Machine (RVM). In order to evaluate the prediction capacity of the PDTPS, the experiment was carried out on enzyme, ion channel, GPCR, and nuclear receptor datasets by using five-fold cross-validation tests. The proposed PDTPS method achieved average accuracy of 97.73%, 93.12%, 86.78%, and 87.78% on enzyme, ion channel, GPCR and nuclear receptor datasets, respectively. The experimental results showed that our method has good prediction performance. Furthermore, in order to further evaluate the prediction performance of the proposed PDTPS method, we compared it with the state-of-the-art support vector machine (SVM) classifier on enzyme and ion channel datasets, and other exiting methods on four datasets. The promising comparison results further demonstrate that the efficiency and robust of the proposed PDTPS method. This makes it a useful tool and suitable for predicting DTI, as well as other bioinformatics tasks. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Open AccessArticle High-Performance Prediction of Human Estrogen Receptor Agonists Based on Chemical Structures
Molecules 2017, 22(4), 675; doi:10.3390/molecules22040675
Received: 16 March 2017 / Revised: 16 April 2017 / Accepted: 19 April 2017 / Published: 23 April 2017
Cited by 2 | PDF Full-text (2839 KB) | HTML Full-text | XML Full-text
Abstract
Many agonists for the estrogen receptor are known to disrupt endocrine functioning. We have developed a computational model that predicts agonists for the estrogen receptor ligand-binding domain in an assay system. Our model was entered into the Tox21 Data Challenge 2014, a computational
[...] Read more.
Many agonists for the estrogen receptor are known to disrupt endocrine functioning. We have developed a computational model that predicts agonists for the estrogen receptor ligand-binding domain in an assay system. Our model was entered into the Tox21 Data Challenge 2014, a computational toxicology competition organized by the National Center for Advancing Translational Sciences. This competition aims to find high-performance predictive models for various adverse-outcome pathways, including the estrogen receptor. Our predictive model, which is based on the random forest method, delivered the best performance in its competition category. In the current study, the predictive performance of the random forest models was improved by strictly adjusting the hyperparameters to avoid overfitting. The random forest models were optimized from 4000 descriptors simultaneously applied to 10,000 activity assay results for the estrogen receptor ligand-binding domain, which have been measured and compiled by Tox21. Owing to the correlation between our model’s and the challenge’s results, we consider that our model currently possesses the highest predictive power on agonist activity of the estrogen receptor ligand-binding domain. Furthermore, analysis of the optimized model revealed some important features of the agonists, such as the number of hydroxyl groups in the molecules. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Figure 1

Review

Jump to: Research

Open AccessReview Recent Advances in Conotoxin Classification by Using Machine Learning Methods
Molecules 2017, 22(7), 1057; doi:10.3390/molecules22071057
Received: 17 May 2017 / Revised: 12 June 2017 / Accepted: 19 June 2017 / Published: 25 June 2017
PDF Full-text (1485 KB) | HTML Full-text | XML Full-text
Abstract
Conotoxins are disulfide-rich small peptides, which are invaluable peptides that target ion channel and neuronal receptors. Conotoxins have been demonstrated as potent pharmaceuticals in the treatment of a series of diseases, such as Alzheimer’s disease, Parkinson’s disease, and epilepsy. In addition, conotoxins are
[...] Read more.
Conotoxins are disulfide-rich small peptides, which are invaluable peptides that target ion channel and neuronal receptors. Conotoxins have been demonstrated as potent pharmaceuticals in the treatment of a series of diseases, such as Alzheimer’s disease, Parkinson’s disease, and epilepsy. In addition, conotoxins are also ideal molecular templates for the development of new drug lead compounds and play important roles in neurobiological research as well. Thus, the accurate identification of conotoxin types will provide key clues for the biological research and clinical medicine. Generally, conotoxin types are confirmed when their sequence, structure, and function are experimentally validated. However, it is time-consuming and costly to acquire the structure and function information by using biochemical experiments. Therefore, it is important to develop computational tools for efficiently and effectively recognizing conotoxin types based on sequence information. In this work, we reviewed the current progress in computational identification of conotoxins in the following aspects: (i) construction of benchmark dataset; (ii) strategies for extracting sequence features; (iii) feature selection techniques; (iv) machine learning methods for classifying conotoxins; (v) the results obtained by these methods and the published tools; and (vi) future perspectives on conotoxin classification. The paper provides the basis for in-depth study of conotoxins and drug therapy research. Full article
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Figures

Back to Top