Submit to IJMS Review for IJMS Propose a Special Issue

Journal Menu

Journal Browser

Special Protein or RNA Molecules Computational Identification 2019

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (28 October 2019) | Viewed by 44793

Share This Special Issue

Special Issue Editor

Prof. Dr. Quan Zou

E-Mail Website
Guest Editor

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
Interests: bioinformatics; parallel computing; deep learning; protein classification; genome assembly
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear colleagues,

The discovery of new molecules remains an important and challenging task. For some special proteins or RNA molecules, it is difficult, time-consuming, and costly to detect new ones. These special proteins include cytokines, enzymes, cell-penetrating peptides, anticancer peptides, cancerlectins, G protein-coupled receptors, etc. Some noncoding RNAs are also required to be annotated in the sequencing data, such as microRNA, snoRNA, snRNA, circle RNA, tRNA, etc. Researchers have often employed computer programs to list some candidates, and validated the candidates using molecular experiments. The “computer program” used is a key issue, which could cut wet experiment costs. High false positive software would lead to high costs in the validation process.

In addition to proteins, we encourage authors to pay attention to noncoding RNA molecules. MicroRNA and other noncoding RNA detections are still openly challenging for bioinformatic researchers. A perfect performance could remove the cost of Northern Blot or rtPCR. RNA function and the RNA–disease relationship are also interesting and welcome. Some network methods, including random walk and matrix factorization, have been employed in the RNA–disease relationship prediction. However, they are not robust. Sometimes, state-of-the-art methods would be invalid upon updating the datasets. I hope to see more novel and robust methods and golden benchmark datasets in the new Special Issue.

Prof. Dr. Quan Zou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Bioinformatics
Machine learning
Feature selection
Protein classification
PseAAC features
Anticancer peptides
Cell-penetrating peptides
Oncogene
DNA/RNA binding proteins
MHC binding peptide
Noncoding RNA
MicroRNA
RNA–disease relationship
Network

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (9 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

19 pages, 928 KB

Open AccessArticle

DeepMiR2GO: Inferring Functions of Human MicroRNAs Using a Deep Multi-Label Classification Model

by Jiacheng Wang, Jingpu Zhang, Yideng Cai and Lei Deng

Int. J. Mol. Sci. 2019, 20(23), 6046; https://doi.org/10.3390/ijms20236046 - 30 Nov 2019

Cited by 16 | Viewed by 3737

Abstract

MicroRNAs (miRNAs) are a highly abundant collection of functional non-coding RNAs involved in cellular regulation and various complex human diseases. Although a large number of miRNAs have been identified, most of their physiological functions remain unknown. Computational methods play a vital role in exploring the potential functions of miRNAs. Here, we present DeepMiR2GO, a tool for integrating miRNAs, proteins and diseases, to predict the gene ontology (GO) functions based on multiple deep neuro-symbolic models. DeepMiR2GO starts by integrating the miRNA co-expression network, protein-protein interaction (PPI) network, disease phenotype similarity network, and interactions or associations among them into a global heterogeneous network. Then, it employs an efficient graph embedding strategy to learn potential network representations of the global heterogeneous network as the topological features. Finally, a deep multi-label classification network based on multiple neuro-symbolic models is built and used to annotate the GO terms of miRNAs. The predicted results demonstrate that DeepMiR2GO performs significantly better than other state-of-the-art approaches in terms of precision, recall, and maximum F-measure. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

15 pages, 4746 KB

Open AccessArticle

LDAPred: A Method Based on Information Flow Propagation and a Convolutional Neural Network for the Prediction of Disease-Associated lncRNAs

by Ping Xuan, Lan Jia, Tiangang Zhang, Nan Sheng, Xiaokun Li and Jinbao Li

Int. J. Mol. Sci. 2019, 20(18), 4458; https://doi.org/10.3390/ijms20184458 - 10 Sep 2019

Cited by 29 | Viewed by 3726

Abstract

Long non-coding RNAs (lncRNAs) play a crucial role in the pathogenesis and development of complex diseases. Predicting potential lncRNA–disease associations can improve our understanding of the molecular mechanisms of human diseases and help identify biomarkers for disease diagnosis, treatment, and prevention. Previous research methods have mostly integrated the similarity and association information of lncRNAs and diseases, without considering the topological structure information among these nodes, which is important for predicting lncRNA–disease associations. We propose a method based on information flow propagation and convolutional neural networks, called LDAPred, to predict disease-related lncRNAs. LDAPred not only integrates the similarities, associations, and interactions among lncRNAs, diseases, and miRNAs, but also exploits the topological structures formed by them. In this study, we construct a dual convolutional neural network-based framework that comprises the left and right sides. The embedding layer on the left side is established by utilizing lncRNA, miRNA, and disease-related biological premises. On the right side of the frame, multiple types of similarity, association, and interaction relationships among lncRNAs, diseases, and miRNAs are calculated based on information flow propagation on the bi-layer networks, such as the lncRNA–disease network. They contain the network topological structure and they are learned by the right side of the framework. The experimental results based on five-fold cross-validation indicate that LDAPred performs better than several state-of-the-art methods. Case studies on breast cancer, colon cancer, and osteosarcoma further demonstrate LDAPred’s ability to discover potential lncRNA–disease associations. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

17 pages, 2041 KB

Open AccessArticle

CNNDLP: A Method Based on Convolutional Autoencoder and Convolutional Neural Network with Adjacent Edge Attention for Predicting lncRNA–Disease Associations

by Ping Xuan, Nan Sheng, Tiangang Zhang, Yong Liu and Yahong Guo

Int. J. Mol. Sci. 2019, 20(17), 4260; https://doi.org/10.3390/ijms20174260 - 30 Aug 2019

Cited by 38 | Viewed by 4445

Abstract

It is well known that the unusual expression of long non-coding RNAs (lncRNAs) is closely related to the physiological and pathological processes of diseases. Therefore, inferring the potential lncRNA–disease associations are helpful for understanding the molecular pathogenesis of diseases. Most previous methods have concentrated on the construction of shallow learning models in order to predict lncRNA-disease associations, while they have failed to deeply integrate heterogeneous multi-source data and to learn the low-dimensional feature representations from these data. We propose a method based on the convolutional neural network with the attention mechanism and convolutional autoencoder for predicting candidate disease-related lncRNAs, and refer to it as CNNDLP. CNNDLP integrates multiple kinds of data from heterogeneous sources, including the associations, interactions, and similarities related to the lncRNAs, diseases, and miRNAs. Two different embedding layers are established by combining the diverse biological premises about the cases that the lncRNAs are likely to associate with the diseases. We construct a novel prediction model based on the convolutional neural network with attention mechanism and convolutional autoencoder to learn the attention and the low-dimensional network representations of the lncRNA–disease pairs from the embedding layers. The different adjacent edges among the lncRNA, miRNA, and disease nodes have different contributions for association prediction. Hence, an attention mechanism at the adjacent edge level is established, and the left side of the model learns the attention representation of a pair of lncRNA and disease. A new type of lncRNA similarity and a new type of disease similarity are calculated by incorporating the topological structures of multiple bipartite networks. The low-dimensional network representation of the lncRNA-disease pairs is further learned by the autoencoder based convolutional neutral network on the right side of the model. The cross-validation experimental results confirm that CNNDLP has superior prediction performance compared to the state-of-the-art methods. Case studies on stomach cancer, breast cancer, and prostate cancer further show the ability of CNNDLP for discovering the potential disease lncRNAs. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

14 pages, 386 KB

Open AccessArticle

FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou’s Five-Step Rule

by Yi Zou, Yijie Ding, Jijun Tang, Fei Guo and Li Peng

Int. J. Mol. Sci. 2019, 20(17), 4175; https://doi.org/10.3390/ijms20174175 - 26 Aug 2019

Cited by 34 | Viewed by 4104

Abstract

DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent years, many computation-based approachs have been proposed to detect DNA-binding proteins. In this paper, a machine learning-based method, which is called the Fuzzy Kernel Ridge Regression model based on Multi-View Sequence Features (FKRR-MVSF), is proposed to identifying DNA-binding proteins. First of all, multi-view sequence features are extracted from protein sequences. Next, a Multiple Kernel Learning (MKL) algorithm is employed to combine multiple features. Finally, a Fuzzy Kernel Ridge Regression (FKRR) model is built to detect DNA-binding proteins. Compared with other methods, our model achieves good results. Our method obtains an accuracy of 83.26% and 81.72% on two benchmark datasets (PDB1075 and compared with PDB186), respectively. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

12 pages, 1963 KB

Open AccessArticle

In Silico Prediction of Drug-Induced Liver Injury Based on Ensemble Classifier Method

by Yangyang Wang, Qingxin Xiao, Peng Chen and Bing Wang

Int. J. Mol. Sci. 2019, 20(17), 4106; https://doi.org/10.3390/ijms20174106 - 22 Aug 2019

Cited by 32 | Viewed by 5461

Abstract

Drug-induced liver injury (DILI) is a major factor in the development of drugs and the safety of drugs. If the DILI cannot be effectively predicted during the development of the drug, it will cause the drug to be withdrawn from markets. Therefore, DILI is crucial at the early stages of drug research. This work presents a 2-class ensemble classifier model for predicting DILI, with 2D molecular descriptors and fingerprints on a dataset of 450 compounds. The purpose of our study is to investigate which are the key molecular fingerprints that may cause DILI risk, and then to obtain a reliable ensemble model to predict DILI risk with these key factors. Experimental results suggested that 8 molecular fingerprints are very critical for predicting DILI, and also obtained the best ratio of molecular fingerprints to molecular descriptors. The result of the 5-fold cross-validation of the ensemble vote classifier method obtain an accuracy of 77.25%, and the accuracy of the test set was 81.67%. This model could be used for drug-induced liver injury prediction. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

19 pages, 2374 KB

Open AccessArticle

Inferring the Disease-Associated miRNAs Based on Network Representation Learning and Convolutional Neural Networks

by Ping Xuan, Hao Sun, Xiao Wang, Tiangang Zhang and Shuxiang Pan

Int. J. Mol. Sci. 2019, 20(15), 3648; https://doi.org/10.3390/ijms20153648 - 25 Jul 2019

Cited by 46 | Viewed by 4202

Abstract

Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

12 pages, 1272 KB

Open AccessArticle

An Ensemble Classifier to Predict Protein–Protein Interactions by Combining PSSM-based Evolutionary Information with Local Binary Pattern Model

by Yang Li, Li-Ping Li, Lei Wang, Chang-Qing Yu, Zheng Wang and Zhu-Hong You

Int. J. Mol. Sci. 2019, 20(14), 3511; https://doi.org/10.3390/ijms20143511 - 17 Jul 2019

Cited by 15 | Viewed by 3773

Abstract

Protein plays a critical role in the regulation of biological cell functions. Among them, whether proteins interact with each other has become a fundamental problem, because proteins usually perform their functions by interacting with other proteins. Although a large amount of protein–protein interactions (PPIs) data has been produced by high-throughput biotechnology, the disadvantage of biological experimental technique is time-consuming and costly. Thus, computational methods for predicting protein interactions have become a research hot spot. In this research, we propose an efficient computational method that combines Rotation Forest (RF) classifier with Local Binary Pattern (LBP) feature extraction method to predict PPIs from the perspective of Position-Specific Scoring Matrix (PSSM). The proposed method has achieved superior performance in predicting Yeast, Human, and H. pylori datasets with average accuracies of 92.12%, 96.21%, and 86.59%, respectively. In addition, we also evaluated the performance of the proposed method on the four independent datasets of C. elegans, H. pylori, H. sapiens, and M. musculus datasets. These obtained experimental results fully prove that our model has good feasibility and robustness in predicting PPIs. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Figure 1

16 pages, 2847 KB

Open AccessArticle

MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models

by Chang Lu, Zhe Liu, Enju Zhang, Fei He, Zhiqiang Ma and Han Wang

Int. J. Mol. Sci. 2019, 20(13), 3120; https://doi.org/10.3390/ijms20133120 - 26 Jun 2019

Cited by 18 | Viewed by 5011

Abstract

Membrane proteins (MPs) are involved in many essential biomolecule mechanisms as a pivotal factor in enabling the small molecule and signal transport between the two sides of the biological membrane; this is the reason that a large portion of modern medicinal drugs target MPs. Therefore, accurately identifying the membrane protein-ligand binding sites (MPLs) will significantly improve drug discovery. In this paper, we propose a sequence-based MPLs predictor called MPLs-Pred, where evolutionary profiles, topology structure, physicochemical properties, and primary sequence segment descriptors are combined as features applied to a random forest classifier, and an under-sampling scheme is used to enhance the classification capability with imbalanced samples. Additional ligand-specific models were taken into consideration in refining the prediction. The corresponding experimental results based on our method achieved an appreciable performance, with 0.63 MCC (Matthews correlation coefficient) as the overall prediction precision, and those values were 0.604, 0.7, and 0.692, respectively, for the three main types of ligands: drugs, metal ions, and biomacromolecules. MPLs-Pred is freely accessible at http://icdtools.nenu.edu.cn/. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Graphical abstract

14 pages, 2973 KB

Open AccessArticle

mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides

by Vinothini Boopathi, Sathiyamoorthy Subramaniyam, Adeel Malik, Gwang Lee, Balachandran Manavalan and Deok-Chun Yang

Int. J. Mol. Sci. 2019, 20(8), 1964; https://doi.org/10.3390/ijms20081964 - 22 Apr 2019

Cited by 176 | Viewed by 9273

Abstract

Anticancer peptides (ACPs) are promising therapeutic agents for targeting and killing cancer cells. The accurate prediction of ACPs from given peptide sequences remains as an open problem in the field of immunoinformatics. Recently, machine learning algorithms have emerged as a promising tool for helping experimental scientists predict ACPs. However, the performance of existing methods still needs to be improved. In this study, we present a novel approach for the accurate prediction of ACPs, which involves the following two steps: (i) We applied a two-step feature selection protocol on seven feature encodings that cover various aspects of sequence information (composition-based, physicochemical properties and profiles) and obtained their corresponding optimal feature-based models. The resultant predicted probabilities of ACPs were further utilized as feature vectors. (ii) The predicted probability feature vectors were in turn used as an input to support vector machine to develop the final prediction model called mACPpred. Cross-validation analysis showed that the proposed predictor performs significantly better than individual feature encodings. Furthermore, mACPpred significantly outperformed the existing methods compared in this study when objectively evaluated on an independent dataset. Full article

(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)

► Show Figures

Journal Menu

Journal Browser

Special Protein or RNA Molecules Computational Identification 2019

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (9 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI