Algorithmic Themes in Bioinformatics and Computational Biology

A special issue of Biomolecules (ISSN 2218-273X). This special issue belongs to the section "Bioinformatics and Systems Biology".

Deadline for manuscript submissions: closed (20 October 2022) | Viewed by 19051

Special Issue Editors


grade E-Mail Website
Guest Editor
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
Interests: bioinformatics; parallel computing; deep learning; protein classification; genome assembly
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150080, China
Interests: Bioinformatics; Machine Learning; Drug Discovery; Biological Sequence Analysis

Special Issue Information

Dear Colleagues,

Deep-learning methods have shown spectacular performance in fields such as computer vision and natural language processing.  These approaches have made it possible to envisage the treatment of complex bioinformatic problems, especially in drug discovery, computational biology, bio-ontology representation, etc. A rich body of recent literature documents the capacity of deep-learning approaches that have led to remarkable results in these fields.  However, they also come with a new set of challenges, such as the limited interpretability of models.

This Special Issue seeks the latest fundamental advances in addressing the challenges of computational biology. Specifically, this issue will explore the latest deep-learning-based methods in biology and studies of biological complex data, drug discovery, and related areas.

We invite scientists working in this area to submit their original research or review articles for publication in this Special Issue. The topics of interest include (but are not limited to) the prediction of ADME properties, predicting drug–target affinity, and lead optimization. Both application and methodological research studies are welcome.

Prof. Dr. Quan Zou
Dr. Chunyu Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomolecules is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep learning
  • Computational biology
  • Drug discovery
  • Molecular property prediction

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

19 pages, 14813 KiB  
Article
A Novel Capsule Network with Attention Routing to Identify Prokaryote Phosphorylation Sites
by Shixian Wang, Lina Zhang, Runtao Yang and Yujiao Zhao
Biomolecules 2022, 12(12), 1854; https://doi.org/10.3390/biom12121854 - 12 Dec 2022
Cited by 2 | Viewed by 1200
Abstract
By denaturing proteins and promoting the formation of multiprotein complexes, protein phosphorylation has important effects on the activity of protein functional molecules and cell signaling. The regulation of protein phosphorylation allows microbes to respond rapidly and reversibly to specific environmental stimuli or niches, [...] Read more.
By denaturing proteins and promoting the formation of multiprotein complexes, protein phosphorylation has important effects on the activity of protein functional molecules and cell signaling. The regulation of protein phosphorylation allows microbes to respond rapidly and reversibly to specific environmental stimuli or niches, which is closely related to the molecular mechanisms of bacterial drug resistance. Accurate prediction of phosphorylation sites (p-site) of prokaryotes can contribute to addressing bacterial resistance and providing new perspectives for developing novel antibacterial drugs. Most existing studies focus on human phosphorylation sites, while tools targeting phosphorylation site identification of prokaryotic proteins are still relatively scarce. This study designs a capsule network-based prediction technique for p-site in prokaryotes. To address the poor scalability and unreliability of dynamic routing processes in the output space of capsule networks, a more reliable way is introduced to learn the consistency between capsules. We incorporate a self-attention mechanism into the routing algorithm to capture the global information of the capsule, reducing the computational effort while enriching the representation capability of the capsule. Aiming at the weak robustness of the model, EcapsP improves the prediction accuracy and stability by introducing shortcuts and unconditional reconfiguration. In addition, the study compares and analyzes the prediction performance based on word vectors, physicochemical properties, and mixing characteristics in predicting serine (Ser/S), threonine (Thr/T), and tyrosine (Tyr/Y) p-site. The comprehensive experimental results show that the accuracy of the developed technique is close to 70% for the identification of the three phosphorylation sites in prokaryotes. Importantly, in side-by-side comparisons with other state-of-the-art predictors, our method improves the Matthews correlation coefficient (MCC) by approximately 7%. The results demonstrate the superiority of EcapsP in terms of high performance and reliability. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

15 pages, 1938 KiB  
Article
Identification of Transcriptome Biomarkers for Severe COVID-19 with Machine Learning Methods
by Xiaohong Li, Xianchao Zhou, Shijian Ding, Lei Chen, Kaiyan Feng, Hao Li, Tao Huang and Yu-Dong Cai
Biomolecules 2022, 12(12), 1735; https://doi.org/10.3390/biom12121735 - 23 Nov 2022
Cited by 3 | Viewed by 1796
Abstract
The rapid spread of COVID-19 has become a major concern for people’s lives and health all around the world. COVID-19 patients in various phases and severity require individualized treatment given that different patients may develop different symptoms. We employed machine learning methods to [...] Read more.
The rapid spread of COVID-19 has become a major concern for people’s lives and health all around the world. COVID-19 patients in various phases and severity require individualized treatment given that different patients may develop different symptoms. We employed machine learning methods to discover biomarkers that may accurately classify COVID-19 in various disease states and severities in this study. The blood gene expression profiles from 50 COVID-19 patients without intensive care, 50 COVID-19 patients with intensive care, 10 non-COVID-19 individuals without intensive care, and 16 non-COVID-19 individuals with intensive care were analyzed. Boruta was first used to remove irrelevant gene features in the expression profiles, and then, the minimum redundancy maximum relevance was applied to sort the remaining features. The generated feature-ranked list was fed into the incremental feature selection method to discover the essential genes and build powerful classifiers. The molecular mechanism of some biomarker genes was addressed using recent studies, and biological functions enriched by essential genes were examined. Our findings imply that genes including UBE2C, PCLAF, CDK1, CCNB1, MND1, APOBEC3G, TRAF3IP3, CD48, and GZMA play key roles in defining the different states and severity of COVID-19. Thus, a new point of reference is provided for understanding the disease’s etiology and facilitating a precise therapy. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

14 pages, 1438 KiB  
Article
Blood Transcript Biomarkers Selected by Machine Learning Algorithm Classify Neurodegenerative Diseases including Alzheimer’s Disease
by Carol J. Huseby, Elaine Delvaux, Danielle L. Brokaw and Paul D. Coleman
Biomolecules 2022, 12(11), 1592; https://doi.org/10.3390/biom12111592 - 29 Oct 2022
Cited by 6 | Viewed by 1964
Abstract
The clinical diagnosis of neurodegenerative diseases is notoriously inaccurate and current methods are often expensive, time-consuming, or invasive. Simple inexpensive and noninvasive methods of diagnosis could provide valuable support for clinicians when combined with cognitive assessment scores. Biological processes leading to neuropathology progress [...] Read more.
The clinical diagnosis of neurodegenerative diseases is notoriously inaccurate and current methods are often expensive, time-consuming, or invasive. Simple inexpensive and noninvasive methods of diagnosis could provide valuable support for clinicians when combined with cognitive assessment scores. Biological processes leading to neuropathology progress silently for years and are reflected in both the central nervous system and vascular peripheral system. A blood-based screen to distinguish and classify neurodegenerative diseases is especially interesting having low cost, minimal invasiveness, and accessibility to almost any world clinic. In this study, we set out to discover a small set of blood transcripts that can be used to distinguish healthy individuals from those with Alzheimer’s disease, Parkinson’s disease, Huntington’s disease, amyotrophic lateral sclerosis, Friedreich’s ataxia, or frontotemporal dementia. Using existing public datasets, we developed a machine learning algorithm for application on transcripts present in blood and discovered small sets of transcripts that distinguish a number of neurodegenerative diseases with high sensitivity and specificity. We validated the usefulness of blood RNA transcriptomics for the classification of neurodegenerative diseases. Information about features selected for the classification can direct the development of possible treatment strategies. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

16 pages, 22609 KiB  
Article
Construction of Multiple Logic Circuits Based on Allosteric DNAzymes
by Xin Liu, Qiang Zhang, Xun Zhang, Yuan Liu, Yao Yao and Nikola Kasabov
Biomolecules 2022, 12(4), 495; https://doi.org/10.3390/biom12040495 - 24 Mar 2022
Cited by 1 | Viewed by 2296
Abstract
In DNA computing, the implementation of complex and stable logic operations in a universal system is a critical challenge. It is necessary to develop a system with complex logic functions based on a simple mechanism. Here, the strategy to control the secondary structure [...] Read more.
In DNA computing, the implementation of complex and stable logic operations in a universal system is a critical challenge. It is necessary to develop a system with complex logic functions based on a simple mechanism. Here, the strategy to control the secondary structure of assembled DNAzymes’ conserved domain is adopted to regulate the activity of DNAzymes and avoid the generation of four-way junctions, and makes it possible to implement basic logic gates and their cascade circuits in the same system. In addition, the purpose of threshold control achieved by the allosteric secondary structure implements a three-input DNA voter with one-vote veto function. The scalability of the system can be remarkably improved by adjusting the threshold to implement a DNA voter with 2n + 1 inputs. The proposed strategy provides a feasible idea for constructing more complex DNA circuits and a highly integrated computing system. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

16 pages, 3866 KiB  
Article
AutoCellANLS: An Automated Analysis System for Mycobacteria-Infected Cells Based on Unstained Micrograph
by Yan Zhuang, Xinzhuo Zhao, Zhongbing Huang, Lin Han, Ke Chen and Jiangli Lin
Biomolecules 2022, 12(2), 240; https://doi.org/10.3390/biom12020240 - 01 Feb 2022
Viewed by 1690
Abstract
The detection of Mycobacterium tuberculosis (Mtb) infection plays an important role in the control of tuberculosis (TB), one of the leading infectious diseases in the world. Recent advances in artificial intelligence-aided cellular image processing and analytical techniques have shown great promises in automated [...] Read more.
The detection of Mycobacterium tuberculosis (Mtb) infection plays an important role in the control of tuberculosis (TB), one of the leading infectious diseases in the world. Recent advances in artificial intelligence-aided cellular image processing and analytical techniques have shown great promises in automated Mtb detection. However, current cell imaging protocols often involve costly and time-consuming fluorescence staining, which has become a major bottleneck for procedural automation. To solve this problem, we have developed a novel automated system (AutoCellANLS) for cell detection and the recognition of morphological features in the phase-contrast micrographs by using unsupervised machine learning (UML) approaches and deep convolutional neural networks (CNNs). The detection algorithm can adaptively and automatically detect single cells in the cell population by the improved level set segmentation model with the circular Hough transform (CHT). Besides, we have designed a Cell-net by using the transfer learning strategies (TLS) to classify the virulence-specific cellular morphological changes that would otherwise be indistinguishable to the naked eye. The novel system can simultaneously classify and segment microscopic images of the cell populations and achieve an average accuracy of 95.13% for cell detection, 95.94% for morphological classification, 94.87% for sensitivity, and 96.61% for specificity. AutoCellANLS is able to detect significant morphological differences between the infected and uninfected mammalian cells throughout the infection period (2 hpi/12 hpi/24 hpi). Besides, it has overcome the drawback of manual intervention and increased the accuracy by more than 11% compared to our previous work, which used AI-aided imaging analysis to detect mycobacterial infection in macrophages. AutoCellANLS is also efficient and versatile when tailored to different cell lines datasets (RAW264.7 and THP-1 cell). This proof-of concept study provides a novel venue to investigate bacterial pathogenesis at a macroscopic level and offers great promise in the diagnosis of bacterial infections. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

13 pages, 1236 KiB  
Article
RFLMDA: A Novel Reinforcement Learning-Based Computational Model for Human MicroRNA-Disease Association Prediction
by Linqian Cui, You Lu, Jiacheng Sun, Qiming Fu, Xiao Xu, Hongjie Wu and Jianping Chen
Biomolecules 2021, 11(12), 1835; https://doi.org/10.3390/biom11121835 - 05 Dec 2021
Cited by 4 | Viewed by 2304
Abstract
Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is [...] Read more.
Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is an urgent necessity for computational simulation to predict unknown miRNA-disease associations. In this work, we combine Q-learning algorithm of reinforcement learning to propose a RFLMDA model, three submodels CMF, NRLMF, and LapRLS are fused via Q-learning algorithm to obtain the optimal weight S. The performance of RFLMDA was evaluated through five-fold cross-validation and local validation. As a result, the optimal weight is obtained as S (0.1735, 0.2913, 0.5352), and the AUC is 0.9416. By comparing the experiments with other methods, it is proved that RFLMDA model has better performance. For better validate the predictive performance of RFLMDA, we use eight diseases for local verification and carry out case study on three common human diseases. Consequently, all the top 50 miRNAs related to Colorectal Neoplasms and Breast Neoplasms have been confirmed. Among the top 50 miRNAs related to Colon Neoplasms, Gastric Neoplasms, Pancreatic Neoplasms, Kidney Neoplasms, Esophageal Neoplasms, and Lymphoma, we confirm 47, 41, 49, 46, 46 and 48 miRNAs respectively. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

16 pages, 3966 KiB  
Article
DNA Matrix Operation Based on the Mechanism of the DNAzyme Binding to Auxiliary Strands to Cleave the Substrate
by Shaoxia Xu, Yuan Liu, Shihua Zhou, Qiang Zhang and Nikola K. Kasabov
Biomolecules 2021, 11(12), 1797; https://doi.org/10.3390/biom11121797 - 30 Nov 2021
Cited by 3 | Viewed by 1751
Abstract
Numerical computation is a focus of DNA computing, and matrix operations are among the most basic and frequently used operations in numerical computation. As an important computing tool, matrix operations are often used to deal with intensive computing tasks. During calculation, the speed [...] Read more.
Numerical computation is a focus of DNA computing, and matrix operations are among the most basic and frequently used operations in numerical computation. As an important computing tool, matrix operations are often used to deal with intensive computing tasks. During calculation, the speed and accuracy of matrix operations directly affect the performance of the entire computing system. Therefore, it is important to find a way to perform matrix calculations that can ensure the speed of calculations and improve the accuracy. This paper proposes a DNA matrix operation method based on the mechanism of the DNAzyme binding to auxiliary strands to cleave the substrate. In this mechanism, the DNAzyme binding substrate requires the connection of two auxiliary strands. Without any of the two auxiliary strands, the DNAzyme does not cleave the substrate. Based on this mechanism, the multiplication operation of two matrices is realized; the two types of auxiliary strands are used as elements of the two matrices, to participate in the operation, and then are combined with the DNAzyme to cut the substrate and output the result of the matrix operation. This research provides a new method of matrix operations and provides ideas for more complex computing systems. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

Review

Jump to: Research

13 pages, 616 KiB  
Review
Developments in Algorithms for Sequence Alignment: A Review
by Jiannan Chao, Furong Tang and Lei Xu
Biomolecules 2022, 12(4), 546; https://doi.org/10.3390/biom12040546 - 06 Apr 2022
Cited by 10 | Viewed by 4653
Abstract
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment [...] Read more.
The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized. Full article
(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)
Show Figures

Figure 1

Back to TopTop