Biomolecules

Research

Jump to: Review

19 pages, 14813 KiB

Open AccessArticle

A Novel Capsule Network with Attention Routing to Identify Prokaryote Phosphorylation Sites

by Shixian Wang, Lina Zhang, Runtao Yang and Yujiao Zhao

Biomolecules 2022, 12(12), 1854; https://doi.org/10.3390/biom12121854 - 12 Dec 2022

Cited by 2 | Viewed by 1200

Abstract

By denaturing proteins and promoting the formation of multiprotein complexes, protein phosphorylation has important effects on the activity of protein functional molecules and cell signaling. The regulation of protein phosphorylation allows microbes to respond rapidly and reversibly to specific environmental stimuli or niches, [...] Read more.

By denaturing proteins and promoting the formation of multiprotein complexes, protein phosphorylation has important effects on the activity of protein functional molecules and cell signaling. The regulation of protein phosphorylation allows microbes to respond rapidly and reversibly to specific environmental stimuli or niches, which is closely related to the molecular mechanisms of bacterial drug resistance. Accurate prediction of phosphorylation sites (p-site) of prokaryotes can contribute to addressing bacterial resistance and providing new perspectives for developing novel antibacterial drugs. Most existing studies focus on human phosphorylation sites, while tools targeting phosphorylation site identification of prokaryotic proteins are still relatively scarce. This study designs a capsule network-based prediction technique for p-site in prokaryotes. To address the poor scalability and unreliability of dynamic routing processes in the output space of capsule networks, a more reliable way is introduced to learn the consistency between capsules. We incorporate a self-attention mechanism into the routing algorithm to capture the global information of the capsule, reducing the computational effort while enriching the representation capability of the capsule. Aiming at the weak robustness of the model, EcapsP improves the prediction accuracy and stability by introducing shortcuts and unconditional reconfiguration. In addition, the study compares and analyzes the prediction performance based on word vectors, physicochemical properties, and mixing characteristics in predicting serine (Ser/S), threonine (Thr/T), and tyrosine (Tyr/Y) p-site. The comprehensive experimental results show that the accuracy of the developed technique is close to 70% for the identification of the three phosphorylation sites in prokaryotes. Importantly, in side-by-side comparisons with other state-of-the-art predictors, our method improves the Matthews correlation coefficient (MCC) by approximately 7%. The results demonstrate the superiority of EcapsP in terms of high performance and reliability. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

15 pages, 1938 KiB

Open AccessArticle

Identification of Transcriptome Biomarkers for Severe COVID-19 with Machine Learning Methods

by Xiaohong Li, Xianchao Zhou, Shijian Ding, Lei Chen, Kaiyan Feng, Hao Li, Tao Huang and Yu-Dong Cai

Biomolecules 2022, 12(12), 1735; https://doi.org/10.3390/biom12121735 - 23 Nov 2022

Cited by 3 | Viewed by 1796

Abstract

The rapid spread of COVID-19 has become a major concern for people’s lives and health all around the world. COVID-19 patients in various phases and severity require individualized treatment given that different patients may develop different symptoms. We employed machine learning methods to [...] Read more.

The rapid spread of COVID-19 has become a major concern for people’s lives and health all around the world. COVID-19 patients in various phases and severity require individualized treatment given that different patients may develop different symptoms. We employed machine learning methods to discover biomarkers that may accurately classify COVID-19 in various disease states and severities in this study. The blood gene expression profiles from 50 COVID-19 patients without intensive care, 50 COVID-19 patients with intensive care, 10 non-COVID-19 individuals without intensive care, and 16 non-COVID-19 individuals with intensive care were analyzed. Boruta was first used to remove irrelevant gene features in the expression profiles, and then, the minimum redundancy maximum relevance was applied to sort the remaining features. The generated feature-ranked list was fed into the incremental feature selection method to discover the essential genes and build powerful classifiers. The molecular mechanism of some biomarker genes was addressed using recent studies, and biological functions enriched by essential genes were examined. Our findings imply that genes including UBE2C, PCLAF, CDK1, CCNB1, MND1, APOBEC3G, TRAF3IP3, CD48, and GZMA play key roles in defining the different states and severity of COVID-19. Thus, a new point of reference is provided for understanding the disease’s etiology and facilitating a precise therapy. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

14 pages, 1438 KiB

Open AccessArticle

Blood Transcript Biomarkers Selected by Machine Learning Algorithm Classify Neurodegenerative Diseases including Alzheimer’s Disease

by Carol J. Huseby, Elaine Delvaux, Danielle L. Brokaw and Paul D. Coleman

Biomolecules 2022, 12(11), 1592; https://doi.org/10.3390/biom12111592 - 29 Oct 2022

Cited by 6 | Viewed by 1964

Abstract

The clinical diagnosis of neurodegenerative diseases is notoriously inaccurate and current methods are often expensive, time-consuming, or invasive. Simple inexpensive and noninvasive methods of diagnosis could provide valuable support for clinicians when combined with cognitive assessment scores. Biological processes leading to neuropathology progress [...] Read more.

The clinical diagnosis of neurodegenerative diseases is notoriously inaccurate and current methods are often expensive, time-consuming, or invasive. Simple inexpensive and noninvasive methods of diagnosis could provide valuable support for clinicians when combined with cognitive assessment scores. Biological processes leading to neuropathology progress silently for years and are reflected in both the central nervous system and vascular peripheral system. A blood-based screen to distinguish and classify neurodegenerative diseases is especially interesting having low cost, minimal invasiveness, and accessibility to almost any world clinic. In this study, we set out to discover a small set of blood transcripts that can be used to distinguish healthy individuals from those with Alzheimer’s disease, Parkinson’s disease, Huntington’s disease, amyotrophic lateral sclerosis, Friedreich’s ataxia, or frontotemporal dementia. Using existing public datasets, we developed a machine learning algorithm for application on transcripts present in blood and discovered small sets of transcripts that distinguish a number of neurodegenerative diseases with high sensitivity and specificity. We validated the usefulness of blood RNA transcriptomics for the classification of neurodegenerative diseases. Information about features selected for the classification can direct the development of possible treatment strategies. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

16 pages, 22609 KiB

Open AccessArticle

Construction of Multiple Logic Circuits Based on Allosteric DNAzymes

by Xin Liu, Qiang Zhang, Xun Zhang, Yuan Liu, Yao Yao and Nikola Kasabov

Biomolecules 2022, 12(4), 495; https://doi.org/10.3390/biom12040495 - 24 Mar 2022

Cited by 1 | Viewed by 2296

Abstract

In DNA computing, the implementation of complex and stable logic operations in a universal system is a critical challenge. It is necessary to develop a system with complex logic functions based on a simple mechanism. Here, the strategy to control the secondary structure [...] Read more.

In DNA computing, the implementation of complex and stable logic operations in a universal system is a critical challenge. It is necessary to develop a system with complex logic functions based on a simple mechanism. Here, the strategy to control the secondary structure of assembled DNAzymes’ conserved domain is adopted to regulate the activity of DNAzymes and avoid the generation of four-way junctions, and makes it possible to implement basic logic gates and their cascade circuits in the same system. In addition, the purpose of threshold control achieved by the allosteric secondary structure implements a three-input DNA voter with one-vote veto function. The scalability of the system can be remarkably improved by adjusting the threshold to implement a DNA voter with 2n + 1 inputs. The proposed strategy provides a feasible idea for constructing more complex DNA circuits and a highly integrated computing system. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

16 pages, 3866 KiB

Open AccessArticle

AutoCellANLS: An Automated Analysis System for Mycobacteria-Infected Cells Based on Unstained Micrograph

by Yan Zhuang, Xinzhuo Zhao, Zhongbing Huang, Lin Han, Ke Chen and Jiangli Lin

Biomolecules 2022, 12(2), 240; https://doi.org/10.3390/biom12020240 - 01 Feb 2022

Viewed by 1690

Abstract

The detection of Mycobacterium tuberculosis (Mtb) infection plays an important role in the control of tuberculosis (TB), one of the leading infectious diseases in the world. Recent advances in artificial intelligence-aided cellular image processing and analytical techniques have shown great promises in automated [...] Read more.

The detection of Mycobacterium tuberculosis (Mtb) infection plays an important role in the control of tuberculosis (TB), one of the leading infectious diseases in the world. Recent advances in artificial intelligence-aided cellular image processing and analytical techniques have shown great promises in automated Mtb detection. However, current cell imaging protocols often involve costly and time-consuming fluorescence staining, which has become a major bottleneck for procedural automation. To solve this problem, we have developed a novel automated system (AutoCellANLS) for cell detection and the recognition of morphological features in the phase-contrast micrographs by using unsupervised machine learning (UML) approaches and deep convolutional neural networks (CNNs). The detection algorithm can adaptively and automatically detect single cells in the cell population by the improved level set segmentation model with the circular Hough transform (CHT). Besides, we have designed a Cell-net by using the transfer learning strategies (TLS) to classify the virulence-specific cellular morphological changes that would otherwise be indistinguishable to the naked eye. The novel system can simultaneously classify and segment microscopic images of the cell populations and achieve an average accuracy of 95.13% for cell detection, 95.94% for morphological classification, 94.87% for sensitivity, and 96.61% for specificity. AutoCellANLS is able to detect significant morphological differences between the infected and uninfected mammalian cells throughout the infection period (2 hpi/12 hpi/24 hpi). Besides, it has overcome the drawback of manual intervention and increased the accuracy by more than 11% compared to our previous work, which used AI-aided imaging analysis to detect mycobacterial infection in macrophages. AutoCellANLS is also efficient and versatile when tailored to different cell lines datasets (RAW264.7 and THP-1 cell). This proof-of concept study provides a novel venue to investigate bacterial pathogenesis at a macroscopic level and offers great promise in the diagnosis of bacterial infections. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

13 pages, 1236 KiB

Open AccessArticle

RFLMDA: A Novel Reinforcement Learning-Based Computational Model for Human MicroRNA-Disease Association Prediction

by Linqian Cui, You Lu, Jiacheng Sun, Qiming Fu, Xiao Xu, Hongjie Wu and Jianping Chen

Biomolecules 2021, 11(12), 1835; https://doi.org/10.3390/biom11121835 - 05 Dec 2021

Cited by 4 | Viewed by 2304

Abstract

Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is [...] Read more.

Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is an urgent necessity for computational simulation to predict unknown miRNA-disease associations. In this work, we combine Q-learning algorithm of reinforcement learning to propose a RFLMDA model, three submodels CMF, NRLMF, and LapRLS are fused via Q-learning algorithm to obtain the optimal weight

S

. The performance of RFLMDA was evaluated through five-fold cross-validation and local validation. As a result, the optimal weight is obtained as S (0.1735, 0.2913, 0.5352), and the AUC is 0.9416. By comparing the experiments with other methods, it is proved that RFLMDA model has better performance. For better validate the predictive performance of RFLMDA, we use eight diseases for local verification and carry out case study on three common human diseases. Consequently, all the top 50 miRNAs related to Colorectal Neoplasms and Breast Neoplasms have been confirmed. Among the top 50 miRNAs related to Colon Neoplasms, Gastric Neoplasms, Pancreatic Neoplasms, Kidney Neoplasms, Esophageal Neoplasms, and Lymphoma, we confirm 47, 41, 49, 46, 46 and 48 miRNAs respectively. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

16 pages, 3966 KiB

Open AccessArticle

DNA Matrix Operation Based on the Mechanism of the DNAzyme Binding to Auxiliary Strands to Cleave the Substrate

by Shaoxia Xu, Yuan Liu, Shihua Zhou, Qiang Zhang and Nikola K. Kasabov

Biomolecules 2021, 11(12), 1797; https://doi.org/10.3390/biom11121797 - 30 Nov 2021

Cited by 3 | Viewed by 1751

Abstract

Numerical computation is a focus of DNA computing, and matrix operations are among the most basic and frequently used operations in numerical computation. As an important computing tool, matrix operations are often used to deal with intensive computing tasks. During calculation, the speed [...] Read more.

Numerical computation is a focus of DNA computing, and matrix operations are among the most basic and frequently used operations in numerical computation. As an important computing tool, matrix operations are often used to deal with intensive computing tasks. During calculation, the speed and accuracy of matrix operations directly affect the performance of the entire computing system. Therefore, it is important to find a way to perform matrix calculations that can ensure the speed of calculations and improve the accuracy. This paper proposes a DNA matrix operation method based on the mechanism of the DNAzyme binding to auxiliary strands to cleave the substrate. In this mechanism, the DNAzyme binding substrate requires the connection of two auxiliary strands. Without any of the two auxiliary strands, the DNAzyme does not cleave the substrate. Based on this mechanism, the multiplication operation of two matrices is realized; the two types of auxiliary strands are used as elements of the two matrices, to participate in the operation, and then are combined with the DNAzyme to cut the substrate and output the result of the matrix operation. This research provides a new method of matrix operations and provides ideas for more complex computing systems. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures

Figure 1

Review

Jump to: Research

13 pages, 616 KiB

Open AccessReview

Developments in Algorithms for Sequence Alignment: A Review

by Jiannan Chao, Furong Tang and Lei Xu

Biomolecules 2022, 12(4), 546; https://doi.org/10.3390/biom12040546 - 06 Apr 2022

Cited by 10 | Viewed by 4653

Abstract

The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment [...] Read more.

The continuous development of sequencing technologies has enabled researchers to obtain large amounts of biological sequence data, and this has resulted in increasing demands for software that can perform sequence alignment fast and accurately. A number of algorithms and tools for sequence alignment have been designed to meet the various needs of biologists. Here, the ideas that prevail in the research of sequence alignment and some quality estimation methods for multiple sequence alignment tools are summarized. Full article

(This article belongs to the Special Issue Algorithmic Themes in Bioinformatics and Computational Biology)

► Show Figures