Algorithms in Bioinformatics

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Algorithms for Multidisciplinary Applications".

Deadline for manuscript submissions: closed (15 August 2020) | Viewed by 20455

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA
Interests: bioinformatics; metagenomics; antimicrobial resistance; algorithms; big data

Special Issue Information

Dear Colleagues,

We invite you to submit your latest research in bioinformatics and computational biology to the Special Issue entitled: “Algorithms for Bioinformatics”. In the past several decades, there has been an explosion in the generation and distribution of biological data, including genomic, transcriptomic, proteomic, and bioimaging data. Analysis of public datasets has shown that the generation of sequence data has outpaced Moore’s law. Furthermore, in addition to an increase in the sheer size of the data, there has been significant growth in the various types of biological data. Aside from the generation of short sequence reads, there are now long reads, ultralong reads, amplified sequence data, optical mapping data, and mass spectrometry data. The analysis of these datasets—alone or in concert with each other—requires the development of novel combinatorial algorithms, heuristics, machine learning paradigms, and data structures. The aim of this Special Issue is to present recent algorithmic and mathematical innovations in the area of bioinformatics and computational biology.

The topics include but are not limited to the following areas:

  • Genome and metagenome assembly;
  • Transcriptomics;
  • Proteomics and proteogenomics;
  • Biological network analysis;
  • Regulation and epigenomics;
  • Analysis of biological imaging data;
  • Data structures for analysis of biological data;
  • Evolution and comparative genomics;
  • Protein structure and function prediction.

Dr. Christina Boucher
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

2 pages, 162 KiB  
Editorial
Special Issue: Algorithms in Bioinformatics
by Christina Boucher
Algorithms 2023, 16(1), 21; https://doi.org/10.3390/a16010021 - 30 Dec 2022
Cited by 1 | Viewed by 1203
Abstract
In the past decade, there has been an effort to sequence and compare a large number of individual genomes of a given species, resulting in a large number of (reference) genomes of various species being made publicly available [...] Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)

Research

Jump to: Editorial

9 pages, 252 KiB  
Article
More Time-Space Tradeoffs for Finding a Shortest Unique Substring
by Hideo Bannai, Travis Gagie, Gary Hoppenworth, Simon J. Puglisi and Luís M. S. Russo
Algorithms 2020, 13(9), 234; https://doi.org/10.3390/a13090234 - 18 Sep 2020
Cited by 2 | Viewed by 2349
Abstract
We extend recent results regarding finding shortest unique substrings (SUSs) to obtain new time-space tradeoffs for this problem and the generalization of finding k-mismatch SUSs. Our new results include the first algorithm for finding a k-mismatch SUS in sublinear space, which [...] Read more.
We extend recent results regarding finding shortest unique substrings (SUSs) to obtain new time-space tradeoffs for this problem and the generalization of finding k-mismatch SUSs. Our new results include the first algorithm for finding a k-mismatch SUS in sublinear space, which we obtain by extending an algorithm by Senanayaka (2019) and combining it with a result on sketching by Gawrychowski and Starikovskaya (2019). We first describe how, given a text T of length n and m words of workspace, with high probability we can find an SUS of length L in O(n(L/m)logL) time using random access to T, or in O(n(L/m)log2(L)loglogσ) time using O((L/m)log2L) sequential passes over T. We then describe how, for constant k, with high probability, we can find a k-mismatch SUS in O(n1+ϵL/m) time using O(nϵL/m) sequential passes over T, again using only m words of workspace. Finally, we also describe a deterministic algorithm that takes O(nτlogσlogn) time to find an SUS using O(n/τ) words of workspace, where τ is a parameter. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
13 pages, 315 KiB  
Article
A Brain-Inspired Hyperdimensional Computing Approach for Classifying Massive DNA Methylation Data of Cancer
by Fabio Cumbo, Eleonora Cappelli and Emanuel Weitschek
Algorithms 2020, 13(9), 233; https://doi.org/10.3390/a13090233 - 17 Sep 2020
Cited by 7 | Viewed by 3496
Abstract
The recent advancements in cancer genomics have put under the spotlight DNA methylation, a genetic modification that regulates the functioning of the genome and whose modifications have an important role in tumorigenesis and tumor-suppression. Because of the high dimensionality and the enormous amount [...] Read more.
The recent advancements in cancer genomics have put under the spotlight DNA methylation, a genetic modification that regulates the functioning of the genome and whose modifications have an important role in tumorigenesis and tumor-suppression. Because of the high dimensionality and the enormous amount of genomic data that are produced through the last advancements in Next Generation Sequencing, it is very challenging to effectively make use of DNA methylation data in diagnostics applications, e.g., in the identification of healthy vs diseased samples. Additionally, state-of-the-art techniques are not fast enough to rapidly produce reliable results or efficient in managing those massive amounts of data. For this reason, we propose HD-classifier, an in-memory cognitive-based hyperdimensional (HD) supervised machine learning algorithm for the classification of tumor vs non tumor samples through the analysis of their DNA Methylation data. The approach takes inspiration from how the human brain is able to remember and distinguish simple and complex concepts by adopting hypervectors and no single numerical values. Exactly as the brain works, this allows for encoding complex patterns, which makes the whole architecture robust to failures and mistakes also with noisy data. We design and develop an algorithm and a software tool that is able to perform supervised classification with the HD approach. We conduct experiments on three DNA methylation datasets of different types of cancer in order to prove the validity of our algorithm, i.e., Breast Invasive Carcinoma (BRCA), Kidney renal papillary cell carcinoma (KIRP), and Thyroid carcinoma (THCA). We obtain outstanding results in terms of accuracy and computational time with a low amount of computational resources. Furthermore, we validate our approach by comparing it (i) to BIGBIOCL, a software based on Random Forest for classifying big omics datasets in distributed computing environments, (ii) to Support Vector Machine (SVM), and (iii) to Decision Tree state-of-the-art classification methods. Finally, we freely release both the datasets and the software on GitHub. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
Show Figures

Figure 1

18 pages, 384 KiB  
Article
A Linear-Time Algorithm for the Isometric Reconciliation of Unrooted Trees
by Broňa Brejová and Rastislav Královič
Algorithms 2020, 13(9), 225; https://doi.org/10.3390/a13090225 - 8 Sep 2020
Cited by 1 | Viewed by 2256
Abstract
In the reconciliation problem, we are given two phylogenetic trees. A species tree represents the evolutionary history of a group of species, and a gene tree represents the history of a family of related genes within these species. A reconciliation maps nodes of [...] Read more.
In the reconciliation problem, we are given two phylogenetic trees. A species tree represents the evolutionary history of a group of species, and a gene tree represents the history of a family of related genes within these species. A reconciliation maps nodes of the gene tree to the corresponding points of the species tree, and thus helps to interpret the gene family history. In this paper, we study the case when both trees are unrooted and their edge lengths are known exactly. The goal is to root them and to find a reconciliation that agrees with the edge lengths. We show a linear-time algorithm for finding the set of all possible root locations, which is a significant improvement compared to the previous O(N3logN) algorithm. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
Show Figures

Figure 1

18 pages, 404 KiB  
Article
A Survey on Shortest Unique Substring Queries
by Paniz Abedin, M. Oğuzhan Külekci and Shama V. Thankachan
Algorithms 2020, 13(9), 224; https://doi.org/10.3390/a13090224 - 6 Sep 2020
Cited by 4 | Viewed by 3217
Abstract
The shortest unique substring (SUS) problem is an active line of research in the field of string algorithms and has several applications in bioinformatics and information retrieval. The initial version of the problem was proposed by Pei et al. [ICDE’13]. Over the years, [...] Read more.
The shortest unique substring (SUS) problem is an active line of research in the field of string algorithms and has several applications in bioinformatics and information retrieval. The initial version of the problem was proposed by Pei et al. [ICDE’13]. Over the years, many variants and extensions have been pursued, which include positional-SUS, interval-SUS, approximate-SUS, palindromic-SUS, range-SUS, etc. In this article, we highlight some of the key results and summarize the recent developments in this area. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
19 pages, 6802 KiB  
Article
Low-Power FPGA Implementation of Convolution Neural Network Accelerator for Pulse Waveform Classification
by Chuanglu Chen, Zhiqiang Li, Yitao Zhang, Shaolong Zhang, Jiena Hou and Haiying Zhang
Algorithms 2020, 13(9), 213; https://doi.org/10.3390/a13090213 - 31 Aug 2020
Cited by 7 | Viewed by 2987
Abstract
In pulse waveform classification, the convolution neural network (CNN) shows excellent performance. However, due to its numerous parameters and intensive computation, it is challenging to deploy a CNN model to low-power devices. To solve this problem, we implement a CNN accelerator based on [...] Read more.
In pulse waveform classification, the convolution neural network (CNN) shows excellent performance. However, due to its numerous parameters and intensive computation, it is challenging to deploy a CNN model to low-power devices. To solve this problem, we implement a CNN accelerator based on a field-programmable gate array (FPGA), which can accurately and quickly infer the waveform category. By designing the structure of CNN, we significantly reduce its parameters on the premise of high accuracy. Then the CNN is realized on FPGA and optimized by a variety of memory access optimization methods. Experimental results show that our customized CNN has high accuracy and fewer parameters, and the accelerator costs only 0.714 W under a working frequency of 100 MHz, which proves that our proposed solution is feasible. Furthermore, the accelerator classifies the pulse waveform in real time, which could help doctors make the diagnosis. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
Show Figures

Figure 1

20 pages, 1750 KiB  
Article
Classical and Deep Learning Paradigms for Detection and Validation of Key Genes of Risky Outcomes of HCV
by Nagwan M. Abdel Samee
Algorithms 2020, 13(3), 73; https://doi.org/10.3390/a13030073 - 24 Mar 2020
Cited by 10 | Viewed by 3930
Abstract
Hepatitis C virus (HCV) is one of the most dangerous viruses worldwide. It is the foremost cause of the hepatic cirrhosis, and hepatocellular carcinoma, HCC. Detecting new key genes that play a role in the growth of HCC in HCV patients using machine [...] Read more.
Hepatitis C virus (HCV) is one of the most dangerous viruses worldwide. It is the foremost cause of the hepatic cirrhosis, and hepatocellular carcinoma, HCC. Detecting new key genes that play a role in the growth of HCC in HCV patients using machine learning techniques paves the way for producing accurate antivirals. In this work, there are two phases: detecting the up/downregulated genes using classical univariate and multivariate feature selection methods, and validating the retrieved list of genes using Insilico classifiers. However, the classification algorithms in the medical domain frequently suffer from a deficiency of training cases. Therefore, a deep neural network approach is proposed here to validate the significance of the retrieved genes in classifying the HCV-infected samples from the disinfected ones. The validation model is based on the artificial generation of new examples from the retrieved genes’ expressions using sparse autoencoders. Subsequently, the generated genes’ expressions data are used to train conventional classifiers. Our results in the first phase yielded a better retrieval of significant genes using Principal Component Analysis (PCA), a multivariate approach. The retrieved list of genes using PCA had a higher number of HCC biomarkers compared to the ones retrieved from the univariate methods. In the second phase, the classification accuracy can reveal the relevance of the extracted key genes in classifying the HCV-infected and disinfected samples. Full article
(This article belongs to the Special Issue Algorithms in Bioinformatics)
Show Figures

Figure 1

Back to TopTop