Submit to Algorithms Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Algorithms in Computational Biology

Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Analysis of Algorithms and Complexity Theory".

Deadline for manuscript submissions: closed (31 March 2021) | Viewed by 19516

Share This Special Issue

Special Issue Editors

Dr. Hélène Touzet

E-Mail Website
Guest Editor

CNRS, Lille, France
Interests: computational molecular biology; algorithms; genomics; metagenomics; high throughput sequencing

Dr. Aïda Ouangraoua

E-Mail Website
Guest Editor

Department of Computer Science, University of Sherbrooke, Sherbrooke, QC, Canada
Interests: computational molecular biology; RNA structure and function; evolution; algorithms; big data

Special Issue Information

Dear Colleagues,

The last decade has witnessed the generation of increasingly massive and complex -omics data by high-throughput technologies: genomes, transcriptomes, proteomes, metagenomes, epigenomes, with important applications in biological, environmental, and biomedical sciences. The huge amount and heterogeneity of these data have taken computational biology into a big data era, with a shift from single-level analysis to large-scale multi-omics data integration. This has led to the rise of diverse problems to store, treat, and annotate these data that require powerful algorithmic techniques to be solved efficiently in practice. The aim of this Special Issue is to present state-of-the-art algorithmic innovations that allow facing the computational bottlenecks of -omics data analysis. This includes a variety of methods, such as discrete algorithms on sequences, trees and graphs, data structures and compressed data structures, parallel computing, combinatorial and sampling approaches, heuristics and parameterized algorithms, data mining, and machine learning techniques.

We invite you to submit high-quality papers to this Special Issue on “Algorithms in Computational Biology”, with subjects covering the whole range from theory to applications. Surveys are also welcome. The following is a (non-exhaustive) list of topics of interests:

Genomics and pangenomics
Transcriptomics
Metagenomics
Epigenomics
Proteomics and proteogenomics
Sequence comparison
Sequence assembly
Structural variants
RNA and protein structures
Structural and functional annotation
Evolution and comparative genomics
Biological networks

Dr. Helene Touzet
Dr. Aïda Ouangraoua
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Genomics and pangenomics
Transcriptomics
Metagenomics
Epigenomics
Proteomics and proteogenomics
Sequence comparison
Sequence assembly
Structural variants
RNA and protein structures
Structural and functional annotation
Evolution and comparative genomics
Biological networks

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

Jump to: Review

23 pages, 1463 KiB

Open AccessArticle

Improved Duplication-Transfer-Loss Reconciliation with Extinct and Unsampled Lineages

by Samson Weiner and Mukul S. Bansal

Algorithms 2021, 14(8), 231; https://doi.org/10.3390/a14080231 - 5 Aug 2021

Cited by 4 | Viewed by 3152

Abstract

Duplication-Transfer-Loss (DTL) reconciliation is a widely used computational technique for understanding gene family evolution and inferring horizontal gene transfer (transfer for short) in microbes. However, most existing models and implementations of DTL reconciliation cannot account for the effect of unsampled or extinct species lineages on the evolution of gene families, likely affecting their accuracy. Accounting for the presence and possible impact of any unsampled species lineages, including those that are extinct, is especially important for inferring and studying horizontal transfer since many genes in the species lineages represented in the reconciliation analysis are likely to have been acquired through horizontal transfer from unsampled lineages. While models of DTL reconciliation that account for transfer from unsampled lineages have already been proposed, they use a relatively simple framework for transfer from unsampled lineages and cannot explicitly infer the location on the species tree of each unsampled or extinct lineage associated with an identified transfer event. Furthermore, there does not yet exist any systematic studies to assess the impact of accounting for unsampled lineages on the accuracy of DTL reconciliation. In this work, we address these deficiencies by (i) introducing an extended DTL reconciliation model, called the DTLx reconciliation model, that accounts for unsampled and extinct species lineages in a new, more functional manner compared to existing models, (ii) showing that optimal reconciliations under the new DTLx reconciliation model can be computed just as efficiently as under the fastest DTL reconciliation model, (iii) providing an efficient algorithm for sampling optimal DTLx reconciliations uniformly at random, (iv) performing the first systematic simulation study to assess the impact of accounting for unsampled lineages on the accuracy of DTL reconciliation, and (v) comparing the accuracies of inferring transfers from unsampled lineages under our new model and the only other previously proposed parsimony-based model for this problem. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Figure 1

27 pages, 1377 KiB

Open AccessArticle

Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods

by Manon Ruffini, Jelena Vucinic, Simon de Givry, George Katsirelos, Sophie Barbe and Thomas Schiex

Algorithms 2021, 14(6), 168; https://doi.org/10.3390/a14060168 - 28 May 2021

Cited by 10 | Viewed by 3575

Abstract

Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stability, activity or other protein capacities. Computational Protein Design aims at designing new proteins from first principles, using full-atom molecular models. However, the size and complexity of proteins require approximations to make them amenable to energetic optimization queries. These approximations make the design process less reliable, and a provable optimal solution may fail. In practice, expensive libraries of solutions are therefore generated and tested. In this paper, we explore the idea of generating libraries of provably diverse low-energy solutions by extending cost function network algorithms with dedicated automaton-based diversity constraints on a large set of realistic full protein redesign problems. We observe that it is possible to generate provably diverse libraries in reasonable time and that the produced libraries do enhance the Native Sequence Recovery, a traditional measure of design methods reliability. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Figure 1

13 pages, 1300 KiB

Open AccessArticle

Validation of Automated Chromosome Recovery in the Reconstruction of Ancestral Gene Order

by Qiaoji Xu, Lingling Jin, James H. Leebens-Mack and David Sankoff

Algorithms 2021, 14(6), 160; https://doi.org/10.3390/a14060160 - 21 May 2021

Cited by 6 | Viewed by 2298

Abstract

The RACCROCHE pipeline reconstructs ancestral gene orders and chromosomal contents of the ancestral genomes at all internal vertices of a phylogenetic tree. The strategy is to accumulate a very large number of generalized adjacencies, phylogenetically justified for each ancestor, to produce long ancestral contigs through maximum weight matching. It constructs chromosomes by counting the frequencies of ancestral contig co-occurrences on the extant genomes, clustering these for each ancestor and ordering them. The main objective of this paper is to closely simulate the evolutionary process giving rise to the gene content and order of a set of extant genomes (six distantly related monocots), and to assess to what extent an updated version of RACCROCHE can recover the artificial ancestral genome at the root of the phylogenetic tree relating to the simulated genomes. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Figure 1

25 pages, 562 KiB

Open AccessArticle

Disjoint Tree Mergers for Large-Scale Maximum Likelihood Tree Estimation

by Minhyuk Park, Paul Zaharias and Tandy Warnow

Algorithms 2021, 14(5), 148; https://doi.org/10.3390/a14050148 - 7 May 2021

Cited by 5 | Viewed by 3439

Abstract

The estimation of phylogenetic trees for individual genes or multi-locus datasets is a basic part of considerable biological research. In order to enable large trees to be computed, Disjoint Tree Mergers (DTMs) have been developed; these methods operate by dividing the input sequence dataset into disjoint sets, constructing trees on each subset, and then combining the subset trees (using auxiliary information) into a tree on the full dataset. DTMs have been used to advantage for multi-locus species tree estimation, enabling highly accurate species trees at reduced computational effort, compared to leading species tree estimation methods. Here, we evaluate the feasibility of using DTMs to improve the scalability of maximum likelihood (ML) gene tree estimation to large numbers of input sequences. Our study shows distinct differences between the three selected ML codes—RAxML-NG, IQ-TREE 2, and FastTree 2—and shows that good DTM pipeline design can provide advantages over these ML codes on large datasets. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Figure 1

17 pages, 2551 KiB

Open AccessArticle

Multiple Loci Selection with Multi-Way Epistasis in Coalescence with Recombination

by Aritra Bose, Filippo Utro, Daniel E. Platt and Laxmi Parida

Algorithms 2021, 14(5), 136; https://doi.org/10.3390/a14050136 - 25 Apr 2021

Viewed by 2603

Abstract

As studies move into deeper characterization of the impact of selection through non-neutral mutations in whole genome population genetics, modeling for selection becomes crucial. Moreover, epistasis has long been recognized as a significant component in understanding the evolution of complex genetic systems. We present a backward coalescent model, EpiSimRA, that accommodates multiple loci selection, with multi-way (k-way) epistasis for any arbitrary k. Starting from arbitrary extant populations with epistatic sites, we trace the Ancestral Recombination Graph (ARG), sampling relevant recombination and coalescent events. Our framework allows for studying different complex evolutionary scenarios in the presence of selective sweeps, positive and negative selection with multiway epistasis. We also present a forward counterpart of the coalescent model based on a Wright-Fisher (WF) process, which we use as a validation framework, comparing the hallmarks of the ARG between the two. We provide the first framework that allows a nose-to-nose comparison of multiway epistasis in a coalescent simulator with its forward counterpart with respect to the hallmarks of the ARG. We demonstrate, through extensive experiments, that EpiSimRA is consistently superior in terms of performance (seconds vs. hours) in comparison to the forward model without compromising on its accuracy. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Figure 1

Review

Jump to: Research

23 pages, 677 KiB

Open AccessReview

Predicting the Evolution of Syntenies—An Algorithmic Review

by Nadia El-Mabrouk

Algorithms 2021, 14(5), 152; https://doi.org/10.3390/a14050152 - 11 May 2021

Cited by 3 | Viewed by 2733

Abstract

Syntenies are genomic segments of consecutive genes identified by a certain conservation in gene content and order. The notion of conservation may vary from one definition to another, the more constrained requiring identical gene contents and gene orders, while more relaxed definitions just require a certain similarity in gene content, and not necessarily in the same order. Regardless of the way they are identified, the goal is to characterize homologous genomic regions, i.e., regions deriving from a common ancestral region, reflecting a certain gene co-evolution that can enlighten important functional properties. In addition of being able to identify them, it is also necessary to infer the evolutionary history that has led from the ancestral segment to the extant ones. In this field, most algorithmic studies address the problem of inferring rearrangement scenarios explaining the disruption in gene order between segments with the same gene content, some of them extending the evolutionary model to gene insertion and deletion. However, syntenies also evolve through other events modifying their content in genes, such as duplications, losses or horizontal gene transfers, i.e., the movement of genes from one species to another. Although the reconciliation approach between a gene tree and a species tree addresses the problem of inferring such events for single-gene families, little effort has been dedicated to the generalization to segmental events and to syntenies. This paper reviews some of the main algorithmic methods for inferring ancestral syntenies and focus on those integrating both gene orders and gene trees. Full article

(This article belongs to the Special Issue Algorithms in Computational Biology)

► Show Figures

Journal Menu

Journal Browser

Algorithms in Computational Biology

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (6 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI