Genomes and Evolution: Computational Approaches

A special issue of Computation (ISSN 2079-3197). This special issue belongs to the section "Computational Biology".

Deadline for manuscript submissions: closed (31 October 2014) | Viewed by 72210

Special Issue Editors


E-Mail Website
Guest Editor
Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK
Interests: computational systems biology; bioinformatics; metabolomics; dynamic modelling; synthetic biology
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Bioinformatics Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
Interests: computational biology; bioinformatics; natural products; microbiomes; synthetic biology

Special Issue Information

Dear Colleagues,

The computational analysis of gene and genome sequences has become a key methodology for understanding the function and evolution of biological systems. Often, descriptions of specific computational methods that have led to exciting research results are discussed only briefly, or relegated to the supplementary information of the papers describing them. Yet, many of these methods merit a more thorough discussion of the key concepts on which they are based, and of the possible further opportunities for exploiting these methods in other contexts. This Special Issue aims to offer a platform for explaining, discussing and contextualizing important computational methods and algorithms. Such methods can assist other scientists researching the evolutionary history of gene and genome sequences and such genes’ biological functions.

Specific topics include, but are not limited to:

  • Methods for tracing the evolutionary history of genome sequences, including, for example, the dynamics of introns and transposons, as well as duplication, recombination, and horizontal transfer events
  • Methods for improving (meta)genome assembly by employing evolutionary information
  • Phylogenetic methods for evaluating evolutionary relationships between genes and genomes
  • Algorithms for studying patterns in amino acid sequences and/or protein structure evolution
  • Tools for automating the annotation of genomes or genomic regions according to function
  • Algorithms or pipelines for identifying mutations from high-throughput sequencing experiments
  • Pipelines for evaluating the outcome of next-generation sequence assemblies
  • Methods for evaluating the evolutionary similarity of genes, gene clusters, genomes, pan-genomes or metagenomes
  • Models and tools for simulating, predicting or otherwise evaluating the evolution of genome-based metabolic or regulatory networks from a systems biology perspective

Prof. Dr. Rainer Breitling
Dr. Marnix Medema
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Computation is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.


Keywords

  • bioinformatics
  • computational biology
  • evolution
  • systems biology
  • algorithms
  • comparative genomics
  • phylogeny
  • sequence analysis
  • metagenomics

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

249 KiB  
Article
Computational Recognition of RNA Splice Sites by Exact Algorithms for the Quadratic Traveling Salesman Problem
by Anja Fischer, Frank Fischer, Gerold Jäger, Jens Keilwagen, Paul Molitor and Ivo Grosse
Computation 2015, 3(2), 285-298; https://doi.org/10.3390/computation3020285 - 03 Jun 2015
Cited by 6 | Viewed by 6120
Abstract
One fundamental problem of bioinformatics is the computational recognition of DNA and RNA binding sites. Given a set of short DNA or RNA sequences of equal length such as transcription factor binding sites or RNA splice sites, the task is to learn a [...] Read more.
One fundamental problem of bioinformatics is the computational recognition of DNA and RNA binding sites. Given a set of short DNA or RNA sequences of equal length such as transcription factor binding sites or RNA splice sites, the task is to learn a pattern from this set that allows the recognition of similar sites in another set of DNA or RNA sequences. Permuted Markov (PM) models and permuted variable length Markov (PVLM) models are two powerful models for this task, but the problem of finding an optimal PM model or PVLM model is NP-hard. While the problem of finding an optimal PM model or PVLM model of order one is equivalent to the traveling salesman problem (TSP), the problem of finding an optimal PM model or PVLM model of order two is equivalent to the quadratic TSP (QTSP). Several exact algorithms exist for solving the QTSP, but it is unclear if these algorithms are capable of solving QTSP instances resulting from RNA splice sites of at least 150 base pairs in a reasonable time frame. Here, we investigate the performance of three exact algorithms for solving the QTSP for ten datasets of splice acceptor sites and splice donor sites of five different species and find that one of these algorithms is capable of solving QTSP instances of up to 200 base pairs with a running time of less than two days. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

1360 KiB  
Article
A Guide to Phylogenetic Reconstruction Using Heterogeneous Models—A Case Study from the Root of the Placental Mammal Tree
by Raymond J. Moran, Claire C. Morgan and Mary J. O'Connell
Computation 2015, 3(2), 177-196; https://doi.org/10.3390/computation3020177 - 15 Apr 2015
Cited by 14 | Viewed by 21572
Abstract
There are numerous phylogenetic reconstruction methods and models available—but which should you use and why? Important considerations in phylogenetic analyses include data quality, structure, signal, alignment length and sampling. If poorly modelled, variation in rates of change across proteins and across lineages can [...] Read more.
There are numerous phylogenetic reconstruction methods and models available—but which should you use and why? Important considerations in phylogenetic analyses include data quality, structure, signal, alignment length and sampling. If poorly modelled, variation in rates of change across proteins and across lineages can lead to incorrect phylogeny reconstruction which can then lead to downstream misinterpretation of the underlying data. The risk of choosing and applying an inappropriate model can be reduced with some critical yet straightforward steps outlined in this paper. We use the question of the position of the root of placental mammals as our working example to illustrate the topological impact of model misspecification. Using this case study we focus on using models in a Bayesian framework and we outline the steps involved in identifying and assessing better fitting models for specific datasets. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

4002 KiB  
Article
Evolution by Pervasive Gene Fusion in Antibiotic Resistance and Antibiotic Synthesizing Genes
by Orla Coleman, Ruth Hogan, Nicole McGoldrick, Niamh Rudden and James O. McInerney
Computation 2015, 3(2), 114-127; https://doi.org/10.3390/computation3020114 - 26 Mar 2015
Cited by 4 | Viewed by 6053
Abstract
Phylogenetic (tree-based) approaches to understanding evolutionary history are unable to incorporate convergent evolutionary events where two genes merge into one. In this study, as exemplars of what can be achieved when a tree is not assumed a priori, we have analysed the evolutionary [...] Read more.
Phylogenetic (tree-based) approaches to understanding evolutionary history are unable to incorporate convergent evolutionary events where two genes merge into one. In this study, as exemplars of what can be achieved when a tree is not assumed a priori, we have analysed the evolutionary histories of polyketide synthase genes and antibiotic resistance genes and have shown that their history is replete with convergent events as well as divergent events. We demonstrate that the overall histories of these genes more closely resembles the remodelling that might be seen with the children’s toy Lego, than the standard model of the phylogenetic tree. This work demonstrates further that genes can act as public goods, available for re-use and incorporation into other genetic goods. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

2430 KiB  
Article
Computational and Statistical Analyses of Insertional Polymorphic Endogenous Retroviruses in a Non-Model Organism
by Le Bao, Daniel Elleder, Raunaq Malhotra, Michael DeGiorgio, Theodora Maravegias, Lindsay Horvath, Laura Carrel, Colin Gillin, Tomáš Hron, Helena Fábryová, David R. Hunter and Mary Poss
Computation 2014, 2(4), 221-245; https://doi.org/10.3390/computation2040221 - 28 Nov 2014
Cited by 5 | Viewed by 9620
Abstract
Endogenous retroviruses (ERVs) are a class of transposable elements found in all vertebrate genomes that contribute substantially to genomic functional and structural diversity. A host species acquires an ERV when an exogenous retrovirus infects a germ cell of an individual and becomes part [...] Read more.
Endogenous retroviruses (ERVs) are a class of transposable elements found in all vertebrate genomes that contribute substantially to genomic functional and structural diversity. A host species acquires an ERV when an exogenous retrovirus infects a germ cell of an individual and becomes part of the genome inherited by viable progeny. ERVs that colonized ancestral lineages are fixed in contemporary species. However, in some extant species, ERV colonization is ongoing, which results in variation in ERV frequency in the population. To study the consequences of ERV colonization of a host genome, methods are needed to assign each ERV to a location in a species’ genome and determine which individuals have acquired each ERV by descent. Because well annotated reference genomes are not widely available for all species, de novo clustering approaches provide an alternative to reference mapping that are insensitive to differences between query and reference and that are amenable to mobile element studies in both model and non-model organisms. However, there is substantial uncertainty in both identifying ERV genomic position and assigning each unique ERV integration site to individuals in a population. We present an analysis suitable for detecting ERV integration sites in species without the need for a reference genome. Our approach is based on improved de novo clustering methods and statistical models that take the uncertainty of assignment into account and yield a probability matrix of shared ERV integration sites among individuals. We demonstrate that polymorphic integrations of a recently identified endogenous retrovirus in deer reflect contemporary relationships among individuals and populations. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

835 KiB  
Article
Incongruencies in Vaccinia Virus Phylogenetic Trees
by Chad Smithson, Samantha Kampman, Benjamin M. Hetman and Chris Upton
Computation 2014, 2(4), 182-198; https://doi.org/10.3390/computation2040182 - 14 Oct 2014
Cited by 11 | Viewed by 9239
Abstract
Over the years, as more complete poxvirus genomes have been sequenced, phylogenetic studies of these viruses have become more prevalent. In general, the results show similar relationships between the poxvirus species; however, some inconsistencies are notable. Previous analyses of the viral genomes contained [...] Read more.
Over the years, as more complete poxvirus genomes have been sequenced, phylogenetic studies of these viruses have become more prevalent. In general, the results show similar relationships between the poxvirus species; however, some inconsistencies are notable. Previous analyses of the viral genomes contained within the vaccinia virus (VACV)-Dryvax vaccine revealed that their phylogenetic relationships were sometimes clouded by low bootstrapping confidence. To analyze the VACV-Dryvax genomes in detail, a new tool-set was developed and integrated into the Base-By-Base bioinformatics software package. Analyses showed that fewer unique positions were present in each VACV-Dryvax genome than expected. A series of patterns, each containing several single nucleotide polymorphisms (SNPs) were identified that were counter to the results of the phylogenetic analysis. The VACV genomes were found to contain short DNA sequence blocks that matched more distantly related clades. Additionally, similar non-conforming SNP patterns were observed in (1) the variola virus clade; (2) some cowpox clades; and (3) VACV-CVA, the direct ancestor of VACV-MVA. Thus, traces of past recombination events are common in the various orthopoxvirus clades, including those associated with smallpox and cowpox viruses. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

959 KiB  
Article
On Mechanistic Modeling of Gene Content Evolution: Birth-Death Models and Mechanisms of Gene Birth and Gene Retention
by Ashley I. Teufel, Jing Zhao, Malgorzata O'Reilly, Liang Liu and David A. Liberles
Computation 2014, 2(3), 112-130; https://doi.org/10.3390/computation2030112 - 28 Aug 2014
Cited by 9 | Viewed by 8865
Abstract
Characterizing the mechanisms of duplicate gene retention using phylogenetic methods requires models that are consistent with different biological processes. The interplay between complex biological processes and necessarily simpler statistical models leads to a complex modeling problem. A discussion of the relationship between biological [...] Read more.
Characterizing the mechanisms of duplicate gene retention using phylogenetic methods requires models that are consistent with different biological processes. The interplay between complex biological processes and necessarily simpler statistical models leads to a complex modeling problem. A discussion of the relationship between biological processes, existing models for duplicate gene retention and data is presented. Existing models are then extended in deriving two new birth/death models for phylogenetic application in a gene tree/species tree reconciliation framework to enable probabilistic inference of the mechanisms from model parameterization. The goal of this work is to synthesize a detailed discussion of modeling duplicate genes to address biological questions, moving from previous work to future trajectories with the aim of generating better models and better inference. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

Review

Jump to: Research

841 KiB  
Review
Evolutionary Dynamics in Gene Networks and Inference Algorithms
by Daniel Aguilar-Hidalgo, María C. Lemos and Antonio Córdoba
Computation 2015, 3(1), 99-113; https://doi.org/10.3390/computation3010099 - 13 Mar 2015
Cited by 6 | Viewed by 5580
Abstract
Dynamical interactions among sets of genes (and their products) regulate developmental processes and some dynamical diseases, like cancer. Gene regulatory networks (GRNs) are directed networks that define interactions (links) among different genes/proteins involved in such processes. Genetic regulation can be modified during the [...] Read more.
Dynamical interactions among sets of genes (and their products) regulate developmental processes and some dynamical diseases, like cancer. Gene regulatory networks (GRNs) are directed networks that define interactions (links) among different genes/proteins involved in such processes. Genetic regulation can be modified during the time course of the process, which may imply changes in the nodes activity that leads the system from a specific state to a different one at a later time (dynamics). How the GRN modifies its topology, to properly drive a developmental process, and how this regulation was acquired across evolution are questions that the evolutionary dynamics of gene networks tackles. In the present work we review important methodology in the field and highlight the combination of these methods with evolutionary algorithms. In recent years, this combination has become a powerful tool to fit models with the increasingly available experimental data. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Graphical abstract

267 KiB  
Review
Computation of the Likelihood in Biallelic Diffusion Models Using Orthogonal Polynomials
by Claus Vogl
Computation 2014, 2(4), 199-220; https://doi.org/10.3390/computation2040199 - 14 Nov 2014
Cited by 3 | Viewed by 4359
Abstract
In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic [...] Read more.
In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic methods of inference can be shown to be optimal, if their assumptions are met. In genomic regions where recombination rates are high relative to mutation rates, polymorphic nucleotide sites can be assumed to evolve independently from each other. The distribution of allele frequencies at a large number of such sites has been called “allele-frequency spectrum” or “site-frequency spectrum” (SFS). Conditional on the allelic proportions, the likelihoods of such data can be modeled as binomial. A simple model representing the evolution of allelic proportions is the biallelic mutation-drift or mutation-directional selection-drift diffusion model. With series of orthogonal polynomials, specifically Jacobi and Gegenbauer polynomials, or the related spheroidal wave function, the diffusion equations can be solved efficiently. In the neutral case, the product of the binomial likelihoods with the sum of such polynomials leads to finite series of polynomials, i.e., relatively simple equations, from which the exact likelihoods can be calculated. In this article, the use of orthogonal polynomials for inferring population genetic parameters is investigated. Full article
(This article belongs to the Special Issue Genomes and Evolution: Computational Approaches)
Show Figures

Figure 1

Back to TopTop