Next Article in Journal / Special Issue
Dynamic Programming Used to Align Protein Structures with a Spectrum Is Robust
Previous Article in Journal
A Laboratory Assessment of Factors That Affect Bacterial Adhesion to Contact Lenses
Previous Article in Special Issue
Algorithms for Computing the Triplet and Quartet Distances for Binary and General Trees
Article Menu

Export Article

Open AccessArticle
Biology 2013, 2(4), 1282-1295; doi:10.3390/biology2041282

Algorithms for Hidden Markov Models Restricted to Occurrences of Regular Expressions

1
Bioinformatics Research Centre, Aarhus University, C. F. Møllers Allé 8, DK-8000 Aarhus C, Denmark
2
Department of Computer Science, Aarhus University, Aabogade 34, DK-8200 Aarhus N, Denmark
*
Author to whom correspondence should be addressed.
Received: 28 June 2013 / Revised: 8 October 2013 / Accepted: 5 November 2013 / Published: 8 November 2013
(This article belongs to the Special Issue Developments in Bioinformatic Algorithms)
View Full-Text   |   Download PDF [344 KB, uploaded 8 November 2013]   |  

Abstract

Hidden Markov Models (HMMs) are widely used probabilistic models, particularly for annotating sequential data with an underlying hidden structure. Patterns in the annotation are often more relevant to study than the hidden structure itself. A typical HMM analysis consists of annotating the observed data using a decoding algorithm and analyzing the annotation to study patterns of interest. For example, given an HMM modeling genes in DNA sequences, the focus is on occurrences of genes in the annotation. In this paper, we define a pattern through a regular expression and present a restriction of three classical algorithms to take the number of occurrences of the pattern in the hidden sequence into account. We present a new algorithm to compute the distribution of the number of pattern occurrences, and we extend the two most widely used existing decoding algorithms to employ information from this distribution. We show experimentally that the expectation of the distribution of the number of pattern occurrences gives a highly accurate estimate, while the typical procedure can be biased in the sense that the identified number of pattern occurrences does not correspond to the true number. We furthermore show that using this distribution in the decoding algorithms improves the predictive power of the model. View Full-Text
Keywords: Hidden Markov Model; decoding; Viterbi; forward; algorithm Hidden Markov Model; decoding; Viterbi; forward; algorithm
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Tataru, P.; Sand, A.; Hobolth, A.; Mailund, T.; Pedersen, C.N.S. Algorithms for Hidden Markov Models Restricted to Occurrences of Regular Expressions. Biology 2013, 2, 1282-1295.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Biology EISSN 2079-7737 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top