Special Issue "Selected Papers from LATA 2010"

Quicklinks

A special issue of Algorithms (ISSN 1999-4893).

Deadline for manuscript submissions: closed (1 August 2011)

Special Issue Editors

Guest Editor
Prof. Dr. Henning Fernau

Theoretical Computer Science, FB 4 - Abteilung Informatik, Universität Trier, D-54286 Trier, Germany
Website | E-Mail
Fax: +49 (0)651 201 3954
Interests: complexity theory; fixed parameter algorithms; formal languages; fractal geometry; learning algorithms (machine learning) and pattern recognition
Guest Editor
Prof. Dr. Carlos Martín-Vide

Research Group on Mathematical Linguistics, Rovira i Virgili University, Tarragona, Spain
Website | E-Mail
Phone: +34 977 55 9543
Fax: +34 977 55 8386
Interests: mathematical linguistics; formal languages; molecular computing; computational linguistics

Special Issue Information

LATA is a yearly conference in theoretical computer science and its applications. LATA 2010 aimed at attracting contributions from both classical theory fields and application areas (bioinformatics, systems biology, language technology, artificial intelligence, etc.). This Special Issue collects selected extended algorithm-oriented papers from LATA 2010 that were especially invited for this Special Issue.

Keywords

  • data and image compression
  • document engineering
  • grammatical inference and algorithmic learning
  • pattern matching and pattern recognition
  • text algorithms
  • text retrieval

Published Papers (3 papers)

View options order results:
result details:
Displaying articles 1-3
Export citation of selected articles as:

Research

Open AccessArticle An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms
Algorithms 2011, 4(4), 285-306; doi:10.3390/a4040285
Received: 14 October 2011 / Revised: 26 October 2011 / Accepted: 26 October 2011 / Published: 31 October 2011
Cited by 2 | PDF Full-text (521 KB) | HTML Full-text | XML Full-text
Abstract
We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer–Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm’s running
[...] Read more.
We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer–Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm’s running time cost (such as the number of text character accesses) for any given pattern in a random text model. Text models range from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). Furthermore, we provide an algorithm to compute the exact distribution of differences in running time cost of two pattern matching algorithms. Methodologically, we use extensions of finite automata which we call deterministic arithmetic automata (DAAs) and probabilistic arithmetic automata (PAAs) [1]. Given an algorithm, a pattern, and a text model, a PAA is constructed from which the sought distributions can be derived using dynamic programming. To our knowledge, this is the first time that substring- or suffix-based pattern matching algorithms are analyzed exactly by computing the whole distribution of running time cost. Experimentally, we compare Horspool’s algorithm, Backward DAWG Matching, and Backward Oracle Matching on prototypical patterns of short length and provide statistics on the size of minimal DAAs for these computations. Full article
(This article belongs to the Special Issue Selected Papers from LATA 2010)
Open AccessArticle The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing
Algorithms 2011, 4(4), 262-284; doi:10.3390/a4040262
Received: 12 October 2011 / Accepted: 14 October 2011 / Published: 26 October 2011
Cited by 3 | PDF Full-text (536 KB) | HTML Full-text | XML Full-text
Abstract
The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in fields such as Kolmogorov complexity, data compression and pattern discovery. We propose a new perspective on this problem by splitting it into two
[...] Read more.
The smallest grammar problem—namely, finding a smallest context-free grammar that generates exactly one sequence—is of practical and theoretical importance in fields such as Kolmogorov complexity, data compression and pattern discovery. We propose a new perspective on this problem by splitting it into two tasks: (1) choosing which words will be the constituents of the grammar and (2) searching for the smallest grammar given this set of constituents. We show how to solve the second task in polynomial time parsing longer constituent with smaller ones. We propose new algorithms based on classical practical algorithms that use this optimization to find small grammars. Our algorithms consistently find smaller grammars on a classical benchmark reducing the size in 10% in some cases. Moreover, our formulation allows us to define interesting bounds on the number of small grammars and to empirically compare different grammars of small size. Full article
(This article belongs to the Special Issue Selected Papers from LATA 2010)
Open AccessArticle Applying Length-Dependent Stochastic Context-Free Grammars to RNA Secondary Structure Prediction
Algorithms 2011, 4(4), 223-238; doi:10.3390/a4040223
Received: 12 October 2011 / Accepted: 20 October 2011 / Published: 21 October 2011
Cited by 3 | PDF Full-text (493 KB) | HTML Full-text | XML Full-text
Abstract
In order to be able to capture effects from co-transcriptional folding, we extend stochastic context-free grammars such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule,
[...] Read more.
In order to be able to capture effects from co-transcriptional folding, we extend stochastic context-free grammars such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule, and we show that existing algorithms for training and for determining the most probable parse tree can easily be adapted to the extended model without losses in performance. Furthermore, we show that the extended model is suited to improve the quality of predictions of RNA secondary structures. The extended model may also be applied to other fields where stochastic context-free grammars are used like natural language processing. Additionally some interesting questions in the field of formal languages arise from it. Full article
(This article belongs to the Special Issue Selected Papers from LATA 2010)

Journal Contact

MDPI AG
Algorithms Editorial Office
St. Alban-Anlage 66, 4052 Basel, Switzerland
algorithms@mdpi.com
Tel. +41 61 683 77 34
Fax: +41 61 302 89 18
Editorial Board
Contact Details Submit to Algorithms
Back to Top