Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing

Lecian, Orchidea Maria

doi:10.3390/ijt1010005

Open AccessArticle

Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing

by

Orchidea Maria Lecian

Department of Clinical and Molecular Medicne, Sapienza University, 324-00185 Rome, Italy

Int. J. Topol. 2024, 1(1), 55-71; https://doi.org/10.3390/ijt1010005

Submission received: 5 November 2024 / Revised: 13 December 2024 / Accepted: 13 December 2024 / Published: 16 December 2024

Download Versions Notes

Abstract

:

The sheaf cohomology techniques are newly used to include Morse simplicial complexes in a rectangular-matrix chain, whose singular values are compatible with those of a square matrix, which can be used for multiple sequencing. The equivalence with the simplices of the corresponding graph is proven, as well as that the filtration of the corresponding probability space. The new protocol eliminates the problem of stochastic stability of deep Markov models. The paradigm can be implemented to develop deep-machine-learning multiple sequencing. The construction of the deep Markov models for sequencing, starting from a profile Markov model, is analytically written. Applications can be found as an amino-acid sequencing model. As a result, the nucleotide-dependence of the positions on the alignments are fully modelized. The metrics of the manifolds are discussed. The instance of the application of the new paradigm to the Jukes–Cantor model is successfully controlled on nucleotide-substitution models.

Keywords:

rectangular-matrix chains; sheaf cohomology; cochains; deep machine learning; multiple sequencing; amino-acids

1. Introduction

The Topological Hidden Markov Model (THMM) from [1] is constructed from a mixture of normal distributions to be extended to a mixture in locally convex topological spaces in order to comprehend probability density functions in infinite-dimensional spaces. Differently, in the present work, the methods of sheaf cohomology will be used instead to construct the chain for deep-machine-learning multiple sequencing.

The present paper is aimed at proving how to construct a rectangular-matrix chain for multiple sequencing starting from a square two-dimensional-matrix Markov chain after adding Morse complexes (i.e., as from [2]).

For these purposes, the two-dimensional Markov models are considered. The more general context of rectangular chains is preferred, in which the compatibility with larger Markov models is ensured after the singular values. The corresponding probability spaces are constructed. A sheaf of constants is chosen in order to extend the two-state Markov models. The blocks to construct the (also, rectangular) new chain are added as Morse complexes, whose compatibility is proven with the simplices of the corresponding graph, i.e., with the filtration(s). The deep Markov models are constructed this way. The more general construction of rectangular-matrix Markov models is presented. The applications to develop sequencing are newly analytically modelized; this way, the dependence of the probabilities of the nucleotides in the sequences are explained.

In the present paper, the Jukes–Cantor model [3] is considered, according to which the amino-acids in a sequence have equal probability; a Poisson model was proposed in [4] and applied in [5] for the expression of the probability matrix of a Markov chain to be written.

As an application, the profile Markov model, which can be extracted from [6], for sequencing is newly analytically written; the insertion of new sequences, i.e., that of a single amino-acid and that of a sequence, is newly analytically written in order to extend the profile Markov model to a Jukes–Cantor model [3] via the interrogation proposed in [5] (where the interrogation from [5] follows the long-standing problem after [4]). After the review of several two-dimensional square matrix Markov chain sequencing models, the limitations of the deep Markov models are outlined. The way to analytically manage insertions and deletions in the sequences is newly proven.

In more detail, the methods of sheaf cohomology are used in order to analytically add blocks to a matrix and to analytically delete blocks from a matrix, whose action corresponds to add insertions and to perform deletions to a sequence. The partial matchings that are to be managed are constructed starting from a chosen sequence. In particular, the Morse simplicial complexes that are treated are those that allow us to reconduct to the wanted filtration of the probability space.

The paradigm developed at the initial stage defined in the above can be of use to shape the profile Hidden Markov Models version treated in [7]. This item of the paradigm allows one to define the probability spaces of the two-state pair MM and of the profile HMM.

The applications to computation in biology are discussed in [8].

The measure of the probability space is that to which the proprieties of the Markov chain allow to upgrade starting from the sigma-algebra of the Borel (sub-)sets that constitute the state space of the models.

From the probability space, a metric space can be defined, therefore endowed with a metric, on which there lives the manifold on which the graphs associated with the chain are defined: it is one of the aims of the present paper to discuss and to prove the well-posedness of the concept of distance within the metric spaces, in the way such that the distance is well-posed if two chains are obtained from the initial two-items sequences, to which the opportune sheaf (or sheaves) are applied.

The possibility to construct rectangular-matrix chains is considered as well; this construction allows one to schematize the existence of latent states [9,10]. The sheaf-cohomology techniques here selected also apply to rectangular matrices; the analytical managing of the Morse simplicial complexes allows one to reconduct to the wanted filtration.

As a further advantage, it is possible to reproduce the site-dependence of the probability of finding the items in a sequence.

Therefore, the model is apt to be applied to the analysis of sequences of amino-acids, as outlined in [11].

In more detail, in the present paper, it is possible to analytically write the probabilities to find a certain number of items in a selected order. Accordingly, the new quality of the amino-acids to exhibit different probabilities to be found in a sequence, as very recently found in [12], is analytically modelized.

As an application of the methods developed here, two nucleotide-substitution models are compared: the method from [13] is proven to be obtained after the addition of the blocks of the sheaf needed to reproduce the wanted probability space, while the model from [14] is proven inconsistent with the presented protocol.

Therefore, the present paper contains answers to the interrogations delivered in [15] and in [16] as far as the comparison of methods of nucleotide substitutions is concerned. An illustrative example is contained in which two methods are compared: one is found to be an extension of the two-state Jukes–Cantor model, while the other one is proven as not being obtainable from the Jukes–Cantor model. The relevance of constructing sequences containing pairwise repetitions of the same pattern as far as timely requests are concerned about the evolution of species is outlined in [17].

The prospective studies to be envisaged are the recognition of patterns in the sequences and the outlining of the patterns from the experimental errors, as requested in [18]: the interrogation can be solved as indicated in [19] and applied as in [20]. The patterns are ibidem shown to be contained in rectangular-matrix constructions.

The implementation of the deep Markov models can therefore be argumented to be apt for developping with the ’context-sensitive’ Markov models, i.e., from [21], where the rendering of the wanted states as ‘context-sensitive’, can be achieved using the substitution techniques developed in the present paper.

To apply the methods of [19], the singular-value decomposition is requested in order to control the filtration of the probability spaces.

The paper is organized as follows:

In Section 2, several models of sequencing are discussed, each of whose exhibits stringent peculiarities for application.

In Section 3, the ergodicity of HMMs with a ‘generalized observation structure’ is ensured.

In Section 4, the deep Markov models are reviewed and the limitations are outlined.

In Section 5, the probability spaces are constructed, whose features complete the limitations of the deep Markov models.

In Section 6, the filtration of the probability space is modelized according to the choice of a (constant) sheaf; furthermore, the way to add insertions is proven to be by adding Morse complexes, to which the compatibility with the simplices of the graph holds: the methods to perform deletions is, in this way, also delineated.

In Section 7, the outlook is proposed. In Appendix A, further complements of the cochain theory are studied in detail in order to illustrate the proof of the new Theorem 1.

In Appendix B, the method is demonstrated to be one of use to develop the multiple sequencing of amino-acids: the chains for sequencing are analytically constructed from the profile Markov model; the metric of the corresponding manifold is discussed. In Appendix C, two methods of amino-acid substitutions are compared: one of them is proven to be obtained from the Jukes–Cantor model after application of the Morse operator, while the other is demonstrated as consisting of different hypotheses.

2. Introductory Material

2.1. Profile-Hidden Markov Model

As described in [22], the construction of a Profile-hidden Markov Model (pHMM) starts with the consideration of a ‘multiple sequence alignment’.

Insertions and deletions are modelized.

The occupancy of a position and the probability to each position of the alignment are encoded in the pHMM.

In [23], the pHMM is commented to apply to large-scale sequence analyses.

Profile-Hidden Markov States models are introduced in [24]. The pHMM is constructed from an opportune HMM of an MSM. While the HMM consists of ‘training sets’ of unaligned sequences, the pHMM allows one to obtain the multiple alignment of all the training sequences.

In [8], the pHMM is analyzed to exhibit higher sensitivity and to ‘recall’ the rates compared to ‘pairwise alignment methods’.

The software implementation of pHMM is reviewed in [23].

From [7,22], a profile for an alignment is understood as a trivial HMM with one ‘match state’ for each column. The consecutive match states have consecutive probabilities.

The ‘output probabilities’ are defined as the probabilities given after the probabilities of finding one particular match state in the corresponding column.

’Insertions’ are defined as ‘portions of sequences that do not overlap those found in the previous constructions’. They are represented after the state space

S_{i j}

, where the states modelize the inserts of the j-th column of the alignment. The ‘output probabilities’ of the insert states are requested to equal the background probabilities.

The probabilities

P_{S j}

are given; such probabilities affect the pHMM such that the insertions can take place in different portions of the alignment. The

P_{S j}

can be time-dependent as well: the ‘affine gap penalties’ are this way defined.

From [22], the construction of a pHMM starts with the consideration of a ‘multiple alignment’. Insertions and deletions can be modelized.

The occupancy of a position and the probability at each position are encoded in the ‘profile’.

Given the sample

Σ = {b_{1}, . . ., b_{n}}

, the wanted set of states

A = {1, . . ., k}

is chosen: the pertinent transition probabilities

a_{i j}

are specified under the initial condition as i; the ‘emission probabilities’ are specified as

a_{i} {b_{k}}

: the notation in the latter formula employs the row–columns description of the entries of the matrix. The corresponding pair HMM relates the probabilities distribution over certain sequences of pairs of observations.

The

p_{s, j}

can be time-dependent as well and are a-priori different from the probabilities

P_{S j}

found in the previous calculations: the ‘affine gap penalties’ are this way defined.

2.2. The Context-Sensitive Hidden Markov Models

The context-sensitive Markov Models (csMMs) are presented in [21]. Context-sensitive HMMs are used to model long-range dependencies in symbol sequences—a case to which the study of amino-acids can be reconducted.

csMMs are apt to describe ‘nested dependences’ between symbols.

The csMMs are developed in order to study long-range correlations after ‘rendering’ some of the states in the model as ‘context-sensitive’.

2.3. Pair Hidden Markov Models

From [25], the Pair hidden Markov Models (pair HMMs) are constructed. The sample

Σ = {b 1, . . ., b_{k}}

is given, with the wanted set of pairs of states

\tilde{S} = {s_{1}, . . ., s_{k}}

, the transition probabilities

a_{i j}

are written, specified under the initial conditions

a_{0 i}

: the ‘emission probabilities’ of the pair hMM are different form those obtained in the profile hMM, and are obtained as

e_{i} (b_{k})

. The pair HMM related the probability distributions

a_{i j}

over certain sequences of pairs of observations.

From [26], the use of conditional random fields is discussed in pair HMMs.

After the use of conditional random fields (CRFs) for time series data, linear chains are obtained [27].

2.4. Deep Markov Models of Sequential Data

The software implementation of featuring the deep Markov models of sequential data after [28] are written in [29].

From the analysis ibidem, the following developments are constructed.

The chain of latent variables is introduced: each latent variable form the chain is conditioned after the previous latent variables.

Potential highly non-linear dynamics are prospected.

Definition 1.

The wanted transition probabilities, which rule the dynamics of the latent variables, and the ‘emission probabilities’, which govern the observations, should be parameterized after non-linear neural networks.

Example 1.

The sequence of observations is of length 3 as

{x_{1}, x_{2}, x_{3}}

; there exist a sequence of coherent random variables

{z_{1}, z_{2}, z_{3}}

.

The implementation of the Markov properties of the model is to be studied.

3. About Ergodicity of HMMs with a ‘Generalized Observation Structure’

The conditions of ergodicity of HMMs with ‘generalized observation structure’ are addressed in [30].

Under the opportune set of hypotheses, which will be tested in the following for the verification of the extent to which the work [30] is useful in the present approach, the existence of a unique invariant measure of the pair (

f i l t e r - o b s e r v a t i o n

) has to be verified.

The triple (

s t a t e - o b s e r v a t i o n - f i l t e r

) is written within the asymptotic stability of probability.

The problem of ‘incorrectly initialized filters’ will be further addressed in the present work.

The asymptotic properties of the filter and the state estimator are funded on both the observations and on the (knowledge of the) initial data.

A minimal invariant measure and a maximal invariant measure can be defined.

Let

{(x_{n}), (y_{n})}

be a discrete time process of a probability space (

Ω, F, P

): the kernel

P (x, d x^{'})

is the transition kernel.

The part

(y_{n})

is called the ‘observation process’; its transition kernel is written as

P_{1}^{x^{'}} (y, d y^{'})

: it is parameterized after the current values of the hidden Markov process.

Let

X^{n}

be the set of states on a Borel subset endowed with a

σ

algebra

σ = {x_{0}, x_{1}, . . ., x_{n})

, and let

Y^{n} = σ {y_{1}, y_{2}, . . ., y_{n})

: here, it is necessary to remark that the implementation from

n = 2

to

n = 3

is non-trivial within the analysis of the ergodicity.

The transition probabilities are written as

P {x_{n + 1} \in A ∣ X^{n}, Y^{n}} = P (x_{n}, A),

(1)

and

P {y_{n + 1} \in B ∣ X^{n}, Y^{n}} = P_{1}^{x_{n + 1}} (y_{n}, B)

(2)

It is crucial to remark in the present work that, from [30], for fixed

x, x^{'}, y

,

P (x, \cdot)

and

P (y, \cdot)

must be defined. In the case where they are defined, they are the probability measures of

e_{0}

and of E, respectively.

It is further crucial to remark from [30] that, for a fixed Borel subset A of

E_{0}

and, if the fixed Borel B of E exists, then the application that acts like sending

x \mapsto p (x, A),

(3)

and

(x^{'}, y) \mapsto P_{1}^{x^{'}} (y, B)

(4)

has to be demonstrated as far as the new probabilities p exist to admit a measure

B (E_{0})

and

B (E)

being

B

the suitable Borel

σ

fields.

The hypothesis is made in [30] that, on the kernel

P_{1}

, the following definition holds:

P_{1}^{x^{'}} (y, B) = \int_{B} r (x^{'}, y, y^{'}) η d y^{'}

(5)

with

η \in P

, and

P

from the probability space is the space of probability measures.

The observation is claimed to be admitted in [30] for the models

y_{n + 1} = h (y_{n}, y_{n + 1}, w_{n + 1})

(6)

being h

C^{1}

diffeomorphisms on

R^{d}

.

It is one of the aims of the present work to analyze the construction.

4. About the Proposal of Deep Markov Models

In [28], the use of Gaussian state space models has been of use in ‘generative model of sequential data’ as a long-standing attribution.

Algorithms to learn a wide class of linear state space models and of non-linear state space models have been developed.

Developments were also implemented in which the emission distributions and the transition distributions were molded after neural networks.

The posterior distributions were obtained so as to be compared with the outcomes of a ‘learning algorithm’, in which both expand the properties of a ‘compiled inference network’ and the ’generative model’, where room is left for parameterized variational approximation.

The scalability properties were exhibited.

The ‘structured approximation’ is shown to produce ‘posterior results’ with ‘significantly higher held-out distributions’. It is our aim to verify the statement analytically.

HMMs and Recurrent Neural Networks (RNNs) can be implemented with linear state space models or with Gaussian state space models.

Definition 2.

A deep Markov model (DMM) is a ‘class of generative models’ in which the ‘classic’ linear emission and the classic transition distribution are substituted with complex multi-layer perceptrons (CMLPs).

There exists general state space model (GSSM’s) that are apt to keep the Markovian structure of the HMMs; in this case, however, the ‘representational power of the deep neural network to treat high-dimensional data is leveraged’.

If the DMM is ‘improved’ after increasing the latent states of the successive time step

z_{t + 1}

, the DMM is ibidem hypothesized to be interpreted as a (possibly ‘more restrictive’) stochastic RNN [31], as far as variational RNNs are concerned [32].

The linear algorithm proposed ibidem is valid for executing ‘stochastic gradient ascent’ of the variational lower bound of the ‘likelihood’ [33].

A new family of ‘structured inference networks’ is assessed in [28]; it is parameterized after ’recurrent neural networks’, and the following three possibilities are comprehended:

(i): The ‘generative model’ is already known and assessed;
(ii): The functional form of the parameters estimation is known;
(iii): The learning deep Markov model is known.

5. The Probability Spaces

The proposed HMM is prepared form the complete model.

From the complete probability space

(ω, F, P)

for the state space

Ω \equiv {X_{1}, X_{2}, . . ., X_{n}}

, the probability is defined

P (X_{n + 1} \in A ∣ F) (ω) \equiv p (X_{n} (ω), A)

(7)

The HMM is defined as the probability space (

Ω_{1}, F_{1}, M_{1}

) with the probability distribution

M_{1}

and the measurable space

(Ω_{2}, F_{2})

with the probability distribution

M_{2}

as

M_{2} : Ω_{1} \times F_{2} \mapsto [0, 1]

(8)

The existence of the HMM is based on the verification that the map Equation (8) is still a measurable function

\forall A \in F_{2}

. In the case where the requests after Equation (8) are verified, the section (product of

σ

fields) F defines a unique measure as

μ (F) = \int_{Ω_{1}} M_{2} (ω_{1}, F (ω_{1})) M_{1} d (ω_{1})

(9)

is still from.

The choice

Ω = {X_{1}, X_{2}}

does not allow for such a construction. It is verified after the verification of the measure. From this, it follows that the two models are on different manifolds.

6. Sheaf Cohomology of Models of Sequential Data

The two-dimensional matrices from [11,34,35] have an eigenvalue equal to 1, i.e., they are stochastic matrices.

The construction of the

\hat{M}

rectangular matrix in the thin singular-value decomposition is achieved as

\hat{M} = U_{k} \hat{Ξ} V_{k}^{T}

(10)

where the apex ^T indicates the transpose. In the present case, the wanted vectors are such that

V^{T} \equiv U^{T} .

(11)

for the purposes of Appendix A. For the present purposes, it is even more conformable to investigate the thin compact singular-value decomposition

\hat{M} = U_{r} {\hat{Ξ}}_{r} V_{r}

(12)

in which only the r non-vanishing singular values are considered.

The singular values

σ_{i}

of

\hat{Σ}

allow for a decomposition

\hat{M} \sum_{i = 1}^{i = r} ξ_{i} u_{i} v_{i}^{T},

(13)

where

\vec{u}

and

\vec{v}

are from, in general, two sets of orthonormal bases.

The request that

\vec{v}

be unitary would lead to the definition of one corresponding normalized orthonormal basis.

It will be interesting to verify whether

\hat{U}

is a partial isometry.

From [2], the procedures are extended here in order to add at least one block

1 - δ

on the diagonal of

\hat{Σ}

to discuss whether it is possible to pass from the two-state Markov models to the Deep Markov model of sequential data [28,29].

The wanted chain

\hat{M}

can be put on a manifold, on which a graph is constructed after the path of simplices implied after

\hat{M}

. A sheaf

S

is fixed on a simplicial complex L.

The choice of the sheaf will be specified in the following as a sheaf of constants.

The following new definition is obtained starting from [2] Def 5.5:

Definition 3 (

γ

:).

∀ dimension

k \geq 0

, the vector space of k-cochains of L with

R

coefficients is written as a product

C^{k} (L, R) = \prod_{d i m τ = k} R (τ)

(14)

Here, one newly chooses from [2] to develop the case in which the vertices of L are ordered. This way, it is possible to define the topological models (of which the 3-states model is that with the lowest dimensionality). This way, the following new definition is given:

Definition 4 (

δ

:).

For each

k > 0

, the k-th coboundary map of L with

R

coefficients is the linear map

\partial_{R}^{k} : C^{k} (L, R) \to C^{k + 1} (L, R) .

(15)

with

\partial_{R}^{k}

coboundary operator.

Remark 1.

The arrow in Equation (15) is defined as a ‘block action’.

For each pairs of simplices

τ < τ^{'}

such that

d i m τ = k

and such that

d i m t a u^{'} = k + 1

, the component of

\partial_{R}^{k}

R (τ) - R (τ^{'})

(16)

is written as

\partial_{R}^{k} ∣_{τ, τ^{'}} = [τ : τ^{'}] \cdot R (τ \leq τ^{'}) .

(17)

The sequence

0 \to C^{0} (L, R) \overset{\partial_{R}^{0}}{\to} C^{1} (L, R) \overset{\partial_{R}^{1}}{\to} . . . \overset{\partial_{R}^{k - 1}}{\to} C^{k} (L, R) \overset{\partial_{R}^{k}}{\to} C^{k + 1} (L, R) \overset{\partial_{R}^{k + 1}}{\to}

(18)

is forming a cochain complex over

F

.

The composition

\partial_{S}^{k} \circ \partial_{S}^{k - 1} = 0 \forall k \geq 1

(19)

for every sheaf

S

and, therefore, in particular, also for

R

.

The Morse chain complex is now discussed.

Let K be a simplicial complex with ordered vertices.

Let

ς \in K

. Also, let

τ \in K

be any simplices.

Now, let

[τ_{1} : ς] \in {0, \pm 1}

: it defines the coefficient of

ς

in the boundary of

τ

. The coefficient is non-vanishing iff

ς ◃ τ

. Here, we are interested in the determination of

[τ_{1} : ς]

and define the coefficient of

ς

as 1.

In [2] page. 98, an acyclic partial matching is fixed: there, all acyclic partial matching is defined as one whose G-paths are all gradient.

The definition of ‘chain complex’ therefore follows Theorem 8.4 in ibidem.

From Definition 8.6 ibidem, which is aimed at associating an algebraic contribution to each G-path, the following new definition is given:

Definition 5 (

ϵ

:).

The weight

w (ρ)

on the G-path consisting of

ς = (ς_{1} ◃ τ_{1} ▹ ς_{2} ◃ τ_{2} ▹ . . . ▹ ς_{m} ◃ τ_{m})

(20)

is defined as the product

w (ς) = \frac{(- 1)}{[τ_{1} : ς_{1}]} \cdot [τ_{2} ς_{1}] \cdot \frac{(- 1)}{[τ_{2} : ς_{2}]} \cdot . . . \cdot [τ_{m - 1} : ς_{m - 1}] \cdot \frac{(- 1)}{[τ_{m} : ς_{m}]} .

(21)

Differently, from [2], it is our aim to express the ‘collected’ version of the weight of the G-path, and to then choose the algebraic weight all equal to 1 and to study the k-th coboundary map.

The weights

w (ρ)

are also written as

w (ρ) = {(- 1)}^{m} \prod_{i = 1}^{i = m - 1} \frac{τ_{i} : ς_{i}}{\prod_{i = 1}^{i = m} [τ_{i} : ς_{1}]} .

(22)

We need to specify Equation (22) for

τ_{1}

as

w (ρ) = {(- 1)}^{m} \prod_{i = 1}^{i = m - 1} \frac{τ_{1} : ς_{i}}{\prod_{i = 1}^{i = m} [τ_{1} : ς_{1}]}

(23)

for the purposes of the sequencing procedures.

The constraints on

w (ρ)

imply the looked-for entries of the singular-value decomposition form of the wanted vectors from

{\hat{U}}^{T} = {\hat{V}}^{T}

. To this aim, the following new definition is given:

Definition 6 (

ζ

:).

For each

d i m k \geq 0

, the k-th Morse boundary operator is the linear map

D_{k}^{G} : C_{k}^{G} (K) \to C_{k - 1}^{G} (K)

(24)

which admits a matrix representation.

The entry of the matrix from Definition

ζ

are the columns of a critical k-simplex

α

, and the rows of a critical

(k - 1)

-simplex

ω

are given as

{[α : ω]}_{G} = [α : ω] + \sum_{ρ} [α ς_{ρ}] \cdot w (ρ : ω) .

(25)

We specify the choice of the orthonormal base (at least,

\vec{v}

) with all the weights of the G-path equal to 1.

The only paths that result in a non-zero contribution are those paths that can be written as

α ▹ (ς_{1} ◃ τ_{1} ▹ ς_{2} ◃ τ_{2} ▹ . . . ▹ ς_{m} ◃ τ_{m}) ▹ ω

(26)

where the role of

τ_{1}

is outlined.

The choice of the path with weight equal to 1 is

F = Z / 2

.

For the sake of the present purposes of sequencing, it is necessary to use Proposition 8.8 from [2].

It is necessary to control the number of the connecting G-paths.

The definition of a chain complex allows one to analyze the outcomes of the measurements of the sequencing resulting from the chain complex.

It is proven by induction that the wanted result is obtained when G consists of a single pair as

ς ◃ τ

of simplices in K.

Complements for the following matter are given in the Appendix.

The wanted set of critical simplices is defined as follows:

Theorem 1.

The set of critical simplices for sequencing is defined as

C_{G}^{s e q} \equiv K - {ς ◃ τ}

(27)

Proof.

Equation (27) allows one to define the new boundary operator, which defines the chain complex. □

It is crucial to comment the following:

Remark 2.

The chain complex is associated with an ‘acyclic partial matching’ G.

The following is true:

Theorem 2.

The partial matching has the same homology as the boundary operator as the chain complex.

Proof.

From Proposition 8.10 in [2], the Morse chain complex is, in this case, chain-homotopy equivalent to the standard simplicial chain complex. □

The wanted HMMs are, therefore, this way constructed.

Here, one needs to take a sheaf

R

of constants as follows:

From Proposition 8.8 in [2],

C_{\circ}^{Σ}, D_{\circ}^{Σ}

is a chain complex. From Proposition 8.10 in [2], the Morse chain complex

(C_{\circ}^{Σ} (K), \partial_{\circ}^{Σ})

is ‘chain homotopy equivalent’ to the standard simplicial complex

(C_{\circ} (K), \partial_{\circ}^{k})

.

The two chain maps are described with a pair of chain homotopies.

To construct the chain homotopies, one processes the simplex-pairs

σ ◃ τ

in

Σ

one at a time. This means the block of the matrices can be added one at a time.

For this purpose, two more arbitrary simplices

α

and

ω

can be considered, as there is only one ‘path’

σ ◃ τ

.

The entries of the new blocks of the matrices to be inserted are this way found as

[α : ω] + [α : σ] \cdot \frac{(- 1)}{[τ : σ]} [τ : ω] .

(28)

Construction of the Probability Space of the HMM

The chain homotopy is defined after the linear maps

θ ∣_{α, β}

as

\begin{matrix} θ ∣_{α, β} = \frac{1}{[τ : σ]}, α = σ, β = τ; \end{matrix}

(29a)

\begin{matrix} θ ∣_{α, β} = 0 o t h e r w i s e . \end{matrix}

(29b)

The difference between the ‘new’ entries and the previous ones are

\frac{[τ^{'} : σ] \cdot [τ : σ^{'}]}{[τ : σ]}

(30)

where the apex ′ denotes the new entries.

Given a filtration of a simplicial complex, an acyclic ‘partial matching’ on

Σ

is compatible with the filtration if there exists a function b such that

b : k \to R_{+}

is associated with the map

σ \mapsto {t \geq 0 ∣ σ \in F_{t} (K)}

(31)

such that

b (σ) = b (τ) \forall σ ◃ τ \in Σ

.

The blocks can, therefore, be inserted one at time; nevertheless, the blocks are different from each other. It is nevertheless ensured that the effects in determining the sequencing are wanted. Indeed, a nested sequence of Morse complexes is found. From Theorem 8.25 in [2], there exist homeomorphisms of persistent homology groups ensuring that the inclusions of Morse complexes are equal.

7. Discussion

The present work is aimed at analyzing the DMM in deep machine learning as far as the dimensionality of the kernels is concerned for the comparison of the different models.

It is proven by induction that the topological features of the HMM eventually descending from the latent states of a two-state MSM are not comparable with those of an N-state MSM with

N > 2

because of the different properties of the manifold that the corresponding graphs stay on according to the analytical expressions of the eigenvalues; accordingly, not even a concept of distances between the models is well-posed.

In the present work, the dimensions of the possible matrices of the chains originating the transition probabilities of sequencing as ‘partial matching’ are achieved by added the blocks needed one at each time as nested sequence of Morse complexes; in more detail, the order at which the Morse complexes are added are proven irrelevant for the construction of the block matrix: indeed, not only one square matrix can be taken as originating the sequencing. In more detail, there exists an infinite set of rectangular matrices, discriminated only according to their singular values, which can serve for the scope of sequencing.

The definition of the probability space allows one to perform all of the requested measurements. In more detail, the measure of the probability spaces allows one to calculate the distances; the choice of the sheaf allows one to qualify the filtration of the partial matchings.

As a result, the new qualities of the DNA sequences [12] can be analytically modelized, according to which the probability to find an amino-acid in a sequence depends on the site; in more detail, further characterization is possible, according to which the probability to find a partial matching of amino-acids is dependent on the distances among other amino-acids in the complete sequence.

Accordingly, the construction of the complete most-general rectangular-matrix chain is possible. The construction of the chain is demonstrated from the proof of Theorem 1: the partial matching 12 is chosen from a profile Markov model, and the further items are added, which correspond to blocks of the matrix. The possibility to manage deletions analytically is also performed. The analytical paradigm here developed is now apt for several analytical applications.

As a first instance, it is possible to compare Markov models for sequencing that are aimed at generalizing the two-items sequence and that have the same dimensions: as a result, it is possible to prove which ones are originated from the Jukes–Cantor model and which are not: the proof is based on the analysis of the corresponding probability space.

As a further result, from the probability space, the metric space on which there lives the manifold of the graph associated with the Markov models can be determined; the determination of the metric space allows one to discern about the well-posedness of distances, i.e., as interrogated in [16]: the answer to the interrogation allows one to find the structures of the sequencing.

As an additional advantage, the search for hidden structures in the sequences is therefore possible; the relevance of the comparison with the Jukes–Cantor model is a long-standing interrogation, which is answered here after defining the suitable topology paradigm to be applied to implement the several two-state Markov models to a model with a higher number of states, i.e., starting from the three states for deep Markov models.

The data simulations for the papers analyzed in Appendix C can be described in a timely manner according to the newly introduced concepts of scalability and computational complexity, as introduced in [36,37]. A definition for scalability in this regard has been pointed out to be not complete yet. The notions are not used here, because all the computations performed are analytical.

The following considerations can be developed.

In [38], ‘transitions in the space of data’ are made to correspond to ‘transitions in the space of models’: in this context, the notion of ‘distance distribution functions’ are introduced. In the present paper, the well-posedness of distances in the metric spaces worked out of the probability spaces defined after the filtrations avoid this misconception according to our analytical derivations. In the case where the assistance of computer-based techniques is of interest, the pioneering work of [39] can be considered for further developments.

According to all the statements, the issue of inquiring about the stochastic stability of deep Markov models can be addressed. From [40], the deep Markov models from a multi-layer deep neural network are analyzed; it is observed that there exists a map between the transition probabilities and the emission probabilities: in the present work, the problem is overcome after the analytical definitions of the metric spaces, after which the ‘stability’ of the system is not stochastic.

Funding

This research received no external funding.

Data Availability Statement

The new calculations are analytical, and are written on this paper.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Complements of the Cochain Theory

Here, the cochain theory is recalled from [2] and the general theorems are specified for Theorem 1 in Section 6.

The boundary operator is built after the property.

Definition A1.

A Morse boundary operator is the linear map

d_{k}^{Σ} : C_{k}^{Σ} (K) \to C_{k - 1}^{Σ} (K) .

(A1)

After the definition of a weight

ρ

over all the

Σ

paths, the Morse boundary operator

d_{k}^{Σ}

for Equation (A1) admits a matrix representation; it is constituted of entries such that the columns of a critical k simplex

α

and the row of a critical k-simplex

ω

are

{[α : ω]}_{Σ} = [α : ω] + Σ_{ρ} [α : σ_{ρ}] \cdot ω (ρ) \cdot [τ_{ρ} : ω]

(A2)

The following proposition holds:

Proposition A1.

The pair

(C_{•}^{Σ}, d_{•}^{Σ})

is a chain complex.

The following example is worked out.

Theorem A1.

Let Σ be an acyclic partial matching on a simplicial complex K and

F

be a coefficient field. The cohomology group of

(C_{•}^{Σ} (K, F), d Σ_{•})

is isomorphic to those of K.

For the future purposes, we restrict to a special acyclic partial matching

σ

(i.e., which will be requested to be taken as that corresponding to the sequence 12).

Definition A2.

∀

d i m

k \geq 0

, the k-th boundary operator is the linear map

d^{S} i g m a_{k}

C_{k}^{σ} (K) \to C_{k - 1}^{Σ}

(A3)

which admits a matrix representation.

Definition A2 must specified to

σ

in order to find the following:

Definition A3.

(C_{•}^{σ}, D_{•}^{σ})

is the Morse chain complex associated to σ.

From Definition A3, we have the following:

Theorem A2.

The Morse chain complex

(C_{k}^{σ}, D_{k}^{σ})

is chain-homology equivalent to the standard simplicial chain complex

C_{•}, D_{•}^{K}

.

It is indeed our aim to specify the method to build a sequence starting form the partial sequence 12. For this, one notices the following:

Definition A4.

A cochain complex

(C^{•}, D^{•})

over

F

is a sequence of vector spaces and of linear mappings

d^{k}

such that

d^{k - 1} \otimes d^{k} = 0 \forall k \geq 1 .

(A4)

Appendix B. Construction of the Chain of Multiple Sequencing

As outlined in [20], the investigation of [5] is developed as follows. The fundamental matrix of the chain

\hat{Q}

is taken as

\hat{Q} = \hat{U} \hat{L} {\hat{U}}^{- 1},

(A5)

where

\hat{L}

is a diagonal matrix whose entries

l_{i i}

,

i = 1, 2, . . ., 20

are written as

l_{i i} \equiv λ_{i}

, where

λ_{i}

are the eigenvalues of the fundamental matrix.

The probability matrix per unit time interval

\hat{M}

is set.

The matrix

\hat{(} U)

is formed from the eigenvectors of the matrix

\hat{M}

.

The probability matrix

\hat{P}

from the chain

\hat{Q}

is written from the entries

u_{i j}

of

\hat{U}

as

P_{i j} = \sum_{k} u_{i k} u_{k j}^{- 1} e^{λ_{k} t} .

(A6)

The matrices

\hat{R}

is the matrix of relative rate of substitution.

The matrix

\hat{A}

is the mutation matrix.

A memory-less process can be implemented form these components after imposing the relation between the matrices

\hat{A}

and

\hat{R}

.

In the Markovian case, the matrix

\hat{R}

is the identity matrix

\hat{I}

.

In the Markovian case, the matrix

\hat{M}

is specified as

\begin{matrix} m_{i j} = δ / 19, i \neq j, \end{matrix}

(A7a)

\begin{matrix} m_{i j} = 1 - δ, i = j . \end{matrix}

(A7b)

The matrices

\hat{R}

and

\hat{A}

are related after the frequencies

π_{i}^{A}

from [41].

In [34], a general reversible process model of Markov process models of nucleotide substitution is provided according to a

4 \times 4

fundamental matrix of a Markov chain proposed from [35]; in this case, the peculiarities of the fundamental matrix allow one to infer that one of the parameters

Π_{i}

(from which the parameters

π_{j}

in [5] are generalized) is redundant.

It is now demonstrated how to analytically construct and write the chain of multiple sequencing.

A rectangular-matrix chain can be chosen, whose singular values must be compatible with those of the rectangular matrix from [5]. This becomes a very special case of the singular-value decomposition Equation (10). In this very special case, the conditions

V^{T} \equiv U^{T}

(A8)

with U from Equation (A5).

The two-state Markov models can therefore be extended to be comprehended within a rectangular matrix (of the chain) after adding Morse complexes by adding

\partial_{S}^{k}

. Thus, the acyclic partial matching holds for every pair of simplices.

The metric of the manifold is dictated after the qualities of the vector

\vec{U}

.

Appendix B.1. The Jukes–Cantor Model

The Jukes–Cantor model [3] and its applications [42] are analyzed here.

The Jukes–Cantor model is one according to which the probability of finding an amino-acid in a sequence is requested to be independent of the amino-acids. The probability matrix

\hat{P}

is written in the original form as

\hat{P} \equiv {p_{i j}}

\hat{P} = [\begin{matrix} 1 - δ & δ / (n - 1) & δ / (n - 1) & δ / (n - 1) \\ δ / (n - 1) & 1 - δ & δ / (n - 1) & δ / (n - 1) \\ δ / (n - 1) & δ / (n - 1) & 1 - δ & δ / (n - 1) \\ δ / (n - 1) & δ / (n - 1) & δ / (n - 1) & 1 - δ \end{matrix}]

with

n = 4

.

The analysis from [20] is spelled according to the present analysis.

The Jukes–Cantor model is applied in [5] as follows: the sequences are requested to be described with ‘dependence among neighboring nucleotides’. A transition probability matrix is given in a general reversible Markov model. The probability matrix

\hat{P}

is rewritten as

\hat{P} = e^{\hat{Q} t}

(A9)

with

\hat{W} = \vec{U} \hat{M} \vec{U^{- 1}}

(A10)

for

n = 20

and

δ = 0.01

; the probabilities

p_{i j} (t)

are spelled as

p_{i j} (t) = \sum_{k = 1}^{k = 20} u_{i k} u_{k j}^{- 1} e^{t λ_{k}}

(A11)

with

λ_{k}

being the eigenvalues of

\hat{M}

.

Therefore, the system is newly, pertinently analyzed.

Here, the probabilities

p_{i j} (t)

are newly remarked to be dependent on the 20 eigenvalues

λ_{k}, k = 1, 2, . . ., 20

: the probabilities

p_{K} \equiv p_{k k}

are written as

p_{K} = {(o (δ))}^{n} e^{λ_{K} t}

(A12)

i.e.,

l n p_{K} = t λ_{K} + n o (δ) .

(A13)

The probabilities

p_{K}

are here newly remarked not to depend on the neighboring entries but only on the phenomenological evidence

δ

and on the dimensionality n of the system.

It is our aim to start implementing the Jukes–Cantor model from the model [6]: ibidem, the sequence of two amino-acids is considered as consisting of sites independent of the quality of the amino-acid; therefore, here, the Jukes–Cantor mode is newly implemented in rewriting Equations (A9) and (A10) as

W = \vec{u} M {\vec{v}}^{*}

with

\vec{u} = (u_{1}, u_{2})

and

\vec{v} = (v_{1}, v_{2})

, where the two different vectors

\vec{u}

and

\vec{v}

are kept different for the discussion of the metric.

This way, sequence 12 has a probability

p_{12} = ((1 - δ) v_{1} + δ v_{2}) u_{1}

(A14)

while

p_{21}

is

p_{21} = + (δ v_{1} + (1 - δ) v_{2}) u_{2}

(A15)

therefore, we newly remark that

p_{12} \neq p_{21}

(A16)

Furthermore, the eigenvalues of

M

are

λ_{1} = 1

and

λ_{2} = 1 - 2 δ

.

It is proven by induction that the model for

n \geq 3

does not admit

λ_{i} = 1

,

i = 1, 2, . . ., n

.

The metric of the manifold on which the graph associated with the Markov process is staying can be discussed. As already anticipated, the metric depends on the vector

\vec{u}

: as an example, if one takes

u_{1} \equiv 1 \sqrt{2}

and

u_{2} \equiv 1 / \sqrt{2}

, the metric is Hilbert. The two vectors can now be taken as equal, if needed, i.e., for the sake of [5].

Appendix B.2. Probability to Find an Amino-Acid Alone from a Profile MM

A further example of profile MM is here compared with that of [6] as one from [43]; in [43], a profile MM is considered, from which entries are to be eliminated that correspond to pseudogenes: here, it is newly suggested that the block of the profile MM be eliminated after applying

\partial^{- 1 i} (δ)

from the model constructed after [6], with

i = 1

or

i = 2

; because the results of the elimination of blocks are dependent on the label i of the blocks, it is newly suggested that eliminating the blocks pertinent to the amino-acid i from a profile MM allows one to write the probability to find the amino-acid j alone. The two probabilities are a priori different.

Appendix B.3. Constructing Deep Markov Models of Sequential Data from a Profile MM

We outline how to construct a deep Markov model of sequential data starting from a profile MM or from a pair MM.

If a Jukes–Cantor model has to be constructed, starting for the sequence 12, the prescriptions outlined throughout the paper are to be followed.

In particular, to start with, the

3 \times 3

matrix

M

has to be built from the

2 \times 2

matrix

M

after the pertinent blocks. The blocks are appended after adding one Morse complex with

\partial_{S}^{3}

, with S being the opportune phenomenological specification.

It is remarked that the hMM from [6] is a profile MM or a pair MM.

Appendix B.4. Substituting Blacks in Deep Markov Sequential Models

In [35], the procedure about how to substitute a nucleotide in a sequence with another nucleotide is studied; the reversible Markov model is analyzed to be more efficient than the method outlined in [44]. A nucleotide is substituted in a sequence according to the following paradigm: the entries i corresponding to the nucleotide a have to be substituted with new entries i corresponding to the nucleotide b. The entries

i_{a}

are removed after applying

\partial^{- 1 i_{a}} (S_{i_{a}})

of the opportune sheaf, and the entries corresponding to the nucleotide b in the same position i are added after

\partial^{i_{b}} (S_{i_{b}})

of the newly selected sheaf.

In [11], the procedures about how to substitute a nucleotide in a sequence are studied, where the probability matrix is a

4 \times 4

matrix: the procedures is found to be dependent on 6 parameters. An exhaustive comparison of nucleotide-substitution methods is found in Appendix C.

Appendix C. Comparison of Nucleotide-Sequencing Models

Several most common nucleotide-substitution methods are illustrated, i.e., in [16] Figure 16.12 ibidem.

In [15] Table 1 ibidem, the fundamental matrices of the Markov chain of several nucleotide-substitution methods are spelled out. In particular, it is shown that every (known) method of nucleotide substitution is descended from the (12 parameters) GH4 model. The choices are commented as far as the expected number of substitutions per site is concerned. Numerical examples are given ibidem. The differences in the probability spaces are appreciated also in [45].

Starting from this viewpoint, the aim of the present Appendix is to analyze how the steps are obtained, i.e., to discriminate about whether the several models can be connected via the Morse operations or not.

For the present purposes, it is our aim to compare two different models and explain how they differ as far as the construction of the blocks of the fundamental matrix of the Markov chain and how the difference is unavoidably observed in the filtration of the resulting probability space. The numerical simulations from the items of the bibliography are also commented.

The methods for comparing synonymous nucleotide substitutions and non-synonymous nucleotide substitutions [46] are summarized in [47,48].

In [17], the neighbor-dependent nucleotide substitution methods are described as based on repetitive elements.

The numerical methods of how to choose the sheaf of the third black starting from a two-amino-acid sequence are discussed in [45]; in this case, the examples are issued from the known data.

The choice of the blocks is reflected in the possibility to evaluate distances between the points of the graphs (on the suitable manifold); if two models are not obtained after the application of the Morse operators, the notion of distance is not well-posed.

In [11], two models are presented. The comparison is proposed, according to which one substitution mode is claimed to be equivalent to the other according to some identifications; it is the present aim to prove that the comparison does not hold. As the two models are constructed according to a different application of the Morse operator, the filtrations of the resulting probability spaces are different and it is not possible to identify the two models. Nevertheless, it was remarked in [11] that some of the details were not codified yet. The sequence of nucleotides

A T C G

is considered.

The four-parameter model of fundamental matrix

{\hat{Q}}_{K 1981}

from [13] is spelled as

{\hat{Q}}_{K 1981} \equiv {q_{i j}}

(A17)

{\hat{Q}}_{K 1981} = [\begin{matrix} - (γ + θ α + α) & γ & θ α & α \\ γ & - (γ + θ α + α) & α & θ α \\ θ β & β & - (γ + θ β + β) & γ \\ β & θ β & γ & - (γ + θ β + β) \end{matrix}]

According to the analysis developed in the present paper, it is understood that the matrix

{\hat{Q}}_{K 1981}

is built from the Jukes–Cantor model as far as the

A T

sequence is concerned; later on, the further blocks

q_{i j}

with

j = 3, 4

are constructed according to the Morse operator after the choice of the opportune sheaf.

It is important to remark that the construction does not reflect the construction proposed after [6].

In [11], the six-parameter model is recalled from [14], whose fundamental Markov matrix

{\hat{Q}}_{G N I}

is spelled as

{\hat{Q}}_{G N I} \equiv {q_{G N I i j}}

(A18)

{\hat{Q}}_{G N I} = [\begin{matrix} - (2 α + α 1) & α_{1} & α & α \\ β_{1} & - (2 α + β_{1}) & α & α \\ β & β & - (2 β + α_{2}) & α_{2} \\ β & β & β_{2} & - (2 β + β_{2}) \end{matrix}]

The model from [14] is inequivalent to the Jukes–Cantor model, as the two-nucleotide sequence does not admit the eigenvalue 1 (with the opportune relabeling of the variables, as it is possible for

{\hat{Q}}_{K 1981}

); moreover, the further blocks

q_{G N I i j}

with

j = 3, 4

are not added according to the prescription of the Morse operator. This way, it is understood that the probability space of the two-nucleotide

A T

model is not contained in the probability space of the four-nucleotide model. From the qualities of the filtration, is it also remarked that a comparison with [6] is not possible.

Therefore, the nucleotide substitution method proposed in [11] does not generalize the Jukes–Cantor model, even if the details of the comments in [11] are applied. The inconsistency is further pointed out in [49].

The system can be compared to the method exposed in [50], according to which philogenetically independent pairwise sequences are used. The Jukes–Cantor model is therefore understood as the application of the choice of a constant sheaf apt to reproduce the same filtration.

References

Kashlak, A.B.; Loliencar, P.; Heo, G. Topological Hidden Markov Models. J. Mach. Learn. Res. 2023, 24, 1–49. [Google Scholar]
Nanda, V. Computational Algebraic Topology Lecture Notes, e-Print. Available online: https://people.maths.ox.ac.uk/nanda/cat/TDANotes.pdf (accessed on 4 November 2024).
Jukes, T.H.; Cantor, C.R. Evolution of protein molecules. In Mammalian Protein Metabolism; Munro, H.N., Ed.; Academic Press: New York, NY, USA, 1969; Volume III, pp. 21–132. [Google Scholar]
Adachi, J.; Hasegawa, M. MOLPHY Version 2.3 Programs for Molecular Phylogenetics Based on Maximum Likelihood; The Institute of Statistical Mathematics 4-6-7 Minami-Azabu: Tokyo, Japan, 1996; Available online: https://stat.sys.i.kyoto-u.ac.jp/titech/class/doc/csm96.pdf (accessed on 4 November 2024).
Adachi, J.; Hasegawa, M. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J. Mol. Evol. 1996, 42, 459. [Google Scholar] [CrossRef]
Henderson, J.; Salzberg, S.; Fasman, K.H. Finding genes in DNA with a Hidden Markov Model. J. Comput. Biol. 1997, 4, 127–141. [Google Scholar] [CrossRef]
Available online: https://www.cs.princeton.edu/~mona/Lecture/HMM1.pdf (accessed on 4 November 2024).
Helder, I.N. (Ed.) Chapter 9 Rational Design of Profile Hidden Markov Models for Viral Classification and Discovery Liliane Santana Oliveira and Arthur Gruber; Exon Publications: Brisbane, Australia, 2021. [Google Scholar]
Anderson, G.; Farcomeni, A.; Pittau, M.G.; Zelli, R. Rectangular latent Markov models for time-specific clustering, with an analysis of the well being of nations. J. R. Stat. Soc. Ser. C Appl. Stat. 2019, 68, 603. [Google Scholar] [CrossRef]
Hoff, P.D. Model averaging and dimension selection for the singular value decomposition. arXiv 2007, arXiv:math/0609042. [Google Scholar] [CrossRef]
Takashi Gojobori, T.; Ishii, K.; Nei, M. Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide. J. Mol. Evol. 1982, 18, 414. [Google Scholar] [CrossRef]
Fuellgrabe, J.; Gosal, W.S.; Creed, P.; Liu, S.; Lumby, C.K.; Morley, D.J.; Ost, T.W.; Vilella, A.J.; Yu, S.; Bignell, H.; et al. Simultaneous sequencing of genetic and epigenetic bases in DNA. Nat. Biotechnol. 2023, 41, 1457. [Google Scholar] [CrossRef] [PubMed]
Kimura, M. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 1981, 78, 454–458. [Google Scholar] [CrossRef] [PubMed]
Gojobori, T.; Nei, M.; Ishii, K. Mathematical model of nucleotide substitutions with unequal substitution rates. Genetics 1981, 97, s43. [Google Scholar]
Rodríguez, F.; Olivera, J.L.; Maran, A.; Medina, J.R. The General Stochastic Model of Nucleotide Substitution. J. Theory Biol. 1990, 142, 485–501. [Google Scholar] [CrossRef] [PubMed]
Saitou, N. Evolutionary Distances. In Introduction to Evolutionary Genomics, Chapter 16. Computational Biology; Springer: London, UK, 2013; Volume 17. [Google Scholar]
Arndt, P.F.; Hwa, T. Identification and measurement of neighbor-dependent nucleotide substitution processes. Bionformatics 2005, 21, 2322–2328. [Google Scholar] [CrossRef] [PubMed]
Gojobori, T.; Li, W.H.; Graur, D. Patterns of Nucleotide Substitution in Pseudogenes and Functional Genes. J. Mol. Evol. 1982, 18, 360–369. [Google Scholar] [CrossRef]
Gong, T.; Zhang, W.; Chen, Y. Uncovering block structures in large rectangular matrices. J. Multivar. Anal. 2023, 198, 105211. [Google Scholar] [CrossRef]
Lecian, O.M. The chains of the Clusters of latent states in DNA sequencing. J. Med. Care Health Rev. 2004, 1, 1. [Google Scholar]
Yoon, B.-J.; Vaidyanathan, P.P. Context-Sensitive Hidden Markov Models for Modeling Long-Range Dependencies in Symbol Sequences. IEEE Trans. Signal Process. 2006, 54, 4169–4184. [Google Scholar] [CrossRef]
Available online: https://www.ebi.ac.uk/training/online/courses/pfam-creating-protein-families/what-are-profile-hidden-markov-models-HMM’s/ (accessed on 4 November 2024).
Eddy, S.R. Profile hidden Markov models. Bioinformatics 1998, 14, 755–763. [Google Scholar] [CrossRef] [PubMed]
Krogh, A.; Brown, M.; Mian, I.S.; Sjoelander, K.; Haussler, D. Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 1994, 235, 1501–1531. [Google Scholar] [CrossRef]
Available online: https://web.stanford.edu/class/cs262/archives/notes/lecture8.pdf (accessed on 4 November 2024).
Available online: https://web.stanford.edu/class/cs262/presentations/lecture8.pdf (accessed on 4 November 2024).
Available online: https://www.cs.cmu.edu/~10715-f18/lectures/lecture2-crf.pdf (accessed on 4 November 2024).
Krishnan, R.; Shalit, U.; Sontag, D. Structured Inference Networks for Nonlinear State Space Models. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
Available online: https://pyro.ai/examples/dmm.html (accessed on 4 November 2024).
Masi, G.B.D.; Stettner, L. Ergodicity of hiddem Markov models. Math. Control. Signals Syst. 2005, 17, 269–296. [Google Scholar] [CrossRef]
Bayer, J.; Osendorfer, C. Learning stochastic recurrent networks. arXiv 2014, arXiv:1411.7610. [Google Scholar]
Chung, J.; Kastner, K.; Dinh, L.; Goel, K.; Courville, A.; Bengio, Y. A recurrent latent variable model for sequential data. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
Hinton, G.E.; Dayan, P.; Frey, B.J.; Neal, R.M. The “wake-sleep” algorithm for unsupervised neural networks. Science 1995, 268, 1158–1161. [Google Scholar] [CrossRef]
Tavare, S. Some probabilistic and statistical problems on the analysis of DNA sequences. Lect. Math. Life Sci. 1986, 17, 57. [Google Scholar]
Yang, Z. Estimating the Pattern of Nucleotide Substitution. J. Mol. Evol. 1994, 39, 105. [Google Scholar] [CrossRef] [PubMed]
Duboc, L.; Rosenblum, D.; Wicks, T. A framework for characterization and analysis of software system scalability. In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, Dubrovnik, Croatia, 3–7 September 2007. [Google Scholar]
Arora, S.; Barak, B. Computational Complexity: A Modern Approach. In Chapter: The Computational Complexity and Why It Does Not Matter; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
Torra, V.; Taha, M.; Navarro-Arribas, G. The space of models in machine learning: Using Markov chains to model transitions. Prog. Artif. Intell. 2021, 10, 321–332. [Google Scholar] [CrossRef]
Duboc, A.L.d.L. A Framework for the Characterization and Analysis of Software Systems Scalability. Ph.D. Thesis, University College London, London, UK, 2009. [Google Scholar]
Dragon’a, J.; Mukherjee, S.; Zhang, J.; Liu, F.; Halappanavar, M. On the Stochastic Stability of Deep Markov Models. Adv. Neural Inf. Process. Syst. 2021, 34, 24033–24047. [Google Scholar]
Dayhoff, M.O.; Schwartz, R.M.; Orcutt, B.C. A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure; Dayhoff, M.O., Ed.; National Biomedical Research Foundation: Washington, DC, USA, 1978; Volume 5, pp. 345–352. [Google Scholar]
Lecture Notes: Markov models of sequence evolution Dannie Durand.
Porter, T.M.; Hajibabaei, M. Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets. BMC Bioinform. 2021, 22, 256. [Google Scholar] [CrossRef]
Hasegawa, M.; Kishino, H.; Yano, T. Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 1985, 22, 160–174. [Google Scholar] [CrossRef]
Andrey, Z. Estimation of Evolutionary Distances Between Nucleotide Sequences. J. Mol. Evol. 1994, 39, 315–329. [Google Scholar]
Ina, Y. New Methods for Estimating the Numbers of Synonymous and Nonsynonymous Substitutions. J. Mol. Evol. 1995, 40, 190–226. [Google Scholar] [CrossRef] [PubMed]
Miyata, T.; Yasunaga, T. Molecular evolution of mRNA: A method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J. Mol. Evol. 1980, 16, 23–26. [Google Scholar] [CrossRef] [PubMed]
Li, W.-H.; Wu, C.-I.; Luo, C.-C. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitutions considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 1985, 2, 150–174. [Google Scholar] [PubMed]
Gojobori, T.; Moriyama, E.N.; Kimura, M. Statistical methods for estimating sequence divergence. Methods Enzymol. 1990, 183, 531–550. [Google Scholar]
Purvis, A.; Bromham, L. Estimating the transition/transversion ratio from independent pairwise comparisons with an assumed phylogeny. J. Mol. Evol. 1997, 44, 112. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lecian, O.M. Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing. Int. J. Topol. 2024, 1, 55-71. https://doi.org/10.3390/ijt1010005

AMA Style

Lecian OM. Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing. International Journal of Topology. 2024; 1(1):55-71. https://doi.org/10.3390/ijt1010005

Chicago/Turabian Style

Lecian, Orchidea Maria. 2024. "Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing" International Journal of Topology 1, no. 1: 55-71. https://doi.org/10.3390/ijt1010005

APA Style

Lecian, O. M. (2024). Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing. International Journal of Topology, 1(1), 55-71. https://doi.org/10.3390/ijt1010005

Article Menu

Sheaf Cohomology of Rectangular-Matrix Chains to Develop Deep-Machine-Learning Multiple Sequencing

Abstract

1. Introduction

2. Introductory Material

2.1. Profile-Hidden Markov Model

2.2. The Context-Sensitive Hidden Markov Models

2.3. Pair Hidden Markov Models

2.4. Deep Markov Models of Sequential Data

3. About Ergodicity of HMMs with a ‘Generalized Observation Structure’

4. About the Proposal of Deep Markov Models

5. The Probability Spaces

6. Sheaf Cohomology of Models of Sequential Data

Construction of the Probability Space of the HMM

7. Discussion

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Complements of the Cochain Theory

Appendix B. Construction of the Chain of Multiple Sequencing

Appendix B.1. The Jukes–Cantor Model

Appendix B.2. Probability to Find an Amino-Acid Alone from a Profile MM

Appendix B.3. Constructing Deep Markov Models of Sequential Data from a Profile MM

Appendix B.4. Substituting Blacks in Deep Markov Sequential Models

Appendix C. Comparison of Nucleotide-Sequencing Models

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI