*3.1. Triplet Markov Chain*

Consider two discrete stochastic processes *<sup>X</sup>* = (*X*1, ··· , *XN*) and *<sup>U</sup>* = (*U*1, ··· , *UN*) as hidden states, where *Xn* <sup>∈</sup> <sup>Λ</sup> <sup>=</sup> {1, ··· ,*r*} and *Un* <sup>∈</sup> <sup>Γ</sup> <sup>=</sup> {1, ··· , *<sup>τ</sup>*}, *<sup>n</sup>* ∈ {1, ... , *<sup>N</sup>*}. Let *<sup>Y</sup>* = (*Y*1, ··· , *<sup>Y</sup> <sup>N</sup>*) be a real-valued process representing the observation of the model, each *<sup>Y</sup><sup>n</sup>* <sup>∈</sup> <sup>R</sup>*w*, where *<sup>w</sup>* is the observation dimension. In this paper, the hidden state *X* refers to the activity to be recognized, *U* refers to the introduced gait or leg cycle, while the observation *Y* is the features extracted from sensor readings. The details of how to apply the model to recognize lower limb locomotion activity are described in Section 3.4. Then, the triplet *T* = (*V*, *Y*), with *V* = (*X*, *U*) is a TMC if *T* is Markovian. It should be noted here that, in classic TMC, none of processes *X*, *U*, *Y*,(*X*, *U*),(*X*, *Y*),(*U*, *Y*) are necessarily Markovian.

Let the realizations of *Xn*, *Un* and *<sup>Y</sup><sup>n</sup>* be denoted by their lower cases *xn*, *un*, and *<sup>y</sup>n*, respectively, so *<sup>v</sup><sup>n</sup>* = (*xn*, *un*), *<sup>t</sup><sup>n</sup>* = (*vn*, *<sup>y</sup>n*). In addition, for simplification, we will denote the probabilities *<sup>p</sup>*(*Xn* <sup>=</sup> *xn*, *Un* <sup>=</sup> *un*|*Y*<sup>1</sup> <sup>=</sup> *<sup>y</sup>*1, ··· , *<sup>Y</sup> <sup>N</sup>* <sup>=</sup> *<sup>y</sup>N*) by *<sup>p</sup>*(*xn*, *un*|*y<sup>N</sup>* <sup>1</sup> ), *<sup>p</sup>*(*Xn* <sup>=</sup> *xn*, *Un* <sup>=</sup> *un*|*Y*<sup>1</sup> <sup>=</sup> *<sup>y</sup>*1, ··· , *<sup>Y</sup><sup>n</sup>* <sup>=</sup> *<sup>y</sup>n*) by *<sup>p</sup>*(*xn*, *un*|*y<sup>n</sup>* <sup>1</sup> ), for example. In TMC transitions form *<sup>p</sup>*(*tn*+1|*tn*)can be expressed in different forms, let us consider the following one:

$$p\left(\mathfrak{t}\_{n+1}|\mathfrak{t}\_n\right) = p\left(\mathfrak{v}\_{n+1}|\mathfrak{v}\_{n\prime}y\_n\right)p\left(y\_{n+1}|\mathfrak{v}\_{n+1\prime}\mathfrak{v}\_{n\prime}y\_n\right).\tag{1}$$

In the application of this paper we will assume that *<sup>p</sup>*(*vn*+1|*vn*, *<sup>y</sup>n*) = *<sup>p</sup>*(*vn*+1|*vn*) and *<sup>p</sup>*(*yn*+1|*vn*+1, *<sup>v</sup>n*, *<sup>y</sup>n*) = *<sup>p</sup>*(*yn*+1|*vn*+1). So the transition is simplified to

$$p\left(t\_{n+1}|t\_n\right) = p\left(\left.\upsilon\_{n+1}|\right|\nu\_n\right)p\left(y\_{n+1}|\nu\_{n+1}\right),\tag{2}$$

which provides process *T* with the structure of a classical HMC. For simplification, this simplified TMC is referred to as TMC in the remaining. The first term *<sup>p</sup>* (*vn*+1|*vn*) in Equation (2) is the state transition probability, the dimension of the matrix is (*r* × *τ*) × (*r* × *τ*). The second term is the probability of observing *<sup>y</sup><sup>n</sup>* conditionally to each state. Most of the time, this kind of density is modeled by Gaussian distributions:

$$p\left(y\_n|\upsilon\_n=i\right) \sim \mathcal{N}\left(\mu\_i, \Sigma\_i\right), i \in \Lambda \times \Gamma,\tag{3}$$

where *μ<sup>i</sup>* and Σ*<sup>i</sup>* are the mean and variance. The dependency graph of this particular TMC is shown in Figure 1a. Regardless of the probabilistic links inside the nodes related to *V*, the dependency of *Y* and *V* is just in the form of HMC.

For obtaining the probability of individual *xn* and *un* conditioned on *y<sup>n</sup>* <sup>1</sup> , *<sup>y</sup><sup>N</sup>* <sup>1</sup> , we only need to compute the marginal probability of *<sup>p</sup>*(*xn*, *un*|*y<sup>n</sup>* <sup>1</sup> ) and *<sup>p</sup>*(*xn*, *un*|*y<sup>N</sup>* <sup>1</sup> ). Indeed, we have

$$\begin{aligned} p(\mathbf{x}\_n|\mathbf{y}\_1^n) &= \sum\_{\boldsymbol{\mu}\_{\boldsymbol{\mu}}} p(\mathbf{x}\_{n\boldsymbol{\prime}}, \boldsymbol{\mu}\_{\boldsymbol{\mu}}|\mathbf{y}\_1^n), \\ p(\mathbf{x}\_{\boldsymbol{\mu}}|\mathbf{y}\_1^N) &= \sum\_{\boldsymbol{\mu}\_{\boldsymbol{\mu}}} p(\mathbf{x}\_{\boldsymbol{\mu}}, \boldsymbol{\mu}\_{\boldsymbol{\mu}}|\mathbf{y}\_1^N). \end{aligned} \tag{4}$$

Likewise, *<sup>p</sup>*(*un*|*y<sup>n</sup>* <sup>1</sup> ) and *<sup>p</sup>*(*un*|*y<sup>N</sup>* <sup>1</sup> ) can be obtained in a similar way. *<sup>p</sup>*(*xn*, *un*|*y<sup>n</sup>* <sup>1</sup> ) and *<sup>p</sup>*(*xn*, *un*|*y<sup>N</sup>* <sup>1</sup> ) are the probabilities if known the observation *y<sup>n</sup>* <sup>1</sup> and *<sup>y</sup><sup>N</sup>* <sup>1</sup> , commonly they are called filtering probability and smoothing probability, respectively. Similar meaning can be deduced for *<sup>p</sup>*(*xn*|*y<sup>n</sup>* <sup>1</sup> ), *<sup>p</sup>*(*xn*|*y<sup>N</sup>* <sup>1</sup> ),

*<sup>p</sup>*(*un*|*y<sup>n</sup>* <sup>1</sup> ), *<sup>p</sup>*(*un*|*y<sup>N</sup>* <sup>1</sup> ). Then, the estimated hidden state will be obtained via MPM (Maximum Posterior Mode) criterion using the smoothing probability:

$$\begin{aligned} \hat{\mathfrak{X}}\_{\mathsf{H}} &= \arg\max\_{\mathbf{x}\_{\mathsf{H}} \in \Lambda} p\left(\mathbf{x}\_{n}|\mathbf{y}\_{1}^{N}\right), \\ \hat{\mathfrak{U}}\_{\mathsf{H}} &= \arg\max\_{\mathbf{u}\_{\mathsf{H}} \in \Gamma} p\left(\mathbf{u}\_{n}|\mathbf{y}\_{1}^{N}\right). \end{aligned} \tag{5}$$
