The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials

Moshkovitz, Itzhak; Barron, Yonit

doi:10.3390/axioms14030221

Open AccessArticle

The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials

by

Itzhak Moshkovitz

and

Yonit Barron

^*

Department of Industrial Engineering and Management, Ariel University, Ariel 40700, Israel

^*

Author to whom correspondence should be addressed.

Axioms 2025, 14(3), 221; https://doi.org/10.3390/axioms14030221

Submission received: 31 December 2024 / Revised: 23 February 2025 / Accepted: 28 February 2025 / Published: 17 March 2025

(This article belongs to the Special Issue Computational and Mathematical Methods in Science and Engineering, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Competing patterns are compound patterns that compete to be the first to occur a pattern-specific number of times, known as a stopping rule. In this paper, we study a higher-order Markovian dependent Bernoulli trials model with competing patterns. The waiting time distribution refers to the distribution of the number of trials required until the stopping rule is met. Based on a finite Markov chain, a hierarchical algorithm is proposed to derive the conditional probability generating function (pgf) of the waiting time of the competing patterns model. By applying the law of total expectation, the final pgf is then obtained. Using examples, we further demonstrate that the proposed algorithm is an effective and easy-to-implement tool.

Keywords:

competing patterns; waiting time; Markov chain; Bernoulli trials

MSC:

60-08; 60J10; 90-10

1. Introduction and Literature Review

The waiting time distribution for a sequence of trials refers to the distribution of the number of trials required until a specific stopping rule is satisfied. A series of identical trials is referred to as a “run”, while a general sequence is called a “pattern”. The waiting times of a sequence of trials have been extensively studied with different stopping rules. Most of these studies employ combinatorial analysis, assuming a sequence of independent and identically distributed (i.i.d.) trials, each of which ends in two or more outcomes.

The most basic model dates to i.i.d. Bernoulli trials (each resulting in either a success “1” or a failure “0”), where the stopping rule is the occurrence of the first success (“1”). Here, the waiting time is a geometrically distributed random variable (r.v.). When the stopping rule is extended to the k-th success, the waiting time is Negative Binomial (NB) r.v. More complex stopping rules include specific sequences, combinations of multiple stopping rules, multi-state trials, and Markovian models (with dependent trials).

The waiting time distributions of runs and patterns, such as geometric, geometric of order k, negative binomial, and sooner and later, have been successfully applied in numerous areas of statistics and applied probability. In recent decades, the theory of waiting time distributions has become an indispensable tool for studying various applications, including DNA sequence homology (Schwager [1], Karwe and Naus [2,3]), epidemiology (Kulldorff [4]), and system reliability (Aki [5], Aki and Hirano [6], Chang and Huang [7]). We will now present some specific examples.

Studying the distribution of patterns assists in DNA sequence analysis by modeling the occurrence of specific motifs, repeats, or base combinations. Each nucleotide position is treated as a Bernoulli trial, allowing researchers to detect biologically significant patterns such as transcription factor binding sites, identify mutation hotspots, and study codon usage bias. It also aids in evaluating the statistical significance of sequence alignments and testing whether observed patterns are random or functionally important, providing insights into DNA structure, function, and evolution. In this regard, Kulldorff [4] investigated the occurrences of sudden infant death syndrome or birth defects using the Bernoulli model. In psychology, Schwager [1] explored the concept of “success breeds success” or “failure breeds failure” which is applied in achievement testing, animal learning studies, athletic competition, and study performance improvement. He further demonstrated that the behavior of groups of people forming lines and other structures can be modeled as a Markov-dependent sequence of trials in which some characteristic, such as the sex of the individual, is taken as a trial outcome. Dafnis [8] demonstrated a practical application for the waiting time distribution of binary trials in the topic of meteorology and agriculture, by considering the cultivation of raisins. The harvesting of certain varieties of raisins in Greece must occur between August and September. For the harvesting, a period of at least four consecutive dry days is required. Before this period begins, a period of at least two consecutive rainy days is required to water the raisins. Dafnis [8] used “0” to denote the occurrence of a rainy day, and “1” to denote the occurrence of a dry one. The probabilities of “0” and “1” were estimated using previous years’ statistics.

In the field of agriculture, the concept of r-weak runs distribution was applied to identify the rate of development of crops (which recognizes that plant development will occur only when the temperature exceeds a specific base temperature for a certain number of days), and to investigate the impact of critical factors on honeydew honey production (Dafnis et al. [9], Dafnis et al. [10]). Drawing our attention to reliability studies, we mention the consecutive k-out-of-n systems. In these systems, each component is either working or failed, and the overall system functions only when at least k consecutive components are working within the total n components in the sequence. Dafnis et al. [11] studied the k-out-of-n system that fails if and only if a string of k non-functioning components is interrupted by a string of at most r consecutive functioning ones. They showed that there are

{(r + 1)}^{k - 1}

different patterns of appearance that can cause a system failure. In the financial field, Dafnis and Marki [12] showed how the concept of r-weak runs can be adapted by financial advisors and technical analysts for the determination of a personalized and effective investing strategy with controlled risk.

For the literature review, we date back to Feller [13], who studied the probability generating function (pgf) for the waiting time distribution of a sequence of i.i.d Bernoulli trials where the stopping condition is the first occurrence of a series of k consecutive successes. Philippou et al. [14] extended Feller [13] by studying a geometric distribution of order k, defined as the waiting time distribution until k consecutive successes occur. Aki [5] generalized the model to geometric, NB, Poisson, logarithmic series, and binomial distributions of order k with dependencies. Philippou and Makri [15] studied the binomial distribution of order k in a finite number of Bernoulli trials. Philippou [16] studied multi-state trials, and Ling [17] extended the k-th ordered geometric model with parameter p to

(k, \dots, k_{m})

-order with parameter

(p_{1}, \dots, p_{n})

; the exact distributions and the pgfs were further derived for some special cases of

(k, \dots, k_{m})

. Shmueli and Cohen [18] used recursive formulas to compute the exact probability functions of a model with switching rules. Koutras and Eryilmaz [19] considered a compound geometric distribution of order k determined by another random process, such as Poisson or binomial.

The above literature assumes that the experiment ends after the first k consecutive successes. A more general stopping rule is to stop the experiment with the first occurrence of a particular sequence of i.i.d. Bernoulli experiments, known as a pattern. A pattern has a specific sequence and may include different symbols. In this context, Blom and Thorburn [20] analyzed the waiting time distribution until a k-digit sequence is obtained or, more generally, until one of several k-digit sequences is obtained. For the latter case, the mean waiting time was also derived. Ebneshahrashoob and Sobel [21] studied a generalized pgf, means and variances for the waiting time until obtaining a sequence of s successes or a sequence of r failures, whichever comes sooner. Huang [22] introduced a generalized stopping

(k_{1}, k_{2})

-rule as the occurrence of

k_{1}

consecutive failures followed by

k_{2}

consecutive successes. Dafnis et al. [8], Makri [23] and Kumar and Upadhye [24] considered different types of

(k_{1}, k_{2})

-rules. Zhao et al. [25], Kong [26] and Chadjiconstantinidis and Eryilmaz [27] studied the distributions of

(k_{1}, \dots, k_{m})

-runs of multi-state trials.

To date, we focused on the distribution of the first instance of a pattern, or the

(k_{1}, \dots, k_{m})

-type model. We next reviewed the literature dealing with the distribution for the r-th occurrence of a pattern. Aki [28] investigated the waiting time distribution until the r-th occurrence of a pattern in a sequence of i.i.d. Bernoulli trials. Koutras [29] derived several moments of the waiting time for the non-overlapping appearance of a pair of successes separated by at most k − 2 failures (

k > 2

). Robin and Daudin [30] obtained the distribution of distances between the two successive occurrences of a specific pattern, as well as between the n-th and the

(n + m)

-th occurrences. Aki and Hirano [31] further explored a two-dimensional pattern model.

Focusing on models under the assumption of Markov dependency, Hirano and Aki [32] studied the distribution of the number of success runs of length k until the n-th trial in a two-state dependent Markov chain. Fu and Koutras [33] presented an approach for the distribution theory of runs based on a finite Markov chain embedding technique that covers identical and nonidentical Bernoulli trials. In addition, the exact distribution of the waiting time for the m-th occurrence of a specific run, and the distribution of the number of at least k successes were derived. The number of failures, successes, and the first consecutive k successes were studied in Aki and Hirano [31]. Using non-overlapping counting, Fu [34] studied the multi-state trials model. Koutras [35] developed a general technique for the waiting time distribution in a two-state dependent Markov chain. Antzoulakos [36] introduced a variation of the finite Markov chain embedding method to derive the pgf until the r-th occurrence of a pattern, considering both non-overlapping and overlapping cases. Fisher and Cui [37] introduced a mathematical framework for determining the expected time for a specific pattern to emerge in higher-order Markov chains, both with and without a predefined starting sequence. This approach was extended to the calculation of the first occurrence of any pattern from a collection, along with the probability that each individual pattern is the first to appear. Chang et al. [38] studied the dual relationship between the probability of the number of patterns and the probability of the waiting time in a sequence of multi-state trials.

We next focused on models with stopping rules that involve a few simple patterns, known as the compound patterns rule, where the stopping rule is triggered by the first appearance of one of the patterns. Applying pgf techniques, Fu and Chang [39] and Han and Hirano [40] investigated multi-state trial models with compound patterns for both i.i.d. trials and first-order Markov-dependent trials. For the r-th order Markov-dependent chain, Fu and Lou [41] examined the waiting time distributions of the first occurrence of simple and compound patterns in the sequences of Bernoulli trials. Wu [42] applied a finite Markov chain embedding technique to analyze the conditional waiting time distributions. Using a matrix form, Aston and Martin [43] presented an algorithm that computes the distribution of the waiting time until the m-th occurrence of a compound pattern. For an excellent summary of various calculation methods, we refer readers to the comprehensive book by Fu and Lou [44] and Balakrishnan and Koutras [45].

Our research addresses models with a competing pattern stopping rule. Here, the experiment ends when a simple or compound pattern occurs a specific number of times (note that the compound pattern rule is a special case, where the experiment ends after the first occurrence). For a real-life example, consider again the cultivation of raisins (Dafnis [8]). To obtain raisins of fine quality, grapes need successions of short rainy and dry periods. Thus, an agriculturalist cares about the frequent occurrence of patterns with at most 2 consecutive rainy days followed by at most 2 consecutive dry days. Agriculturalists further claim that the occurrence of at least five such patterns in the three months has a significant effect on the quality of raisins.

Another real-life example is taken from Dafnis et al. [9]. They studied the effect of the number of cold days on the life cycle of Marchalina hellenica. They showed that the number of cold periods (each of which is a run of k consecutive cold days) is a more critical factor to the completion time of the insect’s biological cycle than the total number of cold days.

For models with competing patterns, Aston and Martin [43] investigated the waiting time distribution for competing patterns in m-th order dependent multi-state Markov trials. They analyzed several compound patterns, each associated with a specific required number of occurrences. Martin and Aston [3] introduced the generalized later patterns model, in which all patterns must appear multiple times. We also mention the closely related sooner and later models. Sooner waiting time distribution captures the number of trials required for the first occurrence of one of two competing patterns, and conversely, later waiting time distribution refers to the number of trials required for both patterns to occur (see, e.g., Han and Hirano [40], Balakrishnan and Koutras [45]).

For other related models dealing with waiting time distributions under the Markovian assumption, we mention the hidden Markov processes (Aston and Martin [46]), in which the states are not directly observable, or the sparse Markov process, in which the transition probability matrix includes many zero or near-zero entries (see, e.g., Martin [47,48]). Dafnis [10] studied the model with independent but not necessarily identically distributed trials. Michael [49] and Vaggelatou [50] presented a framework for a continuous time Markov chain. Dafnis et al. [9] introduced the r-weak run of length at least k in a sequence of binary trials. Their model was extended to include minimum and maximum constraints (Dafnis and Makri [12]). In a series of papers, Makri ([51,52]) investigated a sequence of binary trials with a specific length threshold.

Our paper contributes to this study by deriving simple and closed-form expressions for the pgf of waiting time distributions associated with higher-order Markovian-dependent Bernoulli trials with competing patterns. To the best of our knowledge, the studies presented to date are based on Markov chain-embedding techniques, where the state space is large, leading to a complicated transition probabilities matrix, high computational complexity and, thus, it is difficult to implement. The suggested algorithm is based on a hierarchical approach and includes three steps. In the first step, the framework is designed by including the state space, stopping rules, and steady-state probabilities. The second step determines all the paths that terminate the experiment; each such path is divided into sub-paths (components) where the total pgf is the product of the pgfs of these sub-paths. The last step derives the pgf of each sub-path by considering whether the path is longer or shorter than the number of trials of the longest pattern. The final pgf is obtained by applying the law of total expectation.

The rest of the paper is organized as follows. Section 2 introduces the definitions and preliminaries to be used. Using a hierarchical approach, Section 3 provides a mathematical description of the model and derives the pgf of the waiting time; this derivation is demonstrated by an example. A summarizing algorithm and additional examples are provided in Section 4. Finally, Section 5 concludes the paper and suggests some future directions. Following the convention, we indicate vectors by bold letters, and matrices by blackboard bold letters. We let

1_{{A}}

be the indicator of an event

A,

e = {(1, \dots, 1)}^{T}

be the column vector of all ones, and

I

be the identity matrix, all of the appropriate dimensions. We use

|A|

to denote the number of elements in a set

A .

Summarizing our abbreviations, we use pgf(s) for probability generating function(s), i.i.d r.v.(s) for independent and identical distributed random variable(s).

2. Definitions and Preliminaries

We use the terminology of Fo and Lou [41] and Aston and Martin [43].

Bernoulli trial. Consider a sequence of Bernoulli trials

X_{1}, X_{2}, \dots,

where each trial (symbol)

X_{t}

has two possible outcomes, success and failure (0 and 1, respectively) with

p (X_{t} = 1) = p

and

p (X_{t} = 0) = 1 - p = q .

Let

S

be the state space of an individual result

X_{t},

i.e.,

S = {0, 1} .

A simple pattern. A simple pattern

Λ

is a specific sequence of k trials (symbols),

x_{i, 1},

x_{i, 2} \dots,

x_{i, k}

from

S

, i.e.,

Λ = x_{i, 1}, \dots x_{i, k} .

The waiting time random variable of a simple pattern,

W (Λ),

is defined as

W (Λ) = inf {n \in N : n \geq l, X_{n - k + 1} = x_{i, 1}, \dots, X_{n} = x_{i, k}},

i.e., …

W (Λ)

counts the number of trials until the first occurrence of pattern

Λ

. In the following, we assume that we start at

t = 1

.

A compound pattern

Λ = \cup_{i = 1}^{l} Λ_{i}

is the union of l simple patterns. We use

Λ_{i} \cup Λ_{j}

to denote the occurrence of either pattern

Λ_{i}

or pattern

Λ_{j}

. Accordingly, the waiting time

W (Λ)

is defined as

W (Λ) = inf {n \in N : the occurrence of any pattern Λ_{1}, \dots, Λ_{l} during the n trials} .

Note that

W (Λ) = inf {W (Λ_{1}), W (Λ_{2}), \dots, W (Λ_{l})} .

A (discrete) Markov chain. A discrete-time Markov chain is a sequence of random variables

X_{1}, X_{2}, X_{3}

, … with the Markov property, namely that the probability of moving to the next state only depends on the present state and not on the previous states:

p (X_{t + 1} = x_{t + 1} ∣ X_{1} = x_{1}, \dots, X_{t} = x_{t}) = p (X_{t + 1} = x_{t + 1} ∣ X_{t} = x_{t})

if both conditional probabilities are well defined, that is, if

p (X_{1} = x_{1}, \dots, X_{t} = x_{t}) > 0 .

The possible values of

X_{i}

form a countable set

S

, called the state space of the chain. Time-homogeneous Markov chains are processes where

p (X_{t + 1} = y ∣ X_{t} = y) = p (X_{t} = y ∣ X_{t - 1} = y)

for all t, i.e., the probability of the transition is independent of t (for more details, see Feller [13] and Hirano and Aki [32]).

An r-th order Markov chain. Let

{X_{t}}

be a sequence of irreducible, aperiodic, and homogeneous r-th order Markov-dependent m-state random variables (trials) defined on the state space

S = {b_{1}, b_{2}, \dots b_{m}} .

(when

m = 2,

we have a Bernoulli trial). For

r \geq 1,

the set of all possible r tuples (

m^{r}

combinations)

S^{r}

is given by

S^{r} = {x = (x_{1} \dots x_{r}) : x_{i} \in S, i = 1, \dots, r} .

The r-order transition probabilities of the Markov chain

{X_{t}}

are defined by:

P (b ∣ x_{1}, \dots, x_{r}) = p (X_{t} = b ∣ X_{t - r} = x_{1}, \dots, X_{t - 1} = x_{r}) = p_{x, b}, x = S^{r}, b \in S,

which is independent of t. The steady-state probability vector

π = (π_{x} :

x \in S^{r})

exists and satisfies

π A = π, π e = 1,

where

A

is the

(m^{r} \times m^{r})

transition probability matrix.

In this work, we consider Bernoulli trials with

m = 2

.

Result 1. Let

{X_{t}}

be an r-th order Markov chain. Fu and Lou [41] showed that there exists an embedded finite Markov chain

{Y_{t}}

defined on a state space

Ω (Λ) = (Ω (Λ) - {α)) \cup {α},

where

{α}

is the absorbing state. Accordingly, the transition probability matrix has the form

M = \begin{matrix} Ω (Λ) - {α) \\ α \end{matrix} (\begin{matrix} N & C \\ 0 & 1 \end{matrix}) .

(Note

N e + C = e,

i.e., the sum of the probabilities in each row of

M

is equal to 1). The waiting time of a pattern (simple or compound)

W (Λ)

has a general geometric distribution

p (W (Λ) = n) = ξ N^{n - r - 1} (I - N) e and p (W (Λ) \geq n) = ξ N^{n - r - 1} e, n \geq r + 1 .

(

ξ

is the initial distribution, and

N

is known as the essential transition probability matrix). For more details, see Lemma 3.1, Theorems 3.1–4.2 of Fu and Lou [41].

Competing patterns. Let

{Λ^{(1)}, \dots, Λ^{(c)}}, c \geq 1

, be the set of c compound patterns. Let

n_{i}

denote the number of occurrences of the compound pattern

Λ^{(i)}

that terminates the experiment. The patterns

{Λ^{(1)}, \dots, Λ^{(c)}}

are called competing patterns. We assume that no two competing compound patterns are identical.

Let

C_{i} (n), i = 1, \dots, c

be the event that, by time n, the compound pattern

Λ^{(i)}

has occurred

n_{i}

times. Then, the

p (⋃_{i = 1}^{c} C_{i} (n))

is the waiting time probability function of the competing patterns.

We further note that two distinct methods of counting patterns are considered in the literature (Inoue and Aki [53]). (1). Non-overlapping counting. In this case, when a pattern occurs, the counting restarts from that point, and any partially completed pattern cannot be finished. (2). Overlapping counting. In this case, partially completed patterns can be finished at any time, regardless of whether another pattern has been completed after the partially completed pattern starts but before it is completed.

Furthermore, we have two more definitions (Aston and Martin [43]): Ending blocks of a simple pattern

Λ = {b_{i}, \dots, b_{i_{k}}}

are sub-patterns of the form

\{b_{i}, \dots, b_{i_{q}}\}

, where

q \in {1, 2, \dots, k - 1}

. Ending blocks always start at the beginning of a simple pattern but end at any point before its last symbol. Finishing blocks of the simple pattern

Λ = {b_{i_{1}}, \dots, b_{i_{k}}}

are sub-patterns of the form

\{b_{i_{ζ}}, \dots, b_{i_{k}}\}

, where

ζ \in {1, 2, \dots, k} .

Finishing blocks may start at any point but always end with the last symbol. The finishing blocks of a compound pattern or competing patterns are formed by taking the union of the finishing blocks of their respective components.

Result 2. Consider the competing patterns

{Λ^{(1)}, \dots, Λ^{(c)}}

. Aston and Martin [43] showed that the waiting time distribution for competing compound patterns has a geometric form. Specifically, the competing pattern experiment ends with compound pattern

Λ^{(i)}

occurred

n_{i}

times if and only if the Markov chain

{Y_{t})

is absorbed in the corresponding absorbing state. Therefore, the probability function of the waiting time is given by

P (W (Λ) = n) = p (⋃_{i = 1}^{c} C_{i} (n)) = ψ_{0} T_{Y}^{n} e_{Λ},

where

ψ_{0}

is the initial probability vector of

{Y_{t}},

T_{Y}

is the transition probability matrix, and

e_{Λ}

is a column vector with 1’s in the position corresponding to the absorbing states and 0’s elsewhere (see Section 3.2 of Aston and Martin [43]).

Probability Generating Function

The probability generating functions (pgf) is a useful technique for computing distributions (see, e.g., Feller [13], chapter XI). As a short background, for a non-negative discrete random variable Y, the pgf

G_{Y} (z)

is defined as

G_{Y} (z) = E (z^{Y}) = \sum_{j = 0}^{\infty} p (Y = j) z^{j} .

for all

z \in R

for which the sum converges. The pgf is then a power series and obeys all the rules obeyed by power series with non-negative coefficients. The probabilities

p (Y = j)

are the coefficients of the power series, and may be recovered through series expansion or by taking derivatives of G with respect to z.

Pgfs are especially useful for computing the distributions of sums of random variables, as well as moments and factorial moments. We note that, for independent random variables

V_{1} \dots, V_{n}

, the pgf of

Y = V_{1} + \dots + V_{n}

is given by

G_{Y} (z) = \prod_{i = 1}^{n} G_{V_{i}} (z) .

Along with the uniqueness of the pgf, it is helpful to determine the sampling distribution of interest.

3. Competing Patterns in High-Order Markov-Dependent Bernoulli Trials

The derivation of the probability generating function includes three steps. In the first step, we build the mathematical framework, the experiment design and settings of the experiment. The second step determines the paths that terminate the experiment. Applying tools from probability theory and Markov chain, the third step derives the pgf function of the number of trials (waiting time) for each path. All steps are illustrated by a continuous example.

3.1. Step 1. Experimental Design and Settings

Consider the r-th order Markov-dependent Bernoulli trials

{X_{i}}, i = 1, 2, \dots

on the state space (with

2^{r}

combinations)

S^{r} = {x = (x_{1} \dots x_{r}) : x_{i} \in {0, 1}, i = 1, \dots, r} .

The r-th order transition probabilities are given by:

p (X_{t} = y ∣ X_{t - r} = x_{1}, \dots, X_{t - 1} = x_{r}) = p_{{x, x}^{'}}, x = (x_{1}, x_{2}, \dots, x_{r}) \in S^{r}, x_{i}, y \in S = {0, 1}

(1)

independent of t.

Clearly,

\sum_{y \in S} p_{x, y} = 1,

for all

x \in S^{r} .

In a matrix form, let

A

be the

(S^{r} \times S^{r})

square matrix with elements

{[A]}_{x, x^{'}}

given by

{[A]}_{x, x^{'}} = \{\begin{matrix} p_{x, y} & x = (x_{1}, x_{2}, \dots, x_{r}), x^{'} = (x_{2}, \dots, x_{r}, y) \\ 0 & o t h e r w i s e \end{matrix}\} .

(2)

Associated with the sequence are the initial probabilities

π = π (x_{- r + 1}, \dots, x_{0}) = p (X_{- r + 1} = x_{- r + 1}, \dots, X_{0} = x_{0}) .

(3)

(Note that

A e = e^{T},

and the vector

π = {π_{x}, x \in S^{r}}

satisfies the system of equations

π A = π, π e = 1) .

Let

Λ^{(i)} = {Λ_{1}^{(i)}, \dots, Λ_{k_{i}}^{(i)}}, i = 1, \dots, c

be a compound pattern that includes

k_{i}

simple patterns, each of which has size

l_{i, k_{i}} = |Λ_{k_{i}}^{(i)}| .

We assume that

l_{i, k_{i}} \geq r .

Let

n_{i}

denote the number of non-overlapping occurrences of

Λ^{(i)}

needed for the termination of the experiment, and let

l_{i} = max_{j = 1, \dots, k_{i}} {l_{i, j}}, i = 1, \dots, c .

Let

Λ = {Λ^{(i)}}_{i = 1}^{c}

be the set of competing patterns, and let

W_{Λ}

denote the waiting time random variable (number of trials) until the experiment terminates given the steady-state environment. We assume non-overlapping counting.

Example 1.

Our base case considers second-order Markov-dependent Bernoulli trials, i.e.,

r = 2

and

x_{i} \in {0, 1} .

The transition probabilities

p_{x, y}, x = (x_{1}, x_{2}), x_{i}, y \in {0, 1}

are:

\begin{matrix} p_{00, 0} & = p_{1}, p_{00, 1} = p_{2}, p_{10, 0} = p_{3}, p_{10, 1} = p_{4}, \\ p_{01, 0} & = p_{5}, p_{01, 1} = p_{6}, p_{11, 0} = p_{7}, p_{11, 1} = p_{8} . \end{matrix}

(4)

In a matrix form

A = \begin{matrix} \begin{matrix} 00 \\ 01 \\ 10 \\ 11 \end{matrix} & (\begin{matrix} p_{1} & p_{2} & 0 & 0 \\ 0 & 0 & p_{5} & p_{6} \\ p_{3} & p_{4} & 0 & 0 \\ 0 & 0 & p_{7} & p_{8} \end{matrix}) \end{matrix} .

(5)

(Note that

p_{i} + p_{i + 1} = 1, i = 1, 3, 5, 7

). Here, the steady-state probability vector

π = (π_{00}, π_{01}, π_{10}, π_{11})

satisfying

π A = π, π e = 1

is given by:

\begin{matrix} π_{00} & = \frac{p_{3} \cdot p_{7}}{p_{1} \cdot p_{5} - 2 p_{1} \cdot p_{7} + p_{3} \cdot p_{7} - p_{1} - p_{5} + 2 p_{7} + 1} \\ π_{01} & = \frac{- (p_{1} - 1) \cdot p_{7}}{p_{1} \cdot p_{5} - 2 p_{1} \cdot p_{7} + p_{3} \cdot p_{7} - p_{1} - p_{5} + 2 p_{7} + 1} \\ π_{10} & = \frac{- (p_{1} - 1) \cdot p_{7}}{p_{1} \cdot p_{5} - 2 p_{1} \cdot p_{7} + p_{3} \cdot p_{7} - p_{1} - p_{5} + 2 p_{7} + 1} \\ π_{11} & = \frac{p_{1} \cdot p_{5} - p_{1} - p_{5} + 1}{p_{1} \cdot p_{5} - 2 p_{1} \cdot p_{7} + p_{3} \cdot p_{7} - p_{1} - p_{5} + 2 p_{7} + 1} . \end{matrix}

(6)

We assume two competing patterns (

c = 2

):

Λ^{(1)} = {00}

with

n_{1} = 3

(

l_{1} = 2

), and

Λ^{(2)} = {111},

with

n_{2} = 2

(

l_{2} = 3

). That is, the experiment terminates if either three occurrences of two consecutive

0^{s}

(failures) or two occurrences of three consecutive 1s (successes) are observed.

3.2. Step 2. Stopping Paths

We start by applying a higher hierarchical point of view, focusing on paths of patterns rather than on individual trials. Clearly, several paths can terminate the experiment. Let

V_{i}, i = 1, \dots, c

be the set of paths that terminate the experiment due to

Λ^{(i)}, i = 1, \dots, c .

Concretely, the set

V_{i}

includes all paths that have the following structure: the pattern

Λ^{(i)}

appears

n_{i}

times in total, in which its last

n_{i}

-th occurrence (that terminates the experiment) is the last component of the path; other occurrences of

Λ^{(j)}, j \neq i

appear no more than

n_{j} - 1

times each. Assume that there are

C_{i}

such paths (i.e.,

|V_{i}| = C_{i}

). Denote these paths by

V_{i, 1,} V_{i, 2}, \dots, V_{i, C_{i}},

so that

V_{i} = {V_{i, j}}_{j = 1, \dots, C_{i}} .

In the following, each path will be referred to as a “stopping vector”. Let

V = {\{V_{i}\}}_{i = 1, \dots, c}

be the set of all stopping vectors that terminate the experiment.

Corollary 1.

It is easy to verify the following:

(i): The number of competing patterns in each path $V_{i, j}$ satisfies:

$min_{i = 1 \dots c} (n_{i}) \leq |V_{i, j}| \leq \sum_{i = 1}^{c} (n_{i} - 1) + 1, j = 1, \dots, C_{i},, i = 1, \dots, c$

(7)
(ii): The number of paths in the set $V_{i},$ is given by (combinatorial considerations):

$C_{i} = \sum_{k_{1} = 0}^{n_{1} - 1} \dots \sum_{k_{c} = 0}^{n_{c} - 1} \frac{(k_{1} + \dots + k_{c} + n_{i} - 1)!}{k_{1}! \dots k_{c}! (n_{i} - 1)!}, k_{1} < k_{2}, \dots < k_{c}, k_{j} \neq i .$

(8)

(Note that the summing is over all patterns excluding $Λ^{(i)}$ ).
(iii): Since $Λ^{(i)}, i = 1 \dots c,$ including distinct patterns, we have:

$V = ⋃_{i = 1}^{c} V_{i}, |V| = C w h e r e C = \sum_{i = 1}^{c} C_{i} .$

(9)

Example 2.

We have

Λ^{(1)} = 00

with

n_{1} = 3

, and

Λ^{(2)} = 111,

with

n_{2} = 2;

thus, the experiment terminates if either three occurrences of two consecutive

0^{'} s

or two occurrences of three consecutive 1s appear. Let

V_{1}

and

V_{2}

be the sets of paths that terminate the experiment due to

Λ^{(1)}

and

Λ^{(2)},

respectively. Figure 1 illustrates the sequence (path) of possible patterns that terminate the experiment (denoted by

V_{i, j}

; see the red labels). We observe that

V_{1}

contains four possible paths (

V_{11}, \dots, V_{14}

) and

V_{2}

contains six possible paths (

V_{21}, \dots, V_{26}

). In summary, there are a total of 10 possible paths (stopping vectors) that end the experiment.

Applying Corollary 1(i), each such stopping vector includes at least two patterns (due to

n_{2}

) and no more than four patterns (

(n_{1} - 1) + (n_{2} - 1) + 1 = 4

). Applying Corollary 1(ii), the sets

V_{1}

and

V_{2}

include

C_{1}

and

C_{2}

stopping vectors, respectively. By Corollary 1(ii),

C_{1}

and

C_{2}

are given by:

\begin{matrix} C_{1} & = \sum_{k_{1} = 0}^{n_{2} - 1} \frac{(k_{1} + n_{1} - 1)!}{k_{1}! (n_{1} - 1)!} = \frac{(3 - 1)!}{0! (3 - 1)!} + \frac{(1 + 3 - 1)!}{1! (3 - 1)!} = 4, \\ C_{2} & = \sum_{k_{1} = 0}^{n_{1} - 1} \frac{(k_{1} + n_{2} - 1)!}{k_{1}! (n_{2} - 1)!} = \frac{(2 - 1)!}{0! (2 - 1)!} + \frac{(1 + 2 - 1)!}{1! (2 - 1)!} + \frac{(2 + 2 - 1)!}{2! (2 - 1)!} = 6, \end{matrix}

V = V_{1} \cup V_{2},

and

|V| = C_{1} + C_{2} = 10

vectors (see also Figure 1),

\begin{matrix} V_{1} = \{\begin{matrix} V_{1, 1} = [Λ^{(1)}, Λ^{(1)}, Λ^{(1)}], \\ V_{1, 2} = [Λ^{(1)}, Λ^{(1)}, Λ^{(2)}, Λ^{(1)}], \\ V_{1, 3} = [Λ^{(1)}, Λ^{(2)}, Λ^{(1)}, Λ^{(1)}], \\ V_{1, 4} = [Λ^{(2)}, Λ^{(1)}, Λ^{(1)}, Λ^{(1)}], \end{matrix}\} & V_{2} = \{\begin{matrix} V_{2, 1} = [Λ^{(2)}, Λ^{(2)}], \\ V_{2, 2} = [Λ^{(2)}, Λ^{(1)}, Λ^{(2)}], \\ V_{2, 3} = [Λ^{(1)}, Λ^{(2)}, Λ^{(2)}], \\ V_{2, 4} = [Λ^{(2)}, Λ^{(1)}, Λ^{(1)}, Λ^{(2)}], \\ V_{2, 5} = [Λ^{(1)}, Λ^{(2)}, Λ^{(1)}, Λ^{(2)}], \\ V_{2, 6} = [Λ^{(1)}, Λ^{(1)}, Λ^{(2)}, Λ^{(2)}] . \end{matrix}\} \end{matrix}

Remark 1.

Note that Figure 1 presents only the paths of competing patterns; intermediate trials that do not lead to a pattern are not presented.

We then derive the probability-generating function of the waiting time until the experiment terminates.

3.3. Step 3. The Waiting Time Distribution

Let

W_{Λ}

denote the waiting time until the experiment terminates with

G (z)

its pgf. Applying the law of total expectation and noting that

{V_{i, j}}, i = 1, \dots, c, j = 1, \dots, C_{i},

are disjoint vectors (each vector refers to a different path) leads to:

G (z) = E (s^{W_{Λ}}) = \sum_{i, j} E (s^{W_{Λ}} \cdot 1_{{V_{i, j}}}) .

(10)

(

1_{{A}}

is the indicator function). Each stopping vector

V_{i, j}

is composed of a numbered sequence of consecutive patterns

V_{i, j} = [V_{i, j} (1), V_{i, j} (2), \dots]

, e.g.,

V_{2, 4} (1) = Λ^{(2)}, V_{2, 4} (2) = Λ^{(1)} .

Let

W_{i, j} (k), k = 2, 3, \dots,

be the number of trials from pattern

V_{i j} (k - 1)

until the occurrence of the next pattern

V_{i, j} (k) .

Accordingly, for

k = 1,

we let

W_{i, j} (1)

be the number of trials until the first pattern

V_{i, j} (1)

occurs, given initial state

x \in S^{r}

. Let

W_{i, j}

be the waiting time due to path

V_{i, j} .

It is easy to verify that:

W_{i, j} = \sum_{k = 1}^{|V_{i, j}|} W_{i, j} (k), E (s^{W_{i, j}}) = E (s^{\sum_{k = 1}^{|V_{i, j}|} W_{i, j} (k)}) .

(11)

Substituting

W_{Λ} = \sum_{i, j} W_{i, j} \cdot 1_{{V_{i, j}}}

and (11) into (10) yields:

E (s^{W_{Λ}}) = \sum_{i, j} E (s^{W_{i, j}} \cdot 1_{{V_{i, j}}}) = \sum_{i, j} E (s^{\sum_{k = 1}^{|V_{i, j}|} W_{i, j} (k)} \cdot 1_{{V_{i, j}}}) = \sum_{i, j} E (s^{W_{i, j} (1)} \cdot s^{W_{i, j} (2)} \cdot \dots 1_{{V_{i, j}}}) .

(12)

Assume that the pattern

V_{i j} (k

−

1)

occurs (for

k = 1,

we assume an initial state

x \in S^{r}) .

Let

G_{V_{i, j} (k - 1), V_{i, j} (k)}

be the pgf of

W_{i, j} (k),

i.e., the pgf of the number of trails from

V_{i j} (k

−

1)

until

V_{i j} (k)

,

G_{V_{i, j} (k - 1), V_{i, j} (k)} = \{\begin{matrix} E (s^{W_{i, j} (k)} ∣ V_{i, j} (k - 1)) & k = 2, 3, \dots \\ E (s^{W_{i, j} (1)} ∣ x \in S^{r}) & k = 1 . \end{matrix}\}

(13)

Note that, due to the Markovian property, the non-overlapping feature, and assuming

V_{i, j} (k - 1),

the waiting time

W_{i, j} (k)

is independent of

V_{i, j} (1), \dots, V_{i, j} (k - 2) .

Therefore, we have:

G (z) = \sum_{i, j} \prod_{k = 1}^{|V_{i, j}|} G_{V_{i, j} (k - 1), V_{i, j} (k)} (z) \cdot

(14)

We further note that the function

G_{V_{i, j} (k - 1), V_{i, j} (k)}

is homogeneous, depends only on the patterns

V_{i, j} (k - 1), V_{i, j} (k)

(and is independent of k). Thus, for simplicity, and without losing generality, we will use

W_{i, j}, i, j = 1, \dots, c

to denote the waiting time between two consecutive patterns

Λ^{(i)}

and

Λ^{(j)},

given that the pattern

Λ^{(i)}

occurs (

c^{2}

permutations) and, respectively,

G_{i, j} (z)

to be the pgf of

W_{i, j} .

Accordingly, we let

G_{i} (z)

be the pgf of the number of trials until the first appearance of

Λ^{(i)},

given the initial state

x \in S^{r} .

Example 3.

Two competing patterns yield four pgfs that differ in their starting and ending patterns, namely

G_{1, 1} (z), G_{1, 2} (z), G_{2, 1} (z)

, and

G_{2, 2} (z) .

In addition, we have two functions,

G_{1} (z)

and

G_{2} (z),

corresponding to the first occurrence of

Λ^{(1)}

and

Λ^{(2)},

respectively. The total pgf is the sum of the ten distinct stopping vectors in

V,

each of which consists of the multiplication of the corresponding pgf along the path

V_{i, j} .

Specifically, we have:

\begin{matrix} G (z) = \underset{V_{11}}{\underset{︸}{G_{1} (z) {[G_{1, 1} (z)]}^{2}}} + \underset{V_{12}}{\underset{︸}{G_{1} (z) G_{1, 1} (z) G_{1, 2} (z) G_{2, 1} (z)}} + \underset{V_{13}}{\underset{︸}{G_{1} (z) G_{1, 2} (z) G_{2, 1} (z) G_{1, 1} (z)}} \\ + \underset{V_{14}}{\underset{︸}{G_{2} (z) G_{2, 1} (z) {[G_{1, 1} (z)]}^{2}}} + \underset{V_{21}}{\underset{︸}{G_{2} (z) G_{2, 2} (z)}} + \underset{V_{22}}{\underset{︸}{G_{2} (z) G_{2, 1} (z) G_{1, 2} (z)}} + \underset{V_{23}}{\underset{︸}{G_{1} (z) G_{1, 2} (z) G_{2, 2} (z)}} \\ + \underset{V_{24}}{\underset{︸}{G_{2} (z) G_{2, 1} (z) G_{1, 1} (z) G_{1, 2} (z)}} + \underset{V_{25}}{\underset{︸}{G_{1} (z) {[G_{1, 2} (z)]}^{2} G_{2, 1} (z)}} + \underset{V_{26}}{\underset{︸}{G_{1} (z) G_{1, 1} (z) G_{1, 2} (z) G_{2, 2} (z)}} \end{matrix}

(15)

Our next step is to derive the probability-generating functions

G_{i, j} (z)

and

G_{i} (z)

for

i, j = 1, \dots, c .

Recall that

l_{i}

is the longest length of pattern in

Λ^{(i)},

and let

b = {max}_{i = 1, \dots, c} {l_{i}} .

Our aim is to apply the law of total expectation as a function of the number of trials between two consecutive patterns (or until the first pattern), distinguishing whether the number is less or equal to b or not. Thus, we use the decomposition:

\begin{matrix} G_{i, j} (z) & = G_{i, j} (z) 1_{\{z \leq b\}} + G_{i, j} (z) 1_{\{z > b\}}, \\ G_{i} (z) & = G_{i} (z) 1_{\{z \leq b\}} + G_{i} (z) 1_{\{z > b\}} . \end{matrix}

(16)

In the left-hand side of (16), where the number of trials is no more than b, the derivation is relatively simpler and consists of a finite number of paths. Conversely, the derivation of the right-hand side of (16), where more than b trials are possible, is more challenging. We start with the relatively simpler derivation,

G_{i, j} (z) 1_{\{z \leq b\}}

and

G_{i} (z) 1_{\{z \leq b\}} .

3.3.1. The Functions $G_{i} (z) 1_{\{z \leq b\}},$ $G_{i, j} (z) 1_{\{z \leq b\}}$

We first highlight the main difference between

G_{i} (z) 1_{\{z \leq b\}}

and

G_{i, j} (z) 1_{\{z \leq b\}} .

The function

G_{i} (z) 1_{\{z \leq b\}}

refers to the first b trials; here, we use the steady-state probability vector

π (x), x \in S^{r}

, multiplied by the remaining

(b - r)

probabilities. In contrast, the function

G_{i, j} (z) 1_{\{z \leq b\}}

assumes that the

Λ^{(i)}

occurs, and continues with the multiplication of at most b transition probabilities leading to

Λ^{(j)}

.

To derive

G_{i} (z) 1_{\{z \leq b\}},

recall that

l_{i} \geq r .

Let

p (Λ^{(i)}, b)

be the probability of hitting

Λ^{(i)}

by no more than b trials (and no other patterns are hit). Let

L_{i} (u) = {x = x_{1} \dots x_{u} :

x_{u - l_{i} + 1} \dots x_{u} = Λ^{(i)}}

be the set of sequences of u trials in which

Λ^{(i)}

appears in the last

l_{i}

trials

(l_{i} \leq u \leq b)

and no other pattern is hit; recall that

r \leq l_{i} \leq u \leq b .

Denote by

p (x_{i} (u))

the ergodic probability of

L_{i} (u)

. Clearly, when

u = l_{i} = r,

L_{i} (u)

includes only the one pattern

Λ^{(i)}

with probability

p (L_{i} (u)) = π_{Λ^{(i)}} .

When

l_{i} < u \leq b

, there may be few paths in

L_{i} (u);

each denoted by

x_{k} (u),

starts with

\tilde{x} \in S^{r}

(with probability

π (\tilde{x})

), multiplying by a path of transition probabilities that leads to the ending pattern

Λ^{(i)} .

Thus, the ergodic probability distribution

p (Λ^{(i)}, b)

has the form of:

p (Λ^{(i)}, b) = π_{Λ^{(i)}} 1_{{u = r}} + \sum_{u = l_{i} + 1}^{b} \sum_{k} p (x_{k} (u)) 1_{{x_{k} (u) \in L_{i} (u)}} .

(17)

The function

G_{i} (z) 1_{\{z \leq b\}}

is then,

G_{i} (z) 1_{\{z \leq b\}} = z^{u} π_{Λ^{(i)}} 1_{{u = r}} + \sum_{u = l_{i} + 1}^{b} \sum_{k} z^{u} p (x_{k} (u)) 1_{{x_{k} (u) \in L_{i} (u)}} .

(18)

To derive

G_{i, j} (z) 1_{\{z \leq b\}},

we assume that, at time

t = 0

, the state includes the last r trials of

Λ^{(i)},

and consider the set

L_{i} (u),

u \leq b .

Each path in the set

L_{i} (u)

adds a component to

G_{i, j} (z) 1_{\{z \leq b\}},

with the product of the corresponding transition probabilities multiplied by

z^{u}

. We next demonstrate (18) using our example.

Example 4.

Here,

r = 2

and

b = 3

. To derive

G_{1} (z) 1_{\{z \leq 3\}},

we consider the pattern

Λ^{(1)} = 00

with

l_{1} = 2 = r .

Here,

u = 2

or

u = 3 .

When

u = 2,

we have

L_{1} (2) = {00}

and

π_{Λ^{(1)}} = π_{00};

when

u = 3,

we have

L_{1} (3) = {100}

and

p (L_{1} (3)) = π_{10} \cdot p_{3} .

Next, we derive

G_{2} (z) 1_{\{z \leq 3\}} .

Here,

Λ^{(2)} = 111,

l_{2} = b = 3 > r;

thus,

u = 3

with the only path

L_{2} (3) = {111}

and

p (L_{2} (3)) = π_{11} \cdot p_{8} .

Summarizing, we obtain:

G_{1} (z) 1_{\{z \leq 3\}} = z^{2} π_{00} {+ z}^{3} (π_{10} \cdot p_{3}), G_{2} (z) 1_{\{z \leq 3\}} = z^{3} (π_{11} \cdot p_{8}) .

(19)

Next, we derive

G_{i, j} (z) 1_{\{z \leq 3\}} .

We assume that, at time

t = 0

, the state is the last r trial of

Λ^{(i)}

(i.e., for

Λ^{(1)}

we assume

X_{0} = 00,

and for

Λ^{(2)}

we assume

X_{0} = 11)

. The function

G_{i, j} (z) 1_{\{z \leq 3\}}

is constructed by a path of at most three multiplications of transition probabilities (multiplied by the corresponding power of z) that lead to

Λ^{(j)}

. Here, we obtain:

\begin{matrix} G_{1, 1} (z) 1_{\{z \leq 3\}} & = z^{2} {(p_{1})}^{2} {+ z}^{3} (p_{2} \cdot p_{5} \cdot p_{3}), \\ G_{1, 2} (z) 1_{\{z \leq 3\}} & = z^{3} (p_{2} \cdot p_{6} \cdot p_{8}), \\ G_{2, 1} (z) 1_{\{z \leq 3\}} & = z^{2} (p_{7} \cdot p_{3}) {+ z}^{3} (p_{8} \cdot p_{7} \cdot p_{3}), \\ G_{2, 2} (z) 1_{\{z \leq 3\}} & = z^{3} {(p_{8})}^{3}, \end{matrix}

(20)

i.e., when assuming

Λ^{(1)}

(with

X_{0} = 00),

the paths 00 and 100 (w.p.

{(p_{1})}^{2}

and

p_{2} \cdot p_{5} \cdot p_{3},

respectively) lead to

Λ^{(1)},

and the path 111 (w.p.

p_{2} \cdot p_{6} \cdot p_{8}

) leads to

Λ^{(2)} .

Similarly, when assuming

Λ^{(2)}

(with

X_{0} = 11

), the paths 00 and 100 (w.p.

p_{7} \cdot p_{3}

and

p_{8} \cdot p_{7} \cdot p_{3},

respectively) lead to

Λ^{(1)},

and the path 111 (w.p.

{(p_{8})}^{3}

) leads to

Λ^{(2)} .

3.3.2. The Functions $G_{i} (z) 1_{\{z > b\}},$ $G_{i, j} (z) 1_{\{z > b\}}$

Let us consider the first b trials

x = x_{1} \dots x_{b} = x \in S^{b} .

We group all

x \in S^{b}

that contain the pattern

Λ^{(i)}

into the set

ξ_{b}^{i} .

Define the state

α^{i}

to be the absorbing state due to pattern

Λ^{(i)},

i.e.,

α^{i}

groups all states in

ξ_{b}^{i} .

Denote the absorbing vector by

α = (α^{1}, α^{2}, \dots, α^{c}) .

The set

Ω_{Y} - {α} \in S^{b}

groups all states that do not include a pattern. Note that every state in

\{Ω_{Y} - {α}, α^{1}, \dots, α^{c}\}

has pattern of length b. Let

T

be the transition probability matrix among the states in

Ω_{Y} - {α} .

In addition, let

T_{j}^{0}, i = 1, \dots, c

be the absorbing probability vector at state

α^{i},

and define the absorbing matrix

T^{0} = (T_{1}^{0}, T_{2}^{0}, \dots, T_{c}^{0}) .

We construct the embedded homogeneous Markov chain

{Y_{t}}_{b + 1}^{\infty}

on the state space

Ω_{Y} = {S^{b}} .

The transition probability matrix has the block form

M = (p_{x y}) = \begin{matrix} \begin{matrix} Ω_{Y} - {α} \\ α \end{matrix} & (\begin{array}{c} T & T^{0} \\ 0 & I \end{array}) \end{matrix},

(21)

where

p_{x y} = \{\begin{matrix} p_{x, x_{b + 1}} & i f x \in Ω_{Y}, y = (x x_{b + 1}) \in Ω_{Y}, \\ 1 & if x = y \in {α^{i}}, \\ 0 & otherwise . \end{matrix}\}

(22)

Note that

(T + \sum_{j = 1}^{c} T_{j}^{0}) e = e^{T}

. The initial probability of

{Y_{b}}

is given by:

π_{Y_{b}} = (p (Y_{b} = x) : x \in Ω_{Y}) = \{\begin{matrix} π_{Y_{b}} (x) & x = x_{1} \dots x_{b} \in Ω_{Y} - {α}, \\ \sum_{x \in ξ_{b}^{i}} π_{Y_{b}} (x) & x \in α^{i}, \\ 0 & otherwise . \end{matrix}\}

(23)

Since

b \geq r,

the initial probability

π_{Y_{b}} (x)

for

x = x_{1} \dots x_{b}

starts with the corresponding steady-state probability

π_{x_{1} \dots x_{r}}

multiplying by the transition probabilities along the path

x_{r + 1}, \dots, x_{b}

. To complete the derivation, we need to define two row probability vectors of order

(1 \times |Ω_{Y} - {α}|)

,

IP

and

{IP}_{i}, i = 1, \dots, c .

Both vectors represent the probability of entering into the states in

Ω_{Y} - {α}

after b trials with no competing patterns; the difference arises from their initial conditions. The vector

IP = (IP (x) : x \in Ω_{Y} - {α})

assumes an initial state in

S^{r}

(not including a pattern), and calculates the probability of entering

Ω_{Y} - {α}

at time

t = b - r

; here, we use the steady-state probability vector

π

of order r, and a multiplication of

(b - r)

successive transition probabilities. The vector

{IP}_{i} = (I P_{i} (x) : x \in Ω_{Y} - {α})

assumes an initial pattern

Λ^{(i)}

and derives the probability of entering

Ω_{Y} - {α}

afterward; here, we use a multiplication of b successive transition probabilities.

Proposition 1.

The functions

G_{i} (z) 1_{\{z > b\}}

and

G_{i, j} (z) 1_{\{z > b\}}

have the general form of:

\begin{matrix} G_{j} (z) 1_{\{z > b\}} & = (z^{b} IP) \cdot {(I - z T)}^{- 1} \cdot (z T_{j}^{0}), j = 1, \dots, c, \\ G_{i, j} (z) 1_{\{z > b\}} & = (z^{b} {IP}_{i}) \cdot {(I - z T)}^{- 1} \cdot (z T_{j}^{0}), i, j = 1, \dots, c . \end{matrix}

(24)

Proof.

The derivation of

G_{i} (z) 1_{\{z > b\}}

and

G_{i, j} (z) 1_{\{z > b\}}

is composed of three parts. In the first part, the experiment enters a state within the set

Ω_{Y} - {α}

after b trials where no hitting occurs. Thus, we multiply

I P

and

I P_{i}

by

z^{b},

respectively. The second part is the pgf of the number of trials to stay in the set

Ω_{Y} - {α}

until absorption. Here, following Fu and Lou [41], we have the term

{(I - z T)}^{- 1} .

From that point, the third part is the probability of hitting

Λ^{(j)}

in the next trial, with pgf

z T_{j}^{0} .

□

Remark 2.

An easy way to derive the vectors

IP

and

{IP}_{i}, i = 1, \dots, c

is as follows. Construct the transition probability matrices

H_{r, b} = (p_{{x, x}^{'}} : x \in S^{r}, x^{'} \in S^{b})

and

H_{b, b} = (p_{{x, x}^{'}} : x \in S^{b}, x^{'} \in S^{b}) .

The matrix

H_{r, b} \cdot {(H_{b, b})}^{b - 1}

present the probability of each state in

Ω_{Y}

, given an initial r-state trial. The vectors

I P_{i}, i = 1, \dots, c

can be extracted from

H_{r, b} \cdot {(H_{b, b})}^{b - 1}

by taking the corresponding row to the r ending block of

Λ^{(i)},

and the corresponding columns

Ω_{Y} - {α}

with the appropriate eliminations.

Example 5.

Since

b = 3,

we have

x \in S^{3}

and

Ω_{Y} = {000, 001, 010, 011, 100, 101, 110, 111} .

Here,

α^{1} = {000, 001, 100},

α^{2} = {111},

α = α^{1} \cup α^{2} = {000, 001, 100, 111},

and

Ω_{Y} - \{α\} = {010, 011, 101, 110} .

We construct the embedded homogeneous Markov chain

{Y_{t}}_{4}^{\infty}

on the state space

Ω_{Y} = {010, 011, 101, 110, α^{1}, α^{2}} .

To clarify, Figure 2a–d illustrate the states and probabilities of

Y_{3};

states marked by light red, including a pattern and, thus, are marked by

α^{1}

or

α^{2} .

Summarizing, the initial distribution probability of

Y_{3}

is given by (note that

π_{Y_{3}} e^{T} = 1

):

π_{Y_{3}} = (\underset{x \in Ω_{Y} - {α}}{\underset{︸}{π_{01} p_{5}, π_{01} p_{6}, π_{10} p_{4}, π_{11} p_{7}}}, \underset{x \in α^{1}}{\underset{︸}{π_{00} p_{1} + π_{00} p_{2} + π_{10} p_{3}}}, \underset{x \in α^{2}}{\underset{︸}{π_{11} p_{8}}}) .

(25)

The transition probability matrix has the form

(\begin{array}{c} T & T^{0} \\ 0 & I \end{array}),

and is given by (Table 1):

To complete our derivation, we need to derive the vectors

IP, {IP}_{1},

and

{IP}_{2} .

Note that the vector

IP

is the sub-vector of

π_{Y_{3}},

with the corresponding vectors in

Ω_{Y} - \{α\};

i.e.,

IP = (\underset{Y_{3} (t = 1) = (010)}{\underset{︸}{π_{01} p_{5}}}, \underset{Y_{3} (t = 1) = (011)}{\underset{︸}{π_{01} p_{6}},} \underset{Y_{3} (t = 1) = (101)}{\underset{︸}{π_{10} p_{4}}}, \underset{Y_{3} (t = 1) = (110)}{\underset{︸}{π_{11} p_{7}}}) .

(26)

(Note that, since

(00)

is a competing pattern, the vector

IP

considers only the two-state trials {01,10,11}). In order to derive

{IP}_{1}

and

{IP}_{2},

we apply Remark 2. Here, the matrices

H_{2, 3}

and

H_{3, 3}

are given by (Table 2):

And thus,

H_{2, 3} \cdot {(H_{3, 3})}^{2}

is given by:

The probability vectors

{IP}_{1}

and

{IP}_{2}

to be at state

x \in Ω_{Y} - \{α\}

in three trials without hitting, starting from

Λ^{(1)}

and

Λ^{(2)}

respectively, are highlighted by the red boxes; see the first and last rows in Table 3. Here, we obtain

\begin{matrix} {IP}_{1} & = (p_{1} p_{2} p_{5}, p_{1} p_{2} p_{6}, p_{2} p_{5} p_{4}, p_{2} p_{6} p_{7}), \\ {IP}_{2} & = (p_{7} p_{4} p_{5}, p_{7} p_{4} p_{6}, p_{8} p_{7} p_{4}, p_{8} p_{8} p_{7}) . \end{matrix}

(27)

(Note that

I P

is also obtained by taking the columns corresponding to

Ω_{Y} - \{α\}

in the product

π \cdot {\tilde{H}}_{2, 3}

). Summarizing all, we have:

\begin{matrix} G_{1} (z) = & z^{2} \cdot π_{00} + z^{3} \cdot (π_{10} \cdot p_{3}) + z^{3} \cdot (\begin{matrix} π_{01} p_{5} & π_{01} p_{6} & π_{10} p_{4} & π_{11} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{p_{3}}{0}}{\binom{0}{p_{3}}}), \\ G_{2} (z) = & z^{3} \cdot (π_{11} \cdot p_{8}) + z^{3} \cdot (\begin{matrix} π_{01} p_{5} & π_{01} p_{6} & π_{10} p_{4} & π_{11} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{0}{p_{8}}}{\binom{0}{0}}), \\ G_{1, 1} (z) = & z^{2} \cdot {(p_{1})}^{2} + z^{3} \cdot (p_{2} \cdot p_{5} \cdot p_{3}) \\ + z^{3} \cdot (\begin{matrix} p_{1} p_{2} p_{5} & p_{1} p_{2} p_{6} & p_{2} p_{5} p_{4} & p_{2} p_{6} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{p_{3}}{0}}{\binom{0}{p_{3}}}), \\ G_{1, 2} (z) = & z^{3} \cdot (p_{2} \cdot p_{6} \cdot p_{8}) + z^{3} \cdot (\begin{matrix} p_{1} p_{2} p_{5} & p_{1} p_{2} p_{6} & p_{2} p_{5} p_{4} & p_{2} p_{6} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{0}{p_{8}}}{\binom{0}{0}}), \\ G_{2, 1} (z) = & z^{2} \cdot (p_{7} \cdot p_{3}) + z^{3} \cdot (p_{8} \cdot p_{7} \cdot p_{3}) \\ + z^{3} \cdot (\begin{matrix} p_{7} p_{4} p_{5} & p_{7} p_{4} p_{6} & p_{8} p_{7} p_{4} & p_{8} p_{8} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{p_{3}}{0}}{\binom{0}{p_{3}}}), \\ G_{2, 2} (z) = & z^{3} \cdot {(p_{8})}^{3} + z^{3} \cdot (\begin{matrix} p_{7} p_{4} p_{5} & p_{7} p_{4} p_{6} & p_{8} p_{7} p_{4} & p_{8} p_{8} p_{7} \end{matrix}) \cdot {(I - z T)}^{- 1} z \cdot (\binom{\binom{0}{p_{8}}}{\binom{0}{0}}) . \end{matrix}

(28)

Substituting (28) in (15) completes the derivation of G(z).

4. Algorithm and Examples

To summarize, we next present an algorithm with the main key steps; a detailed pseudocode is provided in Appendix A. The algorithm is then demonstrated by two additional examples.

4.1. The Algorithm

Step 1. Inputs and initialization

The parameter $r,$ the space state $Ω_{r} = {S^{r}},$ the $S^{r}$ -square transition matrix $A$ .
The competing patterns and their appearances: ${(Λ^{(i)}, n_{i})}_{i = 1, \dots, c} .$
Calculate the steady-state probability vector $π = (π_{x} : x \in S^{r})$ satisfying $π A = π, π e = 1 .$

Step 2. Embedded Markov chain

Calculate the stopping vectors ${V_{i}}_{i = 1, \dots, c};$ use (8).
Set $b = {max}_{i} {|l_{i}|}, i = 1, \dots, c .$
Build the state space $Ω_{Y} = {S^{b}},$ define the absorbing states ${α^{i}}, i = 1, \dots, c .$
Construct the matrix $(\begin{array}{c} T & T^{0} \\ 0 & I \end{array}),$ use (21) and (22).

Step 3. Probability generating function

Derive $G_{i} (z) 1_{\{z \leq b\}}$ and $G_{i, j} (z) 1_{\{z \leq b\}};$ use (18).
Calculate the vector $IP,$ and the matrices $H_{r, b}$ and $H_{b, b} .$ Derive ${{IP}_{i}}_{i = 1, \dots, c}$ (the matrix $H_{r, b} \cdot {(H_{b, b})}^{b - 1}$ can be helpful with the appropriate eliminations).
Apply (G14) to obtain $G_{i} (z) 1_{\{z > b\}}$ and $G_{i, j} (z) 1_{\{z > b\}} .$
Apply (16) and (14) to derive $G (z) .$

4.2. Example 2

Our next example is inspired by ReasonLabs. ReasonLabs Ltd. is a global pioneer in cybersecurity detection and prevention powered by machine learning (https://reasonlabs.com) (accessed on 10 January 2025). The Israeli team (called the Performance team) is responsible for projects marketing via a website. Their marketing occurs in several stages, in which a customer is expected to enter the website and choose the service that suits their needs. Basic services are free, while premium services are purchased. Usually, a customer (buyer) visits the website a few times for various projects. The Performance team’s purpose is to track customer acquisition and identify buyer patterns, especially those who are willing to purchase the premium service. To achieve this, they use Bernoulli trials, where each outcome represents a customer choice. The result ‘1’ represents purchasing a premium service, and ‘0’ represents choosing a free service. Two stopping rules are proposed to detect the customer behavior:

(1) Two consecutive purchases (1s) occurring twice (not necessarily consecutively). Here, we also allow at most one free service (‘0’) between them. This customer is considered the most serious buyer to track.

(2) Two consecutive free enters (0s) occurring twice (not necessarily consecutively). This customer is considered a less serious customer who can be ignored.

The above rules (1) and (2) can be modeled as a waiting time problem of competing patterns. We next demonstrate the algorithm on Example 2.

Step 1. Consider second-order Markov-dependent Bernoulli trials, with transition probabilities given in (4)–(6). Here, we have two competing patterns,

Λ^{(1)} = {101,

11}

with

n_{1} = 2

(associated with Rule 1), and

Λ^{(2)} = {00},

with

n_{2} = 2

(associated with Rule 2). Thus, the experiment terminates when one of the following rules occurs:

(i): Two occurrences of the set of patterns ${101, 11},$ i.e., either 101 occurs twice, or 11 occurs twice, or 101 and then $11,$ or vice versa, all occurrences are not necessarily consecutive.
(ii): Two (not necessarily consecutive) occurrences of two consecutive $0^{'} s$ .

Example of the sequences are, e.g., 010100011, 11100101 (Rule (i)), and 0100101100, 01011000100 (Rule (ii)).

Step 2. Accordingly, we have two sets

V_{1}

and

V_{2}

corresponding to

Λ^{(1)}

and

Λ^{(2)},

respectively. Figure 3 demonstrates the paths of possible patterns that terminate the experiment. We observe that

V_{1}

and

V_{2}

include three possible paths each (

V_{11}, V_{12}, V_{13}

), and (

V_{21}, V_{22}, V_{23}

). Thus, we have six possible paths in total (stopping vectors) that terminate the experiment.

Corollary 1(i) yields that each such stopping vector includes at least two patterns and no more than three patterns (

(n_{1} - 1) + (n_{2} - 1) + 1 = 3

). By Corollary 1(ii),

C_{1}

and

C_{2}

are given by:

C_{i} = \sum_{k_{1} = 0}^{1} \frac{(k_{1} + n_{i} - 1)!}{k_{1}! (n_{i} - 1)!} = \frac{(2 - 1)!}{0! (2 - 1)!} + \frac{(1 + 2 - 1)!}{1! (2 - 1)!} = 3, for i = 1, 2,

(29)

and

|V| = C_{1} + C_{2} = 6

vectors. Specifically, the stopping vectors are (see Figure 3):

\begin{matrix} V_{1} = & \{V_{1, 1} = [Λ^{(1)}, Λ^{(1)}], V_{1, 2} = [Λ^{(1)}, Λ^{(2)}, Λ^{(1)}], V_{1, 3} = [Λ^{(2)}, Λ^{(1)}, Λ^{(1)}]\}, \\ V_{2} = & \{V_{2, 1} = [Λ^{(1)}, Λ^{(2)}, Λ^{(2)}], V_{2, 2} = [Λ^{(2)}, Λ^{(1)}, Λ^{(2)}], V_{2, 3} = [Λ^{(2)}, Λ^{(2)}]\} \\ V = & V_{1} \cup V_{2} . \end{matrix}

(30)

Since we have two competing patterns,

Λ^{(1)}

and

Λ^{(2)},

there are four pgfs,

G_{1, 1} (z),

G_{1, 2} (z),

G_{2, 1} (z),

G_{2, 2} (z),

in addition to

G_{1} (z),

G_{2} (z)

(the pgf of the first occurrence of

Λ^{(1)}

and

Λ^{(2)},

respectively). Furthermore, the pattern

Λ^{(1)}

includes two simple patterns

{00, 11} .

Thus, to distinguish between them, we mark

{11}

by the sign “′”, and refer to

{101}

by no sign. Hence, we need to derive

G_{1, 1} (z),

G_{1, 1^{'}} (z),

G_{1^{'}, 1} (z),

G_{1^{'}, 1^{'}} (z),

G_{1, 2} (z),

G_{1^{'}, 2} (z),

G_{2, 1} (z),

G_{2, 1^{'}} (z),

G_{2, 2} (z),

G_{1} (z),

G_{1^{'}} (z),

and

G_{2} (z) .

The final pgf will be (to be shorter, the parameter z is omitted):

\begin{matrix} G (z) & = \underset{V_{11}}{\underset{︸}{G_{1} (G_{1, 1} + G_{1, 1^{'}}) + G_{1^{'}} (G_{1^{'}, 1} + G_{1^{'}, 1^{'}})}} + \underset{V_{12}}{\underset{︸}{(G_{1} G_{1, 2} + G_{1^{'}} G_{1^{'}, 2}) (G_{2, 1} + G_{2, 1^{'}})}} \\ + \underset{V_{13}}{\underset{︸}{G_{2} [G_{2, 1} (G_{1, 1} + G_{1, 1^{'}}) + G_{2, 1^{'}} (G_{1^{'}, 1} + G_{1^{'}, 1^{'}})]}} + \underset{V_{21}}{\underset{︸}{(G_{1} G_{1, 2} + G_{1^{'}} G_{1^{'}, 2}) G_{2, 2}}} \\ + \underset{V_{22}}{\underset{︸}{G_{2} (G_{2, 1} G_{1, 2} + G_{2, 1^{'}} G_{1^{'}, 2})}} + \underset{V_{23}}{\underset{︸}{G_{2} G_{2, 2}}} \end{matrix}

(31)

Here, the longest pattern is

{101}

, so

b = 3,

and

Ω_{Y} = {000, 001, 010, 011, 100, 101, 110, 111

}. We obtain

α_{2} = {101},

α_{1}^{'} = {110, 011, 111}

, and

α_{2} = {000, 001, 100} .

Thus,

Ω_{Y} - {α_{1}, α_{1}^{'}, α_{2}} = {010},

and the

(1 \times 1)

vectors (in fact, scalars) are

I P = π_{01} p_{5},

I P_{1} = p_{5} p_{4} p_{5},

I P_{1}^{'} = p_{7} p_{4} p_{5},

and

I P_{2} = p_{1} p_{2} p_{5}

(see the red rectangles in Table 4). The absorbing vectors (scalar) are

T_{1}^{0} = p_{4},

T_{1^{'}}^{0} = 0

and

T_{2}^{0} = p_{3} .

Also, note that the

(1 \times 1)

transition matrix is

T = 0,

and

{(I - z T)}^{- 1}

=

I .

This result is due to the fact that, when we have the sequence

010,

either trial 0 or 1 immediately yields a hit. The immediate conclusion is that the waiting time for a single hit is no more than four trials.

Step 3. We next derive

G_{i} (z) 1_{\{z \leq 3\}}, i = 1, 1^{'}, 2,

and the pgfs

G_{i, j} (z) 1_{\{z \leq 3\}}, i, j = 1, 1^{'}, 2 .

\begin{matrix} G_{1} (z) 1_{\{z \leq 3\}} & = z^{3} \cdot π_{10} \cdot p_{4}, \\ G_{1^{'}} (z) 1_{\{z \leq 3\}} & = z^{2} \cdot π_{11} + z^{3} \cdot π_{01} \cdot p_{6}, \\ G_{2} (z) 1_{\{z \leq 3\}} & = z^{2} \cdot π_{00} {+ z}^{3} \cdot π_{10} \cdot p_{3} . \end{matrix}

(32)

The function

G_{i, j} (z) 1_{\{z \leq 3\}}

is constructed by a path of at most three multiplications of transition probabilities:

\begin{matrix} G_{1, 1} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{8}^{3}, & G_{1, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{6} \cdot p_{8} {+ z}^{3} \cdot p_{5} \cdot p_{4} \cdot p_{6}, \\ G_{1^{'}, 1} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{4}, & G_{1^{'}, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{8}^{2} + z^{3} \cdot p_{7} \cdot p_{4} \cdot p_{6}, \\ G_{1, 2} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{5} \cdot p_{3} {+ z}^{3} \cdot p_{6} \cdot p_{7} \cdot p_{3}, & G_{1^{'}, 2} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{7} \cdot p_{3} {+ z}^{3} \cdot p_{8} \cdot p_{7} \cdot p_{3}, \\ G_{2, 1} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{4}, & G_{2, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{2} \cdot p_{6} {+ z}^{3} \cdot p_{1} \cdot p_{2} \cdot p_{6}, \\ G_{2, 2} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{1}^{2} + z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{3} . \end{matrix}

(33)

Summarizing everything, we have:

\begin{matrix} G_{1} (z) & = & z^{3} \cdot π_{10} \cdot p_{4} + z^{3} \cdot π_{01} \cdot p_{5} \cdot z \cdot p_{4}, \\ G_{1^{'}} (z) & = & z^{2} \cdot π_{11} + z^{3} \cdot π_{01} \cdot p_{6}, \\ G_{2} (z) & = & z^{2} \cdot π_{00} + z^{3} \cdot π_{10} \cdot p_{3} + z^{3} \cdot π_{01} \cdot p_{5} \cdot z \cdot p_{3}, \\ G_{1, 1} (z) & = & z^{3} \cdot p_{8}^{3} + z^{3} \cdot p_{5} \cdot p_{4} \cdot p_{5} \cdot z \cdot p_{4}, \\ G_{1, 1^{'}} (z) & = & z^{2} \cdot p_{6} \cdot p_{8} + z^{3} \cdot p_{5} \cdot p_{4} \cdot p_{6}, \\ G_{1^{'}, 1} (z) & = & z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{4} + z^{3} \cdot p_{7} \cdot p_{4} \cdot p_{5} \cdot z \cdot p_{4}, \\ G_{1^{'}, 1^{'}} (z) & = & z^{2} \cdot p_{8}^{2} + z^{3} \cdot p_{7} \cdot p_{4} \cdot p_{6}, \end{matrix}

(34)

\begin{matrix} G_{1, 2} (z) & = z^{2} \cdot p_{5} \cdot p_{3} + z^{3} \cdot p_{6} \cdot p_{7} \cdot p_{3} + z^{3} \cdot p_{5} \cdot p_{4} \cdot p_{5} \cdot z \cdot p_{3}, \\ G_{1^{'}, 2} (z) & = z^{2} \cdot p_{7} \cdot p_{3} + z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{3} + z^{3} \cdot p_{7} \cdot p_{4} \cdot p_{5} \cdot z \cdot p_{3}, \\ G_{2, 1} (z) & = z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{4} + z^{3} \cdot p_{1} \cdot p_{2} \cdot p_{5} \cdot z \cdot p_{4}, \\ G_{2, 1^{'}} (z) & = z^{2} \cdot p_{2} \cdot p_{6} + z^{3} \cdot p_{1} \cdot p_{2} \cdot p_{6}, \\ G_{2, 2} (z) & = z^{2} \cdot p_{1}^{2} + z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{3} + z^{3} \cdot p_{1} \cdot p_{2} \cdot p_{5} \cdot z \cdot p_{3} . \end{matrix}

(35)

Substituting (34) and (35) in (31) completes the derivation of

G (z) .

To add real data, the ReasonLabs Performance team further conducted an estimation of daily user-choice probabilities for free vs. premium services, focusing on the two most recent days of user activity,

p_{1} = 0.3, p_{2} = 0.7, p_{3} = 0.4, p_{4} = 0.6, p_{5} = 0.5, p_{6} = 0.5, p_{7} = 0.2, p_{8} = 0.8 .

The above probabilities show that those who switch from free to premium have a 50% chance of reverting and a 50% chance of continuing with premium, while in the opposite case, i.e., those who switch from premium to free have a 60% likelihood of switching back to premium. We also see that 80% of users that chose premium twice will stay premium, and 70% of free users will try premium. These probabilities suggest that users are likely to move toward premium.

Based on these estimates, the probability vector

π

and the pgf of the waiting times are given by:

\begin{matrix} π = (0.1126, 0.1972, 0.1972, 0.4930), \\ G (z) = 0.319 \cdot z^{4} + 0.203 \cdot z^{5} + 0.164 \cdot z^{6} + 0.125 \cdot z^{7} + 0.088 \cdot z^{8} + 0.059 \cdot z^{9} \\ + 0.028 \cdot z^{10} + 0.009 \cdot z^{11} + 0.0016 \cdot z^{12}, \end{matrix}

with an average number of trials of 5.805 and a variance of 3.197. The information that it takes approximately 6 visits (with a minimum of 4 and a maximum of 12 visits) to classify a customer may help in the optimization of resource allocation and preventing early interventions. In addition, the information may also contribute to mapping decision-making processes about customers and identifying behavioral changes in users.

4.3. Example 3

Our third example is inspired by Gamida Ltd. (https://gamida.co.il), (accessed on 1 February 2025). Gamida Ltd. provides targeted and comprehensive first-class services to the medical, science, technology, and industrial community in Israel and is part of the international group of companies Gamida For Life B.V. Among others, Gamida is the exclusive representative in Israel of the following companies: Cardinal Health, Lohmann & Rauscher, Abbott, B.Braun, Flen Health, BD, Getinge, Philips, and Integra LifeSciences, among others. The group specializes in the import, development, production, marketing, and distribution of products from research, diagnostics, medicine, and the advanced industries. Gamida specializes in developing wearable bracelets designed to monitor potential heart rate irregularities, especially in the elderly. These bracelets continuously collect heart rate data and process the accumulated information at the end of each day to produce a daily binary result indicating whether the heart rate was normal or not. A result ‘0’ represents a normal heart rate, and ‘1’ represents an abnormal rate; these results are then transmitted to the medical team for further analysis. The empirical investigation shows that, as expected, there is a dependence between the results. Gamida Medical Ltd. recommends two situations whose occurrence requires further investigation:

(1): Two occurrences (not necessarily consecutive) of an abnormal rate after a normal or abnormal rate. An abnormal rate that appears twice (even after a normal one) may indicate a cardiac problem and is worth checking. According to the company’s experience, the need to record two consecutive outcomes helps determine whether the result is part of an ongoing trend or a single event.
(2): Three (not necessarily consecutively) sequences of an abnormal rate followed by two normal results. This situation may be caused by a malfunction of the device, or as a result of other factors that caused a positive change in the heart rate.

Both conditions suggest an irregular heartbeat and require medical examination and attention. Next, we will demonstrate the algorithm on Example 3.

Step 1. Consider second-order Markov-dependent Bernoulli trials, with arguments as in (4)–(6), with two competing patterns,

Λ^{(1)} = {01,

11}

with

n_{1} = 2

, and

Λ^{(2)} = {100},

with

n_{2} = 2 .

Thus, the experiment terminates when one of the following rules occurs:

(i): Two occurrences of 11, or two occurrences of $01,$ or first 01 and then $11,$ or vice versa—all occurrences are not necessarily consecutive.
(ii): Two (not necessarily consecutive) occurrences of the pattern ${100}$ .

An example of a sequence is given as follows, e.g., 101101, 001100001 (Rule (i)), and 10011100, 101100100 for (Rule (ii)).

Step 2. Accordingly, we have two sets

V_{1}

and

V_{2}

corresponding to

Λ^{(1)}

and

Λ^{(2)},

respectively, which are the same paths as those of Example 2, and thus, are given by (30). Here, we also have four pgfs (

G_{1, 1} (z),

G_{1, 2} (z),

G_{2, 1} (z),

G_{2, 2} (z)

) in addition to

G_{1} (z),

G_{2} (z)

. We further distinguish between the patterns

{01, 11}

of

Λ^{(1)}

by adding the sign “′” to the terms referring to

{11}

, and using no sign for

{01}

. Hence, we need to derive 12 pgfs with the final pgf given by (31).

Here, the longest pattern is

{100}

, so

b = 3,

Ω_{Y} = {000, 001, 010, 011, 100, 101, 110, 111}

. The absorbing states are

α_{1} = {001, 010, 011, 101},

α_{1}^{'} = {110, 111}

and

α_{2} = {100} .

Therefore, we have

Ω_{Y} - {α_{1}, α_{1}^{'}, α_{2}} = {000},

and the

(1 \times 1)

vectors (in fact, scalars) are

I P = π_{00} p_{1},

I P_{1} = p_{5} p_{3} p_{1},

I P_{1}^{'} = p_{7} p_{3} p_{1},

I P_{2} = {(p_{1})}^{3}

(highlighted by the red rectangles in Table 5 and Table 6). The absorbing vectors (scalar) are

T_{1}^{0} = p_{2},

T_{1^{'}}^{0} = 0

and

T_{2}^{0} = 0 .

Also note that the

(1 \times 1)

matrix

T = p_{1},

thus,

{(I - z T)}^{- 1}

=

{(1 - z \cdot p_{1})}^{- 1} .

Step 3. We next derive

G_{i} (z) 1_{\{z \leq 3\}}, i = 1, 1^{'}, 2,

and the pgfs

G_{i, j} (z) 1_{\{z \leq 3\}}, i, j = 1, 1^{'}, 2 .

\begin{matrix} G_{1} (z) 1_{\{z \leq 3\}} & = z^{2} \cdot π_{01} {+ z}^{3} \cdot (π_{00} \cdot p_{2} + π_{10} \cdot p_{4}), \\ G_{1^{'}} (z) 1_{\{z \leq 3\}} & = z^{2} \cdot π_{11} + z^{3} \cdot π_{01} \cdot p_{6}, \\ G_{2} (z) 1_{\{z \leq 3\}} & = z^{3} \cdot π_{10} \cdot p_{3} . \end{matrix}

(36)

The function

G_{i, j} (z) 1_{\{z \leq 3\}}

is constructed by a path of at most three multiplications of transition probabilities:

\begin{matrix} G_{1, 1} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{5} p_{4} + z^{3} \cdot (p_{6} \cdot p_{7} \cdot p_{4} + p_{5} \cdot p_{3} \cdot p_{2}), & G_{1, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{6} \cdot p_{8}, \\ G_{1^{'}, 1} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{7} \cdot p_{4} {+ z}^{3} \cdot (p_{8} \cdot p_{7} \cdot p_{4} + p_{7} \cdot p_{3} \cdot p_{2}), & G_{1^{'}, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{8}^{2}, \\ G_{1, 2} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{6} \cdot p_{7} \cdot p_{3}, & G_{1^{'}, 2} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{3}, \\ G_{2, 1} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{1} \cdot p_{2} {+ z}^{3} \cdot (p_{1} \cdot p_{1} \cdot p_{2} + p_{2} \cdot p_{5} \cdot p_{4}), & G_{2, 1^{'}} (z) 1_{\{z \leq 3\}} = z^{2} \cdot p_{2} \cdot p_{6}, \\ G_{2, 2} (z) 1_{\{z \leq 3\}} = z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{3} . \end{matrix}

(37)

Summarizing all, we have:

\begin{matrix} G_{1} (z) & = z^{2} \cdot π_{01} {+ z}^{3} \cdot π_{00} \cdot p_{2} + z^{3} \cdot π_{10} \cdot p_{4} + z^{3} \cdot π_{00} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot p_{2}, \\ G_{1^{'}} (z) & = z^{2} \cdot π_{11} + z^{3} \cdot π_{01} \cdot p_{6} + z^{3} \cdot π_{00} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{2} \cdot π_{11} + z^{3} \cdot π_{01} \cdot p_{6}, \\ G_{2} (z) & = z^{3} \cdot π_{10} \cdot p_{3} + z^{3} \cdot π_{00} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{3} \cdot π_{10} \cdot p_{3}, \\ G_{1, 1} (z) & = z^{2} \cdot p_{5} \cdot p_{4} + z^{3} \cdot (p_{6} \cdot p_{7} \cdot p_{4} + p_{5} \cdot p_{3} \cdot p_{2}) + z^{3} \cdot p_{5} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot p_{2}, \\ G_{1, 1^{'}} (z) & = z^{2} \cdot p_{6} \cdot p_{8} + z^{3} \cdot p_{5} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{2} \cdot p_{6} \cdot p_{8}, \\ G_{1^{'}, 1} (z) & = z^{2} \cdot p_{7} \cdot p_{4} {+ z}^{3} \cdot (p_{8} \cdot p_{7} \cdot p_{4} + p_{7} \cdot p_{3} \cdot p_{2}) + z^{3} \cdot p_{7} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot p_{2}, \\ G_{1^{'}, 1^{'}} (z) & = z^{2} \cdot p_{8}^{2} + z^{3} \cdot p_{7} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{2} \cdot p_{8}^{2}, \end{matrix}

(38)

\begin{matrix} G_{1, 2} (z) & = z^{3} \cdot p_{6} \cdot p_{7} \cdot p_{3} + z^{3} \cdot p_{5} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{3} \cdot p_{6} \cdot p_{7} \cdot p_{3}, \\ G_{1^{'}, 2} (z) & = z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{3} + z^{3} \cdot p_{7} \cdot p_{3} \cdot p_{1} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{3} \cdot p_{8} \cdot p_{7} \cdot p_{3}, \\ G_{2, 1} (z) & = z^{2} \cdot p_{1} \cdot p_{2} {+ z}^{3} \cdot (p_{1} \cdot p_{1} \cdot p_{2} + p_{2} \cdot p_{5} \cdot p_{4}) + z^{3} \cdot p_{1}^{3} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot p_{2}, \\ G_{2, 1^{'}} (z) & = z^{2} \cdot p_{2} \cdot p_{6} + z^{3} \cdot p_{1}^{3} \cdot {(1 - z \cdot p_{1})}^{- 1} \cdot z \cdot 0 = z^{2} \cdot p_{2} \cdot p_{6}, \\ G_{2, 2} (z) & = z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{3} + z^{3} \cdot p_{1}^{3} \cdot {(1 - z \cdot p_{1})}^{- 1} z \cdot 0 = z^{3} \cdot p_{2} \cdot p_{5} \cdot p_{3} . \end{matrix}

(39)

Substituting (38) and (39) in (31) completes the derivation of

G (z) .

Equations (38) and (39) show some interesting results. For example, we observe that at most three trials are needed to hit 11 or 100 (see

G_{1^{'}} (z), G_{2} (z)

), exactly two trials are needed from

11 / 01 / 100

to 11 (see

G_{1, 1^{'}} (z),

G_{1^{'}, 1^{'}} (z),

and

G_{2, 1^{'}} (z)

), and exactly three trials are needed from

11 / 01 / 100

to 100 (see

G_{1, 2} (z), G_{1^{'}, 2} (z),

and

G_{2, 2} (z)

). To explain the results, assume that the pattern 11 occurs. From that point, any trial 0 followed by 1 immediately yields the hit 01; the only option that 11 occurs again (before other patterns are hit) is by the double trial

11 .

Similarly, other cases can be explained.

According to Gamida Ltd., the transition probabilities are estimated by:

p_{1} = 0.65, p_{2} = 0.35, p_{3} = 0.6, p_{4} = 0.4, p_{5} = 0.7, p_{6} = 0.3, p_{7} = 0.2, p_{8} = 0.8,

with a steady-state probability vector

π = (0.3288, 0.1918, 0.1918, 0.2877) .

Calculating the pgf of the waiting time yielded the following:

\begin{matrix} G (z) & = 0.293 \cdot z^{4} + 0.189 \cdot z^{5} + 0.0529 \cdot z^{6} + 0.032 \cdot z^{7} + 0.037 \cdot z^{8} + 0.0104 \cdot z^{9} + \\ \frac{0.0614 \cdot z^{6} + 0.0312 \cdot z^{7} + 0.0122 \cdot z^{9} + 0.0068 \cdot z^{10}}{1 - 0.65 \cdot z} + \frac{0.00671 \cdot z^{8} + 0.00123 \cdot z^{11}}{{(1 - 0.65 \cdot z)}^{2}}, \end{matrix}

with an average number of trials of 6.63 and a variance 8.99, whilst the average detection time for identifying critical heart rate patterns is 6.6 days. These findings indicate that approximately one week of continuous monitoring is generally required to reliably detect critical cardiac rate patterns. The results can contribute to determining a reasonable detection window while minimizing false alarms. In addition, they can be useful in developing patient care protocols and setting alert thresholds, leading to a more efficient approach to cardiac monitoring.

5. Summary and Future Research

This paper studies a Markov-dependent model with Bernoulli trials and competing patterns. Competing patterns are compound patterns that compete to be the first to occur a specified number of times. Using a finite Markov chain and tools from probability theory, we develop an algorithm to derive a closed-form expression for the pgf of the waiting time distribution. It must be noted that despite being simple to understand, its application requires preparatory work for calculating path probabilities, which is a function of the parameter b. However, we believe that integrating both traditional computing techniques and advanced AI and machine learning approaches may be useful in developing efficient solutions even for the most complex cases. In this vein, the methodology presented in this paper serves as the foundational framework for the computational solutions discussed.

Theoretical extensions of the model can be interpreted in several directions. Our model assumes non-overlapping counting. It would be an interesting extension to generalize the algorithm to overlapping counting, where partially completed patterns can be finished at any time, regardless of whether another pattern has been completed after the partially completed pattern starts but before it is completed. Another direction is to investigate other stopping time rules, such as the sooner or later models. Here, a sooner model captures the number of trials required for the first occurrence of one of two competing patterns, and conversely, a later model refers to the numbers of trials required for both patterns to occur.

Author Contributions

Conceptualization, I.M.; Formal analysis, Y.B.; Investigation, Y.B.; Writing—original draft, I.M.; Writing—review and editing, Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data is unavailable due to the company’s privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In this appendix, we extend the algorithm of Section 4.1 and provide pseudocode for computing the probability generating function.

Step 1. Inputs and initialization

1.1.

Define:

S \in {0, 1},

r be the Markov-order chain.

1.2.

Generate the set of states

Ω_{r} = {S^{r}},

S^{r}

= {x = (x_{1} \dots x_{r}) : x_{i} \in S}

(

|Ω_{r}| = 2^{r}

).

1.3.

Define transition probabilities,

p_{x, y}

,

x, y \in S^{r} .

Build the

(2^{r} \times 2^{r})

matrix

A = (p_{x, y})

.

1.4.

Define the

(1 \times 2^{r})

probability vector

π = {π_{x} : x \in S^{r}} .

Obtain

π

by solving

π A = π, π e = 1 .

1.5.

For each competing pattern

i,

i = 1, \dots, c

:

Define $Λ^{(i)} = {Λ_{1}^{(i)}, \dots, Λ_{k_{i}}^{(i)}} .$
Let $l_{i, k_{i}} = |Λ_{k_{i}}^{(i)}|,$ $l_{i} = max_{j = 1, \dots, k_{i}} {l_{i, j}} .$
Define $n_{i}$ -the number of appearances needed.

1.6.

Let

b = max_{i = 1, \dots, c} {l_{i}} .

Step 2. Embedded Markov chain

2.1

For

i = 1, \dots, c :

Define $V_{i} = {V_{i, j}}$ to be the set of all paths that terminate the experiment via $Λ^{(i)} .$
Calculate $C_{i} = |V_{i}|$ using

$C_{i} = \sum_{k_{1} = 0}^{n_{1} - 1} \dots \sum_{k_{c} = 0}^{n_{c} - 1} \frac{(k_{1} + \dots + k_{c} + n_{i} - 1)!}{k_{1}! \dots k_{c}! (n_{i} - 1)!}, k_{1} < \dots < k_{c}, k_{j} \neq i .$
For each $V_{i, j},$ generate the series of patterns $V_{i, j} = [V_{i, j} (1), V_{i, j} (2), \dots]$ .

2.2

Build

Ω_{Y} = {S^{b}},

where

S^{r}

= {x = (x_{1}, \dots, x_{b}) : x_{i} \in S}

(

|Ω_{Y}| = 2^{b}

).

2.3

Obtain distinct sets of absorbing states

{α^{i}}_{i = 1, \dots, c}

with regard to

Λ^{(i)}

.

2.4

Obtain the transient set of states,

\{Ω_{Y} - ⋃_{i = 1}^{c} {α^{i}}\} .

2.5

Derive

T

—the transition probability matrix among the states in

\{Ω_{Y} - ⋃_{i = 1}^{c} {α^{i}}\} .

2.6

Derive

T_{i}^{0}

—the absorbing probability matrix into states in

{α^{i}} .

2.7

Construct the Markov probability matrix as follows (Figure A1):

Figure A1. The transition probability matrix.

(

I

is the identity matrix, and

0

is the zero matrix, all with the appropriate dimensions).

Step 3. Probability generating function

3.1

For

i, j = 1, \dots, c

:

Derive $G_{i} {(z)}_{z \leq b}$ and $G_{i, j} {(z)}_{z \leq b}$ using a probability product of maximum length b (a probability tree diagram may be useful).
Derive the $|1 \times Ω_{Y} - {α}|$ vector ${IP}_{i}$ (the matrix $H_{r, b} \cdot {(H_{b, b})}^{b - 1}$ may be helpful).
Compute the $|1 \times Ω_{Y} - {α}|$ vector $IP$ (a probability tree diagram may be useful)
Derive $G_{i} (z) 1_{\{z > b\}}$ and $G_{i, j} (z) 1_{\{z > b\}}$ by:

$\begin{matrix} G_{j} (z) 1_{\{z > b\}} & = (z^{b} IP) \cdot {(I - z T)}^{- 1} \cdot (z T_{j}^{0}) \\ G_{i, j} (z) 1_{\{z > b\}} & = (z^{b} {IP}_{i}) \cdot {(I - z T)}^{- 1} \cdot (z T_{j}^{0}) . \end{matrix}$

(A1)
Use the law of total expectation to obtain:

$\begin{matrix} G_{i, j} (z) & = G_{i, j} (z) 1_{\{z \leq b\}} + G_{i, j} (z) 1_{\{z > b\}}, \\ G_{i} (z) & = G_{i} (z) 1_{\{z \leq b\}} + G_{i} (z) 1_{\{z > b\}} . \end{matrix}$

(A2)

3.2

The final

G (z)

is obtained by

G (z) = \sum_{i, j} \prod_{k = 1}^{|V_{i, j}|} G_{V_{i, j} (k - 1), V_{i, j} (k)} (z) \cdot

References

Schwager, S.J. Run probabilities in sequences of markov-dependent trials. J. Am. Stat. Assoc. 1983, 78, 168–175. [Google Scholar] [CrossRef]
Karwe, V.V.; Naus, J.I. New recursive methods for scan statistic probabilities. Comput. Stat. Data Anal. 1997, 23, 389–402. [Google Scholar] [CrossRef]
Martin, D.E.; Aston, J.A. Waiting time distribution of generalized later patterns. Comput. Stat. Data Anal. 2008, 52, 4879–4890. [Google Scholar] [CrossRef]
Kulldorff, M. A spatial scan statistic. Commun. Stat. Theory Methods 1997, 26, 1481–1496. [Google Scholar] [CrossRef]
Aki, S. Discrete distributions of order k on a binary sequence. Ann. Inst. Stat. Math. 1985, 37, 205–224. [Google Scholar] [CrossRef]
Aki, S.; Hirano, K. Lifetime distribution and estimation problems of consecutive-k-out-of-n: f systems. Ann. Inst. Stat. Math. 1996, 48, 185–199. [Google Scholar] [CrossRef]
Chang, Y.M.; Huang, T.H. Reliability of a 2-dimensional k-within consecutive-r’s-out-of-m’n: f system using finite markov chains. IEEE Trans. Reliab. 2010, 59, 725–733. [Google Scholar] [CrossRef]
Dafnis, S.D.; Antzoulakos, D.L.; Philippou, A.N. Distributions related to (k₁,k₂) events. J. Stat. Plan. Inference 2010, 140, 1691–1700. [Google Scholar] [CrossRef]
Dafnis, S.D.; Gounari, S.; Zotos, C.E.; Papadopoulos, G.K. The effect of cold periods on the biological cycle of Marchalina hellenica. Insects 2022, 13, 375. [Google Scholar] [CrossRef]
Dafnis, S.D.; Makri, F.S.; Koutras, M.V. Generalizations of runs and patterns distributions for sequences of binary trials. Methodol. Comput. Appl. Probab. 2021, 23, 165–185. [Google Scholar] [CrossRef]
Dafnis, S.D.; Makri, F.S.; Philippou, A.N. The reliability of a generalized consecutive system. Appl. Math. Comput. 2019, 359, 186–193. [Google Scholar] [CrossRef]
Dafnis, S.D.; Makri, F.S. Distributions related to weak runs with a minimum and a maximum number of successes: A unified approach. Methodol. Comput. Appl. Probab. 2023, 25, 24. [Google Scholar] [CrossRef]
Feller, W. An Introduction to Probability Theory and Its Applications, 3rd ed.; Wiley: New York, NY, USA, 1971; Volume 1. [Google Scholar]
Philippou, A.N.; Georghiou, C.; Philippou, G.N. A generalized geometric distribution and some of its properties. Stat. Probab. Lett. 1983, 1, 171–175. [Google Scholar] [CrossRef]
Philippou, A.N.; Makri, F.S. Successes, runs and longest runs. Stat. Probab. Lett. 1986, 4, 211–215. [Google Scholar] [CrossRef]
Philippou, A.N.; Antzoulakos, D.L. Multivariate distributions of order k on a generalized sequence. Stat. Probab. Lett. 1990, 9, 453–463. [Google Scholar] [CrossRef]
Ling, K. On geometric distributions of order (k₁,k₂,…,k_m). Stat. Probab. Lett. 1990, 9, 163–171. [Google Scholar] [CrossRef]
Shmueli, G.; Cohen, A. Run-Related probability functions applied to sampling inspection. Technometrics 2000, 42, 188–202. [Google Scholar] [CrossRef]
Koutras, M.V.; Eryilmaz, S. Compound geometric distribution of order k. Methodol. Comput. Appl. Probab. 2017, 19, 377–393. [Google Scholar] [CrossRef]
Blom, G.; Thorburn, D. How many random digits are required until given sequences are obtained? J. Appl. Probab. 1982, 19, 518–531. [Google Scholar] [CrossRef]
Ebneshahrashoob, M.; Sobel, M. Sooner and later waiting time problems for bernoulli trials: Frequency and run quotas. Stat. Probab. Lett. 1990, 9, 5–11. [Google Scholar] [CrossRef]
Huang, W.T.; Tsai, C.S. On a modified binomial distribution of order k. Stat. Probab. Lett. 1991, 11, 125–131. [Google Scholar] [CrossRef]
Makri, F.S. On occurrences of FS strings in linearly and circularly ordered binary sequences. J. Appl. Probab. 2010, 47, 157–178. [Google Scholar] [CrossRef]
Kumar, A.N.; Upadhye, N.S. Generalizations of distributions related to (k₁,k₂)-runs. Metrika 2019, 82, 249–268. [Google Scholar] [CrossRef]
Zhao, X.; Song, Y.; Wang, X.; Lv, Z. Distributions of (k₁,k₂,…,k_l)-runs with multi-state Trials. Methodol. Comput. Appl. Probab. 2022, 24, 2689–2702. [Google Scholar] [CrossRef]
Kong, Y. Multiple consecutive runs of multi-state trials: Distributions of (k₁,k₂,…,k_l) patterns. J. Comput. Appl. Math. 2022, 403, 113846. [Google Scholar] [CrossRef]
Chadjiconstantinidis, S.; Eryilmaz, S. Computing waiting time probabilities related to (k₁,k₂,…,k_l) pattern. Stat. Pap. 2023, 64, 1373–1390. [Google Scholar] [CrossRef]
Aki, S. Waiting time problems for a sequence of discrete random variables. Ann. Inst. Stat. Math. 1992, 44, 363–378. [Google Scholar] [CrossRef]
Koutras, M.V. On a waiting time distribution in a sequence of bernoulli trials. Ann. Inst. Stat. Math. 1996, 48, 789–806. [Google Scholar] [CrossRef]
Robin, S.; Daudin, J.J. Exact distribution of word occurrences in a random sequence of letters. J. Appl. Probab. 1999, 36, 179–193. [Google Scholar] [CrossRef]
Aki, S.; Hirano, K. Waiting time problems for a two-dimensional pattern. Ann. Inst. Stat. Math. 2004, 56, 169–182. [Google Scholar] [CrossRef]
Hirano, K.; Aki, S. On number of occurrences of success runs of specified length in a two-state markov chain. Stat. Sin. 1993, 3, 313–320. [Google Scholar]
Fu, J.C.; Koutras, M.V. Distribution theory of runs: A markov chain approach. J. Am. Stat. Assoc. 1994, 89, 1050–1058. [Google Scholar] [CrossRef]
Fu, J.C. Distribution theory of runs and patterns associated with a sequence of multi-state trials. Stat. Sin. 1996, 6, 957–974. [Google Scholar]
Koutras, M.V. Waiting time distributions associated with runs of fixed length in two-state markov chains. Ann. Inst. Stat. Math. 1997, 49, 123–139. [Google Scholar] [CrossRef]
Antzoulakos, D.L. Waiting times for patterns in a sequence of multistate trials. J. Appl. Probab. 2001, 38, 508–518. [Google Scholar] [CrossRef]
Fisher, E.; Cui, S. Patterns generated by m-order markov chains. Stat. Probab. Lett. 2010, 80, 1157–1166. [Google Scholar] [CrossRef]
Chang, Y.M.; Fu, J.C.; Lin, H.Y. Distribution and double generating function of number of patterns in a sequence of markov dependent multistate trials. Ann. Inst. Stat. Math. 2012, 64, 55–68. [Google Scholar] [CrossRef]
Fu, J.C.; Chang, Y.M. On probability generating functions for waiting time distributions of compound patterns in a sequence of multistate trials. J. Appl. Probab. 2002, 39, 70–80. [Google Scholar] [CrossRef]
Han, Q.; Hirano, K. Sooner and later waiting time problems for patterns in markov dependent trials. J. Appl. Probab. 2003, 40, 73–86. [Google Scholar] [CrossRef]
Fu, J.C.; Lou, W.Y.W. Waiting time distributions of simple and compound patterns in a sequence of r-th order Markov dependent multi-state trials. Ann. Inst. Stat. Math. 2006, 58, 291–310. [Google Scholar] [CrossRef]
Wu, T.L. Conditional waiting time distributions of runs and patterns and their applications. Ann. Inst. Stat. Math. 2020, 72, 531–543. [Google Scholar] [CrossRef]
Aston, J.A.; Martin, D.E. Waiting time distributions of competing patterns in higher-order Markovian sequences. J. Appl. Probab. 2005, 42, 977–988. [Google Scholar] [CrossRef]
Fu, J.C.; Lou, W.Y.W. Distribution Theory of Runs and Patterns and Its Applications: A Finite Markov Chain Imbedding Approach; World Scientific Publishing Co.: Singapore, 2003. [Google Scholar]
Balakrishnan, N.; Koutras, M.V. Runs and Scans with Applications; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Aston, J.A.; Martin, D.E. Distributions associated with general runs and patterns in hidden Markov models. Ann. Appl. Stat. 2007, 1, 585–611. [Google Scholar] [CrossRef]
Martin, D.E.K. Computation of exact probabilities associated with overlapping pattern occurrences. WIREs Comput. Stat. 2019, 11, e1477. [Google Scholar] [CrossRef]
Martin, D.E.K. Distributions of pattern statistics in sparse markov models. Ann. Inst. Stat. Math. 2020, 72, 895–913. [Google Scholar] [CrossRef]
Michael, B.V.; Eutichia, V. On the distribution of the number of success runs in a continuous time markov chain. Methodol. Comput. Appl. Probab. 2020, 22, 969–993. [Google Scholar] [CrossRef]
Vaggelatou, E. On the longest run and the waiting time for the first run in a continuous time multi-state Markov chain. Methodol. Comput. Appl. Probab. 2024, 26, 55. [Google Scholar] [CrossRef]
Makri, F.S.; Psillakis, Z.M. Distribution of patterns of constrained length in binary sequences. Methodol. Comput. Appl. Probab. 2023, 25, 90. [Google Scholar] [CrossRef]
Makri, F.S.; Psillakis, Z.M.; Dafnis, S.D. Number of runs of ones of length exceeding a threshold in a modified binary sequence with locks. Commun. Stat. Simul. Comput. 2024, 1–17. [Google Scholar] [CrossRef]
Inoue, K.; Aki, S. Generalized binomial and negative binomial distributions of order k by the l-overlapping enumeration scheme. Ann. Inst. Stat. Math. 2003, 55, 153–167. [Google Scholar] [CrossRef]

Figure 1. The possible paths of

Λ^{(1)}

and

Λ^{(2)}

that terminate the experiment.

Figure 1. The possible paths of

Λ^{(1)}

and

Λ^{(2)}

that terminate the experiment.

Figure 2. The states and probabilities of

Y_{3} (t = 1)

, given (a)

x (t = 0) = (00)

, (b)

x (t = 0) = (01)

, (c)

x (t = 0) = (10)

, (d)

x (t = 0) = (11)

.

Figure 2. The states and probabilities of

Y_{3} (t = 1)

, given (a)

x (t = 0) = (00)

, (b)

x (t = 0) = (01)

, (c)

x (t = 0) = (10)

, (d)

x (t = 0) = (11)

.

Figure 3. The possible paths of

Λ_{1}

or

Λ_{2}

that terminate the experiment.

Figure 3. The possible paths of

Λ_{1}

or

Λ_{2}

that terminate the experiment.

Table 1. The probability transition matrix of

Ω_{Y}

.

Table 1. The probability transition matrix of

Ω_{Y}

.

P_x,y		010	011	101	110	α¹	α²
P_x,y		Ω_Y
Ω_Y	010	0	0	p₄	0	p₃	0
	011	0	0	0	p₇	0	p₈
	101	p₅	p₆	0	0	0	0
	110	0	0	p₄	0	p₃	0
	α¹	0	0	0	0	1	0
	α²	0	0	0	0	0	1

Table 2. The probability transition matrix of

H_{2, 3}

.

Table 2. The probability transition matrix of

H_{2, 3}

.

$H_{2, 3}$		Ω_Y
$H_{2, 3}$		000	001	010	011	100	101	110	111
2-state trials	00	p₁	p₂
	01			p₅	p₆
	10					p₃	p₄
	11							p₇	p₈

Table 3. The probability transition matrix of

H_{3, 3}

.

Table 3. The probability transition matrix of

H_{3, 3}

.

$H_{3, 3}$		Ω_Y
$H_{3, 3}$		000	001	010	011	100	101	110	111
Ω_Y	000	p₁	p₂
	001			p₅	p₆
	010					p₃	p₄
	011							p₇	p₈
	100	p₁	p₂
	101			p₅	p₆
	110					p₃	p₄
	111							p₇	p₈

Table 4. The probability transition matrix of

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

Table 4. The probability transition matrix of

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		Ω_Y
$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		000	001	010	011	100	101	110	111
2-state trials	00	(p₁)³	(p₁)²p₂	p₁p₂p₅	p₁p₂p₆	p₂p₅p₃	p₂p₅p₄	p₂p₆p₇	p₂p₆p₈
	01	p₅p₃p₁	p₂p₅p₃	(p₅)²p₄	p₅p₄p₆	p₆p₇p₃	p₆p₇p₄	p₆p₈p₇	p₆(p₈)²
	10	p₃(p₁)²	p₃p₁p₂	p₂p₅p₃	p₃p₂p₆	p₅p₄p₃	p₅(p₄)²	p₆p₇p₄	p₄p₆p₈
	11	p₇p₃p₁	p₇p₃p₂	p₇p₄p₅	p₇p₄p₆	p₈p₇p₃	p₈p₇p₄	(p₈)²p₇	(p₈)³

Table 5. The highlighted probabilities used in Example 2 from the matrix

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

Table 5. The highlighted probabilities used in Example 2 from the matrix

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		Ω_Y
$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		000	001	010	011	100	101	110	111
2-state trials	00	(p₁)³	(p₁)²p₂	p₁p₂p₅	p₁p₂p₆	p₂p₅p₃	p₂p₅p₄	p₂p₆p₇	p₂p₆p₈
	01	p₅p₃p₁	p₂p₅p₃	(p₅)²p₄	p₅p₄p₆	p₆p₇p₃	p₆p₇p₄	p₆p₈p₇	p₆(p₈)²
	10	p₃(p₁)²	p₃p₁p₂	p₂p₅p₃	p₃p₂p₆	p₅p₄p₃	p₅(p₄)²	p₆p₇p₄	p₄p₆p₈
	11	p₇p₃p₁	p₇p₃p₂	p₇p₄p₅	p₇p₄p₆	p₈p₇p₃	p₈p₇p₄	(p₈)²p₇	(p₈)³

Table 6. The highlighted probabilities used in Example 3 from the matrix

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

Table 6. The highlighted probabilities used in Example 3 from the matrix

H_{2, 3} \cdot {(H_{3, 3})}^{2}

.

$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		000	001	010	011	100	101	110	111
$H_{2, 3} \cdot {(H_{3, 3})}^{2}$		Ω_Y
2-state trials	00	(p₁)³	(p₁)²p₂	p₁p₂p₅	p₁p₂p₆	p₂p₅p₃	p₂p₅p₄	p₂p₆p₇	p₂p₆p₈
	01	p₅p₃p₁	p₂p₅p₃	(p₅)²p₄	p₅p₄p₆	p₆p₇p₃	p₆p₇p₄	p₆p₈p₇	p₆(p₈)²
	10	p₃(p₁)²	p₃p₁p₂	p₂p₅p₃	p₃p₂p₆	p₅p₄p₃	p₅(p₄)²	p₆p₇p₄	p₄p₆p₈
	11	p₇p₃p₁	p₇p₃p₂	p₇p₄p₅	p₇p₄p₆	p₈p₇p₃	p₈p₇p₄	(p₈)²p₇	(p₈)³

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Moshkovitz, I.; Barron, Y. The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms 2025, 14, 221. https://doi.org/10.3390/axioms14030221

AMA Style

Moshkovitz I, Barron Y. The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms. 2025; 14(3):221. https://doi.org/10.3390/axioms14030221

Chicago/Turabian Style

Moshkovitz, Itzhak, and Yonit Barron. 2025. "The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials" Axioms 14, no. 3: 221. https://doi.org/10.3390/axioms14030221

APA Style

Moshkovitz, I., & Barron, Y. (2025). The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms, 14(3), 221. https://doi.org/10.3390/axioms14030221

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials

Abstract

1. Introduction and Literature Review

2. Definitions and Preliminaries

Probability Generating Function

3. Competing Patterns in High-Order Markov-Dependent Bernoulli Trials

3.1. Step 1. Experimental Design and Settings

3.2. Step 2. Stopping Paths

3.3. Step 3. The Waiting Time Distribution

3.3.1. The Functions $G_{i} (z) 1_{\{z \leq b\}},$ $G_{i, j} (z) 1_{\{z \leq b\}}$

3.3.2. The Functions $G_{i} (z) 1_{\{z > b\}},$ $G_{i, j} (z) 1_{\{z > b\}}$

4. Algorithm and Examples

4.1. The Algorithm

4.2. Example 2

4.3. Example 3

5. Summary and Future Research

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials

Abstract

1. Introduction and Literature Review

2. Definitions and Preliminaries

Probability Generating Function

3. Competing Patterns in High-Order Markov-Dependent Bernoulli Trials

3.1. Step 1. Experimental Design and Settings

3.2. Step 2. Stopping Paths

3.3. Step 3. The Waiting Time Distribution

3.3.1. The Functions G i ( z ) 1 z ≤ b , G i , j ( z ) 1 z ≤ b

3.3.2. The Functions G i ( z ) 1 z > b , G i , j ( z ) 1 z > b

4. Algorithm and Examples

4.1. The Algorithm

4.2. Example 2

4.3. Example 3

5. Summary and Future Research

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3.1. The Functions $G_{i} (z) 1_{\{z \leq b\}},$ $G_{i, j} (z) 1_{\{z \leq b\}}$

3.3.2. The Functions $G_{i} (z) 1_{\{z > b\}},$ $G_{i, j} (z) 1_{\{z > b\}}$