Next Article in Journal
Classification of the Second Minimal Orbits in the Sharkovski Ordering
Previous Article in Journal
Calculation of Coefficients of the Optimal Quadrature Formulas in W2(7,0) Space
Previous Article in Special Issue
Critical Permeability from Resummation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials

Department of Industrial Engineering and Management, Ariel University, Ariel 40700, Israel
*
Author to whom correspondence should be addressed.
Axioms 2025, 14(3), 221; https://doi.org/10.3390/axioms14030221
Submission received: 31 December 2024 / Revised: 23 February 2025 / Accepted: 28 February 2025 / Published: 17 March 2025

Abstract

:
Competing patterns are compound patterns that compete to be the first to occur a pattern-specific number of times, known as a stopping rule. In this paper, we study a higher-order Markovian dependent Bernoulli trials model with competing patterns. The waiting time distribution refers to the distribution of the number of trials required until the stopping rule is met. Based on a finite Markov chain, a hierarchical algorithm is proposed to derive the conditional probability generating function (pgf) of the waiting time of the competing patterns model. By applying the law of total expectation, the final pgf is then obtained. Using examples, we further demonstrate that the proposed algorithm is an effective and easy-to-implement tool.
MSC:
60-08; 60J10; 90-10

1. Introduction and Literature Review

The waiting time distribution for a sequence of trials refers to the distribution of the number of trials required until a specific stopping rule is satisfied. A series of identical trials is referred to as a “run”, while a general sequence is called a “pattern”. The waiting times of a sequence of trials have been extensively studied with different stopping rules. Most of these studies employ combinatorial analysis, assuming a sequence of independent and identically distributed (i.i.d.) trials, each of which ends in two or more outcomes.
The most basic model dates to i.i.d. Bernoulli trials (each resulting in either a success “1” or a failure “0”), where the stopping rule is the occurrence of the first success (“1”). Here, the waiting time is a geometrically distributed random variable (r.v.). When the stopping rule is extended to the k-th success, the waiting time is Negative Binomial (NB) r.v. More complex stopping rules include specific sequences, combinations of multiple stopping rules, multi-state trials, and Markovian models (with dependent trials).
The waiting time distributions of runs and patterns, such as geometric, geometric of order k, negative binomial, and sooner and later, have been successfully applied in numerous areas of statistics and applied probability. In recent decades, the theory of waiting time distributions has become an indispensable tool for studying various applications, including DNA sequence homology (Schwager [1], Karwe and Naus [2,3]), epidemiology (Kulldorff [4]), and system reliability (Aki [5], Aki and Hirano [6], Chang and Huang [7]). We will now present some specific examples.
Studying the distribution of patterns assists in DNA sequence analysis by modeling the occurrence of specific motifs, repeats, or base combinations. Each nucleotide position is treated as a Bernoulli trial, allowing researchers to detect biologically significant patterns such as transcription factor binding sites, identify mutation hotspots, and study codon usage bias. It also aids in evaluating the statistical significance of sequence alignments and testing whether observed patterns are random or functionally important, providing insights into DNA structure, function, and evolution. In this regard, Kulldorff [4] investigated the occurrences of sudden infant death syndrome or birth defects using the Bernoulli model. In psychology, Schwager [1] explored the concept of “success breeds success” or “failure breeds failure” which is applied in achievement testing, animal learning studies, athletic competition, and study performance improvement. He further demonstrated that the behavior of groups of people forming lines and other structures can be modeled as a Markov-dependent sequence of trials in which some characteristic, such as the sex of the individual, is taken as a trial outcome. Dafnis [8] demonstrated a practical application for the waiting time distribution of binary trials in the topic of meteorology and agriculture, by considering the cultivation of raisins. The harvesting of certain varieties of raisins in Greece must occur between August and September. For the harvesting, a period of at least four consecutive dry days is required. Before this period begins, a period of at least two consecutive rainy days is required to water the raisins. Dafnis [8] used “0” to denote the occurrence of a rainy day, and “1” to denote the occurrence of a dry one. The probabilities of “0” and “1” were estimated using previous years’ statistics.
In the field of agriculture, the concept of r-weak runs distribution was applied to identify the rate of development of crops (which recognizes that plant development will occur only when the temperature exceeds a specific base temperature for a certain number of days), and to investigate the impact of critical factors on honeydew honey production (Dafnis et al. [9], Dafnis et al. [10]). Drawing our attention to reliability studies, we mention the consecutive k-out-of-n systems. In these systems, each component is either working or failed, and the overall system functions only when at least k consecutive components are working within the total n components in the sequence. Dafnis et al. [11] studied the k-out-of-n system that fails if and only if a string of k non-functioning components is interrupted by a string of at most r consecutive functioning ones. They showed that there are ( r + 1 ) k 1 different patterns of appearance that can cause a system failure. In the financial field, Dafnis and Marki [12] showed how the concept of r-weak runs can be adapted by financial advisors and technical analysts for the determination of a personalized and effective investing strategy with controlled risk.
For the literature review, we date back to Feller [13], who studied the probability generating function (pgf) for the waiting time distribution of a sequence of i.i.d Bernoulli trials where the stopping condition is the first occurrence of a series of k consecutive successes. Philippou et al. [14] extended Feller [13] by studying a geometric distribution of order k, defined as the waiting time distribution until k consecutive successes occur. Aki [5] generalized the model to geometric, NB, Poisson, logarithmic series, and binomial distributions of order k with dependencies. Philippou and Makri [15] studied the binomial distribution of order k in a finite number of Bernoulli trials. Philippou [16] studied multi-state trials, and Ling [17] extended the k-th ordered geometric model with parameter p to ( k , , k m ) -order with parameter ( p 1 , , p n ) ; the exact distributions and the pgfs were further derived for some special cases of ( k , , k m ) . Shmueli and Cohen [18] used recursive formulas to compute the exact probability functions of a model with switching rules. Koutras and Eryilmaz [19] considered a compound geometric distribution of order k determined by another random process, such as Poisson or binomial.
The above literature assumes that the experiment ends after the first k consecutive successes. A more general stopping rule is to stop the experiment with the first occurrence of a particular sequence of i.i.d. Bernoulli experiments, known as a pattern. A pattern has a specific sequence and may include different symbols. In this context, Blom and Thorburn [20] analyzed the waiting time distribution until a k-digit sequence is obtained or, more generally, until one of several k-digit sequences is obtained. For the latter case, the mean waiting time was also derived. Ebneshahrashoob and Sobel [21] studied a generalized pgf, means and variances for the waiting time until obtaining a sequence of s successes or a sequence of r failures, whichever comes sooner. Huang [22] introduced a generalized stopping ( k 1 , k 2 ) -rule as the occurrence of k 1 consecutive failures followed by k 2 consecutive successes. Dafnis et al. [8], Makri [23] and Kumar and Upadhye [24] considered different types of ( k 1 , k 2 ) -rules. Zhao et al. [25], Kong [26] and Chadjiconstantinidis and Eryilmaz [27] studied the distributions of ( k 1 , , k m ) -runs of multi-state trials.
To date, we focused on the distribution of the first instance of a pattern, or the ( k 1 , , k m ) -type model. We next reviewed the literature dealing with the distribution for the r-th occurrence of a pattern. Aki [28] investigated the waiting time distribution until the r-th occurrence of a pattern in a sequence of i.i.d. Bernoulli trials. Koutras [29] derived several moments of the waiting time for the non-overlapping appearance of a pair of successes separated by at most k − 2 failures ( k > 2 ). Robin and Daudin [30] obtained the distribution of distances between the two successive occurrences of a specific pattern, as well as between the n-th and the ( n + m ) -th occurrences. Aki and Hirano [31] further explored a two-dimensional pattern model.
Focusing on models under the assumption of Markov dependency, Hirano and Aki [32] studied the distribution of the number of success runs of length k until the n-th trial in a two-state dependent Markov chain. Fu and Koutras [33] presented an approach for the distribution theory of runs based on a finite Markov chain embedding technique that covers identical and nonidentical Bernoulli trials. In addition, the exact distribution of the waiting time for the m-th occurrence of a specific run, and the distribution of the number of at least k successes were derived. The number of failures, successes, and the first consecutive k successes were studied in Aki and Hirano [31]. Using non-overlapping counting, Fu [34] studied the multi-state trials model. Koutras [35] developed a general technique for the waiting time distribution in a two-state dependent Markov chain. Antzoulakos [36] introduced a variation of the finite Markov chain embedding method to derive the pgf until the r-th occurrence of a pattern, considering both non-overlapping and overlapping cases. Fisher and Cui [37] introduced a mathematical framework for determining the expected time for a specific pattern to emerge in higher-order Markov chains, both with and without a predefined starting sequence. This approach was extended to the calculation of the first occurrence of any pattern from a collection, along with the probability that each individual pattern is the first to appear. Chang et al. [38] studied the dual relationship between the probability of the number of patterns and the probability of the waiting time in a sequence of multi-state trials.
We next focused on models with stopping rules that involve a few simple patterns, known as the compound patterns rule, where the stopping rule is triggered by the first appearance of one of the patterns. Applying pgf techniques, Fu and Chang [39] and Han and Hirano [40] investigated multi-state trial models with compound patterns for both i.i.d. trials and first-order Markov-dependent trials. For the r-th order Markov-dependent chain, Fu and Lou [41] examined the waiting time distributions of the first occurrence of simple and compound patterns in the sequences of Bernoulli trials. Wu [42] applied a finite Markov chain embedding technique to analyze the conditional waiting time distributions. Using a matrix form, Aston and Martin [43] presented an algorithm that computes the distribution of the waiting time until the m-th occurrence of a compound pattern. For an excellent summary of various calculation methods, we refer readers to the comprehensive book by Fu and Lou [44] and Balakrishnan and Koutras [45].
Our research addresses models with a competing pattern stopping rule. Here, the experiment ends when a simple or compound pattern occurs a specific number of times (note that the compound pattern rule is a special case, where the experiment ends after the first occurrence). For a real-life example, consider again the cultivation of raisins (Dafnis [8]). To obtain raisins of fine quality, grapes need successions of short rainy and dry periods. Thus, an agriculturalist cares about the frequent occurrence of patterns with at most 2 consecutive rainy days followed by at most 2 consecutive dry days. Agriculturalists further claim that the occurrence of at least five such patterns in the three months has a significant effect on the quality of raisins.
Another real-life example is taken from Dafnis et al. [9]. They studied the effect of the number of cold days on the life cycle of Marchalina hellenica. They showed that the number of cold periods (each of which is a run of k consecutive cold days) is a more critical factor to the completion time of the insect’s biological cycle than the total number of cold days.
For models with competing patterns, Aston and Martin [43] investigated the waiting time distribution for competing patterns in m-th order dependent multi-state Markov trials. They analyzed several compound patterns, each associated with a specific required number of occurrences. Martin and Aston [3] introduced the generalized later patterns model, in which all patterns must appear multiple times. We also mention the closely related sooner and later models. Sooner waiting time distribution captures the number of trials required for the first occurrence of one of two competing patterns, and conversely, later waiting time distribution refers to the number of trials required for both patterns to occur (see, e.g., Han and Hirano [40], Balakrishnan and Koutras [45]).
For other related models dealing with waiting time distributions under the Markovian assumption, we mention the hidden Markov processes (Aston and Martin [46]), in which the states are not directly observable, or the sparse Markov process, in which the transition probability matrix includes many zero or near-zero entries (see, e.g., Martin [47,48]). Dafnis [10] studied the model with independent but not necessarily identically distributed trials. Michael [49] and Vaggelatou [50] presented a framework for a continuous time Markov chain. Dafnis et al. [9] introduced the r-weak run of length at least k in a sequence of binary trials. Their model was extended to include minimum and maximum constraints (Dafnis and Makri [12]). In a series of papers, Makri ([51,52]) investigated a sequence of binary trials with a specific length threshold.
Our paper contributes to this study by deriving simple and closed-form expressions for the pgf of waiting time distributions associated with higher-order Markovian-dependent Bernoulli trials with competing patterns. To the best of our knowledge, the studies presented to date are based on Markov chain-embedding techniques, where the state space is large, leading to a complicated transition probabilities matrix, high computational complexity and, thus, it is difficult to implement. The suggested algorithm is based on a hierarchical approach and includes three steps. In the first step, the framework is designed by including the state space, stopping rules, and steady-state probabilities. The second step determines all the paths that terminate the experiment; each such path is divided into sub-paths (components) where the total pgf is the product of the pgfs of these sub-paths. The last step derives the pgf of each sub-path by considering whether the path is longer or shorter than the number of trials of the longest pattern. The final pgf is obtained by applying the law of total expectation.
The rest of the paper is organized as follows. Section 2 introduces the definitions and preliminaries to be used. Using a hierarchical approach, Section 3 provides a mathematical description of the model and derives the pgf of the waiting time; this derivation is demonstrated by an example. A summarizing algorithm and additional examples are provided in Section 4. Finally, Section 5 concludes the paper and suggests some future directions. Following the convention, we indicate vectors by bold letters, and matrices by blackboard bold letters. We let 1 { A } be the indicator of an event A ,   e = ( 1 , , 1 ) T be the column vector of all ones, and I be the identity matrix, all of the appropriate dimensions. We use A to denote the number of elements in a set A . Summarizing our abbreviations, we use pgf(s) for probability generating function(s), i.i.d r.v.(s) for independent and identical distributed random variable(s).

2. Definitions and Preliminaries

We use the terminology of Fo and Lou [41] and Aston and Martin [43].
Bernoulli trial. Consider a sequence of Bernoulli trials X 1 , X 2 , , where each trial (symbol) X t has two possible outcomes, success and failure (0 and 1, respectively) with p ( X t = 1 ) = p and p ( X t = 0 ) = 1 p = q . Let S be the state space of an individual result X t , i.e., S = { 0 , 1 } .
A simple pattern. A simple pattern Λ is a specific sequence of k trials (symbols), x i , 1 ,   x i , 2 ,   x i , k from S , i.e., Λ = x i , 1 , x i , k . The waiting time random variable of a simple pattern, W ( Λ ) , is defined as
W ( Λ ) = inf { n N : n l , X n k + 1 = x i , 1 , , X n = x i , k } ,
i.e., … W ( Λ ) counts the number of trials until the first occurrence of pattern Λ . In the following, we assume that we start at t = 1 .
A compound pattern  Λ = i = 1 l Λ i is the union of l simple patterns. We use Λ i Λ j to denote the occurrence of either pattern Λ i or pattern Λ j . Accordingly, the waiting time W ( Λ ) is defined as
W ( Λ ) = inf { n N : the occurrence of any pattern Λ 1 , , Λ l during the n trials } .
Note that W ( Λ ) = inf { W ( Λ 1 ) , W ( Λ 2 ) , , W ( Λ l ) } .
A (discrete) Markov chain. A discrete-time Markov chain is a sequence of random variables X 1 , X 2 , X 3 , … with the Markov property, namely that the probability of moving to the next state only depends on the present state and not on the previous states:
p ( X t + 1 = x t + 1 X 1 = x 1 , , X t = x t ) = p ( X t + 1 = x t + 1 X t = x t )
if both conditional probabilities are well defined, that is, if p ( X 1 = x 1 , , X t = x t ) > 0 . The possible values of X i form a countable set S , called the state space of the chain. Time-homogeneous Markov chains are processes where p ( X t + 1 = y X t = y ) = p ( X t = y X t 1 = y ) for all t, i.e., the probability of the transition is independent of t (for more details, see Feller [13] and Hirano and Aki [32]).
An r-th order Markov chain. Let { X t } be a sequence of irreducible, aperiodic, and homogeneous r-th order Markov-dependent m-state random variables (trials) defined on the state space S = { b 1 , b 2 , b m } . (when m = 2 , we have a Bernoulli trial). For r 1 , the set of all possible r tuples ( m r combinations) S r is given by
S r = { x = ( x 1 x r ) : x i S , i = 1 , , r } .
The r-order transition probabilities of the Markov chain { X t } are defined by:
P ( b x 1 , , x r ) = p ( X t = b X t r = x 1 , , X t 1 = x r ) = p x , b , x = S r , b S ,
which is independent of t. The steady-state probability vector π = ( π x :   x S r ) exists and satisfies π A = π , π e = 1 , where A is the ( m r × m r ) transition probability matrix.
In this work, we consider Bernoulli trials with m = 2 .
Result 1. Let { X t } be an r-th order Markov chain. Fu and Lou [41] showed that there exists an embedded finite Markov chain { Y t } defined on a state space Ω ( Λ ) = Ω ( Λ ) { α ) { α } , where { α } is the absorbing state. Accordingly, the transition probability matrix has the form
M = Ω ( Λ ) { α ) α N C 0 1 .
(Note N e + C = e , i.e., the sum of the probabilities in each row of M is equal to 1). The waiting time of a pattern (simple or compound) W ( Λ ) has a general geometric distribution
p ( W ( Λ ) = n ) = ξ N n r 1 ( I N ) e and p ( W ( Λ ) n ) = ξ N n r 1 e , n r + 1 .
( ξ is the initial distribution, and N is known as the essential transition probability matrix). For more details, see Lemma 3.1, Theorems 3.1–4.2 of Fu and Lou [41].
Competing patterns. Let { Λ ( 1 ) , , Λ ( c ) } , c 1 , be the set of c compound patterns. Let n i denote the number of occurrences of the compound pattern Λ ( i ) that terminates the experiment. The patterns { Λ ( 1 ) , , Λ ( c ) } are called competing patterns. We assume that no two competing compound patterns are identical.
Let C i ( n ) , i = 1 , , c be the event that, by time n, the compound pattern Λ ( i ) has occurred n i times. Then, the p ( i = 1 c C i ( n ) ) is the waiting time probability function of the competing patterns.
We further note that two distinct methods of counting patterns are considered in the literature (Inoue and Aki [53]). (1). Non-overlapping counting. In this case, when a pattern occurs, the counting restarts from that point, and any partially completed pattern cannot be finished. (2). Overlapping counting. In this case, partially completed patterns can be finished at any time, regardless of whether another pattern has been completed after the partially completed pattern starts but before it is completed.
Furthermore, we have two more definitions (Aston and Martin [43]): Ending blocks of a simple pattern Λ = { b i , , b i k } are sub-patterns of the form b i , , b i q , where q { 1 , 2 , , k 1 } . Ending blocks always start at the beginning of a simple pattern but end at any point before its last symbol. Finishing blocks of the simple pattern Λ = { b i 1 , , b i k } are sub-patterns of the form b i ζ , , b i k , where ζ { 1 , 2 , , k } . Finishing blocks may start at any point but always end with the last symbol. The finishing blocks of a compound pattern or competing patterns are formed by taking the union of the finishing blocks of their respective components.
Result 2. Consider the competing patterns { Λ ( 1 ) , , Λ ( c ) } . Aston and Martin [43] showed that the waiting time distribution for competing compound patterns has a geometric form. Specifically, the competing pattern experiment ends with compound pattern Λ ( i ) occurred n i times if and only if the Markov chain { Y t ) is absorbed in the corresponding absorbing state. Therefore, the probability function of the waiting time is given by
P W ( Λ ) = n = p ( i = 1 c C i ( n ) ) = ψ 0 T Y n e Λ ,
where ψ 0 is the initial probability vector of { Y t } ,   T Y is the transition probability matrix, and e Λ is a column vector with 1’s in the position corresponding to the absorbing states and 0’s elsewhere (see Section 3.2 of Aston and Martin [43]).

Probability Generating Function

The probability generating functions (pgf) is a useful technique for computing distributions (see, e.g., Feller [13], chapter XI). As a short background, for a non-negative discrete random variable Y, the pgf G Y ( z ) is defined as
G Y ( z ) = E ( z Y ) = j = 0 p ( Y = j ) z j .
for all z R for which the sum converges. The pgf is then a power series and obeys all the rules obeyed by power series with non-negative coefficients. The probabilities p ( Y = j ) are the coefficients of the power series, and may be recovered through series expansion or by taking derivatives of G with respect to z.
Pgfs are especially useful for computing the distributions of sums of random variables, as well as moments and factorial moments. We note that, for independent random variables V 1 , V n , the pgf of Y = V 1 + + V n is given by
G Y ( z ) = i = 1 n G V i ( z ) .
Along with the uniqueness of the pgf, it is helpful to determine the sampling distribution of interest.

3. Competing Patterns in High-Order Markov-Dependent Bernoulli Trials

The derivation of the probability generating function includes three steps. In the first step, we build the mathematical framework, the experiment design and settings of the experiment. The second step determines the paths that terminate the experiment. Applying tools from probability theory and Markov chain, the third step derives the pgf function of the number of trials (waiting time) for each path. All steps are illustrated by a continuous example.

3.1. Step 1. Experimental Design and Settings

Consider the r-th order Markov-dependent Bernoulli trials { X i } , i = 1 , 2 , on the state space (with 2 r combinations)
S r = { x = ( x 1 x r ) : x i { 0 , 1 } , i = 1 , , r } .
The r-th order transition probabilities are given by:
p ( X t = y X t r = x 1 , , X t 1 = x r ) = p x , x , x = ( x 1 , x 2 , , x r ) S r , x i , y S = { 0 , 1 }
independent of t.
Clearly, y S p x , y = 1 , for all x S r . In a matrix form, let A be the S r × S r square matrix with elements A x , x given by
A x , x = p x , y x = ( x 1 , x 2 , , x r ) , x = ( x 2 , , x r , y ) 0 o t h e r w i s e .
Associated with the sequence are the initial probabilities
π = π ( x r + 1 , , x 0 ) = p ( X r + 1 = x r + 1 , , X 0 = x 0 ) .
(Note that A e = e T , and the vector π = { π x , x S r } satisfies the system of equations π A = π , π e = 1 ) . Let Λ ( i ) = { Λ 1 ( i ) , , Λ k i ( i ) } , i = 1 , , c be a compound pattern that includes k i simple patterns, each of which has size l i , k i = Λ k i ( i ) . We assume that l i , k i r . Let n i denote the number of non-overlapping occurrences of Λ ( i ) needed for the termination of the experiment, and let l i = max j = 1 , , k i { l i , j } , i = 1 , , c . Let Λ = { Λ ( i ) } i = 1 c be the set of competing patterns, and let W Λ denote the waiting time random variable (number of trials) until the experiment terminates given the steady-state environment. We assume non-overlapping counting.
Example 1.
Our base case considers second-order Markov-dependent Bernoulli trials, i.e., r = 2 and x i { 0 , 1 } . The transition probabilities p x , y , x = ( x 1 , x 2 ) , x i , y { 0 , 1 } are:
p 00 , 0 = p 1 , p 00 , 1 = p 2 , p 10 , 0 = p 3 , p 10 , 1 = p 4 , p 01 , 0 = p 5 , p 01 , 1 = p 6 , p 11 , 0 = p 7 , p 11 , 1 = p 8 .
In a matrix form
A = 00 01 10 11 p 1 p 2 0 0 0 0 p 5 p 6 p 3 p 4 0 0 0 0 p 7 p 8 .
(Note that p i + p i + 1 = 1 , i = 1 , 3 , 5 , 7 ). Here, the steady-state probability vector π = ( π 00 , π 01 , π 10 , π 11 ) satisfying π A = π , π e = 1 is given by:
π 00 = p 3 · p 7 p 1 · p 5 2 p 1 · p 7 + p 3 · p 7 p 1 p 5 + 2 p 7 + 1 π 01 = ( p 1 1 ) · p 7 p 1 · p 5 2 p 1 · p 7 + p 3 · p 7 p 1 p 5 + 2 p 7 + 1 π 10 = ( p 1 1 ) · p 7 p 1 · p 5 2 p 1 · p 7 + p 3 · p 7 p 1 p 5 + 2 p 7 + 1 π 11 = p 1 · p 5 p 1 p 5 + 1 p 1 · p 5 2 p 1 · p 7 + p 3 · p 7 p 1 p 5 + 2 p 7 + 1 .
We assume two competing patterns ( c = 2 ): Λ ( 1 ) = { 00 } with n 1 = 3 ( l 1 = 2 ), and Λ ( 2 ) = { 111 } , with n 2 = 2 ( l 2 = 3 ). That is, the experiment terminates if either three occurrences of two consecutive 0 s (failures) or two occurrences of three consecutive 1s (successes) are observed.

3.2. Step 2. Stopping Paths

We start by applying a higher hierarchical point of view, focusing on paths of patterns rather than on individual trials. Clearly, several paths can terminate the experiment. Let V i , i = 1 , , c be the set of paths that terminate the experiment due to Λ ( i ) , i = 1 , , c . Concretely, the set V i includes all paths that have the following structure: the pattern Λ ( i ) appears n i times in total, in which its last n i -th occurrence (that terminates the experiment) is the last component of the path; other occurrences of Λ ( j ) , j i appear no more than n j 1 times each. Assume that there are C i such paths (i.e., V i = C i ). Denote these paths by V i , 1 , V i , 2 , , V i , C i , so that V i = { V i , j } j = 1 , , C i . In the following, each path will be referred to as a “stopping vector”. Let V = V i i = 1 , , c be the set of all stopping vectors that terminate the experiment.
Corollary 1.
It is easy to verify the following:
(i) 
The number of competing patterns in each path V i , j satisfies:
min i = 1 c ( n i ) V i , j i = 1 c ( n i 1 ) + 1 , j = 1 , , C i , , i = 1 , , c
(ii) 
The number of paths in the set V i , is given by (combinatorial considerations):
C i = k 1 = 0 n 1 1 k c = 0 n c 1 k 1 + + k c + n i 1 ! k 1 ! k c ! ( n i 1 ) ! , k 1 < k 2 , < k c , k j i .
(Note that the summing is over all patterns excluding Λ ( i ) ).
(iii) 
Since Λ ( i ) , i = 1 c , including distinct patterns, we have:
V = i = 1 c V i , V = C w h e r e C = i = 1 c C i .
Example 2.
We have Λ ( 1 ) = 00 with n 1 = 3 , and Λ ( 2 ) = 111 , with n 2 = 2 ; thus, the experiment terminates if either three occurrences of two consecutive 0 s or two occurrences of three consecutive 1s appear. Let V 1 and V 2 be the sets of paths that terminate the experiment due to Λ ( 1 ) and Λ ( 2 ) , respectively. Figure 1 illustrates the sequence (path) of possible patterns that terminate the experiment (denoted by V i , j ; see the red labels). We observe that V 1 contains four possible paths ( V 11 , , V 14 ) and V 2 contains six possible paths ( V 21 , , V 26 ). In summary, there are a total of 10 possible paths (stopping vectors) that end the experiment.
Applying Corollary 1(i), each such stopping vector includes at least two patterns (due to n 2 ) and no more than four patterns ( ( n 1 1 ) + ( n 2 1 ) + 1 = 4 ). Applying Corollary 1(ii), the sets V 1 and V 2 include C 1 and C 2 stopping vectors, respectively. By Corollary 1(ii), C 1 and C 2 are given by:
C 1 = k 1 = 0 n 2 1 k 1 + n 1 1 ! k 1 ! ( n 1 1 ) ! = ( 3 1 ) ! 0 ! ( 3 1 ) ! + ( 1 + 3 1 ) ! 1 ! ( 3 1 ) ! = 4 , C 2 = k 1 = 0 n 1 1 k 1 + n 2 1 ! k 1 ! ( n 2 1 ) ! = ( 2 1 ) ! 0 ! ( 2 1 ) ! + ( 1 + 2 1 ) ! 1 ! ( 2 1 ) ! + ( 2 + 2 1 ) ! 2 ! ( 2 1 ) ! = 6 ,
V = V 1 V 2 , and V = C 1 + C 2 = 10 vectors (see also Figure 1),
V 1 = V 1 , 1 = [ Λ ( 1 ) , Λ ( 1 ) , Λ ( 1 ) ] , V 1 , 2 = [ Λ ( 1 ) , Λ ( 1 ) , Λ ( 2 ) , Λ ( 1 ) ] , V 1 , 3 = [ Λ ( 1 ) , Λ ( 2 ) , Λ ( 1 ) , Λ ( 1 ) ] , V 1 , 4 = [ Λ ( 2 ) , Λ ( 1 ) , Λ ( 1 ) , Λ ( 1 ) ] , V 2 = V 2 , 1 = [ Λ ( 2 ) , Λ ( 2 ) ] , V 2 , 2 = [ Λ ( 2 ) , Λ ( 1 ) , Λ ( 2 ) ] , V 2 , 3 = [ Λ ( 1 ) , Λ ( 2 ) , Λ ( 2 ) ] , V 2 , 4 = [ Λ ( 2 ) , Λ ( 1 ) , Λ ( 1 ) , Λ ( 2 ) ] , V 2 , 5 = [ Λ ( 1 ) , Λ ( 2 ) , Λ ( 1 ) , Λ ( 2 ) ] , V 2 , 6 = [ Λ ( 1 ) , Λ ( 1 ) , Λ ( 2 ) , Λ ( 2 ) ] .
Remark 1.
Note that Figure 1 presents only the paths of competing patterns; intermediate trials that do not lead to a pattern are not presented.
We then derive the probability-generating function of the waiting time until the experiment terminates.

3.3. Step 3. The Waiting Time Distribution

Let W Λ denote the waiting time until the experiment terminates with G ( z ) its pgf. Applying the law of total expectation and noting that { V i , j } , i = 1 , , c , j = 1 , , C i , are disjoint vectors (each vector refers to a different path) leads to:
G ( z ) = E ( s W Λ ) = i , j E ( s W Λ · 1 { V i , j } ) .
( 1 { A } is the indicator function). Each stopping vector V i , j is composed of a numbered sequence of consecutive patterns V i , j = [ V i , j ( 1 ) , V i , j ( 2 ) , ] , e.g., V 2 , 4 ( 1 ) = Λ ( 2 ) , V 2 , 4 ( 2 ) = Λ ( 1 ) . Let W i , j ( k ) , k = 2 , 3 , , be the number of trials from pattern V i j ( k 1 ) until the occurrence of the next pattern V i , j ( k ) . Accordingly, for k = 1 , we let W i , j ( 1 ) be the number of trials until the first pattern V i , j ( 1 ) occurs, given initial state x S r . Let W i , j be the waiting time due to path V i , j . It is easy to verify that:
W i , j = k = 1 V i , j W i , j ( k ) , E ( s W i , j ) = E s k = 1 V i , j W i , j ( k ) .
Substituting W Λ = i , j W i , j · 1 { V i , j } and (11) into (10) yields:
E ( s W Λ ) = i , j E ( s W i , j · 1 { V i , j } ) = i , j E s k = 1 V i , j W i , j ( k ) · 1 { V i , j } = i , j E s W i , j ( 1 ) · s W i , j ( 2 ) · 1 { V i , j } .
Assume that the pattern V i j ( k 1 ) occurs (for k = 1 , we assume an initial state x S r ) . Let G V i , j ( k 1 ) , V i , j ( k ) be the pgf of W i , j ( k ) , i.e., the pgf of the number of trails from V i j ( k 1 ) until V i j ( k ) ,
G V i , j ( k 1 ) , V i , j ( k ) = E ( s W i , j ( k ) V i , j ( k 1 ) ) k = 2 , 3 , E ( s W i , j ( 1 ) x S r ) k = 1 .
Note that, due to the Markovian property, the non-overlapping feature, and assuming V i , j ( k 1 ) , the waiting time W i , j ( k ) is independent of V i , j ( 1 ) , , V i , j ( k 2 ) . Therefore, we have:
G ( z ) = i , j k = 1 V i , j G V i , j ( k 1 ) , V i , j ( k ) ( z ) ·
We further note that the function G V i , j ( k 1 ) , V i , j ( k ) is homogeneous, depends only on the patterns V i , j ( k 1 ) , V i , j ( k ) (and is independent of k). Thus, for simplicity, and without losing generality, we will use W i , j , i , j = 1 , , c to denote the waiting time between two consecutive patterns Λ ( i ) and Λ ( j ) , given that the pattern Λ ( i ) occurs ( c 2 permutations) and, respectively, G i , j ( z ) to be the pgf of W i , j . Accordingly, we let G i ( z ) be the pgf of the number of trials until the first appearance of Λ ( i ) , given the initial state x S r .
Example 3.
Two competing patterns yield four pgfs that differ in their starting and ending patterns, namely G 1 , 1 ( z ) , G 1 , 2 ( z ) , G 2 , 1 ( z ) , and G 2 , 2 ( z ) . In addition, we have two functions, G 1 ( z ) and G 2 ( z ) , corresponding to the first occurrence of Λ ( 1 ) and Λ ( 2 ) , respectively. The total pgf is the sum of the ten distinct stopping vectors in V , each of which consists of the multiplication of the corresponding pgf along the path V i , j . Specifically, we have:
G ( z ) = G 1 ( z ) G 1 , 1 ( z ) 2 V 11 + G 1 ( z ) G 1 , 1 ( z ) G 1 , 2 ( z ) G 2 , 1 ( z ) V 12 + G 1 ( z ) G 1 , 2 ( z ) G 2 , 1 ( z ) G 1 , 1 ( z ) V 13 + G 2 ( z ) G 2 , 1 ( z ) G 1 , 1 ( z ) 2 V 14 + G 2 ( z ) G 2 , 2 ( z ) V 21 + G 2 ( z ) G 2 , 1 ( z ) G 1 , 2 ( z ) V 22 + G 1 ( z ) G 1 , 2 ( z ) G 2 , 2 ( z ) V 23 + G 2 ( z ) G 2 , 1 ( z ) G 1 , 1 ( z ) G 1 , 2 ( z ) V 24 + G 1 ( z ) G 1 , 2 ( z ) 2 G 2 , 1 ( z ) V 25 + G 1 ( z ) G 1 , 1 ( z ) G 1 , 2 ( z ) G 2 , 2 ( z ) V 26
Our next step is to derive the probability-generating functions G i , j ( z ) and G i ( z ) for i , j = 1 , , c . Recall that l i is the longest length of pattern in Λ ( i ) , and let b = max i = 1 , , c { l i } . Our aim is to apply the law of total expectation as a function of the number of trials between two consecutive patterns (or until the first pattern), distinguishing whether the number is less or equal to b or not. Thus, we use the decomposition:
G i , j ( z ) = G i , j ( z ) 1 z b + G i , j ( z ) 1 z > b , G i ( z ) = G i ( z ) 1 z b + G i ( z ) 1 z > b .
In the left-hand side of (16), where the number of trials is no more than b, the derivation is relatively simpler and consists of a finite number of paths. Conversely, the derivation of the right-hand side of (16), where more than b trials are possible, is more challenging. We start with the relatively simpler derivation, G i , j ( z ) 1 z b and G i ( z ) 1 z b .

3.3.1. The Functions G i ( z ) 1 z b ,   G i , j ( z ) 1 z b

We first highlight the main difference between G i ( z ) 1 z b and G i , j ( z ) 1 z b . The function G i ( z ) 1 z b refers to the first b trials; here, we use the steady-state probability vector π ( x ) , x S r , multiplied by the remaining ( b r ) probabilities. In contrast, the function G i , j ( z ) 1 z b assumes that the Λ ( i ) occurs, and continues with the multiplication of at most b transition probabilities leading to Λ ( j ) .
To derive G i ( z ) 1 z b , recall that l i r . Let p ( Λ ( i ) , b ) be the probability of hitting Λ ( i ) by no more than b trials (and no other patterns are hit). Let L i ( u ) = { x = x 1 x u :   x u l i + 1 x u = Λ ( i ) } be the set of sequences of u trials in which Λ ( i ) appears in the last l i trials ( l i u b ) and no other pattern is hit; recall that r l i u b . Denote by p ( x i ( u ) ) the ergodic probability of L i ( u ) . Clearly, when u = l i = r ,   L i ( u ) includes only the one pattern Λ ( i ) with probability p ( L i ( u ) ) = π Λ ( i ) . When l i < u b , there may be few paths in L i ( u ) ; each denoted by x k ( u ) , starts with x ˜ S r (with probability π ( x ˜ ) ), multiplying by a path of transition probabilities that leads to the ending pattern Λ ( i ) . Thus, the ergodic probability distribution p ( Λ ( i ) , b ) has the form of:
p ( Λ ( i ) , b ) = π Λ ( i ) 1 { u = r } + u = l i + 1 b k p ( x k ( u ) ) 1 { x k ( u ) L i ( u ) } .
The function G i ( z ) 1 z b is then,
G i ( z ) 1 z b = z u π Λ ( i ) 1 { u = r } + u = l i + 1 b k z u p ( x k ( u ) ) 1 { x k ( u ) L i ( u ) } .
To derive G i , j ( z ) 1 z b , we assume that, at time t = 0 , the state includes the last r trials of Λ ( i ) , and consider the set L i ( u ) ,   u b . Each path in the set L i ( u ) adds a component to G i , j ( z ) 1 z b , with the product of the corresponding transition probabilities multiplied by z u . We next demonstrate (18) using our example.
Example 4.
Here, r = 2 and b = 3 . To derive G 1 ( z ) 1 z 3 , we consider the pattern Λ ( 1 ) = 00 with l 1 = 2 = r . Here, u = 2 or u = 3 . When u = 2 , we have L 1 ( 2 ) = { 00 } and π Λ ( 1 ) = π 00 ; when u = 3 , we have L 1 ( 3 ) = { 100 } and p ( L 1 ( 3 ) ) = π 10 · p 3 . Next, we derive G 2 ( z ) 1 z 3 . Here, Λ ( 2 ) = 111 ,   l 2 = b = 3 > r ; thus, u = 3 with the only path L 2 ( 3 ) = { 111 } and p ( L 2 ( 3 ) ) = π 11 · p 8 . Summarizing, we obtain:
G 1 ( z ) 1 z 3 = z 2 π 00 + z 3 π 10 · p 3 , G 2 ( z ) 1 z 3 = z 3 π 11 · p 8 .
Next, we derive G i , j ( z ) 1 z 3 . We assume that, at time t = 0 , the state is the last r trial of Λ ( i ) (i.e., for Λ ( 1 ) we assume X 0 = 00 , and for Λ ( 2 ) we assume X 0 = 11 ) . The function G i , j ( z ) 1 z 3 is constructed by a path of at most three multiplications of transition probabilities (multiplied by the corresponding power of z) that lead to Λ ( j ) . Here, we obtain:
G 1 , 1 ( z ) 1 z 3 = z 2 p 1 2 + z 3 p 2 · p 5 · p 3 , G 1 , 2 ( z ) 1 z 3 = z 3 p 2 · p 6 · p 8 , G 2 , 1 ( z ) 1 z 3 = z 2 p 7 · p 3 + z 3 p 8 · p 7 · p 3 , G 2 , 2 ( z ) 1 z 3 = z 3 p 8 3 ,
i.e., when assuming Λ ( 1 ) (with X 0 = 00 ) , the paths 00 and 100 (w.p. p 1 2 and p 2 · p 5 · p 3 , respectively) lead to Λ ( 1 ) , and the path 111 (w.p. p 2 · p 6 · p 8 ) leads to Λ ( 2 ) . Similarly, when assuming Λ ( 2 ) (with X 0 = 11 ), the paths 00 and 100 (w.p. p 7 · p 3 and p 8 · p 7 · p 3 , respectively) lead to Λ ( 1 ) , and the path 111 (w.p. p 8 3 ) leads to Λ ( 2 ) .

3.3.2. The Functions G i ( z ) 1 z > b ,   G i , j ( z ) 1 z > b

Let us consider the first b trials x = x 1 x b = x S b . We group all x S b that contain the pattern Λ ( i ) into the set ξ b i . Define the state α i to be the absorbing state due to pattern Λ ( i ) , i.e., α i groups all states in ξ b i . Denote the absorbing vector by α = ( α 1 , α 2 , , α c ) . The set Ω Y { α } S b groups all states that do not include a pattern. Note that every state in Ω Y { α } , α 1 , , α c has pattern of length b. Let T be the transition probability matrix among the states in Ω Y { α } . In addition, let T j 0 , i = 1 , , c be the absorbing probability vector at state α i , and define the absorbing matrix T 0 = ( T 1 0 , T 2 0 , , T c 0 ) . We construct the embedded homogeneous Markov chain { Y t } b + 1 on the state space Ω Y = { S b } . The transition probability matrix has the block form
M = p x y = Ω Y { α } α T T 0 0 I ,
where
p x y = p x , x b + 1 i f   x Ω Y ,   y = ( x x b + 1 ) Ω Y , 1 if   x = y { α i } , 0 otherwise .
Note that T + j = 1 c T j 0 e = e T . The initial probability of { Y b } is given by:
π Y b = ( p ( Y b = x ) : x Ω Y ) = π Y b ( x ) x = x 1 x b Ω Y { α } , x ξ b i π Y b ( x ) x α i , 0 otherwise .
Since b r , the initial probability π Y b ( x ) for x = x 1 x b starts with the corresponding steady-state probability π x 1 x r multiplying by the transition probabilities along the path x r + 1 , , x b . To complete the derivation, we need to define two row probability vectors of order 1 × Ω Y { α } , IP and IP i , i = 1 , , c . Both vectors represent the probability of entering into the states in Ω Y { α } after b trials with no competing patterns; the difference arises from their initial conditions. The vector IP = ( IP ( x ) : x Ω Y { α } ) assumes an initial state in S r (not including a pattern), and calculates the probability of entering Ω Y { α } at time t = b r ; here, we use the steady-state probability vector π of order r, and a multiplication of ( b r ) successive transition probabilities. The vector IP i = ( I P i ( x ) : x Ω Y { α } ) assumes an initial pattern Λ ( i ) and derives the probability of entering Ω Y { α } afterward; here, we use a multiplication of b successive transition probabilities.
Proposition 1.
The functions G i ( z ) 1 z > b and G i , j ( z ) 1 z > b have the general form of:
G j ( z ) 1 z > b = z b IP · ( I z T ) 1 · z T j 0 , j = 1 , , c , G i , j ( z ) 1 z > b = z b IP i · ( I z T ) 1 · z T j 0 , i , j = 1 , , c .
Proof. 
The derivation of G i ( z ) 1 z > b and G i , j ( z ) 1 z > b is composed of three parts. In the first part, the experiment enters a state within the set Ω Y { α } after b trials where no hitting occurs. Thus, we multiply I P and I P i by z b , respectively. The second part is the pgf of the number of trials to stay in the set Ω Y { α } until absorption. Here, following Fu and Lou [41], we have the term ( I z T ) 1 . From that point, the third part is the probability of hitting Λ ( j ) in the next trial, with pgf z T j 0 .  □
Remark 2.
An easy way to derive the vectors IP and IP i , i = 1 , , c is as follows. Construct the transition probability matrices H r , b = ( p x , x : x S r , x S b ) and H b , b = ( p x , x : x S b , x S b ) . The matrix H r , b · H b , b b 1 present the probability of each state in Ω Y , given an initial r-state trial. The vectors I P i , i = 1 , , c can be extracted from H r , b · H b , b b 1 by taking the corresponding row to the r ending block of Λ ( i ) , and the corresponding columns Ω Y { α } with the appropriate eliminations.
Example 5.
Since b = 3 , we have x S 3 and Ω Y = { 000 , 001 , 010 , 011 , 100 , 101 , 110 , 111 } . Here, α 1 = { 000 , 001 , 100 } ,   α 2 = { 111 } ,   α = α 1 α 2 = { 000 , 001 , 100 , 111 } , and Ω Y α = { 010 , 011 , 101 , 110 } . We construct the embedded homogeneous Markov chain { Y t } 4 on the state space
Ω Y = { 010 , 011 , 101 , 110 , α 1 , α 2 } .
To clarify, Figure 2a–d illustrate the states and probabilities of Y 3 ; states marked by light red, including a pattern and, thus, are marked by α 1 or α 2 .
Summarizing, the initial distribution probability of Y 3 is given by (note that π Y 3 e T = 1 ):
π Y 3 = ( π 01 p 5 , π 01 p 6 , π 10 p 4 , π 11 p 7 x Ω Y { α } , π 00 p 1 + π 00 p 2 + π 10 p 3 x α 1 , π 11 p 8 x α 2 ) .
The transition probability matrix has the form T T 0 0 I , and is given by (Table 1):
To complete our derivation, we need to derive the vectors IP ,   IP 1 , and IP 2 . Note that the vector IP is the sub-vector of π Y 3 , with the corresponding vectors in Ω Y α ; i.e.,
IP = ( π 01 p 5 Y 3 ( t = 1 ) = ( 010 ) , π 01 p 6 , Y 3 ( t = 1 ) = ( 011 ) π 10 p 4 Y 3 ( t = 1 ) = ( 101 ) , π 11 p 7 Y 3 ( t = 1 ) = ( 110 ) ) .
(Note that, since ( 00 ) is a competing pattern, the vector IP considers only the two-state trials {01,10,11}). In order to derive IP 1 and IP 2 , we apply Remark 2. Here, the matrices H 2 , 3 and H 3 , 3 are given by (Table 2):
And thus, H 2 , 3 · H 3 , 3 2 is given by:
The probability vectors IP 1 and IP 2 to be at state x Ω Y α in three trials without hitting, starting from Λ ( 1 ) and Λ ( 2 ) respectively, are highlighted by the red boxes; see the first and last rows in Table 3. Here, we obtain
IP 1 = ( p 1 p 2 p 5 , p 1 p 2 p 6 , p 2 p 5 p 4 , p 2 p 6 p 7 ) , IP 2 = ( p 7 p 4 p 5 , p 7 p 4 p 6 , p 8 p 7 p 4 , p 8 p 8 p 7 ) .
(Note that I P is also obtained by taking the columns corresponding to Ω Y α in the product π · H ˜ 2 , 3 ). Summarizing all, we have:
G 1 ( z ) = z 2 · π 00 + z 3 · π 10 · p 3 + z 3 · ( π 01 p 5 π 01 p 6 π 10 p 4 π 11 p 7 ) · ( I z T ) 1 z · p 3 0 0 p 3 , G 2 ( z ) = z 3 · π 11 · p 8 + z 3 · ( π 01 p 5 π 01 p 6 π 10 p 4 π 11 p 7 ) · ( I z T ) 1 z · 0 p 8 0 0 , G 1 , 1 ( z ) = z 2 · p 1 2 + z 3 · p 2 · p 5 · p 3 + z 3 · ( p 1 p 2 p 5 p 1 p 2 p 6 p 2 p 5 p 4 p 2 p 6 p 7 ) · ( I z T ) 1 z · p 3 0 0 p 3 , G 1 , 2 ( z ) = z 3 · p 2 · p 6 · p 8 + z 3 · ( p 1 p 2 p 5 p 1 p 2 p 6 p 2 p 5 p 4 p 2 p 6 p 7 ) · ( I z T ) 1 z · 0 p 8 0 0 , G 2 , 1 ( z ) = z 2 · p 7 · p 3 + z 3 · p 8 · p 7 · p 3 + z 3 · ( p 7 p 4 p 5 p 7 p 4 p 6 p 8 p 7 p 4 p 8 p 8 p 7 ) · ( I z T ) 1 z · p 3 0 0 p 3 , G 2 , 2 ( z ) = z 3 · p 8 3 + z 3 · ( p 7 p 4 p 5 p 7 p 4 p 6 p 8 p 7 p 4 p 8 p 8 p 7 ) · ( I z T ) 1 z · 0 p 8 0 0 .
Substituting (28) in (15) completes the derivation of G(z).

4. Algorithm and Examples

To summarize, we next present an algorithm with the main key steps; a detailed pseudocode is provided in Appendix A. The algorithm is then demonstrated by two additional examples.

4.1. The Algorithm

Step 1. Inputs and initialization
  • The parameter  r , the space state Ω r = { S r } , the S r -square transition matrix A .
  • The competing patterns and their appearances:  ( Λ ( i ) , n i ) i = 1 , , c .
  • Calculate the steady-state probability vector  π = ( π x : x S r )  satisfying  π A = π , π e = 1 .
Step 2. Embedded Markov chain
  • Calculate the stopping vectors  { V i } i = 1 , , c ;  use (8).
  • Set  b = max i { l i } , i = 1 , , c .
  • Build the state space  Ω Y = { S b } ,  define the absorbing states  { α i } , i = 1 , , c .
  • Construct the matrix  T T 0 0 I ,  use (21) and (22).
Step 3. Probability generating function
  • Derive  G i ( z ) 1 z b  and  G i , j ( z ) 1 z b ;  use (18).
  • Calculate the vector  IP ,  and the matrices  H r , b  and  H b , b .  Derive  { IP i } i = 1 , , c (the matrix  H r , b · H b , b b 1  can be helpful with the appropriate eliminations).
  • Apply (G14) to obtain  G i ( z ) 1 z > b  and  G i , j ( z ) 1 z > b .
  • Apply (16) and (14) to derive  G ( z ) .

4.2. Example 2

Our next example is inspired by ReasonLabs. ReasonLabs Ltd. is a global pioneer in cybersecurity detection and prevention powered by machine learning (https://reasonlabs.com) (accessed on 10 January 2025). The Israeli team (called the Performance team) is responsible for projects marketing via a website. Their marketing occurs in several stages, in which a customer is expected to enter the website and choose the service that suits their needs. Basic services are free, while premium services are purchased. Usually, a customer (buyer) visits the website a few times for various projects. The Performance team’s purpose is to track customer acquisition and identify buyer patterns, especially those who are willing to purchase the premium service. To achieve this, they use Bernoulli trials, where each outcome represents a customer choice. The result ‘1’ represents purchasing a premium service, and ‘0’ represents choosing a free service. Two stopping rules are proposed to detect the customer behavior:
(1) Two consecutive purchases (1s) occurring twice (not necessarily consecutively). Here, we also allow at most one free service (‘0’) between them. This customer is considered the most serious buyer to track.
(2) Two consecutive free enters (0s) occurring twice (not necessarily consecutively). This customer is considered a less serious customer who can be ignored.
The above rules (1) and (2) can be modeled as a waiting time problem of competing patterns. We next demonstrate the algorithm on Example 2.
Step 1. Consider second-order Markov-dependent Bernoulli trials, with transition probabilities given in (4)–(6). Here, we have two competing patterns, Λ ( 1 ) = { 101 ,   11 } with n 1 = 2 (associated with Rule 1), and Λ ( 2 ) = { 00 } , with n 2 = 2 (associated with Rule 2). Thus, the experiment terminates when one of the following rules occurs:
(i)
Two occurrences of the set of patterns { 101 , 11 } , i.e., either 101 occurs twice, or 11 occurs twice, or 101 and then 11 , or vice versa, all occurrences are not necessarily consecutive.
(ii)
Two (not necessarily consecutive) occurrences of two consecutive 0 s .
Example of the sequences are, e.g., 010100011, 11100101 (Rule (i)), and 0100101100, 01011000100 (Rule (ii)).
Step 2. Accordingly, we have two sets V 1 and V 2 corresponding to Λ ( 1 ) and Λ ( 2 ) , respectively. Figure 3 demonstrates the paths of possible patterns that terminate the experiment. We observe that V 1 and V 2 include three possible paths each ( V 11 , V 12 , V 13 ), and ( V 21 , V 22 , V 23 ). Thus, we have six possible paths in total (stopping vectors) that terminate the experiment.
Corollary 1(i) yields that each such stopping vector includes at least two patterns and no more than three patterns ( ( n 1 1 ) + ( n 2 1 ) + 1 = 3 ). By Corollary 1(ii), C 1 and C 2 are given by:
C i = k 1 = 0 1 k 1 + n i 1 ! k 1 ! ( n i 1 ) ! = ( 2 1 ) ! 0 ! ( 2 1 ) ! + ( 1 + 2 1 ) ! 1 ! ( 2 1 ) ! = 3 , for i = 1 , 2 ,
and V = C 1 + C 2 = 6 vectors. Specifically, the stopping vectors are (see Figure 3):
V 1 = V 1 , 1 = [ Λ ( 1 ) , Λ ( 1 ) ] , V 1 , 2 = [ Λ ( 1 ) , Λ ( 2 ) , Λ ( 1 ) ] , V 1 , 3 = [ Λ ( 2 ) , Λ ( 1 ) , Λ ( 1 ) ] , V 2 = V 2 , 1 = [ Λ ( 1 ) , Λ ( 2 ) , Λ ( 2 ) ] , V 2 , 2 = [ Λ ( 2 ) , Λ ( 1 ) , Λ ( 2 ) ] , V 2 , 3 = [ Λ ( 2 ) , Λ ( 2 ) ] V = V 1 V 2 .
Since we have two competing patterns, Λ ( 1 ) and Λ ( 2 ) , there are four pgfs, G 1 , 1 ( z ) ,   G 1 , 2 ( z ) ,   G 2 , 1 ( z ) ,   G 2 , 2 ( z ) , in addition to G 1 ( z ) ,   G 2 ( z ) (the pgf of the first occurrence of Λ ( 1 ) and Λ ( 2 ) , respectively). Furthermore, the pattern Λ ( 1 ) includes two simple patterns { 00 , 11 } . Thus, to distinguish between them, we mark { 11 } by the sign “′”, and refer to { 101 } by no sign. Hence, we need to derive G 1 , 1 ( z ) ,   G 1 , 1 ( z ) ,   G 1 , 1 ( z ) ,   G 1 , 1 ( z ) ,   G 1 , 2 ( z ) ,   G 1 , 2 ( z ) ,   G 2 , 1 ( z ) ,   G 2 , 1 ( z ) ,   G 2 , 2 ( z ) ,   G 1 ( z ) ,   G 1 ( z ) , and G 2 ( z ) . The final pgf will be (to be shorter, the parameter z is omitted):
G ( z ) = G 1 G 1 , 1 + G 1 , 1 + G 1 G 1 , 1 + G 1 , 1 V 11 + G 1 G 1 , 2 + G 1 G 1 , 2 G 2 , 1 + G 2 , 1 V 12 + G 2 G 2 , 1 G 1 , 1 + G 1 , 1 ) + G 2 , 1 ( G 1 , 1 + G 1 , 1 V 13 + G 1 G 1 , 2 + G 1 G 1 , 2 G 2 , 2 V 21 + G 2 G 2 , 1 G 1 , 2 + G 2 , 1 G 1 , 2 V 22 + G 2 G 2 , 2 V 23
Here, the longest pattern is { 101 } , so b = 3 , and Ω Y = { 000 , 001 , 010 , 011 , 100 , 101 , 110 , 111 }. We obtain α 2 = { 101 } ,   α 1 = { 110 , 011 , 111 } , and α 2 = { 000 , 001 , 100 } . Thus, Ω Y { α 1 , α 1 , α 2 } = { 010 } , and the ( 1 × 1 ) vectors (in fact, scalars) are I P = π 01 p 5 ,   I P 1 = p 5 p 4 p 5 ,   I P 1 = p 7 p 4 p 5 , and I P 2 = p 1 p 2 p 5 (see the red rectangles in Table 4). The absorbing vectors (scalar) are T 1 0 = p 4 ,   T 1 0 = 0 and T 2 0 = p 3 . Also, note that the ( 1 × 1 ) transition matrix is T = 0 , and ( I z T ) 1 = I . This result is due to the fact that, when we have the sequence 010 , either trial 0 or 1 immediately yields a hit. The immediate conclusion is that the waiting time for a single hit is no more than four trials.
Step 3. We next derive G i ( z ) 1 z 3 , i = 1 , 1 , 2 , and the pgfs G i , j ( z ) 1 z 3 , i , j = 1 , 1 , 2 .
G 1 ( z ) 1 z 3 = z 3 · π 10 · p 4 , G 1 ( z ) 1 z 3 = z 2 · π 11 + z 3 · π 01 · p 6 , G 2 ( z ) 1 z 3 = z 2 · π 00 + z 3 · π 10 · p 3 .
The function G i , j ( z ) 1 z 3 is constructed by a path of at most three multiplications of transition probabilities:
G 1 , 1 ( z ) 1 z 3 = z 3 · p 8 3 , G 1 , 1 ( z ) 1 z 3 = z 2 · p 6 · p 8 + z 3 · p 5 · p 4 · p 6 , G 1 , 1 ( z ) 1 z 3 = z 3 · p 8 · p 7 · p 4 , G 1 , 1 ( z ) 1 z 3 = z 2 · p 8 2 + z 3 · p 7 · p 4 · p 6 , G 1 , 2 ( z ) 1 z 3 = z 2 · p 5 · p 3 + z 3 · p 6 · p 7 · p 3 , G 1 , 2 ( z ) 1 z 3 = z 2 · p 7 · p 3 + z 3 · p 8 · p 7 · p 3 , G 2 , 1 ( z ) 1 z 3 = z 3 · p 2 · p 5 · p 4 , G 2 , 1 ( z ) 1 z 3 = z 2 · p 2 · p 6 + z 3 · p 1 · p 2 · p 6 , G 2 , 2 ( z ) 1 z 3 = z 2 · p 1 2 + z 3 · p 2 · p 5 · p 3 .
Summarizing everything, we have:
G 1 ( z ) = z 3 · π 10 · p 4 + z 3 · π 01 · p 5 · z · p 4 , G 1 ( z ) = z 2 · π 11 + z 3 · π 01 · p 6 , G 2 ( z ) = z 2 · π 00 + z 3 · π 10 · p 3 + z 3 · π 01 · p 5 · z · p 3 , G 1 , 1 ( z ) = z 3 · p 8 3 + z 3 · p 5 · p 4 · p 5 · z · p 4 , G 1 , 1 ( z ) = z 2 · p 6 · p 8 + z 3 · p 5 · p 4 · p 6 , G 1 , 1 ( z ) = z 3 · p 8 · p 7 · p 4 + z 3 · p 7 · p 4 · p 5 · z · p 4 , G 1 , 1 ( z ) = z 2 · p 8 2 + z 3 · p 7 · p 4 · p 6 ,
G 1 , 2 ( z ) = z 2 · p 5 · p 3 + z 3 · p 6 · p 7 · p 3 + z 3 · p 5 · p 4 · p 5 · z · p 3 , G 1 , 2 ( z ) = z 2 · p 7 · p 3 + z 3 · p 8 · p 7 · p 3 + z 3 · p 7 · p 4 · p 5 · z · p 3 , G 2 , 1 ( z ) = z 3 · p 2 · p 5 · p 4 + z 3 · p 1 · p 2 · p 5 · z · p 4 , G 2 , 1 ( z ) = z 2 · p 2 · p 6 + z 3 · p 1 · p 2 · p 6 , G 2 , 2 ( z ) = z 2 · p 1 2 + z 3 · p 2 · p 5 · p 3 + z 3 · p 1 · p 2 · p 5 · z · p 3 .
Substituting (34) and (35) in (31) completes the derivation of G ( z ) .
To add real data, the ReasonLabs Performance team further conducted an estimation of daily user-choice probabilities for free vs. premium services, focusing on the two most recent days of user activity,
p 1 = 0.3 , p 2 = 0.7 , p 3 = 0.4 , p 4 = 0.6 , p 5 = 0.5 , p 6 = 0.5 , p 7 = 0.2 , p 8 = 0.8 .
The above probabilities show that those who switch from free to premium have a 50% chance of reverting and a 50% chance of continuing with premium, while in the opposite case, i.e., those who switch from premium to free have a 60% likelihood of switching back to premium. We also see that 80% of users that chose premium twice will stay premium, and 70% of free users will try premium. These probabilities suggest that users are likely to move toward premium.
Based on these estimates, the probability vector π and the pgf of the waiting times are given by:
π = ( 0.1126 , 0.1972 , 0.1972 , 0.4930 ) , G ( z ) = 0.319 · z 4 + 0.203 · z 5 + 0.164 · z 6 + 0.125 · z 7 + 0.088 · z 8 + 0.059 · z 9 + 0.028 · z 10 + 0.009 · z 11 + 0.0016 · z 12 ,
with an average number of trials of 5.805 and a variance of 3.197. The information that it takes approximately 6 visits (with a minimum of 4 and a maximum of 12 visits) to classify a customer may help in the optimization of resource allocation and preventing early interventions. In addition, the information may also contribute to mapping decision-making processes about customers and identifying behavioral changes in users.

4.3. Example 3

Our third example is inspired by Gamida Ltd. (https://gamida.co.il), (accessed on 1 February 2025). Gamida Ltd. provides targeted and comprehensive first-class services to the medical, science, technology, and industrial community in Israel and is part of the international group of companies Gamida For Life B.V. Among others, Gamida is the exclusive representative in Israel of the following companies: Cardinal Health, Lohmann & Rauscher, Abbott, B.Braun, Flen Health, BD, Getinge, Philips, and Integra LifeSciences, among others. The group specializes in the import, development, production, marketing, and distribution of products from research, diagnostics, medicine, and the advanced industries. Gamida specializes in developing wearable bracelets designed to monitor potential heart rate irregularities, especially in the elderly. These bracelets continuously collect heart rate data and process the accumulated information at the end of each day to produce a daily binary result indicating whether the heart rate was normal or not. A result ‘0’ represents a normal heart rate, and ‘1’ represents an abnormal rate; these results are then transmitted to the medical team for further analysis. The empirical investigation shows that, as expected, there is a dependence between the results. Gamida Medical Ltd. recommends two situations whose occurrence requires further investigation:
(1)
Two occurrences (not necessarily consecutive) of an abnormal rate after a normal or abnormal rate. An abnormal rate that appears twice (even after a normal one) may indicate a cardiac problem and is worth checking. According to the company’s experience, the need to record two consecutive outcomes helps determine whether the result is part of an ongoing trend or a single event.
(2)
Three (not necessarily consecutively) sequences of an abnormal rate followed by two normal results. This situation may be caused by a malfunction of the device, or as a result of other factors that caused a positive change in the heart rate.
Both conditions suggest an irregular heartbeat and require medical examination and attention. Next, we will demonstrate the algorithm on Example 3.
Step 1. Consider second-order Markov-dependent Bernoulli trials, with arguments as in (4)–(6), with two competing patterns, Λ ( 1 ) = { 01 ,   11 } with n 1 = 2 , and Λ ( 2 ) = { 100 } , with n 2 = 2 . Thus, the experiment terminates when one of the following rules occurs:
(i)
Two occurrences of 11, or two occurrences of 01 , or first 01 and then 11 , or vice versa—all occurrences are not necessarily consecutive.
(ii)
Two (not necessarily consecutive) occurrences of the pattern { 100 } .
An example of a sequence is given as follows, e.g., 101101, 001100001 (Rule (i)), and 10011100, 101100100 for (Rule (ii)).
Step 2. Accordingly, we have two sets V 1 and V 2 corresponding to Λ ( 1 ) and Λ ( 2 ) , respectively, which are the same paths as those of Example 2, and thus, are given by (30). Here, we also have four pgfs ( G 1 , 1 ( z ) ,   G 1 , 2 ( z ) ,   G 2 , 1 ( z ) ,   G 2 , 2 ( z ) ) in addition to G 1 ( z ) ,   G 2 ( z ) . We further distinguish between the patterns { 01 , 11 } of Λ ( 1 ) by adding the sign “′” to the terms referring to { 11 } , and using no sign for { 01 } . Hence, we need to derive 12 pgfs with the final pgf given by (31).
Here, the longest pattern is { 100 } , so b = 3 ,   Ω Y = { 000 , 001 , 010 , 011 , 100 , 101 , 110 , 111 } . The absorbing states are α 1 = { 001 , 010 , 011 , 101 } ,   α 1 = { 110 , 111 } and α 2 = { 100 } . Therefore, we have Ω Y { α 1 , α 1 , α 2 } = { 000 } , and the ( 1 × 1 ) vectors (in fact, scalars) are I P = π 00 p 1 ,   I P 1 = p 5 p 3 p 1 ,   I P 1 = p 7 p 3 p 1 ,   I P 2 = p 1 3 (highlighted by the red rectangles in Table 5 and Table 6). The absorbing vectors (scalar) are T 1 0 = p 2 ,   T 1 0 = 0 and T 2 0 = 0 . Also note that the ( 1 × 1 ) matrix T = p 1 , thus, ( I z T ) 1 = ( 1 z · p 1 ) 1 .
Step 3. We next derive G i ( z ) 1 z 3 , i = 1 , 1 , 2 , and the pgfs G i , j ( z ) 1 z 3 , i , j = 1 , 1 , 2 .
G 1 ( z ) 1 z 3 = z 2 · π 01 + z 3 · π 00 · p 2 + π 10 · p 4 , G 1 ( z ) 1 z 3 = z 2 · π 11 + z 3 · π 01 · p 6 , G 2 ( z ) 1 z 3 = z 3 · π 10 · p 3 .
The function G i , j ( z ) 1 z 3 is constructed by a path of at most three multiplications of transition probabilities:
G 1 , 1 ( z ) 1 z 3 = z 2 · p 5 p 4 + z 3 · p 6 · p 7 · p 4 + p 5 · p 3 · p 2 , G 1 , 1 ( z ) 1 z 3 = z 2 · p 6 · p 8 , G 1 , 1 ( z ) 1 z 3 = z 2 · p 7 · p 4 + z 3 · p 8 · p 7 · p 4 + p 7 · p 3 · p 2 , G 1 , 1 ( z ) 1 z 3 = z 2 · p 8 2 , G 1 , 2 ( z ) 1 z 3 = z 3 · p 6 · p 7 · p 3 , G 1 , 2 ( z ) 1 z 3 = z 3 · p 8 · p 7 · p 3 , G 2 , 1 ( z ) 1 z 3 = z 2 · p 1 · p 2 + z 3 · p 1 · p 1 · p 2 + p 2 · p 5 · p 4 , G 2 , 1 ( z ) 1 z 3 = z 2 · p 2 · p 6 , G 2 , 2 ( z ) 1 z 3 = z 3 · p 2 · p 5 · p 3 .
Summarizing all, we have:
G 1 ( z ) = z 2 · π 01 + z 3 · π 00 · p 2 + z 3 · π 10 · p 4 + z 3 · π 00 · p 1 · ( 1 z · p 1 ) 1 · z · p 2 , G 1 ( z ) = z 2 · π 11 + z 3 · π 01 · p 6 + z 3 · π 00 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 2 · π 11 + z 3 · π 01 · p 6 , G 2 ( z ) = z 3 · π 10 · p 3 + z 3 · π 00 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 3 · π 10 · p 3 , G 1 , 1 ( z ) = z 2 · p 5 · p 4 + z 3 · ( p 6 · p 7 · p 4 + p 5 · p 3 · p 2 ) + z 3 · p 5 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · p 2 , G 1 , 1 ( z ) = z 2 · p 6 · p 8 + z 3 · p 5 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 2 · p 6 · p 8 , G 1 , 1 ( z ) = z 2 · p 7 · p 4 + z 3 · ( p 8 · p 7 · p 4 + p 7 · p 3 · p 2 ) + z 3 · p 7 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · p 2 , G 1 , 1 ( z ) = z 2 · p 8 2 + z 3 · p 7 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 2 · p 8 2 ,
G 1 , 2 ( z ) = z 3 · p 6 · p 7 · p 3 + z 3 · p 5 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 3 · p 6 · p 7 · p 3 , G 1 , 2 ( z ) = z 3 · p 8 · p 7 · p 3 + z 3 · p 7 · p 3 · p 1 · ( 1 z · p 1 ) 1 · z · 0 = z 3 · p 8 · p 7 · p 3 , G 2 , 1 ( z ) = z 2 · p 1 · p 2 + z 3 · p 1 · p 1 · p 2 + p 2 · p 5 · p 4 + z 3 · p 1 3 · ( 1 z · p 1 ) 1 · z · p 2 , G 2 , 1 ( z ) = z 2 · p 2 · p 6 + z 3 · p 1 3 · ( 1 z · p 1 ) 1 · z · 0 = z 2 · p 2 · p 6 , G 2 , 2 ( z ) = z 3 · p 2 · p 5 · p 3 + z 3 · p 1 3 · ( 1 z · p 1 ) 1 z · 0 = z 3 · p 2 · p 5 · p 3 .
Substituting (38) and (39) in (31) completes the derivation of G ( z ) .
Equations (38) and (39) show some interesting results. For example, we observe that at most three trials are needed to hit 11 or 100 (see G 1 ( z ) , G 2 ( z ) ), exactly two trials are needed from 11 / 01 / 100 to 11 (see G 1 , 1 ( z ) ,   G 1 , 1 ( z ) , and G 2 , 1 ( z ) ), and exactly three trials are needed from 11 / 01 / 100 to 100 (see G 1 , 2 ( z ) , G 1 , 2 ( z ) , and G 2 , 2 ( z ) ). To explain the results, assume that the pattern 11 occurs. From that point, any trial 0 followed by 1 immediately yields the hit 01; the only option that 11 occurs again (before other patterns are hit) is by the double trial 11 . Similarly, other cases can be explained.
According to Gamida Ltd., the transition probabilities are estimated by:
p 1 = 0.65 , p 2 = 0.35 , p 3 = 0.6 , p 4 = 0.4 , p 5 = 0.7 , p 6 = 0.3 , p 7 = 0.2 , p 8 = 0.8 ,
with a steady-state probability vector
π = ( 0.3288 , 0.1918 , 0.1918 , 0.2877 ) .
Calculating the pgf of the waiting time yielded the following:
G ( z ) = 0.293 · z 4 + 0.189 · z 5 + 0.0529 · z 6 + 0.032 · z 7 + 0.037 · z 8 + 0.0104 · z 9 + 0.0614 · z 6 + 0.0312 · z 7 + 0.0122 · z 9 + 0.0068 · z 10 1 0.65 · z + 0.00671 · z 8 + 0.00123 · z 11 1 0.65 · z 2 ,
with an average number of trials of 6.63 and a variance 8.99, whilst the average detection time for identifying critical heart rate patterns is 6.6 days. These findings indicate that approximately one week of continuous monitoring is generally required to reliably detect critical cardiac rate patterns. The results can contribute to determining a reasonable detection window while minimizing false alarms. In addition, they can be useful in developing patient care protocols and setting alert thresholds, leading to a more efficient approach to cardiac monitoring.

5. Summary and Future Research

This paper studies a Markov-dependent model with Bernoulli trials and competing patterns. Competing patterns are compound patterns that compete to be the first to occur a specified number of times. Using a finite Markov chain and tools from probability theory, we develop an algorithm to derive a closed-form expression for the pgf of the waiting time distribution. It must be noted that despite being simple to understand, its application requires preparatory work for calculating path probabilities, which is a function of the parameter b. However, we believe that integrating both traditional computing techniques and advanced AI and machine learning approaches may be useful in developing efficient solutions even for the most complex cases. In this vein, the methodology presented in this paper serves as the foundational framework for the computational solutions discussed.
Theoretical extensions of the model can be interpreted in several directions. Our model assumes non-overlapping counting. It would be an interesting extension to generalize the algorithm to overlapping counting, where partially completed patterns can be finished at any time, regardless of whether another pattern has been completed after the partially completed pattern starts but before it is completed. Another direction is to investigate other stopping time rules, such as the sooner or later models. Here, a sooner model captures the number of trials required for the first occurrence of one of two competing patterns, and conversely, a later model refers to the numbers of trials required for both patterns to occur.

Author Contributions

Conceptualization, I.M.; Formal analysis, Y.B.; Investigation, Y.B.; Writing—original draft, I.M.; Writing—review and editing, Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data is unavailable due to the company’s privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

In this appendix, we extend the algorithm of Section 4.1 and provide pseudocode for computing the probability generating function.
Step 1. Inputs and initialization
1.1.
Define: S { 0 , 1 } ,  r be the Markov-order chain.
1.2.
Generate the set of states Ω r = { S r } ,   S r   = { x = ( x 1 x r ) : x i S } ( Ω r = 2 r ).
1.3.
Define transition probabilities, p x , y , x , y S r . Build the ( 2 r × 2 r ) matrix A = p x , y .
1.4.
Define the ( 1 × 2 r ) probability vector π = { π x : x S r } . Obtain π by solving π A = π , π e = 1 .
1.5.
For each competing pattern i ,   i = 1 , , c :
  • Define Λ ( i ) = { Λ 1 ( i ) , , Λ k i ( i ) } .
  • Let l i , k i = Λ k i ( i ) ,   l i = max j = 1 , , k i { l i , j } .
  • Define n i -the number of appearances needed.
1.6.
Let b = max i = 1 , , c { l i } .
Step 2. Embedded Markov chain
2.1
For i = 1 , , c :
  • Define V i = { V i , j } to be the set of all paths that terminate the experiment via Λ ( i ) .
  • Calculate C i = V i using
    C i = k 1 = 0 n 1 1 k c = 0 n c 1 k 1 + + k c + n i 1 ! k 1 ! k c ! ( n i 1 ) ! , k 1 < < k c , k j i .
  • For each V i , j , generate the series of patterns V i , j = [ V i , j ( 1 ) , V i , j ( 2 ) , ] .
2.2
Build Ω Y = { S b } ,  where  S r   = { x = ( x 1 , , x b ) : x i S } ( Ω Y = 2 b ).
2.3
Obtain distinct sets of absorbing states { α i } i = 1 , , c with regard to Λ ( i ) .
2.4
Obtain the transient set of states, Ω Y i = 1 c { α i } .
2.5
Derive T —the transition probability matrix among the states in Ω Y i = 1 c { α i } .
2.6
Derive T i 0 —the absorbing probability matrix into states in { α i } .
2.7
Construct the Markov probability matrix as follows (Figure A1):
Figure A1. The transition probability matrix.
Figure A1. The transition probability matrix.
Axioms 14 00221 g0a1
( I is the identity matrix, and 0 is the zero matrix, all with the appropriate dimensions).
Step 3. Probability generating function
3.1
For i , j = 1 , , c :
  • Derive G i ( z ) z b and G i , j ( z ) z b using a probability product of maximum length b (a probability tree diagram may be useful).
  • Derive the 1 × Ω Y { α } vector IP i (the matrix H r , b · H b , b b 1 may be helpful).
  • Compute the 1 × Ω Y { α } vector IP (a probability tree diagram may be useful)
  • Derive G i ( z ) 1 z > b and G i , j ( z ) 1 z > b by:
    G j ( z ) 1 z > b = z b IP · ( I z T ) 1 · z T j 0 G i , j ( z ) 1 z > b = z b IP i · ( I z T ) 1 · z T j 0 .
  • Use the law of total expectation to obtain:
    G i , j ( z ) = G i , j ( z ) 1 z b + G i , j ( z ) 1 z > b , G i ( z ) = G i ( z ) 1 z b + G i ( z ) 1 z > b .
3.2
The final G ( z ) is obtained by
G ( z ) = i , j k = 1 V i , j G V i , j ( k 1 ) , V i , j ( k ) ( z ) ·

References

  1. Schwager, S.J. Run probabilities in sequences of markov-dependent trials. J. Am. Stat. Assoc. 1983, 78, 168–175. [Google Scholar] [CrossRef]
  2. Karwe, V.V.; Naus, J.I. New recursive methods for scan statistic probabilities. Comput. Stat. Data Anal. 1997, 23, 389–402. [Google Scholar] [CrossRef]
  3. Martin, D.E.; Aston, J.A. Waiting time distribution of generalized later patterns. Comput. Stat. Data Anal. 2008, 52, 4879–4890. [Google Scholar] [CrossRef]
  4. Kulldorff, M. A spatial scan statistic. Commun. Stat. Theory Methods 1997, 26, 1481–1496. [Google Scholar] [CrossRef]
  5. Aki, S. Discrete distributions of order k on a binary sequence. Ann. Inst. Stat. Math. 1985, 37, 205–224. [Google Scholar] [CrossRef]
  6. Aki, S.; Hirano, K. Lifetime distribution and estimation problems of consecutive-k-out-of-n: f systems. Ann. Inst. Stat. Math. 1996, 48, 185–199. [Google Scholar] [CrossRef]
  7. Chang, Y.M.; Huang, T.H. Reliability of a 2-dimensional k-within consecutive-r’s-out-of-m’n: f system using finite markov chains. IEEE Trans. Reliab. 2010, 59, 725–733. [Google Scholar] [CrossRef]
  8. Dafnis, S.D.; Antzoulakos, D.L.; Philippou, A.N. Distributions related to (k1,k2) events. J. Stat. Plan. Inference 2010, 140, 1691–1700. [Google Scholar] [CrossRef]
  9. Dafnis, S.D.; Gounari, S.; Zotos, C.E.; Papadopoulos, G.K. The effect of cold periods on the biological cycle of Marchalina hellenica. Insects 2022, 13, 375. [Google Scholar] [CrossRef]
  10. Dafnis, S.D.; Makri, F.S.; Koutras, M.V. Generalizations of runs and patterns distributions for sequences of binary trials. Methodol. Comput. Appl. Probab. 2021, 23, 165–185. [Google Scholar] [CrossRef]
  11. Dafnis, S.D.; Makri, F.S.; Philippou, A.N. The reliability of a generalized consecutive system. Appl. Math. Comput. 2019, 359, 186–193. [Google Scholar] [CrossRef]
  12. Dafnis, S.D.; Makri, F.S. Distributions related to weak runs with a minimum and a maximum number of successes: A unified approach. Methodol. Comput. Appl. Probab. 2023, 25, 24. [Google Scholar] [CrossRef]
  13. Feller, W. An Introduction to Probability Theory and Its Applications, 3rd ed.; Wiley: New York, NY, USA, 1971; Volume 1. [Google Scholar]
  14. Philippou, A.N.; Georghiou, C.; Philippou, G.N. A generalized geometric distribution and some of its properties. Stat. Probab. Lett. 1983, 1, 171–175. [Google Scholar] [CrossRef]
  15. Philippou, A.N.; Makri, F.S. Successes, runs and longest runs. Stat. Probab. Lett. 1986, 4, 211–215. [Google Scholar] [CrossRef]
  16. Philippou, A.N.; Antzoulakos, D.L. Multivariate distributions of order k on a generalized sequence. Stat. Probab. Lett. 1990, 9, 453–463. [Google Scholar] [CrossRef]
  17. Ling, K. On geometric distributions of order (k1,k2,…,km). Stat. Probab. Lett. 1990, 9, 163–171. [Google Scholar] [CrossRef]
  18. Shmueli, G.; Cohen, A. Run-Related probability functions applied to sampling inspection. Technometrics 2000, 42, 188–202. [Google Scholar] [CrossRef]
  19. Koutras, M.V.; Eryilmaz, S. Compound geometric distribution of order k. Methodol. Comput. Appl. Probab. 2017, 19, 377–393. [Google Scholar] [CrossRef]
  20. Blom, G.; Thorburn, D. How many random digits are required until given sequences are obtained? J. Appl. Probab. 1982, 19, 518–531. [Google Scholar] [CrossRef]
  21. Ebneshahrashoob, M.; Sobel, M. Sooner and later waiting time problems for bernoulli trials: Frequency and run quotas. Stat. Probab. Lett. 1990, 9, 5–11. [Google Scholar] [CrossRef]
  22. Huang, W.T.; Tsai, C.S. On a modified binomial distribution of order k. Stat. Probab. Lett. 1991, 11, 125–131. [Google Scholar] [CrossRef]
  23. Makri, F.S. On occurrences of FS strings in linearly and circularly ordered binary sequences. J. Appl. Probab. 2010, 47, 157–178. [Google Scholar] [CrossRef]
  24. Kumar, A.N.; Upadhye, N.S. Generalizations of distributions related to (k1,k2)-runs. Metrika 2019, 82, 249–268. [Google Scholar] [CrossRef]
  25. Zhao, X.; Song, Y.; Wang, X.; Lv, Z. Distributions of (k1,k2,…,kl)-runs with multi-state Trials. Methodol. Comput. Appl. Probab. 2022, 24, 2689–2702. [Google Scholar] [CrossRef]
  26. Kong, Y. Multiple consecutive runs of multi-state trials: Distributions of (k1,k2,…,kl) patterns. J. Comput. Appl. Math. 2022, 403, 113846. [Google Scholar] [CrossRef]
  27. Chadjiconstantinidis, S.; Eryilmaz, S. Computing waiting time probabilities related to (k1,k2,…,kl) pattern. Stat. Pap. 2023, 64, 1373–1390. [Google Scholar] [CrossRef]
  28. Aki, S. Waiting time problems for a sequence of discrete random variables. Ann. Inst. Stat. Math. 1992, 44, 363–378. [Google Scholar] [CrossRef]
  29. Koutras, M.V. On a waiting time distribution in a sequence of bernoulli trials. Ann. Inst. Stat. Math. 1996, 48, 789–806. [Google Scholar] [CrossRef]
  30. Robin, S.; Daudin, J.J. Exact distribution of word occurrences in a random sequence of letters. J. Appl. Probab. 1999, 36, 179–193. [Google Scholar] [CrossRef]
  31. Aki, S.; Hirano, K. Waiting time problems for a two-dimensional pattern. Ann. Inst. Stat. Math. 2004, 56, 169–182. [Google Scholar] [CrossRef]
  32. Hirano, K.; Aki, S. On number of occurrences of success runs of specified length in a two-state markov chain. Stat. Sin. 1993, 3, 313–320. [Google Scholar]
  33. Fu, J.C.; Koutras, M.V. Distribution theory of runs: A markov chain approach. J. Am. Stat. Assoc. 1994, 89, 1050–1058. [Google Scholar] [CrossRef]
  34. Fu, J.C. Distribution theory of runs and patterns associated with a sequence of multi-state trials. Stat. Sin. 1996, 6, 957–974. [Google Scholar]
  35. Koutras, M.V. Waiting time distributions associated with runs of fixed length in two-state markov chains. Ann. Inst. Stat. Math. 1997, 49, 123–139. [Google Scholar] [CrossRef]
  36. Antzoulakos, D.L. Waiting times for patterns in a sequence of multistate trials. J. Appl. Probab. 2001, 38, 508–518. [Google Scholar] [CrossRef]
  37. Fisher, E.; Cui, S. Patterns generated by m-order markov chains. Stat. Probab. Lett. 2010, 80, 1157–1166. [Google Scholar] [CrossRef]
  38. Chang, Y.M.; Fu, J.C.; Lin, H.Y. Distribution and double generating function of number of patterns in a sequence of markov dependent multistate trials. Ann. Inst. Stat. Math. 2012, 64, 55–68. [Google Scholar] [CrossRef]
  39. Fu, J.C.; Chang, Y.M. On probability generating functions for waiting time distributions of compound patterns in a sequence of multistate trials. J. Appl. Probab. 2002, 39, 70–80. [Google Scholar] [CrossRef]
  40. Han, Q.; Hirano, K. Sooner and later waiting time problems for patterns in markov dependent trials. J. Appl. Probab. 2003, 40, 73–86. [Google Scholar] [CrossRef]
  41. Fu, J.C.; Lou, W.Y.W. Waiting time distributions of simple and compound patterns in a sequence of r-th order Markov dependent multi-state trials. Ann. Inst. Stat. Math. 2006, 58, 291–310. [Google Scholar] [CrossRef]
  42. Wu, T.L. Conditional waiting time distributions of runs and patterns and their applications. Ann. Inst. Stat. Math. 2020, 72, 531–543. [Google Scholar] [CrossRef]
  43. Aston, J.A.; Martin, D.E. Waiting time distributions of competing patterns in higher-order Markovian sequences. J. Appl. Probab. 2005, 42, 977–988. [Google Scholar] [CrossRef]
  44. Fu, J.C.; Lou, W.Y.W. Distribution Theory of Runs and Patterns and Its Applications: A Finite Markov Chain Imbedding Approach; World Scientific Publishing Co.: Singapore, 2003. [Google Scholar]
  45. Balakrishnan, N.; Koutras, M.V. Runs and Scans with Applications; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  46. Aston, J.A.; Martin, D.E. Distributions associated with general runs and patterns in hidden Markov models. Ann. Appl. Stat. 2007, 1, 585–611. [Google Scholar] [CrossRef]
  47. Martin, D.E.K. Computation of exact probabilities associated with overlapping pattern occurrences. WIREs Comput. Stat. 2019, 11, e1477. [Google Scholar] [CrossRef]
  48. Martin, D.E.K. Distributions of pattern statistics in sparse markov models. Ann. Inst. Stat. Math. 2020, 72, 895–913. [Google Scholar] [CrossRef]
  49. Michael, B.V.; Eutichia, V. On the distribution of the number of success runs in a continuous time markov chain. Methodol. Comput. Appl. Probab. 2020, 22, 969–993. [Google Scholar] [CrossRef]
  50. Vaggelatou, E. On the longest run and the waiting time for the first run in a continuous time multi-state Markov chain. Methodol. Comput. Appl. Probab. 2024, 26, 55. [Google Scholar] [CrossRef]
  51. Makri, F.S.; Psillakis, Z.M. Distribution of patterns of constrained length in binary sequences. Methodol. Comput. Appl. Probab. 2023, 25, 90. [Google Scholar] [CrossRef]
  52. Makri, F.S.; Psillakis, Z.M.; Dafnis, S.D. Number of runs of ones of length exceeding a threshold in a modified binary sequence with locks. Commun. Stat. Simul. Comput. 2024, 1–17. [Google Scholar] [CrossRef]
  53. Inoue, K.; Aki, S. Generalized binomial and negative binomial distributions of order k by the l-overlapping enumeration scheme. Ann. Inst. Stat. Math. 2003, 55, 153–167. [Google Scholar] [CrossRef]
Figure 1. The possible paths of Λ ( 1 ) and Λ ( 2 ) that terminate the experiment.
Figure 1. The possible paths of Λ ( 1 ) and Λ ( 2 ) that terminate the experiment.
Axioms 14 00221 g001
Figure 2. The states and probabilities of Y 3 ( t = 1 ) , given (a) x ( t = 0 ) = ( 00 ) , (b) x ( t = 0 ) = ( 01 ) , (c) x ( t = 0 ) = ( 10 ) , (d) x ( t = 0 ) = ( 11 ) .
Figure 2. The states and probabilities of Y 3 ( t = 1 ) , given (a) x ( t = 0 ) = ( 00 ) , (b) x ( t = 0 ) = ( 01 ) , (c) x ( t = 0 ) = ( 10 ) , (d) x ( t = 0 ) = ( 11 ) .
Axioms 14 00221 g002
Figure 3. The possible paths of Λ 1 or Λ 2 that terminate the experiment.
Figure 3. The possible paths of Λ 1 or Λ 2 that terminate the experiment.
Axioms 14 00221 g003
Table 1. The probability transition matrix of Ω Y .
Table 1. The probability transition matrix of Ω Y .
Px,yΩY
010011101110α1α2
ΩY01000p40p30
011000p70p8
101p5p60000
11000p40p30
α1000010
α2000001
Table 2. The probability transition matrix of H 2 , 3 .
Table 2. The probability transition matrix of H 2 , 3 .
H 2 , 3 ΩY
000001010011100101110111
2-state trials00p1p2
01 p5p6
10 p3p4
11 p7p8
Table 3. The probability transition matrix of H 3 , 3 .
Table 3. The probability transition matrix of H 3 , 3 .
H 3 , 3 ΩY
000001010011100101110111
ΩY000p1p2
001 p5p6
010 p3p4
011 p7p8
100p1p2
101 p5p6
110 p3p4
111 p7p8
Table 4. The probability transition matrix of H 2 , 3 · H 3 , 3 2 .
Table 4. The probability transition matrix of H 2 , 3 · H 3 , 3 2 .
H 2 , 3 · H 3 , 3 2 ΩY
000001010011100101110111
2-state trials00(p1)3(p1)2p2p1p2p5p1p2p6p2p5p3p2p5p4p2p6p7p2p6p8
01p5p3p1p2p5p3(p5)2p4p5p4p6p6p7p3p6p7p4p6p8p7p6(p8)2
10p3(p1)2p3p1p2p2p5p3p3p2p6p5p4p3p5(p4)2p6p7p4p4p6p8
11p7p3p1p7p3p2p7p4p5p7p4p6p8p7p3p8p7p4(p8)2p7(p8)3
Table 5. The highlighted probabilities used in Example 2 from the matrix H 2 , 3 · H 3 , 3 2 .
Table 5. The highlighted probabilities used in Example 2 from the matrix H 2 , 3 · H 3 , 3 2 .
H 2 , 3 · H 3 , 3 2 ΩY
000001010011100101110111
2-state trials00(p1)3(p1)2p2p1p2p5p1p2p6p2p5p3p2p5p4p2p6p7p2p6p8
01p5p3p1p2p5p3(p5)2p4p5p4p6p6p7p3p6p7p4p6p8p7p6(p8)2
10p3(p1)2p3p1p2p2p5p3p3p2p6p5p4p3p5(p4)2p6p7p4p4p6p8
11p7p3p1p7p3p2p7p4p5p7p4p6p8p7p3p8p7p4(p8)2p7(p8)3
Table 6. The highlighted probabilities used in Example 3 from the matrix H 2 , 3 · H 3 , 3 2 .
Table 6. The highlighted probabilities used in Example 3 from the matrix H 2 , 3 · H 3 , 3 2 .
H 2 , 3 · H 3 , 3 2 ΩY
000001010011100101110111
2-state trials00(p1)3(p1)2p2p1p2p5p1p2p6p2p5p3p2p5p4p2p6p7p2p6p8
01p5p3p1p2p5p3(p5)2p4p5p4p6p6p7p3p6p7p4p6p8p7p6(p8)2
10p3(p1)2p3p1p2p2p5p3p3p2p6p5p4p3p5(p4)2p6p7p4p4p6p8
11p7p3p1p7p3p2p7p4p5p7p4p6p8p7p3p8p7p4(p8)2p7(p8)3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moshkovitz, I.; Barron, Y. The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms 2025, 14, 221. https://doi.org/10.3390/axioms14030221

AMA Style

Moshkovitz I, Barron Y. The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms. 2025; 14(3):221. https://doi.org/10.3390/axioms14030221

Chicago/Turabian Style

Moshkovitz, Itzhak, and Yonit Barron. 2025. "The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials" Axioms 14, no. 3: 221. https://doi.org/10.3390/axioms14030221

APA Style

Moshkovitz, I., & Barron, Y. (2025). The Waiting Time Distribution of Competing Patterns in Markov-Dependent Bernoulli Trials. Axioms, 14(3), 221. https://doi.org/10.3390/axioms14030221

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop