Next Article in Journal
Greedy Algorithm for Deriving Decision Rules from Decision Tree Ensembles
Next Article in Special Issue
On the Komlós–Révész SLLN for Ψ-Mixing Sequences
Previous Article in Journal / Special Issue
On the Convergence Rate for the Longest at Most T-Contaminated Runs of Heads
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Erdős-Révész Type Law for the Length of the Longest Match of Two Coin-Tossing Sequences

Institute of Statistics and Mathematical Methods in Economy, TU Wien, Wiedner Hauptstraße 8-10, 1040 Vienna, Austria
Entropy 2025, 27(1), 34; https://doi.org/10.3390/e27010034
Submission received: 27 November 2024 / Revised: 30 December 2024 / Accepted: 1 January 2025 / Published: 3 January 2025
(This article belongs to the Special Issue The Random Walk Path of Pál Révész in Probability)

Abstract

:
Consider a coin-tossing sequence, i.e., a sequence of independent variables, taking values 0 and 1 with probability 1 / 2 . The famous Erdős-Rényi law of large numbers implies that the longest run of ones in the first n observations has a length R n that behaves like log ( n ) , as n tends to infinity (throughout this article, log denotes logarithm with base 2). Erdős and Révész refined this result by providing a description of the Lévy upper and lower classes of the process R n . In another direction, Arratia and Waterman extended the Erdős-Rényi result to the longest matching subsequence (with shifts) of two coin-tossing sequences, finding that it behaves asymptotically like 2 log ( n ) . The present paper provides some Erdős-Révész type results in this situation, obtaining a complete description of the upper classes and a partial result on the lower ones.

1. Introduction

Consider a coin-tossing sequence ( X n ) , i.e., a sequence of independent random variables satisfying P ( X n = 0 ) = P ( X n = 1 ) = 1 / 2 . Let R n be the length of the longest head-run, i.e., the largest integer r for which there is an i, 0 i n r , for which X i + j = 1 for j = 1 , r . A result of Erdős and Rényi [1] implies that
lim n R n log ( n ) = 1
(throughout this paper, log will denote base 2 logarithms. The notation log k will be used for its iterates: log 2 ( x ) = log ( log ( x ) ) , log k + 1 ( x ) = log ( log k ( x ) ) . Also, C and c, with or without an index, are used to denote generic constants that may have different values at each occurrence). The simple result (1) has seen a number of improvements. Erdős and Révész [2] provided a detailed description of the asymptotic behavior of R n . In order to formulate their result, let us recall
Definition 1 
(Lévy classes). Let ( Y n ) be a sequence of random variables. We say that a sequence ( a n ) of real numbers belongs to
  • The upper-upper class of ( Y n ) ( U U C ( Y n ) ), if, with probability 1 as n , Y n a n eventually.
  • The upper-lower class of ( Y n ) ( U L C ( Y n ) ), if, with probability 1 as n , Y n > a n for infinitely many n.
  • The lower-upper class of ( Y n ) ( L U C ( Y n ) ), if, with probability 1 as n , Y n < a n for infinitely many n.
  • The lower-lower class of ( Y n ) ( L L C ( Y n ) ), if, with probability 1 as n , Y n a n eventually.
Of course, these definitions work best if the sequence ( Y n ) obeys some zero-one law.
Their result is as follows:
Let ( a n ) be a nondecreasing integer sequence. Then
  • ( a n ) U U C ( R n ) if n 2 a n < ,
  • ( a n ) U L C ( R n ) if n 2 a n = ,
  • for any ϵ > 0 , a n = log ( n ) log 3 ( n ) + log 2 ( e ) 1 + ϵ L U C ( R n ) ,
  • for any ϵ > 0 , a n = log ( n ) log 3 ( n ) + log 2 ( e ) 2 ϵ L L C ( R n ) .
Arratia and Waterman [3] extend Erdős and Rényi’s result in another direction: they consider two independent coin-tossing sequences, ( X n ) and ( Y n ) , and look for the longest matching subsequences when shifting is allowed. Formally, let M n be the the largest integer m for which there are i , j with 0 i , j n m and X i + k = Y j + k for all k = 1 , , m . They prove that, with probability 1
lim n M n log ( n ) = 2 .
In the present paper, we will make this more precise by providing a description for the upper classes of ( M n ) and also some results on its lower classes:
Theorem 1. 
Let ( a n ) be a nondecreasing integer sequence. We have
  • ( a n ) U U C ( M n ) if n n 2 a n < .
  • ( a n ) U L C ( M n ) if n n 2 a n = .
  • for some c, a n = 2 log ( n ) log 3 ( n ) + c L U C ( M n ) .
  • for some c, a n = 2 log ( n ) 2 log 2 ( n ) log 3 ( n ) + c L L C ( M n ) .

2. Discussion

We leave the proof of Theorem 1 for later and rather discuss some of the concepts that are connected to this problem. One of them is the so-called independence principle: in many, although not all, situations, one may pretend that the waiting times until a given pattern of length l is observed have an exponential distribution with parameter 2 l , and that the waiting times for different patterns are independent. Móri [4] and Móri and Székely [5] provide an account of this principle and its limitations. In our case, all results but the lower-lower class one are more or less in tune with this principle.
Another question that is closely related is that of the number N ( n , l ) of different length l subsequences of ( X 1 , , X n ) . This question does not seem to have been considered by literature very much; there is one remarkable result by Móri [6]: in the remark following the statement of Theorem 3 in that paper, he mentions that with probability one for large n, the largest l for which all 2 l possible patterns occur as subsequences of ( X 1 , , X n ) is either log ( n ) log 2 ( n ) ϵ or log ( n ) log 2 ( n ) + ϵ for any ϵ > 0 . The independence principle would suggest that N ( n , log ( n ) ) / n is bounded away from 0 with probability one, and this or even the less stringent N ( n , log ( n ) ) n ( log 2 ( n ) ) c for large n would be an important step towards removing the double log term from the L L C result, as we conjecture that, for some c > 0 , log ( n ) c log 3 ( n ) L L C ( M n ) . Unfortunately, we are only able to obtain N ( n , log ( n ) ) c n / log ( n ) , which is also implied by Móri’s result.

3. Proofs

Proof of the upper-upper class result. 
Both upper class statements are fairly easy to prove. First, observe that under our assumptions, the convergence of
n = 1 n 2 a n
is equivalent to that of
k = 1 n k 2 2 a n k
with n k = 2 k .
Now, define events
A k = [ M n k a n k 1 ] .
A k occurs if in one of the ( n k + 1 a n k 1 ) 2 pairs of sequences
( ( X i + 1 , , X i + a k 1 ) , ( ( Y j + 1 , , Y j + a k 1 ) )
both sequences agree. That provides the trivial upper bound
P ( A k ) n k 2 2 a n k 1 ,
so, by our assumptions, n P ( A k ) < , and the Borel-Cantelli Lemma implies that, with probability 1, only finitely many events A k occur. Thus, for sufficiently large k, M n k a n k 1 , and for n k 1 n n k , we have
M n M n k a n k 1 a n .
This shows that ( a n ) U U C ( M n ) , as claimed. □
Proof of the upper-lower class result. 
We may assume without loss of generality that n 2 2 a n 1 / 4 .
Again, let n k = 2 k . We want to use the second Borel-Cantelli Lemma, so we are defining independent events
A k = [ i , j : n k 1 < i , j n k : X i + s = Y j + s , s = 0 , , a n k 1 ]
This is the union of the events
B i j = [ X i + s = Y j + s , s = 0 , , a n k 1 ]
with n k 1 < i , j n k . We endow the set of pairs ( i , j ) with the lexicographic order. For a subset I of the integers, Bonferroni’s inequality provides
P ( ( i , j ) I × I B i j ) ( i , j ) I × I P ( B i j ) ( i , j ) , ( i , j ) I × I , ( i , j ) < ( i , j ) P ( B i j B i j ) .
Let d ( ( i , j ) , ( i , j ) ) = max ( | i i | , | j j | ) . If d ( ( i , j ) , ( i , j ) ) a n k , then P ( B i j B i j ) = 2 2 a n k , otherwise P ( B i j B i j ) = 2 ( a n k + d ( ( i , j ) , ( i , j ) ) .
Let I = { i : n k 1 < i n k : 4 | i } . Using this in (11) yields the value | I | 2 2 a n k for the first sum. In the second sum, for given ( i , j ) and d < a n k / 4 , there are no more than 2 ( 2 d + 1 ) pairs ( i , j ) with d ( ( i , j ) , ( i , j ) ) = 4 d . The number of pairs of pairs ( ( i , j ) , ( i , j ) with d ( ( i , j ) , ( i , j ) ) a n k is trivially bounded by | I | 4 / 2 . Putting these together, we arrive at the upper estimate
| I | 2 2 a n k ( d = 1 ( 2 d + 1 ) 2 1 4 d + | I | 2 2 1 a n k ) < 1 2 | I | 2 2 a n k .
In total,
P ( A k ) 1 2 | I | 2 e a n k = 1 128 n k 2 2 a n k ,
keeping in mind that | I | = n k / 8 .
and k P ( A k ) = . Borel-Cantelli implies that, with probability 1, infinitely many events A k occur. Thus, for infinitely many k, M n k a n k , so ( a n ) U L C ( M n ) . □
For the lower class results, we first prove some lemmas:
Lemma 1. 
With probability 1 for n sufficiently large
n 4 log ( n ) N ( n , log ( n ) ) n
Proof of Lemma 1. 
The lower part is a direct consequence of Móri’s result: as for sufficiently large n  N ( n , log ( n ) log 2 ( n ) 1 ) = 2 log ( n ) log 2 ( n ) 1 n 4 log ( n ) and, obviously N ( n , log ( n ) ) N ( n , log ( n ) log 2 ( n ) 1 ) log 2 ( n ) 2 , as extending two different sequences from length log ( n ) log 2 ( n ) to log ( n ) keeps them different; it can only happen that some of them are extended beyond index n, but this can affect at most log 2 ( n ) + 2 of them. □
Lemma 2. 
Let S be a set of m < 2 l sequences of length l < n , and let A be the event that none of the sequences in S occurs as a subsequence of ( X 1 , , X n ) . For l n < n , let B be the event that some sequence from S is a subsequence of X 1 , , X n . Then
( ( 1 P ( B ) ) 2 n / n m 2 l ) ( 1 P ( B ) ) n / n P ( A ) ( 1 P ( B ) ) n / n .
Proof of Lemma 2, upper part. 
Consider the n / n sequences X k n + 1 , , X ( k + 1 ) n for k = 0 , n / n 1 . Each of these has probability 1 P ( B ) that it does not contain a subsequence that lies in S, and by independence
P ( A ) ( 1 P ( B ) ) n / n .
Proof of Lemma 2, lower part. 
Assume for simplicity that n is a multiple of n , say n = N n . Again, we split ( X 1 , , X n ) into N blocks of length n , and the probability that none of those contains a sequence from S is
( 1 P ( B ) ) N .
It can still happen that there is a sequence from S that crosses one of the boundaries between the blocks. There are N 1 boundaries, and for each of those, there are l 1 possible subsequences of length l crossing it. The probability that this one is from S but none of the N blocks contains one can be estimated above by the probability that it is from S and none of the N 2 blocks not adjacent to it contains one from S. This provides the upper bound
( N 1 ) ( l 1 ) m 2 l ( 1 P ( B ) ) N 2
for the probability that there is a subsequence from S in ( X 1 , , X n ) but none in any of the N blocks. Subtracting this from (17), we obtain the lower bound
( ( 1 P ( B ) ) 2 ( N 1 ) ( l 1 ) m 2 l ) ( 1 P ( B ) ) N 2 ,
and the general case is obtained by observing that the probability P ( A ) for a given n is bounded below by the one that we get for n n / n . □
In this lemma, the lower and upper bounds are rather close. Its applicability, however, depends on the availability of good estimates for the probability P ( B ) . There is the trivial upper bound n m 2 l and an almost as simple lower bound const . n m l 1 2 l (given n m l 1 2 l < 1 ). Bridging the gap between these requires deeper insight into the structure of S.
Proof of the lower-lower class result. 
In both the lower-lower and lower-upper parts, we consider the asymptotics of the longest match found between ( X 1 , , X n ) and ( Y 1 , , Y n ) under the condition that the sequence ( Y n , n N ) is given. Doing so, we may assume that Y = ( Y n , n N ) is a “typical” coin-tossing sequence, in the sense that it possesses some property that holds with probability 1. In the sequel, all probabilities are understood as conditional with respect to such a typical sequence Y . For the lower-lower class result, we let n = n l = C 2 l / 2 l log ( l ) , n = n l = l , and m = m l = n 4 log ( n ) in Lemma 2. Clearly, as ( X 1 , , X l ) only has one length l subsequence, P ( B ) equals m ˜ 2 l , where m ˜ is the number of different sequences of length l in Y 1 , , Y n l . For sufficiently large l, m ˜ m l by Lemma 1, and we obtain an upper estimate
p l = exp ( m n / l 2 l ) = exp 1 2 C 2 log ( l ) ( 1 + o ( 1 ) )
for the probability (conditional on Y ) that there is no match of length l between ( Y 1 , , Y n l ) and ( X 1 , , X n l ) . For C > 2 / log ( e ) , the series l p l converges, so with probability 1, we have M ( n l ) l for all but finitely many l. Thus, for n l 1 n < n l , M n M n l 1 l 1 . Inverting the relationship between n l and l yields l = 2 log ( n l ) 2 log 2 ( n l ) log 3 ( n l ) + O ( 1 ) , so, for some constant c and l large enough, we obtain M n l 1 2 log ( n ) 2 log 2 ( n ) log 3 ( n ) c , which proves the lower-lower class result. □
Proof of the lower-upper class result. 
This time, we need to make our choice of the parameters in Lemma 2 with a little more sophistication. We start with l = l k = k 2 for k N . Then, n = n k = C 2 l k / 2 log ( l k ) . As the set S, we choose the set S k of all sequences of length l k contained in ( Y 1 , , Y n k ) . n = n k is chosen in such a way that m k n k 2 l k 0 and m k n k l k 2 l k / n k 0 , n k = 2 l k / 4 is a possible choice.
We define the events
A k = { There is no sequence from   S k   in ( X 1 , X n k ) } ,
A ˜ k = { There is no sequence from   S k   in ( X n k 1 + 1 , X n k ) }
B k = { There is a sequence from   S k   in ( X 1 , X n k ) }
(this last is just the event B from Lemma 2).
Lemma 2 gives us
P ( A k ) = ( 1 P ( B k ) ) n k / n k ( 1 + o ( 1 ) )
and
P ( A ˜ k ) = ( 1 P ( B k ) ) ( n k n k 1 ) / n k ( 1 + o ( 1 ) ) .
The trivial estimate P ( B k ) m k n k 2 l k n k 2 2 l k yields
P ( A k ) e 2 C 2 log ( k ) ( 1 + o ( 1 ) ) ,
which diverges if we choose C < 1 / 2 log ( e ) .
We are going to use the Borel-Cantelli Lemma in the usual form for dependent events:
Lemma 3 
(Borel-Cantelli II). If the sequence ( A n ) satisfies
n N P ( A n ) =
and
lim n i = 1 n j = 1 n P ( A i A j ) ( i = 1 n P ( A i ) ) 2 = 1 ,
then
P ( lim sup n A n ) = 1 ,
To this end, we need an upper bound for P ( A i A j ) for i < j . We have
P ( A i A j ) P ( A i A ˜ j ) = P ( A i ) P ( A ˜ j ) .
By our Equations (24) and (25) from above
P ( A ˜ j ) = P ( A j ) ( 1 P ( B j ) ) n j 1 / n j ( 1 + o ( 1 ) ) = P ( A j ) e m j n j 1 2 l j ( 1 + o ( 1 ) ) =
P ( A j ) ( 1 + o ( 1 ) ) ,
as n j 1 n j e 1 2 j and m j n j 2 l j n j 2 2 l j = O ( log j ) .
This means that for any ϵ > 0 , there is a number j 0 such that, for j > j 0 and i < j , the inequality
P ( A i A j ) ( 1 + ϵ ) P ( A i ) P ( A j )
holds. Plugging this into
j = 1 n i = 1 n P ( A i A j ) = i = 1 n P ( A i ) + 2 j = 2 n i = 1 j 1 P ( A i A j )
yields the estimate
j = 1 n i = 1 n P ( A i A j ) i = 1 n P ( A i ) + j 0 2 + ( 1 + ϵ ) ( i = 1 n P ( A i ) ) 2 .
As i = 1 P ( A i ) = , we get
lim sup n i = 1 n j = 1 n P ( A i A j ) ( i = 1 n P ( A i ) ) 2 1 + ϵ .
As ϵ > 0 is arbitrary, the sequence ( A k ) satisfies the assumptions of Lemma 3, so, with probability 1, infinitely many of the events ( A k ) occur. Thus, with probability 1 for infinitely many k, M n k < l k . Observing that l k = 2 log ( n k ) log 3 ( n k ) + O ( 1 ) , we obtain our lower-upper class result. □

Funding

This research was funded by TU Wien Bibliothek for financial support through its Open Access Funding Programme.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable.

Acknowledgments

The author acknowledges TU Wien Bibliothek for financial support through its Open Access Funding Programme.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Erdős, P.; Rényi, A. On a new law of large numbers. J. Anal. Math. 1970, 23, 103–111. [Google Scholar] [CrossRef]
  2. Erdős, P.; Révész, P. On the length of the longest head-run. In Topics in Information Theory; Colloquia Mathematica Societatis Janos Bolyai Volume, 16; Ciszár, I., Elias, P., Eds.; North-Holland: Amsterdam, The Netherlands, 1977; pp. 219–228. [Google Scholar]
  3. Arratia, R.; Waterman, S. An Erdős-Rényi law with shifts. Adv. Math. 1985, 55, 13–23. [Google Scholar] [CrossRef]
  4. Móri, T. Large deviation results for waiting times in repeated experiments. Acta Math. Hung. 1985, 45, 213–221. [Google Scholar] [CrossRef]
  5. Móri, T.; Székely, G. Asymptotic independence of pure head stopping times. Stat. Probab. Lett. 1984, 2, 5–8. [Google Scholar] [CrossRef]
  6. Móri, T. On the waiting time till each of some given patterns occurs as a run. Probab. Theory Relat. Fields 1991, 87, 313–323. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Grill, K. An Erdős-Révész Type Law for the Length of the Longest Match of Two Coin-Tossing Sequences. Entropy 2025, 27, 34. https://doi.org/10.3390/e27010034

AMA Style

Grill K. An Erdős-Révész Type Law for the Length of the Longest Match of Two Coin-Tossing Sequences. Entropy. 2025; 27(1):34. https://doi.org/10.3390/e27010034

Chicago/Turabian Style

Grill, Karl. 2025. "An Erdős-Révész Type Law for the Length of the Longest Match of Two Coin-Tossing Sequences" Entropy 27, no. 1: 34. https://doi.org/10.3390/e27010034

APA Style

Grill, K. (2025). An Erdős-Révész Type Law for the Length of the Longest Match of Two Coin-Tossing Sequences. Entropy, 27(1), 34. https://doi.org/10.3390/e27010034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop