Next Article in Journal
DB-COVIDNet: A Defense Method against Backdoor Attacks
Next Article in Special Issue
How to Determine the Optimal Number of Cardiologists in a Region?
Previous Article in Journal
Solution to the Dirichlet Problem of the Wave Equation on a Star Graph
Previous Article in Special Issue
The Open Monopolistic Competition Models: Market Equilibrium and Social Optimality
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Order Book Dynamics with Liquidity Fluctuations: Asymptotic Analysis of Highly Competitive Regime

1
Department of Statistical Engineering, National Engineering University (UNI), Lima 15333, Peru
2
Laboratory of Probability and Mathematical Statistics, Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Science, 630090 Novosibirsk, Russia
3
Department of Computer Science in Economics, Novosibirsk State Technical University (NSTU), 630087 Novosibirsk, Russia
4
Department of Higher Mathematics, Siberian State University of Geosystems and Technologies (SSUGT), 630108 Novosibirsk, Russia
5
Department of Statistics, Institute of Mathematics and Statistics, University of São Paulo (USP), São Paulo 05508-220, Brazil
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(20), 4235; https://doi.org/10.3390/math11204235
Submission received: 5 September 2023 / Revised: 4 October 2023 / Accepted: 8 October 2023 / Published: 10 October 2023
(This article belongs to the Special Issue Mathematical Modeling and Applications in Industrial Organization)

Abstract

:
We introduce a class of Markov models to describe the bid–ask price dynamics in the presence of liquidity fluctuations. In a highly competitive regime, the spread evolution belongs to a class of Markov processes known as a population process with uniform catastrophes. Our mathematical analysis focuses on establishing the law of large numbers, the central limit theorem, and large deviations for this catastrophe-based model. Large deviation theory allows us to illustrate how huge deviations in the spread and prices can occur in the model. Moreover, our research highlights how these local trends and volatility are influenced by the typical values of the bid–ask spread. We calibrated the model parameters using available high-frequency data and conducted Monte Carlo numerical simulations to demonstrate its ability to reasonably replicate key phenomena in the presence of liquidity fluctuations.
MSC:
60F05; 60F10; 60J27; 60J28; 60J75

1. Introduction

The “order book” (OB) refers to an electronic list used to describe the evolution of bid and ask prices and sizes in high-frequency electronic markets, such as NYSE-ARCA, LSE, or NASDAQ. The evolution of the OB results from the interaction of buy and sell orders through a rather complex dynamic process. Order book dynamics has been extensively studied in the market microstructure and econophysics literature ([1,2,3]). More recently, based on empirical characteristics presented in these studies, several models for the evolution of the OB have been proposed, as seen in [4,5,6,7]. These models, which are Markovian queueing systems, primarily focus on the direction of the next price movement and provide good results, offering a more or less clear understanding of price dynamics in conditions of uninterrupted high liquidity, i.e., they assume an abundant availability of limited orders in the OB. In this high liquidity context, the prices are relatively stable with small temporary fluctuations, and the bid and ask sizes at the top of the OB provide valuable information on these short-term price fluctuations.
Conversely, in various markets, prices are not as stable; they exhibit significant changes and, in some cases, local downtrends, often caused by liquidity wells in the OB. Events such as the 6 May 2010 “flash crash” (see Figure 1), which was a sudden and severe drop in stock prices in a very short time, have raised concerns about the stability of the OB and its suitability as the primary mechanism for trading. Additionally, occurrences of mini flash crashes—rapid and significantly large directional movements in asset prices—have become increasingly common (see [8]). These events lead to temporary liquidity crises, resulting in larger spreads. Consequently, it is both practically and theoretically important to gain a better understanding of how price dynamics depend on the structure and fundamental parameters of the OB (see [9,10,11]).
In the present paper, we are interested in understanding how severe intermittencies in liquidity affect the order book dynamics. The contexts in which there are significant and intermittent decreases in the OB’s ability to absorb market orders are what we call “liquidity fluctuations”. We propose a simple model for price dynamics in an OB with the presence of liquidity fluctuations, our model explains in a simple way how large price fluctuations occur, fluctuations such as those observed in flash crashes. Furthermore, it shows us how these local trends and volatility are determined by the typical values of the bid–ask spread. From our price model, a model for the dynamics of the spread is implicitly derived, we use this model to analyze large deviations in the spread and its impact on prices, we present these large deviations in the form of “optimal trajectories” that give us relevant information about their occurrence. Finally, we present Monte Carlo simulations to corroborate that our model reproduces relevant empirical characteristics observed in our data as well as documented in the literature, such as the famous bid–ask bounce; see [12].
We were initially motivated by the local and seemingly patternless price trends observed in various markets. Our goal was to understand the connection between these long-term trends and the short-term micro-jumps in prices, as shown in Figure 2. Our initial conjecture motivating this work was the existence of a close relationship between the spread, price trends, volatility around these trends, and liquidity fluctuations. Consequently, we needed to jointly model spread and price dynamics. Figure 2 illustrates that the local trend is shared by both bid and ask prices. This observation suggested that our model should not only capture long-term price trends but also incorporate the asymptotically stationary behavior of the spread. Large spread and price changes are typically attributed to liquidity changes ([13]). Our curiosity about this phenomenon grew, leading us to investigate how significant fluctuations in spread and prices, such as those seen in flash crashes, are related to liquidity fluctuations.
Our paper is organized as follows. In Section 2, we elucidate key empirical characteristics within an order book exhibiting liquidity fluctuations. We also introduce a general class of continuous-time Markov processes to model bid–ask price dynamics. Section 3 outlines a comprehensive Markovian model for an order book with high liquidity fluctuations. Here, we present the fundamental mathematical statements concerning stability (invariant measure), the law of large numbers, the central limit theorem, and large deviations. In Section 4, we present our numerical results. Section 5 delves into extensions and proposes models for two additional liquidity regimes: non-competitive and low liquidity. Our conclusions are presented in Section 6. The appendix, our final section, contains auxiliary results and proofs of the main statements presented in Section 3.

2. Markov Model and Regimes in Liquidity Fluctuations

A very important empirical characteristic observed in markets with liquidity fluctuations is the low availability of orders in the OB; the queue sizes at the top of the OB are small most of the time; see, e.g., Figure 3. In this context, the queue sizes of the best bid and ask prices are no longer the determining factors in the dynamics of prices; for more details see [13]. If the liquidity intermittency is severe, even “gaps” are formed in the OB (blocks of adjacent price levels that do not contain quotes). In these cases, the distribution of price changes is mainly determined by the distribution of the gap sizes in the OB. Taking these facts into account, if our interest is to explain the observed long-term price trends, we can focus only on micro-jumps in prices and disregard the size of the queues.
In these liquidity regimes, the spread exhibits a quite flexible dynamic behavior, reaching values much larger than those observed in high liquidity conditions; see, e.g., Figure 4. Based on our empirical experiences, the OB slowly digests liquidity fluctuations and we can characterize that process in two stages. In the first stage, the spread begins to increase persistently. In the later stage, the spread is reduced; the reduction can be drastic or gradual. The closing type of the spread in the second stage depends on the intensity of the liquidity fluctuation.
Our empirical observations about the reversing process of the bid–ask spread to its typical values, before and after liquidity shocks, have been theoretically corroborated through equilibrium models; see [14]. In this paper, we consider different types of reversing processes of the spread, i.e., we consider three low-liquidity regimes: highly competitive, non-competitive, and low liquidity with gaps. These three regimes correspond to low-liquidity regimes but differ in the closing type of the spread and in the gaps present in the OB.

The Markov Model: A General View

Building upon our discussions in the preceding sections, we propose a simplified model for an order book (OB) with liquidity fluctuations. Let P b ( t ) represent the (best) bid price and P a ( t ) denote the (best) ask price. The state of the OB is characterized by a continuous-time process X ( t ) = P b ( t ) , P a ( t ) , taking values in the discrete state space τ Z × τ Z (a two-dimensional lattice). Here, τ represents the “tick size”, and, as usual, Z denotes the set of integers.
Here, we aim to elaborate on our choice of employing the set of integers Z as opposed to restricting ourselves to the set of positive integers. This decision is motivated by several factors. One primary reason is the inherent simplicity that arises from utilizing Z in terms of mathematical description and subsequent analysis. Introducing boundary conditions, such as reflections, can significantly complicate the analytical process.
Furthermore, an additional rationale behind our choice lies in our focus on analyzing systems characterized by constant parameters. In situations where the price tends towards zero, it is possible that this behavior is driven by changes in the underlying system’s parameters (due to a crisis, for example). In contrast, it might not necessarily be an outcome of the dynamics within a model governed by the same parameters. This consideration underscores our approach to employ Z for its greater flexibility and relevance to the specific context of our analysis.
For the sake of simplicity, we consider Z × Z as the state space of X ( t ) , interpreting each state as a multiple of τ . The price process X ( t ) manifests piecewise constant sample paths, with transitions corresponding to order book events that trigger price fluctuations (as illustrated in Figure 2). Our objective is to describe the asymptotic behavior of the price process X ( t ) arising from numerous micro-jumps.
Based on this simplified representation, consider a continuous-time Markov chain X ( t ) = P b ( t ) , P a ( t ) with state space X Z × Z
X = { ( b , a ) Z × Z : such   that b < a } .
Here, P b ( t ) represents the bid price, P a ( t ) represents the ask price, and S ( t ) = P a ( t ) P b ( t ) is the bid–ask spread. In general, the transitions of the chain X ( t ) are defined by the following transition rates: given a state ( b , a ) , then
( b , a ) ( b , a + Δ ) with rate α + ( Δ ) , ( b , a ) ( b , a Δ ) with rate α ( Δ ) , where 0 < Δ < a b , ( b , a ) ( b Δ , a ) with rate β ( Δ ) , ( b , a ) ( b + Δ , a ) with rate β + ( Δ ) , where 0 < Δ < a b ,
in all cases, the increment Δ is a positive integer number. The function α + ( · ) (resp. β ( · ) ) is the rate at which increases (resp. decreases) in the ask (resp. bid) price occur as a result of the execution of market buy (resp. sell) orders or cancellations of limited sell (resp. buy) orders, as well as, that the function α ( · ) (resp. β + ( · ) ) is the rate at which the decreases (resp. increases) in the ask (resp. bid) price occur as a result of a limited sell (resp. buy) order placed within the spread.
We study the asymptotic behavior of X ( t ) as t goes to infinity. To facilitate this analysis, it is convenient to consider an equivalent process, denoted as Y ( t ) = P b ( t ) , S ( t ) with state space Y = Z × N . Although both X ( t ) and Y ( t ) contain the same information, the latter representation offers better control for our asymptotic examination. The transitions of the chain Y ( t ) are defined by the following transition rates: given a state ( b , s ) of the Markov chain, then
( b , s ) ( b , s + Δ ) with rate α + ( Δ ) , ( b , s ) ( b , s Δ ) with rate α ( Δ ) , ( b , s ) ( b Δ , s + Δ ) with rate β ( Δ ) , ( b , s ) ( b + Δ , s Δ ) with rate β + ( Δ ) .
Since the transition rates of Y ( t ) depend only on the second coordinate, the spread, we see that S ( t ) alone is the continuous-time Markov process and has the following transition rates. Suppose that at some moment the spread is k N , then
k k + Δ with   rate γ + ( Δ ) : = α + ( Δ ) + β ( Δ ) , k k Δ with   rate γ ( Δ ) : = α ( Δ ) + β + ( Δ ) .
Based on the model (1) and its alternative representation (2) and (3), it is possible to define three low-liquidity regimes: highly competitive, non-competitive, and low liquidity with gaps. Any regime is defined by how the rates depend on the increment Δ which is usually determined by the intensity of the liquidity fluctuations.
In this paper, our focus is primarily on the first regime, namely, the highly competitive regime, while the other two regimes are outlined briefly. The findings presented here can be generalized for the other two regimes, but we believe that there will be no significant qualitative difference in the results.

3. The Markov Model: Closing the Spread Uniformly (Highly Competitive Regime)

The highly competitive regime (HC regime) is characterized by very small opening steps of the spread and a rapid decrease in it. This regime is consistent with a rapid reversing process of the spread and the absence of gaps in the order book (OB). The rapid decrease in the spread is caused by the competitive behavior of impatient agents who place quotes within the spread, prioritizing the execution of their placed limit orders. In the considered model (1), we define the rates in such a way that the spread can increase by only one unit. For a given spread length, denoted as k, the next length of the spread is chosen uniformly from the set I k = { 1 , , k 1 } .
In order to define the rates for the highly competitive regime, we make use of our notation and fix the parameters α + , α , β + , and β , which are strictly positive real numbers. Furthermore, the terms α ± and β ± are employed solely as parameters of the model and not as functions. The transition rates for the Markov chain X ( t ) are defined as follows: given that the chain is in a state ( b , a ) X at a certain moment, then
α + ( Δ ) = α + , if   Δ = 1 ; 0 , otherwise ; α ( Δ ) = α   a b 1 , if   a b > 1   for   any   Δ I a b ; 0 , otherwise ; β ( Δ ) = β , if   Δ = 1 ; 0 , otherwise ; β + ( Δ ) = β + a b 1 , if   a b > 1   for   any   Δ I a b ; 0 , otherwise .
For an illustration, see Figure 5 in the case when a b = 3 .
In this regime, the transition rates of S ( t ) , see (3), are the following: suppose that at some moment the spread is k N , and let γ + : = β + α + and γ : = β + + α , then
k k + 1 with   rate γ + , k k Δ with   rate γ k 1   for   Δ I k .
Note that S ( t ) is an irreducible Markov chain in this regime.

3.1. Ergodicity and Invariant Measure for S ( t )

We begin by analyzing the stability of the spread S ( t ) . The following theorem establishes ergodicity, representing one of the rare instances where we are able to determine the invariant measure for the process.
Theorem 1.
In a highly competitive regime model, for any positive values of parameters α + , α , β + , β the spread S ( t ) is a positive recurrent Markov process with an invariant measure denoted as μ = μ ( k ) , k N given by the following formula: let γ : = γ + + γ
μ ( k ) = k ! ( γ + ) k 1 i = 1 k 1 ( γ + i γ ) 1 + k 2 k ! ( γ + ) k 1 i = 1 k 1 ( γ + i γ ) 1 .
These findings regarding the stationary asymptotic behavior of the spread process S ( t ) are in line with the empirical observations illustrated in Figure 4.
We conclude this section with the following observation: the process S ( t ) falls within the category of processes referred to as population processes with uniform catastrophes. An extension to processes with almost uniform catastrophes (as defined in Section 5.1) was explored in [15]. In that work, the following result was established for the maximum of the process: for any fixed b ( 0 , 1 )
P lim T sup t [ 0 , 1 ] S ( t T ) T b > ε = 0 .

3.2. Local Drift (LLN for the Prices)

The following theorem addresses the law of large numbers (LLN) as applied to prices. This theorem will illuminate the local trends (local drift) exhibited by the prices.
Theorem 2.
With probability one scaled by the time the bid price converges to a constant
P b ( t ) t D a . s . t ,
where
D = β μ ( 1 ) γ β γ γ + + β + 2 + β + γ 2 k 1 k μ ( k ) .
This result validates our conjecture regarding the impact of the spread on the local trend of prices. From a practical standpoint, given the jump rates of the bid and ask prices, we can easily compute the price trend.

3.3. Price Volatility (CLT for the Prices)

In this section, our focus is on examining the connection between price volatility and price jump rates. Specifically, we establish a central limit theorem (CLT) for the price process. We articulate the volatility of price fluctuations around local drift in relation to the jump rates of the ask and bid prices. In other words, the central limit theorem holds for the process depicted by (A3).
Once more, we begin by demonstrating the central limit theorem (CLT) for the embedded discrete-time dynamics p n of the price (Lemma 1), followed by establishing the CLT for the continuous-time chain P b ( t ) (Theorem 3). We have included the proofs in the Appendix A.
Lemma 1.
Let n , then
p n n v n V a r π ^ F ( s ^ n ) N ( 0 , 1 )
in distribution, where
V a r π ^ F ( s ^ n ) = β γ + μ ( 1 ) γ β γ γ + β + 6 + β + γ 1 6 s = 1 s ( 2 s 1 ) π ( s ) E π ^ ( F ( s ^ n ) ) 2
Lemma 1 establishes a connection between the “coarse-grained” volatility of intraday returns at lower frequencies and the high-frequency jump rates of prices. In simpler terms, it asserts that prices exhibit a diffusive behavior around a local drift over time, with a diffusion coefficient of V a r π ^ F ( s ^ n ) . Consequently, price volatility, as determined by the number of micro-jumps in prices, is given by
σ n = n V a r π ^ F ( s ^ n )
Here, n represents the total count of high-frequency price jumps. Equation (8) presents a means to estimate price volatility without requiring long-term price observations. Optionally, the parameter σ n can be interpreted as the intraday realized volatility of the asset. Thus, relation (8) establishes a link between the realized volatility and the high-frequency parameters of the order book.
Based on Lemma 1, we established Theorem 3; see proof in Appendix A.4. Note that the proof of the law of large numbers provides the following representation for local drift D in continuous time D = v γ , where v is the local drift for an embedded chain provided by Lemma A1 (see Appendix A.2).
Theorem 3.
Let t , then there exists σ 2 > 0 such that
t P b ( t ) t D N 0 , ( σ 2 + v 2 ) γ ,
in distribution.

3.4. Large Deviations for the Spread S ( t )

It is known that in the context of liquidity fluctuations, even a small order can trigger a substantial price change, thereby leading to a significant increase in the spread ([3,13]). Consequently, our interest lies in comprehending the mechanisms behind substantial spread changes without altering the model’s parameters. We believe that such analyses can contribute to evaluating the order book’s resilience against severe liquidity fluctuations.
In this section, we present an application of the large deviations theory to the Markov process that describes the dynamics of the spread. Specifically, we investigate the asymptotics of large deviations for the spread process. Our goal is to identify the most probable trajectory associated with a specific state of the spread, particularly when it becomes very large, over a given time interval.
The topic of large deviations for Poisson processes with uniform (or almost uniform) catastrophes has been explored in [15,16]. Large deviation analysis serves as a culminating step within a sequence of limit theorems for such processes. While the theory of large deviations is well-developed, the processes examined here do not satisfy the “classical” conditions. Consequently, the proof of large deviations remains quite technical.
In order to provide the large deviations, we need some increasing scaling parameter. Let T be the length of the time interval [ 0 , T ] over which we observe our process. We consider the following scaled process:
S T ( t ) = S ( t T ) T , t [ 0 , 1 ] .
We say that the family of the random variable S T ( 1 ) satisfies the large deviation principle (LDP) on R with the rate function I = I ( x ) : R [ 0 , ] if for any c 0 the set { x R : I ( x ) c } is compact and for any set B B ( R ) the following inequalities hold:
lim sup T 1 T ln P ( S T ( 1 ) B ) inf x [ B ] I ( x ) and lim inf T 1 T ln P ( S T ( 1 ) B ) inf x ( B ) I ( x ) ,
where B ( R ) is the Borel σ -algebra on R and [ B ] , ( B ) are the closure and open interior of the set B, respectively. This principle was established in [16], in which the logarithmic asymptotic for the probability P ( S T ( 1 ) > x ) was calculated. Note that the principle was proved for the state x of the spread at the time T, it is not the principle on the functional space. The principle on the functional (trajectory) space provides us the possibility to find the (unique) optimal trajectory—the trajectory which shows how such deviation (a rare event) occurs taking into account the evolution of the spread.
An initial approach to proving the principle in the functional space is to establish the local large deviation, which involves studying the asymptotic behavior of the probability of the process remaining within a small neighborhood of a given continuous function. We say that the family of the processes S T ( · ) satisfies the local large deviation principle (LLDP) on the set G D [ 0 , 1 ] with rate function I = I ( f ) : D [ 0 , 1 ] [ 0 , ] if for any function f G the following inequalities hold:
lim ε 0 lim sup T 1 T ln P ( S T ( · ) U ε ( f ) ) = lim ε 0 lim inf T 1 T ln P ( S T ( · ) U ε ( f ) ) = I ( f ) ,
where D [ 0 , 1 ] is the space of càdlàg functions, i.e., the functions that are continuous from the right, and have a limit from the left; and where U ε ( f ) : = { g D [ 0 , 1 ] : sup t [ 0 , 1 ] | f ( t ) g ( t ) | ε } .
The LLDP was proved in [15] for the compound Poisson process with almost uniform catastrophes. We note only here that the process S ( · ) is the special case of the processes considered in [15]. Let G be a set of absolutely continuous functions that are positive on the interval (0, 1]. In order to write the corresponding rate function, we need to remember that any function with a finite variation can be uniquely represented as a difference of two non-decreasing functions f + and f such that Var f [ 0 , 1 ] = Var f [ 0 , 1 ] + + Var f [ 0 , 1 ] . The functions f + and f are called the positive and negative variations in the function f, respectively. Now, the rate function for S T ( · ) can be represented for f G as follows:
I ( f ) = γ + 0 1 f ˙ + ( t ) ln f ˙ + ( t ) γ + γ + f ˙ + ( t ) γ + 1 I f ˙ + ( t ) > γ + d t ,
where f ˙ stands for the derivative of function f and I is the indicator function.
We note that the large deviation principle and local large deviation have the same normalization factor for the probabilities, 1 / T . This provides the existence of an optimal trajectory for the large deviations. The existence of the optimal trajectories of large deviations S T ( 1 ) > x was established in [16]. If x < γ + , then there exists the moment t x = 1 x γ + ( 0 , 1 ) such that the spread process S T ( · ) stays near zero up to the time t x and after that S T ( t ) , t t x increases according to the straight line which starts at point ( t x , 0 ) and grows up to the point ( 1 , x ) with the slope γ + ; see function f 2 in Figure 6A. If x γ + , then the process grows together with the straight line starting from the origin up to the point ( 1 , x ) , i.e., its slope is x; see function f 1 in Figure 6A. For illustrative purposes of comparison, in Figure 6 we represent the optimal trajectories that provide large fluctuations for the Poisson process with rate γ + and the process S T , that is, the Poisson process (of rate γ + ) with uniform catastrophes (of rate γ ).

3.5. Large Deviations for the Prices ( P b ( t ) , P a ( t ) )

The large deviation result for the spread suggests the question about the behavior of prices under a large spread. The rate function corresponding to the large deviation is essentially the rate function of a Poisson process with rate γ + , which consists of the rates α + and β . Here, we provide some qualitative behavior of optimal price trajectories without proof. The qualitative picture is represented in Figure 7.
The main difference between the behavior of the optimal trajectories of Poisson processes and our process lies in the inclusion of the indicator function within the rate function, as seen in (9). This indicator function imposes a constraint on the possible values of the line slope—it cannot be lower than the rate of the Poisson process. Consequently, when the scaled spread is less than γ + , a “bifurcation” point t x emerges. After this point, the upper line has a slope of α + and the lower line has a slope of β + . As the scaled spread surpasses γ + , the slopes change, but the relationship between the contributions of rates α + and β + remains constant.

4. Numerical Simulations and Applications

This section provides a detailed description of the data and the empirical facts relevant to them. We also present computational simulations used to calibrate the model parameters, using the data to validate some qualitative model outcomes. Additionally, we demonstrate a practical application that confirms our model’s short-term predictive capabilities.

4.1. HFT Data

The dataset consists of NASDAQ high-frequency trading (HFT) data for Apple Inc., collected through the Bloomberg stock trading platform. High-frequency data are collected within the day (intraday), and recorded tick by tick. The dataset covers 12 h of trading activity each day, specifically, the entire trading population for 3 and 4 March 2011. During the first 12 h of market operation, the order book quotes, measured by the frequency of price jumps, remain stable for periods ranging from 180 to 540 min after the market opens, as shown in Figure 8. Consequently, for both trading days, we will analyze data from within these time intervals. It is noteworthy that during these intervals there are approximately 305 thousand price jumps per day, a characteristic that typically persists on a daily basis.

4.2. Empirical and Qualitative Facts

In this section, based on the available data presented above, we present some empirical and qualitative characteristics that are typically observed in high-frequency trading markets. The objective of this section is to corroborate whether the assumptions underlying our model align with these recurring qualitative features commonly found in most markets.
As previously mentioned, our model’s key assumption is that in conditions of low liquidity, characterized by a high intensity of price jumps, the sizes of the bid and ask orders diminish in significance as the primary factors influencing the price dynamics. In contrast, under these conditions, the bid–ask spread becomes the determining factor in predicting price dynamics.
In order to empirically validate this assumption, we will label the observed price variations into two categories: up variations for upward variations, and down variations for downward variations. After binarizing the price jumps, we use them as binary labels (target variable) to classify jumps based on the bid–ask spread and the sizes of the bid and ask orders as predictors.
The idea is that the greater the predictive power, the greater the influence on price dynamics. It is worth mentioning that to measure the predictive power of order sizes, the imbalance was considered, which corresponds to the fraction of the ask size and the sum of total orders, that is, the bid size divided by the sum of the bid and ask sizes.
There are various dissimilarity metrics, known as divergence measures in information theory, that we can use to assess the predictive power of the bid–ask spread and the imbalance. Due to its simplicity and wide use in classification problems, we will employ the Jeffreys divergence, also known as the information value (IV); see, for example, reference [17].
Based on the information value (IV) (see Figure 9) we corroborate that the bid–ask spread has a persistent and relevant influence on the price dynamics. On the other hand, the imbalance, and consequently the size of the orders, has an influence that disappears quickly over time. Therefore, our main assumption of the model, the prevailing influence of the spread on price dynamics, is empirically validated in our data. It is worth noting that the influence of the bid–ask spread fluctuates, and our model implicitly accounts for this empirical observation.

4.3. Parameter Estimation and Monte Carlo Experiments

In this section, we explore the steady-state properties of our proposed model using Monte Carlo simulations. We compare the empirically observed long-term behavior (unconditional properties) of the OB to simulations of the fitted model. The goal of these simulations is to indicate how well the model reproduces the average properties of the OB. The transition rates of X ( t ) can be estimated by
α ^ + = N α + T , α ^ = N α T , β ^ + = N β + T , β ^ = N β T ,
where T is the length of our sample (in seconds), N α + ( N α ) is the total number of jumps where the ask price increases (decreases), and N β + ( N β ) is the total number of jumps where the bid price increases (decreases).
From the Apple stock data, since our model (4) only allows spread openings in one tick, we selected approximately 15 continuous minutes of trading for which spread openings only occurred in one tick, which corresponds to the interval of 365 to 380 min after the market opens. In this sub-sample, using (10), we obtain the following: T = 900 s, α ^ + = 3.756 , α ^ = 0.765 , β ^ + = 0.848 , and β ^ = 4.907 . Based on the estimation of the parameters ( α ^ + , α ^ , β ^ + , β ^ ) , we simulate the price process X ( t ) over a long horizon of 900 s, which corresponds to what was empirically observed, and observe the evolution of prices in two time windows. The results are displayed in Figure 10. The results of our simulations illustrate that our model reproduces realistic characteristics for both the short- and long-term price behavior, which were presented for the empirical data in Figure 2.
The simulation results demonstrate that our model also accurately captures realistic characteristics of the (steady-state) average behavior of the order book (OB) profile. Notably, the model successfully replicates the negative autocorrelation of price changes at the first lag. Empirical observations indicate that there is a pronounced negative autocorrelation at the first lag in the autocorrelation function of the transaction price returns. This negative autocorrelation is significant at the first lag and rapidly diminishes thereafter, as depicted in Figure 11.
This phenomenon is commonly referred to as the bid–ask bounce [12], largely arising from having distinct trading prices for buyer-initiated and seller-initiated transactions. While this negative autocorrelation vanishes when considering aggregate returns, it is a noteworthy microstructural effect that must be considered in an order book model. Our model successfully replicates this empirical characteristic. Therefore, we conclude that we have sufficient evidence to argue that our model reproduces qualitative characteristics that are realistic and relevant.

4.4. Application: Prediction of Next Price Jump Direction

In this section, we present a direct application of our model that corresponds to the short-term forecast of price movement; that is, we use the proposed model to calculate the probability that the price will increase in the next jump, this probability is conditional on the observed state of the OB. This amount is particularly important in financial trading as it is used in the design of high-frequency trading strategies. From the transition rates of the price process X ( t ) in (4), the probability that the price will increase in the next jump conditional to the observed state of the OB is given by
p ( b , a ) : = P Δ P > 0 | X ( t ) = ( b , a ) = α + α + + β , if a b = 1 , α + + β + γ + + γ , if a b > 1 ,
where Δ P is the change in the mid-price. To increase the precision of the forecasts, we suggest calculating the theoretical probabilities p ( b , a ) taking into account the different fixed values of the imbalance and the spread, that is,
Imbalance = V ( a ) V ( a ) + V ( b ) and Spread = a b ,
where V ( a ) is the numbers of ask orders and V ( b ) the numbers of bid orders. In other words, we will use (11) for buckets or bins generated by combinations of spread bid–ask and imbalance values.
As for the calculation of the IV, for the purposes of empirical contrast of the theoretical quantities p ( b , a ) , the observed variations in the mid-price were classified into two categories. Variations with mid-price increases were categorized as up variations. On the other hand, downward variations were categorized as down variations. Once the variations in mid-prices were dichotomized, the total set of 305 thousand observations was divided into two sub-samples.
The first sub-sample, which we call the training sample, corresponded to 70% of the total observations and was used to estimate the model parameters. The second sub-sample, called the test sample, corresponding to 30% of the total observations, was used to validate the performance of the model forecasts.
With the estimated parameters, we calculate the empirical probabilities p ^ ( b , a ) using the test sample. Additionally, using the (11), we calculate the theoretical predicted probabilities p ( b , a ) for the same dataset. For comparison purposes, in Figure 12, we present the results for both quantities. The figure confirms the good precision of the model to predict variations in the mid-price. Furthermore, to numerically confirm the precision of our forecasts, we present in Figure 13 the ROC curve, which corroborates, with a score of 96%, the predictive power of the model to events of up variations in the mid-price.

5. Discussion: Other Regimes

In this section, we outline the formulations for the remaining two regimes that can be encompassed within our general model. As previously indicated, the primary findings of this article have the potential for generalization to these alternative regimes. However, our belief is that, in qualitative terms, there will be minimal disparities between the outcomes. We anticipate that these subsequent formulations will serve as incentives for future research endeavors aimed at extending and broadening the scope of our results.

5.1. Almost Uniform Catastrophes

As we mentioned before, the large deviations were proved for the so-called almost uniform catastrophes. Recall that in order to close the spread of length k (with probability γ γ ), we choose the next state for the spread with the same probability (uniformly) from the set I k = { 1 , , k 1 } and denote these probabilities as Q i ( k ) , i I k , and here Q i ( k ) = 1 k 1 . The almost uniform distribution is defined by the following form of probabilities Q i ( k ) , 1 i k 1 : there exists a constant c > 1 such that for all k N
1 c ( k 1 ) Q i ( k ) c k 1 ,
for all i I k . It extends the class of models for highly competitive regimes. For example, for any length k of the spread, it can be divided into some parts, say two parts, and we say that with probability 0.7 , we choose one part, and then uniformly the state from this part is chosen; with probability 0.3 , the second part is chosen and the corresponding state is chosen uniformly.
All the proofs above can be slightly modified.

5.2. Non-Competitive Regime

The main features of the non-competitive regime (NC regime) are small openings of the spread, similar to the HC regime. However, in the NC regime, there is a slow decrease (following a power law) in the spread. This slow decrease occurs because agents placing limit orders within the spread prioritize achieving an optimal price in their quotes. Compared to the HC regime the agents are less impatient. With some constant rate, the spread opens by one tick. For the closing spread, let k > 1 be the spread size, and the variation in the prices Δ is chosen from I k = { 1 , , k 1 } according to the rate which is proportional to Δ μ , where μ is a fixed positive number. The parameter μ can be interpreted as a behavioral measure for agents to obtain a more favorable price in their negotiations.

Model: Closing the Spread Polynomially

In order to define the rates for the NC regime, let us again fix parameters α + c , α c , β + c , β c , which are strictly positive real numbers. Suppose that at some moment the chain is at some state ( b , a ) X , then the transition rates for the Markov chain X ( t ) in this regime are defined in the following way:
α + ( Δ ) = α + c , if   Δ = 1 ; 0 , otherwise ; α ( Δ ) = α c Δ μ , if   a b > 1   for   any   Δ I a b ; 0 , otherwise ; β ( Δ ) = β c , if   Δ = 1 ; 0 , otherwise ; β + ( Δ ) = β + c Δ μ , if   a b > 1   for   any   Δ I a b ; 0 , otherwise .
For illustration, see Figure 14 in the case when a b = 3 .
Again, as before, first, we study the Markov chain S ( t ) . Suppose that at some moment t the chain is at some state k N , and let γ + c : = β c + α + c and γ c : = β + c + α c , then
k k + 1 with   rate γ + c , k k Δ with   rate γ c Δ μ   for   Δ I k .
These transitions suggest that the spread dynamics have a slow reversal process to their typical values, this is because each liquidity provider competes with the others to spread closing.

5.3. Low Liquidity with Gaps Regime

The main feature of low liquidity with gaps regime (LLG regime) is that the spread can open by more than one tick, this is due to the existence of gaps in the OB. The spread decreases similarly to the NC regime.

Model

Let us fix parameters α + l , α l , β + l , β l , κ a , κ b and θ ( 0 , 1 ) , which are strictly positive real numbers. Suppose that at some moment the chain is at some state ( b , a ) X , then the transition rates for the Markov chain X ( t ) for this regime are defined as follows.
α + ( Δ ) = α + l ( a b ) κ a · θ ( 1 θ ) Δ 1 , for   Δ N ; 0 , otherwise ; α ( Δ ) = α l ( a b ) κ a · θ ( 1 θ ) Δ 1 1 ( 1 θ ) a b 1 , if   a b > 1 for   Δ I a b ; 0 , otherwise ; β ( Δ ) = β l ( a b ) κ b · θ ( 1 θ ) Δ 1 , for   Δ N ; 0 , otherwise ; β + ( Δ ) = β l ( a b ) κ b · θ ( 1 θ ) Δ 1 1 ( 1 θ ) a b 1 , if   a b > 1 for   Δ I a b ; 0 , otherwise .
For illustration, see Figure 15 in the case when a b = 3 .
This Markov chain can be described more easily informally in the following way: for a given state ( b , a )
  • with the rate α + l ( a b ) κ a , the chain decides to increase the ask price, and it chooses the increment according to the geometric distribution with parameter θ ;
  • with the rate α l ( a b ) κ a , the chain decides to decrease the ask price, and it chooses the increment according to the truncated geometric distribution with parameter θ and values I a b = { 1 , , a b 1 } ;
  • with the rate β l ( a b ) κ b , the chain decides to decrease the bid price, and it chooses the increment according to the geometric distribution with parameter θ ;
  • with the rate β + l ( a b ) κ b , the chain decides to increase the bid price, and it chooses the increment according to the truncated geometric distribution with parameter θ and values I a b = { 1 , , a b 1 } .
Once more, following the same approach, we initiate our analysis by studying the Markov chain S ( t ) . Let us consider a particular time instance t when the chain is situated at state k N , then the following transitions occur:
k k + Δ with   rate α + l k κ a + β l k κ b · θ ( 1 θ ) Δ 1 ,   for   Δ N , k k Δ with   rate α l k κ a + β + l k κ b · θ ( 1 θ ) Δ 1 1 ( 1 θ ) k 1 ,   for   Δ I k .

6. Conclusions

We propose a straightforward model for price dynamics in an OB with liquidity fluctuations. Unlike [9], our model does not explicitly capture liquidity fluctuations but still offers a reasonable approximation for empirical observations. The continuous-time Markov process describing spread dynamics falls into the category of Poisson processes with uniform catastrophes, where the eliminated fraction of the population follows a uniform distribution. Large deviation results for such processes have been studied in [15,16].
When the spread closure (catastrophe) follows a uniform distribution, it accurately represents a scenario of complete uncertainty in the decisions of bidders. In such a case, any change in the spread is equally probable, indicating an exceptionally unusual situation often associated with extremely high volatility. By examining such scenarios, it is conceivable, to a significant extent, to devise a decision-making algorithm with a substantial degree of reliability. Alternatively, if one seeks to hedge against substantial fluctuations in the spread, our model may offer insights into calculating the appropriate insurance premiums needed to ensure that the risk of financial collapse remains below a predetermined threshold.
We examined the asymptotic behavior of the model, which involved determining the invariant measure (a rare case where it can be derived explicitly), and establishing results such as the law of large numbers, the central limit theorem, and large deviations. These theoretical findings were utilized in Monte Carlo simulations to validate that our model reproduces relevant empirical characteristics.
We conclude our paper by discussing potential future research directions, including the exploration of other liquidity regimes and model extensions for various applications.

Author Contributions

Conceptualization, H.R., A.L. and A.Y. All authors have read and agreed to the published version of the manuscript.

Funding

Artem Logachov is supported by the Ministry of Science and Higher Education of the Russian Federation FWNF-2022-0010. Anatoly Yambartsev thanks the support of FAPESP via grant 2017/10555-0.

Data Availability Statement

The data presented in this study may be available on reasonable request from the first or corresponding author.

Acknowledgments

We thank Sasha Stoikov for providing us with the NASDAQ high-frequency trading (HFT) data for Apple Inc., collected through the Bloomberg stock trading platform. We thank Vadim Scherbakov for fruitful discussions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Ergodicity and Invariant Measure for S(t): Proof of Theorem 1

To establish positive recurrence, we introduce a Lyapunov function based on the criteria for continuous-time Markov chains as outlined in [18], Theorem 1.7. Positive recurrence can be inferred from the existence of a non-negative function f (Lyapunov function) across the set of states, a small positive value ε , and a finite set of states F. Specifically, by applying the process generator Γ to function f, we ensure that Γ f ( x ) < ε holds true for all states x not within the set F.
Recall that the generator Γ for a discrete-state Markov process is represented by the matrix Γ = ( Γ x y ) , where Γ x y for x y corresponds to the transition rate from state x to state y, and Γ x x = y x Γ x y . Applying the generator for the identity function f ( x ) = x , we obtain
Γ f ( k ) = γ + ( k + 1 ) + x I k γ + k 1 x ( γ + + γ ) k = γ + γ k 2 < 1 ,
for all k : k > 2 ( γ + + 1 ) / γ . The last inequality provides the finite set F = { k 2 ( γ + + 1 ) / γ } for the criteria above with ε = 1 . Thus, the criterion ensures that the Markov process is positively recurrent.
The formula for the invariant measure can be verified through direct examination using the global balance equations. Beginning with the system of global balance equations, we derive the following.
γ + μ ( 1 ) = γ μ ( 2 ) + γ 2 μ ( 3 ) + γ 3 μ ( 4 ) + ( γ + + γ ) μ ( 2 ) = γ + μ ( 1 ) + γ 2 μ ( 3 ) + γ 3 μ ( 4 ) + ( γ + + γ ) μ ( 3 ) = γ + μ ( 2 ) + γ 3 μ ( 4 ) + γ 4 μ ( 5 ) + ( γ + + γ ) μ ( n ) = γ + μ ( n 1 ) + γ n μ ( n + 1 ) + 2 γ + μ ( 1 ) = ( γ + + 2 γ ) μ ( 2 ) ( 2 γ + + γ ) μ ( 2 ) = γ + μ ( 1 ) + ( γ + + γ + γ 2 ) μ ( 3 ) ( 2 γ + + γ ) μ ( 3 ) = γ + μ ( 2 ) + ( γ + + γ + γ 3 ) μ ( 4 ) ( 2 γ + + γ ) μ ( n ) = γ + μ ( n 1 ) + ( γ + + γ + γ n ) μ ( n + 1 )
Dividing the left side and right side on γ + + γ and using the notation p : = γ + γ + + γ , q : = 1 p , we rewrite the last system as
μ ( 2 ) = 2 p 1 + q μ ( 1 ) μ ( 3 ) = 1 + p 1 + q 2 μ ( 2 ) p 1 + q 2 μ ( 1 ) μ ( n ) = 1 + p 1 + q n 1 μ ( n 1 ) p 1 + q n 1 μ ( n 2 ) μ ( 2 ) = 2 p 1 + q μ ( 1 ) μ ( 3 ) = 3 p 2 ( 1 + q ) ( 1 + q 2 ) μ ( 1 ) μ ( n ) = n p n 1 i = 1 n 1 ( 1 + q i ) μ ( 1 )
After the normalizing of the relation
μ ( n ) = n p n 1 i = 1 n 1 ( 1 + q i ) μ ( 1 )
and since μ is the probability measure, μ ( k ) = 1 , we return the notation γ ± and obtain Formula (6).

Appendix A.2. Law of Large Numbers: Proof of Theorem 2

Perhaps the most straightforward method to prove the LLN involves the ergodic theorem for discrete-time Markov chains. Let s n denote the embedded discrete-time Markov chain on N derived from S ( · ) , with the following transition probabilities:
p ( k , l ) : = P ( s n + 1 = l s n = k ) = 1 , if   l = 2   and   k = 1 , γ + γ + + γ , if   l = k + 1 ,   when   k > 1 , γ γ + + γ 1 k 1 , if   l I k   and   k > 1 .
Consider the stationary measure π = ( π ( s ) , s N ) associated with the chain s n . Evidently, the relation (6) for the stationary measure μ can be transformed into the following relation for the stationary measure π :
π ( n ) = n p n 2 i = 1 n 1 ( 1 + q i ) π ( 1 ) , n 2 .
This transformation can be verified directly using the global balance equations of the discrete-time Markov chain s n .
Denote as p n the discrete-time embedding chain corresponding to the continuous-time bid-price process P b ( t ) . The behavior of p n can be expressed as a function of the spread dynamics s n using the following formula.
p n = i = 0 n F ( s n 1 , s n , U n ) ,
where
U n = ( U n 1 , U n 2 )
is the sequence of independent and identically distributed random vectors such that
P ( ( U n 1 , U n 2 ) = ( 1 , 0 ) ) = β γ + 1 β + γ , P ( ( U n 1 , U n 2 ) = ( 1 , 1 ) ) = β γ + β + γ ,
P ( ( U n 1 , U n 2 ) = ( 0 , 0 ) ) = 1 β γ + 1 β + γ , P ( ( U n 1 , U n 2 ) = ( 0 , 1 ) ) = 1 β γ + β + γ ,
and where the function F is
F ( s n 1 , s n , U n ) = 1 , if   s n = s n 1 + 1   and   U n 1 = 1 ; s n 1 s n , if   s n < s n 1   and   U n 2 = 1 ; 0 , otherwise .
Note that
s ^ n : = ( s n 1 , s n , U n )
is a Markov chain, and let π ^ be its invariant measure. Observe that the discrete part of the invariant measure π ^ for the process ( s n 1 , s n ) is the product π ^ ( x , y ) = π ( x ) p ( x , y ) . By the ergodic theorem, we obtain the LLN for the embedding chain p n .
Lemma A1.
p n n v a . s . n
where
v = β γ + + γ π ( 1 ) γ + + γ β γ γ + + β + 2 + β + γ + + γ 1 2 s = 1 s π ( s ) .
Proof. 
The ergodic theorem states the convergence (A6). Thus, we need only to find the v, which is the expectation over the invariant measure π ^ of the increments F ( s ^ n ) :
E π ^ ( F ( s ^ n ) ) = π ( 1 ) β γ + s = 2 π ( s ) γ + γ + + γ β γ + + s 2 = 2 s 1 = 1 s 2 1 ( s 2 s 1 ) π ( s 2 ) γ γ + + γ 1 s 2 1 β + γ = β γ + + γ π ( 1 ) γ + + γ β γ γ + + β + 2 + β + γ + + γ 1 2 s = 1 s π ( s ) ,
which finishes the proof of the lemma. □
To finish the proof of Theorem 2, we observe
lim t P b ( t ) t = lim t p N t N t N t t = v ( γ + + γ ) = β π ( 1 ) β γ γ + + β + 2 + β + 2 s = 1 s π ( s ) = : D ,
where N t is the Poisson process with rate γ + + γ .

Appendix A.3. CLT for Embedding pn: Proof of Lemma 1

For instance, one approach to proving this is by demonstrating the geometric ergodicity of the chain s ^ n , which indicates a geometric rate of convergence to the invariant measure:
P n ( x , · ) π ^ ( · ) M ( x ) q n ,   for   some   q < 1 ,
where · stands for the total variation norm. Subsequently, we can apply the results applicable to geometrically ergodic chains. Formally, this necessitates establishing that the chain s ^ n , defined by (A5), is a Harris ergodic Markov chain, which indeed holds true for s ^ n . For further details, please refer to [19].
Theorem A1
(Corollary 2, [19]). Let X be a Harris ergodic Markov chain on X with invariant distribution π and let f : X R be a Borel function. Assume that X is geometrically ergodic and E π | f ( x ) | 2 + δ < for some δ > 0 . Then, for any initial distribution, as n
n ( f ¯ n E π f ) N ( 0 , σ f 2 )
in distribution.
Let us begin by proving the geometric ergodicity of the chain s ^ n . General results exist concerning the so-called drift conditions for establishing geometric ergodicity in chains, as detailed in [20]. However, for countable Markov chains, we can utilize the criteria outlined in [21] (refer to Theorem 2):
A countable Markov chain is geometrically ergodic if there exists a finite set B X and function g ( x ) 0 , x X such that E x e g ( X 1 ) g ( X 0 ) q < 1 , when x B and E x e g ( X 1 ) g ( X 0 ) < , if x B .
Proof. 
Utilizing the criteria, let V ( · ) e g ( · ) . To check the conditions we consider the function V ( · ) for the chain s ^ n , defined as follows:
V ( i , j , u ) = e ln ( i 2 + j 2 ) = i 2 + j 2 .
For simplicity, let us assume p + = γ + γ + + γ and p = γ γ + + γ . Then,
E V ( s n , s n + 1 , U n + 1 ) V ( s n 1 , s n , U n ) s n 1 = y , s n = x = k = 1 x 1 ( x k ) 2 + x 2 x 2 + y 2 p x 1 + ( x + 1 ) 2 + x 2 y 2 + x 2 p + = x ( 2 x 1 ) 6 ( x 2 + y 2 ) p + x 2 x 2 + y 2 p + ( x + 1 ) 2 + x 2 y 2 + x 2 p + .
Consider two cases. First, we suppose that x < y . Then,
( A 8 ) x ( 2 x 1 ) 6 ( x 2 + ( x + 1 ) 2 ) p + x 2 x 2 + ( x + 1 ) 2 p + ( x + 1 ) 2 + x 2 x 2 + ( x + 1 ) 2 p + < 2 3 p + p + q < 1 .
Thus, the condition holds for all ( x , y ) such that x < y . In the second case, x > y , we have y = x 1 :
( A 8 ) = x ( 2 x 1 ) 6 ( x 2 + ( x 1 ) 2 ) p + x 2 x 2 + ( x 1 ) 2 p + ( x + 1 ) 2 + x 2 x 2 + ( x 1 ) 2 p + = ( 8 x 2 x ) p + 24 x p + 6 ( x 2 + ( x 1 ) 2 ) + p + .
There is no q < 1 such that (A8) q for all x. But it is easy to see that there exists q > 2 3 p + p + and C C ( p , p + , q ) > 0 such that for all ( x , y ) under the condition x C
( A 8 ) q < 1 .
It is easy to see that
sup x ( 8 x 2 x ) p + 24 x p + 6 ( x 2 + ( x 1 ) 2 ) + p + < .
Thus, in this case, we can define the finite set B from the condition as
B = ( x , y ) : x C , y x , U n { ( 0 , 0 ) , ( 0 , 1 ) , ( 1 , 0 ) , ( 1 , 1 ) } .
This completes the proof of the geometrical ergodicity of the chain s ^ n . □
For the second condition of the CLT theorem, it is necessary to verify that E π ^ | F ( s ^ n ) | 2 + δ < for a certain δ > 0 , where the function F is defined by (A4). To achieve this, we require insights into the behavior of the invariant measure.
Proof. 
As before, let π be the invariant measure for the chain s n . The condition takes the following form:
E π ^ | F ( s ^ n ) | 2 + δ = x = 1 k = 1 x 1 k 2 + δ π ^ ( x , x k ) β + γ + π ^ ( x , x + 1 ) β γ + = β γ + + γ + β + γ + + γ x = 1 π ( x ) k = 2 x 1 k 2 + δ x 1 < β γ + + γ + β + γ + + γ 1 3 + δ x = 1 π ( x ) x 3 + δ x 1 .
Thus, if we prove that π ( x ) decreases sufficiently fast, then the last series in (A9) will converge. Indeed, from the relation (A2) we obtain immediately
π ( n ) < n p n 2 π ( 1 ) ,
which provides the convergence of the last series in (A9). This establishes the conditions for the central limit theorem (CLT). □
With that, we conclude the proof of the CLT, as stated in Lemma 1.

Appendix A.4. CLT for Continuous-Time P b (t): Proof of Theorem 3

Based on the result above, let us proceed to establish the central limit theorem for the price process P b ( t ) . As previously mentioned, consider N t to be the Poisson process with rate γ = γ + + γ , representing the count of jumps for the process X ( t ) = P b ( t ) , P a ( t ) . The subsequent representation holds:
t P b ( t ) t D = N t p N t N t v N t t + t N t t v D
According to Lemma 1 and CLT for the Poisson process, we expect that as t
N t p N t N t v N ( 0 , σ 2 )   in   distribution   for   some   σ 2 t N t t γ N ( 0 , γ )   in   distribution   N t t γ a . s .
The second and third convergence are well known when the first one can be proved as follows: let F n ( x ) be the cumulative distribution function of the scaled embedded price Markov chain p n from Lemma 1, then for any δ > 0 there exists n δ such that for all n > n δ
F n ( x ) Φ σ 2 ( x ) < δ ,
where Φ σ 2 ( x ) stands for the cumulative normal distribution with zero mean and variance σ 2 . Then,
P N t p N t N t v x = P N t p N t N t v x N t n δ P ( N t n δ ) + n = n δ + 1 P N t p N t N t v x N t = n P ( N t = n ) P ( N t n δ ) + Φ σ 2 ( x ) + δ Φ σ 2 ( x )   as   t ,   and   as   δ 0 P N t p N t N t v x Φ σ 2 ( x ) δ P ( N t > n δ ) = Φ σ 2 ( x ) δ + Φ σ 2 ( x ) δ P ( N t n δ ) Φ σ 2 ( x )   as   t ,   and   as   δ 0 .
Observe that two normal variables from (A10) are not independent; however, they are asymptotically uncorrelated and, furthermore, asymptotically independent. To demonstrate this, consider variables
N t p N t N t v and t N t t γ .
We will show that they are asymptotically independent.
Since the second variable is a measurable function of the N t it suffices to prove that for all x R , all sets A N , and all ε > 0 , the following inequality holds:
lim t P N t p N t N t v < x , N t A Φ σ 2 ( x ) P ( N t A ) ε .
If the set A is bounded from above, then the inequality holds:
0 lim t P N t p N t N t v < x , N t A lim t P ( N t A ) = 0 .
Suppose now that A is not bounded. In this case, for any δ > 0 we obtain
lim t P N t p N t N t v < x , N t A = lim t P N t p N t N t v < x , N t A [ 0 , n δ ] + lim t k = n δ + 1 P k p k k v < x , N t A { k } = lim t k = n δ + 1 P k p k k v < x P ( N t A { k } ) .
Thus, for any δ > 0 we have
lim t ( Φ σ 2 ( x ) δ ) P ( N t A [ n δ , ) ) lim t P N t p N t N t v < x , N t A lim t ( Φ σ 2 ( x ) + δ ) P ( N t A [ n δ , ) ) ,
and the following inequality holds
lim t P N t p N t N t v < x , N t A Φ σ 2 ( x ) P ( N t A ) δ lim t P ( N t A [ n δ , ) ) < δ .
Choosing δ = ε , we obtain inequality (A11).

References

  1. Biais, B.; Hillion, P.; Spatt, C. An empirical analysis of the limit order book and the order flow in the Paris Bourse. J. Financ. 1995, 50, 1655–1689. [Google Scholar] [CrossRef]
  2. Smith, E.; Farmer, J.D.; Gillemot, L.; Krishnamurthy, S. Statistical theory of the continuous double auction. Quant. Financ. 2003, 3, 481–514. [Google Scholar] [CrossRef]
  3. Bouchaud, J.P.; Farmer, J.D.; Lillo, F. How markets slowly digest changes in supply and demand. In Handbook of Financial Markets: Dynamics and Evolution; Elsevier: Amsterdam, The Netherlands, 2009; pp. 57–160. [Google Scholar]
  4. Cont, R.; Stoikov, S.; Talreja, R. A stochastic model for order book dynamics. Oper. Res. 2010, 58, 549–563. [Google Scholar] [CrossRef]
  5. Avellaneda, M.; Reed, J.; Stoikov, S. Forecasting prices from Level-I quotes in the presence of hidden liquidity. Algorithmic Financ. 2011, 1, 35–43. [Google Scholar] [CrossRef]
  6. Cont, R.; De Larrard, A. Price dynamics in a Markovian limit order market. SIAM J. Financ. Math. 2013, 4, 1–25. [Google Scholar] [CrossRef]
  7. Cont, R.; Mueller, M.S. A Stochastic Partial Differential Equation Model for Limit Order Book Dynamics. SIAM J. Financ. Math. 2019, 12, 744–787. [Google Scholar] [CrossRef]
  8. Golub, A.; Keane, J.; Poon, S.H. High frequency trading and mini flash crashes. arXiv 2012, arXiv:1211.6667. [Google Scholar] [CrossRef]
  9. Dall’Amico, L.; Fosset, A.; Bouchaud, J.P.; Benzaquen, M. How does latent liquidity get revealed in the limit order book? J. Stat. Mech. Theory Exp. 2019, 2019, 013404. [Google Scholar] [CrossRef]
  10. Lo, D.K.; Hall, A.D. Resiliency of the limit order book. J. Econ. Dyn. Control 2015, 61, 222–244. [Google Scholar] [CrossRef]
  11. Riccò, R.; Rindi, B.; Seppi, D.J. Information, Liquidity, and Dynamic Limit Order Markets; Bocconi University: Milano, Italy, 2022. [Google Scholar]
  12. Roll, R. A simple implicit measure of the effective bid-ask spread in an efficient market. J. Financ. 1984, 39, 1127–1139. [Google Scholar]
  13. Doyne Farmer, J.; Gillemot, L.; Lillo, F.; Mike, S.; Sen, A. What really causes large price changes? Quant. Financ. 2004, 4, 383–397. [Google Scholar] [CrossRef]
  14. Biais, B.; Weill, P.O. Liquidity Shocks and Order Book Dynamics; Technical Report; National Bureau of Economic Research: Cambridge, MA, USA, 2009. [Google Scholar]
  15. Logachov, A.; Logachova, O.; Yambartsev, A. The local principle of large deviations for compound Poisson process with catastrophes. Braz. J. Probab. Stat. 2021, 35, 205–223. [Google Scholar] [CrossRef]
  16. Logachov, A.; Logachova, O.; Yambartsev, A. Large deviations in a population dynamics with catastrophes. Stat. Probab. Lett. 2019, 149, 29–37. [Google Scholar] [CrossRef]
  17. Rojas, H.; Alvarez, C.; Rojas, N. Statistical Hypothesis Testing for Information Value (IV). arXiv 2023, arXiv:2309.13183. [Google Scholar]
  18. Menshikov, M.; Petritis, D. Explosion, implosion, and moments of passage times for continuous-time Markov chains: A semimartingale approach. Stoch. Process. Their Appl. 2014, 124, 2388–2414. [Google Scholar] [CrossRef]
  19. Jones, G.L. On the Markov chain central limit theorem. Probab. Surv. 2004, 1, 299–320. [Google Scholar] [CrossRef]
  20. Meyn, S.P.; Tweedie, R.L. Markov Chains and Stochastic Stability; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  21. Popov, N. Geometric ergodicity conditions for countable Markov chains. Dokl. Akad. Nauk. Russ. Acad. Sci. 1977, 234, 316–319. [Google Scholar]
Figure 1. A graph of the S&P500 futures on the day of the flash crash of 6 May 2010 at 2:45 p.m.
Figure 1. A graph of the S&P500 futures on the day of the flash crash of 6 May 2010 at 2:45 p.m.
Mathematics 11 04235 g001
Figure 2. Intraday evolution of the ask (red) and bid (green) prices, Apple Inc. (Cupertino, CA, USA) AAPL stock, 4 March 2011. Left: short-term, 1 min. Right: long-term, 15 min. Figures created by H.Rojas.
Figure 2. Intraday evolution of the ask (red) and bid (green) prices, Apple Inc. (Cupertino, CA, USA) AAPL stock, 4 March 2011. Left: short-term, 1 min. Right: long-term, 15 min. Figures created by H.Rojas.
Mathematics 11 04235 g002
Figure 3. Joint empirical distribution of bid and ask queue sizes at the top of the order book; Apple Inc. stock, 4 March 2011. Figure created by H.Rojas.
Figure 3. Joint empirical distribution of bid and ask queue sizes at the top of the order book; Apple Inc. stock, 4 March 2011. Figure created by H.Rojas.
Mathematics 11 04235 g003
Figure 4. Empirical distribution of the bid–ask spread, Apple Inc. stock, 4 March 2011, corresponding to 15 min of observation (blue). The invariant distribution is calculated by Formula (6) (red). Figure created by H.Rojas.
Figure 4. Empirical distribution of the bid–ask spread, Apple Inc. stock, 4 March 2011, corresponding to 15 min of observation (blue). The invariant distribution is calculated by Formula (6) (red). Figure created by H.Rojas.
Mathematics 11 04235 g004
Figure 5. The rates for the highly competitive model. An illustrative example of the case when a b = 3 . Figure created by A.Yambartsev.
Figure 5. The rates for the highly competitive model. An illustrative example of the case when a b = 3 . Figure created by A.Yambartsev.
Mathematics 11 04235 g005
Figure 6. The optimal trajectories for (A) spread process, which is a Poisson process (of rate γ + ) with uniform catastrophes (of rate γ ), and (B) Poisson process with rate γ + . If x < γ + , then the large deviation occurs according to the functions f 2 . If x γ + , then the large deviation trajectory is in the neighborhood of the straight line f 1 . Figures created by A.Yambartsev.
Figure 6. The optimal trajectories for (A) spread process, which is a Poisson process (of rate γ + ) with uniform catastrophes (of rate γ ), and (B) Poisson process with rate γ + . If x < γ + , then the large deviation occurs according to the functions f 2 . If x γ + , then the large deviation trajectory is in the neighborhood of the straight line f 1 . Figures created by A.Yambartsev.
Mathematics 11 04235 g006
Figure 7. The optimal trajectories for prices ( P b ( t ) , P a ( t ) ) under a large deviation of the spread when (A) the scaled spread x is less than γ + , which consists of the rates β , α + , i.e., γ + = β + α + ; and (B) the scaled spread x γ + . Figures created by A.Yambartsev.
Figure 7. The optimal trajectories for prices ( P b ( t ) , P a ( t ) ) under a large deviation of the spread when (A) the scaled spread x is less than γ + , which consists of the rates β , α + , i.e., γ + = β + α + ; and (B) the scaled spread x γ + . Figures created by A.Yambartsev.
Mathematics 11 04235 g007
Figure 8. Intensity of price jumps: average number of micro-jumps per minute. Upper graph: 3 March 2011. Lower graph: 4 March 2011. Figures created by H.Rojas.
Figure 8. Intensity of price jumps: average number of micro-jumps per minute. Upper graph: 3 March 2011. Lower graph: 4 March 2011. Figures created by H.Rojas.
Mathematics 11 04235 g008
Figure 9. The vertical axis corresponds to the information value (IV). The horizontal axis corresponds to the time lag (lags) taken into account for the calculation of the divergence, that is, contiguous periods where price jumps occur. The graph on the left corresponds to the spread bid–ask predictor. The graph on the right corresponds to the imbalance predictor. Figures created by H.Rojas.
Figure 9. The vertical axis corresponds to the information value (IV). The horizontal axis corresponds to the time lag (lags) taken into account for the calculation of the divergence, that is, contiguous periods where price jumps occur. The graph on the left corresponds to the spread bid–ask predictor. The graph on the right corresponds to the imbalance predictor. Figures created by H.Rojas.
Mathematics 11 04235 g009
Figure 10. Simulation of the order book with parameters α ^ + = 3.756 , α ^ = 0.765 , β ^ + = 0.848 , and β ^ = 4.907 . Upper left: Short-term evolution of bid (blue) and ask (red) prices, 1 min sample. Upper right: Long-term evolution of the prices, 15 min sample. Bottom left: short-term path of the price process X ( t ) , 1 min sample. Bottom right: long-term path of the price process, 15 min sample. Figures created by H.Rojas.
Figure 10. Simulation of the order book with parameters α ^ + = 3.756 , α ^ = 0.765 , β ^ + = 0.848 , and β ^ = 4.907 . Upper left: Short-term evolution of bid (blue) and ask (red) prices, 1 min sample. Upper right: Long-term evolution of the prices, 15 min sample. Bottom left: short-term path of the price process X ( t ) , 1 min sample. Bottom right: long-term path of the price process, 15 min sample. Figures created by H.Rojas.
Mathematics 11 04235 g010
Figure 11. Autocorrelation function of price return based on our simulations of the order book. The red dashed line represents the 95% confidence interval. Figure created by H.Rojas.
Figure 11. Autocorrelation function of price return based on our simulations of the order book. The red dashed line represents the 95% confidence interval. Figure created by H.Rojas.
Mathematics 11 04235 g011
Figure 12. Empirical probabilities p ^ ( b , a ) versus theoretical probabilities p ( b , a ) . Figures created by H.Rojas.
Figure 12. Empirical probabilities p ^ ( b , a ) versus theoretical probabilities p ( b , a ) . Figures created by H.Rojas.
Mathematics 11 04235 g012
Figure 13. Accuracy in classifying variations up; the event of interest (target variable) is the variation up. Figure created by H.Rojas.
Figure 13. Accuracy in classifying variations up; the event of interest (target variable) is the variation up. Figure created by H.Rojas.
Mathematics 11 04235 g013
Figure 14. The rates for the non-competitive model. An illustrative example for the case when a b = 3 . Figure created by A.Yambartsev.
Figure 14. The rates for the non-competitive model. An illustrative example for the case when a b = 3 . Figure created by A.Yambartsev.
Mathematics 11 04235 g014
Figure 15. The rates for the low-liquidity model. An illustrative example for the case when a b = 3 . Figure created by A.Yambartsev.
Figure 15. The rates for the low-liquidity model. An illustrative example for the case when a b = 3 . Figure created by A.Yambartsev.
Mathematics 11 04235 g015
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rojas, H.; Logachov, A.; Yambartsev, A. Order Book Dynamics with Liquidity Fluctuations: Asymptotic Analysis of Highly Competitive Regime. Mathematics 2023, 11, 4235. https://doi.org/10.3390/math11204235

AMA Style

Rojas H, Logachov A, Yambartsev A. Order Book Dynamics with Liquidity Fluctuations: Asymptotic Analysis of Highly Competitive Regime. Mathematics. 2023; 11(20):4235. https://doi.org/10.3390/math11204235

Chicago/Turabian Style

Rojas, Helder, Artem Logachov, and Anatoly Yambartsev. 2023. "Order Book Dynamics with Liquidity Fluctuations: Asymptotic Analysis of Highly Competitive Regime" Mathematics 11, no. 20: 4235. https://doi.org/10.3390/math11204235

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop