**4. Results**

In this section, we first show that the Hawkes process introduced here successfully approximated the intensity of the limit order generation process for 104 of the 134 HFTs by applying the method described in Section 3.3 (see Section 4.1). We then categorize the order generation processes of the 104 successfully estimated HFTs into three groups according to their excitation mechanisms, and explain how each group of HFTs places their orders (see Section 4.2).

#### *4.1. DKL TS*+*TB Calculation Results for All HFTs*

Figure 7 shows a histogram of the *DKL TS*+*TB* values for the 134 HFTs. From the histogram, there are many HFTs whose values of *DKL TS*+*TB* are very close to 0 and the plots are scattered for *DKL TS*+*TB* > 0.05, so we set the threshold as 0.05. The 104 HFTs who fell into the range of *DKL TS*+*TB* < 0.05 were considered to be traders whose order generation processes were well modeled, while the rest of the 30 HFTs who fell into the range of *DKL TS*+*TB* ≥ 0.05 were considered to be traders whose order generation processes were poorly modeled by the Hawkes process introduced here.

**Figure 7.** A histogram of *DKL TS*+*TB* values for all 134 HFTs. Out of 134 HFTs, 104 fell within the acceptable error threshold of 0.05, and the remaining 30 HFTs exceeded the threshold.

With the estimated parameters, the Hawkes process could be simulated using the thinning method [51]. For example, Figure 8 compares the time series of the number of orders per 10 min window for the real data of an HFT with *DKL TS*+*TB* = 0.0238 and a simulated time series. It can be confirmed that both sell limit orders (upper figure) and buy limit orders (lower figure) successfully reproduce the behavior of the real data.

**Figure 8.** Comparison between simulations and real data of {*tTS*,*<sup>i</sup>*} and {*tTB*,*<sup>i</sup>*} for HFT with *DKL TS*+*TB* = 0.0238. The horizontal axis represents the time over a 24 h period, and the vertical axis represents the number of order occurrences per 10 min window.

On the other hand, Figure 9 shows a comparison of the simulated and real data for an HFT with *DKL TS*+*TB* = 0.211, which was judged not to be properly estimated by the Hawkes process. It can be seen that the deviations from the real data for both sell limit orders (upper figure) and buy limit orders (lower figure) are larger than in the case of Figure 8. The Hawkes process introduced here did not adequately explain the order generation process for this trader with a large error, *DKL TS*+*TB*. Because some traders could not be modeled by the Hawkes process, in the following, we report the results of our clustering analysis of the order generation processes of 104 HFTs after excluding 30 traders.

**Figure 9.** Comparison between simulations and real data of {*tTS*,*<sup>i</sup>*} and {*tTB*,*<sup>i</sup>*} for HFT with *DKL TS*+*TB* = 0.211. The horizontal axis represents the time during a 24 h period, and the vertical axis represents the number of order occurrences per 10 min window.

#### *4.2. Results of Clustering Analysis*

Here, we categorize the 104 HFTs whose limit order generation process was properly estimated according to the similarity of their excitation mechanisms (the branching ratio), and describe how each group of HFTs placed buy–sell limit orders and provided liquidity to the market. As defined in Equation (5) in Section 3.3, the branching ratio, *ρ<sup>n</sup>*,*m*, is an absolute value that represents the expectation for the number of occurrences of event *n* caused by the occurrence of event *m*. To evaluate the excitations of {*tTS*,*<sup>i</sup>*} and {*tTB*,*<sup>i</sup>*} relative to each HFT, we introduced normalized branching ratios *ρ*¯*TS*,*<sup>m</sup>* and *ρ*¯*TB*,*m*, which are defined by the following equations, so that the sum is equal to 1:

$$\rho\_{TS,m} \equiv \frac{\rho\_{TS,m}}{\sum\_{i \in \mathcal{M}} \rho\_{TS,i}} \tag{11a}$$

$$\phi\_{TB,m} \equiv \frac{\rho\_{TB,m}}{\sum\_{i \in \mathcal{M}} \rho\_{TB,i}} \tag{11b}$$

Because both {*tTS*,*<sup>i</sup>*} and {*tTB*,*<sup>i</sup>*} are 8-variable Hawkes processes, 16 normalized branching rates were defined for each HFT. Figure 10 shows a dendrogram of the hierarchical clustering of the 104 HFTs using these 16 variables. For this hierarchical clustering, we used the Ward method [52] to join clusters in the order of the decreasing sum of squares after joining. The vertical axis in Figure 10 represents the distance between clusters with an increase in the sum of squares when clusters *A* and *B* are joined, and is defined by the following equation:

$$\Delta(A, B) = \sum\_{i \in A \cup B} ||\vec{\mathbf{x}}\_i - \vec{m}\_{A \cup B}||^2 - \sum\_{i \in A} ||\vec{\mathbf{x}}\_i - \vec{m}\_A||^2 - \sum\_{i \in B} ||\vec{\mathbf{x}}\_i - \vec{m}\_B||^2 \tag{12}$$

where ||*d*|| denotes the Euclidean distance, and *m j* is the center of cluster *j*.

Based on the distance between the clusters, we found it reasonable to categorize the HFTs into three groups with the threshold distance around 3, as shown in Figure 10, and designated them as Group A, Group B, and Group C. There were 77 HFTs in Group A, 12 in Group B, and 15 in Group C. The number of clusters becomes larger for a lower threshold distance, however, we confirmed that properties of any smaller groups are quite similar to one of these three groups in the graphical representation of an interaction network to be discussed in the following.

The remainder of this section explains the order events that excited the HFTs in each group to place buy–sell limit orders based on the estimated Hawkes parameters.

**Figure 10.** Dendrogram based on the Ward method of clustering for 104 HFTs that were successfully modeled by the Hawkes process. The vertical axis represents the distance between the clusters, as defined in Equation (12), and the horizontal axis shows the labels of the HFTs according to the order frequency (red: Group A with 77 HFTs; blue: Group B with 12 HFTs; yellow: Group C with 15 HFTs).

#### 4.2.1. Group A

Group A is comprised of 77 HFTs. The total number of limit orders in the 5 days was approximately 850,000, which accounted for 62.5% of the total number of limit orders in the market. Figure 11 shows the quartiles and means of the 16 normalized branching ratios for these 77 HFTs, where (a) represents *ρ*¯*TS*,*<sup>m</sup>* and (b) represents *ρ*¯*TB*,*m*. From Figure 11a, it can be seen that the generation of sell limit orders by the HFTs in Group A was most excited by hit sell, which greatly exceeded the excitation from other events. In contrast, Figure 11b shows that their buy limit order generation was most excited by hit buy, which also greatly exceeded the excitation from other events.

**Figure 11.** Percentile plot of normalized branching ratios (**a**) *ρ*¯*TS*,*<sup>m</sup>* and (**b**) *ρ*¯*TB*,*<sup>m</sup>* for 77 HFTs in group A. The vertical axis represents the normalized branching ratios by event *m*, and the horizontal axis represents element *m* ∈ M in both figures (top bar: 75th percentile; X symbol: median; bottom bar: 25th percentile; symbol: the mean).

Figure 12 illustrates the network graph of the buy and sell limit orders of the HFTs in Group A, along with all types of orders, using these normalized branching ratios. The size of the directed edges of the network is proportional to the mean value of the normalized branching ratio, e.g., edges directed from HS to TS and from HB to TB represent strong excitations.

**Figure 12.** Network graph of interaction between buy–sell limit orders of HFTs in Group A and all types of orders in the order book.

In addition, because the kernel function of the Hawkes process is an exponential function, the time constant, which is a measure of the response speed of one excitation at a time, is given by *β*−<sup>1</sup> *n*,*m*. The mean values of the estimated time constant for the HFTs in Group A are summarized in Table 3. It is suggested that their reaction speed to an event is approximately 0.1 s, which is reasonable for HFTs who trade at very high speeds.

**Table 3.** Mean values of the estimated time constant *β* ˆ −1 *n*,*m* (s) for the HFTs in Group A.


### 4.2.2. Group B

Group B is comprised of 12 HFTs. The total number of limit orders in the 5 days was approximately 174,000, which accounted for 12.7% of the total number of limit orders in the market. Figure 13 shows the quartiles and means of the 16 normalized branching ratios for these 12 HFTs. From Figure 13a, we can see that their sell limit order generation was excited by sell limit and cancel buy, and from Figure 13b, we can see that their buy limit order generation was excited by buy limit and cancel sell. Unlike Group A, they did not react to execution events but were excited by the generation and cancellation of limit orders in the order book.

**Figure 13.** Percentile plot of normalized branching ratios (**a**) *ρ*¯*TS*,*<sup>m</sup>* and (**b**) *ρ*¯*TB*,*<sup>m</sup>* for 12 HFTs in Group B. The vertical and horizontal axes are the same as those in Figure 10 (top bar: 75th percentile;X symbol: median; bottom bar: 25th percentile; symbol: the mean).

Figure 14 shows the interaction network of the buy and sell limit orders of the HFTs in Group B, along with all types of orders, using the mean of the normalized branching ratios, as in Figure 12.

**Figure 14.** Network graph of interaction between the buy–sell limit orders of HFTs in Group B and all types of orders in the order book.

The mean values of the time constants for each event are summarized in Table 4. As in the case of HFTs in Group A, these values sugges<sup>t</sup> that the reaction speed to events were to be measured in milliseconds.


**Table 4.** Sample means of 16 time constants, *β*−<sup>1</sup> *<sup>n</sup>*,*<sup>m</sup>*(S), for HFTs in Group B.

### 4.2.3. Group C

Group C is comprised of 15 HFTs. Their total number of limit orders in the 5 days was approximately 95,000, which accounted for 6.9% of the total number of limit orders in the market. From Figure 15a, it can be seen that the sell limit order generation of the HFTs in Group C was most strongly excited by their own buy limit, and was also excited by the sell limit and cancel buy. On the other hand, Figure 15b shows that their buy limit order generation was most strongly excited by their own sell limit, but was also excited by buy limit and cancel buy.

**Figure 15.** Percentile plot of normalized branching ratios (**a**) *ρ*¯*TS*,*<sup>m</sup>* and (**b**) *ρ*¯*TB*,*<sup>m</sup>* for 15 HFTs in Group C. The vertical and horizontal axes are the same as those in Figures 10 and 11 (top bar: 75th percentile; X symbol: median; bottom bar: 25th percentile; symbol: the mean).

Figure 16 shows the interaction network of the buy–sell limit orders of the HFTs in Group C, along with all types of orders, as in Figures 12 and 14. The HFTs' sell/buy limit orders interacted with each other.

**Figure 16.** Network graph of interaction between buy–sell limit orders of HFTs in Group C and all types of orders in the order book.

The mean values of the time constants for each event are summarized in Table 5. The time constants of the excitations from TS to TB and from TB to TS were larger than 10 s, suggesting that the excitations were sustained for a very long time compared to those previously observed.


**Table 5.** Sample means of 16 time constants, *β*−<sup>1</sup> *<sup>n</sup>*,*<sup>m</sup>*(S), for HFTs in Group C.
