Next Article in Journal
Bounds for Coding Theory over Rings
Next Article in Special Issue
Testing the Intercept of a Balanced Predictive Regression Model
Previous Article in Journal
The Self-Information Weighting-Based Node Importance Ranking Method for Graph Data
Previous Article in Special Issue
Bayesian Variable Selection and Estimation in Semiparametric Simplex Mixed-Effects Models with Longitudinal Proportional Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis of Longitudinal Binomial Data with Positive Association between the Number of Successes and the Number of Failures: An Application to Stock Instability Study

1
Pan-Asia Business School, Yunnan Normal University, Kunming 650031, China
2
Department of Mathematics and Statistics, University of New Brunswick, Fredericton, NB E3B 5A3, Canada
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(10), 1472; https://doi.org/10.3390/e24101472
Submission received: 22 September 2022 / Revised: 13 October 2022 / Accepted: 14 October 2022 / Published: 16 October 2022
(This article belongs to the Special Issue Statistical Methods for Modeling High-Dimensional and Complex Data)

Abstract

:
Numerous methods have been developed for longitudinal binomial data in the literature. These traditional methods are reasonable for longitudinal binomial data with a negative association between the number of successes and the number of failures over time; however, a positive association may occur between the number of successes and the number of failures over time in some behaviour, economic, disease aggregation and toxicological studies as the numbers of trials are often random. In this paper, we propose a joint Poisson mixed modelling approach to longitudinal binomial data with a positive association between longitudinal counts of successes and longitudinal counts of failures. This approach can accommodate both a random and zero number of trials. It can also accommodate overdispersion and zero inflation in the number of successes and the number of failures. An optimal estimation method for our model has been developed using the orthodox best linear unbiased predictors. Our approach not only provides robust inference against misspecified random effects distributions, but also consolidates the subject-specific and population-averaged inferences. The usefulness of our approach is illustrated with an analysis of quarterly bivariate count data of stock daily limit-ups and limit-downs.

1. Introduction

There is continuing interest in developing mixed models for longitudinal binomial data [1,2]; however, these methods in the literature generally assume a fixed number of trials, and thus imply a negative association between the number of successes and the number of failures. As the number of trials is fixed, an increase in the number of successes implies a decrease in the number of failures, and vice versa. In practice, however, the number of successes and the number of failures are often positively associated when the number of trials is random. That is, the number of successes and the number of failures can increase or decrease simultaneously. This can be illustrated further with a hypothetical example in Table 1. Clearly the sample correlation is 1 although both probabilities of success and failure are 0.5 . This is because the two outcomes can increase together if the totals are not fixed. On the other hand, when the totals are fixed, the increase of one outcome is at the loss of the other; therefore, they are negatively associated. Thus, the inference methods for the case of fixed cluster sizes are no longer valid for the case of varying cluster sizes since the key assumption of negative association has been violated [3].
In the analysis of clustered binary data, the importance of accounting for randomness in the cluster sizes has long been recognized in developmental toxicity studies, disease aggregation and behaviour studies [4,5]; therefore, varying cluster sizes are likely to occur if such data are collected longitudinally in these areas. Hence, mixed models that can accommodate a random number of trials are also needed in the analysis of longitudinal binomial analysis of positively associated numbers of successes and numbers of failures. In addition, traditional approaches to mixed models for longitudinal binomial data usually rely on the specification of particular random effects distributions; therefore, concerns over the validity of the assumed random effects distributions and the robustness of such inferences were raised [6].
Our work is motivated by the daily price limit policy in Chinese stock market. Quarterly bivariate counts of stock daily limit-ups and limit-downs were collected over 49 seasons from the second quarter of 2007 to the second quarter of 2019 for 60 randomly selected stocks. The quarterly counts of stock daily limit-ups and the quarterly counts of stock limit-downs were positively correlated over time within every stock; the Pearson correlation coefficient ranged from 0.11 to 0.92 with an average of 0.54 for these 60 stocks. The mixed models for longitudinal binomial data in the literature imply a negative association between the number of successes and the number of failures and are inappropriate for the analysis of quarterly bivariate counts of stock daily limit-ups and limit-downs for these stocks. These binomial mixed models in the literature also imply that the number of trials is always positive; however, more than 58 % of the corresponding number of trials for our quarterly bivariate counts of stock daily limit-ups and limit-downs were exactly zero. The research on the price limit of stock mainly focuses on its impact on stock volatility [7,8,9]. It is generally believed that the price limit policy strengthened the herding effect of the market and made stock prices a self-exciting process. This is indeed one of the characteristics of Chinese stock market, that is, stocks are prone to frequent limit-ups and limit-downs. For market managers, in the process of extreme market fluctuations, the limit-ups and limit-downs bring a lack of liquidity, and the market is prone to systemic risks. For investors, the extreme volatility of stocks will directly affect their investment decisions and investment returns; therefore, the instability of the stocks is of great interest in the analysis of extreme price fluctuations but has not been studied so far.
In this paper, we propose a three-level joint Poisson mixed model for both longitudinal counts of successes and longitudinal counts of failures in the longitudinal binomial data with varying numbers of trials over time. Without loss of generality, we describe the model here in terms of quarterly data of stock daily limit-ups and limit-downs to facilitate understanding. First, we introduce stock-specific random effects shared by both counts of stock daily limit-ups and limit-downs. The higher the stock-specific distribution-free random effects, the higher both quarterly counts of stock daily limit-ups and limit-downs; therefore, the stock-specific random effect characterizes the instability of the corresponding stock. Second, conditioning on stock-specific random effects, we introduce two sequences of autocorrelated distribution-free random effects; one for quarterly counts of stock daily limit-ups, whereas the other is for quarterly counts of stock daily limit-downs. Finally, given both stock-specific and the corresponding autocorrelated random effects, we assume that both quarterly counts of stock daily limit-ups and limit-downs follow Poisson distributions with time-dependent covariates. This joint Poisson mixed model can accommodate randomness and zero in the total counts of quarterly stock daily limit-ups and limit-downs at each time point and imply a positive cross association between the quarterly counts of stock daily limit-ups and the quarterly counts of stock limit-downs. Following Ma and Jørgensen [10] and Ma et al. [5], we develop a model estimation based on the orthodox best linear unbiased predictors (BLUP) of random effects given the data. As our approach does not require the specification of any parametric distribution for random effects, our inference is robust against misspecified random effects distributions. To the best of our knowledge, this is the first time a mixed model is developed for longitudinal binomial data where the number of successes and the number of failures are positively associated over time.
The rest of the paper is organized as follows. After introducing quarterly count data of stock daily limit-ups and limit-downs in Section 2, we propose a joint model for bivariate longitudinal counts and discuss its implied longitudinal binomial model in Section 3. In Section 4, we discuss the orthodox best linear unbiased predictors of random effects and model estimation. We analyze the quarterly stock price limits data in Section 5 and conclude in Section 6.

2. Quarterly Data of Stock Daily Limit-Ups and Limit-Downs

A unique daily price limit policy has been in place in Chinese stock market since 13 December 1996. The purpose of the price limit policy is to reduce the volatility of prices by setting limits on how much each stock can rise or fall on a daily basis. The price can move within 10% of the previous day’s closing price, and a quotation that is outside the range will be invalid. These daily limits on rise or fall are called stock daily limit-ups and limit-downs.
To characterize the instability of the stocks, quarterly bivariate counts of stock daily limit-ups and limit-downs were collected over 49 seasons from the second quarter of 2007 to the second quarter of 2019 for 60 selected stocks. These 60 stocks were randomly selected from the CSI Small cap 500 index (CSI 500) index in order to make the selected samples generally representative of the market and make the conclusion of this research widely applicable. The data are available from http://www.sse.com.cn/ (accessed on 5 January 2020) and http://www.szse.cn/ (accessed on 5 January 2020). The parallel boxplots of this pair of quarterly counts of stock daily limit-ups and limit-downs are displayed in Figure 1 below. The quarterly counts of stock daily limit-ups range between 0 and 21, whereas the quarterly counts of stock daily limit-downs range between 0 and 18. All these counts are far below their ceiling counts of around 60 (trading days); thus, the ceiling counts are not necessarily a concern for our use of conditional Poisson for the quarterly counts [11].
To study the relationship between quarterly bivariate counts of stock daily limit-ups and limit-downs and basic characteristics of stocks that are prone to rise and fall limits, we also collected information on the following three variables quarterly: price-to-earnings ratio (PE), price-to-book ratio (PB) and price-to-sales ratio (PS). These three variables are common and important growth indicators reflecting stock fundamentals [12,13,14]. Our analysis results are expected to help managers and investors in risk management and investment decision-making.

3. Joint Model for Bivariate Longitudinal Counts

3.1. The Model

Let Y i 1 t be the quarterly count of daily limit-ups and Y i 2 t the quarterly count of daily limit-downs, where i indexes one of these 60 stocks, i = 1 , , m = 60 , and t = 1 , , T = 49 indexes the seasons under study with t = 1 corresponding to the second quarter of 2007 and t = 49 the second quarter of 2019. Let N i t be the total number of limit-ups or limit-downs for the ith stock in the tth season. We model the quarterly counts of daily limit-ups and limit-downs jointly through a three-level Poisson mixed model as follows.
Assumption 1.
At the top level, we introduce a stock-specific random effect U i for each stock, i = 1 , , m . We assume that the U i ’s are positive, independently and identically distributed with mean 1 and variance σ 2 .
Assumption 2.
At the second level, we introduce two sequences of random effects for quarterly counts of daily limit-ups and limit-downs of each stock, V i 1 t for the count of daily limit-ups Y i 1 t and V i 2 t for the count of daily limit-downs Y i 2 t . Denote U = ( U 1 , U 2 , , U T ) . Conditioning on U , we assume that these stock × season random effects are positive and identically distributed with
E ( V i 1 t | U ) = E ( V i 2 t | U ) = U i
and
C o v ( V i j t , V i j t U ) = τ j 2 ρ j ( t , t ) U i i f i = i a n d j = j 0 i f i i o r j j
with ρ j ( t , t ) = 1 for t = t , j , j = 1 , 2 . This formulation of correlations of the random effects is general as it encompasses various correlation structures including unstructured, m-dependence, Toeplitz, exchangeable, etc. In this paper, we focus on the first-order autoregression (AR(1)), in which C o r ( V i j t , V i j t ) = ρ j | t t | .
Assumption 3.
At the response level, we assume the quarterly counts of limit-ups and limit-downs are conditionally Poisson distributed, given the stock-specific random effects and the stock × season random effects. Denote V = ( V 1 , V 2 , , V m ) , where V i = ( V i 1 , V i 2 ) , V i j = ( V i j 1 , V i j 2 , , V i j T ) and W = ( U , V ) . Specifically, we assume that counts Y i 1 t ’s and Y i 2 t ’s are conditionally independent and Poisson distributed with
Y i 1 t W P o i s s o n ( V i 1 t exp ( z i t α + x i t β ) ) Y i 2 t W P o i s s o n ( V i 2 t exp ( z i t α ) )
where x i t = ( x i t 1 , x i t 2 , , x i t p ) and z i t = ( z i t 1 , z i t 2 , , z i t p ) are known vectors of covariates, α and β are unknown regression coefficients. Here, we employ the same strategy as in Ma et al. [5] to incorporate covariates in the model. In general, z and x can be different, but they are the same in the analysis of this paper:
z i t α = α 0 + P E α 1 + P B α 2 + P S α 3   a n d   x i t β = β 0 + P E β 1 + P B β 2 + P S β 3 .
Remarks. (1) The proposed joint Poisson mixed model can accommodate both a random and zero number of trials. From Equation (1), given all the random effects, the total number of counts N i t is also conditionally independent and Poisson distributed,
N i t W P o i s s o n ( V i 1 t exp ( z i t α + x i t β ) + V i 2 t exp ( z i t α ) ) .
Clearly, N i t can take a value of zero with positive probability and thus, our model can handle the case that both the quarterly counts of limit-ups and limit-downs are zeros. This is advantageous to traditional logistic mixed models for the number of successes in which the portion of data with zero number of trials has to be excluded from the analysis.
(2) As each of the pair of Poisson mixed models is a generalization of a negative binomial model, our model can accommodate overdispersion and zero inflation.
(3) In Assumptions 1 and 2 above, we only assume the mean and variance structures of the random effects, without specifying any parametric form for their distributions. Furthermore, our estimation method to be discussed in the next section requires only these first two moments of the random effects. Thus, our model is robust against misspecified random effects distributions.
(4) Our model consolidates subject-specific and population-averaged inferences for the number of successes and for the number of failures. Under the model setup, the marginal means of the counts are
E ( Y i 1 t ) = exp ( z i t α + x i t β )   and   E ( Y i 2 t ) = exp ( z i t α )
Comparing Equations (1) and (3), it is clear that the regression parameters α and β can be interpreted marginally the same way as conditionally.
(5) The number of successes Y i 1 t is conditionally binomial, given the number of trials and the random effects:
Y i 1 t ( N i t , W ) b i n o m i a l ( N i t , p i t ) ,
where
p i t = V i 1 t exp ( x i t β ) V i 1 t exp ( x i t β ) + V i 2 t .
Thus, the binomial probability p i t is linear in the regression parameters β under the logit link as follows:
l o g i t ( p i t ) = log V i 1 t V i 2 t + x i t β .
Note that α does not appear in Equation (5); therefore, it is auxiliary in this induced binomial model.

3.2. Covariance Structure

We now give the moment structures of the random effects and responses which will be used in the model estimation; these can be obtained after some algebraic calculations by the method of conditioning on random effects. For ease of programming, we present some results in matrix forms.
In addition to the vectors introduced so far, let Y i = ( Y i 1 , Y i 2 ) where Y i 1 = ( Y i 11 , Y i 12 , , Y i 1 T ) and Y i 2 = ( Y i 21 , Y i 22 , , Y i 2 T ) . Similarly, let μ i = ( μ i 1 , μ i 2 ) where μ i 1 = ( μ i 11 , μ i 12 , , μ i 1 T ) , μ i 2 = ( μ i 21 , μ i 22 , , μ i 2 T ) , μ i 1 t = exp ( z i t α + x i t β ) and μ i 2 t = exp ( z i t α ) . The means of the random effects and the responses are
E ( U i ) = 1 , E ( V i ) = 1 , E ( Y i ) = μ i ,
respectively, where 1 is a vector of ones. The variances of V i is
V a r ( V i ) = σ 2 J + τ 1 2 R 1 0 0 τ 2 2 R 2
where R j is the correlation matrix of V i j , j = 1 , 2 . The variance of Y i is
V a r ( Y i ) = d i a g ( μ i ) + d i a g ( μ i ) V a r ( V i ) d i a g ( μ i )
where d i a g ( μ i ) is the diagonal matrix of μ i . The covariances between the random effects and the responses are
C o v ( U i , Y i ) = σ 2 μ i , C o v ( V i , Y i ) = V a r ( V i ) d i a g ( μ i ) ,
respectively.

4. Model Estimation

Similar to Ma et al. [10], we adopt an iterative EM-like algorithm for the model estimation. While updating a component in an iteration, which can be either a vector of random effects, regression parameters or dispersion parameters, we keep other unknowns at their current values. Thus, in the subsections below, we treat random effects and/or parameters as known except the ones under discussion.

4.1. Prediction of Random Effects

Let the inverse of V a r ( Y i ) be denoted by V a r 1 ( Y i ) , i = 1 , 2 , , m . The values of the random effects are updated by the orthodox BLUPs of the random effects as follows:
U i ^ = E ( U i ) + C o v ( U i , Y i ) V a r 1 ( Y i ) [ Y i E ( Y i ) ] ,
and
V i ^ = E ( V i ) + C o v ( V i , Y i ) V a r 1 ( Y i ) [ Y i E ( Y i ) ] .
where the terms in the equations are given in Equations (6)–(9). As pointed out in [10], the orthodox BLUPs minimize the mean squared distance between the random effects and their predictors within the class of linear functions of the responses. The estimating equations for the regression and random effects parameters can then be constructed based on these predictors.

4.2. Estimation of Regression Parameters

As in Ma et al. [5], we may rewrite Equation (1) into a single Poisson mixed model with regression parameters γ = ( α , β ) as follows:
Y i j t | W P o i s s o n ( V i j t exp ( x i j t γ ) ) ,
where x i 1 t = ( z i t , x i t ) and x i 2 t = ( z i t , 0 ) . Thus, we may adapt the orthodox BLUP approach in [10] to our model.
We first differentiate the partially observed “joint” log-likelihood for the data and the random effects with respect to the regression parameters γ to obtain the partially observed “joint” score function. We then have an unbiased estimating equation for γ below, by replacing the random effects with their corresponding orthodox BLUP predictors:
ψ ( γ ) = i = 1 m j = 1 2 t = 1 T x i j t y i j t V ^ i j t ( γ ) μ i t ( γ ) = 0 .
Following Ma et al. [10], Equation (13) can be solved iteratively using the Newton scoring algorithm [15] with γ being updated as follows:
γ * = γ S 1 ( γ ) ψ ( γ )
with the explicit expression of the sensitivity matrix given by
S ( γ ) = i = 1 m X i d i a g ( μ i ) V a r 1 ( Y i ) d i a g ( μ i ) X i
where X is the design matrix formed by stacking x i j t ’s, i.e., X i = ( x i 11 , , x 11 T , x i 21 , , x i 2 T ) .
The optimality results in Ma et al. [10] still hold, i.e., under mild regularity conditions, the solution to Equation (13) is consistent and asymptotically normal with asymptotic mean γ and asymptotic variance given by the inverse of the negative sensitivity matrix S ( γ ) .

4.3. Estimation of Random Effects Parameters

In this subsection, we present a moment approach to estimate the unknown random effects parameters σ 2 , τ j 2 and ρ j , j = 1 , 2 .
Following Ma and Jørgensen [10], the dispersion parameter σ 2 for the stock-specific random effects can be estimated in terms of their corresponding orthodox BLUPs U ^ i ’s with a bias correction. After some algebraic calculation, the iterative equation for updating σ 2 can be expressed as
σ ^ r 2 = 1 m i = 1 m ( U ^ i 1 ) 2 + c i ,
where c i is a bias-correction term defined as
c i = σ ^ r 1 2 σ ^ r 1 4 μ i V a r 1 ( Y i ) μ i
with σ ^ r 1 2 as the estimate from the previous iteration. Similarly, the iterative equations for estimating τ 1 2 and τ 2 2 are given as
τ ^ j , r 2 = 1 m T i = 1 m t = 1 T ( V ^ i j t U ^ i ) 2 + d i j t ,
where the bias-correction term d i j t is expressed as
d i j t = τ ^ j , r 1 2 σ 4 μ i V a r 1 ( Y i ) μ i C o v ( V i j t , Y i ) V a r 1 ( Y i ) C o v ( Y i , V i j t ) + 2 C o v ( U i , Y i ) V a r 1 ( Y i ) C o v ( Y i , V i j t ) .
For unstructured correlation structures, ρ j , ( t , t ) can be estimated using an adjusted Pearson estimator as
ρ ^ j , ( t , t ) = i = 1 m ( V ^ i j t U ^ i ) ( V ^ i j t U ^ i ) + b i j , ( t , t ) i = 1 m ( V ^ i j t U ^ i ) 2 + d i j t i = 1 m ( V ^ i j t U ^ i ) 2 + d i j t 1 / 2 ,
where b i j , ( t , t ) is the correction term which can be simplified as
b i j , ( t , t ) = ρ j , ( t , t ) τ j 2 C o v ( V i j t , Y i ) V a r 1 ( Y i ) C o v ( Y i , V i j t ) σ 4 μ i V a r 1 ( Y i ) μ i + σ 2 μ i V a r 1 ( Y i ) C o v ( Y i , V i j t ) + σ 2 μ i V a r 1 ( Y i ) C o v ( Y i , V i j t )
For various patterned correlation, we can obtain the patterned correlation matrix from Equation (17). To estimate ρ j under AR(1) structure, it would be sufficient to estimate the lag 1 ( ρ j 1 = ρ j ) correlation only, which can be estimated as
ρ ^ j = i = 1 m t = 1 T 1 ( V ^ i j t U ^ i ) ( V ^ i j ( t + 1 ) U ^ i ) + b i j , ( t , t + 1 ) i = 1 m t = 1 T 1 ( V ^ i j t U ^ i ) 2 + d i j t i = 1 m t = 1 T 1 ( V ^ i j ( t + 1 ) U ^ i ) 2 + d i j , t + 1 1 / 2 .

4.4. Computational Procedures

To start the estimating algorithm, we need some sensible initial values for the unknown parameters. We obtained the initial values γ ^ 0 of the regression parameters by fitting a Poisson regression to Equation (12) ignoring the random effects. The initial value ρ ^ j , 0 of ρ j was taken to be the lag 1 sample correlation of the quarterly counts of daily limit-ups or limit-downs. Similarly, by treating appropriate averages of the counts as rough estimates of the random effects, we obtained the initial values of the dispersion parameters as the sample moments of the random effects. Specifically, the initial value of σ 2 was taken as
σ ^ 0 2 = 1 m i = 1 m 1 2 T j = 1 2 t = 1 T Y i j t 1 2
and the initial value of τ j 2 was
τ ^ j , 0 2 = 1 T t = 1 T 1 m i = 1 m Y i j t 1 2
The algorithm was iterated as follows.
  • Step 1: Initialize the parameters with γ 0 , σ 0 2 , τ j , 0 2 and ρ j , 0 as described above.
  • Step 2: At the rth iteration,
    (1)
    Update regression parameters γ by Equation (14),
    (2)
    Predict the random effects by their orthodox BLUPs given in Equations (10) and (11),
    (3)
    Update dispersion parameter σ 2 by Equation (15),
    (4)
    Update dispersion parameter τ j 2 by Equation (16),
    (5)
    Update correlation ρ j by Equation (18).
  • Step 3: Repeat Step 2 until the sum of absolute changes in the parameters is below a prespecified threshold, for example, 10 4 or 10 7 .

5. Analysis of Quarterly Counts of Stock Daily Limit Ups and Limit Downs

We first present some descriptive statistics of the variables in Table 2 and Table 3. From Table 2, the variables PB and PS are highly dispersed; for computational stability consideration, we divided all these three predictors PE, PB and PS by 100 in our analysis. In Table 3, we observe that the counts of limit-ups and limit-downs are positively correlated ( r = 0.56 ), and that the counts are also positively correlated with PE and PB although the correlations are weaker.
The trace plots of all these five variables, limit-ups, limit-downs, PE, PB and PS, are presented in Figure 2 for four stocks, one in each panel. There seems to be some evidence that PE and PB follow the same temporal pattern. Furthermore, PE looks different than PB and PS.
We fitted the proposed model to the stock instability data. The parameter estimation results are presented in Table 4.
The price-to-book ratio variable PB is defined as the stock price divided by the net asset value per share. PB represents the intrinsic value of the stock. The higher the PB is, the lower the intrinsic value of the stock is. The estimated effect of PB on the quarterly count of daily limit-ups was β 1 * = 0.0345 and significant, with a p-value 0.0200 < 0.05 , whereas the estimated effect of PB on the quarterly count of daily limit-downs was α 1 = 0.0308 but insignificant, with a p-value 0.2070 > 0.05 . Furthermore, the estimated effect of PB on the corresponding binomial proportion was β 1 = 0.0038 , but highly insignificant, with a p-value of 0.8938 . In other words, only the quarterly count of daily limit-ups was affected by PB significantly, whereas neither the quarterly count of daily limit-downs nor the corresponding binomial proportion was affected by PB significantly.
The price-to-earnings ratio variable PE is defined as the price of a stock divided by the earnings per share. PE represents the valuation of the stock: the higher the PE, the higher the valuation of the stock. The estimated effects of PE on both quarterly counts of daily limit-ups and limit-downs were positive and highly significant. More specifically, the corresponding regression parameters for quarterly numbers of limit-ups and limit-downs were estimated as β 2 * = 9.2711 and α 2 = 5.9616 with p-values of 0.0000 and 0.0000 , respectively. Furthermore, the estimated effect of PE on binomial proportion was also positive and highly significant with β 2 = 3.3095 and a p-value of 0.0080 . That is, with PB and PS being held constant, as PE increased, both quarterly counts of daily limit-ups and limit-downs tended to increase significantly, and so did the corresponding binomial proportion. In other words, as PE increased, the quarterly count of daily limit-ups tended to increase faster than the quarterly count of limit-downs.
The price-to-sales ratio variable PS was defined as the price of a stock divided by the sales per share. None of the estimated effects of PS on the quarterly count of daily limit-ups, the quarterly count of daily limit-downs and the corresponding binomial proportion were significant; these insignificance results of PS were also in agreement with the close-to-zero sample correlations with the counts of limit-ups ( r = 0.01 ) and limit-downs ( r = 0.01 ) in Table 3.
The stock-specific random effects helped characterize the positive association between quarterly counts of daily limit-ups and limit-downs. The higher the stock-specific random effects, the higher the frequencies of both quarterly counts of daily limit-ups and limit-downs; therefore, the stock-specific random effects reflected stock instabilities. Thus, we term stock-specific random effects as stock-specific instabilities hereafter. We present the parallel box plots of the predicted stock-specific random effects by industry in Figure 3 to assess stock-specific instabilities. First, the mining industry tended to have much higher stock instabilities than other industries. Second, the production and supply of electricity, construction, manufacturing, service and pharmaceutical industries tended to have relatively lower stock instabilities. Third, the IT industry had a wide range of stock instabilities from very low to very high, whereas the financial industry tended to have higher than average stock instabilities with much narrower range. Finally, the logistics industry tended to have very low stock instabilities with an exception.

6. Discussion

For longitudinal binomial data, we proposed a joint Poisson mixed model for the number of successes and the number of failures when their association was positive over time. Its usefulness and advantages were demonstrated through the analysis of quarterly data of stock daily limit-ups and limit-downs. Compared with traditional logistic mixed models, the proposed joint model could still assess covariate effects on the binomial proportions through the induced binomial model (see Equation (5)), while the influence of a random number of trials was incorporated. In addition, the ability to allow for zero number of trials was also an advantage as excluding this portion of data might lead to biased inferences on the binomial proportion. Furthermore, the predicted stock-specific random effects enabled the assessment of stock instabilities by industry.
In accordance with the proposed estimation method, we did not specify a parametric form for the random effects distributions. While this formulation enjoys the property of robustness against misspecified random effects, it is in principle straightforward to assume some parametric distributions for the random effects and estimate the parameters using alternative methods. For example, the hierarchical nature of the model makes it easy to implement in Bayesian computing package JAGS or OpenBUGS, which only requires users to specify the model structure.
Finally, the proposed joint Poisson mixed model can be easily extended to longitudinal multinomial data by directly modelling counts of each category with a Poisson mixed model.

Author Contributions

Conceptualization, X.Z., G.Y. and R.M.; methodology, G.Y., R.M. and J.L.; software, G.Y. and J.L.; formal analysis, X.Z., G.Y. and R.M.; data curation, X.Z.; writing—original draft preparation, X.Z., G.Y. and R.M.; writing—review and editing, X.Z., G.Y. and R.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2020-04751 and RGPIN-2015-06124), the Doctoral Research Initiation Project Fund (2019BSXM11) and the Yunnan Philosophy and Social Science Planning Project Fund (QN2019009).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are available from http://www.sse.com.cn/ (accessed on 5 January 2020) and http://www.szse.cn/ (accessed on 5 January 2020).

Acknowledgments

The authors thank the guest editor and two referees for their helpful comments that greatly improved the presentation of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhao, Y.; Xu, D.; Duan, X.; Du, J. A semiparametric Bayesian approach to binomial distribution logistic mixed-effects models for longitudinal data. J. Stat. Comput. Simul. 2022, 92, 1438–1456. [Google Scholar] [CrossRef]
  2. Azimi, S.S.; Bahrami Samani, E.; Ganjali, M. Random effects models for analyzing mixed overdispersed binomial and normal longitudinal responses with application to kidney function data of cancer patients. Stat. Biopharm. Res. 2022, 14, 114–131. [Google Scholar] [CrossRef]
  3. Yan, G.; Ma, R.; Hasan, M.T. A joint Poisson state-space modelling approach to analysis of binomial series with random cluster sizes. Int. J. Biostat. 2019, 15, 20180090. [Google Scholar] [CrossRef] [PubMed]
  4. Chen, J.J.; Ahn, H. Marginal models with multiplicative variance components for overdispersed binomial data. J. Agric. Biol. Environ. Stat. 1997, 2, 440–450. [Google Scholar] [CrossRef]
  5. Ma, R.; Jørgensen, B.; Willms, J.D. Clustered binary data with random cluster sizes: A dual poisson modelling approach. Statistical Modelling 2009, 9, 137–150. [Google Scholar] [CrossRef]
  6. Zhang, H.; Xia, Y.; Chen, R.; Gunzler, D.; Tang, W.; Tu, X. Modeling longitudinal binomial responses: Implications from two dueling paradigms. J. Appl. Stat. 2011, 38, 2373–2390. [Google Scholar] [CrossRef]
  7. Lee, J.H.; Xin, S.; Jin, Y. Price limit expansion and volatility: A theoretical perspective. Asia-Pac. J. Financ. Stud. 2021, 50, 271–287. [Google Scholar] [CrossRef]
  8. Ma, Y.; Qian, W.; Luan, Z. Could increasing price limit reduce up limit herding? Evidence from China’s capital market reform. Financ. Res. Lett. 2020, 42, 101909. [Google Scholar] [CrossRef]
  9. Ji, J.; Wang, D.; Xu, D.; Xu, C. Combining a self-exciting point process with the truncated generalized Pareto distribution: An extreme risk analysis under price limits. J. Empir. Financ. 2020, 57, 52–70. [Google Scholar] [CrossRef]
  10. Ma, R.; Jørgensen, B. Nested generalized linear mixed models: An orthodox best linear unbiased predictor approach. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2007, 69, 625–641. [Google Scholar] [CrossRef]
  11. Henderson, R.; Shimakura, S. A serially correlated gamma frailty model for longitudinal count data. Biometrika 2003, 90, 355–366. [Google Scholar] [CrossRef]
  12. Liem, P.F.; Basana, S.R. Price earnings ratio and stock return analysis (evidence from liquidity 45 stocks listed in Indonesia Stock Exchange). J. Manaj. Dan Kewirausahaan 2012, 14, 7–12. [Google Scholar] [CrossRef]
  13. Andarint, G.R. Profitability, solvability, price earnings ratio and price to book ratio an evidence from Indonesian Manufacturing Companies. J. Din. Akunt. Dan Bisnis 2016, 2, 114–122. [Google Scholar] [CrossRef] [Green Version]
  14. Shrinivas, T.; Sanjay, K. Value investing—A way to separating winners from losers. J. Commer. Manag. Thought 2010, 1, 183–198. [Google Scholar]
  15. Jørgensen, B.; Lundbye-Christensen, S.; Song, P.X.K.; Sun, L. A longitudinal study of emergency room visits and air pollution for Prince George, British Columbia. Stat. Med. 1996, 15, 823–836. [Google Scholar] [CrossRef]
Figure 1. Parallel box plots of quarterly counts of stock daily limit-ups (top panel) and limit-downs (bottom panel). The quarters are numbered consecutively in the horizontal axis with “1” for the second quarter of 2007 and 49 for the second quarter of 2007.
Figure 1. Parallel box plots of quarterly counts of stock daily limit-ups (top panel) and limit-downs (bottom panel). The quarters are numbered consecutively in the horizontal axis with “1” for the second quarter of 2007 and 49 for the second quarter of 2007.
Entropy 24 01472 g001
Figure 2. Trace plots of all the variables for four randomly selected stocks. The variables PE, PB and PS were scaled within the stock, that is, subtracting the sample mean and then dividing by the sample standard deviation.
Figure 2. Trace plots of all the variables for four randomly selected stocks. The variables PE, PB and PS were scaled within the stock, that is, subtracting the sample mean and then dividing by the sample standard deviation.
Entropy 24 01472 g002
Figure 3. Parallel box plots of the predicted stock-specific random effects by industry.
Figure 3. Parallel box plots of the predicted stock-specific random effects by industry.
Entropy 24 01472 g003
Table 1. A hypothetical example.
Table 1. A hypothetical example.
Time 1Time 2Time 3Time 4Time 5Time 6Time 7Time 8
Success1020304050607080
Failure1020304050607080
Total20406080100120140160
Table 2. Summary statistics of the variables.
Table 2. Summary statistics of the variables.
Min.1st Qu.Median3rd Qu.Max.MeanSD
Up0001210.761.55
Down0000180.511.43
PE−1449.5317.0933.2970.873676.9793.96271.06
PB−16.341.662.734.72100.643.945.22
PS0.051.172.846.604629.4313.98173.92
Table 3. Pearson’s sample correlations of the variables.
Table 3. Pearson’s sample correlations of the variables.
UpDownPEPBPS
Up1.000.560.120.21−0.01
Down0.561.000.060.10−0.01
PE0.120.061.000.210.17
PB0.210.100.211.000.04
PS−0.01−0.010.170.041.00
Table 4. Parameter estimates in the model for the analysis of quarterly counts of stock daily limit-ups and limit-downs. (The covariates PB, PE and PS were divided by 100 in the analysis).
Table 4. Parameter estimates in the model for the analysis of quarterly counts of stock daily limit-ups and limit-downs. (The covariates PB, PE and PS were divided by 100 in the analysis).
EstimateStandard Errorp-Value
Intercept ( β 0 )0.20630.09570.0311
PB ( β 1 )0.00380.02830.8939
PE ( β 2 )3.30951.24760.0080
PS ( β 3 )0.24160.47430.6106
Intercept ( β 0 * )−0.74450.05950.0000
PB ( β 1 * )0.03450.01480.0200
PE ( β 2 * )9.27090.66660.0000
PS ( β 3 * )−0.12300.12260.3156
Intercept ( α 0 )−0.95080.08170.0000
PB ( α 1 )0.03080.02440.2071
PE ( α 2 )5.96141.06330.0000
PS ( α 3 )−0.36460.46060.4286
σ 2 0.0295
τ 1 2 1.9986
τ 2 2 5.4142
ρ 1 0.4157
ρ 2 0.2880
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhang, X.; Yan, G.; Ma, R.; Li, J. Analysis of Longitudinal Binomial Data with Positive Association between the Number of Successes and the Number of Failures: An Application to Stock Instability Study. Entropy 2022, 24, 1472. https://doi.org/10.3390/e24101472

AMA Style

Zhang X, Yan G, Ma R, Li J. Analysis of Longitudinal Binomial Data with Positive Association between the Number of Successes and the Number of Failures: An Application to Stock Instability Study. Entropy. 2022; 24(10):1472. https://doi.org/10.3390/e24101472

Chicago/Turabian Style

Zhang, Xiaolei, Guohua Yan, Renjun Ma, and Jiaxiu Li. 2022. "Analysis of Longitudinal Binomial Data with Positive Association between the Number of Successes and the Number of Failures: An Application to Stock Instability Study" Entropy 24, no. 10: 1472. https://doi.org/10.3390/e24101472

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop