Next Article in Journal
Jump Models with Delay—Option Pricing and Logarithmic Euler–Maruyama Scheme
Previous Article in Journal
Efficient Computation of Highly Oscillatory Fourier Transforms with Nearly Singular Amplitudes over Rectangle Domains
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spurious OLS Estimators of Detrending Method by Adding a Linear Trend in Difference-Stationary Processes—A Mathematical Proof and Its Verification by Simulation

1
Research Center for College Moral Education, Tsinghua University, 307C Shanzhai Building, Tsinghua University, Beijing 100084, China
2
CNRS (National Center for Scientific Research)—UMR 8174 Centre d’Économie de la, Maison des Sciences Economiques de l’Université de Paris 1 Panthéon-Sorbonne 106-112 boulevard de l’Hôpital, 75013 Paris, France
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(11), 1931; https://doi.org/10.3390/math8111931
Submission received: 2 September 2020 / Revised: 23 September 2020 / Accepted: 25 September 2020 / Published: 2 November 2020
(This article belongs to the Section Dynamical Systems)

Abstract

:
Adding a linear trend in regressions is a frequent detrending method in economic literatures. The traditional literatures pointed out that if the variable considered is a difference-stationary process, then it will artificially create pseudo-periodicity in the residuals. In this paper, we further show that the real problem might be more serious. As the Ordinary Least Squares (OLS) estimators themselves are of such a detrending method is spurious. The first part provides a mathematical proof with Chebyshev’s inequality and Sims–Stock–Watson’s algorithm to show that the OLS estimator of trend converges toward zero in probability, and the other OLS estimator diverges when the sample size tends to infinity. The second part designs Monte Carlo simulations with a sample size of 1,000,000 as an approximation of infinity. The seed values used are the true random numbers generated by a hardware random number generator in order to avoid the pseudo-randomness of random numbers given by software. This paper repeats the experiment 100 times, and gets consistent results with mathematical proof. The last part provides a brief discussion of detrending strategies.

1. Introducing the Problematic

The traditional time-series models focused on stationary processes. As a matter of fact, Wold’s (1954) [1] famous decomposition theorem indicated that any covariance-stationary process could be formulated as the sum of infinite white noises. Thanks to this stationary process’ property, the Autoregressive moving average model (ARMA) models applying the method proposed by Box and Jenkins (1970) [2] gradually became the main modeling in time-series analysis. However, what happens when the series are not stationary?
By simulating two distinct random walks and regressing one to another, Granger and Newbold (1974) [3] revealed the “spurious regression problem.” The OLS estimators of the correlation between these two independent random walks should be zero, but the Monte Carlo simulations performed by the econometricians indicated OLS estimators significantly different from zero, along with very high R². They put forward the idea that such a regression is “spurious,” because it makes no sense, even when it exhibits very high R². Other authors, such as Phillips (1986) [4] or Davidson and McKinnon (1993) [5], revealed similar results, leading to the following conclusions: (i) If the dependent variable is integrated of order 1, that is to say, I(1), then under null hypothesis, the residuals of the regression would also be I(1). However, as the usual statistical tests of the OLS estimators (Fisher or Student tests) are based on a hypothesis of residuals as white noise, these tests are no longer effective if such an assumption is not maintained. (ii) Some asymptotic properties are no longer valid, such as those of the ADF statistics, because they did not obey the same laws in the case of stationary processes. (iii) As the residuals are also I(1), the previsions are not efficient—except when a cointegration relationship between variables exists.
Here, we only examine time-series nonstationarity on average, to be distinguished from that in variance. Since Nelson and Plosser’s (1982) [6] contribution, nonstationarity on average can itself be classified into two categories: the first one is related to trend-stationary (TS) processes which present nonstationarity because of the deterministic trends characterizing their structure; the second category is linked to difference-stationary (DS) processes which contain a stochastic structure, or unit root. The processes considered can be made stationary by adding or removing the deterministic trends in the regressions in the case of TS processes, or, alternatively, in the case of DS processes, through difference operators, going from ARMA to Autoregressive Integrated Moving Average model (ARIMA).
Unit root tests are generally used to identify the nature of a nonstationary process, whether deterministic or stochastic. For DS, in particular, a solution is offered within ARIMA models through difference operators or the cointegration methods respectively proposed by Engle and Granger (1987) [7] in a univariate approach, and by Johansen (1991) [8] in a multivariate approach. Meanwhile, Stock (1987) [9] has demonstrated that, within such frameworks, the OLS estimators converge toward the real values if the variables are cointegrated, and the speed of convergence is faster than that of the usual case (that is, 1/T instead of 1/ T , where T is the sample size).
The cointegration theory achieved great success, but it has several inconveniences. It requires that all the variables must be integrated in the same order; otherwise, the cointegration models cannot be applied. However, it is difficult to make sure that all series have the same order of integration in the economic model which is tested. For example, GDP growth rates are often I(0), while some price indices can be I(2). Moreover, a supplementary difficulty in using difference operators destined to stabilize a DS process comes from the fact that variables in various orders of difference may not match the theoretical models which are employed.
It follows that the detrending method consisting of adding a linear trend into the regression has become common in empirical studies, due to its simplicity and its compatibility with a wide range of models. Many authors have chosen to add a linear trend in their regressions when they considered their dependent variables as nonstationary. Thus, detrending methods are often used in TS processes despite the nonstationary nature of the latter. Nevertheless, TS detrending methods cause specific problems when the series is in fact a DS process.

2. Literature Review

Studying the implications of treating TS processes as DS processes with the application of a difference operator, Chan, Hayya and Ord (1977) [10] found that the difference operator creates an artificial disturbance in the differentiated series. Indeed, the autocorrelation function equals to −1/2 when lag = ± 1 . Later, Nelson and Kang (1981) [11] examined the reverse case, in other words, the effects of treating DS processes as TS processes by adding a linear trend into the regression, and stated that, when a detrending method is used, the covariance of the residuals depends on the size of the sample and on time. By simulation, they showed that adding a linear trend into the regressions for TS processes generates a strong artificial autocorrelation of the residuals for the first lags, and thus induces a pseudo-periodicity—the corresponding spectral density function exhibiting a single peak at a period equal to 0.83 of the sample size. More precisely, treating TS processes as DS processes by difference operator artificially creates a “short-run” cyclical movement in the series, while, conversely, a “long-run” cyclical movement is artificially generated when treating DS processes as TS processes (we speak about “short-run,” since the disturbance happens when lag = ± 1 , and “long-run,” because the problem appears when the period corresponds to 0.83 of the sample size, or almost the same importance than the latter).
These fundamental studies have shown the importance of distinguishing between TS and DS processes, but remained concentrated on artificial correlations of the residuals. None of them focused on the OLS estimators themselves. In addition, the samples which are used are relative small. Following Nelson and Kang’s (1981) [11] research line, we shall mathematically demonstrate that the OLS estimators of detrending method by adding a linear trend in DS processes can be considered as spurious. As we shall see, the OLS estimator of the trend tends to zero when the sample size tends to infinity, while the other OLS estimator (intercept) is divergent in the same situation. After this, we shall design a simulation series to be experimented on by a sample of a million observations. The seed values are given by Rand Corporation (2001) [12]. As the dataset of simulation contains more than 100 million points, we shall present the program built by SAS with the seed values table in the Appendix A, so that readers will be in a position to reproduce the simulations with the same codifications.

3. A Mathematical Proof

We suppose that y t is a DS; for example, the random walk:
y t = y t 1 + v t
where v t is a white noise—and considering a weak form of stationarity, or of the second order.
Let us apply a time detrending method by adding a linear trend into the regression; that is to say, we have the model:
y t = α + β t + ε t
where α and β are coefficients to be estimated, and t is the time variable: t = 1,2,3…T, with T the sample size, or number of observations. ε t is the innovation.
Suppose: X t = ( 1 t ) , γ = ( α β ) , and b T is the OLS estimators of γ based on a sample of size T.
We get:
b T = ( α T ^ β T ^ ) = ( t = 1 T X t X t ) 1 ( t = 1 T X t y t )
For the term:
( t = 1 T X t X t ) = (   1   t   t   t 2 ) = ( T T ( T + 1 ) / 2 T ( T + 1 ) / 2 T ( T + 1 ) ( 2 T + 1 ) / 6 )
As
( a b c d ) 1 = 1 a d b c ( d b c a )
Then:
( t = 1 T X t X t ) 1 = 1 T 2 ( T + 1 ) ( 2 T + 1 ) / 6 T 2 ( T + 1 ) 2 / 4 ( T ( T + 1 ) ( 2 T + 1 ) / 6 T ( T + 1 ) / 2 T ( T + 1 ) / 2 T ) = 12 T 2 ( T + 1 ) ( T 1 ) ( T ( T + 1 ) ( 2 T + 1 ) 6 T ( T + 1 ) 2 T ( T + 1 ) 2 T ) = 2 T ( T 1 ) ( 2 T + 1 3 3 6 / ( T + 1 ) )
Additionally, for the term:
( t = 1 T X t y t ) = ( t = 1 T y t t = 1 T t y t )
So:
( α T ^ β T ^ ) = 2 T ( T 1 ) ( 2 T + 1 3 3 6 / ( T + 1 ) ) ( t = 1 T y t t = 1 T t y t ) = 2 T ( T 1 ) ( ( 2 T + 1 )   y t 3   t y t 3   y t + 6 T + 1   t y t )
Thus, we have, respectively:
{ α T ^ = 2 T ( T 1 ) ( ( 2 T + 1 )   y t 3   t y t ) β T ^ = 6 T ( T 1 ) (   y t + 2 T + 1   t y t ) ( A )
However, initially we have seen that:
y t = y t 1 + v t
Additionally
v t ~ W N
That is:
y t = y 0 + j = 1 t v j
Therefore:
t = 1 T y t = t = 1 T ( y 0 + j = 1 t v j ) = T y 0 + T v 1 + ( t 1 ) v 2 + + 2 v T 2 + v T = T y 0 + t = 1 T ( T + 1 t ) v t = T y 0 + ( T + 1 ) t = 1 T v t t = 1 T t v t
Additionally:
t = 1 T t y t = t = 1 T t ( y 0 + j = 1 t v j ) = y 0 t = 1 T t + t = 1 T t ( j = 1 t v j ) = y 0 ( T + 1 ) T 2 + ( 1 + + T ) v 1 + ( 2 + + T ) v 2 + + ( T 1 + T ) v T 2 + T v T = y 0 ( T + 1 ) T 2 + t = 1 T ( t + T ) ( T t + 1 ) 2   = y 0 ( T + 1 ) T 2 + T ( T + 1 ) 2 t = 1 T v t 1 2 t = 1 T t 2 v t + 1 2 t = 1 T t v t
It becomes:
α T ^ = y 0 + ( T + 1 ) ( T + 2 ) ( T 1 ) T t = 1 T v t 4 T + 5 T 1 t = 1 T t T v t + 3 T T 1 t = 1 T t 2 T 2 v t
Additionally
β T ^ = 6 ( T + 1 ) · 1 T t = 1 T v t 6 T 2 ( T + 1 ) ( T 1 ) · 1 T t = 1 T t 2 T 2 v t + 6 T ( T + 2 ) ( T + 1 ) ( T 1 ) · 1 T t = 1 T t T v t
When T + , we respectively get:
( T + 1 ) ( T + 2 ) ( T 1 ) T 1 ,   4 T + 5 T 1 4 ,   3 T T 1 3 ,   6 ( T + 1 ) 0 ,   6 T 2 ( T + 1 ) ( T 1 ) 6 ,   and   6 T ( T + 2 ) ( T + 1 ) ( T 1 ) 6 .
Thus, we need to determine the convergence of the six terms: t = 1 T v t , t = 1 T t T v t , t = 1 T t 2 T 2 v t , 1 T t = 1 T v t , 1 T t = 1 T t T v t and 1 T t = 1 T t 2 T 2 v t . Intuitively, they seem to tend toward zero when T + .
Both t = 1 T v t and 1 T t = 1 T v t converge to zero; as v t is a white noise, its expectation is zero. However, for the other terms, that multiply a coefficient situated between 0 and 1, the symmetry of white noises in infinity is not valid. So, t T v t or t 2 T 2 v t cannot be cancelled by each other. Additionally, as v t may be positive, negative or zero, the inequality 0 < t T v t < v t (or 0 < t 2 T 2 v t < v t ) does not hold true; however, we cannot use the squeeze theorem to prove that the limits of the remaining four terms exist and that they are equal to zero.
Consequently, we turn now to the Chebyshev’s inequality (see here, among many others: Fischer (2010) [13], Knuth (1997) [14] and originally, Chebyshev (1867) [15]). If X is a random variable, E ( X ) = μ , V ( X ) = σ 2 for k R and k > 0 , and then:
Pr ( | X μ | k σ ) 1 k 2
Here, it is clear that, if we could demonstrate that the variances of the four terms are bounded, the convergence in probability of the four terms is also proven. Let us note:
A = 1 T t = 1 T t T v t   B = t = 1 T t T v t   C = 1 T t = 1 T t 2 T 2 v t   D = t = 1 T t 2 T 2 v t
We first study the convergences of A and B, then, symmetrically, we shall get the conclusions for C and D.
As v t is a white noise, that is, E ( v t ) = 0 , so V ( v t ) = σ 2 is constant over time and, for i j , C o v ( v i , v j ) = 0 . Obviously, E ( A ) = 0 and E ( B ) = 0 :
V ( B ) = 1 2 T 2 V ( v 1 ) + 2 2 T 2 V ( v 2 ) + + T 2 T 2 V ( v T ) + i j T C o v ( i T v i , j T v j )
As C o v ( v i , v j ) = 0 for i j , then: C o v ( i T v i , j T v j ) = 0 and V ( v t ) = σ 2 for any t. We have:
V ( B ) = σ B 2 = σ 2 T 2 ( 1 2 + 2 2 + + T 2 ) = σ 2 T 2 1 6 T ( T + 1 ) ( 2 T + 1 ) = ( T + 1 ) ( 2 T + 1 ) 6 T σ 2
Additionally,
V ( A ) = σ A 2 = 1 T 2   V ( B ) = ( T + 1 ) ( 2 T + 1 ) 6 T 3 σ 2
According to the general version of the Chebyshev’s inequality, we know that, for variable A:
Pr ( | A E ( A ) | k σ A ) 1 k 2 Pr ( | A 0 | k σ A ) 1 k 2 1 Pr ( | A | k σ A ) 1 1 k 2
As
1 Pr ( | A | k σ A ) = Pr ( | A | k σ A ) Pr ( | A | k σ A ) 1 1 k 2
When T + ,
σ A = V ( A ) = ( T + 1 ) ( 2 T + 1 ) 6 T 3 σ 2 = 0
So:
lim T + Pr ( | A | k σ A ) = Pr ( | A | 0 ) 1 1 k 2
For any k > 0 , when k + , Pr ( | A | 0 ) 1 . Obviously: Pr ( | A | = 0 ) = 1 . Consequently, we can infer that, when T + , then: A P 0.
Nevertheless, regarding B, as its variance tends to infinity when T + , so B is divergent.
Symmetrically, we can demonstrate that, when T + , C P 0 and D are divergent, because:
V ( D ) = T ( T + 1 ) ( 2 T + 1 ) ( 3 T 2 + 3 T 1 ) 30 T 4 σ 2
Additionally,
V ( C ) = T ( T + 1 ) ( 2 T + 1 ) ( 3 T 2 + 3 T 1 ) 30 T 6 σ 2
When T + ,   V ( C ) 0 and V ( D ) + .
Turning back to the OLS estimator b T , we see that, when T + , α T ^ is not convergent, and β T ^ converges to zero in probability. So, when the sample size grows to infinity, the coefficient of the trend will tend to zero. This means that this trend is useless. We are indeed still regressing from a random walk to another one. The high R² of the regressions observed in the literature might just be caused by the similarity between a trend and a random walk in the short run, as seen in the simulations performed by Newbold and Granger (1974) [3]. In other words, adding a linear trend in the regressions for DS processes would not play any significant role; and it would even involve “new” spurious regressions in the sense of Granger and Newbold (1974) [3].
As Box and Draper (1987) [16] pertinently wrote it: “Essentially, all models are wrong, but some are useful” (p. 424).

4. Verification by Simulation

In order to verify this mathematical proof, let us simulate the model by SAS through Monte Carlo simulation. The Monte Carlo simulations are widely used computational methods that rely on repeated random sampling to obtain numerical results. It is now more and more popular, in the research of economics based on the use of randomness, to solve deterministic problems (for a more introductive presentation of Monte Carlo simulations, see Rubinstein and Kroese (2016) [17]). The Monte Carlo simulations have the following advantages in economic fields (for a survey on the application of Monte Carlo simulations in economics, see Creal (2012) [18]): (1) some economic models are too complicated to find analytical solutions in short time, or even impossible, as in this situation, Monte Carlo simulations are efficient methods to find numerical solutions (for example, see Kourtellos et al. (2016) [19]). (2) For some economic models, it is difficult to find practical examples in the real world that strictly meet the conditions of the theoretical models (Lux (2018) [20]). For instance, the sample size of macro variables are relatively short, it is difficult to meet the statistical credibility. However, in this situation, Monte Carlo simulations provide the possibility of large samples to verify some economic theories. (3) Due to the methods of data collection, the endogenous problems and identification problems sometimes exists in economic modeling (see the critical of Romer (2016) [21]). As a consequence, the estimated statistical relationships are no longer reliable. Monte Carlo simulations provide an effective way to explore the relationships between economic variables (for example, Reed and Zhu (2017) [22]).
The Monte Carlo simulations also have their disadvantages: (1) Monte Carlo simulations cannot replace the strict mathematical proof, but only provide approximate calculations based on probability when the analytic solutions cannot be provided or cannot be provided temporarily. That is to say it is just a non-deterministic algorithm opposite to the deterministic algorithm. This is why in the first section we also provide a strict mathematical proof. (2) Monte Carlo simulations only provide a possibility of exploring the problems, but the results of experiments may depend on the scientificity of the experimental design. For instance, this paper underlines the importance of true randomness in the experimental design.
The aim of the Monte Carlo simulations in this research is to reveal that, when the considered variable is a DS process, what kinds of problems will appear if we treat it as a TS process. So we need three basic assumptions: (1) the variable is DS, to strictly guarantee this point, the experimental design chooses the simplest and most common DS process, namely, random walk. (2) Infinite sample size; the mathematical proof based on asymptotic consistent theory requires an infinite sample size. Additionally, Monte Carlo simulations are probabilistic methods, which also need a large enough sample size. Thus, one million is chosen as the approximation of infinity. (3) True randomness. To avoid false conclusions caused by pseudo-random numbers, the experimental design takes a two-step strategy to ensure the true randomness of generated random numbers. That is to say, in the first step, we generate true random numbers by hardware random number generator as seed values; in the second step, we use the true random numbers as seed values to generate the samples of 1 million size.
That is to say, to do that, we shall follow four successive steps:
  • Step 1: We generate a white noise, v t , with a sample size of T = 1,000,000. Here, we set the white noise as Gaussian. The seed values (see Table A1) employed for the simulations at this step are provided by the Rand Corporation (2001) [12] with a hardware random number generator to make sure that the simulations effectively use true random numbers, because the random number generated by software is in fact a “pseudorandom.”
  • Step 2: We generate a random walk, y t , in our original equation by setting y 0 = 0 :
    y t = y t 1 + v t
    y t also having a million observations.
  • Step 3: We then regress the DS, y t , to a linear trend with an intercept.
  • Step 4: We repeat this experiment 100 times successively, and each time we use a different true random number as a seed value.
The simulation results appear to be consistent with the mathematical proof. The details of α T ^ , β T ^ and R² are summarized in Table 1 and in Table A2 of Appendix C. The simulation program by SAS is provided in Appendix B, the reader could reproduce our work with the same codes. Besides, Figure 1 and Figure 2 (presenting only the first 10 simulations to make them concise) show the evolutions of α T ^ and β T ^ when the sample size grows from 100 up to 1,000,000 points, while the simulations of α T ^ , β T ^ , t α T ^ ^ and t β T ^ ^ generated by various seed values with a true random number are shown in Figure 3 and Figure 4.
At this point in the reasoning, several important results must be underlined:
(1)
From Figure 1 and Figure 2, we can observe that α T ^ is divergent, with its variance increasing when the sample size grows, while β T ^ converges to zero. The simulation results therefore confirm the mathematical proof previously provided. In addition, from Figure 2, we see that the sample size should be greater than at least 1000 to get a conclusion of convergence becoming clear. That is, the size of the samples simulated by Granger and Newbold (1974) [3] or Nelson and Kang (1981) [11] seem to not be big enough to support their conclusions; even if the latter are right, and can be confirmed and re-obtained by our own simulations mobilizing 1,000,000 observations as an approximation of infinity (the sample size was 50 for Granger and Newbold (1974) [3] and 101 (in order to calculate a sample autocorrelation function of 100 lags) for Nelson and Kang (1981). This is probably because computers’ calculation capacities were much less powerful in the 1970s than today. Thanks to the progress in computing science, we can reinforce the statistical credibility of their findings).
(2)
From Figure 3, we observe that, as expected, when T + , β T ^ converges to zero (the magnitude level of β T ^ is 10−5 considering that the decimal precision of the 32-bit computer used is 10−7, which is almost not-different from zero) and α T ^ is divergent even if the seed values are modified. For 100 different simulations, the conclusions still hold, which indicates that there is no problem of pseudo-randomness in our simulations (even if their conclusions are correct, the simulations by Granger and Newbold (1974) [3] as well as by Nelson and Kang (1981) [11] did not pay attention to the pseudo-randomness, nor specify how the random numbers are obtained). By performing them, as we set all y 0 equal to zero, if α T ^ is convergent, then it must converge to y 0 , in other words, to zero. However, α T ^ seriously deviates from its mathematical expectation zero for different simulations. Thus, the regressions are spurious because the OLS estimator of the trend converges to zero and the other OLS estimator diverges when the sample size tends to infinity.
(3)
From the last column of Table 1, we see that, sometimes, these regressions get a very high R² (the highest being 0.97, with an average of 0.45 for the 100 experiments). This is a classic result associated to spurious regressions, already pointed out by Granger and Newbold (1974) [3].
(4)
From Table 1 and Figure 4, we see that the t-statistics of the OLS estimators are very high, and that all the p-values of H 0 :   t α T ^ ^ = 0 and H 0 :   t β T ^ ^ = 0 are zero. Thus, the OLS estimators are definitely significant when the sample size tends to infinity. This is also a well-known result associated to spurious regressions, since the residuals are not white noises (as indicated above, and studied by Nelson and Kang (1981) [11], we did not test the correlation of the residuals here). In these conditions, we understand that the usual and fundamental Fisher or Student tests of the OLS estimators are no longer valid, precisely because they are based on the assumption of residuals as white noises. If we use such a detrending method in DS processes, we will indeed get wrong conclusions of significance of the explicative variables.
We understand that our results call for a re-examination of the robustness of the classic findings in macroeconomics. To give an example, in a famous paper, Mankiw, Romer and Weil (1990) [23] identified a significant and positive contribution of education to the per capita GDP growth rate. In a theoretical framework close to a Solowian model, their approach consisted in augmenting a production function with constant returns to scale and decreasing marginal factorial returns, by including a variable of human capital in order to regress, in logarithms, per capita GDP to the investment rates of physical capital and of schooling. Their conclusion is probably accurate; but, as they added a linear trend as a detrending method, whatever the input variable that is selected, it will be found statistically significant as long as the size of their sample is sufficiently large. Our own study has described, in an original manner, the behavior of OLS estimators themselves when the sample size tends to infinity. By comparison, the samples used for simulation by Chan, Hayya and Ord (1977) [10], or Nelson and Kang (1981) [11], are relatively small—even if, obviously, they were extremely useful.

5. Concluding Remarks

The introduction of a linear trend generally aimed at avoiding spurious regressions. However, Nelson and Kang (1981) [11], following Chan, Hayya and Ord (1977) [10], had showed that, in OLS estimates, the assimilation of a difference-stationary process (DS)—the most probable process for GDP, with that of unit root, according to Nelson and Plosser (1982)—to a trend-stationary process (TS), (as did Chow and Li (2002) [24], among others, while the log of China’s GDP may present a unit root) can lead to a situation where the covariance of the residuals depends on the size of the sample, which artificially induces an autocorrelation of the residuals for the lags, and, by generating a pseudo-periodicity in the latter, generates a cyclical movement into the series. However, their analyses mainly focused on the residuals, and their simulated sample size remained small. Here, following Nelson and Kang’s (1981) research line, and using the Chebyshev’s inequality, we have given a strict mathematical proof of the fact that the OLS estimators of a detrending method by adding a linear trend in DS processes are spurious. When the sample size tends to infinity, the OLS estimator of the trend converges toward zero in probability, while the other OLS estimator is divergent. The empirical verification attempted by designing a series through the Monte Carlo method and by performing simulations on a sample of a million observations as an approximation of infinity and true random numbers as seed values has finally provided results consistent with the mathematical proof.
Thus, in the context of what has been specified here, our main conclusion according to which the OLS estimators themselves are spurious when the sample size increases also implies that identifying the nature of time series becomes extremely important. For example, it is crucial to decide whether GDP series are to be treated as TS or DS processes—in a short-run context in which random walks usually look like TS processes (on the basis of many macroeconomic series, Nelson and Plosser (1982) [6] have stated that GDP series would be DS rather than TS processes. More recent studies, such as that by Darné (2009) [25], have reexamined GNP series with new unit root tests, and shown that the US GNP expressed in real terms seems to be a stochastic trend). Even if their effectiveness is questioned, especially because of the sensitivity of the choice of the truncation parameters, we recommend using unit root tests to reduce the risk of inappropriately selecting the detrending method, but by regressing the variables of the models used in the first differences of the logarithm forms when such tests show that they contain unit roots (such an advice has been applied in a recent study on China’s long-run growth using a new time-series database of capital stocks from 1952 to 2014 built through an original methodology. See Long and Herrera (2015, 2016) [26,27]). From a theoretical point of view, regressions in the first differences of the logarithm forms are acceptable both by neoclassic and Keynesian modeling, in which they can easily be interpreted in terms of growth-rate dynamics; and from an econometric point of view, logarithms might be useful when a problem of heteroscedasticity appears, while difference operators can help to avoid spurious regressions if there are unit roots. To avoid the over-differencing problem, we finally recommend using inverse autocorrelation functions (IACFs) to determine the order of integration, along with unit root tests and correlogram (See Cleveland (1972) [28], Chatfield (1979) [29] and Priestey (1981) [30]). That is to say, we suggest the following modeling strategy: (i) if the unit roots tests and correlogram indicate that the variables are stationary in the first differences of the logarithm forms; we stay in traditional time series regressions. (ii) If the variables contain unit roots in the first differences of the logarithm forms, we could pass to cointegration framework or effectuate a second difference operation. (iii) If unit root tests and correlogram both indicate that the series seem be stationary but IACF indicates that the series might be over-differenced (in this case, Autoregressive Function (ACF) and Partial Autoregressive Function (PACF) present characteristics of stationary process (or decrease hyperbolically) while IACF presents characteristics of nonstationary process) that implies an integer order of integration is not sufficient, the true order of integration might be between 0 and 1. That is to say, we might need to pass from traditional time series models to fractal theory (Hosking (1981) [31]) such as AutoRegressive Fractionally Integrated Moving Average (ARFIMA) models or fractional cointegration.

Author Contributions

Conceptualization, Z.L.; methodology, Z.L.; software, Z.L.; validation, Z.L.; formal analysis, Z.L.; writing—original draft preparation, R.H.; writing—review and editing, Z.L.; visualization, Z.L.; supervision, R.H.; project administration, R.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank Tao Zha and Jie Liao’s discussion and help. The authors would also like to thank the reviewer’s comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Simulation Program by SAS, with Explanation Annotations

data simulation1;
  call streaminit(77381); *The number in parenthesis is the seed value in Appendix B;
  do t=1 to 1000000; *The sample size is one million;
    v1=rand(“normal”); *Set the white noises series as Gaussian;
  output; *Repeat the simulation 100 times and respectively obtain the white noises v1,v2...v100;
end; *Use a different seed value in Appendix B for each replication;
run; *Use the true random number generated by hardware random number generator in order to avoid the pseudo-randomness of random numbers given by software.;
 
data simulation; *Merge the white noises into a single dataset;
  merge simulation1-simulation100;
run;
data simulation0; *Generate 100 random walks by setting all the initial values equal to 0;
  set simulation;
  array randomwalk(*) v1-v100;*Define an array with do loop in order to reduce the code;
  array y(100);
        do i=1 to 100;*The random walk is the accumulated sum of white noise;
           y(i) + randomwalk(i);
  end;
run;
 
proc reg data=simulation0 outest=reg;
  model y1-y100=t/RSQUARE; *Get 100 regressions and store the R2;
run;
quit;
 
proc reg data=simulation0 outest=reg0 TABLEOUT;
  model y1-y100=t/ RSQUARE;
run; *Create another dataset in order to store the student statistics of OLS estimators;
quit;
 
data reg1;
  set reg0;
  if _TYPE_=“T”;
  rename Intercept=t_alpha t=t_beta;
  drop _MODEL_ _TYPE_ _RMSE_ y1-y100 _IN_ _P_ _EDF_ _RSQ_;
run; *Only reserve the student statistics of OLS estimators;
 
data reg;
  merge reg1 reg;
  by _DEPVAR_;
run; *Merge the datasets;
 
data reg;
  set reg;
  rename t=beta;
  rename Intercept=Alpha;
  rename _DEPVAR_=bootstrap;
  drop _MODEL_ _TYPE_ _IN_ _P_ _EDF_ _RMSE_ y1-y100;
run; *Rename the variables from automatic SAS names to specific names and drop all information we don’t need;
 
data reg;*As the variable bootstrap is character it will sort in order y1 y10 y100 y2 y21 y3....y99 in figures. We change them into numeric.;
  set reg; *We can also correct this problem in the array statement step by adding a leading zero such as y01 y02....y09 y10...y100.;
    bootstrap=substr(bootstrap,2,3);
run;
 
data reg;
  set reg(rename=(bootstrap=bootstrap_char));
      bootstrap = input(bootstrap_char,best.);
  drop bootstrap_char _MODEL_ _TYPE_ y1-y100;
run; *Change the variable bootstrap from character to numeric;
 
proc univariate data=reg;
  var alpha beta t_alpha t_beta;
  histogram alpha beta t_alpha t_beta / kernel normal;
run; *Calculate some elements in Table 1;
proc gplot data=reg;
  plot beta*bootstrap alpha*bootstrap/overlay;
        symboli=join;
run;*Represent Figure 3 and Figure 4;
 
%macro reg(size);*define a macro program to consider the behaviors of OLS estimators when sample size increase from 100 to 1000000;
  %do i = 100 %to &size %by 10000;
  %let j= %sysevalf((&i−100)/10000);
      proc reg data=simulation0(where=(t<=&i))
      outest=out&j(keep=intercept t) noprint;
      model y1=t;
      run;
      quit;
  %end;
%mend;
 
%reg(1000000) *Invoke the macro reg(size) and let size=1000000;
 
data reg1; *merge the datasets;
set out0-out99;
run;
 
%macro rename;*Define a new macro program in order to rename variables because SAS automatically creates the same names in each regression;
  %do i=1 %to 10;
        data reg&i;
      set reg&i;
      rename intercept=intercept&i t=t&i;
      run;
  %end;
%mend;
 
%rename *Invoke the macro program;
 
data order(keep=size); *Create an index variable;
  do n=0 to 99;
    size=100+10000*n;
  output;
  end;
run;
 
data reg0; *Merge the datasets;
  merge order reg1-reg10;
run;
 
proc gplot data=reg0; *Represent Figure 1 and Figure 2;
  plot (intercept1-intercept10)*size/ overlay;
        symboli=join;
  plot (t1-t10)*size / overlay;
        symboli=join;
run;
quit;

Appendix B

Table A1. Table of Seed Values.
Table A1. Table of Seed Values.
1820077381594438743077462414407549649906098238129389793
1820179729865262263399540233545593037734978616827033174
1820282377535021361521230257415993560282904306625175758
1820331592309571445877037107774525269494745091603180045
1820433553072102912718634710523518289048049780045146072
1820559326459165569808330925411019637699811626556224792
1820661082835869898978927688004488296851791679278682529
1820714373760096587629319632122200257795287727482395093
1820890754767678130932874617926365910851291068498863128
1820933936116595675448332086874129931220377092833591985
Source: Rand Corporation (2001) [12], p. 365. Online: http://www.rand.org/pubs/monograph_reports/MR1418.html. Note: Here, the computer operating system used is Windows7-32 bits Home premium, with the 9.3 version of SAS. The results might be a little bit different in another operating environment—most programming languages use the IEEE 754 international standard. With this standard, a 32-bits computer can use a 23-bits precision when decimal numbers have no accurate representation in binary. However, for a 64-bits computer, it can use a 52-bits precision.

Appendix C

Table A2. Simulation Results (Using a sample size of a million points).
Table A2. Simulation Results (Using a sample size of a million points).
BootstrapAlphaT_AlphaBetaT_Beta_RSQ_
1−516.428−994.2130.0014781642.9750.729684
2−197.89−454.4350.0016732218.0980.83108
3−270.459−441.39−0.00137−1293.60.625945
4−319.191−499.0530.001721552.420.706746
559.65598147.5−0.0021−3001.910.900115
6128.2268303.27090.0008951222.6640.599184
7340.0485723.4509−0.00085−1049.390.524084
8106.5433173.9056−0.0005−473.3320.183036
9312.9737707.3468−0.00201−2623.790.873166
10706.19751119.959−0.00064−587.6750.256706
11−127.036−339.1470.000358552.40230.233804
12543.87491136.5010.0011631402.9110.663091
13588.89411529.704−0.00052−783.3530.380284
14−648.32−813.506−0.00027−192.580.035761
15656.4477765.2626−0.00222−1496.840.69141
1611.8413327.688740.0008931205.2040.592256
17488.6455598.7078−4.9 × 10−5−34.36880.00118
18−465.545−725.8220.000631568.37240.244169
19−248.422−564.730.000643844.14010.416084
2048.52915105.5969−0.00044−549.4490.231889
21−101.445−287.4740.000207338.17930.102628
22127.0612299.22390.0023353175.4130.909774
23−33.161−99.9226−1.3 × 10−5−22.90010.000524
24−453.84−745.7540.0011441085.5070.540932
25235.6821619.602−0.00118−1797.210.763592
26−218.546−584.063−0.00084−1290.120.624682
27−440.194−973.8926.51 × 10−583.113380.00686
28−449.678−1043.20.000493660.59260.303807
29112.1348289.5755−0.00141−2105.150.815894
30581.19021689.905−0.00215−3604.90.928548
31−587.032−1501.880.0021933239.6420.913008
32−369.762−1367.25−0.00042−889.290.441602
33−389.217−555.6870.000312256.87260.061899
34867.64851281.799−0.00091−772.520.373743
35436.77671145.4720.000166251.69120.059575
36270.2492565.8382−0.0021−2537.890.865607
37216.3166600.8799−0.00052−837.3620.412172
38−231.201−467.153−0.00027−319.3070.092524
3991.05972183.4367−0.00069−798.6910.389465
40−311.184−661.0380.000234287.25670.076227
4188.70737226.15730.0008261216.4260.596725
42−418.93−845.3050.00045523.94990.215393
43139.4184556.70290.000388895.49360.445033
44131.9699299.73070.0017692319.3970.843251
45235.8081671.6881−8.7 × 10−5−143.4320.020158
46−183.356−385.2230.0012011457.0790.679804
47−450.714−849.624−0.00025−270.6240.06824
48−102.254−255.019−0.00238−3433.090.92179
49−639.041−1252.140.0009261047.6770.523272
50362.1756950.0575−0.00026−398.3360.136943
51210.3742444.9447−0.00122−1484.720.687928
52421.0867909.93460.0008981120.0440.556443
53−455.699−1037.49−0.00086−1129.10.560416
54421.7283649.61−0.00237−2104.80.815844
55521.16281052.502−0.0014−1628.740.726237
56−470.648−899.902−0.00035−387.8370.130751
5730.6667983.322066.79 × 10−610.654260.000114
58−299.44−621.459−0.00184−2199.160.828659
5983.65093193.0325−0.00104−1381.630.656228
60−144.012−384.8730.0022683500.0930.924532
61831.92411056.3770.000812595.46740.261765
62−261.051−540.9410.0018552219.5310.831261
63−98.1186−153.397−0.00042−382.350.127545
64858.16531707.158−0.00167−1919.830.786587
65−4.99706−16.75380.0019053686.8770.931474
6682.03943203.1585−0.00132−1889.560.781203
67−79.6881−179.7480.0019742570.2230.868526
68−144.75−270.4250.000239257.28050.062084
69106.8698344.2261−0.00147−2741.720.882588
70224.7479457.09210.0018972227.1210.832217
71−531.303−700.1320.0015261160.8380.574023
72−431.824−1038.130.000284393.65440.134172
73−209.465−398.030.000603661.81630.304591
74304.19458.4119−0.00137−1195.680.588419
75406.7961471.68830.000832556.75750.236629
76−275.16−545.632−4.3 × 10−5−49.71710.002466
77−149.003−351.0970.000263357.82270.113505
78−743.237−1276.950.0013261315.3940.633735
79−562.356−967.8190.0025332516.590.863635
80426.6258808.3939−0.00093−1012.770.506345
81620.8599946.9761−0.00246−2161.950.823759
82206.2948509.13320.0010181449.9790.677673
83658.67891399.0532.77 × 10−63.3964481.15 × 10−5
84−183.309−454.236−0.00093−1326.520.637638
85−12.3161−31.8307−2.7 × 10−5−40.18050.001612
86−218.468−581.443.37 × 10−551.823760.002679
87−183.216−381.860.000529636.57870.288374
88−286.844−549.0210.000599662.00880.304714
89374.423746.4409−0.00154−1775.530.759183
90418.4504492.28460.00037251.43810.059462
91−159.337−361.947−2.9 × 10−5−37.64010.001415
92−229.543−385.27−0.00095−915.9220.456201
93149.2659379.55120.0029184283.7060.948321
94124.2723235.9754−0.00066−726.3470.34537
95−542.8−1029.70.000313342.6460.105071
96335.8459.17239.31 × 10−573.500740.005373
97230.6743753.40470.0030765799.9560.971131
98−450.053−1081.940.000144199.18440.038161
99180.6602451.0257−0.00037−535.1810.222649
100−206.238−370.228−0.00059−609.2640.270714

References

  1. Wold Herman, O.A. A Study in the Analysis of Stationary Time Series, 2nd ed.; Almqvist and Wiksell: Uppsala, Sweden, 1954. [Google Scholar]
  2. Box, G.E.P.; Jenkins, G.M. Time Series Analysis: Forecasting and Control; Holden-Day: San Francisco, CA, USA, 1970. [Google Scholar]
  3. Granger, C.W.J.; Newbold, P. Spurious Regressions in Econometrics. J. Econom. 1974, 2, 111–120. [Google Scholar] [CrossRef] [Green Version]
  4. Phillips, P.C.B. Understanding Spurious Regressions in Econometrics. J. Econom. 1986, 33, 311–340. [Google Scholar] [CrossRef] [Green Version]
  5. Davidson, R.; MacKinnon, J.G. Estimation and Inference in Econometrics; Oxford University Press: New York, NY, USA, 1993. [Google Scholar]
  6. Nelson, C.R.; Plosser, C.R. Trends and Random Walks in Macroeconmic Time Series: Some Evidence and Implications. J. Monet. Econ. 1982, 10, 139–162. [Google Scholar] [CrossRef]
  7. Engle, R.F.; Granger, C.W.J. Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica 1987, 55, 251–276. [Google Scholar] [CrossRef]
  8. Johansen, S. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica 1991, 59, 1551–1580. [Google Scholar] [CrossRef]
  9. Stock, J.H. Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors. Econometrica 1987, 55, 1035–1056. [Google Scholar] [CrossRef]
  10. Chan, H.K.; Hayya, J.C.; Ord, J.K. A Note on Trend Removal Methods: The Case of Polynomial Regression Versus Variate Differencing. Econometrica 1977, 45, 737–744. [Google Scholar] [CrossRef]
  11. Nelson, C.R.; Kang, H. Spurious Periodicity in Inappropriately Detrended Time Series. Econometrica 1981, 49, 741–751. [Google Scholar] [CrossRef] [Green Version]
  12. Rand Corporation. Million Random Digits with 100,000 Normal Deviates; Rand: Santa Monica, CA, USA, 2001. [Google Scholar]
  13. Fischer, H. A History of the Central Limit Problem: From Classical to Modern Probability Theory; Sources and Studies in the History of Mathematics and Physical Sciences, Science & Business Media; Springer: Eichstätt, Germany, 2010. [Google Scholar]
  14. Knuth, D.E. The Art of Computer Programming: Fundamental Algorithms, 3rd ed.; Addison-Wesley: Boston, CA, USA, 1997. [Google Scholar]
  15. Tchebichef, P.L. Des Valeurs moyennes. J. Mathématiques Pures Appliquées 1867, 12, 177–184, (Originally published in 1867 by Mathematicheskii Sbornik, 2, pp. 1–9). [Google Scholar]
  16. Box, G.E.P.; Draper, N.R. Empirical Models Building and Response Surfaces; Wiley Series in Probability and Statistics; John Wiley & Sons: New York, NY, USA, 1987. [Google Scholar]
  17. Rubinstein, R.Y.; Kroese, D.P. Simulation and the Monte Carlo Method (Vol. 10); John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
  18. Creal, D. A survey of sequential Monte Carlo methods for economics and finance. Econom. Rev. 2012, 31, 245–296. [Google Scholar] [CrossRef] [Green Version]
  19. Kourtellos, A.; Thanasis, S.; Chih, M.T. Structural threshold regression. Econom. Theory 2016, 32, 827. [Google Scholar] [CrossRef] [Green Version]
  20. Lux, T. Estimation of agent-based models using sequential Monte Carlo methods. J. Econ. Dyn. Control 2018, 91, 391–408. [Google Scholar] [CrossRef] [Green Version]
  21. Romer, P. The trouble with macroeconomics. Am. Econ. 2016, 20, 1–20. [Google Scholar]
  22. Reed, W.R.; Zhu, M. On estimating long-run effects in models with lagged dependent variables. Econ. Model. 2017, 64, 302–311. [Google Scholar] [CrossRef]
  23. Mankiw, N.G.; Romer, D.; Weil, D.N. A Contribution to the Empirics of Economic Growth. Q. J. Econ. 1992, 107, 407–427. [Google Scholar] [CrossRef]
  24. Chow, G.C.; Li, K.W. China’s Economic Growth: 1952–2010. Econ. Dev. Cult. Chang. 2002, 51, 247–256. [Google Scholar] [CrossRef]
  25. Darné, O. The Uncertain Unit Root in Real GNP: A Re-Examination. J. Macroecon. 2009, 31, 153–166. [Google Scholar] [CrossRef]
  26. Long, Z.; Herrera, R. A Contribution to Explaining Economic Growth in China: New Time Series and Econometric Tests of Various Models; Mimeo, CNRS UMR 8174—Centre d’Économie de la Sorbonne: Paris, France, 2015. [Google Scholar]
  27. Long, Z.; Herrera, R. Building Original Series of Physical Capital Stocks for China’s Economy: Methodological Problems, Proposals of Solutions and a New Database. China Econ. Rev. 2016, 40, 33–53. [Google Scholar] [CrossRef]
  28. Cleveland, W.S. The Inverse Autocorrelations of a Time Series and Their Applications. Technometrics 1972, 14, 277–298. [Google Scholar] [CrossRef]
  29. Chatfield, C. Inverse Autocorrelations. J. R. Stat. Soc. 1979, 142, 363–377. [Google Scholar] [CrossRef]
  30. Priestley, M.B. Spectral Analysis and Time Series, 1: Univariate Series; Probability and Mathematical Statistics; Academic Press: London, UK, 1981. [Google Scholar]
  31. Hosking, J.R.M. Fractional differencing. Biometrika 1981, 68, 165–176. [Google Scholar] [CrossRef]
Figure 1. Evolutions of α T ^ when the sample size increases from 100 up to 1,000,000.
Figure 1. Evolutions of α T ^ when the sample size increases from 100 up to 1,000,000.
Mathematics 08 01931 g001
Figure 2. Evolutions of β T ^ when the sample size increases from 100 up to 1,000,000.
Figure 2. Evolutions of β T ^ when the sample size increases from 100 up to 1,000,000.
Mathematics 08 01931 g002
Figure 3. Simulations of α T ^ (in red) and β T ^ (in blue).
Figure 3. Simulations of α T ^ (in red) and β T ^ (in blue).
Mathematics 08 01931 g003
Figure 4. Simulations of t α T ^ ^ (in red) and t β T ^ ^ (in blue).
Figure 4. Simulations of t α T ^ ^ (in red) and t β T ^ ^ (in blue).
Mathematics 08 01931 g004
Table 1. Summary of the simulation results, with a sample size of T = 1,000,000.
Table 1. Summary of the simulation results, with a sample size of T = 1,000,000.
α T ^ t α T ^ ^ β T ^ t β T ^ ^ R 2
Mean6.117643616.146501860.0000217460.08606290.45983
Variance143,633.942554,191.0791.57763 × 10−62,744,822.520.10287
Standard Deviation378.990689744.4401110.001256041656.750590.32073
Max867.648481707.157890.003075785799.955950.97113
Min−743.23667−1501.88113−0.00245505−3604.899131.15357 × 10−5
P-value of null test0-0--
Note: The location tests of α T ^ and β T ^ cannot be the judgments of convergence to zero because the convergence means that the latter occurs within the sample when T tends to infinity.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

LONG, Z.; HERRERA, R. Spurious OLS Estimators of Detrending Method by Adding a Linear Trend in Difference-Stationary Processes—A Mathematical Proof and Its Verification by Simulation. Mathematics 2020, 8, 1931. https://doi.org/10.3390/math8111931

AMA Style

LONG Z, HERRERA R. Spurious OLS Estimators of Detrending Method by Adding a Linear Trend in Difference-Stationary Processes—A Mathematical Proof and Its Verification by Simulation. Mathematics. 2020; 8(11):1931. https://doi.org/10.3390/math8111931

Chicago/Turabian Style

LONG, Zhiming, and Rémy HERRERA. 2020. "Spurious OLS Estimators of Detrending Method by Adding a Linear Trend in Difference-Stationary Processes—A Mathematical Proof and Its Verification by Simulation" Mathematics 8, no. 11: 1931. https://doi.org/10.3390/math8111931

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop