**5. Application**

In the following section, we will apply the proposed estimation strategies to a financial dataset to examine the relative performance of the listed estimators. To illustrate and compare the listed estimators, we will study the effect of several economic and financial variables on the performance of the "Fragile Five" countries (coined by Stanley 2013) in terms of their attraction of direct foreign investment (FDI) over the period between 1983 and 2018. The "Fragile Five" include Turkey (TUR), South Africa (ZAF), Brazil (BRA), India (IND), and Indonesia (IDN). Agiomirgianakis et al. (2003), Hubert et al. (2017), and Akın (2019) used the FDI as the dependent variable across countries. With five countries, we have *M* = 5 blocks in our SUR model, with measurements of *T* = 36 years per equation. Table 2 provides information about prediction variables, and the raw data are available from the World Bank1.


We sugges<sup>t</sup> the following model:

$$\begin{array}{rcl} \text{FDI}\_{it} &=& \beta\_{0i} + \beta\_{1i} \text{GROWTH} + \beta\_{2i} \text{DEFLATOR} + \beta\_{3i} \text{EXPORTS} + \beta\_{4i} \text{IMPORTS} + \\ & & \beta\_{5i} \text{GGFCE} + \beta\_{6i} \text{RESERENS} + \beta\_{7i} \text{PREM} + \beta\_{8i} \text{BALANCE} + \varepsilon\_{it} \end{array} \tag{14}$$

where *i* denotes countries (*i* = TUR, ZAF, BRA, IND, IDN) and *t* is time (*t* = 1, 2, ... , *<sup>T</sup>*). Following Salman (2011), the errors of each equation are assumed to be normally distributed with mean zero, homoscedastic, and serially not autocorrelated. Furthermore, there is contemporaneous correlation between corresponding errors in different equations. We test these assumptions along with the assumptions in Section 2. We first check the following assumptions of each equation:

<sup>1</sup> https://data.worldbank.org.

Nonautocorrelation of errors: There are a number of viable tests in the reviewed literature for testing the autocorrelation. For example, the Ljung–Box test is widely used in applications of time series analysis, and a similar assessment may be obtained via the Breusch–Godfrey test and the Durbin–Watson test. We apply the Ljung–Box test of (Ljung and Box 1978). The null hypothesis of the Ljung–Box Test, H0, is that the errors are random and independent. A significant *p*-value in this test rejects the null hypothesis that the time series is not autocorrelated. Results reported in Table 3 sugges<sup>t</sup> a rejection of H0 for the equations of both TUR and IND at any conventional significance level. Thus, the estimation results will be clearly unsatisfactory for these two equation models. To tackle this problem, we performed the first differences procedure to transform the variables. After transformation, the test statistics and *p*-values of the equation TUR and IND were *<sup>χ</sup>*<sup>2</sup>(1) = 1.379, *p* = 0.240 and *<sup>χ</sup>*<sup>2</sup>(1) = 0.067, *p* = 0.794, respectively. Hence, each equation satisfied the assumption of nonautocorrelation. We confirmed our result using the Durbin–Watson test.


**Table 3.** Ljung-Box test.

Homoscedasticity of errors: To test for heteroscedasticity, we used the Breusch–Pagan test (Breusch and Pagan 1979). The results in Table 4 failed to reject the null hypothesis in each equation.


**Table 4.** Breusch–Pagan test.

The assumption homoscedasticity in each equation was thus met.

Normality of errors: To test for normality, there are various tests such as Shapiro–Wilk, Anderson–Darling, Cramer–von Mises, Kolmogorov–Smirnov, and Jarque–Bera. In this study, we performed the Jarque–Bera goodness-of-fit test (Jarque and Bera 1980).



The null hypothesis for the test is that the data are normally distributed. The results reported in Table 5 suggested a rejection of H0 only for ZAF. We also performed the Kolmogorov–Smirnov test for

ZAF, and the results showed that the errors were normally distributed. Thus, each equation satisfied the assumption of normality.

Cross-sectional dependence: To test whether the estimated correlation between the sections was statistically significant, we applied the Breusch and Pagan (1980) Lagrange multiplier (LM) statistic and the Pesaran (2004) cross-section dependence (CD) tests. The null hypothesis of these tests claims there is no cross-section dependence. Both tests in Table 6 suggested a rejection of the null hypothesis that the residuals from each equation were significantly correlated with each other. Consequently, the SUR model would be the preferred technique, since this model assumed contemporaneous correlation across equations. Therefore, the joint estimation of all parameters rather than OLS, on each equation, was more efficient (Kleiber and Zeileis 2008).


**Table 6.** Cross-section dependence test results. LM, Lagrange multiplier; CD, cross-section dependence.

Specification test: The regression equation specification error test (RESET) designed by Ramsey (1969) is a general specification test for the linear regression model. It tests the exogeneity of the independent variables, that is the null hypothesis is *E* [*<sup>ε</sup>i*|**<sup>X</sup>***i*] = 0. Thus, rejecting the null hypothesis indicates that there is a correlation between the error term and the regressors or that nonlinearities exist in the functional form of the regression. The results reported in Table 7 suggested a rejection of H0only for IDN.

**Table 7.** The regression equation specification error test (RESET) test.


Multicollinearity: We calculated the variance inflation factor (VIF) values among the predictors. A VIF value provides the user with a measure of how many times larger the Var(*βj*) will be for multicollinear data than for orthogonal data. Usually, the multicollinearity is not a problem, as the VIFs are generally not significantly larger than one (Mansfield and Helms 1982). In the literature, values of VIF that exceed 10 are often regarded as indicating multicollinearity, but in weaker models, values above 2.5 may be a cause for concern. Another measure of multicollinearity is to calculate the condition number (CN) of **<sup>X</sup>** *<sup>i</sup>***X***i*, which is the square root of the ratio of the largest characteristic root of **<sup>X</sup>** *i***X***<sup>i</sup>* to the smallest. Belsley et al. (2005) suggested that a CN greater than fifteen poses a concern, a CN in excess of 20 is indicative of a problem, and a CN close to 30 represents a severe problem. Table 8 displays the results from a series of multicollinearity diagnostics. In general, EXPORTS, IMPORTS, and BALANCE were found to be problematic with regard to VIF values, while the others may be a little concerning. On the other hand, the results from the CN test suggested that there was a very serious

concern about multicollinearity for the equations of ZAF, BRA, and IDN. In light of these results, it was clear that the problem of multicollinearity existed in the equations. According to Greene (2019), the SUR estimation is more efficient when the less correlation exists between covariates. Therefore, the ridge-type SUR estimation will be a good solution of this problem.


**Table 8.** Variance inflation factor (VIF) and CN values.

Structural change: To investigate the stability of the coefficients in each equation, we used the CUSUM (cumulative sum) test of Brown et al. (1975) that checks for structural changes. The null hypothesis is that of coefficient constancy, while the alternative suggests inconsistent structural change in the model over time. The results in Table 9 suggested the stability of coefficients over time.


**Table 9.** CUSUM test.

Following Lawal et al. (2019), we selected important variables in each equation of the SUR model and implemented the stepwise AIC forward regression by using the function **ols**\_**step**\_**forward**\_**aic** from the **olsrr** package in the R project. The statistically significant variables are shown in Table 10. After that, the sub-models were constituted by using these variables per equation.


**Table 10.** Important variables per equation.

In light of the selected variables in Table 10, we construct the matrices of restrictions as follows:


$$\begin{aligned} \mathbf{R}\_3 &= \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ \end{bmatrix}, \mathbf{R}\_4 &= \begin{bmatrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \end{bmatrix}, \\ \mathbf{R}\_5 &= \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \end{bmatrix} \text{ and } \mathbf{r}\_3 = \mathbf{r}\_4 = \mathbf{r}\_5 = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{bmatrix}; \end{aligned}$$

thus, the reduced models are given by:

$$\begin{aligned} \text{TUR} &: \quad \text{FDI}\_{t} = \beta\_{0} + \beta\_{3} \text{EXPORTS} + \epsilon\_{t} & \tag{15} \\ \text{ZAF} &: \quad \text{FDI}\_{t} = \beta\_{0} + \beta\_{7} \text{PREM} + \epsilon\_{t} & \tag{16} \\ \text{BRA} &: \quad \text{FDI}\_{t} = \beta\_{0} + \beta\_{2} \text{DEFLATOR} + \beta\_{4} \text{IMPORTS} + \beta\_{8} \text{BALANCE} + \epsilon\_{t} & \tag{17} \\ \text{IND} &: \quad \text{FDI}\_{t} = \beta\_{0} + \beta\_{4} \text{IMPORTS} + \beta\_{7} \text{PREM} + \beta\_{8} \text{BALANCE} + \epsilon\_{t} & \tag{18} \\ \text{IND} &: \quad \text{FDI}\_{t} = \beta\_{0} + \beta\_{3} \text{EXPORTS} + \beta\_{4} \text{IMPORTS} + \beta\_{7} \text{PREM} + \epsilon\_{t} & \tag{19} \end{aligned}$$

Next, we combined Model (14) and Models (15)–(19) using the shrinkage and preliminarily test strategies outlined in Section 3. Before we performed our analysis, the response was centred, and the predictors were standardized for each equation so that the intercept term was omitted. We then split the data by using the time series cross-validation technique of Hyndman and Athanasopoulos (2018) into a series of training sets and a series of testing sets. Each test set consisted of a single observation for the models that produced one step-ahead forecasts. In this procedure, the observations in the corresponding training sets occurred prior to the observation of the test sets. Hence, it was ensured that no future observations could be used in constructing the forecast. We used the function **createTimeSlices** from the **caret** package in the R project here. The listed models were applied to the data, and predictions were made based on the divided training and test sets. The process was repeated 15 times, and for each subset's prediction, the mean squared error (MSE) and the mean absolute error (MAE) were calculated. The means of the 15 MSEs and MAEs were then used to evaluate the performance for each method. We also report the relative performances (RMAE and RMSE) with respect to the full model estimator for easier comparison. If a relative value of an estimator is larger than one, it is superior to the full model estimator.

In Table 11, we report the MSE and MAE values and their standard errors to see the stability of the algorithm. Based on this table, as expected, the RE had the smallest measurement values since the insignificant variables were selected as close to correct as possible. We saw that the performance of the PSE after the RE was best by following the SE and the PTE. Moreover, the performance of the OLS was the worst due to the problem of multicollinearity.


**Table 11.** Comparison of forecasting performance.

The numbers in parenthesis are the corresponding standard errors of the MAE and MSE.

In order to test whether the two competing models had the same forecasting accuracy, we used the two-sided statistical Diebold–Mariano (DM) test (Diebold and Mariano 1995) when the forecasting horizon was extended to one year, and the loss functions were both squared errors and absolute errors. A significant *p*-value in this test rejected the null hypothesis that the models had different forecasting accuracy. The results based on the absolute-error loss in Table 12 suggested that the FME had different prediction accuracy with all methods except RE. Additionally, the forecasting accuracy of the OLS differed from the listed estimators. On the other hand, the results of the DM test based on the squared error loss suggested that the observed differences between the RE and shrinkage estimators were significant.


**Table 12.** Diebold–Mariano test for the forecasting results.

The numbers in parenthesis are the corresponding *p*-values; LS is the "loss function" of the method to compute; \* *p* < 0.1, \*\* *p* < 0.05, \*\*\* *p* < 0.01.

Finally, the estimates of coefficients of all countries are given in Table 13.


