2.1. Benchmark Model
Our study concept is based upon the following model:
where
acps stands for the student’s average exam scores of Korean, English, mathematics, science, and social science, which measures students’ academic performance; and
excd and
prvl denote the dummy variables for weekly after-school exercise and weekly after-school private lessons for Korean, English, mathematics, science, or social science, respectively. For example,
excd is equal to 1 if the student attends after-school private lessons for exercise and is 0, otherwise. This model was specified to capture the effect of the after-school exercise on the student’s academic performance. Therefore, the exam score of physical education was not included in
acps. In addition,
yinc stands for the parent’s yearly income measured in 10,000 Korean won. The logarithm was taken to
acps and
yinc, so that we could interpret the coefficient
as the income elasticity of the academic performance. Finally, the subscript
denotes the student index.
This empirical model was motivated by the hypothesis that regular exercise positively affects students’ academic performance. For example, Alkadhi [
6], de Greeff, Bosker, Oosterlaan, Visscher, and Hartman [
7], Tomporowski, McCullick, Pendleton, and Pesce [
8], and Xiang et al. [
9], among others, provided theoretical bases for a positive association between physical and cognitive or mental activities. Furthermore, Belcher et al. [
10] reported experimental results indicating that regular exercises modified the structure and function of brain, positively affecting brain activities. These theoretical and experimental works imply that students’ regular after-school exercises are helpful in raising their academic performances, and Model (1) was specified to capture this effect empirically. Here, a student’s regular exercise is denoted as any form of after-school exercise lessons, but excludes casual exercises, such as irregular or regular gym classes held at the school. They are excluded because such classes are common for all students in Korea, and therefore it is difficult to capture the different activities among different students and their affection to the academic performance.
Models similar to (1) have been empirically investigated in prior literature. For example, using Korean middle school student data, Shin, Yoo, and Kim [
3] showed that the academic performance was positively and structurally associated with after-school exercise hours, although their relationship was negative in terms of the prediction model. In obtaining this result, Shin, Yoo, and Kim [
3] employed the OLS and TSLS estimation methods. The OLS method estimates the prediction model, and a negative relationship is captured between the two variables. Meanwhile, the TSLS method estimates the structural relationship between the two variables, estimating a positive relationship, affirming the structural hypothesis on the two variables in the literature.
The current study deepens the empirical results in prior literature. Note that students’ responses to the after-school exercise would not be the same for every student, and this can lead to various results in terms of raising their academic performance. For example, a student with high academic performance is expected to respond to the after-school exercise differently from a student with low academic performance. Specifically, if the academic performance is positively associated with the student’s cognitive power and his/her regular exercise is helpful in raising the student’s cognitive power, attending the after-school exercise class is expected to raise the cognitive power of the low-performing student more effectively than the high-performing student. It is mainly because the latter is likely to have already reached the level that cannot be easily raised by attending the physical exercise class. We estimated these different responses by specifying the QR model. Specifically, the responses were assumed to be different among the students belonging to the different quantile levels in terms of their academic performance, and we estimated the different coefficients by modifying Model (1) to have different parameters at different quantile levels. That is, if we let
denote the quantile level of the student’s academic performance, the parameters in Model (1) are modified to depend on
, as follows: for each
,
The parameters on the right side now depend on the quantile level , so that the top 10-% students can now be differently associated with the right-side variables from the bottom 10-% students.
We estimated Model (2) by the QR method. Koenker and Bassett [
4] provided the estimation method for the model specified by the same motivation as that of this study, under a general model assumption, and showed that their estimator was consistent for the desired parameters and was asymptotically normal around the unknown parameter values. In addition, Koenker [
11] demonstrated how to test hypotheses on the unknown parameters using the asymptotic normal distribution provided by Koenker and Bassett [
4]. In particular, Koenker [
11] employed the robust standard error to define the
-test and showed that its null limit distribution was a standard normal. Below, we exploit the technical advances in Koenker [
11] to estimate Model (2).
Nevertheless, the QR method does not estimate the structural relationship between the academic performance and the after-school exercise. Note that when Model (2) is estimated by QR for different quantiles, their weighted average with respect to turns out to be identical to that estimated by OLS. This implies that the QR method cannot be associated with the structural relationship between the variables.
We therefore estimated the structural form of Model (2) by IVQR. Chernozhukov and Hansen [
5] provided a method to estimate the structural parameters in Model (2) under the condition that proper instrumental variables are available. They showed that their estimator could consistently estimate the unknown structural parameters and, also, that its limit distribution was normal under some mild regularity conditions, that enables us to construct the
-test for the QR method. The model estimated by IVQR was different from that estimated by QR. For each quantile level, we emphasize that it consistently estimated the structural quantile equation instead of the quantile equation, viz., the quantile prediction model. The relationship between the QR and IVQR estimations is parallel to that between the OLS and TSLS estimations in terms of their structures. Exploiting the advances in Chernozhukov and Hansen [
5], we estimated the structural quantile model and drew the model implications that were different from those of the QR method. (The following URL provides the stata code to estimate the model by the IVQR method:
http://sites.google.com/site/dwkwak/dataset-and-code (accessed on 23 November 2021)).
The instrumental variables play a critical role in estimating the structural quantile model. For the goal of the current study, we employed the logarithms of the students’ height, weight, and sleeping hours on the weekend. There were two motivations for these instrumental variables. First, a student’s height and weight are closely associated with outside activities [
12], so they are highly correlated. Second, a student attending the regular after-school exercise class tends to sleep more than other students on the weekend in order to recover from physical fatigue because they cannot oversleep during weekdays; hence, students’ sleeping hours on weekends tend to be correlated with regular after-school exercise. Nihayah et al. [
13] and Zeek et al. [
14] also provided case studies on the relationship between students’ sleeping hours and their academic performance. Based upon these two facts, the logarithms of students’ height, weight, and sleeping hours on weekends were employed as our instrumental variables to apply to the IVQR method. The same instrumental variables were also selected by Shin, Yoo, and Kim [
3] when estimating their structural model by TSLS.
In addition to the after-school exercise, the other explanatory variables on the right side of Model (2) were included in order to explain the variation of the academic performance score. The after-school private lessons for school subjects are certainly helpful in raising the academic performance score, which allows the after-school exercise (excd) to maintain its explanatory power. In addition, the parent’s income level was also included on the right side, by noting that parent’s high-income level provides the student with more opportunities to take high-quality private lessons, raising the student’s academic performance. If these variables were to be omitted from the right side, the explanatory power of the after-school exercise may be overwhelmed by the variation of the error term. We called Model (2) our benchmark model and estimate it by both QR and IVQR.
2.2. Model Extensions
We next extended the model scope by including other explanatory variables on the right side and tested the robust model estimation property. For this goal, we applied the strategy taken by Shin, Yoo, and Kim [
3] to our QR and IVQR models. As our first extension, we specified the following model:
where
gndr denotes students’ gender, such that it is 1 and 0 for male and female students, respectively;
expl denotes the monthly expenditure on the after-school private lessons measured in 10,000 KRW;
nsib denotes the number of siblings; and
moed_x denotes the mother’s education level. The attachment
x indicates the education level. That is,
m,
h,
p,
u, and
g denote middle school, high school, polytechnic school, university, and graduate school, respectively. For example, if
moed_u is 1, it implies that the student’s mother was educated up to the university education.
These additional explanatory variables were included in order to examine how they are associated with the academic performance score. First, according to Alkadhi [
6], and de Greeff, Bosker, Oosterlaan, Visscher, and Hartman [
7], among others, male and female students have relative advantages in different disciplines, so a student’s academic performance score for different disciplines can be gender-dependent. Therefore, we expected the coefficient of
gndr in Model (3) to be significantly different from zero, and to further diverge depending on the quantile levels. Second, we included the monthly expenditure on private lessons (
expl) and the number of siblings (
nsib) in order to explain the partial effect of the parent’s income on the academic performance. If the parents’ income is spent on the student’s private lessons in order to raise the academic performance, the income effect can be better explained by including the expenditure on the private lessons in addition to parent’s income. Similarly, the income effect of parents with multiple children can reduce if the total parents’ income is divided for each child’s private lessons. We therefore included the number of siblings on the right side and capture the split-income effect. Finally, we included the mother’s education level on the right side and detected the parental-involvement effect. Bogenschneider [
15], Boonk, Gijselaers, Ritzen, and Brand-Gruwel [
16], and Glick and Hohmann-Marriott [
17], among others, pointed out that a student’s academic performance was closely related to parental involvement and, further, that parental involvement can exist in various forms, suggesting that the parent’s education level can be a proper form for parental involvement, although it is generally difficult to measure it objectively. By following the suggestions in prior studies, we included the mothers’ education level to measure parental involvement and examined how it affected the student’s academic performance at the different quantile levels [
15,
17,
18,
19].
By estimating Model (3) by both QR and IVQR, we can examine the hypothesis for the newly included explanatory variables on the right side. The given hypothesis may be relevant to some quantile levels but not to all of the quantile levels, or it may be relevant at all quantile levels. There can be many different results depending on the quantile levels. Below, we empirically examine whether the given hypothesis is valid or not at each quantile level, and from this we draw detailed empirical inference on the Korean middle school students.
As our final model extension, we further included more explanatory variables in Model (3). As mentioned above, parental involvement is a critical variable that explains a student’s academic performance but including only the mother’s education level on the right side may be insufficient to capture the effect of parental involvement on a student’s academic performance. We, therefore, compensated the mother’s education level by complementing it with the father’s education level,
:
where
faed_x is a dummy variable indicating the father’s education level, and the attachment
x denotes the same education level as for the mother’s education level.
We estimated the extended models in Models (3) and (4) in order to affirm the estimation results made by both QR and IVQR. If the quantile prediction and structural quantile equations in Model (2) are consistently estimated by both QR and IVQR, respectively, they should be similar to those obtained by Models (3) and (4). By estimating these multiple models, we attempted to ensure that our model estimates were robust to model variation.