4.1. Factor Analysis of School Questionnaire
A principal components analysis (PCA) was conducted on a 77 question questionnaire administered to school principals to provide information about the school contexts for teaching and learning. The suitability of PCA was assessed before analysis. Inspection of the rotated component matrix shows that all variables have a factor-loading coefficient greater than 0.3. The overall Kaiser–Meyer–Olkin (KMO) measure is 0.771, which is acceptable [
24]. Bartlett’s test of sphericity is statistically significant (
p < 0.05), indicating that the data is likely factorable. Component loadings and variables of the rotated solution are presented in
Table 3.
The PCA reveals five factors with eigenvalues greater than 1 among the 14 potential factors. Only five factors were retained and ran the principal component analysis ignoring the other factors that have a low-reliability coefficient and the coefficient of items loading. The five factors used as categorical variables explain 17.3%, 14.5%, 7.8%, 5.6%, and 4.2% of the total variance. If the first five factors explain most variables’ variability, they are a good, more straightforward substitute for all variables. Therefore, the rest of the factors were dropped without losing much of the original variability. A visual inspection of the scree plot indicates that five components should be retained. Besides, a five-component solution meets the interpretability criterion [
25,
26]. As such, five components are included. The five factors as categorical variables are selected based on several criteria. The first criterion is that an eigenvalue less than one indicates that the component explains less variance than a variable, and should not be retained (
Table 4).
In this study, the components ranked 6th to 14th have an eigenvalue between (1.928 to 1.132). Still, these factors have a low-reliability coefficient and fewer items loading with them. Therefore, the interpretation is relatively straightforward: components 1 to 5 are retained, and components 6 to 14 are not. The second criterion is based on the cumulative percentage of variance explained by a set number of components, where the first 5 factors explain about 50% of the total cumulative variance. Using the lower criterion of at least 50% of total variance leads to the retention of the first five components [
25,
26]. The third criterion is a scree plot. The components to retain are those before the (last) inflection point of the graph (
Figure 1). The inflection point represents where the graph begins to level out, and subsequent components add little to the total variance [
25,
26].
In this study, visual inspection of the scree plot leads to the retention of five components (
Figure 1). The fourth criterion is reliability. The study adopted Cronbach’s alpha (α) to measure the internal consistency of the scales created. The internal consistency of the five components is high to low (0.94, 0.90, 0.91, 0.572, 0.695). One factor has a low internal reliability coefficient, but is near 0.60 and is retained. The components’ items are closely related, so this led to the retention of five components. The fifth criterion is the interpretability criterion. The interpretability criterion is arguably the most crucial. It mainly revolves around the concept of simple structure, and a readily explainable division of variables into separate components. Extracting five components in this example allows the attainment of simple structure, and given the leaning towards extracting five components, re-runs the principal components analysis, but forces SPSS to only extract (retain) five components, instead of the default, using eigenvalue-one criterion, and suppressing all coefficients less than 0.3 [
25,
26].
The five-component solution explains approximately 50% of the total variance. A varimax orthogonal rotation is employed to aid interpretability. The rotated solution exhibits a ‘simple structure’ [
24,
26]. The interpretation of the data is consistent with what the questionnaire is designed to measure, with strong loadings of items on factor 1, named general school resources; items on factor 2, named school discipline and safety; factor 3, named parental support; factor 4, named principal experience and education; and factor 5, named library and instruction resources (
Table 5).
Table 5 shows that factor 1: general school resources is a combination of 14 variables, coded as BCBG13AE, BCBG13AE, BCBG13AE, etc. The value of Cronbach’s alpha (α) for factor 1 is 0.95, which is above 0.9, and considered excellent and acceptable [
25,
27]. It shows a high internal consistency of the variables within the factor. Therefore, the reliability of factor 1 within the variables is excellent. For factor 1: general school resources, each component variable loads moderately high on the underlying factor (loadings between 0.43 and 0.86), indicating that they measure the underlying construct relatively well. Variance in factor 1, explained by the component variables, is 17.32%, which is relatively high for a 14-variable scale. Cronbach’s alpha is 0.95, providing further evidence that the component variables are valid. These variables work well as a unit.
Factor 2: discipline and safety is a combination of 12 variables, i.e., BCBG15F, BCBG15G, BCBG15H, etc. The value of Cronbach’s alpha (α) for factor 2 is 0.90, which is considered excellent and acceptable, which is above 0.9 [
25,
26,
27]. For factor 2: discipline and safety, each component variable loads moderately high on the underlying factor (loadings between 0.59 and 0.81), indicating that they measure the underlying construct relatively well. Variance in factor 2, explained by the component variables, is 14.52%, which is relatively high for a 12-variable scale. Cronbach’s alpha is 0.90, providing further evidence that the component variables are valid. These component variables work well as a unit.
Factor 3: parental support is a combination of 13 variables, i.e., BCBG14A, BCBG14D, BCBG14G, etc. The value of Cronbach’s alpha (α) for factor 3: parental support is 0.917, which is considered excellent and acceptable, which is above 0.9 [
24,
25]. For factor 3: parental support, each component variable loads moderately on the underlying factor (loadings between 0.51 and 0.83), indicating that they measure the underlying construct relatively well. The percentage of variance in factor 3, explained by the component variables, is 7.86%, which is relatively moderate for a 13-variable scale. Cronbach’s alpha is 0.92, providing further evidence that the component variables are valid. The component variables work well as a unit.
Factor 4: principal experience and education is a combination of five variables, i.e., BCBG22A, BCBG22A, BCBG03A, etc. The value of Cronbach’s alpha (α) for factor 4: principal experience and education is 0.57, which is low, but still acceptable for exploratory factor analysis, as it is close to reliability coefficient 0.6 [
25]. It shows an acceptable degree of internal consistency of the variables within the factor. For factor 4: principal experience and education, each component variable loads moderately on the underlying factor (loadings between 0.38 and 0.69), indicating that they measure the underlying construct relatively well [
25]. The percentage of variance in factor 4, explained by the component variables, is 5.64%, which is relatively moderate for a five variable scale. Cronbach’s alpha is 0.57, providing further evidence that the component variables are valid. The component variables work well as a unit.
Factor 5: library and instruction resources is a combination of seven variables, i.e., BCBG17A, BCBG08A, BCBG03BRSCHOOL, etc. The measure of internal consistency and reliability value of Cronbach’s alpha (α) for factor 5: library and instruction resources is 0.695, which is good and acceptable [
25]. For factor 5: library and instruction resources, each component variable loads moderately on the underlying factor (loadings between 0.30 and 0.55), indicating that they measure the underlying construct relatively well. The percentage of variance in factor 5, explained by the component variables, is 4.23%, which is relatively low for a seven variable scale. Cronbach’s alpha is near 0.70, providing further evidence that the component variables are valid. The component variables work well as a unit.
The factor analysis results show that the internal consistency of factors 1, 2, and 3 are very high (0.94, 0.90, and 0.91, respectively), and the items in the factors are closely related. The internal consistency of factors 4 and 5 is moderate (0.57 and 0.70, respectively), in comparison to the internal consistency of factors 1, 2, and 3. As a result of the factor analysis with principal component analysis, five new factors are created throughout the school questionnaire, that are entitled factor 1: general school resources, factor 2: discipline and safety, factor 3: parental support, factor 4: principal experience and education, and factor 5: library and instruction resources. These factors are used for one sample t-test and regression analysis, to identify the school factors that most affected students’ achievement in TIMSS 2015.
4.2. One-Sample t-Test of School Questionnaire: Factor 1—General School Resources
A one-sample test was performed to examine the perceptions of the school principals on items related to factor 1: general school resources. These items have four-point Likert-scale responses, from
A lot (coded 4), to
Not at all (coded 1), and the neutral value of 2.5 is used as the test value. The one-sample
t-test shows that the principals have an overall significant negative perception toward factor 1: general school resources (mean = 2.42, SD = 0.88, and
p < 0.05), since it attains an overall mean of less than 2.5, with SD = 1.302 and
p < 0.05. Their perceptions are negative toward school buildings and grounds, heating/cooling and lighting systems, and computer technology for teaching and learning (e.g., computers or tablets for student use). They express negative perceptions toward technologically competent staff (mean = 2.34, SD = 1.064, and
p < 0.05), instructional materials (e.g., textbooks) (mean = 2.37, SD = 1.199, and
p < 0.05), audio–visual resources for delivery of instruction (e.g., interactive whiteboards, digital projectors) (mean = 2.34, SD = 1.063, and
p < 0.05), supplies (e.g., papers, pencils, materials) (mean = 2.15, SD = 1.186, and
p < 0.05), concrete objects or materials to help students understand quantities or procedures (mean = 2.41, SD = 0.936, ad
p < 0.05), library resources relevant to mathematics instruction (mean = 2.37, SD = 0.939, and
p < 0.05), and toward calculators for mathematics instruction (mean = 2.26, SD = 1.140, and
p < 0.05). Overall, the principals have a negative perception toward factor 1: general school resources (mean = 2.42, SD = 0.88, and
p < 0.05) (
Table 6).
4.7. School Factors Multiple Regression
Multiple regression analysis was also employed to investigate the influence of school factors on eighth grade students’ mathematics achievement in TIMSS 2015. The student’s achievement in TIMSS 2015 was set as the dependent variable, and the five factors selected as the independent variables. Multiple regression using the enter method was deemed a suitable analysis method [
28]. Before conducting the analysis, the relevant assumptions of this statistical analysis were examined. Tests conclude that the data meet the premises of no multicollinearity [
29], and no independent errors (Durbin–Watson = 1.527). Further analysis of standard residuals identifies that the data obtained has no outliers (std. residual min = −4.159; std. residual max = 3.360). Scatter plots demonstrate that the assumptions of linearity and homogeneity are all satisfied [
30].
A one-way ANOVA was conducted to determine the differences between five plausible values, and the average of five plausible values (
Table 11). Student achievement was classified into six groups: first plausible value, second plausible value, third plausible value, fourth plausible value, fifth plausible value, and the average of five plausible values. In that order, there are no outliers. Data are normally distributed for each group, as assessed by the Shapiro–Wilk test (
p > 0.05), and variances are homogeneous, as assessed by Levene’s test of homogeneity of variances. In that order, a one-way ANOVA indicates that the differences between all five plausible values and the average of the five plausible value groups are not statistically significant (
p = 0.876 > 0.05) (
Table 11).
To investigate the effects of school factors (factor 1: general school resources, factor 2: school discipline and safety, factor 3: parental support, factor 4: principal experience and education, and factor 5: library and instruction resources) on students’ achievement in TIMSS 2015, a five-stage multiple regression, using the enter method, was deemed a suitable method of analysis [
28]. The purpose of multiple regression is to ascertain the variation in the dependent variable, clarified by the addition of new variables that are not dependent. Still, multiple regression can also be utilized to calculate dependent variable values centered on new values of the variables that are not dependent, and estimate the amount of change in the dependent variable when one unit of the independent variable varies. This unit focuses on clarifying the dependent variable’s proportion, while adding new variables that are not dependent. When explaining and stating findings from multiple regression, we operated through three phases: (a) calculating the regression models that were meant for comparison, (b) deciding whether the multiple regression model was good for the information, and (c) comprehending the coefficients in the multiple regression model [
21].
Separate five-stage multiple regressions are conducted to investigate school factors’ effects on students’ achievement on TIMSS 2015 (
Table 12). Factor 1: general school resources is entered at stage one of the regressions, as the main predictor, to observe their effects on students’ achievement on TIMSS 2015. Next, factor 2: discipline and safety is entered at stage two. Next, factor 3: parental support, is entered at stage three. Next, factor 4: principal experience and education, is entered at stage four. Factor 5: library and instruction resources is entered at stage five. This order seems plausible to investigate school factors’ effects on students’ achievement in the TIMSS 2015 (
Table 12).
Before conducting a multiple regression, the relevant assumptions of this statistical analysis are tested. Firstly, a sample size of 4838 is deemed adequate, given that five independent variables are included in the analysis; Green [
31] suggests the rule of thumb to determine the number of participants as appropriate via the formula:
N > 50 + 8 m (where m is the number of independent variables). As per this formula, the minimum sample size required is a number greater than 90 (
N > 50 + 8(5)) for a moderate relationship among the one dependent and five independent variables. An examination of correlations reveals a statistically significant correlation between achievement and school factors. However, as the collinearity tests indicate, the data meet no multicollinearity assumption [
29].
The multiple regression reveals that in model 1, factor 1: general school resources contributes significantly to the regression model (F (1, 4374) = 19.946,
p < 0.01). The prediction of students’ achievement in TIMSS 2015 (model 1) (R
2 = 0.005) accounts for approximately 0.5% of the total variance in students’ achievement in TIMSS 2015. Adding factor 2: discipline and safety to the prediction of achievement (model 2) brings an improvement over the earlier model, which leads to a statistically significant increase in R
2 of 0.029, F (2, 4373) = 64.609,
p < 0.01, since it accounts for 2.9% of the total variance. The addition of factor 3: parental support to the prediction of achievement (model 3) leads to a statistically significant increase in R
2 of 0.088, F (3, 4372) = 141.308,
p < 0.01, and accounts for 8.8% of the total variance. The addition of factor 4: principal experience and education to the prediction of achievement (model 4) leads to a statistically significant increase in R
2 of 0.163, F (4, 4371) = 213.486,
p < 0.01, and accounts for 16.3% of the total variance. The fifth and final model, comprised of all five predictor factors (factor 1: general school resources, factor 2: discipline and safety, factor 3: parental support, factor 4: principal experience and education, and factor 5: library and instruction resources), in the prediction of student achievement on TIMSS 2015 (model 5), leads to a statistically significant increase in R
2 of 0.168, F (1, 4370) = 176.022,
p < 0.01, and accounts for 16.8% of the total variance (
Table 12).
The ANOVA result (
Table 13) indicates that there is a significance in each of the five models (one predictor, two predictors, three predictors, four predictors, and five predictors, respectively.) It is seen that all five models are significant (
p < 0.01). In particular, it is noted that the F value is the largest for the model with the fourth predictor. The F values are the overall predictive effects, which are different from the F for the amount of changes in achievement when adding a variable. The
p-value of 0.000 < 0.01 for models 1, 2, 3, 4, and 5 implies that the regression model is statistically significant, indicating a significant linear relationship between achievement and general school resources, school discipline and safety, parental support, principal experience and education, and library and instruction resources.
For general school resources, tolerance = 0.922, and VIF = 1.084; factor 2: discipline and safety, tolerance = 0.848, and VIF = 1.179; factor 3: parental support, tolerance = 0.839, and VIF = 1.191; factor 4: principal experience and education, tolerance = 0.942, and VIF = 1.061; and for factor 5: library and instruction resources, tolerance = 0.927, and VIF = 1.079. Furthermore, the data meet the assumption of independent errors (Durbin–Watson = 1.350). An analysis of standard residuals is carried out, which indicates that the data contains no outliers (standard residual min = −3.757, standard residual max = 3.853). Residual and scatter plots indicate the assumptions of linearity and homogeneity are all satisfied [
30].
From
Table 14, the results of the multiple regression model with five plausible values from student achievement in mathematics show the coefficients for the constant and the five predictors of student achievement in TIMSS 2015. The regression slopes (B) and significance values (alpha) are used to discuss the final model, combining the five factors on student achievement (average). These values are as follows:
- (1)
Constant average score for B = 693.615, p = 0.000: significant;
- (2)
General school resources B = −0.225, p = 0.880: not significant;
- (3)
School discipline and safety B = −9.947, p = 0.000: significant;
- (4)
School parental support B = −40.998, p = 0.000: significant;
- (5)
Principal experience and education B = −48.932, p = 0.000: significant;
- (6)
School library and instruction resources B = −9.545, p = 0.000: significant.
The best-fitting model for predicting student achievement in the TIMSS 2015 from the analysis above is the linear combination of the constant, factor 1: general school resources, factor 2: discipline and safety, factor 3: parental support, factor 4: principal experience and education, and factor 5: library and instruction resources for instruction.
Model 1:
Achievement in math = 693.615 − 0.225 (general school resources) − 9.947 (school discipline and safety) − 40.998 (school parental support) − 48.932 (school principal experience and education) and − 9.545 (school library and instruction resources).
This model indicates that for every one unit increase in general school resources, the achievement declines by 0.225; for one unit increase in school discipline and safety, the achievement declines by 9.947; for one unit increase in school parental support, the achievement declines by 40.998; for one unit increase in school principal experience and education, the achievement declines by 48.932; and for one unit increase in the school library and instruction resources, the achievement declines by 9.545.
In addition, p-value = 0.880 > 0.01 is not a significant level for general school resources; p-value = 0.000 < 0.01 is a significant level for discipline and safety; and p-value = 0.000 < 0.01 for parental support, school principal experience and education, and school library and instruction resources implies that school discipline and safety to the school library and instruction resources are statistically significant and, therefore, have a significant impact on achievement, while general school resources is not statistically significant. Meanwhile, the variance inflation factor for general school resources to the school library and instruction resources is less than five. This shows no multicollinearity among the explanatory variables that satisfy the assumptions that there should not be multicollinearity.