4. Artificial Neural Networks
Artificial neural networks (ANNs) have been helpful in contexts where the relationships between the variables are initially unclear. These domains include medicine (
Burden & Winkler, 2008), the stock market (
Sun et al., 2014), engineering (
Feng et al., 2019), and psychology (
Kayri, 2016). In the context of learning analytics, machine learning has been used to model data related to university students (see
Guillén Perales et al. (
2024) and
Luo et al. (
2024) for some recent examples). To date, however, its potential to model the learning development of primary and secondary students has been scarce.
One recent exception utilised machine learning to predict the math achievement scores of German students (
Lavelle-Hill et al., 2024). Using a standardised mathematics test, which covers arithmetic, algebra, and geometry appropriate for the school year, the Rasch-scaled scores enabled a direct comparison of achievement across six grade levels (Years 5–9). Lavelle-Hill et al. found that student achievement scores from the immediately preceding year were best predictors for predicting next year’s achievement scores. Over longer periods of time, predictor variables, such as non-verbal intelligence, motivation and emotion, learning strategies, student- and teacher-rated classroom contexts, family contexts, and demographic contexts, became increasingly important. Although Lavelle-Hill et al. pointed out that machine learning techniques in social sciences have been criticised for their focus on prediction rather than explanation, an a priori model for testing and estimates of variable importance helps to overcome these concerns.
Aghaei et al. (
2023) developed a hybrid approach to predict the academic satisfaction of university students, and
Herzog’s (
2006) comparison between ANNs and logistical regression predicted university students’ time to complete their courses. Aghaei et al. employed SEM to construct a hypothesised model that explained the relationship between 10 variables on students’ course satisfaction, removing any variables that did not fit into the model. The second step developed an ANN model with the remaining variables and showed that their model could explain 99% of the variance in the training data, thereby outperforming the SEM model by 30%.
Earlier,
Herzog (
2006) attributed ANN’s stronger predictive power to its ability to handle input variables that display a high degree of collinearity. Indeed, many traditional statistical approaches including logistic regression and SEM assume linearity in the independent variables, and when this assumption is violated, it reduces model performance and predictability.
To date, the development of ANNs includes architectures with two or more hidden layers and with increased choices for both activation function and training function. The success of ANN modelling is based on establishing the optimal architecture of the ANN and optimising both activation and learning functions. In this section we outline in general terms the basic features of an ANN, researcher considerations, and areas of concern. For more information, please refer to
Kim (
2017),
Feng et al. (
2019), and
S. N. Phillipson et al. (
2023).
The primary unit of an ANN is the ‘neuron’ (
Figure 1), which is represented by a circle and is linked to other neurons in the adjacent layer. Usually, the input variables are transferred one set at a time to neurons in the hidden layer, with information between neurons being weighted (
w) by the model. The input of each neuron in the hidden layer is the sum total of the weighted information from preceding neurons.
Each neuron in the hidden layer then transforms the summed information using an activation function (AF) chosen by the researcher and an associated bias (b) assigned by the model. This continues throughout the hidden layer until the information is transferred to the output layer. Here, the output value is compared to the dependent variable associated with each dataset. Differences between the values in the output layer and the dependent variable are compared and provide the basis for adjustments to both the weights and biases for each neuron. These adjustments are made through training functions, which are again chosen by the researcher. The process continues iteratively until the network learns and estimates the appropriate relationships between the independent and dependent variables.
The modelling is based on a randomly chosen portion of the complete dataset, termed the ‘training’ dataset. The remaining portion is used to test the model, hence the term ‘testing’ dataset, noting that with each iteration, there is a re-selection of data for each portion. Comparisons between the output values and the dependent variables for the training and testing datasets are made by estimating Pearson’s correlation (R). Ideally, the Train-R and Test-R should be comparable in value and closer to one, where the ANN explains close to all variances in the data with the hypothesised model.
Researchers need to be alert to three issues: overfit, vanishing gradient, and computational load. Overfit occurs when the training function incorporates spurious patterns in the data and is evident when the Train-R is much larger than the Test-R. Overfit generally occurs when the ANN’s architecture is overly complex and contains too many hidden layers and hidden neurons.
Vanishing gradient (or the inability of the architecture to learn) refers to the failure of the backpropagation algorithm to adjust the weights and biases in the hidden layer closest to the input layer. Reducing the complexity of the ANN’s architecture also prevents vanishing gradient. Finally, computational load refers to the time required to fully train an ANN’s architecture. The computation load increases exponentially as the ANN contains more hidden neurons and layers because there are more weights and biases to be estimated (
Thomas et al., 2016).
The challenge for researchers is to find the optimal architecture for each dataset with a number of network parameters that need to be specified by the researcher. Each of these is outlined in turn.
4.1. The Number of Neurons in the Input Layer
The number of neurons in the input layer is usually the number of input variables. For the AMG, this means that there would be 10 (or 11) neurons corresponding to the 10 (or 11) capitals plus any other variables related to demography and/or year level, age, and so on. As previously mentioned, we suggest the use of Rasch scores where possible, year level (i.e., nominal data) coded as interval-level data, and one-hot coding for gender (
S. N. Phillipson et al., 2023).
4.2. The Number of Hidden Layers
The guideline is that fewer hidden layers are better (
Hornik et al., 1989). In some instances, however, models with two hidden layers outperform one-hidden-layer models (
Thomas et al., 2016). Accordingly, ANN models with one and two hidden layers should be investigated.
4.3. The Number of Neurons in Each Hidden Layer
The general guideline is that the number of neurons in each hidden layer should be fewer than the preceding layer (
Huang, 2003;
Stathakis, 2009). In practice,
Thomas et al. (
2016) provided a ‘short-cut trajectory’ to identify the optimal number of neurons within each hidden layer for ANNs with two hidden layers.
4.4. The Number of Neurons in the Output Layer
The number of neurons in the output layer is the number of dependent variables. For the current research, the output layer contains one neuron, reflecting one of the six possible achievement scores. Thus, six ANNs are sought, corresponding with the three school-assessed academic achievement scores (English, mathematics, and science) and the three PAT scores (mathematics, reading, and vocabulary).
In research involving one or more hidden layers, the architecture of a neural network includes q inputs, j hidden layers, and s outputs, where, for example, 13-(5-4-3)-1 represents 13 input variables, three hidden layers (with five, four, and three neurons, respectively), and one output neuron.
4.5. Activation Function
Linear activation functions are ineffective in the hidden layers but are commonly selected for the outer layer (
Kim, 2017). In contrast, nonlinear activation functions such as sigmoid, log-sigmoid, and tan-sigmoid are better suited for the hidden layers (
Beale et al., 2024;
Hagan et al., 2014). Nonlinear activation functions are also consistent with the theoretical basis of the AMG.
4.6. Training Function
The ANN ‘learns’ by adjusting the weights and biases for each iteration and comparing the new outputs with the data. The adjustments begin with the output layer and ‘propagate’ toward the first layer, hence the term backpropagation. This continues until the adjustments no longer improve the match between the output and the data, and the model converges. However, backpropagation methods increase the risk of overfit.
To reduce the potential of overfit, researchers can choose between processes that help to select (or regulate) the most probable model parameters (
Okut, 2016). The three most common regularisation techniques in ANNs include early stopping, Levenberg–Marquardt (LM), and Bayesian regularisation (BR). In early stopping, data are split into training, validation, and testing data. The training data are used to construct a hypothesised model with user-defined number of hidden layers, hidden neurons, and activation function. The validation data test the hypothesised model and select parameter values that generate the smallest average errors from multiple runs. Finally, the testing data test the hypothesised model for its predictability for “unseen” data. However, the need for three subsets of data depends on high-quality data.
The Levenberg–Marquardt technique combines two distinct algorithms for optimising nonlinear functions (
Yu & Wilamowski, 2018). Specifically, it integrates the gradient-descent algorithm and the Gauss–Newton algorithm (
Okut, 2016), so it is able to efficiently converge complex nonlinear curvature. While the Levenberg–Marquardt technique has lower predictability than Bayesian regularisation, it converges faster than BR, and its results are often compared to BR in similar ANN research (
Iftikhar et al., 2021;
Kayri, 2016).
Bayesian regularised artificial neural networks (BRANNs) use Bayesian statistics to prevent overfit (
Kayri, 2016;
Okut, 2016;
Zhao et al., 2011). Bayesian regularisation also streamlines the ANN training cycle by stopping the model error reduction calculations as soon as the squared error reaches the minimum value, rather than allowing the training cycle to loop through the entire irritative and time-consuming process (
Burden & Winkler, 2008;
Sun et al., 2014).
Burden and Winkler (
2008) also provided explanations of the mathematical framework behind BRANNs.
The choice between the LM and BR training algorithms requires optimisation. Both choices are investigated in the current study.
4.7. Proportions of Training and Test Data
As mentioned previously, the training and testing data are used to train the model and then test the model, respectively. The training and testing data are randomly selected by the software used to generate the neural network in each iteration or run. The default proportion is 70% training and 30% testing data, but this can be adjusted if deemed necessary. Increasing the proportion of training data could increase overfit, while increasing testing data could increase underfit.
4.8. Data Quality
In addition to having an optimal structure, the success of the ANN is dependent on the quality of the training data. If the training data suffers from sampling bias or non-random missing data, the generalisability and predictability of the machine learning model is reduced. Furthermore, sampling and measurement errors increase noise in the data, thereby reducing the capacity of the ANN to accurately estimate the mathematical relationship between variables, and, as a consequence, its potential to make predictions.
To illustrate a practical application of an ANN in engineering,
Feng et al. (
2019) reported that for a dataset comprising 21 variables and 575 observations, the optimal architecture was 21-(6-5-4-3)-1, representing one input layer with 21 neurons (corresponding to 21 input variables); four hidden layers with 6, 5, 4, and 3 hidden neurons, respectively; and one output layer consisting of 1 neuron. The activation function was a hyperbolic tangent function for both the input and hidden layers, as well as a linear function for the output layer. In order to reduce the potential for overfit, the learning function was Bayesian regularisation. For this model, the values of Train-R and Test-R were 0.99 and 0.93, respectively. Importantly,
Feng et al. (
2019) used this architecture, including the associated weights and biases, to accurately predict defects in stainless steel structures based on their chemical composition and physical properties.
5. Current Study
Students, parents, and staff from one large co-educational Australian school participated in the research. The school enrols over 3600 students from Reception to Year 12 and is organised into three sub-schools, including Junior School (Reception to Year 6), Middle School (Years 7–10), and Senior School (Year 11–12). At the time of data collection, the school community was adjusting to a return to school after a period of learning at home because of the COVID-19 pandemic. As well as providing further details regarding the school, the significant challenges facing these students and their parents when learning at home are reported in
S. N. Phillipson et al. (
2025a,
2025b).
Since this research appears as the first to test the utility of ANNs in the context of the AMG, the first question focuses on whether an ANN can model the interactions between the educational and learning capitals and academic achievement. Accordingly, we identify the optimal ANN for three school-assessed achievement scores, including mathematics, English, and science, and three standardised tests in mathematics, reading, and vocabulary. Given the history of AMG research involving the use of SEM, we compare the ANN models to SEM models created from the same dataset in order to support the interpretability of the ANN.
Finally, we evaluate the potential of the optimal ANN to predict future academic achievement scores based on changes to capitals that could be improved by classroom teachers. This aspect of our study is based on the premise that not everything schools and teachers do to support students has the same impact (
Hattie, 2008,
2023). For example, Hattie’s meta-analyses showed that the most powerful effects on student learning include optimising the classroom’s climate, minimising the influence of disruptive students, and the positive influence of peers.
In this aspect of our study, we focus on the school-based mathematics achievement scores of Year 5 and Year 8 students and their learning capitals rather than educational capitals. The educational and learning capital profiles of two low-performing students in Year 5 are compared with the profiles of the average and highest-performing Year 8 students. We identify three types of variables: the focus variable, supplementary variables, and unchanged variables, where the focus variable refers to the capital that has a low Rasch score, which could improve through direct teacher intervention, while the supplementary variables are other capitals that could also increase due to improvements to the student’s learning environment. Reflecting the terminology, unchanged variables remain the same from Year 5 to Year 8, including the other six capitals and gender.
In terms of teacher intervention, the quality of their relationships with students, their approaches to teaching, and their high expectations, amongst other factors, can significantly impact student learning. Of these ‘effects’, some are under the direct control of teachers, whilst others are either indirect or minimal. For example, the teacher may not be able to control the specific content of the curriculum but can control their method of teaching, the expectations placed on students, their strength of partnership with parents, and positively impact the aspirations of students.
6. Materials and Methods
We outline the key steps in the workflow in turn.
6.1. Participants
Following institutional, parental, and student approvals, students across Year 5 to 12 participated as fully informed and consenting volunteers. With support from the school staff, students (n = 2338) were assigned an individual code to allow for student responses to remain anonymous and to be matched with their school grades and standardised test scores.
6.2. Measures
The independent variables include measures of the availability of 11 educational and learning capitals, gender, and student year level. The dependent variables include three school-assessed academic achievement scores and test scores in three standardised tests.
6.3. Capitals
Students completed a modified version of the Questionnaire of Educational and Learning Capital (QELC) (
S. Phillipson et al., 2018;
S. N. Phillipson et al., 2017). For this research, a small sample of students across each of the year levels trialled the survey items, resulting in a small number of items being reworded to ensure that all students understood the context of the items. The QELC measures students’ perceptions of resource availability and consists of 55 items, corresponding to between 4 to 6 items for each of the 11 capitals.
The details of the 11 capitals are outlined in
Table 1. In the current study, students selected the extent they agreed with the items on a four-point Likert scale, where 1 = completely disagree, 2 = disagree, 3 = agree, and 4 = completely agree. Most items were written in the positive, meaning that responses of 3 and 4 corresponded with increased levels of the capital in their environment. A small number of items were stated in the negative, and accordingly, these responses were reverse coded using Stata (Version 16) before further analysis.
Thus, the responses to the QELC provided 11 predictor variables, including six educational capitals (economic, cultural, social, infrastructural, didactic, and aspirations) and five learning capitals (LCs) (organismic, actional, telic, episodic, and attentional). In addition, the instrument asked students to provide their age, gender, and year level. Responses to these items were used to check that the responses were correctly matched, while gender and year level provided two additional predictor variables. The order of all items was randomised for each student.
6.4. School-Assessed Achievement Scores
The two groups of outcome variables included school-assessed achievement scores in mathematics (SchoolMath), English (SchoolEng), and science (SchoolSci), as well as scores in three standardised tests in mathematics (PATMath), reading (PATRead), and vocabulary (PATVocab). In order to ensure anonymity, the school supplied all outcome variables using the de-identified codes.
In terms of the school-assessed achievement, the scores were aggregated from assessment tasks completed by students during the school semester. Although the methods used to arrive at these scores were consistent at each year level, they varied across the different year levels. For the reasons outlined earlier, the outcomes at one grade level are not directly comparable at another.
All students from Years 5–10 reported achievement scores in mathematics, English, and science. For a small number of Years 9 and 10 students, the school offered additional ‘Foundation’- and ‘Extension’-level subjects. However, these scores were not included as outcome variables. Finally, students in Year 11 could choose from either biology, chemistry, and physics as their science subject, with students opting for none, one, two, or all three subjects. For students undertaking more than two subjects, their science achievement score was calculated as the arithmetic average.
School-assessed achievement scores were recorded as one of 15 letter grades, ranging from E− to A+. Using advice from the school, the letter grades corresponded to a range of values where, for example, E− and A+ corresponded to an achievement score between 0–16% and 95–100%, respectively. Accordingly, we converted each of the 15 grades to mid-points for each range, where, for example, E− = 13% and A+ = 97%. Of course, this means that school achievement scores were restricted to one of 15 discrete ‘numerical’ values and, strictly speaking, could not be considered interval-level data.
6.5. Standardised Achievement Scores
Approximately one month prior to completing the QELC, students from Years 5 to 10 at the school completed the Progressive Achievement Test (PAT) in mathematics (PATMath), reading (PATRead), and vocabulary (PATVocab). PAT is a large-scale standardised test designed to provide additional information for schools to support student learning (
Australian Council for Educational Research (ACER), 2025). Consistent with other large-scale international tests such as PISA and TIMSS, PAT scores are Rasch-standardised, allowing for the direct comparisons of student performance across different year levels. For example, a student’s PATMath score is a measure of their current levels of skills and knowledge in number, algebra, geometry, measurement, statistics, and probability. The tests are adaptive in the sense that the difficulty of items changes according to the responses by students. This allows students’ performance across different year levels to be directly compared. Thus, students in junior years are expected to achieve lower scores compared to students in senior years.
6.6. Procedures
Students responded to the instrument during the final two weeks of the school year, corresponding to the completion of the tasks that contributed to school-assessed achievement scores. Together with details of the research project, the instrument was formatted for Qualtrics, an online survey platform, and sent to students and their parents using the email addresses supplied by the school. Using the appropriate Qualtrics function, the items were randomised on the survey platform to avoid possible order effects. With the permission of their parents, students responded to the instrument over a two-week period, with their responses automatically collated onto a spreadsheet using only their de-identified codes.
It is important to note that performance scores were not measured at the same time as capital availability. In particular, PAT scores were collected one month earlier, while school-assessed achievement scores were aggregated from assessment tasks throughout the semester. While it is unusual to attribute academic performance scores to educational and learning capitals measured at different timepoints within the same semester, our methods are consistent with other actiotope research and are based on the assumption that latent constructs are relatively stable.
The school supplied each student’s final school-achievement scores and PAT scores using de-identified codes generated by the researchers. The matching of student achievement scores with their responses to the instrument was enabled using Stata (Version 16), a programming tool used to clean, merge, and analyse data. The accuracy of the merge was checked manually.
7. Data Analysis, Modelling, and Prediction
7.1. Initial Data Analysis
The data were analysed in four broad ways. The first included descriptive statistics of the input and output variables from the responses to the project invitation. Little’s missing completely at random (MCAR) test (
Tabachnick & Fidell, 2019) was conducted on the 55 items reflecting the 11 capitals prior to Rasch standardisation to ensure that any missing values were not correlated with any variables. Although
Tabachnick and Fidell (
2019) recommended a benchmark of 5% of data missing in a random pattern (
Waterbury, 2019), the extent and type of missing data affect the fit statistics of Rasch models and so should be kept to a minimum (
Bond et al., 2020).
After MCAR, the responses to the QELC were then Rasch-standardised using Winsteps (Version 4.2.2) to check that each of the 11 subscales was able to measure variability in person ‘ability’ on each of the 11 capitals. Using Rasch modelling for polytomous data, overall fit statistics for persons, items, and categories are reported, together with corresponding estimates of reliability and separation. Based on the criteria outlined elsewhere (
Bond et al., 2020;
Linacre, 2022,
2023,
2024;
S. N. Phillipson et al., 2017;
S. Phillipson et al., 2018), misfitting persons and items were removed as necessary. Given that each capital was estimated using between 4–6, a cautious approach was utilised to avoid the removal of items where possible. In total, only one item was removed from economic capital, and two items were removed from aspirations capital.
7.2. Modelling
For both SEM and ANNs, a consideration of sample size is important. For SEM, a randomised sample size of 200 with normally distributed and non-missing data is recommended (
Kline, 2016). Each violation of this requirement increases the sample size required for statistical power and precision. For the ANN, a complete dataset of around 500 observations is considered small (
Feng et al., 2019).
7.3. Building Datasets
For all datasets used in the modelling, academic scores more than three standard deviations from the mean are considered as outliers and are removed. The distribution of all PAT achievement scores are also assessed for normality by overlaying a normal curve for each of the three PAT subjects. The visual examination of a normal distribution was not applied to the three school-grade datasets as they were converted from letter grades and are not true interval-level data.
In each dataset, any observation without matched Rasch-standardised survey responses and academic achievement scores is removed from subsequent analysis. As there is a unique dataset for each academic variable, a student who is removed from one dataset could still remain in the other datasets if the corresponding data in those academic variables are complete. Although the observations differed across the datasets for the six measures of academic achievement, this procedure ensured that there were sufficient data to build SEM and ANN models for all six estimates of academic achievement.
The complete dataset comprising input (11 capitals, gender, and year level) and output variables (school-achievement scores and PAT scores) was modelled using SEM and ANNs. Thus, six separate SEM models and six (optimal) ANN models were created.
7.4. Structural Equation Modelling
The software SPSS AMOS (Version 26) was used to build SEM. Reflecting convention, a measurement model was constructed for each latent construct, followed by a full structural model showing the hypothesised relationships between all observed variables, latent constructs, and one measure of academic achievement scores. The variables gender and year level were evaluated for inclusion in the SEM structural models but were not included in the final SEM models because their inclusion had minimal improvements to model performance while resulting in a less parsimonious model with poorer model fit.
The input data for SEM included Rasch-standardised capital scores for each of the 11 capitals, gender, and year level. Gender was encoded as 1 = male and 2 = female, and year level as numerical values ranging from 5 to 12, corresponding to the eight possible year levels.
For each latent construct, the strength of the relationship between the latent and observed variables is indicated by the factor loadings, also referred to as validity coefficients or path coefficients. SPSS AMOS reports these factor loadings in both the unstandardized and standardised forms, where the factor loadings are standardised to a range between 0 and 1. For each observed variable, the standardised factor loadings are interpreted the same way as regression coefficients, where the squared and standardised factor loadings explain the extent of variance (R
2) of the latent variable (
Collier, 2020).
Observed variables with a standardised factor loading greater than 0.7 are appropriate for the latent construct as they explain at least 50% of the variance in the corresponding latent construct (
Collier, 2020). The remaining unexplained variance of the latent variable is attributed as the measurement error of the observed variable and can be calculated from the following equation: 1 − R
2. For each observed variable, its measurement error is inversely proportionate to its standardised factor loading. Therefore, constructs with high validity have observed variables with high factor loadings and low measurement errors as the observed variables are accurately measuring the latent construct as hypothesised (
Byrne, 2016).
The reliability of observed variables for each construct is assessed through composite reliability (
Collier, 2020) rather than Cronbach’s alpha because the latter is sensitive to a large number of observed variables and assumes equal contribution for the observed variables of the same construct. The model fit of the hypothesised measurement and structural models is evaluated through the chi-squared test (χ
2), the root mean square error of approximation (RMSEA), the comparative fit index (CFI), the relative fit index (RFI), and the standardised root mean squared residual (SRMR) (
Byrne, 2016;
Collier, 2020).
7.5. Artificial Neural Networks
In the current study, six model parameters are optimised for each measure of academic achievement, resulting in six distinct ANN models, reflecting the relationship between 11 educational and learning capitals, student gender, and year level on SchoolMath, SchoolEng, SchoolSci, PATMath, PATVocab, and PATRead. The fit of each ANN model with the data is determined by comparing the values of Train-R with Test-R.
The software programme MATLAB (Version R2023b) was used to generate and test the various configurations of ANNs. As required by MATLAB, the datasets were separated into an input (13 n) and an output (or target) (1 n) matrix, where n equals the number of observations in the dataset. The MATLAB output included Train-R values, Test-R values, weights, and biases for each iteration. The outputs were processed with Stata to calculate the mean, SD, and 95% confidence intervals.
To prevent the oversaturation of the activation functions and to facilitate the training function, both the input and target data are normalised to a range between −1 and 1 (
Beale et al., 2024;
Hagan et al., 2014). The input variable, gender, was one-hot encoded by creating two new variables, namely male and female. Thus, a male student is encoded as 1 and 0 for the male and female variables, respectively. Although one-hot encoding is a common procedure in ANN modelling (
Beale et al., 2024), its impact on the modelling may be negligible. Finally, year level was encoded as numerical values ranging from 5 to 12, despite being nominal-level data. Again, its impact on the modelling needs to be established.
To simplify the search for optimal models, the number of hidden layers is restricted to one or two hidden layers. Prior studies showed that feedforward ANNs with one hidden layer are a universal approximator for almost all nonlinear relationships between inputs and outputs (
Hagan et al., 2014;
Hornik et al., 1989;
Sonoda & Murata, 2017). Others have pointed out that the inclusion of an extra hidden layer could outperform ANNs with only one hidden layer (
Feng et al., 2019;
Thomas et al., 2016). Deep neural networks with more than two hidden layers are not considered due to the incremental difficulty in the training process for each additional hidden layer (
Feng et al., 2019).
Our research used two separate processes to optimise the six model parameters for ANNs with one or two hidden layers. For models with one hidden layer, four steps are employed to evaluate their interactions effects with the other five model parameters. Each of the four steps tested 20 or 40 unique ANN configurations, and each configuration was run 100 times to obtain the mean Train-R values and mean Test-R values. From these four steps, the optimal model was selected as the parsimonious model with the highest mean Test-R value (
Bentler & Mooijaart, 1989).
For models with two hidden layers, the optimisation process evaluates the 13 configurations outlined in
Thomas et al. (
2016). Using these 13 configurations streamlines the optimisation process as
Thomas et al. (
2016) found that these configurations have the highest probability of being the optimal topology for ANNs with two hidden layers.
A four-step process was used to establish the optimal ANN:
Step 1: The interaction effect of the number of hidden neurons and the activation function for ANNs with one hidden layer was tested using 40 unique neural networks for each academic variable. These models had 1 to 20 hidden neurons and either tan-sigmoid or log-sigmoid as the activation function in the hidden layer. Other parameters such as the training function were fixed as Bayesian regularisation, and the division of Train and Test data was fixed at 70% and 30%, respectively.
Step 2: The effects of training functions, activation functions, and the number of hidden neurons on ANNs with one hidden layer were tested using 20 unique neural networks for each academic variable. These models had a tan-sigmoid or log-sigmoid activation function in the hidden layer and Bayesian regularisation or the Levenberg–Marquardt method as the training function. The number of hidden neurons tested is determined by the results from Step 1 and includes five variations.
For each academic variable, the number of hidden neurons that produced the highest mean Test-R values in Step 1 and four models in close proximity were tested. If the results from Step 1 indicated that seven is the optimal number of hidden neurons, then five to nine hidden neurons were then tested in Step 2. Restricting the maximum number of hidden neurons tested from 20 to 5 reduced the training time but ensured that the search range was wide enough to include the optimal number of hidden neurons. In these models, the proportion of Train and Test data was fixed at 70% and 30%, respectively.
Step 3: The effects of one-hot encoded gender variables and excluding demographic variables on ANNs with one hidden layer were tested with both tan-sigmoid and log-sigmoid activation functions, resulting in 20 unique neural networks for each academic variable. The training function was fixed as Bayesian regularisation, and the division of Train and Test data was fixed at 70% and 30%, respectively.
Step 4: The effects of different proportions of training data and testing data division on neural networks with one hidden layer and five variations in the number of hidden neurons were tested. The four variations in Train and Test data proportions had ranges of 60–40%, 70–30%, 75–25%, and 80–20%, respectively. The activation function was set as tan-sigmoid, and the training function was fixed as Bayesian regularisation. We report only the results at Step 4.
7.6. Distribution Equivalence
For each of the six optimal ANN models, the distribution of input variables in the Train and Test datasets was checked for equivalence. As the allocation of data into Train and Test datasets for each run is random, the distribution of input variables in these datasets should share the same underlying distribution. In particular, 20 of the 100 datasets used in calculating Training-R and Test-R were randomly selected and checked for equivalence using a two-sample Kolmogorov–Smirnov (K-S) test.
7.7. Variable Importance
Olden et al.’s (
2004) connection weight approach was used to assess variable importance for each optimal neural network model. The approach calculates the net connection weight for each input neuron and ranks these weights by size using the following formula:
where
X is the input neuron,
Y is the hidden neuron, and
A is any given hidden neuron in the ANN.
For each input neuron, all the input-hidden and hidden-output connection weights between itself and the output neuron(s) are multiplied and added together, enabling a connection weight for each input variable to be calculated. A Stata programme was written and used to access the weights from MATLAB outputs and to calculate the value of . Finally, the input neuron with the largest connection weight is ranked “1” and is deemed the most important predictor in the model.
7.8. Predictive Modelling
Two groups of prediction models are implemented, representing a conservative and optimistic ‘intervention’ taken by teachers and schools. The prediction spans from Year 5 to Year 8 to illustrate the possible outcomes of an ‘intervention’ over three years of schooling that focussed on the physical and mental health of students (organismic capital) or their aspirations (aspirations capital) on SchoolMath scores. Because a focus on these two capitals are likely to impact other capitals, most notably telic, episodic, and attentional capitals, the values of these supplementary capitals are also changed. In each scenario, the focus variable and supplementary variables of the lowest-performing Year 5 students are replaced with the mean Rasch-standardised capitals scores of average (conservative approach)- or highest (optimistic approach)-performing Year 8 students, respectively. Based on the optimal ANN model, the predictions are run 100 times, and mean scores (and SDs) are reported. The four prediction models are summarised in
Table 3.
8. Results
8.1. Descriptive Statistics
A total of 778 responses from students were received (
Table 4), representing 33% of the possible 2338 participants. Of these, 417 (35%) and 361 (31%) are responses from female and male students, respectively.
The distribution of responses was skewed towards Years 5–9, with response rates ranging from 33% for Year 7 students to 74% for Year 5 students. Across the eight year levels, the response rates for female students were always higher than the male students. The low response rate from Year 12 students is likely because these students had completed their secondary education and did not access their survey links through their school emails.
Of the 778 students, 737 (95%) responded to all 55 items that operationalised the 11 educational and learning capitals. Missing value analysis using Little’s MCAR test showed that the percentage of missing data ranged between 2.3 and 3.9% for the 55 items, and the missing data were randomly distributed and not associated with any one variable ( = 934.46, p = 0.717). Noting that Rasch modelling is able to compute Rasch scores (logits) and the associated fit statistics even with the presence of missing data, the dataset was then subject to Rasch analysis.
8.2. Rasch Analysis
Taking into account any missing responses, Rasch measures for all 11 capitals were obtained for 749 (96%) of the 778 returned questionnaires. For the remaining 29 responses, Rasch measures were obtained for 6 to 10 capitals. Initial fit statistics identified several misfitting persons and items. Measures of rating scale effectiveness showed that the four categories were used by respondents and contributed to the effectiveness of the instrument. Accordingly, all responses and items were retained in order to compute Rasch measures for all 11 capitals.
For each capital, the Infit and Outfit MnSq (and SD) and standardised Z statistics are within the acceptable range of 0.6 < MnSq < 1.4 and −2.0 < Z < 2.0, respectively. Estimates of reliability are also acceptable, with values of person reliability ranging from 0.46 (telic) to 0.79 (aspirations). Estimates of item reliability are also acceptable, with values ranging from 0.98 to 1.00. Person and item separation indices range from 0.92 (telic) to 1.94 (aspirations) and from 6.16 (infrastructural) to 22.32 (attentional), respectively.
With the exception of Year 12 males, the final number of items used to estimate variability in each of the 11 capitals and the distribution of Rasch scores (Mean and SD) across all eight year levels are shown in
Table 5. For female students, the number of complete responses across each of the capitals ranges from 417 (social) to 411 (economic), and for male students, the number ranges from 361 (social) to 356 (telic). Thus, datasets comprising all 11 capitals can be obtained for 411 female students and 356 male students.
Given the focus of the current study, no attempt was made to estimate effect size differences across genders. Nevertheless, the general trend for both female and male students is that didactic capital is the highest-rated capital, with mean values of 3.68 and 3.48 logits, respectively. In contrast, attentional and organismic learning capital are generally rated lowest for female and male students, with values of 0.47 and −0.11, and 0.4 and 0.48, respectively (
Figure 2).
A close examination of the year-to-year differences for female students show that the greatest changes in capitals occur in Years 10–12, with cultural capital reducing from 3.02 in Year 9 to 2.31 in Year 10, only to rise again to 3.36 in Year 11. Of the remaining capitals, infrastructural and didactic capitals show reductions between Years 9–10, whereas actional and episodic capitals show increases from Years 10–11, reflecting the changes in cultural, infrastructural, and didactic capitals. For these students, the changes correspond with the transition from Middle School to Senior School.
The year-to-year differences for male students show similar patterns of change, albeit not to the same degree as female students. Increases in economic capital occur between Years 7–8 and again in Years 10–11. Across Years 8–9, these students report a reduction in this capital.
Of the remaining capitals, male students report increases in didactic, aspirations, telic, and attentional capitals in year levels close to the transition from Middle School to Senior School. However, male students report a reduction in actional and attentional capitals in the transition between Middle School and Senior School.
8.3. Achievement Scores
The number of students with school-based achievement scores that could be matched with the Rasch-standardised survey responses was n = 590 for mathematics, n = 676 for English, and n = 701 for science. For both mathematics and English, the scores ranged from 37% (D) to 97% (A+), with no scores corresponding to D- or less after removing three observations with outlier grades. Although the range of scores in science ranged from 31% (D-) to 97% after removing one observation with an outlier grade, only one student received the lowest score.
The distribution of scores across all three school scores was non-normal. For example, the percentage of students receiving a score of 55%, 73%, and 92% for SchoolMath, SchoolEng, and SchoolSci was approximately 27%, 22%, and 14%, respectively.
Progressive Achievement Test (PAT) scores in mathematics, vocabulary, and reading were made available by the school for students in Years 5–10. Matching these data with the responses to the instrument resulted in n = 470, n = 379, and n = 476 complete datasets for each PAT, respectively. While complete, the matched datasets contained uneven distributions of PAT scores across the six year levels, where 69% of PAT mathematics scores and 65% of PAT vocabulary and PAT reading scores are from students in Years 5 and 6. The uneven distribution of PAT scores across different year levels violates the assumptions of SEM and contributes to the lack of model performance, especially for PAT vocabulary and PAT reading. This uneven distribution of PAT scores did not affect ANN model performance as it is able to model both linear and nonlinear relationships. Although not directly tested, the scores in each dataset appear normally distributed when overlaid with a normal curve.
8.4. Structural Equation Models
The SEM and ANN modelling used complete datasets of n = 590 for SchoolMath, n = 676 for SchoolEng, n = 701 for SchoolSci, n = 470 for PATMath, n = 379 for PATVocab, and n = 476 for PATRead.
Confirmatory factor analysis was performed prior to constructing structural models. Although not shown, all analyses indicate good fit, with all p-values greater than 0.05, RMSEA less than 0.05, CFI greater than 0.95, RFI greater than 0.9, and SRMR less than 0.05. The Composite Reliabilities are greater than 0.7, indicating acceptable reliability of observed variables for each construct.
When setting up the SEM latent models, we found that including year level and gender had minimal improvements to model performance but resulted in a less parsimonious model with poorer model fit statistics. As a result, these demographic variables are not included in our SEM models for the six measures of academic achievement.
The absence of demographic variables is also observed in prior research that uses SEM to model talent development based on the AMG. For example,
Paz-Baruch (
2020)’s SEM model evaluated the relationship between the educational capital, learning capital, general intelligence, and academic achievement of primary school students from eight co-educational schools but did not include any demographics variables.
The final SEM models for the six measures of academic achievement are shown, beginning with school-assessed achievement scores. Consistent with previous research, the data show a high degree of collinearity between the educational and learning capitals across all six SEM models, ranging from 0.75 to 0.81 (
Figure 3 and
Figure 4). Although these values are critical (
Tabachnick & Fidell, 2019), their removal did not enhance the models. Hence, all six SEM models retained the covariance between these two constructs.
The SEM models showing the relationships between capitals and achievement scores are shown in
Figure 3 for school-assessed scores and
Figure 4 for PAT scores.
8.5. School-Assessed Mathematics Scores
The SEM describing the relationships between educational and learning capitals and SchoolMath is shown in
Figure 3a. The correlation matrix indicates that 65 of the 66 possible correlations between the capitals and SchoolMath scores are statistically significant (
p < 0.05), with values ranging from
R = 0.11 (aspirations and organismic) to
R = 0.63 (actional and episodic). The only exception is the correlation between cultural capital and SchoolMath (
R = 0.08).
The high collinearity between learning capitals and educational capitals for SchoolMath SEM (0.81) indicates the lack of independence between assessments of the educational and learning capitals within this group of participants. However, the modelling collapses if this covariance is removed.
The standardised estimate between the learning capitals and SchoolMath scores has a value of 0.70, compared to educational capitals and SchoolMath with a value of −0.37. Thus, 49% of the variability in SchoolMath scores can be explained by the learning capitals. In contrast, the negative standardised estimate between educational capitals and SchoolMath indicates that high evaluations of these resources are correlated with lower SchoolMath scores.
8.6. School-Assessed English Scores
The relationships between educational and learning capitals and SchoolEng are shown in
Figure 3b. The correlation matrix indicates that 65 of the 66 possible correlations between the capitals and SchoolEng scores are statistically significant (
p < 0.05 or
p < 0.01), with values of
R = 0.09 (cultural and organismic),
R = 0.11 (cultural, social, didactic, and SchoolEng), and
R = 0.64 (actional and episodic). The only exceptions is the statistically non-significant correlations between aspirations and organismic capitals (
R = 0.05).
Again, there is high collinearity between learning capitals and educational capitals (0.76). However, the modelling collapses if this covariance is removed. The standardised estimate between the learning capitals and SchoolEng scores has a value of 0.52, compared to educational capitals and SchoolEng with a value of −0.24. Thus, 27% of the variability in SchoolEng scores can be explained by the learning capitals. As for SchoolMath, the negative standardised estimate between educational capitals and SchoolEng indicates that high evaluations of these resources are correlated with lower SchoolEng scores.
8.7. School-Assessed Science Scores
The relationships between educational and learning capitals and SchoolSci are shown in
Figure 3c. The correlation matrix indicates that 65 of 66 possible correlations between the capitals and SchoolSci scores are statistically significant (
p < 0.05 or
p < 0.01), with values ranging from
R = 0.12 (economic, cultural, social and organismic capitals; and SchoolSci scores) to
R = 0.63 (actional and episodic). The only exception is the statistically non-significant correlations between aspirations and organismic capitals (
R = 0.05).
As with previous models, although there is high collinearity between learning capitals and educational capitals (0.78), the modelling collapses if this covariance is removed. The standardised estimate between the learning capitals and SchoolSci scores has a value of 0.54, compared to educational capitals and SchoolSci with a value of −0.24. Thus, 29% of the variability in SchoolSci scores can be explained by the learning capitals. As with other school-assessed scores, the negative standardised estimate between educational capitals and SchoolSci indicates that high evaluations of these resources are correlated with lower SchoolSci scores.
8.8. PAT Mathematics Scores
Confirmatory factor analysis was performed prior to constructing structural models. All analyses indicate good fit, with all p-values greater than 0.05, RMSEA less than 0.05, CFI greater than 0.95, RFI greater than 0.9, and SRMR less than 0.05. As for school-assessed scores, Composite Reliabilities are greater than 0.7, indicating acceptable reliability of observed variables for each construct.
The relationships between educational and learning capitals and PATMath are shown in
Figure 4a. The correlation matrix indicates that 57 of 66 possible correlations between the capitals and PATMath scores are statistically significant (
p < 0.05 or
p < 0.01), with values ranging from
R = 0.14 (cultural and organismic) to
R = 0.62 (actional and episodic). The nine statistically non-significant correlations include economic, cultural, social, infrastructural, didactic, aspirations, organismic, episodic and attentional capitals, as well as PATMath scores.
As with previous models, although there is high collinearity between learning capitals and educational capitals (0.79), the modelling collapses if this covariance is removed. The standardised estimate between the learning capitals and PATMath scores has a value of 0.42, compared to educational capitals and PATMath with a value of −0.32. Thus, 18% of the variability in PATMaths scores can be explained by the learning capitals.
8.9. PAT Vocabulary Scores
The relationships between educational and learning capitals and PATVocab are shown in
Figure 4b. The correlation matrix indicates that 57 of 66 possible correlations between the capitals and PATVocab scores are statistically significant (
p < 0.05 or
p < 0.01), with values ranging from
R = 0.11 to
R = 0.61 (infrastructural and didactic). The nine statistically non-significant correlations include most of the educational and learning capitals with PATVocab. The only exceptions are actional (
R = 0.11,
p < 0.05) and telic (
R = 0.14,
p < 0.01) with PATVocab.
As with previous models, although there is high collinearity between learning capitals and educational capitals (0.79), the modelling collapses if this covariance is removed. The standardised estimate between the learning capitals and PATVocab scores has a value of 0.13, compared to educational capitals and PATVocab with a value of −0.07. Thus, only a negligible portion of the variance in PATVocab scores can be explained by this model.
8.10. PAT Reading Scores
The relationships between educational and learning capitals and PATRead are shown in
Figure 4c. The correlation matrix indicates that 59 of 66 possible correlations between the capitals and PATRead scores are significant (
p < 0.05 or
p < 0.01), with values ranging from
R = 0.16 to
R = 0.63. Again, the seven statistically non-significant correlations include most of the educational and learning capitals and PATRead. The only exceptions are aspirations (
R = 0.10,
p < 0.05), actional (
R = 0.20,
p < 0.01), and attentional (
R = 0.12,
p < 0.01).
As with previous models, although there is high collinearity between learning capitals and educational capitals (0.75), the modelling collapses if this covariance is removed. The standardised estimate between the learning capitals and PATRead scores has a value of 0.22, compared to educational capitals and PATRead with a value of −0.12. Thus, only a negligible portion of the variance in PATRead scores can be explained by this model.
8.11. Artificial Neural Networks
The optimal ANNs are organised into two groups, beginning with school-assessed scores followed by PAT scores. For each of the six ANN models, the input variables included the 11 capitals, gender (male and female), and year level (5–12), and the output (or target) variables were one of the six scores. To identify the optimal ANN for each of the six measures of academic achievement, the number of hidden layers ranged between one and two, with the number of neurons in the hidden layer varying between 1 and 20. Two activation (tan-sigmoid and log-sigmoid) functions were tested, and two training functions (Bayesian regularisation or Levenberg–Marquardt) were tested. Next, one-hot coding was tested for its capacity to encode gender, with year level coded as numbers. Finally, the distribution of training and testing data was tested, with proportions in ranges of 60–40%, 70–30%, 75–25%, and 80–20%, respectively. For models with two hidden layers,
Thomas et al.’s (
2016) search trajectory was utilised in order to streamline the search.
To test the five parameters for ANNs with one hidden layer, we begin by testing the number of hidden neurons and the choice of the activation function in the hidden layer. The remaining parameters are then tested in turn, with outcomes from previous steps informing the parameter selections in subsequent steps. MATLAB estimated the training and testing R-values for each of the 100 runs, with Stata used to calculate means and SDs and to generate box-and-whisker plots. The optimal model is the parsimonious model with the highest mean Test-R value.
The distribution of data in the training and testing sets was evaluated using Kolmogorov–Smirnov (K-S) tests to ensure data equivalence. For this purpose, 20 of the 100 re-runs were randomly selected and tested. To identify the most important variables, the optimal model for each measure of academic achievement was re-ran 100 times in order to generate values for each connection weight. These were assessed for importance using
Olden et al.’s (
2004) connection weight approach. Finally, the optimal ANN for school grades in mathematics was used for the purpose of predictive modelling.
The results indicate that ANNs with one hidden layer have higher mean Train-R and Test-R values compared to models with two hidden layers, so the optimal number of hidden layers is one, and the optimal number of hidden neurons varies between one and seven. In all six optimal models, the optimal training function is Bayesian Regularisation, and the optimal train and test data division is 70% and 30%, respectively. The two demographic variables are included, but one-hot encoding of the gender variable is not used as it only marginally improved the mean Test-R values for some models (
Table 6).
8.12. School-Assessed Mathematics Scores
The optimal ANN architecture is 13-(2)-1, representing all 13 input variables, one hidden layer with two hidden neurons, and one output neuron. As indicated in
Table 6, the activation function is tan-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 75% and 25%, respectively. The mean Train-R and Test-R values are 0.476 and 0.410, respectively.
Of the 20 randomly chosen training and testing datasets, the overall impression is that the two datasets are from the same distribution. Of the possible 220 p-values, 217 p-values are not statistically significant. Similarly, the distributions of gender and year levels in the two sets of data were not different.
The analysis of input variable importance showed that in the modelling of SchoolMath scores, the first six most important variables were actional capital, followed by organismic, telic, infrastructural, year level, and attentional (
Figure 5a). The least important variables were social and didactic capitals.
8.13. School-Assessed English Scores
The optimal ANN architecture is 13-(2)-1, again representing all 13 input variables, one hidden layer with two hidden neurons, and one output neuron. The activation function is tan-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 80% and 20%, respectively. The mean Train-R and Test-R values are 0.491 and 0.450, respectively.
Of the possible 220 p-values for the 11 capitals, 215 are not statistically significant. Similarly, the distributions of gender and year levels in the two sets of data were not different, indicating that the training and testing datasets are essentially the same.
In the modelling of SchoolEng scores, the first six most important variables were actional capital, followed by telic, economic, aspirations, year level, and attentional (
Figure 5c). The least important variables were social and cultural capitals.
8.14. School-Assessed Science Scores
The optimal ANN architecture is 13-(1)-1, again representing all 13 input variables, one hidden layer with one hidden neuron, and one output neuron. The activation function is log-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 80% and 20%, respectively. The mean Train-R and Test-R values are 0.517 and 0.493, respectively.
Of the possible 220 p-values for the 11 capitals, 217 are not statistically significant. Similarly, the distributions of gender and year levels in the two sets of data were not different, indicating that the training and testing datasets are essentially the same.
In the modelling of School Science scores, the first six most important variables were actional capital, followed by telic, year level, organismic, attentional, and aspirations (
Figure 5e). The least important variables were social and cultural capitals.
8.15. PAT Mathematics Scores
The optimal ANN architecture is 13-(1)-1, representing all 13 input variables, one hidden layer with one hidden neuron, and one output neuron. The activation function is tan-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 70% and 30%, respectively. The mean Train-R and Test-R values are 0.610 and 0.583, respectively (
Figure 6).
Of the possible 220 p-values, 218 are not statistically significant. Similarly, the distributions of gender and year levels in the two sets of data were not different, indicating that the training and testing datasets are essentially the same.
In the modelling of PATMath scores, the first six most important variables were year level, followed by actional, organismic, telic, infrastructural, and attentional (
Figure 6a). The least important variables were economic and gender.
8.16. PAT Vocabulary Scores
The optimal ANN architecture is 13-(7)-1, representing all 13 input variables, one hidden layer with seven hidden neurons, and one output neuron. The activation function is tan-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 75% and 25%, respectively. The mean Train-R and Test-R values are 0.588 and 0.555, respectively.
Of the possible 220 p-values, 217 are not statistically significant, and the null hypothesis is not rejected. Similarly, the distributions of gender and year levels were not different, indicating that the training and testing datasets are essentially the same.
In the modelling of PATVocab scores, the first six most important variables were year level, followed by actional, telic, organismic, aspirations, and Infrastructural (
Figure 6c). The least important variables were episodic and didactic.
8.17. PAT Reading Scores
The optimal ANN architecture is 13-(1)-1, representing 13 input variables, one hidden layer with one hidden neuron, and one output neuron. The activation function is tan-sigmoid, and the training function is Bayesian Regularisation. The best proportion of training to testing data was 80% and 20%, respectively. The mean Train-R and Test-R values are 0.557 and 0.524, respectively.
Of the 220 p-values, 215 p-values are not statistically significant, and the null hypothesis is not rejected. Again, the distributions of gender and year levels in the two sets of data were not different, indicating that the training and testing datasets are essentially the same.
In the modelling of PATRead scores, the first six most important variables were year level, followed by actional, telic, attentional, organismic, and aspirations (
Figure 6e). The least important variables were episodic and cultural.
8.18. Predictive Modelling
The optimal ANN model for SchoolMath was used in the predicted modelling experiments. These experiments focussed on two Year 5 students with the lowest school grades in mathematics (37%) and tested the potential improvements to their school grades by changing the value of one focus capital and four supplementary capitals while holding the remaining capitals constant. The potential improvements to the focus and supplementary capitals were based on the mean Rasch-standardised capital scores of average-performing (73%; n = 12) and highest-performing Year 8 (97%; n = 7) students in SchoolMath. These represent the conservative and optimistic approaches, respectively.
The Rasch-standardised capital scores of the two Year 5 students are shown in the column labelled ‘Before’ in
Table 7. For example, the Rasch score of organismic capital for Student 1 is −1.84. In the conservative approach, this score is changed to −0.02 (After), representing an increase of 1.82 logits. In the optimistic approach, the corresponding change is from −1.84 to 0.42, representing an increase of 2.26 logits.
For both Students 1 and 2, the impact of changing the focus and supplementary capitals using the conservative approach is to increase their SchoolMath scores from 37% (D) to 65.4% (B−) and 67.5% (B−), respectively. For the optimistic approach, the impact of changing these capitals is to increase their SchoolMath scores to 74.6% (B) and 75.8% (B), respectively. However, these potential improvements are still lower than the mean SchoolMath scores of the average (73%) and highest performing (97%) Year 8 students.
9. Discussion
The focus of the current research is to determine whether the ANN provides an advantage over SEM in modelling the interactions between the 11 educational and learning capitals and academic achievement within the AMG. Students from one co-educational school responded to the Australian version of the QELC. After using Rasch modelling to confirm the utility of the instrument to measure the capitals and to convert the responses to interval-level data, six SEM models and six ANNs were created, corresponding to the six measures of academic achievement, grades in three school-assessed subjects, and scores in three areas of standardised achievement tests. While discussing the results of our research, however, it is important to note that the data were collected in the wake of significant disruptions to regular school activities because of the COVID-19 pandemic. Hence, the students’ actiotopes would have been influenced by these disruptions.
The scope of the data included eight (Years 5–12) and seven (Years 5–11) years for female and male students, respectively. Although effect sizes were not computed, trends in the variability in the 11 capitals for females and males across the eight year levels indicate that both female and male students perceived the quality of teaching and curriculum (didactic) as the resource most available to support their learning, particularly for students in Years 5–7 (
Figure 3). However, a downward trend across these year levels was observed for this and most other capitals, and raising again over Years 10–12.
Cultural and aspirations capitals are also rated highly for these students, with aspiration capital increasing from Middle School students to Senior School students across both genders. For example, for Year 12 female students, the availability of aspiration capital is twice that for Year 6 female students. Similarly, the availability of aspiration capital for Year 11 male students is more than double their Year 7 counterparts.
At the opposite end of the spectrum, two learning capitals (attentional and organismic) were rated the least available by both female and male students across all year levels. In other words, resources that reflected their capacity to attend to their learning and their physical and mental health were not perceived as sufficiently available.
9.1. Structural Equation Models
Structural equation models have been used to establish relationships between the availability of the educational and learning capitals and academic achievement. For all school-assessed scores, the most important predictors of educational capital are social, infrastructural, and didactic capital (
Figure 3). Noting the post-COVID context, these capitals were of the uppermost concern for students (and parents) when returning from learning at home (
S. N. Phillipson et al., 2025a,
2025b). As shown by all models, however, the latent variable, educational capital, was a significant negative predictor of school-assessed academic achievement, indicating that learning capitals were of greater importance. However, gender and year level did not contribute to the models.
The theoretical basis of the AMG suggests that educational and learning capitals positively support student learning. However, a negative standardised estimate is observed between educational capital and academic performance for all three school-assessed and three PATs, with four of six of these estimates being statistically significant at p = 0.05, with t-values less than −1.96. This result suggests that higher levels of educational capitals negatively influence learning outcomes. Although unexpected, the result is plausible given the unique context of the research.
Bearing in mind that significant efforts have been made by the school and parents to ensure that, when learning at home, students have the necessary educational resources such as equipment, quiet spaces to complete learning tasks, and access to quality teaching. Despite these efforts, the challenges facing students when learning at home include limited direct access to their teachers, being distracted from their learning, and a lack of contact with their social group (
S. N. Phillipson et al., 2025b), reflecting the didactic and social educational capitals. In supporting their child, parents report similar concerns (
S. N. Phillipson et al., 2025a). Despite these challenges, however, lower estimates of capital availability would normally be associated with lower scores on academic performance.
There are two possible explanations for this unexpected result. The first explanation focuses on the possibility that students overestimate and/or under-utilise the didactic and social capitals compared with other capitals. A cursory examination of the Rasch scores shows that students do rate these capitals amongst the highest of all capitals (
Table 5), raising the possibility that, despite their availability, students are unable to fully utilise these resources.
The second explanation focuses on changes in the interactions between the capitals immediately following a period of adjustment to learning at home and then adjusting to a return to school. In other words, data were collected when the system was in a transformative stage where allostatic processes are occurring. Moreover, the findings by
S. N. Phillipson et al. (
2025a,
2025b) suggest that these changes were stressful for many students, providing evidence that these students were experiencing allostatic load, where outcomes include changes in physical and mental health, as well as cognition (
Sandifer et al., 2022;
Saxbe et al., 2019). Although the lack of a control group(s) makes it impossible to distinguish between these possibilities, the current research may be the first to describe the interactions between capitals during a transformative stage where the interactions between capitals are very different to those during homeostasis. On the other hand, actional and episodic capitals were important predictors of learning capitals, with learning capitals playing an important role in the variability of school-assessed academic achievement, particularly SchoolMath scores. In the post-COVID context, all of the endogenous learning capitals play a significant role in students’ academic achievement scores.
For all three SEM models reflecting school-assessed academic achievement, there is significant and important covariance between educational and learning capitals, as well as between many of the individual capitals. This finding is consistent with all previous research involving the same methodology. Interestingly, however, as variability in learning capitals increases, there is a decrease in the educational capitals, again reflecting the increased dependence of learning capitals on academic achievement scores in the post-COVID context. It is reasonable to suggest that when learning at home, there is a greater dependence on endogenous resources to support learning.
For the SEM models involving standardised scores in mathematics (PATMath), vocabulary (PATVocab), and reading (PATRead), a similar phenomenon arises. However, the relationships between educational and learning capitals is strongest for PATMath, with actional and episodic learning capitals playing the most important role in the variability in PATMath scores. Consistent with school-assessed academic achievement, there is a negative relationship with educational capitals and standardised scores as well as covariance between both latent variables.
9.2. Artificial Neural Networks
The relationships between the 11 individual capitals, gender, and year level, as well as both school-assessed and standardised achievement scores, can be modelled using ANNs. Across all six outcome variables, the models vary between tan-sigmoid and log-sigmoid activation functions, as well as the number of neurons in the single hidden layer. However, the values of mean R values for the training and testing datasets are higher for standardised achievement scores than school-assessed grades, reflecting higher input data quality for the standardised test scores. Nevertheless, the mean Train-R values and Test-R values for the school-assessed academic achievement scores are sufficient to enable confidence that meaningful relationships are being modelled by the ANNs.
A close examination of input variable importance shows that for all school-assessed achievement scores, actional learning capital was clearly the most important capital. Bearing in mind that a generic assessment was made rather than a context-specific measure of skills and knowledge, the finding indicates that raising student knowledge and skills is fundamental to raising their achievement scores. Although this appears self-evident, the question for teachers (and schools) remains how this can be achieved.
Paradoxically, the role of didactic capital (quality curriculum and teaching) plays a less important role in school-assessed academic achievement of this cohort, perhaps reflecting the unique context of the data collection and, accordingly, consistent with the SEM models. The remaining ‘important’ variables provide some indication as to how this may occur. For SchoolMath, it may be to focus on those capitals under the direct influence of teachers, including organismic, telic, attentional and aspirations capitals.
For standardised achievement scores, it is unsurprising that year level plays the most important role in the variability of scores. The remaining capitals in order of importance include actional, organismic, telic, and aspirations capitals. It is interesting to confirm that didactic, cultural, and social capitals are the least important for all scores. Also surprising is that didactic capital plays a less important role for all three standardised achievement scores.
In supporting students who are not achieving, it is tempting to suggest that teachers at this school should be focussing on raising capitals that are least available or most important for students. For example, this means that a focus on student mental and physical health (organismic) and, to a lesser extent, attentional resources may facilitate an increase in students’ academic achievement. Of the educational capitals, aspirations capital may support students’ academic achievement. The next section outlines our attempt to use the optimal ANN for SchoolMath to predict changes in school-assessed mathematics achievement scores.
9.3. Predicted SchoolMath Scores
As previously outlined, the capacity of the optimal ANN model for SchoolMath to predict future school-assessed achievement scores is tested with two possible scenarios. Both scenarios are focussed on two Year 5 students with the lowest school-assessed score of 37%. Given that it is reasonable for the school to propose interventions to support their improvement, the scenarios are based on the capital profiles of current Year 8 students achieving average (73%) and highest (97%) school-assessed achievement scores (
Table 7).
The results indicate that enhancing these capitals has a positive impact on SchoolMath. In the conservative approach, the predicted SchoolMath score raises from 37% to 65.4% and 67.5% for Students 1 and 2, respectively. In the optimistic approach, the predicted scores raise to 74.6% and 75.8%, respectively. The results suggest that a focus on five capitals is sufficient to double the SchoolMath scores, providing an evidence-based approach for teachers to design appropriate intervention strategies. Given that the capitals are highly interdependent, raising mental and physical health (organismic) and self-regulated strategies such goal setting (telic) and attention may be sufficient to raise actional and episodic capitals to the levels required.
Importantly, the predictions presume that there are no changes to educational capitals such as economic, cultural, social, infrastructural, and didactic. Given that these capitals are outside the classroom teacher’s immediate range of influence, this provides additional surety for the teacher and their school on their approach.
As already outlined, the results are based on the actiotopes of students attending one school and a specific context. Although the AMG is based on the notion that the interactions between the capitals remain constant, the levels of the capitals will vary across different contexts. The results of the current study encourage further efforts to use ANNs to model these interactions. More importantly, the models are amenable to hypothesis testing where, for example, the impact of teacher interventions on these capitals and achievement scores can be measured and compared with predicted improvements.
Although comprehensive, the current study needs to be confirmed under a number of different contexts and with better-quality data, especially in relation to school-assessed achievement scores. In Australia, the use of data from the National Assessment Program—Literacy and Numeracy (NAPLAN) could provide policymakers with direct evidence for the direction of resources to address issues in student literacy and numeracy. Moreover, longitudinal studies are also better placed to measure changes in both capitals and achievement scores, which is consistent with the central premise of the AMG.