1. Introduction
The importance of general cognitive ability,
g, as compared to specific abilities for job performance has been a subject of great debate (
Kell and Lang 2018). Despite the assertion that cognitive abilities are among the best predictors of job performance, the controversy as to which ability or set of abilities plays a significant role in explaining variance in the criterion space of job performance has never ended. In this context, many studies have been and are being published to advocate the importance of some particular ability over other abilities for job performance prediction. The works of
Schmidt and Hunter (
1998,
2004) are examples of a strong line of research ascertaining that
g is the most crucial ability for predicting occupational performance, whereas specific abilities do not explain much variance beyond
g.
Hunter (
1986, p. 341) took an extreme position when he stated that “it is general cognitive ability and not specific cognitive aptitudes which predict job performance.”
Schmidt (
2002) argued that it is “not logically possible” to have a serious debate over the importance of general cognitive ability for job performance. In the same way, the “Not Much More Than g” series of Ree and his colleagues (
Ree and Earles 1991,
1996;
Ree et al. 1994) is a reflection of the same standpoint that views
g as the best construct for the prediction of job performance. One implication of such a hypothesis is that the focus in selection procedures should be directed, to a large extent, to applicants’ scores of general ability (or IQ) and, to a much lesser extent, to their narrower ability scores.
Opposing this line of cognitive ability research, another direction has started to gain attention in recent years, emphasizing that specific abilities (e.g., verbal, quantitative, spatial) can also be significant components for predicting success in occupations, and their roles should not be ignored (e.g.,
Krumm et al. 2014;
Lang and Kell 2020;
Murphy 2017;
Reeve et al. 2015;
Schneider and Newman 2015;
Wee et al. 2014). The idea of having one single trait,
g, capable of fully capturing the individual differences in job performance might be problematic for applied industrial/organizational (I/O) psychology (
Beier et al. 2019), particularly for selection and assessment purposes.
Beier et al. (
2019) noted that three challenges arise when relying solely on a
g score: violation of legal frameworks in some organizations (e.g., not complying with job analysis), limitations of the information obtained from one single score, and the large majority–minority differences typically associated with
g scores. Criticism was raised that research examining the prediction of job performance often takes
g for granted, and other abilities are considered only for the sake of a little improvement (
Ziegler and Peikert 2018).
Stankov (
2017) argued that the overemphasized “
g” has hindered the study of broad and specific cognitive abilities and led to neglecting the first- and second-stratum factors in the Cattell–Horn–Carroll (CHC) model. Similarly,
Murphy (
2017) noted that studies stressing
g measures over measures of specific abilities fail to consider the second-stratum abilities that can sometimes be more predictive for job performance than more global measures of general cognitive ability. He cautioned that the increasing publications overstressing the predictive role of
g and underestimating the incremental contribution of specific abilities might have led to a premature decline in research on the roles of specific abilities in the workplace (
Murphy 2017).
In contrast to the “Not Much More Than
g” hypothesis,
Kell and Lang (
2017) maintained that specific abilities in some workplaces could be “More Important Than
g.” The supporters of this contention believe that many of the findings that have devalued the significance of specific abilities in workplaces were due to limitations in the analytical procedures used in the assessment of predictive relations. The majority relied primarily on traditional regression analyses (e.g., hierarchical linear regression), which might not be the ideal analyses for making a firm conclusion about the relative importance of predictors. Although this family of statistical techniques is powerful in maximizing the prediction of a particular set of variables, they tend to provide an “unequal” opportunity for predictors to exhibit their potential power, especially when the multicollinearity among predictors is high (
Tonidandel and LeBreton 2011).
In hierarchical regression analyses, the frequently used method in incremental validity studies, a score of
g (often the first unrotated principal component or composite score from a test battery), is entered first in the model, whereas specific abilities are added second in the model (e.g.,
Ree et al. 1994). Criterion scores (e.g., flying performance) are regressed first on scores of
g, with scores of specific abilities (e.g., spatial ability, perceptual speed) entered in the second step of a hierarchical regression. The shared variance in this statistical design is always attributed to the influence of
g because the model prioritizes predictors entered first into the hierarchical regression, regardless of specific-abilities variance shared with the criterion. Even the overlapping shared variance between
g and specific abilities is counted as resulting from
g. The only variance that is credited to other predictors in the model is the percentage that does not overlap with
g. Such an analytical strategy is likely to leave little remaining variance in criterion scores that can be accounted for by specific abilities (
Lang et al. 2010).
For that reason, many researchers have called for adapting other analytical procedures when attempting to establish whether specific abilities have incremental validity above and beyond that provided by
g. Relative importance analysis (RIA) is one useful analytical procedure for investigating predictor–criteria relationships. Two variants of RIA have gained popularity in recent years: relative weight analysis (
Johnson 2000) and dominance analysis (
Azen and Budescu 2003). Both procedures have beenfound to produce similar results, although they differ in their computational and analytical foundations (
Johnson 2000). The two procedures allow for a more accurate partitioning of variance in multiple regression, which leads to a better judgment of the effect of predictors on outcomes. RIA exhibits the impact each predictor has on the overall model, considering both its unique contribution and its contribution in the presence of other predictors (
LeBreton et al. 2007). It decomposes the total predicted variance in a criterion into that which should be attributed to each individual predictor, even when the predictors are strongly correlated with one another. These analyses, however, are not meant to be a replacement for regression analyses but rather an informative supplement fostering the understanding of the role played by each predictor in a regression equation (
Tonidandel and LeBreton 2011). Although relative weight and dominance analysis are very useful techniques for assessing the relative importance of predictors in a model, neither is as powerful as multiple regression in maximizing the prediction of the criterion variable.
A bifactor model, or nested-factor model, is another useful approach to help improve our understanding of the interplay of predictors. Although this model was introduced many decades ago (
Holzinger and Swineford 1937), its use as a predictive model for associations between predictors (e.g., cognitive abilities) and outcome criteria (e.g., job performance) has been revived only recently. In a bifactor model,
g is modeled similarly to specific abilities as a lower-order factor but differently in that it has paths to all (or the majority) of the indicators. Studies comparing bifactor models with higher-order models have showed that bifactor models tend to produce a better fit than higher-order models (e.g.,
Cucina and Byle 2017;
Morgan et al. 2015). The bifactor model’s unique specification allows for an effective partitioning of variance among observed variables and enables a clear separation of domain-general from domain-specific effects (
Reise 2012;
Zhang et al. 2021). The
g effect can thus be disentangled from specific-ability effects, and their contributions to a criterion can be assessed using latent multiple regression models underlying the SEM framework. The readily built-in orthogonalization feature in this model makes it appropriate for investigations that seek a complete distinction between the effects of general and specific factors (e.g.,
Gignac 2008).
Alternatively,
Coyle (
2014) advocated the use of an analytic approach through which relations are tested via the non-
g residuals of tests produced from a higher-order factor model (i.e., a hierarchical structure involving three conceptual levels:
g at the top, ability group factors at the second level, and specific abilities at the lowest level represented by observed test scores). He argued that this approach is the most promising approach in the study of human intelligence (
Coyle 2014). In these SEM models, the residuals of specific abilities are allowed to be correlated with performance measures to partial out the effect of
g, thus providing a purer estimate for specific-abilities effects on performance. Relations that were examined with the non-
g residuals of tests showed that specific abilities could have equal or even higher importance than
g in predicting outcomes. Contrary to the primacy of
g hypothesis,
Coyle (
2018) found significant incremental validity for several specific abilities on the SAT, ACT, and Preliminary SAT tests above
g validity for the prediction of different criteria, often with substantial effect sizes (βs ≈ 0.30). This method has seen increased use and has assisted in determining the relative role of specific constructs beyond the validity obtained by the
g factor (e.g.,
Benson et al. 2016;
Berkowitz and Stern 2018;
Wee 2018).
In addition to the influence of statistical analyses on the results concluded from predictive validity research, there are other factors that can determine whether or not specific abilities are important predictors for job performance. The cognitive-ability–job-performance compatibility principle (
Schneider and Newman 2015) is one factor that needs to be considered in such investigations and is believed to be one possible reason biasing against specific abilities. The center point here is the necessity to make a reasonable alignment between predictors and criteria such that a general predictor is matched with a general criterion and specific predictors are matched with specific criteria (
Wee 2018). More precisely, as indicator variables for a predictor and criterion have similar cognitive requirements and are equally weighted in the predictor and criterion, the strength of the predictive relationship is expected to increase (
Krumm et al. 2014, citing Brunswik’s (1956) lens model).
Moreover, the job performance dimension is another aspect to take into account when designing a criterion-related validity study.
Drasgow (
2012) argued that expanding the criterion space to include other criteria than training performance and overall job performance (e.g., contextual job performance, counterproductive work behaviors, and attrition) enables a better understanding of the individual differences that predict behavior in the workplace. Derived from an integrative synthesis of the literature, Campbell and his colleagues (e.g.,
Campbell et al. 2001;
Campbell and Wiernik 2015) proposed an eight-factor model representing the primary dimensions of performance in a work role. A hierarchically organized structure, similar to an intelligence model, was also suggested for job performance, where indicators from different performance domains cluster into a few group factors of broad performance (or compound performance dimensions) and the highest order factor of performance is at the vertex of the model (
Ones et al. 2017). Hence, a more thoughtful plan in the design of a validation study, particularly related to the selection of criteria, can have an impact on the results and conclusions determined about the true relations between predictor and outcome variables.
Another factor that can be highlighted in ability–performance research is the overuse of correction (
LeBreton et al. 2014). The compelling results showing the negligible role of specific abilities relative to the predominant role of general ability for predicting job performance may be due, in part, to studies’ reliance on correlations that have undergone several corrections for range restriction, measurement error, or dichotomization. Although the correction of observed correlations is a recommended strategy to produce more accurate estimates of ability–performance relationships, it may have precluded critical evaluations and possible refinement of the interplay of general and specific cognitive abilities in predicting job performance. It might have also hindered scholarly understanding and appreciation of the possible role of specific abilities as a worthy predictor for future work outcomes. Thus, in this study, we applied uncorrected data (i.e., observed correlations) to establish more clearly the relative contribution of cognitive abilities for predicting job performance, free from the possible influence of correlation correction.
The bright side of this long-lived scientific debate, however, is that it has stimulated dynamic research in both directions, which is certainly advantageous for the advancement of related sciences. Some journals have devoted special issues debating the relative value of cognitive abilities for performance outcomes. As an example, a special issue of
Human Performance discussed the role of general mental ability in I/O psychology (
Viswesvaran and Ones 2002). Equally, a recent special issue of
Journal of Intelligence focused on this great debate in seven articles (
Kell and Lang 2018) in an attempt to motivate reconsideration of specific abilities in the workplace. Some of these articles offered analytical strategies that can be used as an alternative to the traditional statistical analysis to disclose the determinants of job performance more accurately (e.g.,
Coyle 2018;
Eid et al. 2018;
Ziegler and Peikert 2018). Of interest, this debate on the relative role of general versus specific abilities has transferred from educational and workplace settings to other life domains. Some forms of this debate can now be found in studies of wages (
Ganzach and Patel 2018), players of the National Football League (
Lyons et al. 2009), happiness (
Blasco-Belled et al. 2020), triangular love (
Van Buskirk 2018), humor production ability (
Christensen et al. 2018), music training (
Silvia et al. 2016), and piano skill acquisition (
Burgoyne et al. 2019).
The present study revolved around this context—the debate on whether it is general ability or specific abilities that contribute most to the prediction of job performance. More specifically, we assessed the role of five specific abilities (verbal, quantitative, spatial, perceptual speed, and aviation-related acquired knowledge), as well as general ability, in predicting performance in three military aviation-related occupations: flying, navigation, and air battle management (ABM). Given the nature of the three occupations, the selectees to these jobs are typically of high cognitive aptitude and they achieve high scores in many selection requirements such as scholastic, personality, physical, and medical examinations. Hence, there is more opportunity for cognitive abilities to demonstrate their roles and influence in the individuals’ performance. In this study, we aimed to understand how influential certain cognitive abilities are in different aviation occupations, and how the occupational patterns may vary.
The examination of relationships relied primarily on a bifactor modeling approach as a suitable alternative statistical approach. In this study, we sought to examine latent relationships between cognitive abilities and job performances, which can be accomplished appropriately through SEM procedures. We were interested in capturing the latent constructs of cognitive abilities and related them to latent (or observed) job performance. SEM can be a sound method for this particular goal as compared to other alternative analyses more suitable for the assessment of scores at the observed, lower abstract level (e.g.,
Oh et al. 2004;
Glutting et al. 2006). Through SEM application, we can also overcome the concerns raised around the hierarchical regression analysis. Given the goals pursued by the current investigation, a bifactor SEM model provides an efficient tool to disentangle the effect on criteria due to the general factor from the effects due to the specific-ability factors, with several equations and parameters tested simultaneously. Every ability factor, including the
g factor, will have a path (i.e., regression) coefficient showing its effect on performance criteria, controlling for other abilities in the model. Thus, the unique contribution of every ability to the candidates’ outcomes in the three aviation jobs can be estimated. The two main research questions investigated in the present study were as follows: (1)
How do the predictive relations between cognitive abilities and job performance vary across the three occupations (flying, navigation, air battle management (ABM))? (2)
Is there any incremental validity of the specific group factors of the abilities above that obtained from the g factor in any of the three occupations (flying, navigation, ABM)? 4. Discussion
Intelligence researchers have long debated whether the general ability factor is the only factor that accounts for performance in cognitive tasks or if there might be other broad ability factors that explain some of the common variance in test scores (e.g.,
Agnello et al. 2015;
Reeve and Bonaccio 2011). Another version of this debate is the debate among industrial/organizational (I/O) psychology researchers about whether it is general ability or narrower abilities that contribute most to the prediction of job and training performance (e.g.,
Hunter 1986;
Kell and Lang 2017;
Lang et al. 2010;
Ones et al. 2017). The current study weighs in on this controversy by providing results that may be of mutual interest to intelligence and I/O psychology researchers using data from highly cognitively demanding occupations, where individual differences in job performance are linked to differences in cognitive abilities. Evidence from three aviation occupations was provided aboutthe predictive relations between cognitive abilities and job performance. Through the application of bifactor predictive models, results clarify the interplay of general and specific cognitive abilities in predicting the training performance of pilots, navigators, and air battle managers.
The effect size of bifactor
g was large in the navigator sample, small in the pilot sample, and negligible in the air battle manager sample. In contrast, the number of significant effects due to specific factors was none in the navigation sample, one in the flying sample, and three in the ABM sample. In the navigator sample, when
g was modeled, the effect of specific abilities either declined or faded away, as compared to their significant relationships with performance criteria in the correlated-factor model.
g was found to be the only noteworthy predictor for navigators’ performance, suggesting that the simple correlations of the five abilities with navigation performance were mostly due to their overlap with
g. Navigation, like flying, is considered a complex class of jobs that requires high cognitive ability, even to undertake the training. In the old 16-subtest AFOQT (e.g.,
Carretta and Ree 1996), navigation applicants had to be qualified by an 11-subtest composite score (Navigator/Technical composite), as compared to an 8-subtest composite score (Pilot composite) for the qualification of pilot applicants. This gives an indication of the cognitively demanding nature of this job that may also explain the greater role of
g, relative to specific abilities, in the prediction of trainees’ performance in navigation tasks.
The pattern in the pilots’ sample comes in between the patterns noted for navigators and air battle managers, where aviation acquired knowledge, along with
g, stayed significant and effective in the predictive model. Acquired aviation knowledge became a better predictor of flight performance after removing the general factor variance from its scale scores. The effect of this factor was estimated to be 0.29 versus 0.11 for the
g factor. The higher effect of the aviation-related acquired knowledge factor in the pilot sample than in the other two samples may reflect the fact that the two indicators used to extract the factor include content more related to pilot jobs than any other jobs in the USAF. The predictive utility of tests measuring acquired knowledge for pilot performance has been documented in a number of meta-analyses (
ALMamari and Traynor 2019,
2020;
Hunter and Burke 1994;
Martinussen 1996).
The strong relationship between the AFOQT construct of aviation acquired knowledge and pilot performance has been distinctly determined in
ALMamari’s (
2021) study. Using a similar modeling technique to that presented here (i.e., bifactor) and three pilot performance criteria, effect sizes of 0.43 and 0.12 were obtained for this construct when predicting “hands-on” flying performance at primary and advanced phases of training, respectively. For the academic performance criterion, acquired knowledge showed a weaker role (β = 0.08), although still noteworthy. Bifactor
g related more strongly to academic performance (β = 0.24) and less strongly to hands-on pilot performance at the primary phase (β = 0.26) than acquired knowledge did. Perceptual speed ability demonstrated the highest predictive validity for hands-on pilot performance at the advanced phase. The remaining cognitive abilities (verbal, quantitative, and spatial) contributed trivially to pilot performance predictive models.
Job knowledge test scores often demonstrate strong relationships with job performance (
Hunter 1986;
McDaniel et al. 1988). Hence, the comparative importance of this factor in the current finding may not be different from the trend. What makes the finding different, however, is the relatively large significant effect of this factor even with the presence of
g, although it is common to hypothesize that job knowledge influences job performance indirectly through its relation with
g (
Ree et al. 1995;
Schmidt et al. 1986). Interestingly, the knowledge-based tests can also be viewed as indicators of an applicant’s interest and motivation toward the job they are applying for (e.g.,
Kanfer and Ackerman 1989;
Colquitt et al. 2000), and thus, it may be this interaction between the cognitive and non-cognitive aspects of the construct that makes this factor a robust predictor for pilot performance.
Compared to flying and navigation performances, cognitive abilities’ predictive relations with air battle managers’ performance showed a distinct pattern that seemed somewhat unexpected. Due to the nature of the air battle manager performance measure as an average score of multiple written tests, the expectation was that this measure would relate more strongly to general ability than any specific ability due to its saturation with general academic and knowledge constructs. The influence of
g on academic and achievement performance is a well-documented phenomenon (
Gottfredson 2002;
Gustafsson and Undheim 1996), especially when the performance is general in scope (
Kahana et al. 2002) as is the case in air battle managers’ composite measure. However, contrary to expectations, quantitative ability, aviation acquired knowledge, and verbal ability were the three strongest predictors of air battle manager performance after removing the general factor variance in their latent scores. Thus, the current findings that seem in contrast with the majority of research supporting a dominant role of
g over any specific abilities in the prediction of academic performance remain to be explained.
One possible reason for the significance of specific abilities and non-significance of
g as predictors of air battle manager performance is the way that air battle manager performance was modeled in this study. Due to the existence of only one performance measure for the air battle manager performance, it was modeled as an observed variable indicated by one dimension of performance more related to academic achievement, rather than modeling it as a latent variable indicated by multiple measures of different performance dimensions. Including scores from multiple dimensions of air battle manager performance may make the construct more suitable to be predicted by a general predictor such as
g (e.g.,
Ones and Viswesvaran 1996). Additionally, performance measures of pilots and navigators in this study relied primarily on ratings of hands-on job samples, while that of air battle managers was mostly academic, which may not correspond well to our operationalization of
g that includes spatial ability and perceptual speed, which were probably not sampled in the conventional academic test items.
Moreover, according to the job complexity hypothesis, a highly complex job requires more general ability, and a less complex job requires only specific abilities (
Gottfredson 1997;
Hunter et al. 1990;
Murphy 1989). Thus, the air battle manager performance in this study may have been represented by a less complex dimension in the wide criterion space of the ABM job, while the performance of pilots and navigators was represented by a global score with overlapping dimensions and constructs, most of which were practical in their essence. Furthermore, an air battle manager’s job is generally less complex than pilot and navigator jobs (e.g.,
Fowley 2016;
Rhone 2008), with a lower minimum qualifying score (
Carretta 2008), and thus, also based on the job complexity proposition, a lesser role for g might be expected. Last, it is expected that the courses taught in a technical program for training air battle managers are also of technical scope and tend to target narrower knowledge and skills. According to the ability–criterion compatibility principle (
Schneider and Newman 2015), such a specific-ability-oriented criterion score is best predicted by a specific-ability-oriented score.
All in all, despite pre-existing overwhelming evidence of the supremacy of the general factor as the best stand-alone predictor of job and training performance (
Ones et al. 2017;
Ree and Earles 1992;
Schmidt and Hunter 2004), the present study provides support for crucial predictive roles for some specific abilities that contribute uniquely to performance outcomes. A strong predictive role for some specific abilities (relative to g) for job and training performance has also been found in some recent investigations (e.g.,
Coyle 2018;
Lang and Kell 2020;
Lang et al. 2010;
Ziegler et al. 2011), implying that this conclusion may hold across a wider range of occupations. In our view, a next step should be to synthesize this accumulating evidence to characterize more systematically which specific abilities tend to predict job performance outcomes and net g, which performance outcomes, and in what types of job roles. Following Brunswik and
Krumm et al. (
2014), we might then ask, in each study that has found stand-alone predictive value of specific factors, what were the indicator components of g, their analytic weights or loadings, and the indicator components of any specific factors and their weights, and to what extent were they aligned with the indicators of successful job performance?
6. Limitations and Future Research
In this study, the role of general ability and domain-specific abilities as predictors for job performance was examined using data for three aviation-related occupations. The main focus was five ability factors, along with psychometric
g, that can be extracted from AFOQT subtest scores. The results of the present study show that the predictive relations differed across three professional aviation occupations. Although the cognitive testing applied on the three samples was similar, the performance measures differed to some extent, especially that of the ABM sample. The breadth of performance measures used in each sample (e.g., general or specific), the varying modeling approach (e.g., latent or observed), and the constraints we imposed to identify the predictive bifactor model (e.g.,
Eid et al. 2018;
Zhang et al. 2021) could have had some effect on the results. To allow a better comparison, future studies should attempt to obtain comparable performance measures across occupations, such as academic performance of the training program or actual hands-on experience of the job sample. The ability–performance relationship was investigated in this study without controlling for any potential covariates that may influence the predictive relations. Future studies that aim to establish the validity of cognitive abilities for job performance could add to the predictive models potential moderator variables, such as gender or ethnicity group, if such moderators were suggested by previous empirical findings. Future studies could expand the scope and assess the predictive role of other cognitive functions obtained from different test batteries (e.g., memory, multitasking, reaction time) for professional flight occupations. Due to the limitation of cross-sectional data and the likely influence of between-group sampling variability (
Little 2013), future studies may also attempt a longitudinal design for tracking the changes of predictive relations through different phases of training, with some control of previous levels of variables.
The modeling technique applied in the present study was based primarily on a bifactor model, which has some inherent limitations (
Eid et al. 2018;
Reise 2012;
Zhang et al. 2021). For example, the factors in a bifactor model, both the general factor and grouping factors, are restricted to being uncorrelated. In addition, each indicator in a bifactor model is allowed to load onto the general factor and to only one grouping factor. Due to the known intercorrelations between cognitive data, these assumptions may seem unrealistic, where group factors are conceptually related, or an indicator can mark more than one construct. For example, perceptual speed and spatial ability factors are expected to share some common variance that is attributable to the general factor (e.g.,
Barron and Rose 2013), but each was marked with a separate set of indicators. Thus, it would be useful to attempt different approaches with other analytic procedures to give the results further credibility. Examples of such approaches that have shown to be effective in separating the effects of predictors on a criterion include relative weight analysis (
Johnson 2000), dominance analysis (
Azen and Budescu 2003), and the non-g residuals of tests derived from a higher-order factor structure (
Coyle 2018). Replicating the results of the current study using some of these methods can give further confidence in the results. Finally, because this study attempted to provide a view of ability–performance relationships different from the conventional view that relies on corrected data, the findings are likely to be an underestimation of the true effects of cognitive abilities on job performance measures. Using correlational data that are corrected for attenuation (e.g., range restriction) may show a different or substantially similar pattern of predictive relations (
Aguinis et al. 2011), although those data transformation techniques also have limitations (
LeBreton et al. 2014). Given the restrictive samples used in this study due to the strict selection procedures for USAF officer candidates, especially those qualified for aviation jobs, generalization of current findings to less restricted samples from similar occupations (e.g., civil airline pilots) needs to be made with caution.