1. Introduction
Fluid reasoning (Gf;
Carroll 1993) refers to a set of abilities that helps us solve new problems (
Schneider and McGrew 2012). Gf supports relational and inferential reasoning, classification of new situations and phenomena, formulation of hypothesis, generalization, application of old schemas in new events and problems, and establishing similarities and differences between concepts (
McGrew 2009). Gf refers a cognitive aptitude that relies heavily on working memory (WM), or the ability to keep relevant information in mind (e.g.,
Chuderski 2013;
Conway et al. 2002,
2003;
Dehn 2017;
Engle et al. 1999;
Kaufman 2014), as well as executive functions (EFs), or the set of control processes that support goal-directed behavior (
Lehto 2004) and planning, or the ability to consider how to approach a complex problem before getting started.
Tests designed to measure Gf typically present a pattern of simple shapes, and the test taker must understand the rule that is guiding this pattern and then construct the answer or choose the correct answer in the answer choices bank (
Alves 2007;
Flanagan and Harrison 2012;
Schlottfeldt and Malloy-Diniz 2018). One of the most common types of tests used to measure Gf are matrix reasoning tests, such as those found in Cattell’s Culture Fair task (
Cattell 1973), Raven’s Progressive Matrices (
Raven et al. 1998), the Wechsler Adult Intelligence Scale matrices (
Wechsler 2004), and the Wiener Matrizen-Test-2 (WMT-2;
Schlottfeldt and Malloy-Diniz 2018). These tests commonly present a 3 × 3 matrix with nine cells and a set of answer choices (see
Figure 1 for a sample WMT-2 problem). All these tests are non-verbal measures that require inferring abstract relations among simple shapes and deducing the missing item in the array based on these relations. These types of tests assess deductive and inductive reasoning (
Drozdick et al. 2012). Central to these tests is the fact that they require that participants identify and integrate relations (across the problem array), a cognitive process known as relational thinking (
Alexander 2016) that was recently conceptualized as an EF (
Starr et al. 2022).
When completing matrices problems, a test taker uses visual cognitive strategies.
Vigneau et al. (
2006) described two strategies in matrix reasoning tasks based on the work of
Bethell-Fox et al. (
1984), and
Snow (
1978,
1980) in analogical tasks with several response alternatives. The strategies can be defined as
constructive matching, an effective but costly strategy whereby participants try to solve the problem by mentally constructing the missing piece before going to the answer choices to look for it; and
response elimination, a less precise strategy whereby participants alternate many times between the matrix and the answer choices in an effort to eliminate wrong answer choices.
These two strategies, constructive matching and response elimination, can be identified through the use of eye tracking (
Laurence and Macedo 2022). For example, the number of times a test taker’s eye gaze toggles between the matrix and the answer choices are proposed to be indicative of the strategy they are using: many toggles between them is theorize to reflect a response elimination strategy, while few toggles is thought to reflect constructive matching. The same arguments have been made for how long it takes for a participant to go to the answer choices for the first time after studying the matrix, or how much more time the participant spent on the matrix instead of the answer choices. Specifically, a long duration of focus on the matrix prior to transitioning to the answer choices, and proportionally more time spent analyzing the matrix than the answer choices, are thought to reflect use of the constructive matching strategy (see
Laurence et al. 2018).
Further, algorithms have been developed to classify the scanpaths of different participants based on their similarities (
Hayes et al. 2011;
Kucharský et al. 2020). Through this approach, it is possible to use transition matrices on the areas of interest (AOIs), i.e., the matrix and the answer choices, and analyze this transition matrix to induce which strategy a participant had adopted (
Kucharský et al. 2020).
These strategy measures have been shown to relate to performance on a matrix reasoning task. Previous studies established that these measures can predict performance on matrix reasoning tasks, demonstrating that participants with higher accuracy on the task tend to exhibit eye movement patterns similar to the constructive matching strategies (
Laurence et al. 2018;
Laurence and Macedo 2022;
Vigneau et al. 2006). However, the eye gaze metrics in these studies were related to performance on the same task; we know of no studies analyzing the relation of the eye-tracking measures with a Gf score that was not measured in the same test where the strategy was measured.
Because WM is hypothesized to support the ability to reason about the relations among stimuli, several studies have investigated the relationship between self-reported strategy use and visuospatial WM. The relationship between WM and fluid reasoning is not new: several studies over the last 25 years have demonstrated that the capacity to solve new problems is directly associated with the capacity to keep relevant information in mind and manipulate it (e.g.,
Chuderski 2013;
Conway et al. 2002,
2003;
Dehn 2017;
Engle et al. 1999;
Kaufman 2014). However, research regarding the relationship between WM and cognitive abilities in the matrix reasoning task is much more recent (e.g.,
Gonthier and Roulin 2020;
Jarosz et al. 2019). To assess strategies, most studies have either used questionnaires (e.g.,
Gonthier and Roulin 2020) or verbal protocols (e.g.,
Jarosz et al. 2019). These studies have demonstrated a relationship whereby higher visuospatial WM is related to the frequency of constructive matching strategy. In this case, participants appeared to use constructive matching for easy problems and resort to response elimination for hard ones. Further, the visuospatial WM predicted strategy use, with participants with higher WM scores maintaining the use of constructive matching strategy even for hard problems (
Gonthier and Roulin 2020). It is noteworthy that, although this relationship between the type of strategy used and visuospatial WM has been reported in several studies, no study has investigated this relationship using eye tracking to quantify strategy use objectively, and on a trial-by-trial basis.
Since WM is related to the retention and manipulation of information, it is a limiting ability for most test takers (
Gonthier and Roulin 2020). Besides visuospatial WM, which is important for keeping relevant visual features in mind, multiple other cognitive processes are thought to be required for performing the matrices test. For example, performing the test requires shifting between focusing on different elements of the matrix item, which can relate to the EF construct of cognitive flexibility (
Birney and Beckmann 2022;
Colzato et al. 2006). Further, performance likely benefits from a systematic approach to the problem (
Cormier et al. 1990), i.e., planning. Strategy use on matrix reasoning can be related to these and other cognitive abilities; however, to the best of our knowledge, no study has tried to study the relations between them, especially with the use of eye tracking.
Fixation durations are also an eye behavior that is related to different cognitive abilities. First, individuals with high WM performance tend to exhibit longer fixations on the areas of interest of a distraction task (
Luke et al. 2018;
Meghanathan et al. 2015). Second,
Hodgson et al. (
2000) found that participants who performed well on a planning task called the Tower of London test showed shorter fixation times. This result indicates that good planners exhibit fast and efficient fixations. To the best of our knowledge, no studies have examined the relationship between eye gaze metrics and performance on a matrix reasoning task in relation to different cognitive abilities.
With these literature gaps in mind, our study aimed to explore the relationship between cognitive abilities and eye-tracking measures related to strategy use in matrix reasoning tasks. To perform this, we conducted two studies. In the first study, we measured reasoning on a computerized matrix reasoning task (
Figure 1a) and a paper-and-pencil non-verbal inductive reasoning task (
Figure 1b). In the matrix reasoning task, we recorded the eye movements of the participant and calculated metrics regarding cognitive strategy use. In the second study, we measured planning, visuospatial WM, cognitive flexibility, self-reported EF, and reasoning task performance with a matrix reasoning task.
We set out to test specific hypotheses regarding the relation of the eye gaze metrics on the WMT-2 reasoning task and individual differences in planning, WM, and cognitive flexibility. These hypotheses were tested specifically in Study 2. Our three key hypotheses and predictions, and our analytic approach, were pre-registered at AsPredicted (
https://aspredicted.org/8sp6a.pdf, accessed on 14 April 2023).
First, we hypothesized that participants with better visuospatial WM would be able to retain matrix items and relations in mind more readily than others. Therefore, we predicted that visuospatial working memory would correlate with several gaze metrics. In particular, we predicted that the distinguishing characteristic would be the number of gaze transitions between the matrix and the answer choices. On this view, participants with better spatial working memory would be better able to keep in mind the features of a stimulus that would constitute the correct answer as they transition from the matrix to the answer choices. Additionally, they would be better to keep in mind an answer choice and check whether it fits, thereby making fewer transitions between the matrix and answer choices.
Second, we hypothesized that participants with superior planning ability would be more likely to adopt a constructive approach on the matrix reasoning eye-tracking task, spending more time observing the matrix problem than the answer choices. Thus, we predicted that planning ability would be associated with a higher proportion of time spent on the matrix vs. the answer choices.
Third, we hypothesized that participants with greater cognitive flexibility would perseverate less on the matrix reasoning task. Specifically, we predicted that the more flexible individuals, the ones with a low number of perseverative errors on the test of cognitive flexibility, would make fewer fixations to the incorrect answer choices on the matrix reasoning task, that is, they would not revisit irrelevant options multiple times.
We also sought to run exploratory analyses. We computed several strategy use metrics in the matrix test based on eye movements. Since we aimed to verify which of these metrics are related to different cognitive abilities, we used a feature selection method. We employed gold-standard methods of machine learning for small samples by using train/test split (see
Vabalas et al. 2019) in order to select eye-movement predictors for reasoning task performance, planning, WM, cognitive flexibility, and self-reported EF in everyday thinking and behavior. All measures, with exception of the self-reported EF, are lab-based and objective measures, while the self-reported EF is a real-world, subjective measure. Therefore, it is noteworthy that in this exploratory analysis we tested the possibility that objective eye gaze metrics on an abstract reasoning task would be related to a subjective measure of real-world self-regulation. Furthermore, although we used methods for small samples, the samples presented in Study 1 and 2 are bigger than most of studies that investigated matrix reasoning strategies with eye tracking (see
Laurence and Macedo 2022 for a list of sample sizes).
It should be emphasized that the feature selection method selected variables based on an algorithm. This selection is conditioned on the data. Although this type of feature selection can be used in databases with much more features than the size of the sample, the results should be observed with caution because this can lead to bias (
Heinze et al. 2018); therefore, these exploratory analyzes should be analyzed with attention and criticism.
4. General Discussion
Our study aimed to explore the relationship between cognitive abilities and eye-tracking metrics related to strategy use in matrix reasoning tasks. Our preregistered analyses revealed relationships between several eye-tracking metrics with different cognitive abilities. We predicted that the ratio of time spent on the matrix vs. the answer choices would be more related to planning than the other cognitive measures (Hy1). Indeed, we found a p-value lower than 0.1 in this hypothesis, pointing to the possibility that planning is reflected in more time spent gazing on the matrix, but it was not significant. We also predicted that fewer gaze transitions between the matrix and the answer choices would be related to higher visuospatial WM scores (Hy2); this prediction was confirmed. Lastly, we predicted that fewer perseverative errors on WCST would be more related to less revisit in incorrect answer choices; however, this hypothesis was not confirmed. Based on Steiger’s test, no statistical significance was found pointing to difference between correlations of the cognitive measures. In summary, we were able to confirm our hypothesis that fewer gaze transitions between the matrix and the answer choices were related to higher visuospatial WM scores and found a low, yet not-significant, p-value indicating that planning can be related to more time spent gazing on the matrix. However, we also predicted that fewer perseverative errors on WCST were related to less revisits in incorrect answer choices, but this was proven not to be true.
We also conducted exploratory analyses to investigate the relationship of the eye gaze metrics with the cognitive tests. The test that was most similar to the eye-tracking task (another visuospatial reasoning task) was the one well-predicted by eye gaze metrics. The results in Study 1 showed a strong relationship between several eye gaze metrics and the D.70 score. These variables predicted 57% of the variation in the D.70 test. These results are consistent with previous literature (e.g.,
Hayes et al. 2011;
Laurence et al. 2018;
Vigneau et al. 2006), in which the eye gaze metrics predicted the participant score in the same reasoning task. With our results, it seems that the eye gaze metrics in a reasoning task can also predict the participant score in another reasoning task. Therefore, it is possible that the eye gaze metrics have a relationship with the reasoning ability of a participant.
Eye gaze metrics were also a moderate predictor of planning scores (16%). Higher scores on the Tower of London, a cognitive measure of planning, were related to higher Latency to the first fixation in answer choices, a higher ratio of time spent on the matrix vs. answer, and a smaller mean fixation duration in the matrix. We reasoned that participants that show better planning abilities are the ones that would first try to solve the problem on the matrix and then go to the answer choices, suggestive of a constructive matching strategy (Hy1). With this in mind, the eye gaze metrics selected were all related with the constructive matching: a high latency to the first fixation in answer choices points out that participants were scanning the matrix before going to the answer; a higher ratio of time demonstrates that participants spent more time fixating in the matrix than in the answer choices. Shorter fixations were also related to planning.
Hodgson et al. (
2000) demonstrated that participants who showed better performance in the Tower of London test were the ones with lower fixation times, suggesting that good planning is linked to fast and efficient fixations. However, the eye-tracking metrics were not good predictors for planning time, indicating that their time and efficiency are mediated by other variables.
Further, higher scores on the visuospatial WM task (Corsi block-tapping) were related to fewer visits to wrong answer choices and the percent of trials classified as cluster 2 scanpath (18%). While not predicted a priori, this result suggests that participants with higher WM are better able to remember elements of the problem, and therefore visit the wrong answer choices fewer times. We predicted higher WM scores to be related to fewer matrix–answer transitions (Hy2), based on previous work that demonstrated that individuals with higher WM used the constructive matching strategy more times (
Gonthier and Roulin 2020). However, the percent of trials classified as cluster 2 scanpath was a negative predictor, indicating that participants that relied more on an eye gaze strategy that is very similar to constructive matching were able also more probable to present high visuospatial WM scores. Additionally, we predicted that visits in wrong answers would be related to perseverative errors in the WSCT (Hy3); however, this hypothesis was not borne out either. A possible explanation for this is that our sample had relatively small variation in the WCST perseverative error measure (
Figure 4). Perhaps in a more diverse sample than university students, variation in perseverative errors would be higher and it would be possible to observe a relationship.
It is interesting to note that the rate of matrix–answer transitions was the best predictor for several studies predicting the score in the same matrix reasoning task that the eye gaze was recorded (e.g.,
Hayes et al. 2011;
Laurence et al. 2018;
Vigneau et al. 2006). We found similar results when predicting the score of another reasoning task, the D.70 test. However, this variable was not selected in any model predicting the score of the other cognitive tests related to visuospatial working memory, planning, and cognitive flexibility. The lack of relationship between the rate of matrix–answer transitions with other cognitive measures besides reasoning indicates that this metric is extremely related to reasoning, but not to other cognitive abilities. It is hard to understand precisely why this metric is related to reasoning ability, but it seems to be a reliable predictor of reasoning, even when predicting reasoning in a different task.
No predictor was found for the BRIEF-A general executive composite and metacognitive index. However, we were able to find predictors for the BRIEF-A behavior regulation index. In this case, ratio of time spent on matrix vs. answer choices, number of fixations in the matrix, mean fixation duration in answer choices, and average time in each test item were selected as predictors. However, in the test split, we found a negative R2 (−3%), indicating that these measures are not reliable to predict BRIEF-A BRI. These results indicate that self-reported daily EFs are not related to the eye gaze metrics in the reasoning task. Therefore, these real-world, subjective measures are not predicted by the lab-based, objective measures.
Regarding the clusterization of the eye gaze in the matrix reasoning task, we chose, based on previous literature (e.g.,
Kucharský et al. 2020;
Vigneau et al. 2006), to use two centroids. This number of centroids was selected based on the idea of constructive matching and response elimination strategies. In both studies, we found a cluster that is very similar to the constructive matching pattern and another cluster that had a row-and-column-wise scan, with more transitions to the answer choices. This pattern was also verified by
Kucharský et al. (
2020), indicating that this pattern can be seen in different studies with the clusterization with two centroids. Indeed, we expected to find a second pattern that was more similar to the response elimination strategy but found an eye gaze pattern that is not a pure response elimination, but is also not a constructive matching pattern, although it has some similarities to it. Previous studies demonstrated that the pure response elimination strategy is very rare, with participants performing it in less than 3% of the trials (
Jarosz et al. 2019). It was more common for test takers to execute a hybrid strategy. Therefore, our second cluster may reflect this: a strategy that has elements of the response elimination strategy (i.e., more transitions to the answer choices), but that also follows elements of the constructive matching strategy (i.e., the row-and-column-wise scan).
Differences found in the Bayesian post-hoc comparisons of the clusters indicated that, in Study 1, the constructive matching (cluster 1) scanpath had longer fixations than cluster 2, while the inverse was true in the Study 2. However, it is noteworthy that the biggest difference between the studies were in the second cluster. This indicates that the first cluster had a small variation while the second cluster presented a higher heterogeneity. These result patterns indicate that the second cluster can present distinct patterns since it is a more chaotic strategy, similar to the strategy described by
Jarosz et al. (
2019). It is noteworthy that differences in the clusters post-hoc analysis can be due to the clusterization itself. This means that since the clusters are generated by different features in eye gaze, these differences can also be related to the fixations and the other eye gaze metrics. Taking this into account, these results are very provisional.
The present study has broader implications for understanding the relationship between eye movements in matrix reasoning tasks and cognitive abilities. It demonstrates that the cognitive strategies in matrix reasoning tasks, measured by eye tracking, have a moderate relation with cognitive abilities of planning and WM. In this case, it is possible to think of how cognitive abilities intervention for WM or planning can affect the eye gaze on the matrix test. Additionally, studies teaching participants how to process the matrix in an efficient way may produce cognitive gains. New studies seeking to answer these questions are needed.
There are several limitations of this study to consider. First, we had two different samples with different cognitive measures instead of one with all the cognitive measures. Second, both samples consisted of university students, which limits the generalizability of the results. Future studies should focus on diverse samples. Furthermore, our sample size was relatively small, and our results should be interpreted with caution. Due to the sample size, both of our studies are underpowered (67% in Study 1 and 74% in Study 2), although both studies present bigger samples than most eye-tracking studies in this type of research (see
Laurence and Macedo 2022). New studies should focus on larger samples.