1. Introduction
Research into reasoning processes has been dominated by the dual-processing approach and the various theories it has spawned over the decades. Traditionally, System 1 or Type 1 processes were considered quick, automatic, cognitively undemanding, and based on intuitions. On the other hand, System 2 or Type 2 processes were considered slow, deliberate, cognitively demanding, and based on analytical thinking (
Evans 2012). However, developments in the past two decades have led to the conclusion that responses previously ascribed to the slow System 2 are indeed available to System 1. Key findings leading to these conclusions come from research on reasoning under cognitive load and/or time pressure and research with the rethinking paradigm. When forcing participants to reason under cognitive load and under time pressure conditions, the frequency of analytical responses does decrease but only slightly, meaning that both intuitive and analytical responses are still being generated (
Bago and De Neys 2017;
Lawson et al. 2020). Similarly, when participants give quick initial responses and are then given the opportunity to rethink their choices, the vast majority of analytical responses after rethinking are not a result of switching from an intuitive response (
Thompson et al. 2011;
Dujmović et al. 2021). Rather, analytical responses after rethinking were already given as the initial response. These and other findings have led to a continuous development of models within the dual-process approach, which takes into account the fact that System 1 generates various types of responses, presumably accommodating different types of quick processes. An early idea about multiple intuitive processes is introduced by
Glockner and Witteman (
2010), while
De Neys (
2012) refers to logical intuitions which are part of System 1 and cue analytically correct responses in reasoning tasks. Logical intuitions would include mathematical, probabilistic, formal logic, and others not traditionally considered to be intuitions. The more traditionally used intuitions, like biases stemming from representativeness, matching, and other heuristics, will be referred to as
heuristic intuitions.
Based on these more recent findings, a common framework is emerging, and the proposed working model is depicted in
Figure 1. Intuitive processes generate initial intuitive responses (IR). Depending on task demands, and previous experience, one or more of these responses may be generated. Most tasks in the field are designed to pit a heuristic response against an analytical response. For example, take a simplified version of the Base Rate Neglect task (
De Neys and Glumicic 2008;
Dujmović and Valerjev 2018). Participants are told that a random person is chosen from a group of 1000 people. The random person is very tall. The group consists of 10 basketball players and 990 mathematicians. The task is to pick whether it is more probable that the randomly chosen person is a basketball player or a mathematician. In this example, one intuitive response is based on height being extremely typical of basketball players compared to mathematicians. Thus, it indicates that the randomly chosen person is most likely a basketball player. The second response is based on the information indicating there are many more mathematicians in the group, and thus indicates that the randomly chosen person is a mathematician.
The two initial responses in this example are conflicting, but responses could point towards the same option congruently. The strength of these initial intuitive responses is going to depend on a number of factors.
Stanovich (
2018) refers to
mindware as one determining factor. Mindware refers to knowledge and experiences of rules, procedures, and strategies which help detect and override intuitive, miserly, but incorrect responses. In traditional heuristic and bias tasks, the mindware necessary to perform well is usually tied to probabilistic, causal, and scientific reasoning as well as numeracy. The heuristic that basketball players are very tall has been built through experience and many examples over a lifetime. The same can be said about relying on the high proportion of mathematicians in the group. Knowledge and experiences handling various quantities, ratios, and probabilities build up mindware which can intuitively use that type of information. The strength of each intuitive response will therefore depend on the instantiation of the mindware. The average person might have a much stronger mindware generating the heuristic response (basketball player) when compared to a professional statistician who may have built up a strong mindware for handling probabilities, which favors what is deemed to be the analytical response (mathematician).
De Neys (
2022) further speculates that the strength of these intuitions might be variable over time. Therefore, the relative strength of intuitive responses may not simply be a function of peak magnitude but also the point in time at which the system is prompted to give a response. In our previous work, we have shown that the relative strength of responses can be influenced via instruction (
Valerjev and Dujmović 2017) as well as a change in the modality of presentation (
Dujmović and Valerjev 2017), implying attention and attentional resources play a mediating role between mindware and response strength. When the two responses are congruent, no further deliberation or activation of System 2 is required since there is no uncertainty, and a final response is given. If the responses are in conflict, a monitoring process may trigger System 2 activation. It is not clear how exactly conflict or the resulting uncertainty is computed, but many studies have shown that the presence of conflict increases the likelihood of System 2 activation (
Dujmović and Valerjev 2018;
Pennycook et al. 2014,
2015;
Thompson and Johnson 2014). When the level of uncertainty/conflict is high enough to be detected, System 2 is activated to resolve the situation. This may include rationalization of one of the responses, considering additional evidence for one or both responses, or attempting to generate a novel response. If the level of uncertainty does not trigger System 2, the final response is simply the stronger of the two initial responses.
In recent years, more researchers have recognized the importance of metacognitive processes during reasoning. Indeed, the processes that detect conflict and/or compute uncertainty could be viewed as metacognitive monitoring. This has led to the emergence of meta-reasoning (
Ackerman and Thompson 2015,
2017), in which measuring various metacognitive judgments such as perceived difficulty, solvability, and confidence has accompanied the usual measures of accuracy (usually defined as the proportion of analytical responses) and response time. In fact, metacognitive measures have proven to be very useful and sensitive to experimental manipulations even when accuracy and response times are not. For example, if all conditions in a task are completed at a very high level of accuracy, then there will be very little or no difference among them, but confidence judgments (for example) may be impacted by the experimental manipulation. Standard findings show participants are less confident in conflict when compared to congruent versions of reasoning tasks (
Dujmović and Valerjev 2018;
Dujmović et al. 2021;
Mevel et al. 2015;
Shynkaruk and Thompson 2006). Accuracy is often not a good predictor of confidence (
Bajšanski et al. 2014;
Markovits et al. 2015;
Thompson et al. 2011), while response time as a proxy for response fluency oftentimes is (
Ackerman and Zalmanov 2012;
Dujmović and Valerjev 2018;
Thompson et al. 2013;
Thompson and Johnson 2014).
While the introduction of metacognitive measures to reasoning tasks has increased our understanding of both metacognitive and reasoning processes, it is becoming increasingly more difficult to probe deeper into these processes using current research paradigms. The tasks which have been used in reasoning research track their roots to
Tversky and Kahneman (
1982,
1983) and their conjunction fallacy (Linda problem) task. The task describes Linda, an outspoken, former philosophy student who was concerned with issues of discrimination and social injustice and participated in a number of demonstrations. Participants are then asked whether, years later, it is more probable that A—Linda is a bank teller or that B—Linda is a bank teller
and active in the feminist movement. In many different variations of this task, Tversky and Kahneman found that the vast majority of participants judged the conjunction of two statements (bank teller
and active feminist) as more likely than one of the two constitutive statements (bank teller). This is a fallacy; the conjunction of any two events is less likely than either of the two individually. The fallacy is committed because the description of Linda is more representative of someone who is to become active in the feminist movement, and thus, the conjunction is deemed more probable than the single statement even though it is, in fact, less probable (for a thorough investigation of the conjunction fallacy from a modern meta-reasoning perspective see
Dujmović et al. 2021). Considering how prevalent the fallacy was and how difficult it was for participants to give the correct response, it naturally served as evidence for reasoning being dual-process. The representativeness heuristic provided one response (the conjunction) and correct probability assessment the other. This type of design then became prevalent in practically all reasoning tasks of note. This includes the various versions of the Base Rate Neglect task (
De Neys and Glumicic 2008;
Dujmović and Valerjev 2018), the Cognitive Reflection Test (
Frederick 2005), the Covariation Detection task (
Stanovich et al. 2016;
Valerjev and Dujmović 2019), and syllogisms (
Shynkaruk and Thompson 2006). In all of these tasks, the design pits one type of process (e.g., representativeness or believability) against another (e.g., probability computation or formal logic) in conflict versions or has the two processes aim at the same response in congruent versions. While these tasks have produced many important findings, there are inherent limitations in how they have been used. Namely, the tasks always pit qualitatively different processes of unknown strengths against each other. One can vary the group sizes in a base rate task (
Dujmović and Valerjev 2018), but the strength of the heuristic response is almost always unknown or only roughly estimated as more or less extreme. Likewise, one can vary the believability when using syllogisms (
Shynkaruk and Thompson 2006). However, with any of these manipulations, the relative strengths of the two processes are unknown. Indeed, the fact that the mismatch is quite extreme in most cases has perpetuated what
De Neys (
2022) deems as an unfounded assumption of exclusivity (that some responses are only available to System 2). The mismatch can be seen as a severe difference in mindware. In the Linda problem, representativeness as a heuristic is well developed for the vast majority of people, while properly combining probabilities or understanding rules of formal logic is tied to a much weaker mindware for most. Even if one could match (via pre-studies) the strengths of one process with another, these processes are still qualitatively different, and it is unknown if a match in peak estimated strength means a match when embedded in a reasoning task. Without more stringent matching, subtle manipulations are more difficult, as is testing more nuanced aspects of dual-process models.
The aims of this study were two-fold. First, we aimed to develop materials for reasoning tasks that focus on only one type of heuristic process with good measurements of relative heuristic strength. In order to accomplish this, we started from the simplified Base Rate task, but instead of pitting a heuristic intuitive process against a logical intuitive process, we chose to use the same type of intuitive heuristic process in order to cue both responses. In comparison to the example given above, instead of stating that a randomly chosen, very tall person came from a group of 10 basketball players and 990 mathematicians, we would state that a randomly chosen person is very tall and very intelligent while being selected from a group which is half basketball players and half mathematicians. In this version, being very tall is indicative of being a basketball player, while being very intelligent is indicative of being a mathematician (for details on all conditions and materials, see the Methods section of Experiment 1). Second, we aimed to test whether standard findings considering the influence of conflict detection and resolution hold for these tasks in which the same type of intuitive process cues different responses. Our pre-study (see Methods) results made fine control of the level of conflict/congruence possible. This allows for comparisons of congruent and conflict conditions in which the dominance of the final response can be held constant, making an evaluation of the influence of conflict easier. Given previous research, we hypothesized that conflict would manifest through longer reasoning times, lower accuracy (measured as a lower proportion of response in accordance with the dominant heuristic), and lower confidence. It was further hypothesized that the level of congruence/conflict would influence reasoning speed, accuracy, and confidence. Higher congruence should result in an increase in speed, accuracy, and confidence, while higher conflict should result in a decrease in all three dependent variables.
4. General Discussion
We set out to test whether conflict detection and resolution retained a strong influence on meta-reasoning when pitting responses cued by the same type of heuristic process rather than two vastly different processes. Thus far, the influence of conflict in reasoning has always been tested by inducing processes relying on different types of mindware. Further, the strength of responses cued by these processes is not strongly controlled or calibrated. Usually, the only manipulation is whether a conflict is present or not; at the most, the level of conflict is lightly manipulated when it is simple to do so (e.g., ratios in the Base Rate task as in
Dujmović and Valerjev 2018). Therefore, it is difficult to disambiguate the influence of the presence and magnitude of the conflict itself from other differences between the processes cueing these responses. The key finding from Experiment 1 was that when nominal strengths of the prevailing responses were matched in congruent and conflict conditions, there was no difference in accuracy, reasoning times, or confidence. These findings are in sharp contrast with most research within the modern dual-process framework (
Dujmović and Valerjev 2018;
Dujmović et al. 2021;
Mevel et al. 2015;
Shynkaruk and Thompson 2006). Indeed, the most parsimonious explanation of the results was a single-process account in which one heuristic process simply computes all of the information in the task and cues a response with some level of certainty. However, as pointed out in the discussion of Experiment 1, the presence or absence of conflict was not the only difference between the two key conditions. The strength of the dominant heuristic response was also different, which is unavoidable due to the nature of conflict vs. congruent tasks if one aims to match the nominal strength of final responses. This may have masked the impact of conflict in Experiment 1. In order to account for this, Experiment 2 was designed to test whether an increased nominal strength level would have more of an impact on congruent conditions than conflict conditions. We hypothesized that an increase in heuristic response strength would, even if the two conditions remained unbalanced in that regard, unmask the effect of conflict if there was one. Results indeed revealed that increased strength had a higher impact in the congruent condition. Accuracy was equally increased between strength levels irrespective of congruence, but analyses of the remaining two dependent variables revealed strong interaction effects. Participants reasoned significantly faster in the high-strength congruent when compared to the low-strength congruent condition, while the difference between the two conflict conditions was not significant. Further, confidence was higher with the increase of strength level in both congruent and conflict conditions, but the increase was significantly larger in the congruent condition. The results indicate that even when a single type of heuristic process is responsible for cueing multiple responses, there is an effect of conflict. This provides further evidence that two heuristic processes cue the responses; conflict can then be detected and does influence meta-reasoning.
The results revealed an interesting modulation effect of dominant response strength on the impact of conflict in our tasks. A higher strength of the dominant response can mitigate the negative impact conflict has on reasoning times and confidence levels. This may be due to factors that are not mutually exclusive. First, there may be a lower probability of conflict being detected due to the high strength of one of the responses. Second, it may be due to an easier resolution of the conflict when it is detected. It is likely that this effect is bound and non-linear. In Experiment 2, the high-strength conflict condition boasts a higher final response strength (which can be seen as lower conflict), a higher heuristic strength of the dominant, and a lower heuristic strength of the subordinate heuristic response than the low-strength conflict condition. However, these conditions do not differ when looking at reasoning times. Additionally, the increase in confidence was much smaller than in the congruent conditions—despite both heuristic responses in the high-strength congruent condition being much weaker than the dominant response in the high-strength conflict condition.
The findings presented here provide a promising foundation as the approach taken allows for more control and a more detailed analysis of the targeted processes. As far as we are aware, this is the first study to investigate processing that gives rise to System 1 responses rather than just the outcomes of those processes. In the future, materials can be generated using a similar approach in order to match the strength of qualitatively different heuristic processes and explore differences due to other factors. The recent review of dual-process theory by
De Neys (
2022) poses questions for future research, many of which we feel our approach is uniquely positioned to explore. That being said, there are clear limitations to this approach. There is a clear disconnect between independent typicality judgments and solving reasoning tasks with combinations based on those judgments. Indeed, additional factors that occur during reasoning led to Experiment 1 findings fitting a misleading single-process account. Furthermore, it is entirely unknown whether computations similar to the ones used here to generate experimental conditions mirror anything happening in the brain during reasoning. If they do, they are likely to be an oversimplification of the actual computations. However, the behavioral results presented here match expectations based on our computations rather well (e.g., participants basically guessing in the high-conflict condition of Experiment 1), and this approach may be an avenue to better explore the dynamics of reasoning in the future.