Limitation of the Present Study
The DIF analysis revealed gender differences, not always statistically significant. This could be a limitation of the present study because results presented in this paper cannot be inferred to the entire student population. Nonetheless, it is worth noting that the Rasch model is based on the assumption that the probability of encountering an item successfully is related to students’ relative ability, i.e., their ability compared with item difficulty, and that no other variables (e.g., students’ individual features) can affect it. Therefore, even though a moderate item misfit does not need to necessarily be interpreted as a limitation (of the test or even of the choice of the model), but as a potential source of information (as recently argued in [
57]), test items are constructed by INVALSI to be DIF-free. Similarly, the materials we developed for the purposes of the present research were constructed to be DIF-free with just a few exceptions aimed at testing specific hypotheses about how gender interplays with item characteristics. Nonetheless, only three items were constructed to explore gender differences (D9, D15, and D16, aimed at exploring misconceptions or the effect of the item’s context—i.e., real or mathematical—on students’ solving strategy). The absence of a statistically significant DIF is thus an unavoidable and inherent consequence of the tests’ construction process.
Moreover, even though our analysis revealed some differences in students’ answers to the item in F1 and F2, and to the item in F3 and in F4, it is worth noting that results between F1 and F2 are consistent, as are those between F3 and F4, thus supporting our results’ interpretations about the diverse effect of the misconception analysed in this paper on girls’ and boys’ answers.
Results presented in this paper showed that traditional psychometric tools, and in particular the graphical inspection of the ICCs and of distractor plots, are extremely valuable in exploring in-group differences, since all the graphs compare students matched on ability. Moreover, in this research, such graphs were constructed after having equated mathematics achievement tests, thus making students’ answers directly comparable. Working within the framework of the Rasch analysis is an added value of the present study: the equating strategy performed here guarantees the comparability of students’ answers across mathematics achievement tests and across sub-groups of students (whichever way they are defined), thus offering a methodological approach that can be used also to pursue other research goals.
Our analyses showed a different effect of a specific misconception (related to multiplication with decimal numbers) on boys’ and girls’ answering behaviour. The misconception investigated here was already studied from a qualitative point of view by D’Amore and Sbaragli [
35]. Consistently with Sbaragli [
15], our results showed that girls’ difficulty in multiplying decimal numbers is due to the misconception, as also confirmed by distractor D, which is strictly related to the misconception and is more attractive for girls than for boys.
The inversion of multiplication factors misleads students’ answers, with a stronger influence on girls than on boys. Moreover, compared to previous studies about students’ misconceptions in multiplying decimal numbers, the use of the Rasch model adds some advantages to the investigation of this topic. Firstly, if DIF is detected, the results can be interpreted in terms of which items are easier or harder to solve for which group [
47]. This offers interesting elements to enrich the debate from a didactic point of view: previous studies carried out in Italy have shown that girls are more influenced by didactic practices, classroom routines, and the teacher–student relationship than boys, and that this makes them more prone to the (mis)leading effect of misconceptions and didactic contract [
16,
38]. Moreover, such strong differences between boys and girls at grade 8 in Italy are quite unusual: as systematically reported by INVALSI in its national annual reports, gender differences increase over time, from primary to secondary school, but at grade 8, they tend to be close to zero (e.g., [
44]). Understanding such a result deserves much more investigation that is beyond the scope of this study.
Results presented in this paper help us to explain why and how the exploration of gender gap at item level, rather than across the entire test, can contribute further information to the current debate about gender differences. In this direction, for example, Leder and Lubiensky [
14] (p. 35) stated that:
Item-level analyses can pinpoint the mathematics that students do and do not know, including which problems most students can and cannot solve, and which problems have the largest disparities between groups. This information can inform both textbook writers and teachers, as they strive to address curricular areas in need of additional attention. Hence, it is important for item-level analyses to be systematically conducted and reported.
In this paper, we combined traditional psychometrical tools with the theoretical lens of mathematics education to test specific hypotheses about students’ problem-solving strategies. This comparison, based on a large probability sample consisting of 1647 students attending grade 8, was made on the analysis of students’ answers to four anchored mathematics tests developed for the specific purposes of the present study. A common-item non-equivalent group design was employed to collect data, and all forms were equalised to enable comparable answers from the different subgroups of students: “When using this design, the difference between group level performance on the two forms is taken as a direct indication of the difference in difficulty between the forms” [
39] (p.13).
The combination of traditional psychometric tools with the theoretical lens of mathematics education, an unprecedented strategy for the exploration of gender differences, adds real value to the current debate about gender differences because it provides critical information about boys’ and girls’ performances and hence suggests research paths about their problem-solving strategies. Gender differences emerge on specific mathematics content, and these results are consistent with the current literature on gender differences in mathematics: many studies highlight that differences between boys and girls can be explained by a different use of learning and problem-solving strategies rather than differences in cognitive abilities. If we consider problem-solving activities in mathematics, for instance, girls more frequently use routine procedures and well-known algorithms, while boys are more inclined to try new methods and non-conventional approaches [
58,
59,
60]. The analysis of gender difference in items related to specific difficulties and constructs already studied in mathematics education research could be fruitful also for teachers. The more we investigate and understand these differences, the more teachers will have opportunities to intervene with specific didactical activities. In particular, regarding misconception, teachers must be aware of avoidable and unavoidable misconception [
15]. The first ones are linked to didactical practices and teachers’ choices; the second ones are unavoidable because they are not due to didactical transposition but are temporary and not exhaustive ideas due to the necessary gradual introduction of new mathematical knowledge. In this paper, we compared two versions of the same item with the purpose of analysing a specific misconception concerning multiplication with decimal numbers. Misconceptions related to decimal numbers are considered unavoidable misconceptions: they arise from the fact that students learn mathematical operation in the field of natural numbers. Teachers must be conscious of students’ difficulties in the transition between natural and rational numbers: they need to ensure that ideas related to mathematics operations in natural numbers do not become “parasite” models [
15] when students have to face the same operation with rational numbers. In particular, this study suggests to teachers to pay special attention to girls because they are more influenced by these misconceptions. In our task, we observe that differential item functioning is related to misconceptions and intuitive models of multiplications used by students, but the influence of these factors is different for boys and girls. This is further confirmed by variation in item functionality due to variation in item formulation: 0.5×4 favours lower-ability girls by offering an “easier” formulation which activates a routine procedure (intuitive model of multiplication as repeated addition).
This paper gives a contribution in the direction indicated by [
61]: a theoretically driven interpretation of macrophenomena highlighted quantitatively by Large-Scale Assessments may help in clarifying solid findings in Mathematical Education.