Mathematical Modelling Abilities of Artificial Intelligence Tools: The Case of ChatGPT
Abstract
:1. Introduction
2. Theoretical Background of Mathematical Modelling
2.1. Notions and Concepts
2.2. Aims of Modelling
2.3. Modelling in Mathematics Classrooms
- The students should be able to apply mathematics daily and professionally.
- Mathematics is supposed to help students understand their world and critically view mathematical information in the sense of active citizenship.
- The students are to develop problem-solving competencies. They are also expected to deal with situations utterly unfamiliar to them and communicate with the help of mathematics.
- Students should have insights into the usefulness of mathematics.
- Modelling will help students understand and memorise mathematical content more easily.
- Modelling tasks are supposed to help students gain a more positive attitude towards mathematics.
2.4. Modelling Competencies
2.5. Classification of Tasks
2.6. Classification Scheme for Modelling
- Which modelling activities must be carried out?
- Which data are provided?
- What is the nature of the relationship between the context and reality?
- Which type of representation is chosen?
- How open is the task?
2.7. Modelling Activity
2.8. Data
- Superfluous data: this kind of task contains more data than needed. Relevant and irrelevant data need to be distinguished by the pupils.
- Missing data: this kind of task contains less data than needed. A possible solution needs additional information or estimated variables. This kind of task might have several different solutions.
- Missing and superfluous data: this kind of task does not contain all the data needed to find a solution; simultaneously, it contains superfluous data.
- Inconsistent data: this kind of task contains data that are not relevant to the solution.
- Matching data: this kind of task contains exactly the information needed to find a solution.
2.9. Relationship to Reality
- Authentic tasks focus on context-related topics and pose relevant questions. A task can be considered to be authentic when the data themselves, the way in which the task is presented, and the question itself are authentic or if the task is simulated in a mathematics classroom situation.
- Realistic tasks: in this case, tasks are close to reality, while the data or the question are not necessarily authentic. Even if the data may have a realistic meaning, they might have been artificially constructed. Vice versa, the data of such tasks can be authentic, while the question is not.
- Embedded tasks: here, the proposed situation embeds the mathematical topics. Thus, it is not necessary to reflect on the particular context.
- Artificial tasks: the proposed situation is intentionally artificial.
- Fantasy tasks: in this case, the task context is inspired by a fantasy world; this might be highly appealing, particularly in primary or elementary classrooms.
2.10. Representation
- Text: this kind of task is presented as pure text.
- Pictures: this kind of task consists of images or photographs only.
- Text and pictures: pictures and photographs can illustrate the provided text and support the pupils in making a connection between the task and the reality.
- Materials: various artefacts, such as documents, newspaper articles, or radio broadcasts, can support the mental visualisation of a proposed problem or a situation.
- Situations: this task uses real situations to be explored mathematically.
2.11. Openness
- Solved example: the proposed task has already been solved and can serve as an example.
- Ascertaining task: the initial situation is given in this task; the subsequent transformations must be performed.
- Reversal task: the end situation and the transformation are given; the initial situation is missing and needs to be found.
- Ascertaining problem: the initial situation is given; the transformation and end situation are unknown. This is a situation typical of modelling tasks.
- Reversal problem: the end situation is given, while the rest is unknown. These are modelling tasks where an aim is given.
- Finding a situation: a mathematical tool, or any transformation, is given; the situation where this tool can be used must be found.
- Open problem: in this case, the initial situation, the transformation, and the end situation are not given.
3. Theoretical Background of AI
3.1. Background
3.2. Mathematical Performance of GPT
3.3. Research Questions
4. Method
4.1. Selection of Tasks
4.2. Process for ChatGPT Solutions
4.3. Qualitative Analyses
5. Results
5.1. GPTs’ Solutions
5.2. Rating of Solutions
5.3. Interpretation of Solutions
5.3.1. Task 1: Rubber Ball
“To solve this problem, we need to understand that each time the rubber ball touches the ground, it reaches half of its previous height. When it first falls, it reaches a height of 18 m, then it bounces back to half of that height, which is 9 m. On the second bounce, it falls from 9 m, then bounces back to half of that, which is 4.5 m. On the third bounce, it falls from 4.5 m, then bounces back to half of that, which is 2.25 m. This process continues.”GPT 3.5
5.3.2. Task 3: Refuelling
“To check whether the trip is worthwhile for Mr. Stein, we need to calculate the cost of the trip and compare it with the potential savings from the cheaper petrol price in Luxembourg.”GPT-3.5
“The amount of petrol required for the trip is . At the Luxembourg petrol price, the cost of the round trip in terms of fuel is 2.67 × €1.05 = €2.8”GPT-MM
“Based on this information, we can calculate the fuel consumption for the 20 km trip:Consumption = (20 km/100 km) ×6 L/100 km = 1.2 LNow, we can calculate the fuel costs both in Tier and Luxembourg:In Tier: 1.1 Euro/liter ×1.2 L = 1.32 EuroIn Luxembourg: 0.85 Euro/liter ×1.2 L = 1.02 EuroThe difference between the costs in Tier and Luxembourg is:Savings = Costs in Tier − Costs in Luxembourg= 1.32 Euro − 1.02 Euro= 0.3 Euro”GPT-3.5
“To determine if the trip is worth it, we need to calculate the break-even point, where the savings on petrol equal the cost of the journey. Let x be the liters of petrol needed to break even.0.15x = 2.80”GPT-MM
5.3.3. Task 4: Water Supply in Traffic Jam
“Let’s denote:D as the distance of the traffic jam (20 km)t as the time it takes for the truck to supply all people with waterr as the rate at which the truck supplies water (in km/h)We can set up the equation: r × t = D”GPT-3.5
“Given that Christina has been stuck in the traffic jam for 6 h, we know the total time is 6 h. We can set up the equation with this information:To solve for r, we divide both sides by 6:So, the truck is supplying water at a rate of 3.33 km/h.Now, to find out how long it will take for the truck to supply all people with water, we can use the same rate:[…]h.So, it will take approximately 6 h for the truck to supply all people with water.”GPT-3.5
“The average distance between cars in the jam to estimate the number of cars. How much time it takes for the truck to serve water to each car.”GPT-4.0
“Traffic jam length = 20 km = 20,000 mAverage distance between cars = 10 m”GPT-4.0
“Assuming an average car length of 4.5 m (including spacing between cars), we can estimate the number of cars in a 20 km traffic jam.”GPT-MM
5.3.4. Tasks 2 and 5
6. Discussion
7. Limitation
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Task 1: Rubber Ball
German | English |
German | English |
German | English |
Only available in English |
Appendix A.2. Task 2: Ferry Price
German | English |
German | English |
German | English |
Only available in English |
Appendix A.3. Task 3: Refuelling
German | English |
German | English |
German | English |
Only available in English |
Appendix A.4. Task 4: Water Supply in the Traffic Jam
Appendix A.5. Task 5: Round Up Please
References
- Kaiser, G.; Bracke, M.; Göttlich, S.; Kaland, C. Authentic Complex Modelling Problems in Mathematics Education. In Educational Interfaces between Mathematics and Industry; Damlamian, A., Rodrigues, J.F., Sträßer, R., Eds.; New ICMI Study Series; Springer International Publishing: Berlin/Heidelberg, Germany, 2013; Volume 16, pp. 287–297. [Google Scholar] [CrossRef]
- Blum, W. ICMI Study 14: Applications and modelling in mathematics education—Discussion document. Educ. Stud. Math. 2002, 51, 149–171. [Google Scholar] [CrossRef]
- Maaß, K.; Gurlitt, J. Designing a Teacher Questionnaire to Evaluate Professional Development in Modelling. In Proceedings of the CERME 6, Lyon, France, 28 January–1 February 2009; Available online: http://www.inrp.fr/editions/editions-electroniques/cerme6/ (accessed on 22 June 2024).
- Krainer, K. Powerful tasks: A contribution to a high level of acting and reflecting in mathematics instruction. Educ. Stud. Math. 1993, 24, 65–93. [Google Scholar] [CrossRef]
- Henn, H.-W. Why Sometimes Cats Fall from the Sky … or … about Good and Bad Models [Warum manchmal Katzen vom Himmel fallen … oder … von Guten und von Schlechten Modellen]. In Model Building, Computers and Mathematics Instruction [Modellbildung Computer und Mathematikunterricht]; Hischer, H., Ed.; Franzbecker: Hildesheim, Germany, 2000; pp. 9–17. [Google Scholar]
- Klieme, E.; Neubrand, M.; Lüdtke, O. Mathematical Basic Education: Test Design and Results [Mathematische Grundbildung: Testkonzeption und Ergebnisse]. In PISA 2000: Basic Competencies of Students in an International Comparison [PISA 2000: Basiskompetenzen von Schülerinnen und Schülern im Internationalen Vergleich]; PISA Consortium, Ed.; VS Verlag für Sozialwissenschaften: Wiesbaden, Germany, 2001; pp. 139–190. [Google Scholar]
- Matos, J.F. Mathematics Learning and Modelling: Theory and Practice. In Mathematical Modelling Teaching and Assessment in a Technology-Rich World; Galbraith, P., Blum, W., Booker, G., Huntley, I., Eds.; Horwood: Chichester, UK, 1998; pp. 21–27. [Google Scholar]
- Alsina, C. Neither a Microscope nor a Telescope Just a Mathscope. In Mathematical Modelling Teaching and Assessment in a Technology-Rich World; Galbraith, P., Blum, W., Booker, G., Huntley, I., Eds.; Horwood: Chichester, UK, 1998; pp. 3–10. [Google Scholar]
- Galbraith, P. Modelling Teaching Reflecting—What I Have Learned. In Advances and Perspectives in the Teaching of Mathematical Modelling and Applications; Sloyer, C.W., Huntley, I., Blum, W., Eds.; Water Street Mathematics: Yorklyn, DE, USA, 1995; pp. 21–45. [Google Scholar]
- Niss, M.; Blum, W.; Galbraith, P. Introduction. In Modelling and Applications in Mathematics Education; Blum, W., Galbraith, P.L., Henn, H.-W., Niss, M., Eds.; New ICMI Study Series; Springer: Boston, MA, USA, 2007; Volume 10, pp. 3–32. [Google Scholar] [CrossRef]
- Maaß, K. Mathematical Modelling in the Classroom: Results of an Empirical Study [Mathematisches Modellieren im Unterricht: Ergebnisse einer empirischen Studie]. In Texts on Mathematical Research and Teaching [Texte zur Mathematischen Forschung und Lehre]; Franzbecker: Hildesheim, Germany, 2004; Volume 30. [Google Scholar]
- Kaiser, G.; Sriraman, B. A global survey of international perspectives on modelling in mathematics education. ZDM Math. Educ. 2006, 38, 302–310. [Google Scholar] [CrossRef]
- English, L.D. Advancing Mathematics Education Research within a STEM Environment. In Research in Mathematics Education in Australasia 2012–2015; Makar, K., Dole, S., Visnovska, J., Goos, M., Bennison, A., Fry, K., Eds.; Springer: Singapore, 2016; pp. 353–371. [Google Scholar] [CrossRef]
- Maaß, K.; Doorman, M.; Jonker, V.; Wijers, M. Promoting active citizenship in mathematics teaching. ZDM Math. Educ. 2019, 51, 991–1003. [Google Scholar] [CrossRef]
- Maaß, K.; Zehetmeier, S.; Weihberger, A.; Flößer, K. Analysing mathematical modelling tasks in light of citizenship education using the COVID-19 pandemic as a case study. ZDM Math. Educ. 2023, 55, 133–145. [Google Scholar] [CrossRef]
- Kaiser-Meßmer, G. Applications in Mathematics Education [Anwendungen im Mathematikunterricht]; Franzbecker: Hildesheim, Germany, 1986. [Google Scholar]
- Blum, W. Application contexts in mathematics education—Trends and perspectives [Anwendungsbezüge im Mathematikunterricht—Trends und Perspektiven]. Schriftenreihe Didakt. Math. 1996, 23, 15–38. [Google Scholar]
- Stern, E. Mathematics. In Encyclopedia of Psychology: Practice Areas. Series I Educational Psychology Vol. 3. Psychology of Teaching and School [Enzyklopädie der Psychologie: Themenbereich d. Praxisgebiete. Serie I Pädagogische Psychologie Bd. 3. Psychologie des Unterrichts und der Schule]; Weinert, F.E., Ed.; Hogrefe: Göttingen, Germany, 1997; pp. 398–426. [Google Scholar]
- Blomhøj, M.; Jensen, T.H. What’s All the Fuss about Competencies? In Modelling and Applications in Mathematics Education; Blum, W., Galbraith, P.L., Henn, H.-W., Niss, M., Eds.; New ICMI Study Series; Springer: New York, NY, USA, 2007; Volume 10, pp. 45–56. [Google Scholar] [CrossRef]
- Verschaffel, L.; de Corte, E.; Lasure, S.; van Vaerenbergh, G.; Bogaerts, H.; Ratinckx, E. Learning to solve mathematical application problems: A design experiment with fifth graders. Math. Think. Learn. 1999, 1, 195–229. [Google Scholar] [CrossRef]
- Verschaffel, L.; de Corte, E.; Greer, B. Making Sense of Word Problems; Contexts of Learning; Swets & Zeitlinger: Lisse, The Netherlands, 2000; Volume 8. [Google Scholar]
- Burkhardt, H. Mathematical modelling in the curriculum. In Applications and Modelling in Learning and Teaching Mathematics; Blum, W., Berry, J.S., Biehler, I., Huntley, I., Kaiser-Meßmer, G., Profke, L., Eds.; Horwood: Newyork, NY, USA, 1989; pp. 1–11. [Google Scholar]
- Kaiser, G. Reality-Related Aspects in Mathematics Education—An Overview of the Current and Historical Discussion [Realitätsbezüge im Mathematikunterricht—Ein Überblick über die Aktuelle und Historische Diskussion]. In Series of the ISTRON Group. Materials for a Reality-Related Mathematics Education [Schriftenreihe der ISTRON-Gruppe. Materialien für Einen Realitätsbezogenen Mathematikunterricht]; Graumann, G., Ed.; Franzbecker: Hildesheim, Germany, 1995; Volume 2, pp. 66–84. [Google Scholar]
- OECD. The PISA 2003 Assessment Framework; OECD: Paris, France, 2003. [Google Scholar] [CrossRef]
- Franke, M. Didactics of Arithmetic in Elementary School [Didaktik des Sachrechnens in der Grundschule]. In Mathematics for Primary and Secondary Education [Mathematik Prima-und Sekundarstufe]; Springer Spektrum: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Büchter, A.; Leuders, T. Developing Math Tasks on Your Own: Promoting Learning—Assessing Performance [Mathematikaufgaben Selbst Entwickeln: Lernen Fördern—Leistung Überprüfen]; Cornelsen Scriptor: Berlin, Germany, 2005. [Google Scholar]
- Bruder, R. Construct-Select-Accompany: On Dealing with Tasks [Konstruieren-Auswählen-Begleiten: Über den Umgang mit Aufgaben]. 2003.
- Jordan, A.; Krauss, S.; Löwen, K.; Blum, W.; Neubrand, M.; Brunner, M.; Kunter, M.; Baumert, J. Tasks in the COACTIV project: Evidence of the cognitive activation potential in German mathematics instruction [Aufgaben im COACTIV-Projekt: Zeugnisse des kognitiven Aktivierungspotentials im deutschen Mathematikunterricht]. J. Für Math.-Didakt. 2008, 29, 83–107. [Google Scholar] [CrossRef]
- Blomhøj, M.; Jensen, T.H. Developing mathematical modelling competence: Conceptual clarification and educational planning. Teach. Math. Its Appl. 2003, 22, 123–139. [Google Scholar] [CrossRef]
- Brand, S. Acquisition of Modelling Competences: Empirical Comparison of a Holistic and an Atomistic Approach to Fostering Modelling Competences [Erwerb von Modellierungskompetenzen: Empirischer Vergleich Eines Holistischen und Eines Atomistischen Ansatzes zur Förderung von Modellierungskompetenzen]. In Perspectives of Mathematics Education; Springer Fachmedien Wiesbaden: Wiesbaden, Germany, 2014. [Google Scholar]
- Maaß, K. Classification scheme for modelling tasks. J. Für Math. Didakt. 2010, 31, 285–311. [Google Scholar] [CrossRef]
- Blum, W.; Leiß, D. Modelling in class with the “Refueling” task [Modellieren im Unterricht mit der “Tanken”-Aufgabe]. Math. Lehren 2005, 128, 18–21. [Google Scholar]
- Greefrath, G.; Maaß, K. Diagnosis and Evaluation in Mathematical Modelling [Diagnose und Bewertung beim Mathematischen Modellieren]. In Modelling Competences—Diagnosis and Evaluation [Modellierungskompetenzen—Diagnose und Bewertung]; Greefrath, G., Maaß, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2020; pp. 1–19. [Google Scholar]
- Zhang, K.; Aslan, A.B. AI technologies for education: Recent research & future directions. Comput. Educ. Artif. Intell. 2021, 2, 100025. [Google Scholar] [CrossRef]
- Chassignol, M.; Khoroshavin, A.; Klimova, A.; Bilyatdinova, A. Artificial Intelligence trends in education: A narrative overview. Procedia Comput. Sci. 2018, 136, 16–24. [Google Scholar] [CrossRef]
- Weßels, D. ChatGPT—A milestone in AI development [ChatGPT—Ein Meilenstein der KI-Entwicklung]. Mitt. Dtsch. Math.-Ver. 2023, 31, 17–19. [Google Scholar] [CrossRef]
- Wu, T.; He, S.; Liu, J.; Sun, S.; Liu, K.; Han, Q.-L.; Tang, Y. A brief overview of ChatGPT: The history, status quo, and potential future development. IEEE/CAA J. Autom. Sin. 2023, 10, 1122–1136. [Google Scholar] [CrossRef]
- OpenAI Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Bishop, C.M. Neural networks and their applications. Rev. Sci. Instrum. 1994, 65, 1803–1832. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Lo, C.K. What is the impact of ChatGPT on education? A rapid review of the literature. Educ. Sci. 2023, 13, 410. [Google Scholar] [CrossRef]
- Lund, B.; Ting, W. Chatting about ChatGPT: How may AI and GPT impact academia and libraries? Libr. Hi Tech News 2023, 40, 26–29. [Google Scholar] [CrossRef]
- Lin, S.-M.; Chung, H.-H.; Chung, F.-L.; Lan, Y.-J. Concerns about Using ChatGPT in Education. In Lecture Notes in Computer Science. Innovative Technologies and Learning: 6th International Conference; Huang, Y.-M., Rocha, T., Eds.; Springer International: Berlin/Heidelberg, Germany, 2023; Volume 14099, pp. 37–49. [Google Scholar] [CrossRef]
- Yu, H. Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching. Front. Psychol. 2023, 14, 1181712. [Google Scholar] [CrossRef]
- Helfrich-Schkarbanenko, A. Mathematics and ChatGPT [Mathematik und ChatGPT]; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
- Korkmaz Guler, N.; Dertli, Z.G.; Boran, E.; Yildiz, B. An artificial intelligence application in mathematics education: Evaluating ChatGPT’s academic achievement in a mathematics exam. Pedagog. Res. 2024, 9, em0188. [Google Scholar] [CrossRef]
- Plevris, V.; Papazafeiropoulos, G.; Jiménez Rios, A. Chatbots put to the test in math and logic problems: A comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard. AI 2023, 4, 949–969. [Google Scholar] [CrossRef]
- Dao, X.-Q.; Le, N.-B. Investigating the effectiveness of ChatGPT in mathematical reasoning and problem solving: Evidence from the Vietnamese national high school graduation examination. arXiv 2023. [Google Scholar] [CrossRef]
- Wardat, Y.; Tashtoush, M.A.; AlAli, R.; Jarrah, A.M. ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia J. Math. Sci. Technol. Educ. 2023, 19, em2286. [Google Scholar] [CrossRef]
- Frieder, S.; Pinchetti, L.; Chevalier, A.; Griffiths, R.-R.; Salvatori, T.; Lukasiewicz, T.; Petersen, P.C.; Berner, J. Mathematical capabilities of ChatGPT. Adv. Neural Inf. Process. Syst. 2023, 36, 1–37. [Google Scholar] [CrossRef]
- Shakarian, P.; Koyyalamudi, A.; Ngu, N.; Mareedu, L. An independent evaluation of ChatGPT on mathematical word problems (MWP). arXiv 2023. [Google Scholar] [CrossRef]
- Zong, M.; Krishnamachari, B. Solving math word problems concerning systems of equations with GPT-3. Proc. AAAI Conf. Artif. Intell. 2023, 37, 15972–15979. [Google Scholar]
- McGee, R.W. Is Chat GPT Biased against Conservatives? An Empirical Study. SSRN 2023. [Google Scholar] [CrossRef]
- Wan, Y.; Pu, G.; Sun, J.; Garimella, A.; Chang, K.-W.; Peng, N. “Kelly is a Warm Person, Joseph is a Role Model”: Gender biases in LLM-generated reference letters. arXiv 2023, arXiv:2310.09219. [Google Scholar]
- Schukajlow, S.; Kolter, J.; Blum, W. Scaffolding mathematical modelling with a solution plan. ZDM Math. Educ. 2015, 47, 1241–1254. [Google Scholar] [CrossRef]
- Hankeln, C.; Beckschulte, C. Partial Competences of Modelling and Their Assessment—Presentation of a Test Development [Teilkompetenzen des Modellierens und ihre Erfassung—Darstellung einer Testentwicklung]. In Modelling Competences—Diagnosis and Evaluation [Modellierungskompetenzen—Diagnose und Bewertung]; Greefrath, G., Maaß, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2020; pp. 65–86. [Google Scholar]
- Mayring, P. Qualitative content analysis. Forum Qual. Sozialforschung/Forum: Qual. Soc. Res. 2000, 1, 20. [Google Scholar] [CrossRef]
- Mayring, P. Qualitative Content Analysis: Theoretical Foundation, Basic Procedures and Software Solution. 2014. Available online: https://nbn-resolving.org/urn:nbn:de:0168-ssoar-395173 (accessed on 22 June 2024).
- Maaß, K. Modelling in mathematics education at the lower secondary level [Modellieren im Mathematikunterricht der Sekundarstufe I]. J. Für Math. -Didakt. 2005, 26, 114–142. [Google Scholar] [CrossRef]
- Maaß, K. What are modelling competencies? ZDM Math. Educ. 2006, 38, 113–142. [Google Scholar] [CrossRef]
- Kvale, S. Doing Interviews; SAGE Publications Ltd.: Thousand Oaks, CA, USA, 2007. [Google Scholar] [CrossRef]
- Gwet, K.L. Computing inter-rater reliability and its variance in the presence of high agreement. Br. J. Math. Stat. Psychol. 2008, 61, 29–48. [Google Scholar] [CrossRef]
- Krippendorff, K. Content Analysis: An Introduction to Its Methodology; Sage Publications Inc.: Thousand Oaks, CA, USA, 2004. [Google Scholar]
- Conger, A.J. Integration and generalization of kappas for multiple raters. Psychol. Bull. 1980, 88, 322–328. [Google Scholar] [CrossRef]
- Lombard, M.; Snyder-Duch, J.; Bracken, C.C. Content analysis in mass communication: Assessment and reporting of intercoder reliability. Hum. Commun. Res. 2002, 28, 587–604. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing [Computer Software]. R Foundation for Statistical Computing. 2021. Available online: https://www.R-project.org/ (accessed on 22 June 2024).
- Gwet, K.L. irrCAC: Computing Chance-Corrected Agreement Coefficients (CAC). 2019. Available online: https://CRAN.R-project.org/package=irrCAC (accessed on 22 June 2024).
- Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159. [Google Scholar] [CrossRef]
- Feng, G.C. Mistakes and how to avoid mistakes in using intercoder reliability indices. Methodology 2015, 11, 13–22. [Google Scholar] [CrossRef]
- Blum, W. Modelling Tasks in Mathematics Education—Challenges for Students and Teachers [Modellierungsaufgaben im Mathematikunterricht—Herausforderung für Schüler und Lehrer]. In Realworld Mathematics Education: From the Subject and for Practice; Festschrift for Hans-Wolfgang Henn’s 60th Birthday [Realitätsnaher Mathematikunterricht: Vom Fach aus und für die Praxis; Festschrift für Hans-Wolfgang Henn zum 60. Geburtstag]; Büchter, A., Ed.; Franzbecker: Hildesheim, Germany, 2006; pp. 8–23. [Google Scholar]
- Jordan, A.; Ross, N.; Krauss, S.; Baumert, J.; Blum, W.; Neubrand, M.; Löwen, K.; Brunner, M.; Kunter, M. Classification Scheme for Maths Tasks: Documentation of Task Categorisation in the COACTIV Project. [Klassifikationsschema für Mathematikaufgaben: Dokumentation der Aufgabenkategorisierung im COACTIV-Projekt.]; Materialien aus der Bildungsforschung; Max-Planck-Inst. für Bildungsforschung: Berlin, Germany, 2006; Volume 81. [Google Scholar]
- Flößer, K. Round Up, Please! [Aufrunden, Bitte!]. 2018. Available online: https://icse.ph-freiburg.de/problemdesquartals/das-problem-des-quartals-mathe-edition-aufrunden-bitte/ (accessed on 22 June 2024).
Data | Nature of Relationship to Reality | Type of Representation | Openness of Task | |
---|---|---|---|---|
Task 1: rubber ball | Matching | Embedded, intentionally artificial | Text | Ascertaining task |
Task 2: ferry price | Missing | Embedded | Text | Ascertaining task |
Task 3: refuelling | Missing | Authentic, close to reality | Text, picture, situation | Ascertaining problem |
Task 4: water supply | Missing | Close to reality | Text, situation | Ascertaining problem |
Task 5: round up | Superflous and missing | Authentic | Text, situation | Ascertaining problem |
Partial Competence | Description |
---|---|
Understanding | Students construct their own mental model of a given problem situation and, thus, understand the question. |
Collecting information, analysing sources (simplifying) | Students separate important and unimportant information with respect to a real-life situation. |
Mathematising | Students translate suitably simplified real-life situations into mathematical models (e.g., term, equation, figure, diagram, function). |
Using mathematics | Students work with the mathematical model. |
Interpreting | Students relate the results obtained from the model to the real situation and, thus, achieve real results. |
Validating * | Students check the real results from the situation model for appropriateness. Students compare and evaluate different mathematical models with respect to a real-life situation. |
Discussing (possibly) contradicting results | Students relate the answers obtained from the situation model to the real situation and, thus, answer the question. |
Partial Competence | Task 1: Rubber Ball | Taks 2: Ferry Price | Taks 3: Refuelling | Task 4: Water Supply | Task 5: Round Up |
---|---|---|---|---|---|
Understanding | 2 | 3 | 0 | 0 | 3 |
Collecting information, analysing sources (simplifying) | 4 | 4 | 4 | 2 | 2 |
Mathematising | 1 | 3 | 2 | 2 | 2 |
Using mathematics | 3 | 3 | 2 | 2 | 2 |
Interpreting | 3 | 2 | 2 | 2 | 3 |
Validating in the situation model | 0 | 2 | 0 | 2 | 3 |
Validating in the context | 4 | 0 | 0 | 0 | 2 |
Discussing (possibly) contradicting results | 1 | 1 | 1 | 1 | 3 |
Partial Competence | Task 1: Rubber Ball | Taks 2: Ferry Price | Taks 3: Refuelling | Task 4: Water Supply | Task 5: Round Up |
---|---|---|---|---|---|
Understanding | 3 | 3 | 3 | 3 | 3 |
Collecting information, analysing sources (simplifying) | 4 | 4 | 4 | 3 | 3 |
Mathematising | 3 | 3 | 3 | 2 | 3 |
Using mathematics | 3 | 3 | 3 | 3 | 3 |
Interpreting | 3 | 2 | 3 | 2 | 3 |
Validating in the situation model | 3 | 2 | 3 | 2 | 3 |
Validating in the context | 4 | 0 | 0 | 0 | 3 |
Discussing (possibly) contradicting results | 3 | 2 | 3 | 3 | 3 |
Partial Competence | Task 1: Rubber Ball | Taks 2: Ferry Price | Taks 3: Refuelling | Task 4: Water Supply | Task 5: Round Up |
---|---|---|---|---|---|
Understanding | 3 | 3 | 3 | 3 | 3 |
Collecting information, analysing sources (simplifying) | 4 | 4 | 4 | 3 | 3 |
Mathematising | 3 | 3 | 3 | 2 | 3 |
Using mathematics | 3 | 3 | 3 | 3 | 3 |
Interpreting | 3 | 3 | 3 | 3 | 3 |
Validating in the situation model | 3 | 3 | 2 | 3 | 3 |
Validating in the context | 4 | 2 | 0 | 0 | 3 |
Discussing (possibly) contradicting results | 3 | 3 | 3 | 3 | 3 |
Coefficient | Estimate | S.E. | p-Value |
---|---|---|---|
GPT-3.5 | |||
Task 1: rubber ball | |||
Gwet’s AC1 | 0.592 | 0.158 | <0.01 |
Krippendorf’s α | 0.564 | 0.163 | 0.011 |
Conger’s κ | 0.562 | 0.145 | <0.01 |
Simple agreement (%) | 0.667 | 0.126 | <0.01 |
Taks 2: ferry price | |||
Gwet’s AC1 | 0.273 | 0.170 | 0.152 |
Krippendorf’s α | 0.294 | 0.182 | 0.150 |
Conger’s κ | 0.296 | 0.157 | 0.101 |
Simple agreement (%) | 0.417 | 0.137 | 0.019 |
Taks 3: refuelling | |||
Gwet’s AC1 | 0.381 | 0.133 | 0.024 |
Krippendorf’s α | 0.376 | 0.154 | 0.045 |
Conger’s κ | 0.396 | 0.123 | 0.014 |
Simple agreement (%) | 0.500 | 0.109 | <0.01 |
Task 4: water supply | |||
Gwet’s AC1 | 0.241 | 0.117 | 0.079 |
Krippendorf’s α | 0.154 | 0.160 | 0.366 |
Conger’s κ | 0.200 | 0.120 | 0.139 |
Simple agreement (%) | 0.375 | 0.098 | <0.01 |
Task 5: round up | |||
Gwet’s AC1 | 0.156 | 0.054 | 0.024 |
Krippendorf’s α | −0.057 | 0.082 | 0.513 |
Conger’s κ | 0.000 | 0.040 | |
Simple agreement (%) | 0.292 | 0.042 | <0.01 |
GPT-4.0 | |||
Task 1: rubber ball | |||
Gwet’s AC1 | 0.694 | 0.161 | <0.01 |
Krippendorf’s α | 0.566 | 0.182 | 0.017 |
Conger’s κ | 0.560 | 0.167 | 0.012 |
Simple agreement (%) | 0.750 | 0.122 | <0.01 |
Taks 2: ferry price | |||
Gwet’s AC1 | 0.397 | 0.123 | 0.014 |
Krippendorf’s α | 0.296 | 0.221 | 0.223 |
Conger’s κ | 0.309 | 0.182 | 0.134 |
Simple agreement (%) | 0.500 | 0.109 | <0.01 |
Taks 3: refuelling | |||
Gwet’s AC1 | 0.901 | 0.099 | <0.01 |
Krippendorf’s α | 0.828 | 0.188 | <0.01 |
Conger’s κ | 0.822 | 0.184 | <0.01 |
Simple agreement (%) | 0.917 | 0.083 | <0.01 |
Task 4: water supply | |||
Gwet’s AC1 | 0.790 | 0.138 | <0.01 |
Krippendorf’s α | 0.743 | 0.178 | <0.01 |
Conger’s κ | 0.733 | 0.175 | <0.01 |
Simple agreement (%) | 0.833 | 0.109 | <0.01 |
Task 5: round up | |||
Gwet’s AC1 | 0.306 | 0.174 | 0.123 |
Krippendorf’s α | 0.143 | 0.150 | 0.372 |
Conger’s κ | 0.232 | 0.089 | 0.034 |
Simple agreement (%) | 0.500 | 0.109 | <0.01 |
GPT-MM | |||
Task 1: rubber ball | |||
Gwet’s AC1 | 0.778 | 0.163 | <0.01 |
Krippendorf’s α | 0.681 | 0.168 | <0.01 |
Conger’s κ | 0.673 | 0.157 | <0.01 |
Simple agreement (%) | 0.833 | 0.109 | <0.01 |
Taks 2: ferry price | |||
Gwet’s AC1 | 0.784 | 0.155 | <0.01 |
Krippendorf’s α | 0.649 | 0.205 | 0.016 |
Conger’s κ | 0.644 | 0.188 | 0.011 |
Simple agreement (%) | 0.833 | 0.109 | <0.01 |
Taks 3: refuelling | |||
Gwet’s AC1 | 0.799 | 0.142 | <0.01 |
Krippendorf’s α | 0.691 | 0.160 | <0.01 |
Conger’s κ | 0.683 | 0.151 | <0.01 |
Simple agreement (%) | 0.833 | 0.109 | <0.01 |
Task 4: water supply | |||
Gwet’s AC1 | 0.391 | 0.152 | 0.037 |
Krippendorf’s α | 0.110 | 0.166 | 0.530 |
Conger’s κ | 0.135 | 0.143 | 0.377 |
Simple agreement (%) | 0.500 | 0.109 | <0.01 |
Task 5: round up | |||
Gwet’s AC1 | 0.627 | 0.249 | 0.040 |
Krippendorf’s α | 0.274 | 0.138 | 0.087 |
Conger’s κ | 0.294 | 0.109 | 0.031 |
Simple agreement (%) | 0.750 | 0.122 | <0.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Spreitzer, C.; Straser, O.; Zehetmeier, S.; Maaß, K. Mathematical Modelling Abilities of Artificial Intelligence Tools: The Case of ChatGPT. Educ. Sci. 2024, 14, 698. https://doi.org/10.3390/educsci14070698
Spreitzer C, Straser O, Zehetmeier S, Maaß K. Mathematical Modelling Abilities of Artificial Intelligence Tools: The Case of ChatGPT. Education Sciences. 2024; 14(7):698. https://doi.org/10.3390/educsci14070698
Chicago/Turabian StyleSpreitzer, Carina, Oliver Straser, Stefan Zehetmeier, and Katja Maaß. 2024. "Mathematical Modelling Abilities of Artificial Intelligence Tools: The Case of ChatGPT" Education Sciences 14, no. 7: 698. https://doi.org/10.3390/educsci14070698
APA StyleSpreitzer, C., Straser, O., Zehetmeier, S., & Maaß, K. (2024). Mathematical Modelling Abilities of Artificial Intelligence Tools: The Case of ChatGPT. Education Sciences, 14(7), 698. https://doi.org/10.3390/educsci14070698