Theory and Research in Data Science Education

A special issue of Education Sciences (ISSN 2227-7102).

Deadline for manuscript submissions: 31 July 2025 | Viewed by 4533

Special Issue Editors


E-Mail Website
Guest Editor

E-Mail Website
Guest Editor
Department of STEM Education, Mary Immaculate College, University of Limerick, SOuth Circular Road, Limerick V994 VN26, Ireland
Interests: STEM education; statistics education; data science; initial teacher education; mathematics education

E-Mail Website
Guest Editor
Faculty of Mathematics and Computer Science, Institute GIMB, University of Münster, Johann-Krane-Weg 39, 48149 Münster, Germany
Interests: early statistical thinking; data science education; teaching and learning mathematics with digital resources

E-Mail Website
Guest Editor
Mathematics and Science Education Department, Middle East Technical University, Ankara 06800, Turkiye
Interests: teaching and learning of statistics and probability; data literacy and statistical reasoning; data science education; teacher education; technology in mathematics education

E-Mail Website
Guest Editor
Cyprus Pedagogical Institute, Nicosia 2252, Cyprus
Interests: teacher professional learning; mathematics education/statistics education; STEM/STEAM education

E-Mail Website
Guest Editor
Department of Mathematics, University of Athens, Zografou 15784, Greece
Interests: mathematics education/statistics education; teacher professional development; teaching resources

Special Issue Information

Dear Colleagues,

The unprecedented growth in the availability and accessibility of data, highly influenced by the increasing ubiquity of digital media and the Internet (Mund, 2022), has opened new possibilities to develop greater understandings of every aspect of human existence. Data science, “the study of extracting value from data” (Wing, 2019), has become indispensable for approaching society’s most pressing societal problems (e.g., climate change, health, social justice) (Tanaka et al., 2022). However, while open and multisource access to information is a key value of modern democratic societies, a lack of readiness for navigation in the dynamic information landscape can become a threat to the common good (Bobrowicz et al., 2022). The ways in which universal access to information can backfire without citizens’ readiness for responsible, well-reasoned choices (Bobrowicz et al., 2022) have rarely been so painfully clear than during the COVID-19 global pandemic, which required everyone to make sense of data for community spread, levels of risk, and vaccine efficacy.

The ever-increasing growth of data has increased the demand for a next generation of data scientists that can anticipate user needs and develop optimal solutions to address business, academic, and societal challenges (Seshaiyer & McNeely, 2022). More importantly, basic data science skills are becoming increasingly important for any profession, from technology, science, finance, journalism and politics to art and history (Mund, 2022), as well as for active and responsible citizenship.

Data Science is one of the fastest-growing fields of study at the collegiate level. Recently, there has also been a call for data science to be included in school curricula (Lee et al., 2022). Responding to this call, data science education has been recently established as a new field of educational research and practice that aims to build students’ data science literacy starting from the early years of schooling. This Special Issue aims to provide a forum for the sharing of research findings, ideas, and perspectives on this new but fast-growing field of inquiry. Recommended topics for the Special Issue include but are not restricted to the following:

  • Essential concepts and core ideas fundamental to data science literacy;
  • Teaching and learning of key data science concepts and practices (e.g., exploration of messy data, data cleaning and wrangling, use of coding) at the K-12, undergraduate, graduate, and professional levels;
  • Pedagogical models and instructional approaches underlying data science education within and across disciplines;
  • Data Science as a bridge between individual disciplines and STEM/STEAM education;
  • Building connections between data science education and data science applications in industry;
  • Data science literacy for all: equity, inclusion, accessibility, and diversity;
  • Capacity development of data-science literate educators and trainers;
  • Use of technological tools (e.g., Jupyter Notebooks, Python) to support the teaching and learning of data science;
  • Data science and data analytics as tools for enhancing the educational process;
  • Ethics in data science and the role of education;
  • Data science literacy as a tool for civic engagement and social justice;
  • Use of AI technologies in Data Science (school) projects (either as learning tools, or as a tool for teachers’ design).

The articles should report on original empirical studies, which will demonstrate validated practical experiences related to the teaching and learning of data science. The Research Topic will also include conceptual essays contributing to future research and theory building by presenting reflective or theoretical analyses, epistemological studies, integrative and critical literature reviews, or the forecasting of emerging trends in data science education.

Prof. Dr. Maria Meletiou-Mavrotheris
Prof. Dr. Aisling Leavy
Prof. Dr. Daniel Frischemeier
Dr. Sibel Kazak
Dr. Efi Paparistodemou
Dr. Dionysia Bakogianni
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Education Sciences is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data science
  • data science education
  • data science literacy
  • data analytics
  • statistics
  • statistics education
  • statistics literacy
  • computational thinking
  • machine learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

22 pages, 4999 KiB  
Article
Messy Data in Education: Enhancing Data Science Literacy Through Real-World Datasets in a Master’s Program
by Iraklis Varlamis
Educ. Sci. 2025, 15(4), 500; https://doi.org/10.3390/educsci15040500 - 16 Apr 2025
Viewed by 190
Abstract
The increasing importance of data science in today’s world highlights the need to prepare students for the complexities of real-world data. This paper presents insights and findings from 15 years of teaching Data Mining and Business Intelligence in a Computer Science Master’s program, [...] Read more.
The increasing importance of data science in today’s world highlights the need to prepare students for the complexities of real-world data. This paper presents insights and findings from 15 years of teaching Data Mining and Business Intelligence in a Computer Science Master’s program, where a key component of the course is a semester-long assignment involving publicly available, messy, and often incomplete datasets. These datasets include examples such as publicly accessible datasets on accidents or fines from data.gov.uk, data from data contest platforms like Kaggle, and house rental data from platforms like Airbnb. Through these assignments, students are tasked with not only applying algorithmic tools but also addressing challenges like missing information, noisy inputs, and inconsistencies. They also learn the importance of finding and integrating supplementary open data sources to enhance the value and depth of their analyses. The primary objective of this approach is to enhance students’ problem-solving abilities by engaging them in complex, real-world data scenarios where they must navigate and resolve issues related to data quality and completeness. This approach cultivates critical skills such as data wrangling, preprocessing, and the extraction of meaningful insights, along with the ability to understand and articulate the business value of the data. Working hypotheses, such as the impact of data quality on analysis outcomes, are explored, and the paper demonstrates how addressing these challenges improves students’ decision-making processes in data-driven tasks. By engaging with real-world datasets, students develop resilience, adaptability, and problem-solving abilities, which are essential for navigating the complexities of data science in professional settings. This paper highlights the educational benefits of using messy data to bridge the gap between theoretical knowledge and real-world application while also demonstrating how this method explicitly improves students’ problem-solving and critical thinking skills in the context of data science. Full article
(This article belongs to the Special Issue Theory and Research in Data Science Education)
Show Figures

Figure 1

Review

Jump to: Research

30 pages, 7049 KiB  
Review
Curriculum, Pedagogy, and Teaching/Learning Strategies in Data Science Education
by Cecilia Avila-Garzon and Jorge Bacca-Acosta
Educ. Sci. 2025, 15(2), 186; https://doi.org/10.3390/educsci15020186 - 5 Feb 2025
Viewed by 1081
Abstract
Data science education is an interdisciplinary and multidisciplinary field, with curricula continually evolving to meet societal needs. This paper aims to report a bibliometric analysis focused on the pedagogical aspects and teaching/learning strategies employed in data science curriculum design, emphasizing contributions from key [...] Read more.
Data science education is an interdisciplinary and multidisciplinary field, with curricula continually evolving to meet societal needs. This paper aims to report a bibliometric analysis focused on the pedagogical aspects and teaching/learning strategies employed in data science curriculum design, emphasizing contributions from key authors, publication sources, affiliations, content, and cited documents. The analysis draws on metadata from documents published over a 20-year period (2005–2024), encompassing a total of 1245 documents sourced from the Scopus scientific database. Additionally, a scoping review of 20 articles was conducted to identify key skills, topics, and courses in data science education. The findings reveal a growing interest in the field, with an increasingly multidisciplinary and interdisciplinary approach. Advances in artificial intelligence and related topics, such as linked data, the semantic web, ontologies, and machine learning, are shaping the development of data science curricula. The main challenges in data science education include the creation of up-to-date and competitive curricula, integrating data science training at early educational stages (K-12, secondary schools, pre-collegiate), leveraging data-driven technologies, and defining the profile of a data scientist. Furthermore, the availability of vast amounts of open, linked, and restricted data, along with advancements in data-driven technologies, is significantly influencing research in the field of data science education. Full article
(This article belongs to the Special Issue Theory and Research in Data Science Education)
Show Figures

Figure 1

19 pages, 1235 KiB  
Review
Strengthening Data Literacy in K-12 Education: A Scoping Review
by Verena Witte, Angela Schwering and Daniel Frischemeier
Educ. Sci. 2025, 15(1), 25; https://doi.org/10.3390/educsci15010025 - 30 Dec 2024
Cited by 1 | Viewed by 1422
Abstract
Competent data handling is crucial for active and informed participation in modern society. To equip students for this challenge, data literacy must be strengthened throughout their K-12 education. This scoping review, conducted following PRISMA-ScR guidelines, aims to provide an overview of methods and [...] Read more.
Competent data handling is crucial for active and informed participation in modern society. To equip students for this challenge, data literacy must be strengthened throughout their K-12 education. This scoping review, conducted following PRISMA-ScR guidelines, aims to provide an overview of methods and approaches for enhancing specific sub-skills of data literacy and to identify research gaps. Analysis of 30 relevant papers reveals that, although various definitions and models of data literacy exist, most emphasize data analysis skills. This area is extensively covered in practical approaches, three times more than methods for planning and conducting independent data collection. This disparity highlights an imbalance in data literacy development and underscores the need to address under-represented sub-skills, in particular the actual data collection process. The review suggests a focus on project-based learning with real-world data and current issues as an effective method to balance out this disparity. Future research should explore and develop comprehensive approaches to teaching all aspects of data literacy, particularly those currently underemphasized. Full article
(This article belongs to the Special Issue Theory and Research in Data Science Education)
Show Figures

Figure 1

Back to TopTop