Submit to Data Review for Data Propose a Special Issue

Journal Menu

Journal Browser

Education Data Mining

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Data (ISSN 2306-5729). This special issue belongs to the section "Information Systems and Data Management".

Deadline for manuscript submissions: closed (28 February 2022) | Viewed by 61346

Share This Special Issue

Special Issue Editors

Prof. Dr. Leonardo Grilli

E-Mail Website
Guest Editor

Dipartimento di Statistica, Informatica, Applicazioni (DiSIA), Università di Firenze, I-50134 Firenze, Italy
Interests: multilevel models; latent variable models; causal inference; methods for the evaluation of public services

Prof. Dr. Donatella Merlini

E-Mail Website
Guest Editor

Dipartimento di Statistica, Informatica, Applicazioni (DiSIA), Università di Firenze, I-50134 Firenze, Italy
Interests: analysis of algorithms and data structures; enumerative combinatorics; symbolic computation; databases and data mining; educational data mining
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Carla Rampichini

E-Mail Website
Guest Editor

Dipartimento di Statistica, Informatica, Applicazioni (DiSIA), Università di Firenze, I-50134 Firenze, Italy
Interests: multilevel models; duration models; causal inference; evaluation of educational systems

Prof. Dr. Maria Cecilia Verri

E-Mail Website
Guest Editor

Dipartimento di Statistica, Informatica, Applicazioni (DiSIA), Università di Firenze, I-50134 Firenze, Italy
Interests: databases and algorithms; analysis of algorithms and combinatorics; educational data mining

Special Issue Information

Dear Colleagues,

Many fields and sectors, from business, medical and biological activities to public administration, are involved with the growth of data in computer systems. For this reason it is important to develop new methodologies and technologies to manage and analyse all the information that can be derived from such big sources of data. For what concerns the field of education, Educational data mining is a research area that explores and analyzes, by using data mining, machine learning and statistical methods, both large repositories of data usually stored in the schools and universities databases for administrative purposes and large amounts of information about teaching-learning interaction generated in e-learning or web-based educational contexts. Educational data mining considers a wide variety of types of data, including but not limited to log files of interactive learning environments and intelligent tutoring systems, results of examinations and assessment tests and student-produced artifacts. Educational data mining seeks to use all this information to better understand the performance of the student learning process and can be used by the university or school management to improve the entire educational process. The use of data mining in the educational context is mainly concerned with techniques such as clustering, classification, regression, text mining, association rules mining and sequential pattern analysis.

This Special Issue aims at receiving papers in the field of educational data mining that are significant and original and clearly delineate their contributions to the literature, both in terms of data pre-processing and data organization techniques and in terms of algorithms for data analysis.

Topics of interest include, but are not limited to, the following:

New techniques for mining educational data
Evaluation of students performance
Evaluation of curricula and university quality
Social network analysis of student and teacher interactions
Temporal patterns in student behavior
Text mining of educational documents
Students evaluation of teaching
Publishing educational datasets that are useful for the context

Prof. Dr. Leonardo Grilli
Prof. Dr. Donatella Merlini
Prof. Dr. Carla Rampichini
Prof. Dr. Maria Cecilia Verri
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Data is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

Jump to: Other

19 pages, 2685 KB

Open AccessEditor’s ChoiceArticle

A Mixture Hidden Markov Model to Mine Students’ University Curricula

by Silvia Bacci and Bruno Bertaccini

Data 2022, 7(2), 25; https://doi.org/10.3390/data7020025 - 21 Feb 2022

Cited by 3 | Viewed by 4086

Abstract

In the context of higher education, the wide availability of data gathered by universities for administrative purposes or for recording the evolution of students’ learning processes makes novel data mining techniques particularly useful to tackle critical issues. In Italy, current academic regulations allow students to customize the chronological sequence of courses they have to attend to obtain the final degree. This leads to a variety of sequences of exams, with an average time taken to obtain the degree that may significantly differ from the time established by law. In this contribution, we propose a mixture hidden Markov model to classify students into groups that are homogenous in terms of university paths, with the aim of detecting bottlenecks in the academic career and improving students’ performance. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

19 pages, 1423 KB

Open AccessArticle

Development of a Web-Based Prediction System for Students’ Academic Performance

by Dabiah Alboaneen, Modhe Almelihi, Rawan Alsubaie, Raneem Alghamdi, Lama Alshehri and Renad Alharthi

Data 2022, 7(2), 21; https://doi.org/10.3390/data7020021 - 29 Jan 2022

Cited by 38 | Viewed by 12876

Abstract

Educational Data Mining (EDM) is used to extract and discover interesting patterns from educational institution datasets using Machine Learning (ML) algorithms. There is much academic information related to students available. Therefore, it is helpful to apply data mining to extract factors affecting students’ academic performance. In this paper, a web-based system for predicting academic performance and identifying students at risk of failure through academic and demographic factors is developed. The ML model is developed to predict the total score of a course at the early stages. Several ML algorithms are applied, namely: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), Artificial Neural Network (ANN), and Linear Regression (LR). This model applies to the data of female students of the Computer Science Department at Imam Abdulrahman bin Faisal University (IAU). The dataset contains 842 instances for 168 students. Moreover, the results showed that the prediction’s Mean Absolute Percentage Error (MAPE) reached 6.34%, and the academic factors had a higher impact on students’ academic performance than the demographic factors, the midterm exam score in the top. The developed web-based prediction system is available on an online server and can be used by tutors. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

15 pages, 2494 KB

Open AccessArticle

Analysing Computer Science Courses over Time

by Renza Campagni, Donatella Merlini and Maria Cecilia Verri

Data 2022, 7(2), 14; https://doi.org/10.3390/data7020014 - 24 Jan 2022

Viewed by 3093

Abstract

In this paper we consider courses of a Computer Science degree in an Italian university from the year 2011 up to 2020. For each course, we know the number of exams taken by students during a given calendar year and the corresponding average grade; we also know the average normalized value of the result obtained in the entrance test and the distribution of students according to the gender. By using classification and clustering techniques, we analyze different data sets obtained by pre-processing the original data with information about students and their exams, and highlight which courses show a significant deviation from the typical progression of the courses of the same teaching year, as time changes. Finally, we give heat maps showing the order in which exams were taken by graduated students. The paper shows a reproducible methodology that can be applied to any degree course with a similar organization, to identify courses that present critical issues over time. A strength of the work is to consider courses over time as variables of interest, instead of the more frequently used personal and academic data concerning students. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

19 pages, 1739 KB

Open AccessFeature PaperArticle

Dealing with Randomness and Concept Drift in Large Datasets

by Kassim S. Mwitondi and Raed A. Said

Data 2021, 6(7), 77; https://doi.org/10.3390/data6070077 - 19 Jul 2021

Cited by 7 | Viewed by 5897

Abstract

Data-driven solutions to societal challenges continue to bring new dimensions to our daily lives. For example, while good-quality education is a well-acknowledged foundation of sustainable development, innovation and creativity, variations in student attainment and general performance remain commonplace. Developing data -driven solutions hinges on two fronts-technical and application. The former relates to the modelling perspective, where two of the major challenges are the impact of data randomness and general variations in definitions, typically referred to as concept drift in machine learning. The latter relates to devising data-driven solutions to address real-life challenges such as identifying potential triggers of pedagogical performance, which aligns with the Sustainable Development Goal (SDG) #4-Quality Education. A total of 3145 pedagogical data points were obtained from the central data collection platform for the United Arab Emirates (UAE) Ministry of Education (MoE). Using simple data visualisation and machine learning techniques via a generic algorithm for sampling, measuring and assessing, the paper highlights research pathways for educationists and data scientists to attain unified goals in an interdisciplinary context. Its novelty derives from embedded capacity to address data randomness and concept drift by minimising modelling variations and yielding consistent results across samples. Results show that intricate relationships among data attributes describe the invariant conditions that practitioners in the two overlapping fields of data science and education must identify. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

31 pages, 1021 KB

Open AccessArticle

Performing Learning Analytics via Generalised Mixed-Effects Trees

by Luca Fontana, Chiara Masci, Francesca Ieva and Anna Maria Paganoni

Data 2021, 6(7), 74; https://doi.org/10.3390/data6070074 - 9 Jul 2021

Cited by 17 | Viewed by 5107

Abstract

Nowadays, the importance of educational data mining and learning analytics in higher education institutions is being recognised. The analysis of university careers and of student dropout prediction is one of the most studied topics in the area of learning analytics. From the perspective of estimating the likelihood of a student dropping out, we propose an innovative statistical method that is a generalisation of mixed-effects trees for a response variable in the exponential family: generalised mixed-effects trees (GMET). We performed a simulation study in order to validate the performance of our proposed method and to compare GMET to classical models. In the case study, we applied GMET to model undergraduate student dropout in different courses at Politecnico di Milano. The model was able to identify discriminating student characteristics and estimate the effect of each degree-based course on the probability of student dropout. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

Other

Jump to: Research

10 pages, 415 KB

Open AccessData Descriptor

A Dataset of Dropout Rates and Other School-Level Variables in Louisiana Public High Schools

by Michael Stein, Michael Leitner, Jill C. Trepanier and Kory Konsoer

Data 2022, 7(4), 48; https://doi.org/10.3390/data7040048 - 12 Apr 2022

Cited by 2 | Viewed by 8959

Abstract

Students dropping out of high school is a nationwide problem in the United States, plaguing communities and often greatly reducing the prospects of a quality life for those students who do not complete their high school education. The state of Louisiana consistently has among the highest public high school dropout rates in the United States and, often, the highest. This massive dataset of school variables covering a duration of five academic years (2014–2015 to 2018–2019) was originally compiled with the intention of identifying the factors that correlate with high school dropouts in Louisiana public high schools, specifically. However, it can be useful to any researchers interested in analyzing school-level data concerning a wide range of variables beyond merely dropout rates. This dataset also contains socioeconomic demographics, financial variables, class size, and much more. The correlation analyses ultimately revealed many intriguing insights into the relationships between the tested variables and the dropout rates. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Figure 1

10 pages, 714 KB

Open AccessData Descriptor

Dataset of Students’ Performance Using Student Information System, Moodle and the Mobile Application “eDify”

by Raza Hasan, Sellappan Palaniappan, Salman Mahmood, Ali Abbas and Kamal Uddin Sarker

Data 2021, 6(11), 110; https://doi.org/10.3390/data6110110 - 22 Oct 2021

Cited by 21 | Viewed by 16738

Abstract

The data presented in this article comprise an educational dataset collected from the student information system (SIS), the learning management system (LMS) called Moodle, and video interactions from the mobile application called “eDify.” The dataset, from the higher educational institution (HEI) in Sultanate of Oman, comprises five modules of data from Spring 2017 to Spring 2021. The dataset consists of 326 student records with 40 features in total, including the students’ academic information from SIS (which has 24 features), the students’ activities performed on Moodle within and outside the campus (comprising 10 features), and the students’ video interactions collected from eDify (consisting of six features). The dataset is useful for researchers who want to explore students’ academic performance in online learning environments, and will help them to model their educational datamining models. Moreover, it can serve as an input for predicting students’ academic performance within the module for educational datamining and learning analytics. Furthermore, researchers are highly recommended to refer to the original papers for more details. Full article

(This article belongs to the Special Issue Education Data Mining)

► Show Figures

Journal Menu

Journal Browser

Education Data Mining

Share This Special Issue

Special Issue Editors

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (7 papers)

Research

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI