Next Article in Journal
Nozzles for Spraying Coal–Water Fuels
Previous Article in Journal
Experimental and Statistical Study on the Formation Characteristics and Discrimination Criteria of River Blockages Caused by Landslides
 
 
Article
Peer-Review Record

A Study on Dropout Prediction for University Students Using Machine Learning

Appl. Sci. 2023, 13(21), 12004; https://doi.org/10.3390/app132112004
by Choong Hee Cho 1, Yang Woo Yu 2 and Hyeon Gyu Kim 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(21), 12004; https://doi.org/10.3390/app132112004
Submission received: 23 October 2023 / Revised: 1 November 2023 / Accepted: 2 November 2023 / Published: 3 November 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper discusses the issue of student dropout and the use of machine learning algorithms to predict dropout rates. The authors conducted an experimental study using academic records from 20,050 students at Sahmyook University in Seoul, Republic of Korea. They compared the performance of different machine learning algorithms, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, and Deep Neural Network. The study also examined the impact of class imbalance in the data and applied oversampling techniques such as SMOTE to address this issue.

 

As mentioned in the article, using the SMOTE algorithm can effectively improve the model's performance. However, only the SMOTE algorithm was tested for the final results without using the other algorithms mentioned in Table 1. Can we combine other models with different sampling algorithms to obtain better results?

 

The article mentions choosing several highly correlated features from 144 feature values to predict dropout rates. I want to know whether these features will change depending on the dataset and whether the number of these features will affect the final experimental results.

 

For the selection of datasets, it is worth noting that, in reality, the dropout rate of students is often related to regional culture, regional development level, and economic environment. These factors can all lead to different feature value selections in the experiment. Can we select more datasets from different regions and economic environments for training and validation to obtain a more accurate model?

 

At the end of the article, the author proposed further establishing a model for predicting student dropout rates based on semesters. On this basis, should the time attribute of the dataset be added, that is, to consider adding more data from different years and grades?

Comments on the Quality of English Language

Except for some typos and tense errors, I cannot find other issues.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Authors should address the following comments before this manuscript is considered for publication.

1.     The word “experiment” should not appear in the text, as this manuscript is only theoretical work and no experiments have been conducted.

2.     All the ML techniques such as Logistic Regression, Decision Tree, Random Forest, and Support Vector Machine are old methods, the novelty of the study missing.

3.     The introduction section requires more extensive literature with all available techniques, as well as an explanation of why the authors chose these specific ML techniques. Author can consider adding the following references about machine learning used in various applications which can be useful for indepth analysis. https://doi.org/10.1038/s41598-023-29024-x; https://doi.org/10.1016/j.matchemphys.2023.128180

4.     The quality of Figures needs to be improved, it seems a bit unclear at the moment.

5.     It is written as RL in some places (see figs 2 and 3) and LR in others. Therefore, all ML techniques should have identical terminology in the article.

6.     The manuscript requires careful editing to ensure that the goals and results of the study are clearly communicated to the reader.

 

7.     How authors pick data samples between training and test sets?

Comments on the Quality of English Language

Minor editing of English language required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Accepted as all comments have been addressed!

Back to TopTop