Next Article in Journal
A High-Performance and Accurate FPGA-Based Flow Monitor for 100 Gbps Networks
Previous Article in Journal
A Practical Guide to Estimating Coil Inductance for Magnetic Resonance Applications
 
 
Article
Peer-Review Record

Intelligent Model for Data Analytical Study of Coronavirus COVID-19 Databases

Electronics 2022, 11(13), 1975; https://doi.org/10.3390/electronics11131975
by Doaa Sami Khafaga 1, Faten Khalid Karim 1,*, Mohamed M. Dessouky 2,3 and Mohamed A. El-Rashidy 3
Electronics 2022, 11(13), 1975; https://doi.org/10.3390/electronics11131975
Submission received: 18 May 2022 / Revised: 21 June 2022 / Accepted: 23 June 2022 / Published: 24 June 2022
(This article belongs to the Section Computer Science & Engineering)

Round 1

Reviewer 1 Report

The rapid spread of the new coronavirus COVID-19 around the world has led the World Health Organization to announce as a pandemic that the number of deaths has exceeded that of SARS. Massive efforts have been made to stop the spread of the virus.

The authors  presented an intelligent model for analyzing of data collected from the countries affected by the COVID-19 virus. It considers the total number of tests that each country has undergone, the number of international tourist arrivals in each country, the percentage of employment, the life expectancy at birth, the median age, the population density, the number of people aged 65 years or older in millions and the sex ratio. The proposed model is based on machine learning approaches using k-Means as a clustering approach, Support Vector Machine (SVM) as a classifier, and wrapper as a feature extraction approach. It consists of three phases the pre-processing of the data collected, the discovery of outlier cases, the selection of the most effective features for each of the total infected, deaths, critical and recovery cases, and the construction of prediction models.

Authors highpoint that: (a)   experimental results show that the extracted features of the wrapper technique have shown that it is more capable of fitting and predicting data than the Correlation-Based Feature Selection, Correlation Attribute Evaluation, Information Gain, and Relief Attribute Evaluation techniques. (b) The SVM classifier also achieved the highest accuracy compared to other classification algorithms for predicting total infected, fatal, critical, and recovery cases.

 

 

 

The contribute is interesting and well written

I propose the following suggestions with a pure academic spirit.

1. Abstract must better summarize the sections. Delete for example “The rapid spread of the new coronavirus COVID-19 around the world has led the World Health Organization to announce as a pandemic that the number of deaths has exceeded that of SARS. Massive efforts have been made to stop the spread of the virus.”…it is well known, and put a background

2. Insert a clear purpose. The part “the contribution of….” Is more suitable for conclusions

3. The par “Data Analysis approaches” is still introduction? It must be rearranged as it seems too fragmented. Some paragraphs are one sentence long

 

4. Divide Results from discussion

5. Use discussion to compare with other studies

6. Insert limitations in the discussion

 

 

 

 

 

 

 

 

 

.

Author Response

  1. Abstract must better summarize the sections. Delete for example “The rapid spread of the new coronavirus COVID-19 around the world has led the World Health Organization to announce as a pandemic that the number of deaths has exceeded that of SARS. Massive efforts have been made to stop the spread of the virus.”…it is well known and put a background.

 

Response:

Thank you, sir, for your valuable comment, the background had been added to the abstract in the first three lines.

 

2. Insert a clear purpose. The part “the contribution of….” Is more suitable for conclusions.

 

Response:

Thank you, sir, for your valuable comment, the main purpose of the paper had been cleared in the abstract and introduction on page 2. The contribution had been also added to the conclusion part.

 

  1. The par “Data Analysis approaches” is still introduction? It must be rearranged as it seems too fragmented. Some paragraphs are one sentence long

 

Response:

Thank you, sir, for your valuable comment, yes, it is still an introduction. All this part had been rephrased and fragmented.

4.

  1. Divide Results from discussion

Response:

Thank you, sir, for your valuable comment. The Results had been separated from the discussion.

 

  1. Use discussion to compare with other studies

 

Response:

Thank you, sir, for your valuable comment. But there are no other studies on the same topic as it is the first paper discussing the COVID-19 from this point of view. And the dataset had been collected manually from the WHO website.

  1. Insert limitations in the discussion

Response:

Thank you, sir, for your valuable comment. The limitation of this study had been added in the conclusion section in the future work part.

  • In the future, missing data will be considered to complete the experimental dataset without ignoring the empty field records. Deep learning will also be considered to predict overall infected cases, total deaths, critical cases, and recovery cases compared to the SVM algorithm.

Reviewer 2 Report

The proposal about “Intelligent Model for Data Analytical Study of Coronavirus COVID-19 Databases” is attractive. To improve, you need the next questions:

-Title: I recommend to add the insight or the added value of the model, in a word or a short expression. It is not easy to find it!

-Abstract: it is right.

-Theoretical framework: it is correct, but it is the more improvable part of the work. Check if some references can be updated. To be prudent, try to update some new references if you find.

-Methods. This paper suggests a model for analyzing of data collected from the countries affected by the COVID-19 virus. Th KPI are the next: total number of tests that each country has undergone; number of international tourist arrivals in each country; percentage of employment; life expectancy at birth; median age; population density; number of people aged 65 years or older in millions; sex ratio. You should justify in depth the indicators.

-Results. Results include a lot of Figures. Review if some of them can be deleted.

-Conclusion and discussion: the weakest part of the article, especially conclusion. You have to include a more in-depth discussion comparing your findings with the principal authors of the theoretical framework. Conclusion is too much brief. There is a clear disequilibrium between the article in general and conclusion.

Author Response

  • Title: I recommend adding the insight or the added value of the model, in a word or a short expression. It is not easy to find it!

 

Response:

Thank you, sir, for your valuable comment. The main purpose and the added value of the algorithm had been cleared in the abstract and introduction on page 2.

  • Abstract: it is right.

 

Response:

Thank you, sir. The abstract also had been improved.

  • Theoretical framework: it is correct, but it is the more improvable part of the work. Check if some references can be updated. To be prudent, try to update some new references if you find.

 

Response:

Thank you, sir, for your valuable comment. But there are no other studies on the same topic as it is the first paper discussing the COVID-19 from this point of view. And the dataset had been collected manually from the WHO website.

  • Methods. This paper suggests a model for analyzing data collected from the countries affected by the COVID-19 virus. The KPI is the next: total number of tests that each country has undergone; the number of international tourist arrivals in each country; percentage of employment; life expectancy at birth; median age; population density; the number of people aged 65 years or older in millions; sex ratio. You should justify in-depth the indicators.

 

Response:

Thank you, sir, for your valuable comment. The dataset has been described in detail 5.1 Dataset Description section on page 8. Also, Table 1 shows the statistical description of the dataset with all details of these indicators.

  • Results. Results include a lot of Figures. Review if some of them can be deleted.

 

Response:

Thank you, sir, for your valuable comment. But each table includes information different than others, such as:

  • Table 2 shows the number of records, the minimum value, and the maximum value for each cluster of classified features.
  • Tables 3, 4, 5, and 6 present the number of incorrectly classified instances which are different for each classification algorithm during the training phase.
  • Tables 7, 8, 8, and 10 present the performance metric parameters of the classification algorithms after applying the 10-fold cross-validation test technique.
  • Tables 11, 12, 13, and 14 illustrate the most significant features extracted from the feature extraction algorithms for each predicted feature.
  • Tables 15, 16, 17, and 18 present the values of the performance metric parameters of the SVM classifier using different subsets of the feature extraction algorithms.
  • Figures 2, 3, 4, and 5 present the effectiveness of the outlier detection and extraction of the effective features of the proposed approach, which shows that these steps have contributed to the increase in the accuracy of the classification.

 

  • Conclusion and discussion: the weakest part of the article, especially the conclusion. You have to include a more in-depth discussion comparing your findings with the principal authors of the theoretical framework. The conclusion is too much brief. There is a clear disequilibrium between the article in general and conclusion.

 

Response:

Thank you, sir, for your valuable comment. The result part had been updated as following

 

  • The Results had been separated from the discussion.
  • The new section discussion had been added.
  • The conclusion section had been updated with more details.
  • I tried to compare my work with others but there are no other studies on the same topic as it is the first paper discussing the COVID-19 from this point of view. And the dataset had been collected manually from the WHO website.
  • The limitation of this study had been added in the conclusion section in future work part.
    • In the future, missing data will be considered to complete the experimental dataset without ignoring the empty field records. Deep learning will also be considered to predict overall infected cases, total deaths, critical cases, and recovery cases compared to the SVM algorithm.

Round 2

Reviewer 2 Report

The improvements are enough, although a little bit improvable.

Back to TopTop