Next Article in Journal
Fouling and Slagging Investigation on Ash Derived from Sasol Coal Using ICP and XRF Analytical Techniques
Next Article in Special Issue
Multi-Objective Deep Reinforcement Learning for Personalized Dose Optimization Based on Multi-Indicator Experience Replay
Previous Article in Journal
Testing of Structural Integrity of U-Shaped Sheet Pile in Canal Engineering Using Ground Penetrating Radar
Previous Article in Special Issue
Brain Tumor Classification Using Conditional Segmentation with Residual Network and Attention Approach by Extreme Gradient Boost
 
 
Article
Peer-Review Record

Reducing Operation Costs of Thyroid Nodules Using Machine Learning Algorithms with Thyroid Nodules Scoring Systems

Appl. Sci. 2022, 12(22), 11559; https://doi.org/10.3390/app122211559
by Erdal Ayvaz 1, Kaplan Kaplan 2,*, Fatma Kuncan 3, Ednan Ayvaz 4 and Hüseyin Türkoğlu 1
Reviewer 1:
Appl. Sci. 2022, 12(22), 11559; https://doi.org/10.3390/app122211559
Submission received: 7 October 2022 / Revised: 9 November 2022 / Accepted: 10 November 2022 / Published: 14 November 2022
(This article belongs to the Special Issue Artificial Intelligence for Health and Well-Being)

Round 1

Reviewer 1 Report

The paper is not ready to be considered for the publication. The models were not developed properly. The experiments were not performed properly. The results seem that the models are suffering from overfitting. 

Line 23 - process management

Does the author mean model development

The paper need to be extensively proofread. 

The abstract needs major revision. The novelity part is missing. what is the novelity of the current study. The details related to data partitioning, optimization methods used is missing. 

Line 73-K-TIRADS, EU-TIRADS, ACR-TIRADS, ATA, and BTA)

Does the author mean here the datatset source ? Please explain what those systems are? What are these classification system? Why there is a need to develop a new classification system.

Extensive use of personal pronoun like "we"

Line 108-Two independ- 108ent experienced sonographers evaluated the US imaging of each thyroid nodule blind to 109pathology.

It is better to evaluate the US using odd number of experts rather than binary. If there is conflict in the decision. How the conflict is resolved. 

Line 133- what is Db?

line 143 equation 3- The subscripts need to be written properly in the equations. 

Explain the variables used in each equation. What each and every variable in the equations represent.

Equation 5 need to be double check and please write the reference. The equation for SVM is not correct and please explain this equation as well. 

Line 171- mention that the features were extracted. How the features are extracted the method details are missing? The feature extraction technique need to be defined.

Line 183- The distribution of the data used for training and testing is 700 and 479. The percentage of distribution is around 60-40. why this ration was used. did the author tried other data spliting ratio. If yes please include the results. 

How optimization was performed because the algorithms like SVM, RF and AB contains parameters and the value of these parameters have huge impact on the performance of the algorithm. 

The number of samples in the dataset mentioned in the experimental result section is different than mentioned in Line 183

Previously the author mentioned the holdout distribution method for data splitting and in line 218 the author mentioned 10-fold cross validation

The result section of the paper is very confusing. No clear findings can be made from the current result explanations. The figures were not created properly. 

The evaluation measures need to be defined in the methodology section. 

The structure of the results tables are inconsistent.

Why Deep learning models have not been used. 

The result of the proposed model seems that the models suffers from overfitting. 

 

Author Response

Dear Reviewer 1, you can find our corrections as in attachement. thay you for your useful comments.

Author Response File: Author Response.docx

Reviewer 2 Report

In this manuscript, the authors investigate thyroid evaluation reports, obtained through ultrasound imaging techniques, adopting different machine-learning algorithms with current classification systems. However, the manuscript seems to have the following limitations:

(I) The authors should put the contributions on a point-by-point basis which will be easily understood by the readers. 

(ii) The authors should keep a paragraph at the end of the "Introduction" section for representing the organization of the work.

(iii) The authors should add a literature survey section to include the state-of-the-art literature of the domain.

(iv) The authors should put a comparison with others' work, especially on classification performance. 

Author Response

Dear Reviewer 2 

thank you for your useful comments. you can find our corrections as in attachement

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

As the study has used several measures that has not been defined in the performance evaluation. Like MAE, RMS, kappa measure, ROC measure etc. Make sure to define all the meaures. Furthermore, only equations are written for the measure like accuracy, error rate, precision, recall etc without any description and also the variables used in the equations are not defined. Furthermore, the authors need to defined why MAE and RMS is used because they are the widely used measure for regression not classification.

Why in the article two different citation style is used like (Stenzel & Stenzel, 2010) [29].

The author alternatively use the word smart system and intelligent system. Why???

The author fails to define how the optimization has been performed. 

The comment related to the model overfitting has not been addressed properly. 

The confusion matrix is not created properly. They need to be created properly.

Author Response

Response to Reviewer Reports

 

Manuscript Number: applsci-1985421

 

Journal: Applied Sciences

 

Title: Reducing operation costs thanks to machine learning algorithms of thyroid nodules classification systems: A case example

 

 

Dear Reviewer,

Thank you for your useful comments and suggestions on our manuscript. The corrections related to the revisions are listed below as:

 

 

Note: All the revisions have been shown in red painted form in "Corrections on manuscript" file.

 

Reviewer #1

As the study has used several measures that has not been defined in the performance evaluation. Like MAE, RMS, kappa measure, ROC measure etc. Make sure to define all the meaures. Furthermore, only equations are written for the measure like accuracy, error rate, precision, recall etc without any description and also the variables used in the equations are not defined. Furthermore, the authors need to defined why MAE and RMS is used because they are the widely used measure for regression not classification.

Thank you for your useful comments and suggestions on our manuscript.

 

All definitions are explained. As you indicated, MAE, RMS, kappa measures are used in regression problems. We deleted this criteria’s. and all used criteria’s are defined and explained.

Why in the article two different citation style is used like (Stenzel & Stenzel, 2010) [29].

Thank you for your useful comments and suggestions on our manuscript.

 

They are all corrected.

The author alternatively use the word smart system and intelligent system. Why???

Thank you for your useful comments and suggestions on our manuscript.

smart systems phrase changed with intelligent systems.

The author fails to define how the optimization has been performed. 

Thank you for your useful comments and suggestions on our manuscript.

 

For all experiments, optimization methods and hyperparameters were determined by trial and error method.

The comment related to the model overfitting has not been addressed properly. 

Thank you for your useful comments and suggestions on our manuscript.

 

Sorry for our response in major revision. We now understood what you meant.

For proving the model does not perform overfitting, the EU-TIRADS scoring system which is achieved the best results is selected for this experiment. In this experiment, 80% of the whole dataset (2609 of 3261) is selected randomly for training the model, and the remaining 20% of the dataset (652 of 3261) which is not used in training process of model is selected for testing the model. Random Forest, J48 DT, Ada Boost, and SVM models are used to perform this experiment. The results revealed that Random Forest yielded the top accuracy rate with 98.4663% as in Table 17. While Table 18 presents the confusion matrix generated for the Random Forest classifier, Table 18 demonstrates the evaluation metrics for this classifier. As can be seen from the experimental results, the external validation test results provide a proof for generalization capability of the models and supports the validity and reliability of results ob-tained with 10-fold cross validation test. In addition, this results shows, the proposed models can handle overfitting problem.

Note: Table 17,18,19 are added to manuscript.

The confusion matrix is not created properly. They need to be created properly.

Thank you for your useful comments and suggestions on our manuscript.

 

They are corrected.

 

Author Response File: Author Response.docx

Reviewer 2 Report

Thanks to the authors for resolving the raised issues.

Author Response

Thank you for your useful comments, contributions and suggestions on our manuscript. 

best ragards

Back to TopTop