Next Article in Journal
Design and Implementation of Health and Risk Level Assessment for Socially Disadvantaged Patients
Previous Article in Journal
Finite Element Analysis of Tyre Contact Interaction Considering Simplified Pavement with Different Aggregate Sizes
Previous Article in Special Issue
Improving Behavior Monitoring of Free-Moving Dairy Cows Using Noninvasive Wireless EEG Approach and Digital Signal Processing Techniques
 
 
Article
Peer-Review Record

Comparison of the Effectiveness of Various Classifiers for Breast Cancer Detection Using Data Mining Methods

Appl. Sci. 2023, 13(21), 12012; https://doi.org/10.3390/app132112012
by Noor Kamal Al-Qazzaz 1,*, Iyden Kamil Mohammed 1, Halah Kamal Al-Qazzaz 2, Sawal Hamid Bin Mohd Ali 3,4 and Siti Anom Ahmad 5,6
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Reviewer 4: Anonymous
Appl. Sci. 2023, 13(21), 12012; https://doi.org/10.3390/app132112012
Submission received: 28 June 2023 / Revised: 12 September 2023 / Accepted: 14 September 2023 / Published: 3 November 2023

Round 1

Reviewer 1 Report

The manuscript describes a classification approach for Breast cancer detection. Unfortunately, the paper is affected by some major flaws that make it unpublishable in its current form.

  1. It is not clear what exactly the raw input features are, and whether the 10 derived features are 10 overall or 10 for each of the measured hormones. Most importantly, I am concerned with the collinearity within this set of derived features. For example, the mean is perfectly correlated with the summation, so keeping both is redundant.

  2. It is also not clear why dimensionality reduction would be beneficial or even needed in this situation, where the number of samples is actually larger than the number of features, whereas the mentioned “curse of dimensionality” refers to the opposite situation. The Authors also do not mention the number of dimensions in the reduced space.

  3. There is noticeable confusion on the machine learning terminology: Relief is not a supervised learning algorithm (see Introduction) but rather a feature ranking algorithm. Besides, the Authors state that “Relief integration in SVM is an effective method for identifying BC” but do not explain or mention such integration in Materials and Methods.

  4. The explanation about how hyperparameter tuning for SVM and kNN was performed is confusing and should be better clarified.

  5. 10-fold CV on this data set would leave 6 samples for validation, a number insufficient to assess the model’s effectiveness and level of generalization. Moreover, it is not stated whether the class proportions were maintained in the CV splits, which would be crucial.

  6. The analysis lacks proper validation on an independent test set. Only the performance in cross-validation is reported. It is not clear on which data the confusion matrixes were computed.

  7. Accuracy and confusion matrices alone are not enough to robustly assess the classification performance: they should be complemented by other metrics, such as precision, recall, F1-score, AUC.

 

It is not clear what the difference between Figures 2 and 3 should be (the captions are identical).

 

The statement “the blood and saliva of 60 women with histologically confirmed BC and 20 age-matched control women” is misleading since there are 60 samples overall (Table 1) and not 60+20.

 

In Eqns 1-10, all notation should be properly introduced and explained (e.g., m2, m3, m4, f1, f0, f2...). I suggest not displaying formulas for trivial statistics (e.g., mean, sum, min, max…). In case the mode returned multiple values, how did the Authors deal with them?

 

In the current manuscript, the Authors mention “brain diseases” (Section 2.4) and “the effects of dementia recognition” (same), clearly leftovers from other papers by the Authors in different areas.

 

Stochastic Neighbor Embedding (SNE) is not the proper name of the algorithm: it should be t-distributed Stochastic Neighbor Embedding (t-SNE).

There are several issues with the English language: e.g., punctuation, grammar, and informal expressions.

Author Response

We have carefully revised the manuscript following the Reviewers’ comments. We considered and addressed each one of their concerns and remarks.

Major changes are highlighted in yellow in the revised manuscript. Additionally, pieces of text that have been included in the revised manuscript to address the Reviewers’ comments appear in this response document typed in Italic font.

We really appreciate the Reviewer’s effort in revising our study. We have consider your comments thoroughly regarding the writing aspect. The English language and typos were revised.

 

We are grateful for the feedback provided by the Editor in Chief, Associate Editor, and Reviewers. Their remarks and suggestions helped us to improve the manuscript significantly. We hope that the revised version of the study has addressed all your concerns and will be considered a contribution of interest to the readership of “Applied Science Journal.”

For your convenience, a list of responses to the Reviewers’ remarks is included below.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

I've the opportunity to review this paper, that is interesting for the development of diagnosting test for breast cancer.

Given that it isn't specified what is the journal in which it will be published the introduction could not be suitable. For example for a medical journal is written in a way really not suitable nor correct.

In method section some basic statistical formula should be avoided, both bioingeneer, biotechnologist, doctor knows how mean and standard deviation are calculated. 

Result are not clear. table 1 is not sufficient. Mosaic plot should be better labeled.

Discussion request a proper chapter to discuss results and compare them with othe AI methods or other frequentist method for accuracy.

Lookin at the bibliography I think there is too much self citation and could be better to compare results with other authors.

 

minor revision, few tips or words.

Author Response

We have carefully revised the manuscript following the Reviewers’ comments. We considered and addressed each one of their concerns and remarks.

Major changes are highlighted in yellow in the revised manuscript. Additionally, pieces of text that have been included in the revised manuscript to address the Reviewers’ comments appear in this response document typed in Italic font.

We really appreciate the Reviewer’s effort in revising our study. We have consider your comments thoroughly regarding the writing aspect. The English language and typos were revised.

 

We are grateful for the feedback provided by the Editor in Chief, Associate Editor, and Reviewers. Their remarks and suggestions helped us to improve the manuscript significantly. We hope that the revised version of the study has addressed all your concerns and will be considered a contribution of interest to the readership of “Applied Science Journal.”

For your convenience, a list of responses to the Reviewers’ remarks is included below.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

the authors propose a comparative study to evaluate the blood and salivary levels of prolactin (P), testosterone (T), cortisol (C), and human chorionic gonadotropin (HCG) between breast cancer patients and healthy control subjects using ten statistical features with and without implementing dimensionality reduction techniques to discriminate the BC severity using data mining techniques.

My remarks to improve this paper:

1/ Try to describe more your problem and your contribution in the introduction section.

2/ Improves the figure of your approach (figure 1)

3/ A section (related work) is very important for a comparative study paper

4/ subsection 2.3.1 and 2.3.2: in these two sections you theoretically describe the two techniques but you did not show how you used them in your approach, try to describe more how you used these two techniques in your approach .

5/ describe more the results found and especially add a section (discussion) to discuss how the results vary from 45 to 100.

6/ in conclusion section add some future work

Author Response

Reviewer 3 Comments and Authors’ Response

The authors propose a comparative study to evaluate the blood and salivary levels of prolactin (P), testosterone (T), cortisol (C), and human chorionic gonadotropin (HCG) between breast cancer patients and healthy control subjects using ten statistical features with and without implementing dimensionality reduction techniques to discriminate the BC severity using data mining techniques. My remarks to improve this paper:

1.    Reviewer 3 Comments

Try to describe more your problem and your contribution in the introduction section

Ø Authors’ Response

We thank the Reviewer for this comment. Actually, Sections 1. Introduction has been updated to describe more your problem and your contribution. In the new version of the paper, major changes are highlighted in yellow.

“The use of images obtained from mammography and MRI for the purpose of diagnosing breast cancer offers a number of obstacles [5]. Although these imaging methods are essential for early diagnosis, they also come with a variety of drawbacks, such as the fact that mammography, despite its widespread application, may have limitations in terms of its sensitivity [8], particularly in women who have breast tissue that is dense. The MRI has a high sensitivity, but due to the fact that it is particularly sensitive to benign lesions, it can also lead to false positives [8]. Both mammography and MRI have the potential to provide false positive results, which can result in patients undergoing needless biopsies and experiencing emotional discomfort [9]. An overdiagnosis, in which benign tumours are identified and treated as dangerous, might result in overtreatment [9]. In addition, highly trained radiologists are required to appropriately interpret the results of mammograms and MRIs [10].

To prevent errors in diagnosis as much as possible, training and experience are absolutely necessary. In addition, Cost and Accessibility are important considerations because an MRI scan can be rather pricey and may not be readily available to all individuals [8]. Because of this, its extensive usage in routine screening is restricted. In terms of Data Variability, the Variability in image quality, location, and patient characteristics might affect the accuracy of both MRI-based detection methods and mammography-based detection methods [10-12]. In addition, image analysis tools have helped radiologists in image interpretation and BC detection, as these systems can be more successful and trustworthy in their diagnoses because they make use of artificial intelligence to increase the accuracy of BC detection [5]. However, there are computational challenges to consider, the use of artificial intelligence for the analysis of mammography and MRI images necessitates the collection of vast datasets for the purpose of training robust models; also, there is a concern regarding the models' reliability and generalization [13].

 Because blood and saliva hormones have the potential to serve as non-invasive and easily accessible biomarkers, using them in diagnostic laboratory testing can provide useful information regarding the diagnosis of breast cancer. Blood testing makes early diagnosis possible, which is helpful for people who are at higher risk. Biomarkers that can be found in saliva provide a patient-friendly way of collecting specimens, which enables the detection of cancer signals. Through the use of blood or saliva samples, laboratory tests determine hereditary risks, which in turn indicate hormone receptor status, which in turn influences treatment selections. This novel technique improves the ability to diagnose breast cancer in its earlier stages. Researchers choose drugs by monitoring changes in blood and plasma proteins, which can be found in the body. Methods for detecting breast cancer that are not intrusive and focused on the patient, such as using blood and saliva hormones, have been shown to be effective.”

“Because blood and saliva hormones have the potential to serve as non-invasive and
easily accessible biomarkers, using them in diagnostic laboratory testing can provide useful
information regarding the diagnosis of breast cancer. Blood testing makes an early diagnosis
possible, which is helpful for people who are at higher risk. Biomarkers that can be found in
saliva provides a patient-friendly way of collecting specimens, which enables the detection
of cancer signals. Through the use of blood or saliva samples, laboratory tests determine
hereditary risks, which in turn indicate hormone receptor status, which in turn influences
treatment selections. This novel technique improves the ability to diagnose breast cancer in
its earlier stages. Researchers choose drugs by monitoring changes in blood and plasma
proteins, which can be found in the body. Methods for detecting breast cancer that are not
intrusive and focused on the patient, such as using blood and saliva hormones, have been
shown to be effective. Dimensionality reduction using the t-SNE method was also used in
this investigation. FA and t-SNE were used in this study for the first time to differentiate
between mild and severe cases of BC. Dimensionality reduction using the t-SNE method
was also used in this investigation. FA and t-SNE were used in this study for the first time
to differentiate between mild and severe cases of BC.

The purpose of using t-SNE is to preserve the local associations between points, which
appear intuitively to be clustering rather than unrolling [ 15 ]. This is the motivation behind
the usage of t-SNE, it is a nonlinear dimensionality reduction strategy that focuses on
keeping the structure of neighbour points [ 16 ]. It provides somewhat different outcomes
each time on the same data set, but these results are all focused on retaining the same
information.

Cancer is just one example of a disease where symptoms don’t appear until the disease
has progressed to a later stage, at which point it is usually too late to treat effectively,
making early diagnosis crucial. Laboratory-based methods, such as blood and saliva tests
with machine learning (ML) methods, are highly suitable for the detection of BC because
they circumvent the difficulties of invasive-based methods. Since blood tests are now often
used to diagnose a variety of minor conditions, it would be great to be able to utilise them
to diagnose major diseases like cancer. Therefore, proper treatment and a full recovery
depend on detecting BC at an early stage. Our objective was to conduct a comparative
study to evaluate the blood and salivary levels of prolactin (P), testosterone (T), cortisol (C),
and human chorionic gonadotropin (HCG) between breast cancer patients and healthy control subjects using ten statistical features with and without implementing dimensionality
reduction techniques to discriminate the BC severity using data mining techniques.”

2.    Reviewer 3 Comments

Improves the figure of your approach (figure 1).

Ø Authors’ Response

We appreciate the Reviewer’s concern. Basically, Figure 1 has been improved.

3.    Reviewer 3 Comments

A section (related work) is very important for a comparative study paper.

Ø Authors’ Response

We have carefully considered this comment and we have added Section 2. Related Works. In the new version of the paper, major changes are highlighted in yellow.

4.    Reviewer 3 Comments

Subsections 2.3.1 and 2.3.2: in these two sections you theoretically describe the two techniques but you did not show how you used them in your approach, try to describe more how you used these two techniques in your approach.

Ø Authors’ Response

We really appreciate the Reviewer’s effort in revising our study. Actually, I have made a revision to the manuscript by providing the needed information. We have updated the Subsections 3.3.1 Factor Analysis (FA) and 3.3.2. t-Stochastic Neighbor Embedding (t-SNE), respectively. In the new version of the paper, major changes are highlighted in yellow.

“When applied to the BC dataset, factor analysis will assist in identifying the underlying components that provide an explanation for the apparent variance in the data. The statistical method known as factor analysis can be used to perform the work of discovering latent variables that are responsible for explaining correlations between observable data. This effort can be accomplished by applying factor analysis. Utilising factor analysis is one way to minimise the dimensionality of the data, which can be helpful in evaluating which elements of the BC are the most significant [31].”

“Applying the t-SNE approach to the factor scores that were produced as a result of factor analysis will accomplish this. To visualise high-dimensional data in a low-dimensional setting, a non-linear dimensionality reduction method known as t-SNE may be applied. This method has the potential to be exploited. This is feasible due to the fact that t-SNE employs sparse networks in order to represent the data. The procedure of mapping the factor scores to a two- or three-dimensional space is what allows t-SNE to be of assistance in the process of locating patterns and clusters within the BC data. This can be done in any dimension.”

5.    Reviewer 3 Comments

Describe more the results found and especially add a section (discussion) to discuss how the results vary from 45 to 100.

Ø Authors’ Response

We appreciate the Reviewer’s comment about our study. We have added Section 5. Discussion to the manuscript. In the new version of the paper, major changes are highlighted in yellow.

This is the first study that has proven that dimensionality reduction approaches using FA and t-SNE may be used to identify between mild and severe subjects of breast cancer as well as control women. Even though blood and saliva are not yet widely used as a source of samples for hormone analysis, they are a great research tool that provides a non-invasive and stress-free alternative to plasma and serum. This makes using them an appealing option. It has been demonstrated to be trustworthy and, in some situations, superior to other body fluids by displaying a very close association with free testosterone levels in serum [43]. This has led to its recognition as an extremely useful diagnostic tool. Saliva, on the other hand, has several benefits over blood as a sampling medium, including the fact that it may be easily collected by the participants themselves at regular intervals and that it does not require any specialised equipment for collection or storage [44,45]. Hormones like testosterone serum levels have a low level of reliability in the low ranges that are observed in normal women [46], and their levels can vary greatly depending on genetic, metabolic, and endocrine effects [47]. The concept that measurements of free or bioavailable testosterone are able to more precisely predict androgenic effects than total testosterone levels has recently gained widespread acceptance. In order to ensure the reliability of our research, the 10 generated features correspond to a total of 10 characteristics for each of the hormones that were analysed. We make an effort to use all of the possible independent variable statistical features in order to obtain a broad and comprehensive perspective from the blood and saliva in order to classify the severity of BC into normal, benign, and malignant cases. This is due to the collinearity that may exist within this set of derived features as well as the redundancy that arises from the utilisation of the mean and the summation. On the other hand, the discovery that two or more independent variables are correlated leads one to infer that shifts in one variable are related to changes in another variable. This can happen when two or more variables are found to be correlated. Because of these modifications, we are able to increase the size of the feature set, which is beneficial given that we want to apply methods that reduce the dimensionality of the data. Because of this, the accuracy of the classifiers for salivary biomarkers varied, which is why the results improved, especially with DT, with an increase in classification accuracy from 66.67\% to 93.3\% and 90\%, respectively, when using t-SNE and FA. and for Blood biomarkers improved the results, particularly with DT, boosting classification accuracy from 60\% to 80\% and 93.3\%, respectively, when utilising FA and t-SNE as a statistical tool.”

 

 

6.    Reviewer 3 Comments

In conclusion section add some future work.

Ø Authors’ Response

We are thankful for the Reviewer’s comment about our study, Section 5. Conclusion. Has been updated by adding add some future work. In the new version of the paper, major changes are highlighted in yellow.

“Blood tests are going to be integrated with the help of tumor-associated circulating tumour cells, also known as TACTs. This made it feasible to detect breast cancer at an earlier stage, particularly in people who were at high risk for developing the disease. In addition, the datasets could be expanded, and the applicability of the approach that was recommended to additional datasets might be investigated.”

 

 

 

 

 

 

 

 

 

 

 

We appreciate the Reviewer’s effort in revising our manuscript. We believe that the Reviewer’s comments have helped us to improve and clarify our study and we hope that the revised paper has addressed all your concerns.

Author Response File: Author Response.docx

Reviewer 4 Report

This article describes a procedure for detecting breast cancer based on blood hormone levels, including prolactin, testosterone, cortisol, and human chorionic gonadotropin (HCG). A multi-classification task is performed using three machine learning classifiers, namely decision trees (DT), support vector machines (SVM), and K-nearest neighbors (KNN). The authors propose using feature extraction to reduce the dimensionality of the data and compare two techniques, factor analysis (FA) and t-stochastic neighbor embedding (t-SNE). Various metrics are used to evaluate the efficacy of the three machine learning models, and the results are reported.

The paper is generally well-written and somewhat structured. With their permission, the authors obtained the data from Elwiya Oncology Teaching Hospital and the Center for Early BC Detection.

There was an effort made to collect the data. The idea proposed by the authors is interesting but not novel, as a 2010 study proposed using prolactin as a biomarker for early detection of BC. (See for instance “Prolactin as a biomarker for early detection of breast cancer among high-risk women” by Tworoger et al, published in Cancer Prevention Research in 2010. This paper investigated the association between serum prolactin levels and breast cancer risk among women with a family history of breast cancer or BRCA1/2 mutations.)

The paper requires substantial revisions before it can be published.

- Numerous studies have investigated the early detection of breast cancer. Consequently, a section reviewing the relevant literature and identifying the research gap is required.

- As a consequence, more appropriate references need to be included.

- The authors clearly defined the collection process. However, it is not clear what features in the data set refer to.  Four kinds of hormones are considered P, T, C and HCG and many statistics have been gathered. The size of the collected data is 60x10 where 60 refers to the number of patients and 10 to the considered statistics. It is not clear to me what features in the data set refer to considering P, T,C and HCG.

 

- The authors used feature extraction as a method for reducing dimensionality, but did not provide a rationale for this approach. Moreover, the selection of FA and T-SNE lacks justification. The authors are expected to justify their choices. What are the reasons for not using feature selection? Interpreting and explaining the outputs of classifiers is facilitated by the use of selected features rather than extracted features.

- The authors' primary emphasis was on evaluating the classifiers using performance measures, without providing any elucidation on the influence of features on the classification process. The concept of interpretability or explainability has significant importance within the realm of these types of investigations.

- t-SNE can produce different results for the same data and parameters depending on the initial state and random factors. The authors dis not explain how did they deal with this issue to get the optimal result?

-In accordance with academic standards, it is necessary to include a suitable reference to support the assertion made in page 2, lines 49-50 and 59-60.

- regarding the claim in lines 59-60, t-SNE is computationally expensive compared to other feature extraction techniques such as PCA. So what motivated its use?

- On page 6, line 210, the claim made is not sufficiently clear or suitable. The statement suggests that the Decision Tree (DT) has a structure that closely resembles that of a flowchart. However, it is important to note that the DT, being an algorithm, may indeed be effectively represented using a flowchart.

- According to the information provided on page 6, line 215, it is seen that there are three distinct groups. Consequently, a leaf node in this context may be categorized as either malignant, benign, or normal, as opposed to only being classified as present or missing. 

- The meaning and description of the term "50" (DT parameter) should be provided on page 6, line 228.

- The paragraph located on page 3, namely lines 107-109, requires rephrasing in order to conform to the conventions of academic writing. The use of several biomarkers such as P, T, C, and others for the purpose of cancer detection is not a novel concept.

- It is necessary to provide the implementation environment and parameter settings for each classifier.

- It is also necessary to discuss the results given by the confusion matrices.

- a typo on line 140 Blood instead of bloob

 

 

 

Author Response

Reviewer 4 Comments and Authors’ Response

This article describes a procedure for detecting breast cancer based on blood hormone levels, including prolactin, testosterone, cortisol, and human chorionic gonadotropin (HCG). A multi-classification task is performed using three machine learning classifiers, namely decision trees (DT), support vector machines (SVM), and K-nearest neighbors (KNN). The authors propose using feature extraction to reduce the dimensionality of the data and compare two techniques, factor analysis (FA) and t-stochastic neighbor embedding (t-SNE). Various metrics are used to evaluate the efficacy of the three machine learning models, and the results are reported.

The paper is generally well-written and somewhat structured. With their permission, the authors obtained the data from Elwiya Oncology Teaching Hospital and the Center for Early BC Detection.

There was an effort made to collect the data. The idea proposed by the authors is interesting but not novel, as a 2010 study proposed using prolactin as a biomarker for early detection of BC. (See for instance “Prolactin as a biomarker for early detection of breast cancer among high-risk women” by Tworoger et al, published in Cancer Prevention Research in 2010. This paper investigated the association between serum prolactin levels and breast cancer risk among women with a family history of breast cancer or BRCA1/2 mutations.)

1.    Reviewer 4 Comments

Numerous studies have investigated the early detection of breast cancer. Consequently, a section reviewing the relevant literature and identifying the research gap is required. As a consequence, more appropriate references need to be included.

Ø Authors’ Response

We really appreciate the Reviewer’s effort in revising our study. We have carefully considered all your comments and we have updated Section 1. Introduction by identifying the research gap. We have added Section 2. Related Works, therefore, more appropriate references have been added. In the new version of the paper, major changes are highlighted in yellow.

“The use of images obtained from mammography and MRI for the purpose of diagnosing breast cancer offers a number of obstacles [5]. Although these imaging methods are essential for early diagnosis, they also come with a variety of drawbacks, such as the fact that mammography, despite its widespread application, may have limitations in terms of its sensitivity [8], particularly in women who have breast tissue that is dense. The MRI has a high sensitivity, but due to the fact that it is particularly sensitive to benign lesions, it can also lead to false positives [8]. Both mammography and MRI have the potential to provide false positive results, which can result in patients undergoing needless biopsies and experiencing emotional discomfort [9]. An overdiagnosis, in which benign tumours are identified and treated as dangerous, might result in overtreatment [9]. In addition, highly trained radiologists are required to appropriately interpret the results of mammograms and MRIs [10].

To prevent errors in diagnosis as much as possible, training and experience are absolutely necessary. In addition, Cost and Accessibility are important considerations because an MRI scan can be rather pricey and may not be readily available to all individuals [8]. Because of this, its extensive usage in routine screening is restricted. In terms of Data Variability, the Variability in image quality, location, and patient characteristics might affect the accuracy of both MRI-based detection methods and mammography-based detection methods [10-12]. In addition, image analysis tools have helped radiologists in image interpretation and BC detection, as these systems can be more successful and trustworthy in their diagnoses because they make use of artificial intelligence to increase the accuracy of BC detection [5]. However, there are computational challenges to consider, the use of artificial intelligence for the analysis of mammography and MRI images necessitates the collection of vast datasets for the purpose of training robust models; also, there is a concern regarding the models' reliability and generalization [13].”

2.    Reviewer 4 Comments

The authors clearly defined the collection process. However, it is not clear what features in the data set refer to.  Four kinds of hormones are considered P, T, C and HCG and many statistics have been gathered. The size of the collected data is 60x10 where 60 refers to the number of patients and 10 to the considered statistics. It is not clear to me what features in the data set refer to considering P, T, C and HCG.

Ø Authors’ Response

We really appreciate the Reviewer’s comments. As the obtained data is 60 x 10, where 60 is the number of patients and 10 is the considered statistics, the features in the data set were organised for each type of hormone. The overall features set in the data set was 60 x 40 when P, T, C, and HCG were taken into account, respectively.

3.    Reviewer 4 Comments

The authors used feature extraction as a method for reducing dimensionality but did not provide a rationale for this approach. Moreover, the selection of FA and T-SNE lacks justification. The authors are expected to justify their choices. What are the reasons for not using feature selection? Interpreting and explaining the outputs of classifiers is facilitated by the use of selected features rather than extracted features.

Ø Authors’ Response

We appreciate the Reviewer’s concern. Basically, dimensionality reduction is the process of decreasing the number of columns in a feature set. We may just as easily send our dataset into our machine learning algorithm in its original high-dimensional form. However, the number of samples needed to ensure that all possible combinations of feature values are adequately represented in the sample grows in direct proportion to the number of features. As a result, the complexity of the model increases as the number of features increases, a phenomenon known as the Curse of Dimensionality. Overfitting is also more likely when there are more features. Overfitting occurs when a machine learning model becomes overly reliant on the training data, which can lead to subpar results when applied to real data. One of the main draws of dimensionality reduction is its ability to prevent overfitting. However, there are many more benefits to dimensionality reduction as well. With less noise in the data, models can be more precise. Less dimensions equal less processing power. With less information, algorithms may be trained more quickly. The number of dimensions in the reduced space was ) .

Dimensionality reduction approaches like factor analysis and t-SNE can reduce a dataset's characteristics without affecting its structure. Factor analysis identifies fundamental elements that underlie data disparities as a linear dimensionality reduction approach. The components are usually unrelated, therefore you can use them to determine the original data. t-SNE reduces dimensions non-linearly while preserving data from local organisations. It does this by displaying data with fewer dimensions that preserve identical point distances.

4.     Reviewer 4 Comments

The authors' primary emphasis was on evaluating the classifiers using performance measures, without providing any elucidation on the influence of features on the classification process. The concept of interpretability or explainability has significant importance within the realm of these types of investigations.

Ø Authors’ Response

We are thankful for the Reviewer’s comment. Actually, We have illustrated the effects of the features on the classification process by evaluating the classifiers using performance metrics, although interpretability and explainability are crucial in machine learning and data analysis. Interpretability in machine learning means understanding and explaining how a model or classifier makes decisions, especially in difficult tasks like categorization. The following section has been added to the Section 4. Results. In the new version of the paper, major changes are highlighted in yellow.

Learning patterns from data, feature extraction, and dimensionality reduction let a
model or classifier makes judgements, especially in complicated tasks like BC categorization. This study trains DT, SVM, and KNN models on labelled data. The model learns feature-label patterns during training. Model performance is robustly performed via cross-validation. Finally, the model is evaluated on a new dataset to determine its real-world performance. This stage determines if the model overfit (learned noise in training data) or underfit (failed to capture patterns). Overfitting can be prevented by splitting the dataset into several smaller subsets (folds), keeping one fold aside for testing purposes, and training the model on the remaining four-folds. These procedures are done five times, and the model’s efficacy on the unseen fold is assessed using the selected metric (precision, sensitivity, specificity, accuracy and F1-score). Finally, a single performance score for the model is calculated by averaging the performance of the used met all 5 iterations.

5.    Reviewer 4 Comments

t-SNE can produce different results for the same data and parameters depending on the initial state and random factors. The authors did not explain how they dealt with this issue to get the optimal result? 

Ø Authors’ Response

We are grateful to the Reviewer for providing valuable feedback regarding our work. We have carefully considered all your comments and we have updated Section 4. Results. In the new version of the paper, major changes are highlighted in yellow.

The authors of t-SNE conceded that, depending on the starting point and other random factors, the algorithm may return different results despite using the same data and parameters. Thus, we have used a fixed random seed to guarantee that the process always begins in the same place. Thirty iterations were performed. As a result, the algorithm should eventually reach a more robust answer.

To solve the reproducibility problem of t-SNE, it is recommended to run the algorithm 5 times with different random seeds and then take an average of the results. such that classification results are not skewed due to overfitting. In order to further analyse the data, it was split into 5 independent datasets of similar size. We used one of these subsets as our test data, and we used the other nine to train our classifier. five iterations of this process yielded ten reliable results. The accuracy of this dataset's 5-fold CV was determined by averaging these results. This will lessen the effect of chance and lead to more consistent outcomes.

The features were first calculated using the training set to create the input features vector for t-SNE. After that, dimensionality reduction was accomplished by computing both the training and testing sets.

6.    Reviewer 4 Comments

In accordance with academic standards, it is necessary to include a suitable reference to support the assertion made on page 2, lines 49-50 and 59-60. Regarding the claim in lines 59-60, t-SNE is computationally expensive compared to other feature extraction techniques such as PCA. So what motivated its use?

Ø Authors’ Response

We are thankful for the Reviewer’s suggestion. We have updated the manuscript Section 1. Introduction and including suitable references. In the new version of the paper, major changes are highlighted in yellow.

The purpose of using t-SNE is to preserve the local associations between points, which appear intuitively to be clustering rather than unrolling [15]. This is the motivation behind the usage of t-SNE. t-SNE is a nonlinear dimensionality reduction strategy that focuses on keeping the structure of neighbour points [16]. It provides somewhat different outcomes each time on the same data set, but these results are all focused on retaining the same information.”

7.    Reviewer 4 Comments

According to the information provided on page 6, line 215, it is seen that there are three distinct groups. Consequently, a leaf node in this context may be categorized as either malignant, benign, or normal, as opposed to only being classified as present or missing. 

Ø Authors’ Response

We are thankful to the Reviewer for providing insightful criticism of our work. In this study, there were three leaf nodes that were malignant, benign, and normal, and they were not merely classified as present or missing. We have updated the manuscript Section 3.4. Classification Stage. In the new version of the paper, major changes are highlighted in yellow.

The algorithm would work its way through the tree branches in a certain order, with each branch being determined by the responses to the questions that came before it until it reached the leaf node, at which point it would indicate whether or not the individual is part of three leaf nodes that were malignant, benign, and normal group [32]”

8.    Reviewer 4 Comments

The meaning and description of the term "50" (DT parameter) should be provided on page 6, line 228.

Ø Authors’ Response

We are appreciative of the feedback provided by the Reviewer. We have updated the manuscript Section 3.4. Classification Stage. In the new version of the paper, major changes are highlighted in yellow.

DT is a supervised learning algorithm that needs labelled data to train. Labelled data would include patients with known diagnoses for normal, benign, or malignant classification. The DT approach is effective for classification. To avoid overfitting, pick parameters carefully. Additionally, the training set should be reflective of the data the tree will classify. The Gini impurity and entropy are the main splitting criteria. The minimal number of samples in a leaf node before the tree can split is 50. This and the tree's maximum depth avoid overfitting.

9.    Reviewer 4 Comments

The paragraph located on page 3, namely lines 107-109, requires rephrasing in order to conform to the conventions of academic writing. The use of several biomarkers such as P, T, C, and others for the purpose of cancer detection is not a novel concept.

Ø Authors’ Response

We are appreciative of the feedback provided by the Reviewer. We went through each and every text to ensure that the Quality of English Language was improved.

10.                  Reviewer 4 Comments

It is necessary to provide the implementation environment and parameter settings for each classifier.

Ø Authors’ Response

We are appreciative of the feedback provided by the Reviewer. We have updated the manuscript Section 3.4. Classification Stage. In the new version of the paper, major changes are highlighted in yellow.

DT is a supervised learning algorithm that needs labelled data to train. Labelled data would include patients with known diagnoses for normal, benign, or malignant classification. The DT approach is effective for classification. To avoid overfitting, pick parameters carefully. Additionally, the training set should be reflective of the data the tree will classify. The Gini impurity and entropy are the main splitting criteria. The minimal number of samples in a leaf node before the tree can split, usually 50. This and the tree's maximum depth avoid overfitting.

By conducting a 10-fold cross-validation for the purpose of optimising C on the training set, we were able to acquire the best results possible for the SVM classifier. To be more specific, the SVMs were trained for a variety of C values that fell within the range of -4 log10(C)4 in C values C ranging from 0.0001,0.001,0.01,0.1,0,10,100,1000,10000. During the testing method, the result obtained for C = 10 was the most favourable. For the purpose of implementing the multi-class SVMs classifier, the RBF-kernel functions were put to use. During the SVM training, the smoothing parameter was chosen based on the least misclassification rate that was obtained from the training dataset. This was done so that the best possible model could be produced. The best value of can only be established by methodically experimenting with different values of during the various training sessions. As a result, the value of was changed between 0.1 and 1 with a step size of 0.1 between each change. At the value of = 0.5, the rate of misclassification was reduced to its lowest possible value.

In this research, the k value for the kNN classifier was chosen by varying k between 1 and 10 at 1 intervals. Maximum classification accuracy was achieved after training the classifier to determine the optimal value of k, which was found to be k = 5. Each experiment was categorised using kNN, with the Euclidean distance serving as the similarity metric.”

11.                  Reviewer 4 Comments

It is also necessary to discuss the results given by the confusion matrices.

Ø Authors’ Response

We are appreciative of the feedback provided by the Reviewer. We have updated Section 4. Results by discussing more the confusion matrices. In the new version of the paper, major changes are highlighted in yellow.

 

12.                  Reviewer 4 Comments

A typo on line 140 Blood instead of bloob.

Ø Authors’ Response

We are appreciative of the feedback provided by the Reviewer. We went through each and every text to ensure that the Quality of English Language was improved.

 

 

 

 

 

 

We appreciate the Reviewer’s effort in revising our manuscript. We believe that the Reviewer’s comments have helped us to improve and clarify our study and we hope that the revised paper has addressed all your concerns.

 

 

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for improving the manuscript. Some of the raised points, however, still remain. For example, I was probably unclear in my comment about the 10-fold CV. My point was that, given a 60-sample dataset as input to a 10-fold CV, only 6 of them (1/10) are used for the validation folds (too few to assess the model's generalization robustly). A possible solution would be to use a CV with fewer folds, i.e. 5-fold. Moreover, in your response you state that you split the full dataset into training and testing parts, thus reducing even more the number of samples in the training dataset that undergoes the CV, exacerbating the above problem. I am concerned about the sample size (60) not being large enough to allow a train/test split and properly evaluate the generalization of the machine learning models (in this case, an independent test set would be more suitable). By the way, what percentage of data was used as training set? This should also be specified in the Methods. The following points were not addressed in your response:

- "it is not stated whether the class proportions were maintained in the CV splits, which would be crucial."
- "Only the performance in cross-validation is reported. It is not clear on which data the confusion matrixes were computed."

The confusion matrixes, expressed in percentages, could be improved by also adding the absolute numbers.

Some minor edits are needed in the added/modified parts of the revised manuscript (e.g. "The kNN algorithm is widely. Considered", "It assigns labels, to samples", etc.)

Author Response

1.   Reviewer 1 Comments and Authors’ Response

1.    Reviewer 1 Comments

Thank you for improving the manuscript. Some of the raised points, however, still remain. For example, I was probably unclear in my comment about the 10-fold CV. My point was that, given a 60-sample dataset as input to a 10-fold CV, only 6 of them (1/10) are used for the validation folds (too few to assess the model's generalization robustly). A possible solution would be to use a CV with fewer folds, i.e. 5-fold. Moreover, in your response you state that you split the full dataset into training and testing parts, thus reducing even more the number of samples in the training dataset that undergoes the CV, exacerbating the above problem. I am concerned about the sample size (60) not being large enough to allow a train/test split and properly evaluate the generalization of the machine learning models (in this case, an independent test set would be more suitable). By the way, what percentage of data was used as training set? This should also be specified in the Methods. The following points were not addressed in your response:

- "it is not stated whether the class proportions were maintained in the CV splits, which would be crucial."
- "Only the performance in cross-validation is reported. It is not clear on which data the confusion matrixes were computed."

Authors’ Response

We really appreciate the Reviewer’s effort in improving the results of our study. We have adapted the manuscript by applying the 5-fold cross-validation, which resulted in improved results. Actually, the 5-fold random cross-validation was performed and the average of the 5 folds were averaged and reported, the whole data set was divided into (24 x 4) training set and (6 x 4) testing set. This was done in order to improve the accuracy of our final results. Actually, a 5-fold cross-validation (CV) was used to compute confusion matrixes for each classifier as well. The final confusion matrix was the average of the 5 folds. We have carefully considered all your comments and we have updated Section 2.4 Classification Stage. In the new version of the paper, major changes are highlighted in yellow.

“The best results for the classifiers were obtained by carrying out a 5-fold cross-validation (CV). This procedure was carried out, and the average of the 5 folds was then reported. During this procedure, the entire data set was split up into a (24 × 4) training set and a (6 × 4) testing set. This was done so that the accuracy of our overall results may be improved as a result. The performance of the proposed framework was evaluated by using the average classification precision, sensitivity, specificity, accuracy, and F1-score, which were reported as a percentage, and the confusion matrix, which made it possible to classify the BC severity. The confusion matrices were produced for each classifier while doing a 5-fold cross-validation (CV), in which the final confusion matrix is obtained by averaging the 5 folds..”

 

 

2.    Reviewer 1 Comments

The confusion matrixes, expressed in percentages, could be improved by also adding the absolute numbers.

Authors’ Response

We appreciate the Reviewer’s concern. We have updated the confusion matrixes by using absolute numbers.

3.    Reviewer 1 Comments

Comments on the Quality of English Language: Some minor edits are needed in the added/modified parts of the revised manuscript (e.g. "The kNN algorithm is widely. Considered", "It assigns labels, to samples", etc.) 

Authors’ Response

We have carefully considered all your comments and we apologize for this mistake. We have improved the quality of the English language in the revised version of the manuscript.

 

 

 

 

 

 

 

 

 

We appreciate the Reviewer’s effort in revising our manuscript. We believe that the Reviewer’s comments have helped us to improve and clarify our study and we hope that the revised paper has addressed all your concerns.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Please , update all figures with headers: thay are confusion matrix therefore the reader should get where is the true condition and the classification by the model, so to a better interpretation of percentages.

Author Response

Reviewer 2 Comments and Authors’ Response

1.    Reviewer 2 Comments

Please, update all figures with headers: they are confusion matrix therefore the reader should get where is the true condition and the classification by the model, so to a better interpretation of percentages.Authors’ Response

We thank the Reviewer for this comment. We have carefully considered all your comments and we have updated all the figures with headers.

 

 

We appreciate the Reviewer’s effort in revising our manuscript. We believe that the Reviewer’s comments have helped us to improve and clarify our study and we hope that the revised paper has addressed all your concerns.

 

Author Response File: Author Response.pdf

Reviewer 4 Report

1- To better understand the data set before applying dimensionality reduction, you need to clarify the two following points:  How many samples are in the data set? (We assume that it is 60.) What are the features and how many are there, given that you use 4 biomarkers (P, T, C, and HCG) and 10 statistical descriptive measures (Is it 4 times 10?)?

2- The justification given for the choice of the t-SNE despite its stochastic nature is not convincing and ambiguous. "This is the motivation behind  the usage of t-SNE, it is a nonlinear dimensionality reduction strategy that focuses on  keeping the structure of neighbour points [16]. It provides somewhat different outcomes each time on the same data set, but these results are all focused on retaining the same information"

3- References 35-40 are inappropriate and need to be replaced with more appropriate ones.

4- line 447, figure number missing in   "the confusion matrix is shown in Figure ?? .."

5- Statements on data availability and ethical approval at the end of your manuscript should be included following the template instructions.

Author Response

Editorial Office and Reviewer’s Comments and Authors’ Responses

 

Noor Kamal Al-Qazzaz, Iyden Kamil Mohammed, Halah Kamal Al-Qazzaz,

Sawal Hamid Bin Mohd Ali and Siti Anom Ahmad

 

Applied Science Journal / Section: Biomedical Engineering.

 

Comparison of the Effectiveness of Various Classifiers for Breast Cancer Detection using Data Mining Methods

 

 

We have carefully revised the manuscript following the Reviewers’ comments. We considered and addressed each one of their concerns and remarks.

Major changes are highlighted in yellow in the revised manuscript. Additionally, pieces of text that have been included in the revised manuscript to address the Reviewers’ comments appear in this response document typed in Italic font.

We really appreciate the Reviewer’s effort in revising our study. We have considered your comments thoroughly regarding the writing aspect. The English language and typos were revised.

 

We are grateful for the feedback provided by the Editor in Chief, Associate Editor, and Reviewers. Their remarks and suggestions helped us to improve the manuscript significantly. We hope that the revised version of the study has addressed all your concerns and will be considered a contribution of interest to the readership of “Applied Science Journal.”

For your convenience, a list of responses to the Reviewers’ remarks is included below.

 

 

Reviewer 4 Comments and Authors’ Response

The authors propose a comparative study to evaluate the blood and salivary levels of prolactin (P), testosterone (T), cortisol (C), and human chorionic gonadotropin (HCG) between breast cancer patients and healthy control subjects using ten statistical features with and without implementing dimensionality reduction techniques to discriminate the BC severity using data mining techniques. My remarks to improve this paper:

1.    Reviewer 4 Comments

To better understand the data set before applying dimensionality reduction, you need to clarify the two following points:  How many samples are in the data set? (We assume that it is 60.) What are the features and how many are there, given that you use 4 biomarkers (P, T, C, and HCG) and 10 statistical descriptive measures (Is it 4 times 10?)?

Ø Authors’ Response

We really appreciate the Reviewer’s effort in revising our study. Actually, we have added the following paragraph to Sections 3.2. Features Extraction Stage to describe the dimension of the dataset. In the new version of the paper, major changes are highlighted in yellow.

The dimension of the employed dataset's feature matrix was (60 x 40), where (20 control + 20 benign + 20 malignant) = 60 observations and the (10 features x 4 biomarkers (P, T, C, and HCG)) = 40 attributes.  ”

2.    Reviewer 4 Comments

The justification given for the choice of the t-SNE despite its stochastic nature is not convincing and ambiguous. "This is the motivation behind the usage of t-SNE, it is a nonlinear dimensionality reduction strategy that focuses on  keeping the structure of neighbour points [16]. It provides somewhat different outcomes each time on the same data set, but these results are all focused on retaining the same information"

Ø Authors’ Response

We appreciate the Reviewer’s concern. We have added the following paragraph to Section 1. Introduction to justify the choice of the t-SNE despite its stochastic nature. In the new version of the paper, major changes are highlighted in yellow.

t-SNE is a powerful dimensionality reduction technique used in data analysis to convert the high-dimensional Euclidean distances between data points into conditional probabilities that represent similarities [15]. Despite its stochastic nature, for several reasons like preservation of local structures the t-SNE is particularly effective at preserving local structures in high-dimensional data as it focuses on maintaining the similarity relationships between data points, making it useful for visualizing clusters and patterns that might be lost in other dimensionality reduction techniques like Principal Component Analysis (PCA) [17]. Moreover, t-SNE produces visually appealing embeddings that can help analysts and researchers interpret complex data [18].

 \hl{Furthermore, t-SNE complements other dimensionality reduction methods like PCA it can be used to gain a more comprehensive understanding of the data [16]. Therefore, t-SNE's ability to reveal local structures encourages researchers and analysts to often employ t-SNE alongside other techniques to uncover hidden patterns and gain a richer understanding of datasets.”

3.    Reviewer 4 Comments

References 35-40 are inappropriate and need to be replaced with more appropriate ones.

Ø Authors’ Response

We have carefully considered this comment and we have replaced the mentioned references with appropriate ones.

4.    Reviewer 4 Comments

line 447, figure number missing in   "the confusion matrix is shown in Figure ?? .."

Ø Authors’ Response

We really appreciate the Reviewer’s comment. Actually, We have added the Figure number.

5.    Reviewer 4 Comments

Statements on data availability and ethical approval at the end of your manuscript should be included following the template instructions.

Ø Authors’ Response

We appreciate the Reviewer’s comment about our study. We have added the Statements on data availability and ethical approval at the end of our manuscript.

 

 

 

 

We appreciate the Reviewer’s effort in revising our manuscript. We believe that the Reviewer’s comments have helped us to improve and clarify our study and we hope that the revised paper has addressed all your concerns.

Author Response File: Author Response.pdf

Round 3

Reviewer 4 Report

Appropriate revisions have been made.

Back to TopTop