Next Article in Journal
Impact of Physical Exercise on Quality of Life, Self-Esteem, and Depression in Breast Cancer Survivors: A Pilot Study
Previous Article in Journal
A Systems Biology Analysis of Chronic Lymphocytic Leukemia
 
 
Article
Peer-Review Record

Predicting Resistance to Immunotherapy in Melanoma, Glioblastoma, Renal, Stomach and Bladder Cancers by Machine Learning on Immune Profiles

Onco 2024, 4(3), 192-206; https://doi.org/10.3390/onco4030014
by Guillaume Mestrallet
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Onco 2024, 4(3), 192-206; https://doi.org/10.3390/onco4030014
Submission received: 3 June 2024 / Revised: 5 August 2024 / Accepted: 15 August 2024 / Published: 20 August 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript had several concerns, particularly in the machine learning methodology and data analysis. The training process for the models lacks clarity, raising doubts about the validity of the approach. Here are the specific points that require attention:

  1. Methodology Clarification: The term "independent Wilcoxon t-Test with multiple sample correction" is ambiguous and potentially incorrect. Wilcoxon and t-tests are distinct statistical methods. Furthermore, the nature and application of the mentioned “correction “ are unclear. Please provide a detailed explanation of the statistical approach used.
  2. Terminology Consistency: In section 3.1, there is inconsistent use of terms between the text and visual elements. The text refers to "progressors" and "non-progressors," while Table 1 and Figure 1 use "responders" and "non-responders." This discrepancy should be resolved. Additionally, there is a mismatch in patient numbers between Figure 1 and Table 1 that requires explanation. (GBM)
  3. Data Source Transparency: The origin of the immune response data analyzed in section 3.2 is not clearly stated. If this data was obtained from external sources, this should be explicitly mentioned. Alternatively, if the authors estimated these values, the methodology should be detailed.
  4. Figure Legends: Figures 2 and 3 appear to have identical legends. If this is an error, it should be corrected. If intentional, the reason for this repetition should be explained.
  5. Consistent Terminology: The issue of inconsistent terminology noted in point 2 persists throughout the manuscript and should be addressed comprehensively.
  6. Clarity of Interpretation: The conclusion drawn from the MHC molecule expression data in GBM progressors versus non-progressors requires further explanation. The link between increased MHC molecule expression and defects in antigen presentation is not immediately apparent and needs elaboration.
  7. Machine Learning Methodology: The machine learning section lacks crucial details. The composition of training and testing datasets is not specified. Given the small sample sizes in GBM, SKCM, and STAD cohorts, the method of data splitting for model training and testing should be explicitly described.

In conclusion, while the study presents interesting findings, significant revisions are necessary to address these methodological and clarity issues before the manuscript can be considered for publication.

 

Comments on the Quality of English Language

Average

Author Response

Thanks to reviewer 1 for his/her suggestions. 

 

The manuscript had several concerns, particularly in the machine learning methodology and data analysis. The training process for the models lacks clarity, raising doubts about the validity of the approach. Here are the specific points that require attention:

  1. Methodology Clarification: The term "independent Wilcoxon t-Test with multiple sample correction" is ambiguous and potentially incorrect. Wilcoxon and t-tests are distinct statistical methods. Furthermore, the nature and application of the mentioned “correction “ are unclear. Please provide a detailed explanation of the statistical approach used.

Statistical significance of the observed differences was determined using both independent Wilcoxon and t-Tests. All data are presented as mean±SEM. Standard error of the mean (SEM) measures how far the sample mean (average) of the data is likely to be from the true population mean. The difference was considered as significant when the p value was below 0.05. * : p<0.05 for all tests. These tests are the only ones available on the CRI iAtlas portal. This was modified as requested.

  1. Terminology Consistency: In section 3.1, there is inconsistent use of terms between the text and visual elements. The text refers to "progressors" and "non-progressors," while Table 1 and Figure 1 use "responders" and "non-responders." This discrepancy should be resolved. Additionally, there is a mismatch in patient numbers between Figure 1 and Table 1 that requires explanation. (GBM)

This is because the response status is not available for GBM, the only status available for GBM is the progression in the CRI iAtlas dataset. Non-Progressors are defined as patients with mRECIST of Partial Response, Complete Response or Stable disease, whereas Progressors are those with Progressive Disease. Responders are defined as patients with mRECIST of Partial Response or Complete Response, whereas Non-Responders are those with Progressive Disease or Stable Disease. It was clarified in the methods, the table and the results as requested.

  1. Data Source Transparency: The origin of the immune response data analyzed in section 3.2 is not clearly stated. If this data was obtained from external sources, this should be explicitly mentioned. Alternatively, if the authors estimated these values, the methodology should be detailed.

Data are available on CRIiAtlas website https://isb-cgc.shinyapps.io/iatlas/ and code is available on Github https://github.com/gmestrallet/Cancers_2024_16. It was added in the Data Availability Statement as requested.

 

  1. Figure Legends: Figures 2 and 3 appear to have identical legends. If this is an error, it should be corrected. If intentional, the reason for this repetition should be explained.

Figure 2 shows the immune cell scores while Figure 3 shows the immune gene scores.

  1. Consistent Terminology: The issue of inconsistent terminology noted in point 2 persists throughout the manuscript and should be addressed comprehensively.

It was clarified in the methods, the table and the results as requested.

  1. Clarity of Interpretation: The conclusion drawn from the MHC molecule expression data in GBM progressors versus non-progressors requires further explanation. The link between increased MHC molecule expression and defects in antigen presentation is not immediately apparent and needs elaboration.

We agree with this point, we were also surprised that resistance was associated with elevated MHC expression. We think that the fact that progressors following Pembrolizumab express more immunosuppressive molecules such as TGFB, IL2RA and CD276 is more likely to explain the resistance for this type of cancer. We changed this conclusion as suggested.

  1. Machine Learning Methodology: The machine learning section lacks crucial details. The composition of training and testing datasets is not specified. Given the small sample sizes in GBM, SKCM, and STAD cohorts, the method of data splitting for model training and testing should be explicitly described.

Each dataset was partitioned into 5 subsets and further categorized into training (80%) and testing (20%) groups. RandomForrestClassifier, GradientBoosting, SupportVectorMachine and LogisticRegression models were trained on the immune data using five-fold cross-validation to predict response to immune checkpoint blockade. It was clarified in the methods and the results as requested.

In conclusion, while the study presents interesting findings, significant revisions are necessary to address these methodological and clarity issues before the manuscript can be considered for publication.

Best regards

Guillaume

Reviewer 2 Report

Comments and Suggestions for Authors

 

The article entitled “Predicting resistance to immunotherapy in five cancer types by machine learning” is well-written and, from my point of view, would be of interest for the readers of Onco. In spite of these and before its publication in consider that the following issues should be solved:

For the bibliographical references, please make use of brackets instead of parentheses.

In the introduction please explain the aim of the paper.

In the introduction, please also introduce a paragraph that describes the structure of the manuscript.

Line 69: please explain why such cohorts were selected.

Line 111: please define the meaning of SEM before using it.

Figure 1: please give a letter and a sub caption to each one of the figures.

Figure 2: please consider the same comment that for Figure 1.

Figure 3: please consider the same comment that for Figure 1.

Figure 4: this is not really a table, if not many performance metrics that should presented in a table or tables and not as an screenshot.

Author Response

Thanks to reviewer 2 for his/her suggestions. 

 

The article entitled “Predicting resistance to immunotherapy in five cancer types by machine learning” is well-written and, from my point of view, would be of interest for the readers of Onco. In spite of these and before its publication in consider that the following issues should be solved:

For the bibliographical references, please make use of brackets instead of parentheses.

References were modified as suggested.

In the introduction please explain the aim of the paper.

We aim to identify immune features associated with either resistance or positive response to therapy. These identified features will serve as input for training machine learning algorithms, enabling the development of personalized prediction models tailored to individual patients based on their unique immune profiles. This approach aims to refine treatment decisions, ultimately improving outcomes for cancer patients undergoing immunotherapy. This was added in the introduction.

In the introduction, please also introduce a paragraph that describes the structure of the manuscript.

This was added in at the end of the introduction.

Line 69: please explain why such cohorts were selected.

These cohorts were selected because information about patient response to immunotherapy and immune profile were available on the CRI iAtlas website. This was added in at the end of the introduction.

Line 111: please define the meaning of SEM before using it.

Standard error of the mean (SEM) measures how far the sample mean (average) of the data is likely to be from the true population mean. This was added as requested. 

Figure 1: please give a letter and a sub caption to each one of the figures.

Figure 2: please consider the same comment that for Figure 1.

Figure 3: please consider the same comment that for Figure 1.

This was done as suggested.

Figure 4: this is not really a table, if not many performance metrics that should presented in a table or tables and not as an screenshot.

We replaced this figure by a table as requested.

Best regards

Guillaume

Reviewer 3 Report

Comments and Suggestions for Authors

Machine learning algorithms were used to predict resistance to immunotherapy in five cancer types. However, the author has too many comments to consider; thus, this paper is far from publishable. Please refer to my comments as follows.
Comment 1. Refer to the journal’s template at https://www.mdpi.com/journal/onco/instructions:
(a) Full stop should be removed from the paper’s title.
(b) The paper’s title is too general. Please update “five cancer types” and “machine learning”.
(c) Affiliation and email are missing.
(d) The format of the in-text citation is [x].
(e) Before “References”, there is some more information to be reported.
(f) “References”: The format is not correct.
Comment 2. Abstract:
(a) The abstract is lengthy. Please check the requirements based on the journal’s template.
(b) What is meant by “prediction” in this paper?
(c) Regarding “…over 79% of cancer patients to ICB…”, was it the average across five cancer types?
(d) What is the percentage of improvement by the proposed method compared with the existing works?
Comment 3. Keywords:
(a) Ensure the terms fully capture the scope of the paper.
(b) The maximum number of terms is 10.
(c) “Resistance” and “machine learning” are too general.
Comment 4. Section 1 Introduction:
(a) Apart from the content in the first sentence, it is expected that the authors discuss the issue from a worldwide perspective.
(b) Why were those five cancer types considered?
(c) Justify the selections of machine learning algorithms.
(d) A literature review is missing. Please share the methodologies (i.e., using different machine learning algorithms), results, and limitations of the existing works.
(e) Add a paragraph to share the research contributions of the paper.
Comment 5. Section 2 Material and Methods:
(a) Add an introductory paragraph before Subsection 2.1.
(b) Add a table or a figure to facilitate the explanation of Subsection 2.1.
(c) Table 1: Conventionally, we use “,” instead of “.” for numerical values. In addition, what is “nb”?
(d) Subsection 2.3: It is too short. In addition, which does “* : p<0.05.” refer to?
(e) The style of writing “n=34 for Pembrolizumab and GBM. n=32 for Ipilimumab and Pembrolizumab and SKCM. n=45 for Pembrolizumab and STAD. n=298 for BLCA and Atezolizumab. n=165 for KIRC and Atezolizumab.” is informal.
(f) Methodology of the machine learning algorithms is missing.
Comment 6. Section 3 Results:
(a) Add an introductory paragraph before Subsection 3.1.
(b) Subsection 3.1: The first paragraph keeps referring to Table 1 and Figure 1 back and forth. To enhance the organization and clarity, please revise the paragraph.
(c) Enhance the resolutions of all figures. Enlarge the file to confirm that no content is blurred.
(d) Apart from Figure 4, the author is expected to provide a detailed analysis of the performance of each model in hyperparameter tuning.
Comment 7. A performance comparison between the author’s work and the existing works is missing.
Comment 8. What are the research benefits and implications?

Comments on the Quality of English Language

The organization and clarity of the paper can be enhanced.

Author Response

Thanks to reviewer 3 for his/her suggestions. 

 

Machine learning algorithms were used to predict resistance to immunotherapy in five cancer types. However, the author has too many comments to consider; thus, this paper is far from publishable. Please refer to my comments as follows.

Comment 1. Refer to the journal’s template at https://www.mdpi.com/journal/onco/instructions:

(a) Full stop should be removed from the paper’s title.

(b) The paper’s title is too general. Please update “five cancer types” and “machine learning”.

(c) Affiliation and email are missing.

(d) The format of the in-text citation is [x].

(e) Before “References”, there is some more information to be reported.

(f) “References”: The format is not correct.

 

These points were modified as suggested.

Comment 2. Abstract:

(a) The abstract is lengthy. Please check the requirements based on the journal’s template.

 

We reduced the length of the abstract as requested.

 

(b) What is meant by “prediction” in this paper?

 

We harnessed machine learning algorithms to construct models predicting response and resistance to ICB. It was clarified in the abstract as requested.

 

(c) Regarding “…over 79% of cancer patients to ICB…”, was it the average across five cancer types?

 

It is the score for the cancer type (KIRC) with the worst performance (Table 2).

 

(d) What is the percentage of improvement by the proposed method compared with the existing works?

 

While previous work on glioblastoma with only one type of algorithm had an accuracy of 0.82, we managed to develop 20 models that predicted response and resistance in 5 cancer types with accuracies between 0.79 and 1. It was added in the abstract as requested.

 

Comment 3. Keywords:

(a) Ensure the terms fully capture the scope of the paper.

 

This is done.

 

(b) The maximum number of terms is 10.

 

We reduced the number of keywords.

 

(c) “Resistance” and “machine learning” are too general.

 

We replaced it with RandomForestClassifier and GradientBoostingClassifier, and resistance was removed.

 

Comment 4. Section 1 Introduction:

(a) Apart from the content in the first sentence, it is expected that the authors discuss the issue from a worldwide perspective.

 

This is what we are doing after this sentence, but unfortunately few worldwide data are currently available to do it.

 

(b) Why were those five cancer types considered?

 

These cohorts were selected because information about patient response to immunotherapy and immune profile were available on the CRI iAtlas website. This was added in at the end of the introduction.

(c) Justify the selections of machine learning algorithms.

 

Machine-learning approaches showed promising results to predict patient outcomes in gliomas, lung and gastric cancers [16–20]. While previous work on glioblastoma with only one type of algorithm had an accuracy of 0.82, we aim to develop more models that will predict response and resistance in 5 cancer types with better accuracy. It was added as requested

 

(d) A literature review is missing. Please share the methodologies (i.e., using different machine learning algorithms), results, and limitations of the existing works.

 

Machine-learning approaches showed promising results to predict patient outcomes in gliomas, lung and gastric cancers [16–20]. We aim to improve it and to do it in other cancers such as bladder cancer, renal cancer, melanoma and others.

 

(e) Add a paragraph to share the research contributions of the paper.

 

We aim to identify immune features associated with either resistance or positive response to therapy. These identified features will serve as input for training machine learning algorithms, enabling the development of personalized prediction models tailored to individual patients based on their unique immune profiles. This approach aims to refine treatment decisions, ultimately improving outcomes for cancer patients undergoing immunotherapy. This was added in the introduction.

Comment 5. Section 2 Material and Methods:

(a) Add an introductory paragraph before Subsection 2.1.

 

This was added as requested.

 

(b) Add a table or a figure to facilitate the explanation of Subsection 2.1.

 

We updated table 1 as requested.

 

(c) Table 1: Conventionally, we use “,” instead of “.” for numerical values. In addition, what is “nb”?

 

This was modified as requested, and nb was replaced by number.

 

(d) Subsection 2.3: It is too short. In addition, which does “* : p<0.05.” refer to?

 

This was modified as requested, and “* : p<0.05.” refers to the fact that the label * on figures indicate that p<0.05.

 

(e) The style of writing “n=34 for Pembrolizumab and GBM. n=32 for Ipilimumab and Pembrolizumab and SKCM. n=45 for Pembrolizumab and STAD. n=298 for BLCA and Atezolizumab. n=165 for KIRC and Atezolizumab.” is informal.

 

There are 34 GBM patients treated with Pembrolizumab, 32 SKCM patients treated with Ipilimumab and Pembrolizumab, 45 STAD patients treated with Pembrolizumab, 298 BLCA patients treated with Atezolizumab and 165 KIRC patients treated with Atezolizumab. this was modified as requested.

 

(f) Methodology of the machine learning algorithms is missing.

 

This was added as requested.

 

Comment 6. Section 3 Results:

(a) Add an introductory paragraph before Subsection 3.1.

 

This was added as requested.

 

(b) Subsection 3.1: The first paragraph keeps referring to Table 1 and Figure 1 back and forth. To enhance the organization and clarity, please revise the paragraph.

 

This was modified as requested.

 

(c) Enhance the resolutions of all figures. Enlarge the file to confirm that no content is blurred.

 

Unfortunately these are the best images available.

 

(d) Apart from Figure 4, the author is expected to provide a detailed analysis of the performance of each model in hyperparameter tuning.

 

This is done in the results section 3.3., and in the methods section. Parameter optimization was performed independently for each model using methods such as grid search or random search. Hyperparameters were fine-tuned to enhance model performance based on appropriate evaluation metrics.

 

Comment 7. A performance comparison between the author’s work and the existing works is missing.

 

While previous work on glioblastoma with only one type of algorithm had an accuracy of 0.82, we managed to develop 20 models that predicted response and resistance in 5 cancer types with accuracies between 0.79 and 1. It was added as requested in the discussion.

 

Comment 8. What are the research benefits and implications?

 

Our approach advocates for the personalization of immunotherapy in cancer patients. By harnessing patient-specific immune attributes and computational predictions, we offer a promising avenue for the enhancement of clinical outcomes following immunotherapy.

 

Best regards

Guillaume

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

Insufficient changes are made in the revised paper. In particular, some comments were claimed to have been addressed; however, no changes can be found in the revised paper. If the author prefers not to address the comments, please share your ideas.

Follow-up Comment 1.
(b) The paper’s title is too general. Please update “five cancer types” and “machine learning”.
(f) “References”: The format is not correct.
Follow-up Comment 2. Abstract:
(b) What is meant by “prediction” in this paper? Does it refer to a probability output? Or does the model provide an estimation of future events?
Follow-up Comment 4. Section 1 Introduction:
(d) Please share the methodologies (i.e., using different machine learning algorithms), results, and limitations of the existing works.
(e) The claimed research contributions are not well justified in the rest of the sections (particularly the methodology and results).
Follow-up Comment 5. Section 2 Material and Methods:
(f) Methodology of the machine learning algorithms is missing. It refers to equations, pseud-codes, the process of hyperparameter tuning, etc.
Follow-up Comment 6. Section 3 Results:
(c) Enhance the resolutions of all figures. Please explain why high-resolution images cannot be provided.
(d) Apart from Figure 4, the author is expected to provide a detailed analysis of the performance of each model in hyperparameter tuning. If the author developed 20 models, it is expected some pages should be added to share the details.
Comment 7. A performance comparison between the author’s work and the existing works should be presented in a table for clarity.

 

 

Author Response

Thanks to reviewer 3 for his/her new suggestions. 

 

Insufficient changes are made in the revised paper. In particular, some comments were claimed to have been addressed; however, no changes can be found in the revised paper. If the author prefers not to address the comments, please share your ideas.

Follow-up Comment 1.
(b) The paper’s title is too general. Please update “five cancer types” and “machine learning”.

The new title is ‘Predicting resistance to immunotherapy in melanoma, glioblastoma, renal, stomach and bladder cancers by machine learning on immune profiles.’

(f) “References”: The format is not correct.

We replaced the () by [] as suggested by the other reviewer. We are not sure to understand which other changes are needed but if we missed anything specific we will fix it.

Follow-up Comment 2. Abstract:
(b) What is meant by “prediction” in this paper? Does it refer to a probability output? Or does the model provide an estimation of future events?

Yes, the model provides an estimation of future events of resistance or response to ICB by machine learning. It was clarified in the abstract.

Follow-up Comment 4. Section 1 Introduction:
(d) Please share the methodologies (i.e., using different machine learning algorithms), results, and limitations of the existing works.

This was modified as suggested in the introduction. ‘Thus, conducting a meta-analysis of cancer patient cohorts will be instrumental in characterizing the mechanisms underpinning response and resistance to ICB. These cohorts were selected because information about patient response to immunotherapy and immune profile were available on the CRI iAtlas website. Meta-analysis of data from multiple cohorts may also facilitate the identification of optimal targets to develop combination therapies and improve patient outcomes. The development of software using machine learning approaches will enhance the precision of response and resistance prediction to ICB. Machine-learning approaches showed promising results to predict patient outcomes in gliomas, lung and gastric cancers [16–20]. While previous work using RandomForrest in glioblastoma with only one type of algorithm had an accuracy of 0.82, we aim to develop more models that will predict response and resistance in 5 cancer types with better accuracy. It will improve the diagnosis and subsequent therapeutic strategies according to patient-specific characteristics.

We aim to identify immune features associated with either resistance or positive response to therapy. These identified features will serve as input for training machine learning algorithms (RandomForrestClassifier, GradientBoosting, SupportVectorMachine and LogisticRegression algorithms), enabling the development of personalized prediction models tailored to individual patients based on their unique immune profiles. This approach aims to refine treatment decisions, ultimately improving outcomes for cancer patients undergoing immunotherapy, even if the size of glioblastoma, melanoma and stomach cancer cohorts may be a limitation.’


(e) The claimed research contributions are not well justified in the rest of the sections (particularly the methodology and results).

While previous work on glioblastoma with only one type of algorithm had an accuracy of 0.82, we managed to develop 20 models that predicted response and resistance in 5 cancer types with accuracies between 0.79 and 1, meaning that our models managed to successfully predict the Response status of 79 to 100% of the patients. It was added in the discussion.

Follow-up Comment 5. Section 2 Material and Methods:
(f) Methodology of the machine learning algorithms is missing. It refers to equations, pseud-codes, the process of hyperparameter tuning, etc.

Parameter optimization was performed independently for each model using methods such as grid search or random search. Code and methodology for hyperparameter tuning is available on github https://github.com/gmestrallet/Cancers_2024_16/blob/main/glioblastoma.ipynb and it was added in the manuscript in the Data Availability Statement and in the Methods. 

Follow-up Comment 6. Section 3 Results:
(c) Enhance the resolutions of all figures. Please explain why high-resolution images cannot be provided.

These are the best figures available. We uploaded the figures separately to try to fix it in the new submission.

(d) Apart from Figure 4, the author is expected to provide a detailed analysis of the performance of each model in hyperparameter tuning. If the author developed 20 models, it is expected some pages should be added to share the details.

The performance of each model is presented in Table 2 and in the results after hyperparameter optimization. ‘For KIRC, the best performance was obtained with the RandomForrestClassifier and LogisticRegression algorithms. These algorithms have an overall accuracy of 79%, which means that it correctly predicted the Response status for about 79% of the patients in the test set (Table 2). For 'Non responder', the precision is around 79 and 81%, indicating that 80% of the positive predictions for this class were accurate. For ‘Responder', the precision is between 67 and 75%. For ‘Non responder', the recall is between 92 and 96%, meaning that the vast majority of actual ‘Non responder’ instances were correctly identified. For ‘Responder', the recall is between 33 and 47%, indicating that only 40% of the actual ‘Responder' instances were correctly identified. For 'Non responder', the F1-score is between 0.87 and 0.92, and for ‘Responder', it is between 0.46 and 0.53. For 'Non responder', there are 24 samples, and for ‘Responder', there are 9 samples in the test set. For GBM, the best performance was obtained with the RandomForrestClassifier algorithm. This algorithm has an overall accuracy of 82%, which means that it correctly predicted the Progression status for about 82% of the patients in the test set (Table 2). For 'Non progressor', the precision is 100%, indicating that 100% of the positive predictions for this class were accurate. For ‘Progressor', the precision is 80%. For ‘Non progressor', the recall is 33%, meaning that only 33% of actual ‘Non progressor’ instances were correctly identified. For ‘Progressor', the recall is 100%, indicating all actual ‘Progressor’ instances were correctly identified. For 'Non progressor’' the F1-score is 0.50, and for ‘Progressor', it is 0.89. For 'Non progressor' there are 3 samples, and for ‘Progressor', there are 8 samples in the test set.

For STAD, the best performance was obtained with the RandomForrestClassifier algorithm. This algorithm has an overall accuracy of 100%, which means that it correctly predicted the Response status for 100% of the patients in the test set (Table 2). For 'Non responder', the precision is 100%, indicating that 100% of the positive predictions for this class were accurate. For ‘Responder', the precision is also 100%. For ‘Non responder', the recall is 100%, meaning that all actual ‘Non responder’ instances were correctly identified. For ‘Responder', the recall is also 100%, indicating that all the actual ‘Responder' instances were correctly identified. For 'Non responder', the F1-score is 1, and for ‘Responder', it is also 1. For 'Non responder', there are 5 samples, and for ‘Responder', there are 4 samples in the test set.

For SKCM, the best performance was obtained with the RandomForrestClassifier algorithm. This algorithm has an overall accuracy of 100%, which means that it correctly predicted the Response status for 100% of the patients in the test set (Table 2). For 'Non responder', the precision is 100%, indicating that 100% of the positive predictions for this class were accurate. For ‘Responder', the precision is also 100%. For ‘Non responder', the recall is 100%, meaning that all actual ‘Non responder’ instances were correctly identified. For ‘Responder', the recall is also 100%, indicating that all the actual ‘Responder' instances were correctly identified. For 'Non responder', the F1-score is 1, and for ‘Responder', it is also 1. For 'Non responder', there are 3 samples, and for ‘Responder', there are 5 samples in the test set.

For BLCA, the best performance was obtained with the RandomForrestClassifier and LogisticRegression algorithms. These algorithms have an overall accuracy of 90%, which means that it correctly predicted the Response status for about 90% of the patients in the test set (Table 2). For 'Non responder', the precision is around 93 and 94%, indicating that 94% of the positive predictions for this class were accurate. For ‘Responder', the precision is between 62 and 67%. For ‘Non responder', the recall is between 94 and 96%, meaning that the vast majority of actual ‘Non responder’ instances were correctly identified. For ‘Responder', the recall is between 50 and 62%, indicating that only 62% of the actual ‘Responder' instances were correctly identified. For 'Non responder', the F1-score is 0.94, and for ‘Responder', it is between 0.57 and 0.62. For 'Non responder', there are 52 samples, and for ‘Responder', there are 8 samples in the test set. 

Overall, our models managed to successfully predict the Response status of 79% of the KIRC patients to Atezolizumab, 82% of the GBM patients to Pembrolizumab, 100% of the STAD patients to Pembrolizumab, 100% of the SKCM patients to Ipilimumab and Pembrolizumab, 90% of the BLCA patients to Atezolizumab.’

Comment 7. A performance comparison between the author’s work and the existing works should be presented in a table for clarity.

Very few work was performed using this methodology. In Table 2, we compared the results that we obtained previously using RandomForrest in the glioblastoma dataset, with an accuracy of 0.82, and here we managed to develop 20 models that predicted response and resistance in 5 cancer types with accuracies between 0.79 and 1, meaning that our models managed to successfully predict the Response status of 79 to 100% of the patients. It is clarified in the discussion.

Best

Guillaume

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

There are some important comments that remain unaddressed:
(a) Add a new paragraph to describe the research contributions of the paper precisely.
(b) A literature review is still missing. Please provide a summary of the methodologies, results, and limitations of at least five recent journal articles. Compare your work with the existing works.
(c) Subsection 2.5: Why were the step sizes of parameters uneven?
(d) The format of the table is not correct.
(e) The results of the model’s performance based on various hyperparameter values were missing.

Author Response

Thanks to reviewer 3 for his/her new suggestions. 

 

There are some important comments that remain unaddressed:

(a) Add a new paragraph to describe the research contributions of the paper precisely.

We added a new paragraph in the manuscript to address this comment: ‘Overall, to unravel the intricacies of resistance, we scrutinized the immune profiles of cancer patients experiencing ongoing disease progression and resistance post-ICB therapy. These profiles delineated multifaceted defects, including compromised macrophage, monocyte, and T cell responses, impaired antigen presentation, aberrant regulatory T cell (Tregs) responses, and elevated expression of immunosuppressive and G protein-coupled receptor molecules (TGFB1, IL2RA, IL1B, EDNRB, ADORA2A, SELP, and CD276). Building upon these insights into resistance profiles, we harnessed machine learning algorithms to construct models predicting response and resistance to ICB and accompanying software. While previous work on glioblastoma with only one type of algorithm had an accuracy of 0.82, we managed to develop 20 models that provided an estimation of future events of resistance or response in 5 cancer types with accuracies between 0.79 and 1, based on their distinct immune characteristics. In conclusion, our approach advocates for the personalized application of immunotherapy in cancer patients based on patient-specific attributes and computational models.’

(b) A literature review is still missing. Please provide a summary of the methodologies, results, and limitations of at least five recent journal articles. Compare your work with the existing works.

Very few works were performed using this methodology. In Table 2, we compared the results that we obtained previously using RandomForrest in the glioblastoma dataset [19], with an accuracy of 0.82, and here we managed to develop 20 models that predicted response and resistance in 5 cancer types with accuracies between 0.79 and 1, meaning that our models managed to successfully predict the Response status of 79 to 100% of the patients. Other machine-learning approaches showed promising results to predict patient outcomes in gliomas, lung and gastric cancers [16–20], but not in melanoma, bladder and renal cancers, and no one with immune features as we did here. The first one in lung cancer was a risk prediction model combining Clinical + DeepRadiomics [16] and the second one used nCounter RNA expression data [17]. Another one on gastric cancer used bulk and single cell RNA seq [18]. Finally, we also previously developed algorithms using the tumor mutational profile in glioma [20]. These approaches are very different compared to the one we used here with immune features. It is added in the discussion.

(c) Subsection 2.5: Why were the step sizes of parameters uneven?

There is no specific reason for this.

(d) The format of the table is not correct.

The format of the table was changed as requested.

(e) The results of the model’s performance based on various hyperparameter values were missing.

Optimization of hyperparameter imply to run the algorithms with a very high number of combination (thousands and thousands of times) of these parameters, as shown in the code and the method section, and it is not possible to print all these results, so this is why we chose to show the results only for the best algorithms.

Best

Guillaume

Author Response File: Author Response.docx

Back to TopTop