Next Article in Journal
Intra-Tumor Heterogeneity Revealed by Mass Spectrometry Imaging Is Associated with the Prognosis of Breast Cancer
Next Article in Special Issue
Global Trends in Cancer Nanotechnology: A Qualitative Scientific Mapping Using Content-Based and Bibliometric Features for Machine Learning Text Classification
Previous Article in Journal
Comparing Breast Cancer Experiences and Quality of Life between Lesbian and Heterosexual Women
Previous Article in Special Issue
Relevant and Non-Redundant Feature Selection for Cancer Classification and Subtype Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data

by
Péter Gargya
and
Bálint László Bálint
*
Genomic Medicine and Bioinformatics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Egyetem tér 1, 4032 Debrecen, Hungary
*
Author to whom correspondence should be addressed.
Cancers 2021, 13(17), 4348; https://doi.org/10.3390/cancers13174348
Submission received: 8 July 2021 / Revised: 24 August 2021 / Accepted: 26 August 2021 / Published: 27 August 2021
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)

Simple Summary

Implementing machine learning methods into the RNA-seq data analysis pipelines can further improve the efficiency of data utilization in clinical decision making. In this article, we present how machine learning methods can be used to go one step further in data analysis of the global gene expression datasets, namely, to develop models that are able to classify individual cancer samples based on well characterized reference samples. We used the publicly available endometrial cancer sample RNA-seq datasets of the TCGA project to develop a model that can separate G1 and G3 cancer samples with an accuracy of 85%. Our model could also further stratify G2 samples into high-risk and low-risk subgroups. Moreover, with an iterative retraining approach, we could subselect twelve genes that performed similarly in the stratification. Our results were validated by the survival data of the patients.

Abstract

The tumor grade of endometrioid endometrial cancer is used as an independent marker of prognosis and a key component in clinical decision making. It is reported that between grades 1 and 3, however, the intermediate grade 2 carries limited information; thus, patients with grade 2 tumors are at risk of both under- and overtreatment. We used RNA-sequencing data from the TCGA project and machine learning to develop a model which can correctly classify grade 1 and grade 3 samples. We used the trained model on grade 2 patients to subdivide them into low-risk and high-risk groups. With iterative retraining, we selected the most relevant 12 transcripts to build a simplified model without losing accuracy. Both models had a high AUC of 0.93. In both cases, there was a significant difference in the relapse-free survivals of the newly identified grade 2 subgroups. Both models could identify grade 2 patients that have a higher risk of relapse. Our approach overcomes the subjective components of the histological evaluation. The developed method can be automated to perform a prescreening of the samples before a final decision is made by pathologists. Our translational approach based on machine learning methods could allow for better therapeutic planning for grade 2 endometrial cancer patients.
Keywords: endometrium; biomarkers; endometrial cancer; machine learning; elastic-net; relapse-free survival; TCGA; RNA-seq; tumor grade; fertility preservation endometrium; biomarkers; endometrial cancer; machine learning; elastic-net; relapse-free survival; TCGA; RNA-seq; tumor grade; fertility preservation

Share and Cite

MDPI and ACS Style

Gargya, P.; Bálint, B.L. Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data. Cancers 2021, 13, 4348. https://doi.org/10.3390/cancers13174348

AMA Style

Gargya P, Bálint BL. Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data. Cancers. 2021; 13(17):4348. https://doi.org/10.3390/cancers13174348

Chicago/Turabian Style

Gargya, Péter, and Bálint László Bálint. 2021. "Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data" Cancers 13, no. 17: 4348. https://doi.org/10.3390/cancers13174348

APA Style

Gargya, P., & Bálint, B. L. (2021). Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data. Cancers, 13(17), 4348. https://doi.org/10.3390/cancers13174348

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop