Using Machine Learning to Expand the Ann Arbor Staging System for Hodgkin and Non-Hodgkin Lymphoma
Abstract
:1. Introduction
2. Material and Methods
2.1. Data
2.2. EACCD
2.3. Prognostic Systems
2.4. Validation Method
2.5. Software
3. Results
3.1. Ann Arbor Staging System
3.2. EACCD Prognostic System
3.3. Validation of the EACCD Systems
4. Discussion
4.1. Comparison with Ann Arbor Staging System
4.2. Effect of Factor Levels on Survival
4.3. Limitations of Analyses
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer statistics, 2021. CA Cancer J. Clin. 2021, 71, 7–33. [Google Scholar] [CrossRef] [PubMed]
- Shanbhag, S.; Ambinder, R.F. Hodgkin lymphoma: A review and update on recent progress. CA Cancer J. Clin. 2018, 68, 116–132. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ansell, S.M. Hodgkin lymphoma: 2016 update on diagnosis, risk-stratification, and management. Am. J. Hematol. 2016, 91, 434–442. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Carbone, P.P.; Kaplan, H.S.; Musshoff, K.; Smithers, D.W.; Tubiana, M. Report of the committee on Hodgkin’s disease staging classification. Cancer Res. 1971, 31, 1860–1861. [Google Scholar]
- Amin, M.B.; Edge, S.; Greene, F.; Byrd, D.R.; Brookland, R.K.; Washington, M.K.; Gershenwald, J.E.; Compton, C.C.; Hess, K.R.; Sullivan, D.C.; et al. AJCC Cancer Staging Manual, 8th ed.; Springer International Publishing: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Eichenauer, D.A.; Aleman, B.M.; André, M.; Federico, M.; Hutchings, M.; Illidge, T.; Engert, A.; Ladetto, M. Hodgkin lymphoma: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2018, 29, iv19–iv29. [Google Scholar] [CrossRef]
- Ansell, S.M. Non-Hodgkin lymphoma: Diagnosis and treatment. Mayo Clin. Proc. 2015, 90, 1152–1163. [Google Scholar] [CrossRef]
- Sibon, D.; Morschhauser, F.; Resche-Rigon, M.; Ghez, D.; Dupuis, J.; Marçais, A.; Deau-Fischer, B.; Bouabdallah, R.; Sebban, C.; Salles, G.; et al. Single or tandem autologous stem-cell transplantation for first-relapsed or refractory Hodgkin lymphoma: 10-year follow-up of the prospective H96 trial by the LYSA/SFGM-TC study group. Haematologica 2016, 101, 474. [Google Scholar] [CrossRef] [Green Version]
- Vassilakopoulos, T.P.; Chatzidimitriou, C.; Asimakopoulos, J.V.; Arapaki, M.; Tzoras, E.; Angelopoulou, M.K.; Konstantopoulos, K. Immunotherapy in Hodgkin lymphoma: Present status and future strategies. Cancers 2019, 11, 1071. [Google Scholar] [CrossRef] [Green Version]
- Otter, R.; Gerrits, W.B.; Sandt, M.M.; Hermans, J.; Willemze, R.; Group, S. Primary extranodal and nodal non-Hodgkin’s lymphoma: A survey of a population-based registry. Eur. J. Cancer Clin. Oncol. 1989, 25, 1203–1210. [Google Scholar] [CrossRef]
- Edge, S.B.; Byrd, D.R.; Compton, C.C.; Fritz, A.; Greene, F. AJCC Cancer Staging Manual, 7th ed.; Springer: New York, NY, USA, 2010. [Google Scholar]
- Yang, Y.; Zhang, Y.J.; Zhu, Y.; Cao, J.Z.; Yuan, Z.Y.; Xu, L.M.; Wu, J.X.; Wang, W.; Wu, T.; Lu, B.; et al. Prognostic nomogram for overall survival in previously untreated patients with extranodal NK/T-cell lymphoma, nasal-type: A multicenter study. Leukemia 2015, 29, 1571–1577. [Google Scholar] [CrossRef]
- Zhong, H.; Chen, J.; Cheng, S.; Chen, S.; Shen, R.; Shi, Q.; Xu, P.; Huang, H.; Zhang, M.; Wang, L.; et al. Prognostic nomogram incorporating inflammatory cytokines for overall survival in patients with aggressive non-Hodgkin’s lymphoma. EBioMedicine 2019, 41, 167–174. [Google Scholar] [CrossRef] [Green Version]
- Kwak, L.W.; Halpern, J.; Olshen, R.A.; Horning, S.J. Prognostic significance of actual dose intensity in diffuse large-cell lymphoma: Results of a tree-structured survival analysis. J Clin. Oncol. 1990, 8, 963–977. [Google Scholar] [CrossRef] [Green Version]
- Phillips, A.A.; Shapira, I.; Willim, R.D.; Sanmugarajah, J.; Solomon, W.B.; Horwitz, S.M.; Savage, D.G.; Bhagat, G.; Soff, G.; Zain, J.M. A critical analysis of prognostic factors in North American patients with human T-cell lymphotropic virus type-1-associated adult T-cell leukemia/lymphoma: A multicenter clinicopathologic experience and new prognostic score. Cancer 2010, 116, 3438–3446. [Google Scholar] [CrossRef]
- Peng, F.; Li, J.; Mu, S.; Cai, L.; Fan, F.; Qin, Y.; Ai, L.; Hu, Y. Epidemiological features of primary breast lymphoma patients and development of a nomogram to predict survival. Breast 2021, 57, 49–61. [Google Scholar] [CrossRef]
- Low, S.K.; Zayan, A.H.; Istanbuly, O.; Nguyen Tran, M.D.; Ebied, A.; Mohamed Tawfik, G.; Huy, N.T. Prognostic factors and nomogram for survival prediction in patients with primary pulmonary lymphoma: A SEER population-based study. Leuk Lymphoma 2019, 60, 3406–3416. [Google Scholar] [CrossRef]
- Wang, H.; Li, G. A selective review on random survival forests for high dimensional data. Quant. Biosci. 2017, 36, 85. [Google Scholar] [CrossRef]
- Chen, D.; Xing, K.; Henson, D.; Sheng, L.; Schwartz, A.; Cheng, X. Developing prognostic systems of cancer patients by ensemble clustering. Biomed. Res. Int. 2009, 2009, 632786. [Google Scholar] [CrossRef] [Green Version]
- Qi, R.; Wu, D.; Sheng, L.; Henson, D.; Schwartz, A.; Xu, E.; Xing, K.; Chen, D. On an ensemble algorithm for clustering cancer patient data. BMC Syst. Biol. 2013, 7, S9. [Google Scholar] [CrossRef] [Green Version]
- Chen, D.; Hueman, M.T.; Henson, D.E.; Schwartz, A. An algorithm for expanding the TNM staging system. Future Oncol. 2016, 12, 1015–1024. [Google Scholar] [CrossRef]
- Wang, H.; Chen, D.; Hueman, M.T.; Sheng, L.; Henson, D. Clustering big cancer data by effect sizes. In Proceedings of the 2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Philadelphia, PA, USA, 17–19 July 2017; pp. 58–63. Available online: https://ieeexplore.ieee.org/abstract/document/8010615 (accessed on 16 November 2020).
- Wang, H.; Hueman, M.; Pan, Q.; Henson, D.E.; Schwartz, A.; Sheng, L.; Chen, D. Creating Prognostic Systems by the Mann-Whitney Parameter. In Proceedings of the 2018 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Washington, DC, USA, 26–28 September 2018; pp. 33–39. Available online: https://ieeexplore.ieee.org/abstract/document/8648686 (accessed on 16 November 2020).
- Wang, H.; Chen, D.; Pan, Q.; Hueman, M. Using Weighted Differences in Hazards as Effect Sizes for Survival Data. J. Stat. Theory Pract. 2022, 16, 12. [Google Scholar] [CrossRef]
- Hueman, M.T.; Wang, H.; Yang, C.Q.; Sheng, L.; Henson, D.; Schwartz, A.; Chen, D. Creating prognostic systems for cancer patients: A demonstration using breast cancer. Cancer Med. 2018, 7, 3611–3621. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yang, C.; Gardiner, L.; Wang, H.; Hueman, M.; Chen, D. Creating prognostic systems for well differentiated thyroid cancer using machine learning. Front. Endocrinol. 2019, 10, 288. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hueman, M.; Wang, H.; Henson, D.; Chen, D. Expanding the TNM for cancers of the colon and rectum using machine learning: A demonstration. ESMO Open 2019, 4, e000518. [Google Scholar] [CrossRef] [Green Version]
- Grimley, P.M.; Liu, Z.; Darcy, K.M.; Hueman, M.; Wang, H.; Sheng, L.; Henson, D.; Chen, D. A prognostic system for epithelial ovarian carcinomas using machine learning. Acta Obstet. Gynecol. Scand. 2021, 100, 1511–1519. [Google Scholar] [CrossRef]
- Hueman, M.; Wang, H.; Liu, Z.; Henson, D.; Nguyen, C.; Park, D.; Sheng, L.; Chen, D. Expanding TNM for lung cancer through machine learning. Thorac. Cancer 2021, 12, 1423–1430. [Google Scholar] [CrossRef]
- Yang, C.Q.; Wang, H.; Liu, Z.; Hueman, M.T.; Bhaskaran, A.; Henson, D.E.; Sheng, L.; Chen, D. Integrating additional factors into the TNM for melanoma of the skin by machine learning. PLoS ONE 2021, 16, e0257949. [Google Scholar]
- Surveillance, Epidemiology, and End Results (SEER) Program Research Data (2000–2019), National Cancer Institute, DCCPS, Surveillance Research Program, Released April 2022, Based on the November 2021 Submission. Available online: https://seer.cancer.gov/ (accessed on 10 May 2022).
- SEER Cause-Specific Death Classification. Available online: https://seer.cancer.gov/causespecific/ (accessed on 10 May 2022).
- Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons: Hoboken, NJ, USA, 1990. [Google Scholar]
- Bien, J.; Tibshirani, R. Hierarchical clustering with prototypes via minimax linkage. J. Am. Stat. Assoc. 2011, 106, 1075–1084. [Google Scholar] [CrossRef] [Green Version]
- Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
- Harrell, F.E.; Lee, K.L.; Mark, D.B. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 1996, 15, 361–387. [Google Scholar] [CrossRef]
- Wang, H. Development of prognostic systems for cancer patients. Doctoral Dissertation, The George Washington University, Washington, DC, USA, 2020. [Google Scholar]
- Ramspek, C.L.; Jager, K.J.; Dekker, F.W.; Zoccali, C.; Van Diepen, M. External validation of prognostic models: What, why, how, when and where? Clin. Kidney J. 2021, 14, 49–58. [Google Scholar] [CrossRef]
- Tanvetyanon, T.; Finley, D.J.; Fabian, T.; Riquet, M.; Voltolini, L.; Kocaturk, C.; Bryant, A.; Robinson, L. Prognostic nomogram to predict survival after surgery for synchronous multiple lung cancers in multiple lobes. J. Thorac. Oncol. 2015, 10, 338–345. [Google Scholar] [CrossRef] [Green Version]
- Kang, L.; Chen, W.; Petrick, N.A.; Gallas, B.D. Comparing two correlated C indices with right-censored survival outcome: A one-shot nonparametric approach. Stat. Med. 2015, 34, 685–703. [Google Scholar] [CrossRef] [Green Version]
- Daniel, W.W. Biostatistics: A Foundation for Analysis in the Health Sciences, 7th ed.; John Wiley & Sons: New York, NY, USA, 1999. [Google Scholar]
- Site Recode ICD-O-3/WHO 2008 Definition. Available online: https://seer.cancer.gov/siterecode/icdo3_dwhoheme/index.html (accessed on 4 June 2023).
Training Set | Validation Set | |||
---|---|---|---|---|
N | % | N | % | |
Stage | ||||
IA | 14,744 | 11.2 | 1706 | 10.9 |
IB | 2997 | 2.3 | 310 | 2.0 |
IEA | 16,090 | 12.2 | 2075 | 13.2 |
IEB | 2276 | 1.7 | 238 | 1.5 |
ISA | 457 | 0.3 | 70 | 0.4 |
ISB | 201 | 0.2 | 28 | 0.2 |
IIA | 12,054 | 9.2 | 1285 | 8.2 |
IIB | 5125 | 3.9 | 511 | 3.3 |
IIEA | 6142 | 4.7 | 722 | 4.6 |
IIEB | 2128 | 1.6 | 266 | 1.7 |
IISA | 453 | 0.3 | 50 | 0.3 |
IISB | 333 | 0.3 | 22 | 0.1 |
IIESA | 133 | 0.1 | 9 | 0.1 |
IIESB | 108 | 0.1 | 10 | 0.1 |
IIIA | 11,275 | 8.6 | 1460 | 9.3 |
IIIB | 5998 | 4.6 | 706 | 4.5 |
IIIEA | 1972 | 1.5 | 255 | 1.6 |
IIIEB | 1010 | 0.8 | 108 | 0.7 |
IIISA | 1634 | 1.2 | 213 | 1.4 |
IIISB | 1667 | 1.3 | 194 | 1.2 |
IIIESA | 299 | 0.2 | 34 | 0.2 |
IIIESB | 246 | 0.2 | 27 | 0.2 |
IVA | 27,309 | 20.7 | 3401 | 21.7 |
IVB | 17,074 | 13.0 | 1983 | 12.6 |
Site | ||||
Nodal | 94,043 | 28.6 | 10,764 | 68.6 |
Extranodal | 37,682 | 71.4 | 4919 | 31.4 |
Type | ||||
Hodgkin | 17,770 | 13.5 | 1799 | 11.5 |
Non-Hodgkin | 113,955 | 86.5 | 13,884 | 88.5 |
Age | ||||
Young | 55,852 | 42.4 | 6020 | 38.4 |
Old | 75,873 | 57.6 | 9663 | 61.6 |
Sex | ||||
Female | 58,663 | 44.5 | 6907 | 44.0 |
Male | 73,062 | 55.5 | 8776 | 56.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Liu, Z.; Yang, J.; Sheng, L.; Chen, D. Using Machine Learning to Expand the Ann Arbor Staging System for Hodgkin and Non-Hodgkin Lymphoma. BioMedInformatics 2023, 3, 514-525. https://doi.org/10.3390/biomedinformatics3030035
Wang H, Liu Z, Yang J, Sheng L, Chen D. Using Machine Learning to Expand the Ann Arbor Staging System for Hodgkin and Non-Hodgkin Lymphoma. BioMedInformatics. 2023; 3(3):514-525. https://doi.org/10.3390/biomedinformatics3030035
Chicago/Turabian StyleWang, Huan, Zhenqiu Liu, Julie Yang, Li Sheng, and Dechang Chen. 2023. "Using Machine Learning to Expand the Ann Arbor Staging System for Hodgkin and Non-Hodgkin Lymphoma" BioMedInformatics 3, no. 3: 514-525. https://doi.org/10.3390/biomedinformatics3030035