Advances in Cancer Through Machine Learning Models

Rosca, Cosmina-Mihaela; Stancu, Adrian; Brezoi, Alina Gabriela

doi:10.3390/app16052226

Open AccessSystematic Review

Advances in Cancer Through Machine Learning Models

by

Cosmina-Mihaela Rosca

¹

,

Adrian Stancu

^2,*

and

Alina Gabriela Brezoi

²

¹

Department of Automatic Control, Computers, and Electronics, Faculty of Mechanical and Electrical Engineering, Petroleum-Gas University of Ploiesti, 39 Bucharest Avenue, 100680 Ploiesti, Romania

²

Department of Business Administration, Faculty of Economic Sciences, Petroleum-Gas University of Ploiesti, 39 Bucharest Avenue, 100680 Ploiesti, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2226; https://doi.org/10.3390/app16052226

Submission received: 4 February 2026 / Revised: 18 February 2026 / Accepted: 23 February 2026 / Published: 25 February 2026

(This article belongs to the Special Issue Advances in the Detection and Diagnosis of Cancer and Their Clinical Applications)

Download

Browse Figures

Versions Notes

Abstract

The integration of machine learning (ML) algorithms in oncology creates a new path for prognosis, early diagnosis, prevention, and treatment customization. However, large-scale clinical implementation is difficult due to the lack of standardized assessments and the variation in reported performance. A systematic review of the most recent research on ML applications in oncology (1 January 2020–31 December 2025) was conducted. The databases employed are Web of Science, Scopus, and PubMed. Filters applied for open-access articles that were simultaneously indexed and had numerical data in the abstract. From an initial of 13,292 articles, successive selection according to the PRISMA diagram resulted in a final set of 1364 studies. These were analyzed from four perspectives: the types of cancer investigated, the characteristics of the datasets (reproducibility and generalizability), the ML models used, and the performance achieved (accuracy, precision, recall, F1-score, and AUC). There is high interest in breast cancer (350 articles), colorectal cancer (337 articles), and lung cancer (151 articles), with frequent use of the databases The Cancer Genome Atlas (133 studies), Gene Expression Omnibus (94 studies), and Surveillance, Epidemiology, and End Results (72 studies). The Random Forest model proved to be predominant due to its tolerance for incomplete data. Reported performance varies considerably between cancer types and even within the same type. This analysis demonstrates the potential of ML methods for deciphering genomic alterations and supports the development of integrated personalized medicine approaches in oncology.

Keywords:

oncology; cancer prediction; cancer detection; cancer diagnosis; reproducibility; generalizability; TCGA; GEO; SEER; Random Forest; accuracy; precision; recall; F1-score; AUC

1. Introduction

The rapid development of artificial intelligence (AI) technologies, particularly machine learning (ML) algorithms, has created opportunities for improving early prevention, diagnosis, prognosis, and the personalization of cancer treatment. In the hierarchy of concepts, AI represents the general field that includes all methods capable of reproducing human reasoning, with an emphasis on decision-making. ML is a subfield of AI focused on developing algorithms that learn patterns directly from data. Deep Learning (DL) is also a branch of ML based on multi-layered artificial neural networks. These techniques are used particularly in the areas of medical imaging, histopathological analysis, and the recognition of complex signals, such as genomic or radiomic ones. In oncology, these algorithms assist in diagnosis. When the data volume is modest, ML algorithms are preferred over DL algorithms, which require a much larger volume of data. The specialized literature reveals differences between the performance reported in studies and its actual applicability in clinical practice. The differences are generated by data quality and availability, class imbalances, lack of external validation, and inconsistent reporting of results. These gaps justify the need for a systematic evaluation of ML applications in oncology.

The paper is addressed to several categories of readers:

Researchers in AI applied to medicine, interested in the current state of ML integration in oncology;
Clinicians and oncologists seeking objective information about models with real potential for clinical use;
Decision-makers and developers of digital medical solutions who can identify opportunities and risks in adopting ML;
The academic community to which it aims to provide a replicable methodological framework for other medical fields.

The research questions (RQs) of the study are as follows:

RQ1: What are the most frequently addressed cancer types using ML algorithms between 2020 and 2025?
RQ2: What are the datasets and ML models mainly used in the studies from the literature?
RQ3: What performance levels are reported for different types of cancer, and what factors influence these results?
RQ4: How do the reported performances correlate with the actual potential for clinical implementation?

Study objectives are outlined below:

To conduct a systematic analysis of the recent literature on ML applications in oncology;
Classification of studies based on cancer type, datasets used, ML models applied, and achieved performance;
Assessing the reproducibility and generalizability of studies based on the information reported in article abstracts by proposing novel indicators associated with reproducibility and generalizability;
Identifying gaps and future research directions for the responsible implementation of ML in oncological practice.

The main contributions of this study are mentioned next:

Developing a rigorous literature filtering methodology, which has never been used before in the review articles;
Classification of studies into four analytical directions: type of cancer, dataset characteristics, ML algorithms used, and the performance achieved;
Comparative analysis of performance indicators, such as accuracy, precision (positive predictive value), recall (sensitivity), F1-score, and Area Under the Curve (AUC).

The paper is structured into five sections. Section 2 presents the methodology used. Section 3 focuses on the results and is structured into four subsections that analyze the distribution of publications that employ ML modes by cancer type, the dataset assessment, ML models used in papers by cancer type, and the performance metrics for cancer types. The discussion and the limitations are depicted in Section 4, whereas the conclusions and future research are described in Section 5.

This review follows the PRISMA 2020 structure, and the results are reported in accordance with the selection stages described in the methodological section.

The originality of this review lies in the integration of a structured comparative framework along four axes: type of cancer, dataset characteristics, algorithm used, and reported performance. Unlike previous reviews, which focus on a single type of cancer or a single type of algorithm, this study proposes a cross-sectional analysis with its own indicators for reproducibility and generalizability. Additionally, the authors integrate innovative eligibility criteria into the PRISMA analysis, considered unique to date in the literature, aimed at filtering research that demonstrates a degree of quality in both development and content.

2. Methodology

This paper presents a systematic review with the main objective of investigating applications that record technological progress through ML algorithms in the field of cancer. The paper analyzes original studies published between 1 January 2020 and 31 December 2025. The literature selection process has the following standards: novelty, quality, analysis, comparability, and expertise of the extracted data. The methodology presents a systematic approach with a detailed description of the most important papers identified after applying an entire filtration process.

2.1. Search Strategy and Data Source

The search strategy included the following three databases, i.e., Web of Science (WOS), Scopus, and PubMed. These three databases include papers from fields such as medical sciences, life sciences, biomedical engineering, and computer science. This ensures a complete review of the ML subject in cancer. The search focused on two key concepts: cancer and ML. The preliminary search strategy was restricted to article titles. Thus, all initially selected articles included the words “cancer” and “machine learning” or “ML” in their titles. The authors wanted to ensure the relevance of the initial results in this approach. The title of an article best reflects the central subject. Limiting the search to the title reduces irrelevant results where the terms appear only in a sentence within the text or in references. This increases the probability that the extracted articles will directly address the two concepts. Although this strategy proposal may omit some articles where the two subjects appear in the abstract or full text, the advantage of this proposal is to obtain an initial qualitative set that justifies this choice for the systematic review.

The time frame 1 January 2020 to 31 December 2025 is justified by the field of ML, which has been evolving spectacularly in recent years. Limiting the review to the last six years ensures it reflects the latest discoveries, methods, implementations, trends, and evaluations. Secondly, the post-pandemic period of 2020 accelerated medical research, which included digital technological applications involving the integration of ML technologies in healthcare. Therefore, the authors expect that many innovations have emerged during this period. Thirdly, the integration of ML techniques has improved in recent years, meaning that studies from this period are more likely to use advanced models that report updated performance.

After conducting the initial searches, only open-access articles were selected. The motivation for choosing this was to be able to access the full text of the articles. This allowed for a detailed analysis of the content to understand the methods, context, results, future directions, and conclusions obtained.

Furthermore, review articles were excluded, and this exclusion supports the purpose of this research. The main objective is to review the original studies that apply ML in cancer. Including these articles would create a circular loop, distorting the analysis. The paper focuses on the direct analysis of primary information sources to extract details about the types of cancer in which they are used, as well as the methods, datasets, analyzed ML models, and numerical results, which are identified in the models’ performance metrics. It has obtained three lists of open-access articles after applying individual filters to each database. In the next step, common articles across all three databases were identified. This intersection is necessary to identify articles present in all three databases. This ensures the identification of the most highly indexed, most cited articles with high visibility within the scientific community, which represent benchmark scientific elements for the field. This way, confidence in the quality of the selected items is increased. The authors also believe that these analyzed articles are recognized due to their indexing in multiple sources, which reduces the risk of including peripheral articles or those that are not sufficiently representative of the field.

From the set of common articles, a further selection was made by keeping only those articles whose abstracts contain numerical values. With this decision, the technical nature of the review is imposed. One of the goals is to provide a technical perspective on the performance of ML models in cancer. The reason for the restriction applied to the abstract is that it is a concise source of information. By identifying articles with numerical data, analysis efforts are focused on the articles most likely to contain detailed technical information in the rest of the text. This optimization in the data extraction process for the systematic review helped filter the large volume of data existing in the literature up to this point.

The criterion of analyzing numerical values in the abstract has been maintained as a deliberate methodological contribution. The authors of this paper believe that a scientific abstract should highlight the quantifiable results of the study. Excluding articles without numerical data in the abstract is not a cognitive bias. This is considered a quality filter to prioritize studies with transparent performance reporting. This approach brings an element of novelty to this research, through a different vision from the standardized one of systematic reviews. The authors of this paper acknowledge the risk of limiting the approach by excluding important articles, but on the other hand, it is considered that a qualitative study adheres to the standard norms of developing scientific material. Therefore, the idea is emphasized that this criterion ensures the selection of studies that communicate technical results right from the synthesis. Furthermore, it strengthens the argument that this approach represents a methodological novelty for rigorous filtering of the technical literature.

Articles filtered by applying the set of limitations are analyzed from the perspective of oncological classification. In this analysis, the articles were categorized based on the type of cancer addressed. This is how research areas that prioritize the integration of ML technologies are defined. Classification by cancer type allows for a detailed thematic analysis.

The U.S. National Cancer Institute (NCI) groups cancer types by human organs in numerous categories, but the most important are the 35 depicted in Figure 1 [1]. According to the American Cancer Society’s estimation, the cancer types that are expected to record the highest cases in 2025 were prostate cancer for males (colored in blue in Figure 1), uterine cancer for females (colored in red in Figure 1), and bladder, breast, colorectal, kidney, leukemia, lung, lymphoma, pancreatic, skin, and thyroid cancer for both genders (colored in green in Figure 1) [2].

In the second analysis, the characteristics of the datasets used are evaluated. The implications of different types of datasets are also discussed, along with understanding the context and validity of the results, the concept of a well-constructed dataset, and the implications the dataset has on the ML application, which achieves good metric-level performance for practical use. Dataset analysis is particularly important due to the following characteristics:

Reproducibility ensures that the data description allows other researchers to replicate the study and potentially improve upon the findings presented in the paper;
Generalizability is studied using an indicator that reflects the model’s ability to function on populations different from those in the training cohort. Thus, the evaluation was based on the volume of the cohort, the disparity of the dataset, its diversity, and the presence of external validation.

By analyzing the two characteristics that should describe the datasets used in training and validation, the importance of analyzing these datasets used in the research that will be included in the systematic review is deduced.

The third analysis inventories the ML models mentioned in the extracted articles for mapping the algorithms used in the investigations. These provide an overview of the methodological trends of the algorithms. Inventorying these models is important because it highlights technological trends that allow for the identification of the most popular models at a specific point in time. They also provide information on which models are considered suitable for certain data types, support the identification of specific types of cancer, and offer a comparison of the performance of models applied to the same data types and of models investigating the same type of cancer.

Finally, an evaluation of the performance metrics reported in the articles is conducted to understand the degree of integration of ML models into practical applications. The most commonly used performance metrics are accuracy, precision, recall, F1-score, and AUC. Performance metrics are indicators of an application’s success in evaluating its effectiveness, quantifying how well a model performs for the specific task it was designed for, such as diagnosing a particular type of cancer. These metrics provide an objective comparison between multiple models with the same objective, and they also identify a number of limitations through their low values, which can indicate the study’s boundaries and ultimately provide contextualization within the field through the values correlated with a specific type of cancer to understand the specific difficulties associated with a particular application typology.

Figure 2 presents the synthesis of the methodology that includes the WOS, Scopus, and PubMed databases. Thus, a series of successive filters is applied for title, period, open access, and the exclusion of reviews. Subsequently, only the common articles that include numerical data in the abstract are retained, and the selected articles are analyzed in four directions: cancer type, datasets, algorithms, and performances.

This analysis methodology, based on the automatic extraction of information from abstracts and extended papers, systematizes the technical content of a large number of scientific articles. The complete search strategy used for each database, including the exact syntax of the query, Boolean operators, queried fields, and applied filters, is presented in Table S1 from the Supplementary Materials section.

2.2. Review Protocol in Accordance with the PRISMA Guidelines

The articles extracted from the three databases, WOS, Scopus, and PubMed, investigate applications that integrate ML algorithms in the oncological field. In the WOS database, the initial search filter was applied based on the expression “cancer*” and “machine learning” in the title, for the period 1 January 2020–31 December 2025. The search returned 4577 articles, as shown in Figure 3. The use of the asterisk in the word cancer includes all lexical derivatives, an example in this sense being the words cancer, cancers, cancerous, etc., practically extending the coverage area. This is a strategic choice to capture all terms in the field of cancer, without losing potential works due to semantic restrictions. Subsequently, the removal of review articles was applied, which reduced the set to 4330 articles. The additional application of the open-access filter reduces the final set to 2237 articles. This filter is motivated by the need for full access to the complete text for in-depth technical analysis.

Regarding the Scopus search, it returned an initial number of 5514 articles. This value indicates a greater coverage of recent works in which frontier research in ML and cancer is published. The removal of review articles reduced the number of results to 5235. After applying the open-access filter, a total of 2098 results were obtained, which represents a practical limitation of secondary studies.

Regarding the PubMed database, it generated 3201 articles in the initial search. This large number is justified by the fact that the database is associated with publications in biomedicine, bioinformatics, and clinical research, so it includes many works specific to the medical field. The removal of reviews led to a minor decrease, with 2965 results. After applying the open-access filter, 2234 results were obtained.

The differences between the databases reflect their complementarity as they are focused on different objectives. Filtering by title is a strategy that ensures the selected papers have the two reference concepts as central elements. The elimination of review papers avoids the inclusion of syntheses that do not present the specific metrics of ML models employed in the conducted research. The setting of open access allows for the validation of discussions regarding content transparency. Figure 3 presents the PRISMA diagram, which initially identifies 13,292 articles. After applying the review removal filter, the total number was reduced to 12,530, and finally, after excluding those that are not open access, a total of 6569 was obtained. Of these articles, 295 are common to WOS–Scopus, 298 are common to WOS–PubMed, 82 are common to Scopus–PubMed, and 1503 are common to all three databases, WOS, Scopus, and PubMed. After applying the filter of including the value of the metrics in the abstract, 1364 remained. Only these articles are analyzed from the four perspectives that address cancer types, datasets, ML models, and performance metrics.

The systematic review was developed in accordance with the recommendations of the PRISMA 2020 guideline [3]. The methodological protocol of the study was generated prior to the article selection process. This included: defining the research questions, eligibility criteria, search strategy, and selection stages.

The methodological structure follows the PRISMA principles for reporting the study selection flow. The PRISMA diagram, presented in Figure 3, illustrates the process of identification, screening, eligibility, and inclusion of studies.

The search strategy was constructed using Boolean operators and domain-specific terms, with queries adapted to each database. The queries used are presented in full in Table S1 from the Supplementary Material section. In this way, the authors ensure the transparency of the literature review process.

The search was limited to the title to outline the specificity of the results. Equally, this approach reduced the inclusion of peripheral studies where the terms appeared incidentally. This selection principle used in this research is considered a benchmark in the age of speed and AI tools, when everyone wants information delivered quickly and of high quality. By prioritizing conceptual importance over the raw volume of results, the authors extracted those studies that were handled by the authors of the papers with professionalism, adhering to the standards of conducting research in the medical field.

The selection process was carried out following the stages below: removal of duplicates, exclusion of review articles, application of the open-access filter, and application of predefined eligibility criteria.

Inclusion criteria include:

Original studies published between 1 January 2020–31 December 2025;
Articles indexed simultaneously in Web of Science, Scopus, and PubMed;
Studies that explicitly apply ML algorithms in an oncological context;
Articles that report at least one performance indicator (accuracy, precision, recall, F1-score, or AUC);
Articles are available in full, open access.

Exclusion criteria comprise:

Review articles, meta-analyses, editorials, or letters;
Studies without explicit reporting of ML model performance;
Studies in which ML is not the main methodological component;
Studies with insufficient information regarding the dataset used.

The selection process was carried out sequentially, and the justification for each criterion was established prior to data extraction to reduce the risk of selection bias.

To ensure compliance with the PRISMA 2020 guidelines, duplicate articles between the three databases, ineligible publications (review, editorial, letter, etc.), as well as articles lacking explicit application of an ML algorithm, lacking reporting performance indicators, and containing insufficient information regarding the dataset were removed in Figure 3. The PRISMA diagram illustrates these stages.

The filtering strategy used involves searching in the title, selecting open-access articles, intersecting three databases, and using information from the abstract for preliminary classification. This approach was designed to maximize the identification of important information in the final set of studies. This approach may introduce certain methodological limitations. However, the authors have proposed this framework for the research, which distinguishes the study from other similar studies. The authors’ proposals in the methodology, which still align with the PRISMA 2020 standard, bring the novelty elements of each systematic review article, contributing their own insights within the research.

3. Results

The results section exclusively presents the synthesis of the included studies after applying the previously described methodological criteria, without introducing additional search strategy elements.

ML models can perform classification or prediction tasks. In a medical context, prediction refers to risk, recurrence, survival chances, or other aspects that have implications for a binary classification form. This does not limit the possibility of expanding the number of classes for which the ML model can make predictions [4]. Regarding prediction, it refers to the risk over time, the chance of survival, the likelihood of recurrence, and the identification of the degree of organ damage, sometimes being treated as a classification. Everything that involves evolution over time at the ML level is considered a prediction task. Table 1 presents a series of papers that mention the type of cancer, the task, and the objective of the paper.

Table 1 synthesizes the clinical objectives and task typology (classification vs. prediction, survival analysis, metastasis detection), thereby contextualizing how ML is being applied in oncology from a clinical decision-making perspective.

The development of predictive models for risk diagnosis in cancer includes cervical cancer, analyzed with hrHPV genotyping, cervical cytology, and clinical data [24], as well as through the analysis of simple hematological tests for screening [25]. Lung cancer [26], breast cancer [27], gastric cancer [28], and pancreatic cancer [16] can also be included in the category of predictive diagnostic models. Survival prediction in cancer is studied using ML models in lung cancer [29], prostate cancer with bone metastases [30], breast cancer [31], and colorectal cancer [32].

The analysis of the specialized literature shows that ML models are used for predicting metastases and surgical complications. This is the case of axillary metastases in breast cancer [33], lymph node metastases [34], and post-complete mesocolic excision (colon) heart failure [35,36]. ML models are also used to assist experts in optimizing a personalized treatment plan without wasting time to the detriment of the patient [37,38].

For ML models to function with the highest possible accuracy, the quality of the data used in training these models is a fundamental step in their development phase [39]. Combining data that have a real contribution makes the models classify or predict as accurately as possible to reality [40]. Another strategy to increase accuracy is by using multiple ML models simultaneously [41,42]. A final modern strategy of ML models that helps alleviate medical effort is the techniques for explaining the decisions obtained by the model. This way, the medical professional can more easily understand the reasoning performed by the model and decide if it has suggested a correct result [9]. This approach ensures that the doctor does not miss a detail that could lead to an incorrect decision, but it also helps verify the model’s decision in case it might have mislabeled something [31].

3.1. Cancer Types and ML Models

Figure 4 shows how scientific articles are distributed according to the type of cancer analyzed. After extracting the 1364 original contributions, it was desired to classify them according to the type of cancer addressed in the paper. The representation in Figure 4 is important because it reflects researchers’ interest in applying ML techniques to each type of cancer individually. Since the articles do not explicitly mention the type of cancer in the title or abstract, it was necessary to perform normalization to categorize the articles based on the type of cancer. A concrete example of this is the different ways the same type of cancer is written. For example, breast cancer is referred to as carcinoma of the breast, mammary carcinoma, invasive ductal carcinoma (IDC), breast malignancy, neoplasm of the breast, etc. Out of the initial 1364 articles identified, those where the type of cancer was not specified were excluded. Then the remaining articles were grouped according to cancer type, and the number of articles corresponding to each type was counted. Finally, a new figure was designed to summarize the types of cancer and the number of articles that address the issue using an ML approach.

An analysis of article distribution shows a concentration of research in the ML area within the oncological context, predominantly on breast cancer. This dominates the number of research studies with 350 articles. This value demonstrates the abundance of available data, the high global incidence, the interest in improving early detection, and the need for predicting the presence of cancer cells using ML techniques. The second most studied types of cancer are colorectal cancer, with 337 articles, and lung cancer, with 151 articles. Similarly, prostate cancer is identified with 83 articles, gastric cancer with 60, ovarian cancer with 49, bladder cancer with 36, pancreatic cancer with 34, head and neck cancer with 30, cervical cancer with 27, etc. Cancer types that were less frequently investigated using ML technologies are brain and CNS cancer (3), childhood cancer (2), bone cancer (2), kidney cancer (2), laryngeal cancer (2), testicular cancer (2), lymphatic cancer (1), nasopharyngeal cancer (1), and salivary gland tumor cancer (1).

The distribution of articles in Figure 5 shows that ML applications in oncology are more focused on cancer types with high incidence, data availability, early detection possibilities, major clinical impact, and also massive financial contributions to support this research. This trend is natural, considering the equity of research based on the need to extend ML applications to less studied types of cancer. Figure 5 does not include all 35 types of cancer mentioned in Figure 1, but only the types of cancer for which the papers employed ML algorithms.

3.2. Dataset Assessment

The indicators proposed by the authors for evaluating reproducibility and generalizability are conceptually inspired by the principles of Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) and Findable, Accessible, Interoperable, Reusable (FAIR) [45,46,47,48]. These principles address the issue of transparency, accessibility, and data replicability in the context of AI-based medical research [46,49]. Reproducibility categories (low, medium, and high), named the generalizability indicator, were defined according to cohort size, following methodological conventions in recent ML medical studies. Datasets with <100 samples were considered exploratory (low), 100–1000 as moderate-scale (medium), and >1000 as large-scale (high) cohorts. Restricted-access datasets were categorized based on sample size, with an additional note regarding data availability. The thresholds defined by the authors for cohort size are related to classifications used in clinical and medical studies that employ ML methods [50,51,52,53]. The reproducibility indicators proposed in the present study represent an element of originality that the authors introduced to align conceptually with existing standards in the field of medical ML. These indicators provide a systematic way to quantify research reproducibility.

The reproducibility rate was used to measure the extent to which the research can be replicated in another study using the dataset and methodology employed in the article. To establish the degree of reproducibility, three types of labels: high, medium, and low were employed. For the high label, the dataset must be publicly available from standard sources, accessible free of charge, and described within the article. For the medium category, the dataset is private; its volume and structure must be large enough to be adapted to ML algorithms, and it must be described within the paper. Regarding the low label, either the dataset is not specified, the data lacks the described labels, the details are insufficient for replication, or details about the dataset are not mentioned. This labeling is justified by the fact that a public dataset allows anyone to replicate the study, while a private one, even if well-described, is much harder to replicate, even if it partially provides a basis for replication. At the opposite end, a dataset without details cannot be reproduced at all.

The analyzed papers were classified according to the types of cancer treated. It should be mentioned that the dataset was analyzed in the description phase of the abstract and not by directly accessing the entire article. This analysis is based on the importance of the dataset, which should be highlighted from the beginning stage, in the abstract. The reproducibility rate is computed as the ratio of the number of articles with high and medium labels to the total number of articles, according to Equation (1).

R_{j} = \frac{\sum_{i = 1}^{n} H_{i j} + \sum_{i = 1}^{n} M_{i j}}{\sum_{i = 1}^{n} H_{i j} + \sum_{i = 1}^{n} M_{i j} + \sum_{i = 1}^{n} L_{i j}} \times 100 [%]

(1)

where

R_{j}

—reproducibility rate of j cancer type;

H_{i j}

—paper i labeled high from j cancer type;

M_{i j}

—paper i labeled medium from j cancer type;

L_{i j}

—paper i labeled low from j cancer type.

Table 2 summarizes the number of articles that meet each criterion in relation to the type of cancer they treat. Out of the total analyzed papers, only 276 have details about the dataset and were labeled as high and medium. The results from Table 2 can be ranked into three groups. The first group comprises the cancers with a reproducibility rate over 50%, namely bone cancer (100%), bladder cancer (50%), and kidney cancer (50%). The second group incudes 19 cancer types that recoded a reproducibility rate between 5% and 34%, with higher values in the case of brain and CNS cancer (33.33%), CUP cancer (29.41%), endometrial cancer (25%), ovarian cancer (24.49%), pancreatic cancer (23.53%), gastric cancer (23.33%), etc. The third group consists of seven cancer types that registered a null reproducibility rate, indicating that no articles used a publicly available dataset. This applies to esophageal cancer, childhood cancer, laryngeal cancer, nasopharyngeal cancer, lymphatic cancer, and salivary gland tumor cancer.

For the generalizability study, labeling with high, medium, and low was used. For the high label, the cohort was set at over 1000 patients. The medium label was used for a cohort of 100 to 1000 patients, and the low label for a cohort of fewer than 100 patients (Figure 6).

The larger and more diverse the cohort, the greater the chances that the model can be implemented in different contexts. Small cohorts are often prone to overfitting. The generalizability study was conducted on the three patient limit intervals, with classification for each type of cancer. The results are summarized in Table 3. According to the methodology, from the total number of articles reported for each type of cancer, those that mentioned the number of patients for whom the experiment was conducted in the abstract were retained for analysis.

Analyzing Table 3, an unequal distribution of generalizability among cancer types is observed. Thus, most studies in the high category are in the field of colorectal cancer (33), breast cancer (14), lung cancer (6), bladder cancer (6), and prostate cancer (5), whereas in the case of the medium category the hierarchy is slightly different, i.e., breast cancer (191), colorectal cancer (185), lung cancer (107), prostate cancer (52), and gastric cancer (42). The significant emphasis on large cohorts is because it is leading to confidence in the applicability of the datasets. Conversely, there are no papers in the high category and very few in the medium category that researched CUP cancer, leukemia and hematologic cancer, multiple types of cancer, esophageal cancer, lymphatic cancer, laryngeal cancer, nasopharyngeal cancer, kidney cancer, bone cancer, and brain and CNS cancer. Cancers, such as endometrial, liver, thyroid, skin, and childhood, have a small number of studies. However, they are labeled high due to the use of national databases, such as The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), Surveillance, Epidemiology, and End Results (SEER), etc.

An important aspect highlighted by Table 3 is that only studies explicitly mentioning the number of patients in the abstract were included in the analysis. Studies with large cohorts are likely to mention this, while those with small cohorts are prone to omit such information in the abstract. Generalizability in oncology is generally moderate, as analyzed in Table 3. The focus on cancers with high incidence stems from high generalizability values, while cancers with low incidence are underrepresented. Increasing the clinical impact of ML models requires a focus on large cohorts and the transparency of the dataset, which must be diversified regardless of the type of cancer addressed.

In this analysis, the degree of generalizability was estimated based on cohort size, as this is the most frequently reported variable in clinical ML studies. However, the generalizability of an AI model also relates to qualitative factors such as:

Data diversity is understood as multi-center origin, ethnic distributions, demographic age groups, etc. This diversity is correlated with the model’s subject, which is why it was not feasible in this general approach to all cancer types;
The existence of external validation, which confirms the model’s performance on independent datasets. This aspect is difficult to evaluate, considering that most articles do not present clinical studies, but only innovative methods validated on small sets of real subjects;
The integration of multi-modal data, which combines imaging, clinical, and molecular information, can increase the model’s adaptability to real clinical situations. Again, most articles do not have such combined approaches, either due to a lack of data or a desire to focus exclusively on an isolated issue.

These aspects were mentioned in the comparative discussions, where the information provided by the authors allowed for it, but they could not be uniformly quantified due to the heterogeneous way they were reported in the analyzed literature.

Figure 7 presents the top 10 most used datasets. At the top of this ranking are TCGA, GEO, and SEER. The TCGA dataset is reported in 133 articles and used for the identification of over 10 types of cancer [54] that employ ML techniques, including colorectal cancer [55], bladder cancer [56], endometrial cancer [57], etc. The second most used dataset, GEO, is reported in 94 articles addressing bladder cancer [58], kidney cancer [59], breast cancer [60], etc. Some studies combine datasets for training ML models [61]. The third most used dataset is the SEER dataset, encountered in 72 papers. It is used for identifying several types of cancer, such as gastric cancer [62], breast cancer [63], esophageal cancer [64], etc. Other datasets are used with lower frequency such as The Cancer Imaging Archive (TCIA) in 11 papers, IMvigor210 in 10 papers, Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Cancer Cell Line Encyclopedia (CCLE) in 8 papers each, Genotype-Tissue Expression (GTEx) in 7 papers, UK Biobank in 6 papers, and Medical Information Mart for Intensive Care (MIMIC) and Clinical Proteomic Tumor Analysis Consortium (CPTAC) in 4 papers each.

This analysis highlights standardized public databases, such as TCGA, GEO, and SEER, in cancer research using ML techniques. The quality of the datasets is a fundamental characteristic in obtaining results that allow the use of ML models in assisting medical professionals [65].

The synthesis of the analysis regarding reproducibility and generalizability is presented in Table 4. This table summarizes the characteristics of the main databases used in the included studies. Here, the approximate size of the cohorts, the type of data (genomic, imaging, clinical), the degree of accessibility, and the validation practices reported in the analyzed literature are presented.

The three datasets provide extensive cohorts, public access, and are the most used in studies in the literature. However, the repeated use of the same datasets can generate an internal validation bias. Thus, the performance of the models can be overestimated.

3.3. ML Models Employed in Cancer Types

To identify the ML models studied in the literature in relation to each type of cancer, the abstracts of the articles were examined, and based on this analysis, Table 5 was designed. Thus, the breast and colorectal cancers are studied using 19 ML models each, followed by lung cancer with 17 models, prostate cancer with 16, cervical and ovarian cancers with 15 each, bladder and head and neck cancers with 14 each, gastric and thyroid cancers with 13 each, pancreas cancer with 12, CUP cancer with 11, and endometrial, esophageal, liver and skin cancers all with the same value of 9. Subsequently, eight ML models are associated with leukemia, hematologic cancer, and pan-cancer; seven ML models are used in papers that focus on cell, multiple types, and nasopharyngeal cancers; and six ML models are used for brain and CNS, laryngeal, and lymphatic cancers. Finally, bone cancer is studied with five ML models, kidney cancer with four ML models, whereas childhood and salivary gland tumor cancers are studied with two models each.

Table 5 provides a methodological inventory of the specific ML algorithms used across cancer types, in order to express technological trends and model distribution.

Table 6 presents the most frequently used ML models for each type of cancer. Thus, the most used model is RF for breast, colorectal, lung, prostate, gastric, ovarian, bladder, pancreatic, head and neck, cervical, liver, thyroid, pan-cancer, CUP, endometrial, esophageal, multiple types, skin, leukemia and hematologic, cell, childhood, kidney, and laryngeal cancers. Furthermore, LogisticReg is used with the same weight as RF for bone cancer, with DT, GB, RF, SVM, and XGBoost for lymphatic cancer, and with Clustering, GB, NB, NN, RF, and XGBoost for nasopharyngeal cancer. In addition to RF, for lymphatic and salivary gland tumor cancers, DT is employed. Moreover, for brain and CNS cancers, the studies are using NN most frequently. Recent progress in glioma research integrates spatiotemporal heterogeneity with multimodal fusion strategies. These are guided by ML techniques that identify aligned and collective multicellular bundles within high-grade gliomas [66]. At the same time, the paper by Bahar et al. [67] analyzes the ML methods used in glioma grading. These radiomic models outperform clinical data in predicting the progression of patients with high-grade glioma [68]. Furthermore, Redlich et al. [69] provide a synthesis of the use of AI in histopathological imaging of gliomas. This highlights the need for model validation on data from multiple centers to improve generalizability. Finally, integrative models like Deep Orthogonal Fusion show how combining pathology imaging data, genomic data, and clinical variables improves predictive performance compared to unimodal models [70].

Regardless of the cancer types, the top five most frequently used ML models are RF, NN, LogisticReg, GB, and SVM.

RF is suitable for medical data analysis due to features that prevent overfitting, tolerate missing or incomplete data, evaluate important features, and function well on small datasets. RF is frequently used because it works with mixed data, such as clinical, genetic, imaging, and histopathological data, as it does not assume rigorous statistical distributions, unlike other models, such as classic LR. Another characteristic is its accessibility compared to DL models, which require large datasets and advanced computational resources, and the balance they offer in the medical field through performance metric reporting. In other words, RF is used in medical data analysis due to its ability to work with complex, incomplete, and varied data, as well as data that requires preprocessing before training.

3.4. Performance Metrics for Cancer Types

To perform a comparative analysis of the performance of ML models in oncology and beyond, standardized evaluation metrics are needed. In the context of cancer detection, a patient can be classified as positive or negative. Performance metrics are calculated based on the following parameters:

True positive (TP) means a patient with cancer is correctly classified as “cancer”;
A false positive (FP) is a patient without cancer incorrectly classified as “cancer” (false alarm);
True negative (TN) represents a patient without cancer correctly classified as “no cancer”;
A false negative (FN) corresponds to a patient with cancer incorrectly classified as “no cancer” (missed diagnosis).

The most important performance indicators are calculated using these parameters. The category of performance indicators includes accuracy, precision, recall, and F1-score [71,72]. In addition, when model output scores are available, the AUC is computed to measure the model’s discrimination ability across different decision thresholds.

Accuracy (overall correctness) measures the proportion of all patients correctly classified (both cancer and non-cancer). It is the measure that indicates the model gave the correct diagnosis, and it is computed with Equation (2).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N} \times 100 [%]

(2)

Precision (positive predictive value) represents the probability that a patient marked with cancer has cancer in reality (important to avoid unnecessary biopsies or treatment) and it is calculated with Equation (3).

P r e c i s i o n = \frac{T P}{T P + F P} \times 100 [%]

(3)

Recall (sensitivity or true positive rate) represents the indicator of real cancer patients detected by the model (critical for early detection or screening), and it is computed with Equation (4).

R e c a l l = \frac{T P}{T P + F N} \times 100 [%]

(4)

F1-score is a single-number summary useful when classes are imbalanced and both FP and FN matter, and it is calculated with Equation (5).

F_{1} - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \times 100 [%]

(5)

AUC is the probability that a randomly chosen patient with cancer receives a higher model score (risk score or probability) than a randomly chosen patient without cancer. It summarizes discrimination ability across all possible decision thresholds.

Consider the hypothetical case of breast cancer screening in a sample of 1010 patients. Out of the total number of patients, 80 patients had a confirmed cancer diagnosis, meaning they were correctly identified by the model as being affected (TP = 80). The model misclassified 20 cancer-free patients as having cancer (FP = 20). From the dataset, 900 healthy patients were correctly identified by the model as unaffected (TN = 900). Out of the total patients, 10 cancer patients were not detected as positive by the model (FN = 10).

The performance indicators are as follows:

A c c u r a c y = \frac{80 + 900}{80 + 900 + 20 + 10} \times 100 = 97.03 %

P r e c i s i o n = \frac{80}{80 + 20} \times 100 = 80 %

R e c a l l = \frac{80}{80 + 10} \times 100 = 88.89 %

F_{1} - s c o r e = 2 \times \frac{0.8 \times 0.8889}{0.8 + 0.8889} \times 100 = 84.21 %

Based on the performance metrics, it can be concluded that the model has a high success rate of 97.03%. This value indicates that the model has good classification capabilities, considering it is intended to assist, not replace, a human expert.

An ideal classification model must strike a balance between accuracy, sensitivity, specificity, and interpretability. In the authors’ opinion, the values should mandatorily be above 90% for accuracy, and above 85% for precision, recall, and F1-score. It is important to acknowledge that a perfect model cannot be achieved in any study. However, this recommendation should represent a reference standard for real clinical applicability.

3.4.1. Accuracy Metric for Cancer Types

Accuracy measures the total proportion of correct classifications. This can be deceiving when the datasets are imbalanced. In the case of cancers, this situation is common in ML models, as in the datasets, the number of healthy patients is much larger than that of sick patients. Table 7 presents the maximum and minimum values of accuracy by cancer types. A 100% performance was achieved for breast, colorectal, and lung cancers. These models, used to obtain the values in Table 7, are accurate. In the absence of external validation, there is a risk that these may reflect overfitting on the training data. On the opposite end, the models that achieved minimal performance reported values of 45.8% for gastric cancer, 48% for colorectal cancer, and 58% for breast cancer. These values indicate an intrinsic difficulty of the dataset or a generalization problem in the models. As previously stated in the paper, accuracy must be correlated with the other metrics to obtain a correct evaluation. The model may have high accuracy simply because the data is imbalanced, or it may indeed offer good performance in detecting positive cases. To discern which scenario it is in, the model must be evaluated through the lens of all indicators.

RF models are used in various applications, such as predicting pulmonary metastases in thyroid cancer [112], classifying tumor tissue in high-grade glioma [75], predicting lateral lymph node metastases in papillary thyroid cancer [112], predicting HER2 status in bladder cancer [74], predicting survival in resectable pancreatic cancer [91], classifying colorectal cancer [82], categorizing liver tumors using optical biopsy [97], predicting pathological risk factors in cervical cancer [81], etc. The accuracy of these works varies between 90% and 99%. The best-performing model is reported by Liu et al. [111], where the accuracy was 99%, AUC 99%, F1-score 72%, and recall 88%. In the study by Lai et al. [112], RF achieved an AUC of 80%, accuracy of 74%, F1-score of 81%, and sensitivity of 89%. Overall, all the indicators reported in these papers demonstrate the possibility of their use in clinical applications, but they cannot replace human expertise because the values are not perfect [113].

For heterogeneous data, XGBoost performs a classification that indicates the possibility of integration into medical applications through performance metrics. For example, for liver cancer prediction, in the paper by Vekariya et al. [96], XGBoost achieved an AUC of 85.2% and an accuracy of 87.5%.

Alongside RF, XGBoost, and SVM models, the LR model is frequently studied for integration into such applications due to its simplicity in terms of interpretability. The prediction of bone metastases in esophageal cancer is studied by Wan and Zhou [87], in whose research LR had an AUC of 83.1% and an accuracy of 72.1%. Also, for the progression of post-nephrectomy renal dysfunction, LR achieved an AUC of 81.5% and an accuracy of 78.7% [93]. The response to paclitaxel in advanced gastric cancer was investigated in the paper by Choi et al. [89], in which LR had an AUC of 67.9%, accuracy of 82.3%, F1-score of 46.1%, sensitivity of 63.8% and predicted a longer survival trend.

3.4.2. Precision Metric for Cancer Types

Precision measures the percentage of cases classified as positive for a certain type of cancer that are actually proven to be positive. The adapted calculation relationship is presented in Equation (3). This metric refers to cases where the diagnosis is a false positive and involves stress for the patient. This means additional investigations, increased stress for the patient, higher costs, congestion in the medical system, and many other auxiliary inconveniences. Table 8 depicts the maximum and minimum precision by cancer types. Thus, the precision is perfect for breast and colorectal cancer. These results are very useful for both doctors and patients in avoiding unjustified treatments. It is worth mentioning that this indicator must be carefully analyzed to exclude overfitting. The lowest reported precisions are 36% for breast cancer and 47.1% for lung cancer. The values indicate a high false positive rate with implications that could compromise the model’s accessibility. Depending on the purpose for which the model is used, screening or confirmatory diagnosis, the priority between precision and recall alternates.

Thyroid cancer with lung metastases is investigated by Liu et al. [111]. This proposes the RF model for predicting pulmonary metastases, using data from the SEER database. The results obtained yielded an F1-score of 72%, a precision of 61%, and a recall of 88%. The early detection of melanoma was investigated using specific image processing filters (Color Layout Filter) and a classifier with attribute selection [130]. Thus, an F1-score of 91%, precision of 91%, and recall of 91% were obtained.

The research by Jeong et al. [114] on bladder cancer integrates an electrochemical sensor combined with ML for discriminating normal cells from cancerous ones [131]. In this case, the RF model achieved an F1-score of 93.8%, with an accuracy of 91.7% and a sensitivity of 92.9%. Colorectal cancer proposes an SVM classifier based on circulating tumor cells (CTCs). The model achieved 100% accuracy, 100% specificity, and an implicit F1-score of 88.9% (calculated from 80% sensitivity and 100% precision) [119].

For ovarian cancer (progression-free survival), Arezzo et al. [126] demonstrate the possibility of predicting 12-month survival with 90% accuracy. This means that 9 out of 10 patients predicted to be progression-free actually had such an outcome. The RF model generated a precision and recall of 90%, an accuracy of 93.7%, and an AUC of 92%. The response to chemotherapy in colorectal cancer is predicted using the XGBoost model, which achieved an accuracy of 94.6% for a favorable prognosis in the Multilayer Perceptron (MLP) model. Similarly, the Gradient Boosting Decision Tree (GBDT) model had an accuracy of 86.3% for unfavorable prognosis. These values identify patients who will respond to treatment [120], making them an extremely useful tool in the healthcare system [132].

The toxicity of cisplatin in head and neck cancer is being investigated using the GLM model. It achieved 75% precision, meaning that three out of four patients predicted to have severe toxicity will actually develop it [123]. Zhu et al. [116] employed the LightGBM model to study breast cancer. The authors achieved 100% precision, meaning all cases classified as malignant were indeed cancerous. This performance makes the model ideal as a screening tool due to its ability to minimize false positives [133].

Undifferentiated early gastric cancer is being studied for non-curative resection using the XGBoost model, which achieved a precision of 92.6% [88]. In the case of post-urostomy urinary tract infections in bladder cancer, the SVM model achieved an accuracy of 58.3%, meaning that less than half of the patients predicted to have an infection will actually develop one. Although the AUC is high (83.5%), the low precision of 58.3% indicates limitations in practical application without further adjustments. The model is available online and includes a visualization of variable importance, confirming that accuracy can be improved by selecting suitable clinical features [115].

3.4.3. Recall Metric for Cancer Types

Recall measures the proportion of positive cases correctly detected. Practically, this indicator is associated with a correctly made diagnosis. The adapted calculation formula for cancers is presented in Equation (4). At the oncological level, the indicator is useful to avoid false negative situations that delay treatment, and therefore survival chances. Table 9 depicts the maximum and minimum recall by cancer types. An almost complete sensitivity is reported for breast, gastric, and lung cancers of 99.49%, 99%, and 98.9%, respectively. These values indicate that almost all real cases are detected. Low sensitivity is associated with breast cancer (50%), but also with head and neck cancer (55%), when the ML model is not suitable for the context. Models with a high recall value are preferred in the screening stages when an early evaluation is conducted. They need to be adjusted later to reduce the false positive rate by increasing precision.

RF is one of the most frequently used models in the analyzed studies. This model reported the best performance in several types of cancer. Lung cancer (post-lobectomy complications) offers the possibility of predicting cardiopulmonary complications through the RF model, with a recall of 73.8% and an AUC of 85.6% [138]. Even in the case of CUP, the RF based on the Cancer of Unknown Primary Location Resolver (CUPLR) model was able to identify the tissue of origin for 35 cancer subtypes with a recall of 90% and precision of 90%. It was trained on genomic data (6756 tumors) and resolved 58% of CUP cases [118]. The second most used ML model is XGBoost. It stands out for its performance in classifying disease severity and predicting treatment response. For lung cancer, the XGBoost model achieved a recall of 98.9%, a precision of 99%, and an accuracy of 98.9%. It was trained on clinical data from Ethiopia [124].

Breast cancer uses the BreCML model based on XGBoost to identify new key genes. It achieved a recall of 99.49%, a precision of 99.15%, and an F1-score of 99.79% [135]. Bladder cancer (RB1 mutation) also uses XGBoost to achieve an 80% recall, 84% accuracy, and an AUC of 84%. He was trained on radiomics features from computed tomography urography (CTU) [134].

3.4.4. F1-Score Metric for Cancer Types

The F1-score is a combination of precision and recall, with the calculation relationship adapted to the oncological context according to Equation (5). This indicator highlights the imbalanced scenarios between classes, a scenario that is typical in the field of oncology. These scenarios refer to the number of positive cases, which is much smaller than that of negative ones. In other words, the number of diagnosed cancer cases is much smaller than the number of negative results. Therefore, a discrepancy arises between the two classes. In Table 10, which presents the maximum and minimum F1-score by cancer types, the observed results show very good values for breast cancer, 99.79%, colorectal cancer, 98.2%, gastric cancer, 97.5%, and thyroid cancer, 96.7%. From these values, it can be deduced that for these types of cancer, the models have the ability to simultaneously maintain high values for both precision and recall.

Extremely low values were reported for thyroid cancer, 21.6%, head and neck, 30%, and breast cancer, 37%. These severe imbalances are associated with models that detect more positive cases with many false alarms. Basically, it either avoids false positives or misses real cases. The extreme values reported for this indicator do not guarantee good overall performance [155]. Thus, reporting multiple metrics becomes mandatory in this context for the comprehensive evaluation of ML models in oncology.

Nayan et al. [151] evaluated disease progression in patients on active surveillance (AS) using ML models. The study achieved an F1-score of 58.6% using the SVM model. Compared to this value, the traditional model had an F1-score of 18.2%. Another study [150] used RF and XGBoost for prostate cancer prediction based on targeted or combined biopsy. In this case, the F1-score achieved a value between 94% and 97%. Mahmud et al. [92] used computed tomography (CT) images and clinical data to classify four types of kidney cancer (ccRCC, chRCC, pRCC, oncocytoma). The model combines DL techniques with clinical data. After training, it achieved an F1-score of 84.92% for all types. For renal cell carcinoma (RCC), the F1-score increased to 90.50%.

Two other studies that addressed breast cancer were selected from the literature for their F1-score values. The first research is by Ke et al. [135] in which BreCML, an XGBoost-based model, was developed. It achieved an F1-score of 99.79% in cell subtype classification. The second research is by Nguyen et al. [140], which predicted five-year survival. The best model was the Artificial Neural Network (ANN), which achieved an F1-score of 37%, despite an AUC of 95%. The class distribution imbalances are the cause of these anomalies. Nair et al. [147] applied Short Term Fourier Transform (STFT), LASSO, and Elephant Herding Optimisation (EHO) for feature extraction from gene expression data. The classification used several ML algorithms. The best result was achieved by Flower Pollination Optimization–Gaussian Mixture Model (FPO-GMM), with an F1-score of 97.5%. Schöneck et al. [148] studied the prediction of Kirsten Rat Sarcoma viral oncogene homolog (KRAS) mutation in non-small cell lung cancer (NSCLC) using radiomics and ML models, but the performance was modest (maximum F1-score of 67% internally, with 41% externally). These values suggest difficulties in model transferability.

Jeong et al. [114] proposed a non-invasive method based on electrochemical impedance and ML. RF was the model against which the best performance metrics were reported. The results reported an F1-score of 93.8% in discriminating normal cells from cancerous ones. For papillary thyroid cancer (PTC), ML models (SVM, XGBoost, RF) outperformed American Thyroid Association (ATA) classification, with F1-scores ranging from 33.1% to 42.9%. The best model was RF [154].

Yan et al. [144] optimized the gastric cancer screening score using GBM, Distributed Random Forest (DRF), and DL. In binary classification, the models achieved an AUC higher than 99%, but in triple classification, the F1-score for high risk was only 53.34% for GBM. The value indicates difficulties in discriminating between intermediate and high risk. The study by Hsu et al. [149] predicted muscle mass loss in ovarian cancer patients. The best results were reported for RF, which achieved an F1-score of 72.6% (internal) and 74.1% (external validation).

3.4.5. AUC Metric for Cancer Types

Performance metrics in the oncological field refer to the ability of ML algorithms to identify clinical patterns. The values centralized in Table 11 show the maximum and minimum AUC by cancer types. These values reflect the quality of the dataset used, the model architecture, the degree of applicability of the model for the type of cancer, the evaluation protocol, and the usability in practice.

AUC measures the model’s ability to discriminate between sick and healthy patients [186]. In oncology, the value of this parameter indicates a higher probability that a positive patient will be correctly classified as having that specific type of cancer. Data from Table 11 outlines that the maximum values obtained by the ML models are nearly perfect, with values of 1 for colorectal cancer, gastric cancer, ovarian cancer, and pancreatic cancer. The fact that these models achieved such high values indicates a clear separation between classes. The lowest values were obtained for lung cancer at 0.45, head and neck at 0.51, and ovarian cancer at 0.56. These values indicate that the respective studies either had insufficient data, imbalanced data, low-quality data, or poor generalization of the ML model itself. In relation to the methodology of this study, the differences between the maximum and minimum values highlight the importance of selection and diversity in datasets. High AUC values are associated with cancers in databases standardized by TCGA and SEER. Low AUC values are associated with a reduced cohort.

In the specialized literature, a multitude of studies have been presented that utilized ML models for the early diagnosis of cancer in various forms. Thyroid cancer achieved the best results with the help of the RF model. In the paper by Liu et al. [111], the AUC indicator had a level of 99%, with a precision of 61%, a recall of 88%, and an accuracy of 99%. On the opposite end, the lowest result was obtained for the GBDT in diagnosing central lymph node metastases. In this case, the AUC had a performance of 73.1%, surpassing the performance of ultrasound, which had an AUC of 62.3% [185]. Practically, the weakest result obtained for the AUC performance indicator surpassed the performance of classical diagnostic methods.

For breast cancer, ML models as well as hybrid models were explored in computerized diagnosis. Shaikh and Ali [158] use a hybrid model that includes the SVM model, achieving an AUC of 99.41% on a local dataset and 99.21% on the BCDR-F03 dataset, with an overall accuracy of 99.89%. For radiation-associated breast cancer, the RF model achieves an AUC of 62% in predicting contralateral cancer, according to the study [157]. Ovarian cancer achieved the best AUC of 100%, while on the opposite end, an AUC of 56% was obtained through natural language processing (NLP) on CT reports for operators. The AUC value of 56% improved the prediction of post-operative readmission, reaching 70% when integrated with NLP [179]. In the research by Hamidi et al. [178], the Boruta-based model, combined with five other ML algorithms, identified 10 miRNAs. The models achieved an AUC of 100% and over 94% in external validation sets.

Bone metastases are studied using the XGBoost model. Ji et al. [171] achieved an AUC of 82.69% in the internal cohort and 91.23% in the external cohort. Pain associated with bone metastases is investigated using a gene-based nomogram that achieved an AUC of 99% in the study by Li et al. [182]. Pancreatic cancer is studied through the RF model, which eliminates features in order to identify a panel of biomarkers. It achieved an AUC of 100% for classifying post-operative complications [180]. Iwatate et al. [181] use radiogenomics in training the model that predicts the expression of the Integrin subunit alpha V (ITGAV) gene with an AUC of 69.7%.

Colorectal cancer is studied through NN and RF models. It achieved performance in predicting metastases with an AUC of 100%, a sensitivity of 100%, and an accuracy of 99% on the balanced set by Talebi et al. [161]. For peripheral nerve invasion, CT radiomics-based models achieved an AUC between 61.1% and 66.3% in the study by Liu et al. [162]. Lymph node metastases in endometrial cancer are studied with a model based on the apparent diffusion coefficient (ADC) and radiomic features. It achieved an AUC of 85%, demonstrating an improvement over classical criteria [163]. H₂O AUTO-ML-based GBM investigated prostate cancer; it achieved an AUC of 72% and a specificity of 84% in assisting with case selection for biopsy [183]. For bone metastases, a gene-based omogram from the Stimulator of INterferon Genes (STING) pathway achieved an AUC of 99% [182].

Bladder cancer is studied using atomic force microscopy combined with an ML model, achieving an AUC of 97% and an accuracy of 91% in a controlled system. In cases with multiple image channels, the model achieved an AUC of 99% and an accuracy of 93% in the study by Petrov et al. [73]. In another research by Petrov and Sokolo [159], RF is employed to separate precancerous cervical cells from cancerous ones. The model achieved an AUC of 93% and a sensitivity of 92%. Peritumoral radiomic models are studied for proximal esophageal cancer. The dual-region model achieves an AUC of 96.63% in the training phase and 94.71% in the validation phase. A radiomic clinical monogram outperforms the clinical model, reporting a net reclassification improvement of 34.4% [165].

The XGBoost model has been demonstrated to be effective in predicting cervical lymph node metastases in patients with early-stage supraglottic laryngeal cancer. The model proposed by Wang et al. [172] achieved an AUC of 87% in internal validation and 80% in external validation. Similar ML models, including XGBoost or LightGBM, have been used in early stage gastric cancer as well. It achieved an AUC between 73.6% and 83% in the study by Yang et al. [168]. A panel of six genes associated with cancer-associated fibroblasts (CAFs) identified an AUC of 75.4% and 100% in the validation sets [187]. For lung cancer, Wang et al. [176] employed an NN model, achieving an AUC of 99.4% and an accuracy of 99.3% using inflammatory markers. In the case of esophageal squamous cell carcinoma, Cui et al. [166] employed combined models, achieving an AUC of 85.6%. As for laryngeal cancer, Nakajo et al. [173] implemented the NB model, achieving an AUC of 84.2% in predicting progression, while Random Survival Forest (RSF), a modified version of RF, achieves a C-index of 80.8%.

The cancers studied intensively are from the categories of breast, colorectal, gastric, and lung cancers. These are associated with high values of performance indicators due to large and standardized datasets. The cancers with low incidence, such as thyroid, head and neck, and salivary, have greater variability and lower values in indicator reporting because they require additional data collection. Studies that use public and diverse databases have balanced values for performance indicators. Private or small cohort datasets have generated good performance due to overfitting or reported low values. The large differences between the maximum and minimum values for the same type of cancer indicate an incompatibility of the model used, a lack of uniform reporting and validation protocols, or difficulties concerning the datasets used, making direct comparison of studies difficult. Thus, data from Table 7, Table 8, Table 9, Table 10 and Table 11 demonstrate that certain ML models have high performance.

Within the analysis of the performance indicators reported by various studies, it was found that some models had an accuracy or AUC of 100%. These perfect values indicate that the results may stem from overfitting the models, using small datasets, including irrelevant data for the elements the model was trained on, the absence of external validation, or the correct selection of training and validation data groups. In many cases, validation was performed only on subsets of the same dataset. This approach leads to an overestimation of the model’s actual performance. Also, the lack of cohort diversity, the absence of multicenter testing, validation in feature processing and selection, and expansion on datasets never seen by the model can, in some cases, lead to seemingly perfect results that, when applied in practice, significantly reduce the model’s performance. These performances reported by the articles in the specialized literature should be viewed as indicators of the potential of machine models. These do not represent irrefutable evidence of immediate clinical applicability. Therefore, model validation should also be performed by evaluating reproducibility and generality according to the authors’ proposals in this article.

The variability of performance indicators reported between studies is explained by the following methodological factors:

The class imbalance in oncology arises from the fact that the proportion of positive patients is much smaller than that of negative patients. This aspect leads to high accuracy values, even though the performance is low for recall or F1-score.
The small sample size at the level of small cohorts increases the risk of overfitting and overestimating internal performance. This aspect is not specific to the field of cancer, as it is encountered in all fields where the datasets are small in volume.
The absence of external validation, which causes many studies to use only internal validation (cross-validation), without testing on independent cohorts.
Differences in data preprocessing generated by normalization methods, feature selection, handling of missing values, and the quality of the final data obtained modify performance indicators.
The heterogeneity of clinical objectives in binary classification, multiclass classification, or survival prediction involves different challenges.

Therefore, the direct comparison of maximum values between cancer types must be done with caution, as they reflect distinct experimental contexts. However, they provide an overview of the performance progress of ML models.

4. Discussion

This study analyzes a large volume of research that integrates ML models in the field of oncology. Thus, articles from recent oncological research, covering the period between 1 January 2020 and 31 December 2025, were analyzed. The analysis included the types of cancer investigated, the datasets used to train the models, the ML models studied in these articles, and the performance metrics obtained in relation to the type of cancer explored. This review paper aims to inventory the studies based on a critical evaluation of the reproducibility, generalizability, transferability, and clinical trends introduced by these ML models.

Unlike other existing synthesis articles, the authors’ proposal makes a distinct methodological contribution. First, the selection process was conducted simultaneously across three different scientific databases, which allowed for complete coverage and eliminated marginal duplicates. The second direct contribution focused on extracting open-access articles that contain numerical values for performance metrics in the abstract. This approach is unique in the specialized literature up to this point and has allowed for a uniform quantitative analysis of all the extracted articles. A third original contribution of the study is marked by the two indicators, reproducibility rate and generalizability rate, proposed by the authors as a new perspective on the quality of results in the field of oncological ML. Furthermore, a comparative analysis of the reported ML model performances based on cancer type, as well as algorithm type, provides a unique perspective in the specialized literature by detailing and ensuring the reproducibility of the technical aspects that are currently lacking in the previous literature.

The selection process aligns with international standards for systematic reviews. Additionally, the article proposes a novel element in the analysis: an additional criterion that ensures scientific rigor from the initial drafting stage. Although the criterion of excluding articles that do not contain numerical values in the abstract represents a potential source of bias, the proposed strategy meets contemporary requirements regarding information accessibility and the ability to quickly identify important elements. The selection criteria were designed to prioritize technical reproducibility over volume. By requesting numerical data in the abstract, we ensured a uniform quantitative synthesis, which is often lacking in narrative reviews. We have strengthened the Methodology and Limitations sections to explicitly justify these choices as a deliberate methodological contribution, aimed at reducing ambiguity in performance reporting. In this way, this research distinguishes itself from all other reviews, introducing a unique contribution.

The work addresses a real need in the field of identifying ML technologies that can be implemented in oncological practice. From the analysis of the studied materials, the pitfalls resulting from imbalanced data, insufficiently documented training sets, lack of external validations, the inappropriate choice of an ML model, or the analysis of a limited set of performance metrics that can create a false illusion of high model performance were extracted. The paper has a novelty factor that places this research at the top of ML works in the field of oncology due to the proposed methodology, which is unique in the literature up to this point. Thus, articles were extracted from the WOS, Scopus, and PubMed databases, applying the strict use of open-source articles as filters so that they could be accessed. Therefore, from an initial total of 13,292 articles, the set was reduced to 1364 common articles across all three databases. Practically, these articles are indexed in all three databases, which guarantees the quality of the analyzed article. Furthermore, within the methodology, a technical criterion was applied to the numerical data, introducing a new numerical value filter in the summary. This way, the articles for which a comparative technical analysis can be performed were extracted.

Their classification was based on four analytical axes corresponding to the type of cancer, dataset characteristics, the type of ML algorithm, and the performance metrics obtained. Regarding the characteristics of the datasets, the authors introduced, in addition to the classic data analysis, two new indicators represented by reproducibility and generalizability, in order to allow for a relevant comparison between the study results. Reproducibility analysis quantifies high, medium, and low labels for each cancer type. For example, bone cancer, bladder cancer, and kidney cancer have a reproducibility of 100%, 50%, and 50%, respectively. Conversely, esophageal cancer, childhood cancer, laryngeal cancer, nasopharyngeal cancer, lymphatic cancer, and salivary gland tumor cancer had a null reproducibility. Generalizability analysis refers to the cohort size used in studies. The high category includes cancer types such as colorectal, breast, lung, bladder, and prostate; the medium category comprises breast cancer, colorectal cancer, lung cancer, prostate cancer, and gastric cancer, and the low category contains cancer types akin to CUP, leukemia and hematologic, multiple types, esophageal, lymphatic, laryngeal, nasopharyngeal, kidney, bone, and brain and CNS. Larger cohorts of over 1000 patients offer a greater impact on performance stability.

The performance metrics analyzed, for which extreme values were reported to highlight variability, were:

Accuracy ranging from 45.8% (gastric cancer) to 100% (breast, colorectal, and lung cancers);
Precision ranges from 36% (breast cancer) to 100% (breast and colorectal cancers);
Recall ranges between 50% (breast cancer) and 99.49% (breast cancer);
F1-score between 21.6% (thyroid cancer) and 99.79% (breast cancer);
AUC between 45% (lung cancer) and 100% (colorectal, gastric, ovarian, and pancreatic cancers).

The most analyzed ML models were identified in the inventory of ML models. Thus, RF is the most applied ML model in 82.14% of the types of cancer, and it is employed with the same weight as other ML models in an additional 14.28% of the types of cancer. This can be justified due to the tolerance of this model to incomplete training datasets. Regardless of the cancer type, the top five most frequently used ML models are RF, NN, LogisticReg, GB, and SVM. Through this approach, the paper investigates studies that extract the direct relationship between data quality, algorithm type, and the resulting outcome, making this study a critical tool for selecting ML technologies with real clinical potential.

The RF algorithm is the most analyzed in these studies due to its characteristics that allow for easy implementation. Also, the model does not require a time-consuming hyperparameter configuration. The model performs well in situations where the data is incomplete or heterogeneous. Furthermore, this model provides a simple analysis of variable importance, which is extremely important in medical research. These claims are also supported by other studies where comparative experiments on hundreds of datasets show that RF performs well even with minimal hyperparameter tuning, especially when the data is tabular and the variables are numerous [188,189,190]. This is often preferred by clinical researchers due to the transparency of the decisions. Also, the results are stable, meaning that successive training sessions will yield the same results. This dominance can introduce a certain bias, as simpler and more accessible models are favored over complex ones in the category of deep neural networks or generative models. The latter requires increased computational resources, as well as extensive datasets. There are studies showing that RF allows for interpretability through variable importance analyses and local explainability, which is particularly important for clinical acceptance. At the same time, in contexts with heterogeneous or incomplete data, RF has provided exceptional results [191]. The popularity of the RF model reflects researchers’ accessibility and familiarity with this model, not necessarily its absolute performance superiority. However, the literature also points out the risk that these simple models may be unfairly favored over other models that are superior in terms of performance but difficult to implement [192].

The most studied cancer type in the literature in which ML models are employed is breast cancer. From the category of cancers with a high mortality rate, the one that affects women the most is breast cancer [6]. Studies show that RF is used by numerous researchers for risk prediction and early diagnosis. This ML model achieved up to 99.3% accuracy in breast cancer prediction based on multifactorial, genetic, biochemical, and demographic factors [5]. Alongside models like RF, the XGBoost model is also present, which predicts tumor type [6]. Other approaches, such as integrating fluorescence spectroscopy with ML, achieve 98.78% accuracy in interoperable diagnosis [27]. Interpretable models have also been developed for assessing the risk in pre-survivors [8].

The synthesis aspects for datasets identified in the literature address class imbalance, which is common in oncology (low prevalence of many cancers). Metrics behave differently, having the following recommendations:

Accuracy may be misleading (a model predicting always “no cancer” can have high accuracy if prevalence is low);
Precision depends strongly on disease prevalence and, therefore, precision should be reported alongside prevalence or the prevalence-adjusted should be computed when appropriate;
When multiple metrics are used for clinical purposes, the minimum metrics that should be reported are accuracy, recall, precision, F1-score, and AUC;
Threshold selection matters when a decision threshold needs to be chosen based on clinical priorities. A recommendation in this way is to maximize recall for screening or to maximize precision for diagnostic confirmation;
Confidence intervals and statistical tests always require external validation on independent cohorts to check generalizability;
Overfitting warning is when the accuracy, precision, or AUC record values are near 100% on the internal test. The dataset may indicate overfitting, especially when no external validation is present.

One of the limitations identified in the literature concerns class imbalance, which is frequently encountered in oncological studies where the number of positive cases is much lower than that of negative ones. This leads to an artificial overestimation of accuracy. This type of behavior affects the correct evaluation of ML models in clinical studies. For a fair assessment, it is recommended to use a balanced set of performance indicators. This balance should include both AUC and calibration methods, such as calibration curves or the Brier score [193]. In addition to this, independent external tests for validation are required in the case of clinical studies. These tools provide a realistic estimate of performance under various clinical conditions, which will reduce the risk of overfitting in the long term. Furthermore, the clinical utility of a model must be reported against context-specific performance thresholds. For example, a recall of at least 90% and a precision of over 85% are considered feasibility standards for cancer screening [194]. Therefore, model evaluation should not be based on specific accuracy values but should include an analysis of the balance between sensitivity, specificity, and external calibration. This ensures real applicability in medical practice.

The paper is addressed to researchers in AI applied to oncology. They will find in this study a mapping of the most used models and data sources. The work is also aimed at clinicians and oncology specialists interested in understanding how to use ML models in prevention, diagnosis, prognosis, and treatment personalization. This review is also aimed at decision-makers and medical solution developers who can identify the models with the greatest chance of integration into clinical systems. Finally, the paper targets the academic community, which benefits from a replicable methodological framework for systematic evaluations in other development directions.

The results of this study confirm that the literature trends are focused on cancers with high incidence and standardized databases. The article differs from other similar works by quantifying the performance and major differences between studies analyzing the same type of cancer. For example, for breast cancer, accuracy ranges between 58% and 100%, and AUC between 62% and 99.89%.

The limitations of the study refer to the constraints in the methodology used to narrow the research framework. These constraints focused on the exclusive analysis of abstracts, open-access articles, and those containing numerical values in the abstract. These three conditions applied simultaneously can lead to the exclusion of a volume of research whose contributions can be major to the field of oncology. However, the authors believe that a quality article should include the most valuable aspects of the research in its synthesis, which is how this methodology was proposed.

In recent years, generative and self-supervised learning models have shaped a new direction in ML research applied to oncology. In addition to classical approaches to ML models, a range of advanced techniques will be discussed to evaluate the state of the art in relation to potential future research directions [195]. Makhlouf et al. [196] show that GANs produce synthetic samples in medical imaging when the data is imbalanced or there is a small number of examples for a specific class. These models have a diversity of data distribution that helps stabilize predictions, leading to a reduction in overfitting in CNNs. Frid-Adar et al. [197] employ GANs to generate synthetic augmented images in a set of 182 liver lesions, adding new data that will increase the sensitivity of the CNN compared to classical augmentation. The paper by Yang et al. [198] presents a model based on diffusion models for semantic enhancement, which is reflected in superior performance in medical imaging. Dai et al. [199] outline the advantage of guided text generation when cancer is rare. He produces synthetic samples that reflect subtle clinical variations. He and McMillan [200] state that DL offers better performance than traditional models when the dataset size is large. However, traditional models remain competitive when the data regime is moderate, also offering a trade-off between interpretability and computational cost. Traditional models, such as RF, are the best classifiers in terms of the balance between performance, requirements, and computational resources [201]. This analysis reveals that generative or diffusion models require intensive processing, high-quality and balanced data, significant computing power, and advanced knowledge, as well as dedicated development time, which is extensive compared to traditional models.

This review also identifies a series of inherent limitations in the field of ML applied to oncology. Among these are the following:

Publication bias (the tendency to report high performance);
Lack of external validation in numerous studies;
Overestimation of performance in contexts with class imbalance;
Heterogeneity of training and validation protocols;
Exclusion of articles that do not mention numerical values in the abstract;
Exclusion from detailed analysis of articles that are not open-source.

Additionally, the predominant use of public datasets (TCGA, GEO, SEER) can generate a bias due to reusing the same cohorts, limiting true clinical generalizability. The interpretation of performances must be done with caution, especially in the absence of multicenter and prospective validation.

The answers to the RQs stated in Section 1 are as follows:

RQ1: The types of cancer frequently investigated in the literature using an ML approach are: breast (350 articles), colorectal (337), lung (151), prostate (83), and gastric (60);
RQ2: The datasets used in most studies are TCGA (133 papers), GEO (94), and SEER (72). The most investigated ML model is RF;
RQ3: Performance levels are investigated using the maximum and minimum values for accuracy, precision, recall, F1-score, and AUC, simultaneously. Analyzing a single performance indicator is not conclusive regarding the quality of the model;
RQ4: Correlation with clinical potential refers to the fact that models with external validation and diverse datasets have the best chance of implementation, whereas models with extreme scores without validation are at risk of overfitting.

Incorporating ML models in oncology is opening up new horizons for prevention, early diagnosis, accurate prognosis, and the personalization of treatment plans for humans. The variability in performance across studies, ranging from almost perfect results to very low values, highlights that success is directly dependent on data quality, cohort size, the quality of the ML model, and its external validation. The major contribution of this study lies in the unique methodology applied in selecting and analyzing how ML models are used to identify gaps that need to be addressed in future research.

Future research directions should include:

The development of models based on federated learning, which can train on multicentric data without transferring sensitive information, taking into account the medical context of these studies;
Multimodal integration (imaging, genomic, clinical) for predictions at a level superior to the current one;
The systematic implementation of Explainable AI (XAI) techniques, as a facilitator in clinical acceptance;
The use of prospective validation and randomized studies to confirm real-world applicability;
The standardization of performance reporting according to the TRIPOD-AI guidelines.

As in most recent review articles, future research directions are represented by managing the bias present in datasets. ML models amplify the imbalances generated by the uneven distribution of data. Standardizing the data collection and annotation process, including multicenter and multiethnic cohorts, and studying model interpretability in a clinical context represent future research directions.

5. Conclusions

Cancer is one of the most complex medical challenges of the 21st century. It is characterized by clinical diversity, biological complexity, the need for early diagnoses, and personalized treatments. In recent years, remarkable progress has been made in the field of oncology through the integration of AI techniques, specifically through ML models. These come with a unique opportunity to analyze a large volume of data that human experts often cannot, causing them to overlook certain details. ML models solve this problem and support the expert in making clinical decisions. The implementation of these technologies must be understood for a systematic evaluation where the analysis of reproducibility, generalizability, applicability, and clinical relevance in the results leads to oncological success.

This paper conducted a systematic analysis of 1364 articles published between 1 January 2020 and 31 December 2025. These were extracted simultaneously from three major databases: WOS, Scopus, and PubMed. For the analyzed articles, the filters applied were the requirement of being open access and presenting numerical data in the abstract to highlight the technical nature of the approach within these articles. The study analyzes the research from four approaches: the type of cancer, the characteristics of the datasets, the ML models used, and the performance achieved. The results showed a major focus on high-incidence cancers. The papers primarily use public databases, and the preferred algorithm is RF. The research identified different cancer types, and the analysis found that these performances are directly related to the quality and size of the datasets used. The analysis highlights the potential of ML models in oncology through clinical implementation dependent on access to large and diverse datasets, external validation protocols, standardized performance reports, and the integration of algorithm interpretability. Using all these factors in implementing ML models can lead to good performance metrics, allowing them to be integrated into real-world practice without generating risks or erroneous results.

Future research directions should focus on the creation of a standardized protocol for reporting ML performance in oncology. Furthermore, future research should expand research toward cancers with low incidence, which have low reproducibility, such as brain, bone, and salivary gland cancers. The integration of algorithm interpretability into all stages of development would increase clinical trust in ML models and, along with systematic external validation before clinical application, would lead to widespread adoption of these AI technologies. Exploring strategies for combining datasets should also be investigated much more to identify whether they improve the degree of generalization.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app16052226/s1.

Author Contributions

Conceptualization, C.-M.R.; methodology, C.-M.R., A.S. and A.G.B.; software, C.-M.R. and A.S.; validation, C.-M.R., A.S. and A.G.B.; formal analysis, C.-M.R., A.S. and A.G.B.; investigation, C.-M.R., A.S. and A.G.B.; resources, C.-M.R., A.S. and A.G.B.; data curation, C.-M.R. and A.S.; writing—original draft preparation, C.-M.R., A.S. and A.G.B.; writing—review and editing, C.-M.R., A.S. and A.G.B.; visualization C.-M.R., A.S. and A.G.B.; supervision, C.-M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Petroleum-Gas University of Ploiesti, Romania.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Figure 5 was designed by brgfx—Freepik.com (accessed on 7 August 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AdaBoost	Adaptive Boosting
ADC	Apparent diffusion coefficient
AI	Artificial intelligence
ANN	Artificial Neural Network
AS	Active surveillance
ATA	American Thyroid Association
AUC	Area Under the Curve
CAF	Cancer-associated fibroblast
CatBoost	Categorical Boosting
CCLE	Cancer Cell Line Encyclopedia
CEA	Carcinoembryonic Antigen
CNN	Convolutional Neural Network
CNS	Central Nervous System
CPTAC	Clinical Proteomic Tumor Analysis Consortium
CT	Computed tomography
CTC	Circulating tumor cell
CTU	Computed tomography urography
CUP	Cancer of unknown primary
CUPLR	Cancer of Unknown Primary Location Resolver
DL	Deep learning
DRF	Distributed Random Forest
DT	Decision Tree
EHO	Elephant Herding Optimisation
FAIR	Findable, Accessible, Interoperable, Reusable
FN	False negative
FOBT	Fecal Occult Blood Test
FP	False positive
FPO	Flower Pollination Optimization
GB	Gradient Boosting
GBDT	Gradient Boosting Decision Tree
GEO	Gene Expression Omnibus
GMM	Gaussian Mixture Model
GTEx	Genotype-Tissue Expression
IDC	Invasive ductal carcinoma
ITGAV	Integrin subunit alpha V
KNN	K-Nearest Neighbor
KRAS	Kirsten Rat Sarcoma viral oncogene homolog
LASSO	Least Absolute Shrinkage and Selection Operator
LightGBM	Light Gradient-Boosting Machine
LogisticReg	Logistic regression
METABRIC	Molecular Taxonomy of Breast Cancer International Consortium
MIMIC	Medical Information Mart for Intensive Care
ML	Machine learning
MLP	Multilayer perceptron
NB	Naive Bayes
NCI	U.S. National Cancer Institute
NLP	Natural language processing
NN	Neural network
NSCLC	Non-small cell lung cancer
PTC	Papillary thyroid cancer
RCC	Renal cell carcinoma
RF	Random Forest
RNA	Ribonucleic acid
RQs	Research question
RSF	Random Survival Forest
SEER	Surveillance, Epi-demiology, and End Results
SHAP	SHapley Additive exPlanations
STFT	Short Term Fourier Transform
STING	Stimulator of INterferon Genes
SVM	Support Vector Machine
TCGA	The Cancer Genome Atlas
TCIA	The Cancer Imaging Archive
TN	True negative
TP	True positive
TRIPOD	Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis
WOS	Web of Science
XAI	Explainable artificial intelligence
XGBoost	Extreme Gradient Boosting

References

National Cancer Institute. Cancer Types. Available online: https://www.cancer.gov/types (accessed on 12 January 2026).
American Cancer Society. Cancer Facts & Figures 2025. Available online: https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2025/2025-cancer-facts-and-figures-acs.pdf (accessed on 4 January 2026).
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Rosca, C.-M.; Stancu, A. A Review of Neuro-ML Breakthroughs in Addressing Neurological Disorders. Appl. Sci. 2025, 15, 5442. [Google Scholar] [CrossRef]
Nazari, E.; Naderi, H.; Tabadkani, M.; Arefnezhad, R.; Farzin, A.H.; Dashtiahangar, M.; Khazaei, M.; Ferns, G.A.; Mehrabian, A.; Tabesh, H.; et al. Breast cancer prediction using different machine learning methods applying multi factors. J. Cancer Res. Clin. Oncol. 2023, 149, 17133–17146. [Google Scholar] [CrossRef]
Chen, H.; Wang, N.; Du, X.; Mei, K.; Zhou, Y.; Cai, G. Classification Prediction of Breast Cancer Based on Machine Learning. Comput. Intell. Neurosci. 2023, 2023, 6530719. [Google Scholar] [CrossRef] [PubMed]
Jozsa, F.; Baker, R.; Kelly, P.; Ahmed, M.; Douek, M. The Use of Machine Learning to Reduce Overtreatment of the Axilla in Breast Cancer: Retrospective Cohort Study. JMIR Perioper. Med. 2022, 5, e34600. [Google Scholar] [CrossRef] [PubMed]
Plisson, M.; Moll, A.; Sarrazin, V.; Charles, D.; Antoine, T.; Ionescu, R.; Koehren, O.; Raymond, E. Methods for Inclusive Underwriting of Breast Cancer Risk with Machine Learning and Innovative Algorithms. J. Insur. Med. 2023, 50, 36–48. [Google Scholar] [CrossRef]
Arravalli, T.; Chadaga, K.; Muralikrishna, H.; Sampathila, N.; Cenitta, D.; Chadaga, R.; Swathi, K.S. Detection of breast cancer using machine learning and explainable artificial intelligence. Sci. Rep. 2025, 15, 26931. [Google Scholar] [CrossRef]
Amraei, J.; Mirzapoor, A.; Motarjem, K.; Abdolahad, M. Enhancing breast cancer diagnosis through machine learning algorithms. Sci. Rep. 2025, 15, 23316. [Google Scholar] [CrossRef]
Shadman, H.; Gomrok, S.; Litle, C.; Cheng, Q.; Jiang, Y.; Huang, X.; Ziebarth, J.D.; Wang, Y. A machine learning-based investigation of integrin expression patterns in cancer and metastasis. Sci. Rep. 2025, 15, 5270. [Google Scholar] [CrossRef]
Rodriguez, P.J.; Heagerty, P.J.; Clark, S.; Khor, S.; Chen, Y.; Haupt, E.; Hahn, E.E.; Shankaran, V.; Bansal, A. Using Machine Learning to Leverage Biomarker Change and Predict Colorectal Cancer Recurrence. JCO Clin. Cancer Inform. 2023, 7, e2300066. [Google Scholar] [CrossRef] [PubMed]
Li, R.; Hao, X.; Diao, Y.; Yang, L.; Liu, J. Explainable Machine Learning Models for Colorectal Cancer Prediction Using Clinical Laboratory Data. Cancer Control 2025, 32, 1–14. [Google Scholar] [CrossRef]
Bifarin, O.O.; Fernández, F.M. Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics. J. Am. Soc. Mass Spectrom. 2024, 35, 1089–1100. [Google Scholar] [CrossRef]
Tokareva, A.; Iurova, M.; Starodubtseva, N.; Chagovets, V.; Novoselova, A.; Kukaev, E.; Frankevich, V.; Sukhikh, G. Machine Learning Framework for Ovarian Cancer Diagnostics Using Plasma Lipidomics and Metabolomics. Int. J. Mol. Sci. 2025, 26, 6630. [Google Scholar] [CrossRef]
Lee, D.; Lee, C.; Han, K.; Goo, T.; Kim, B.; Han, Y.; Kwon, W.; Lee, S.; Jang, J.-Y.; Park, T. Machine learning models for pancreatic cancer diagnosis based on microbiome markers from serum extracellular vesicles. Sci. Rep. 2025, 15, 10995. [Google Scholar] [CrossRef]
Wang, H.-N.; An, J.-H.; Wang, F.-Q.; Hu, W.-Q.; Zong, L. Predicting gastric cancer survival using machine learning: A systematic review. World J. Gastrointest. Oncol. 2025, 17, 103804. [Google Scholar] [CrossRef]
Yuan, J.; Zhou, D.; Yu, S. Interpretable machine learning driven biomarker identification and validation for prostate cancer. Transl. Androl. Urol. 2025, 14, 1528–1541. [Google Scholar] [CrossRef] [PubMed]
Liu, L.; Duan, W.; She, T.; Ma, S.; Wang, H.; Chen, J. Machine learning-enabled prediction of bone metastasis in esophageal cancer. Front. Med. 2025, 12, 1620687. [Google Scholar] [CrossRef]
Krishnan, A. Radiomics and machine learning for predicting metachronous liver metastasis in rectal cancer. World J. Gastrointest. Oncol. 2025, 17, 102324. [Google Scholar] [CrossRef] [PubMed]
Jopek, M.A.; Pastuszak, K.; Sieczczyński, M.; Cygert, S.; Żaczek, A.J.; Rondina, M.T.; Supernat, A. Improving platelet-RNA-based diagnostics: A comparative analysis of machine learning models for cancer detection and multiclass classification. Mol. Oncol. 2024, 18, 2743–2754. [Google Scholar] [CrossRef]
Koh, H.Y.K.; Lam, U.T.F.; Ban, K.H.-K.; Chen, E.S. Machine learning optimized DriverDetect software for high precision prediction of deleterious mutations in human cancers. Sci. Rep. 2024, 14, 22618. [Google Scholar] [CrossRef] [PubMed]
Hamano, J.; Takeuchi, A.; Keyaki, T.; Nose, H.; Hayashi, K. Optimal Machine Learning Models for Developing Prognostic Predictions in Patients with Advanced Cancer. Cureus 2024, 16, e76227. [Google Scholar] [CrossRef]
Dong, B.; Lu, Z.; Yang, T.; Wang, J.; Zhang, Y.; Tuo, X.; Wang, J.; Lin, S.; Cai, H.; Cheng, H.; et al. Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): A modelling study. Lancet Reg. Health-West. Pac. 2025, 55, 101480. [Google Scholar] [CrossRef]
Su, J.; Lu, H.; Zhang, R.; Cui, N.; Chen, C.; Si, Q.; Song, B. Cervical cancer prediction using machine learning models based on routine blood analysis. Sci. Rep. 2025, 15, 22655. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Dai, L.; Guo, S.; Wang, H.; Lei, L.; Yu, J.; Li, X.; Wang, J. Rapid differentiation of patients with lung cancers from benign lung nodule based on dried serum Fourier-transform infrared spectroscopy combined with machine learning algorithms. Photodiagn. Photodyn. Ther. 2025, 54, 104653. [Google Scholar] [CrossRef] [PubMed]
Amin, A.; Priya, M.; Rodrigues, J.; Biswas, S.; Chandra, S.; Mathew, S.; Ray, S.; Rao, B.S.S.; Mahato, K.K. Machine Learning Empowered a Graphical User Interface on Native Fluorescence to Predict Breast Cancer. ACS Omega 2025, 10, 20315–20325. [Google Scholar] [CrossRef]
Lopes, C.; Brandão, A.; Teixeira, M.R.; Dinis-Ribeiro, M.; Pereira, C. Saliva-derived transcriptomic signature for gastric cancer detection using machine learning and leveraging publicly available datasets. Sci. Rep. 2025, 15, 18491. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Chen, A.; Liu, Z.; Wei, S.; Zhang, J.; Chen, J.; Shi, C. Machine learning driven prediction of drug efficacy in lung cancer: Based on protein biomarkers and clinical features. Life Sci. 2025, 375, 123706. [Google Scholar] [CrossRef]
Zhang, H.; Dong, B.; Han, J.; Huang, L. Interpretable machine learning models for survival prediction in prostate cancer bone metastases. Sci. Rep. 2025, 15, 24150. [Google Scholar] [CrossRef]
Tegaw, E.M.; Asfaw, B.B. Machine learning analysis of survival outcomes in breast cancer patients treated with chemotherapy, hormone therapy, surgery, and radiotherapy. Sci. Rep. 2025, 15, 24981. [Google Scholar] [CrossRef]
Montgomery, A.; Vadapalli, R.; Dinenno, F.A.; Schilling, J.; Jain, P.; Jacob, A.; Chism, D.; Shanker, A. Machine learning to evaluate the effects of non-clinical social determinant features in predicting colorectal Cancer mortality in a medically underserved Appalachian population. Sci. Rep. 2025, 15, 25781. [Google Scholar] [CrossRef]
Yao, J.; Zhou, W.; Xu, S.; Jia, X.; Zhou, J.; Chen, X.; Zhan, W. Machine Learning-Based Breast Tumor Ultrasound Radiomics for Pre-operative Prediction of Axillary Sentinel Lymph Node Metastasis Burden in Early-Stage Invasive Breast Cancer. Ultrasound Med. Biol. 2024, 50, 229–236. [Google Scholar] [CrossRef]
Zhang, D.; Shen, M.; Zhang, L.; He, X.; Huang, X. Establishment of an interpretable MRI radiomics-based machine learning model capable of predicting axillary lymph node metastasis in invasive breast cancer. Sci. Rep. 2025, 15, 26030. [Google Scholar] [CrossRef]
Liu, Y.; Liu, Y.; Zhang, Y.; Zhang, P.; Xie, J.; Zhao, N.; Xie, Y.; Cheng, C.; Zhao, S. Using machine learning algorithms to predict risk factors of heart failure after complete mesocolic excision in colorectal cancer patients. Sci. Rep. 2025, 15, 25441. [Google Scholar] [CrossRef]
Anselmi, F.; Cavigli, L.; Pagliaro, A.; Valente, S.; Valentini, F.; Cameli, M.; Focardi, M.; Mochi, N.; Dendale, P.; Hansen, D.; et al. The importance of ventilatory thresholds to define aerobic exercise intensity in cardiac patients and healthy subjects. Scand. J. Med. Sci. Sports 2021, 31, 1796–1808. [Google Scholar] [CrossRef]
Kurokawa, S.; Okamoto, H.; Nakaichi, T.; Mikasa, S.; Nakamura, S.; Iijima, K.; Chiba, T.; Nakayama, H.; Kashihara, T.; Inaba, K.; et al. Machine learning prediction for lung dose in locally advanced esophageal cancer using volumetric modulated arc therapy. Med. Dosim. 2025, 50, 229–236. [Google Scholar] [CrossRef] [PubMed]
Yuan, Z.-N.; Xue, Y.-J.; Wang, H.-J.; Qu, S.-N.; Huang, C.-L.; Wang, H.; Zhang, H.; Zhang, M.-Z.; Xing, X.-Z. A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation. Sci. Rep. 2025, 15, 18268. [Google Scholar] [CrossRef] [PubMed]
Rosca, C.-M. New Algorithm to Prevent Online Test Fraud Based on Cognitive Services and Input Devices Events. In Proceedings of Third Emerging Trends and Technologies on Intelligent Systems. ETTIS 2023; Noor, A., Saroha, K., Pricop, E., Sen, A., Trivedi, G., Eds.; Lecture Notes in Networks and Systems; Springer Nature: Singapore, 2023; Volume 730, pp. 207–219. [Google Scholar] [CrossRef]
Zhang, C.; Wang, Z.; Shang, P.; Zhou, Y.; Zhu, J.; Xu, L.; Chen, Z.; Yu, M.; Zang, Y. Combining multi-parametric MRI radiomics features with tumor abnormal protein to construct a machine learning-based predictive model for prostate cancer. Sci. Rep. 2025, 15, 22816. [Google Scholar] [CrossRef] [PubMed]
S Athisayamani, T.S.; Singh, A.R.; Hwang, J.-Y.; Joshi, G.P. A novel double machine learning approach for detecting early breast cancer using advanced feature selection and dimensionality reduction techniques. Sci. Rep. 2025, 15, 22971. [Google Scholar] [CrossRef]
Yadav, K.; Daga, S.; Saini, A. Structural Analysis of Sustainable Fashion Adoption: An Integrated TISM and MICMAC Approach. J. Sustain. Mark. 2025, 6, 13–36. [Google Scholar] [CrossRef]
Freepik. Anatomical Structure Human Body–Brgfx. Available online: https://www.freepik.com/free-vector/anatomical-structure-human-body_27539420.htm (accessed on 7 August 2025).
Freepik. Male and Female Reproductive System Organs Realistic Set Isolated on White Background Vector Illustration–Macrovector. Available online: https://www.freepik.com/free-vector/male-female-reproductive-system-organs-realistic-sset-isolated-white-background-vector-illustration_26764508.htm (accessed on 7 August 2025).
Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. BMC Med. 2015, 13, 1. [Google Scholar] [CrossRef]
Moons, K.G.M.; Altman, D.G.; Reitsma, J.B.; Ioannidis, J.P.A.; Macaskill, P.; Steyerberg, E.W.; Vickers, A.J.; Ransohoff, D.F.; Collins, G.S. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration. Ann. Intern. Med. 2015, 162, W1–W73. [Google Scholar] [CrossRef]
National Library of Medicine. Definition of FAIR Data. Available online: https://www.nlm.nih.gov/oet/ed/cde/tutorial/02-200.html (accessed on 3 January 2026).
Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; Da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
Rosca, C.M.; Gortoescu, I.A.; Tanase, M.R. Artificial Intelligence—Powered Video Content Generation Tools. Rom. J. Pet. Gas Technol. 2024, 5, 131–144. [Google Scholar] [CrossRef]
Zantvoort, K.; Nacke, B.; Görlich, D.; Hornstein, S.; Jacobi, C.; Funk, B. Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions. npj Digit. Med. 2024, 7, 361. [Google Scholar] [CrossRef]
Dhiman, P.; Ma, J.; Qi, C.; Bullock, G.; Sergeant, J.C.; Riley, R.D.; Collins, G.S. Sample size requirements are not being considered in studies developing prediction models for binary outcomes: A systematic review. BMC Med. Res. Methodol. 2023, 23, 188. [Google Scholar] [CrossRef]
Riley, R.D.; Ensor, J.; Snell, K.I.E.; Archer, L.; Whittle, R.; Dhiman, P.; Alderman, J.; Liu, X.; Kirton, L.; Manson-Whitton, J.; et al. Importance of sample size on the quality and utility of AI-based prediction models for healthcare. Lancet Digit. Health 2025, 7, 100857. [Google Scholar] [CrossRef] [PubMed]
Goldenholz, D.M.; Sun, H.; Ganglberger, W.; Westover, M.B. Sample Size Analysis for Machine Learning Clinical Validation Studies. Biomedicines 2023, 11, 685. [Google Scholar] [CrossRef] [PubMed]
De Velasco, M.A.; Sakai, K.; Mitani, S.; Kura, Y.; Minamoto, S.; Haeno, T.; Hayashi, H.; Nishio, K. A machine learning-based method for feature reduction of methylation data for the classification of cancer tissue origin. Int. J. Clin. Oncol. 2024, 29, 1795–1810. [Google Scholar] [CrossRef]
Khalili-Tanha, G.; Mohit, R.; Asadnia, A.; Khazaei, M.; Dashtiahangar, M.; Maftooh, M.; Nassiri, M.; Hassanian, S.M.; Ghayour-Mobarhan, M.; Kiani, M.A.; et al. Identification of ZMYND19 as a novel biomarker of colorectal cancer: RNA-sequencing and machine learning analysis. J. Cell Commun. Signal. 2023, 17, 1469–1485. [Google Scholar] [CrossRef]
Zeng, X.; Lu, Z.; Dai, C.; Su, H.; Liu, Z.; Cheng, S. Establish TIIC signature score based the machine learning fusion in bladder cancer. Discov. Oncol. 2024, 15, 368. [Google Scholar] [CrossRef]
Liu, X.; Wang, W.; Zhang, X.; Liang, J.; Feng, D.; Li, Y.; Xue, M.; Ling, B. Metabolism pathway-based subtyping in endometrial cancer: An integrated study by multi-omics analysis and machine learning algorithms. Mol. Ther.-Nucleic Acids 2024, 35, 102155. [Google Scholar] [CrossRef]
Liu, Y.; Pang, Z.Q.; Wang, J.S.; Wang, J.F.; He, J.X.; Ji, B.; Zhang, L.; Ren, M.H. Heat shock protein family A member 8 is a prognostic marker for bladder cancer: Evidences based on experiments and machine learning. J. Cell. Mol. Med. 2023, 27, 3995–4008. [Google Scholar] [CrossRef] [PubMed]
Tian, M.; Wang, T.; Wang, P. Development and Clinical Validation of a Seven-Gene Prognostic Signature Based on Multiple Machine Learning Algorithms in Kidney Cancer. Cell Transplant. 2021, 30, 1–17. [Google Scholar] [CrossRef]
Ma, H.; Shi, L.; Zheng, J.; Zeng, L.; Chen, Y.; Zhang, S.; Tang, S.; Qu, Z.; Xiong, X.; Zheng, X.; et al. Advanced machine learning unveils CD8+ T cell genetic markers enhancing prognosis and immunotherapy efficacy in breast cancer. BMC Cancer 2024, 24, 1222. [Google Scholar] [CrossRef]
Li, Q.; Xiao, X.; Feng, J.; Yan, R.; Xi, J. Machine learning-assisted analysis of epithelial mesenchymal transition pathway for prognostic stratification and immune infiltration assessment in ovarian cancer. Front. Endocrinol. 2023, 14, 1196094. [Google Scholar] [CrossRef]
Tian, H.; Liu, Z.; Liu, J.; Zong, Z.; Chen, Y.; Zhang, Z.; Li, H. Application of machine learning algorithm in predicting distant metastasis of T1 gastric cancer. Sci. Rep. 2023, 13, 5741. [Google Scholar] [CrossRef]
Manikandan, P.; Durga, U.; Ponnuraja, C. An integrative machine learning framework for classifying SEER breast cancer. Sci. Rep. 2023, 13, 5362. [Google Scholar] [CrossRef]
Wan, J.; Zeng, Y. Prediction of hepatic metastasis in esophageal cancer based on machine learning. Sci. Rep. 2024, 14, 14507. [Google Scholar] [CrossRef] [PubMed]
Rosca, C.-M.; Stancu, A. Quality assessment of GPT-3.5 and Gemini 1.0 Pro for SQL syntax. Comput. Stand. Interfaces 2026, 95, 104041. [Google Scholar] [CrossRef]
Comba, A.; Faisal, S.M.; Dunn, P.J.; Argento, A.E.; Hollon, T.C.; Al-Holou, W.N.; Varela, M.L.; Zamler, D.B.; Quass, G.L.; Apostolides, P.F.; et al. Spatiotemporal analysis of glioma heterogeneity reveals COL1A1 as an actionable target to disrupt tumor progression. Nat. Commun. 2022, 13, 3606. [Google Scholar] [CrossRef] [PubMed]
Bahar, R.C.; Merkaj, S.; Cassinelli Petersen, G.I.; Tillmanns, N.; Subramanian, H.; Brim, W.R.; Zeevi, T.; Staib, L.; Kazarian, E.; Lin, M.; et al. Machine Learning Models for Classifying High- and Low-Grade Gliomas: A Systematic Review and Quality of Reporting Analysis. Front. Oncol. 2022, 12, 856231. [Google Scholar] [CrossRef]
Pasquini, L.; Napolitano, A.; Lucignani, M.; Tagliente, E.; Dellepiane, F.; Rossi-Espagnet, M.C.; Ritrovato, M.; Vidiri, A.; Villani, V.; Ranazzi, G.; et al. AI and High-Grade Glioma for Diagnosis and Outcome Prediction: Do All Machine Learning Models Perform Equally Well? Front. Oncol. 2021, 11, 601425. [Google Scholar] [CrossRef]
Redlich, J.-P.; Feuerhake, F.; Weis, J.; Schaadt, N.S.; Teuber-Hanselmann, S.; Buck, C.; Luttmann, S.; Eberle, A.; Nikolin, S.; Appenzeller, A.; et al. Applications of artificial intelligence in the analysis of histopathology images of gliomas: A review. npj Imaging 2024, 2, 16. [Google Scholar] [CrossRef] [PubMed]
Braman, N.; Gordon, J.W.H.; Goossens, E.T.; Willis, C.; Stumpe, M.C.; Venkataraman, J. Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data. In Proceedings of the 24th International Conference on Medical Image Computing and Computer-Assisted Intervention, Strasbourg, France, 27 September–1 October 2021; pp. 667–677. [Google Scholar] [CrossRef]
Rosca, C.-M.; Stancu, A.; Tănase, M.R. A Comparative Study of Azure Custom Vision Versus Google Vision API Integrated into AI Custom Models Using Object Classification for Residential Waste. Appl. Sci. 2025, 15, 3869. [Google Scholar] [CrossRef]
Daga, S.; Yadav, K.; Singh, P.; Mishra, V. Beyond the Take-Make-Dispose Model—Unlocking the Power of Circular Economy for an Environmentally Resilient Future. In Circular Economy and Environmental Resilience; Singh, P., Daga, S., Yadav, K., Mishra, V., Eds.; Springer: Cham, Switzerland, 2025; pp. 1–11. [Google Scholar] [CrossRef]
Petrov, M.; Makarova, N.; Monemian, A.; Pham, J.; Lekka, M.; Sokolov, I. Detection of Human Bladder Epithelial Cancerous Cells with Atomic Force Microscopy and Machine Learning. Cells 2024, 14, 14. [Google Scholar] [CrossRef] [PubMed]
Wei, Z.; Bai, X.; Xv, Y.; Chen, S.-H.; Yin, S.; Li, Y.; Lv, F.; Xiao, M.; Xie, Y. A radiomics-based interpretable machine learning model to predict the HER2 status in bladder cancer: A multicenter study. Insights Imaging 2024, 15, 262. [Google Scholar] [CrossRef]
Urbanos, G.; Martín, A.; Vázquez, G.; Villanueva, M.; Villa, M.; Jimenez-Roldan, L.; Chavarrías, M.; Lagares, A.; Juárez, E.; Sanz, C. Supervised Machine Learning Methods and Hyperspectral Imaging Techniques Jointly Applied for Brain Cancer Classification. Sensors 2021, 21, 3827. [Google Scholar] [CrossRef] [PubMed]
Umer, M.; Naveed, M.; Alrowais, F.; Ishaq, A.; Hejaili, A.A.; Alsubai, S.; Eshmawi, A.A.; Mohamed, A.; Ashraf, I. Breast Cancer Detection Using Convoluted Features and Ensemble Machine Learning Algorithm. Cancers 2022, 14, 6015. [Google Scholar] [CrossRef]
Miglietta, F.; Collesei, A.; Vernieri, C.; Giarratano, T.; Giorgi, C.A.; Girardi, F.; Griguolo, G.; Cacciatore, M.; Botticelli, A.; Vingiani, A.; et al. Development of two machine learning models to predict conversion from primary HER2-0 breast cancer to HER2-low metastases: A proof-of-concept study. ESMO Open 2025, 10, 104087. [Google Scholar] [CrossRef]
Miao, Y.; Zhang, X.; Chen, S.; Zhou, W.; Xu, D.; Shi, X.; Li, J.; Tu, J.; Yuan, X.; Lv, K.; et al. Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci. Front. Oncol. 2022, 12, 946552. [Google Scholar] [CrossRef]
Jeong, Y.; Chu, J.; Kang, J.; Baek, S.; Lee, J.-H.; Jung, D.-S.; Kim, W.-W.; Kim, Y.-R.; Kang, J.; Do, I.-G. Application of Transcriptome-Based Gene Set Featurization for Machine Learning Model to Predict the Origin of Metastatic Cancer. Curr. Issues Mol. Biol. 2024, 46, 7291–7302. [Google Scholar] [CrossRef] [PubMed]
Al Mudawi, N.; Alazeb, A. A Model for Predicting Cervical Cancer Using Machine Learning Algorithms. Sensors 2022, 22, 4132. [Google Scholar] [CrossRef] [PubMed]
Ou, Z.; Mao, W.; Tan, L.; Yang, Y.; Liu, S.; Zhang, Y.; Li, B.; Zhao, D. Prediction of Postoperative Pathologic Risk Factors in Cervical Cancer Patients Treated with Radical Hysterectomy by Machine Learning. Curr. Oncol. 2022, 29, 9613–9629. [Google Scholar] [CrossRef] [PubMed]
Maurya, N.S.; Kushwaha, S.; Chawade, A.; Mani, A. Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer. Sci. Rep. 2021, 11, 14304. [Google Scholar] [CrossRef]
Rhanoui, M.; Mikram, M.; Amazian, K.; Ait-Abderrahim, A.; Yousfi, S.; Toughrai, I. Multimodal Machine Learning for Predicting Post-Surgery Quality of Life in Colorectal Cancer Patients. J. Imaging 2024, 10, 297. [Google Scholar] [CrossRef]
Bezzi, C.; Bergamini, A.; Mathoux, G.; Ghezzo, S.; Monaco, L.; Candotti, G.; Fallanca, F.; Gajate, A.M.S.; Rabaiotti, E.; Cioffi, R.; et al. Role of Machine Learning (ML)-Based Classification Using Conventional 18F-FDG PET Parameters in Predicting Postsurgical Features of Endometrial Cancer Aggressiveness. Cancers 2023, 15, 325. [Google Scholar] [CrossRef]
Bruno, V.; Betti, M.; D’Ambrosio, L.; Massacci, A.; Chiofalo, B.; Pietropolli, A.; Piaggio, G.; Ciliberto, G.; Nisticò, P.; Pallocca, M.; et al. Machine learning endometrial cancer risk prediction model: Integrating guidelines of European Society for Medical Oncology with the tumor immune framework. Int. J. Gynecol. Cancer 2023, 33, 1708–1714. [Google Scholar] [CrossRef]
Gong, X.; Zheng, B.; Xu, G.; Chen, H.; Chen, C. Application of machine learning approaches to predict the 5-year survival status of patients with esophageal cancer. J. Thorac. Dis. 2021, 13, 6240–6251. [Google Scholar] [CrossRef]
Wan, J.; Zhou, J. Use machine learning to predict bone metastasis of esophageal cancer: A population-based study. Digit. Health 2025, 11, 1–12. [Google Scholar] [CrossRef]
Bang, C.S.; Ahn, J.Y.; Kim, J.-H.; Kim, Y.-I.; Choi, I.J.; Shin, W.G. Establishing Machine Learning Models to Predict Curative Resection in Early Gastric Cancer with Undifferentiated Histology: Development and Usability Study. J. Med. Internet Res. 2021, 23, e25053. [Google Scholar] [CrossRef]
Choi, Y.; Lee, J.; Shin, K.; Lee, J.W.; Kim, J.W.; Lee, S.; Choi, Y.J.; Park, K.H.; Kim, J.H. Integrated clinical and genomic models using machine-learning methods to predict the efficacy of paclitaxel-based chemotherapy in patients with advanced gastric cancer. BMC Cancer 2024, 24, 502. [Google Scholar] [CrossRef]
Ishii, H.; Saitoh, M.; Sakamoto, K.; Sakamoto, K.; Saigusa, D.; Kasai, H.; Ashizawa, K.; Miyazawa, K.; Takeda, S.; Masuyama, K.; et al. Lipidome-based rapid diagnosis with machine learning for detection of TGF-β signalling activated area in head and neck cancer. Br. J. Cancer 2020, 122, 995–1004. [Google Scholar] [CrossRef] [PubMed]
Baig, Z.; Abu-Omar, N.; Khan, R.; Verdiales, C.; Frehlick, R.; Shaw, J.; Wu, F.-X.; Luo, Y. Prognosticating Outcome in Pancreatic Head Cancer With the use of a Machine Learning Algorithm. Technol. Cancer Res. Treat. 2021, 20, 1–8. [Google Scholar] [CrossRef]
Mahmud, S.; Abbas, T.O.; Mushtak, A.; Prithula, J.; Chowdhury, M.E.H. Kidney Cancer Diagnosis and Surgery Selection by Machine Learning from CT Scans Combined with Clinical Metadata. Cancers 2023, 15, 3189. [Google Scholar] [CrossRef]
Yan, Y.; Sun, Q.; Du, H.; Sun, W.; Guo, Y.; Li, B.; Wang, X. Machine learning models predict the progression of long-term renal insufficiency in patients with renal cancer after radical nephrectomy. BMC Nephrol. 2024, 25, 450. [Google Scholar] [CrossRef]
Panda, P.; Bisoy, S.K.; Panigrahi, A.; Pati, A.; Sahu, B.; Guo, Z.; Liu, H.; Jain, P. BIMSSA: Enhancing cancer prediction with salp swarm optimization and ensemble machine learning approaches. Front. Genet. 2025, 15, 1491602. [Google Scholar] [CrossRef] [PubMed]
Koowattanasuchat, S.; Ngernpimai, S.; Matulakul, P.; Thonghlueng, J.; Phanchai, W.; Chompoosor, A.; Panitanarak, U.; Wanna, Y.; Intharah, T.; Chootawiriyasakul, K.; et al. Rapid detection of cancer DNA in human blood using cysteamine-capped AuNPs and a machine learning-enabled smartphone. RSC Adv. 2023, 13, 1301–1311. [Google Scholar] [CrossRef]
Vekariya, V.; Passi, K.; Jain, C.K. Predicting liver cancer on epigenomics data using machine learning. Front. Bioinform. 2022, 2, 954529. [Google Scholar] [CrossRef]
Potapova, E.V.; Shupletsov, V.V.; Dremin, V.V.; Zherebtsov, E.A.; Mamoshin, A.V.; Dunaev, A.V. In Vivo Time-Resolved Fluorescence Detection of Liver Cancer Supported by Machine Learning. Lasers Surg. Med. 2024, 56, 836–844. [Google Scholar] [CrossRef]
Pathan, R.K.; Shorna, I.J.; Hossain, M.S.; Khandaker, M.U.; Almohammed, H.I.; Hamd, Z.Y. The efficacy of machine learning models in lung cancer risk prediction with explainability. PLoS ONE 2024, 19, e0305035. [Google Scholar] [CrossRef]
Sepehri, S.; Tankyevych, O.; Upadhaya, T.; Visvikis, D.; Hatt, M.; Cheze Le Rest, C. Comparison and Fusion of Machine Learning Algorithms for Prospective Validation of PET/CT Radiomic Features Prognostic Value in Stage II-III Non-Small Cell Lung Cancer. Diagnostics 2021, 11, 675. [Google Scholar] [CrossRef]
Li, Y.; Luo, Y. Performance-weighted-voting model: An ensemble machine learning method for cancer type classification using whole-exome sequencing mutation. Quant. Biol. 2020, 8, 347–358. [Google Scholar] [CrossRef] [PubMed]
Akcay, M.; Etiz, D.; Celik, O.; Ozen, A. Evaluation of Prognosis in Nasopharyngeal Cancer Using Machine Learning. Technol. Cancer Res. Treat. 2020, 19, 1–9. [Google Scholar] [CrossRef] [PubMed]
Alabi, R.O.; Elmusrati, M.; Leivo, I.; Almangush, A.; Mäkitie, A.A. Machine learning explainability in nasopharyngeal cancer survival using LIME and SHAP. Sci. Rep. 2023, 13, 8984. [Google Scholar] [CrossRef]
Kluz-Barłowska, M.; Kluz, T.; Paja, W.; Sarzyński, J.; Łączyńska-Madera, M.; Odrzywolski, A.; Król, P.; Cebulski, J.; Depciuch, J. FT-Raman data analyzed by multivariate and machine learning as a new methods for detection spectroscopy marker of platinum-resistant women suffering from ovarian cancer. Sci. Rep. 2023, 13, 20772. [Google Scholar] [CrossRef]
Laios, A.; Katsenou, A.; Tan, Y.S.; Johnson, R.; Otify, M.; Kaufmann, A.; Munot, S.; Thangavelu, A.; Hutson, R.; Broadhead, T.; et al. Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning. Cancer Control 2021, 28, 1–12. [Google Scholar] [CrossRef]
Akmeşe, Ö.F. Data privacy-aware machine learning approach in pancreatic cancer diagnosis. BMC Med. Inform. Decis. Mak. 2024, 24, 248. [Google Scholar] [CrossRef]
Zhang, L.; Yu, Z.; Jin, R.; Yang, X.; Ying, D. Machine learning-based prediction of surgical benefit in borderline resectable and locally advanced pancreatic cancer. J. Cancer Res. Clin. Oncol. 2023, 149, 11857–11871. [Google Scholar] [CrossRef]
Rodrigues, V.C.; Soares, J.C.; Soares, A.C.; Braz, D.C.; Melendez, M.E.; Ribas, L.C.; Scabini, L.F.S.; Bruno, O.M.; Carvalho, A.L.; Reis, R.M.; et al. Electrochemical and optical detection and machine learning applied to images of genosensors for diagnosis of prostate cancer with the biomarker PCA3. Talanta 2021, 222, 121444. [Google Scholar] [CrossRef]
Qiao, X.; Gu, X.; Liu, Y.; Shu, X.; Ai, G.; Qian, S.; Liu, L.; He, X.; Zhang, J. MRI Radiomics-Based Machine Learning Models for Ki67 Expression and Gleason Grade Group Prediction in Prostate Cancer. Cancers 2023, 15, 4536. [Google Scholar] [CrossRef] [PubMed]
Shah, S.A.H.; Shah, S.T.H.; Khaled, R.A.; Buccoliero, A.; Shah, S.B.H.; Di Terlizzi, A.; Di Benedetto, G.; Deriu, M.A. Explainable AI-Based Skin Cancer Detection Using CNN, Particle Swarm Optimization and Machine Learning. J. Imaging 2024, 10, 332. [Google Scholar] [CrossRef]
Günther, M.P.; Kirchebner, J.; Schulze, J.B.; Känel, R.; Euler, S. Towards identifying cancer patients at risk to miss out on psycho-oncological treatment via machine learning. Eur. J. Cancer Care 2022, 31, e13555. [Google Scholar] [CrossRef]
Liu, W.; Wang, S.; Ye, Z.; Xu, P.; Xia, X.; Guo, M. Prediction of lung metastases in thyroid cancer using machine learning based on SEER database. Cancer Med. 2022, 11, 2503–2515. [Google Scholar] [CrossRef]
Lai, S.-W.; Fan, Y.-L.; Zhu, Y.-H.; Zhang, F.; Guo, Z.; Wang, B.; Wan, Z.; Liu, P.-L.; Yu, N.; Qin, H.-D. Machine learning-based dynamic prediction of lateral lymph node metastasis in patients with papillary thyroid cancer. Front. Endocrinol. 2022, 13, 1019037. [Google Scholar] [CrossRef] [PubMed]
Rosca, C.M.; Popescu, M.; Patrascioiu, C.; Stancu, A. Comparative Analysis of pH Level Between Pasteurized and UTH Milk Using Dedicated Developed Application. Rev. Chim. 2019, 70, 3917–3920. [Google Scholar] [CrossRef]
Jeong, H.-J.; Kim, K.; Kim, H.W.; Park, Y. Classification between Normal and Cancerous Human Urothelial Cells by Using Micro-Dimensional Electrochemical Impedance Spectroscopy Combined with Machine Learning. Sensors 2022, 22, 7969. [Google Scholar] [CrossRef] [PubMed]
Zhao, Q.; Liu, M.-Y.; Gao, K.-X.; Zhang, B.-S.; Qi, F.-Y.; Xing, T.-R.; Liu, C.-C.; Gao, J.-P. Predicting 90-day risk of urinary tract infections following urostomy in bladder cancer patients using machine learning and explainability. Sci. Rep. 2025, 15, 6807. [Google Scholar] [CrossRef] [PubMed]
Zhu, J.; Zhao, Z.; Yin, B.; Wu, C.; Yin, C.; Chen, R.; Ding, Y. An integrated approach of feature selection and machine learning for early detection of breast cancer. Sci. Rep. 2025, 15, 13015. [Google Scholar] [CrossRef]
Jongbloed, E.M.; Jansen, M.P.H.M.; De Weerd, V.; Helmijr, J.A.; Beaufort, C.M.; Reinders, M.J.T.; Van Marion, R.; Van Ijcken, W.F.J.; Sonke, G.S.; Konings, I.R.; et al. Machine learning-based somatic variant calling in cell-free DNA of metastatic breast cancer patients using large NGS panels. Sci. Rep. 2023, 13, 10424. [Google Scholar] [CrossRef]
Nguyen, L.; Van Hoeck, A.; Cuppen, E. Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features. Nat. Commun. 2022, 13, 4013. [Google Scholar] [CrossRef]
Hatzidaki, E.; Iliopoulos, A.; Papasotiriou, I. A Novel Method for Colorectal Cancer Screening Based on Circulating Tumor Cells and Machine Learning. Entropy 2021, 23, 1248. [Google Scholar] [CrossRef]
Li, J.; Zhang, W.; Chen, L.; Mao, X.; Wang, X.; Liu, J.; Huang, Y.; Qi, H.; Chen, L.; Shi, H.; et al. SNPs and blood inflammatory marker featured machine learning for predicting the efficacy of fluorouracil-based chemotherapy in colorectal cancer. Sci. Rep. 2024, 14, 27700. [Google Scholar] [CrossRef]
Sun, X.; Zhang, L.; Luo, Q.; Zhou, Y.; Du, J.; Fu, D.; Wang, Z.; Lei, Y.; Wang, Q.; Zhao, L. Application of Machine Learning in the Diagnosis of Early Gastric Cancer Using the Kyoto Classification Score and Clinical Features Collected from Medical Consultations. Bioengineering 2024, 11, 973. [Google Scholar] [CrossRef]
Safakish, A.; Sannachi, L.; Dicenzo, D.; Kolios, C.; Pejović-Milić, A.; Czarnota, G.J. Predicting head and neck cancer treatment outcomes with pre-treatment quantitative ultrasound texture features and optimising machine learning classifiers with texture-of-texture features. Front. Oncol. 2023, 13, 1258970. [Google Scholar] [CrossRef]
Cauvin, C.; Bourguignon, L.; Carriat, L.; Mence, A.; Ghipponi, P.; Salas, S.; Ciccolini, J. Machine-Learning Exploration of Exposure-Effect Relationships of Cisplatin in Head and Neck Cancer Patients. Pharmaceutics 2022, 14, 2509. [Google Scholar] [CrossRef]
Endalie, D.; Abebe, W.T. Analysis of lung cancer risk factors from medical records in Ethiopia using machine learning. PLoS Digit. Health 2023, 2, e0000308. [Google Scholar] [CrossRef]
Huang, P.; Shang, J.; Fan, Y.; Hu, Z.; Dai, J.; Liu, Z.; Yan, H. Unsupervised machine learning model for detecting anomalous volumetric modulated arc therapy plans for lung cancer patients. Front. Big Data 2024, 7, 1462745. [Google Scholar] [CrossRef]
Arezzo, F.; Cormio, G.; La Forgia, D.; Santarsiero, C.M.; Mongelli, M.; Lombardi, C.; Cazzato, G.; Cicinelli, E.; Loizzi, V. A machine learning approach applied to gynecological ultrasound to predict progression-free survival in ovarian cancer patients. Arch. Gynecol. Obstet. 2022, 306, 2143–2154. [Google Scholar] [CrossRef]
Teng, B.; Zhang, X.; Ge, M.; Miao, M.; Li, W.; Ma, J. Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: A population-based study and an external validation. Front. Oncol. 2024, 14, 1488118. [Google Scholar] [CrossRef]
Zandie, F.; Salehi, M.; Maziar, A.; Bayatiani, M.R.; Paydar, R. Radiomics based Machine Learning Models for Classification of Prostate Cancer Grade Groups from Multi Parametric MRI Images. J. Med. Signals Sens. 2024, 14, 33. [Google Scholar] [CrossRef]
Mokoatle, M.; Mapiye, D.; Marivate, V.; Hayes, V.M.; Bornman, R. Discriminatory Gleason grade group signatures of prostate cancer: An application of machine learning methods. PLoS ONE 2022, 17, e0267714. [Google Scholar] [CrossRef]
Kavitha, P.; Ayyappan, G.; Jayagopal, P.; Mathivanan, S.K.; Mallik, S.; Al-Rasheed, A.; Alqahtani, M.S.; Soufiene, B.O. Detection for melanoma skin cancer through ACCF, BPPF, and CLF techniques with machine learning approach. BMC Bioinform. 2023, 24, 458. [Google Scholar] [CrossRef]
Rosca, C.-M.; Stancu, A. Integration of AI in Self-Powered IoT Sensor Systems. Appl. Sci. 2025, 15, 7008. [Google Scholar] [CrossRef]
Üstgörül, S.; Popescu, C. What Is the Mediating Role of Communication Skills and Sexual Satisfaction between Job and Life Satisfaction of Healthcare Employees? Behav. Sci. 2023, 13, 368. [Google Scholar] [CrossRef]
Melo, P.N.; Machado, C. Business Intelligence and Analytics in Small and Medium Enterprises; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar] [CrossRef]
İnce, O.; Yıldız, H.; Kisbet, T.; Ertürk, Ş.M.; Önder, H. Classification of retinoblastoma-1 gene mutation with machine learning-based models in bladder cancer. Heliyon 2022, 8, e09311. [Google Scholar] [CrossRef]
Ke, S.; Huang, Y.; Wang, D.; Jiang, Q.; Luo, Z.; Li, B.; Yan, D.; Zhou, J. BreCML: Identifying breast cancer cell state in scRNA-seq via machine learning. Front. Med. 2024, 11, 1482726. [Google Scholar] [CrossRef]
Du, J.; Yang, J.; Yang, Q.; Zhang, X.; Yuan, L.; Fu, B. Comparison of machine learning models to predict the risk of breast cancer-related lymphedema among breast cancer survivors: A cross-sectional study in China. Front. Oncol. 2024, 14, 1334082. [Google Scholar] [CrossRef]
Horio, Y.; Ikeda, J.; Matsumoto, K.; Okada, S.; Nagano, K.; Kusunoki, K.; Kuwahara, R.; Kimura, K.; Kataoka, K.; Beppu, N.; et al. Machine learning-based radiomics models accurately predict Crohn’s disease-related anorectal cancer. Oncol. Lett. 2024, 28, 421. [Google Scholar] [CrossRef]
Zhai, Y.; Lin, X.; Wei, Q.; Pu, Y.; Pang, Y. Interpretable prediction of cardiopulmonary complications after non-small cell lung cancer surgery based on machine learning and SHapley additive exPlanations. Heliyon 2023, 9, e17772. [Google Scholar] [CrossRef]
Rovera, G.; Grimaldi, S.; Oderda, M.; Finessi, M.; Giannini, V.; Passera, R.; Gontero, P.; Deandreis, D. Machine Learning CT-Based Automatic Nodal Segmentation and PET Semi-Quantification of Intraoperative 68Ga-PSMA-11 PET/CT Images in High-Risk Prostate Cancer: A Pilot Study. Diagnostics 2023, 13, 3013. [Google Scholar] [CrossRef]
Nguyen, Q.T.N.; Nguyen, P.A.; Wang, C.J.; Phuc, P.T.; Lin, R.K.; Hung, C.S.; Kuo, N.H.; Cheng, Y.W.; Lin, S.J.; Hsieh, Z.Y.; et al. Machine learning approaches for predicting 5-year breast cancer survival: A multicenter study. Cancer Sci. 2023, 114, 4063–4072. [Google Scholar] [CrossRef]
Villazon, J.; Dela Cruz, N.; Shi, L. Cancer Cell Line Classification Using Raman Spectroscopy of Cancer-Derived Exosomes and Machine Learning. Anal. Chem. 2025, 97, 7289–7298. [Google Scholar] [CrossRef]
Si, D.; Shu, Y.; Jiang, H.; Lin, X.; Yuan, Q.; Deng, S.; Luo, W.; Lin, Y.; Wang, J.; Zhan, C.; et al. Construction of diagnostic models with machine-learning algorithms for colorectal cancer based on clinical laboratory parameters. J. Gastrointest. Oncol. 2024, 15, 2145–2156. [Google Scholar] [CrossRef]
Moon, I.; Lopiccolo, J.; Baca, S.C.; Sholl, L.M.; Kehl, K.L.; Hassett, M.J.; Liu, D.; Schrag, D.; Gusev, A. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat. Med. 2023, 29, 2057–2067. [Google Scholar] [CrossRef]
Yan, S.-Y.; Fu, X.-Y.; Tang, S.-P.; Qi, R.-B.; Liang, J.-W.; Mao, X.-L.; Ye, L.-P.; Li, S.-W. A feasibility study on utilizing machine learning technology to reduce the costs of gastric cancer screening in Taizhou, China. Digit. Health 2024, 10, 1–10. [Google Scholar] [CrossRef]
Agheli, R.; Siavashpour, Z.; Reiazi, R.; Azghandi, S.; Cheraghi, S.; Paydar, R. Predicting severe radiation-induced oral mucositis in head and neck cancer patients using integrated baseline CT radiomic, dosimetry, and clinical features: A machine learning approach. Heliyon 2024, 10, e24866. [Google Scholar] [CrossRef]
Reber, B.; Van Dijk, L.; Anderson, B.; Mohamed, A.S.R.; Fuller, C.; Lai, S.; Brock, K. Comparison of Machine-Learning and Deep-Learning Methods for the Prediction of Osteoradionecrosis Resulting From Head and Neck Cancer Radiation Therapy. Adv. Radiat. Oncol. 2023, 8, 101163. [Google Scholar] [CrossRef]
Nair, A.R.; Rajaguru, H.; Karthika, M.S.; Keerthivasan, C. Metaheuristic integrated machine learning classification of colon cancer using STFT LASSO and EHO feature extraction from microarray gene expressions. Sci. Rep. 2024, 14, 16485. [Google Scholar] [CrossRef]
Schöneck, M.; Rehbach, N.; Lotter-Becker, L.; Persigehl, T.; Lennartz, S.; Caldeira, L.L. Machine Learning-Based Radiomics Analysis for Identifying KRAS Mutations in Non-Small-Cell Lung Cancer from CT Images: Challenges, Insights and Implications. Life 2025, 15, 83. [Google Scholar] [CrossRef]
Hsu, W.H.; Ko, A.T.; Weng, C.S.; Chang, C.L.; Jan, Y.T.; Lin, J.B.; Chien, H.J.; Lin, W.C.; Sun, F.J.; Wu, K.P.; et al. Explainable machine learning model for predicting skeletal muscle loss during surgery and adjuvant chemotherapy in ovarian cancer. J. Cachexia Sarcopenia Muscle 2023, 14, 2044–2053. [Google Scholar] [CrossRef]
Arafa, M.A.; Omar, I.; Farhat, K.H.; Elshinawy, M.; Khan, F.; Alkhathami, F.A.; Mokhtar, A.; Althunayan, A.; Rabah, D.M.; Badawy, A.-H.A. A Comparison of Systematic, Targeted, and Combined Biopsy Using Machine Learning for Prediction of Prostate Cancer Risk: A Multi-Center Study. Med. Princ. Pract. 2024, 33, 491–500. [Google Scholar] [CrossRef]
Nayan, M.; Salari, K.; Bozzo, A.; Ganglberger, W.; Lu, G.; Carvalho, F.; Gusev, A.; Schneider, A.; Westover, B.M.; Feldman, A.S. A machine learning approach to predict progression on active surveillance for prostate cancer. Urol. Oncol. Semin. Orig. Investig. 2022, 40, 161.e1–161.e7. [Google Scholar] [CrossRef]
Tajerian, A.; Kazemian, M.; Tajerian, M.; Akhavan Malayeri, A. Design and validation of a new machine-learning-based diagnostic tool for the differentiation of dermatoscopic skin cancer images. PLoS ONE 2023, 18, e0284437. [Google Scholar] [CrossRef]
Firat Atay, F.; Yagin, F.H.; Colak, C.; Elkiran, E.T.; Mansuri, N.; Ahmad, F.; Ardigò, L.P. A hybrid machine learning model combining association rule mining and classification algorithms to predict differentiated thyroid cancer recurrence. Front. Med. 2024, 11, 1461372. [Google Scholar] [CrossRef]
Wang, H.; Zhang, C.; Li, Q.; Tian, T.; Huang, R.; Qiu, J.; Tian, R. Development and validation of prediction models for papillary thyroid cancer structural recurrence using machine learning approaches. BMC Cancer 2024, 24, 427. [Google Scholar] [CrossRef]
Rosca, C.M.; Ariciu, A.V. Unlocking Customer Sentiment Insights with Azure Sentiment Analysis: A Comprehensive Review and Analysis. Rom. J. Pet. Gas Technol. 2023, IV, 173–182. [Google Scholar] [CrossRef]
Chen, C.; Zhang, J.; Liu, X.; Zhuang, Q.; Lu, H.; Hou, J. Machine learning developed an intratumor heterogeneity signature for predicting clinical outcome and immunotherapy benefit in bladder cancer. Transl. Androl. Urol. 2024, 13, 1104–1117. [Google Scholar] [CrossRef]
Lee, S.; Liang, X.; Woods, M.; Reiner, A.S.; Concannon, P.; Bernstein, L.; Lynch, C.F.; Boice, J.D.; Deasy, J.O.; Bernstein, J.L.; et al. Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study. PLoS ONE 2020, 15, e0226157. [Google Scholar] [CrossRef]
Shaikh, T.A.; Ali, R. An automated machine learning tool for breast cancer diagnosis for healthcare professionals. Health Syst. 2022, 11, 303–333. [Google Scholar] [CrossRef]
Petrov, M.; Sokolov, I. Machine Learning Allows for Distinguishing Precancerous and Cancerous Human Epithelial Cervical Cells Using High-Resolution AFM Imaging of Adhesion Maps. Cells 2023, 12, 2536. [Google Scholar] [CrossRef]
Kim, S.I.; Lee, S.; Choi, C.H.; Lee, M.; Suh, D.H.; Kim, H.S.; Kim, K.; Chung, H.H.; No, J.H.; Kim, J.-W.; et al. Machine Learning Models to Predict Survival Outcomes According to the Surgical Approach of Primary Radical Hysterectomy in Patients with Early Cervical Cancer. Cancers 2021, 13, 3709. [Google Scholar] [CrossRef]
Talebi, R.; Celis-Morales, C.A.; Akbari, A.; Talebi, A.; Borumandnia, N.; Pourhoseingholi, M.A. Machine learning-based classifiers to predict metastasis in colorectal cancer patients. Front. Artif. Intell. 2024, 7, 1285037. [Google Scholar] [CrossRef]
Liu, N.-J.; Liu, M.-S.; Tian, W.; Zhai, Y.-N.; Lv, W.-L.; Wang, T.; Guo, S.-L. The value of machine learning based on CT radiomics in the preoperative identification of peripheral nerve invasion in colorectal cancer: A two-center study. Insights Imaging 2024, 15, 101. [Google Scholar] [CrossRef]
Yang, L.-Y.; Siow, T.Y.; Lin, Y.-C.; Wu, R.-C.; Lu, H.-Y.; Chiang, H.-J.; Ho, C.-Y.; Huang, Y.-T.; Huang, Y.-L.; Pan, Y.-B.; et al. Computer-Aided Segmentation and Machine Learning of Integrated Clinical and Diffusion-Weighted Imaging Parameters for Predicting Lymph Node Metastasis in Endometrial Cancer. Cancers 2021, 13, 1406. [Google Scholar] [CrossRef]
Akazawa, M.; Hashimoto, K.; Noda, K.; Yoshida, K. The application of machine learning for predicting recurrence in patients with early-stage endometrial cancer: A pilot study. Obstet. Gynecol. Sci. 2021, 64, 266–273. [Google Scholar] [CrossRef]
Li, L.; Qin, Z.; Bo, J.; Hu, J.; Zhang, Y.; Qian, L.; Dong, J. Machine learning-based radiomics prognostic model for patients with proximal esophageal cancer after definitive chemoradiotherapy. Insights Imaging 2024, 15, 284. [Google Scholar] [CrossRef]
Cui, Y.; Li, Z.; Xiang, M.; Han, D.; Yin, Y.; Ma, C. Machine learning models predict overall survival and progression free survival of non-surgical esophageal cancer patients with chemoradiotherapy based on CT image radiomics signatures. Radiat. Oncol. 2022, 17, 212. [Google Scholar] [CrossRef] [PubMed]
Gilani, N.; Arabi Belaghi, R.; Aftabi, Y.; Faramarzi, E.; Edgünlü, T.; Somi, M.H. Identifying Potential miRNA Biomarkers for Gastric Cancer Diagnosis Using Machine Learning Variable Selection Approach. Front. Genet. 2022, 12, 779455. [Google Scholar] [CrossRef] [PubMed]
Yang, T.; Martinez-Useros, J.; Liu, J.; Alarcón, I.; Li, C.; Li, W.; Xiao, Y.; Ji, X.; Zhao, Y.; Wang, L.; et al. A retrospective analysis based on multiple machine learning models to predict lymph node metastasis in early gastric cancer. Front. Oncol. 2022, 12, 1023110. [Google Scholar] [CrossRef] [PubMed]
Gangil, T.; Sharan, K.; Rao, B.D.; Palanisamy, K.; Chakrabarti, B.; Kadavigere, R. Utility of adding Radiomics to clinical features in predicting the outcomes of radiotherapy for head and neck cancer using machine learning. PLoS ONE 2022, 17, e0277168. [Google Scholar] [CrossRef]
Varghese, A.J.; Gouthamchand, V.; Sasidharan, B.K.; Wee, L.; Sidhique, S.K.; Rao, J.P.; Dekker, A.; Hoebers, F.; Devakumar, D.; Irodi, A.; et al. Multi-centre radiomics for prediction of recurrence following radical radiotherapy for head and neck cancers: Consequences of feature selection, machine learning classifiers and batch-effect harmonization. Phys. Imaging Radiat. Oncol. 2023, 26, 100450. [Google Scholar] [CrossRef]
Ji, L.; Zhang, W.; Huang, J.; Tian, J.; Zhong, X.; Luo, J.; Zhu, S.; He, Z.; Tong, Y.; Meng, X.; et al. Bone metastasis risk and prognosis assessment models for kidney cancer based on machine learning. Front. Public Health 2022, 10, 1015952. [Google Scholar] [CrossRef]
Wang, H.; He, Z.; Xu, J.; Chen, T.; Huang, J.; Chen, L.; Yue, X. Development and validation of a machine learning model to predict the risk of lymph node metastasis in early-stage supraglottic laryngeal cancer. Front. Oncol. 2025, 15, 1525414. [Google Scholar] [CrossRef]
Nakajo, M.; Nagano, H.; Jinguji, M.; Kamimura, Y.; Masuda, K.; Takumi, K.; Tani, A.; Hirahara, D.; Kariya, K.; Yamashita, M.; et al. The usefulness of machine-learning-based evaluation of clinical and pretreatment 18F-FDG-PET/CT radiomic features for predicting prognosis in patients with laryngeal cancer. Br. J. Radiol. 2023, 96, 20220772. [Google Scholar] [CrossRef]
Sun, J.Q.; Wu, S.N.; Mou, Z.L.; Wen, J.Y.; Wei, H.; Zou, J.; Li, Q.J.; Liu, Z.L.; Xu, S.H.; Kang, M.; et al. Prediction model of ocular metastasis from primary liver cancer: Machine learning-based development and interpretation study. Cancer Med. 2023, 12, 20482–20496. [Google Scholar] [CrossRef]
Lei, L.; Wang, Y.; Xue, Q.; Tong, J.; Zhou, C.-M.; Yang, J.-J. A comparative study of machine learning algorithms for predicting acute kidney injury after liver cancer resection. PeerJ 2020, 8, e8583. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Mei, N.; Zhou, Z.; Fang, Y.; Lin, J.; Zhao, F.; Fang, Z.; Li, Y. A novel prediction model for the prognosis of non-small cell lung cancer with clinical routine laboratory indicators: A machine learning approach. BMC Med. Inform. Decis. Mak. 2024, 24, 344. [Google Scholar] [CrossRef]
Luna, J.M.; Chao, H.-H.; Shinohara, R.T.; Ungar, L.H.; Cengel, K.A.; Pryma, D.A.; Chinniah, C.; Berman, A.T.; Katz, S.I.; Kontos, D.; et al. Machine learning highlights the deficiency of conventional dosimetric constraints for prevention of high-grade radiation esophagitis in non-small cell lung cancer treated with chemoradiation. Clin. Transl. Radiat. Oncol. 2020, 22, 69–75. [Google Scholar] [CrossRef] [PubMed]
Hamidi, F.; Gilani, N.; Arabi Belaghi, R.; Yaghoobi, H.; Babaei, E.; Sarbakhsh, P.; Malakouti, J. Identifying potential circulating miRNA biomarkers for the diagnosis and prediction of ovarian cancer using machine-learning approach: Application of Boruta. Front. Digit. Health 2023, 5, 1187578. [Google Scholar] [CrossRef] [PubMed]
Barber, E.L.; Garg, R.; Persenaire, C.; Simon, M. Natural language processing with machine learning to predict outcomes after ovarian cancer surgery. Gynecol. Oncol. 2021, 160, 182–186. [Google Scholar] [CrossRef]
Abu-Khudir, R.; Hafsa, N.; Badr, B.E. Identifying Effective Biomarkers for Accurate Pancreatic Cancer Prognosis Using Statistical Machine Learning. Diagnostics 2023, 13, 3091. [Google Scholar] [CrossRef] [PubMed]
Iwatate, Y.; Yokota, H.; Hoshino, I.; Ishige, F.; Kuwayama, N.; Itami, M.; Mori, Y.; Chiba, S.; Arimitsu, H.; Yanagibashi, H.; et al. Machine learning with imaging features to predict the expression of ITGAV, which is a poor prognostic factor derived from transcriptome analysis in pancreatic cancer. Int. J. Oncol. 2022, 60, 60. [Google Scholar] [CrossRef]
Li, G.; Zhao, R.; Xie, Z.; Qu, X.; Duan, Y.; Zhu, Y.; Liang, H.; Tang, D.; Li, Z.; He, W. Mining bone metastasis related key genes of prostate cancer from the STING pathway based on machine learning. Front. Med. 2024, 11, 1372495. [Google Scholar] [CrossRef] [PubMed]
Satır, A.; Üstündağ, Y.; Yeşil, M.R.; Huysal, K. Prediction of Prostate Cancer From Routine Laboratory Markers With Automated Machine Learning. J. Clin. Lab. Anal. 2025, 39, e25143. [Google Scholar] [CrossRef]
Yang, T.-Y.; Chien, T.-W.; Lai, F.-J. Web-Based Skin Cancer Assessment and Classification Using Machine Learning and Mobile Computerized Adaptive Testing in a Rasch Model: Development Study. JMIR Med. Inform. 2022, 10, e33006. [Google Scholar] [CrossRef]
Wu, Y.; Rao, K.; Liu, J.; Han, C.; Gong, L.; Chong, Y.; Liu, Z.; Xu, X. Machine Learning Algorithms for the Prediction of Central Lymph Node Metastasis in Patients with Papillary Thyroid Cancer. Front. Endocrinol. 2020, 11, 577537. [Google Scholar] [CrossRef]
Rosca, C.-M.; Bold, R.-A.; Gerea, A.-E. A Comprehensive Patient Triage Algorithm Incorporating ChatGPT API for Symptom-Based Healthcare Decision-Making. In Proceedings of the Emerging Trends and Technologies on Intelligent Systems. ETTIS 2024, Lecture Notes in Networks and Systems, Noida, India, 27–28 March 2024; pp. 167–178. [Google Scholar] [CrossRef]
Tian, L.; Long, F.; Hao, Y.; Li, B.; Li, Y.; Tang, Y.; Li, J.; Zhao, Q.; Chen, J.; Liu, M. A Cancer Associated Fibroblasts-Related Six-Gene Panel for Anti-PD-1 Therapy in Melanoma Driven by Weighted Correlation Network Analysis and Supervised Machine Learning. Front. Med. 2022, 9, 880326. [Google Scholar] [CrossRef]
Couronné, R.; Probst, P.; Boulesteix, A.-L. Random forest versus logistic regression: A large-scale benchmark experiment. BMC Bioinform. 2018, 19, 270. [Google Scholar] [CrossRef]
Wallace, M.L.; Mentch, L.; Wheeler, B.J.; Tapia, A.L.; Richards, M.; Zhou, S.; Yi, L.; Redline, S.; Buysse, D.J. Use and misuse of random forest variable importance metrics in medicine: Demonstrations through incident stroke prediction. BMC Med. Res. Methodol. 2023, 23, 144. [Google Scholar] [CrossRef] [PubMed]
Petkovic, D.; Altman, R.; Wong, M.; Vigil, A. Improving the explainability of Random Forest classifier—user centered approach. In Proceedings of the Pacific Symposium on Biocomputing, Kohala Coast, HI, USA, 3–7 January 2018; pp. 204–215. Available online: https://psb.stanford.edu/psb-online/proceedings/psb18/petkovic.pdf (accessed on 5 January 2026).
Yu, F.; Wei, C.; Deng, P.; Peng, T.; Hu, X. Deep exploration of random forest model boosts the interpretability of machine learning studies of complicated immune responses and lung burden of nanoparticles. Sci. Adv. 2021, 7, eabf4130. [Google Scholar] [CrossRef]
Elshawi, R.; Al-Mallah, M.H.; Sakr, S. On the interpretability of machine learning-based model for predicting hypertension. BMC Med. Inform. Decis. Mak. 2019, 19, 146. [Google Scholar] [CrossRef] [PubMed]
Saito, T.; Rehmsmeier, M. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar] [CrossRef]
Collins, G.S.; Moons, K.G.M. Reporting of artificial intelligence prediction models. Lancet 2019, 393, 1577–1579. [Google Scholar] [CrossRef]
Rosca, C.-M.; Stancu, A. A Comprehensive Review of Machine Learning Models for Optimizing Wind Power Processes. Appl. Sci. 2025, 15, 3758. [Google Scholar] [CrossRef]
Makhlouf, A.; Maayah, M.; Abughanam, N.; Catal, C. The use of generative adversarial networks in medical image augmentation. Neural Comput. Appl. 2023, 35, 24055–24068. [Google Scholar] [CrossRef]
Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
Yang, Y.; Fu, H.; Aviles-Rivero, A.I.; Schönlieb, C.-B.; Zhu, L. DiffMIC: Dual-Guidance Diffusion Network for Medical Image Classification. In Proceedings of the 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, Vancouver, BC, Canada, 8–12 October 2023; pp. 95–105. [Google Scholar] [CrossRef]
Dai, F.; Yao, S.; Wang, M.; Zhu, Y.; Qiu, X.; Sun, P.; Qiu, C.; Yin, J.; Shen, G.; Sun, J.; et al. Improving AI models for rare thyroid cancer subtype by text guided diffusion models. Nat. Commun. 2025, 16, 4449. [Google Scholar] [CrossRef]
He, Z.; McMillan, A.B. Comparative Evaluation of Radiomics and Deep Learning Models for Disease Detection in Chest Radiography. J. Imaging Inform. Med. 2025, 1–12. [Google Scholar] [CrossRef] [PubMed]
Wan, K.W.; Wong, C.H.; Ip, H.F.; Fan, D.; Yuen, P.L.; Fong, H.Y.; Ying, M. Evaluation of the performance of traditional machine learning algorithms, convolutional neural network and AutoML Vision in ultrasound breast lesions classification: A comparative study. Quant. Imaging Med. Surg. 2021, 11, 1381–1393. [Google Scholar] [CrossRef]

Figure 1. Cancer types by the NCI [1,2]. Note: CNS—Central Nervous System; CUP—Carcinoma of Unknown Primary.

Figure 2. Workflow of the systematic review methodology for ML applications in cancer research.

Figure 3. PRISMA flow diagram of the study selection process (the use of the asterisk in the word cancer includes all lexical derivatives).

Figure 4. Distribution of ML publications by cancer type (2020–2025).

Figure 5. Distribution of publication number by cancer type (2020–2025) based on 1364 papers retrieved from the WOS, Scopus, and PubMed [43,44].

Figure 6. Workflow of the generalizability assessment of ML studies in cancer research by cancer types (the use of the asterisk in the word cancer includes all lexical derivatives).

Figure 7. Top 10 most frequently used cancer datasets in the literature.

Table 1. Summary of the ML task type cited.

Cancer Types	Reference	Task Types	Scope
Breast	[5]	Prediction (Risk)	To develop a comprehensive breast cancer risk prediction model using multifactorial features (genetic, biochemical, demographic, etc.).
	[6]	Classification	To classify breast cancer using ML models with high recall for early detection, focusing on malignant cell identification.
	[7]	Classification	To improve preoperative identification of patients with high and low axillary nodal burden to avoid unnecessary lymph node clearance.
	[8]	Prediction (Survival)	To predict long-term survival in breast cancer patients for insurance risk assessment using ML models.
	[9]	Classification	To identify breast cancer employing patient diagnostic features and interpretable ML models (XAI) for transparent clinical decision-making.
	[10]	Classification	To assess multiple ML algorithms for the diagnosis of breast cancer and improve early detection.
	[11]	Classification	To classify breast cancer and metastatic samples based on integrin expression patterns using transcriptomic data and ML.
Colorectal	[12]	Prediction (Recurrence)	To predict the individual risk of colorectal cancer recurrence using longitudinal CEA biomarker data and ML.
Colorectal	[13]	Classification	To develop ML models for CRC detection using routine lab tests, outperforming traditional biomarkers (CEA, FOBT).
Ovarian	[14]	Classification	To classify ovarian cancer patients vs. other gynecological conditions using serum metabolomics and AutoML with XAI.
Ovarian	[15]	Classification	To detect ovarian cancer using integrated lipidomics and metabolomics with ML for liquid biopsy-based diagnosis.
RCC	[14]	Classification	To differentiate RCC patients from healthy controls using urinary metabolomics and AutoML with explainability.
Pancreatic	[16]	Classification	To identify microbiome-based markers in blood extracellular vesicles for non-invasive diagnosis of pancreatic cancer.
Gastric	[17]	Prediction (Survival)	To predict the survival rate in gastric cancer patients using ML models and assess their clinical applicability.
Prostate	[18]	Classification	To identify the main genes involved in prostate cancer development and build a predictive classification model with SHAP interpretability.
Esophageal	[19]	Prediction (Metastasis)	To predict the risk of bone metastasis in esophageal cancer patients to support clinical decision-making.
Rectal	[20]	Prediction (Metastasis)	To predict metachronous liver metastasis in rectal cancer patients using radiomics and ML.
Pan-cancer	[21]	Classification and Prediction	To detect and classify 17 cancer types based on liquid biopsy and RNA.
Pan-cancer	[22]	Prediction	To identify cancer driver mutations in patients using a dedicated ML algorithm and a cancer gene dataset.
Advanced cancer (Palliative)	[23]	Prediction (Survival)	To predict 30-day survival in palliative care cancer patients using objective clinical data and ML models.

Note: CEA—Carcinoembryonic Antigen; FOBT—Fecal Occult Blood Test; RCC—Renal cell carcinoma; RNA—Ribonucleic acid; SHAP—SHapley Additive exPlanations; XAI—Explainable artificial intelligence.

Table 2. Reproducibility rate of ML cancer research by cancer types.

Cancer Types	Reproducibility (%)
Bone	100
Bladder	50
Kidney	50
Brain and CNS	33.33
CUP	29.41
Endometrial	25
Ovarian	24.49
Pancreatic	23.53
Gastric	23.33
Lung	22.52
Multiple types	22.22
Leukemia and Hematologic	22.22
Cell	20.83
Colorectal	20.47
Breast	20.29
Pan-cancer	16.67
Liver	15.79
Cervical	14.81
Prostate	14.46
Head and Neck	13.33
Skin	7.69
Thyroid	5.88
Esophageal	0
Childhood	0
Laryngeal	0
Nasopharyngeal	0
Lymphatic	0
Salivary Gland Tumor	0

Table 3. Generalizability assessment of ML studies in cancer research by cancer types.

Cancer Types	High	Medium	Low	Total
Colorectal	33	185	74	292
Breast	14	191	101	306
Lung	6	107	55	168
Bladder	6	19	10	35
Prostate	5	52	30	87
Ovarian	3	30	21	54
Cervical	3	16	11	30
Gastric	2	45	25	72
Pancreatic	2	24	13	39
Head and Neck	2	21	12	35
Pan-cancer	2	11	4	17
Cell	2	9	6	17
Endometrial	1	11	6	18
Liver	1	9	6	16
Thyroid	1	11	3	15
Skin	1	2	0	3
Childhood	1	0	0	1
CUP	0	8	5	13
Leukemia and Hematologic	0	6	4	10
Multiple types	0	3	3	6
Esophageal	0	8	3	11
Lymphatic	0	1	1	2
Laryngeal	0	2	1	3
Nasopharyngeal	0	1	1	2
Kidney	0	1	0	1
Bone	0	2	0	2
Brain and CNS	0	1	0	1

Table 4. Characteristics of the main datasets used in the analyzed studies.

Dataset	Approximate Size	Data Type	Access	Frequent External Validation	Limitations
TCGA	>11,000 patients	Genomic–Clinical	Public	Rarely multicentric	Cohort reuse bias
GEO	Variable (tens–thousands)	Transcriptomic	Public	Study-dependent	Platform heterogeneity
SEER	>18 million cases	Clinical–Epidemiologic	Public	Frequent	Lack of molecular data

Table 5. ML Models used in cancer research by cancer types.

Cancer Types	ML Model	Number of ML Models
Breast	AdaBoost, CatBoost, Clustering, CNN, DT, ElasticNet, GB, KNNs, LASSO, LightGBM, LogisticReg, LSTM, NB, NN, RF, Ridge, SVM, Transformer, and XGBoost	19
Colorectal	AdaBoost, CatBoost, Clustering, CNN, DT, ElasticNet, GB, KNNs, LASSO, LightGBM, LogisticReg, LSTM, NB, NN, RF, Ridge, SVM, Transformer, and XGBoost	19
Lung	AdaBoost, CatBoost, Clustering, CNN, DT, ElasticNet, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, Ridge, SVM, and XGBoost	17
Prostate	AdaBoost, CatBoost, Clustering, CNN, DT, GB, KNNs, LASSO, LightGBM, LogisticReg, LSTM, NB, NN, RF, SVM, and XGBoost	16
Cervical	AdaBoost, Clustering, DT, GB, KNNs, LASSO, LightGBM, LogisticReg, LSTM, NB, NN, RF, Ridge, SVM, and XGBoost	15
Ovarian	AdaBoost, CatBoost, DT, ElasticNet, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, Ridge, SVM, and XGBoost	15
Bladder	AdaBoost, Clustering, CNN, DT, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, SVM, and XGBoost	14
Head and Neck	AdaBoost, Clustering, CNN, DT, ElasticNet, GB, LASSO, LogisticReg, NB, NN, RF, Ridge, SVM, and XGBoost	14
Gastric	CatBoost, CNN, DT, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, SVM, and XGBoost	13
Thyroid	Clustering, CNN, DT, GB, KNNs, LightGBM, LogisticReg, NB, NN, RF, Ridge, SVM, and XGBoost	13
Pancreatic	CatBoost, Clustering, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, SVM, and XGBoost	12
CUP	AdaBoost, GB, KNNs, LASSO, LightGBM, LogisticReg, NB, NN, RF, SVM, and XGBoost	11
Endometrial	AdaBoost, Clustering, GB, LASSO, LogisticReg, NN, RF, SVM, and XGBoost	9
Esophageal	DT, GB, LASSO, LogisticReg, NB, NN, RF, SVM, and XGBoost	9
Liver	Clustering, DT, GB, LASSO, LogisticReg, NN, RF, SVM, and XGBoost	9
Skin	CNN, DT, GB, KNNs, LogisticReg, NB, NN, RF, and SVM	9
Leukemia and Hematologic	DT, ElasticNet, GB, LogisticReg, NN, RF, SVM, and XGBoost	8
Pan-cancer	DT, ElasticNet, GB, LogisticReg, NB, NN, RF, and XGBoost	8
Cell	CNN, GB, LASSO, NN, RF, SVM, and XGBoost	7
Multiple types	Clustering, GB, LASSO, NN, RF, SVM, and XGBoost	7
Nasopharyngeal	Clustering, GB, LogisticReg, NB, NN, RF, and XGBoost	7
Brain and CNS	Clustering, CNN, GB, NN, RF, and SVM	6
Laryngeal	KNNs, LogisticReg, NB, NN, RF, and SVM	6
Lymphatic	DT, GB, LogisticReg, RF, SVM, and XGBoost	6
Bone	GB, LogisticReg, RF, SVM, and XGBoost	5
Kidney	LASSO, NN, RF, and SVM	4
Childhood	RF and DT	2
Salivary Gland Tumor	DT and RF	2

Note: AdaBoost—Adaptive Boosting; CatBoost—Categorical Boosting; CNN—Convolutional Neural Network; DT—Decision Tree; GB—Gradient Boosting; KNN—K-Nearest Neighbor; LASSO—Least Absolute Shrinkage and Selection Operator; LightGBM—Light Gradient-Boosting Machine; LogisticReg—Logistic regression; NB—Naive Bayes; NN—Neural network; RF—Random Forest; SVM—Support Vector Machine; XGBoost—Extreme Gradient Boosting.

Table 6. Most frequently used ML model in cancer research by cancer type.

Cancer Type	The Most Applied ML Model	Number of Papers That Employed the Most Applied ML Model
Breast	RF	255
Colorectal	RF	248
Lung	RF	121
Prostate	RF	58
Gastric	RF	46
Ovarian	RF	30
Bladder	RF	29
Pancreatic	RF	25
Head and Neck	RF	24
Cervical	RF	20
Liver	RF	14
Thyroid	RF	14
Pan-cancer	RF	11
CUP	RF	10
Endometrial	RF	9
Esophageal	RF	9
Multiple types	RF	7
Skin	RF	7
Leukemia and Hematologic	RF	6
Cell	RF	4
Bone	LogisticReg and RF	2 (each)
Brain and CNS	NN	2
Childhood	RF	2
Kidney	RF	2
Laryngeal	RF	2
Lymphatic	DT, GB, LogisticReg, RF, SVM, and XGBoost	1 (each)
Nasopharyngeal	Clustering, GB, LogisticReg, NB, NN, RF, and XGBoost	1 (each)
Salivary Gland Tumor	DT and RF	1 (each)

Table 7. Maximum and minimum accuracy by cancer types.

Cancer Types	Accuracy (%)		Reference
Bladder	Max	93	[73]
Bladder	Min	71.7	[74]
Brain and CNS	-	60	[75]
Breast	Max	100	[76]
Breast	Min	58	[77]
CUP	Max	96	[78]
CUP	Min	80	[79]
Cervical	Max	99	[80]
Cervical	Min	70.8	[81]
Colorectal	Max	100	[82]
Colorectal	Min	48	[83]
Endometrial	Max	87	[84]
Endometrial	Min	69	[85]
Esophageal	Max	87.5	[86]
Esophageal	Min	72.1	[87]
Gastric	Max	93.4	[88]
Gastric	Min	45.8	[89]
Head and Neck	Max	98.79	[90]
Head and Neck	Min	75	[91]
Kidney	Max	85.66	[92]
Kidney	Min	78.7	[93]
Leukemia and Hematologic	Max	96.7	[94]
Leukemia and Hematologic	Min	95.3	[95]
Liver	Max	99.67	[96]
Liver	Min	90	[97]
Lung	Max	100	[98]
Lung	Min	63	[99]
Multiple types	-	71.46	[100]
Nasopharyngeal	Max	88	[101]
Nasopharyngeal	Min	85.9	[102]
Ovarian	Max	95	[103]
Ovarian	Min	73	[104]
Pancreatic	Max	98.8	[105]
Pancreatic	Min	74.3	[106]
Prostate	Max	99.9	[107]
Prostate	Min	62.3	[108]
Skin	Max	98.5	[109]
Skin	Min	68.5	[110]
Thyroid	Max	99	[111]
Thyroid	Min	74	[112]

Table 8. Maximum and minimum precision by cancer types.

Cancer Types	Precision (%)		Reference
Bladder	Max	92.9	[114]
Bladder	Min	58.3	[115]
Breast	Max	100	[116]
Breast	Min	36	[117]
CUP	-	90	[118]
Colorectal	Max	100	[119]
Colorectal	Min	86.3	[120]
Gastric	Max	92.6	[88]
Gastric	Min	58.91	[121]
Head and Neck	Max	86	[122]
Head and Neck	Min	75	[123]
Kidney	-	84.18	[92]
Lung	Max	99	[124]
Lung	Min	47.1	[125]
Ovarian	Max	90	[126]
Ovarian	Min	80	[104]
Pancreatic	-	83.2	[127]
Prostate	Max	98	[128]
Prostate	Min	66	[129]
Thyroid	-	61	[111]

Table 9. Maximum and minimum recall by cancer types.

Cancer Types	Recall (%)		Reference
Bladder	Max	80	[134]
Bladder	Min	77.8	[115]
Breast	Max	99.49	[135]
Breast	Min	50	[136]
Colorectal	Max	94.6	[120]
Colorectal	Min	83	[137]
CUP	-	90	[118]
Gastric	Max	99	[88]
Gastric	Min	70.96	[121]
Head and Neck	-	55	[123]
Kidney	-	85.66	[92]
Lung	Max	98.9	[124]
Lung	Min	73.8	[138]
Ovarian	Max	90	[126]
Ovarian	Min	80	[104]
Prostate	Max	99	[129]
Prostate	Min	68	[139]
Thyroid	-	88	[111]

Table 10. Maximum and minimum F1-scores by cancer types.

Cancer Types	F1-Score (%)		Reference
Bladder	Max	93.8	[114]
Bladder	Min	66.7	[115]
Breast	Max	99.79	[135]
Breast	Min	37	[140]
Colorectal	Max	98.2	[141]
Colorectal	Min	76.3	[142]
CUP	Max	94.2	[143]
CUP	Min	70	[78]
Gastric	Max	97.5	[144]
Gastric	Min	57.9	[89]
Head and Neck	Max	91	[145]
Head and Neck	Min	30	[146]
Kidney	-	90.63	[92]
Lung	Max	97.5	[147]
Lung	Min	41	[148]
Ovarian	-	72.6	[149]
Prostate	Max	94	[150]
Prostate	Min	58.6	[151]
Skin	-	93	[152]
Thyroid	Max	96.7	[153]
Thyroid	Min	21.6	[154]

Table 11. Maximum and minimum AUC by cancer types.

Cancer Types	AUC (%)		Reference
Bladder	Max	97	[73]
Bladder	Min	74	[156]
Breast	Min	62	[157]
Breast	Max	99.89	[158]
Cervical	Max	93	[159]
Cervical	Min	68	[160]
Colorectal	Max	100	[161]
Colorectal	Min	61	[162]
Endometrial	Max	94	[163]
Endometrial	Min	53	[164]
Esophageal	Max	98	[165]
Esophageal	Min	66	[166]
Gastric	Max	100	[167]
Gastric	Min	74	[168]
Head and Neck	Max	97	[169]
Head and Neck	Min	51	[170]
Kidney	Max	83	[171]
Kidney	Min	82	[93]
Laryngeal	Max	87	[172]
Laryngeal	Min	81	[173]
Liver	Max	99	[174]
Liver	Min	77	[175]
Lung	Max	99.3	[176]
Lung	Min	45	[177]
Ovarian	Max	100	[178]
Ovarian	Min	56	[179]
Pancreatic	Max	100	[180]
Pancreatic	Min	70	[181]
Prostate	Max	99	[182]
Prostate	Min	72	[183]
Skin	Max	99	[184]
Skin	Min	75	[110]
Thyroid	Max	99	[111]
Thyroid	Min	68	[185]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rosca, C.-M.; Stancu, A.; Brezoi, A.G. Advances in Cancer Through Machine Learning Models. Appl. Sci. 2026, 16, 2226. https://doi.org/10.3390/app16052226

AMA Style

Rosca C-M, Stancu A, Brezoi AG. Advances in Cancer Through Machine Learning Models. Applied Sciences. 2026; 16(5):2226. https://doi.org/10.3390/app16052226

Chicago/Turabian Style

Rosca, Cosmina-Mihaela, Adrian Stancu, and Alina Gabriela Brezoi. 2026. "Advances in Cancer Through Machine Learning Models" Applied Sciences 16, no. 5: 2226. https://doi.org/10.3390/app16052226

APA Style

Rosca, C.-M., Stancu, A., & Brezoi, A. G. (2026). Advances in Cancer Through Machine Learning Models. Applied Sciences, 16(5), 2226. https://doi.org/10.3390/app16052226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advances in Cancer Through Machine Learning Models

Abstract

1. Introduction

2. Methodology

2.1. Search Strategy and Data Source

2.2. Review Protocol in Accordance with the PRISMA Guidelines

3. Results

3.1. Cancer Types and ML Models

3.2. Dataset Assessment

3.3. ML Models Employed in Cancer Types

3.4. Performance Metrics for Cancer Types

3.4.1. Accuracy Metric for Cancer Types

3.4.2. Precision Metric for Cancer Types

3.4.3. Recall Metric for Cancer Types

3.4.4. F1-Score Metric for Cancer Types

3.4.5. AUC Metric for Cancer Types

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI