Next Article in Journal
CacheFormer: High-Attention-Based Segment Caching
Next Article in Special Issue
Deep Reinforcement Learning for Automated Insulin Delivery Systems: Algorithms, Applications, and Prospects
Previous Article in Journal
Efficient Detection of Mind Wandering During Reading Aloud Using Blinks, Pitch Frequency, and Reading Rate
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Artificial Intelligence in Ovarian Cancer: A Systematic Review and Meta-Analysis of Predictive AI Models in Genomics, Radiomics, and Immunotherapy

by
Mauro Francesco Pio Maiorano
1,2,*,
Gennaro Cormio
1,2,
Vera Loizzi
2,3 and
Brigida Anna Maiorano
4
1
Unit of Obstetrics and Gynecology, Department of Interdisciplinary Medicine (DIM), University of Bari “Aldo Moro”, Polyclinic of Bari, Piazza Giulio Cesare 11, 70124 Bari, Italy
2
Unit of Oncologic Gynecology, IRCCS “Giovanni Paolo II” Oncologic Institute, Viale Orazio Flacco 65, 70124 Bari, Italy
3
Translational Biomedicine and Neuroscience Department (DiBraiN), University of Bari “Aldo Moro”, Piazza Giulio Cesare 11, 70124 Bari, Italy
4
Department of Medical Oncology, IRCCS San Raffaele Hospital, Via Olgettina 60, 20132 Milan, Italy
*
Author to whom correspondence should be addressed.
Submission received: 22 March 2025 / Revised: 14 April 2025 / Accepted: 16 April 2025 / Published: 18 April 2025

Abstract

:
Background/Objectives: Artificial intelligence (AI) is increasingly influencing oncological research by enabling precision medicine in ovarian cancer through enhanced prediction of therapy response and patient stratification. This systematic review and meta-analysis was conducted to assess the performance of AI-driven models across three key domains: genomics and molecular profiling, radiomics-based imaging analysis, and prediction of immunotherapy response. Methods: Relevant studies were identified through a systematic search across multiple databases (2020–2025), adhering to PRISMA guidelines. Results: Thirteen studies met the inclusion criteria, involving over 10,000 ovarian cancer patients and encompassing diverse AI models such as machine learning classifiers and deep learning architectures. Pooled AUCs indicated strong predictive performance for genomics-based (0.78), radiomics-based (0.88), and immunotherapy-based (0.77) models. Notably, radiogenomics-based AI integrating imaging and molecular data yielded the highest accuracy (AUC = 0.975), highlighting the potential of multi-modal approaches. Heterogeneity and risk of bias were assessed, and evidence certainty was graded. Conclusions: Overall, AI demonstrated promise in predicting therapeutic outcomes in ovarian cancer, with radiomics and integrated radiogenomics emerging as leading strategies. Future efforts should prioritize explainability, prospective multi-center validation, and integration of immune and spatial transcriptomic data to support clinical implementation and individualized treatment strategies. Unlike earlier reviews, this study synthesizes a broader range of AI applications in ovarian cancer and provides pooled performance metrics across diverse models. It examines the methodological soundness of the selected studies and highlights current gaps and opportunities for clinical translation, offering a comprehensive and forward-looking perspective in the field.

1. Introduction

Ovarian cancer ranked as the eighth most prevalent cancer among women worldwide, representing about 3.7% of all cancer cases and 4.7% of cancer-related mortality in 2020 [1]. In the United States, the projection for 2025 indicates that approximately 20,890 women will be diagnosed with ovarian cancer, with an estimated 12,730 deaths from the disease [2]. The five-year relative survival rate stands at 49.7%, with around 57% of cases having metastasized at the time of diagnosis [1,2,3]. Late-stage diagnoses, tumor heterogeneity, and the development of resistance to standard therapies are the main causes of the high mortality rate [2,3,4]. The integration of artificial intelligence (AI) into healthcare—especially in oncology—is accelerating progress in diagnostic accuracy, clinical decision-making support, and individualized treatment strategies [5,6,7]. In gynecologic cancers, AI applications have shown promise in improving diagnostic accuracy, risk assessment, and treatment planning [8]. For instance, AI models have been developed to detect endometrial cancer with high accuracy by analyzing histopathological images, potentially applicable to other cancers, including ovarian cancer [9]. However, challenges such as data transparency, quality, and interpretation must be addressed to fully realize AI’s potential in gynecologic oncology, and while these challenges remain, growing interest in applying AI to retrospective and prospective clinical datasets has prompted efforts to evaluate its actual performance in real-world scenarios. Therefore, this systematic review and meta-analysis aim to evaluate the predictive performance of AI models in ovarian cancer across three major domains: genomics and molecular profiling, radiomics-based imaging analysis, and prediction of immunotherapy response. By synthesizing current evidence, we seek to assess the performance of AI-driven approaches in predicting treatment response and support therapy selection in research and experimental setting, anticipating future real-world applications, identify sources of heterogeneity in model performance, and highlight areas requiring further research. Our goal is to systematically evaluate current evidence on AI applications in ovarian cancer and provide a foundation for future research integrating AI into ovarian cancer management, thereby supporting the development of clinically translatable, evidence-based precision oncology tools.

2. Materials and Methods

This systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and aimed to evaluate the predictive performance of AI models in ovarian cancer therapy response prediction [10]. Figure 1 shows the PRISMA flowchart for the included studies.
This review focused on three major AI applications, genomic and molecular profiling, radiomics-based therapy prediction, and immunotherapy response prediction, assessing their predictive accuracy in therapy selection and patient outcomes. We registered this systematic review on PROSPERO (ID 1015093).

2.1. Search Strategy and Study Selection

Two independent reviewers, M.F.P.M. and B.A.M., conducted a comprehensive literature search across PubMed/MEDLINE, Embase, Scopus, Web of Science, IEEE Xplore, and Cochrane Library from 2020 until March 2025. To ensure a broad and thorough search, conference abstracts from ASCO, ESMO, and ESGO were also screened. The search terms included combinations of “artificial intelligence”, “machine learning”, “deep learning”, and “radiomics” with “ovarian cancer”, “therapy response”, “biomarkers”, “immunotherapy”, “PARP inhibitors”, and “platinum resistance”. No restrictions were applied regarding the publication date, but the studies were limited to English-language articles. After duplicate removal, the titles and abstracts were independently screened by two reviewers. The studies deemed relevant underwent full-text review, where eligibility was assessed based on predefined inclusion and exclusion criteria. Discrepancies in selection were resolved through discussion or consultation with a third reviewer to ensure consistency and rigor in study selection.

2.2. Eligibility Criteria

The inclusion and exclusion criteria for this systematic review and meta-analysis were established using the Population, Intervention, Comparison, Outcomes, and Study (PICOS) framework to ensure methodological rigor and relevance to the research question (Table 1) [11].
Studies were included if they evaluated AI models for therapy response prediction in ovarian cancer, specifically those applied to genomic and molecular profiling, radiomics-based approaches, or immunotherapy-based studies. The population of interest included patients diagnosed with ovarian cancer across various disease stages and histological subtypes, with a focus on those undergoing treatment with chemotherapy, PARP inhibitors (PARPis), or immune checkpoint inhibitors (ICIs). The interventions analyzed were AI-based models aimed at predicting response to these therapies, leveraging imaging data, molecular profiling, or tumor microenvironment characterization. The comparators included standard clinical or molecular predictors, such as traditional biomarker-based assays, including Homologous Recombination Deficiency (HRD) testing and BReast Cancer Associated (BRCA) gene mutation screening, as well as clinician-based radiologic assessments and conventional histopathologic scoring methods. To be included, the studies were required to report AI model performance metrics, including the area under the curve (AUC), sensitivity, specificity, or hazard ratios (HR) for overall survival (OS) or progression-free survival (PFS). Eligible studies included retrospective and prospective cohort studies, observational studies, and randomized controlled trials (RCTs) if AI was used for therapy response prediction. The datasets incorporated included real-world sources such as The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), and multi-center institutional imaging registries. Studies were excluded if they lacked quantifiable AI performance metrics, did not validate AI models on ovarian cancer datasets, or focused exclusively on traditional biomarkers without AI integration. Additionally, theoretical AI models without validation, preclinical models lacking human data, review articles, case reports, editorials, and studies not written in English were excluded.

2.3. Data Extraction and Synthesis

Data extraction was conducted independently by two reviewers, M.F.P.M. and B.A.M., using a standardized extraction form. The extracted data included study characteristics (authors, year of publication, study design), AI model details (architecture, learning approach, type of input data), and performance metrics (AUC, sensitivity, specificity, HR for OS and PFS). The validation strategy (internal validation, external validation, or cross-validation) was also documented when feasible. A narrative synthesis was performed to categorize AI models into genomics-based, radiomics-based, and immunotherapy-focused approaches, allowing for a structured comparison of findings across different AI applications.

2.4. Meta-Analysis and Statistical Analysis

A meta-analysis was performed to quantitatively assess the predictive performance of AI models in therapy response prediction. Pooled AUC values were calculated using a random-effects model (DerSimonian–Laird method) for heterogeneous datasets (I2 > 50%) and a fixed-effects model (Mantel–Haenszel method) for low-heterogeneity datasets [12,13]. Heterogeneity across studies was assessed using the chi-square test (χ2) and the I2 statistic, with thresholds of 25%, 50%, and 75% corresponding to low, moderate, and high heterogeneity, respectively [14]. Subgroup analyses were conducted to compare different AI models based on model type (genomics, radiomics, or immunotherapy AI), therapy type predicted (platinum response, PARPis, or immunotherapy response), and validation strategy (single- vs. multi-center datasets). A sensitivity analysis was performed by sequentially excluding studies to assess the robustness of pooled estimates. All statistical analyses were conducted using R 4.2.2 (meta and metafor packages) [15].

2.5. Risk of Bias Assessment

Two reviewers independently assessed the risk of bias for the selected study, using PROBAST (Prediction Model Risk of Bias Assessment Tool) for AI-based models and QUADAS (Quality Assessment of Diagnostic Accuracy Studies)-2 for studies evaluating diagnostic accuracy [16,17]. Bias was examined across four domains: patient selection, model development, outcome measurement, and data transparency. The studies were classified as low, moderate, or high risk of bias, and any discrepancies in assessment were resolved through discussion (Supplementary Tables S1 and S2).

2.6. Publication Bias Assessment

A funnel plot analysis was conducted to assess potential publication bias by plotting study effect sizes (AUC values) against their standard errors. Egger’s test for funnel plot asymmetry was performed to statistically evaluate the presence of small-study effects. Given the limited number of studies per subgroup, formal publication bias testing was only applied to the overall pooled analysis. A non-significant p-value from Egger’s test (p > 0.05) was interpreted as lacking strong evidence for publication bias [18].

2.7. Assessment of Evidence Certainty

The Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) method was applied to evaluate the certainty of the evidence. This assessment considered the risk of bias, inconsistency of effect, indirectness, imprecision, and potential publication bias. A GRADE Summary of Findings table was generated using the GRADEpro tool to summarize confidence in the pooled estimates and support the interpretation of AI models’ predictive performance in ovarian cancer therapy response prediction [19].

3. Results

3.1. Main Findings

A total of 1543 studies were initially identified through database searches and manual screening of conference proceedings. After removing duplicates and assessing titles and abstracts for relevance, 126 full-text articles were reviewed for eligibility. Among these, 111 studies were excluded for the following reasons: 42 studies did not apply artificial intelligence models, 25 lacked validation on ovarian cancer datasets, 18 were preclinical or in silico models without human data, 17 failed to report predictive performance metrics such as AUC or sensitivity, and 11 were review articles or conference abstracts without sufficient methodological details (Figure 1). Finally, 13 studies met the inclusion criteria and were analyzed [20,21,22,23,24,25,26,27,28,29,30,31,32]. The selected studies encompassed genomics-based AI models (n = 4), radiomics-based AI models (n = 4), and immunotherapy-focused AI models (n = 5) [20,21,22,23,24,25,26,27,28,29,30,31,32]. The studies collectively included data from multi-center registries, publicly available datasets such as TCGA and GEO, and institutional imaging databases, covering a total of approximately 10,000 ovarian cancer patients across different validation cohorts. AI approaches used in these studies ranged from machine-learning-based classifiers (e.g., support vector machines, random forests) to deep learning models (e.g., convolutional neural networks, transformer-based architectures). Table 2 synthesizes the main studies’ findings.
The included studies assessed therapy response prediction in ovarian cancer across three primary domains: genomics and molecular profiling, radiomics-based imaging analysis, and immunotherapy response prediction. Five studies focused on genomic and molecular profiling, utilizing AI to predict BRCA mutation status, HRD, and OS outcomes following chemotherapy or PARP inhibitor treatment [23,28,29,30,31]. Five studies applied radiomics-based AI models, leveraging CT, MRI, and PET/CT imaging datasets to predict recurrence risk, tumor heterogeneity, and preoperative staging accuracy [24,25,26,27,32]. Three studies investigated AI applications in immunotherapy prediction, analyzing tumor immune microenvironment (TME) features, ICI response, and immune cell infiltration patterns [20,21,22]. Performance was primarily reported using the AUC values, with additional evaluation of sensitivity, specificity, HRs for OS, and PFS. Three studies [20,22,25] explored AI-driven multi-modal integration, combining radiomics, genomics, and immune profiling for comprehensive therapy response prediction.

3.1.1. AI in Genomic and Molecular Profiling

AI has demonstrated strong performance in genomic-based therapy prediction, particularly in evaluating HRD, BRCA mutation status, and prognostic biomarkers in ovarian cancer. Nero et al. applied weakly supervised AI to H&E-stained slides to predict BRCA status, achieving an AUC of 0.70 in the training set and 0.55 in validation, indicating the need for further model refinement [20]. Meanwhile, Bergstrom et al. developed DeepHRD, a deep learning model trained on TCGA and external cohorts. This model demonstrated an AUC of 0.81 in predicting HRD status and improved OS prediction following platinum therapy (HR = 0.46, p = 0.030) [21]. In prognostic modeling, Wang et al. employed machine learning for survival prediction, reporting multiple AUC values across different datasets (ranging from 0.542 to 0.820 for OS across different validation cohorts) [22]. Similarly, Huan et al. utilized a multi-cohort AI-driven prognostic signature, reporting 1-, 3-, and 5-year OS AUCs of 0.859, 0.812, and 0.795, respectively, suggesting strong predictive power for long-term survival outcomes [23]. These findings highlight AI’s potential to complement traditional molecular testing, enabling non-invasive prediction of HRD and survival outcomes, though further validation is required for clinical translation.

3.1.2. AI in Imaging and Radiomics for Predicting Response to Therapy

Radiomics and AI-driven imaging analysis are emerging as highly effective tools for therapy prediction in ovarian cancer. A CT-based radiomics model by Wei et al. achieved an AUC of 0.88 for recurrence prediction, demonstrating strong performance in stratifying high-risk patients [24]. Expanding upon these findings, Binas et al. introduced an MRI-based AI model, which achieved 86% accuracy in identifying tumor heterogeneity, an important factor influencing chemoresistance and progression [25]. Xu et al. further demonstrated the value of PET/CT-based AI models, achieving an AUC of 0.819 for preoperative FIGO stage prediction, indicating that radiomics can enhance staging accuracy without invasive procedures [26]. A major advancement in multi-modal AI integration was made by Zeng et al., whose radiogenomics model combining imaging and genetic data reached an AUC of 0.975, the highest predictive accuracy among all models analyzed. This study reinforces the superior performance of multi-modal AI approaches in precision oncology [27]. These results support radiomics as a superior tool for therapy response prediction, with multi-modal AI models demonstrating the highest predictive power.

3.1.3. AI for Immunotherapy and Novel Targeted Treatments in Ovarian Cancer

Artificial intelligence is also playing a transformative role in immunotherapy response prediction, guiding ICIs, tumor immune microenvironment (TME) profiling, and novel targeted therapies. Chen et al. developed a CD8+ exhausted T-cells (Tex) prognostic signature, reporting AUCs of 0.728 for 2-year survival, 0.783 for 3-year survival, and 0.773 for 4-year survival. This signature outperformed traditional prognostic models in predicting the response to ICIs [28]. Meanwhile, Wu et al. applied an AI-driven immune risk model, achieving an AUC of 0.79 in stratifying ovarian cancer patients for immunotherapy guidance [29]. Zhao et al. introduced a macrophage-related signature (MRS), which showed 1-, 3-, and 5-year OS AUCs ranging from 0.692 to 0.774, validating its predictive capability for drug sensitivity and immune microenvironment status [30]. Yang et al. developed a novel fibroblast-based biomarker for immunotherapy response. The SFRP2+ fibroblast signature achieved an AUC of 0.853 (95% CI: 0.829–0.877) in predicting ICI response and TP53 mutation status [31]. Geng et al. focused on extracellular matrix (ECM) proteins, showing that ECM-based AI stratified patients based on immunotherapy sensitivity, with an AUC of 0.810 (training) and 0.684 (validation), emphasizing the role of ECM remodeling in ICI response [32]. These findings suggest that AI-driven immunotherapy models provide moderate-to-high predictive accuracy, particularly when applied to TME profiling and ICI response prediction.
When stratified by AI model type, deep learning approaches demonstrated superior performance in radiomics (AUC = 0.975) and immunotherapy-based models (average AUC = 0.832), whereas traditional machine learning showed more consistent accuracy in genomics-based prediction (average AUC = 0.803 vs. 0.755 for deep learning).

3.1.4. Risk of Bias Assessment

The risk of bias assessment using PROBAST for AI-based models and QUADAS-2 for diagnostic accuracy studies revealed notable variability across the included studies [16,17]. Genomics-based and immunotherapy-focused AI models exhibited the highest risk of bias, with concerns related to dataset heterogeneity, lack of external validation, and suboptimal model transparency. Specifically, studies by Nero et al., Wang et al., Chen et al., and Zhao et al. were classified as high risk, primarily due to patient selection bias and insufficient external validation [20,22,28,30]. In contrast, radiomics-based AI models demonstrated a lower overall risk, benefiting from more standardized imaging protocols and feature extraction techniques. Studies such as those of Wei et al. and Zeng et al. were rated as low risk under QUADAS-2, reflecting well-defined patient cohorts and consistent validation methodologies [24,27]. However, Binas et al. and Xu et al. showed moderate to high risk, largely due to variability in reference standards and model generalizability [25,26]. Among diagnostic accuracy studies, PET/CT-based AI models exhibited a higher risk of bias, particularly in the flow and timing domain, suggesting greater variability in reference standards and dataset consistency [26]. In contrast, MRI/CT-based radiomics studies had a lower risk of bias, benefiting from better-defined imaging parameters and structured validation protocols [24,27]. Supplementary Tables S1 and S2 summarize the detailed risk of bias assessment results for all included studies.

3.2. Meta-Analyses

The pooled analysis of all included AI models demonstrated a pooled AUC of 0.81 (95% CI: 0.72–0.89), indicating moderate-to-high predictive accuracy across different AI applications in ovarian cancer therapy prediction. Heterogeneity was high (I2 = 84.9%), reflecting variability across model types, particularly between genomics-based, radiomics-based, and immunotherapy-focused AI models (Figure 2).

3.2.1. Genomics-Based AI Models

The pooled AUC for genomics-based AI models was 0.78 (95% CI: 0.66–0.89), indicating moderate predictive accuracy. Heterogeneity was high (I2 = 86.5%), reflecting variability among models. HRD prediction models, such as DeepHRD (AUC = 0.81, 95% CI: 0.77–0.85), achieved relatively strong performance, whereas BRCA-status models like NERO et al. (AUC = 0.70, 95% CI: 0.65–0.75) exhibited lower predictive accuracy [20,21]. Prognostic models for survival prediction reported AUC values ranging from 0.739 to 0.859 across different validation cohorts, suggesting variability in long-term survival prediction capabilities (Figure 3) [22,23].

3.2.2. Radiomics-Based AI Models

Radiomics-based AI models exhibited the highest predictive accuracy, with a pooled AUC of 0.88 (95% CI: 0.78–0.99). Heterogeneity was high (I2 = 90.5%), indicating substantial variability across studies despite the standardized imaging techniques. Individual models demonstrated strong predictive performance, with Wei et al. (CT-based AI, AUC = 0.88, 95% CI: 0.84–0.92), Binas et al. (MRI-based AI, AUC = 0.86, 95% CI: 0.82–0.90), and Xu et al. (PET/CT-based AI, AUC = 0.819, 95% CI: 0.78–0.86) showing robust accuracy in recurrence risk assessment and therapy response prediction [24,25,26]. The highest-performing model was the radiogenomics-based AI by Zeng et al. (AUC = 0.975, 95% CI: 0.94–1.01), reinforcing the advantage of integrating imaging and molecular data for therapy response stratification (Figure 4) [27].

3.2.3. Immunotherapy-Focused AI Models

The pooled AUC for immunotherapy-focused AI models was 0.77 (95% CI: 0.69–0.85), with high heterogeneity (I2 = 90.0%), reflecting differences in immune biomarker selection, dataset validation, and AI methodologies. Individual models demonstrated varying predictive performance, with Chen et al. (CD8+ Tex signature, AUC = 0.728, 95% CI: 0.69–0.77), Wu et al. (immune risk model, AUC = 0.79, 95% CI: 0.75–0.83), and Zhao et al. (MRS, AUC = 0.692, 95% CI: 0.65–0.73) showing moderate predictive accuracy [28,29,30]. In contrast, the SFRP2+ fibroblast signature by Yang et al. (AUC = 0.853, 95% CI: 0.81–0.89) and the ECM remodeling-based AI by Geng et al. (AUC = 0.810, 95% CI: 0.77–0.85) exhibited higher accuracy, suggesting that immune microenvironment models incorporating extracellular matrix interactions may enhance immunotherapy response prediction (Figure 5) [31,32].

3.2.4. Publication Bias Findings

The funnel plot of included studies appeared symmetrical, suggesting no evident small-study effects or selective reporting bias (Figure 6). Egger’s test for funnel plot asymmetry yielded a non-significant result (t = −0.56, p = 0.5894), indicating no statistically significant evidence of publication bias [18]. The bias estimate (−5.5422) had a large standard error (SE = 9.9691), reinforcing the lack of strong directional bias. However, moderate-to-high heterogeneity (tau2 = 14.8676) was observed across studies, which may contribute to minor deviations in the funnel plot shape.

3.2.5. Evidence Certainty Assessment

The certainty of evidence for AI models in ovarian cancer therapy prediction was evaluated using the GRADE approach, considering factors such as risk of bias, inconsistency, indirectness, and imprecision (Table 3) [19]. Radiomics-based AI models demonstrated the highest certainty of evidence, with low bias and high consistency across studies. Genomics-based AI models had moderate certainty, largely due to variability in biomarker selection and dataset representativeness. Immunotherapy-focused AI models exhibited the greatest inconsistency and indirectness, leading to an overall low-to-moderate certainty rating. The overall certainty of evidence across all AI models was rated as moderate, highlighting the need for further validation yet encouraging clinical integration.

4. Discussion

Artificial intelligence has become a promising tool in ovarian cancer management, especially in predicting therapy response through genomic profiling, radiomics-based imaging analysis, and immunotherapy response prediction [33]. To our knowledge, this is the first meta-analysis to comprehensively assess and compare the performance of AI models across the three major domains of precision oncology. In fact, unlike previous studies that focused on single AI applications, such as diagnostic radiomics, prognostic genomic models, or immune biomarker discovery [34,35], this study uniquely integrates and quantitatively compares AI models applied to each of these domains within the same analytical framework. Notably, while prior meta-analyses have evaluated AI’s diagnostic accuracy in ovarian cancer imaging, with pooled AUCs ranging from 0.85 to 0.93 depending on the imaging modality [34,36], none have included therapy response prediction or incorporated genomics or immunotherapy-focused models. Our work not only expands the scope of prior analyses but also provides a pooled AUC of 0.81 across all domains, offering a benchmark for the global predictive accuracy of AI in ovarian cancer management. In fact, our results support the premise that multi-modal AI integration enhances therapy response and outcome prediction, offering a pathway to more robust and generalizable models [37]. Importantly, this study introduces novel insights into the performance of multi-modal (radiogenomic) AI models, which integrate imaging with molecular data. These models achieved the highest performance (AUC up to 0.975), exceeding that of any single-modality model. While emerging research suggests radiogenomics may outperform individual data, no prior meta-analysis has quantified this advantage. Radiomics-based AI models exhibited the highest performance (AUC = 0.88), while genomics-based AI models achieved an AUC = 0.78, and immunotherapy-focused AI models demonstrated the lowest predictive power (AUC = 0.77). This value falls within the AUC range reported in recent studies using machine learning for survival prediction and HRD status classification, typically ranging from 0.72 to 0.82 [32,38].Interestingly, radiomics-based AI models demonstrated superior pooled performance compared to genomics-based models focused on BRCA mutation or HRD status prediction: although these values were derived from different studies, the observed trend suggests that radiomic features may offer complementary or even superior predictive insights, particularly in modeling spatial tumor heterogeneity and phenotypic behavior. This is supported by findings in breast cancer, where radiomic analysis outperformed BRCA status in stratifying triple-negative tumors, underscoring the potential of imaging-derived data as a robust prognostic and predictive tool [39]. These results highlight the increasing role of AI in enhancing diagnostics, optimizing therapy selection, and improving patient stratification in ovarian cancer [40]. Genomics-based AI models demonstrated moderate predictive accuracy, with HRD prediction models, such as DeepHRD (AUC = 0.81), outperforming BRCA-status prediction models, such as Nero et al. (AUC = 0.70) [20,21]. This finding is consistent with the literature from other cancer types, where multi-gene AI models combining somatic mutations, transcriptomics, and clinical variables have achieved AUCs above 0.85 [41,42]. The lower performance of BRCA-focused models, in fact, suggests that single-gene approaches may not provide sufficient discriminatory power for therapy selection. These findings reinforce the need for AI-driven multi-gene panels that integrate multiple molecular biomarkers, including not only HRD, but also tumor-associated protein (TP) 53, Cyclin-E1 (CCNE1), and immune-related gene signatures, to enhance predictive performance and improve treatment response stratification [43,44]. Given that CCNE1 amplification has been linked to poor response to platinum-based chemotherapy and PARP inhibitors, its inclusion in AI-driven predictive models may provide additional insights into optimal treatment selection [45]. Similarly, immune-related markers such as TMB and Interferon-gamma (IFN-γ) signatures have shown potential in guiding immunotherapy response, further highlighting the importance of multi-modal AI approaches [46,47]. The prognostic AI models for survival prediction, including those developed by Wang et al. and Huan et al., exhibited AUC values ranging from 0.739 to 0.859, suggesting their potential in long-term outcome prediction [22,23]. However, the high heterogeneity observed across genomics-based AI models reflects differences in biomarker selection, dataset sources, and model validation techniques. The variability in AI model performance in this domain may stem from inconsistent feature selection methods, differences in training cohort characteristics, and the absence of external validation in real-world patient populations. Comparatively, AI-based multi-gene prognostic signatures have been successfully applied in breast and lung cancer, where integrative genomic models combining somatic mutations, transcriptomics, and epigenetics achieved AUC values above 0.85 [41,42]. The findings in ovarian cancer suggest a similar opportunity for AI-driven multi-omics integration to enhance therapy selection and survival prediction. Future AI models could incorporate spatial transcriptomics and proteomics to improve tumor heterogeneity characterization and enhance biomarker-driven patient stratification. Radiomics-based AI models demonstrated the highest predictive accuracy: the models developed by Wei et al. (CT-based, AUC = 0.88), Binas et al. (MRI-based, AUC = 0.86), and XU et al. (PET/CT-based, AUC = 0.819) effectively predicted recurrence risk, tumor heterogeneity, and FIGO stage, respectively [24,25,26]. While the included studies relied on conventional imaging modalities, future integration of advanced imaging technologies such as multispectral and hyperspectral imaging may enhance tissue characterization and tumor margin detection, offering new opportunities for AI-driven diagnostics and intraoperative decision-making [48]. Furthermore, our analysis revealed that the relative performance of AI model types appears to be domain-dependent. Deep learning models demonstrated superior performance in imaging-based applications, achieving an average AUC of 0.975 in radiogenomics. This aligns with findings in other fields, such as lung and brain cancer, where convolutional neural networks have outperformed traditional machine learning in radiological tasks [49,50]. In contrast, traditional machine learning methods showed more stable performance in genomics-based applications, where structured tabular data with well-defined biomarkers may favor less complex algorithms. Similarly, in immunotherapy prediction, deep learning yielded higher accuracy (AUC = 0.832) compared to traditional ML (AUC = 0.749), particularly when models leveraged spatial or stromal features. These findings suggest that the selection of AI architecture should be tailored to the data and prediction task, a concept that is increasingly recognized in precision oncology model development [6,7]. The superior performance of radiogenomics-based AI, such as the model by Zeng et al. (AUC = 0.975), further highlights the value of integrating imaging features with molecular profiling to improve patient stratification and therapy selection [27]. This model combined multi-modal data from multiple centers; however, the reported performance was based on internal validation and has not yet been replicated in external datasets, limiting generalizability. The success of radiomics-based AI in ovarian cancer aligns with previous applications in other malignancies. In glioblastoma and non-small-cell lung cancer (NSCLC), radiogenomics-enhanced AI models have achieved AUC values between 0.90 and 0.97 [49,50]. The strong performance of these models suggests that AI-driven imaging analysis could serve as a key tool in precision oncology, particularly for assessing chemoresistance, identifying high-risk recurrence patients, and refining preoperative decision-making. However, despite their promising results, radiomics-based AI models still face significant challenges. Efforts to standardize radiomics workflows through AI-driven automated feature selection and harmonization of imaging databases, such as The Cancer Imaging Archive (TCIA), will be crucial for their translation into routine clinical practice [51]. On the other hand, AI models predicting immunotherapy response demonstrated the lowest pooled AUC (0.77) and the highest heterogeneity, reflecting the inherent complexity of immune biomarkers and the variability in AI performance across different datasets. While some models, such as the CD8+ Tex signature (Chen et al., AUC = 0.728) and macrophage-based prognostic signatures (Zhao et al., AUC = 0.692–0.774), showed moderate predictive power, others, such as the fibroblast-based immune modeling approach (Yang et al., AUC = 0.853) and ECM-based AI models (Geng et al., AUC = 0.810), demonstrated improved accuracy in stratifying patients for immunotherapy response [28,30,31,32]. Furthermore, the immunotherapy-related models included in this review were primarily trained on publicly available transcriptomic datasets (TCGA, GEO), typically involving a few hundred patients. Furthermore, the endpoints used to develop and validate these models varied considerably, and while some studies assessed clinically meaningful outcomes such as OS or PFS, others relied on surrogate markers, i.e., immune infiltration scores or gene expression signatures. Only a few studies reported external validation, and none incorporated real-world clinical endpoints such as ORR. These limitations highlight the need for prospective validation using clinically based immunotherapy datasets to ensure real-world applicability of AI-based immune prediction models. Furthermore, unlike radiomics-based AI, which benefits from standardized imaging protocols, immunotherapy-focused AI models rely on heterogeneous genomic, transcriptomic, and spatial transcriptomics data, leading to greater variability in predictive performance. Future research should prioritize spatial proteomics, AI-assisted single-cell transcriptomics, and liquid biopsy-based immune monitoring to enhance predictions of ICI response. By conducting a pooled analysis of these immunotherapy models, our study offers the first quantitative evaluation of AI’s performance in this domain, revealing a clear gap that warrants further development. Unlike prior reviews on this topic which have been narrative in nature, this work provides concrete evidence of moderate predictive power and high heterogeneity, pointing to a need for more robust, externally validated models with standardized performance metrics [37,46]. A key barrier to clinical implementation of AI in oncology remains the lack of interpretability. Explainable AI (XAI) techniques have emerged to address this challenge, offering tools to visualize or quantify how input features contribute to model predictions [52]. For example, SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are widely used in genomics to attribute prediction importance to specific genes or pathways, improving biological interpretability and hypothesis generation. In radiomics, saliency maps and Grad-CAM (Gradient-weighted Class Activation Mapping) allow for a visual interpretation of which image regions contribute most to a classification, thereby enhancing model transparency and clinician trust. Future development of AI models for ovarian cancer should prioritize the integration of these tools, particularly when models are applied to high-stakes clinical decisions such as therapy selection or prognostic risk stratification [53,54]. Furthermore, this study emphasizes how AI could be leveraged beyond prediction models to actively guide treatment adaptation. AI-driven computational approaches are already demonstrating their ability to refine drug-repurposing strategies, particularly in identifying alternative PARP inhibitor combinations for HRD-negative ovarian cancer [55]. Recent research suggests that combining ATR inhibitors, such as ceralasertib, with PARP inhibitors could improve therapeutic efficacy by exploiting synthetic lethality mechanisms in HRD-negative tumors [56]. AI-driven multi-omics analysis could help refine these strategies, accelerating the identification of novel therapeutic combinations. Additionally, AI-driven predictive models integrating circulating tumor DNA (ctDNA) and liquid biopsy data may allow for real-time adjustments to treatment regimens, marking a shift toward adaptive AI-driven personalized medicine [57]. Despite the promising results observed in this systematic review, several limitations should be considered. The heterogeneity in AI model architectures, dataset sources, and validation strategies may have influenced the pooled estimates of predictive performance. Additionally, differences in feature selection methods, training datasets, and external validation cohorts contribute to variations in model accuracy, particularly in genomics-based and immunotherapy-focused AI models. The high heterogeneity observed across studies (I2 > 85% in most subgroups) can be attributed to multiple interrelated factors. First, differences in AI model architecture influenced performance consistency, with deep learning models showing greater variability, particularly in genomics and immunotherapy domains. Also, validation strategies contributed significantly: internally validated models tended to report higher AUCs, while externally validated models demonstrated a more limited performance. In addition, data modality and acquisition protocols introduced inconsistency. Radiomics studies differed in imaging modality (CT, MRI, PET/CT), segmentation approaches, and feature extraction pipelines, all of which can impact model reproducibility. In immunotherapy-focused studies, heterogeneity in biomarker selection—ranging from CD8+ T cells to fibroblast and ECM-based signatures—also contributed to variable accuracy. Furthermore, the use of public versus proprietary datasets introduced variability in terms of patient demographics, treatment regimens, sequencing platforms, and data preprocessing. Inconsistencies in outcome definitions and performance reporting—including variation in timepoints for survival prediction and lack of standardized thresholds for classification—further compounded comparability across studies. Finally, sample size variability and lack of calibration reporting in many studies may have added to the statistical noise in pooled estimates. Outcome heterogeneity also contributed to the observed variability in model performance. While all included studies reported the AUC as a primary performance metric, the clinical outcomes predicted by AI models varied widely—ranging from binary response categories (e.g., BRCA mutation, HRD status) to continuous outcomes (e.g., survival time) and categorical endpoints such as recurrence risk or immunotherapy response. Moreover, definitions of therapy response and follow-up periods were often not standardized across studies. This variation limited our ability to compare models directly and may have affected the precision of pooled estimates. To mitigate this, we focused our quantitative synthesis on AUC values and presented other outcomes descriptively. The absence of significant publication bias, as confirmed by Egger’s test and the symmetrical funnel plot, suggests that the reported AI model performances are unlikely to be systematically overestimated due to selective reporting [18]. However, the lack of standardized performance metrics—such as the inconsistent reporting of sensitivity, specificity, and decision thresholds—hindered the direct comparability of results. Notably, none of the included studies conducted prospective validation, and only five employed external validation with independent cohorts. This highlights a critical gap in current AI development for ovarian cancer, limiting the generalizability of published models. The scarcity of prospective validation is likely due to logistical challenges, data access limitations, and the retrospective design typical of early-stage AI studies. Moving forward, the successful integration of AI into clinical practice will require prospective validation through multi-center clinical trials to ensure generalizability. AI models must also be continuously updated using real-world patient data to adapt to evolving treatment paradigms, novel biomarker discoveries, and emerging immunotherapy combinations. Additionally, regulatory frameworks must establish guidelines for AI transparency, bias mitigation, and clinical interpretability, ensuring that AI-driven therapy response prediction remains both reliable and reproducible. Beyond therapy response prediction, AI-driven drug discovery is an emerging field with significant potential in ovarian cancer management. For instance, AI-based computational modeling could identify alternative PARPi combinations for HRD-negative patients, an area where treatment options remain limited. AI could facilitate the identification of novel immune checkpoint inhibitor targets by analyzing large-scale transcriptomic and proteomic datasets, further expanding treatment options for patients with immunotherapy-resistant ovarian cancer. Ultimately, artificial intelligence is on the verge of transforming ovarian cancer management. While challenges remain, this study provides a critical framework for future AI advancements in precision oncology. The findings demonstrate that AI-based models offer a path toward high-precision, personalized therapy response prediction. Moving forward, prospective multi-center clinical trials, regulatory frameworks, and explainable AI methodologies will ensure that AI fulfills its potential as a powerful tool in ovarian cancer treatment and decision-making.

5. Conclusions

This review underlines the predictive potential of AI models in ovarian cancer therapy response, with radiomics-based approaches demonstrating the highest accuracy and multi-modal AI outperforming single-modality models. Our findings reveal that genomics-based AI faces challenges in biomarker standardization, while immunotherapy-focused AI models show high variability. The risk of bias assessment underscores the need for external validation and improved model transparency. Addressing dataset heterogeneity and explainability (XAI) concerns is crucial for AI’s clinical adoption. This work emphasizes the importance of developing validated, interpretable, and multi-modal AI models to enhance precision oncology and personalized treatment strategies in ovarian cancer.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ai6040084/s1, Table S1: Risk of bias assessment using PROBAST for AI-based models; Table S2: Risk of bias assessment using QUADAS-2 for diagnostic accuracy studies; PRISMA Checklist.

Author Contributions

Conceptualization, M.F.P.M. and B.A.M.; methodology, M.F.P.M.; software, M.F.P.M.; validation, M.F.P.M., B.A.M. and V.L.; formal analysis, M.F.P.M.; investigation, M.F.P.M.; resources, M.F.P.M.; data curation, M.F.P.M.; writing—original draft preparation, M.F.P.M.; writing—review and editing, B.A.M., V.L. and G.C.; visualization, M.F.P.M.; supervision, B.A.M. and G.C.; project administration, M.F.P.M.; funding acquisition, not applicable. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors confirm that the data supporting the findings of this study are available within this article and its Supplementary Materials.

Acknowledgments

The authors would like to thank Davide Rodelli for the article’s SEO optimization and Keyword Research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Webb, P.M.; Jordan, S.J. Global epidemiology of epithelial ovarian cancer. Nat. Rev. Clin. Oncol. 2024, 21, 389–400. [Google Scholar] [CrossRef] [PubMed]
  2. Siegel, R.L.; Miller, K.D.; Jemal, A. Cancer statistics, 2020. CA Cancer J. Clin. 2020, 70, 7–30. [Google Scholar] [CrossRef] [PubMed]
  3. Gao, Y.; Zhou, N.; Liu, J. Ovarian Cancer Diagnosis and Prognosis Based on Cell-Free DNA Methylation. Cancer Control 2024, 31, 10732748241255548. [Google Scholar] [CrossRef]
  4. Lin, A.; Xue, F.; Pan, C.; Li, L. Integrative prognostic modeling of ovarian cancer: Incorporating genetic, clinical, and immunological markers. Discov. Oncol. 2025, 16, 115. [Google Scholar] [CrossRef]
  5. Bajwa, J.; Munir, U.; Nori, A.; Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Healthc. J. 2021, 8, e188–e194. [Google Scholar] [CrossRef] [PubMed]
  6. Fanizzi, A.; Arezzo, F.; Cormio, G.; Comes, M.C.; Cazzato, G.; Boldrini, L.; Bove, S.; Bollino, M.; Kardhashi, A.; Silvestris, E.; et al. An Explainable Machine Learning Model to Solid Adnexal Masses Diagnosis Based on Clinical Data and Qualitative Ultrasound Indicators. Cancer Med. 2024, 13, e7425. [Google Scholar] [CrossRef]
  7. Duwe, G.; Mercier, D.; Wiesmann, C.; Kauth, V.; Moench, K.; Junker, M.; Neumann, C.C.M.; Haferkamp, A.; Dengel, A.; Höfner, T. Challenges and perspectives in the use of artificial intelligence to support treatment recommendations in clinical oncology. Cancer Med. 2024, 13, e7398. [Google Scholar] [CrossRef]
  8. Mysona, D.P.; Kapp, D.S.; Rohatgi, A.; Lee, D.; Mann, A.K.; Tran, P.; Tran, L.; She, J.X.; Chan, J.K. Applying Artificial Intelligence to Gynecologic Oncology: A Review. Obstet. Gynecol. Surv. 2021, 76, 292–301. [Google Scholar] [CrossRef]
  9. Erdemoglu, E.; Serel, T.A.; Karacan, E.; Köksal, O.K.; Turan, İ.; Öztürk, V.; Bozkurt, K.K. Artificial intelligence for prediction of endometrial intraepithelial neoplasia and endometrial cancer risks in pre- and postmenopausal women. AJOG Glob. Rep. 2023, 3, 100154. [Google Scholar] [CrossRef]
  10. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
  11. Schardt, C.; Adams, M.B.; Owens, T.; Keitz, S.; Fontelo, P. Utilization of the PICO framework to improve searching PubMed for clinical questions. BMC Med. Inform. Decis. Mak. 2007, 7, 16. [Google Scholar]
  12. DerSimonian, R.; Kacker, R. Random-effects model for meta-analysis of clinical trials: An update. Contemp. Clin. Trials 2007, 28, 105–114. [Google Scholar] [CrossRef] [PubMed]
  13. DerSimonian, R.; Laird, N. Meta-analysis in clinical trials. Control. Clin. Trials 1986, 7, 177–188. [Google Scholar] [CrossRef]
  14. Cochran, W.G. The comparison of percentages in matched samples. Biometrika 1950, 37, 256–266. [Google Scholar] [CrossRef] [PubMed]
  15. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://www.R-project.org/ (accessed on 9 March 2025).
  16. Collins, G.S.; Dhiman, P.; Andaur Navarro, C.L.; Ma, J.; Hooft, L.; Reitsma, J.B.; Logullo, P.; Beam, A.L.; Peng, L.; Van Calster, B.; et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open 2021, 11, e048008. [Google Scholar] [CrossRef]
  17. Whiting, P.F.; Rutjes, A.W.; Westwood, M.E.; Mallett, S.; Deeks, J.J.; Reitsma, J.B.; Leeflang, M.M.; Sterne, J.A.; Bossuyt, P.M. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann. Intern. Med. 2011, 155, 529–536. [Google Scholar] [CrossRef]
  18. Egger, M.; Davey Smith, G.; Schneider, M.; Minder, C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997, 315, 629–634. [Google Scholar] [CrossRef]
  19. Guyatt, G.H.; Ebrahim, S.; Alonso-Coello, P.; Johnston, B.C.; Mathioudakis, A.G.; Briel, M.; Mustafa, R.A.; Sun, X.; Walter, S.D.; Heels-Ansdell, D.; et al. GRADE guidelines 17: Assessing the risk of bias associated with missing participant outcome data in a body of evidence. J. Clin. Epidemiol. 2017, 87, 14–22. [Google Scholar] [CrossRef]
  20. Nero, C.; Boldrini, L.; Lenkowicz, J.; Giudice, M.T.; Piermattei, A.; Inzani, F.; Pasciuto, T.; Minucci, A.; Fagotti, A.; Zannoni, G.; et al. Deep-Learning to Predict BRCA Mutation and Survival from Digital H&E Slides of Epithelial Ovarian Cancer. Int. J. Mol. Sci. 2022, 23, 11326. [Google Scholar] [CrossRef]
  21. Bergstrom, E.N.; Abbasi, A.; Díaz-Gay, M.; Galland, L.; Ladoire, S.; Lippman, S.M.; Alexandrov, L.B. Deep Learning Artificial Intelligence Predicts Homologous Recombination Deficiency and Platinum Response From Histologic Slides. J. Clin. Oncol. 2024, 42, 3550–3560. [Google Scholar] [CrossRef]
  22. Wang, L.; Chen, X.; Song, L.; Zou, H. Machine Learning Developed a Programmed Cell Death Signature for Predicting Prognosis, Ecosystem, and Drug Sensitivity in Ovarian Cancer. Anal. Cell Pathol. 2023, 2023, 7365503. [Google Scholar] [CrossRef]
  23. Huan, Q.; Cheng, S.; Ma, H.F.; Zhao, M.; Chen, Y.; Yuan, X. Machine learning-derived identification of prognostic signature for improving prognosis and drug response in patients with ovarian cancer. J. Cell. Mol. Med. 2024, 28, e18021. [Google Scholar] [CrossRef] [PubMed]
  24. Wei, W.; Liu, Z.; Rong, Y.; Zhou, B.; Bai, Y.; Wei, W.; Wang, S.; Wang, M.; Guo, Y.; Tian, J. A Computed Tomography-Based Radiomic Prognostic Marker of Advanced High-Grade Serous Ovarian Cancer Recurrence: A Multicenter Study. Front. Oncol. 2019, 9, 255. [Google Scholar] [CrossRef]
  25. Binas, D.A.; Tzanakakis, P.; Economopoulos, T.L.; Konidari, M.; Bourgioti, C.; Moulopoulos, L.A.; Matsopoulos, G.K. A Novel Approach for Estimating Ovarian Cancer Tissue Heterogeneity through the Application of Image Processing Techniques and Artificial Intelligence. Cancers 2023, 15, 1058. [Google Scholar] [CrossRef]
  26. Xu, S.; Zhu, C.; Wu, M.; Gu, S.; Wu, Y.; Cheng, S.; Wang, C.; Zhang, Y.; Zhang, W.; Shen, W.; et al. Artificial intelligence algorithm for preoperative prediction of FIGO stage in ovarian cancer based on clinical features integrated 18F-FDG PET/CT metabolic and radiomics features. J. Cancer Res. Clin. Oncol. 2025, 151, 87. [Google Scholar] [CrossRef]
  27. Zeng, S.; Wang, X.L.; Yang, H. Radiomics and radiogenomics: Extracting more information from medical images for the diagnosis and prognostic prediction of ovarian cancer. Mil. Med. Res. 2024, 11, 77. [Google Scholar] [CrossRef] [PubMed]
  28. Chen, R.; Zheng, Y.; Fei, C.; Ye, J.; Fei, H. Machine learning developed a CD8+ exhausted T cells signature for predicting prognosis, immune infiltration and drug sensitivity in ovarian cancer. Sci. Rep. 2024, 14, 5794. [Google Scholar] [CrossRef] [PubMed]
  29. Wu, Q.; Tian, R.; He, X.; Liu, J.; Ou, C.; Li, Y.; Fu, X. Machine learning-based integration develops an immune-related risk model for predicting prognosis of high-grade serous ovarian cancer and providing therapeutic strategies. Front. Immunol. 2023, 14, 1164408. [Google Scholar] [CrossRef]
  30. Zhao, B.; Pei, L. A macrophage related signature for predicting prognosis and drug sensitivity in ovarian cancer based on integrative machine learning. BMC Med. Genom. 2023, 16, 230. [Google Scholar] [CrossRef]
  31. Yang, Z.; Zhou, D.; Huang, J. Identifying Explainable Machine Learning Models and a Novel SFRP2+ Fibroblast Signature as Predictors for Precision Medicine in Ovarian Cancer. Int. J. Mol. Sci. 2023, 24, 16942. [Google Scholar] [CrossRef]
  32. Geng, T.; Zheng, M.; Wang, Y.; Reseland, J.E.; Samara, A. An artificial intelligence prediction model based on extracellular matrix proteins for the prognostic prediction and immunotherapeutic evaluation of ovarian serous adenocarcinoma. Front. Mol. Biosci. 2023, 10, 1200354. [Google Scholar] [CrossRef]
  33. Wang, Y.; Lin, W.; Zhuang, X.; Wang, X.; He, Y.; Li, L.; Lyu, G. Advances in artificial intelligence for the diagnosis and treatment of ovarian cancer (Review). Oncol. Rep. 2024, 51, 46. [Google Scholar] [CrossRef]
  34. Xu, H.L.; Gong, T.T.; Liu, F.H.; Chen, H.Y.; Xiao, Q.; Hou, Y.; Huang, Y.; Sun, H.Z.; Shi, Y.; Gao, S.; et al. Artificial Intelligence Performance in Image-Based Ovarian Cancer Identification: A Systematic Review and Meta-Analysis. EClinicalMedicine 2022, 53, 101662. [Google Scholar] [CrossRef] [PubMed]
  35. Huang, M.L.; Ren, J.; Jin, Z.Y.; Liu, X.Y.; He, Y.L.; Li, Y.; Xue, H.D. A Systematic Review and Meta-Analysis of CT and MRI Radiomics in Ovarian Cancer: Methodological Issues and Clinical Utility. Insights Imaging 2023, 14, 117. [Google Scholar] [CrossRef] [PubMed]
  36. O’Sullivan, N.J.; Temperley, H.C.; Horan, M.T.; Kamran, W.; Corr, A.; O’Gorman, C.; Saadeh, F.; Meaney, J.M.; Kelly, M.E. Role of Radiomics as a Predictor of Disease Recurrence in Ovarian Cancer: A Systematic Review. Abdom. Radiol. 2024, 49, 3540–3547. [Google Scholar] [CrossRef]
  37. Hatamikia, S.; Nougaret, S.; Panico, C.; Avesani, G.; Nero, C.; Boldrini, L.; Sala, E.; Woitek, R. Ovarian Cancer beyond Imaging: Integration of AI and Multiomics Biomarkers. Eur. Radiol. Exp. 2023, 7, 50. [Google Scholar] [CrossRef]
  38. Asadi, F.; Rahimi, M.; Ramezanghorbani, N.; Almasi, S. Comparing the Effectiveness of Artificial Intelligence Models in Predicting Ovarian Cancer Survival: A Systematic Review. Cancer Rep. 2025, 8, e70138. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  39. Pecchi, A.; Bozzola, C.; Beretta, C.; Besutti, G.; Toss, A.; Cortesi, L.; Balboni, E.; Nocetti, L.; Ligabue, G.; Torricelli, P. DCE-MRI Radiomic Analysis in Triple Negative Ductal Invasive Breast Cancer. Comparison between BRCA and Not BRCA Mutated Patients: Preliminary Results. Magn. Reson. Imaging 2024, 113, 110214. [Google Scholar] [CrossRef]
  40. Azadinejad, H.; Farhadi Rad, M.; Shariftabrizi, A.; Rahmim, A.; Abdollahi, H. Optimizing Cancer Treatment: Exploring the Role of AI in Radioimmunotherapy. Diagnostics 2025, 15, 397. [Google Scholar] [CrossRef]
  41. Mirza, Z.; Ansari, M.S.; Iqbal, M.S.; Ahmad, N.; Alganmi, N.; Banjar, H.; Al-Qahtani, M.H.; Karim, S. Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis. Cancers 2023, 15, 3237. [Google Scholar] [CrossRef]
  42. Gandhi, Z.; Gurram, P.; Amgai, B.; Lekkala, S.P.; Lokhandwala, A.; Manne, S.; Mohammed, A.; Koshiya, H.; Dewaswala, N.; Desai, R.; et al. Artificial Intelligence and Lung Cancer: Impact on Improving Patient Outcomes. Cancers 2023, 15, 5236. [Google Scholar] [CrossRef] [PubMed]
  43. Pandya, P.H.; Jannu, A.J.; Bijangi-Vishehsaraei, K.; Dobrota, E.; Bailey, B.J.; Barghi, F.; Shannon, H.E.; Riyahi, N.; Damayanti, N.P.; Young, C.; et al. Integrative Multi-OMICs Identifies Therapeutic Response Biomarkers and Confirms Fidelity of Clinically Annotated, Serially Passaged Patient-Derived Xenografts Established from Primary and Metastatic Pediatric and AYA Solid Tumors. Cancers 2022, 15, 259. [Google Scholar] [CrossRef]
  44. Zhuang, S.; Chen, T.; Li, Y.; Wang, Y.; Ai, L.; Geng, Y.; Zou, M.; Liu, K.; Xu, H.; Wang, L.; et al. A transcriptional signature detects homologous recombination deficiency in pancreatic cancer at the individual level. Mol. Ther. Nucleic Acids 2021, 26, 1014–1026. [Google Scholar] [CrossRef]
  45. Gorski, J.W.; Ueland, F.R.; Kolesar, J.M. CCNE1 Amplification as a Predictive Biomarker of Chemotherapy Resistance in Epithelial Ovarian Cancer. Diagnostics 2020, 10, 279. [Google Scholar] [CrossRef] [PubMed]
  46. Jardim, D.L.; Goodman, A.; de Melo Gagliato, D.; Kurzrock, R. The Challenges of Tumor Mutational Burden as an Immunotherapy Biomarker. Cancer Cell 2021, 39, 154–173. [Google Scholar] [CrossRef]
  47. Cui, C.; Xu, C.; Yang, W.; Chi, Z.; Sheng, X.; Si, L.; Xie, Y.; Yu, J.; Wang, S.; Yu, R.; et al. Ratio of the interferon-γ signature to the immunosuppression signature predicts anti-PD-1 therapy response in melanoma. NPJ Genom. Med. 2021, 6, 7. [Google Scholar] [CrossRef]
  48. Lu, G.; Fei, B. Medical Hyperspectral Imaging: A Review. J. Biomed. Opt. 2014, 19, 010901. [Google Scholar] [CrossRef] [PubMed]
  49. Mahajan, A.; Sahu, A.; Ashtekar, R.; Kulkarni, T.; Shukla, S.; Agarwal, U.; Bhattacharya, K. Glioma radiogenomics and artificial intelligence: Road to precision cancer medicine. Clin. Radiol. 2023, 78, 137–149. [Google Scholar] [CrossRef] [PubMed]
  50. Kohan, A.; Hinzpeter, R.; Kulanthaivelu, R.; Mirshahvalad, S.A.; Avery, L.; Tsao, M.; Li, Q.; Ortega, C.; Metser, U.; Hope, A.; et al. Contrast Enhanced CT Radiogenomics in a Retrospective NSCLC Cohort: Models, Attempted Validation of a Published Model and the Relevance of the Clinical Context. Acad. Radiol. 2024, 31, 2953–2961. [Google Scholar] [CrossRef]
  51. Clark, K.; Vendt, B.; Smith, K.; Freymann, J.; Kirby, J.; Koppel, P.; Moore, S.; Phillips, S.; Maffitt, D.; Pringle, M.; et al. The Cancer Imaging Archive (TCIA): Maintaining and operating a public information repository. J. Digit. Imaging 2013, 26, 1045–1057. [Google Scholar] [CrossRef]
  52. Abgrall, G.; Holder, A.L.; Chelly Dagdia, Z.; Zeitouni, K.; Monnet, X. Should AI models be explainable to clinicians? Crit. Care 2024, 28, 301. [Google Scholar] [CrossRef] [PubMed]
  53. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘16), San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
  54. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
  55. Frenel, J.S.; Bossard, C.; Rynkiewicz, J.; Thomas, F.; Salhi, Y.; Salhi, S.; Chetritt, J. Artificial intelligence to predict homologous recombination deficiency in ovarian cancer from whole-slide histopathological images. J. Clin. Oncol. 2024, 42, 5578. [Google Scholar] [CrossRef]
  56. Shah, P.D.; Wethington, S.L.; Pagan, C.; Latif, N.; Tanyi, J.; Martin, L.P.; Morgan, M.; Burger, R.A.; Haggerty, A.; Zarrin, H.; et al. Combination ATR and PARP Inhibitor (CAPRI): A phase 2 study of ceralasertib plus olaparib in patients with recurrent, platinum-resistant epithelial ovarian cancer. Gynecol. Oncol. 2021, 163, 246–253. [Google Scholar] [CrossRef]
  57. Ginghina, O.; Hudita, A.; Zamfir, M.; Spanu, A.; Mardare, M.; Bondoc, I.; Buburuzan, L.; Georgescu, S.E.; Costache, M.; Negrei, C.; et al. Liquid Biopsy and Artificial Intelligence as Tools to Detect Signatures of Colorectal Malignancies: A Modern Approach in Patient’s Stratification. Front. Oncol. 2022, 12, 856575. [Google Scholar]
Figure 1. PRISMA flowchart for the included studies.
Figure 1. PRISMA flowchart for the included studies.
Ai 06 00084 g001
Figure 2. Forest plot showing the pooled AUC values and confidence intervals for AI models predicting therapy response in ovarian cancer. The pooled AUC was 0.81 (95% CI: 0.72–0.89). The random-effects model was applied due to high heterogeneity (I2 = 84.9%). Radiomics-based AI models demonstrated the highest predictive performance, while immunotherapy-focused models exhibited higher heterogeneity.
Figure 2. Forest plot showing the pooled AUC values and confidence intervals for AI models predicting therapy response in ovarian cancer. The pooled AUC was 0.81 (95% CI: 0.72–0.89). The random-effects model was applied due to high heterogeneity (I2 = 84.9%). Radiomics-based AI models demonstrated the highest predictive performance, while immunotherapy-focused models exhibited higher heterogeneity.
Ai 06 00084 g002
Figure 3. Forest plot of pooled AUC values for genomics-based AI models in ovarian cancer. The pooled AUC was 0.78 (95% CI: 0.66–0.89) with high heterogeneity (I2 = 86.5%), indicating variability across studies. DeepHRD performed best in predicting HRD, while BRCA-status models showed lower predictive accuracy [20,21,22,23].
Figure 3. Forest plot of pooled AUC values for genomics-based AI models in ovarian cancer. The pooled AUC was 0.78 (95% CI: 0.66–0.89) with high heterogeneity (I2 = 86.5%), indicating variability across studies. DeepHRD performed best in predicting HRD, while BRCA-status models showed lower predictive accuracy [20,21,22,23].
Ai 06 00084 g003
Figure 4. Forest plot of pooled AUC values for radiomics-based AI models in ovarian cancer. The pooled AUC was 0.88 (95% CI: 0.78–0.99), with high heterogeneity (I2 = 90.5%), suggesting variability in imaging methodologies and datasets. Zeng et al.’s radiogenomics model outperformed all others, highlighting the benefit of multi-modal AI integration [24,25,26,27].
Figure 4. Forest plot of pooled AUC values for radiomics-based AI models in ovarian cancer. The pooled AUC was 0.88 (95% CI: 0.78–0.99), with high heterogeneity (I2 = 90.5%), suggesting variability in imaging methodologies and datasets. Zeng et al.’s radiogenomics model outperformed all others, highlighting the benefit of multi-modal AI integration [24,25,26,27].
Ai 06 00084 g004
Figure 5. Forest plot of pooled AUC values for immunotherapy-focused AI models in ovarian cancer. The pooled AUC was 0.77 (95% CI: 0.69–0.85), with high heterogeneity (I2 = 90.0%), suggesting substantial variability due to differences in immune biomarker selection and AI model validation. The SFRP2+ fibroblast signature by Yang et al. and ECM remodeling-based AI by Geng et al. outperformed other models, highlighting the importance of tumor microenvironment interactions in immunotherapy prediction [28,29,30,31,32].
Figure 5. Forest plot of pooled AUC values for immunotherapy-focused AI models in ovarian cancer. The pooled AUC was 0.77 (95% CI: 0.69–0.85), with high heterogeneity (I2 = 90.0%), suggesting substantial variability due to differences in immune biomarker selection and AI model validation. The SFRP2+ fibroblast signature by Yang et al. and ECM remodeling-based AI by Geng et al. outperformed other models, highlighting the importance of tumor microenvironment interactions in immunotherapy prediction [28,29,30,31,32].
Ai 06 00084 g005
Figure 6. Egger’s test for funnel plot asymmetry testing publication bias.
Figure 6. Egger’s test for funnel plot asymmetry testing publication bias.
Ai 06 00084 g006
Table 1. PICOS Criteria for study inclusion/exclusion.
Table 1. PICOS Criteria for study inclusion/exclusion.
PopulationPatients diagnosed with ovarian cancer, undergoing treatment with CHT, PARPis, or ICIs
InterventionAI-based models applied for therapy response prediction, including genomics-based, radiomics-based, and immunotherapy-focused models
ComparatorStandard clinical or molecular predictors, including traditional biomarker-based testing (HRD status, BRCA mutations), clinician-based radiologic assessments, and conventional histopathologic scoring methods
OutcomesThe predictive performance of AI models, assessed using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and hazard ratios (HR) for progression-free survival (PFS) and overall survival (OS). Secondary outcomes included model generalizability, external validation, and clinical applicability
Study DesignRetrospective and prospective cohort studies, observational studies, and RCTs that employed AI for therapy response prediction
AI: artificial intelligence; AUC: area under the curve; BRCA: breast cancer gene; CHT: chemotherapy; HR: hazard ratio; HRD: homologous recombination deficiency; ICI: immune checkpoint inhibitor; OS: overall survival; PARPi: poly (ADP-ribose) polymerase inhibitor; PFS: progression-free survival; RCT: randomized controlled trial.
Table 2. Summary of the included studies.
Table 2. Summary of the included studies.
AI in Genomic and Molecular Profiling
StudyAI Model
(Type)
Dataset UsedAUCOutcome Assessed
NERO et al. [20]Weakly Supervised AI
(Deep Learning)
TCGA0.700BRCA status prediction
BERGSTROM et al. [21]DeepHRD
(Deep Learning)
TCGA + external cohorts0.810HRD prediction
WANG et al. [22]ML Prognostic Signature
(Traditional ML)
Multi-center cohorts0.739–0.820 (OS 2–5 yrs)Survival prediction, drug response
HUAN et al. [23]MLDPS Prognostic AI
(Traditional ML)
Multi-cohort OV datasets0.859–0.795 (OS 1–5 yrs)Prognosis, drug response prediction
AI in Imaging and Radiomics for Therapy Prediction
WEI et al. [24]CT-Based Radiomics
(Traditional ML)
Multi-center CT datasets0.880Recurrence prediction
BINAS et al. [25]MRI-Based AI
(Traditional ML)
Multi-center MRI datasets0.860Tumor heterogeneity assessment
XU et al. [26]PET/CT-Based AI
(Traditional ML)
Clinical PET/CT scans0.819FIGO stage prediction
ZENG et al. [27]Radiomics & Radiogenomics
(Deep Learning)
Multi-center Imaging & Genomics0.975Diagnosis, prognosis, therapy response
AI for Immunotherapy and Novel Targeted Treatments
CHEN et al. [28]CD8+ Tex Prognostic Signature
(Traditional ML)
TCGA, GSE datasets0.728–0.783ICI response prediction
WU et al. [29]Immune Risk Model
(Traditional ML)
TCGA, GEO datasets0.790ICI response, TME profiling
ZHAO et al. [30]MRS- Macrophage AI
(Traditional ML)
TCGA, GEO datasets0.692–0.774Prognosis, drug sensitivity
YANG et al. [31]SFRP2+ Fibroblast Signature
(Deep Learning)
TCGA, GEO datasets0.853ICI response, TP53 mutation
GENG et al. [32]ECM-Based AI
(Deep Learning)
TCGA-Pancancer0.810Immunotherapy response prediction
AI: artificial intelligence; AUC: area under the curve; BRCA: breast cancer gene; CD8+ Tex: CD8+ exhausted T cells; CHT: chemotherapy; CT: computed tomography; ECM: extracellular matrix; FIGO: International Federation of Gynecology and Obstetrics; GEO: Gene Expression Omnibus; HRD: homologous recombination deficiency; ICI: immune checkpoint inhibitor; ML: machine learning; MLDPS: machine learning-derived prognostic signature; MRI: magnetic resonance imaging; MRS: macrophage-related signature; OC: ovarian cancer; OS: overall survival; PET: positron emission tomography; TCGA: The Cancer Genome Atlas; Tex: exhausted T cells; TME: tumor microenvironment; yrs: years.
Table 3. GRADE certainty of evidence summary table. The green dot represents a low risk of bias, the yellow and red dots represent a moderate and high risk of bias, respectively.
Table 3. GRADE certainty of evidence summary table. The green dot represents a low risk of bias, the yellow and red dots represent a moderate and high risk of bias, respectively.
AI ModelRisk of BiasInconsistencyIndirectnessImprecisionPublication BiasCertainty of Evidence
Genomics-Based AI
Radiomics-Based AI
Immunotherapy AI
Overall AI Models
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Maiorano, M.F.P.; Cormio, G.; Loizzi, V.; Maiorano, B.A. Artificial Intelligence in Ovarian Cancer: A Systematic Review and Meta-Analysis of Predictive AI Models in Genomics, Radiomics, and Immunotherapy. AI 2025, 6, 84. https://doi.org/10.3390/ai6040084

AMA Style

Maiorano MFP, Cormio G, Loizzi V, Maiorano BA. Artificial Intelligence in Ovarian Cancer: A Systematic Review and Meta-Analysis of Predictive AI Models in Genomics, Radiomics, and Immunotherapy. AI. 2025; 6(4):84. https://doi.org/10.3390/ai6040084

Chicago/Turabian Style

Maiorano, Mauro Francesco Pio, Gennaro Cormio, Vera Loizzi, and Brigida Anna Maiorano. 2025. "Artificial Intelligence in Ovarian Cancer: A Systematic Review and Meta-Analysis of Predictive AI Models in Genomics, Radiomics, and Immunotherapy" AI 6, no. 4: 84. https://doi.org/10.3390/ai6040084

APA Style

Maiorano, M. F. P., Cormio, G., Loizzi, V., & Maiorano, B. A. (2025). Artificial Intelligence in Ovarian Cancer: A Systematic Review and Meta-Analysis of Predictive AI Models in Genomics, Radiomics, and Immunotherapy. AI, 6(4), 84. https://doi.org/10.3390/ai6040084

Article Metrics

Back to TopTop