Next Article in Journal
Lung Surfactant Deficiency in Severe Respiratory Failure: A Potential Biomarker for Clinical Assessment
Previous Article in Journal
Classification of the ICU Admission for COVID-19 Patients with Transfer Learning Models Using Chest X-Ray Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

AI-Guided Delineation of Gross Tumor Volume for Body Tumors: A Systematic Review

by
Lea Marie Pehrson
1,2,3,*,
Jens Petersen
3,4,
Nathalie Sarup Panduro
1,2,
Carsten Ammitzbøl Lauridsen
1,5,
Jonathan Frederik Carlsen
1,2,
Sune Darkner
3,
Michael Bachmann Nielsen
1,2 and
Silvia Ingala
1,6,7
1
Department of Diagnostic Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark
2
Department of Clinical Medicine, University of Copenhagen, 2100 Copenhagen, Denmark
3
Department of Computer Science, University of Copenhagen, 2100 Copenhagen, Denmark
4
Department of Oncology, Rigshospitalet, 2100 Copenhagen, Denmark
5
Radiography Education, University College Copenhagen, 2200 Copenhagen, Denmark
6
Cerebriu A/S, 1434 Copenhagen, Denmark
7
Department of Diagnostic Radiology, Copenhagen University Hospital Herlev and Gentofte, 2730 Herlev, Denmark
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(7), 846; https://doi.org/10.3390/diagnostics15070846
Submission received: 6 January 2025 / Revised: 14 March 2025 / Accepted: 18 March 2025 / Published: 26 March 2025
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

Abstract

:
Background: Approximately 50% of all oncological patients undergo radiation therapy, where personalized planning of treatment relies on gross tumor volume (GTV) delineation. Manual delineation of GTV is time-consuming, operator-dependent, and prone to variability. An increasing number of studies apply artificial intelligence (AI) techniques to automate such delineation processes. Methods: To perform a systematic review comparing the performance of AI models in tumor delineations within the body (thoracic cavity, esophagus, abdomen, and pelvis, or soft tissue and bone). A retrospective search of five electronic databases was performed between January 2017 and February 2025. Original research studies developing and/or validating algorithms delineating GTV in CT, MRI, and/or PET were included. The Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis statement and checklist (TRIPOD) were used to assess the risk, bias, and reporting adherence. Results: After screening 2430 articles, 48 were included. The pooled diagnostic performance from the use of AI algorithms across different tumors and topological areas ranged 0.62–0.92 in dice similarity coefficient (DSC) and 1.33–47.10 mm in Hausdorff distance (HD). The algorithms with the highest DSC deployed an encoder–decoder architecture. Conclusions: AI algorithms demonstrate a high level of concordance with clinicians in GTV delineation. Translation to clinical settings requires the building of trust, improvement in performance and robustness of results, and testing in prospective studies and randomized controlled trials.

1. Introduction

Approximately 50% of all oncologic patients undergo radiation therapy, making this one of the cornerstones of cancer treatment. Personalized planning of radiation therapy relies on anatomical tumor delineation in a scan prior to treatment, referred to as Gross Tumor Volume (GTV) [1]. However, manual GTV delineation is a time-consuming and operator-dependent procedure prone to intra- and interobserver variability [2]. While the development and refinement of artificial intelligence (AI)-based applications for automated segmentation of tumors promise to automate this task, making it both faster and more objective, applications in real-life clinical scenarios are still lacking due to insufficient evidence on their generalizability [3].
The performance of algorithms is linked to several factors, including the modality employed for GTV delineation, labeling, annotation, and segmentation protocols, as well as quality assurance. It is widely acknowledged that data quality plays a pivotal role in shaping the performance of AI algorithms and the generalizability of their results. Specifically, the fitness of both the quantity and quality of the data for the specific problem at hand significantly influences algorithm performance. Additionally, the architecture and backbone of the proposed model can also widely affect the outcome [3].
To date, no comprehensive review of AI-based algorithms to delineate GTV for radiotherapy purposes in body tumors (thorax, abdomen, and associated soft tissue and bone) has been published. Former segmentation challenges for head and neck cancers have provided extensive overviews of the applications and available literature in their respective domains. With this in mind, we aimed to conduct a systematic review of the available literature on existing algorithms by topological tumor location and assess their performance based on accuracy, robustness, efficiency, generalizability, and interpretability. This was conducted in order to evaluate AI-based method performance for delineation of the GTV compared to manual delineation for tumors eligible for radiation therapy in body tumors.

2. Materials and Methods

Institutional review board approval of this study was deemed unnecessary as the data utilized in this study were retrospectively sourced exclusively from publicly available databases, and no direct involvement or handling of human subjects occurred.
Search strategy:
To identify potentially relevant articles, a title/abstract/keyword search was performed in PubMed, Scopus, Cochrane Library, IEEE, and Web of Science. The search string used was “(GTV OR gross tumor volume) AND (segmentation OR delineation)”. The search was restricted to peer-reviewed original research articles in English published between 2017 and 2025, both years included. The literature search was completed on 4th February 2025.
Inclusion and exclusion criteria: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for literature search and study selection were followed [4]. After the removal of duplicates, the titles and abstracts of the articles were screened independently by two authors (L.M.P, N.S.P.). Inclusion criteria were (1) original research peer reviewed articles; (2) focused on the automatic segmentation of GTV of any type of cancer eligible for radiation therapy in the thorax, esophagus, abdomen, pelvis, soft tissue, and bone; and (3) utilizing one or more imaging modalities including computed tomography (CT), magnetic resonance imaging (MRI), and/or positron emission tomography (PET) for GTV segmentations. Reasons for exclusion included phantom data, animal studies, studies only using publicly available datasets to ensure diverse patient cohorts, and GTV obtained from any other modalities than CT, PET, or MRI, as well as studies focused on head and neck, brain, pediatric patients, or conference papers/abstracts.
Evaluation markers and quality assessment: For each publication, the name of the first author, year of publication, modality, dice similarity coefficient (DSC), Hausdorff distance (HD), cancer type, sample size, model, backbone, and numbers of specialized personal delineating the GTV were extracted for further analysis. The risk, bias, and reporting adherence were assessed using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) as well as the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis statement and checklist (TRIPOD) [5,6].
To prevent unfair comparisons in study outcomes due to different selection of (secondary) outcome measures, only the best performance as measured by the DSC and HD for the validation cohort was assessed. The varying outcomes of the publications may be attributed to modifications in the hyperparameters or test of different algorithms.

3. Results

In brief, 2430 records were initially identified. From these, 1063 were duplicates and hence removed. Afterwards, 1367 abstracts were screened and 1153 did not fulfill the inclusion criteria. A full-text review was performed for 214 records and 166 of these were then excluded based on the topological location of the tumor (e.g., brain tumors), study outcomes and/or design beyond the scope of this review, year of publication preceding 2017, and presence of duplicates. As a result, a total of 48 publications were included in this review. A flow diagram showing a schematic overview of the steps followed in this study in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines is shown in Figure 1 [4]. The publications reported in the results section were reviewed based on the affected topological district, listed in Table 1, Table 2, Table 3 and Table 4, and split based on the location and tumor type.

3.1. Thorax

An overview of the studies addressing GTV delineation in the thoracic cavity included in this review (n = 15) is provided in Table 1 [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. All studies focused on lung tumors, however, only eight specified the histological subtype, which was non-small cell lung cancer (NSCLC) for all of them [7,8,10,12,14,17,20,21]. Yu X. et al. narrowed down the study scope to a specific stage (NSCLC stage III) and Kunkyab T. et al. described the baseline characteristics including tumor type and location [7,17]. There were no studies available regarding the delineation of small cell lung cancer, mesothelioma, and mediastinal tumors (thymoma, thymic carcinoma, lymphoma, germ cell and neurogenic tumors, or primary cardiac tumors).
Kunkyab T. et al. achieved the highest overall DSC and one of the lowest HD values (DSC = 0.92, HD = 1.33). Their study utilized CT scans from two publicly available datasets as well as their internal clinical database (BC Cancer Kelowna), comprising 676 cases (train: 563, test: 113) [7]. The dataset covered adenocarcinoma, squamous cell carcinoma, NSCLC (not specified), and large cell carcinoma. The ML architecture employed was based on a dual-branch encoder, with a deep 3D convolutional neural network (CNN) extracting semantic features from low-resolution data, while a shallow 3D CNN preserves positional details from high-resolution inputs. They then utilize a multi-scale feature pyramid network to enhance feature extraction. A transformer module integrates deformable self-attention for efficient region focus, feed-forward networks for feature transformation, and skip connections to mitigate gradient vanishing. The decoder reconstructs the segmentation map using CNN-based up-sampling, residual blocks, and encoder–decoder skip connections to retain fine-grained details. Additionally, they provide inference time (Co-ReTr = 12.03 s) and segmentation performance across multiple tumors (DSC = 0.89, HD = 1.66).
In line with clinical standards for assessment of tumor size [22], CT was the preferred imaging modality for GTV delineation in the thoracic cavity across all the publications analyzed, over 73% of studies relying solely on CT [7,9,10,11,12,13,16,17,18,19,21], and approximately 27% employing it in conjunction with PET [8,14,15,20]. The mean sample size across all publications with regards to the thorax was N = 216, (range: 9–871).
All 15 publications openly disclosed their data sources, suggesting a commitment to transparency. The data split method was reported in over half of the publications (60%). Five papers reported using an external testing set [7,10,11,19,21]. Notably, 93% of the publications provided information regarding the staff involved and their educational qualifications in the delineation of the ground truth [7,8,9,10,11,12,13,14,15,16,17,18,19,21].
Compared to other topographical areas, the thorax subset had the lowest adherence to the CLAIM and TRIPOD guidelines, with an average of 75.6% for CLAIM and 70.0% for TRIPOD, resulting in an overall adherence of 77.3%. Figure 2 summarizes the assessment of TRIPOD and CLAIM adherence. These results indicate that studies focusing on the thorax tend to have lower compliance with the reporting standards, signaling room for improvement for both TRIPOD and CLAIM adherence. The mean DSC outcomes of the examined publications was 0.78 with an interquartile range (IQR) of 0.10 (range: 0.66–0.92) and the HD measurements derived had a mean of 10.5 mm, and an IQR of 8.79 mm (range: 1.33–28.23 mm). Figure 3 and Figure 4 summarize the DSC and HD outcomes.

3.2. Esophagus

An overview of the studies addressing GTV delineation in the esophagus (n = 11) is provided in Table 2 [23,24,25,26,27,28,29,30,31,32,33]. All publications focused on esophageal cancer, however, five publications specified in greater characteristics of the dataset [23,24,26,27,33]. Six of the papers utilized both PET and CT [25,26,27,29,31,32], whereas the remaining five papers solely relied on CT [23,24,28,30,33].
Zhang S. et al. achieved the highest DSC (0.869 ± 0.006) and lowest HD (3.51 ± 0.74) in a multi-cohort study with 580 patients [23]. The study provided detailed patient characteristics, including staging, tumor location, volume in cm3, and length in mm. The ground truth generated based on consensus from two board-certified radiation oncologists and eight board-certified radiologists across four hospitals was tested against an AI model using the 3D nnU-Net architecture.
Approximately 72% of the publications explicitly documented the origin of their data. Three publications declared using both internal and external validation datasets and notably also had the largest datasets [23,26,28]. Cross-validation was implemented in roughly 60% of the papers [25,26,27,29,32]. Nine out of the eleven papers disclosed the personnel responsible for delineating the ground truth data [23,24,25,26,27,29,31,32,33].
The esophagus subset showed higher adherence, averaging 82.8% for CLAIM and 83.6% for TRIPOD, leading to an overall adherence of 84.2% (Figure 2) This suggests a high level of compliance with both reporting guidelines, indicating that studies related to the esophagus generally meet transparency and accuracy standards. The mean sample size of included samples across all papers was N = 238, (range: 49–606). The mean DSC outcome of the examined publications was 0.78 with an IQR of 0.09 (range: 0.72–0.86) and the HD measurements derived had a mean of 15.70 mm, and an IQR of 8.58 mm (range: 4.60–47.10 mm). Figure 3 and Figure 4 summarize the DSC and HD outcomes.
Table 2. Esophagus publications.
Table 2. Esophagus publications.
Author
(Year)
ModalityDSC/HDCancer Type
(N)
ModelBackboneDelineation Staff
Zhang S. et al. [23]
(2024)
CT0.869 ± 0.006/
3.51 ± 0.74
Esophageal cancer
(580)
3D nn-U-Netskip connections between the encoder and decoder
improved the segmentation
Four oncologists and eight radiologists
Jin L. et al. [24]
(2022)
CT0.86 ± 0.12/
13.38 ± 0.12
Esophageal cancer
(215)
3D VUMix-Net3D V-Net for localization, 2D U-Net for segmentationOne radiation oncologist
Yue Y. et al. [25]
(2022)
CT, PET0.84 + 0.009/
4.60 ±0.99
Esophageal cancer
(164)
GloD-LoATU-NetConV-Transformer with GloDAT and LoAT blocksTwo nuclear clinicians, one chief oncologist.
Ye X. et al. [26]
(2022)
CT, PET0.83/
9.50
Esophageal cancer
(606)
Two-Stream 3D PSNN3D Progressive Semantically Nested NetworkTwo expert healthcare
professionals
Jin D. et al. [27]
(2021)
CT, PET0.79 ± 0.09/
39.30 ± 56.5
Esophageal cancer
(148)
Two-Stream 3D PSNN3D Progressive Semantically Nested NetworkTwo experienced radiation oncologists
Youssefi S. et al. [28]
(2021)
CT0.79 ± 0.20/
14.7 ± 25.0
Esophageal Cancer
(288)
DDAU-NetDilated Dense Attention U-NetN/A
Jin D. et al. [29]
(2019)
CT, PET0.76 ± 0.13/
47.10 ± 56.0
Esophageal cancer
(110)
Two-Stream 3D PSNN3D Progressive Semantically Nested NetworkTwo experienced radiation oncologists
Yousefi S. et al. [30]
(2018)
CT0.73 ± 0.20/
N/A
Esophageal cancer
(49)
3D Dense U-NET3D U-NET network with dense blocks N/A
Yue Y. et al. [31]
(2024)
CT, PET0.76 ± 0.13/
9.38 ± 8.76
Esophageal Cancer
(164)
TransAttPSNNTwo-stream Attention Progressive Semantically-Nested NetworkTwo nuclear medicine physicians
Yue Y. et al. [32]
(2022)
CT, PET0.72 ± 0.02/
11.87 ± 4.20
Esophageal cancer
(166)
Two-Stream 3D PSNN3D Progressive Semantically Nested NetworkTwo experienced nuclear medicine physicians
Lou X. et al. [33]
(2024)
CT0.72 ± 19.18/
3.98 ± 3.01
Esophageal Cancer
(124)
Modified U-Net architectureEnhanced attention and frequency-aware U-Net variant optimized for advanced feature extraction and fusion Three radiation oncologists
Abbreviations: PSNN: Progressive semantically nested network, Global and Local Attention U-Net (GloD-LoATU-Net), Not Available (N/A).

3.3. Abdomen/Pelvis

An overview of the studies addressing GTV delineation within the abdominal and pelvic region included in this review (n = 17) is provided in Table 3 [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50]. One publication focused on pancreatic cancer [45], one on hepatocellular carcinoma [36], four on colon carcinoma limited to the rectal/anal region [34,35,38,43], five focused on prostate cancer [39,40,41,42,47], and six on cervical cancer [37,44,46,49,50]. There were no recent publications focusing on GTV delineation in tumors of the upper gastrointestinal tract, ovaries, or the urinary tract and bladder.
The mean sample size across the abdomen/pelvis subset was N = 97 (range: 15–209). The primary imaging modality in most publications was MRI alone (53%) [34,35,41,43,44,45,46,49,50], followed by PET-CT (30%) [38,39,40,42,47]. Three papers used CT alone for GTV delineation (17%) [36,37,48].
Geng J. et al. published two top-performing papers, both achieving high DSC, using a DpnU-Net-based deep learning framework for GTV segmentation, with DSC = 0.87 ± 0.07 in both studies [34,35]. The HD values were 4.07 ± 1.67 (August) and 5.79 ± 3.00 (September). In the August study (141 patients), two separate DpnU-Net models were trained using MRI for GTV segmentation and CT for clinical target volume (CTV) segmentation, respectively. The September study (88 patients) expanded this by using both MRI and CT in a three-stage process. Stage 1 predicted GTV-MRI from MRI, Stage 2 generated intermediate GTV-CT from CT, and Stage 3 refined the contours by registering the MRI-derived GTV with CT, highlighting the framework’s ability to handle multimodal segmentation tasks effectively.
All the publications disclosed their data sources, demonstrating transparency in data acquisition. Apart from Ghezzo et al. [47], all papers reported on their data split, highlighting a commitment to methodological transparency. Seven of the publications reported using an external dataset. It is noteworthy that only one of these papers was prior to 2023 (Kostyszyn) [37,39,40,41,42,46,47]. More than two-thirds of the publications (88%) reported details about the delineation staff, including the number of staff members and their educational qualifications [34,35,36,37,38,40,41,42,44,45,46,47,48,49,50].
The abdomen subset demonstrated the highest overall adherence, with an average of 81.7% for CLAIM and 86.5% for TRIPOD, resulting in an overall adherence of 84.1% (Figure 2). This highlights consistent reporting practices in studies focused on the abdominal region. The mean DSC outcomes of the examined publications was 0.77 with an IQR of 0.11 (range: 0.62–0.87) and the HD measurements derived had a mean of 7.53 mm, and an IQR of 3.47 mm (range: 2.77–20.44 mm). Figure 3 and Figure 4 summarize the DSC and HD outcomes.
Table 3. Abdomen/pelvis publications [43].
Table 3. Abdomen/pelvis publications [43].
Author
(Year)
ModalityDSC/HDCancer Type
(N)
ModelArchitectureDelineation Staff
Geng J et al. [34]
(2023) (August)
MRI0.87 ± 0.07/
4.07 ± 1.67
Rectal Cancer
(141)
DpuU-NetU-Net with dual-path-network modules (DPN92).Eight radiation oncologists
Geng J. et al. [35]
(2023) (September)
MRI0.87 ± 0.07/
5.79 ± 3.00
Rectal Cancer
(88)
DpuU-NetU-Net with dual-path-network modules (DPN92).Two oncologists
Yang Z. et al. [36]
(2023)
(4D)-CT0.86 ± 0.08/
5.14 ± 3.34
Hepatocellular carcinoma
(26)
Spatial-temporal dual path U-NetDual-path network with
spatial-temporal features, and a feature fusion module
Radiation oncologist
Peng H. et al. [37]
(2024)
CT0.84 ± 0.07/
6.58 ± 5.97
Cervical Cancer
(71)
MDSSL 3D-U-NetMulti-decoder and semi-supervised learning (MDSSL)Radiation oncologists
Groendahl A. et al. [38]
(2022)
CT, PET 0.83 ± 0.08/
7.07 ± 4.43
Anal squamous cell carcinoma
(36)
2D U-NETU-NETOne oncologist, one radiologist.
Kostyszyn D. et al. [39]
(2020)
CT, PET0.83/
4.12
Prostate cancer
(209)
3D U-NETU-NETN/A
Holzschuh J.C. et al. [40]
(2023)
CT, PET0.82 ± 0.07/
3.30 ± 1.96
Prostate Cancer
(52)
3D-U-Net3D-U-Net with decoder and encoder consisting of 3 layersTwo readers (radiation oncology, radiology or nuclear medicine)
Rajendrang P. et al. [41]
(2024)
MRI0.81 ± 0.10/
9.86 ± 9.77
Prostate Cancer
(133)
Medformer (w. LAVE)Dual-channel 3D Swin Transformer backbone with visual-language attention and a CNN-based decoderRadiation oncologist and professional trainee
Holzschuh J.C. et al. [42]
(2024)
CT, PET0.76/
1.73
Prostate Cancer
(161)
nn-U-NetDynamically configuration based on input, without fixed backbone. Two radiation oncologists
Wang J. et al. [43]
(2018)
MRI0.74 ± 0.14/
20.44 ± 13.35
Rectal cancer
(93)
2D U-NETU-NETN/A
Outeiral R. et al. [44]
(2023)
MRI0.73/
6.80
Cervical cancer
(195)
3D nn-U-NETnn-U-NETOne radiation oncologist
Liang Y. et al. [45]
(2020)
MRI0.73 ± 0.09/
8.11 ± 4.09
Pancreas cancer
(56)
Square-window based convolutional neural networkCustom CNNOne oncologist, one radiologist.
Rouhi R. et al. [46]
(2024)
MRI0.72 ± 0.16/
14.6 ± 9.0
Cervical Cancer
(166)
SegResNetAsymmetrically larger encoder using ResNet blocks, strided convolutions, and a decoder with skip connectionsTwo radiation oncologists
Ghezzo S. et al. [47]
(2023)
CT, PET0.71 ± 0.19/
N/A
Prostate cancer
(85)
3D U-NET
(Kostyszyn D. et al. [39])
U-NETTwo nuclear medicine physicians
Chang, JH. et al. [48]
(2021)
CT0.71/
N/A
Cervical cancer
(51)
3D U-NET + Long Short-Term Memory3D U-NET + Long Short-Term MemoryOne radiation oncologist
Breto A. et al. [49]
(2022)
MRI0.67/
2.77 ± 1.73
Cervical cancer
(15)
Mask R-CNNFaster R-CNN (ImageNet) + segmentationOne radiation oncologist
Yoganathan S. et al. [50]
(2022)
MRI0.62 ± 0.14/
6.83 ± 2.89
Cervical cancer
(71)
2.5D DeepLabv3+ ResNet50, InceptionResNetv2One radiation oncologist
Abbreviations: Not Available (N/A), Neural Network U-Net (nn-U-NET), Convolutional Neural Network (CNN), Region-Based Convolutional Neural Network (R-CNN), Residual Networks 50 (ResNet50), Inception-Residual Network v2 (InceptionResNetV2).

3.4. Soft TISSUE and Bone

Four papers aiming to delineate the GTV in soft tissue or bone were included, as listed in Table 4 [51,52,53,54]. The DSC and HD overview is reported in Figure 3 and Figure 4. These publications focus, respectively, on sarcomas (including soft tissue sarcomas, bone sarcomas, and chondromas) and oligo-metastases from primary NSCLC. Two papers proposed solely using CT as the preferred imaging modality for soft tissue and bone sarcoma, and sacral chondroma [52,53]. The remaining two papers using CT in combination with PET for NSCLC bone metastasis [54] or solely relied on MRI for soft tissue sarcoma [51].
Peeken J. C. et al. achieved the best segmentation performance for both DSC and HD using an MRI-based model (DSC = 0.88 ± 0.04, HD = 12.0 ± 4.3), with the largest cohort (244 patients; 157 training and 87 independent external test) [51]. The authors provided detailed cohort statistics, including TNM classification, grading, and AJCC staging. They proposed a modified 3D U-Net featuring an encoder–decoder structure with multi-head self-attention at the bottleneck to improve spatial awareness beyond convolutional limitations. The final architecture was validated on an independent external dataset (n = 87).
All papers disclosed their data sources and delineation references and had a mean sample size of 71 (range: 15–209). The mean DSC outcome of the examined publications in the soft tissue and bone subset was 0.80 with an IQR of 0.07 (range 0.62–0.88), and the HD measurements derived had a mean of 14.20 mm, and an IQR of 2.21 (range: 12.0–16.43). The subset exhibited an adherence to CLAIM with 77.25% and 86.25% for TRIPOD, yielding an overall adherence of 81.75% (Figure 2). This disparity suggests variability in how well studies related to soft tissue or bone align with these reporting guidelines, with stronger adherence to TRIPOD.
Table 4. Soft tissue and bone publications.
Table 4. Soft tissue and bone publications.
Author
(Year)
ModalityDSC/HDCancer Type
(N)
ModelArchitectureDelineation Staff
Peeken JC. et al. [51]
(2024)
MRI0.88 ± 0.04/
12.0 ± 4.3
Soft tissue sarcoma
(244)
DLBAS 3D-U-Net3D U-Net with squeeze and excitation blocks, residual blocks, and multi-head self-attentionTwo radiation oncologists
Marin T. et al. [52]
(2021)
CT0.86 ± 0.05/
16.43 ± 13.26
Soft tissue and bone sarcoma
(68)
2.5D U-NETU-NETFour radiation oncologists or radiologists
Boussioux L. et al. [53]
(2024)
CT0.85 ± 6.4/
NA
Sacral chordoma
(48)
Residual 3D U-NetOptimal ensemble of residual 3D U-NetOne radiologist
Nigam R. et al. [54]
(2023)
CT, PET0.63 ± 0.12/
NA
NSCLC Bone metastasis
(9)
Auto segmentation on SUV thresholdingCustom PET/CT segmentation pipelineOne radiation oncologist
Abbreviations: Not Available (N/A).

4. Discussion

This systematic review aimed to outline and critically appraise the recent literature on AI-based automatic delineation of gross tumor volume in radiological images of tumors of the thoracic, abdominal, and pelvic cavity as well as soft tissue and bone. Four main findings were highlighted. Firstly, the diagnostic performance from the use of artificial intelligence ranged 0.62–0.92 in dice similarity coefficient (DSC) and 1.33–47.10 mm in Hausdorff distance (HD). Secondly, for all four topological areas of interest, we observed the highest DSC ranging between 0.86 and 0.92 [7,23,34,51]. Thirdly, the most used architecture for all publications was the U-NET (56.25%), which was also used for three of the top performing algorithms [23,34,51]. Fourthly, the lowest Hausdorff distance and highest dice similarity coefficient observed in three of the subsets (esophagus, abdomen and pelvis, soft tissue, and bone) were also found using a U-NET [23,34,51].
For three of the top-performing proposed methods in the thorax, esophagus, and abdomen subset, time measurements were provided [7,23,34]. Kunkyab et al. and Geng et al. both obtained their automated GTV delineation after 12 s in comparison to the reported manual annotation time of 10–15 min [7,34]. Zhang S. et al. showcased significant improvement in delineation for 6/12 radiologists, as well as reducing the inter- and intra-reader variability by 37.4% and 55.2%, respectively. In addition, the author also reported being able to reduce the annotation time by 77.6% (9.73 to 2.18 min) [23].
The best performance with regards to DSC and HD was obtained by Kunkyab T. et al. (DSC = 0.92, HD = 1.33), utilizing CNN with multi-resolution input and a transformers module [7]. This method shares some similarities with a U-NET in the sense of having an encoder–decoder architecture, both skipping connections to preserve spatial information, and both aiming to conduct multi-scale feature extraction. Kunkyab T. et al.’s proposed method, however, differs in using a parallel dual-branch (deep and shallow) to process different image resolutions. They proceed to integrate a deformable self-attention transformer to enhance region specific focus, deploy a multi-scale feature pyramid network to leverage hierarchical features, and integrate residual blocks in the decoder to refine the segmentation details.

4.1. CLAIM and TRIPOD Assessment

Adherence to the CLAIM and TRIPOD guidelines varied across topological areas. These differences might partly reflect the number of studies in each category. The average adherence for TRIPOD was observed to be higher (83.83%) in comparison to CLAIM (79.35). The difference for the adherence between the two quality assessments was, however, not statistically significantly different (normal distribution, t-test, p-value = 0.008). For esophagus, abdomen, and soft tissue and bone, the overall average adherence to both assessments was above 80% (83.2%, 84.1%, 81.8%). The subset for thorax publications differed with an average adherence of 77.3%, however, not significantly (Kruskal–Wallis chi-squared = 3, degrees of freedom = 3, p-value = 0.39).

4.2. Clinical Relevance

The diverse array of tumor delineation methodologies across abdominal and thoracic studies prompts a crucial exploration of their clinical relevance. The variability in imaging modalities and algorithmic strategies signifies the adaptability of these methodologies to the heterogeneous clinical landscape.
There are sparse to no publications regarding GTV segmentation of tumors of the mediastinal cavity, upper GI tract, and abdominal cavity (e.g., gastric or ovarian tumors), which we interpreted in light of the segmentation difficulties that these tumors pose given their irregular and infiltrative nature as well as their association with vital cardiac and vascular structures. This, coupled with the variability of manual GTV segmentations in such anatomical districts, should prompt a discussion for clearer and more standardized guidelines for GTV delineation. To this extent, it might be speculated that the lack of clear clinical standards makes the development of reliable algorithms in this space hard if not impossible.
The flexibility of GTV delineation instructions is also reflected in the imaging modalities used for algorithmic segmentation (CT, MRI, PET). Although it must be appreciated that no one-size-fits-all solution is possible due to the diverse biological nature and hence the radiological appearance of such tumors, which also may differ depending on the imaging modality, as well as logistical factors due to different imaging availabilities of different centers, more research on the best imaging modality or modalities for automatic GTV delineation for each tumor type is required before translating this technology to real-life clinical scenarios.

4.3. Methodical Considerations

This systematic review boasts several strengths contributing to its robustness. All selected papers leveraged data from diverse imaging modalities such as CT, PET, and MRI, ensuring a comprehensive exploration of tumor delineation methods. Notably, there were no restrictions on patient numbers or algorithmic architectures, fostering inclusivity and enhancing the generalizability of findings. Assessment for bias and reporting adherence using the TRIPOD and CLAIM checklist was conducted in every paper, ensuring transparency and reliability. The review’s criterion, encompassing all cancer types treatable with radiotherapy and requiring GTV delineation, guarantees relevance across varied clinical scenarios.
Although the exclusion of non-peer-reviewed papers may narrow the results reported in this review, it guarantees a rigorous revision of the methods and ethical standards of the reported results. Additionally, our evaluation focused mainly on DSC and HD, which are arguably the most used metrics in the field, making studies comparable. Nevertheless, it is possible that other interesting performance metrics were overlooked. Variability in study design introduces heterogeneity, complicating direct comparisons. Diverse algorithmic parameters and the inclusion of different cancer types and imaging modalities may challenge the identification of optimal approaches. The absence of standardization across studies poses challenges in synthesizing cohesive conclusions. Incomplete reporting in some studies adds complexity, potentially influencing the review’s overall robustness.

4.4. Future Directions for Research

Furthermore, an in-depth exploration of the interpretability of algorithms used in tumor delineation is crucial for gaining clinician trust and facilitating clinical adoption. Future research should further address the interpretability of algorithmic outputs, which is seldom assessed in validation studies [55]. Additionally, a more extensive exploration of algorithm robustness in the face of diverse patient populations, including variations in age, sex, comorbidities and scanning protocols as well as other possible sources of bias is essential for ensuring equitable and effective clinical applications. Exploring the integration of multimodal imaging data, combining information from CT, MRI, and PET scans, could be a promising avenue. Investigating how algorithms can effectively leverage complementary data from various modalities may enhance the accuracy and robustness of tumor delineation, particularly in complex cases involving multiple anatomical structures.
Future research could focus on the development and validation of standardized protocols for tumor delineation across different anatomical regions and possibly different tumor types. Such consensus guidelines for standardized imaging acquisition, processing, and algorithmic implementation could enhance comparability and reproducibility across studies and ensure consistent and reliable performance across diverse clinical scenarios. Furthermore, investigating the feasibility and accuracy of real-time tumor delineation during imaging procedures could have significant implications for adaptive radiotherapy and interventions, where timely and precise delineation is critical.

Author Contributions

Conceptualization, L.M.P., J.P., C.A.L., S.D. and M.B.N.; methodology, L.M.P., J.P., C.A.L., J.F.C., S.D., M.B.N. and S.I.; formal analysis, L.M.P. and S.I.; investigation, L.M.P. and N.S.P.; resources, S.D. and M.B.N.; data curation, L.M.P.; writing original draft preparation, L.M.P. and S.I.; writing—review and editing, L.M.P., J.P., C.A.L., J.F.C., S.D., M.B.N. and S.I.; visualization, L.M.P.; supervision, J.P., C.A.L., J.F.C., S.D., M.B.N. and S.I.; project administration, L.M.P.; funding acquisition, M.B.N. and S.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Innovation Fund Denmark (IFD) with grant no. 0176-00013B for the AI4Xray project.

Institutional Review Board Statement

Institutional review board approval of this study was deemed unnecessary as the data utilized in this study was retrospectively sourced exclusively from publicly available databases, and no direct involvement or handling of human subjects occurred.

Informed Consent Statement

Patient consent was waived due to the data utilized in this study was retrospectively sourced exclusively from publicly available databases, and no direct involvement or handling of human subjects occurred.

Data Availability Statement

Not applicable.

Conflicts of Interest

Author Silvia Ingala (S.I.) is employed by the company Cerebriu A/S. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Burnet, N.G.; Thomas, S.J.; Burton, K.E.; Jefferies, S.J. Defining the tumour and target volumes for radiotherapy. Cancer Imaging 2004, 4, 153–161. [Google Scholar] [CrossRef]
  2. Jaffray, D.; Gospodarowicz, M. Disease Control Priorities Third Edition; World Bank Group: Washington, DC, USA, 2014; Volume 3. [Google Scholar] [CrossRef]
  3. Cahan, E.M.; Hernandez-Boussard, T.; Thadaney-Israni, S.; Rubin, D.L. Putting the data before the algorithm in big data addressing personalized healthcare. NPJ Digit. Med. 2019, 2, 78. [Google Scholar] [CrossRef] [PubMed]
  4. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
  5. Mongan, J.; Moy, L.; Kahn, C.E. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol. Artif. Intell. 2020, 2, e200029. [Google Scholar] [CrossRef] [PubMed]
  6. Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, K.G.M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. BMC Med. 2015, 13, 1. [Google Scholar] [CrossRef]
  7. Kunkyab, T.; Bahrami, Z.; Zhang, H.; Liu, Z.; Hyde, D. A deep learning-based framework (Co-ReTr) for auto-segmentation of non-small cell-lung cancer in computed tomography images. J. Appl. Clin. Med. Phys. 2024, 25, e14297. [Google Scholar] [CrossRef]
  8. Wang, S.; Mahon, R.; Weiss, E.; Jan, N.; Taylor, R.J.; McDonagh, P.R.; Quinn, B.; Yuan, L. Automated Lung Cancer Segmentation Using a PET and CT Dual-Modality Deep Learning Neural Network. Int. J. Radiat. Oncol. Biol. Phys. 2023, 115, 529–539. [Google Scholar] [CrossRef]
  9. Xie, H.; Chen, Z.; Deng, J.; Zhang, J.; Duan, H.; Li, Q. Automatic segmentation of the gross target volume in radiotherapy for lung cancer using transresSEUnet 2.5D Network. J. Transl. Med. 2022, 20, 524. [Google Scholar] [CrossRef]
  10. Cui, Y.; Arimura, H.; Nakano, R.; Yoshitake, T.; Shioyama, Y.; Yabuuchi, H. Automated approach for segmenting gross tumor volumes for lung cancer stereotactic body radiation therapy using CT-based dense V-networks. J. Radiat. Res. 2021, 62, 346–355. [Google Scholar] [CrossRef]
  11. Zhang, G.; Yang, Z.; Jiang, S. Automatic lung tumor segmentation from CT images using improved 3D densely connected UNet. Med. Biol. Eng. Comput. 2022, 60, 3311–3323. [Google Scholar] [CrossRef]
  12. Talebi, P.; Saeedzadeh, E.; Bakhshandeh, M.; Arabi, H. A Novel Attention-based Neural Network for Automated Lung Lesion Delineation from 4DCT Images. In Proceedings of the 2022 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), Milano, Italy, 5–12 November 2022. [Google Scholar] [CrossRef]
  13. Skett, S.; Patel, T.; Duprez, D.; Gupta, S.; Netherton, T.; Trauernicht, C.; Aldridge, S.; Eaton, D.; Cardenas, C.; Court, L.E.; et al. Autocontouring of primary lung lesions and nodal disease for radiotherapy based only on computed tomography images. Phys. Imaging Radiat. Oncol. 2024, 31, 100637. [Google Scholar] [CrossRef] [PubMed]
  14. Kawata, Y.; Arimura, H.; Ikushima, K.; Jin, Z.; Morita, K.; Tokunaga, C.; Yabu-Uchi, H.; Shioyama, Y.; Sasaki, T.; Honda, H.; et al. Impact of pixel-based machine-learning techniques on automated frameworks for delineation of gross tumor volume regions for stereotactic body radiation therapy. Phys. Med. 2017, 42, 141–149. [Google Scholar] [CrossRef] [PubMed]
  15. Ikushima, K.; Arimura, H.; Jin, Z.; Yabu-Uchi, H.; Kuwazuru, J.; Shioyama, Y.; Sasaki, T.; Honda, H.; Sasaki, M. Computer-assisted framework for machine-learning-based delineation of GTV regions on datasets of planning CT and PET/CT images. J. Radiat. Res. 2017, 58, 123–134. [Google Scholar] [CrossRef] [PubMed]
  16. Ma, Y.; Mao, J.; Liu, X.; Dai, Z.; Zhang, H.; Zhang, X.; Li, Q. Deep learning-based internal gross target volume definition in 4D CT images of lung cancer patients. Med. Phys. 2022, 50, 2303–2316. [Google Scholar] [CrossRef]
  17. Yu, X.; Jin, F.; Luo, H.L.; Lei, Q.; Wu, Y. Gross Tumor Volume Segmentation for Stage III NSCLC Radiotherapy Using 3D ResSE-Unet. Technol. Cancer Res. Treat. 2022, 21, 15330338221090847. [Google Scholar] [CrossRef]
  18. Gan, W.; Wang, H.; Gu, H.; Duan, Y.; Shao, Y.; Chen, H.; Feng, A.; Huang, Y.; Fu, X.; Ying, Y.; et al. Automatic segmentation of lung tumors on CT images based on a 2D & 3D hybrid convolutional neural network. Br. J. Radiol. 2021, 94, 20210038. [Google Scholar]
  19. Wong, J.; Huang, V.; Giambattista, J.A.; Teke, T.; Kolbeck, C.; Atrchian, S. Training and Validation of Deep Learning-Based Auto-Segmentation Models for Lung Stereotactic Ablative Radiotherapy Using Retrospective Radiotherapy Planning Contours. Front. Oncol. 2021, 11, 626499. [Google Scholar] [CrossRef]
  20. Thomas, H.M.T.; Devakumar, D.; Sasidharan, B.; Bowen, S.R.; Heck, D.K.; Samuel, E.J.J. Hybrid positron emission tomography segmentation of heterogeneous lung tumors using 3D Slicer: Improved GrowCut algorithm with threshold initialization. J. Med. Imaging 2017, 4, 011009. [Google Scholar] [CrossRef]
  21. Cheng, D.C.; Chi, J.H.; Yang, S.N.; Liu, S.H. Organ contouring for lung cancer patients with a seed generation scheme and random walks. Sensors 2020, 20, 4823. [Google Scholar] [CrossRef]
  22. Nishino, M.; Hatabu, H.; Johnson, B.E.; McLoud, T.C. State of the art: Response assessment in lung cancer in the era of genomic medicine. Radiology 2014, 271, 6–27. [Google Scholar] [CrossRef]
  23. Zhang, S.; Li, K.; Sun, Y.; Wan, Y.; Ao, Y.; Zhong, Y.; Liang, M.; Wang, L.; Chen, X.; Pei, X.; et al. Deep Learning for Automatic Gross Tumor Volumes Contouring in Esophageal Cancer Based on Contrast-Enhanced Computed Tomography Images: A Multi-Institutional Study. Int. J. Radiat. Oncol. 2024, 119, 1590–1600. [Google Scholar] [CrossRef]
  24. Jin, L.; Chen, Q.; Shi, A.; Wang, X.; Ren, R.; Zheng, A.; Song, P.; Zhang, Y.; Wang, N.; Wang, C.; et al. Deep Learning for Automated Contouring of Gross Tumor Volumes in Esophageal Cancer. Front. Oncol. 2022, 12, 892171. [Google Scholar] [CrossRef]
  25. Yue, Y.; Li, N.; Zhang, G.; Zhu, Z.; Liu, X.; Song, S.; Ta, D. Automatic segmentation of esophageal gross tumor volume in 18F-FDG PET/CT images via GloD-LoATUNet. Comput. Methods Programs Biomed. 2022, 229, 107266. [Google Scholar] [CrossRef] [PubMed]
  26. Ye, X.; Guo, D.; Tseng, C.-K.; Ge, J.; Hung, T.-M.; Pai, P.-C.; Ren, Y.; Zheng, L.; Zhu, X.; Peng, L.; et al. Multi-Institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume Using Planning CT and FDG-PET/CT. Front. Oncol. 2022, 11, 785788. [Google Scholar] [CrossRef]
  27. Jin, D.; Guo, D.; Ho, T.-Y.; Harrison, A.P.; Xiao, J.; Tseng, C.-K.; Lu, L. DeepTarget: Gross tumor and clinical target volume segmentation in esophageal cancer radiotherapy. Med. Image Anal. 2021, 68, 101909. [Google Scholar] [CrossRef]
  28. Yousefi, S.; Sokooti, H.; Elmahdy, M.S.; Lips, I.M.; Shalmani, M.T.M.; Zinkstok, R.T.; Dankers, F.J.W.M.; Staring, M. Esophageal Tumor Segmentation in CT Images Using a Dilated Dense Attention Unet (DDAUnet). IEEE Access 2021, 9, 99235–99248. [Google Scholar] [CrossRef]
  29. Jin, D.; Guo, D.; Ho, T.-Y.; Harrison, A.P.; Xiao, J.; Tseng, C.-K.; Lu, L. Accurate Esophageal Gross Tumor Volume Segmentation in PET/CT using Two-Stream Chained 3D Deep Network Fusion 2019. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2019, Shenzhen, China, 13–17 October 2019. [Google Scholar]
  30. Yousefi, S.; Sokooti, H.; Elmahdy, M.S.; Peters, F.P.; Shalmani, M.T.M.; Zinkstok, R.T.; Staring, M. Esophageal Gross Tumor Volume Segmentation Using a 3D Convolutional Neural Network. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2018; Volume 11073, pp. 343–351. [Google Scholar] [CrossRef]
  31. Yue, Y.; Li, N.; Zhang, G.; Xing, W.; Zhu, Z.; Liu, X.; Song, S.; Ta, D. A transformer-guided cross-modality adaptive feature fusion framework for esophageal gross tumor volume segmentation. Comput. Methods Programs Biomed. 2024, 251, 108216. [Google Scholar] [CrossRef]
  32. Yue, Y.; Li, N.; Shahid, H.; Bi, D.; Liu, X.; Song, S.; Ta, D. Gross Tumor Volume Definition and Comparative Assessment for Esophageal Squamous Cell Carcinoma From 3D 18F-FDG PET/CT by Deep Learning-Based Method. Front. Oncol. 2022, 12, 799207. [Google Scholar] [CrossRef]
  33. Lou, X.; Zhu, J.; Yang, J.; Zhu, Y.; Shu, H.; Li, B. Enhanced Cross-stage-attention U-Net for esophageal target volume segmentation. BMC Med. Imaging 2024, 24, 339. [Google Scholar] [CrossRef]
  34. Geng, J.; Zhu, X.; Liu, Z.; Chen, Q.; Bai, L.; Wang, S.; Li, Y.; Wu, H.; Yue, H.; Du, Y. Towards deep-learning (DL) based fully automated target delineation for rectal cancer neoadjuvant radiotherapy using a divide-and-conquer strategy: A study with multicenter blind and randomized validation. Radiat. Oncol. 2023, 18, 164. [Google Scholar] [CrossRef]
  35. Geng, J.; Zhang, S.; Wang, R.; Bai, L.; Chen, Q.; Wang, S.; Zhu, X.; Liu, Z.; Yue, H.; Wu, H.; et al. Deep-learning based triple-stage framework for MRI-CT cross-modality gross tumor volume (GTV) segmentation for rectal cancer neoadjuvant radiotherapy. Biomed. Signal Process. Control 2023, 89, 105715. [Google Scholar] [CrossRef]
  36. Yang, Z.; Yang, X.; Cao, Y.; Shao, Q.; Tang, D.; Peng, Z.; Di, S.; Zhao, Y.; Li, S. Deep learning based automatic internal gross target volume delineation from 4D-CT of hepatocellular carcinoma patients. J. Appl. Clin. Med. Phys. 2023, 25, e14211. [Google Scholar] [CrossRef] [PubMed]
  37. Peng, H.; Liu, T.; Li, P.; Yang, F.; Luo, X.; Sun, X.; Gao, D.; Lin, F.; Jia, L.; Xu, N.; et al. Automatic delineation of cervical cancer target volumes in small samples based on multi-decoder and semi-supervised learning and clinical application. Sci. Rep. 2024, 14, 26937. [Google Scholar] [CrossRef] [PubMed]
  38. Groendahl, A.R.; Moe, Y.M.; Kaushal, C.K.; Huynh, B.N.; Rusten, E.; Tomic, O.; Hernes, E.; Hanekamp, B.; Undseth, C.; Guren, M.G.; et al. Deep learning-based automatic delineation of anal cancer gross tumour volume: A multimodality comparison of CT, PET and MRI. Acta Oncol. 2021, 61, 89–96. [Google Scholar] [CrossRef]
  39. Kostyszyn, D.; Fechter, T.; Bartl, N.; Grosu, A.L.; Gratzke, C.; Sigle, A.; Mix, M.; Ruf, J.; Fassbender, T.F.; Kiefer, S.; et al. Intraprostatic Tumor Segmentation on PSMA PET Images in Patients with Primary Prostate Cancer with a Convolutional Neural Network. J. Nucl. Med. 2021, 62, 823–828. [Google Scholar] [CrossRef]
  40. Holzschuh, J.C.; Mix, M.; Ruf, J.; Hölscher, T.; Kotzerke, J.; Vrachimis, A.; Doolan, P.; Ilhan, H.; Marinescu, I.M.; Spohn, S.K.; et al. Deep learning based automated delineation of the intraprostatic gross tumour volume in PSMA-PET for patients with primary prostate cancer. Radiother. Oncol. 2023, 188, 109774. [Google Scholar] [CrossRef]
  41. Rajendran, P.; Chen, Y.; Qiu, L.; Niedermayr, T.; Liu, W.; Buyyounouski, M.; Bagshaw, H.; Han, B.; Yang, Y.; Kovalchuk, N.; et al. Auto-delineation of treatment target volume for radiation therapy using large language model-aided multimodal learning. Int. J. Radiat. Oncol. 2024, 121, 230–240. [Google Scholar] [CrossRef]
  42. Holzschuh, J.C.; Mix, M.; Freitag, M.T.; Hölscher, T.; Braune, A.; Kotzerke, J.; Vrachimis, A.; Doolan, P.; Ilhan, H.; Marinescu, I.M.; et al. The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on 18F-PSMA-1007 PET. Radiat. Oncol. 2024, 19, 106. [Google Scholar] [CrossRef]
  43. Wang, J.; Lu, J.; Qin, G.; Shen, L.; Sun, Y.; Ying, H.; Zhang, Z.; Hu, W. Technical Note: A deep learning-based autosegmentation of rectal tumors in MR images. Med. Phys. 2018, 45, 2560–2564. [Google Scholar] [CrossRef]
  44. Rodríguez Outeiral, R.; González, P.J.; Schaake, E.E.; van der Heide, U.A.; Simões, R. Deep learning for segmentation of the cervical cancer gross tumor volume on magnetic resonance imaging for brachytherapy. Radiat. Oncol. 2023, 18, 91. [Google Scholar] [CrossRef]
  45. Liang, Y.; Schott, D.; Zhang, Y.; Wang, Z.; Nasief, H.; Paulson, E.; Hall, W.; Knechtges, P.; Erickson, B.; Li, X.A. Auto-segmentation of pancreatic tumor in multi-parametric MRI using deep convolutional neural networks. Radiother. Oncol. 2020, 145, 193–200. [Google Scholar] [CrossRef] [PubMed]
  46. Rouhi, R.; Niyoteka, S.; Carré, A.; Achkar, S.; Laurent, P.-A.; Ba, M.B.; Veres, C.; Henry, T.; Vakalopoulou, M.; Sun, R.; et al. Automatic gross tumor volume segmentation with failure detection for safe implementation in locally advanced cervical cancer. Phys. Imaging Radiat. Oncol. 2024, 30, 100578. [Google Scholar] [CrossRef] [PubMed]
  47. Ghezzo, S.; Mongardi, S.; Bezzi, C.; Gajate, A.M.S.; Preza, E.; Gotuzzo, I.; Baldassi, F.; Jonghi-Lavarini, L.; Neri, I.; Russo, T.; et al. External validation of a convolutional neural network for the automatic segmentation of intraprostatic tumor lesions on 68Ga-PSMA PET images. Front. Med. 2023, 10, 1133269. [Google Scholar] [CrossRef] [PubMed]
  48. Chang, J.H.; Lin, K.H.; Wang, T.H.; Zhou, Y.K.; Chung, P.C. Image Segmentation in 3D Brachytherapy Using Convolutional LSTM. J. Med. Biol. Eng. 2021, 41, 636–651. [Google Scholar] [CrossRef]
  49. Breto, A.L.; Spieler, B.; Zavala-Romero, O.; Alhusseini, M.; Patel, N.V.; Asher, D.A.; Xu, I.R.; Baikovitz, J.B.; Mellon, E.A.; Ford, J.C.; et al. Deep Learning for Per-Fraction Automatic Segmentation of Gross Tumor Volume (GTV) and Organs at Risk (OARs) in Adaptive Radiotherapy of Cervical Cancer. Front. Oncol. 2022, 12, 854349. [Google Scholar] [CrossRef]
  50. Yoganathan, S.; Paul, S.N.; Paloor, S.; Torfeh, T.; Chandramouli, S.H.; Hammoud, R.; Al-Hammadi, N. Automatic segmentation of magnetic resonance images for high-dose-rate cervical cancer brachytherapy using deep learning. Med. Phys. 2022, 49, 1571–1584. [Google Scholar] [CrossRef]
  51. Peeken, J.C.; Etzel, L.; Tomov, T.; Münch, S.; Schüttrumpf, L.; Shaktour, J.H.; Kiechle, J.; Knebel, C.; Schaub, S.K.; Mayr, N.A.; et al. Development and benchmarking of a Deep Learning-based MRI-guided gross tumor segmentation algorithm for Radiomics analyses in extremity soft tissue sarcomas. Radiother. Oncol. 2024, 197, 110338. [Google Scholar] [CrossRef]
  52. Marin, T.; Zhuo, Y.; Lahoud, R.M.; Tian, F.; Ma, X.; Xing, F.; Moteabbed, M.; Liu, X.; Grogg, K.; Shusharina, N.; et al. Deep learning-based GTV contouring modeling inter- and intra-observer variability in sarcomas 2021. Radiother. Oncol. 2022, 167, 269–276. [Google Scholar] [CrossRef]
  53. Boussioux, L.; Ma, Y.; Thomas, N.K.; Bertsimas, D.; Shusharina, N.; Pursley, J.; Chen, Y.-L.; DeLaney, T.F.; Qian, J.; Bortfeld, T. Automated Segmentation of Sacral Chordoma and Surrounding Muscles Using Deep Learning Ensemble. Int. J. Radiat. Oncol. 2023, 117, 738–749. [Google Scholar] [CrossRef]
  54. Nigam, R.; Field, M.; Harris, G.; Barton, M.; Carolan, M.; Metcalfe, P.; Holloway, L. Automated detection, delineation and quantification of whole-body bone metastasis using FDG-PET/CT images. Phys. Eng. Sci. Med. 2023, 46, 851–863. [Google Scholar] [CrossRef]
  55. Zając, H.D.; Ribeiro, J.M.N.; Ingala, S.; Gentile, S.; Wanjohi, R.; Gitau, S.N.; Carlsen, J.F.; Nielsen, M.B.; Andersen, T.O. “It depends”: Configuring AI to Improve Clinical Usefulness Across Contexts. In Proceedings of the 2024 ACM Designing Interactive Systems Conference, New York, NY, USA, 1–5 July 2024; Association for Computing Machinery, Inc.: New York, NY, USA, 2024; pp. 874–889. [Google Scholar] [CrossRef]
Figure 1. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA).
Figure 1. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA).
Diagnostics 15 00846 g001
Figure 2. Assessment of the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) checklist adherence [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54].
Figure 2. Assessment of the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) checklist adherence [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54].
Diagnostics 15 00846 g002
Figure 3. Reported dice similarity coefficient (DSC) overview for all included publications [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54].
Figure 3. Reported dice similarity coefficient (DSC) overview for all included publications [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54].
Diagnostics 15 00846 g003
Figure 4. Reported Hausdorff distance (HD) overview from all publications [7,8,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,28,29,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,49,50,51,52].
Figure 4. Reported Hausdorff distance (HD) overview from all publications [7,8,9,10,11,12,13,16,17,18,19,20,23,24,25,26,27,28,29,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,49,50,51,52].
Diagnostics 15 00846 g004
Table 1. Thorax Publications.
Table 1. Thorax Publications.
Author
(Year)
ModalityDSC/HDCancer Type
(N Patients)
ModelBackboneDelineation STAFF
Kunkyab T. et al. [7]
(2024)
CT0.92
1.33
Lung Cancer
(676)
Co-ReTrCNN with multi resolution input, and Transformers moduleRadiation oncologist
Wang S. et al. [8]
(2022)
CT, PET0.85 ± 0.05/
8.53 ± 3.79
NSCLC
(280)
3D CNN Dual-Modality NetworkIndependent convolution for PET/CT and encoder-decoder architectureFour radiation oncologists
Xie H. et al. [9]
(2022)
CT0.84 ± 0.0/
8.11 ±3.43
Lung cancer
(127)
TransResSEU-NET 2.5D 3D U-NET with 2D and 3D Res-SE ModulesOne radiation oncologist, two radiotherapists
Cui Y. et al. [10]
(2021)
CT0.83 ± 0.07/
4.57 ± 2.44
NSCLC
(192)
Dense V-NetworksCombination of DenseNet and V-Network StructuresTwo radiation oncologists
Zhang G. et al. [11]
(2022)
CT0.83 ± 0.10/
4.02 ± 0.15
Lung Cancer
(871)
I-3D DenseU-NETNested Dense Skip Connection between Encoder and Decoder BlocksOne radiation oncologist
Talebi P. et al. [12]
(2022)
(4D-) CT0.83 ± 0.13
3.73 ± 0.99
NSCLC
(20)
3D-U-Net w. attention module3D-U-Net with an added attention module One radiation oncologist
Skett S. et al. [13]
(2024)
CT0.80 ± 0.10/
10.5 ± 7.3
Lung Cancer
(379)
nnU-NetAnchor-point-based post-
processing
Two oncologists
Kawata Y. et al. [14]
(2017)
CT, PET0.79 ± 0.06/
N/A
NSCLC
(16)
Automated ML Framework for GTV SegmentationPixel-based ML Techniques: FCM, ANN, SVMTwo radiation oncologists
Ikushima K. et al. [15]
(2017)
CT, PET0.77/
N/A
Lung cancer
(14)
PET/CT and Diagnostic CT RegistrationSVM with Gaussian kernel for classificationTwo radiation oncologists
Ma Y. et al. [16]
(2022)
CT0.74 ± 0.15/
28.23 ± 34.87
Lung cancer
(70)
GruU-NET-addConvolutional GRU-based 3D U-NETOne radiation oncologist
Yu X. et al. [17]
(2022)
CT0.73/
21.39
Stage III NSCLC
(214)
3D ResSE-U-NET3D U-NET with Residual and SE BlocksRadiation oncologist
Gan W. et al. [18]
(2021)
CT0.72 ± 0.10/
21.73 ± 13.30
Lung cancer
(260)
Hybrid 2D + 3D CNNV-Net for 3D CNN; Dense Blocks for 2D CNNTwo radiation oncologists
Wong J. et al. [19]
(2021)
CT0.71 ± 0.19/
5.23
Lung cancer
(96)
Limbus Contour v1.0.22U-NETOne radiation oncologist
Thomas T. et al. [20]
(2017)
CT, PET0.71/
8.10
NSCLC
(9)
Improved GrowCutGrowCut AlgorithmN/A
Cheng D. et al. [21]
(2020)
CT0.66/
N/A
NSCLC
(25)
Random Walks AlgorithmGraph-based algorithmOne clinical oncologist
Abbreviations: Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Fuzzy C-Means (FCM), Gross Tumor Volume (GTV), Gated Recurrent Unit-based U-NET (Gru-based U-NET), Machine Learning (ML), Not Available (N/A), Residual Squeeze-and-Excitation (Res-SE), Squeeze-and-Excitation Blocks (SE-Blocks), Support Vector Machine (SVM), Transfer Learning with Residual Networks (TransRes).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pehrson, L.M.; Petersen, J.; Panduro, N.S.; Lauridsen, C.A.; Carlsen, J.F.; Darkner, S.; Nielsen, M.B.; Ingala, S. AI-Guided Delineation of Gross Tumor Volume for Body Tumors: A Systematic Review. Diagnostics 2025, 15, 846. https://doi.org/10.3390/diagnostics15070846

AMA Style

Pehrson LM, Petersen J, Panduro NS, Lauridsen CA, Carlsen JF, Darkner S, Nielsen MB, Ingala S. AI-Guided Delineation of Gross Tumor Volume for Body Tumors: A Systematic Review. Diagnostics. 2025; 15(7):846. https://doi.org/10.3390/diagnostics15070846

Chicago/Turabian Style

Pehrson, Lea Marie, Jens Petersen, Nathalie Sarup Panduro, Carsten Ammitzbøl Lauridsen, Jonathan Frederik Carlsen, Sune Darkner, Michael Bachmann Nielsen, and Silvia Ingala. 2025. "AI-Guided Delineation of Gross Tumor Volume for Body Tumors: A Systematic Review" Diagnostics 15, no. 7: 846. https://doi.org/10.3390/diagnostics15070846

APA Style

Pehrson, L. M., Petersen, J., Panduro, N. S., Lauridsen, C. A., Carlsen, J. F., Darkner, S., Nielsen, M. B., & Ingala, S. (2025). AI-Guided Delineation of Gross Tumor Volume for Body Tumors: A Systematic Review. Diagnostics, 15(7), 846. https://doi.org/10.3390/diagnostics15070846

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop