Unsupervised Domain Adaptation for Automatic Polyp Segmentation Using Synthetic Data
Abstract
1. Introduction
- In Section 2, we discuss recent advancements in Transformer architectures in computer vision, review various polyp segmentation methods, and give an overview of available UDA methodologies.
- In Section 3, we provide a detailed explanation of how we utilized the SynthColon and UDA methodologies.
- In Section 5, we evaluate DAFormer under different scenarios and compare it with other architectures and under different target datasets.
- In Section 6, we reflect on the insights gained through extensive experimentation.
- In Section 7, we summarize our findings and propose potential directions for future work.
2. Background and Related Work
2.1. Transformers
2.2. Polyp Segmentation
2.3. Unsupervised Domain Adaptation (UDA)
3. Methodology
3.1. Problem Formulation
3.2. Synthetic Dataset Generation
3.3. Architecture
3.4. Training Pipeline
4. Experiments
Experimental Setup
5. Experimental Results
5.1. Training on the Raw 3D-Generated Dataset
5.2. Effect of Pretraining on Source
5.3. Comparison of Backbone Depth
5.4. Comparison of Different Network Architectures
5.5. Comparison of Current Synth-Colon→Kvasir-Seg
5.6. Synth-Colon→CVC-ClinicDB
5.7. Computational Efficiency
6. Discussion
6.1. Key Observations
6.1.1. Training on the Raw 3D-Generated Dataset
6.1.2. Effect of Pretraining on Source
6.1.3. Comparison of Backbone Depth
6.1.4. Comparison of Different Network Architectures
6.1.5. Comparison of Current Synth-Colon→Kvasir-Seg
6.1.6. Synth-Colon→CVC-ClinicDB
- Src.Only: Both models exhibit a performance drop when trained excluseively on SynthColon and evaluated on CVC-ClinicDB. Specifically, DAFormer (MiT-b5) decreases slightly from 54.58% mIoU to 53.37%, while EffiSegNet drops more substantially from 40.03% to 35.77%. Although a decrease is expected since CVC-ClinicDB was not used during style-transfer, the more pronounced decline in EffiSegNet may be attributed to the texture sensitivity inherent to CNN-based architectures.
- UDA (SynthColon→CVC-ClinicDB): Both models show reduced performance compared to their results under UDA from SynthColon to Kvasir-Seg (Table 2 and Table 3). Specifically, DAFormer and EffiSegNet drop by 8.3% and 5.31% mIoU, respectively. However, they still outperform their Src.Only counterparts, meaning that our UDA method is capable of providing improved results even in the face of a larger domain shift.
- UDA (SynthColon→Kvasir-Seg): In this setup, we utilize the best checkpoints of our SynthColon→Kvasir-Seg training and evaluate them on CVC-ClinicDB. Surprisingly, this setting results in a smaller performance drop on CVC-ClinicDB compared to the UDA (SynthColon→CVC-ClinicDB) setup. This is unexpected, as we assumed that directly adapting to CVC-ClinicDB would yield better results on that same dataset.This result can be intuitively explained by the fact that our CycleGAN-refined SynthColon dataset is visually closer to Kvasir-Seg due to the style transfer, and Kvasir-Seg in turn is naturally more similar to CVC-ClinicDB, as both are real-world datasets. However, SynthColon and CVC-ClinicDB are not necessarily close. Thus, adapting in two smaller steps, first from synthetic to Kvasir-Seg and then evaluating on CVC-ClinicDB, may generalize better than attempting to bridge the larger domain gap directly. In Figure 4, this improvement can be observed in two challenging samples. In the first image, multiple folds and discolorations may mislead the model, while in the second image, the boundaries are difficult to discern and the polyp texture closely resembles that of the surrounding intestinal walls. In both cases, the SynthColon→Kvasir-Seg→CVC-ClinicDB improves performance over SynthColon→CVC-ClinicDB, clearly showcasing this phenomenon.
6.2. Limitations
6.3. Future Work
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
UDA | Unsupervised Domain Adaptation |
AI-CAD | Artificial Intelligence-Powered Computer-Aided Detection and Diagnosis |
AI | Artificial Intelligence |
ViT | Vision Transformer |
MiT | Mix Vision Transformer |
MHSA | Multi-Head Self-Attention mechanism |
PVT | Pyramid Vision Transformer |
CNN | Convolutional Neural Network |
GAN | Generative Adversarial Network |
MSE | Mean Squared Error |
DACS | Domain Adaptation via Cross-Domain Mixed Sampling |
ASPP | Atrous Spatial Pyramid Pooling |
EMA | Exponential Moving Average |
VAE | Variational Autoencoder |
References
- American Cancer Society. Key Statistics for Colorectal Cancer. 2024. Available online: https://www.cancer.org/cancer/types/colon-rectal-cancer/about/key-statistics.html (accessed on 21 April 2025).
- Cheng, E.; Blackburn, H.N.; Ng, K.; Spiegelman, D.; Irwin, M.L.; Ma, X.; Gross, C.P.; Tabung, F.K.; Giovannucci, E.L.; Kunz, P.L.; et al. Analysis of Survival Among Adults with Early-Onset Colorectal Cancer in the National Cancer Database. JAMA Netw. Open 2021, 4, e2112539. [Google Scholar] [CrossRef]
- Sikora, N.; Manschke, R.L.; Tang, A.M.; Dunstan, P.; Harris, D.A.; Yang, S. ColonScopeX: Leveraging Explainable Expert Systems with Multimodal Data for Improved Early Diagnosis of Colorectal Cancer. arXiv 2025, arXiv:2504.08824. [Google Scholar]
- Rex, D.K.; Boland, C.R.; Dominitz, J.A.; Giardiello, F.M.; Johnson, D.A.; Kaltenbach, T. Colorectal cancer screening: Recommendations for physicians and patients from the US Multi-Society Task Force on colorectal cancer. Gastroenterology 2017, 153, 307–323. [Google Scholar] [CrossRef]
- Than, M.; Witherspoon, J.; Shami, J.; Patil, P.; Saklani, A. Diagnostic miss rate for colorectal cancer: An audit. Ann. Gastroenterol. 2015, 28, 94–98. [Google Scholar]
- Takeda, K.; Kudo, S.E.; Mori, Y.; Misawa, M.; Kudo, T.; Wakamura, K.; Katagiri, A.; Baba, T.; Hidaka, E.; Ishida, F.; et al. Accuracy of diagnosing invasive colorectal cancer using computer-aided endocytoscopy. Endoscopy 2017, 49, 798–802. [Google Scholar] [CrossRef]
- Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24–29. [Google Scholar] [CrossRef]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
- Schäfer, R.; Nicke, T.; Höfener, H.; Lange, A.; Merhof, D.; Feuerhake, F.; Schulz, V.; Lotz, J.; Kiessling, F. Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task Model. Nat. Comput. Sci. 2024, 4, 495–509. [Google Scholar] [CrossRef] [PubMed]
- Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef]
- European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation). Off. J. Eur. Union 2016, L119, 1–88. Available online: https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed on 30 August 2025).
- Yao, L.; Prosky, J.; Covington, B.; Lyman, K. A Strong Baseline for Domain Adaptation and Generalization in Medical Imaging. Extended Abstract Track. In Proceedings of the Medical Imaging with Deep Learning (MIDL 2019), London, UK, 8–10 July 2019. [Google Scholar]
- Hu, S.; Liao, Z.; Xia, Y. Devil is in Channels: Contrastive Single Domain Generalization for Medical Image Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2023: 26th International Conference, Vancouver, BC, Canada, 8–12 October 2023. [Google Scholar]
- Ren, G.; Lazarou, M.; Yuan, J.; Stathaki, T. Towards Automated Polyp Segmentation Using Weakly- and Semi-Supervised Learning and Deformable Transformers. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Basaran, B.D.; Zhang, W.; Qiao, M.; Kainz, B.; Matthews, P.M.; Bai, W. LesionMix: A Lesion-Level Data Augmentation Method for Medical Image Segmentation. In Proceedings of the Data Augmentation, Labelling, and Imperfections: Third MICCAI Workshop, DALI 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, 12 October 2023. [Google Scholar]
- Moreu, E.; McGuinness, K.; O’Connor, N.E. Synthetic data for unsupervised polyp segmentation. In Proceedings of the 29th Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2021), Dublin, Ireland, 7–8 December 2021. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Halvorsen, P.; de Lange, T.; Johansen, D.; Johansen, H.D. Kvasir-SEG: A Segmented Polyp Dataset. In Proceedings of the 26th International Conference on MultiMedia Modeling (MMM 2020), Daejeon, Korea, 5–8 January 2020. [Google Scholar]
- Hoyer, L.; Dai, D.; Gool, L.V. DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New York City, NY, USA, 23–26 June 2021; pp. 45–67. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Proceedings of the Neural Information Processing Systems (NeurIPS), Virtual Conference, 6–14 December 2021. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. PVT v2: Improved baselines with pyramid vision transformer. Comput. Vis. Media 2022, 8, 415–424. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Bernal, J.; Sánchez, F.; Fernández-Esparrach, G.; Gil, D.; Rodríguez, C.; Vilariño, F. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 2015, 43, 99–111. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), Munich, Germany, 5–9 October 2015. [Google Scholar]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Proceedings of the 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; de Lange, T.; Halvorsen, P.; Johansen, H.D. ResUNet++: An Advanced Architecture for Medical Image Segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019. [Google Scholar]
- Jha, D.; Riegler, M.A.; Johansen, D.; Halvorsen, P.; Johansen, H.D. DoubleU-Net: A Deep Convolutional Neural Network for Medical Image Segmentation. In Proceedings of the 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), Rochester, MN, USA, 28–30 July 2020. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Kim, T.; Lee, H.; Kim, D. UACANet: Uncertainty Augmented Context Attention for Polyp Segmentation. In Proceedings of the 29th ACM International Conference on Multimedia, ACM, Virtual Event, 17 October 2021; pp. 2167–2175. [Google Scholar]
- Fan, D.P.; Ji, G.P.; Zhou, T.; Chen, G.; Fu, H.; Shen, J.; Shao, L. PraNet: Parallel Reverse Attention Network for Polyp Segmentation. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2020), Lima, Peru, 4–8 October 2020. [Google Scholar]
- Zhang, Y.; Liu, H.; Hu, Q. TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2021), Strasbourg, France, 27 September–1 October 2021. [Google Scholar]
- Dong, B.; Wang, W.; Fan, D.P.; Li, J.; Fu, H.; Shao, L. Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers. CAAI Artif. Intell. Res. 2023, 2, 9150015. [Google Scholar] [CrossRef]
- Rahman, M.M.; Marculescu, R. Medical Image Segmentation via Cascaded Attention Decoding. In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 6211–6220. [Google Scholar] [CrossRef]
- Shi, W.; Xu, J.; Gao, P. SSformer: A Lightweight Transformer for Semantic Segmentation. In Proceedings of the 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP), Shanghai, China, 26–28 September 2022. [Google Scholar]
- Fitzgerald, K.; Matuszewski, B. FCB-SwinV2 Transformer for Polyp Segmentation. arXiv 2023, arXiv:2302.01027. [Google Scholar] [CrossRef]
- Choudhuri, A.; Gao, Z.; Zheng, M.; Planche, B.; Chen, T.; Wu, Z. PolypSegTrack: Unified Foundation Model for Colonoscopy Video Analysis. arXiv 2025, arXiv:2503.24108. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023. [Google Scholar]
- Li, H.; Zhang, D.; Yao, J.; Han, L.; Li, Z.; Han, J. ASPS: Augmented Segment Anything Model for Polyp Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI, Marrakesh, Morocco, 6–10 October 2024. [Google Scholar]
- Li, Y.; Hu, M.; Yang, X. Polyp-SAM: Transfer SAM for Polyp Segmentation. In Proceedings of the Medical Imaging 2024: Computer-Aided Diagnosis, San Diego, CA, USA, 3 April 2024. [Google Scholar]
- Rahman, M.M.; Munir, M.; Jha, D.; Bagci, U.; Marculescu, R. PP-SAM: Perturbed Prompts for Robust Adaptation of Segment Anything Model for Polyp Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 17–21 June 2024. [Google Scholar]
- Mao, X.; Xing, X.; Meng, F.; Liu, J.; Bai, F.; Nie, Q.; Meng, M. One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Iterative Prompt Evolution. arXiv 2025, arXiv:2507.16337. [Google Scholar] [CrossRef]
- Mansoori, M.; Shahabodini, S.; Abouei, J.; Plataniotis, K.N.; Mohammadi, A. Polyp SAM 2: Advancing Zero shot Polyp Segmentation in Colorectal Cancer Detection. arXiv 2024, arXiv:2408.05892. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhou, T.; Gu, Y.; Zhou, Y.; Zhang, Y.; Wu, Y.; Fu, H. WeakPolyp-SAM: Segment Anything Model-driven weakly-supervised polyp segmentation. Know.-Based Syst. 2025, 322, 113701. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NeurIPS 2014), Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
- Gong, R.; Li, W.; Chen, Y.; Gool, L.V. DLOW: Domain Flow for Adaptation and Generalization. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial Discriminative Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Tsai, Y.H.; Hung, W.C.; Schulter, S.; Sohn, K.; Yang, M.H.; Chandraker, M. Learning to Adapt Structured Output Space for Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Tsai, Y.H.; Sohn, K.; Schulter, S.; Chandraker, M. Domain Adaptation for Structured Output via Discriminative Patch Representations. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Lee, D.H. Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks. In Proceedings of the ICML 2013 Workshop: Challenges in Representation Learning (WREPL), Atlanta, GA, USA, 16–21 June 2013. [Google Scholar]
- Sohn, K.; Berthelot, D.; Li, C.L.; Zhang, Z.; Carlini, N.; Cubuk, E.D.; Kurakin, A.; Zhang, H.; Raffel, C. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Event, 6–12 December 2020. [Google Scholar]
- Zou, Y.; Yu, Z.; Kumar, B.V.K.V.; Wang, J. Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training. In Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Zou, Y.; Yu, Z.; Liu, X.; Kumar, B.V.K.V.; Wang, J. Confidence Regularized Self-Training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2019), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Wang, Y.; Wang, H.; Shen, Y.; Fei, J.; Li, W.; Jin, G.; Wu, L.; Zhao, R.; Le, X. Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Sakaridis, C.; Dai, D.; Hecker, S.; Gool, L.V. Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding. In Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Yang, Y.; Soatto, S. FDA: Fourier Domain Adaptation for Semantic Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. In Proceedings of the NIPS 2014 Deep Learning Workshop, Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
- Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Olsson, V.; Tranheden, W.; Pinto, J.; Svensson, L. ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2021), Waikoloa, HI, USA, 5–9 January 2021. [Google Scholar]
- Tranheden, W.; Olsson, V.; Pinto, J.; Svensson, L. DACS: Domain Adaptation via Cross-domain Mixed Sampling. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2021), Waikoloa, HI, USA, 5–9 January 2021. [Google Scholar]
- Zhou, Q.; Feng, Z.; Gu, Q.; Cheng, G.; Lu, X.; Shi, J.; Ma, L. Uncertainty-aware consistency regularization for cross-domain semantic segmentation. Comput. Vis. Image Underst. 2022, 221, 103448. [Google Scholar] [CrossRef]
- Diamantis, D.E.; Gatoula, P.; Koulaouzidis, A.; Iakovidis, D.K. This Intestine Does Not Exist: Multiscale Residual Variational Autoencoder for Realistic Wireless Capsule Endoscopy Image Generation. IEEE Access 2024, 12, 25668–25683. [Google Scholar] [CrossRef]
- Barua, H.B.; Stefanov, K.; Wong, K.; Dhall, A.; Krishnasamy, G. GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction. arXiv 2024, arXiv:2403.17837. [Google Scholar]
- Ros, G.; Sellart, L.; Materzynska, J.; Vazquez, D.; Lopez, A.M. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3234–3243. [Google Scholar] [CrossRef]
- Moreu, E.; Arazo, E.; McGuinness, K.; O’Connor, N.E. Joint one-sided synthetic unpaired image translation and segmentation for colorectal cancer prevention. Expert Syst. 2022, 40, e13137. [Google Scholar] [CrossRef]
- Vezakis, I.A.; Georgas, K.; Fotiadis, D.; Matsopoulos, G.K. EffiSegNet: Gastrointestinal Polyp Segmentation through a Pre-Trained EfficientNet-based Network with a Simplified Decoder. In Proceedings of the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2024), Orlando, FL, USA, 15–19 July 2024. [Google Scholar]
- Moreu, E.; Arazo, E.; McGuinness, K.; O’Connor, N.E. Self-Supervised and Semi-Supervised Polyp Segmentation using Synthetic Data. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023. [Google Scholar]
- Huang, C.H.; Wu, H.Y.; Lin, Y.L. HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network that Achieves over 0.9 Mean Dice and 86 FPS. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021), Toronto, ON, Canada, 6–11 June 2021. [Google Scholar]
- McDuff, D.; Curran, T.; Kadambi, A. Synthetic Data in Healthcare. arXiv 2023, arXiv:2304.03243. [Google Scholar] [PubMed]
- Chen, J.; Chun, D.; Patel, M.; Chiang, E.; James, J.; Capobianco, J.; Lipson, J.; Hong, C.; Natarajan, K.; Cole, C.L.; et al. The validity of synthetic clinical data: A validation study of a leading synthetic data generator (Synthea) using clinical quality measures. BMC Med. Inform. Decis. Mak. 2019, 19, 44. [Google Scholar] [CrossRef] [PubMed]
- Mathew, S.; Nadeem, S.; Kumari, S.; Kaufman, A. Augmenting Colonoscopy Using Extended and Directional CycleGAN for Lossy Image Translation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4695–4704. [Google Scholar] [CrossRef]
- Iacono, P.; Khan, N. Structure Preserving Cycle-GAN for Unsupervised Medical Image Domain Adaptation. arXiv 2023, arXiv:2304.09164. [Google Scholar] [CrossRef]
Model | Encoder Pretrained on ImageNet | Segmentor Pretrained on Source Dataset | mIoU | Dice |
---|---|---|---|---|
DAFormer | ✓ | – | 0.4520 | 0.5915 |
DAFormer | ✓ | ✓ | 0.7021 | 0.8157 |
Encoder | # Params (M) | Src-Only (%) | UDA (%) | Oracle (%) | Rel. (%) |
---|---|---|---|---|---|
MiT-B2 | 27.9 | 49.28 ± 0.2 | 58.3 ± 3.0 | 82.49 ± 0.42 | 70.68 |
MiT-B3 | 47.7 | 50.23 ± 3.7 | 61.6 ± 2.6 | 84.43 ± 0.24 | 72.96 |
MiT-B5 | 85.1 | 54.85 ± 2.8 | 69.0 ± 0.9 | 85.18 ± 1.18 | 81.00 |
Architecture | # Params (M) | Src-Only (%) | UDA (%) | Oracle (%) | Rel. (%) |
---|---|---|---|---|---|
UNet | 3.1 | 25.84 ± 2.1 | 31.86 ± 0.3 | 74.6 [33] | 41.55 |
EffiSegNet (B6) | 40.7 | 40.03 ± 1.14 | 59.35 ± 0.67 | 90.56 [68] | 65.55 |
DAFormer (mit-b3) | 47.7 | 50.23 ± 3.7 | 61.6 ± 2.6 | 84.43 ± 0.24 | 72.96 |
Method | Src-Only | UDA | Oracle | |||
---|---|---|---|---|---|---|
mIoU | mDice | mIoU | mDice | mIoU | mDice | |
DAFormer | 54.85 ± 0.2 | 73.2 ± 0.4 | 69.0 ± 0.9 | 80.67 ± 0.01 | 85.18 ± 1.1 | 89.04 ± 0.3 |
SynthColon [16] | 52.7 | 75.9 | – | – | 85.70 | 90.4 |
CUT-Seg [67] | 62.1 | 70.2 | – | – | 85.70 | 90.4 |
PL-CUT-Seg [69] | – | – | 68.77 | 78.08 | 85.70 | 90.4 |
Method | Src-Only | UDA (SynthColon →CVC-ClinicDB) | UDA (SynthColon →Kvasir) | Oracle |
---|---|---|---|---|
SynthColon (baseline) | 45.7 * | - | - | 88.20 * |
DAFormer (MiT-B5) | 53.37 ± 1.5 | 60.70 ± 0.8 | 66.62 ± 0.2 | 85.18 ± 1.18 |
EffiSegNet (B6) | 35.77 ± 1.1 | 54.04 ± 0.9 | 50.16 ± 0.7 | 89.50 * |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Malli, I.; Vezakis, I.A.; Kakkos, I.; Kalamatianos, T.; Matsopoulos, G.K. Unsupervised Domain Adaptation for Automatic Polyp Segmentation Using Synthetic Data. Appl. Sci. 2025, 15, 9829. https://doi.org/10.3390/app15179829
Malli I, Vezakis IA, Kakkos I, Kalamatianos T, Matsopoulos GK. Unsupervised Domain Adaptation for Automatic Polyp Segmentation Using Synthetic Data. Applied Sciences. 2025; 15(17):9829. https://doi.org/10.3390/app15179829
Chicago/Turabian StyleMalli, Ioanna, Ioannis A. Vezakis, Ioannis Kakkos, Theodosis Kalamatianos, and George K. Matsopoulos. 2025. "Unsupervised Domain Adaptation for Automatic Polyp Segmentation Using Synthetic Data" Applied Sciences 15, no. 17: 9829. https://doi.org/10.3390/app15179829
APA StyleMalli, I., Vezakis, I. A., Kakkos, I., Kalamatianos, T., & Matsopoulos, G. K. (2025). Unsupervised Domain Adaptation for Automatic Polyp Segmentation Using Synthetic Data. Applied Sciences, 15(17), 9829. https://doi.org/10.3390/app15179829