Improved UNet with Attention for Medical Image Segmentation
Abstract
:1. Introduction
2. Materials and Methods
2.1. Network Architecture
2.1.1. Integrating UNet++ to Transformer and Side-Outputs for Deep Supervision
2.1.2. Incorporating Attention in the Network
2.2. Datasets
2.2.1. The Synapse Multi-Organ Segmentation Dataset
2.2.2. CAMUS Dataset
2.3. Evaluation and Metrics
2.3.1. Evaluation of The Synapse Multi-Organ Segmentation Dataset
- Cross-entropy loss [56]:
- Dice loss [56]:
2.3.2. Evaluation of CAMUS Dataset
2.4. Implementation Details
3. Results and Discussion
3.1. Synapse Multi-Organ Segmentation
3.2. CAMUS Dataset
4. Ablation Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Gao, Q.; Almekkawy, M. ASUNet++: A nested UNet with adaptive feature extractions for liver tumor segmentation. Comput. Biol. Med. 2021, 136, 104688. [Google Scholar]
- Conze, P.H.; Andrade-Miranda, G.; Singh, V.K.; Jaouen, V.; Visvikis, D. Current and emerging trends in medical image segmentation with deep learning. IEEE Trans. Radiat. Plasma Med. Sci. 2023, 7, 545–569. [Google Scholar] [CrossRef]
- Heimann, T.; Meinzer, H.P. Statistical shape models for 3D medical image segmentation: A review. Med. Image Anal. 2009, 13, 543–563. [Google Scholar]
- Kakumani, A.K.; Sree, L.P.; Kumar, B.V.; Rao, S.K.; Garrepally, M.; Chandrakanth, M. Segmentation of Cell Nuclei in Microscopy Images using Modified ResUNet. In Proceedings of the 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT), Bangalore, India, 7–9 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
- Zhou, S.; Wang, J.; Zhang, S.; Liang, Y.; Gong, Y. Active contour model based on local and global intensity information for medical image segmentation. Neurocomputing 2016, 186, 107–118. [Google Scholar] [CrossRef]
- Gao, Q.; Almekkawy, M. Ultrasound liver tumor segmentation with nested UNet and dynamic feature extraction. J. Acoust. Soc. Am. 2021, 149, A115. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems; Pereira, F., Burges, C., Bottou, L., Weinberger, K., Eds.; Curran Associates, Inc.: New York, NY, USA, 2012; Volume 25. [Google Scholar]
- Wang, B.; Wang, F.; Dong, P.; Li, C. Multiscale transUNet++: Dense hybrid UNet with Transformer for medical image segmentation. Signal Image Video Process. 2022, 16, 1607–1614. [Google Scholar] [CrossRef]
- Chen, B.; Liu, Y.; Zhang, Z.; Lu, G.; Kong, A.W.K. TransattUNet: Multi-level attention-guided UNet with Transformer for medical image segmentation. arXiv 2021, arXiv:2107.05274. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. UNet: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.W.; Wu, J. UNet 3+: A full-scale connected UNet for medical image segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1055–1059. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef]
- Jumutc, V.; Bļizņuks, D.; Lihachev, A. Multi-Path UNet architecture for cell and colony-forming unit image segmentation. Sensors 2022, 22, 990. [Google Scholar] [CrossRef]
- Mohammad, U.F.; Almekkawy, M. Automated detection of liver steatosis in ultrasound images using convolutional neural networks. In Proceedings of the 2021 IEEE International Ultrasonics Symposium (IUS), Xi’an, China, 11–16 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–4. [Google Scholar]
- Safarov, S.; Whangbo, T.K. A-DenseUNet: Adaptive densely connected UNet for polyp segmentation in colonoscopy images with atrous convolution. Sensors 2021, 21, 1441. [Google Scholar] [PubMed]
- Tao, S.; Jiang, Y.; Cao, S.; Wu, C.; Ma, Z. Attention-guided network with densely connected convolution for skin lesion segmentation. Sensors 2021, 21, 3462. [Google Scholar] [CrossRef] [PubMed]
- Liu, H.; Li, Z.; Lin, S.; Cheng, L. A Residual UNet Denoising Network Based on Multi-Scale Feature Extraction and Attention-Guided Filter. Sensors 2023, 23, 7044. [Google Scholar] [CrossRef] [PubMed]
- Mohammad, U.F.; Almekkawy, M. A substitution of convolutional layers by fft layers-a low computational cost version. In Proceedings of the 2021 IEEE International Ultrasonics Symposium (IUS), Xi’an, China, 11–16 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–3. [Google Scholar]
- Jiang, Y.; Yao, H.; Tao, S.; Liang, J. Gated skip-connection network with adaptive upsampling for retinal vessel segmentation. Sensors 2021, 21, 6177. [Google Scholar] [CrossRef]
- Li, S.; Sultonov, F.; Ye, Q.; Bai, Y.; Park, J.H.; Yang, C.; Song, M.; Koo, S.; Kang, J.M. TA-UNet: Integrating triplet attention module for drivable road region segmentation. Sensors 2022, 22, 4438. [Google Scholar] [CrossRef]
- Chen, S.; Qiu, C.; Yang, W.; Zhang, Z. Multiresolution aggregation Transformer UNet based on multiscale input and coordinate attention for medical image segmentation. Sensors 2022, 22, 3820. [Google Scholar] [CrossRef]
- Thirusangu, N.; Almekkawy, M. Segmentation of Breast Ultrasound Images using Densely Connected Deep Convolutional Neural Network and Attention Gates. In Proceedings of the 2021 IEEE UFFC Latin America Ultrasonics Symposium (LAUS), Gainesville, FL, USA, 4–5 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–4. [Google Scholar]
- Thirusangu, N.; Subramanian, T.; Almekkawy, M. Segmentation of induced substantia nigra from transcranial ultrasound images using deep convolutional neural network. J. Acoust. Soc. Am. 2020, 148, 2636–2637. [Google Scholar] [CrossRef]
- Leclerc, S.; Smistad, E.; Pedrosa, J.; Østvik, A.; Cervenansky, F.; Espinosa, F.; Espeland, T.; Berg, E.A.R.; Jodoin, P.M.; Grenier, T.; et al. Deep learning for segmentation using an open large-scale dataset in 2D echocardiography. IEEE Trans. Med. Imaging 2019, 38, 2198–2210. [Google Scholar] [CrossRef]
- Arsenescu, T.; Chifor, R.; Marita, T.; Santoma, A.; Lebovici, A.; Duma, D.; Vacaras, V.; Badea, A.F. 3D Ultrasound Reconstructions of the Carotid Artery and Thyroid Gland Using Artificial-Intelligence-Based Automatic Segmentation—Qualitative and Quantitative Evaluation of the Segmentation Results via Comparison with CT Angiography. Sensors 2023, 23, 2806. [Google Scholar] [CrossRef]
- Katakis, S.; Barotsis, N.; Kakotaritis, A.; Economou, G.; Panagiotopoulos, E.; Panayiotakis, G. Automatic Extraction of Muscle Parameters with Attention UNet in Ultrasonography. Sensors 2022, 22, 5230. [Google Scholar] [CrossRef]
- Han, Z.; Jian, M.; Wang, G.G. ConvUNeXt: An efficient convolution neural network for medical image segmentation. Knowl.-Based Syst. 2022, 253, 109512. [Google Scholar] [CrossRef]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. UNet++: A nested UNet architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
- Zeng, Z.; Hu, Q.; Xie, Z.; Zhou, J.; Xu, Y. Small but Mighty: Enhancing 3D Point Clouds Semantic Segmentation with U-Next Framework. arXiv 2023, arXiv:2304.00749. [Google Scholar]
- Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehghan, M.; Jagersand, M. Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7479–7489. [Google Scholar]
- Hou, Q.; Jiang, Z.; Yuan, L.; Cheng, M.M.; Yan, S.; Feng, J. Vision permutator: A permutable mlp-like architecture for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 1328–1334. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Li, S.; Dong, M.; Du, G.; Mu, X. Attention dense-UNet for automatic breast mass segmentation in digital mammogram. IEEE Access 2019, 7, 59037–59047. [Google Scholar] [CrossRef]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention UNet: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Chen, Y.; Wang, K.; Liao, X.; Qian, Y.; Wang, Q.; Yuan, Z.; Heng, P.A. Channel-UNet: A spatial channelwise convolutional neural network for liver and tumors segmentation. Front. Genet. 2019, 10, 1110. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Zhao, P.; Zhang, J.; Fang, W.; Deng, S. SCAUNet: Spatial-channel attention UNet for gland segmentation. Front. Bioeng. Biotechnol. 2020, 8, 670. [Google Scholar] [CrossRef]
- Hong, Z.; Chen, M.; Hu, W.; Yan, S.; Qu, A.; Chen, L.; Chen, J. Dual encoder network with Transformer-CNN for multi-organ segmentation. Med Biol. Eng. Comput. 2023, 61, 661–671. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Azad, R.; Al-Antary, M.T.; Heidari, M.; Merhof, D. Transnorm: Transformer provides a strong spatial normalization mechanism for a deep segmentation model. IEEE Access 2022, 10, 108205–108215. [Google Scholar] [CrossRef]
- Wu, H.; Chen, S.; Chen, G.; Wang, W.; Lei, B.; Wen, Z. FAT-Net: Feature adaptive Transformers for automated skin lesion segmentation. Med. Image Anal. 2022, 76, 102327. [Google Scholar] [CrossRef] [PubMed]
- Zuo, S.; Xiao, Y.; Chang, X.; Wang, X. Vision Transformers for dense prediction: A survey. Knowl.-Based Syst. 2022, 253, 109552. [Google Scholar] [CrossRef]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-UNet: UNet-like pure Transformer for medical image segmentation. arXiv 2021, arXiv:2105.05537. [Google Scholar]
- Yin, Y.; Xu, W.; Chen, L.; Wu, H. CoT-UNet++: A medical image segmentation method based on contextual Transformer and dense connection. Math. Biosci. Eng. 2023, 20, 8320–8336. [Google Scholar] [CrossRef]
- Balachandran, S.; Qin, X.; Jiang, C.; Blouri, E.S.; Forouzandeh, A.; Dehghan, M.; Zonoobi, D.; Kapur, J.; Jaremko, J.; Punithakumar, K. ACU2E-Net: A novel predict–refine attention network for segmentation of soft-tissue structures in ultrasound images. Comput. Biol. Med. 2023, 157, 106792. [Google Scholar] [CrossRef]
- Zhang, S.; Fu, H.; Yan, Y.; Zhang, Y.; Wu, Q.; Yang, M.; Tan, M.; Xu, Y. Attention guided network for retinal image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, 13–17 October 2019; Proceedings, Part I 22. Springer: Berlin/Heidelberg, Germany, 2019; pp. 797–805. [Google Scholar]
- Xie, Y.; Yang, B.; Guan, Q.; Zhang, J.; Wu, Q.; Xia, Y. Attention Mechanisms in Medical Image Segmentation: A Survey. arXiv 2023, arXiv:2305.17937. [Google Scholar]
- Mubashar, M.; Ali, H.; Grönlund, C.; Azmat, S. R2U++: A multiscale recurrent residual UNet with dense skip connections for medical image segmentation. Neural Comput. Appl. 2022, 34, 17723–17739. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. arXiv 2019, arXiv:cs.CV/1709.01507. [Google Scholar]
- Wu, X.; Yang, L.; Ma, Y.; Wu, C.; Guo, C.; Yan, H.; Qiao, Z.; Yao, S.; Fan, Y. An end-to-end multiple side-outputs fusion deep supervision network based remote sensing image change detection algorithm. Signal Process. 2023, 213, 109203. [Google Scholar] [CrossRef]
- Fu, S.; Lu, Y.; Wang, Y.; Zhou, Y.; Shen, W.; Fishman, E.; Yuille, A. Domain adaptive relational reasoning for 3D multi-organ segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, 4–8 October 2020; Proceedings, Part I 23. Springer: Berlin/Heidelberg, Germany, 2020; pp. 656–666. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-UNet: UNet-like pure Transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 205–218. [Google Scholar]
- Ma, J.; Chen, J.; Ng, M.; Huang, R.; Li, Y.; Li, C.; Yang, X.; Martel, A.L. Loss odyssey in medical image segmentation. Med Image Anal. 2021, 71, 102035. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Cao, P.; Wang, J.; Zaiane, O.R. Uctransnet: Rethinking the skip connections in UNet from a channelwise perspective with Transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 2441–2449. [Google Scholar]
- Wang, H.; Xie, S.; Lin, L.; Iwamoto, Y.; Han, X.H.; Chen, Y.W.; Tong, R. Mixed Transformer UNet for medical image segmentation. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2390–2394. [Google Scholar]
- Lei, T.; Sun, R.; Wan, Y.; Xia, Y.; Du, X.; Nandi, A.K. TEC-Net: Vision Transformer Embrace Convolutional Neural Networks for Medical Image Segmentation. arXiv 2023, arXiv:2306.04086. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional Transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
- Roux, N.; Schmidt, M.; Bach, F. A stochastic gradient method with an exponential convergence _rate for finite training sets. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Sun, S.; Cao, Z.; Zhu, H.; Zhao, J. A survey of optimization methods from a machine learning perspective. IEEE Trans. Cybern. 2019, 50, 3668–3681. [Google Scholar] [CrossRef] [PubMed]
Model | DSC↑ | HD↓ | Aorta | Gallb. | Kidney (L) | Kidney (R) | Liver | Pancreas | Spleen | Stomach |
---|---|---|---|---|---|---|---|---|---|---|
R50 UNet [45] | 74.68 | 36.87 | 87.74 | 63.66 | 80.6 | 78.19 | 93.74 | 56.9 | 85.87 | 74.16 |
UNet | 76.85 | 39.70 | 89.07 | 69.72 | 77.77 | 68.6 | 93.43 | 53.98 | 86.67 | 75.5 |
R50 ViT [45] | 71.29 | 32.87 | 73.73 | 55.13 | 75.8 | 72.2 | 91.51 | 45.99 | 81.99 | 73.95 |
TransUNet [45,55] | 77.48 | 31.69 | 87.23 | 63.13 | 81.87 | 77.02 | 94.08 | 55.86 | 85.08 | 75.62 |
UCTransNet [57] | 78.99 | 30.29 | − | − | − | − | − | − | − | − |
Swin-UNet [55] | 79.13 | 21.55 | 85.47 | 66.53 | 83.28 | 79.61 | 94.29 | 56.58 | 90.66 | 76.60 |
TransNorm [42] | 78.40 | 30.25 | 86.23 | 65.10 | 82.18 | 78.63 | 94.22 | 55.34 | 89.50 | 76.01 |
MT-UNet [58] | 78.59 | 26.59 | 87.92 | 64.99 | 81.47 | 77.29 | 93.06 | 59.46 | 87.75 | 76.81 |
Proposed Method | 81.92 | 20.21 | 89.01 | 70.39 | 86.04 | 82.83 | 95.09 | 62.32 | 90.02 | 78.33 |
Model | ED | ES | ||
---|---|---|---|---|
DSC↑ | HD↓ | DSC↑ | HD↓ | |
UNet | 91.40 | 11.89 | 91.42 | 13.27 |
(SD) | ±0.9562 | ±1.0046 | ±0.5941 | ±0.7397 |
UNet++ | 92.04 | 11.53 | 92.32 | 12.65 |
(SD) | ±0.7336 | ±0.5889 | ±0.6699 | ±0.6788 |
TransUNet | 91.23 | 12.06 | 91.42 | 13.37 |
(SD) | ±0.4414 | ±0.6234 | ±0.6660 | ±0.5664 |
Swin-UNet | 84.34 | 16.33 | 85.71 | 16.86 |
(SD) | ±0.8296 | ±0.7783 | ±1.1337 | ±0.8415 |
TransNorm | 90.18 | 11.15 | 90.87 | 13.47 |
(SD) | ±0.8957 | ±1.7352 | ±0.4438 | ±0.6467 |
Proposed Method | 92.52 | 11.04 | 92.64 | 12.35 |
(SD) | ±0.5068 | ±0.5302 | ±0.7081 | ±0.5990 |
Methods | DSC↑ | HD↓ |
---|---|---|
Baseline | 78.40 | 30.25 |
Baseline + “UNet++” | 79.37 | 30.46 |
Baseline + “UNet++” + side output (deep supervision) | 81.49 | 21.59 |
Baseline + “UNet++” + side output + TLA module | 81.80 | 26.07 |
Baseline + “UNet++” + side output + TLA module + SE | 81.92 | 20.21 |
Model Scale | DSC↑ | HD↓ | Aorta | Gallb. | Kidney (L) | Kidney (R) | Liver | Pancreas | Spleen | Stomach |
---|---|---|---|---|---|---|---|---|---|---|
Base | 81.92 | 20.21 | 89.01 | 70.39 | 86.04 | 82.83 | 95.09 | 62.32 | 90.02 | 78.33 |
Large | 82.69 | 16.41 | 89.80 | 73.67 | 86.09 | 82.08 | 95.01 | 65.68 | 92.81 | 76.41 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
AL Qurri, A.; Almekkawy, M. Improved UNet with Attention for Medical Image Segmentation. Sensors 2023, 23, 8589. https://doi.org/10.3390/s23208589
AL Qurri A, Almekkawy M. Improved UNet with Attention for Medical Image Segmentation. Sensors. 2023; 23(20):8589. https://doi.org/10.3390/s23208589
Chicago/Turabian StyleAL Qurri, Ahmed, and Mohamed Almekkawy. 2023. "Improved UNet with Attention for Medical Image Segmentation" Sensors 23, no. 20: 8589. https://doi.org/10.3390/s23208589
APA StyleAL Qurri, A., & Almekkawy, M. (2023). Improved UNet with Attention for Medical Image Segmentation. Sensors, 23(20), 8589. https://doi.org/10.3390/s23208589