A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion
Abstract
:1. Introduction
2. Methods
2.1. Proposed Network Structure
2.2. Network Backbone Extraction
2.3. Cross Extraction Module
2.4. Multi-Proportion Fusion Module
3. Experiment
3.1. Data Preprocessing
3.2. Experimental Details
3.3. Loss Function
3.4. Experimental Metrics
3.5. Experimental Results
3.6. Ablation Experiment
3.6.1. Effect of Activation Function and Batch Normalization (BN) on Segmentation Results
3.6.2. Effect of CEM on Segmentation Results
3.6.3. The Effect of MPFM on Segmentation Results
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Winawer, S.J.; Zauber, A.G.; Ho, M.N.; O’Brien, M.J.; Gottlieb, L.S.; Sternberg, S.S.; Waye, J.D.; Schapiro, M.; Bond, J.H.; Panish, J.F.; et al. Prevention of Colorectal Cancer by Colonoscopic Polypectomy. N. Engl. J. Med. 1993, 329, 1977–1981. [Google Scholar] [CrossRef] [PubMed]
- Leufkens, A.M.; van Oijen, M.G.H.; Vleggaar, F.P.; Siersema, P.D. Factors influencing the miss rate of polyps in a back-to-back colonoscopy study. Endoscopy 2012, 44, 470–475. [Google Scholar] [CrossRef]
- Dawwas, M.F. Adenoma Detection Rate and Risk of Colorectal Cancer and Death. N. Engl. J. Med. 2014, 370, 2539–2541. [Google Scholar] [CrossRef] [Green Version]
- Mamonov, A.V.; Figueiredo, I.N.; Figueiredo, P.N.; Tsai, Y.-H.R. Automated Polyp Detection in Colon Capsule Endoscopy. IEEE Trans. Med Imaging 2014, 33, 1488–1502. [Google Scholar] [CrossRef] [Green Version]
- Akbari, M.; Mohrekesh, M.; Nasr-Esfahani, E.; Soroushmehr, S.M.R.; Karimi, N.; Samavi, S.; Najarian, K. Polyp Segmentation in Colonoscopy Images Using Fully Convolutional Network. In Proceedings of the 40th Annual International Conference of the IEEE-Engineering-in-Medicine-and-Biology-Society (EMBC), Honolulu, HI, USA, 18–21 July 2018. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; Volume 39, pp. 640–651. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015. [Google Scholar]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Proceedings of the 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Granada, Spain, 20 September 2018; Springer: Cham, Switzerland, 2018; Volume 11045, pp. 3–11. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; de Lange, T.; Halvorsen, P.; Johansen, H.D. ResUNet++: An Advanced Architecture for Medical Image Segmentation. In Proceedings of the 21st IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2020; Volume 42, pp. 2011–2023. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [Green Version]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Feng, S.; Zhao, H.; Shi, F.; Cheng, X.; Wang, M.; Ma, Y.; Xiang, D.; Zhu, W.; Chen, X. CPFNet: Context Pyramid Fusion Network for Medical Image Segmentation. IEEE Trans. Med. Imaging 2020, 39, 3008–3018. [Google Scholar] [CrossRef]
- Kang, J.; Gwak, J. Ensemble of Instance Segmentation Models for Polyp Segmentation in Colonoscopy Images. IEEE Access 2019, 7, 26440–26447. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. Ieee Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
- Qadir, H.A.; Shin, Y.; Solhusvik, J.; Bergsland, J.; Aabakken, L.; Balasingham, I. Polyp Detection and Segmentation using Mask R-CNN: Does a Deeper Feature Extractor CNN Always Perform Better? In Proceedings of the 13th International Symposium on Medical Information and Communication Technology (ISMICT), Oslo, Norway, 8–10 May 2019. [Google Scholar]
- Fan, D.-P.; Ji, G.-P.; Zhou, T.; Chen, G.; Fu, H.; Shen, J.; Shao, L. PraNet: Parallel Reverse Attention Network for Polyp Segmentation. In Proceedings of the 2020 International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru, 4–8 October 2020; pp. 263–273. [Google Scholar] [CrossRef]
- Dong, B.; Wang, W.; Fan, D.P.; Li, J.; Fu, H.; Shao, L. Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers. arXiv 2021, arXiv:2108.06932. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Lou, A.; Guan, S.; Ko, H.; Loew, M.H. CaraNet: Context axial reverse attention network for segmentation of small medical objects. In Proceedings of the SPIE Medical Imaging 2022: Image Processing, San Diego, CA, USA, 20 February–28 March 2022. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, H.; Hu, Q. TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention 2021, Strasbourg, France, 27 September–1 October 2021; pp. 14–24. [Google Scholar] [CrossRef]
- Srivastava, A.; Jha, D.; Chanda, S.; Pal, U.; Johansen, H.; Johansen, D.; Riegler, M.; Ali, S.; Halvorsen, P. MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation. IEEE J. Biomed. Health Inform. 2021, 26, 2252–2263. [Google Scholar] [CrossRef] [PubMed]
- Srivastava, A.; Chanda, S.; Jha, D.; Pal, U.; Ali, S. GMSRF-Net: An improved generalizability with global multi-scale residual fusion network for polyp segmentation. arXiv 2021, arXiv:2111.10614. [Google Scholar]
- Jiang, D.; Sun, B.; Su, S.; Zuo, Z.; Wu, P.; Tan, X. FASSD: A Feature Fusion and Spatial Attention-Based Single Shot Detector for Small Object Detection. Electronics 2020, 9, 1536. [Google Scholar] [CrossRef]
- Wang, J.; Huang, Q.; Tang, F.; Meng, J.; Su, J.; Song, S. Stepwise Feature Fusion: Local Guides Global. arXiv 2022, arXiv:2203.03635. [Google Scholar]
- Zhang, Z.; Liu, Q.; Wang, Y. Road Extraction by Deep Residual U-Net. Ieee Geosci. Remote Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for Activation Functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
- Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Halvorsen, P.; Lange, T.D.; Johansen, D.; Johansen, H.D. Kvasir-SEG: A Segmented Polyp Dataset. In Proceedings of the 26th International Conference on MultiMedia Modeling (MMM), Daejeon, Korea, 5–8 January 2020. [Google Scholar]
- Bernal, J.; Sánchez, F.J.; Fernández-Esparrach, G.; Gil, D.; Rodríguez, C.; Vilariño, F. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 2015, 43, 99–111. [Google Scholar] [CrossRef]
- Huang, C.H.; Wu, H.Y.; Lin, Y.L. HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network that Achieves over 0.9 Mean Dice and 86 FPS. arXiv 2021, arXiv:2101.07172. [Google Scholar]
- Qin, X.; Zhang, Z.; Huang, C.; Gao, C.; Dehghan, M.; Jagersand, M. BASNet: Boundary-Aware Salient Object Detection. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–21 June 2019. [Google Scholar]
- Mattyus, G.; Luo, W.; Urtasun, R. DeepRoadMapper: Extracting Road Topology from Aerial Images. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- De Boer, P.T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A Tutorial on the Cross-Entropy Method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003. [Google Scholar] [CrossRef] [Green Version]
- Graham, S.; Vu, Q.D.; Raza, S.E.A.; Azam, A.; Tsang, Y.W.; Kwak, J.T.; Rajpoot, N. Hover-Net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med. Image Anal. 2019, 58, 101563. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Model | Strength | Weakness |
---|---|---|
Unet [10] | These four models can use fewer training sets for end-to-end training, fully use context information, and have good output results. | The detailed information is lost during sampling under these four models, and there are numerous repeated operations in the training process. |
Unet++ [11] | ||
ResUnet [29] | ||
ResUnet++ [12] | ||
PSPNet [9] | The PPM module is used in this model to aggregate global context information. | Considerable detail information is lost in the sampling process of this model, leading to the imprecise edge of the segmentation result. |
Mask R-CNN [18] | The model is segmented based on target detection and can achieve high accuracy. | The model needs to generate the region of interest first, then classify the object and return the bounding box, which is usually slow. |
PraNet [20] | In this model, advanced features are used to capture the rough position of polyp tissue, and the reverse attention module mines the edge information to obtain accurate segmentation results. | This model mainly focuses on edge information and ignores context information at different scales. |
CaraNet [23] | ||
Polyp-PVT [21] | The model uses a transformer as the encoder to have the whole image sensing range and fully uses the global context information. | The model does not acquire enough local information, which affects the final segmentation result. |
SSFormer-L [28] | ||
Ours | The proposed model fully considers multi-scale context information and uses feature fusion modules for different proportions of fusion. | The model has modules to deal with shallow features, which may require more computing resources. |
Model | Accuracy | Recall | Precision | mIoU | mDice |
---|---|---|---|---|---|
Unet | 0.9341 | 0.9141 | 0.7709 | 0.8027 | 0.8179 |
ResUnet | 0.9074 | 0.8653 | 0.7095 | 0.7369 | 0.7877 |
Unet++ | 0.9404 | 0.9171 | 0.7998 | 0.8165 | 0.8211 |
ResUnet++ | 0.9389 | 0.9181 | 0.7976 | 0.8006 | 0.8132 |
PSPNet | 0.9346 | 0.9407 | 0.7478 | 0.8161 | 0.8091 |
Mask R-CNN | 0.9289 | 0.8794 | 0.7984 | 0.7899 | 0.7962 |
PraNet | 0.9431 | 0.9389 | 0.8583 | 0.8568 | 0.8981 |
Polyp-PVT | 0.9441 | 0.9387 | 0.8541 | 0.8621 | 0.9171 |
CaraNet | 0.9562 | 0.9326 | 0.8614 | 0.8627 | 0.9163 |
SSFormer-L | 0.9574 | 0.9384 | 0.8662 | 0.8696 | 0.9156 |
Ours | 0.9649 | 0.9401 | 0.9011 | 0.8873 | 0.9192 |
Model | Activation | BN | Accuracy | Recall | Precision | mIoU | mDice |
---|---|---|---|---|---|---|---|
CEM | NO | NO | 0.9606 | 0.9288 | 0.8936 | 0.8743 | 0.9003 |
NO | YES | 0.9619 | 0.9279 | 0.9225 | 0.8775 | 0.9076 | |
ReLU | NO | 0.9615 | 0.9318 | 0.9014 | 0.8779 | 0.9075 | |
ReLU | YES | 0.9599 | 0.9331 | 0.8951 | 0.8769 | 0.9008 | |
SiLU | NO | 0.9649 | 0.9401 | 0.9011 | 0.8873 | 0.9192 | |
SiLU | YES | 0.9605 | 0.9287 | 0.8963 | 0.8746 | 0.9011 |
Model | Accuracy | Recall | Precision | mIoU | mDice |
---|---|---|---|---|---|
N/CEM | 0.9612 | 0.9308 | 0.9075 | 0.8784 | 0.9089 |
ASPP | 0.9612 | 0.9278 | 0.9119 | 0.8777 | 0.9029 |
CBAM | 0.9612 | 0.9336 | 0.8949 | 0.8785 | 0.9094 |
Ours | 0.9649 | 0.9401 | 0.9011 | 0.8873 | 0.9192 |
Model | Accuracy | Recall | Precision | mIoU | Dice |
---|---|---|---|---|---|
N/MPFM | 0.9596 | 0.9256 | 0.8966 | 0.8724 | 0.8957 |
GPG | 0.9592 | 0.9304 | 0.8964 | 0.8734 | 0.9015 |
CFM | 0.9614 | 0.9366 | 0.8935 | 0.8809 | 0.9113 |
Ours | 0.9649 | 0.9401 | 0.9011 | 0.8873 | 0.9192 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, J.; Li, Z.; Xu, C.; Feng, B. A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion. Electronics 2022, 11, 2501. https://doi.org/10.3390/electronics11162501
Yu J, Li Z, Xu C, Feng B. A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion. Electronics. 2022; 11(16):2501. https://doi.org/10.3390/electronics11162501
Chicago/Turabian StyleYu, Jing, Zhengping Li, Chao Xu, and Bo Feng. 2022. "A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion" Electronics 11, no. 16: 2501. https://doi.org/10.3390/electronics11162501
APA StyleYu, J., Li, Z., Xu, C., & Feng, B. (2022). A Segmentation Algorithm of Colonoscopy Images Based on Multi-Scale Feature Fusion. Electronics, 11(16), 2501. https://doi.org/10.3390/electronics11162501