Deep Multi-Task Learning for an Autoencoder-Regularized Semantic Segmentation of Fundus Retina Images
Abstract
:1. Introduction
- A novel encoder–decoder architecture based on the multi-task learning paradigm is proposed. The VQ-VAE module branch reconstructs input images to regularize the shared image encoder while generating and integrating hierarchical representations of the input image. This module not only alleviates the challenge caused by limited annotated data but also improves the representation ability of the model.
- An edge attention module is proposed to learn edge-focused feature representations by deep supervision, which can induce the model to focus on the target edge regions that are most difficult to segment and improve the perception of edge information.
- Comprehensive experiments are conducted on three public datasets, and experimental results show that our methods can achieve state-of-the-art performance.
2. Related Work
3. Method
3.1. Network Architectures Overview
3.2. Edge Attention Module
3.3. VQ-VAE Module
4. Experimental Results
4.1. Evaluation Datasets
- DRIVE: 40 fundus retinal images are included in this dataset. All images were collected by a Canon CR5 nonmydriatic 3CCD camera with a 45-degree field of view (FOV) and cropped from pixels to pixels. The dataset contains seven retinal fundus images of diabetes patients. Moreover, the dataset is split into two subsets, that is, training set (20 images) and testing set (20 images).
- CHASEDB1: Each image in this dataset of 28 pictures depicts a vascular patch and has a resolution of pixels. Fourteen kids’ left and right eye retinal fundus images are stored in the database. All images were taken from a 30 degree FOV. The first 20 photos are used as a training set, while the last 8 are utilized as a test set, as stated in [49].
- STARE: In this dataset containing 20 retinal fundus images, half of them have pathological signs. The resolution of the images is . We used 20% of the images as the validation and test sets.
4.2. Evaluation Metric
4.3. Implementation Details
4.4. Segmentation Results
4.5. Ablation Study
4.6. Effectiveness of the VQ-VAE Module
4.7. Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zana, F.; Klein, J.C. A multimodal registration algorithm of eye fundus images using vessels detection and Hough transform. IEEE Trans. Med. Imaging 1999, 18, 419–428. [Google Scholar] [CrossRef] [PubMed]
- Sinthanayothin, C. Automated localization of the optic disc, fovea, and retinal blood vessels from digital colour fundus images. Br. J. Ophthalmal 1999, 83, 231–238. [Google Scholar]
- Wu, H.; Wang, W.; Zhong, J.; Lei, B.; Wen, Z.; Qin, J. Scs-net: A scale and context sensitive network for retinal vessel segmentation. Med. Image Anal. 2021, 70, 102025. [Google Scholar] [CrossRef] [PubMed]
- Li, D.; Rahardja, S. BSEResU-Net: An attention-based before-activation residual U-Net for retinal vessel segmentation. Comput. Methods Programs Biomed. 2021, 205, 106070. [Google Scholar] [CrossRef]
- Mo, J.; Zhang, L. Multi-level deep supervised networks for retinal vessel segmentation. Int. J. Comput. Assist. Radiol. Surg. 2017, 12, 2181–2193. [Google Scholar] [CrossRef]
- Nian, F.; Li, T.; Wu, X.; Gao, Q.; Li, F. Efficient near-duplicate image detection with a local-based binary representation. Multimed. Tools Appl. 2016, 75, 2435–2452. [Google Scholar] [CrossRef]
- Li, T.; Yan, S.; Mei, T.; Hua, X.S.; Kweon, I.S. Image decomposition with multilabel context: Algorithms and applications. IEEE Trans. Image Process. 2010, 20, 2301–2314. [Google Scholar]
- Li, T.; Mei, T.; Yan, S.; Kweon, I.S.; Lee, C. Contextual decomposition of multi-label images. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 2270–2277. [Google Scholar]
- Nian, F.; Bao, B.K.; Li, T.; Xu, C. Multi-modal knowledge representation learning via webly-supervised relationships mining. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 411–419. [Google Scholar]
- Zhang, J.; Liu, S.; Yan, H.; Li, T.; Mao, R.; Liu, J. Predicting voxel-level dose distributions for esophageal radiotherapy using densely connected network with dilated convolutions. Phys. Med. Biol. 2020, 65, 205013. [Google Scholar] [CrossRef]
- Jiang, D.; Yan, H.; Chang, N.; Li, T.; Mao, R.; Du, C.; Guo, B.; Liu, J. Convolutional neural network-based dosimetry evaluation of esophageal radiation treatment planning. Med. Phys. 2020, 47, 4735–4742. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2016; pp. 424–432. [Google Scholar]
- Khened, M.; Kollerathu, V.A.; Krishnamurthi, G. Fully convolutional multi-scale residual DenseNets for cardiac segmentation and automated cardiac diagnosis using ensemble of classifiers. Med. Image Anal. 2019, 51, 21–45. [Google Scholar] [CrossRef] [Green Version]
- Fu, H.; Xu, Y.; Lin, S.; Kee Wong, D.W.; Liu, J. Deepvessel: Retinal vessel segmentation via deep learning and conditional random field. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2016; pp. 132–139. [Google Scholar]
- Alom, M.Z.; Hasan, M.; Yakopcic, C.; Taha, T.M.; Asari, V.K. Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv 2018, arXiv:1802.06955. [Google Scholar]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. Ce-net: Context encoder network for 2d medical image segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, J.; Zhang, Y.; Xu, X. Pyramid u-net for retinal vessel segmentation. In Proceedings of the ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 1125–1129. [Google Scholar]
- Staal, J.; Abràmoff, M.D.; Niemeijer, M.; Viergever, M.A.; Van Ginneken, B. Ridge-based vessel segmentation in color images of the retina. IEEE Trans. Med. Imaging 2004, 23, 501–509. [Google Scholar] [CrossRef] [PubMed]
- Owen, C.G.; Rudnicka, A.R.; Mullen, R.; Barman, S.A.; Monekosso, D.; Whincup, P.H.; Ng, J.; Paterson, C. Measuring retinal vessel tortuosity in 10-year-old children: Validation of the computer-assisted image analysis of the retina (CAIAR) program. Investig. Ophthalmol. Vis. Sci. 2009, 50, 2004–2010. [Google Scholar] [CrossRef] [Green Version]
- Hoover, A.; Kouznetsova, V.; Goldbaum, M. Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Trans. Med. Imaging 2000, 19, 203–210. [Google Scholar] [CrossRef] [Green Version]
- Liskowski, P.; Krawiec, K. Segmenting retinal blood vessels with deep neural networks. IEEE Trans. Med. Imaging 2016, 35, 2369–2380. [Google Scholar] [CrossRef]
- Samuel, P.M.; Veeramalai, T. VSSC Net: Vessel specific skip chain convolutional network for blood vessel segmentation. Comput. Methods Programs Biomed. 2021, 198, 105769. [Google Scholar] [CrossRef]
- Soomro, T.A.; Afifi, A.J.; Gao, J.; Hellwich, O.; Khan, M.A.; Paul, M.; Zheng, L. Boosting sensitivity of a retinal vessel segmentation algorithm with convolutional neural network. In Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Sydney, Australia, 29 November–1 December 2017; pp. 1–8. [Google Scholar]
- Wu, A.; Xu, Z.; Gao, M.; Buty, M.; Mollura, D.J. Deep vessel tracking: A generalized probabilistic approach via deep learning. In Proceedings of the 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1363–1367. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Atli, I.; Gedik, O.S. Sine-Net: A fully convolutional deep learning architecture for retinal blood vessel segmentation. Eng. Sci. Technol. Int. J. 2021, 24, 271–283. [Google Scholar] [CrossRef]
- Li, W.; Zhang, M.; Chen, D. Fundus retinal blood vessel segmentation based on active learning. In Proceedings of the 2020 International Conference on Computer Information and Big Data Applications (CIBDA), Guiyang, China, 17–19 April 2020; pp. 264–268. [Google Scholar]
- Luo, Y.; Cheng, H.; Yang, L. Size-invariant fully convolutional neural network for vessel segmentation of digital retinal images. In Proceedings of the 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, Republic of Korea, 13–15 December 2016; pp. 1–7. [Google Scholar]
- Sathananthavathi, V.; Indumathi, G. Encoder enhanced atrous (EEA) unet architecture for retinal blood vessel segmentation. Cogn. Syst. Res. 2021, 67, 84–95. [Google Scholar]
- Li, D.; Dharmawan, D.A.; Ng, B.P.; Rahardja, S. Residual u-net for retinal vessel segmentation. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1425–1429. [Google Scholar]
- Lian, S.; Li, L.; Lian, G.; Xiao, X.; Luo, Z.; Li, S. A global and local enhanced residual u-net for accurate retinal vessel segmentation. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 18, 852–862. [Google Scholar] [CrossRef] [PubMed]
- Mishra, S.; Chen, D.Z.; Hu, X.S. A data-aware deep supervised method for retinal vessel segmentation. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; pp. 1254–1257. [Google Scholar]
- Laibacher, T.; Weyde, T.; Jalali, S. M2u-net: Effective and efficient retinal vessel segmentation for real-world applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
- Jin, Q.; Meng, Z.; Pham, T.D.; Chen, Q.; Wei, L.; Su, R. DUNet: A deformable network for retinal vessel segmentation. Knowl.-Based Syst. 2019, 178, 149–162. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Verma, M.; Nakashima, Y.; Nagahara, H.; Kawasaki, R. Iternet: Retinal image segmentation utilizing structural redundancy in vessel networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, 1–5 March 2020; pp. 3656–3665. [Google Scholar]
- Liu, W.; Yang, H.; Tian, T.; Cao, Z.; Pan, X.; Xu, W.; Jin, Y.; Gao, F. Full-Resolution Network and Dual-Threshold Iteration for Retinal Vessel and Coronary Angiograph Segmentation. IEEE J. Biomed. Health Inform. 2022, 26, 4623–4634. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Yu, H.; Shi, H. Study group learning: Improving retinal vessel segmentation trained with noisy labels. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2021; pp. 57–67. [Google Scholar]
- Kamran, S.A.; Hossain, K.F.; Tavakkoli, A.; Zuckerbrod, S.L.; Sanders, K.M.; Baker, S.A. RV-GAN: Segmenting retinal vascular structure in fundus photographs using a novel multi-scale generative adversarial network. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2021; pp. 34–44. [Google Scholar]
- Zhou, T.; Li, L.; Bredell, G.; Li, J.; Unkelbach, J.; Konukoglu, E. Volumetric memory network for interactive medical image segmentation. Med. Image Anal. 2023, 83, 102599. [Google Scholar] [CrossRef] [PubMed]
- Zhou, T.; Wang, W.; Konukoglu, E.; Van Gool, L. Rethinking Semantic Segmentation: A Prototype View. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2582–2593. [Google Scholar]
- Yang, B.; Zhang, X.; Chen, L.; Yang, H.; Gao, Z. Edge guided salient object detection. Neurocomputing 2017, 221, 60–71. [Google Scholar] [CrossRef]
- Wu, Z.; Su, L.; Huang, Q. Stacked cross refinement network for edge-aware salient object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7264–7273. [Google Scholar]
- Zhou, T.; Li, J.; Wang, S.; Tao, R.; Shen, J. Matnet: Motion-attentive transition network for zero-shot video object segmentation. IEEE Trans. Image Process. 2020, 29, 8326–8338. [Google Scholar] [CrossRef]
- Myronenko, A. 3D MRI brain tumor segmentation using autoencoder regularization. In International MICCAI Brainlesion Workshop; Springer: Cham, Switzerland, 2018; pp. 311–320. [Google Scholar]
- Razavi, A.; Van den Oord, A.; Vinyals, O. Generating diverse high-fidelity images with vq-vae-2. Adv. Neural Inf. Process. Syst. 2019, 32, 14866–14876. [Google Scholar]
- Wu, Y.; Xia, Y.; Song, Y.; Zhang, Y.; Cai, W. NFN+: A novel network followed network for retinal vessel segmentation. Neural Netw. 2020, 126, 153–162. [Google Scholar] [CrossRef]
- Li, Q.; Feng, B.; Xie, L.; Liang, P.; Zhang, H.; Wang, T. A cross-modality learning approach for vessel segmentation in retinal images. IEEE Trans. Med. Imaging 2015, 35, 109–118. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
DRIVE Dataset | ||||
Methods | SE (%) | SP (%) | ACC (%) | AUC (%) |
U-Net [12] | 79.150.23 | 98.080.31 | 96.400.13 | 97.640.08 |
DeepVessel [15] | 78.830.18 | 98.130.23 | 96.090.12 | 97.830.05 |
R2U-Net [16] | 79.230.56 | 98.030.35 | 96.540.07 | 98.020.08 |
AttU-Net [17] | 78.820.17 | 98.480.35 | 96.490.19 | 98.030.07 |
CE-Net [18] | 80.150.22 | 98.160.19 | 96.590.16 | 98.110.09 |
IterNet [37] | 79.950.26 | 98.260.08 | 96.570.17 | 98.130.06 |
SGL [39] | 83.800.09 | 98.340.08 | 97.050.13 | 98.860.04 |
RV-GAN [40] | 79.270.10 | 99.690.11 | 97.900.15 | 98.870.05 |
Ours | 83.860.13 | 98.370.28 | 97.350.08 | 98.820.05 |
CHASEDB1 Dataset | ||||
Method | SE (%) | SP (%) | ACC (%) | AUC (%) |
U-Net [12] | 76.170.86 | 98.610.69 | 97.160.25 | 97.920.15 |
DeepVessel [15] | 75.840.54 | 98.340.54 | 97.180.14 | 97.850.13 |
R2U-Net [16] | 81.450.71 | 98.400.71 | 97.210.13 | 98.010.07 |
AttU-Net [17] | 77.211.01 | 98.500.98 | 97.260.18 | 98.070.06 |
CE-Net [18] | 80.420.39 | 98.390.33 | 97.230.36 | 98.060.09 |
IterNet [37] | 79.971.55 | 98.471.05 | 97.310.24 | 98.260.12 |
SGL [39] | 86.900.24 | 98.430.23 | 97.710.19 | 99.200.08 |
RV-GAN [40] | 81.990.07 | 98.060.13 | 96.970.24 | 99.140.03 |
Ours | 83.290.64 | 98.650.37 | 97.800.07 | 98.980.10 |
STARE Dataset | ||||
Method | SE (%) | SP (%) | ACC (%) | AUC (%) |
U-Net [12] | 78.391.36 | 98.710.96 | 96.880.48 | 97.930.15 |
DeepVessel [15] | 78.830.94 | 98.140.81 | 97.130.41 | 98.140.11 |
R2U-Net [16] | 78.690.99 | 98.620.56 | 96.970.33 | 98.090.09 |
AttU-Net [17] | 79.031.06 | 98.560.74 | 97.220.45 | 98.220.10 |
CE-Net [18] | 79.160.86 | 98.531.11 | 97.150.25 | 98.170.07 |
IterNet [37] | 80.860.53 | 98.460.68 | 97.230.36 | 98.290.07 |
RV-GAN [40] | 83.260.27 | 98.640.37 | 97.540.15 | 98.870.06 |
Ours | 81.350.85 | 98.740.48 | 97.540.19 | 98.840.05 |
Model | U-Net [12] | R2U-Net [16] | AttU-Net [17] | CE-Net [18] | IterNet [37] | SGL [39] | RV-GAN [40] | Ours |
---|---|---|---|---|---|---|---|---|
Parameters (M) | 7.8 | 8.3 | 7.2 | 14.4 | 8.6 | 15.5 | 14.8 | 8.8 |
Methods | SE (%) | SP (%) | ACC (%) | AUC (%) |
---|---|---|---|---|
Baseline | 79.350.09 | 97.950.11 | 96.160.04 | 97.840.03 |
Baseline + VVM | 82.310.18 | 98.430.25 | 97.240.09 | 98.550.06 |
Baseline + EAM | 81.140.15 | 98.110.15 | 96.870.09 | 98.190.04 |
Baseline + VVM + EAM | 83.860.13 | 98.370.28 | 97.350.08 | 98.820.05 |
Data amount = Full | ||||
Methods | SE (%) | SP (%) | ACC (%) | AUC (%) |
Baseline | 79.350.09 | 97.950.11 | 96.160.04 | 97.840.03 |
Data amount = 1/2 | ||||
Methods | SE (%) | SP (%) | ACC (%) | AUC (%) |
Baseline | 74.420.39 | 95.740.31 | 93.780.19 | 95.920.08 |
Baseline + VVM | 78.620.18 | 96.750.23 | 95.280.12 | 97.110.07 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, G.; Chen, X.; Ying, L. Deep Multi-Task Learning for an Autoencoder-Regularized Semantic Segmentation of Fundus Retina Images. Mathematics 2022, 10, 4798. https://doi.org/10.3390/math10244798
Jin G, Chen X, Ying L. Deep Multi-Task Learning for an Autoencoder-Regularized Semantic Segmentation of Fundus Retina Images. Mathematics. 2022; 10(24):4798. https://doi.org/10.3390/math10244798
Chicago/Turabian StyleJin, Ge, Xu Chen, and Long Ying. 2022. "Deep Multi-Task Learning for an Autoencoder-Regularized Semantic Segmentation of Fundus Retina Images" Mathematics 10, no. 24: 4798. https://doi.org/10.3390/math10244798
APA StyleJin, G., Chen, X., & Ying, L. (2022). Deep Multi-Task Learning for an Autoencoder-Regularized Semantic Segmentation of Fundus Retina Images. Mathematics, 10(24), 4798. https://doi.org/10.3390/math10244798