Counteracting Data Bias and Class Imbalance—Towards a Useful and Reliable Retinal Disease Recognition System
Abstract
:1. Introduction
2. Materials and Methods
2.1. Models
- A feature extractor built mostly with convolutional layers, used to capture increasingly abstract image features, which are then compressed into a vector, called feature-embedding, during the process.
- A classifier containing mainly dense, fully connected layers, responsible for the classification of a feature-embedding vector.
2.2. Data Augmentation
2.3. Model Training
2.4. Pre-Training
2.5. Fine Tuning
2.6. Verification of Other Resampling Methods
3. Results
3.1. Dataset
3.2. Evaluation Criteria
3.3. Performance
3.4. Comparison of Resampling Methods
3.5. Comparison to Other Recent Models
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- WHO. Blindness and Vision Impairment. 2021. Available online: https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment (accessed on 13 October 2022).
- Bourne, R.R.; Flaxman, S.R.; Braithwaite, T.; Cicinelli, M.V.; Das, A.; Jonas, J.B.; Zheng, Y. Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: A systematic review and meta-analysis. Lancet Glob. Health 2017, 5, e888–e897. [Google Scholar] [CrossRef]
- Buchan, J.C.; Norman, P.; Shickle, D.; Cassels-Brown, A.; MacEwen, C. Failing to plan and planning to fail. Can we predict the future growth of demand on UK Eye Care Services? Eye 2019, 33, 1029–1031. [Google Scholar] [CrossRef] [PubMed]
- Lee, P.P.; Hoskins, H.D., Jr.; Parke, D.W., III. Access to Care: Eye Care Provider Workforce Considerations in 2020. Arch. Ophthalmol. 2007, 125, 406–410. [Google Scholar] [CrossRef] [PubMed]
- Lin, D.; Xiong, J.; Liu, C.; Zhao, L.; Li, Z.; Yu, S.; Wu, X.; Ge, Z.; Hu, X.; Wang, B.; et al. Application of Comprehensive Artificial intelligence Retinal Expert (CARE) system: A national real-world evidence study. Lancet Digit. Health 2021, 3, e486–e495. [Google Scholar] [CrossRef] [PubMed]
- Burlina, P.M.; Joshi, N.; Pacheco, K.D.; Liu, T.Y.A.; Bressler, N.M. Assessment of deep generative models for high-resolution synthetic retinal image generation of age-related macular degeneration. JAMA Ophthalmol. 2019, 137, 258–264. [Google Scholar] [CrossRef]
- Gulshan, V.; Rajan, R.; Widner, K.; Wu, D.; Wubbels, P.; Rhodes, T.; Whitehouse, K.; Coram, M.; Corrado, G.; Ramasamy, K.; et al. Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in India. JAMA Ophthalmol. 2019, 137, 987–993. [Google Scholar] [CrossRef] [PubMed]
- Milea, D.; Najjar, R.P.; Jiang, Z.; Ting, D.; Vasseneix, C.; Xu, X.; Biousse, V. Artificial intelligence to detect papilledema from ocular fundus photographs. N. Engl. J. Med. 2020, 382, 1687–1695. [Google Scholar] [CrossRef]
- Son, J.; Shin, J.Y.; Kim, H.D.; Jung, K.H.; Park, K.H.; Park, S.J. Development and validation of deep learning models for screening multiple abnormal findings in retinal fundus images. Ophthalmology 2020, 127, 85–94. [Google Scholar] [CrossRef]
- Taylor, H.R.; Keeffe, J.E. World blindness: A 21st century perspective. Br. J. Ophthalmol. 2001, 85, 261–266. [Google Scholar] [CrossRef]
- Bulut, B.; Kalın, V.; Güneş, B.B.; Khazhin, R. Deep Learning Approach For Detection Of Retinal Abnormalities Based On Color Fundus Images. In Proceedings of the 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), Istanbul, Turkey, 15–17 October 2020; pp. 1–6. [Google Scholar]
- Chellaswamy, C.; Geetha, T.S.; Ramasubramanian, B.; Abirami, R.; Archana, B.; Bharathi, A.D. Optimized Convolutional Neural Network based Multiple Eye Disease Detection and Information Sharing System. In Proceedings of the 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 25–27 May 2022; pp. 1105–1113. [Google Scholar]
- Gour, N.; Khanna, P. Multi-class multi-label ophthalmological disease detection using transfer learning based convolutional neural network. Biomed. Signal. Process. Control. 2021, 66, 102329. [Google Scholar] [CrossRef]
- Han, Y.; Li, W.; Liu, M.; Wu, Z.; Zhang, F.; Liu, X.; Tao, L.; Li, X.; Guo, X. Application of an Anomaly Detection Model to Screen for Ocular Diseases Using Color Retinal Fundus Images: Design and Evaluation Study. J. Med. Internet Res. 2021, 23, e27822. [Google Scholar] [CrossRef]
- Khan, S.; Tafshir, N.; Alam, K.N.; Dhruba, A.R.; Khan, M.M.; Albraikan, A.A.; Almalki, F.A. Deep Learning for Ocular Disease Recognition: An Inner-Class Balance. Comput. Intell. Neurosci. 2022, 2022, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Li, B.; Chen, H.; Zhang, B.; Yuan, M.; Jin, X.; Lei, B.; Xu, J.; Gu, W.; Wong, D.C.S.; He, X.; et al. Development and evaluation of a deep learning model for the detection of multiple fundus diseases based on colour fundus photography. Br. J. Ophthalmol. 2022, 106, 1079–1086. [Google Scholar] [CrossRef] [PubMed]
- Muthukannan, P. Optimized convolution neural network based multiple eye disease detection. Comput. Biol. Med. 2022, 146, 105648. [Google Scholar]
- Rathakrishnan, N.; Raja, D. Optimized convolutional neural network-based comprehensive early diagnosis method for multiple eye disease recognition. J. Electron. Imaging 2022, 31, 043016. [Google Scholar] [CrossRef]
- Shanggong Medical Technology Co., Ltd. ODIR-5K. Available online: https://odir2019.grand-challenge.org/dataset/ (accessed on 3 October 2022).
- Vokinger, K.N.; Feuerriegel, S.; Kesselheim, A.S. Mitigating bias in machine learning for medicine. Commun. Med. 2021, 1, 1–3. [Google Scholar] [CrossRef] [PubMed]
- Ling, C.X.; Sheng, V.S. Class Imbalance Problem. In Encyclopedia of Machine Learning; Sammut, C., Webb, G.I., Eds.; Springer: Boston, MA, USA, 2011; Volume 10, p. 978. [Google Scholar]
- Lee, H.; Park, M.; Kim, J. Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3713–3717. [Google Scholar]
- Johnson, J.M.; Khoshgoftaar, T.M. Survey on deep learning with class imbalance. J. Big Data 2019, 6, 1–54. [Google Scholar] [CrossRef]
- Brad, M.D.; Feldman, H.; Alpa, S. Cataract. Available online: https://eyewiki.aao.org/Cataract (accessed on 3 October 2022).
- Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 11976–11986. [Google Scholar]
- Radosavovic, R.P.; Kosaraju, R.; Girshick, K.; Dollár, P.H. Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14–19 June 2020; pp. 10428–10436. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Horta, A.; Joshi, N.; Pekala, M.; Pacheco, K.D.; Kong, J.; Bressler, N.; Freund, D.E.; Burlina, P. A hybrid approach for incorporating deep visual features and side channel information with applications to AMD detection. In Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 18–21 December 2017; pp. 716–720. [Google Scholar]
- Islam, M.T.; Imran, S.A.; Arefeen, A.; Hasan, M.; Shahnaz, C. Source and camera independent ophthalmic disease recognition from fundus image using neural network. In Proceedings of the 2019 IEEE International Conference on Signal Processing, Information, Communication & Systems (SPICSCON), Dhaka, Bangladesh, 28–30 November 2019; pp. 59–63. [Google Scholar]
- Tan, J.H.; Bhandary, S.V.; Sivaprasad, S.; Hagiwara, Y.; Bagchi, A.; Raghavendra, U.; Rao, A.K.; Raju, B.; Shetty, N.S.; Gertych, A.; et al. Age-related macular degeneration detection using deep convolutional neural network. Future Gener. Comput. Syst. 2018, 87, 127–135. [Google Scholar] [CrossRef]
- Ikechukwu, A.V.; Murali, S.; Deepu, R.; Shivamurthy, R. ResNet-50 vs VGG-19 vs training from scratch: A comparative analysis of the segmentation and classification of Pneumonia from chest X-ray images. Glob. Transit. Proc. 2021, 2, 375–381. [Google Scholar] [CrossRef]
- Reddy, S.B.; Juliet, D.S. Transfer learning with ResNet-50 for malaria cell-image classification. In Proceedings of the 2019 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 4–6 April 2019; pp. 945–949. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Nashville, TN, USA, 19–25 June 2021; pp. 10012–10022. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Goyal, P.; Duval, Q.; Seessel, I.; Caron, M.; Misra, I.; Sagun, L.; Bojanowski, P. Vision models are more robust and fair when pretrained on uncurated images without supervision. arXiv 2022, arXiv:2202.08360. [Google Scholar]
- Azzuni, H.; Ridzuan, M.; Xu, M.; Yaqub, M. Color Space-based HoVer-Net for Nuclei Instance Segmentation and Classification. arXiv 2022, arXiv:2203.01940. [Google Scholar]
- Lihacova, I.; Bondarenko, A.; Chizhov, Y.; Uteshev, D.; Bliznuks, D.; Kiss, N.; Lihachev, A. Multi-Class CNN for Classification of Multispectral and Autofluorescence Skin Lesion Clinical Images. J. Clin. Med. 2022, 11, 2833. [Google Scholar] [CrossRef] [PubMed]
- Touvron, H.; Bojanowski, P.; Caron, M.; Cord, M.; El-Nouby, A.; Grave, E.; Jégou, H. Resmlp: Feedforward networks for image classification with data-efficient training. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5314–5321. [Google Scholar] [CrossRef] [PubMed]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual Event, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Marcel, S.; Rodriguez, Y. Torchvision the machine-vision package of torch. In Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italy, 25–29 October 2010; pp. 1485–1488. [Google Scholar]
- Althnian, A.; AlSaeed, D.; Al-Baity, H.; Samha, A.; Bin Dris, A.; Alzakari, N.; Elwafa, A.A.; Kurdi, H. Impact of dataset size on classification performance: An empirical evaluation in the medical domain. Appl. Sci. 2021, 11, 796. [Google Scholar] [CrossRef]
- Perez, L.; Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv 2017, arXiv:1712.04621. [Google Scholar]
- Sajjad, M.; Khan, S.; Muhammad, K.; Wu, W.; Ullah, A.; Baik, S.W. Multi-grade brain tumor classification using deep CNN with extensive data augmentation. J. Comput. Sci. 2019, 30, 174–182. [Google Scholar] [CrossRef]
- Sedigh, P.; Sadeghian, R.; Masouleh, M.T. Generating synthetic medical images by using GAN to improve CNN performance in skin cancer classification. In Proceedings of the 2019 7th International Conference on Robotics and Mechatronics (ICRoM), IEEE, Tehran, Iran, 20–21 November 2019; pp. 497–502. [Google Scholar]
- Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and flexible image augmentations. Information 2020, 11, 125. [Google Scholar] [CrossRef]
- Decencière, E.; Cazuguel, G.; Zhang, X.; Thibault, G.; Klein, J.-C.; Meyer, F.; Marcotegui, B.; Quellec, G.; Lamard, M.; Danno, R.; et al. TeleOphta: Machine learning and image processing methods for teleophthalmology. IRBM 2013, 34, 196–203. [Google Scholar] [CrossRef]
- DeVries, T.; Taylor, G.W. Improved regularization of convolutional neural networks with cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar]
- Biewald, L. Experiment Tracking with Weights and Biases. Software available from wandb.com. 2020. Available online: https://www.wandb.com/ (accessed on 24 May 2023).
- Liu, L.; Jiang, H.; He, P.; Chen, W.; Liu, X.; Gao, J.; Han, J. On the variance of the adaptive learning rate and beyond. arXiv 2019, arXiv:1908.03265. [Google Scholar]
- Loshchilov, I.; Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar]
- Chawla, N.V.; Japkowicz, N.; Kotcz, A. Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 2004, 6, 1–6. [Google Scholar] [CrossRef]
- Van Hulse, J.; Khoshgoftaar, T.M.; Napolitano, A. Experimental perspectives on learning from imbalanced data. In Proceedings of the 24th International Conference on MACHINE Learning, Corvalis, OR, USA, 20–24 June 2007; pp. 935–942. [Google Scholar]
- Diaz-Pinto, A.; Morales, S.; Naranjo, V.; Köhler, T.; Mossi, J.M.; Navea, A. CNNs for automatic glaucoma assessment using fundus images: An extensive validation. Biomed. Eng. Online 2019, 18, 1–19. [Google Scholar] [CrossRef]
- Abràmoff, M.D.; Lou, Y.; Erginay, A.; Clarida, W.; Amelon, R.; Folk, J.C.; Niemeijer, M. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Investig. Ophthalmol. Vis. Sci. 2016, 57, 5200–5206. [Google Scholar] [CrossRef]
- Adal, K.M.; van Etten, P.G.; Martinez, J.P.; van Vliet, L.J.; Vermeer, K.A. Accuracy assessment of intra-and intervisit fundus image registration for diabetic retinopathy screening. Investig. Ophthalmol. Vis. Sci. 2015, 56, 1805–1812. [Google Scholar] [CrossRef]
- Holm, S.; Russell, G.; Nourrit, V.; McLoughlin, N. DR HAGIS—A fundus image database for the automatic extraction of retinal surface vessels from diabetic patients. J. Med. Imaging 2017, 4, 014503. [Google Scholar] [CrossRef] [PubMed]
- Pires, R.; Jelinek, H.F.; Wainer, J.; Valle, E.; Rocha, A. Advancing bag-of-visual-words representations for lesion classification in retinal images. PLoS ONE 2014, 9, e96814. [Google Scholar] [CrossRef]
- Drive. Digital Retinal Images for Vessel Extraction. Available online: https://drive.grand-challenge.org/ (accessed on 12 December 2022).
- Mahdi, H.; El Abbadi, N. Glaucoma Diagnosis Based on Retinal Fundus Image: A Review. Iraqi J. Sci. 2022, 63, 4022–4046. [Google Scholar] [CrossRef]
- Kaggle, E. Kaggle Diabetic Retinopathy Detection. 2015. Available online: https://www.kaggle.com/c/diabetic-retinopathy-detection/data (accessed on 1 December 2022).
- Almazroa, A.A.; Alodhayb, S.; Osman, E.; Ramadan, E.; Hummadi, M.; Dlaim, M.; Alkatee, M.; Raahemifar, K.; Lakshminarayanan, V. Retinal fundus images for glaucoma analysis: The RIGA dataset. In Proceedings of the Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, SPIE, Houston, TX, USA, 10–15 February 2018; pp. 55–62. [Google Scholar]
- Orlando, J.I.; Fu, H.; Breda, J.B.; van Keer, K.; Bathula, D.R.; Diaz-Pinto, A.; Fang, R.; Heng, P.-A.; Kim, J.; Lee, J.; et al. REFUGE Challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal. 2020, 59, 101570. [Google Scholar] [CrossRef]
- Takahashi, H.; Tampo, H.; Arai, Y.; Inoue, Y.; Kawashima, H. Applying artificial intelligence to disease staging: Deep learning for improved staging of diabetic retinopathy. PLoS ONE 2017, 12, e0179790. [Google Scholar] [CrossRef]
- Abràmoff, M.D.; Folk, J.; Han, D.P.; Walker, J.D.; Williams, D.F.; Russell, S.; Massin, P.; Cochener, B.; Gain, P.; Tang, L.; et al. Automated analysis of retinal images for detection of referable diabetic retinopathy. JAMA Ophthalmol. 2013, 131, 351–357. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Xu, M.; Wang, X.; Jiang, L.; Liu, H. Attention based glaucoma detection: A large-scale database and CNN model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10571–10580. [Google Scholar]
- Batista, F.J.F.; Diaz-Aleman, T.; Sigut, J.; Alayon, S.; Arnay, R.; Angel-Pereira, D. Rim-one dl: A unified retinal image database for assessing glaucoma using deep learning. Image Anal. Stereol. 2020, 39, 161–167. [Google Scholar] [CrossRef]
- Niemeijer, M.; van Ginneken, B.; Cree, M.J.; Mizutani, A.; Quellec, G.; Sanchez, C.I.; Zhang, B.; Hornero, R.; Lamard, M.; Muramatsu, C.; et al. Retinopathy online challenge: Automatic detection of microaneurysms in digital color fundus photographs. IEEE Trans. Med. Imaging 2009, 29, 185–195. [Google Scholar] [CrossRef]
- Hoover, D.; Kouznetsova, V.; Goldbaum, M. Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Trans. Med. Imaging 2000, 19, 203–210. [Google Scholar] [CrossRef]
- Farnell, D.; Hatfield, F.; Knox, P.; Reakes, M.; Spencer, S.; Parry, D.; Harding, S. Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators. J. Frankl. Inst. 2008, 345, 748–765. [Google Scholar] [CrossRef]
- Pachade, S.; Porwal, P.; Thulkar, D.; Kokare, M.; Deshmukh, G.; Sahasrabuddhe, V.; Giancardo, L.; Quellec, G.; Mériaudeau, F. Retinal fundus multi-disease image dataset (RFMiD): A dataset for multi-disease detection research. Data 2021, 6, 14. [Google Scholar] [CrossRef]
- Hicks, S.A.; Strümke, I.; Thambawita, V.; Hammou, M.; Riegler, M.A.; Halvorsen, P.; Parasa, S. On evaluation metrics for medical applications of artificial intelligence. Sci. Rep. 2022, 12, 1–9. [Google Scholar] [CrossRef]
- Korotcov, A.; Tkachenko, V.; Russo, D.P.; Ekins, S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol. Pharm. 2017, 14, 4462–4475. [Google Scholar] [CrossRef]
- Harding, S.P.; Broadbent, D.M.; Neoh, C.; White, M.C.; Vora, J. Sensitivity and specificity of photography and direct ophthalmoscopy in screening for sight threatening eye disease: The Liverpool diabetic eye study. BMJ 1995, 311, 1131. [Google Scholar] [CrossRef]
- Santini, A.M.; Voidăzan, S. Accuracy of Diagnostic Tests. J. Crit. Care Med. 2021, 7, 241–248. [Google Scholar] [CrossRef]
- Huang, J.; Ling, C.X. Using AUC and accuracy in evaluating learning algorithms. IEEE Trans. Knowl. Data Eng. 2005, 17, 299–310. [Google Scholar] [CrossRef]
- Alberg, J.; Park, J.W.; Hager, B.W.; Brock, M.V.; Diener-West, M. The use of “overall accuracy” to evaluate the validity of screening or diagnostic tests. J. Gen. Intern. Med. 2004, 19, 460–465. [Google Scholar] [CrossRef] [PubMed]
Name | Probability | Values |
---|---|---|
Rotate | 0.8 | [−90°, 90°] |
Horizontal flip | 0.5 | - |
Vertical flip | 0.5 | - |
Random brightness contrast | 0.5 | Brightness limit: 0.1 |
Contrast limit: 0.15 | ||
Cutout | 0.5 | Number of holes: 20 Maximum height of hole: 11 px |
Maximum width of hole: 11 px |
Split | # of Normal | # of Glaucoma | # of AMD | # of Diabetic Retinopathy | Total # of Samples |
---|---|---|---|---|---|
Pre-training | 82,626 | 0 | 0 | 30,592 | 113,218 |
Fine-tuning | 3787 | 3787 | 632 | 3787 | 11,993 |
ROS/RUS | 86,415 | 3787 | 632 | 34,379 | 125,211 |
Dataset | N | GL | AMD | DR | Camera Models | Annotators |
---|---|---|---|---|---|---|
ACRIMA [54] | 309 | 396 | 0 | 0 | Topcon → TRC → non-mydriatic | Two glaucoma experts with 8 years of experience |
APTOS 2019 Blindness Detection Dataset [55] | 3733 | 0 | 0 | 1857 | Variety of cameras | Not available |
Cataract [56] | 300 | 101 | 0 | 100 | Not available | Not available |
DR HAGIS [57] | 0 | 10 | 10 | 10 | Topcon TRC-NW6s non-mydriatic, Topcon TRC-NW8 non-mydriatic or Canon CR DGi non-mydriatic | Expert grader |
DR1, DR2 [58] | 895 | 0 | 0 | 1118 | Topcon → TRC-50X mydriatic | Respectively, three and two medical specialists |
DRIVE [59] | 33 | 0 | 0 | 5 | Canon → CR5 → non-mydriatic 3CCD | Ophthalmology expert |
Machine learn for glaucoma [60] | 788 | 756 | 0 | 0 | Not available | Not available |
e-optha [47] | 116 | 0 | 0 | 121 | Not available | Ophthalmology experts |
Kaggle: EyePACS [61] | 65,343 | 0 | 0 | 23,359 | Variety of cameras | A panel of medical specialists |
BAIDU: iChallenge-AMD [62] | 311 | 0 | 89 | 0 | Not available | Not available |
REFUGE [63] | 360 | 40 | 0 | 0 | Zeiss Visucam 500 non-mydriatic | Seven glaucoma specialists |
Davis Grading of One and Concatenated Figures [64] | 6561 | 0 | 0 | 3378 | Nidek AFC-230 non-mydriatic | Specialist grader |
Longitudinal diabetic retinopathy screening data [65] | 0 | 0 | 0 | 1120 | Topcon → TRC-NW65 non-mydriatic | Two graders |
Messidor-2 [66] | 1017 | 0 | 0 | 731 | Topcon TRC NW6 | Medical expert |
ODIR-5K [19] | 3098 | 312 | 280 | 1697 | Various cameras such as Canon, Zeiss, Kowa | Trained human readers with Quality control management |
LAG [67] | 3147 | 1711 | 0 | 0 | Not available | Glaucoma specialists |
RIGA [62] | 0 | 289 | 0 | 0 | Topcon → TRC → 50DX mydriatic | Six experienced ophthalmologists |
RIM-ONE DL [68] | 313 | 172 | 0 | 0 | Kowa WX 3D stereo non-mydriatic or Nidek AFzC-210 non-mydriatic with a Canon EOS 5D Mark II body | Three experts |
ROC [69] | 0 | 0 | 0 | 100 | Topcon NW 100, NW 200, or Canon CR5-45NM | Retinal experts |
STARE [70] | 36 | 0 | 61 | 92 | TOPCON TRV-50 | Ophthalmology experts |
ARIA [71] | 61 | 0 | 23 | 59 | Zeiss FF450+ mydriatic | Retinal expert |
RFMID [72] | 669 | 0 | 169 | 632 | TOPCON 3D OCT-2000, Kowa VX-10alfa mydriatic and non-mydriatic two in one, and TOPCON TRC-NW300 non-mydriatic | Two ophthalmologists |
TOTAL | 86,415 | 3787 | 632 | 34,379 |
Class | Metric | ResNet50 | RegNetY3_2gf | ConvNextTiny |
---|---|---|---|---|
Normal | F1-Score | 72.61 ± 1.86 | 72.15 ± 2.32 | 72.97 ± 2.60 |
Sensitivity | 73.75 ± 3.49 | 73.75 ± 6.64 | 74.57 ± 3.94 | |
Specificity | 86.46 ± 1.64 | 85.99 ± 2.90 | 86.27 ± 1.76 | |
AUC | 90.53 ± 0.76 | 90.19 ± 0.77 | 90.64 ± 0.56 | |
Accuracy | 82.50 ± 1.27 | 82.17 ± 0.88 | 80.01 ± 1.10 | |
Glaucoma | F1-Score | 95.22 ± 0.80 | 94.42 ± 0.83 | 94.83 ± 0.96 |
Sensitivity | 95.64 ± 1.02 | 95.11 ± 1.09 | 95.54 ± 1.22 | |
Specificity | 97.57 ± 0.64 | 97.06 ± 0.51 | 97.25 ± 0.81 | |
AUC | 99.44 ± 0.18 | 99.30 ± 0.23 | 99.32 ± 0.17 | |
Accuracy | 92.78 ± 0.41 | 96.92 ± 0.66 | 97.20 ± 0.66 | |
AMD | F1-Score | 81.78 ± 4.35 | 79.25 ± 4.21 | 82.98 ± 3.50 |
Sensitivity | 84.01 ± 8.13 | 83.23 ± 6.86 | 84.02 ± 6.37 | |
Specificity | 98.82 ± 0.43 | 98.51 ± 0.47 | 98.97 ± 0.49 | |
AUC | 99.25 ± 0.34 | 98.99 ± 0.51 | 98.79 ± 0.83 | |
Accuracy | 97.91 ± 0.29 | 98.13 ± 0.31 | 98.14 ± 0.31 | |
Diabetic Retinopathy | F1-Score | 72.32 ± 1.28 | 71.60 ± 2.21 | 72.96 ± 1.78 |
Sensitivity | 70.56 ± 2.23 | 69.11 ± 6.67 | 70.69 ± 3.33 | |
Specificity | 88.65 ± 1.65 | 89.09 ± 3.30 | 89.36 ± 1.91 | |
AUC | 91.15 ± 0.65 | 90.97 ± 0.64 | 91.65 ± 0.87 | |
Accuracy | 82.63 ± 1.09 | 83.04 ± 0.94 | 80.66 ± 1.27 | |
Average | Accuracy | 89.88 ± 7.53 | 90.15 ± 7.43 | 88.99 ± 8.74 |
F1-Score | 80.48 ± 1.51 | 79.36 ± 1.60 | 80.93 ± 1.61 | |
Sensitivity | 80.99 ± 2.13 | 80.30 ± 2.03 | 81.20 ± 2.26 | |
Specificity | 92.88 ± 0.40 | 92.66 ± 0.39 | 92.96 ± 0.55 | |
AUC | 95.09 ± 0.39 | 94.87 ± 0.41 | 95.10 ± 0.36 | |
Overall | Accuracy | 79.76 ± 1.39 | 79.53 ± 1.07 | 80.46 ± 1.48 |
Class | Metric | Two-Stage Learning (Our) | RUS | ROS |
---|---|---|---|---|
Normal | F1-Score | 72.97 ± 2.60 | 63.52 ± 3.50 | 73.84 ± 1.36 |
Sensitivity | 74.57 ± 3.94 | 63.27 ± 8.69 | 80.34 ± 5.50 | |
Specificity | 86.27 ± 1.76 | 83.75 ± 3.76 | 82.83 ± 3.79 | |
AUC | 90.64 ± 0.56 | 85.59 ± 0.67 | 90.23 ± 0.87 | |
Glaucoma | F1-Score | 94.83 ± 0.96 | 92.36 ± 1.00 | 95.34 ± 1.28 |
Sensitivity | 95.54 ± 1.22 | 91.08 ± 3.03 | 92.45 ± 3.07 | |
Specificity | 97.25 ± 0.81 | 97.17 ± 0.91 | 99.33 ± 0.45 | |
AUC | 99.32 ± 0.17 | 98.69 ± 0.18 | 99.46 ± 0.12 | |
AMD | F1-Score | 82.98 ± 3.50 | 76.16 ± 3.84 | 84.44 ± 2.55 |
Sensitivity | 84.02 ± 6.37 | 93.01 ± 3.90 | 75.40 ± 4.18 | |
Specificity | 98.97 ± 0.49 | 97.13 ± 0.72 | 99.82 ± 0.28 | |
AUC | 98.79 ± 0.83 | 99.05 ± 0.39 | 99.28 ± 0.27 | |
Diabetic Retinopathy | F1-Score | 72.96 ± 1.78 | 63.23 ± 2.87 | 72.89 ± 1.59 |
Sensitivity | 70.69 ± 3.33 | 62.43 ± 7.74 | 70.05 ± 4.84 | |
Specificity | 89.36 ± 1.91 | 84.09 ± 4.37 | 89.82 ± 2.74 | |
AUC | 91.65 ± 0.87 | 85.52 ± 0.84 | 91.41 ± 0.71 | |
Average | F1-Score | 80.93 ± 1.61 | 73.82 ± 1.43 | 81.63 ± 1.31 |
Sensitivity | 81.20 ± 2.26 | 77.45 ± 1.29 | 79.56 ± 1.38 | |
Specificity | 92.96 ± 0.55 | 90.54 ± 0.43 | 92.95 ± 0.46 | |
AUC | 95.10 ± 0.36 | 92.21 ± 0.41 | 95.10 ± 0.40 | |
Overall | Accuracy | 80.46 ± 1.48 | 73.35 ± 1.26 | 80.65 ± 1.30 |
Technical | Runtime [s] | 1196.4 | 370 | 37,194.9 |
Paper | Class | F1-Score | Sensitivity | Specificity | AUC | Accuracy |
---|---|---|---|---|---|---|
Ours (ConvNextTiny) | Normal | 72.97 | 74.57 | 86.27 | 90.64 | 80.01 |
Glaucoma | 94.83 | 95.54 | 97.25 | 99.32 | 97.20 | |
AMD | 82.98 | 84.02 | 98.97 | 98.79 | 98.14 | |
Diabetic Retinopathy | 72.96 | 70.69 | 89.36 | 91.65 | 80.66 | |
Han et al. [14] | Normal | - | - | - | - | - |
Glaucoma | - | 83.70 | 84.00 | 91.60 | 83.89 | |
AMD | - | 77.61 | 78.75 | 86.70 | 78.37 | |
Diabetic Retinopathy | - | 80.36 | 80.50 | 89.10 | 80.39 | |
Bulut et al. [11] | Normal | - | - | - | - | - |
Glaucoma | - | - | - | 81.10 | - | |
AMD | - | - | - | 96.30 | - | |
Diabetic Retinopathy | - | - | - | 87.10 | - | |
Gour et al. [13] | Normal | - | 77.00 | 21.00 | - | 40.00 |
Glaucoma | - | 40.00 | 60.00 | - | 54.00 | |
AMD | - | 06.00 | 93.00 | - | 88.00 | |
Diabetic Retinopathy | - | 05.00 | 94.00 | - | 89.00 | |
Chellaswamy et al. [12] | Normal | 96.39 | 95.99 | 91.27 | - | 95.00 |
Glaucoma | 96.43 | 94.95 | 96.32 | - | 96.00 | |
AMD | 93.96 | 99.01 | 94.98 | - | 96.38 | |
Diabetic Retinopathy | - | - | - | - | - | |
Muthukannan et al. [17] | Normal | 94.09 | 95.65 | 98.56 | - | 99.20 |
Glaucoma | 97.04 | 97.77 | 99.28 | - | 97.80 | |
AMD | 95.49 | 94.98 | 99.01 | - | 98.40 | |
Diabetic Retinopathy | 94.98 | 94.31 | 98.92 | - | 97.90 | |
Khan et al. [15] | Normal | - | - | - | - | - |
Glaucoma | 92.00 | 97.00 | - | - | - | |
AMD | 88.00 | 92.00 | - | - | - | |
Diabetic Retinopathy | 89.00 | 92.00 | - | - | - | |
Li et al. [16] | Normal | - | 94.50 | 95.70 | 98.90 | - |
Glaucoma | - | 80.40 | 93.40 | 95.30 | - | |
AMD | - | 85.80 | 93.90 | 97.60 | - | |
Diabetic Retinopathy | - | 80.40 | 89.70 | 95.00 | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chłopowiec, A.R.; Karanowski, K.; Skrzypczak, T.; Grzesiuk, M.; Chłopowiec, A.B.; Tabakov, M. Counteracting Data Bias and Class Imbalance—Towards a Useful and Reliable Retinal Disease Recognition System. Diagnostics 2023, 13, 1904. https://doi.org/10.3390/diagnostics13111904
Chłopowiec AR, Karanowski K, Skrzypczak T, Grzesiuk M, Chłopowiec AB, Tabakov M. Counteracting Data Bias and Class Imbalance—Towards a Useful and Reliable Retinal Disease Recognition System. Diagnostics. 2023; 13(11):1904. https://doi.org/10.3390/diagnostics13111904
Chicago/Turabian StyleChłopowiec, Adam R., Konrad Karanowski, Tomasz Skrzypczak, Mateusz Grzesiuk, Adrian B. Chłopowiec, and Martin Tabakov. 2023. "Counteracting Data Bias and Class Imbalance—Towards a Useful and Reliable Retinal Disease Recognition System" Diagnostics 13, no. 11: 1904. https://doi.org/10.3390/diagnostics13111904
APA StyleChłopowiec, A. R., Karanowski, K., Skrzypczak, T., Grzesiuk, M., Chłopowiec, A. B., & Tabakov, M. (2023). Counteracting Data Bias and Class Imbalance—Towards a Useful and Reliable Retinal Disease Recognition System. Diagnostics, 13(11), 1904. https://doi.org/10.3390/diagnostics13111904