Skin Cancer Diagnosis Using VGG16 and Transfer Learning: Analyzing the Effects of Data Quality over Quantity on Model Efficiency
Abstract
:1. Introduction
2. The State of the Art
3. Methods and Materials
3.1. Dataset
3.2. Methodology
3.2.1. Data Collection
3.2.2. Image Pre-Processing
- Converting to NumPy arrays
- 2.
- Normalization
- 3.
- Resizing images
3.2.3. Transfer Learning for VGG16
- Deleting the Top Layer
- 2.
- Adding Custom Top Layer
- Convolutional Layers
- Max-Pooling Layers
- Fully Connected Layers
- ReLU: ReLU acquaints non-linearity with the organization, permitting it to learn complex examples in the information.
- Softmax: This function converts the classification scores into probabilities, providing the final output for classification.
3.2.4. Training and Validation Model 1, Model 2, and Model 3
- Epochs = 40: The number of times the model will iterate over the entire training dataset.
- Batch_size = 32: The number of samples that will be propagated through the network at one time. After processing this batch, the model’s weights will be updated.
- Callbacks: These are special functions that can be called during training at certain points.
- They are used for various purposes, such as saving the model after each epoch or stopping training early if the validation loss stops improving.
- Verbosity mode = 1: This controls the verbosity of the output, and “1” means progress updates will be shown during training.
- Optimizer = Adam: Adam is a popular optimization algorithm used for training machine learning models, especially deep learning models. “Adam” represents a versatile second assessment. It is an augmentation of the Stochastic Slope Plunge (SGD) calculation that determines versatile learning rates for every boundary.
3.2.5. Comparison of Validation Accuracies
3.2.6. Testing
4. Metrics
4.1. Accuracy (ACC)
4.2. Loss
- is the number of samples;
- is the number of classes;
- is a binary indicator (0 or 1) if the class label is the correct classification for the sample ;
- is the predicted probability of the sample being classified as class .
4.3. Precision ()
4.4. Recall ()
4.5. F-measure
4.6. Learning Curves
5. Results
5.1. A Comparison of the Three Models Using Validation Accuracy
5.2. Quality Insights
5.3. Results of Testing Phase for Model 2
5.4. Comparison with Related Work
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Rogers, H.W.; Weinstock, M.A.; Feldman, S.R.; Coldiron, B.M. Incidence estimate of nonmelanoma skin cancer (keratinocyte carcinomas) in the US population, 2012. JAMA Dermatol. 2015, 151, 1081–1086. [Google Scholar] [CrossRef] [PubMed]
- American Cancer Society. Cancer Facts and Figures 2024. Available online: https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/2024-cancer-facts-figures.html (accessed on 17 January 2024).
- Mansouri, B.; Housewright, C. The treatment of actinic keratoses—The rule rather than the exception. J. Am. Acad. Dermatol. 2017, 153, 1200. [Google Scholar] [CrossRef] [PubMed]
- Weller, M.; van den Bent, M.; Preusser, M.; Le Rhun, E.; Tonn, J.C.; Minniti, G.; Bendszus, M.; Balana, C.; Chinot, O.; Dirven, L.; et al. EANO guidelines on the diagnosis and treatment of diffuse gliomas of adulthood. Nat. Rev. Clin. Oncol. 2021, 18, 170–186. [Google Scholar] [CrossRef] [PubMed]
- Cohen, A.; Thammasitboon, S.; Singhal, G.; Epner, P. Diagnostic Error. In Patient Safety; Agrawal, A., Bhatt, J., Eds.; Springer: Cham, Switzerland, 2023; pp. 225–239. [Google Scholar] [CrossRef]
- Ahsan, M.M.; Luna, S.A.; Siddique, Z. Machine-Learning-Based Disease Diagnosis: A Comprehensive Review. Healthcare 2022, 10, 541. [Google Scholar] [CrossRef]
- Galić, I.; Habijan, M.; Leventić, H.; Romić, K. Machine Learning Empowering Personalized Medicine: A Comprehensive Review of Medical Image Analysis Methods. Electronics 2023, 12, 4411. [Google Scholar] [CrossRef]
- Groh, M.; Badri, O.; Daneshjou, R.; Koochek, A.; Harris, C.; Soenksen, L.R.; Doraiswamy, P.M.; Picard, R. Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nat. Med. 2024, 30, 573–583. [Google Scholar] [CrossRef]
- Singh, B.; Malhotra, H.; Kumar, D.; Mujtaba, S.F.; Upadhyay, A.K. Understanding Cellular and Molecular Events of Skin Aging and Cancer: An Integrative Perspective. In Skin Aging & Cancer; Dwivedi, A., Agarwal, N., Ray, L., Tripathi, A., Eds.; Springer: Singapore, 2019; pp. 27–46. [Google Scholar] [CrossRef]
- Swathi, B.; Kannan, K.; Chakravarthi, S.S.; Ruthvik, G.; Avanija, J.; Reddy, C.C.M. Skin Cancer Detection using VGG16, InceptionV3 and ResUNet. In Proceedings of the 2023 4th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 6–8 July 2023; pp. 812–818. [Google Scholar] [CrossRef]
- Aljohani, K.; Turki, T. Automatic Classification of Melanoma Skin Cancer with Deep Convolutional Neural Networks. AI 2022, 3, 512–525. [Google Scholar] [CrossRef]
- Lomas, A.; Leonardi-Bee, J.; Bath-Hextall, F. A systematic review of worldwide incidence of nonmelanoma skin cancer. Br. J. Dermatol. 2012, 166, 1069–1080. [Google Scholar] [CrossRef] [PubMed]
- Christenson, L.J.; Borrowman, T.A.; Vachon, C.M.; Tollefson, M.M.; Otley, C.C.; Weaver, A.L.; Roenigk, R.K. Incidence of basal cell and squamous cell carcinomas in a population younger than 40 years. JAMA 2005, 294, 681–690. [Google Scholar] [CrossRef]
- Yu, Z.; Jiang, X.; Zhou, F.; Qin, J.; Ni, D.; Chen, S.; Lei, B.; Wang, T. Melanoma recognition in Dermoscopy images via aggregated deep convolutional features. IEEE Trans. Biomed. Eng. 2019, 66, 1006–1016. [Google Scholar] [CrossRef]
- Lodde, G.; Zimmer, L.; Livingstone, E.; Schadendorf, D.; Ugurel, S. Malignant melanoma. Hautarzt 2020, 71, 63–77. [Google Scholar] [CrossRef]
- Gandini, S.; Sera, F.; Cattaruzza, M.S.; Pasquini, P.; Abeni, D.; Boyle, P.; Melchi, C.F. Meta-analysis of risk factors for cutaneous melanoma: I. Common and atypical naevi. Eur. J. Cancer 2005, 41, 28–44. [Google Scholar] [CrossRef]
- Ker, J.; Wang, L.; Rao, J.; Lim, T. Deep Learning Applications in Medical Image Analysis. IEEE Access 2018, 6, 9375–9389. [Google Scholar] [CrossRef]
- Greenspan, H.; van Ginneken, B.; Summers, R.M. Guest Editorial Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique. IEEE Trans. Med. Imaging 2016, 35, 1153–1159. [Google Scholar] [CrossRef]
- Ghanem, N.M.; Attallah, O.; Anwar, F.; Ismail, M.A. Artificial Intelligence in Cancer Diagnosis and Prognosis, Volume 2: Breast and Bladder Cancer; IOP Publishing: Bristol, UK, 2022. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems; 2012; pp. 1097–1105. Available online: https://dl.acm.org/doi/10.1145/3065386 (accessed on 1 August 2024).
- Dong, Y.; Hu, Z.; Uchimura, K.; Murayama, N. Driver inattention monitoring system for intelligent vehicles: A review. IEEE Trans. Intell. Transp. Syst. 2011, 12, 596–614. [Google Scholar] [CrossRef]
- Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.-A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Bengio, Y. Generative adversarial nets. In Advances in Neural Information Processing Systems; 2014; pp. 2672–2680. Available online: https://dl.acm.org/doi/10.1145/3422622 (accessed on 1 August 2024).
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Dinh, L.; Sohl-Dickstein, J.; Bengio, S. Density estimation using real nvp. arXiv 2016, arXiv:1605.08803. [Google Scholar]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; 2017; pp. 5998–6008. Available online: https://arxiv.org/abs/1706.03762 (accessed on 1 August 2024).
- Chamberlain, A.J.; Fritschi, L.; Kelly, J.W. Nodular melanoma: Patients’ perceptions of presenting features and implications for earlier detection. J. Am. Acad. Dermatol. 2003, 48, 694–701. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Liu, S.; Cai, W.; Pujol, S.; Kikinis, R.; Feng, D. Early diagnosis of Alzheimer’s disease with deep learning. In Proceedings of the International Symposium on Biomedical Imaging, Beijing, China, 29 April–2 May 2014; pp. 1015–1018. [Google Scholar]
- Brosch, T.; Tam, R. Manifold learning of brain MRIs by deep learning. Med. Image Comput. Comput. Assist. Interv. 2013, 16, 633–640. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Williams, R.J.; Zipser, D. A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1989, 1, 270–280. [Google Scholar] [CrossRef]
- Smolensky, P. Information Processing in Dynamical Systems: Foundations of Harmony Theory; Colorado University at Boulder, Department of Computer Science: Boulder, CO, USA, 1986. [Google Scholar]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
- Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A Survey of the Recent Architectures of Deep Convolutional Neural Networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef]
- Nazari, S.; Garcia, R. Automatic Skin Cancer Detection Using Clinical Images: A Comprehensive Review. Life 2023, 13, 2123. [Google Scholar] [CrossRef]
- Dildar, M.; Akram, S.; Irfan, M.; Khan, H.U.; Ramzan, M.; Mahmood, A.R.; Alsaiari, S.A.; Saeed, A.H.M.; Alraddadi, M.O.; Mahnashi, M.H. Skin Cancer Detection: A Review Using Deep Learning Techniques. Int. J. Environ. Res. Public Health 2021, 18, 5479. [Google Scholar] [CrossRef] [PubMed]
- Naqvi, M.; Gilani, S.Q.; Syed, T.; Marques, O.; Kim, H.-C. Skin Cancer Detection Using Deep Learning—A Review. Diagnostics 2023, 13, 1911. [Google Scholar] [CrossRef]
- Abadi, M. TensorFlow: Learning functions at scale. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Nara, Japan, 18–24 September 2016; Volume 51, p. 1. [Google Scholar] [CrossRef]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Madinakhon, R.; Mukhtorov, D.; Cho, Y.-I. Integrating Principal Component Analysis and Multi-Input Convolutional Neural Networks for Advanced Skin Lesion Cancer Classification. Appl. Sci. 2024, 14, 5233. [Google Scholar] [CrossRef]
- Saini, A.; Guleria, K.; Sharma, S. Skin Cancer Classification Using Transfer Learning-Based Pre-Trained VGG 16 Model. In Proceedings of the 2023 IEEE International Conference on Computer Communication and Information Systems (ICCCIS), Greater Noida, India, 3–4 November 2023; pp. 305–310. [Google Scholar] [CrossRef]
- Jiang, S.; Li, H.; Jin, Z. A Visually Interpretable Deep Learning Framework for Histopathological Image-Based Skin Cancer Diagnosis. IEEE J. Biomed. Health Inform. 2021, 25, 1483–1494. [Google Scholar] [CrossRef] [PubMed]
- Khamsa, D.; Pascal, L.; Zakaria, B.; Lokman, M.; Zakaria, M.Y. Skin Cancer Diagnosis and Detection Using Deep Learning. In Proceedings of the 2023 International Conference on Electrical Engineering and Advanced Technology (ICEEAT), Batna, Algeria, 5–7 November 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 dataset: A large collection of multi-source dermatoscopic images of common pigmented skin lesions. Data 2018, 5, 180161. [Google Scholar] [CrossRef] [PubMed]
Comparison Factor | Model 1 | Model 2 | Model 3 |
---|---|---|---|
CNN model | VGG16 | VGG16 | VGG16 |
Number of epochs | 40 | 40 | 40 |
Batch size | 32 | 32 | 32 |
Optimizer | Adam | Adam | Adam |
Number of images in training dataset | 14,454 | 13,232 | 10,232 |
Number of images in validation dataset | 1671 | 820 | 820 |
Types of skin cancers/lesions | bcc/mel/NV | bcc/mel/NV | bcc/mel/NV |
Number of kernels | 100 | 100 | 100 |
Kernel size | 3 × 3 | 3 × 3 | 3 × 3 |
Padding | valid | valid | valid |
Dropout | 0.75 | 0.75 | 0.75 |
Activation function | Softmax | Softmax | Softmax |
Comparison Metrics | Model 1 | Model 2 | Model 3 |
---|---|---|---|
validation accuracy | 86% | 94% | 93% |
loss | 38% | 16% | 17% |
Ref | DL Model | Type of Cancer | Dataset | Accuracy |
---|---|---|---|---|
Saini et al. [44] | VGG16 | Melanoma | Kaggle | 0.84 |
Jiang et al. [45] | DRANet, ResNet50, InceptionV3, VGG16, VGG19 | 11 types (BCC, EC, S, EP, D, GA, N, LP, LMDF, ACD, and PG) | 1167 images | DRANet (86.8%), ResNet50 (85.5%), InceptionV3 (86.3%), VGG16 (82.1%), VGG19 (83.8%) |
Aljohani et al. [11] | GoogleNet, DenseNet201, ResNet50V2, VGG16, VGG19 | Melanoma | 7146 images | 76.08%, 73.96%, 73.74%, 74.68%, 73.42% |
Khamsa et al. [46] | VGG16 | 2 classes (benign/malignant) | 3297 images | 83% |
Our method | VGG16 | 3 types (MEL, NV, BCC) | 13,232 images | 84.5% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Djaroudib, K.; Lorenz, P.; Belkacem Bouzida, R.; Merzougui, H. Skin Cancer Diagnosis Using VGG16 and Transfer Learning: Analyzing the Effects of Data Quality over Quantity on Model Efficiency. Appl. Sci. 2024, 14, 7447. https://doi.org/10.3390/app14177447
Djaroudib K, Lorenz P, Belkacem Bouzida R, Merzougui H. Skin Cancer Diagnosis Using VGG16 and Transfer Learning: Analyzing the Effects of Data Quality over Quantity on Model Efficiency. Applied Sciences. 2024; 14(17):7447. https://doi.org/10.3390/app14177447
Chicago/Turabian StyleDjaroudib, Khamsa, Pascal Lorenz, Rime Belkacem Bouzida, and Hanine Merzougui. 2024. "Skin Cancer Diagnosis Using VGG16 and Transfer Learning: Analyzing the Effects of Data Quality over Quantity on Model Efficiency" Applied Sciences 14, no. 17: 7447. https://doi.org/10.3390/app14177447
APA StyleDjaroudib, K., Lorenz, P., Belkacem Bouzida, R., & Merzougui, H. (2024). Skin Cancer Diagnosis Using VGG16 and Transfer Learning: Analyzing the Effects of Data Quality over Quantity on Model Efficiency. Applied Sciences, 14(17), 7447. https://doi.org/10.3390/app14177447