ColpoClassifier: A Hybrid Framework for Classification of the Cervigrams
Abstract
:1. Introduction
- Proposes the framework consisting of the extraction of hybrid feature fusion vector followed by the classification of cervigrams;
- Experiments with each feature extraction method and proposes the hybrid fusion vector consisting of GLCM + GLRLM + HOG for more accurate classification than that of the individual;
- Builds the dataset by downloading individual images from the WHO website along with a label and augmenting them with different operations. This dataset is made available for other researchers;
- Classifies the cervigrams by using different machine learning classifiers;
- Evaluates the classification performance of these classifiers by using different performance measures.
2. Related Work
3. ColpoClassifier: Proposed Hybrid Framework for Classification of Colposcopy Images
3.1. Gray-Level Run Length Matrix (GLRLM)
3.2. Gray-Level Co-Occurrence Matrix (GLCM)
3.3. Histogram of Gradients (HOG)
- (a)
- Pre-processing.
- (b)
- Image gradients calculation.
- (c)
- Histogram of gradients.
3.4. Feature Fusion
3.5. Classification
3.5.1. Naïve Bayes
3.5.2. Bayes Net
3.5.3. Random Tree
3.5.4. Random Forest
3.5.5. Decision Table
3.5.6. Logistics
4. Dataset
Data Augmentation
5. Experiments and Results
5.1. Implementation Details
5.2. Performance Measures
- Accuracy (A) is the ratio of the number of correct predictions to all the predictions by the model and is given by
- b.
- Sensitivity (True Positive Rate) is the proportion of correct predictions of the positive class. It signifies the classifier’s ability to accurately predict disease-positive patients and is given by
- c.
- Specificity (True Negative Rate) is the proportion of negative predictions to the total negative cases. It signifies the classifier’s ability to distinguish the disease-negative patients and is given by
- d.
- Precision (P) is the number of true positives divided by the total number of positive predictions.
- e.
- Recall (R) is the ratio of correctly classified positives to the total number of positives.
- f.
- Mean absolute error (MAE) is the error between expressions for the same event.
- g.
- F1 measure (F1) is a weighted average of Precision and Recall.
5.3. Results and Discussions
5.3.1. Dataset-I Results
- GLRLM Feature Extraction for Dataset-I
- b.
- GLCM Feature Extraction for Dataset-I
- c.
- HOG Feature Extraction for Dataset-I
- d.
- Hybrid Feature Fusion for Dataset-I
5.3.2. Dataset-II Results
- GLRLM Feature Extraction for Dataset-II
- b.
- GLCM Feature Extraction for Dataset-II
- c.
- HOG Feature Extraction for Dataset-II
- d.
- Hybrid Feature Fusion for Dataset-II
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Waggoner, S.E. Cervical cancer. Lancet 2003, 361, 2217–2225. [Google Scholar] [CrossRef] [PubMed]
- Canfell, K.; Kim, J.J.; Brisson, M.; Keane, A.; Simms, K.T.; Caruana, M.; Burger, E.A.; Martin, D.; Nguyen, D.T.; Bénard, É.; et al. Mortality impact of achieving WHO cervical cancer elimination targets: A comparative modeling analysis in 78 low-income and lower-middle-income countries. Lancet 2020, 395, 591–603. [Google Scholar] [CrossRef] [Green Version]
- Atlas of Colposcopy. Available online: https://screening.iarc.fr/atlascolpo.php (accessed on 25 August 2022).
- Mortakis. Available online: https://mortakis.hpvinfocenter.gr/en/index.php/2-basic-colposcopic-images (accessed on 10 August 2022).
- Park, Y.R.; Kim, Y.J.; Ju, W.; Nam, K.; Kim, S.; Kim, K.G. Comparison of a machine and deep learning for the classification of cervical cancer based on cervicography images. Nature 2021, 11, 1–11. [Google Scholar] [CrossRef] [PubMed]
- De Siqueira, F.R.; Schwartz, W.R.; Pedrini, H. Multi-scale Gray-level co-occurrence matrices for texture description. Neurocomputing 2013, 120, 336–345. [Google Scholar] [CrossRef]
- Zhang, H.; Hung, C.L.; Min, G.; Guo, J.P.; Liu, M.; Hu, X. GPU-Accelerated GLRLM Algorithm for Feature Extraction of MRI. Nature 2019, 9, 1–3. [Google Scholar] [CrossRef] [Green Version]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Han, J.; Kamber, M. Data Mining: Concepts and Techniques. Morgan Kaufmann 2000, 10, 559–569. [Google Scholar]
- Witten, I.H.; Frank, E. Datamining Practical Machine Learning Tools and Techniques, 2nd ed.; Morgan Kaufmann: San Fransisco, CA, USA, 2005. [Google Scholar]
- Bayes, T. Bayes An essay towards solving a problem in the doctrine of chances 1763. MD Comput. Comput. Med. Pract. 1991, 8, 157–171. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- King, P.J. Decision Tables in Rough Sets. Comput. J. 1991, 10, 68–80. [Google Scholar]
- Chung, M.K. Introduction to logistic regression. arXiv 2020, arXiv:2008.13567. [Google Scholar]
- Li, W.; Soto-Thompson, M.; Gustafsson, U. A new image calibration system in digital colposcopy. Opt. Soc. Am. 2006, 14, 12887–12901. [Google Scholar] [CrossRef] [PubMed]
- Lange, H. Automatic detection of multi-level acetowhite regions in RGB color images of the uterine cervix. Int. Soc. Opt. Eng. 2005, 5747, 1004–1017. [Google Scholar]
- Cho, B.J.; Choi, Y.J.; Lee, M.J.; Kim, J.H.; Son, G.H.; Park, S.H.; Kim, H.B.; Joo, Y.J.; Cho, H.Y.; Kyung, M.S.; et al. Classification of cervical neoplasms on colposcopic photography using deep learning. Nature 2020, 10, 1–10. [Google Scholar] [CrossRef] [PubMed]
- Sato, M.; Horie, K.; Hara, A.; Miyamoto, Y.; Kurihara, K.; Tomio, K.; Yokota, H. Application of deep learning to the classification of images from colposcopy. Oncol. Lett. 2018, 15, 3518–3523. [Google Scholar] [CrossRef] [Green Version]
- Tulpule, B.; Yang, S.; Srinivasan, Y.; Mitra, S.; Nutter, B. Segmentation and classification of cervix lesions by pattern and texture analysis. ACM 2006, 5, 173–176. [Google Scholar]
- Ji, Q.; Engel, J.; Craine, E. Texture analysis for classification of cervix lesions. IEEE Trans. Med. Imaging 2000, 19, 1144–1149. [Google Scholar] [CrossRef] [PubMed]
- Asiedu, M.N.; Simhal, A.; Chaudhary, U.; Mueller, J.L.; Lam, C.T.; Schmitt, J.W.; Venegas, G.; Sapiro, G.; Ramanujam, N. Development of algorithms for automated detection of cervical pre-cancers with low cost, point of care, pocket colposcope. IEEE Trans. Biomed. Eng. 2019, 66, 2306–2318. [Google Scholar] [CrossRef]
- Acosta-Mesa, H.G.; Cruz-Ramírez, N.; Hernández-Jiménez, R. Aceto-white temporal pattern classification using k-NN to identify a precancerous cervical lesion in colposcopic images. Comput. Biol. Med. 2009, 39, 778–784. [Google Scholar] [CrossRef]
- Hu, L.; Bell, D.; Antani, S.; Xue, Z.; Yu, K.; Horning, M.P.; Gachuhi, N.; Wilson, B.; Jaiswal, M.S.; Befano, B.; et al. An Observational Study of Deep Learning and Automated Evaluation of Cervical Images for Cancer Screening. J. Natl. Cancer Inst. 2019, 111, 923–932. [Google Scholar] [CrossRef] [Green Version]
- Park, Y.; Kim, Y.J.; Ju, W.; Nam, K.; Kim, S.; Kim, K.G. Classification of Cervical Cancer Using Deep Learning and Machine Learning Approach. IEEE 2022, 5, 1210–1218. [Google Scholar] [CrossRef]
- Adweb, K.M.; Cavus, N.; Sekeroglu, B. Cervical Cancer Diagnosis Using Very Deep Networks Over Different Activation Functions. IEEE Access 2021, 9, 46612–46625. [Google Scholar] [CrossRef]
- Alquran, H.; Mustafa, W.A.; Abdi, R.A.; Ismail, A.R. Cervical Net: A Novel Cervical Cancer Classification Using Feature Fusion. Bioengineering 2022, 9, 578. [Google Scholar] [CrossRef] [PubMed]
- Novitasari, D.C.; Asyhar, A.H.; Thohir, M.; Arifin, A.Z.; Mu’jizah, H.; Foeady, A.Z. Cervical Cancer Identification Based Texture Analysis Using GLCM-KELM on Colposcopy Data. In Proceedings of the 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan, 19 February 2020; pp. 409–414. [Google Scholar] [CrossRef]
- William, W.; Ware, A.; Basaza-Ejiri, A.H.; Obungoloch, J. A Pap-smear analysis tool (PAT) for the detection of cervical cancer from pap-smear Images. Biomed. Eng. Online 2019, 18, 16. [Google Scholar] [CrossRef] [Green Version]
- Win, K.P.; Kitjaidure, Y.; Hamamoto, K.; Myo Aung, T. Computer-Assisted Screening for Cervical Cancer Using Digital Image Processing of Pap Smear Images. Appl. Sci. 2020, 10, 1800. [Google Scholar] [CrossRef] [Green Version]
- Alsalatie, M.; Alquran, H.; Mustafa, W.A.; Mohd Yacob, Y.; Ali Alayed, A. Analysis of Cytology Pap Smear Images Based on Ensemble Deep Learning Approach. Diagnostics 2022, 12, 2756. [Google Scholar] [CrossRef]
- Athinarayanan, S.; Srinath, M.V.; Kavitha, R. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based Texture Features and Fuzzy Logic based Hybrid Kernel Support Vector Machine Classifier. IOSR J. Comput. Eng. 2017, 19, 23–34. [Google Scholar]
- Shanthi, P.B.; Hareesha, K.S.; Kudva, R. Automated Detection and Classification of Cervical Cancer Using Pap Smear Microscopic Images: A Comprehensive Review and Future Perspectives. Eng. Sci. 2022, 19, 20–41. [Google Scholar] [CrossRef]
- Haralick, R.; Shanmugan, K.; Dinstein, I. Texture for image classification. IEEE Trans. Syst. Man Cybern. 1973, Smc-3, 610–621. [Google Scholar] [CrossRef] [Green Version]
- Soh, L.K.; Tsatsoulis, C. Texture Analysis of SAR Sea Ice Imagery Using Gray-level Co-Occurrence Matrices. IEEE Trans. Geosci. Remote Sens. 1999, 37, 780–795. [Google Scholar] [CrossRef] [Green Version]
- Clausi, D.A. An analysis of co-occurrence texture statistics as a function of grey level quantization. Can. J. Remote Sens. 2002, 28, 45–62. [Google Scholar] [CrossRef]
- Thohir, M.; Foeady, A.Z.; Novitasari, D.C.; Arifin, A.Z.; Phiadelvira, B.Y.; Asyhar, A.H. Classification of Colposcopy Data Using GLCM-SVM on Cervical Cancer. In Proceedings of the 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan, 19–21 February 2020; Volume 3, pp. 373–378. [Google Scholar]
Long-run emphasis | |
Short-run emphasis | /() |
Gray-level nonuniformity | |
Run length nonuniformity | |
Run percentage | |
Short-run low-Gray-level emphasis | |
Long-run low-Gray-level emphasis | |
Long-run high-Gray-level emphasis |
Maximum Probability | Strongest Response of P, in the Range [0, 1] Max Pi,j I,j |
---|---|
Correlation | Calculates correlation between pixel I and neighboring pixel j in the range [0, 1] |
Contrast | Calculates intensities between pixels in range [0, 1] |
Energy | Energy will be 1 for constant image |
Homogeneity | Calculates spatial auto-correlation for range [0, 1] |
Entropy | Calculates the randomness of the matrix |
Operation | Parameter Values |
---|---|
Rotation | 15 degrees |
Width Shift | 0.2 |
Shear | 0.2 |
Height Shift | 0.2 |
Horizontal Flip | True |
Vertical Flip | True |
Fill Mode | Constant |
Class | # |
---|---|
Abnormal images | 229 |
Normal images | 141 |
Total images | 370 |
Class | # |
---|---|
Abnormal images | 214 |
Normal images | 166 |
Total images | 380 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Naïve Bayes | 49.45% | 0.49 | 0.36 | 0.63 | 0.5 | 0.5 | 0.56 |
Bayes Net | 45.94% | 0.45 | 0.35 | 0.65 | 0.45 | 0.48 | 0.53 |
Random Tree | 61.62% | 0.61 | 0.43 | 0.61 | 0.61 | 0.38 | 0.61 |
Random Forest | 68.11% | 0.68 | 0.40 | 0.67 | 0.68 | 0.40 | 0.67 |
Decision Table | 61.89% | 0.61 | 0.61 | 0.67 | 0.61 | 0.47 | 0.64 |
Logistic | 61.35% | 0.61 | 0.52 | 0.58 | 0.61 | 0.43 | 0.59 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Naïve Bayes | 61.89% | 0.61 | 0.34 | 0.66 | 0.61 | 0.39 | 0.63 |
Bayes Net | 63.24% | 0.63 | 0.38 | 0.64 | 0.63 | 0.43 | 0.63 |
Random Tree | 62.70% | 0.62 | 0.41 | 0.62 | 0.62 | 0.37 | 0.62 |
Random Forest | 65.94% | 0.65 | 0.41 | 0.65 | 0.65 | 0.39 | 0.65 |
Decision Table | 59.45% | 0.59 | 0.56 | 0.55 | 0.59 | 0.46 | 0.57 |
Logistic | 59.45% | 0.59 | 0.49 | 0.58 | 0.59 | 0.42 | 0.58 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Naïve Bayes | 64.86% | 0.64 | 0.40 | 0.64 | 0.64 | 0.42 | 0.64 |
Bayes Net | 63.51% | 0.63 | 0.49 | 0.61 | 0.63 | 0.44 | 0.62 |
Random Tree | 59.18% | 0.59 | 0.45 | 0.59 | 0.59 | 0.40 | 0.59 |
Random Forest | 69.72% | 0.69 | 0.42 | 0.69 | 0.69 | 0.40 | 0.69 |
Decision Table | 63.51% | 0.63 | 0.49 | 0.61 | 0.63 | 0.44 | 0.62 |
Logistic | 61.89% | 0.61 | 0.47 | 0.60 | 0.61 | 0.42 | 0.60 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Naïve Bayes | 55.40% | 0.55 | 0.32 | 0.68 | 0.55 | 0.43 | 0.61 |
Bayes Net | 53.51% | 0.53 | 0.36 | 0.63 | 0.53 | 0.45 | 0.58 |
Random Tree | 65.40% | 0.65 | 0.39 | 0.65 | 0.65 | 0.34 | 0.65 |
Random Forest | 72.43% | 0.72 | 0.37 | 0.72 | 0.72 | 0.40 | 0.72 |
Decision Table | 61.89% | 0.61 | 0.52 | 0.59 | 0.61 | 0.44 | 0.60 |
Logistic | 62.97% | 0.63 | 0.41 | 0.63 | 0.63 | 0.38 | 0.63 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Bayes Net | 63.68% | 0.63 | 0.45 | 0.67 | 0.63 | 0.36 | 0.65 |
Naïve Bayes | 57.36% | 0.57 | 0.37 | 0.63 | 0.57 | 0.43 | 0.60 |
Random Tree | 78.15% | 0.78 | 0.22 | 0.782 | 0.78 | 0.21 | 0.78 |
Random Forest | 85.52% | 0.85 | 0.16 | 0.856 | 0.85 | 0.26 | 0.85 |
Decision Table | 68.15% | 0.68 | 0.30 | 0.693 | 0.64 | 0.38 | 0.67 |
Logistic | 80% | 0.80 | 0.21 | 0.800 | 0.80 | 0.28 | 0.80 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Bayes Net | 69.21% | 0.69 | 0.30 | 0.69 | 0.69 | 0.31 | 0.69 |
Naïve Bayes | 70.26% | 0.70 | 0.28 | 0.71 | 0.70 | 0.30 | 0.70 |
Random Tree | 69.21% | 0.69 | 0.31 | 0.69 | 0.69 | 0.30 | 0.69 |
Random Forest | 77.36% | 0.77 | 0.24 | 0.77 | 0.77 | 0.30 | 0.77 |
Decision Table | 72.63% | 0.72 | 0.31 | 0.72 | 0.72 | 0.37 | 0.72 |
Logistic | 74.47% | 0.74 | 0.27 | 0.74 | 0.74 | 0.31 | 0.74 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Bayes Net | 63.68% | 0.63 | 0.45 | 0.67 | 0.63 | 0.36 | 0.65 |
Naïve Bayes | 56.29% | 0.56 | 0.47 | 0.55 | 0.56 | 0.46 | 0.55 |
Random Tree | 58.61% | 0.58 | 0.43 | 0.58 | 0.58 | 0.41 | 0.58 |
Random Forest | 65.29% | 0.65 | 0.40 | 0.65 | 0.65 | 0.44 | 0.65 |
Decision Table | 62.21% | 0.62 | 0.46 | 0.62 | 0.62 | 0.46 | 0.62 |
Logistic | 61.18% | 0.61 | 0.42 | 0.60 | 0.61 | 0.45 | 0.60 |
Classification Algorithm | Accuracy | Sensitivity | Specificity | Precision | Recall | MAE | F1 Measure |
---|---|---|---|---|---|---|---|
Bayes Net | 75.00% | 0.75 | 0.28 | 0.74 | 0.75 | 0.26 | 0.74 |
Naïve Bayes | 71.31% | 0.71 | 0.26 | 0.26 | 0.71 | 0.29 | 0.38 |
Random Tree | 74.47% | 0.74 | 0.27 | 0.74 | 0.74 | 0.25 | 0.74 |
Random Forest | 84.47% | 0.84 | 0.18 | 0.84 | 0.84 | 0.29 | 0.84 |
Decision Table | 75.00% | 0.75 | 0.29 | 0.74 | 0.75 | 0.34 | 0.74 |
Logistic | 79.47% | 0.79 | 0.21 | 0.79 | 0.79 | 0.21 | 0.79 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kalbhor, M.; Shinde, S. ColpoClassifier: A Hybrid Framework for Classification of the Cervigrams. Diagnostics 2023, 13, 1103. https://doi.org/10.3390/diagnostics13061103
Kalbhor M, Shinde S. ColpoClassifier: A Hybrid Framework for Classification of the Cervigrams. Diagnostics. 2023; 13(6):1103. https://doi.org/10.3390/diagnostics13061103
Chicago/Turabian StyleKalbhor, Madhura, and Swati Shinde. 2023. "ColpoClassifier: A Hybrid Framework for Classification of the Cervigrams" Diagnostics 13, no. 6: 1103. https://doi.org/10.3390/diagnostics13061103
APA StyleKalbhor, M., & Shinde, S. (2023). ColpoClassifier: A Hybrid Framework for Classification of the Cervigrams. Diagnostics, 13(6), 1103. https://doi.org/10.3390/diagnostics13061103