*3.2. Comparison Procedure Between Manual and Automatic Classification*

In a routine clinical procedure, the otolaryngologist evaluates a set of CE-NBI images of a patient and then identifies a patient's lesion as benign or malignant. For the manual classification in this work, the clinicians performed a similar routine, making a decision based on a set of images belonging to a patient. Since the automatic classification does not classify a patient but an image, in order to compare automatic to manual classification we made the following assumption: if a given classifier correctly classifies more than half of the images of a patient, then the patient is considered as a correct classification performed by this classifier. Following the assumption, two procedures for comparing between manual and automatic classification were proposed.

The first comparison procedure consists of comparing both approaches based on the level of agreement/disagreement between clinicians for classifying a patient as benign or malignant. In this aim, patients were divided into three categories:


The second comparison procedure aims to compare manual and automatic classifications in terms of their misclassification levels depending on the histopathologies. This evaluation was performed to analyze the histopathologies in benign and malignant groups that caused significant difficulties for otolaryngologists and then to see how the automatic approach behaves with these cases.

We divided the patients into the 15 groups presented in Table 1. For each histopathology, a misclassification percentage was computed per patient for the automatic and manual classification as follows:

• Misclassification percentage of all otolaryngologists per patient in each histopathology group:

$$\left(\frac{\text{Number of dolor(s) who misclassified the patients}}{\text{Total number of docytes} \times \text{Total number of patients}}\right) \times 100\tag{1}$$

where the total number of patients was the number of patients for the corresponding histopathology.

• Misclassification percentage of every classifier per patient in each histopathology group:

$$\left(\frac{\text{Number of misclassified patient(s)}}{\text{Total number of patients}}\right) \times 100\tag{2}$$

• Misclassification percentage of all classifiers per patient in each histopathology group:

$$\left(\frac{\text{Number of misclassified patient(s) by all classifiers}}{\text{Total number of classifications} \times \text{Total number of patients}}\right) \times 100\tag{3}$$

#### **4. Results and Discussion**

Table 2 shows the global performances of the manual and automatic classification. In the manual approach, otolaryngology specialists showed a better performance than the otolaryngology residents. These results prove that the interpretation of CE-NBI images based on vascular patterns is subjective and highly depends on otolaryngologists' experience.

For the automatic approach, RFC with a sensitivity of 0.846 and SVM with RBF kernel with a specificity of 0.981 showed better results in comparison to the other classifiers.

The overall specificity values of otolaryngologists are low. This means that both groups had difficulties in distinguishing patients with benign histopathologies from malignant ones visually. This fact can be explained by the similarity between vascular patterns of benign and malignant histopathologies that can not be distinguished easily. For instance, Papillomatosis is a benign histopathology with similar vascular patterns than malignant histopathologies. This similarity leads to visually misclassify Papillomatosis as malignant. However, all four classifiers showed higher specificity than otolaryngologists proving the ability of automatic approach to overcome such a problem.


**Table 2.** General performance of manual and automatic approaches

Figure 2 shows the detailed results of the first comparison procedure consisting of comparing both approaches based on the level of agreement/disagreement between clinicians for classifying a patient as benign or malignant. A first visual inspection shows that the classifiers individually misclassified 1 to 2 images in some patients at the Category I, where all otolaryngologists correctly classified these patients. Nevertheless, based on the assumption made in Section 3.2, the automatic approach did not misclassify any patient of this category.

For the patients belonging to Category II, both manual and automatic increased their misclassification levels compared to Category I. In the automatic approach, it is possible to observe that several images belonging to a patient can be misclassified. However, if we consider the automatic classification per patient, only for one patient, two classifiers (SVM with polykernel and RFC) perform a misclassification. On the other hand, otolaryngologists showed a significant misclassification in some cases. For example, in the case of patients p26, p34 and p 72, five clinicians misclassified the patients, while the classifiers classified the patients correctly. These patients were diagnosed as Papillomatosis and Hyperkeratosis cases and belong to benign histopathologies. Figure 3a–c, displays the PVC vascular patterns in the CE-NBI images of these patients. As pointed out in the introduction, the difference between PVC in benign and in malignant histopathologies is not visually evident for the otolaryngologist. This causes a significant difficulty for the clinicians to distinguish benign from malignant cases based on the vascular patterns. Based on the results, the automatic approach showed the ability to identify this difference and then classify the patients correctly because it is

capable of quantifying and differentiating these tiny differences. SVM with RBF did not show any misclassification per patient in this category.

For the Category III, where all otolaryngologists misclassified the patients, SVM with RBF misclassified fewer images compared to the other three classifiers. Concerning the classification per patient performed by the classifiers, it is possible to see that misclassifications were made for only two patients. Particularly, for patient p10 three classifiers failed in their classification. According to the histopathology, it corresponds to a patient presenting Hyperkeratosis. A set of CE-NBI images of this case is presented in Figure 3d. The type of vascular patterns of Hyperkeratosis can notably vary from one patient to another one. The CE-NBI dataset included 4 patients for this histopathology, presenting LVC and PVC vascular patterns. Due to this variation, the classifier's learning process using the proposed features [17,18] can be complicated. SVM with RBF showed no misclassification per patient in this Category.

These results show that the complexity of a manual analysis of a laryngeal lesion can be related to the type of histopathology and therefore we decided to perform a separated analysis based on the histopathology of the lesion. Table 3 presents the results of this second comparison procedure.

For the benign histopathologies, otolaryngologists showed high misclassification percentage of 83%, 77%, 46%, 33% and 27% for Fibroma, Papillomatosis, Hyperkeratosis, Squamous Hyperplasia and Polyp, respectively. Except for Fibroma, the misclassification level of each classifier is lower than the manual classification. Notably, in the case of Papillomatosis, the misclassification is significantly reduced in each classifier. If all classifiers are considered, the misclassification decreases from 77% to 7% in this histopathology. Papillomatosis causes classification difficulties to the otolaryngologists due to their vascular patterns that has similar characteristics to the malignant histopathologies. SVM with RBF and kNN seems to have the ability to solve this issue with 0% misclassification.

In the case of Fibroma, the misclassification percentage varied significantly among the four classifiers. This can be explained by the reduced number of images that the dataset contains for this type of histopathology (only one patient and two images).

In the malignant group, the otolaryngologists had the highest misclassification percentage of 61% for mild dysplasia. This histopathology can have PVC as well as LVC vascular patterns that usually appear in benign histopathologies. Hence, it is challenging for the otolaryngologists to classify patients with this condition as malignant visually. For this histopathology, the four classifiers performed well by classifying every patient correctly.

In general, SVM with RBF showed no patient misclassification for all histopathologies.

**Figure 2.** An overall view of the manual and automatic classification of every patient of the dataset; Green color: correct classification; Red color: misclassification. C1 to C4 represents the four classifiers; C1: Support Vector Machine (SVM) with polykernel, C2: SVM with Radial Basis Function (RBF), C3: k-Nearest Neighbor (kNN) and C4: Random Forest Classifier (RFC). I1 to I5 represent five testing images for each patient. D1 to D6 represent the six otolaryngologists.

**Figure 3.** CE-NBI images of four patients from Category II and Category III: (**a**) p26, (**b**) p34, (**c**) p72 and (**d**) p10.


**Table 3.** Misclassification percentage of every histopathology category based on patient. C1 to C4 represent the four classifiers; C1: SVM with polykernel, C2: SVM with RBF, C3: kNN and C4: RFC.
