Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography

Akalin, Fatma; Özkan, Yasin

doi:10.3390/sym17091392

Open AccessArticle

Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography

by

Fatma Akalin

^1,*

and

Yasin Özkan

²

¹

Department of Information Systems Engineering, Faculty of Computer and Information Sciences, Sakarya University, Serdivan 54050, Sakarya, Turkey

²

Department of Computer Technologies, Zonguldak Bulent Ecevit University, Zonguldak Merkez 67100, Zonguldak, Turkey

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(9), 1392; https://doi.org/10.3390/sym17091392

Submission received: 10 April 2025 / Revised: 28 July 2025 / Accepted: 6 August 2025 / Published: 26 August 2025

(This article belongs to the Special Issue Symmetry in Computational Intelligence and Applications)

Download

Browse Figures

Versions Notes

Abstract

Artificial intelligence has made revolutionary advances in medical imaging in recent years. Various algorithms and techniques are used in this scientific field to significantly improve the accuracy and speed of medical diagnosis and classification processes. In this direction, approaches have been improved, from the past to the present, to extract meaningful features from dental images and classify them accurately. Especially, high asymmetry in morphological balance, play a critical role in distinguishing pathological patterns from normal anatomy. In this study, we propose a scenario for the classification of periapical lesions, supported by a combination of improved image processing techniques and regularization strategies integrated into the VGG16 transfer learning architecture, as the experience and time criteria required for manual interpretation of lesion detection confirm the need for a computer-aided system. In this study, which was conducted on the UFPE public dataset, an improvement in the performance of the VGG16 transfer learning architecture was achieved, with 18 different regularization methods proposed. These values indicate optimized training within the parameters of avoiding overfitting, stability, generalizability, and high accuracy. This optimization has the potential to use as a decision support system for diagnosis and treatment processes in various subfields of the medical world.

Keywords:

classification of periapical lesions; VGG16 transfer learning architecture; proposed medical imaging approaches; improved regularization methods

1. Introduction

Digital images are data containing valuable information. They have sensitive and critical features. The smallest units of these images are pixels. Pixels play an important role in processing information. For grayscale images, pixel values range from 0 to 255. Pixels close to 255 represent darker color tones. Pixels close to 0 represent lighter color tones. Grayscale digital images consist of a single channel [1].

Physicians use digital images obtained through X-ray, CBCT, and intraoral scanning systems to evaluate the oral and maxillofacial regions [2], which supports treatment planning. [3]. This allows surgical interventions to be performed more safely and post-treatment processes to be monitored more effectively [4]. Image processing is frequently preferred to make sense of these images consisting of shades of gray [5].

Image processing is a field in which mathematical and algorithmic techniques are used to analyze, process, and interpret digital images. In this field; patterns, symmetry, color tones, shapes, and other features in images are processed. This allows images to become more meaningful or for information to be extracted for a specific purpose. The effectiveness of these techniques provides great benefits to healthcare professionals, especially in the field of medical imaging [6].

In dentistry, image processing techniques play an important role in the accurate interpretation of coronal sections, which allow the evaluation of the anterior and posterior surfaces of the teeth and sagittal sections, which reveal the lateral profiles of the teeth and defects in the jaw structure. This offers a successful perspective for the detailed evaluation of intraoral structures [7]. Especially in three-dimensional (3D) imaging systems, the combination of coronal and sagittal sections allows for a more detailed and clearer analysis of the teeth and surrounding tissues. This provides dentists to diagnose in a more precise format and to carry out more comprehensive treatment processes [8]. On the other hand, manual interpretation of digital medical images is time consuming. It requires expertise and the risk of error is high. There is an important need to use computer-aided systems to solve these problems.

Image processing and deep learning-based convolutional neural network architectures (CNNs) are complementary approaches actively used for computer-aided systems improved in scope of the detection of healthy and diseased regions on digital images [6]. Image processing, which is used to reveal sensitive and critical information in digital images, provides a powerful input to deep learning-based CNN architectures customized for processing visual data. Thus, the success in the evaluation of clinical images increases [6]. In this context, image processing and CNN stand out as two critical areas [9]. In particular, image processing improves the analysis and interpretation of images by highlighting salient features in digital images, thereby reducing error rates [5]. It then performs more efficient and effective feature extraction through CNNs run on data optimized by image processing. This is valuable for providing classification and analysis processes with higher accuracy [10]. It also has great potential for optimizing diagnosis and treatment processes for diseases in the health sector [11]. For this reason, many computer-aided multidisciplinary studies have been carried out in the field of dentistry [12]. In [13], a deep convolutional neural network (DCNN) was used to classify tooth types from dental cone beam CT images in forensic identification. Fifty-two CT data samples were divided into 42 training and 10 test samples. During training, data augmentation techniques were used to prevent overfitting. In the classification with the AlexNet architecture, an accuracy rate of 88.8% was obtained. In [14], the accuracy of deep learning algorithms for the diagnosis and classification of dental caries was investigated. The dataset has 382 decayed teeth and 403 healthy teeth cone beam computed tomography (CBCT) images. The data were presented to a multi-input CNN model. The accuracy rates were 95.3% for carious teeth and 94.8% for healthy teeth. The deep learning model classified the depth and type of caries with high accuracy. These results show that deep learning is an effective tool for accurate diagnosis and treatment planning in dentistry. In [15], self-supervised learning (SSL) was used to improve the classification of dental caries. In the training with unlabeled CBCT images, the hybrid use of ResNet-18 architecture and SimCLR technique resulted in an F1-score of 88.42%, 90.44% accuracy, and 86.67% sensitivity. These results show that SSL is an effective method for improved accuracy and efficiency for tooth decay classification. In [16], a deep neural network using 3D CBCT images for tooth classification was proposed. Combining transformer and CNN structures, this neural network aims to solve the shortcomings of the transformer model, which requires high computational complexity. In experiments with 450 training and 104 test samples, the improved model achieved 91.3% accuracy and a 99.7 AUC score. In [17], an automatic deep convolutional neural network (DCNN) was applied on 11,980 dental radiographs collected from three dental hospitals for the classification of dental implant systems (DIS). The accuracy rates of the automatic DCNN were found to be 0.954, 0.955 and 0.853 in terms of the AUC, sensitivity, and specificity, respectively. The study shows that this DCNN provides high accuracy in tooth classification and needs further research for clinical applications. In [18], a decision support system was proposed for the classification of dental periapical cysts and keratocystic odontogenic tumor (KCOT) lesions using CBCT. In the first stage, segmentation was performed on 50 CBCT 3D image datasets. Then, a vector containing 636 different features was created for each dataset. In experiments with six classifiers, the support vector machine (SVM) performed the best with 100% accuracy and F1-scores. The results show that periapical cyst and KCOT lesions can be classified with high accuracy, and this study makes an important contribution for the computer-aided diagnosis of apical lesions of teeth. In [19], a CNN architecture was proposed for the detection and diagnosis of dental caries in periapical dental images. In the proposed model, CNN and long short-term memory (LSTM) networks were combined for feature extraction. Experimental results showed that the proposed CNN-LSTM model provides higher accuracy (96%) compared to the AlexNet (93%) and GoogleNet (94%) pretrained models. In [20], a new approach for the automatic diagnosis of dental caries was proposed. The study used a multi-input deep convolutional neural network ensemble (MI-DCNNE) model that utilizes periapical images. The model uses a dataset of 340 caries and 340 non-caries periapical images as the inputs. The results show that the proposed model achieves 99.13% accuracy in diagnosing dental caries. In [21], the detection and diagnosis of dental caries in periapical radiographs using CNN was evaluated. In total, 3000 images were divided into training and test datasets and classified with the Inception v3 model. The accuracy rates of the models, including premolar, molar, and both teeth, were 89.0%, 88.0%, and 82.0%, respectively. In addition, the premolar model was reported as the category with the highest AUC value (0.917). In [22], the use of image processing and deep CNN techniques for the early detection of dental caries was investigated. Following the preprocessing steps of histogram equalization, contrast enhancement, and feature selection, edges in tooth images were detected using the Sobel method. The resulting enhanced images were given as inputs to the custom CNN model. The success of the method was compared with OTSU’s threshold segmentation and watershed segmentation techniques. The proposed method achieved an accuracy rate of 96.08%, showing higher efficiency compared to OTSU’s threshold segmentation (72.3%) and watershed segmentation (80.4%) methods.

Image processing and deep learning-based studies in the literature make significant contributions to the diagnosis and treatment processes of dentists. In particular, the analysis of CT images increases the accuracy of lesion detection and the early diagnosis of dental diseases, enabling more reliable results to be obtained. In this way, time is saved in clinical applications and treatment processes can be managed more effectively. Furthermore, these advanced methods allow the customization of treatment plans, making treatment processes more efficient. In this context, a computer-aided multidisciplinary study was conducted.

In this study, coronal and sagittal slices obtained from computed tomography (CT) images in the field of dentistry were used to measure the classification success between images with and without lesions. In the first stage, an image processing method was proposed. Thus, critical and sensitive information in the images was extracted. Then, these enhanced images were given as inputs to the VGG16 transfer learning architecture. Finally, different regularization methods are integrated into the VGG16 transfer learning architecture. Thus, it is planned to improve a generalizable and stable structure.

2. Material and Methods

Image processing is an approach used to analyze and make sense of images. Because raw images have the potential to contain noise, low-resolution areas, different lighting parameters, and a lack of contrast. This situation is a significant obstacle to accurate and effective model learning [23]. Image preprocessing techniques are preferred to overcome these difficulties and to highlight important structural features in images.

Image preprocessing techniques improve the details in low-quality images. They provide consistency in images obtained under different shooting conditions [5]. In addition, the models in which the images processed with the preprocessing approach are given as inputs make it possible to extract powerful features on real-world data. For this reason, image processing approaches used for the success of deep learning-based models preferred for analyzing images represent a preliminary stage used to increase the accuracy of the models and reduce errors in classification processes. The input obtained as a result of this stage is valuable for deep learning-based CNN architectures.

Transfer learning-based CNN, which is a subset of the deep learning approach customized for visual data, has the ability to automatically learn local features in visual data. With its multi-layered structure, it can effectively extract complex patterns in images [24].

In this study, images processed with image processing techniques are classified using the VGG16 transfer learning architecture. VGG16 is a deep convolutional neural network known for its simple and uniform architecture, consisting of sequential 3 × 3 convolutional layers followed by max pooling layers and fully connected layers. Despite its relatively large number of parameters, VGG16 is widely used in medical image analyses due to its robust feature extraction capabilities and proven performance in classification tasks [25]. Its deep and orderly structure allows for effective learning of hierarchical features from medical images, which is essential for the early diagnosis of lesions. Moreover, to enhance the stability, generalization ability, and classification performance of the model, proposed regularization methods are integrated into the VGG16 transfer learning architecture. These improvements contribute significantly to minimizing overfitting, ensuring that the model maintains high accuracy across various unseen data samples and clinical scenarios.

Regularization is a technique that makes the learning process more generalizable by limiting the complexity of the model [26]. It prevents the model from focusing only on training data and losing its ability to generalize. It provides a flexible structure [27]. It reduces the complexity of the model. It uses various penalty methods for this. This term added to the loss function reduces the overfitting potential of the model [28].

In this study, the classification of species with and without lesions in a dataset consisting of coronal and sagittal slices is planned. For this purpose, the dataset used, the preferred transfer learning architecture, the applied image processing methods, and the proposed regularization techniques are presented in detail. In addition, the experiments in the study were carried out in the Jupyter Notebook environment (version 7.2.2) using the Python programming language. TensorFlow libraries were used to build and train the deep learning models. All experiments were conducted on a computer with an Intel (R) Core (TM) i5-9400 CPU @ 2.90 GHz, 8 GB of RAM, a 64-bit operating system, and a ×64-based architecture.

2.1. Dataset

Datasets represent an important tool used in various analysis and research processes. They play a role in the evaluation of different classification and modeling methods. In this study, the UFPE dataset was used to classify tooth sections with and without lesions. The UFPE dataset was prepared for use in health research in Brazil and approved by the Local Research Ethics Committee of the University of Pernambuco. The dataset is divided into two main categories, healthy and unhealthy tooth samples, and contains a total of 1000 CBCT tooth scans. Each sample in this dataset is organized as pairs of images in both the coronal and sagittal planes [29]. Examples of lesioned and non-lesioned tooth sections in the dataset are presented visually in Figure 1.

The images labeled a and b in Figure 1 show slices representing tooth samples with and without lesions, respectively. The images used in the study were processed and analyzed in 186 × 115 dimensions, both in their original and enhanced versions. These images represent the different examples used to classify the tooth lesions in the dataset. The dataset has three different categories. These are no lesions, large lesions, and small lesions. The dataset is divided into two groups as 80% training data and 20% test data to test the accuracy of the model.

2.2. Image Processing

Image processing offers an approach that enables the extraction of valuable features in images. It is especially important for the interpretation of grayscale images. Therefore, an image processing approach is proposed to perform a successful classification process in the CBCT tooth scan data. In the proposed image processing approach, the entropy curve modification method in [31] is reconstructed by integrating the modified alpha value to provide optimal enhancement. Then, morphological and logical (bit-level processing) processing steps are performed to preserve details and isolate critical regions, respectively.

2.2.1. Proposed Image Processing Approach 1

In the second filter improved in the proposed image processing approach, a specific hierarchical order is realized for the images with and without lesions.

Improved Entropy Curve Modification

Entropy curve modification, which is based on modifying the entropy curve in the image, obtains the entropy curve through the entropy values corresponding to each gray level of the image. The entropy curve corresponding to the gray levels is then modified and a balanced gray level distribution is achieved [31]. This process is explained step-by-step below.

1.: The image read in the RGB color space is transformed into the grayscale space.
2.: The entropy curve of the image is drawn. To draw this curve, entropy information about the gray levels of the image is used. Equation (1) gives a mathematical expression for the entropy of the i’th gray level [30].

E_{i} = \sum_{t ϵ b} I_{i} (t) P (t = i) \log_{2} (P (t = i))

(1)

In the mathematical expression given in Equation (1), b represents the gray level. When t = 1, I_i(t) = 1. When t takes a different value, I_i(t) = 0. Based on the assumption P_i = P(t = i), the expression given in Equation (1) is simplified and represented in Equation (2) [31].

E_{i} = - P_{i} \log_{2} {(P}_{i})

(2)

P_i given in Equation (2) is the probability of the ith gray level to occur. Accordingly, E_i is the information associated with the ith gray level [30].

3.: Calculating the uniform entropy value to provide a reference point is often a necessary operation. In this process, it is assumed that the image is of M $\times$ N size. Then, the M $\times$ N pixels are evenly distributed over 256 gray levels and the mathematical expression given in Equation (3) is obtained [31].

P_{u} = \frac{1}{256}

(3)

The mathematical expression given in Equation (3) is redefined in Equation (4) for the entropy corresponding to the gray level [31].

E_{u} = - P_{u} \log_{2} {(P}_{u})

(4)

4.: It is important to obtain a modified entropy curve to improve the quality of the image. The smooth entropy is the guide in this process. The mathematical expression for the entropy curve modification is given in Equation (5).

E^{'} = a E + (1 - a) E_{U}

(5)

The E given in Equation (5) is the entropy curve of the input image. E_U is the smooth entropy curve. The value of α is chosen to provide an optimal improvement.

5.: An adaptive format has been improved for the selection of α given in Equation (5). In this format, the variance of the image is first calculated. Then, the variance value is normalized. Mathematical expressions for the normalized variance value are given in Equations (6)–(8).

V a r i e n c e = σ^{2} = \sum_{i = 1}^{N} (x_{i} - µ)^{2}

(6)

M a x V a r i e n c e = {(L - 1)}^{2} / 4

(7)

N o r m a l i z e d V a r i a n c e = V a r i a n c e / M a x V a r i a n c e

(8)

Normalized variance, substituted for α in Equation (5), expresses the variation of the differences between gray levels at a given distance [32]. Therefore, it is useful in image optimization.

6.: To better understand and balance the information density corresponding to the gray levels of the image, a necessary step is to calculate the probability density function from the modified entropy curve [31]. For this, the mathematical equations given in Equations (9) and (10) are used.

S_{M} = \sum_{i = 1}^{N} E_{i}

(9)

P D F (i) = E_{i} / S_{M}

(10)

The mathematical expression given in Equation (9) is the sum of the modified entropy values at m gray levels. The PDF value given in Equation (10) is the probability density value obtained by dividing the entropy density corresponding to each gray level by the total entropy. This is followed by a histogram equalization process based on the modified entropy curve. First, the PDF values are summed. Thus, the cumulative sum of the intensity values up to each gray level is obtained [30]. Its mathematical expression is given in Equation (11).

C D F (i) = \sum_{j = 1}^{i} P D F (j)

(11)

The CDF given in Equation (11) is a cumulative distribution process. The cumulative value is scaled in the range [0, L − 1] to homogenize the distribution of gray levels [31]. Its mathematical expression is given in Equation (12).

T = (L - 1) \times c d f

(12)

L in Equation (12) is the maximum value of the gray levels. It is defined as 256. T is the converted gray level [31].

The operations performed in Equations (1)–(12) include the steps needed to use the proposed modified entropy curve. This approach [31], which transforms the old gray level of each pixel into a new level, has an advantage in the interpretation of X-ray images that present outputs in the gray scale range. Because each pixel value in the vector is associated with a new gray level according to T in the context of the image, which is converted into a one-dimensional vector to obtain a new image with improved contrast.

Morphological Gradient Calculation

The contrast-enhanced image is subjected to morphological gradient processing by the improved entropy curve modification. First, a 3 × 3 matrix with element values of 1 is created. Then, a morphological gradient operation is applied through this matrix, which calculates the difference between dilation and erosion operations. Thus, an output image with emphasized edges and boundaries is created.

Logical Operation and Weighted Image Blending

The output obtained as a result of applying the morphological gradient process to the contrast-enhanced image obtained through the improved entropy curve modification and the original image is subjected to the logical NOT process. The output obtained as a result of this process is blended with the contrast-enhanced image obtained through step 1 at ratios of 0.8 and 0.2, respectively. The final output is then rotated.

The flow diagram of the 3 basic steps of proposed image processing approach 1 for images with and without lesions is given in Figure 2 and Figure 3.

The enhanced images shown in Figure 2 and Figure 3 were converted into data where sensitive points and boundaries were highlighted.

The image processing approach successfully localized pixel values ranging from 0 to 255. This offers the potential for a powerful feature set for the classification of images with and without lesions.

2.3. VGG16 Transfer Learning Architecture

Transfer learning is a deep learning approach that makes it possible to use models trained with big dataset on computers with powerful hardware on smaller datasets for a specific problem space [33]. Because the model does not perform any learning from scratch within the scope of the customized task training time is reduced and high computational power is not required [33]. In this study, VGG16 architecture was chosen for the classification of lesion images.VGG16 enables efficient feature extraction through its simple and consistent layer structure, making it well-suited for medical image analysis tasks [25]. Therefore, in the proposed system, the VGG16 model is retrained using a transfer learning approach to adapt to the specific characteristics of the lesion dataset. Thus it is planned the enhancing the diagnostic performance and robustness.

VGG16 is a prominent model among deep convolutional neural network architectures. This architecture uses filters with a fixed size of 3 × 3 in all convolution layers. It is also based on the principle of size reduction, with 2 × 2 max pooling operations after every two or three convolution layers [25]. The most important advantage of VGG16 is that its architecture is simple, consistent, and modular.This deep structure consisting of 16 layers is able to effectively learn low-level and high-level features in images. Moreover, reusing pretrained weights with less data offers the potential for high accuracy. This makes VGG16 a very suitable option in data-constrained domains such as medical image processing [34]. Figure 4 shows the structure of VGG16.

In this study, the fine-tuned hyperparameters for the VGG16 transfer learning architecture are given in Table 1.

Table 1 presents the chosen fine-tuned hyperparameters for the transfer learning models. Each parameter was optimized to maximize the performance of the model.

2.4. Regularization Functions

Regularization is a method that improves the power of the model to represent the data. It is used to avoid overfitting problems. It aims to optimize the error value by adjusting the coefficients of the model. Minimizing the error value means obtaining a successful model. There are 3 common regularization methods. These are lasso, ridge, and ElasticNet [35].

Lasso is a regularization method that uses the L1 norm. Its mathematical expression is given in Equation (13) [35].

L a s s o = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(13)

Ridge is a regularization method that uses the L2 norm. Its mathematical expression is given in Equation (14) [35].

R i d g e = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(14)

ElasticNet is a regularization method that combines the advantages of the ridge and lasso methods. Its mathematical expression is given in Equation (15) [35].

E l a s t i c N e t = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(15)

In the mathematical equations given in Equations (13)–(15), parameter λ controls the degree of minimization of the coefficients; β is a parameter that shows the effect of the independent variables on the target variable. ElasticNet, on the other hand, uses the parameter ρ. The ρ parameter determines the trade-off between the ridge and lasso regularization methods [35].

In this study, the ridge, lasso, and ElasticNet regularization methods were used. Accordingly, 18 different regularization methods were improved. These methods, which were improved to control the complexity of the model, prevent overfitting, and increase the accuracy, are presented below.

Proposed Regularization Methods

The proposed regularization methods are shaped by integrating the entropy calculation into the lasso, ridge, and ElasticNet regularization methods. In this context, the entropy value calculated in Equation (16) [32] is given as the input to the lambda update function in Equation (17). Then, a normalized result is obtained by dividing by the specified normalization constant. This result is used in the mathematical equations given in Equations (18)–(20) as a parameter controlling the degree of minimization of the coefficients.

H (p) = m e a n (- \sum_{i} p_{i} l o g {(p}_{i} + ϵ))

(16)

λ = λ_{0} . \frac{H (p)}{H m a x}

(17)

R e g 1 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(18)

R e g 2 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(19)

R e g 3 = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(20)

The second proposed regularization methods are shaped by integrating the energy calculation into the lasso, ridge, and ElasticNet regularization methods. In this context, the energy value calculated in Equation (21) [32] is given as an input to the lambda update function in Equation (22). Then, a normalized result is obtained by dividing it by the specified normalization constant. This result is used in the mathematical equations given in Equations (23)–(25) as a parameter controlling the degree of minimization of the coefficients.

E (p) = m e a n (\sum_{i = 1}^{n} p_{i}^{2})

(21)

λ = λ_{0} . \frac{E (p)}{H m a x}

(22)

R e g 4 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(23)

R e g 5 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(24)

R e g 6 = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(25)

The third proposed regularization method is shaped by integrating the root mean square (RMS) calculation into the lasso, ridge, and ElasticNet regularization methods. In this context, the RMS value calculated in Equation (26) [32] is given as an input to the lambda update function in Equation (27). Then, a normalized result is obtained by dividing it by the specified normalization constant. This result is used in the mathematical equations given in Equations (28)–(30) as a parameter controlling the degree of minimization of the coefficients.

R M S (p) = m e a n (\sqrt{\frac{1}{n} \sum_{i = 1}^{n} p_{i}^{2}})

(26)

λ = λ_{0} . \frac{R M S (p)}{H m a x}

(27)

R e g 7 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(28)

R e g 8 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(29)

R e g 9 = L o s s V a l u e + R S S + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(30)

The fourth suggested regularization methods are shaped by integrating the normalized values used in the entropy calculation process into the lasso, ridge, and ElasticNet regularization methods. In this context, the entropy value obtained with the normalized values calculated in Equations (31) and (32) are given as inputs to the lambda update function in Equation (33). The result obtained through this function is used in the mathematical equations given in Equations (34)–(36) as a parameter controlling the degree of minimization of the coefficients.

p_{i j} = \frac{p_{i j}}{\sum_{k = 1}^{n} p_{i k}}

(31)

H (p) = m e a n (\sum_{j = 1}^{n} p_{i j} . l o g (p_{i j} + ϵ))

(32)

λ = λ_{0} . \frac{H (p)}{H m a x}

(33)

R e g 10 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(34)

R e g 11 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(35)

R e g 12 = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(36)

The proposed fifth regularization methods are shaped by integrating the normalized values used in the energy calculation process into the lasso, ridge, and ElasticNet regularization methods. In this context, the energy value obtained with the normalized values calculated in Equation (37) is given as an input to the lambda update function in Equation (38). The result obtained through this function is used in the mathematical equations given in Equations (39)–(41) as a parameter controlling the degree of reduction in the coefficients.

E (i) = m e a n (\sum_{j = 1}^{n} {(p}_{i j})^{2}) and p_{i j} = \frac{p_{i j}}{\sum_{j = 1}^{n} p_{i j}}

(37)

λ = λ_{0} . \frac{E (i)}{H m a x}

(38)

R e g 13 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(39)

R e g 14 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(40)

R e g 15 = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(41)

The proposed sixth regularization method is shaped by integrating the normalized values used in the RMS (root mean square) calculation process into the lasso, ridge, and ElasticNet regularization methods. In this context, the RMS value obtained with the normalized values calculated in Equation (42) is given as an input to the lambda update function in Equation (43). The result obtained through this function is used in the mathematical equations given in Equations (44)–(46) as a parameter controlling the degree of reduction in the coefficients.

R M S (i) = m e a n (\sqrt{\frac{1}{n}} {(p_{i j})}^{2}) a n d p_{i j} = \frac{p_{i j}}{\sum_{j = 1}^{n} p_{i j}}

(42)

λ = λ_{0} . \frac{R M S (i)}{H m a x}

(43)

R e g 16 = L o s s V a l u e + λ \sum_{j = 1}^{p} | β_{j} |

(44)

R e g 17 = L o s s V a l u e + λ \sum_{j = 1}^{p} β_{j}^{2}

(45)

R e g 18 = L o s s V a l u e + λ ρ \sum_{j = 1}^{p} | β_{j} | + \frac{λ (1 - ρ)}{2} \sum_{j = 1}^{p} β_{j}^{2}

(46)

The proposed 18 different regularization methods prevent the model from losing its generalization ability and increase its stability.

3. Results and Discussion

In recent years, deep learning has gained significant development The capacity of deep learning algorithms to process large datasets and make meaningful inferences from these data provides higher accuracy and generalization capability than traditional machine learning methods [36]. The success of deep learning methods increases significantly on images preprocessed with various preprocessing methods [37,38]. Especially, image processing optimizes the learning effort from complex data. Optimization is valuable for improving the performance [39] of deep learning approaches involving CNN architectures.

CNN automatically extracts meaningful features from images and creates in-depth feature representations. This makes it possible to achieve high success rates in object detection and image recognition tasks [6]. However, the success of CNNs does not only depend on the network architecture and learning algorithms. The quality of the datasets is also important. On this issue, image processing comes to the fore with the aim of improving data quality. It significantly improves the learning process of the model by providing strong inputs to CNN architectures. [37,38].

The techniques used at preprocessing stages prevent the model from making erroneous predictions that might arise from low-quality data. Color correction, denoising, normalization, histogram equalization, morphological operations, data augmentation methods, etc. are some examples to image processing techniques. [40]. In addition, standardizing the brightness and contrast levels of images helps the model to obtain more robust results [41]. On the other hand, edge detection methods help to highlight important structural features in the images, allowing the boundaries to be clearly defined and increasing the accuracy of the model [42]. Thus, classification performance is strengthened and more reliable results are obtained.

In this study, lesion classification was performed on preprocessed images using a VGG16 transfer learning architecture to classify dental images. The performance metrics for the classification results are given in Table 2.

According to Table 2, the VGG16 model achieved 58.64% accuracy in the processed images. While the model correctly detects 72% of the positive samples (recall), the precision is 40.9% due to the high number of false alarms in the positive predictions. This indicates that the model gives too many false alarms in the positive class. According to the F1-score (52.17%), the overall performance of the model is balanced.

The enhanced images with image processing approach were given as inputs to VGG16, which integrates 18 different regularization techniques to produce stable and generalizable results. The performance metrics for the regularization methods integrated into the VGG16 architecture for the classification of lesioned and non-lesioned dental images are presented in Table 3.

Table 3 shows that the regularization 13 method provides maximum success in optimizing performance. This is because in classification without regularization, the F1-score is 52.17%. However, in classification with regularization 13 integrated, the F1-score is 70.80%, indicating strong optimization. Figure 5 show the ROC curve results.

The AUC value obtained in Figure 5 represents a result that could be improved. However, the model in which the improved regularization method is integrated is more successful than the original model. Heat maps showing the areas the model focuses on during the classification process are given in Figure 6. Additionally, to statistically validate the performance difference between the models, a Bowker’s test of symmetry was conducted based on the results from the best-performing configuration, namely Reg13. The test yielded a chi-square value of 11.23 with 1 degree of freedom and a p-value of 0.0008, indicating statistically significant asymmetry (p < 0.001). This result proves the effectiveness of the proposed method.

Grad-CAM heat maps are valuable for decision-making. However, image processing and the regularization method improved for hyperparameter tuning play complementary roles in making accurate decisions. Because, while image processing enables the model to learn more accurately and effectively by extracting meaningful features from raw images, regularization techniques strengthen the generalization capacity of the model by preventing overfitting. In addition, in order to evaluate the proposed method, direct comparisons were made with widely used transfer learning-based models such as Xception, ResNet50, MobileNetV2, MobileNetV3, EfficientNetB0, and DenseNet169 in scope of the Reg13 integration. The obtained results are given in Table 4.

When comparing the results, the Proposed_VGG16 model exhibited the best performance, with 58.65% accuracy, 90.91% precision, and a 70.80% F1-score. Its superiority, particularly in terms of precision, demonstrates that the model is highly reliable in distinguishing the positive class. While the other ResNet50, MobileNetV3, EfficientNetB0 models achieved 100% precision, their TN values of zero indicate that this success is based on limited generalization. Evaluating all models under the same conditions clearly demonstrated the relative superiority of the proposed method.

The regularization methods presented in Table 3 were used to prevent overfitting. They allow optimization to improve a successful model. Lasso (L1), ridge (L2), and ElasticNet are the three basic regularization methods that are widely used. During the variable selection process, the lasso regularization method reduces some coefficients to zero. The ridge regularization method limits the magnitude of the coefficients without reducing them to zero. The ElasticNet method provides feature selection by using the advantages of the lasso and ridge regularization methods. Customized versions of each method have the potential to allow scientific development in improving performance, increasing the generalization success, and enhancing feature selection.

The 18 different regularization methods improved in this study are formed in groups of 3. Each group customizes the coefficient of importance that the model will give to the penalty term in the widely used lasso, ridge, and ElasticNet regularization methods.

In the first customization process within the scope of group 1, the coefficient of importance that the model will give to the penalty term is recreated with the entropy parameter. Thus, extreme values in the model output distribution are avoided. Balanced predictions are encouraged with soft classification outputs. In the second customization process within the scope of group 2, the coefficient of importance that the model will give to the penalty term is recreated with the energy parameter. This parameter produces a sharp response for the model decision. It provides strong discrimination between classes and exhibits the opposite behavior of the entropy approach. In the third customization process within the scope of group 3, the coefficient of importance that the model will give to the penalty term is recreated with the RMS parameter. RMS takes the average of the energy calculated in the second group and then applies the square root operation to it. This plays a role in preventing extreme errors as it provides an overall energy measurement for the dataset. In the fourth customization process within the scope of group 4, the coefficient of importance that the model will give to the penalty term is recreated by giving the normalized outputs with the softmax function as inputs to the entropy calculation. In the fifth customization process within the scope of group 5, the coefficient of importance that the model will give to the penalty term is recreated by giving the normalized outputs with the softmax function as inputs to the energy calculation. In the sixth customization process within the scope of group 6, the coefficient of importance that the model will give to the penalty term is recreated by giving the normalized outputs with the softmax function as inputs to the RMS calculation. The softmax function used in group 4, group 5, and group 6 provides a scaling functionality for the data. Thus, a balanced and equal evaluation is applied to the data mapped to the same space. This is valuable for the model to produce reliable predictions. These 6 interrelated groups analyze the success of the model theoretically. They enable the production of reliable, stable, and high-performance outputs.

In the transfer learning experiments, the improved VGG16 model for 1000 categories with (224, 224) input sizes using the ImageNet dataset was selected. This model has 138,357,544 total trainable parameters. However, in the original format of the modified VGG16 model, the 2 fully connected layers that were placed consecutively after the flattening process and the prediction layer for 1000 categories were removed. Instead, the prediction layer for 2 categories was integrated. As a result, the total number of parameters was reduced to 14,714,688. Changes in the number of parameters change the training time and affect the hardware selection. A strategic choice should be made according to the goal.

In the proposed scenario, the optimization of the classification accuracy is significantly improved. Also, in order to compare the contributions of the study, similar studies in the literature were analyzed. In this context, previous studies using CT and CBCT datasets were compared in Table 5 to present the innovative approaches and findings of this research.

Some articles in the literature were reviewed. These studies [13,14,15,16,21,43,44,45,46,47,48] show that deep learning and image processing approaches are actively used in medical imaging. The automatic classification, early diagnosis, and clinical decision support ability and generalizability of the improved approach for different diagnostic processes have paved the way for multidisciplinary studies based on medicine and informatics. However, it should be kept in mind that preprocessing and fine tuning are key for the improved architectures. They must be implemented successfully to reduce false positives and false negatives. Therefore, in this work, we introduced an image processing filter and the improved regularization functions in a suitable combination. Table 1, Table 2 and Table 3 show that the proposed methodology effectively contributes to performance improvements in the context of lesion image classification. These approaches have the potential to be integrated into the structures presented in [13,14,15,16,21,43,44,45,46,47,48]. Thus, we plan to obtain more effective outputs.

4. Conclusions

Artificial intelligence-based architectures play an important role in the analysis of dental images and early diagnosis of dental diseases. In recent years, the advances offered by these technologies have been transforming healthcare services. This offers great potential to improve patient satisfaction and enhance diagnosis and treatment processes.

In this study, we aimed to improve a powerful artificial intelligence-based diagnostic tool for the classification of lesioned and non-lesioned regions in CT images. For this reason, images were first enhanced with the proposed image processing approach because enhanced images are valuable for deep learning-based architectures that perform feature extraction. Then, they were classified with the transfer learning-based VGG16 architectures. This research on the UFPE public dataset shows that the combination of image processing and regularization strategies can significantly improve the classification performance. Offering great potential in terms of accuracy and stability in the classification of tooth root lesions, the findings obtained through this study show that the proposed hybrid structure is a reliable tool for various clinical applications. In future studies, a new optimization algorithm will be improved and integrated into the model. Thus, we aim to improve the success and stability parameters.

Author Contributions

Conceptualization, F.A. and Y.Ö.; methodology, F.A. and Y.Ö.; software, F.A. and Y.Ö.; validation, F.A. and Y.Ö.; formal analysis, F.A. and Y.Ö.; investigation, F.A. and Y.Ö.; resources, F.A. and Y.Ö.; data curation, F.A. and Y.Ö.; writing—original draft, F.A. and Y.Ö.; writing—review and editing, F.A. and Y.Ö.; visualization, F.A. and Y.Ö. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets analyzed during the current study are publicly available and can be accessed at https://github.com/felipebsferreira/periapical-lesion-dataset (accessed on: 14 March 2025) and https://doi.org/10.3390/s22176481.

Acknowledgments

We gratefully acknowledge the assistance of ChatGPT-3.5 and ChatGBT-4o-mini language models for its contributions to this study. The tool provided valuable support in writing some parts, providing different perspectives, analyzing mathematical equations, identifying and addressing coding errors with ease, and ensuring accurate and clear translations. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflict of interest.

References

Aksoylu, M.Ü. Artificial Intelligence and Computer Vision with Projects, 1st ed.; Kodlab: İstanbul, Turkey, 2021. [Google Scholar]
Shah, N.; Bansal, N.; Logani, A. Recent advances in imaging technologies in dentistry. World J. Radiol. 2014, 6, 794. [Google Scholar] [CrossRef]
Tetradis, S.; Anstey, P.; Graff-Radford, S. Cone beam computed tomography in the diagnosis of dental disease. J. Calif. Dent. Assoc. 2010, 38, 27–32. [Google Scholar] [CrossRef]
Weiss, R.; Read-Fuller, A. Cone beam computed tomography in oral and maxillofacial surgery: An evidence-based review. Dent. J. 2019, 7, 52. [Google Scholar] [CrossRef]
Gonzalez, R.C. Digital Image Processing; Pearson Education India: Chennai, India, 2009. [Google Scholar]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
Scarfe, W.C.; Farman, A.G.; Sukovic, P. Clinical applications of cone-beam computed tomography in dental practice. J.-Can. Dent. Assoc. 2006, 72, 75. [Google Scholar]
Patel, S.; Brown, J.; Pimentel, T.; Kelly, R.D.; Abella, F.; Durack, C. Cone beam computed tomography in Endodontics–a review of the literature. Int. Endod. J. 2019, 52, 1138–1152. [Google Scholar] [CrossRef] [PubMed]
Shin, H.C.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.; Summers, R.M. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef]
LeCun, Y.; Kavukcuoglu, K.; Farabet, C. Convolutional networks and applications in vision. In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, Paris, France, 30 May–2 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 253–256. [Google Scholar]
Wang, J.; Zhu, H.; Wang, S.H.; Zhang, Y.D. A review of deep learning on medical image analysis. Mob. Netw. Appl. 2021, 26, 351–380. [Google Scholar] [CrossRef]
Kumar, S.M.; Mouli, P.C.; Kailasam, S.; Raghuram, P.H.; Sateesh, S. Applications of cone-beam computed tomography in dentistry. J. Indian Acad. Oral Med. Radiol. 2011, 23, 593–597. [Google Scholar] [CrossRef]
Miki, Y.; Muramatsu, C.; Hayashi, T.; Zhou, X.; Hara, T.; Katsumata, A.; Fujita, H. Classification of teeth in cone-beam CT using deep convolutional neural network. Comput. Biol. Med. 2017, 80, 24–29. [Google Scholar] [CrossRef]
Esmaeilyfard, R.; Bonyadifard, H.; Paknahad, M. Dental Caries Detection and Classification in CBCT Images Using Deep Learning. Int. Dent. J. 2024, 74, 328–334. [Google Scholar] [CrossRef]
Zanini, L.G.K.; Rubira-Bullen, I.R.F.; dos Santos Nunes, F.D.L. Enhancing dental caries classification in CBCT images by using image processing and self-supervised learning. Comput. Biol. Med. 2024, 183, 109221. [Google Scholar] [CrossRef]
Gao, S.; Li, X.; Li, X.; Li, Z.; Deng, Y. Transformer based tooth classification from cone-beam computed tomography for dental charting. Comput. Biol. Med. 2022, 148, 105880. [Google Scholar] [CrossRef] [PubMed]
Lee, J.H.; Kim, Y.T.; Lee, J.B.; Jeong, S.N. A performance comparison between automated deep learning and dental professionals in classification of dental implant systems from dental imaging: A multi-center study. Diagnostics 2020, 10, 910. [Google Scholar] [CrossRef] [PubMed]
Yilmaz, E.; Kayikcioglu, T.; Kayipmaz, S. Computer-aided diagnosis of periapical cyst and keratocystic odontogenic tumor on cone beam computed tomography. Comput. Methods Programs Biomed. 2017, 146, 91–100. [Google Scholar] [CrossRef]
Singh, P.; Sehgal, P. GV Black dental caries classification and preparation technique using optimal CNN-LSTM classifier. Multimed. Tools Appl. 2021, 80, 5255–5272. [Google Scholar] [CrossRef]
Imak, A.; Celebi, A.; Siddique, K.; Turkoglu, M.; Sengur, A.; Salam, I. Dental Caries Detection Using Score-Based Multi-Input Deep Convolutional Neural Network. IEEE Access 2022, 10, 18320–18329. [Google Scholar] [CrossRef]
Lee, J.H.; Kim, D.H.; Jeong, S.N.; Choi, S.H. Detection and diagnosis of dental caries using a deep learning-based convolutional neural network algorithm. J. Dent. 2018, 77, 106–111. [Google Scholar] [CrossRef]
Lakshmi, M.M.; Chitra, P. Classification of Dental Cavities from X-ray images using Deep CNN algorithm. In Proceedings of the 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI) (48184), Tirunelveli, India, 15–17 June 2020; pp. 774–779. [Google Scholar] [CrossRef]
Gonzales, R.C.; Wintz, P. Digital Image Processing; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1987. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Akalın, F.; Çavdaroğlu, P.D.; Orhan, M.F. Arrhythmia detection with transfer learning architecture integrating the developed optimization algorithm and regularization method. BMC Biomed. Eng. 2025, 7, 8. [Google Scholar] [CrossRef] [PubMed]
Wager, S.; Wang, S.; Liang, P.S. Dropout training as adaptive regularization. Adv. Neural Inf. Process. Syst. 2013, 26, 351–359. [Google Scholar]
Ng, A.Y. Feature selection, L1 vs. L2 regularization, and rotational invariance. In Proceedings of the Twenty-First International Conference on Machine Learning, Banff, AB, Canada, 4–8 July 2004; p. 78. [Google Scholar]
Calazans, M.A.A.; Ferreira, F.A.B.; Alcoforado, M.D.L.M.G.; Santos, A.D.; Pontual, A.D.A.; Madeiro, F. Automatic classification system for periapical lesions in cone-beam computed tomography. Sensors 2022, 22, 6481. [Google Scholar] [CrossRef] [PubMed]
Ferreira, F.B.S. Periapical Lesion Dataset. GitHub repository. [Online]. 2022. Available online: https://github.com/felipebsferreira/periapical-lesion-dataset (accessed on 14 March 2025).
Yadav, P.S.; Gupta, B.; Lamba, S.S. A new approach of contrast enhancement for Medical Images based on entropy curve. Biomed. Signal Process. Control 2024, 88, 105625. [Google Scholar] [CrossRef]
Sonti, K.; Dhuli, R. Pattern Based Glaucoma Classification Approach using Statistical Texture Features. In Proceedings of the 2022 2nd International Conference on Artificial Intelligence and Signal Processing (AISP), Vijayawada, India, 12–14 February 2022. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [Google Scholar] [CrossRef]
Chhabra, M.; Kumar, R. An advanced VGG16 architecture-based deep learning model to detect pneumonia from medical images. In Emergent Converging Technologies and Biomedical Systems: Select Proceedings of ETBS 2021; Springer: Singapore, 2022; pp. 457–471. [Google Scholar]
Chukwura, J.; Jecinta, I.C. A Review of Techniques for Regularization. Int. J. Res. Eng. Sci. 2023, 11, 360–367. Available online: www.ijres.org (accessed on 14 March 2025).
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Akalın, F.; Aksoy, N.Ö.; Top, D.; Kara, E. Improved Filter Designs Using Image Processing Techniques for Color Vision Deficiency (CVD) Types. Symmetry 2025, 17, 1046. [Google Scholar] [CrossRef]
Akalın, F.; Top, D. Proposed Image Processing Filters to Improve Digital Image Perception for Color Blindness Types. Concurr. Comput. Pract. Exp. 2025, 37, e70194. [Google Scholar] [CrossRef]
Akalın, F. Survival Classification in Heart Failure Patients by Neural Network-Based Crocodile and Egyptian Plover (CEP) Optimization Algorithm. Arab. J. Sci. Eng. 2024, 49, 3897–3914. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical İmage Computing and Computer-Assisted İntervention–MICCAI 2015: 18th İnternational Conference, Munich, Germany, 5–9 October 2015, Proceedings, Part III 18; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 6, 679–698. [Google Scholar] [CrossRef]
Chau, K.K.; Zhu, M.; AlHadidi, A.; Wang, C.; Hung, K.; Wohlgemuth, P.; Lam, W.Y.H.; Liu, W.; Yuan, Y.; Chen, H. A novel AI model for detecting periapical lesion on CBCT: CBCT-SAM. J. Dent. 2025, 153, 105526. [Google Scholar] [CrossRef] [PubMed]
Kirnbauer, B.; Hadzic, A.; Jakse, N.; Bischof, H.; Stern, D. Automatic Detection of Periapical Osteolytic Lesions on Cone-beam Computed Tomography Using Deep Convolutional Neuronal Networks. J. Endod. 2022, 48, 1434–1440. [Google Scholar] [CrossRef] [PubMed]
Akalin, F.; Yildiz, T. Detection and classification of enhanced periapical lesion images with YOLO algorithms. Connect. Sci. 2025, 37, 2522706. [Google Scholar] [CrossRef]
Li, X.; Liu, W.; Tang, W.; Guo, J. Dense image-mask attention-guided transformer network for jaw lesions classification and segmentation in dental cone-beam computed tomography images. Appl. Intell. 2025, 55, 499. [Google Scholar] [CrossRef]
Zheng, Q.; Ma, L.; Wu, Y.; Gao, Y.; Li, H.; Lin, J.; Qing, S.; Long, D.; Chen, X.; Zhang, W. Automatic 3-dimensional quantification of orthodontically induced root resorption in cone-beam computed tomography images based on deep learning. Am. J. Orthod. Dentofac. Orthop. 2025, 167, 188–201. [Google Scholar] [CrossRef]
Zeng, X.; Ding, J.; Yuan, K.; Zhan, J.; He, C.; Wu, H.; Lin, H. Joint detection of dental diseases with panoramic imaging system via multi-task context integration network. Opt. Laser Technol. 2025, 192, 113394. [Google Scholar] [CrossRef]

Figure 1. Examples of images with and without lesions in the UFPE dataset [30].

Figure 2. Stages of implementation of proposed image processing approach on a with lesion image [30].

Figure 3. Stages of implementation of proposed image processing approach on a withoutlesion image [30].

Figure 4. Diagram of the VGG16 architecture.

Figure 5. ROC curve for regularization method 13 integrated into VGG 16 architecture.

Figure 6. Heat maps obtained with the Grad-CAM technique [30].

Table 1. The fine-tuned hyperparameters for the proposed model.

Hyperparameter	Evaluated Values
Learning rate	0.001
Optimization algorithm	Adam
Batch size	16
Dropout rate	0.5
Number of dense units	128
Activation function	ReLU
Early stopping patience	5

Table 2. Performance metrics obtained for enhanced images.

Datasets	True Positive	False Positive	False Negative	True Negative	Accuracy (%)	Recall (%)	Precision (%)	F1-Score (%)
Processed images	90	130	35	144	58.64	72.00	40.90	52.17

Table 3. Performance metrics obtained with 18 different regularization techniques integrated into the VGG16 transfer learning architecture.

Regularization	True Positive	False Positive	False Negative	True Negative	Accuracy (%)	Recall (%)	Precision (%)	F1-Score (%)
Regularization 1	70	150	32	147	54.39	68.63	31.82	43.48
Regularization 2	140	80	90	89	57.39	60.87	63.64	62.22
Regularization 3	111	109	57	122	58.40	66.07	50.45	57.22
Regularization 4	91	129	44	135	56.64	67.41	41.36	51.27
Regularization 5	150	70	96	83	58.40	60.98	68.18	64.38
Regularization 6	116	104	63	116	58.15	64.80	52.73	58.15
Regularization 7	192	28	133	46	59.65	59.08	87.27	70.46
Regularization 8	156	64	103	76	58.15	60.23	70.91	65.14
Regularization 9	140	80	85	94	58.65	62.22	63.64	62.92
Regularization 10	107	113	54	125	58.15	66.46	48.64	56.17
Regularization 11	191	29	140	39	57.64	57.70	86.82	69.33
Regularization 12	138	82	86	93	57.89	61.61	62.73	62.16
Regularization 13	200	20	145	34	58.65	57.97	90.91	70.80
Regularization 14	117	103	64	115	58.15	64.64	53.18	58.35
Regularization 15	124	96	61	118	60.65	67.03	56.36	61.23
Regularization 16	136	84	88	91	56.89	60.71	61.82	61.26
Regularization 17	117	103	60	119	59.15	66.10	53.18	58.94
Regularization 18	50	170	25	154	51.13	66.67	22.73	33.90

Table 4. Classification performance obtained for different transfer learning models in scope of the Reg13 integration.

Models	True Positive	False Positive	False Negative	True Negative	Accuracy (%)	Recall (%)	Precision (%)	F1-Score (%)
Proposed models	200	20	145	34	58.65	57.97	90.91	70.80
Xception	141	79	89	90	57.89	61.30	64.09	62.67
ResNet50	220	0	179	0	55.14	55.14	100	71.08
MobileNetV2	79	141	30	149	57.14	72.48	35.91	48.02
MobileNetV3	220	0	179	0	55.14	55.14	100	71.08
EfficientNetB0	220	0	179	0	55.14	55.14	100	71.08
DenseNet169	103	117	51	128	57.89	66.88	46.82	55.08

Table 5. Methods used in previous studies in the literature and comparative evaluations.

Study	Year	Dataset	Objective	Method	Performance Metrics
Miki et al. [13]	2017	Anonymous 52 CBCT images	Classifying 7 different types of teeth	Data augmentation + DCNN	Accuracy is 88.8%.
Esmaeilyfard et al. [14]	2024	CBCT	Classifying teeth	CNN	Accuracy is 94.8%.
Zanini [15]	2024	Dataset obtained by collecting unlabeled CBCT data of patients	Using the self-supervised learning (SSL) technique to improve the classification of dental caries	SSL + transfer learning	Accuracy is 90.44% for ResNet18 + SSL.
Gao et al. [16]	2022	Dataset obtained by collecting 3D CBCT data of patients	Automating the dental charting process and ensuring accurate and fast classification of teeth	CNN + transformer	Accuracy is 91.3%.
Lee et al. [21]	2018	Dental caries obtained by periapical radiographs	Classifying teeth	Inceptionv3	Accuracy is 89%.
Chau et al. [43]	2025	A total of 659 CBCT images	Identification of periapical lesions on CBCT	CBCT-SAM artificial intelligence (AI) model building on a previously improved AI model called PAL-Net	CBCT-SAM scored the highest for average segmentation accuracy, sensitivity, and DSC, which were 99.65% ± 0.66%, 72.36% ± 21.61%, and 0.70 ± 0.19, respectively.
Kirnbauer et al. [44]	2022	Dataset obtained by collecting CBCT data of patients	Automatic detection of osteolytic PALs (radiolucent periapical lesions) in CBCT datasets	Improving and validating a deep convolutional neural network based on SpatialConfiguration-Net and U-Net	The sensitivity and specificity values for lesion detection were 97.1% and 88.0%, respectively.
Akalin and Yildiz [45]	2025	Periapical X-ray public dataset provided by the Kaggle platform	Detection and classification of enhanced periapical lesion images	Improved image processing filter and YOLO approach	96% F Criterion was achieved in lesion detection.
Li et al. [46]	2025	A large internal dataset of 358 CBCT scans	Automated segmentation and classification of jaw lesions	Proposing of the dense image mask attention-guided transformer network for end-to-end jaw lesions classification and segmentation on 3D CBCT images based on a multi-task learning (MTL) architecture	A binary segmentation DICE score of 90%, an average classification accuracy of 89.23%, and a multi-class segmentation DICE score of 79.06% were obtained for five different types of jaw lesions.
Zheng et al. [47]	2025	210 CBCT scans from 105 patients	Localization of root resorption due to orthodontic treatment	Preprocessing on CBCT images, point segmentation, conversion to point clouds. Segmentation of tooth crowns and roots with a dynamic graph convolutional neural network; calculation of root volume and OIRR localization	The intraclass correlation coefficient for the mean volume measurements at each tooth position was above 0.95 and the accuracy of the different OIRR severity classifications exceeded 0.8.
Zeng et al. [48]	2025	Dental Panoramic Multiple Disease Dataset (DPMD) consisting of 2467 labeled panoramic radiographs	Detection of concurrent diseases and staging of periodontitis, multiscale progressive feature merging, detection of small lesion features	YOLO (MMC-YOLO) based on the multiscale progressive feature fusion module (MPFAM), coordinate-guided fusion module (CGFM) and edge-enhanced spatial fusion module (ESFM)	The mIoU performance was 97.4% for alveolar bone segmentation, 96.1% for CEJ, and 93.6% for tooth segmentation.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Akalin, F.; Özkan, Y. Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography. Symmetry 2025, 17, 1392. https://doi.org/10.3390/sym17091392

AMA Style

Akalin F, Özkan Y. Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography. Symmetry. 2025; 17(9):1392. https://doi.org/10.3390/sym17091392

Chicago/Turabian Style

Akalin, Fatma, and Yasin Özkan. 2025. "Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography" Symmetry 17, no. 9: 1392. https://doi.org/10.3390/sym17091392

APA Style

Akalin, F., & Özkan, Y. (2025). Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography. Symmetry, 17(9), 1392. https://doi.org/10.3390/sym17091392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Hybrid Scenario for Classification of Periapical Lesions in Cone Beam Computed Tomography

Abstract

1. Introduction

2. Material and Methods

2.1. Dataset

2.2. Image Processing

2.2.1. Proposed Image Processing Approach 1

Improved Entropy Curve Modification

Morphological Gradient Calculation

Logical Operation and Weighted Image Blending

2.3. VGG16 Transfer Learning Architecture

2.4. Regularization Functions

Proposed Regularization Methods

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI