1. Introduction
Primary teeth are crucial for children’s overall development, providing support for speech, mastication, and aesthetics, while also maintaining space for the proper eruption of permanent teeth [
1,
2]. The natural physiological process involves the exfoliation of primary teeth followed by the eruption of permanent teeth [
3]. However, the premature loss of primary teeth can disrupt this process, potentially resulting in malocclusion, ectopic eruption, and midline deviation [
4]. The premature loss of primary teeth is most attributed to dental caries; however, other significant contributing factors include congenital anomalies, ectopic eruption of permanent teeth, and dental trauma [
5,
6].
The etiology of malocclusion encompasses a complex process driven by the combined effects of genetic and environmental factors. Space loss resulting from the premature loss of primary teeth constitutes an environmental factor that may contribute to the development or exacerbation of malocclusion, ultimately increasing the need for orthodontic treatment [
7]. The most effective way to prevent such issues is to ensure that primary teeth remain in the oral cavity until their natural exfoliation [
8]. However, if this is not feasible, the use of SMs becomes necessary [
9]. Safeguarding space following the premature loss of primary teeth has the potential to minimize or entirely negate the requirement for orthodontic intervention [
10].
The selection of an appropriate SM is influenced by several critical factors, including the stage of dental development, the size of tooth loss, the number of primary teeth affected, and the degree of patient cooperation [
9,
11,
12]. The use of fixed SMs is a well-established approach to preserving space in the dental arch following the premature loss of primary teeth [
13]. Among the different types of fixed SMs, band and loop appliances are among the most utilized for maintaining arch space in cases of single-tooth loss [
14].
In addition to clinical considerations, the implementation of SMs in dental practice also varies depending on dentists’ clinical education, professional experience, patient acceptance of treatment, and financial considerations. Determining the necessity of SMs requires a thorough evaluation of multiple factors, with radiographs and space analysis serving as valuable tools in this process [
5]. However, assessing the need for SMs using dental radiographs can be time-consuming and prone to human error.
Disease diagnosis requires clinicians to assess symptoms, interpret diagnostic test results, and consider other relevant factors [
15]. However, this process can be affected by cognitive biases and reliance on memory, potentially influencing clinical judgment. Artificial intelligence (AI), trained on vast datasets, has demonstrated the ability to outperform even highly experienced specialists in certain clinical tasks [
16,
17,
18]. Consequently, AI is progressively becoming an important component of contemporary healthcare, as it is in the field of medicine, with applications extending to pediatric dentistry [
16,
19]. In healthcare, AI is categorized into two main domains: virtual and physical. The virtual domain encompasses ML and DL [
20]. Machine learning refers to a system’s ability to autonomously learn from data without explicit programming [
21]. It includes four primary methodologies: supervised learning, unsupervised learning, reinforcement learning, and active learning [
22]. Supervised learning involves analyzing labeled input data to uncover patterns, utilizing models such as Bayesian inference, decision trees, linear discriminants, support vector machines, logistic regression, and artificial neural networks [
23]. Deep learning, a more advanced subset of ML, employs multiple interconnected layers to extract features and optimize model performance [
24].
AI technologies aim to develop systems and robots capable of performing tasks like pattern recognition, decision making, and adaptive problem solving—capabilities traditionally associated with human intelligence [
25]. Advances in computational power, combined with innovations in machine learning techniques and neural networks, have accelerated progress in AI [
26]. As a subset of AI, ML focuses on training computers to analyze large datasets, identify trends, and apply these insights for predictions or decisions [
25]. AI has demonstrated transformative potential across fields such as natural language processing, autonomous vehicles, healthcare, and image recognition. In AD research, it excels at rapidly analyzing complex datasets, identifying patterns imperceptible to humans, and providing highly accurate predictions, thereby advancing the understanding and management of the disease [
27,
28]. DL is centered around advanced neural network architectures, including Convolutional Neural Networks (CNNs) [
29] and artificial neural networks (ANNs) [
30]. CNNs are a specialized type of ANN designed to process and analyze visual data, such as images. Unlike ANNs, CNNs leverage convolutional layers that apply filters (kernels) to extract spatial and hierarchical features like edges, textures, and shapes [
31].
AI is utilized in various areas of pediatric dentistry, including dental plaque detection, oral health assessment, supernumerary tooth identification, detection of early childhood caries, fissure sealant categorization, chronological age assessment, deciduous and young permanent tooth detection, ectopic eruption detection, and behavior management. Previous studies have compared the clinical success of traditional and 3D-printed space maintainers [
32]. One study has evaluated the effectiveness of AI-based chatbots in providing reliable and high-quality information on space maintainers for pediatric patients and their parents [
33]. A recent study developed a CNN model based on YOLOv3 to detect mesiodens in panoramic radiographs and identify specific tooth numbers, achieving high accuracy [
34]. Another study has shown the effectiveness of CNNs in dental image analysis, particularly in detecting third molar angles using object detection models [
35].
Unlike traditional deep learning models that focus solely on tooth type classification, such as the ZNet model proposed in [
36], our CNN model not only predicts the necessity of a space maintainer (SM) but also identifies the specific tooth requiring intervention, providing a more clinically actionable solution for pediatric dentistry. This dual-function capability uniquely enhances clinical decision making by providing precise localization, which is crucial for pediatric dentists. By integrating both classification and localization, our model streamlines the treatment planning process, reducing reliance on time-consuming manual assessments. The CNN model has the potential to significantly reduce errors attributable to human oversight and may serve as a valuable resource for guiding future dental professionals [
19]. This innovation bridges the gap between AI-driven dental diagnostics and real-world clinical applications, making it a valuable tool for digital dentistry.
This study aims to leverage the capabilities of AI to facilitate the automated assessment of the need for an SM through the detailed analysis of dental radiographic imagery. It is hypothesized that the AI model will accurately predict the need for a space maintainer and identify specific teeth requiring intervention, thus providing a reliable and efficient tool for pediatric dentists in clinical practice.
2. Methods
In this study, we developed an AI model to predict the need for dental SMs using X-ray images. The evaluation was conducted specifically for single-tooth loss, focusing only on a specific tooth in the radiographs, and the type of SM to be used was not assessed. The evaluation of space maintainer necessity was conducted using a dual-expert evaluation approach, considering factors such as the time elapsed since the extraction, the available space, the amount of bone covering the permanent tooth germ, the patient’s dental age, and the sequential eruption pattern of the teeth. The process involved multiple stages, beginning with the preprocessing of dental X-rays into numerical data using OpenCV for compatibility with machine learning models. A CNN was then designed, trained, and tested using a labeled dataset to classify whether an SM is required and, if so, to identify the specific tooth. The methodology is detailed in the following subsections, which cover data preparation, image processing, model architecture, training and testing, and evaluation metrics. While CNNs are widely used in medical imaging, our study is innovative in applying deep learning specifically to SM prediction—an underexplored area in pediatric dentistry. Unlike existing models that focus on diagnosing dental conditions, our approach not only determines the need for an SM but also identifies the specific tooth requiring intervention. Additionally, we implemented standardized image preprocessing and a rigorous evaluation framework. Our work contributes to preventive dentistry, offering an AI-driven tool to assist clinicians in objective decision making.
2.1. Data Preparation and Image Processing
The dataset used in this study consisted of 400 dental X-ray images, of which 195 were labeled as requiring an SM and 205 were labeled as not requiring one. The decision to use 400 images was influenced by several factors, including the need to have a balanced representation of both classes (SM needed and SM not needed) and to allow for robust model training and validation. These images were processed to standardize their format and prepare them for training and testing the artificial intelligence model. To ensure a reliable evaluation of the model’s performance, the dataset was divided into training and testing subsets, with 80% of the data used for training and 20% reserved for testing. The split was performed in a stratified manner to maintain an equal representation of both labels in each subset.
The original images in the dataset are generally 2000 × 1000 pixels. For CNN image processing, we resized them to 100 × 100 pixels with three RGB color channels using Python 12.0 code. The resizing was performed using interpolation methods to maintain aspect ratio and minimize distortion. After that, each image was resized to a resolution of 100 × 100 pixels with three color channels (RGB format), ensuring uniformity in dimensions across the entire dataset. This resizing step was critical for the consistency required by the machine learning algorithms, as it ensured that all input data shared the same shape and scale. Following resizing, the images were flattened into one-dimensional numerical arrays. This transformation preserved the pixel intensity values while reducing the spatial complexity of the images, enabling efficient storage and processing. The flattened arrays served as a numerical representation of the images, capturing all the relevant features necessary for the learning process. The labels for the dataset were encoded numerically, with “0” representing cases where no SM was needed and “1” representing cases where an SM was required. The processed data were divided into training and testing files.
The training set, represented by “X_train”, consisted of 320 flattened images (80% of the dataset), while the corresponding labels, “Y_train”, comprised 165 zeros and 155 ones. Similarly, the testing set, represented by “X_test”, included 80 flattened images (20% of the dataset), with “Y_test” containing 40 zeros and 40 ones. These files were stored in CSV format to ensure compatibility with the machine learning pipeline. By employing this systematic preprocessing pipeline, the study ensured that the dataset was both standardized and appropriately structured for machine learning applications. This approach not only enhanced the model’s training efficiency but also improved the reproducibility of the methodology. Instead of relying solely on a separate test set, the model utilized all 400 images to predict the tooth number based on its previous classification outputs. This approach allowed us to assess the model’s consistency and accuracy in identifying the correct tooth. Among these, 86 images specifically indicated the need for a space maintainer (predicting teeth number), serving as a focused subset for evaluating prediction accuracy.
2.2. AI Architecture and Performance Metrics
The predictive system for determining the need for space maintainers was built using a CNN, designed to process and analyze dental X-ray images. The architecture consisted of multiple convolutional and pooling layers, followed by fully connected layers to perform binary classification. The specific layers and their configurations were as follows:
Convolutional Layers:
- ○
The first convolutional layer consisted of 256 filters with a kernel size of 3 × 3 and ReLU activation. This layer extracted low-level spatial features from the input images.
- ○
Subsequent layers included 256 filters, also with a kernel size of 3 × 3 and ReLU activation. These layers progressively extracted higher-level features essential for classification.
Pooling Layers:
- ○
Max-pooling layers with a pool size of 2 × 2 were interspersed between convolutional layers to reduce the spatial dimensions of the feature maps, thereby controlling overfitting and improving computational efficiency.
Fully Connected Layers:
- ○
A dense layer with 256 neurons and ReLU activation was added to integrate the extracted features.
- ○
The final dense layer consisted of a single neuron with sigmoid activation, outputting a probability score for binary classification (0: no space maintainer needed, 1: space maintainer needed).
Figure 1 demonstrates the graphical representation of the CNN architecture, illustrating the layers and connections that enable the model to perform image classification. The model was trained using 30 epochs and a size batch of 64 to balance computational efficiency and convergence. A binary cross-entropy loss function was employed, which is well suited for binary classification tasks, and an optimizer was utilized to minimize the loss, ensuring efficient convergence during training. The model’s parameters were fine-tuned to optimize its predictive capabilities and to generalize well to unseen data. The performance of the CNN model was evaluated using a comprehensive set of metrics to capture its predictive capabilities and reliability.
Table 1 shows performance metrics such as accuracy. Accuracy was used to measure the overall correctness of the model’s predictions, while precision evaluated the proportion of true positive predictions among all positive predictions, indicating the model’s ability to avoid false positives. Recall, or sensitivity, assessed the proportion of true positive cases detected, highlighting the model’s ability to identify cases requiring space maintainers. The F1-score, a harmonic means of precision and recall, provided a balanced assessment of the model’s performance. Additionally, the Receiver Operating Characteristic Area Under Curve (ROC AUC) quantified the model’s ability to distinguish between classes, with higher values indicating better discriminatory power. It is computed as the area under the ROC curve, which plots the true positive rate (TPR) against the false positive rate (FPR). The Matthews Correlation Coefficient (MCC) was also calculated to evaluate the correlation between predicted and true labels, accounting for both true and false predictions, which made it particularly useful for imbalanced datasets.
The comprehensive evaluation of these metrics ensured a robust assessment of the model’s performance, emphasizing its clinical relevance. High precision and recall scores were critical in this application to minimize false positives, which could lead to unnecessary interventions, and false negatives, which might result in missed cases requiring intervention. The ROC AUC and MCC further validated the model’s robustness, demonstrating its reliability in accurately predicting the need for space maintainers across diverse cases. By achieving high performance across these metrics, the system showed potential to assist dental professionals in making faster and more accurate clinical decisions.
3. Results
The results obtained from the experiments conducted in this study provide a comprehensive evaluation of the model’s ability to predict the need for space maintainers using dental X-ray images. These results are presented to illustrate the effectiveness and reliability of the CNN in addressing the classification task. Performance metrics such as accuracy, precision, recall, F1-score, ROC AUC, and MCC were employed to assess the model’s predictive capabilities. The findings highlight the model’s capacity to make accurate predictions and demonstrate its potential as a valuable tool in clinical decision making for pediatric dentistry. The following sections provide a detailed analysis of these results, supported by quantitative metrics and visualizations.
Figure 2 illustrates the training performance of the CNN used for predicting the need for space maintainers. The left panel shows the training accuracy over 30 epochs, with the training and validation accuracy steadily increasing as the model learns the patterns within the data. By the end of the training process, the accuracy approaches 1.0, indicating that the model has effectively captured the features necessary for accurate predictions. The right panel depicts the corresponding training loss, which consistently decreases over the epochs, reflecting the model’s ability to minimize errors between predicted and actual labels. Together, these trends demonstrate the model’s capacity to learn effectively from the training data, achieving high accuracy and low loss.
The ROC curve presented in
Figure 3 illustrates the model’s performance in distinguishing between two classes: cases where a space maintainer is required and cases where it is not. The
x-axis represents the false positive rate (1-specificity), while the
y-axis represents the true positive rate (sensitivity), depicting the balance between correctly identifying positive cases and minimizing false alarms. As the decision threshold changes, the curve demonstrates how the model’s sensitivity and specificity vary. The ROC curve closely approaches the top-left corner, which indicates a high discriminative ability of the model. (AUC is calculated as 0.94, signifying near-perfect classification performance. AUC values range from 0 to 1, where 0.5 suggests no discriminatory power (equivalent to random guessing), and 1.0 represents a perfect classifier. An AUC of 0.94 confirms the model’s strong ability to correctly identify cases requiring a space maintainer while minimizing false positives. This high AUC value highlights the robustness and reliability of the model in making accurate predictions. Given its exceptional performance, the model has significant potential as a decision-support tool in pediatric dentistry, assisting clinicians in making informed and efficient assessments regarding the necessity of space maintainers.
In
Figure 4, the confusion matrix visualizes the performance of the model in predicting the need for a space maintainer using 80 flattened images. It provides a detailed breakdown of the true positives, true negatives, false positives, and false negatives.
The matrix contains four quadrants:
The top-left quadrant shows 38 true negatives, where the model correctly predicted that no space maintainer was needed.
The bottom-right quadrant indicates 37 true positives, representing cases where the model accurately identified the need for a space maintainer.
The top-right quadrant reflects two false positives, where the model incorrectly predicted the need for a space maintainer when it was not required.
The bottom-left quadrant has three false negatives, indicating that the model missed three cases that required a space maintainer.
This performance indicates that the model exhibits a high degree of accuracy and sensitivity, as it successfully identified all true cases requiring intervention (sensitivity) and minimized the occurrence of false positives.
The metrics and classification report provides a comprehensive evaluation of the model’s performance in predicting the need for a space maintainer. The overall accuracy of the model is 94%, which demonstrates its strong ability to correctly classify both classes (need and no need for a space maintainer). This high accuracy reflects the model’s robustness and reliability when applied to dental X-ray data.
Table 2 shows the classification report that provides key performance metrics for the model’s ability to distinguish between the two classes: cases where a space maintainer is needed (Class 1) and cases where it is not needed (Class 0). For Class 0 (Not Needed), the precision is 0.93, indicating that 93% of the instances predicted as Class 0 were correct. The recall for this class is 0.95, meaning the model successfully identified 95% of actual Class 0 cases. The F1-score, which balances precision and recall, is 0.94, confirming strong performance in detecting cases where a space maintainer is unnecessary. For Class 1 (Needed), the precision is 0.95, meaning that 95% of the instances predicted as Class 1 were correctly classified. The recall is 0.93, showing that the model correctly identified 93% of actual Class 1 cases. The F1-score for this class is also 0.94, indicating an effective balance between precision and recall. The macro average precision, recall, and F1-score are all 0.94, representing the arithmetic mean of these metrics across both classes. Similarly, the weighted average precision, recall, and F1-score are also 0.94, reflecting the overall performance while accounting for class support (number of instances per class). These high values indicate that the model performs consistently well across both classes.
The additional metrics, including the ROC AUC score of 0.94 and the MCC of 0.875, underscore the model’s high discriminative power and reliability. The ROC AUC score reflects the model’s capability to distinguish between the two classes, while the MCC provides a balanced measure of performance, particularly useful for datasets with slight class imbalances.
Figure 5 demonstrates the model’s predictions for 86 X-ray images, where each image was previously annotated by a dentist to identify the teeth requiring space maintainers. The tooth number on the
x-axis corresponds to the Universal Numbering System, which is commonly used in dentistry to label teeth in a standardized manner. Each point represents a prediction made by the CNN-based AI model, correlating the tooth number with the model’s probability of requiring a space maintainer.
The y-axis displays the prediction probabilities generated by the model, ranging from 0 to 1, where higher values indicate a stronger prediction that a space maintainer is necessary. A threshold of 0.5 was applied as follows:
Predictions above 0.5 (marked in blue) indicate that the model predicts the need for a space maintainer.
Predictions below 0.5 (marked in red) indicate that the model does not predict the need for a space maintainer.
The results in this figure allow for an assessment of how well the model aligns with expert annotations. The clustering of blue points near 1.0 and red points near 0.0 suggests that the model makes confident classifications in most cases. However, a few predictions close to the 0.5 threshold may indicate uncertain cases, which could require further validation. Out of the 86 images tested, the model produced eight errors.
Figure 6 illustrates a dental X-ray where the CNN model predicted that there is no need for an SM following the extraction of teeth 75 and 85. This prediction is generated by the trained CNN, which analyzes the features of the X-ray image, such as tooth alignment, spacing, and other structural patterns. For each case, the model evaluates these features and assigns a probability, determining whether an SM is necessary. In this instance, the prediction confidently indicates that the natural spacing and alignment are adequate, and no intervention is required. This demonstrates the model’s ability to automate and streamline the decision-making process in pediatric dentistry.
Figure 7 illustrates the prediction result generated by the proposed CNN model, highlighting the necessity of a space maintainer following the extraction of tooth 75. The panoramic radiograph displays a clear bounding box around the affected region, emphasizing the location of the missing or soon-to-be-extracted tooth. The model successfully identifies the specific tooth requiring intervention and labels it accordingly, demonstrating its ability to detect and classify cases where space maintainers are essential. This result confirms the model’s efficacy in assisting clinicians by providing automated and accurate assessments, ultimately aiding in timely and effective treatment planning for pediatric patients.
4. Discussion
The findings from this study emphasize the potential of the CNN model as a robust tool for predicting the need for dental SMs based on X-ray imagery. The model demonstrated a high overall accuracy of 94%, supported by other performance metrics such as precision, recall, F1-score, and ROC AUC. These results indicate that the model can effectively support pediatric dentists by automating the assessment of dental X-rays, minimizing diagnostic delays, and enhancing treatment planning. Below, we discuss the insights derived from these results, compare them with findings in similar studies, and identify limitations and opportunities for future research.
The results show the CNN model’s ability to generalize effectively, as evidenced by its low false positive and false negative rates. High precision for Class 0 (0.93) indicates the model’s success in avoiding unnecessary interventions, which is critical in pediatric dentistry. Similarly, the recall for Class 1 (0.93) underscores the model’s ability to identify all cases requiring an SM, ensuring no critical cases are overlooked. Visual examples, such as
Figure 6 and
Figure 7, highlight the practical application of the model. The accurate identification of structural dental anomalies (e.g., spacing issues) demonstrates the model’s capability to process and analyze complex patterns in X-ray images. Furthermore, the model’s ability to pinpoint specific tooth numbers adds a layer of precision that could streamline clinical workflows and decision-making processes.
The ROC AUC score of 0.94 obtained in this study is consistent with the results reported in other dental AI applications, such as [
34], where CNN models were applied to classify orthodontic issues or identify dental caries. The confusion matrix and classification report highlight the model’s ability to maintain a balance between precision and recall, like the results observed in studies like [
41], where CNNs were used to predict the need for orthodontic retainers. These comparisons affirm that the model’s performance is not only competitive but also consistent with the capabilities of state-of-the-art AI systems in dental diagnostics.
The focus on tooth-level predictions aligns with emerging trends in personalized dental care, as seen in [
42], where AI models are tailored for patient-specific interventions. Previous studies have implemented models such as U-Net [
43], SegNet [
44], BiseNet [
45], and Dense-ASPP [
46] for dental image segmentation, each of which involves many trainable parameters. The U-Net-based approach focuses on a hybrid loss function weighted on tooth edges rather than architectural modifications. However, their method relied on hyperparameters that may not be optimal and were validated on a limited number of edge-optimized images. Compared to our approach, our model employs optimized hyperparameters, or a more efficient way to enhance prediction accuracy while maintaining computational efficiency. Recent research has shown that BERT-based classification of pediatric dental diseases achieved an accuracy of 77%, while a 1D-CNN reached 84%, outperforming other pretrained CNNs [
47]. In contrast, our CNN model leverages direct radiographic image features rather than text-based transformations, enabling more precise tooth localization and treatment prediction. Recent advancements in AI-driven pediatric dental analysis have focused on automating primary teeth segmentation from CBCT scans, achieving expert-level accuracy (98%) and significantly reducing the processing time compared to manual methods [
48]. Recent studies have demonstrated the effectiveness of deep learning models, such as YOLOv8, in improving diagnostic accuracy for interproximal caries detection, achieving a high precision of 96.03% for enamel caries and reducing false negatives [
49]. Recent research has demonstrated the effectiveness of CNN models in diagnosing dental diseases by classifying radiographic images into different categories, such as fillings, cavities, and implants [
50]. While such segmentation models enhance treatment planning efficiency, they do not predict specific clinical interventions, such as the necessity of a space maintainer. Our CNN model goes beyond segmentation by not only identifying teeth but also determining the teeth number, offering a more comprehensive AI-assisted decision-making tool for pediatric dentistry. The development of a no-code AI model for detecting primary proximal surfaces from bitewing radiographs demonstrates the increasing potential of AI in pediatric dental diagnostics, achieving high accuracy and precision with a limited dataset [
51]. While this model focuses on caries detection, our CNN model targets a different aspect of pediatric dentistry by predicting the need for space maintainers and localizing specific teeth for intervention. Both models highlight the value of AI in enhancing diagnostic and treatment planning, but our approach integrates both classification and localization for more comprehensive clinical decision support.
Despite the promising performance of the model, several limitations must be considered. Firstly, the dataset size was relatively small, with only 400 images for binary classification and 86 for tooth-level prediction. While the model performed well within this context, a larger, more diverse dataset would be crucial to improve its generalizability and robustness, ensuring that it can perform effectively across a broader range of cases. Additionally, the model was trained and tested on a single dataset, which may limit its applicability to other populations or imaging modalities. Variations in dental X-ray characteristics—such as differences in equipment, settings, and patient demographics—could influence the model’s performance when applied to different clinical environments. Moreover, while the dataset was balanced for the binary classification task, there were slight imbalances in the tooth-specific predictions, which could impact the model’s accuracy for underrepresented categories. Lastly, error analysis revealed higher prediction errors for tooth numbers 75 and 84, suggesting the need for further investigation to determine whether these errors are due to model limitations, dataset bias, or the inherent challenges of interpreting certain dental structures. To better understand the misclassifications, we analyzed the prediction probabilities across different tooth numbers (
Figure 5). The results indicate that errors are primarily concentrated around tooth numbers 65 and 75, suggesting potential feature overlap between these teeth. Additionally, our confusion matrix (
Figure 4) reveals that false positives (n = 2) and false negatives (n = 3) occur in cases with reduced contrast or anatomical similarity to adjacent teeth. Furthermore, inconsistencies in the radiographic quality, such as variations in brightness and contrast, may have contributed to the observed errors. To mitigate these issues, future improvements will focus on data augmentation strategies to enhance feature diversity, the optimization of classification thresholds, and potential architectural modifications to strengthen feature extraction in ambiguous cases.
Collaborations with multiple institutions or incorporating public datasets could improve the model’s generalizability. Extending the model to handle multi-class predictions would broaden its clinical applicability by enabling it to classify additional dental conditions beyond the need for SMs. Leveraging transfer learning techniques with pretrained models on larger datasets, such as ImageNet, could further enhance performance, particularly when dealing with limited datasets. Incorporating explainability methods like Grad-CAM could provide valuable insights into the model’s decision-making process, increasing trust and usability among clinicians.