Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders

Ha, Changjin; Go, Taesik; Choi, Woorak

doi:10.3390/app14051734

Open AccessArticle

Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders

by

Changjin Ha

¹,

Taesik Go

^2,3,* and

Woorak Choi

^4,*

¹

Department of Software Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea

²

Division of Biomedical Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea

³

Division of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea

⁴

Department of Mechanical Engineering, Korea National University of Transportation, Chungju 27469, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(5), 1734; https://doi.org/10.3390/app14051734

Submission received: 19 January 2024 / Revised: 16 February 2024 / Accepted: 19 February 2024 / Published: 21 February 2024

(This article belongs to the Section Biomedical Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Various scalp and hair disorders distress numerous people. Severe scalp hair disorders have an adverse effect on appearance, self-confidence, and quality of life. Therefore, early and exact diagnosis of various scalp hair disorders is important for timely treatment. However, conventional manual examination method is time-consuming, objective, and labor-intensive. The presented study proposes an intelligent healthcare platform for identifying severity levels of six common scalp hair disorders such as dryness, oiliness, erythema, folliculitis, dandruff, and hair loss. To establish a suitable scalp image classification model, we tested three deep learning models (ResNet-152, EfficientNet-B6, and ViT-B/16). Among the three tested deep learning models, the ViT-B/16 model exhibited the best classification performance with an average accuracy of 78.31%. In addition, the attention rollout method was applied to explain the decision of the trained ViT-B/16 model and highlight approximate lesion areas with no additional annotation procedure. Finally, Scalp checker software was developed based on the trained ViT-B/16 model and the attention rollout method. Accordingly, this proposed platform facilitates objective monitoring states of the scalp and early diagnosis of hairy scalp problems.

Keywords:

hairy scalp disorders; diagnosis; deep learning; explainable artificial intelligence; assistant computer program

Graphical Abstract

1. Introduction

Scalp and hair disorders have become prevalent because of high stress, dietary changes, and exposure to harmful environmental substances. Common scalp and hair symptoms include dryness, oiliness, erythema, folliculitis, and dandruff [1,2,3,4]. If these symptoms persist for a long time, they are commonly accompanied by inflammatory scalp disorders which may lead to severe alopecia [2,5,6]. Severe scalp and hair disorders can also affect one’s appearance, resulting in losing self-confidence, generating additional social barriers, and degrading the quality of life [7,8]. Therefore, it is crucial to diagnose various scalp and hair disorders at an early stage and receive proper care and treatment based on an accurate diagnostic result. In addition, the size of the global scalp and hair care market was valued at USD 91.60 billion in 2022 and is expected to grow annually by 5.8% with increasing demand for scalp and hair healthcare [9].

The current gold standard for diagnosing scalp hair conditions is a manual examination by a dermatologist or professional therapist. However, typical visual examination is time-consuming, inspector-dependent, and labor-intensive [1,4,10]. A patient should visit a dermatologist or professional hair therapist to find out the state of the scalp and hair. Diagnostic results may vary depending on the skill of the expert. Furthermore, long training time and high education costs are required to determine the scalp and hair condition.

Deep learning has been widely introduced in various research fields including object recognition, natural language processing, bio-medical diagnosis, and treatment [11]. For example, Zhang et al. proposed convolutional neural network (CNN)-based models to simultaneously track multiple objects [12] and precisely detect multiple targets from infrared images [13]. A number of deep learning models have also been applied for scalp and hair healthcare (Table 1) [14]. Wang et al. tested several machine learning algorithms and the ImageNet-VGG-f deep learning model to categorize microscopic scalp images into four groups: bacteria type 1, bacteria type 2, allergy, and dandruff images [15]. The same group proposed ScalpEye using a Faster R-CNN model to detect four scalp hair symptoms: dandruff, folliculitis, hair loss, and oily hair [4]. Podlodowski et al. adopted an ensemble of VGG-16 networks to mark hair follicles on microscopic images [16]. Jhong et al. used a CNN model with the convolutional block attention module (CBAM) and spinal fully connected (FC) layers to assess the severity of dandruff [1]. Benhabiles et al. compared the performance of three CNN-based models to detect hair loss severity from facial images [17]. Lee et al. varied a conventional U-net model for identifying the scalp and hair loss area [18]. Sayyad et al. suggested a VGG-SVM model to recognize alopecia [19]. Meng et al. proposed D-Net and R-Net to detect hair follicles and evaluate hair density [20]. Kim et al. detected hair follicles and predicted the number of hairs in each follicle using EfficientDet, YOLOv4, and DetoctoRS which are object detection networks [21]. Kim et al. applied the Mask R-CNN framework to sort the hair follicle conditions into three groups (healthy, normal, and severe) [10]. Kim et al. compared the performance of individual deep learning models and their ensemble models for diagnosing the severity of alopecia [22]. However, studies to simultaneously diagnose the severity and problem areas of various scalp and hair disorders have not yet been attempted.

In the present study, we developed a deep learning-based intelligent platform to predict the severity of each scalp and hair disorder and localize the lesion region. The open dataset from AI-Hub was used to train deep learning models [23]. The dataset consists of six types of scalp hair disorders and four severity levels for each disorder. Pre-trained ResNet [24], EfficientNet [25], and ViT [26] models, which have been effective for the image classification task, were fine-tuned and the performance of the three models was evaluated. In addition, eXplainable Artificial Intelligence (XAI) techniques such as Grad-CAM [27] and attention rollout [28] methods were applied to highlight the disorder region in a scalp image depending on the type of scalp disorder. Finally, the user-friendly software was also developed to easily recognize hairy scalp health states and check trends of the severity of each disorder by date. In summary, the main contributions of this work are as follows: (1) we applied three deep learning models and assessed their performances to accurately estimate severities of six common scalp and hair disorders; (2) we utilized XAI methods to visually demonstrate model explainability and approximate lesion region; (3) we developed the assistant program ‘Scalp checker’ to help users analyze and track their hairy scalp health states.

The remainder of this paper is organized as follows. Section 2 describes the dataset, deep learning models, and XAI visualization techniques used in this study. Section 3 compares the performance of three deep learning models with XAI methods and shows Scalp checker software development results. Section 4 presents the discussion and future studies. Section 5 concludes this study.

2. Materials and Methods

The overall procedure for diagnosing the severity levels of six scalp and hair disorders is described in Figure 1. The scalp image open dataset was acquired from AI-hub [23]. The open dataset consisted of six scalp and hair disorders (Figure 1a) and each disorder was divided into four severity levels (Table 2). The original dataset was augmented, resized, and normalized to fine-tune three deep learning models: ResNet152, EfficientNet-B6, and ViT-B/16 (Figure 1b). XAI techniques, such as Grad-CAM and attention rollout, were also applied to figure out the decision of the trained deep learning model and visualize the lesion region (Figure 1b). After the classification performance of three deep learning models, the ViT-B/16 model showed the best performance among tested models. Based on the ViT-B/16 model and attention rollout technique, ‘Scalp checker’ software was developed. When a microscopic scalp image was entered into the software, the software offered severity levels of six individual disorders and problem regions, visually (Figure 1c).

2.1. Dataset

The open dataset of scalp images was utilized [23]. The dataset was collected from South Koreans aged from 10 to 70. The dataset was composed of over 100,000 magnified scalp images (×60) acquired from the vertex, anterior, posterior, right, and left of the head. Each image had a pixel resolution of 640 × 480. In the dataset, a total of six scalp hair disorders (dryness, oiliness, erythema, folliculitis, dandruff, and hair loss) was included and each disorder was categorized into four severity levels (0: none, 1: mild, 2: moderate, 3: severe, Table 2). To generate the annotated scalp image dataset [23], Bundang Seoul National University Hospital dermatologists determined the type of hairy scalp problem and its severity level based on the scalp photographic index [2]. The number of original datasets for each disorder is summarized in Table 3.

The number of the original data was unbalanced (Table 3). Some datasets were augmented through rotation, vertical/horizontal flip, and affine transformation (Table 3). It helps to reduce overfitting and improve classification accuracy [21]. After balancing the number of datasets, all images were resized from 640 × 480 to 480 × 480 or 224 × 224 because the input image size required for each deep learning model is different (Table 4). Each pixel value was then divided by 255 for normalization. The data after image processing were randomly split into train (60%), validation (20%), and test (20%) datasets.

2.2. Deep Learning Model Description

To accurately predict the severity level of each disorder, we applied two CNN models (ResNet-152, EfficientNet-B6) and a transformer model (ViT-B/16) that have been popular for image classification. The pre-trained models of ResNet-152 with ImageNet, EfficientNet-B6 with ImageNet, and ViT-B/16 with ImageNet-21k were fine-tuned on the scalp image dataset and the performances of the fine-tuned models were compared. The type of deep learning models and hyperparameters were carefully selected considering our GPU computing power (Table 4).

ResNet is one of the CNN models [24]. The deeper the network model, the more levels of features can be integrated so better performance can be expected. However, the vanishing gradient problem occurs. To resolve the problem, shortcut connections were inserted in ResNet. The input was added to the output of the convolutional block through the short connection, and it makes residual learning and more layer stacking possible. ResNet-152 was used in this study and it is composed of 152 residual network layers and operates on about 58 million parameters.

EfficientNet is also one of the CNN models [25]. The classification accuracy is commonly improved by scaling up network depth, width, and input image resolution. However, the computational cost has dramatically increased. The compound scaling method was proposed to determine network depth, width, and input image resolution efficiently. The EfficientNet model showed better performance with less computation compared to previous CNN models. The baseline EfficientNet B0–B7 models mainly contain mobile inverted bottleneck MBConv blocks in addition to squeeze and excitation blocks. Among the baseline models, the EfficientNet-B6 model which operates on about 41 million parameters was selected in this study.

ViT is one of the transformer models with self-attention to sequences of image patches [26]. To convert 2D image data into 1D sequential data for using transformer encoder input, an image is firstly divided into fixed size (e.g., 16 × 16) patches. Each patch is flattened and linearly embedded. A class token, which is a learnable embedding and is used for classification tasks, is added at the beginning of the input sequence. Positional embeddings are added to retain positional information about individual input sequences. The sequence of the resultant vectors is fed into multiple transformer encoder blocks composed of layer normalization, multi-head self-attention, multi-layer perceptron (MLP) blocks, and residual connections. Finally, the class is predicted through MLP for the output of the last transformer encoder. Among baseline ViT models, the ViT-B/16 model which has about 86 million trainable parameters was applied in this study.

All deep learning models were trained and tested on a desktop computer (NVIDIA GeForce RTX 3090 Ti, Intel i9-13900K CPU, and 128 GB RAM) with Python 3.9.13 and TensorFlow GPU 2.10.1. The model, which showed the highest classification accuracy in the validation dataset during training epochs, was saved. The details of the hyperparameter settings utilized in each network are listed in Table 4.

2.3. XAI Visualization Techniques

Grad-CAM [27] and attention rollout [28] techniques were applied to investigate which areas in a scalp image were focused on by the trained deep learning model for classifying scalp hair disorders. These visual explanation techniques provide the basis for decision-making of the trained model and help users to receive information about the problem region in the image.

Grad-CAM is one of the techniques to explain the classification decision of any CNN-based network including ResNet and EfficientNet without structural changes or retraining [27]. It provides a localization map highlighting important parts in the image to predict the class. The gradients of the classification score concerning the last convolutional feature maps are calculated. These gradient values are globally averaged over each feature map dimension to obtain weights. These weights are linearly combined with the final feature maps, and then ReLU is applied to the linear combination maps. Finally, the heatmap is generated and the high-impact areas for the classification are emphasized.

Visualizing attention weights is useful for interpreting a model’s decision. Attention rollout is one of the ways to compute an attention map when the model includes multi-head self-attention and residual connections [28]. To acquire an approximate attention map from the output token to the input image, the identities of input tokens are linearly combined through all layers based on attention weights. The attention weights of ViT-B/16 over all heads were averaged and the weight matrices in all layers were multiplied, recursively. The acquired attention map demonstrated the image parts that are semantically related for classification.

2.4. Software Design

In this study, the user-friendly software (i.e., ‘Scalp checker’) was also developed using the trained deep learning model. Scalp checker was designed to allow users to easily check their scalp and hair conditions and trends by date.

The flowchart illustrating Scalp checker software execution is shown in Figure 2. The Scalp checker consists of GUI-level and API-level software. The GUI-level Scalp checker was developed for observing the API-level Scalp checker processes. When the user requests the inspection at the GUI-level Scalp checker (Figure 2a), the paths of the trained deep learning model, input image, and the user-selected inspection type are allocated at the API-level Scalp checker. Subsequently, the severities of scalp hair disorders are diagnosed based on the trained model after checking GPU compatibility (Figure 2b). After completing the inspection, the results are stored in the local database and the GUI-level Scalp checker displays the results (Figure 2c). If the user navigates to the dashboard or history page at the GUI-level Scalp checker, the results are requested and received from the local database (Figure 2d). The GUI-level and API-level Scalp checker were developed by using Microsoft Visual Studio 2022 and PyCharm Professional, respectively.

3. Results

3.1. Severity Classification of Six Hairy Scalp Disorders

Three trained models based on ResNet-152, EfficientNet-B6, and ViT-B/16 predicted the severity level (0, 1, 2, and 3) of six individual scalp and hair disorders (dryness, oiliness, erythema, folliculitis, dandruff, and hair loss). We compared the overall performance of these trained models using the test dataset (Figure 3). Four metrics commonly used for performance examination of deep learning classification were evaluated: accuracy, precision, recall, and F1-score [29].

a ccuracy = \frac{TP + TN}{TP + TN + FP + FN}

(1)

p recision = \frac{TP}{TP + FP}

(2)

r ecall = \frac{TP}{TP + TN}

(3)

F 1 score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(4)

TP, TN, FP, and FN indicate the number of true positive, true negative, false positive, and false negative of each class, respectively.

As a result, the ViT-B/16 model outperformed the other two deep learning models (ResNet-152, EfficientNet-B6) for all metrics (Figure 3). The trained ViT-B/16 model exhibited the highest classification performance with the overall accuracy of 78.31%, precision of 78.81%, recall of 79.02%, and F1-score of 78.91%. Therefore, the ViT-B/16 model was selected as the best classifier for diagnosing scalp and hair disorders among the tested models.

Figure 4 presents confusion matrices showing the detailed classification results of the ViT-B/16 models trained by each scalp and hair disorder. The trained ViT model predicted the severity level of each disorder from 0 (none) to 3 (severe). Based on these confusion matrices, we calculated the accuracy, precision, recall, and F1-score for the severity classification of each scalp hair disorder (Table 5). As a result, all metrics varied across scalp hair disorders and ranged from 69.07% to 83.06%. Relatively low classification performance was shown in oiliness (Table 5) because the trained ViT-B/16 model misclassified severity level 2 (actual class) as severity level 1 or level 3 (predicted class, Figure 4b). On the other hand, the severities of the other five scalp hair disorders were correctly classified over 76.88% of accuracy, precision, recall, and F1-score (Table 5).

3.2. Prediction of Scalp Lesion Using XAI

We compared attention maps derived from the trained ViT-B/16 model and Grad-CAM maps derived from the trained EfficientNet-B6 model (Figure 5). Among the two tested CNN models, the EfficientNet-B6 model was selected instead of ResNet-152 for the comparison because the overall classification performance of the EfficientNet-B6 model was better (Figure 3). These XAI techniques can visualize which areas in the image are important when deep learning models predict the severity level of each scalp and hair disorder.

Figure 5a shows a scalp image having multiple scalp and hair disorders: oiliness (severity level: 2), folliculitis (severity level: 3), and dandruff (severity level: 2). Class-discriminated localization heatmaps were derived using the EfficientNet-B6 model with Grad-CAM (Figure 5b–d) and the ViT-B/16 model with attention rollout technique (Figure 5e–g). The emphasized areas were different depending on the scalp hair disorders and XAI techniques. To classify the severity level of oiliness, both EfficientNet-B6 and ViT-B/16 models focused on shiny surface regions induced by excessive oil and greasy sebum (Figure 5b,e). To categorize the severity level of folliculitis, the EfficientNet-B6 model failed to localize the region (Figure 5c) while the ViT-B/16 model successfully highlighted pustules and high inflammatory regions around hair pores (Figure 5f). To predict the severity level of dandruff, the EfficientNet-B6 model partially focused on the dandruff region (Figure 5d) whereas the ViT-B/16 model focused on the dandruff region where there was white keratin on the scalp. Overall, attention maps derived from the ViT-B/16 model with the attention rollout method were more similar to human intuition and a better localized relevant image region according to each disorder, compared to Grad-CAM maps with EfficientNet-B6. Accordingly, six ViT-B16 models trained by each scalp hair disorder and attention rollout technique were used to develop the ‘Scalp checker’ software.

3.3. Scalp Checker Software

Figure 6 shows the user interface (UI) of the developed Scalp checker software. The GUI-level Scalp checker software consists of ‘Dashboard’, ‘Inspection’, and ‘History’ pages (Figure 2 and Figure 6).

The dashboard page is the first screen that a user encounters when the GUI-level Scalp checker software is run (Figure 6a). On this page, average severities and trends of six individual scalp and hair disorders are shown at the top and bottom, respectively. The inspection page allows the user to examine all six or selected scalp and hair disorders (Figure 6b). After the user loads a scalp image and clicks the ‘Start inspection’ button (Figure 6b), the GPU compatibility is checked first (Figure 2b). If the GPU is not recognized or an incompatible GPU is used, a warning message is displayed, and the inspection is performed via CPU instead of GPU (Figure 6c). If the API-level Scalp checker is normally terminated (Figure 2c), the severity levels of individual scalp hair disorders and suspected problem areas acquired from the attention rollout method are exhibited (Figure 6d). However, an error message is shown if this process is terminated, abnormally (Figure 6e). The history page enables the user to check the results by date and observe the change in a user’s scalp hair conditions (Figure 6f).

The developed Scalp checker software was tested under a laptop computer condition (NVIDIA GeForce RTX 3070 Laptop GPU, Intel i7-12650H CPU, and 32 GB RAM). In this condition, it took about 40 s when utilizing the GPU and 60 s when utilizing only CPU to diagnose the severities of six scalp hair disorders from one image. In addition, the operation of the Scalp checker software was evaluated in various operating systems (OSs) and hardware environments to check the compatibility of the software (Table 6). OSs varied from Windows XP to Windows 11. In Windows XP and Vista environments, the GUI-level Scalp checker did not run due to a lack of support for the .NET Framework. In a Windows 7 environment, the API-level Scalp checker was not executed because Python 3.9 was not supported. When GPU was unable to be used, the CPU performance was better, and there was a shorter prediction time. When GPU was able to be used, the GPU accelerated computation, and the prediction time was shortened. Therefore, Windows 8.1, 4 GB RAM, and 14 GB free disk space are the minimum requirements to successfully run GUI-level and API-level Scalp checker software, install these software, and store diagnostic result log files.

4. Discussion

This paper presented an end-to-end intelligent healthcare platform that predicts the severity levels of six scalp disorders, highlights the regions of suspected lesions, and enables users to easily check their scalp conditions via the developed Scalp checker software. In this study, two CNN models (ResNet-152, EfficientNet-B6) and the transformer model (ViT-B/16) were applied to predict the severity level of each disorder. To the best of our knowledge, there are no studies that have applied the ViT model for diagnosing scalp and hair conditions. As a result, the ViT-B/16 model showed the highest accuracy, precision, recall, and F1-score among the tested deep learning models (Figure 3). ViT is a pure self-attention-based architecture without any convolution layer [26]. Dosovitskiy et al. reported that ViT models pre-trained on large datasets (ImageNet-21k or JFT-300M) showed excellent classification performance, compared to CNN models [26]. In addition, ViT models required considerably fewer computational resources for training, so ViT models generally showed better performance than CNN models with the same computational budget. As the ViT-B/16 model (86 M) can learn more parameters compared to ResNet-152 (58 M) and EfficientNet-B6 (41 M) under the same GPU capability and better focus on the image region semantically relevant for classification via multiheaded self-attention (Figure 5), the ViT-B/16 model outperformed the two tested CNN models. However, the trained ViT-B/16 model showed relatively low performance for severity prediction of oiliness (Figure 4b and Table 4). In particular, test images with severity level 2 of oiliness were incorrectly classified as severity level 1 or level 3 of oiliness (Figure 4b). The severity level of oiliness was classified based on the intensity of reflected light by excessive oil in a photograph [2]. Because the amount of reflected light is sensitively altered by the surrounding environment or other structures (e.g., dandruff), it can be the reason why classifying the severity level of oiliness is more difficult compared to the other five scalp hair disorders.

The attention rollout XAI technique was adopted to visualize the lesion from the ViT-B/16 model trained by each scalp and hair disorder (Figure 5e–g). The locations of the lesion were more accurately detected by applying other CNN models for object detection, such as Faster R-CNN [4], Mask R-CNN [10], U-net [18], and YOLO [21] models. However, extensive pixel-level [18] or bounding box annotations [4,10,21] were essentially required. The attention map can be directly obtained from the trained ViT-B/16 model and attention rollout without any annotation or re-training processes. Furthermore, it provides not only model interpretability but also approximate lesion areas as the class-discriminative localization map (Figure 5e–g).

Several studies about the severity prediction of scalp hair disorders have been introduced (Table 5). Previous studies showed the severity classification accuracy for dandruff (85.03%) [1] or hair loss (75.73~95.75%) [10,21,22]. Although some studies achieved comparatively high accuracy by applying CNN with attention blocks [1] or an ensemble model [22], they focused on predicting the severity of only one scalp hair disorder. On the other hand, we established deep learning models for diagnosing the severity levels of six common scalp hair disorders, applied XAI techniques to represent suspected problem regions for each disorder, and also developed user-friendly Scalp checker software incorporated with the trained ViT-B/16 models. We expect the Scalp checker software to help users perform an inspection using their typical PC without a cloud-based AI server and network connection (e.g., internet, Wi-Fi, and Bluetooth), visually check their hairy scalp conditions, and easily monitor the changes in the inspection results over time.

AI technologies have been extensively applied in medical fields including diagnosis, treatment, surgery, screening, and epidemiology analysis due to their effectiveness and usefulness [14,30,31,32,33,34,35,36,37]. However, they have some potential concerns in clinical practice. AI model performance significantly depends on data quality [14,36,38]. If the dataset is too small and includes inaccurate annotation or inequities, there is a risk of producing incorrect or biased results [14,33,36]. AI models are trained and evaluated on datasets that cannot include the entire population and complex clinical environments so they might have inherent biases [37,39]. In addition, the inevitable intrinsic uncertainties of medical interpretations are not usually considered to train AI models [40]. As a result, the performance and reliability of AI models affected by observer variability can be underrated in clinical decision-making. Therefore, in the current state, AI cannot replace the role of clinicians in medical fields, and rigorous validation through real-world clinical trials is still required to ensure generalizability, robustness, reliability, and safety [33,36,37]. So far, AI can offer good support for the decision-making of experts and understanding the diagnostic results of non-experts [14,32,36,37,41].

In the future, we will apply other state-of-the-art deep learning models, their ensembles, data augmentation techniques, and methods for reducing noisy label problems (e.g., label transition matrix method, MentorMix, and DivideMix), and increase the number of datasets to enhance severity classification performance, effectiveness, and reliability [22,42,43,44]. Moreover, we also plan to develop an efficient and lightweight model that can be used on mobile devices.

5. Conclusions

We proposed a deep learning-based healthcare platform for diagnosing six common scalp hair problems (dryness, oiliness, erythema, folliculitis, dandruff, and hair loss) and their severities (level 0~level 3). After reducing data imbalance, we trained and compared three deep learning models: ResNet-152, EfficientNet-B6, and ViT-B/16. As a result, ViT-B/16 showed the highest severity classification performance with 78.31%, 78.81%, 79.02%, and 78.91% of overall accuracy, precision, recall, and F1-score, respectively. The attention maps were also acquired using the trained ViT-B/16 model and attention rollout XAI method to visualize hairy scalp lesions. Moreover, Scalp checker software that runs on a PC was developed and compatibility of this software was tested under several OS and hardware environments. The proposed platform enables users to easily inspect and check their scalp conditions by date.

Author Contributions

C.H.: methodology, software, validation, investigation, visualization, writing—original draft preparation, and writing—review and editing; T.G.: conceptualization, methodology, formal analysis, resources, funding acquisition, project administration, writing—original draft preparation, and writing—review and editing; W.C.: conceptualization, methodology, supervision, and writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by research funds for newly appointed professors of Jeonbuk National University in 2020, National Research Foundation of Korea (NRF-2021R1C1C1010063), and the Bio & Medical Technology Development program of the NRF (No. RS-2023-00236157).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the usage of a public open dataset.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found from AI-Hub: https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=&topMenu=&aihubDataSe=data&dataSetSn=216 (accessed on 18 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jhong, S.-Y.; Yang, P.-Y.; Hsia, C.-H. An expert smart scalp inspection system using deep learning. Sens. Mater. 2022, 34, 1265–1274. [Google Scholar] [CrossRef]
Kim, B.R.; Won, S.H.; Kim, J.W.; Kim, M.; Jeong, J.-I.; Shin, J.-W.; Huh, C.-H.; Na, J.-I. Development of a new classification and scoring system for scalp conditions: Scalp Photographic Index (SPI). J. Dermatol. Treat. 2023, 34, 2181655. [Google Scholar] [CrossRef]
Misery, L.; Rahhali, N.; Duhamel, A.; Taieb, C. Epidemiology of dandruff, scalp pruritus and associated symptoms. Acta Derm. Venereol. 2013, 93, 80–81. [Google Scholar] [CrossRef]
Chang, W.-J.; Chen, L.-B.; Chen, M.-C.; Chiu, Y.-C.; Lin, J.-Y. ScalpEye: A deep learning-based scalp hair inspection and diagnosis system for scalp health. IEEE Access 2020, 8, 134826–134837. [Google Scholar] [CrossRef]
Springer, K.; Brown, M.; Stulberg, D.L. Common hair loss disorders. Am. Fam. Physician 2003, 68, 93–102. [Google Scholar]
Chang, W.-J.; Chen, M.-C.; Chen, L.-B.; Chiu, Y.-C.; Hsu, C.-H.; Ou, Y.-K.; Chen, Q. A mobile device-based hairy scalp diagnosis system using deep learning techniques. In Proceedings of the IEEE 2nd Global Conference on Life Sciences and Technologies (LifeTech), Kyoto, Japan, 10–12 March 2020. [Google Scholar]
Saed, S.; Ibrahim, O.; Bergfeld, W.F. Hair camouflage: A comprehensive review. Int. J. Women’s Dermatol. 2016, 2, 122–127. [Google Scholar] [CrossRef]
Ibrahim, S.; Noor Azmy, Z.; Abu Mangshor, N.; Sabri, N.; Ahmad Fadzil, A.; Ahmad, Z. Pre-trained classification of scalp conditions using image processing. Indones. J. Electr. Eng. Comput. Sci. 2020, 20, 138–144. [Google Scholar] [CrossRef]
Market Research Report. Available online: https://www.fortunebusinessinsights.com/hair-care-market-102555 (accessed on 18 January 2024).
Kim, J.-H.; Kwon, S.; Fu, J.; Park, J.-H. Hair Follicle Classification and Hair Loss Severity Estimation Using Mask R-CNN. J. Imaging 2022, 8, 283. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Zhang, R.; Wu, L.; Yang, Y.; Wu, W.; Chen, Y.; Xu, M. Multi-camera multi-player tracking with deep player identification in sports video. Pattern Recognit. 2020, 102, 107260. [Google Scholar] [CrossRef]
Zhang, R.; Xu, L.; Yu, Z.; Shi, Y.; Mu, C.; Xu, M. Deep-IRTarget: An automatic target detector in infrared imagery using dual-domain feature extraction and allocation. IEEE Trans. Multimed. 2021, 24, 1735–1749. [Google Scholar] [CrossRef]
Gupta, A.K.; Ivanova, I.A.; Renaud, H.J. How good is artificial intelligence (AI) at solving hairy problems? A review of AI applications in hair restoration and hair disorders. Dermatol. Ther. 2021, 34, e14811. [Google Scholar] [CrossRef] [PubMed]
Wang, W.-C.; Chen, L.-B.; Chang, W.-J. Development and experimental evaluation of machine-learning techniques for an intelligent hairy scalp detection system. Appl. Sci. 2018, 8, 853. [Google Scholar] [CrossRef]
Podlodowski, L.; Roziewski, S.; Nurzynski, M. An ensemble of Deep Convolutional Neural Networks for Marking Hair Follicles on Microscopic Images. In Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, Poznań, Poland, 9–12 September 2018. [Google Scholar]
Benhabiles, H.; Hammoudi, K.; Yang, Z.; Windal, F.; Melkemi, M.; Dornaika, F.; Arganda-Carreras, I. Deep learning based detection of hair loss levels from facial images. In Proceedings of the 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), Istanbul, Turkey, 6–9 November 2019. [Google Scholar]
Lee, S.; Lee, J.W.; Choe, S.J.; Yang, S.; Koh, S.B.; Ahn, Y.S.; Lee, W.-S. Clinically applicable deep learning framework for measurement of the extent of hair loss in patients with alopecia areata. JAMA Dermatol. 2020, 156, 1018–1020. [Google Scholar] [CrossRef] [PubMed]
Sayyad, S.; Midhunchakkaravarthy, D.; Sayyad, F. An Analysis of Alopecia Areata Classification Framework for Human Hair Loss Based on VGG-SVM Approach. J. Pharm. Negat. Results 2022, 13, 9–15. [Google Scholar]
Meng, G.; Yue, W.; Haipeng, X.; Congcong, X.; Xianhong, Y.; Jin, N.; Zhang, Z.; Zhixuan, L.; Wei, H.; Jiang, Y. Deep Learning-based Trichoscopic Image Analysis and Quantitative Model for Predicting Basic and Specific Classification in Male Androgenetic Alopecia. Acta Derm. Venereol. 2022, 102, 564. [Google Scholar]
Kim, M.; Kang, S.; Lee, B.-D. Evaluation of automated measurement of hair density using deep neural networks. Sensors 2022, 22, 650. [Google Scholar] [CrossRef]
Kim, M.; Gil, Y.; Kim, Y.; Kim, J. Deep-Learning-Based Scalp Image Analysis Using Limited Data. Electronics 2023, 12, 1380. [Google Scholar] [CrossRef]
AI-Hub. Available online: https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=&topMenu=&aihubDataSe=data&dataSetSn=216 (accessed on 18 January 2024).
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Abnar, S.; Zuidema, W. Quantifying attention flow in transformers. arXiv 2020, arXiv:2005.00928. [Google Scholar]
Yacouby, R.; Axman, D. Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP systems, Online, 20 November 2020. [Google Scholar]
Mintz, Y.; Brodie, R. Introduction to artificial intelligence in medicine. Minim. Invasive Ther. Allied Technol. 2019, 28, 73–81. [Google Scholar] [CrossRef]
Vatiwutipong, P.; Vachmanus, S.; Noraset, T.; Tuarob, S. Artificial Intelligence in Cosmetic Dermatology: A Systematic Literature Review. IEEE Access 2023, 11, 71407–71425. [Google Scholar] [CrossRef]
Bakator, M.; Radosav, D. Deep learning and medical diagnosis: A review of literature. Multimodal Technol. Interact. 2018, 2, 47. [Google Scholar] [CrossRef]
Esteva, A.; Chou, K.; Yeung, S.; Naik, N.; Madani, A.; Mottaghi, A.; Liu, Y.; Topol, E.; Dean, J.; Socher, R. Deep learning-enabled medical computer vision. NPJ Digit. Med. 2021, 4, 5. [Google Scholar] [CrossRef] [PubMed]
Wainberg, M.; Merico, D.; Delong, A.; Frey, B.J. Deep learning in biomedicine. Nat. Biotechnol. 2018, 36, 829–838. [Google Scholar] [CrossRef] [PubMed]
Navalesi, P.; Oddo, C.M.; Chisci, G.; Frosolini, A.; Gennaro, P.; Abbate, V.; Prattichizzo, D.; Gabriele, G. The Use of Tactile Sensors in Oral and Maxillofacial Surgery: An Overview. Bioengineering 2023, 10, 765. [Google Scholar] [CrossRef] [PubMed]
Jones, O.; Matin, R.; van der Schaar, M.; Bhayankaram, K.P.; Ranmuthu, C.; Islam, M.; Behiyat, D.; Boscott, R.; Calanzani, N.; Emery, J. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: A systematic review. Lancet Digit. Health 2022, 4, e466–e476. [Google Scholar] [CrossRef] [PubMed]
Liu, Q.; Zhang, J.; Bai, Y. Mapping the landscape of artificial intelligence in skin cancer research: A bibliometric analysis. Front. Oncol. 2023, 13, 1222426. [Google Scholar] [CrossRef]
Liu, Y.; Primiero, C.A.; Kulkarni, V.; Soyer, H.P.; Betz-Stablein, B. Artificial Intelligence for the Classification of Pigmented Skin Lesions in Populations with Skin of Color: A Systematic Review. Dermatology 2023, 239, 499–513. [Google Scholar] [CrossRef]
Wen, D.; Khan, S.M.; Xu, A.J.; Ibrahim, H.; Smith, L.; Caballero, J.; Zepeda, L.; de Blas Perez, C.; Denniston, A.K.; Liu, X. Characteristics of publicly available skin cancer image datasets: A systematic review. Lancet Digit. Health 2022, 4, e64–e74. [Google Scholar] [CrossRef]
Cabitza, F.; Rasoini, R.; Gensini, G.F. Unintended consequences of machine learning in medicine. JAMA 2017, 318, 517–518. [Google Scholar] [CrossRef] [PubMed]
Kovarik, C.; Lee, I.; Ko, J.; Adamson, A.; Otley, C.; Kvedar, J.; Vedak, P.; Huang, S.; Fitzgerald, M.; Chaudhari, R. Commentary: Position statement on augmented intelligence (AuI). J. Am. Acad. Dermatol. 2019, 81, 998–1000. [Google Scholar] [CrossRef] [PubMed]
Zhang, R.; Cao, Z.; Yang, S.; Si, L.; Sun, H.; Xu, L.; Sun, F. Cognition-Driven Structural Prior for Instance-Dependent Label Transition Matrix Estimation. IEEE Trans. Neural Netw. Learn. Syst. 2024; early access. [Google Scholar]
Jiang, L.; Huang, D.; Liu, M.; Yang, W. Beyond synthetic noise: Deep learning on controlled noisy labels. In Proceedings of the International Conference on Machine Learning, Online, 13–18 July 2020. [Google Scholar]
Li, J.; Socher, R.; Hoi, S.C. Dividemix: Learning with noisy labels as semi-supervised learning. arXiv 2020, arXiv:2002.07394. [Google Scholar]

Figure 1. Overall process for diagnosing the severity levels of six scalp and hair disorders. (a) Open dataset of scalp images was composed of six scalp and hair disorders. Each disorder was classified into four severity levels. (b) After image preprocessing, three deep learning models were fine-tuned on the scalp image dataset. By comparing classification performance, the ViT-B/16 model was selected among tested models. (c) Scalp checker software was developed based on the ViT-B/16 model combined with attention rollout technique. The severity level of each disorder and lesion area was shown in the software.

Figure 2. Flow chart illustrating execution of Scalp checker software. (a) User launches GUI-level Scalp checker using PC. (b) API-level Scalp checker checks GPU compatibility and predicts severities of each hair scalp disorder when user inspects hair scalp disorders. (c) After finishing the inspection, API-level Scalp checker requests the prediction results be stored in the local database. The results are displayed on the inspection page. (d) GUI-level Scalp checker shows the results stored in the local database when dashboard or history page at GUI-level Scalp checker is navigated by the user.

Figure 3. Overall classification results of test dataset according to three deep learning models: ViT-B/16, EfficientNet-B6, and ResNet-152.

Figure 4. Confusion matrix results of severity classification of six scalp hair disorders by the trained ViT-B/16 model: (a) Dryness. (b) Oiliness. (c) Erythema. (d) Folliculitis. (e) Dandruff. (f) Hair loss.

Figure 5. Comparison of XAI visualization techniques. (a) Original scalp image with multiple scalp disorders. Class-discriminative localization heatmaps were derived using (b–d) trained EfficientNet-B6 model with Grad-CAM and (e–g) trained ViT-B/16 with attention rollout. The degree of importance was described from blue to red.

Figure 6. GUI-level Scalp checker: (a) Dashboard page. (b) Inspection page. (c) Warning alarm when the GPU is not compatible. (d) Results after successful inspection. (e) Error message after unsuccessful inspection. (f) History page.

Table 1. Summary of the existing deep learning models for scalp and hair healthcare.

Reference	Model	Description	Limitation
[1]	DenseNet with CBAM and spinal FC	The severity of dandruff was examined from a microscopic scalp image.	It can only assess one specific scalp and hair disorder without providing information about the lesion region.
[22]	Triple ensemble model	The severity of alopecia was predicted from a microscopic scalp image.
[4]	Faster R-CNN	The ScalpEye system detected dandruff, folliculitis, hair loss, and oily hair from a microscopic scalp image.	It cannot predict the severity of each disorder and bounding box annotation is required.
[15]	ImageNet-VGG-f model	Microscopic scalp images were classified into four disorders: bacteria type 1, bacteria type 2, allergy, and dandruff.	It can only judge whether the image is one of the four trained classes or not and cannot describe the lesion region.
[16]	Ensemble of VGG-16 networks	Hair follicles on microscopic scalp images were counted.	It cannot predict common scalp hair disorders.
[17]	Three CNN-based models	The severity level (1~4) of hair loss was predicted from facial images.	It can only inspect hair loss severity and the dataset does not contain microscopic scalp characteristics.
[18]	U-net	The percentage of hair loss and severity of alopecia were calculated.	It can only inspect hair loss severity and pixel-level annotation is necessary.
[19]	VGG-SVM	The model recognized alopecia from hair photos.	It can only determine if the photo is alopecia or healthy hair.
[10]	Mask R-CNN	Hair follicles were detected and hair density in each follicle was measured.	It can only evaluate the hair loss severity from a microscopic scalp image and bounding box annotation is required.
[20]	D-Net and R-Net
[21]	EfficientDet, YOLOv4, DetectoRS

Table 2. Representative microscopic scalp images of each label.

Label	None (0)	Mild (1)	Moderate (2)	Severe (3)
Dryness
Oiliness
Erythema
Folliculitis
Dandruff
Hair loss

Table 3. The number of scalp images of each label.

	Original Data				After Data Balancing
Label	0	1	2	3	0	1	2	3
Dryness	686	5702	7054	2936	7339	10,047	7054	8054
Oiliness	686	36,079	31,481	4816	7339	9019	9444	8627
Erythema	686	38,520	16,659	5496	7339	9630	8329	9713
Folliculitis	686	2733	974	427	2676	2733	2818	2500
Dandruff	686	21,291	12,244	2900	7339	10,645	12,244	7912
Hair loss	686	17,159	4881	1075	7339	8579	8739	7536

Table 4. Hyper parameters for training models.

Models	Input Dimension	Epochs	Batch Size	Learning Rate	Optimizer	Loss
ResNet-152	480 × 480 × 3	200	16	1 × 10⁻³	Adam	Categorical cross-entropy
EfficientNet-B6	480 × 480 × 3	200	8	1 × 10⁻³	Adam	Categorical cross-entropy
ViT-B/16	224 × 224 × 3	200	16	1 × 10⁻⁴	Adam	Categorical cross-entropy

Table 5. Severity classification results of each scalp hair disorder by the trained ViT-B/16 model and other deep learning models.

Model	Disorder	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
ViT-B/16	Dryness	77.66	76.98	76.88	76.93
	Oiliness	69.07	69.67	70.62	70.14
	Erythema	81.38	81.47	81.69	81.58
	Folliculitis	82.32	82.28	82.63	82.46
	Dandruff	77.09	79.34	79.25	79.29
	Hair loss	82.34	83.06	83.04	83.05
CNN with attention blocks [1]	Dandruff	85.03	N/A	N/A	N/A
YOLOv4 [21]	Hair loss	75.73	80.75	80.22	80.48
Mask R-CNN [10]		79.29	80.62	78.85	79.73
Triple ensemble model [22]		95.75	N/A	N/A	87.05

Table 6. The operation of Scalp checker software under diverse environments.

Operating System	CPU	Architecture	RAM	VRAM	Elapsed Time	Others
Windows XP (SP3)	4 cores	×86	4 GB	4 GB	N/A	.NET Framework 4.8 is not supported
Windows Vista (SP2)	4 cores	×64	8 GB	4 GB	N/A	.NET Framework 4.8 is not supported
Windows 7 (SP1)	4 cores	×64	8 GB	4 GB	N/A	Python 3.9 is not supported
Windows 8.1 (Update 1)	4 cores	×64	4 GB	N/A	≒229 s
Windows 8.1 (Update 1)	4 cores	×64	8 GB	N/A	≒213 s
Windows 10 (22H2)	4 cores	×64	4 GB	N/A	≒227 s
Windows 10 (22H2)	20 cores	×64	32 GB	12 GB	≒40 s
Windows 11 (22H2)	20 cores	×64	32 GB	N/A	≒60 s
Windows 11 (22H2)	20 cores	×64	32 GB	12 GB	≒40 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ha, C.; Go, T.; Choi, W. Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders. Appl. Sci. 2024, 14, 1734. https://doi.org/10.3390/app14051734

AMA Style

Ha C, Go T, Choi W. Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders. Applied Sciences. 2024; 14(5):1734. https://doi.org/10.3390/app14051734

Chicago/Turabian Style

Ha, Changjin, Taesik Go, and Woorak Choi. 2024. "Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders" Applied Sciences 14, no. 5: 1734. https://doi.org/10.3390/app14051734

APA Style

Ha, C., Go, T., & Choi, W. (2024). Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders. Applied Sciences, 14(5), 1734. https://doi.org/10.3390/app14051734

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Healthcare Platform for Diagnosis of Scalp and Hair Disorders

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Deep Learning Model Description

2.3. XAI Visualization Techniques

2.4. Software Design

3. Results

3.1. Severity Classification of Six Hairy Scalp Disorders

3.2. Prediction of Scalp Lesion Using XAI

3.3. Scalp Checker Software

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI