1. Introduction
A technological breakthrough has made clinical investigations and tests for disease diagnosis feasible to a great extent; high-end diagnostic tools are readily available. Ayurveda uses certain human indices like pulse (Nadi), eyes, nails, and tongue to diagnose common ailments. Ancient traditional medicine used these parameters as a means of diagnosis, with no advanced technological tools available, to predict many abnormalities related to the health status of internal body organs. These traditional, non-invasive diagnostic practices can be adapted to give a preliminary prognosis by quantifying and automating the entire analysis process to remove the associated subjectivity.
The paper presents a novel method for multilabel disease classification from tongue analysis. Observation of the tongue is an important part of traditional Chinese medicine as well as Ayurveda. It was established that different parts of the tongue correspond to different internal organs. The tip of the tongue reflects the heart and lungs; the middle part reflects the spleen, pancreas, and stomach; its root represents the kidneys and intestines, and the right and left sides represent the liver and gall bladder.
Over the past few decades, the use of decision-support systems in clinical practice has increased. Deep learning models allow reliable classification and object detection of medical images, showing remarkable accuracy comparable to that of physicians. An automated tongue diagnosis system is a way to bridge the gap between traditional diagnostic methods and modern western medicine. Quantitative analysis can overcome the subjective aspect, which stems from a skilled traditional practitioner’s knowledge base and meticulous practice and establish its accountability and acceptability. In this paper, we have developed a multi-disease classification algorithm that is able to predict multiple disease classes with appreciable accuracy. According to the collected data samples, eight diseased states are considered. This paper contributes to the following fields:
The paper presents a MobileNetV2 deep learning model to detect the diseased tongue from a normal, healthy tongue. The model employs a five-fold cross-validation and transfer learning technique to achieve this binary classification. The main achievement is the sanguine results obtained with images taken by smartphone cameras rather than standard equipment.
A multilabel classification model for eight common categories of ailments is developed. The DenseNet-121 architecture for disease classification achieves satisfactory results with the small dataset. There are eight disease labels: diabetes, hypertension, acidic peptic disease, pyrexia, hepatitis, cold cough, gastritis, and others.
The organisation of this paper is as follows:
Section 2 summarises the techniques used for tongue feature extraction.
Section 3 provides an explanation of the related work, dataset, and training details, along with the experimentation and evaluation metrics used. Finally,
Section 4 reports the performance analysis.
Section 5 concludes the research.
2. Related Work
Quantifying the tongue’s diagnostic attributes is one of the main challenges in automating tongue analysis for disease diagnosis. Ref. [
1] presents a study of tongue conditions based on Ayurveda, which pertain to an individual’s health status. Ref. [
2] presented a summary of various tongue attributes such as colour coating, texture, and geometric shape to predict specific diseased conditions, followed in oriental medicine. According to TCM (traditional Chinese practice), the best illumination for tongue inspection is sunshine in an open area at 9 am. Artificially, this can be generated with a source with a colour rendering index greater than ninety and a colour temperature around 5000 K. The automation of tongue analysis systems essentially requires an image-capturing device with high-resolution images for accurate extraction of tongue features for disease predictions, in agreement with traditional medicine practitioners. Researchers explore various imaging setups using high-end CCD cameras [
3,
4,
5,
6,
7,
8,
9], hyperspectral cameras [
10,
11,
12,
13,
14,
15,
16], and smartphone cameras [
17,
18,
19,
20,
21,
22,
23,
24]. Before feature extraction and classification, it is necessary to segment the tongue region from the captured images, which include teeth, lips, and skin areas. Over time, researchers have explored conventional approaches, artificial neural networks, and deep learning algorithms for tongue area segmentation and feature extraction. Probably due to the lack of a digital dataset covering all possible features for various diseases, most of the work in this area focused on a specific disease and its associated tongue features. Some common diseases, such as diabetes, appendicitis, and gastritis, were targeted, and relevant features and classification were done using statistical techniques [
25,
26,
27,
28]. A hybrid model with statistical methods for feature extraction and machine learning algorithms for classification was also developed by [
29,
30,
31,
32,
33].
Table 1 summarises some feature and disease classification methods published.
Ref. [
47] reviewed current trends in tongue diagnosis. They also trained a classification model using Random Forest and Support Vector Machine on a tongue dataset, with the tongue region divided into five parts as per the layout of the internal organs and extracted seven colour spaces from the five extracted parts. Further, they trained VGG and ResNet pretrained models on tongue constitution classification. ResNet 50 showed the best performance, with 64.52% accuracy.
Over the past two decades, enormous efforts have been made to effectively utilise the full potential of the tongue diagnosis system. Enhancement of tools and techniques for achieving the holistic approach and overcoming the subjective nature of diagnosis is an ongoing process, and there is a need to set some standard protocols so that they have uniform characteristics and are readily acceptable by end users. Most of the research has focused on specific diseases and sought to extract tongue features. Refs. [
44,
45] achieved notable and accurate results by classifying 12 disease categories, including a healthy tongue, and assigning a specific single disease label to each tongue sample; they did not focus on multi-label classification. None of the references mentioned aimed to classify multiple disease labels.
The aim of this paper is to take a step towards developing a decision support system using deep architecture to predict a holistic tongue diagnosis, not just any particular tongue features or diseases. In the future, it is proposed to further develop it as a mobile app that allows users to consult an expert medical professional and receive a preliminary treatment for immediate relief before conducting detailed investigations.
3. Methodology
The primary goal of this study is to determine whether smartphone images can yield accurate results for tongue analysis, enabling the provision of a preliminary diagnosis whenever and wherever needed. Tongue images captured by an individual and symptoms mentioned by him or her can aid the medical professional in giving a preliminary line of treatment before directing further detailed investigations. We collaborated with two consulting physicians to collect data over a two-month period using smart phone devices with a camera resolution greater than or equal to 8 megapixels. Analysis and identification of the images to be labelled with single and/or multiple disease labels for each case are performed.
We classified seven common ailments, each with a significant number of samples, and grouped the remaining diseases diagnosed, with few samples, under a single label, ‘others’. The doctor’s diagnosis serves as the ground truth for the classification task. The approach used for multilabel classification starts with the initial segmentation of the tongue area from the complete image. Thereafter, a binary classification architecture distinguishes an unhealthy tongue from a healthy one, followed by final disease label prediction using a multi-label classifier model. The workflow for implementing a multi-label classification model is illustrated in
Figure 1. Subsequent sub-sections follow, providing brief details of all the steps involved. The strategy for the deep learning classification models used is as follows:
Stratified 5-fold cross-validation for disease risk classification.
Ensemble learning strategy: bagging to reduce variance in data.
Upsampling of the dataset to set some minimum sample size in each class.
Extensive real-time data augmentation is needed for training models.
Class-weighted focal loss to tackle class imbalance.
Individual training for multi disease classification and disease risk detection.
3.1. Tongue Analysis Dataset
A total of 1095 images of subjects suffering from one or more than one ailment are acquired using smartphone cameras with an image resolution greater than or equal to 8 mega pixels (Samsung A50, iPhone, one plus). A total of 822 images of healthy individuals were also collected from willing individuals. All images were collected with the necessary consent of individuals eager to be part of our study. Raw images come in a variety of resolutions and sizes. For the proposed model, images are annotated with eight conditions other than normal and disease risk categories, as listed in
Table 2. Other label classes include some uncommon ones like CAD (coronary artery disease), CKD (chronic kidney disease), COPD (Chronic Obstructive Pulmonary Disease), epilepsy, and vertigo.
3.2. Preprocessing and Image Augmentation
The acquired images are of various sizes and formats, and we perform primary processing by converting all images to a 256 × 256-pixel jpeg format. These basic preprocessing steps are used ensure a uniformity in all images in the dataset. The next step involves segmentation of the tongue area of interest by Double U-Net architecture [
48]. In order to increase data variability, further preprocessing methods are used, such as image augmentation for upsampling to balance class distribution and real-time augmentation during training to obtain unique and novel images in each epoch, thus improving the model’s performance. Upsampling is done to ensure each label occurs at least 150 times in the dataset, which increased the total number of diseased tongue images to 2729. Rotation, flipping, and altering brightness, saturation, and hue are used for real-time augmentation.
3.3. Deep Learning Models
Our pipeline combines two different types of image classification methods: a binary classification for normal or diseased tongues and a disease label classifier for multilabel annotated images. The AUCMEDI [
49] platform is employed to develop our classification model; in both cases, we are using pretrained models to reduce the time and cost of training a fresh model. We apply transfer learning with frozen layers, with the exception of the classification head. After 10 epochs of training, the freezing is undone so that the weights can be adapted to the new task.
We chose the two architectures to be compatible with low resource requirements and feasible for deployment on the Android platform. A brief introduction to the two architectures is provided in the following paragraphs, along with the parameter settings for each model.
3.3.1. Diseased Tongue Detector
MobileNetV2 is used for binary classification to distinguish between normal and unhealthy tongue images. MobileNet [
50] is the first computer vision model open-sourced by Google and designed for training classifiers, detectors, and segmentation. The model’s highlights are its small, low-latency, and low-power models, which are designed to effectively maximise accuracy while considering resource constraints for on-device or embedded applications. The use of depth-wise separable convolutions significantly reduces the number of parameters compared to other regular convolution networks. MobileNetV2, as defined in [
50], basically contains 53 convolution layers and 1 average pooling layer, as shown in
Figure 2a. It incorporates two new architectural elements within its layout: a linear bottleneck across the layers and a shortcut between the bottlenecks. The initial layer is a fully convolutional layer with 32 filters, followed by 19 residual bottleneck layers. The structure consists of two crucial components:
Each block has three different layers, consisting of point-wise 1 × 1 convolution and 3 × 3 depth-wise convolution. There are Stride 1 blocks and Stride 2 blocks, as shown in
Figure 2b. Bottleneck is either an inverted residual block, a bottleneck residual block, or a Stride 1 or Stride 2 block. Bottleneck layers enhance models’ ability to transition from lower-level pixels to higher-level variables identifying image categories.
The use of 5-fold cross-validation with the bagging method for ensembles with Random Forest and Soft Majority Voting is done. The dataset is split into three parts: a train, a test model, and a test ensemble set. The five-fold cross-validation further divides the train set into a train and a validation set, dividing the training set into five parts and performing training on each part. Each time, one of the five parts is used for validation, and the other four parts are used for training. Five models are trained according to a 5-fold train dataset split; individual models may not give the best results by themselves. The best model for each fold is saved. The predictions of the individual models are combined into an ensemble in order to produce a single, stable prediction. Two techniques—Random Forest (RF) and Soft Majority Voting (SMV)—are used, and results are compared.
Random Forest
The Random Forest algorithm itself is an ensemble method. The implementation steps begin with a bootstrapped dataset, from which we draw random samples. We can draw one sample from the dataset multiple times, resulting in a new set of data that is the same size as the original, but does not necessarily contain all of the original dataset. The next step is to build a decision tree with random variables based on the number of folds. The bootstrapping step is repeated, thereby generating multiple decision trees. The final step is tree bagging for the prediction of new data; the most common decision becomes the final result of the decision tree. For the bagging method, the Random Forest classifier uses an ensemble dataset.
Soft Majority Vote
For SMV, the summation of each model’s prediction is done to create a new prediction matrix. The final prediction for new data then becomes the category having the maximum value in the matrix.
Parameter settings for the said model are as follows; due to the imbalance in the number of images in the two categories, class weights are computed, and categorical focal loss is considered the loss function. Testing of model performance is done on the test data, and evaluation metrics for training each fold are computed and saved. Ensemble by 5-fold bagging is done by Random Forest and Soft Majority Vote methods. Each model is trained for 50 epochs, and the dynamic learning rate is set to a maximum decrease of 1 × 10−7 with a factor of 0.1 in case of no improvement. Validation loss is monitored after 5 epochs. Early stopping and the model checkpoint method are also used, with a patience level set at 12 epochs with no improvements in validation loss. Finally, we compare all the individual fold F1 scores with the ensemble models to determine which model performs best.
3.3.2. Disease Label Classifier
This is a multilabel classifier using the DenseNet-121 [
51] architecture for the eight disease classes. DenseNet, as its name implies, is a densely connected network that connects each layer to every other layer. The input of a layer is the concatenation of feature maps from previous layers. Concatenation requires feature maps of similar size; this issue is resolved by dividing the network into multiple densely connected networks, facilitating both down sampling and feature concatenation. This keeps the feature map size similar within a block.
where
x0 represents the input,
Hl represents the non-linear transformation occurring in each block, and
xl represents the output of the Lth layer. A transition block, consisting of a convolution and pooling layer, follows each dense block. Lastly, the dense block is terminated by a classification layer that accepts the feature map of all previous layers of the network to perform the classification task. Bottleneck layers are used within each block, where 1 × 1 convolution reduces the number of channels in the input, followed by 3 × 3 convolution for feature extraction. This helps to relax large computational requirements while also improving efficiency. DenseNet-121 consists of four dense blocks with [6,12,24,16] number of layers in each block as shown in the summary chart in
Figure 3.
This network has the following imperative advantages: it mitigates the vanishing gradient problem, strengthens feature propagation, encourages feature reuse, and has a smaller number of parameters.
For this study, multilabel focal loss is used as the loss function along with categorical accuracy. The model is trained with hyperparameter settings as mentioned: 100 epochs with a dynamic learning rate set to a maximum decrease of 1 × 10−7 with a factor of 0.1 in case of no improvement. It is also monitored for validation loss with a patience of 5 epochs. Early stopping and the model checkpoint method are also used to stop after 12 epochs with no improvements monitored.
The experiments were performed with the available resources, consisting of a HP Pavilion laptop with a 1.60 GHz Intel i5 8th generation processor and 8 GB of RAM. Training of the model was done on Google Colab Python 3. Backend GPU for Google Compute Engine.
3.4. Evaluation Metrices
The metrics considered for quantitative evaluation of the model on test data are precision, recall, and F1-score.
Precision is the number of true positives divided by the number of total positive predictions.
Recall measures the model’s ability to predict the positives; it is the true positive divided by the true positive and false negative.
F1
score is the harmonic mean of precision and recall given by
Categorical Accuracy is the percentage of predicted values that match the actual truth values. The calculation for the same is done as follows:
First, identify the index at which the maximum value occurs using argmax.
If it is the same for both predicted and true value, it is considered accurate.
Here, since the maximum value index is observed, the predicted value can be a logit or probability function.
The receiver operating characteristic (ROC) curve is determined by calculating the average ROC for each of the five folds. Practically, for each fold, different false- and true-positive-rates exist, and the mean cannot be calculated; hence, the first aggregation of all False-Positive-Rates (FPR) into one vector is done, serving as the x axis of the average ROC curve. The True-Positive Rate (TPR) needs to be interpolated. The TPR represents the corresponding y-points to the previously collected x-points.
4. Results and Discussions
A disease risk detection model is trained on the dataset with healthy and unhealthy tongue images. Upsampling of the unhealthy image data is done to ensure at least 150 images in each of the 8 categories. Real time Image augmentation is used to increase the image set artificially by adding small transformations to the original images, such as rotations or changes in contrast or saturation. Online image augmentation applies the transformations to each image upon loading with the data generator, eliminating the need for disk storage. For multilabel disease classification evaluation, one hot-encoded file serves as an interface defining true labels for the model.
4.1. Performance Analysis of Disease Risk Detector
When trained on a shared GPU from COLAB, the sequential training of the disease risk detector with 5-fold cross validation with the MobileNetV2 architecture took approximately 4.5 h with 45 epochs for each fold on average. The disease risk model’s performance results with respect to metrics considered on the test model dataset are shown in
Table 3. The test dataset consists of 485 images in total, with 361 unhealthy tongue images and 124 normal, healthy tongue images. Concise ROC Curve
Figure 4, for the 5-folds and ensemble techniques with a mean value, indicates the model’s appreciable performance for binary classification.
The best model is saved for each fold of cross-validation. Predictions of each fold on a test ensemble dataset with 485 images are combined to form an ensemble. Implementation of the Random Forest ensemble is done by picking random samples and features to build many decision trees on a bootstrapped dataset through a repeated process. Prediction of new data is accomplished by bagging operations, wherein the final decision is the most common decision amongst all decision trees.
The Soft Majority Voting Ensemble aggregates the generated predictions from the five folds to create a new prediction matrix. The category with the maximum value is considered the final prediction for each sample. In this particular case, the ensemble of 5-fold predictions with two classes does not show enhanced performance; F1 scores for Random Forest and Soft Majority Voting are comparable to the individual fold F1 score as depicted in
Figure 5.
4.2. Performance Analysis of the Multi-Label Disease Classification Model
The DenseNet-121 model is utilised for the classification of diseases. The dataset comprised seven distinct disease labels and an additional one, which consisted of all diseases with fewer than 10 subjects in each category. Model training stopped at 41 epochs. The disease classification model with DenseNet-121 architecture took over 18 h to train.
Table 4 shows the performance parameters for the eight labels considered in the test data of 474 images. The ROC curve in
Figure 6 gives an idea of true positive predictions for each category.
It is observed that each of the diseased classes achieved an average accuracy of over 90%. Some sample results presented in
Table 5, where bold values indicate the true predictions with respect to ground truth, the success of the DenseNet-121 model for disease classification.
Highlighted rows in the table indicate inaccurate classification for two sample images. The results for the class “others” were also observed to be slightly less accurate. This is also evident from the ROC curve and performance indices.
The proposed model for multi-disease classification shows satisfactory results. The most important outcome of this work is that images captured in different environments and on various mobile phones demonstrated appreciable results. Further improvement is possible with an ensemble of different deep learning models to achieve high accuracy for all disease labels. As it is not practically possible for any individual model to deliver uniform accuracy for all classification labels under consideration, multiple models will ultimately enhance the overall prediction accuracy. In future work, the aim is to increase the number of disease labels to encompass all potential cases that tongue analysis can diagnose, in collaboration with a medical expert to compile ground truth data for classification.
Even with the limitation of a small dataset with a smaller variety of disease classes, model performance is appreciable. Since only a single dense model is considered, its performance accuracy for all class labels could not be uniform. As previously mentioned, we can enhance this by incorporating the ensemble learning method.
5. Conclusions and Future Work
The main challenging aspect of automating tongue analysis for disease diagnosis is quantifying the diagnostic features of the tongue image on par with the expert practitioner’s observations. Automating this non-invasive method of diagnosis can facilitate the consultation of a sick individual with medical personnel using a telemedicine network, even from a remote location. On the Indian continent, with its diverse and large population, automatic tongue analysis systems can aid medical personnel to some extent in giving a preliminary diagnosis and initiating immediate treatment without direct physical interaction with the subject.
In this study, we introduced a powerful multi-disease detection pipeline for tongue image analysis that uses ensemble learning to combine the predictions of five individually trained models to improve the performance of identifying a diseased tongue. In this particular case, the ensemble method’s performance is comparable to that of individual models, showing no notable improvement. A single image of the tongue can be used to predict more than one disease by employing techniques like transfer learning, class weighting for imbalance in different classes, large amounts of real-time data enhancement, and focal loss utilisation in the deep learning architecture. An average accuracy of 90% is achieved with the multi-label classification model for eight disease classes. Two different deep learning models are specifically used, primarily due to their lower resource requirements compared to their counterparts and their potential for Android platform deployment.
This study’s most significant accomplishment is the successful use of mobile phone images for tongue image analysis, which represents a step towards reducing the cost and expertise of a high-end, sophisticated image-capturing device and strengthens the possibility of developing an Android-based app for easy and quick prognosis. Automation will also be helpful in training aspiring clinicians to use this non-invasive traditional practice and enhance their skills with experience. This opens the opportunity to explore more enhanced models with a larger dataset for performance improvement.
In our future work, we aim to build a more robust model with a judicious selection of a group of deep learning architectures for an ensemble model capable of classifying all possible disease labels achievable using tongue analysis from smartphones. This inherently includes the need to compile an exhaustive tongue image dataset for research.
Author Contributions
V.B. has compiled the work after conducting a conducting a literature survey and developing an algorithm that is then trained on the dataset and rigorously analysed by P.P.B., my guide and mentor. P.P.B. critically revised and corrected the first author’s rough draft for intellectual content, ultimately approving the submitted version. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
This is an observational study, and hence no ethical approval is required. The individual consent form was obtained from all the subjects and recorded alongside the clinician’s diagnosis. The only use of the collected data is for research purposes, never disclosing the identity of any subject.
Informed Consent Statement
All of the individual participants in the study gave their informed consent. The authors made it clear to them that they would only use the provided images for research purposes and would not disclose any personal information on any platform.
Data Availability Statement
The data presented in this study are available on request from the corresponding author.
Acknowledgments
The authors acknowledge the support of medical professionals who contributed to the collection of datasets and the annotation and labelling of the same. We are also thankful to student volunteers of the department of Biomedical Engineering, SGSITS, Indore.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Patil, M.K. Anatomical Study of Jhiva W.R.T Sam and Niram Prakriti Pariksha. Int. Ayurvedic Med. J. 2017, 1, 151–159. [Google Scholar]
- Vocaturo, E.; Zumpano, E.; Veltri, P. On discovering relevant features for tongue colored image analysis. In Proceedings of the 23rd International Database Applications & Engineering Symposium (IDEAS ‘19), Athens, Greece, 10–12 June 2019; Association for Computing Machinery: New York, NY, USA; pp. 1–8. [Google Scholar] [CrossRef]
- Chiu, C.C. A novel approach based on computerized image analysis for traditional Chinese medical diagnosis of the tongue. Comput. Methods Programs Biomed. 2000, 61, 77–89. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Zhou, Y.; Yang, J.; Xu, Q. An Image Analysis System for Tongue Diagnosis in Traditional Chinese Medicine. In Proceedings of the International Computational & Information Science Conference, Shanghai, China, 16–18 December 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 1181–1186. [Google Scholar]
- Zhang, H.; Wang, K.; Zhang, D.; Pang, B.; Huang, B. Computer aided tongue diagnosis system. In Proceedings of the 27th Annual Conference on Engineering in Medicine & Biology Society, Sydney, Australia, 17–18 January 2005; IEEE: Piscataway, NJ, USA, 2006; pp. 6754–6757. [Google Scholar]
- Jiang, L.; Xu, W.; Chen, J. Digital imaging system for physiological analysis by tongue color inspection. In Proceedings of the 3rd Innovative Engineering & Applications Conference, Washington, DC, USA, 3–5 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1833–1836. [Google Scholar]
- Xu, J.; Tu, L.; Ren, H.; Zhang, Z. A Diagnostic Method Based on Tongue Imaging Morphology. In Proceedings of the 2nd International Conference on Bioinformatics & Biomedical Engineering, Shanghai, China, 16–18 May 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 2613–2616. [Google Scholar]
- Wang, X.; Zhang, B.; Yang, Z.; Wang, H.; Zhang, D. Statistical Analysis of Tongue Images for Feature Extraction & Diagnostics. IEEE Trans. Image Process. 2013, 22, 5336–5347. [Google Scholar] [PubMed]
- Wang, X.; Zhang, D. A High-Quality Color Imaging System for Computerized tongue Image Analysis. J. Expert Syst. Appl. 2013, 40, 5854–5866. [Google Scholar] [CrossRef]
- Liu, Z.; Zhang, D.; Yan, J.Q.; Li, Q.L.; Tang, Q.L. Classification of hyperspectral medical tongue images for tongue diagnosis. Comput. Med. Imaging Graph. 2007, 31, 672–678. [Google Scholar]
- Li, Q.; Liu, J.; Xiao, G.; Xue, Y. Hyperspectral tongue imaging system used in tongue diagnosis. In Proceedings of the 2nd International Conference on Bioinformatics & Biomedical Engineering, Shanghai, China, 16–18 May 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 2579–2581. [Google Scholar]
- Li, Q.; Wang, Y.; Liu, H.; Sun, Z.; Liu, Z. Tongue fissure extraction and classification using hyperspectral imaging technology. Appl. Opt. 2010, 49, 2006–2013. [Google Scholar] [CrossRef] [PubMed]
- Li, Q.; Lui, Z. Tongue color analysis and discrimination based on hyper spectral images. Comput. Med. Imaging Graph. 2009, 33, 217–221. [Google Scholar] [CrossRef] [PubMed]
- Yamamoto, S.; Tsumura, N.; Nakaguchi, T.; Namiki, T.; Kasahara, Y.; Terasawa, K.; Miyake, Y. Early Detection of Disease Oriented State from Hyperspectral Tongue Images with Principal Component Analysis and Vector Rotation. In Proceedings of the Annual International Conference on Engineering in Medicine & Biology Society, Buenos Aires, Argentina, 31 August–4 September 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 3025–3028. [Google Scholar]
- Li, Q.; Wang, Y.; Liu, H.; Sun, Z. AOTF based Hyperspectral Tongue Imaging System and Its Applications in Computer-aided Tongue Disease Diagnosis. In Proceedings of the 3rd International Conference on Biomedical Engineering and Informatics, Yantai, China, 16–18 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1424–1427. [Google Scholar]
- Liu, Z.; Wang, H.J.; Li, Q. Tongue Tumour Detection in Medical Hyperspectral Images. Sensors 2012, 12, 162–174. [Google Scholar] [CrossRef] [PubMed]
- Ryu, I.; Itiro, S. A tongue diagnosis system for personal healthcare on smartphone. In Proceedings of the 5th Augmented Human International Conference, Kobe, Japan, 7–9 March 2014; Waseda University Press: Tokyo, Japan, 2014. [Google Scholar]
- Duan, Y.; Xu, D. I Tongue: An iPhone App for Personal Health Monitoring Based on Tongue Image; Final Report; Interdisciplinary Innovations Fund (IIF) 2012/2013 Awards (MU); University of Missouri: Columbia, MO, USA, 2014. [Google Scholar]
- Hu, M.C.; Cheng, M.H.; Lan, K.C. Color Correction Parameter Estimation on the Smartphone and its Application to Automatic Tongue Diagnosis. J. Med. Syst. 2016, 40, 18. [Google Scholar] [CrossRef]
- Tania, M.H.; Lwin, K.T.; Hossain, M.A. Computational Complexity of Image Processing Algorithms for an Intelligent Mobile Enabled Tongue Diagnosis Scheme. In Proceedings of the 10th International Conference on Software, Knowledge, Information Management & Applications, Denver, CO, USA, 15–16 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 29–36. [Google Scholar]
- Kanawong, R.; Obafemi-Ajayi, T.; Liu, D.; Zhang, M.; Xu, D.; Duan, Y. Tongue Image Analysis and Its Mobile App Development for Health Diagnosis. Adv. Exp. Med. Biol. 2017, 1005, 99–121. [Google Scholar]
- Hu, M.C.; Lan, K.C.; Fang, W.C.; Huang, Y.C.; Ho, T.J.; Lin, C.P.; Yeh, M.H.; Raknim, P.; Lin, Y.H.; Cheng, M.H.; et al. Automated Tongue Diagnosis on the Smartphone and its Applications. Comput. Methods Programs Biomed. 2019, 174, 51–64. [Google Scholar] [CrossRef]
- Zhou, Z.; Peng, D.; Gao, F.; Leng, L. Medical Diagnosis Algorithm Based on Tongue Image on Mobile Device. J. Multimed. Inf. Syst. 2019, 6, 99–106. [Google Scholar] [CrossRef]
- Smith, Z.J.; Chu, K.; Espenson, A.R.; Rahimzadeh, M.; Gryshuk, A.; Molinaro, M.; Dwyre, D.M.; Lane, S.M.; Matthews, D.; Wachsmann-Hogiu, S. Cell-Phone-Based Platform for Biomedical Device Development and Education Applications. PLoS ONE 2011, 6, e17150. [Google Scholar] [CrossRef]
- Haralick, R.M.; Shanmugan, K.; Dinstein, I. Textural features for Image Classification. J. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
- Pang, B.; Zhang, D. Computerized tongue diagnosis based on Bayesian networks. IEEE Trans. Biomed. Eng. 2004, 51, 1803–1810. [Google Scholar] [CrossRef]
- Pang, B.; Zhang, D.; Wang, K. Tongue Image analysis for appendicitis diagnosis. Inf. Sci. 2005, 175, 160–176. [Google Scholar] [CrossRef]
- Kim, J.; Son, J.; Jang, S.; Nam, D.H.; Han, G.; Yeo, I.; Ko, S.J.; Park, J.W.; Ryu, B.; Kim, J. Availability of Tongue diagnosis system for Assessing Tongue Coating thickness in Patients with Functional Dyspepsia. Evid Based Complement. Altern. Med. 2013, 2013, 348272. [Google Scholar] [CrossRef]
- Jayanti, S.K.; Shanmugapriyanga, B. Detecting Diabetes Mellitus Gradient Vector Flow Snake Segmented Technique. Int. Res. J. Eng. Technol. 2017, 4, 1238–1244. [Google Scholar]
- Zhang, D.; Zhang, H.; Zhang, B. Detecting Diabetes Mellitus and No proliferative Diabetic Retinopathy Using CTD. In Tongue Image Analysis; Springer: Singapore, 2017; pp. 303–325. [Google Scholar]
- Sandhya, N.; Rajasekar, M. Tongue Image Analysis for Hepatitis Detection Using GA-SVM. Indian J. Comput. Sci. Eng. 2017, 8, 526–534. [Google Scholar]
- Mrilaya, D.; Pervetaneni, P.; Aleperi, G. An Approach for Tongue Diagnosing with Sequential Image Processing Method. Int. J. Comput. Theory Eng. 2012, 4, 322–328. [Google Scholar]
- Dhanalakshmi, M.; Pervetaneni, P.; Aleperi, G. Applying Wavelet Transforms and Statistical Feature Analysis for Digital Tongue Image. IOSR J. Comput. Eng. 2014, 16, 95–102. [Google Scholar] [CrossRef]
- Kawanabe, T.; Kamarudin, N.D.; Ooi, C.Y.; Kobayashi, F.; Xiaoyu, M.; Sekine, M.; Wakasugi, A.; Odaguchi, H.; Hanawa, T. Quantification of tongue colour using machine learning in Kampo medicine. Eur. J. Integr. Med. 2016, 8, 932–941. [Google Scholar] [CrossRef]
- Li, J.; Hu, X.; Tu, L.; Cui, L.; Jiang, T.; Cui, J.; Ma, X.; Yao, X.; Shi, Y.; Wang, S.; et al. Diabetes Tongue Image Classification Using Machine Learning and Deep Learning. In Proceedings of the 2008 3rd IEEE Conference on Industrial Electronics and Applications, Singapore, 3–5 June 2008. [Google Scholar] [CrossRef]
- Ma, J.; Wen, G.; Hu, Y.; Chang, T.; Zeng, H.; Jiang, L.; Qin, J. Tongue image constitution recognition based on Complexity Perception method. arXiv 2018, arXiv:1803.00219. [Google Scholar] [CrossRef]
- Zhang, X.F.; Zhang, J.; Hu, G.Q.; Wang, Y.Z. Preliminary Study of Tongue Image Classification Based on Multi-Label Learning; ICIC 2015, Part III LNAI9227; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
- Chen, L.; Wang, B.; Zhang, Z.; Lin, F.; Ma, Y. Research on Techniques of Multifeatures Extraction for Tongue Image and Its Application in Retrieval. Comput. Math. Methods Med. 2017, 2017, 8064743. [Google Scholar] [CrossRef] [PubMed]
- Jiang, T.; Lu, Z.; Hu, X.; Zeng, L.; Ma, X.; Huang, J.; Cui, J.; Liping, T.; Zhou, C.; Yao, X.; et al. Deep Learning Multi-label Tongue Image Analysis and Its Application in a Population Undergoing Routine Medical Checkup. Evid.-Based Complement. Altern. Med. 2022, 2022, 3384209. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Zhang, Z.; Zhu, X.; Zhao, Y.; Ma, Y.; Zang, J.; Li, B.; Cao, X.; Xue, C. Automatic Classification Framework of Tongue Feature Based on Convolutional Neural Networks. Micromachines 2022, 13, 501. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Liu, J.; Wu, C.; Liu, J.; Li, Q.; Chen, Y.; Wang, X.; Chen, X.; Pang, X.; Chang, B.; et al. Artificial intelligence in tongue diagnosis: Using deep convolutional neural network for recognizing unhealthy tongue with tooth-mark. Comput. Struct. Biotechnol. J. 2020, 18, 973–980. [Google Scholar] [CrossRef] [PubMed]
- Meng, D.; Cao, G.; Duan, Y.; Zhu, M.; Tu, L.; Xu, D.; Xu, J. Tongue Images Classification Based on Constrained High Dispersal Network. Evid.-Based Complement. Altern. Med. 2017, 2017, 7452427. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kanawong, R.; Ajayi, T.O.; Ma, T.; Xu, D.; Li, S.; Duan, Y. Automated Tongue Feature Extraction for Zheng Classification in Traditional Chinese Medicine; Hindawi Publications Corporation: London, UK, 2012. [Google Scholar] [CrossRef]
- Rajakumaran, S.; Sashikala, J. An Automated Tongue Color Image Analysis for Disease Diagnosis and Classification Using Deep Learning Techniques. Eur. J. Mol. Clin. Med. 2021, 7, 4779–4796. [Google Scholar]
- Mansour, R.F.; Althobaiti, M.M.; Ashour, A.A. Internet of Things and Synergic Deep Learning Based Biomedical Tongue Color Image Analysis for Disease Diagnosis and Classification. IEEE Access 2021, 9, 94769–94779. [Google Scholar] [CrossRef]
- Soma, P.; Saradha, K.R.; Jothika, S.; Dharshini, S. Tongue Diagnosis using CNN for Disease Detection. Int. J. Electr. Electron. Res. 2022, 10, 817–821. [Google Scholar]
- Xie, J.; Jing, C.; Zhang, Z.; Xu, J.; Duan, Y.; Xu, D. Digital tongue image analyses for health assessment. Med. Rev. 2022, 1, 172–198. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Bhatnagar, V.; Bansod, P.P. Double U-Net a Deep Convolution Neural Network for Tongue Body Segmentation for Diseases Diagnosis. In Proceedings of the International Conference on Communication and Computational Technologies, Jaipur, India, 26–27 February 2022; Algorithms for Intelligent Systems. Springer: Singapore; pp. 293–303. [Google Scholar] [CrossRef]
- Mayer, S.; Müller, D.; Kramer, F. Standardized Medical Image Classification across Medical Disciplines. arXiv 2022, arXiv:2210.11091. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. Computer Vision and Pattern Recognition (cs.CV). arXiv 2019, arXiv:1801.04381. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. arXiv 2018, arXiv:1608.06993. [Google Scholar]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).