1.1. Dental Scientific Background
The two most common oral diseases among adults are dental caries (tooth decay) and periodontal inflammations (gum diseases) [
1]. Caries are caused by bacteria present in dental plaque, which ferment sugars, producing acid which demineralizes tooth enamel, then the bacteria enter the tissues under the enamel [
2]. Periodontal diseases [
3] are classified as either gingivitis or periodontitis based on severity. Gingivitis is the reversible stage, only the gums are affected, exhibiting symptoms of redness, swelling, and bleeding. With professional treatment, and maintaining a regimen of daily oral hygiene, recovery time from gingivitis is usually about 10–14 days [
4]. Gingivitis which goes untreated usually escalates to periodontitis, the irreversible stage of gum disease. Acids, which are the immune system’s response to the presence of toxic bacteria, cause a deterioration and finally destruction of the tooth-support tissues and can lead to teeth loss [
5]. There are several pathways of reversing inflammation. The two modes most studied in the literature are (1) mechanical plaque removal, and (2) chemical plaque removal. It has been demonstrated [
6] that alcohol-free mouthwash along with regular toothbrushing is capable of reducing gingival inflammation. The significant difference between caries and periodontal disease is that caries often causes pain even in the early stages, while gum infections are often asymptomatic until a quite advanced stage [
7]. What the two diseases have in common is their progression and escalation, producing a situation where delayed care can necessitate complex and expensive treatment or lead to loss of teeth [
1].
Dentists have developed numerous indices to assess the severity of gingivitis [
8,
9], based on one or more of the following criteria: gum coloration (redness), gum contour, bleeding presence, stippling, and crevicular fluid flow [
10,
11]. Another method for distinguishing unhealthy oral tissue from healthy tissue is bioimpedance, unhealthy tissue has been found to offer lower resistance to electrical current than healthy ones [
12]. Most of the indices developed require both visual and invasive measures (probing of the gums with instruments) to assess gum health status and reach a rating. Exceptional is the Modified Gingival Index (MGI) [
8] which is completely noninvasive (i.e., exclusively visual). A survey study demonstrated that the variety of gingivitis indices in common use are all strongly correlated, including in particular MGI [
9]. The MGI (
Table 1) uses a rating score between 0 and 4, with 0 indicating a tooth with healthy gums and 4 the most severe inflammation with spontaneous bleeding, for a precise definition of each rating see
Table 1.
Numerous epidemiological studies concluded that gingivitis [
8] is prevalent in children, adolescents, and adults [
11,
13,
14,
15] and more than 80% of the world’s population suffers from mild to moderate forms of gingivitis from time to time [
1]. Treating gingivitis is relatively simple, mostly at home, based on oral hygiene maintenance methods, including brushing twice daily, mouthwash, interproximal flossing, and dental toothpicks when appropriate [
16,
17,
18]. In the clinic, gingivitis is usually treated in a single visit to a dental hygienist to remove plaque and calculus (tartar, or hardened plaque) [
19]. If a proper oral hygiene regimen is not maintained, gingivitis is likely to prolong and progress to periodontitis.
Despite the fact that routine checkups are essential for monitoring and maintaining oral health, most people do not follow a recommended checkup schedule [
20]. The problem is intensified even beyond irregular checkups when people are instructed to practice social distancing and avoid all unnecessary contact, such as during the current COVID-19 pandemic. This calls for a paradigm shift and new protocols and new software tools that will enable patients to have their oral health monitored by a dental healthcare provider in a more accessible, user-friendly way, not requiring a major effort on their part (such as visiting a clinic).
Recently, we presented iGAM [
21] which is the first mHealth app for monitoring gingivitis using dental selfies. In a qualitative pilot study, we showed the potential acceptance of iGAM to facilitate information flow between dentists and patients, which may be especially useful when face-to-face consultations are not possible. By using iGAM the patient’s gum health is remotely monitored. The patient is instructed by the app how to use it to take and upload weekly gum photographs, and the dentist can monitor gum health (MGI score) without the need for face-to-face meeting. The data is stored and transferred between the dentist and patient by a data storage server (
Figure 1).
The goal of this paper is to take the next step with the iGAM app and use the patient dental selfies toward automatically classifying the patients’ gum health status by predicting the MGI score.
1.2. Machine Learning Background
Automated machine learning has proven to be a very effective accelerator for repetitive tasks in machine learning pipelines [
22], aiding data preprocessing, and streamlining and successfully resolving tasks like model selection, hyperparameter optimization, and feature selection and elimination [
23]. AutoML packages are enabling considerable advances in machine learning by shifting the focus of the researcher to the feature engineering aspect of the ML pipeline, rather than spending a large amount of time trying to find the best algorithm or hyperparameters for a given dataset [
24]. In particular, AutoML-H2O trains a variety of algorithms (e.g., GBMs, random forests, deep neural networks, GLMs), providing diversity of candidate models, which can then be stacked—producing more powerful final models. Despite the fact that it uses random search for hyperparameter optimization, AutoML-H2O frequently outperforms other AutoML packages [
25,
26]. All machine learning training processes and testing performed in this paper were done using the AutoML-H2O [
27]. Our goal is to develop a suite of features to be evaluated, that we tailored especially to the unique characteristics of our cropped single tooth images. Dental selfies taken by users vary widely due to differences between cameras used (variations between vendors and quality), lighting conditions, image perspective, and more. We aim to make our features robust against such variations. An advantage we can exploit is that we know there is an underlying significant degree of homogeneity of the content being depicted—teeth and gums; our overall purpose is that our suite of features should be correlative to the visual characteristics dentists use to establish the MGI score—redness, swelling, irregularity, and more.
This paper will present a method to analyze the dental images, extract the most relevant image features from the dental selfies (that correspond to the MGI), and use machine learning algorithms to classify the state of gum health.
The innovations of our proposed method mainly include three aspects. (1) Accurate method which predicts the gum health status using noninvasive selfie image alone, (2) light method that can be implemented on mobile devices, and (3) just 35 scalar features, tailored to the unique characteristics of dental selfies and robust against wide variation between cameras used, lighting conditions, image perspective, and more.