**Preface to "Temporomandibular Joint Diseases: Diagnosis and Management"**

The temporomandibular joint (TMJ) is capable of remodeling even after growth has stopped, allowing it to make structural changes and adapt to different physiological demands. Temporomandibular disorders (TMD) are a group of degenerative disorders involving the components of the TMJ, which can lead to displacement of the disc, joint remodeling, and eventually osteoarthritis. Different methods of diagnosis and treatments of TMD have been described in the literature in the past years. This reprint was created to provide updated information regarding all methods of diagnosis of TMD, from clinical exams to immunohistologic and molecular diagnosis and novel treatments for this disease, ranging from non-invasive techniques, such as physical therapy, ultrasound, low-level laser therapy, and splints, to surgical treatments of TMJ.

> **Luis Eduardo Almeida** *Editor*

### *Article* **A Novel Quantitative Method for Tooth Grinding Surface Assessment Using 3D Scanning**

**Benedikt Sagl 1, \* , Ferida Besirevic-Bulic 2 , Martina Schmid-Schwap 2 , Brenda Laky 1 , Klara Janji´c 1 , Eva Piehslinger <sup>2</sup> and Xiaohui Rausch-Fan 1**


**Abstract:** Sleep bruxism is an oral parafunction that involves involuntary tooth grinding and clenching. Splints with a colored layer that gets removed during tooth grinding are a common tool for the initial diagnosis of sleep bruxism. Currently, such splints are either assessed qualitatively or using 2D photographs, leading to a non-neglectable error due to the 3D nature of the dentition. In this study we propose a new and fast method for the quantitative assessment of tooth grinding surfaces using 3D scanning and mesh processing. We assessed our diagnostic method by producing 18 standardized splints with 8 grinding surfaces each, giving us a total of 144 surfaces. Moreover, each splint was scanned and analyzed five times. The accuracy and repeatability of our method was assessed by computing the intraclass correlation coefficient (ICC) as well reporting means and standard deviations of surface measurements for intra- and intersplint measurements. An ICC of 0.998 was computed as well as a maximum standard deviation of 0.63 mm<sup>2</sup> for repeated measures, suggesting an appropriate accuracy of our proposed method. Overall, this study proposes an innovative, fast and cost effective method to support the initial diagnosis of sleep bruxism.

**Keywords:** sleep bruxism; digital dentistry; diagnostic bruxism splint

#### **1. Introduction**

Traditionally, bruxism is defined as an oral parafunction involving involuntary tooth grinding and clenching [1]. Moreover, a distinction is made between awake and sleep bruxism, which potentially have different origins and pathophysiology [2]. Bruxism is a possible risk factor for different pathologies and can lead to severe abrasion of teeth, tooth hypermobility, masticatory muscle pain, headache, periodontal tissue damage as well as temporomandibular joint (TMJ) pain. Most people will go through phases of tooth grinding or clenching during the course of their lifetime [3] with studies reporting approximately 5–13% of adults as frequent tooth grinders [4–6].

Diagnosis of bruxism is a challenging task due to its involuntary nature. Initial assessment often relies on reports of tooth grinding sounds and symptoms such as flattened teeth, which already imply a rather late time of diagnosis. The American Academy of Sleep Medicine defined diagnostic criteria for sleep bruxism, which involve the occurrence of abnormal tooth wear, associated sounds and jaw discomfort [5]. A polysomnographic (PSG) investigation, including video, audio as well as a multitude of different respiratory, muscular and other parameters, is generally seen as the gold standard for a definitive diagnosis of sleep bruxism [7]. Since PSG is very expensive and time consuming for the patient, many studies have used electromyography (EMG) [8] devices to measure masticatory muscle activity during sleep, investigating rhythmic masticatory muscle activity

**Citation:** Sagl, B.; Besirevic-Bulic, F.; Schmid-Schwap, M.; Laky, B.; Janji´c, K.; Piehslinger, E.; Rausch-Fan, X. A Novel Quantitative Method for Tooth Grinding Surface Assessment Using 3D Scanning. *Diagnostics* **2021**, *11*, 1483. https://doi.org/10.3390/ diagnostics11081483

Academic Editor: Luis Eduardo Almeida

Received: 28 June 2021 Accepted: 11 August 2021 Published: 16 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

(RMMA), which is a diagnostic sign of sleep bruxism [9,10]. Another approach using an instrumented splint to measure peaks in bite force has been proposed previously [11].

While EMG gives reliable information on RMMA occurrence and, as a consequence, helps with detecting bruxism [10], portable EMG devices are still rather expensive and most clinics do not own enough devices to easily use them for every potential patient. A previously proposed simple and cost-effective tool for bruxism diagnosis is a colored splint to monitor tooth contact during sleep. The first reports of this method go back to the 1970s [12,13]. The proposed splint consists of four colored layers comprising an overall thickness of 0.51 mm. During grinding of the teeth, one or multiple, depending on the amount of grinding force, colored layers are ground off, revealing information on occlusal contact areas. More recently a semi-automatic method to analyze such splints has been published [14,15]. The method uses standardized pictures to measure the abraded area in a 2D projection but neglects the 3D nature of the tooth shape. Another comparable product was developed by a group at the Kanagawa Dental College [16]. While their splint only has a single colored layer, reducing the diagnostic information on bruxing force, it is very thin (0.1 mm thickness), which potentially limits the alteration of muscular activity during sleep caused by the splint [17]. To the best of our knowledge, analysis of this tool has also solely focused on quantitative assessments of occlusal grinding patterns in 2D projections (photographs) [18–21]. All analysis methods that rely on 2D projections infer an error, which increases with the angle between the projection plane and the tooth facet. With the advance of digital dentistry and improvements in the quality as well as the accessibility of 3D scanning devices, a logical next step would be the digitalization of occlusal splints and the detailed diagnostic analysis of the occlusal contacts using 3D mesh analysis approaches.

Consequently, the presented study proposes a novel method for the semi-automated 3D analysis of colored occlusal splints for the diagnostic investigation of tooth contacts in the context of bruxism. This method has the potential to gather more accurate information on nocturnal occlusal contacts in an easy and reliable fashion, helping clinicians to collect the information necessary for bruxism diagnosis, while only using equipment accessible in a dental practice.

#### **2. Materials and Methods**

To test and validate our diagnostic method a model consisting of an idealized gingiva arch with a total of 8 embedded icosahedrons was designed using the Autodesk® Meshmixer toolkit (Autodesk, San Rafael, CA, USA) (Figure 1).

**Figure 1.** Top view of the 3D model created in the Meshmixer toolkit.

To later test the performance of the presented method for different sizes of grinding surfaces, the geometrical bodies varied in size. The triangular surfaces of the icosahedrons' faces decreased from posterior to anterior, with respective triangle heights of 5 mm, 4 mm, 3 mm and 2 mm. The base model was produced with an additive manufacturing

approach using a Formlabs® Form 2 printer (Formlabs, Somerville, MA, USA) and the Formlabs® Dental LT Clear V1 resin (Formlabs, Somerville, MA, USA). The model was used in combination with a pressure molding device (Biostar®, Scheu Dental, Iserlohn, Germany) to produce the splints themselves from a dedicated pressure molding foil with one red-colored side and a thickness of 0.1 mm (Bruxchecker®, Scheu Dental, Iserlohn, Germany). After production the splints are relatively translucent and normally turn opaque in the patient's mouth. To get the same effect in vitro, we submerged the finished splints in water with some added toothpaste for 6 h. After this step the splints showed surface opaqueness comparable to clinical splints.

To simulate tooth grinding, one triangle per icosahedron was prepared using a KaVo K4 handpiece (KaVo Dental, Biberach an der Riß, Germany) and the red layer was ground off to leave the respective surface transparent. Processed triangles varied between splints and were used to test the performance of our method for different surface angles. Scanning of the transparent surfaces lead to rather severe 3D reconstruction artifacts—consequently, we spray-painted the inside of the splint using a colored (green) powder spray (Occlu®Spray Plus, Hager & Werken, Duisburg, Germany) (Figure 2).

**Figure 2.** Example splint after "grinding surface" preparation and powder spraying. S1 to S8 depict the respective grinding surfaces.

After preparation, splints were scanned using an optical 3D scanner (Primescan™ AC, Dentsply Sirona, Bensheim, Germany) and mesh files were exported as .ply files including vertex position as well as vertex color information. To check for intrascan accuracy of the method, each splint was scanned 5 times. Meshes were imported into the Meshmixer software toolkit (version 3.4) and the "grinding surfaces" were segmented using a semiautomatic method. For this purpose, an initial vertex inside the grinding surface was selected and the selection was expanded using a similarity measure of vertex color for the abraded areas. The abraded areas were green and the rest of the splint remained red. The surface area of each grinding facet was recorded for 18 splints for 5 repeated measurements, giving 90 scans and 720 grinding surfaces. A detailed description of our software workflow can be found in Appendix A.

Intraclass correlation coefficient (ICC) over the 5 repeated measures was evaluated and an analysis of variance (ANOVA) of repeated scans of the same physical splint was performed. To better describe the grinding surfaces we moreover reported the maximum, minimum, mean and relative standard deviations over the repeated measures for each grinding surface. Additionally, to showcase the differences in results computed using a 2D projection approach with respect to the proposed 3D method, all 18 splints were photographed using a standardized set-up and grinding areas were segmented in 2D using ImageJ (https://imagej.nih.gov/ij, [22]). We report mean surface areas and standard deviations for each grinding surface for both measurement methods and compared 2D

photographs to 3D scans using an independent-samples *t*-test. Statistical assessment was performed using IBM SPSS Statistics 26® (IBM, Armonk, NY, USA).

#### **3. Results**

The proposed workflow allowed for the successful completion of all necessary substeps. Using the colored powder spray enabled easy and fast scanning, without any artifacts caused by the transparent grinding areas on the splint (Figure 3). Moreover, the clear difference in color between the red splint and the green grinding surfaces allowed for easy segmentation of the grinding surfaces (Figure 4). To assess this statement, the repeatability and accuracy of the scanning procedure were tested as follows.

**Figure 3.** Example scans of two out of the 18 different splints; different combinations of triangles were prepared for each splint to investigate different surface angles.

**Figure 4.** Example of a scanned splint after successful segmentation.

The ICC score of 0.998 (95% confidence interval, CI 0.997–0.998; *p* < 0.001), for single measures using a two-way mixed effects model assessing absolute agreement, suggests a high repeatability and reliability of our proposed method. No significant differences between repeated scans and segmentations were detected, suggesting an appropriate repeatability of our approach (F = 1.112; *p* = 0.350).

Table 1 reports the mean surface area and standard deviation for each grinding surface for all 18 scans using the 2D and 3D methods. For 2D measurements only a single measurement was completed, while we report the mean over the five repetitions for our new method. Generally speaking, higher standard deviations can be seen for the 2D measurements. Moreover, the independent-samples t-test showed statistically significant differences for all grinding surfaces between surface areas measured in 2D and 3D. Figure 5 shows the results of the 2D measurements for an example splint (Splint 2) and depicts clear differences in grinding size for similarly sized icosahedrons.

**Table 1.** Mean surface area and standard deviation for all 8 grinding surfaces over the 18 prepared splints measured from 2D photographs and 3D scans. *p*-values are reported for independent-samples *t*-test for differences between the measurement methods.


**Figure 5.** Example of a 2D measurement (Splint 2 shown). Surface area for each grinding surface is reported in mm<sup>2</sup> . While measurements on the same size icosahedrons should be relatively close, stark differences can be seen, e.g., between S1 and S8.

The accuracy of the presented method was assessed by reporting the maximum, minimum and mean standard deviations between the five repeated scans of the same splint reported in absolute mm<sup>2</sup> and relative to the mean size of the grinding area (%; Table 2). The highest maximum standard deviation was 0.63 mm<sup>2</sup> . Generally, a trend for larger absolute variation was found for the measurements of larger grinding surfaces. Taking the size of the grinding surface into account, the largest relative variation was found to be 10.36%. Generally, the relative standard deviation was larger for the smaller grinding surfaces.


**Table 2.** Standard deviation of repeated measures for each surface over 18 prepared splints.

#### **4. Discussion**

The presented study established and reports a novel method for the semi-automatic, quantitative, 3D assessment of grinding surfaces on a colored occlusal splint; a task that, to the best of the authors' knowledge, has not been accomplished in the previous literature so far. Our measurements suggest a high repeatability and accuracy of the presented method. Overall, the proposed workflow could be a valuable tool for future investigations regarding occlusal variables and has the potential to increase the understanding of various functional, parafunctional and dysfunctional tasks of the masticatory system.

Generally speaking, occlusal splints are a cheap, non-invasive and easy-to-use method to assess the grinding pattern of a patient [14,17]. Consequently, they are a great tool for the initial assessment in bruxism diagnosis [20]. Currently these splints are mostly qualitatively assessed by defining the involved regions of the occlusal grinding patterns (e.g., "canine guided", "premolar and/ or molar involved") [16], which limits their diagnostic value. Some quantitative methods have been proposed, but they all use 2D photographs of the splints [14,15]. Those methods so far cannot calculate the grinding area precisely, since the 3D nature of human teeth induces a non-negligible error caused by the 2D projection of a photograph. This error increases with the angle between the 2D projection plane and the grinding facet plane. When models are photographed from above, the largest error can be seen on steep tooth surfaces, e.g., on the canines. By using an optical 3D scanner, we solved this problem and computed accurate 3D shapes.

One major problem during initial testing of the presented method was the detection of the grinding areas during 3D scanning. The patient (or, in our case, the polishing device) grinds off the colored layer on the splint, leaving translucent grinding areas. While these areas are easy to register visually, the translucency of the foil makes them very hard for an optical scanner to detect, which leads to non-repeatable and noisy results, where the scanner sometimes detects the splint and sometimes scans the dental model below the splint. This problem often induces sharp edges and switching of the surface between the level of the cast and the splint, which leads to an overestimation of the grinding surfaces and a generally cumbersome scanning process. We solved this problem by using a colored powder spray with a different color with respect to the splint color. We chose a green spray because it gave good contrast to the red color of the splint and since red and green are well separated in RGB (red, green, blue) color space, we expected this color decision to further improve the segmentation process. This simple and cost-effective solution enabled us to drastically increase the scan quality, while simultaneously reducing scanning time.

To assess the repeatability of our results we scanned each splint five times, segmented the grinding surfaces and compared the differences between the repeated scans. The high ICC of 0.998 detected with the presented method suggests an excellent repeatability. Moreover, this finding was confirmed by detecting no significant differences between the repeated measurements, giving us confidence in the results computed with the proposed method.

In general, the same grinding surface on different splints should be relatively comparable in surface area (e.g., Splint1 S1 and Splint2 S1) with only minor differences caused by

the manual grinding process. Moreover, since the triangles on the left and right sides have the same size in our model, differences between the respective surfaces (e.g., S1 and S8) should be minimal. This was indeed true and we could only detect statistically significant differences between the grinding areas of surfaces on differently sized icosahedrons.

To further evaluate accuracy of the measurement method, we investigated the standard deviation of the grinding surfaces between the repeated measurements and compared them to the standard deviation between the grinding surfaces on different splints. Standard deviations were larger between models, compared to repeated measures of the same splint. The largest standard deviation for the repeated measures was 0.63 mm<sup>2</sup> for surface 8. Relative to the mean grinding area of the surface, the computed standard deviation for surface 8 is equal to 3.73%. As expected, the largest relative difference was found for the smallest grinding surfaces, with 10.36% for Surface 4, which represents an absolute surface of 0.29 mm<sup>2</sup> . These maximum values represent the worst case and when looking at the mean standard deviations for each surface we see values of approximately half the value of these maximums. We think that these relatively small differences suggest an appropriate accuracy for clinically relevant differences in grinding areas.

Additionally, we showcased our novel measurement method by comparing it to the currently used method of assessing surface areas on 2D photographs. Larger standard deviations for the surfaces were found when using the 2D method. As described above, this is due to the fact that, in addition to the standard deviation caused by the actual differences from manual preparation of the grinding areas, an additional variability is included by using grinding facets with different angles with respect to the imaging plane. This can clearly be seen in Figure 5 when comparing S1 and S8. These surfaces are roughly the same size, apart from small variances caused by the manual grinding, but due to the projection error S8 is substantially smaller than S1 when using photographs. By using the presented method this error is drastically reduced. On the other hand, the projection error for S1 is relatively small since the surface is well aligned with the projection plane of the photograph. Consequently, our data show that if a grinding surface with a large angle to the imaging plane is chosen, the surface area was underestimated drastically. As expected, significant differences in grinding area were detected between the two measurement methods (photographs vs. 3D scanning) for all grinding surfaces.

While our study computed convincing results, some limitations remain. Firstly, the occlusal splint used in our study can only assess the direction of the grinding movement, the area and number of occlusal grinding surfaces, but it cannot define the magnitude, frequency and duration of the applied grinding force, which are relevant parameters related to the pathogenesis of TMD [23]. Other splint designs have been proposed that use multiple layers of colored material, inferring some information on grinding force magnitude [14], but some authors have reservations regarding the thickness of these multilayer splints [24]. It was suggested that the thicker splints act in the same way as an actual therapeutic splint and reduce muscle activity, which would make them infeasible as a diagnostic tool. Nevertheless, colored splints have proven to be a valuable first diagnostic tool in bruxism diagnosis [16,17,19,20] and we are confident that our method is transferable to other splint designs. Secondly, we did not compare our optical scans to a different physical measurement of the grinding surface. Optical scanning has been shown to be a valuable and accurate tool in digital dentistry [25–27] and is used for various dental applications [28,29]. More specifically, the trueness and precision of the 3D scanner used in this study has been assessed for complete arch scans by multiple previous studies [30–32]. Schmidt et al. found a mean deviation of 33.8 ± 31.5 µm. Moreover, Dutton et al. assessed the performance of the Primescan over multiple materials and found a trueness of 17 µm and a precision of 25 µm. Lastly, Ender et al. report a trueness of 33.9 ± 7.8 µm and a precision of 31.3 ± 10.3 µm. Consequently, we do not think that the initial validation of the correctness of the overall geometry has to be proven for our specific study.

Future studies could, for example, focus on the assessment of a potential correlation between occlusal grinding areas in 3D and muscle activity EMG, in order to include

additional information on the frequency and magnitude of the grinding events. This could provide important clues to predict diseases of traumatic occlusion and TMJ disorders.

#### **5. Conclusions**

In conclusion, this study proposes an innovative, fast and cost effective method to support the initial diagnosis of sleep bruxism. Moreover, due to the 3D nature of the presented method, it facilitates the fast and easy quantitative assessment of the surface area of the respective grinding facets. The study results suggest a high accuracy as well repeatability of the proposed method, which will allow for better quantitative assessment and comparison of the grinding areas in future clinical studies. This will potentially help in gathering knowledge and developing better screening and treatment methods for patients in the early stages of sleep bruxism.

**Author Contributions:** Conceptualization, B.S., F.B.-B., M.S.-S., E.P. and X.R.-F.; methodology, B.S., F.B.-B., M.S.-S., K.J. and B.L.; software, B.S., B.L.; validation, B.S. and B.L.; formal analysis, B.S and B.L.; investigation, B.S. and K.J.; resources, M.S.-S., E.P. and X.R.-F.; data curation, B.S.; writing—original draft preparation, B.S.; writing—review and editing, F.B.-B., M.S.-S., B.L., K.J., E.P. and X.R.-F.; visualization, B.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Acknowledgments:** The authors want to thank Dominique Flechl for her support during splint creation and Denisa Hani for her support during 3D scanning of the splints.

**Conflicts of Interest:** The authors declare no conflict of interest. The Bruxchecker® foils were sponsored by Scheu Dental. The company had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A. —Software Workflow**

This appendix will briefly describe the software settings and procedures used during the various digital processing and production steps.

Model design: The model is based on the idealized morphology of a gingival arch. We idealized the arch using manual smoothing in the Autodesk® Meshmixer toolkit. Afterwards we selected icosahedrons under Meshmix → Primitives and pulled them onto the appropriate positions on the arch. The offset was kept at 0 and the dimension was used to change the size as described in the Methods section. The icosahedrons were added using the Boolean Union composition mode.

Model 3D Printing: The model was printed in a layer density of 0.05 mm with a total of 413 layers. It was printed within approximately 3 h, consuming 18.48 mL of resin. The object was printed on full rafts, including a raft label and internal supports. Rafts had a touchpoint size of 0.60 mm and a density value of 1. Automated advanced settings included a flat spacing of 5 mm, a slope multiplier value of 1, 5 mm height above raft, 2 mm raft thickness, a Z-compression correction of 0.75 mm and an early layer merge of 0.30 mm. After printing, the object was washed in isopropanol (IPA) for 15 min. Washing was repeated in fresh IPA for another 5 min, followed by drying overnight. The next day, the printed model was cured in a Formlabs® Form Cure for 20 min at 80 ◦C, following the recommendations of the manufacturer. After curing, all support structures were removed manually.

Splint 3D Scanning: The colored splints were scanned using the dedicated scanning software using the standard parameters. Afterwards the saved .ply files were collected and exported for post-processing.

Splint 3D assessment: .ply files of the scanned splints were opened using the Autodesk® Meshmixer toolkit and the function under Select → Filters → Vertex Color Similarity was

used. Back face selection was not enabled and no crease angle threshold was used. Each grinding facet was segmented and separated into its own component (Edit → Separate). Afterwards, each grinding facet was selected and the surface area was computed using Analysis →Stability. Values were computed and collected for all grinding areas of all splints and used for statistical analysis.

#### **References**


### *Article* **Validity and Reliability of the Helkimo Clinical Dysfunction Index for the Diagnosis of Temporomandibular Disorders**

**Roger Alonso-Royo 1 , Carmen María Sánchez-Torrelo 1 , Alfonso Javier Ibáñez-Vera 2, \* , Noelia Zagalaz-Anula 2 , Yolanda Castellote-Caballero 2 , Esteban Obrero-Gaitán 2 , Daniel Rodríguez-Almagro 2 and Rafael Lomas-Vega 2**


**Abstract:** The Helkimo Clinical Dysfunction Index (HCDI) is a simple and quick test used to evaluate subjects affected by temporomandibular disorders (TMDs), and its psychometric properties have not been tested. The test evaluates movement, joint function, pain and musculature, providing a quick general overview that could be very useful at different levels of care. For this reason, the aim of this study was to validate the use of the HCDI in a sample of patients with TMD. Methods: The sample consisted of 107 subjects, 60 TMD patients and 47 healthy controls. The study evaluated concurrent validity, inter-rater concordance and predictive values. Results: The HCDI showed moderate to substantial inter-rater concordance among the items and excellent concordance for the total scores. The correlation with other TMD assessment tests was high, the correlation with dizziness was moderate and the correlation with neck pain, headache and overall quality of life was poor. The prediction of TMD showed a sensitivity of 86.67%, a specificity of 68.09% and an area under the curve (AUC) of 0.841. Conclusions: The HCDI is a valid and reliable assessment instrument; its clinimetric properties are adequate, and it has a good ability to discriminate between TMD-affected and TMD-unaffected subjects.

**Keywords:** temporomandibular disorder; validity and reliability; questionnaires and survey validity study

#### **1. Introduction**

Temporomandibular joint disorders (TMDs) are a very prevalent condition that, according to some authors, are present in 27.4% of adolescents [1] and 25% of adults [2]. Costs in European public hospitals due to erroneous diagnosis of TMD exceed a minimum of €52 and a maximum of €425, with a mean of €146, according to the amounts received from mutual insurance companies and insurers [3]. The analysis of the aetiology of TMDs has focused on several factors such as inflammatory diseases [4], fractures and trauma [5,6], as well as biomedical models related to temporomandibular joints, muscles of mastication and occlusal factors [7]. The management of TMDs includes clinical examination [8] and the use of imaging techniques both for diagnosis and for monitoring the efficacy of treatments [9,10], which classically included the use of botulinum toxin [11], occlusal splint therapy [12] and polyphenols as potential therapeutic agents [13]. TMDs are related to headache, neck pain, shoulder pain, insomnia, vertigo, ocular pain and hearing loss [14], and 91% of TMD patients reported pain, 61.2% joint clicks or crepitation and 53.3% temporomandibular joint limited range of movement [15].

Due to the wide list of related symptoms, diagnostic criteria for temporomandibular disorders (DC/TMDs) were designed for the performance of an exhaustive assessment

**Citation:** Alonso-Royo, R.; Sánchez-Torrelo, C.M.; Ibáñez-Vera, A.J.; Zagalaz-Anula, N.; Castellote-Caballero, Y.; Obrero-Gaitán, E.; Rodríguez-Almagro, D.; Lomas-Vega, R. Validity and Reliability of the Helkimo Clinical Dysfunction Index for the Diagnosis of Temporomandibular Disorders. *Diagnostics* **2021**, *11*, 472. https:// doi.org/10.3390/diagnostics11030472

Academic Editor: Luis Eduardo Almeida

Received: 24 February 2021 Accepted: 4 March 2021 Published: 8 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of each patient [16]; for this reason, an important requirement of time is needed for adequate evaluation with these internationally accepted criteria, which are considered the gold-standard reference test for the diagnosis of temporomandibular disorders. The test examines 12 dimensions that evaluate mandibular movement, type of bite, pain on movement, pain on touch of the musculature, alterations in mandibular movement and headache [16].

According to the cost of misdiagnosis and the time necessary to perform the reference test for TMD diagnosis, it would be beneficial to find a simpler and quicker tool to use as a diagnostic method for TMD in primary care. The Helkimo Clinical Dysfunction Index (HCDI) has been widely used for the clinical diagnosis of TMDs [17–19]. It is a simple and quick test that assesses limitations of mandibular movement, pain and joint function. However, the studies that analysed the reliability [20,21] and validity of this tool are old, used a very small sample, applied incorrect statistical techniques and were limited to the analysis of a single clinimetric property [22,23].

Therefore, a thorough analysis of the main properties of the HCDI is necessary, using the DC/TMD protocol as a reference. For this reason, the aim of the study was to assess and test the psychometric properties of the HCDI in patients with TMD.

#### **2. Materials and Methods**

#### *2.1. Participants*

To meet the objectives of this work, a cross-sectional validation study was designed. The protocol of this study received the approval of the Research Ethics Committee of Jaén, Spain (date of approval: 27 April 2020; internal code ABR.20/2.TFM). This study was conducted in accordance with the Declaration of Helsinki, good clinical practice guidelines and all applicable laws and regulations, and written informed consent was obtained from all subjects to participate in the study.

The sample size calculation was carried out using the recruitment of at least 10 subjects per item of the scale as a criterion, with a minimum of 80 subjects for validity studies and 20 for reliability [24]. This study was developed between May and August 2020. The sample was selected from the patients of the Dental Medical Center Doctores López Collantes, which provides stomatology services (Dos Hermanas, Sevilla, Spain). and from those at the FisioMedic Clinic (Dos Hermanas, Sevilla, Spain), which provides physiotherapy, general medicine and traumatology services. Recruitment was performed by telephone contact and personal interviews.

#### *2.2. Measurements*

Once the patients were selected, demographic data were recorded: age, sex, height, weight, body mass index (BMI), educational level, work situation, smoking status, alcoholic habits and physical activity [25].

The diagnostic validity of the HCDI was measured according to the DC/TMD protocol, which is the gold-standard diagnostic test for TMD. The DC/TMD protocol is composed of 12 items that assess muscle and joint pain, pain during jaw movement, headache, bites, noise, obstacles or blockages during jaw movement and discomfort in the palpation of the muscles of the temporomandibular joint. Finally, a diagnostic tree is used to specify a diagnostic result. The DC/TMD protocol has a sensitivity of 86%, a specificity of 98% and an inter-examination reliability of 85% [16].

The main measure was the HCDI. The instrument is comprised of five items, with each assessment having three possible answers, scored as 0, 1 or 5. The first item (A) is related to the limitation in the range of jaw movement and is subdivided into four sections: the maximum opening of the mouth and the protrusion and lateral shift to both sides. In the opening of the mouth, a value of more than 40 mm scores 0 points, a value between 30 and 39 mm scores 1 point and opening less than 30 mm scores 5 points; protrusion and lateral mouth shifts score 0 if the measurement is 7 mm or more, 1 point if the range of motion is between 4 and 6 mm and 5 points if the range is less than 4 mm. These

subsections of item A are added together to obtain a subtotal that scores 0 if the sum of the four sections is 0, 1 point if the subtotal is between 1 and 4 points and 5 points if the subtotal is greater than 4 points. The second item (B) evaluates the alterations of joint function that produce deviations, sounds and/or joint locks or blockages; the third item (C) evaluates the presence of pain when performing some movements; the fourth item (D) evaluates muscular pain in the masticatory muscles; and the fifth item (E) evaluates the presence of discomfort or pain in the prearticular area of the temporomandibular joint (TMJ) through palpation. From the sum of the 5 items, we identify no TMJ involvement if the score is 0, mild TMJ involvement when the score ranges from 1 to 9, moderate TMJ involvement if the score ranges between 10 and 19 and severe TMJ involvement for a score between 20 and 25. Previous studies have shown that the HCDI is able to detect TMD-affected subjects with rheumatoid arthritis, with a statistically significant difference between affected and unaffected subjects [26–28].

Concurrent validity was also measured with Fonseca's anamnestic index (FAI), which is made up of 10 questions that can be answered with yes, no or sometimes, and these answers are scored 10, 0 or 5, respectively. This questionnaire classifies patients according to the affectation, with a total score between 0 and 100. The test categorises temporomandibular disorder as not affected when the score is between 0 and 15 points, mild affectation when the score is between 20 and 40 points, moderate affectation when the score is between 45 and 65 points and severe affectation when the score is between 70 and 100 points. The FAI has a Cronbach alpha of 0.826, an intraclass correlation coefficient of 0.937, a cut-off point of >35 points, a sensitivity of 83.33% and a specificity of 77.97% [29,30]. Similarly, the short version of Fonseca's anamnestic index (SFAI) was also considered; it is a five-question questionnaire that is answered and scored the same as the standard version of the FAI, and the questionnaire categorises patients as unaffected by TMD when the scores is between 0 and 15 points and as affected by TMD when the score is between 20 and 50 points. The SFAI has a sensitivity of 86% and a specificity of 95.5% based on a cut-off point of >17.5 [31].

Pain perception was evaluated by the Numerical Pain-Rating Scale (NPRS) test. The subjects indicate their perceived pain with a number between 0 (no pain) and 10 (the worst pain possible). This tool was used to quantify both the neck and the temporomandibular joint and is the pain assessment test preferred by Spanish-speaking patients. The test has a strong correlation with the Visual Analogue Scale (VAS) and the Four-category Verbal Rating Scale (VRS-4) instruments, with the NPRS being preferred by patients; the Kaiser– Meyer–Olkin (KMO) value is 0.85, with a Bartlett sphericity of <0.01, a landing factor of 0.95 and a lack of implementation percentage of <0.01% [32].

To evaluate the possibility of associated neck disability, the Neck Disability Index test was used; it is a 10-question survey, with answers being reported as a number between 0 and 5. For each question, a score of 0 refers to the total absence of disability, while a score of 5 refers to total disability. In this line, a total score between 0 and 5 indicates absence of disability, 5–14 points indicate low disability, 15–24 point indicates moderate disability and 35–50 points indicate great disability. Cronbach's alpha is 0.89, and the intraclass coefficient is 0.98, with a Pearson's correlation coefficient with the visual analogue pain scale of r = 0.65 and with the Northwick Park neck pain questionnaire of r = 0.89 [33].

The presence of vertigo and balance problems was assessed by the Dizziness Handicap Inventory (DHI). This questionnaire is composed of 25 questions that can be answered with yes, no or sometimes, scoring 4, 0 and 2 points, respectively. This questionnaire assesses physical, emotional and functional dimensions, each of which has an independent score in addition to the total score. There is a high correlation between each of the dimensions and the total score (*p* < 0.01); factorial analysis shows a structure formed by three components, and there is perfect correlation with the Dizziness Characteristics and Impact on Quality of Life (UCLA-DQ) (>0.75) [34–36].

Headache-associated symptoms were measured with the Headache Impact Test (HIT-6), which is an evaluation questionnaire consisting of six questions that can be answered with usual, almost always, sometimes, rarely and never, with a total score between

36 and 78 points. The correlation between the HIT-6 in different languages is high, it has high reliability, and its items are comparable [37].

Finally, the quality of life was assessed using the 12-item Short-Form Health Survey (SF-12). This questionnaire is the short version of the SF-36 and retains its self-administered form. It results in a Mental Component Summary score and a Physical Component Summary score (PCS-12), differentiating between the two components of the quality of life. The weights of the Spanish version of the SF-12 are similar to those of the original American version, with a correlation of >0.9. The questionnaire explains 91% of the variance of the SF-36 in the sum of the components, and the coefficient of internal consistency is 0.9 for the SF-36 and slightly lower for the SF-12 [38].

#### *2.3. Statistical Analysis*

Descriptive analysis was performed by calculating means and standard deviations for continuous variables and frequencies and percentages for categorical variables. The Kolmogorov–Smirnov test was used to verify the normality distribution of the continuous variables, and the Levene test was used to test the homoscedasticity of the samples. The confidence level was set at 95% (*p* < 0.05).

To test the agreement between the two raters for the total HCDI score, the intraclass correlation coefficient (ICC) of Shrout and Fleiss was used in a one-way random effects model of the absolute agreement type; it estimates the reliability of single ratings [39]. Reliability was considered poor when the ICC was <0.40, moderate when the ICC was between 0.40 and 0.75, substantial when the ICC was between 0.75 and 0.90 and excellent when the ICC was >0.90. From the ICC, the standard error of measurement (SEM) and the minimum detectable change (MDC) were calculated. The SEM was calculated as the baseline standard deviation (SD) (σbase) minus the square root of (1-Rxx), where Rxx is the ICC. The MDC was quantified at the 95% confidence level (MDC95) from the SEM formula as follows: MDC95 = 1.96 \* σbase \* "√ (1-ICC), where 1.96 is the z-value corresponding to the 95% confidence interval (MDC95). The MDC provides a good tool for translating the ICC into units of change in the instrument. For measured agreement between two raters for the items, a weighted Kappa coefficient, weighted by quadratic weights, was used [40]. The agreement was considered null if Kappa was <0.00, insignificant if Kappa was between 0.00–0.20, discreet if Kappa was between 0.21–0.40, moderate if Kappa was between 0.41–0.60, substantial if Kappa was between 0.61–0.80 and almost perfect if Kappa was between 0.81–1.00 [41]. In addition, Bland–Altman charts were generated to evaluate the limits of agreement [42].

To analyse the concurrent validity of the HCDI with the FAI, NPRS, NDI, DHI, HIT-6 and SF-12, Pearson's correlation coefficient r was used. The correlation coefficient was considered strong if it was >0.50 and moderate if it was between 0.30 and 0.50 [43].

The ability of the HCDI to discriminate between TMD patients and healthy subjects was determined using receiver operating characteristic (ROC) curves. First, the classification of the subjects as TMD patients or healthy controls was carried out based on the diagnostic criteria of the DC/TMD protocol, and the total score obtained in the HCDI was evaluated as a variable. In the ROC curve, the fraction of true positives (sensitivity) was represented as a function of the fraction of false positives for different cut-off points. The area under the curve (AUC) was also calculated as a measure of the ability of the score to discriminate between the two diagnostic groups (TMD patients or healthy subjects). The AUC was considered statistically significant when the 95% confidence interval did not include 0.5 [44]. Values between 0.5 and 0.7 indicated low accuracy, values between 0.7 and 0.9 indicated good accuracy and values greater than 0.9 indicated high accuracy [45].

#### **3. Results**

In all, 158 people were contacted, but the final sample was composed of 107 participants (60 TMD patients and 47 healthy controls), as 51 did not meet the selection criteria or

**All** *<sup>n</sup>* **= 107 Healthy** *<sup>n</sup>* **= 47 Temporomandibular Disorders (TMDs)** *<sup>n</sup>* **= 60** Weight (kilograms) 72.83 17.05 77.86 19.22 68.90 14.07 Height (meters) 1.63 0.09 1.65 0.09 1.61 0.07 Body mass index 27.48 6.91 28.48 7.10 26.69 6.72 Age (years) 46.25 13.88 49.66 14.56 43.53 12.79 Sex Female 83 77.6 27 57.45 56 93.3 Male 24 22.4 20 42.55 4 6.7 Study level Primary 19 17.8 12 25.53 7 11.7 Secondary 52 48.6 25 53.19 27 45.0 University 36 33.6 10 21.28 26 43.3 Physical activity No 45 42.1 19 40.43 26 43.3 Yes 62 57.9 28 59.57 34 56.7 Economic level <€20.000 62 57.9 29 61.70 33 55.0 >€20.000 45 42.1 18 38.30 27 45.0 Smoker No 69 64.5 28 59.57 41 68.3 Yes 13 12.1 6 12.77 7 11.7 Occasional 12 11.2 6 12.77 6 10.0 Ex-smoker 13 12.1 7 14.89 6 10.0 Drinker No 38 35.5 19 40.43 19 31.7 Regular drinker 6 5.6 3 6.38 3 5.0 Occasional 63 58.9 25 53.19 38 63.3

refused to participate. The sociodemographic and anthropometric characteristics of the sample are shown in Table 1.

#### *3.1. Inter-Rater Reliability*

Results showed a maximum weighted kappa value of 0.774 for item C and a minimum value of 0.426 for item A2. Based on these values, reliability ranged from moderate to substantial, while the total score of the scale reached an excellent degree of concordance of 0.905 (Table 2). Figure 1 shows the Bland–Altman plot. Table 3 shows concurrent validity of the Helkimo Clinical Dysfunction Index with other specific and generic instruments.

**Table 2.** Inter-rater concordance of the Helkimo items and the total score.


– **Figure 1.** Limits of concordance by Bland–Altman plot.

**Table 3.** Concurrent validity of the Helkimo Clinical Dysfunction Index with other specific and generic instruments.


#### *3.2. Validity and Accuracy of the TMD Diagnostic Ability*

ROC curve analysis found an optimal cut-off point of more than 1 point in the HCDI score that showed a sensitivity of 86.67% with a specificity of 68.09% for the diagnosis of TMDs, making the DC/TMD protocol the gold standard (Table 4). This analysis showed an area under the curve (AUC) of 0.841 (Figure 2), which can be interpreted as good accuracy.

**Table 4.** Predictive values of the Helkimo Clinical Dysfunction Index (HCDI) total score by ROC curve analysis for the diagnosis of TMDs.


95% CI: 95% confidence interval; +LR: positive likelihood ratio; -LR: negative likelihood ratio; +PV: positive predictive value; -PV: negative predictive value.

**Figure 2.** Receiver operating characteristic (ROC) curve plot showing the area under the curve (AUC).

#### **4. Discussion**

– – – – – – This study evaluated the clinimetric properties of the Helkimo Clinical Dysfunction Index. The data obtained suggested that it is a valid and reliable instrument for evaluating patients with TMD, determining the degree of severity of the condition and discriminating between affected and unaffected patients with TMD. In this study, a total sample of 107 patients was used (60 TMD patients and 47 healthy subjects), and all of them were evaluated by this test, which lasted approximately 4 min. The two groups were comparable, except that a higher proportion of females who suffered from TMD, which is a consistent observation among TMD studies [17,27]. This fact may have led to a reduction in the mean weight and height and a higher proportion of university-educated subjects among the female population [46].

Despite being a commonly used tool for TMD assessment [19], few authors have studied the HCDI in depth. In 1987, Van der Weele et al. conducted an argumentative analysis of the HCDI, studying the pertinence of the construction of such a test to evaluate patients with TMD according to the evidence of the moment. They concluded that there was insufficient scientific evidence to support the use of these items in a diagnostic test for TMD [28]. However, in the analysis of the current scientific evidence regarding the pertinence of the use of these items in a diagnostic test for TMD, there is a general consensus that supports their use, and no evidence casts doubt on it [19,47]. In 2007, Da Cunha et al. conducted a comparative study between the HCDI and the craniomandibular test. As in the present study, they found greater affectation of TMD among women, who represented 70% of the total sample of affected people in the study, and a mean age of 46 years in affected patients, which agrees with the mean age of 43 years observed in this study [27].

Oliveira de Santis et al. conducted the only study analysing the psychometric characteristics of the HCDI and the American Association of Orofacial Pain (AAOP) index in subjects aged between 6 and 18 years, using the DC/TMD protocol as a reference. The authors found a non-statistically significant difference between genders, a sensitivity of 53.40% and a specificity of 77.27% for the HCDI, as well as a low level of accordance between the test being considered and the gold standard [47]. Nonetheless, in the present study, the sensitivity obtained was 86.67%, while the specificity was 68.09%. These differences in the results may be due to the difference in age between samples (46.25 years old in

our study, 8.18 years old in the one of Oliveira de Santis et al.), which could indicate that the HCDI is more useful for adults than children.

The present study had some limitations. First, the study sample had a higher proportion of women due to the higher proportion of women affected by TMD. Furthermore, although this study analysed the most common psychometric properties, we did not study the sensitivity to change or the ability to discriminate between different TMD populations. Additionally, this study was carried out on a sample of resident patients in a well-defined geographic location, which limits the generalisation of the results obtained.

#### **5. Conclusions**

The study shows that the HCDI is suitable for the diagnosis of TMD. The inter-observer concordance was between moderate and substantial for each of the items and excellent for the total score of the test. The HCDI has strong concurrent validity with the FAI, SFAI and NPRS orofacial assessment instruments; moderate validity with the NPRS neck pain assessment, emotional and physical facets and the total DHI value; and poor validity with respect to HIT-6 instruments, the mental and physical components of the SF-12 and the functional component of the DHI. The HCDI shows a sensitivity of 86.67%, a specificity of 68.09% and an AUC of 0.841 to predict the presence of TMD.

**Author Contributions:** All authors actively participated in the study and made substantial contributions to this article. Conceptualisation, R.A.-R., A.J.I.-V., C.M.S.-T., N.Z.-A. and R.L.-V.; methodology, N.Z.-A., Y.C.-C. and R.L.-V.; software, R.L.-V.; formal analysis, Y.C.-C. and R.L.-V.; investigation, R.A.-R., A.J.I.-V., C.M.S.-T., N.Z.-A., Y.C.-C., D.R.-A., E.O.-G. and R.L.-V.; data curation, R.A.-R., A.J.I.-V., C.M.S.-T. and R.L.-V.; writing—original draft preparation, R.A.-R., A.J.I.-V. and R.L.-V.; writing—review and editing A.J.I.-V. and R.L.-V.; visualisation and supervision, N.Z.-A., A.J.I.-V. and R.L.-V. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of University of Jaen (internal code ABR.20/2.TFM; date of approval 27 April 2020).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Data available under request to corresponding author due to participants' consent.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*
