**Optimal Examination Sites for Periodontal Disease Evaluation: Applying the Item Response Theory Graded Response Model**

**Yoshiaki Nomura <sup>1</sup> , Toshiya Morozumi 2,\* , Mitsuo Fukuda 3, Nobuhiro Hanada <sup>1</sup> , Erika Kakuta 4, Hiroaki Kobayashi 5, Masato Minabe 2, Toshiaki Nakamura 6, Yohei Nakayama 7, Fusanori Nishimura 8, Kazuyuki Noguchi 6, Yukihiro Numabe 9, Yorimasa Ogata <sup>7</sup> , Atsushi Saito <sup>10</sup> , Soh Sato 11, Satoshi Sekino 9, Naoyuki Sugano 12, Tsutomu Sugaya 13, Fumihiko Suzuki <sup>14</sup> , Keiso Takahashi 15, Hideki Takai 7, Shogo Takashiba 16, Makoto Umeda 17, Hiromasa Yoshie 18, Atsutoshi Yoshimura <sup>19</sup> , Nobuo Yoshinari <sup>20</sup> and Taneaki Nakagawa <sup>21</sup>**



Received: 9 October 2020; Accepted: 19 November 2020; Published: 21 November 2020

**Abstract:** Periodontal examination data have a complex structure. For epidemiological studies, mass screenings, and public health use, a simple index that represents the periodontal condition is necessary. Periodontal indices for partial examination of selected teeth have been developed. However, the selected teeth vary between indices, and a justification for the selection of examination teeth has not been presented. We applied a graded response model based on the item response theory to select optimal examination teeth and sites that represent periodontal conditions. Data were obtained from 254 patients who participated in a multicenter follow-up study. Baseline data were obtained from initial follow-up. Optimal examination sites were selected using item information calculated by graded response modeling. Twelve sites—maxillary 2nd premolar (palatal-medial), 1st premolar (palatal-distal), canine (palatal-medial), lateral incisor (palatal-central), central incisor (palatal-distal) and mandibular 1st premolar (lingual, medial)—were selected. Mean values for clinical attachment level, probing pocket depth, and bleeding on probing by full mouth examinations were used for objective variables. Measuring the clinical parameters of these sites can predict the results of full mouth examination. For calculating the periodontal index by partial oral examination, a justification for the selection of examination sites is essential. This study presents an evidence-based partial examination methodology and its modeling.

**Keywords:** periodontitis; epidemiological index; item response theory; oral examination; diagnosis; bleeding on probing

### **1. Introduction**

Periodontal examination should be carried out precisely for the evaluation of periodontal diseases, especially their clinical parameters. Improvement in or progression of periodontitis should be monitored at the site-level along with periodontal treatment [1]. One of the important characteristics of periodontal disease is the localization of infectious processes at specific sites, which eventually leads to tissue destruction [2]. Therefore, the accumulated data obtained during periodontal examination is copious and structurally complex [1]. Each of the 28 teeth has six examination sites for measuring parameters such as bleeding on probing (BOP), periodontal pocket probing depth (PD), and clinical attachment level (CAL). These measurements are evaluated several times throughout the course of periodontal disease treatment. Representative summary statistics include the mean value of CAL and PD, maximum value of CAL and PD, and percentage of sites with BOP (BOP%). Conventionally, summary statistics have been used in several clinical trials for the patient-level evaluation of clinical parameters [3]. However, aggregating summary statistics like the mean value or maximum value can lead to the loss of information [4]. Full-mouth protocols are proven to be the most effective [5]. When time and labor are limited, a simplified index may be necessary to represent the periodontal condition.

Several periodontal indices have been developed for epidemiological surveys and periodontal disease screening: Periodontal disease index (PDI) [6,7], Periodontal Index (PI) [8], Community periodontal index (CPI or CPITN) [8,9], Gingival bone count [10], PMA index [11], Gingival index [12]. They are based on a partial-examination method in which the target teeth for examination vary between the indices. A justification for the selection of these target teeth is not always clear. However, the developers of epidemiological indices may notice that there are several teeth or examination sites within the oral cavity that represent periodontal conditions at an individual level.

In many educational and psychological studies, a latent variable is often used as the outcome variable. These variables cannot be measured directly and are estimated by their response to the observational items. Item response theory (IRT) modeling is an important methodology commonly used for the development of tests to measure the ability by total score of test consisted of items with weighted scores [13]. Through the application of IRT, we can examine each item's reliability and whether it contributes to an overall construct [14,15]. Total score correspond to the sum of the values of periodontal examinations. Items correspond to the values of periodontal examinations. Therefore, IRT is applicable for the indexes used in dental search.

IRT can be applied to the clinical parameters of periodontal disease. The progress of periodontal disease at an individual level corresponds to ability. Susceptibility to the progress of PD, CAL, and BOP corresponds to item difficulty. Discrimination parameter corresponds to the predictability of each site to represent all examination sites or teeth. Previous studies have shown that IRT can efficiently characterize dental caries susceptibility [16,17]. IRT graded response modeling has also been applied in the evaluation of existing measures in several clinical areas, such as those related to swallowing and communication disorders [18,19]. Furthermore, by using IRT models with clinical diagnoses from electronic health records, a constellation of high-risk patients could be identified [20]. Therefore, by applying the IRT model to periodontal data, evidence-based target sites or teeth can be selected to represent and reflect the periodontal conditions of all sites or teeth in the oral cavity.

This study aimed to identify the most reliable subset of teeth able to represent a full-mouth periodontal diagnosis.

### **2. Materials and Methods**

### *2.1. Study Design*

### 2.1.1. Setting

This study was part of a clinical research project by the Japanese Society of Periodontology, in cooperation with 17 facilities (one clinic and 16 university hospitals) in Japan for the diagnosis of periodontitis [1,21,22]. Two-hundred-fifty-four patients with chronic periodontitis were chosen between February 2009 and February 2012 for this study, who had completed their active treatment regulated by the Japanese health insurance system. All 254 patients who registered the study were analyzed.

### 2.1.2. Diagnosis

Each patient was diagnosed according to the guideline at the time (Guidelines of the American Academy of Periodontology) [23]. One examiner from each institute (T.M., M.F., H.K., M.M., T.N., Y.N., K.N., S.S., N.S., S.S., T.S., F.S., H.T., H.Y., A.Y., N.Y. and T.N.) was chosen to carry out the oral examinations. Each examiner was a periodontist licensed by the Japanese Society of Periodontology.

Intra- and inter-examiner calibration session were conducted at the beginning and middle of the study period. Diagnosis of periodontitis was based on the proposed criteria by the Center for Disease Control and Prevention (CDC) in partnership with the American Academy of Periodontology (AAP) [24].

### 2.1.3. Patients

Each patient was ≥ 30 years of age, possessed at least 20 teeth, was systemically healthy, and had not been administered immunosuppressive or anti-inflammatory drugs or systemic antibiotics within 3 months before the initiation of the investigation.

### *2.2. Research Data*

In this study, we analyzed CAL, PD, BOP, plaque index (PlI), and tooth mobility. CAL was measured at six sites for all of the remaining teeth (mesiobuccal, buccal, distobuccal, mesiolingual, lingual, and distolingual). The data of CAL were categorized as < 4, 4–5, and > 5 mm.

### *2.3. Statistical Analysis*

### 2.3.1. IRT Modeling

Based on the IRT model for ordinal polytomous data, we applied a Graded Response Model [25–29]. Item difficulty, item discrimination, item information for the examined sites, and ability of the subjects were calculated [30–34]. The R software with the ltm package was used to perform the IRT analysis [27]. To reduce the total number of examination sites, sites with small item information were removed from the IRT model. This procedure were based on a step-by-step analysis. Using the data for CAL, a model was constructed for all examination sites (Model 1). Next, out of all 168 examination sites, 28 sites representing the highest information for each tooth (sum of left and right side) were selected. An IRT model was constructed using these 28 sites (Model 2). Out of these 28 sites, 12 sites in six teeth were selected for depicting a higher information (Model 3). Finally, the data from the right and left side were categorized as follows: at least one site with >5 mm CAL; at least one site with 4–5 mm CAL; or both sites with <4 mm CAL. Even though there may be optimal examination sites for each clinical parameter, the examination of numerous sites for each clinical parameter may be a laborious procedure for an epidemiological examiner or clinician. IRT models for BOP and PD were constructed in the same manner as for CAL.

### 2.3.2. Model Evaluation

For the scatterplot, regression analysis was carried out. Generalized linear models were applied. For optimal link functions, models were evaluated using Akaike's information criteria [35]. Receiver operating characteristic (ROC) curve was used to analyze sensitivities and specificities. The cutoff points were determined as the minimum difference between specificity and sensitivity [36,37]. The mean CAL of all examination sites and community periodontal index (CPI) were used as reference. Diagnostic criteria by the CDC-AAP [24] was used. Statistical Package for the Social Sciences version 24.0 (IBM, Tokyo, Japan) was used to perform the analyses.

To compare the model to other studies, Sensitivity, relative bias (Severity) relative bias (Extent) were calculated [38–40].

### *2.4. Ethical Approval*

The study was conducted in compliance with the principles outlined in the Helsinki Declaration. Informed written consent was obtained from each subject, and the protocol was approved by the Institutional Review Board of each participating institution. The ethics committee members' names and reference numbers are listed in Appendix A.

### **3. Results**

### *3.1. Descriptive Statistics of the Subjects Participated in this Study*

Descriptive statistics of periodontal clinical parameters were the 3.1 mm for mean of CAL, 2.5 mm for mean of PD, 15.0% for BOP%, and 0.3 for PlI.

### *3.2. Optimal Site Selection by IRT Modeling*

The final model for CAL (Model 4) is shown in Table 1, accompanied with models for the remaining clinical parameters. Item information and item response curves of Model 4 are shown in Figure S1. Using these steps, 168 examination sites were narrowed down to six variables located at 12 sites

(same sites on the right and left side). The results of each step from Model 1 to 4 are shown in Table S1. A quick reference for the calculation of ability by Model 4 is presented in Appendix B.


**Table 1.** Final model (Model 4) for the clinical attachment level.

Extrmt: extremity parameters; CAL: clinical attachment level; PD: probing depth; BOP: bleeding on probing; PlI: plaque index. For CAL, it shows the cutoff to discriminate CAL < 4 mm, CAL 4−5 mm, and CAL > 5 mm. Extrmt1 discriminates CAL < 4 mm and (CAL 4−5 mm and CAL > 5 mm), and Extrmt 2 discriminates (CAL < 4 mm and CAL 4–5 mm) and CAL > 5 mm. Discrimination: This parameter shows the height of item characteristic curves. For the item response theory (IRT) analysis, CAL and PD were categorized as at least one site with >6 mm, at least one site with 4−6 mm, or both sites with <4 mm on the left or right side. IRT analysis was carried out using a graded response model. AIC: Akaike's information criterion; BIC: Bayesian information criterion; both are fitness indices, in which small values are more suitable for a model fit.

### *3.3. Model Evaluation*

### 3.3.1. Evaluation of Selected Sites

In clinical practice, the mean values of all examination sites are often used as summary statistics. The selected 12 sites were evaluated using a scatter plot by plotting the mean values of each clinical parameter against the mean values of the selected 12 sites. The results are shown in Figure 1. For each clinical parameter, adequate co-relations were obtained.

**Figure 1.** Scatter plot of the mean values of clinical parameters against the mean values of the selected 12 sites. (**A**) CAL: clinical attachment level. (**B**) PD: probing depth. (**C**) BOP: bleeding on probing. (**D**) PlI: plaque index. The selected 12 sites were the same sites that are listed in the Figure 1 legend.

### 3.3.2. Model Evaluation

The models were evaluated using two methods: correlation between predictive values and observed values and ROC curve analysis. Ability calculated by IRT analysis indicates the predictive value of the sample. The scatter plot of the ability calculated using Model 4 against the mean values of CAL is illustrated in Figure 2. A scatter plot of all 168 examination sites (Model 1) is also presented as a reference. As the plot appears to be a curve, the generalized linear model was applied. The coefficient and intercept were statistically significant. The scatter plot of the result of the generalized linear model against the models for other clinical parameters is shown in Table S2.

**Figure 2.** Scatter plot of the mean value of the clinical attachment level against the ability calculated by item response theory. Abilities are calculated using a graded response model under the item response theory approach. Abilities are calculated for all the 168 examined sites (**A**) and the selected six variables at the 12 sites (**B**). Ability means sum of the weighted scores of each items. In this case, ability indicate the sum of the weighted scores of positive for periodontal examination.

The data from the same location site are combined with at least one site with >6 mm, at least one site with 4–6 mm, or both sites <4 mm.

The plot appears to be a curve. Therefore, the generalized linear model is applied for the relationship.

The selected sites were maxillary 2nd premolar (palatal-distal), maxillary 1st premolar (palatal-medial), maxillary Canine (palatal-distal), maxillary lateral incisor (palatal-central), maxillary central incisor (palatal-medial), and mandibular 1st premolar (lingual-medial).

Based on the ROC analysis, sensitivity, specificity, likelihood, and area under ROC curve (AUR) are presented in Table 2. The results of the models for other clinical parameters are also shown in Table 2. For the mean value of CAL >3 mm, sensitivity and specificity were 0.832 and 0.852, respectively, and for >5 mm, they were 0.895 and 0.911, respectively. The ROC curve of each clinical parameter and various cutoff points are presented in Figure S2.

**Table 2.** Sensitivity, specificity, and area under the receiver operating characteristic curve for the six variables.


CAL: clinical attachment level; PD: probing depth; BOP: bleeding on probing; PlI: plaque index; AUR: area under receiver operating characteristic curve. Cutoff points are set in abilities calculated using graded response theory. Graded response theory is one of the models of item response theory.

### *3.4. Application of the CAL Model for Diagnosis of Periodontal Disease*

The cutoff point, sensitivity, specificity, and AUR for the diagnosis of periodontal disease by the CDC-AAP are presented in Table 3. ROC curves are shown in Figure S3. For moderate periodontitis, the simple mean CAL of all examination sites is most useful, followed by the mean CAL of optimal examination sites. For severe periodontitis, the CPI is most useful. Simple mean CAL were not obtained similar AUR for CPI.


**Table 3.** Receiver operating characteristic analysis for the selected optimal examination sites for the diagnosis of periodontal disease.

CPI: community periodontal index; CAL: clinical attachment level.

### *3.5. Prediction of Conventional Periodontal Indices by the CAL Model*

The model was applied to conventionally use summary statistics, i.e., mean value of CAL, PD, and BOP%. Clinically useful cutoff points were set for each index. The results are presented in Table 4. The best predictors were the mean values of the clinical parameters by the mean values of the 12 selected sites (e.g., mean CAL by the mean CAL of the 12 sites). For obtaining the mean values of PD and BOP%, we were able to obtain higher values of the AUR compared to the CPI by calculating the simple mean CAL of the 12 selected sites. Ability of CAL, weighted CAL, could obtain higher AUR. All ROC curves are shown in Figure S4. By measuring the CAL, all other clinical parameters can be predicted.

### *3.6. Model Evaluation by Prevalence, Severity, and Extent*

The prevalence, relative bias for severity and extent of model 4 were 62%, −0.0035, and −0.017, respectively.




### **4. Discussion**

Public health applications of periodontal examination such as epidemiological surveys, mass screenings, and community diagnosis, simplified indices that represent an indivisible disease status are indispensable. For this purpose, several indices have been developed [6–12]. Developed by the World Health Organization, the CPI has been utilized in epidemiological surveys not just for community diagnosis but also for the screening of periodontal disease. Originally, the index was calculated through the examination of eight teeth [41]. It has currently been revised to include the examination of all teeth [42]; however, the original method of examining just eight teeth is also still applied. According to Japan's Survey of Dental Diseases, national oral health surveys, which are conducted every six years, still use the original CPI examination method. However, we were unable to find any justification for the tooth-selection methodology or periodontal indices used in these surveys [43].

Several methods that do not require oral examinations for the screening of periodontal disease have been proposed [36,44–48]. These include questionnaires [49,50] and biochemical analysis of the saliva [47] or gingival crevicular fluid [51]. However, the sensitivity and specificity of questionnaires used for periodontitis screening is not high enough, and using biochemical analyses requires a special measuring device. Therefore, oral examinations are still widely used for periodontal disease screenings.

In comparison to the CPI, our model was superior in predicting clinical parameters and almost equivalent in diagnosing moderate periodontitis. Further, partial examination using the CPI requires the examination of 60 sites; our model requires the examination of only 6–12 sites. Furthermore, the examination sites presented by our model are more representative of the periodontal conditions in an oral cavity. As shown in Table 4, the partial examination of these sites represents the mean value of each examined index by the mean value of all examination sites in the oral cavity. By simply measuring the CAL, all other clinical parameters can be predicted. The model presented in this study was derived using the IRT approach. The IRT model is very useful for the selection of items that have high information. However, IRT models can only process dichotomous variables or ordinal scale; they are unable to process contentious variables. At this stage, a loss of information can occur. Therefore, the simple mean of 12 selected sites is more suitable for calculating some of the predictions presented in this study.

When the 12 selected sites are compared with other partial examination protocol, sampling sites are predominantly small. Sampling sites of other protocols were 84 [38,39,52,53], 60 [54], and 56 [38,39,52]. Partial examination protocol by high number of examination site can detect small number of deep CAL or PD. The sensitivity indicate to detect subject with at least one site of CAL > 4 mm. The sensitivity by 84 examination site was 92% [52], by 56 site was 66% [52] and by 28 site was 57% [52]. The sensitivity by the 12 site in this study was 62%. In addition, relative bias of severity of the 12 sites, which estimate the difference of mean value of CAL between full mouth examination, was −0.0035. This value protocols by 84 site were 0.009 [52], −0.046 [38] and −0.01 [53]. The 12 sites based on statistical model may equal, in other partial examination protocol, more than 5 times higher numbers of examination sites.

In this study, the teeth selected for examination included premolars and anterior teeth; the molars were excluded. The molar is a double-rooted tooth with complex anatomical root morphology, including root length, furcation area, and divergence of root and root trunk. Cervical enamel projections and enamel pearls also occur commonly in molars and are considered to be risk factors for periodontal disease; however, their occurrence varies among different individuals [55]. Additionally, molars have to withstand high occlusal forces, which can contribute to periodontal tissue destruction. Therefore, we excluded molars from representing oral examination sites in our analysis.

The periodontal disease index is used for assessing the periodontal status in epidemiological surveys; six target teeth (maxillary right 1stmolar, maxillary left central incisor, maxillary left 1st premolar, mandibular left 1st molar, mandibular left 1st molar, and mandibular right 1st premolar) are scored for the assessment of the disease. However, a sufficient justification for the selection of these teeth has not been provided [6]. For the index, evidence is indispensable.

In this study, optimal sites were selected based in the item information through the models presented in Table S1, and Table 1. Twelve sites: maxillary 2nd premolar (palatal-medial), 1st premolar (palatal-distal), 3 canine (palatal-medial), lateral incisor (palatal-central), central incisor (palatal-distal) and mandibular 1st premolar (lingual-medial) selected in this study were based on statistical modeling and represented the periodontal conditions. a full mouth examination is a best method; however, as time and labor are limited, partial examination may be applicable. Partial examination of these sites may be useful tool for epidemiological studies, mass screenings, and public health use.

There are several limitations in this study. The study population was consisted of the patients who experienced active periodontal treatment. The wider population is necessary to confirm the robustness of the model presented in this study. However, several partial-mouth assessments were not based on the statistical modeling. The strength of the model presented in this study was based on the IRT model, and weights for the site were calculated to improve the predictive values.

### **5. Conclusions**

For calculating the periodontal index by partial oral examination, a justification for the selection of examination sites is necessary. This study presents an evidence-based partial examination method and its modeling. The 12 sites presented in this study almost equal to other partial examination protocol, which have more than 5 times the number of sampling sites.

**Supplementary Materials:** The attached supplementary materials are available online at http://www.mdpi.com/ 2077-0383/9/11/3754/s1, Figure S1: Item response curve and item information curves by the selected six values of the clinical attachment level, Figure S2: ROC curves by the selected six values, Figure S3: ROC curves for the diagnosis of periodontal disease by the selected site, Figure S4: ROC curves for clinical parameters by the selected site, Table S1: Constructed models, Table S2: Generalized linear model to predict the mean value of the clinical attachment level by the selected site

**Author Contributions:** Conceptualization, Y.N. (Yoshiaki Nomura) and T.M.; methodology, Y.N. (Yoshiaki Nomura); software, Y.N. (Yoshiaki Nomura); validation, E.K. and N.H.; formal analysis, Y.N. (Yoshiaki Nomura); investigation, T.M., M.F., H.K., M.M., T.N. (Taneaki Nakagawa), Y.N. (Yukihiro Numabe), F.N., K.N., Y.N. (Yohei Nakayama), Y.O., A.S., S.S. (Soh Sato), N.S., S.S. (Satoshi Sekino), T.S., F.S., K.T., H.T., M.U., H.Y., A.Y., N.Y. and T.N. (Toshiaki Nakamura); data curation, S.T.; writing—original draft preparation, Y.N. (Yoshiaki Nomura); writing—review and editing, Y.N. (Yoshiaki Nomura) and T.M.; visualization, Y.N. (Yoshiaki Nomura); supervision, T.M. and H.Y.; project administration, T.M., T.N. (Taneaki Nakagawa) and H.Y.; funding acquisition, T.N. (Taneaki Nakagawa) and H.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by a clinical research project grant from the Japanese Society of Periodontology for the diagnosis of periodontitis.

**Acknowledgments:** The authors thank Toshihide Noguchi, Masamitsu Kawanami, Koichi Ito, Yuichi Izumi, Yoshitaka Hara, Osamu Fujise, Yuzo Abe, Tomoo Kono, Asako Makino-Oi, and Chie Fukaya, for their advice and useful comments.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Appendix A**

List of ethical committees and their approval numbers.



### **Appendix B**

Quick reference for the calculation of ability by Model 4.





1: CAL <4mm; 2: CAL = 4–6mm; 3: CAL >6mm.

### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
