Next Article in Journal
Effects of Footwear Selection on Plantar Pressure and Neuromuscular Characteristics during Jump Rope Training
Previous Article in Journal
Aboriginal Young People’s Experiences of Accessibility in Mental Health Services in Two Regions of New South Wales, Australia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Associations of Preterm Birth with Dental and Gastrointestinal Diseases: Machine Learning Analysis Using National Health Insurance Data

1
Department of Oral and Maxillofacial Surgery, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea
2
Department of Obstetrics and Gynecology, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea
3
Department of Gastroenterology, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea
4
Department of Statistics, Korea University College of Political Science & Economics, Korea University Anam Hospital, Seoul 02841, Republic of Korea
5
AI Center, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Int. J. Environ. Res. Public Health 2023, 20(3), 1732; https://doi.org/10.3390/ijerph20031732
Submission received: 27 December 2022 / Revised: 15 January 2023 / Accepted: 16 January 2023 / Published: 18 January 2023

Abstract

:
Background: This study uses machine learning with large-scale population data to assess the associations of preterm birth (PTB) with dental and gastrointestinal diseases. Methods: Population-based retrospective cohort data came from Korea National Health Insurance claims for 124,606 primiparous women aged 25–40 and delivered in 2017. The 186 independent variables included demographic/socioeconomic determinants, disease information, and medication history. Machine learning analysis was used to establish the prediction model of PTB. Random forest variable importance was used for identifying major determinants of PTB and testing its associations with dental and gastrointestinal diseases, medication history, and socioeconomic status. Results: The random forest with oversampling data registered an accuracy of 84.03, and the areas under the receiver-operating-characteristic curves with the range of 84.03–84.04. Based on random forest variable importance with oversampling data, PTB has strong associations with socioeconomic status (0.284), age (0.214), year 2014 gastroesophageal reflux disease (GERD) (0.026), year 2015 GERD (0.026), year 2013 GERD (0.024), progesterone (0.024), year 2012 GERD (0.023), year 2011 GERD (0.021), tricyclic antidepressant (0.020) and year 2016 infertility (0.019). For example, the accuracy of the model will decrease by 28.4%, 2.6%, or 1.9% if the values of socioeconomic status, year 2014 GERD, or year 2016 infertility are randomly permutated (or shuffled). Conclusion: By using machine learning, we established a valid prediction model for PTB. PTB has strong associations with GERD and infertility. Pregnant women need close surveillance for gastrointestinal and obstetric risks at the same time.

1. Introduction

Preterm birth (PTB), a delivery occurring between 200/7 and 366/7 gestational weeks, is one of the major unsolved problems of obstetrics. PTB is a main cause of serious neonatal morbidity and mortality, which leads to heavy socioeconomic and public health costs [1,2]. The economic cost of caring for preterm infants in the United States was reported to be 25 billion dollars incrementally [3]. In addition, PTB also affects the long-term health and well-being in the later life of preterm infants [4]. In Korea, the cumulative medical cost for preterm children was approximately 43 million dollars for preterm children during 6 years after discharge from the neonate intensive care unit (NICU) [5]. Even with continual efforts to reduce and prevent PTB, PTB still accounts for approximately 11% of total births globally. Even though more than half of PTB occurs in low- and middle-income countries, the PTB rate is also increasing in higher-income countries and remains as a significant social and medical issue over decades [6].
To fully identify the pathophysiology of PTB is challenging because it is a complex syndrome in which many factors are involved. Studies have reported various risk factors for PTB, including sociodemographic, lifestyle, genetic, medical, and obstetrics factors. Periodontal disease, such as gingivitis, which more than 40% of the population in the United States suffer from, is one of the contributing factors of PTB [7,8,9]. The prevalence of periodontitis in Korea was approximately 24%, and it is reported that periodontitis shares some risk factors with PTB, such as low socioeconomic status, obesity, and smoking [5,10,11]. Gastroesophageal reflux disease (GERD), a common gastrointestinal disease during pregnancy, accompanies with dental disease, including periodontitis and dental erosions [12,13]. The prevalence of GERD worldwide was 13%. Its prevalence has more than doubled in Asia over the past two decades, from 6.0% to 15.0% [14,15]. Based on this linkage between PTB, periodontitis, and GERD, a few studies were conducted to assess the effect of both periodontitis and GERD on PTB. These previous studies found that GERD is one of the major determinants of PTB and has a stronger association with PTB than the association between periodontitis with PTB [16,17]. These studies were clinically meaningful in referring to the importance of GERD in the context of preventing PTB, which is often overlooked. However, previous study studies have limitations regarding either a small number of the study population (731 participants) or a relatively lower range of the area under the receiver-operating-characteristic curves (0.51–0.57) [16,17].
To overcome those limitations, we aimed to establish a high-performance prediction model with machine learning and large-scale population data. Furthermore, we assessed the association of PTB with more various dental diseases than in previous studies.

2. Materials and Methods

2.1. Participants and Variables

Population-based retrospective cohort data came from Korea National Health Insurance claims for 172,462 primiparous women aged 25–40 and delivered in 2017. This retrospective cohort study was approved by the Institutional Review Board (IRB) of Korea University Anam Hospital on 5 November 2018 (2018AN0365). Informed consent was waived by the IRB.
The dependent variable was PTB (birth before 37 weeks of gestation) in 2017. Four categories of PTB were introduced according to the ICD-10 Code: (1) PTB 1—PTB with premature rupture of membranes (PROM) only; (2) PTB 2—preterm labor and birth without PROM; (3) PTB 3—PTB 1 or PTB 2; (4) PTB 4—PTB 3 or other indicated PTB (Table S1). The 188 independent variables covered the following information: (1) demographic/socioeconomic determinants in 2016, including age and socioeconomic status measured by an insurance fee with the range of 1 (the highest socioeconomic group) to 20 (the lowest socioeconomic group); (2) dental diseases for any of the years 2002–2016, i.e., dental cavity, oral mucositis, periodontitis, salivary gland disease, tooth loss; (3) gastrointestinal diseases for any of the years 2002–2016, i.e., Crohn’s disease, gastroesophageal reflux disease (GERD), irritable bowel syndrome, ulcerative colitis; (4) obstetric history in 2016, that is, infertility; (5) medication history in 2016 including benzodiazepine, calcium channel blocker, nitrate, progesterone, sleeping pill, and tricyclic antidepressant. The selection of these 186 independent variables were based on previous studies and data availability. These data on disease and medication history were screened from ICD-10 and ATC codes, respectively (Supplementary Tables S1 and S2).

2.2. Analysis

Logistic regression and the random forest were used for the prediction of PTB [16,17]. A random forest is a group of decision trees with a majority vote on the dependent variable. The random forest with 100 decision trees and default parameters (GINI criterion, max depth none, max features square root) were employed in this study: 100 training sets were sampled with replacements, 100 decision trees were trained with the 100 training sets, the 100 decision trees made 100 predictions, and the random forest took a majority vote on the dependent variable. The data of 124,606 cases with full information were split into training and validation sets with an 80:20 ratio (99,685 vs. 24,921 cases). Criteria for the validation of the trained models were accuracy (a ratio of correct predictions among 24,921 cases) and the area under the receiver-operating-characteristic curve (the plot of sensitivity vs. 1-specificity). In other words, accuracy can be expressed as the true positive plus the true negative divided by all cases. Likewise, the area under the receiver-operating-characteristic curve can be interpreted as “how much sensitivity can be secured when the threshold of sensitivity rises from 0 to 1 and specificity rises from 0 to 1” [11,12]. Random forest variable importance was introduced for identifying major determinants of PTB and testing its associations with dental and gastrointestinal diseases, medication history, and socioeconomic status. The package “sklearn 1.2.0” in Python (CreateSpace: Scotts Valley, 2009) was employed for the analysis from 15 December 2021–31 July 2022.

3. Results

Descriptive statistics for the 124,606 participants are presented in Supplementary Table S3. The proportion of those with preterm birth (PTB4) was 5.8% (7285/124,606) in 2017. The mean of socioeconomic status was 11.11 for preterm birth vs. 11.08 for term birth, whereas the mean of age was 32.1 for preterm birth vs. 31.8 for term birth (p-value < 0.01). The proportions of those with GERD during 2011–2016 were higher for preterm birth than for term birth, and these differences were statistically significant: 7.7% vs. 6.6% (2011 GERD), 8.2% vs. 7.6% (2012 GERD), 8.9% vs. 8.0% (2013 GERD), 9.8% vs. 9.1% (2014 GERD), 10.5% vs. 9.6% (2015 GERD), and 11.7% vs. 10.0% (2016 GERD). Likewise, the proportions of those with progesterone, tricyclic antidepressant, and infertility in 2016 were higher for preterm birth compared to term birth, and these differences were statistically significant: 19.0% vs. 15.8% (2016 progesterone), 10.8% vs. 9.6% (2016 tricyclic antidepressant), and 27.7% vs. 18.1% (2016 infertility). The findings of the univariate analysis above confirm the positive associations of preterm birth with GERD during 2011–2016, as well as age, progesterone, tricyclic antidepressant, and infertility in 2016.
In Table 1, the random forest with oversampling data registered an accuracy of 84.03, and the areas under the receiver-operating-characteristic curves with the range of 84.03–84.04. Its logistic-regression counterparts were within the ranges of 50.45–60.25 (accuracy) and 54.40–60.19 (areas under the receiver-operating-characteristic curves). The performance measures of the random forest with oversampling data were far beyond those of logistic regression with oversampling data. Here, oversampling is an approach to match the sizes of two groups (participants with and without preterm birth) so that the training of machine learning models can be balanced between the two groups. Based on random forest variable importance with oversampling data in Table 2 and Figure 1, PTB 4 has strong associations with socioeconomic status (0.284), age (0.214), year 2014 GERD (0.026), year 2015 GERD (0.026), year 2013 GERD (0.024), progesterone (0.024), year 2012 GERD (0.023), year 2011 GERD (0.021), tricyclic antidepressant (0.020) and year 2016 infertility (0.019). For example, the accuracy of the model will decrease by 28.4%, 2.6%, or 1.9% if the values of socioeconomic status, year 2014 GERD, or year 2016 infertility are randomly permutated (or shuffled). Among the top 10 important variables, socioeconomic status and age are the well-known contributing factors for PTB [18,19,20]. Infertility was strongly associated with the increased risk of PTB. This finding was consistent with the previous studies which reported the increased risk of PTB in women with a history of fertility or assisted reproductive technology [21,22,23]. GERD during 2011–2016 ranked top 10 important variables. Even though GERD was not considered as the conventional risk factor for PTB, recent studies have reported the association between GERD and PTB [16,17,24]. The mechanism by which GERD and infertility increase the risk of PTB is not elucidated yet. Among the administration of medication, TCA and progesterone acted as important variables. It is assumed that progesterone showed a strong association with PTB, not because it increased the risk of PTB, but because it is widely used to prevent PTB in pregnant women with short cervical length, the high-risk group for PTB. It is reasonable that the administration of TCA, a popular anti-depressant, was associated with the risk of PTB, considering that maternal stress is also a well-known contributor to PTB [25]. In addition, recent studies also demonstrated that antenatal maternal depression increased the risk of PTB [26,27]. It should be noted that the random forest variable importance measures for oversampling data were very similar to those for original data in Supplementary Table S4 and Figure 1.

4. Discussion

4.1. Summary

The random forest with oversampling data registered an accuracy of 84.03, and the areas under the receiver-operating-characteristic curves with the range of 84.03–84.04. Based on random forest variable importance with oversampling data, PTB has strong associations with socioeconomic status, age, year 2014 GERD, year 2015 GERD, year 2013 GERD, progesterone, year 2012 GERD, year 2011 GERD, tricyclic antidepressant, and year 2016 infertility.

4.2. Contributions

This study presents the most comprehensive analysis of the determinants of PTB, using a population-based cohort of 124,606 participants and the richest collection of 186 predictors, including demographic/socioeconomic determinants, dental and gastrointestinal diseases, and medication history. We established a valid prediction model for PTB and investigated its associations with dental and gastrointestinal diseases. Moreover, this study made the following clinical and policy implications. Firstly, the findings of this study agree with those of previous studies on the positive associations of preterm birth with low socioeconomic status [20,28]. The odds ratio of low socioeconomic status was 1.75 in a retrospective cohort study of 1,282,172 pregnant women in Scotland during 1980–2000 [20]. The corresponding statistic was 5.1 in a multicenter prospective study of 2645 pregnant women in Korea [28]. In a similar vein, low socioeconomic status, measured by an insurance fee with the range of 1 (the highest socioeconomic group) to 20 (the lowest socioeconomic group), ranked first in random forest variable importance and its mean was higher for preterm birth (11.11) than for term birth (11.08) in this study. Secondly, the results of this study request due attention to the importance of infertility and its determinants in the prediction of preterm birth. In this study, the variable importance of the year 2016 infertility was within the top 10, and its proportion was higher for preterm birth (27.7%) than for term birth (18.1%) with statistical significance. These findings are consistent with those of a retrospective cohort study of 2034 pregnant women in Australia during 1986–1998, reporting that the proportion of preterm birth was higher with high infertility treatment history (5.2%) than without the history (1.0%) [22]. Likewise, in a retrospective cohort study of 117,401 pregnant women during 1991–2016 in the United Kingdom, the odds ratios of preterm birth were higher with the following causes of infertility than without the conditions, i.e., ovulatory disorders (1.25), tubal disorders (1.25) and endometriosis (1.17) [23]. However, little literature is available and more analysis is needed on the determinants of infertility and their associations with preterm birth.
Thirdly, the findings of this study affirm those of existing literature on the positive associations of preterm birth with GERD [16,17,24]. In this study, GERD during 2011–2016 ranked within the top 10 in random forest variable importance for the prediction of preterm birth in 2017, and their proportions were higher for preterm birth than for term birth with statistical significance: 7.7% vs. 6.6% (2011 GERD), 8.2% vs. 7.6% (2012 GERD), 8.9% vs. 8.0% (2013 GERD), 9.8% vs. 9.1% (2014 GERD), 10.5% vs. 9.6% (2015 GERD), and 11.7% vs. 10.0% (2016 GERD). Similarly, in a retrospective cohort study of 405,586 pregnant women during 2002–2017 in Korea, the random forest variable importance rankings of GERD during 2009–2014 were within the top 10 for the prediction of preterm birth during 2015–2017 [24]. However, pregnant women usually neglect the significant role of GERD symptoms in preterm birth; hence more active counseling is really needed for effective prenatal care. Fourthly, this study brings new insights into a positive relationship between inflammatory bowel disease and preterm birth. Inflammatory bowel syndromes (IBS) in 2005 and 2006 ranked within the top 20 in random forest variable importance, and their proportions were higher for preterm birth than for term birth in this study, i.e., 3.61% vs. 2.89% (2005 IBS), 3.69% vs. 3.13% (2006 IBS). As a matter of fact, inflammatory bowel disease, including Crohn’s disease and uncreative colitis, were reported to be the risk factors of preterm birth in previous studies [29,30,31,32,33], and inflammatory agents are expected to mediate uterine contractility, cervical dilation, and inflammatory bowel disease during the process of labor [34]. This study makes a unique contribution to this line of research using machine learning analysis and a population-based cohort of 124,606 participants. Finally, it is notable that the importance rankings of dental diseases were out of the top 30 in this study. More machine learning investigation is to be done for more concrete evidence and more rigorous validation in this direction.

4.3. Limitations

This study had some limitations. Firstly, indicated PTB (preterm birth due to maternal and fetal indication) and spontaneous PTB (preterm birth due to spontaneous preterm labor) have different etiology, but this study could not separate them based on the ICD-10 code. Secondly, this study did not examine possible mediating effects among variables (e.g., the mediating effects of socioeconomic status between heart disease and preterm birth). Thirdly, a recent review suggests that different machine learning approaches would be optimal for different types of data regarding the prediction of preterm birth: the artificial neural network, logistic regression, and/or the random forest for numeric data; the support vector machine for electrohysterogram data; the recurrent neural network for text data; and the convolutional neural network for image data [23]. Uniting various kinds of machine learning approaches for various kinds of PTB data would bring new innovations and deeper insights into this line of research. Fourthly, this study used the default parameters of the random forest, but there exists a possibility of overfitting, and parameter tuning (e.g., the number of trees, their max depth) would help to resolve the issue. Lastly, various dental and gastrointestinal diseases would have different effects on a mother or fetus. However, we did not consider different mechanisms among various diseases in this study.

5. Conclusions

In conclusion, this study demonstrated that GERD and infertility are significantly associated with PTB through a valid prediction model for PTB by using machine learning with large-scale population data. Through this study, the need for close surveillance of the obstetric risks as well as the gastrointestinal risk for PTB, which has been overlooked, is ascertained. Further prospective studies to elucidate the pathophysiology of GERD increasing the risk of PTB are needed.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijerph20031732/s1, Table S1: ICD-10 Code for Preterm Birth and Dental/Gastrointestinal Disease; Table S2: ATC Code for Medication; Table S3: Descriptive Statistics; Table S4: Random Forest Variable Importance—No Sampling.

Author Contributions

I.-S.S., E.-S.C., E.S.K., K.-S.L. and K.H.A. designed the study. Y.H. and K.-S.L. collected, analyzed, and interpreted the data. I.-S.S., E.-S.C., K.-S.L. and K.H.A. wrote and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by (1) the Korea University Medical Center grant (No. K1925051), (2) the Korea Health Industry Development Institute grant funded by the Ministry of Health & Welfare of South Korea (No. HI21C156001), (3) the Korea Health Industry Development Institute grant (Korea Health Technology R&D Project) funded by the Ministry of Health & Welfare of South Korea (No. HI22C1302) and (4) the Korea Medical Device Development Fund grant funded by the Ministry of Science and ICT, the Ministry of Trade, Industry, and Energy, the Ministry of Health & Welfare, the Ministry of Food and Drug Safety of South Korea) (No. RS-2021-KD000009). The funders had no role in the design of the study, the collection, analysis, and interpretation of the data, and the writing of the manuscript.

Institutional Review Board Statement

This retrospective cohort study was approved by the Institutional Review Board (IRB) of Korea University Anam Hospital on 5 November 2018 (2018AN0365). Informed consent was waived by the IRB.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are not publicly available. However, the data are available from the corresponding author upon reasonable request and under the permission of Korea National Health Insurance Service.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Beck, S.; Wojdyla, D.; Say, L.; Betran, A.P.; Merialdi, M.; Requejo, J.H.; Rubens, C.; Menon, R.; Van Look, P.F. The worldwide incidence of preterm birth: A systematic review of maternal mortality and morbidity. Bull. World Health Organ. 2010, 88, 31–38. [Google Scholar] [CrossRef] [PubMed]
  2. Mangham, L.J.; Petrou, S.; Doyle, L.W.; Draper, E.S.; Marlow, N. The cost of preterm birth throughout childhood in England and Wales. Pediatrics 2009, 123, e312–e327. [Google Scholar] [CrossRef] [PubMed]
  3. Waitzman, N.J.; Jalali, A.; Grosse, S.D. Preterm birth lifetime costs in the united states in 2016: An update. Semin. Perinatol. 2021, 45, 151390. [Google Scholar] [CrossRef] [PubMed]
  4. Markopoulou, P.; Papanikolaou, E.; Analytis, A.; Zoumakis, E.; Siahanidou, T. Preterm birth as a risk factor for metabolic syndrome and cardiovascular disease in adult life: A systematic review and meta-analysis. J. Pediatr. 2019, 210, 69–80.e5. [Google Scholar] [CrossRef] [PubMed]
  5. Jin, J.H.; Lee, S.A.; Yoon, S.W. Medical utilization and costs in preterm infants in the first 6 years of life after discharge from neonatal intensive care unit: A nationwide population-based study in korea. J. Korean Med. Sci. 2022, 37, e93. [Google Scholar] [CrossRef]
  6. Blencowe, H.; Cousens, S.; Oestergaard, M.Z.; Chou, D.; Moller, A.B.; Narwal, R.; Adler, A.; Vera Garcia, C.; Rohde, S.; Say, L.; et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: A systematic analysis and implications. Lancet 2012, 379, 2162–2172. [Google Scholar] [CrossRef] [Green Version]
  7. Vergnes, J.N.; Sixou, M. Preterm low birth weight and maternal periodontal status: A meta-analysis. Am. J. Obstet. Gynecol. 2007, 196, 135.e1–135.e7. [Google Scholar] [CrossRef]
  8. Puertas, A.; Magan-Fernandez, A.; Blanc, V.; Revelles, L.; O’Valle, F.; Pozo, E.; León, R.; Mesa, F. Association of periodontitis with preterm birth and low birth weight: A comprehensive review. J. Matern. Fetal Neonatal Med. 2018, 31, 597–602. [Google Scholar] [CrossRef]
  9. Eke, P.I.; Dye, B.A.; Wei, L.; Slade, G.D.; Thornton-Evans, G.O.; Borgnakke, W.S.; Taylor, G.W.; Page, R.C.; Beck, J.D.; Genco, R.J. Update on prevalence of periodontitis in adults in the united states: Nhanes 2009 to 2012. J. Periodontol. 2015, 86, 611–622. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Genco, R.J.; Borgnakke, W.S. Risk factors for periodontal disease. Periodontology 2000 2013, 62, 59–94. [Google Scholar] [CrossRef]
  11. Takeshita, T.; Matsuo, K.; Furuta, M.; Shibata, Y.; Fukami, K.; Shimazaki, Y.; Akifusa, S.; Han, D.H.; Kim, H.D.; Yokoyama, T.; et al. Distinct composition of the oral indigenous microbiota in south Korean and Japanese adults. Sci. Rep. 2014, 4, 6990. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Vinesh, E.; Masthan, K.; Kumar, M.S.; Jeyapriya, S.M.; Babu, A.; Thinakaran, M. A clinicopathologic study of oral changes in gastroesophageal reflux disease, gastritis, and ulcerative colitis. J. Contemp. Dent. Pract. 2016, 17, 943–947. [Google Scholar] [CrossRef] [PubMed]
  13. Fill Malfertheiner, S.; Malfertheiner, M.V.; Mönkemüller, K.; Röhl, F.W.; Malfertheiner, P.; Costa, S.D. Gastroesophageal reflux disease and management in advanced pregnancy: A prospective survey. Digestion 2009, 79, 115–120. [Google Scholar] [CrossRef] [PubMed]
  14. Richter, J.E.; Rubenstein, J.H. Presentation and epidemiology of gastroesophageal reflux disease. Gastroenterology 2018, 154, 267–276. [Google Scholar] [CrossRef] [PubMed]
  15. Jung, H.K.; Tae, C.H.; Song, K.H.; Kang, S.J.; Park, J.K.; Gong, E.J.; Shin, J.E.; Lim, H.C.; Lee, S.K.; Jung, D.H.; et al. 2020 Seoul consensus on the diagnosis and management of gastroesophageal reflux disease. J. Neurogastroenterol. Motil. 2021, 27, 453–481. [Google Scholar] [CrossRef]
  16. Lee, K.S.; Song, I.S.; Kim, E.S.; Ahn, K.H. Determinants of spontaneous preterm labor and birth including gastroesophageal reflux disease and periodontitis. J. Korean Med. Sci. 2020, 35, e105. [Google Scholar] [CrossRef] [Green Version]
  17. Lee, K.S.; Kim, E.S.; Kim, D.Y.; Song, I.S.; Ahn, K.H. Association of gastroesophageal reflux disease with preterm birth: Machine learning analysis. J. Korean Med. Sci. 2021, 36, e282. [Google Scholar] [CrossRef]
  18. Meis, P.J.; Michielutte, R.; Peters, T.J.; Wells, H.B.; Sands, R.E.; Coles, E.C.; Johns, K.A. Factors associated with preterm birth in cardiff, wales. Ii. Indicated and spontaneous preterm birth. Am. J. Obstet. Gynecol. 1995, 173, 597–602. [Google Scholar] [CrossRef]
  19. Goldenberg, R.L.; Culhane, J.F.; Iams, J.D.; Romero, R. Epidemiology and causes of preterm birth. Lancet 2008, 371, 75–84. [Google Scholar] [CrossRef]
  20. Fairley, L.; Leyland, A.H. Social class inequalities in perinatal outcomes: Scotland 1980–2000. J. Epidemiol. Community Health 2006, 60, 31–36. [Google Scholar] [CrossRef]
  21. Wang, Y.A.; Sullivan, E.A.; Black, D.; Dean, J.; Bryant, J.; Chapman, M. Preterm birth and low birth weight after assisted reproductive technology-related pregnancy in australia between 1996 and 2000. Fertil. Steril. 2005, 83, 1650–1658. [Google Scholar] [CrossRef] [PubMed]
  22. Wang, J.X.; Norman, R.J.; Kristiansson, P. The effect of various infertility treatments on the risk of preterm birth. Hum. Reprod. 2002, 17, 945–949. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Sunkara, S.K.; Antonisamy, B.; Redla, A.C.; Kamath, M.S. Female causes of infertility are associated with higher risk of preterm birth and low birth weight: Analysis of 117 401 singleton live births following IVF. Hum. Reprod. 2021, 36, 676–682. [Google Scholar] [CrossRef] [PubMed]
  24. Lee, K.S.; Song, I.S.; Kim, E.S.; Kim, H.I.; Ahn, K.H. Association of preterm birth with medications: Machine learning analysis using national health insurance data. Arch. Gynecol. Obstet. 2022, 305, 1369–1376. [Google Scholar] [CrossRef] [PubMed]
  25. Dole, N.; Savitz, D.A.; Hertz-Picciotto, I.; Siega-Riz, A.M.; McMahon, M.J.; Buekens, P. Maternal stress and preterm birth. Am. J. Epidemiol. 2003, 157, 14–24. [Google Scholar] [CrossRef] [Green Version]
  26. Khanam, R.; Applegate, J.; Nisar, I.; Dutta, A.; Rahman, S.; Nizar, A.; Ali, S.M.; Chowdhury, N.H.; Begum, F.; Dhingra, U.; et al. Burden and risk factors for antenatal depression and its effect on preterm birth in south Asia: A population-based cohort study. PLoS ONE 2022, 17, e0263091. [Google Scholar] [CrossRef]
  27. Liu, C.; Cnattingius, S.; Bergstrom, M.; Ostberg, V.; Hjern, A. Prenatal parental depression and preterm birth: A national cohort study. BJOG 2016, 123, 1973–1982. [Google Scholar] [CrossRef] [Green Version]
  28. Kim, Y.J.; Lee, B.E.; Park, H.S.; Kang, J.G.; Kim, J.O.; Ha, E.H. Risk factors for preterm birth in Korea: A multicenter prospective study. Gynecol. Obstet. Investig. 2005, 60, 206–212. [Google Scholar] [CrossRef]
  29. Varner, M.W.; Esplin, M.S. Current understanding of genetic factors in preterm birth. BJOG 2005, 112 (Suppl. S1), 28–31. [Google Scholar] [CrossRef]
  30. Couceiro, J.; Matos, I.; Mendes, J.J.; Baptista, P.V.; Fernandes, A.R.; Quintas, A. Inflammatory factors, genetic variants, and predisposition for preterm birth. Clin. Genet. 2021, 100, 357–367. [Google Scholar] [CrossRef]
  31. Bengtson, M.B.; Solberg, I.C.; Aamodt, G.; Jahnsen, J.; Moum, B.; Vatn, M.H. Relationships between inflammatory bowel disease and perinatal factors: Both maternal and paternal disease are related to preterm birth of offspring. Inflamm. Bowel Dis. 2010, 16, 847–855. [Google Scholar] [CrossRef]
  32. Bröms, G.; Granath, F.; Stephansson, O.; Kieler, H. Preterm birth in women with inflammatory bowel disease—The association with disease activity and drug treatment. Scand. J. Gastroenterol. 2016, 51, 1462–1469. [Google Scholar] [CrossRef]
  33. Cornish, J.; Tan, E.; Teare, J.; Teoh, T.G.; Rai, R.; Clark, S.K.; Tekkis, P.P. A meta-analysis on the influence of inflammatory bowel disease on pregnancy. Gut 2007, 56, 830–837. [Google Scholar] [CrossRef] [Green Version]
  34. Nasef, N.A.; Ferguson, L.R. Inflammatory bowel disease and pregnancy: Overlapping pathways. Transl. Res. 2012, 160, 65–83. [Google Scholar] [CrossRef]
Figure 1. Random forest variable importance-oversampling vs. no sampling for PTB4. (A) Oversampling (B) No Sampling. Based on random forest variable importance with oversampling data in (A), PTB 4 has strong associations with socioeconomic status (0.284), age (0.214), year 2014 GERD (0.026), year 2015 GERD (0.026), year 2013 GERD (0.024), progesterone (0.024), year 2012 GERD (0.023), year 2011 GERD (0.021), tricyclic antidepressant (0.020) and year 2016 infertility (0.019). For example, the accuracy of the model will decrease by 28.4%, 2.6%, or 1.9% if the values of socioeconomic status, year 2014 GERD, or year 2016 infertility are randomly permutated (or shuffled). It should be noted that the random forest variable importance measures for oversampling data were very similar to those for the original data in (B). GERD = gastroesophageal reflux disease, IBS = irritable bowel syndrome, SES = Socioeconomic Status, TCA = Tricyclic Antidepressant.
Figure 1. Random forest variable importance-oversampling vs. no sampling for PTB4. (A) Oversampling (B) No Sampling. Based on random forest variable importance with oversampling data in (A), PTB 4 has strong associations with socioeconomic status (0.284), age (0.214), year 2014 GERD (0.026), year 2015 GERD (0.026), year 2013 GERD (0.024), progesterone (0.024), year 2012 GERD (0.023), year 2011 GERD (0.021), tricyclic antidepressant (0.020) and year 2016 infertility (0.019). For example, the accuracy of the model will decrease by 28.4%, 2.6%, or 1.9% if the values of socioeconomic status, year 2014 GERD, or year 2016 infertility are randomly permutated (or shuffled). It should be noted that the random forest variable importance measures for oversampling data were very similar to those for the original data in (B). GERD = gastroesophageal reflux disease, IBS = irritable bowel syndrome, SES = Socioeconomic Status, TCA = Tricyclic Antidepressant.
Ijerph 20 01732 g001
Table 1. Model performance.
Table 1. Model performance.
AccuracyPTB 1PTB 2PTB 3PTB 4
No Sampling
  Logistic Regression0.98270.99320.97810.9772
  Random Forest0.98180.99270.97670.9758
Oversampling
  Logistic Regression0.54450.60250.55510.5582
  Random Forest0.85990.84030.84030.8403
AUCPTB 1PTB 2PTB 3PTB 4
No Sampling
  Logistic Regression0.50000.50000.50000.5000
  Random Forest0.50220.50270.50230.5023
Oversampling
  Logistic Regression0.84030.60190.55390.5572
  Random Forest0.84040.84040.84040.8404
Table 2. Random forest variable importance-oversampling.
Table 2. Random forest variable importance-oversampling.
 PTB1PTB2PTB3PTB4
1SES0.284SES0.274SES0.281SES0.284
2Age0.227Age0.198Age0.218Age0.215
3GERD_20130.025Infertility_20160.038GERD_20140.025GERD_20140.026
4GERD_20150.025Benzodiazepine0.034GERD_20150.025GERD_20150.026
5GERD_20120.024GERD_20140.025Progesterone0.024GERD_20130.024
6Progesterone0.023GERD_20150.025GERD_20120.023Progesterone0.024
7TCA0.022GERD_20160.025GERD_20130.023GERD_20120.023
8GERD_20140.021GERD_20130.024GERD_20100.020GERD_20110.021
9Benzodiazepine0.020GERD_20120.022Benzodiazepine0.019TCA0.020
10GERD_20110.017GERD_20110.021GERD_20160.019Infertility_20160.019
11GERD_20160.016INFE_20150.021GERD_20110.018GERD_20100.017
12Infertility_20150.015GERD_20100.020Infertility_20160.018GERD_20160.017
13GERD_20080.014Progesterone0.019TCA0.018Benzodiazepine0.016
14GERD_20090.014TCA0.019GERD_20090.016GERD_20090.016
15GERD_20100.014GERD_20080.013Infertility_20150.013Infertility_20150.014
16IBS_20060.013GERD_20070.012GERD_20070.012GERD_20070.012
17GERD_20070.012GERD_20090.012IBS_20050.012GERD_20080.012
18IBS_20050.012IBS_20050.012GERD_20080.011IBS_20060.012
19Infertility_20160.012IBS_20060.012IBS_20020.010GERD_20060.011
20GERD_20060.010Sleeping Pill0.012IBS_20030.010IBS_20040.011
21IBS_20040.010IBS_20040.011IBS_20040.010IBS_20050.011
22Infertility_20140.010Infertility_20140.011IBS_20060.010IBS_20020.009
23Sleeping Pill0.010IBS_20020.009Infertility_20140.010IBS_20030.009
24IBS_20020.009IBS_20030.009Sleeping Pill0.010Infertility_20140.009
25IBS_20030.009Infertility_20130.008GERD_20060.009Sleeping Pill0.008
26GERD_20050.007GERD_20060.007GERD_20050.007GERD_20050.007
27GERD_20040.006GERD_20040.006GERD_20040.006GERD_20040.006
28Infertility_20130.006GERD_20050.006Infertility_20130.006Infertility_20130.006
29GERD_20020.005Infertility_20120.005GERD_20020.005GERD_20020.005
30GERD_20030.005GERD_20020.004GERD_20030.005GERD_20030.005
GERD = gastroesophageal reflux disease, IBS = irritable bowel syndrome, SES = Socioeconomic Status, TCA = Tricyclic Antidepressant.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Song, I.-S.; Choi, E.-S.; Kim, E.S.; Hwang, Y.; Lee, K.-S.; Ahn, K.H. Associations of Preterm Birth with Dental and Gastrointestinal Diseases: Machine Learning Analysis Using National Health Insurance Data. Int. J. Environ. Res. Public Health 2023, 20, 1732. https://doi.org/10.3390/ijerph20031732

AMA Style

Song I-S, Choi E-S, Kim ES, Hwang Y, Lee K-S, Ahn KH. Associations of Preterm Birth with Dental and Gastrointestinal Diseases: Machine Learning Analysis Using National Health Insurance Data. International Journal of Environmental Research and Public Health. 2023; 20(3):1732. https://doi.org/10.3390/ijerph20031732

Chicago/Turabian Style

Song, In-Seok, Eun-Saem Choi, Eun Sun Kim, Yujin Hwang, Kwang-Sig Lee, and Ki Hoon Ahn. 2023. "Associations of Preterm Birth with Dental and Gastrointestinal Diseases: Machine Learning Analysis Using National Health Insurance Data" International Journal of Environmental Research and Public Health 20, no. 3: 1732. https://doi.org/10.3390/ijerph20031732

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop