**Taiyue Jin 1, Jiyoung Youn 1, An Na Kim 1, Moonil Kang 2, Kyunga Kim 3,4, Joohon Sung <sup>5</sup> and Jung Eun Lee 1,6,\***


Received: 2 July 2020; Accepted: 23 July 2020; Published: 26 July 2020

**Abstract:** Habitual coffee consumption and its association with health outcomes may be modified by genetic variation. Adults aged 40 to 69 years who participated in the Korea Association Resource (KARE) study were included in this study. We conducted a genome-wide association study (GWAS) on coffee consumption in 7868 Korean adults, and examined whether the association between coffee consumption and the risk of prediabetes and type 2 diabetes combined was modified by the genetic variations in 4054 adults. In the GWAS for coffee consumption, a total of five single nucleotide polymorphisms (SNPs) located in 12q24.11-13 (rs2074356, rs11066015, rs12229654, rs11065828, and rs79105258) were selected and used to calculate weighted genetic risk scores. Individuals who had a larger number of minor alleles for these five SNPs had higher genetic risk scores. Multivariate logistic regression models were used to estimate the odds ratios (ORs) and 95% confidence intervals (95% CIs) to examine the association. During the 12 years of follow-up, a total of 2468 (60.9%) and 480 (11.8%) participants were diagnosed as prediabetes or type 2 diabetes, respectively. Compared with non-black-coffee consumers, the OR (95% CI) for ≥2 cups/day by black-coffee consumers was 0.61 (0.38–0.95; *p* for trend = 0.023). Similarly, sugared coffee showed an inverse association. We found a potential interaction by the genetic variations related to black-coffee consumption, suggesting a stronger association among individuals with higher genetic risk scores compared to those with lower scores; the ORs (95% CIs) were 0.36 (0.15–0.88) for individuals with 5 to 10 points and 0.87 (0.46–1.66) for those with 0 points. Our study suggests that habitual coffee consumption was related to genetic polymorphisms and modified the risk of prediabetes and type 2 diabetes combined in a sample of the Korean population. The mechanisms between coffee-related genetic variation and the risk of prediabetes and type 2 diabetes combined warrant further investigation.

**Keywords:** coffee consumption; type 2 diabetes; prediabetes; genome-wide association analysis (GWAS); single nucleotide polymorphism (SNP)

#### **1. Introduction**

The incidence rate and prevalence of type 2 diabetes have steadily increased in Asian populations. The International Diabetes Federation (IDF) estimated that 163 million people (35.2% of global diabetic population) in the Western Pacific region had prevalent type 2 diabetes in 2019 [1], contributing the most to type 2 diabetes in the world. In Korea, the prevalence of type 2 diabetes has increased from 6.9% in 1998 to 10.8% in 2017 [2,3]. Additionally, type 2 diabetes contributed to 17.1% of the total deaths in Korea in 2018 [4].

Coffee consumption has been suggested to lower several chronic diseases, including type 2 diabetes [5], metabolic syndrome [6], coronary heart disease [7], liver disorders [8], and several types of cancers [9]. The bioactive compounds in coffee, such as caffeine and chlorogenic acids, have been investigated as potential compounds that lower the risk of type 2 diabetes. Caffeine has been shown to stimulate the metabolic rate [10,11], and its thermogenic effect has been hypothesized to decrease the risk of metabolic disease development. Antioxidants, including chlorogenic acids, commonly found in coffee, have also been highlighted as a preventing factor for type 2 diabetes by inhibiting the generation of free radicals and removing hyperglycemia-induced oxidative stress [12–14].

A heritability study on caffeine [15] and genome-wide association studies (GWASs) have suggested that coffee consumption behavior may be linked to genetic polymorphisms. The first genome-wide meta-analysis of coffee consumption was conducted in a European population and identified two independent loci, rs4410790 nears *AHR* and rs2470893 between *CYP1A1* and *CYP1A2*, which are caffeine metabolism-related genes [16]. Additional European/Caucasian GWASs also discovered single nucleotide polymorphisms (SNPs) located in *AHR*, *CYP1A1* and *CYP1A2* as well as *ABCG2*, *POR*, *BDNF*, *SLC6A4*, *GCKR* and *MLXIPL* [17], near *NRCAM* or *ULK3* [18]. A Japanese GWAS, the first GWAS of coffee consumption in Asia, identified 24 SNPs on chromosome 12, showing rs2074356 in *HECTD4* as the strongest significant SNP [19]. Another Japanese GWAS found two loci located in *CUX2* (rs7910258) and *AHR* (rs10251701) [20]. The few GWASs in Asia warrant further investigation of the coffee-related genetic polymorphisms in Asian populations because there has been an increase in coffee consumption and type 2 diabetes in Asia. Because coffee consumption has the potential to prevent type 2 diabetes [21], it is important to investigate whether coffee consumption is linked to a lower risk of type 2 diabetes and whether this association is modified by genetic variations common in Asian populations.

The objective of this study was to identify genetic polymorphisms associated with habitual coffee consumption and examine whether the association between coffee consumption and the risk of prediabetes and type 2 diabetes combined was modified by these genetic variants in Korean population.

#### **2. Materials and Methods**

#### *2.1. Study Population*

Participants were recruited from the Korea Association Resource (KARE) study, which is part of the Korean Genome and Epidemiology Study (KoGES), a community-based cohort study. A total of 5012 and 5018 participants aged 40–69 years were enrolled from Ansan and Ansung, respectively, between 2001 and 2002. Socio-demographic status, anthropometric indices, dietary lifestyle, physical activity, and disease history were examined at baseline and follow-up phase every 2 years. In this study, we included data from baseline to the 6th follow-up (2001–2014). The details of the KoGES project have been described elsewhere [22].

Among the 10,030 participants of this study, DNA samples from a total of 8840 participants were genotyped at baseline. We excluded participants who did not have genetic information (*n* = 1190); those who were diagnosed with a type 2 diabetes by physicians or had been treated with oral hypoglycemic medication or insulin therapy at baseline (*n* = 619); those who had a history of cardiovascular disease including myocardial infarction and stroke (*n* = 150) or cancer (*n* = 20) at baseline; and those who did not provide daily coffee consumption frequency (*n* = 129) or amount

(*n* = 54). In summary, 7868 participants (3685 men and 4183 women) were included in the GWAS (Table S1).

When we examined the association between habitual coffee consumption and the incident risk of prediabetes and type 2 diabetes combined, we further excluded participants who had prevalent type 2 diabetes (*n* = 666) or prediabetes (*n* = 2099) determined by the fasting plasma glucose (FPG) test, a 2-h oral glucose tolerance test (OGTT) and a hemoglobin A1c (HbA1c) test at baseline. We also excluded those who attended neither fifth nor sixth follow-up examination (*n* = 1049). As a result, a total of 4054 participants (1904 men and 2150 women) were included in the association analysis. This study was approved by the Institutional Review Board of Seoul National University (IRB No. E1911/003-011).

#### *2.2. Dietary Assessment*

A 103-item semi-quantitative food frequency questionnaire (FFQ) for the KoGES was used to assess the habitual coffee consumption at baseline. The validity of the FFQ was evaluated by 3-day diet records [23,24]. Participants were asked to answer the frequency of coffee consumption over the last year (almost none, 1 time/month, 2–3 times/month, 1–2 times/week, 3–4 times/week, 5–6 times/week, 1 time/day, 2 times/day or 3 times/day) as well as the amount of coffee (0.5, 1 or 2 cups per time). The amount of sugar added in coffee was also investigated (1, 2 or 3 teaspoons per time). We multiplied the frequency and the amount of the daily consumption of coffee to obtain the number of cups consumed per day. Coffee consumption was categorized into non-coffee consumers, <1 cup/day, 1 to <2 cups/day and ≥2 cups/day. Among coffee consumers, participants who consumed coffee without sugar were defined as black-coffee consumers and those who consumed coffee with sugar as sugared-coffee consumers.

We calculated the amount of caffeine consumption of each participant by using the caffeine database published by the Korea Ministry of Food and Drug Safety (MFDS) [25] or the United States Department of Agriculture (USDA) [26] or by contacting the product companies. The foods considered to contain caffeine were as follows: beverages (coffee, tea and soft drinks) and foods made with cocoa (chocolate, chocolate candies, chocolate ice cream, chocolate milk, chocolate pies, and chocolate cakes).

#### *2.3. Ascertainment of Cases*

Participants underwent a 2-h 75 g OGTT and HbA1c test at baseline and each follow-up. The concentrations of FPG and 2-h plasma glucose were measured by the hexokinase method (ADVIA 1650; Bayer, Berkley, MI, USA). Incident cases of prediabetes and type 2 diabetes were defined according to the American Diabetes Association (ADA) criteria [27]. Incident type 2 diabetes was ascertained if a participant had one of the following during the follow-up examination: had been diagnosed with a type 2 diabetes by physicians or had been treated with oral hypoglycemic medication or insulin therapy; had an FPG ≥ 126 mg/dL (≥7.0 mmol/L); had a 2-h plasma glucose ≥ 200 mg/dL (≥11.1 mmol/L); or had a HbA1c ≥ 6.5%. Prediabetes incidence was defined as an FPG ≥ 100 mg/dL (≥5.6 mmol/L) or 2-h plasma glucose ≥ 140 mg/dL (≥7.8 mmol/L) or HbA1c ≥ 6.0%.

#### *2.4. Genotyping and Quality Control*

DNA samples were genotyped using the Affymetrix Genome-Wide Human SNP Array 5.0 (Affymetrix, Santa Clara, CA, USA). The Bayesian Robust Linear Modeling using Mahalanobis Distance (BRLMM) Genotyping Algorithm was used for the genotype calling of 500,568 SNPs [28]. After quality control filtering, a total of 352,228 SNPs remained for analysis. Details about quality control criteria have been described elsewhere [29].

#### *2.5. GWAS on Co*ff*ee Consumption*

We identified SNPs related to habitual coffee consumption in the GWAS. Participants were grouped into non-coffee consumers and coffee consumers, and this group information was used as a binary outcome variable in a logistic regression model for each SNP. We adjusted for age (years, continuous), sex, and alcohol consumption (g/day, continuous) in the GWAS. We also adjusted for baseline BMI in the secondary GWAS. In addition, we replicated a GWAS using a continuous variable of coffee consumption. A GWAS on continuous caffeine intake was also conducted. A box-cox transformed coffee or caffeine intake and a linear regression model were used for continuous analysis. A Manhattan plot and quantile-quantile (Q-Q) plot were generated, and the inflation factor (λ) was calculated. The GWAS was performed using PLINK version 1.07, and SNPs with a *p*-value < 1 <sup>×</sup> 10−<sup>5</sup> were considered suggestive significant. A regional association plot of the significant SNPs was generated using the LocusZoom program (http://locuszoom.org). The linkage disequilibrium (LD) between the significant SNPs was examined using the HaploView program [30].

#### *2.6. Statistical Analysis*

To calculate the genetic risk scores, each coffee-related SNP identified from the GWAS was assigned 0, 1, or 2 according to the number of minor alleles and then weighted by its relative effective size (β coefficient obtained from the GWAS). The genetic risk scores were calculated as following the equation: 5 × (β<sup>1</sup> × SNP1 + β<sup>2</sup> × SNP2 + β<sup>3</sup> × SNP3 + β<sup>4</sup> × SNP4 + β<sup>5</sup> × SNP5)/(β<sup>1</sup> + β<sup>2</sup> + β<sup>3</sup> + β<sup>4</sup> + β5) [31]. The genetic risk scores ranged from 0 to 10 points, and the median value was 5 points among those with at least one minor allele. Participants were categorized into 3 groups: 0 point, 0.1 to <5 points, and 5 to 10 points. We examined the association between coffee consumption and the risk of prediabetes and type 2 diabetes combined in black coffee and sugared coffee. Non-coffee consumers and sugared-coffee consumers combined were regarded as reference group for black-coffee consumers. For sugared-coffee consumers, non-coffee consumers and black-coffee consumers combined were regarded as a reference group. Odds ratios (ORs) and 95% confidence intervals (CIs) for the association between coffee consumption and the risk of prediabetes and type 2 diabetes combined were calculated using multivariate logistic regression models among men and women combined or separately. The median cups of coffee consumed per day were assigned to each coffee group and used to test the linear trends. We also examined whether the association varied by genetic risk scores. The interaction analysis was performed by comparing the models with or without interaction term using a likelihood ratio test. All of the analyses were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <30 and <sup>≥</sup>30 kg/m2), smoking status (never smokers, ≤10 and >10 pack-years for black-coffee consumers; never smokers, ≤10, 10.1 to ≤20, 20.1 to ≤30 and >30 pack-years for sugared-coffee consumers), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10 and >10 g/day for black-coffee consumers; non-drinkers, ≤5, 5.1 to ≤10, 10.1 to ≤20 and >20 g/day for sugared-coffee consumers), family history of type 2 diabetes (yes or no), and total energy intake (kcal/day, continuous). Additionally, we also adjusted for the amount of sugar added in coffee when analyzing the association between black coffee consumption and the risk of prediabetes and type 2 diabetes combined (0, ≤5, 5.1 to ≤10, 10.1 to ≤15 and >15 g/day). SAS version 9.4 (SAS Institute, Cary, NC, USA) was used for all analyses, and *p*-value < 0.05 in two-sided tests was defined as a significant difference.

#### **3. Results**

#### *3.1. Baseline Characteristics*

Of the 4054 participants in the association analysis, a total of 480 (11.8%) and 2468 (60.9%) participants developed type 2 diabetes and prediabetes during the 12-year follow-up period, respectively. Table 1 shows the baseline characteristics of the study population according to habitual coffee consumption. Compared with non-coffee consumers, participants who consumed either black coffee or sugared coffee were more likely to be younger, smoke, drink alcohol, and have a higher BMI.


**Table 1.** Baseline characteristics of study population according to habitual coffee consumption.

Abbreviations: SD, standard deviation; BMI, body mass index.

#### *3.2. Genetic Polymorphisms Associated with Co*ff*ee Consumption*

In the GWAS of habitual coffee consumption, a total of 18 SNPs located in 12q24 achieved a suggestive significance (*p* < 1 <sup>×</sup> 10−5) (Table S2). When we additionally adjusted for baseline BMI, the same 18 significant SNPs were identified as well (Table S3). SNPs identified based on habitual coffee consumption were the same when we additionally conducted a GWAS for continuous caffeine intake.

Figures 1 and 2 show the Manhattan plot and regional association plot, respectively. Figure 3 shows the Q-Q plot, and the genomic inflation factor (λ) of the GWAS was 1.0103, suggesting that the study population structure was well-adjusted [32]. Among the SNPs achieved suggestive significance at *p* < 1 <sup>×</sup> 10−5, we selected SNPs as follows, which were used later for calculating the genetic risk scores. First, pairwise correlations were examined and high correlation was declared when r2 > 0.8 (Figure 4). Then, we excluded all imputed SNPs having a high correlation with any genotyped SNP, and selected the SNPs among highly-correlated genotyped SNPs. As a result, three genotyped SNPs (rs2074356, rs11066015, and rs12229654) and two imputed SNPs (rs11065828 and rs79105258) were considered as coffee-related SNPs and used for calculating the genetic risk scores. Table 2 presents the information on these five SNPs related to coffee consumption. The SNPs were located in genes *HECTD4*, *ACAD10*, *MYL2*, and *CUX2*, and the minor allele frequency ranged from 0.143 to 0.172 in our population.

**Figure 1.** Manhattan plot of the genome-wide association study (GWAS) on coffee consumption. Each color represents a different chromosome. The strongest significant SNP was rs2074356 in chromosome 12 (*p*-value <sup>=</sup> 6.62 <sup>×</sup> <sup>10</sup><sup>−</sup>8).

**Figure 2.** Regional association plot of the 18 significant single nucleotide polymorphisms (SNPs) discovered from the GWAS on coffee consumption. The strongest significant SNP, rs2074356, was shown in purple, and the gray dots represent chromosomal positions of other SNPs near the rs2074356.

**Figure 3.** Quantile-quantile (Q-Q) plot of the GWAS on coffee consumption.

**Figure 4.** Linkage disequilibrium (LD) plot of the 18 significant SNPs discovered from the GWAS on coffee consumption. Values shown in the red boxes represent the LD (r2) between the SNPs.


**Table 2.** SNPs related to coffee consumption (*p*-value <sup>&</sup>lt; <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>−</sup>5).

Abbreviations: Chr, chromosome; SNP, single nucleotide polymorphism; MAF, minor allele frequency. <sup>1</sup> Alleles are presented as minor allele/major allele; <sup>2</sup> minor allele frequency in this study population; <sup>3</sup> minor allele frequency in the 1000Genomes, East Asian; <sup>4</sup> the beta (β) coefficient was obtained from the GWAS; <sup>5</sup> the *p*-value was calculated using a Wald test from logistic regression model adjusted for age (years; continuous), sex, and alcohol consumption (g/day; continuous).

#### *3.3. Co*ff*ee Consumption and the Risk of Prediabetes and Type 2 Diabetes Combined*

We examined the association between black coffee consumption and the risk of prediabetes and type 2 diabetes combined (Table 3). Compared with non-black-coffee consumers, participants who consumed ≥2 cups/day of black coffee had a 39% lower risk of prediabetes and type 2 diabetes combined among men and women combined (95% CI = 0.38–0.95; *p* for trend = 0.023). When we separated men and women, compared with non-black-coffee consumers, participants who consumed ≥2 cups/day of black coffee had a 54% lower risk of prediabetes and type 2 diabetes combined among men (95% CI = 0.23–0.94; *p* for trend = 0.026). Although the association was not statistically significant among women, an inverse trend was observed (OR = 0.74; 95% CI = 0.41–1.34).


**Table 3.** Multivariate-adjusted ORs and 95% CIs for the risk of prediabetes and type 2 diabetes combined according to black coffee consumption.

Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. <sup>1</sup> For black-coffee consumers, non-coffee consumers and sugared-coffee consumers combined were regarded as reference group; <sup>2</sup> the ORs (95% CIs) were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <30 and ≥30 kg/m2), smoking status (never smokers, ≤10 and <sup>&</sup>gt;10 pack-years), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10 and >10 g/day), family history of type 2 diabetes (yes or no), total energy intake (kcal/day, continuous), and the amount of sugar added in coffee (0, ≤5, 5.1 to ≤10, 10.1 to ≤15 and >15 g/day).

Table 4 presents the association between sugared coffee consumption and the risk of prediabetes and type 2 diabetes combined. The ORs (95% CIs) of prediabetes and type 2 diabetes combined comparing sugared-coffee consumers with non-sugared-coffee consumers were 0.73 (0.60–0.89; *p* for trend = 0.005) for men and women combined, 0.71 (0.52–0.97; *p* for trend = 0.015) for men, and 0.75 (0.57–0.99; *p* for trend = 0.080) for women.

**Table 4.** Multivariate-adjusted ORs and 95% CIs for the risk of prediabetes and type 2 diabetes combined according to sugared coffee consumption.


Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. <sup>1</sup> For sugared-coffee consumers, non-coffee consumers and black-coffee consumers combined were regarded as reference group; <sup>2</sup> the ORs (95% CIs) were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <sup>&</sup>lt;30 and ≥30 kg/m2), smoking status (never smokers, ≤10, 10.1 to ≤20, 20.1 to ≤30 and <sup>&</sup>gt;30 pack-years), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10, 10.1 to ≤20 and >20 g/day), family history of type 2 diabetes (yes or no), and total energy intake (kcal/day, continuous).

### *3.4. Associations Modified by Genetic Risk Scores*

We analyzed whether the association between habitual coffee consumption and the risk of prediabetes and type 2 diabetes combined varied by genetic polymorphisms. For black-coffee consumers, we found that an inverse association was more pronounced among individuals with high genetic risk scores, but the interaction was not statistically significant (*p* for interaction = 0.261) (Table 5). The ORs

(95% CIs) for ≥2 cups/day of black coffee vs. non-black-coffee consumers were 0.87 (0.46–1.66) for participants with 0 points, 0.49 (0.17–1.44) for those with 0.1 to <5 points, and 0.36 (0.15–0.88) for those with 5 to 10 points for their genetic risk scores. For sugared coffee consumption, the associations with the risk of prediabetes and type 2 diabetes combined were similar across genetic risk scores (*p* for interaction = 0.608) (Table 6). When we conducted an interaction analysis for each coffee-related SNP for black coffee consumption, the association between black coffee consumption and the risk of prediabetes and type 2 diabetes combined was inverse for the minor allele of each SNP (Table 7).

**Table 5.** Multivariate-adjusted ORs and 95% CIs for the risk of prediabetes and type 2 diabetes combined by genetic risk scores according to black coffee consumption.


Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. <sup>1</sup> Genetic risk scores were calculated by 5 SNPs related to coffee consumption weighted by relative effect size (β coefficient); <sup>2</sup> for black-coffee consumers, non-coffee consumers and sugared-coffee consumers combined were regarded as reference group; <sup>3</sup> the ORs (95% CIs) were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <sup>&</sup>lt;30 and ≥30 kg/m2), smoking status (never smokers, ≤10 and <sup>&</sup>gt;10 pack-years), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10 and >10 g/day), family history of type 2 diabetes (yes or no), total energy intake (kcal/day, continuous), and the amount of sugar added in coffee (0, ≤5, 5.1 to ≤10, 10.1 to ≤15 and >15 g/day).

**Table 6.** Multivariate-adjusted ORs and 95% CIs for the risk of prediabetes and type 2 diabetes combined by genetic risk scores according to sugared coffee consumption.


Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. <sup>1</sup> Genetic risk scores were calculated by 5 SNPs related to coffee consumption weighted by relative effect size (β coefficient); <sup>2</sup> for sugared-coffee consumers, non-coffee consumers and black-coffee consumers combined were regarded as reference group; <sup>3</sup> the ORs (95% CIs) were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <30 and ≥30 kg/m2), smoking status (never smokers, ≤10, 10.1 to ≤20, 20.1 to ≤30 and >30 pack-years), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10, 10.1 to ≤20 and >20 g/day), family history of type 2 diabetes (yes or no), and total energy intake (kcal/day, continuous).


**Table 7.** Multivariate-adjusted ORs and 95% CIs for the risk of prediabetes and type 2 diabetes combined by 5 coffee-related SNPs according to black coffee consumption.

Abbreviations: OR, odds ratio; 95% CI, 95% confidence interval. <sup>1</sup> For black-coffee consumers, non-coffee consumers and sugared-coffee consumers combined were regarded as reference group; <sup>2</sup> the ORs (95% CIs) were adjusted for age (years, continuous), sex (for men and women combined), body mass index (BMI, <23, 23 to <25, 25 to <30 and ≥30 kg/m2), smoking status (never smokers, ≤10 and <sup>&</sup>gt;10 pack-years), alcohol consumption (non-drinkers, ≤5, 5.1 to ≤10 and >10 g/day), family history of type 2 diabetes (yes or no), total energy intake (kcal/day, continuous), and the amount of sugar added in coffee (0, ≤5, 5.1 to ≤10, 10.1 to ≤15 and >15 g/day.

#### **4. Discussion**

Our first GWAS of coffee consumption in the Korean population identified five SNPs (rs2074356 in *HECTD4*, rs11066015 in *ACAD10*, rs12229654 in *MYL2*, rs11065828 and rs79105258 in *CUX2*) related to habitual coffee consumption. Compared with non-coffee consumers, the risk of prediabetes and type 2 diabetes being combined was inversely associated with habitual coffee consumption, either black coffee or sugared coffee. Individuals with black coffee consumption had a lower risk of prediabetes and type 2 diabetes combined compared with non-black-coffee consumers among those with multiple minor alleles for these five SNPs.

The significant SNPs discovered to be related to habitual coffee consumption in our GWAS were all introns. Although the introns were noncoding regions of genes, they may affect the transcription rate and translation efficiency, further regulating gene expression [33]. Recent GWASs have identified several loci on the *AHR*, *CYP1A1*, and *CYP1A2* genes associated with coffee consumption [16,17]. However, most of the GWASs on coffee consumption were conducted in European populations, only two GWASs, to our knowledge, reported SNPs related to coffee consumption in Asian populations. A GWAS in the Japan Multi-Institutional Collaborative Cohort (J-MICC) study, the first GWAS on coffee consumption in Asia, found that rs2074356 located in 12q24 was most strongly associated with habitual coffee consumption (*<sup>p</sup>* <sup>=</sup> 2.2 <sup>×</sup> <sup>10</sup><sup>−</sup>6) [19]. Similarly, in our GWAS, a total of 18 SNPs were associated with coffee consumption at *<sup>p</sup>* <sup>&</sup>lt; <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>−</sup>5, all of which were found to be related to habitual coffee consumption in the J-MICC study, and the strongest significant variant was rs2074356 as well. Another Japanese coffee GWAS identified two independent loci (rs79105258 in 12q24 and rs10252701 in 7p21) that were associated with coffee consumption [20]. rs79105258 was also selected as a coffee-related variant in our study. In addition to the association with habitual coffee consumption, these SNPs were associated with type 2 diabetes [34], blood glucose levels [35], blood pressure levels [36], and obesity [37]. rs2074356 in *HECTD4* was associated with prevalent type 2 diabetes and blood glucose level in a Korean population [34,35]. Three SNPs (rs12229654, rs11066015 and rs2074356) were also identified to be linked with both systolic and diastolic blood pressure in a Japanese GWAS [36].

Previous epidemiologic studies have shown that coffee consumption was inversely associated with the risk of type 2 diabetes. A meta-analysis of 28 cohort and nested case-control studies reported that participants who consumed 5 cups/day of coffee had a 30% lower risk of type 2 diabetes compared with almost non-consumers, and the associations were similar between men and women [5]. Although we found a stronger inverse association among men than among women, further investigation is needed to explore a larger amount of coffee consumption, e.g., 3 or more cups/day, and whether the inverse association holds for women.

The lower risk of type 2 diabetes linked to coffee consumption could be linked to several biological mechanisms. As a main polyphenolic compound in coffee, chlorogenic acids have been shown as inhibitors of hepatic glucose-6-phosphatase, the rate-limiting enzyme of glucose hydrolysis [38]. Reduced hepatic glucose-6-phosphatase may affect the glucose output and thus decrease the blood glucose concentration. In addition, chlorogenic acids act as antioxidants to lower oxidative stress shown in both in vitro and in vivo studies [39]. Additionally, caffeine and magnesium, both of which are commonly found in coffee, have been suggested to have roles in type 2 diabetes prevention by improving insulin resistance. Previous studies suggested that caffeine could improve insulin resistance by stimulating insulin secretion from pancreatic β cells [40]. In addition to insulin secretion, caffeine increases thermogenesis, lipolysis, and β-oxidation [41]. Magnesium supplementation has reduced the development of type 2 diabetes and improved glucose disposal in experimental studies [42,43], and cohort studies have reported a significant inverse association between magnesium intake and type 2 diabetes risk [44].

In this study, we observed that both black coffee and sugared coffee decreased the risk of prediabetes and type 2 diabetes combined for individuals who consumed more than 2 cups of coffee per day. But participants who consumed 1 to <2 cups/day of sugared coffee had a 45% higher risk of prediabetes and type 2 diabetes combined among men alone, but not among women. Although sugar-sweetened beverage intake including carbonated beverages has been positively associated with the risk of type 2 diabetes [45,46], the results of coffee with sugar remained equivocal. In a French cohort study, compared to non-coffee consumers, the incident type 2 diabetes decreased by 40% and 31% among participants consuming more than 1.1 cups of coffee per lunch with or without sugar, respectively [47]. In a small clinical trial, where eight lean, young and healthy adults drank six types of beverages or water 1 h before a potato-based meal, postprandial hyperglycemia, an early abnormality of type 2 diabetes [48], was significantly reduced when they drank sweetened coffee before their meals [49]. Further studies are needed to examine whether the benefit of coffee remains even after adding a small amount of sugar.

To our knowledge, this study is the first GWAS of habitual coffee consumption in a Korean population. The strengths of this study include good ascertainment of prediabetes and type 2 diabetes, adjustment for potential confounding factors, and a 12-year follow-up. The incidence of prediabetes and type 2 diabetes were identified based on the circulating levels of FPG, 2-h plasma glucose, and HbA1c, which could minimize the misclassification. Adjustment for potential confounding factors, including smoking status and alcohol consumption, may enable us to remove the effect of the confounding factors. However, we cannot rule out the possibility that residual confounding factors remained. There are several limitations to our study. First, the rate of revisits to the clinic for the blood draw decreased to 60% in the last sixth follow-up. Therefore, our study may not be representative of the full cohort of KARE study. However, the internal validity may not be impaired as we obtained relatively accurate information on the incidence of prediabetes and type 2 diabetes during the 12-year follow-up. Second, we could not distinguish caffeinated and decaffeinated coffee or boiled and filtered coffee. However, previous studies have shown that the associations between coffee and type 2 diabetes were similar by the amount of caffeine [5] or preparation methods [50]. Third, we were not able to examine high coffee consumption (e.g., 3 or more cups/day) because only a few participants consumed more than 3 cups/day. Fourth, we did not consider medicinal caffeine intake when we performed a GWAS. However, beverages may mainly contribute to daily caffeine intake in Korea.
