**Preface to "Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine"**

The first all-Russia conference with international participation "Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine"was held in Novosibirsk on 26–27 November 2020. The purpose of this conference was to disseminate the latest basic and clinical findings in the fields of etiology, clinical characteristics, and modern diagnostics and treatments of endocrine disorders among various relevant specialists. The conference was intended for practicing endocrinologists, primary care physicians, medical geneticists, pediatric endocrinologists, pediatricians, and physician–scientists. The conference included plenary sessions, specialty sessions, satellite symposia, an open competition for young scientists, and the first-in-Russia educational course for physicians: "Maturity Onset Diabetes of the Young (MODY): Molecular Genetic Determinants and a Personalized Approach to Patient Management."

The main topics included epidemiology and pathogenesis of endocrine disorders; genomic research in endocrinology; biochemical characteristics of endocrine aberrations; immunology and immunogenetics in endocrinology; cellular technologies in endocrinology; metabolomic research in endocrinology; pharmacogenetics; basic pathomorphology; high-tech care of patients with endocrine disorders; iodine-deficiency–related, autoimmune, and oncological diseases of the thyroid; modern diagnostic and therapeutic strategies for diabetes mellitus; osteoporosis and osteopenias; polyendocrinopathies; an interdisciplinary approach to the diagnosis and treatment of obesity and metabolic syndrome; hypo- and hyperparathyroidism, vitamin D; neuroendocrine disorders; reproductive health; rehabilitation of patients with endocrine disorders; health resort and spa treatments of endocrine disorders and comorbid conditions.

Scientists from Siberian cities, from Moscow, St. Petersburg, and Kazan, as well as Kazakhstan and Great Britain, delivered presentations at the conference. The conference was organized by the Institute of Internal and Preventive Medicine –a branch of the Institute of Cytology and Genetics, the Siberian Branch of the Russian Academy of Sciences (IIPM –a branch of the ICG SB RAS). The objectives and subject areas of the IIPM are basic, exploratory, and applied scientific studies in priority areas of molecular medicine and human genetics as well as safeguarding and improvement of human health, the development of health care and medical science, and preparation of advanced specialists in science and medicine.

**Yuliya I. Ragino, Mikhail I. Voevoda**

*Editors*

### *Editorial* **Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine**

**Elena Shakhtshneider 1,2 , Alla Ovsyannikova <sup>1</sup> , Oksana Rymar <sup>1</sup> , Yuliya Ragino 1,\* and Mikhail Voevoda 2,\***


The first all-Russia conference with international participation, "Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine", was held in Novosibirsk on 26–27 November 2020. The purpose of this conference was to disseminate the latest basic and clinical findings in the fields of etiology, clinical characteristics, and modern diagnostics and treatments of endocrine disorders among various relevant specialists. The conference was intended for practicing endocrinologists, primary care physicians, medical geneticists, pediatric endocrinologists, pediatricians, and physician–scientists. The conference included plenary sessions, specialty sessions, satellite symposia, an open competition for young scientists, and the first-in-Russia educational course for physicians: "Maturity Onset Diabetes of the Young (MODY): Molecular Genetic Determinants and a Personalized Approach to Patient Management." This Special Issue on "Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine" includes a review and 10 original studies about diabetes mellitus, hypothalamic norepinephrine, obesity, and osteoporosis.

Six of the special-issue articles, Samoilova et al. [1], Mustafina et al. [2], Ivanoshchuk et al.[3,4], Musina et al. [5], and Kruchinina et al. [6], focus on the detection and characterization of various types of diabetes mellitus. Samoilova et al. [1] evaluated the specificity of hippocampal spectroscopy for parameters of type 1 and type 2 diabetes mellitus (T1DM and T2DM) and cognitive dysfunction. The authors analyzed 65 T1DM patients with cognitive deficits and 20 T1DM patients without, as well as 75 T2DM patients with cognitive deficits and 20 T2DM patients without. They determined that patients with diabetes possessed an altered hippocampal metabolism, which may serve as an early predictive marker of neurometabolic changes. The main modifiable risk factors whose correction may slow down the progression of cognitive dysfunction were identified. Mustafina et al. [2] investigated the 14-year risk of T2DM and developed a risk score for T2DM in a Siberian cohort. A random population sample (males/females, 45–69 years old) was examined at baseline in 2003–2005 (Health, Alcohol, and Psychosocial Factors in Eastern Europe [HAPIEE] project, *n* = 9360, Novosibirsk) and re-examined in 2006–2008 and 2015–2017. After the exclusion of subjects with baseline T2DM, the final analysis included 7739 participants. In addition, secondary education, low physical activity, and a history of cardiovascular disease were significantly associated with T2DM in females. Ivanoshchuk et al. [3] researched the genetic characteristics of MODY. The authors analyzed 14 MODY genes in 178 patients with a MODY phenotype in Western Siberia. A multiplex ligation-dependent probe amplification analysis of DNA samples from 50 randomly selected patients without detectable mutations did not reveal large rearrangements in the MODY genes. In 38 patients (among them, 37% males) out of the 178, mutations were identified in *HNF4A*, *GCK*, *HNF1A*, and *ABCC8*. The authors of ref. [4] investigated the *APPL1* gene, which is associated with MODY 14. Thirteen variants were found in *APPL1*, three of which (rs79282761, rs138485817, and

**Citation:** Shakhtshneider, E.; Ovsyannikova, A.; Rymar, O.; Ragino, Y.; Voevoda, M. Basic Research in Endocrinology: A Modern Strategy for the Development and Technologies of Personalized Medicine. *J. Pers. Med.* **2021**, *11*, 895. https://doi.org/10.3390/ jpm11090895

Received: 3 September 2021 Accepted: 6 September 2021 Published: 8 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

rs11544593) were located in exons. There were no statistically significant differences in the frequencies of rs11544593 alleles and genotypes between T2DM patients and the general population. In the MODY group, AG rs11544593 genotype carriers were significantly more frequent (AG vs. AA + GG: odds ratio 1.83, confidence interval 1.15–2.90, *p* = 0.011) compared with the control group. Musina et al. [5] established relations among inflammatory status, ferrokinetics, and lipid metabolism in patients with diabetes mellitus. The discovered relations among lipid profile indices, inflammatory status, and microalbuminuria confirmed the mutual influences of hyperlipidemia, inflammation, and nephropathy in diabetes patients. Their results justify the strategy involving early hypolipidemic therapy for patients with diabetes mellitus to prevent the development and progression of microvascular complications. Kruchinina et al. [6] investigated the feasibility of a differential diagnosis of degrees of rheological disturbances in patients with T2DM by dielectrophoresis of erythrocytes. The proposed experimental approach features a low invasiveness, high productivity, shorter duration, and vividness of the results. The method allows one to evaluate not only the local (renal and ocular) but also systemic status of microcirculation using more than 20 parameters of erythrocytes.

Redina et al. [7] addressed the role of hypothalamic norepinephrine in the activation of the sympathetic nervous system. They carried out genetic mapping by quantitative trait loci analysis and identified loci associated both with an increased hypothalamic norepinephrine concentration and with an increase of the heart weight in Inherited Stress-Induced Arterial Hypertension (ISIAH) rats, a model of the stress-sensitive type of arterial hypertension. The contribution to the development of heart hypertrophy in ISIAH rats is controlled by different genetic loci, one of which is associated with hypothalamic norepinephrine concentration (on chromosome 18) and the other correlating with high blood pressure (on chromosome 1).

Kiseleva et al. [8] investigated correlations between the parameters of classic clinical blood tests and the proteomic profiles of 104 lean and obese subjects. As a result, the authors compiled patterns of the proteins whose presence or absence allowed one to predict the weight of a patient fairly well. Ragino et al. [9] studied the blood cytokine/chemokine profile of 25–44 year old patients with early ischemic heart disease comorbid with abdominal obesity. Their findings related to Flt3 ligand, granulocyte macrophage–colony stimulating factor (GM-CSF), and interleukin 4 (IL-4) are consistent with the international literature. The results of that study are partly confirmative and partly hypothesis-generating.

Mazurenko et al. [10] examined the frequency of osteoporotic forearm fractures in postmenopausal women to assess their association with risk factors of chronic noncommunicable diseases. In the studied population sample of postmenopausal women, a high total cholesterol level and a history of smoking were cross-sectional determinants of osteoporotic forearm fractures, whereas the body–mass index was a protective factor against the risk of osteoporotic fractures.

Grigor'eva [11] wrote a review about the relation of gallstone disease, obesity, and the Firmicutes/Bacteroidetes ratio as a possible biomarker of gut dysbiosis. This review presents and summarizes recent findings of studies on the gut microbiota in patients with gallstone disease.

The articles in this special issue cover interesting topics in endocrinology that are also related to gastroenterology, genetics, hematology, and cardiology. The presented data expand our knowledge about basic endocrinology.

**Author Contributions:** E.S. and A.O. were responsible for the preparation of the scientific program and overall organization of the conference; O.R. chaired the sessions and was in charge of the conference content; Y.R. and M.V. conceived the conference. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work received no external funding.

**Conflicts of Interest:** The authors declare that they have no conflicts of interest.

#### **References**


## *Article* **A Prospective Study: Highlights of Hippocampal Spectroscopy in Cognitive Impairment in Patients with Type 1 and Type 2 Diabetes**

**Julia Samoilova <sup>1</sup> , Mariia Matveeva 1,\* , Olga Tonkih <sup>1</sup> , Dmitry Kudlau <sup>2</sup> , Oxana Oleynik <sup>1</sup> and Aleksandr Kanev <sup>1</sup>**



**Citation:** Samoilova, J.; Matveeva, M.; Tonkih, O.; Kudlau, D.; Oleynik, O.; Kanev, A. A Prospective Study: Highlights of Hippocampal Spectroscopy in Cognitive Impairment in Patients with Type 1 and Type 2 Diabetes. *J. Pers. Med.* **2021**, *11*, 148. https://doi.org/ 10.3390/jpm11020148

Academic Editors: Mikhail Ivanovich Voevoda and Yuliya I. Ragino

Received: 30 November 2020 Accepted: 12 February 2021 Published: 19 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** Diabetes mellitus type 1 and 2 is associated with cognitive impairment. Previous studies have reported a relationship between changes in cerebral metabolite levels and the variability of glycemia. However, the specific risk factors that affect the metabolic changes associated with type 1 and type 2 diabetes in cognitive dysfunction remain uncertain. The aim of the study was to evaluate the specificity of hippocampal spectroscopy in type 1 and type 2 diabetes and cognitive dysfunction. Materials and methods: 65 patients with type 1 diabetes with cognitive deficits and 20 patients without, 75 patients with type 2 diabetes with cognitive deficits and 20 patients without have participated in the study. The general clinical analysis and evaluation of risk factors of cognitive impairment were carried out. Neuropsychological testing included the Montreal Scale of Cognitive Dysfunction Assessment (MoCA test). Magnetic resonance spectroscopy (MRS) was performed in the hippocampal area, with the assessment of *N*-acetylaspartate (NAA), choline (Cho), creatine (Cr), and phosphocreatine (PCr) levels. Statistical processing was performed using the commercially available IBM SPSS software. Results: Changes in the content of NAA, choline Cho, phosphocreatine Cr2 and their ratios were observed in type 1 diabetes. More pronounced changes in hippocampal metabolism were observed in type 2 diabetes for all of the studied metabolites. Primary risk factors of neurometabolic changes in patients with type 1 diabetes were episodes of severe hypoglycemia in the history of the disease, diabetic ketoacidosis (DKA), chronic hyperglycemia, and increased body mass index (BMI). In type 2 diabetes, arterial hypertension (AH), BMI, and patient's age are of greater importance, while the level of glycated hemoglobin (HbA1c), duration of the disease, level of education and insulin therapy are of lesser importance. Conclusion: Patients with diabetes have altered hippocampal metabolism, which may serve as an early predictive marker. The main modifiable factors have been identified, correction of which may slow down the progression of cognitive dysfunction.

**Keywords:** type 1 diabetes mellitus; type 2 diabetes mellitus; proton spectroscopy; hippocampus; cognitive dysfunction

#### **1. Introduction**

Clinical studies indicate that type 1 and type 2 diabetes are risk factors for cognitive decline, while structural and functional deficits are associated with synaptic plasticity processes. Experimental data derived from rodent model of diabetes demonstrate the decline in learning and memory of varying extent, owing to the hippocampus dysfunction. These changes are associated with an increase in oxidative stress, levels of pro-inflammatory cytokines, β-amyloid, as well as dysfunction of the hypothalamic-pituitary-adrenal axis [1]. The hippocampus is a vital structure for learning and memory. High density of insulin

receptors has been found in this area of the brain [2]. Magnetic resonance spectroscopy (MRS), the latest biochemical analysis method, detects changes in metabolic neurochemical levels and energy metabolism in different areas of the brain in vivo [3–5]. Considering that MRS is both safe and non-invasive, researchers have used MRS to investigate the underlying causes of many neurological conditions, such as Alzheimer's disease and diabetes mellitus [6]. Monitoring changes in the level of each metabolite and metabolite ratios can provide information on neuronal damage, membrane metabolic dysfunctions, and transmission defects that occur in neurological diseases [7,8]. The purpose of the present study was to evaluate the features of spectroscopy of the hippocampus in patients with type 1 and 2 diabetes. Hypothesis: patients with type 1 and type 2 diabetes have risk factors for impaired neurometabolism in the hippocampus, which is clinically manifested by cognitive impairment.

#### **2. Materials and Methods**

All patients included in the study signed an informed consent, which was approved by an ethics committee (No. 265 from 2 May 2017). Sixty-five patients with type 1 diabetes with cognitive deficits and 20 patients without, 75 patients with type 2 diabetes with cognitive deficits and 20 patients without have participated in the study. According to the design, the study is characterized as an observational, cross-sectional, single-center, continuous, comparative study. Inclusion criteria for the study: diabetes type 1 and 2, age 18–65 years with varying degrees of disease compensation, carbohydrate metabolism, severity of vascular changes, the duration of the disease, and the type of therapy; obligatory presence of voluntary informed consent. Exclusion criteria: non-compliance with the inclusion criteria; presence of organic brain damage (tumors, stroke); of drugs or substances that alter cognitive functions (psychotropic, narcotic drugs); chronic alcoholism (anamnestic, ambulatory history); vitamin B12 deficiency (determined at study inclusion); condition after severe trauma and surgery; presence of hematologic, oncologic, and severe infectious diseases; decompensation of chronic heart failure with pronounced clinical symptoms, functional class of chronic heart failure functional class higher than II; acute coronary syndrome and transient ischemic attack in previous 6 months.

#### *2.1. Sample Characteristics*

The cohort of patients examined is presented below (Table 1).


**Table 1.** Characteristics of patients with type 1 and type 2 diabetes; M (interquartile range—Q1–Q3).

Patients with type 1 diabetes were younger and had longer duration of the disease. However, inside of type 1 and type 2 diabetes groups, no statistically significant differences in patients' characteristics were noted between subgroups with and without cognitive dysfunction.

#### *2.2. Risk Factors*

Each patient underwent the assessment of risk factors for cognitive impairment, including metabolic parameters, acute complications of diabetes, hypertension, Body Mass Index (BMI), duration of the disease, age, level of education, smoking status, alcohol abuse, insulin therapy.

#### *2.3. Cognitive Function*

For the diagnosis of cognitive impairment, the generally accepted test is the Montreal cognitive assessment (MoCA) (Copyright 2019, Ziad Nasreddine, MD) [9]. The test assesses eight cognitive domains: executive and visual-constructive skills, naming, memory, attention, speech, abstraction, delayed memory, and orientation. The maximum score is 30 points; borderline—26 points. The specificity of MoCA in detecting mild cognitive impairment is 90%, the sensitivity is 87%. A MoCA survey takes up to 10 min. This scale is now recommended by most modern experts in the field of cognitive impairment for widespread use in everyday clinical practice.

#### *2.4. Proton Spectroscopy of the Brain*

MRS of the brain with an echo time (TE) of 135 ms, and the volume of one voxel of 1.5 cm<sup>3</sup> was performed immediately after MRI of the brain. This technique was carried out in a multi-voxel mode, which allows placing 64 voxels on one slice simultaneously. In the areas of interest, the main spectra of *N*-acetylaspartate (NAA), choline (Cho), creatine (Cr), and phosphocreatine (PCr), as well as their ratios, were recorded. The protocol of proton magnetic resonance spectroscopy of the brain included the following steps (the total time of the study is approximately 35–40 min): (1) positioning of the examined patient (supine); (2) conducting a standard MRI of the brain; (3) performing sighting slices on the hippocampus area, T2 3 mm hippocampus; (4) MRS on the area of interest—frontal and temporal lobes area of the hippocampus (the camera serial interface protocol was used); (5) obtaining MR spectra of NAA, Cho, Cr, PCr in multivoxel mode; (6) postprocessing of the results of proton magnetic resonance spectroscopy including the analysis of spectrograms and the construction of color maps of the distribution of the main metabolites, as well as their ratios. The Turbo Spin Echo (Turbo SE) method with the parameters: recovery time—1500 ms, TE—135 ms, field of view—160 mm, matrix—192 × 256, slice thickness—5 mm, scanning time—12 min, was used. The quantitative characteristics of the studied metabolites and their ratios in the gray matter of the cerebral cortex, white matter of the brain, subcortical structures, and in the hippocampus on both sides were assessed. Using a regional approach, data were selected for NAA, Cho, Cr, PCr, localized in the left and right hippocampus, as shown in Figure 1. There were no missing data in our selection of patients.

**Figure 1.** A Regional approach to assessing metabolite content in the hippocampus.

To calculate sample size, we used the formula calculating the minimum sample size. As previous studies investigating hippocampal metabolism parameters in diabetes patients are lacking, the required sample size was estimated based on the results of two studies that were assessing the NAA/Cr ratio in patients with type 2 diabetes mellitus [10] (NAA/Cr

ratio 1.36 ± 0.15) and mild cognitive impairment, which also included diabetes patients (NAA/Cr ratio 1.58 ± 0.14) [6]. According to the aforementioned data, should we pursue the power of 0.9, the sample size of 11 in each group would be sufficient to prove the initial hypothesis.

The statistical processing was performed using the IBM SPSS Statistics software (USA), 19.0.0 Russian version. The following coefficients were evaluated: Shapiro–Wilk *W*-test, distribution estimate. Differences were determined by Student's *t*-test for normal, Mann–Whitney *Z*-test for non-normal distribution, Wilcoxon's test for dependent data. Descriptive analysis included the determination of the arithmetic mean X, error of the mean m, and calculation of median and quartiles (Me, Q1–Q3) depending on the type of distribution. The critical level of significance *p* when testing statistical hypotheses in the study was taken equal to 0.05. Qualitative data were analyzed using frequency analysis. For definition of accuracy, we used Pearson *χ 2* test. Spearman test was used for the correlation analysis [11]. *χ*

#### **3. Results**

As a result of randomization, 15 people each from the group with type 1 and type 2 diabetes dropped out of the study (Figure 2).

**Figure 2.** A Flowchart of randomized process.

In patients with type 1 diabetes presenting with cognitive impairment, an increase in NAA and PCr, and a decrease in the Cho content was noted. NAA/Cho, Cho/Cr ratios were also decreased, while the Cho/Cr ratio was increased. In patients with type 2 diabetes, more pronounced metabolic impairments were evidenced (Table 2).


**Table 2.** The content of metabolites of the hippocampus in patients with type 1 and type 2 diabetes with different cognitive functions.

Note: \* *p* ≤ 0.05.

The correlation analysis was carried out between the content of basic metabolites in brain cells and the basic indicators of glycemic variability (Figure 3).

− **Figure 3.** Correlation analysis of the relationship between brain metabolites and glycemic variability indicators. Note. NAA—*N*-acetylaspartate, Cho—choline, Cr—creatine, PCr—creatine phosphate, LBGI—low blood glucose index, LI—lability index, TIR—time in range, CONGA—glycemia long-term increase index, <<+>>—positive correlation, <<−>>—negative correlation.

As a result of the study on the relationship between metabolite content and glycemic variability indices, a number of significant positive correlations were registered: the level of NAA with the index of hypoglycemic risk/low blood glucose index (LBGI), Cho with the index of glycemic lability/liability index (LI), the duration and the time in range level (TIR), Cr with the index of prolonged glycemic increase (CONGA), and PCr with LI, CONGA and TIR.

In addition, the elevated HbA1c level, the glycemic lability index, and the mean glucose value were correlated in patients with Type 2 diabetes with reduced creatine levels in the hippocampus region. A decrease in the Cho creatine to NAA/Cr ratio was recorded in patients with increased LI and average glycemia level, when the Cho/Cr increase was more frequently noted when the long-term glycemic increase index (CONGA) was increased (Table 3).


**Table 3.** Correlation of brain metabolites, carbohydrate metabolism parameters, and serum protein taupe in patients with Table 2. diabetes mellitus.

Note: NAA—*N*-acetylaspartate, Cho—choline, Cr—creatine, PCr—creatine phosphate, mean—average value of glycemia, LI—glycemia lability index, CONGA—glycemia long-term increase index.

When assessing the influence of risk factors for cognitive impairment on brain metabolism in type 1 diabetes, there was a relationship between NAA left and HbA1c level (−0.247, *p* ≤ 0.05), Cho on the right and age, history of diabetic ketoacidosis and degree of cognitive impairment (−0.237, −0.230, 0.216, *p* ≤ 0.05), Cr on the right and a history of severe hypoglycemia (0.220, *p* ≤ 0.05), Cr2 on the left and body mass index (−0.218, *p* ≤ 0.05), NAA/Cr on the right and a history of severe hypoglycemia episodes (0.210, *p* ≤ 0.05).

When assessing the influence of risk factors for cognitive impairment on brain metabolism in type 2 diabetes, associations between NAA on the left and HbA1c level (−0.733, *p* ≤ 0.05), NAA on the left and the level of arterial hypertension (0.511, *p* ≤ 0.05), Cho on the left and the level of arterial hypertension (0.682, *p* ≤ 0.05), Cho on the right and age and the degree of cognitive impairment (0.785, 0.576, −0.561, *p* ≤ 0.05), Cr on the right and the duration of the disease and the degree of cognitive impairment (0.445, −0.508, 0.619, *p* ≤ 0.05), Cr2 on the left and the degree of arterial hypertension (0.577694, *p* ≤ 0.05), NAA/Cr on the left and a history of episodes of diabetic ketoacidosis in the anamnesis, and the level of HbA1c (0.451, 0.733, *p* ≤ 0.05), NAA/Cr on the left and the presence of higher education and insulin therapy (0.424, −0.596, *p* ≤ 0.05), NAA/Chon the left and body mass index, and degree of arterial hypertension (−0.562, −0.481, *p* ≤ 0.05), NAA/Chon the left and body mass index (−0.529, *p* ≤ 0.05), Ch/Cr on the left and age, and glycemic level (−0.457, −0.733, *p* ≤ 0.05), Ch/Cr on the right and the degree of arterial hypertension (−0.512, *p* ≤ 0.05) were noted.

For the purpose of timely detection of cognitive dysfunction at the earliest stage of the pathological process development, a method of early prediction of cognitive dysfunction in patients with type 1 and 2 diabetes using neuroimaging methods was developed.

For this aim, the data obtained by MRS were processed by the program on the platform of IBM Watson Studio 1.2.2.0, 2018, Armonk, NY, USA, 2018, to build a neural network model, which is a decision support system. This program is used to generalize a large number of complexly structured data using non-linear mathematical operations and allows you to build a model for predicting cognitive disorders in diabetes patients.

The predicted value of the cognitive test was used as the output parameter. The accuracy of this model for type 1 diabetes was 73%, error 1.7%, and for type 2 diabetes—79%, error 1.3%.

#### **4. Discussion and Conclusions**

With a change in the modern lifestyle, the incidence of diabetes increases from year to year [12]. The prevalence of cognitive dysfunction caused by diabetes is 10.8–65.0%, and its occurrence is associated with hippocampal atrophy [13–15]. In patients with diabetes, the literature suggests lower values of NAA of the hippocampus on both sides in patients with diabetic retinopathy, which is in accordance with our results, as in our study, patients with cognitive impairment also had a decrease in this metabolite [16]. These data indicate a common effect of complications of diabetes on metabolism in hippocampal cells, which is associated with the loss of neurons or axons and is independently associated with the development of diabetes.

Obesity combined with type 2 diabetes is an important factor of brain damage. The available knowledge demonstrates that in patients with obesity, there are a number of cerebral disorders, either associated with or preceding diabetes, including impaired substrate processing, insulin resistance, and impaired organ interconnections [17]. In keeping with that, in our study, it is BMI that is a risk factor for neurometabolic disorders.

In experimental models of diabetes, the role of dysglycemia on the change in the level of metabolites of the hippocampus was shown, along with the violation of glucose oxidation and an increase in creatinine levels [18]. Another study demonstrated the acute effect of hyperglycemia, i.e., in fact, DKA, on a decrease in NAA, which was combined with hyperphosphorylation of the tau protein, which supports the results of our study and the role of acute complications in the dysfunction of the hippocampus at the stage of manifestation [19]. Currently, data are scant on the effects of hypoglycemia on hippocampal metabolism. In a recent study, Wiegers et al. performed a two-stage hyperinsulinemic euglycemic (5 mmol/L)—hypoglycemic (2.8 mmol/L) test in 7 patients with poor hypoglycemic awareness and in seven patients with normal awareness of hypoglycemia, Hypoglycemia, and in healthy controls. The results showed that all metabolites were reduced by 20% in patients with poor awareness (*p* < 0.001) [20]. In the same study, patients with DM type 1 and frequent hypoglycemia had altered metabolism of the hippocampus.

Cao et al. investigated brain metabolites in the frontal and parietal cortex in 33 patients with diabetes and hypertension and noted the significant decrease in NAA/Cr and Cho/Cr ratios [21]. In our study, hypertension was an important determinant of changes in the hippocampal region.

Other risk factors, such as duration of the disease and age, naturally impair cognitive function and cannot be corrected [22].

A limitation of the study was the non-randomized nature of their comparisons. However, in the future, these points will be taken into account.

In this study, when performing proton spectroscopy, patients with type 1 diabetes showed an increase of NAA, Cho, and Cr2 levels in the hippocampus area, which are responsible for normal neuronal functioning. In type 2 diabetes, NAA levels were increased, as well, Cho, Cr, and Cr2 content decreased. According to several authors, such changes occur in gliosis and membrane necrosis and when oxidative stress is activated [23,24]. MRS of the brain in patients with diabetes shows changes associated with neurodegenerative processes [25]. In this connection, we determined non-invasive markers of metabolic and structural brain changes in patients with diabetes by means of various MRI techniques.

**Author Contributions:** Conceptualization, J.S. and M.M.; methodology, M.M.; software, O.T. and M.M.; validation, J.S., O.O. and M.M.; formal analysis, J.S. and M.M.; investigation, D.K., J.S. and M.M.; resources, D.K. and O.T.; supervision, D.K. and J.S.; data curation, J.S., O.O., O.T., A.K. and M.M.; writing—original draft preparation, J.S., O.O., O.T., A.K. and M.M.; writing—review and editing, M.M. and A.K.; visualization, O.T.; supervision, D.K. and J.S.; project administration, M.M.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded with the support of a grant from the President, agreement 075-15-2020-192 dated 03.2019.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of Siberian State Medical University (265 from 2 May 2017).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical restrictions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **The Risk of Type 2 Diabetes Mellitus in a Russian Population Cohort According to Data from the HAPIEE Project**

**Svetlana V. Mustafina 1,\* , Oksana D. Rymar <sup>1</sup> , Liliya V. Shcherbakova <sup>1</sup> , Evgeniy G. Verevkin <sup>1</sup> , Hynek Pikhart <sup>2</sup> , Olga V. Sazonova <sup>3</sup> , Yuliya I. Ragino <sup>1</sup> , Galina I. Simonova <sup>1</sup> , Martin Bobak <sup>2</sup> , Sofia K. Malyutina 1,3 and Mikhail I. Voevoda <sup>1</sup>**


**Citation:** Mustafina, S.V.; Rymar, O.D.; Shcherbakova, L.V.; Verevkin, E.G.; Pikhart, H.; Sazonova, O.V.; Ragino, Y.I.; Simonova, G.I.; Bobak,M.;Malyutina, S.K.; et al. The Risk of Type 2 Diabetes Mellitus in a Russian Population Cohort According to Data from the HAPIEE Project. *J. Pers. Med.* **2021**, *11*, 119. https://doi.org/10.3390/jpm11 020119

Academic Editor: Mikhail Ivanovich Voevoda

Received: 10 December 2020 Accepted: 8 February 2021 Published: 11 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** The aim of this study is to investigate the 14-year risk of type 2 diabetes mellitus (T2DM) and develop a risk score for T2DM in the Siberian cohort. A random population sample (males/females, 45–69 years old) was examined at baseline in 2003–2005 (Health, Alcohol, and Psychosocial Factors in Eastern Europe (HAPIEE) project, *n* = 9360, Novosibirsk) and re-examined in 2006–2008 and 2015– 2017. After excluding those with baseline T2DM, the final analysis included 7739 participants. The risk of incident T2DM during a 14-year follow-up was analysed using Cox regression. In age-adjusted models, male and female hazard ratios (HR) of incident T2DM were 5.02 (95% CI 3.62; 6.96) and 5.13 (95% CI 3.56; 7.37) for BMI ≥ 25 kg/m<sup>2</sup> ; 4.38 (3.37; 5.69) and 4.70 (0.27; 6.75) for abdominal obesity (AO); 3.31 (2.65; 4.14) and 3.61 (3.06; 4.27) for fasting hyperglycaemia (FHG); 2.34 (1.58; 3.49) and 3.27 (2.50; 4.26) for high triglyceride (TG); 2.25 (1.74; 2.91) and 2.82 (2.27; 3.49) for hypertension (HT); and 1.57 (1.14; 2.16) and 1.69 (1.38; 2.07) for family history of diabetes mellitus (DM). In addition, secondary education, low physical activity (PA), and history of cardiovascular disease (CVD) were also significantly associated with T2DM in females. A simple T2DM risk calculator was generated based on non-laboratory parameters. A scale with the best quality included waist circumference >95 cm, HT history, and family history of T2DM (area under the curve (AUC) = 0.71). The proposed 10-year risk score of T2DM represents a simple, non-invasive, and reliable tool for identifying individuals at a high risk of future T2DM.

**Keywords:** diabetes mellitus; risk factor; diabetes risk scale; diabetes risk model

#### **1. Introduction**

In the last decades, the prevalence of diabetes mellitus (DM) has consistently risen in the general world population, thus making DM a medical and social problem worldwide [1]. According to the International Diabetes Federation forecast in 2019, the number of subjects with diabetes is expected to reach 578 million by 2030 and 700 million by 2045 [2]. In the Russian Federation, the prevalence of DM is also rising. According to the Federal Register of Diabetes Mellitus, in Russia by the end of 2016, 4.35 million subjects (3.0% of the population) had been registered in a dispensary as DM patients, of which 92% (4 million) had type 2 diabetes mellitus (T2DM), 6% (255,000) had type 1 diabetes mellitus, and 2% (75,000) other types of diabetes [3]. According to the NATION study, among the adult Russian population of 20–79 years old, 5.4% have T2DM. Moreover, approximately one-half of patients with diagnosed DM (54%) were unaware of this disease [4]. In our baseline survey within the Health, Alcohol, and Psychosocial Factors in Eastern Europe (HAPIEE)

project, 2003–2005, we found a high prevalence of T2DM (11.3%) in the population sample aged 45–69 years old in Novosibirsk [5]. The rate is close to the data on compatible age groups in the NATION study conducted in Russia in 2013–2015 [4].

T2DM is known to be a multifactorial disease, and environmental factors are important for T2DM pathogenesis. The risk factors of T2DM are well established and include abdominal obesity (waist circumference (WC) ≥94 cm in males and ≥ 80 cm in females), a family history of DM, age >45 years old, hypertension and major cardiovascular diseases (CVDs), gestational diabetes, the use of drugs that contribute to hyperglycaemia, and weight gain. Early identification of T2DM risk factors and their clusters with the aim of modification can help to prevent T2DM [6,7]. At present, preventive strategies rely on the identification of risk factors and their combinations and subsequent lifestyle intervention. Appropriate lifestyle changes, including the normalisation of diet, increased physical activity, and weight loss can reduce the risk of T2DM by as much as 56% [8].

One of the first tools to identify individuals at high risk for T2DM is the Finnish diabetes risk score (FINDRISC) [9,10]. This tool was later successfully validated in other countries, including Germany, Holland, Denmark, Sweden, England, and Australia, [10,11]. The results show good sensitivity (Se) and specificity (Sp) in Germany, the USA, Switzerland, and Canada [12–15], although it did not perform well among the Omani Arabs [16].

To prevent further increase in the prevalence of diabetes, the identification of individuals at high risk for hyperglycaemia using inexpensive and available methods is crucial. Using risk score methods of prediction allows us to set the level of total risk, identify high-risk patients, and prescribe the necessary preventive measures.

Risk factors of T2DM have been studied in healthcare institutions and cross-sectional studies in Russia. Nonetheless, we are not aware of a Russian prospective cohort analysis of the long-term risk of T2DM in the general population. In this study, we aimed to investigate the 14-year risk of T2DM in a Russian population cohort in order to develop a risk scale to predict the development of T2DM over 10 years in people aged 45–69 years old.

#### **2. Materials and Methods**

#### *2.1. Study Population and Methods*

The data came from the Russian arm of the HAPIEE project [17]. The cohorts in this multicentre project were randomly selected from population registers or electoral lists and stratified by sex and five year age groups. The planned sample size was 10,000 persons in each country. At baseline, a cross-sectional analysis of the random age- and sex-stratified population sample of males and females aged 45–69 years old was performed in 2003–2005 in Novosibirsk (Russia) (*n* = 9360, response rate 61%). In the Russian arm of the study, both the questionnaire and the examination have been completed in an outpatient clinic. The details of sampling have been described elsewhere [17]. The cohort was re-examined in 2006–2008 and in 2015–2017. Follow-up data were collected between 2003 and 2017; the average follow-up period comprised 13.7 (0.7) years (mean (SD)). The last survey and follow-up were supported by Russian Science Foundation. We excluded from the analysis those who have no baseline glucose assessment and those with T2DM at baseline defined as fasting plasma glucose (FPG) ≥7.0 mmol/L or current treatment with insulin or oral hypoglycaemic agents (1621 subjects excluded). In total, the final analysis included 7739 participants (3376 males; 4363 females). Incident DM in any wave of follow-up was registered as the endpoint.

#### *2.2. Baseline Examination*

The baseline and repeated examinations involved standardised questionnaires, objective measurements, and blood sampling for biomarker assays. Details of the baseline protocol of the HAPIEE project have been published previously [17,18]. The participants did an interview with trained technicians regarding sociodemographic characteristics, behavioural risk factors (including smoking and alcohol intake), a history of DM and hypertension and their treatment, a history of major CVDs and other chronic diseases, physical activity (PA), a family history of T2DM and CVDs, meal frequency and other health, lifestyle, and social characteristics.

Objective measurements included anthropometry (body weight, height, WC, and hip circumference) and blood pressure (BP) determination. The body weight was measured on a weighing scale with a participant wearing one layer of clothes (accuracy to 0.1 kg). The height was measured using a vertical stadiometer (accuracy 0.1 cm). WC and hip circumference were measured by means of a tape with 0.1 cm accuracy. The body–mass index (BMI) was calculated as weight in kilograms divided by the square of height in meters and categorised as BMI < 25.0 kg/m<sup>2</sup> and BMI ≥ 25 kg/m<sup>2</sup> [18]. Abdominal obesity was defined as a WC of ≥94 cm for males and ≥80 cm for females [3].

BP was measured after 5-min rest on the right hand in a sitting position using Omron M5-I (Omron Co. Ltd., Terado-cho, Muko, Kyoto, Japan). The measurement of BP was carried out three times with an interval of 2 min. For the present analysis, the average of the three values of BP was used. Hypertension was defined as systolic BP (SBP) ≥140 mmHg or diastolic BP (DBP) ≥90 mmHg and/or antihypertensive medication use during the last two weeks.

Blood samples were drawn after overnight fasting (at least 8 h). Serum levels of glucose, total cholesterol (TC), triglycerides (TGs), and high-density lipoprotein cholesterol (HDL-C) were determined by enzymatic methods on a KoneLab 30i automated analyser (Thermo Fisher Scientific Inc., Waltham, MA, USA). The fasting serum glucose value was converted to fasting plasma glucose (FPG) via the formula of the European Association for the Study of Diabetes in 2007 [19]:

FPG (mmol/L) = −0.137 + 1.047 × serum glucose concentration (mmol/L). Impaired fasting glucose was defined as FPG of 6.1–6.9 mmol/L.

Hypertriglyceridemia was defined as a serum TG level of ≥2.8 mmol/L, and abnormal HDL-C was defined as HDL cholesterol of ≤0.9 mmol/L for males and females [3].

We followed the cohort from the baseline survey up to 31 December 2017; the average period of follow-up comprised 13.7 (0.7) years (mean (SD)). The incident cases of T2DM were ascertained using overlapping sources—the cases registered by the municipal diabetes register and new cases identified by the repeated surveys in 2006–2008 and 2015–2017. Baseline T2DM was defined as FPG of ≥7.0 mmol/L or current treatment with insulin or oral hypoglycaemic agents [20]; these persons were excluded from the study. An incident case of DM was defined as a new case registered by the T2DM register during the 2003–2017 period or a new case identified in the second or third survey as FPG ≥7.0 mmol/L or current treatment with insulin or oral hypoglycaemic agents.

#### *2.3. Statistical Analyses*

These analyses were carried out using the statistical package SPSS for Windows Version 13.0, (SPSS Inc., Chicago, IL, USA). Baseline characteristics of the study participants are given as means (SD) and were compared by the χ 2 test, unpaired Student's *t* test (2-tailed), or Mann–Whitney test, depending on the type of distribution of the variables.

First, the association between potential risk factors and incident T2DM was assessed by Cox regression analysis in age- and sex-adjusted Model 1 and in age-adjusted Model 2 split by sex. Incident T2DM served as a dependent variable. Independent variables tested sequentially included the education level (categorised into three groups—higher education, secondary, or primary education); marital status (categorised into two groups married or cohabitating/single); SBP and DBP (as continuous variables); hypertension (dichotomised as yes/no); TC, TG, HDL-C, and FPG (as continuous variables); high TG concentration (dichotomised as ≥2.8 mmol/L and <2.8 mmol/L); low HDL-C concentration (dichotomised as cholesterol ≤0.9 mmol/L for males and females and cholesterol >0.9 mmol/L for males and females); fasting hyperglycaemia (dichotomised as ≥6.1 and <6.1 mmol/L); obesity (categorised into two groups—BMI < 25 kg/m<sup>2</sup> and ≥25 kg/m<sup>2</sup> ); abdominal obesity (categorised into two groups—WC ≥ 94 cm for males and ≥80 cm for females and <94 cm for males and <80 cm for females); smoking (categorised into three

groups—current smokers, past smokers, or never smoked); alcohol consumption was rated in two versions—as a continuous variable (the mean dose of ethanol per occasion, in grams) and as a dichotomised variable (higher than a calculated sex-specific mean amount of alcohol intake per session in the population sample, yes/no); the lack of leisure time PA weekly or daily (PA in a previous week was categorised into three groups—none, insufficient (1– 179 min), and sufficient (≥180 min); everyday PA was categorised into three groups—none, insufficient (1–29 min), and sufficient (≥30 min)), and a dichotomised variable 'low PA' was generated based on weekly or daily insufficient PA at leisure time (yes/no); fruit and vegetable consumption (dichotomised as every day/not every day); a family history of DM (dichotomised as a T2DM history in first-degree relatives and no family history of T2DM); and a history of major CVDs (dichotomised as yes/no). Covariates were age (as a continuous variable, per year) and sex (male or female).

Among the tested factors, for subsequent Cox regression analysis, we selected those significantly associated with T2DM, and in a set of similar variables, we selected the ones with greater hazard ratios (HRs). The selected variables included age, BMI, abdominal obesity, hypertension, dyslipidaemia (high TG and low HDL-C levels), FPG, the history of CVDs, smoking status, alcohol intake, the education level, marital status, PA, fruit and vegetable consumption, and the family history of DM.

At the second stage, we applied Cox proportional hazards regression analysis to assess the association between risk factors and incident T2DM in age-adjusted and multivariableadjusted models separately in males and females.

HRs with a 95% confidence interval (CI) were calculated for the above factors selected as independent variables. In males, the multivariable model included age, BMI ≥ 25 kg/m<sup>2</sup> , hypertension, high TG concentration, FPG, a family history of DM, and a history of CVD. In females, the model included age, BMI ≥ 25 kg/m<sup>2</sup> , hypertension, high TG concentration, FPG, education level, PA, marital status, a family history of DM, and a history of CVD.

At the third stage, we used dichotomised variables based on cut-off points for risk factors of T2DM obtained using receiver-operating characteristic (ROC) analysis. These cutoffs were applied further to build a 10-year risk score for T2DM using Cox regression and ROC analysis and select a model which includes a minimum number of prognostic parameters and has the maximum positive predictive power for T2DM risk.

Logistic regression was used to compute β-coefficients for significant risk factors for T2DM. Coefficients (β) of the model were used to assign a score value for each variable, and the composite diabetes risk score was calculated as the sum of those scores. The sensitivity (the probability that the test is positive for subjects who will receive drug-treated diabetes in the future) and the specificity (the probability that the test is negative for subjects without drug-treated diabetes) with 95% CIs were calculated for each diabetes risk score level in differentiating subjects developed incident T2DM from those who did not. Then, ROC curves were plotted for the diabetes risk score; the sensitivity was plotted on the *y*-axis, and the false-positive rate (1-specificity) was plotted on the *x*-axis. The more accurately discriminatory the test, the steeper the upward portion of the ROC curve and the higher the area under the curve (AUC), the optimal cut point being the peak of the curve.

#### **3. Results**

The population sample of males and females aged 45–69 years old was examined at baseline in 2003–2005 in Novosibirsk (*n* = 9360 subjects). The present analysis was limited to those without T2DM at baseline (*n* = 7739). The baseline characteristics of this sample are presented in Table 1.

#### *3.1. The 14-Year Risk of T2DM*

During the follow-up of 13.7 (0.7) years (mean (SD)), 915 participants developed T2DM for the first time (11.8%). Among males, the frequency of incident cases of T2DM was 1.8-fold lower than that among females, 9.7% and 15.5%, respectively (*p* < 0.0001).

Low HDL-C ■

BMI ≥ 25 kg/m<sup>2</sup>

Fruit and vegetable consumption less than

ment during the last two weeks; ■

cient: 1–179 min or sufficient: ≥180 min).

♦ High triglyceride (TG): serum TG concentration ≥2.8 mmol/L. ●

*3.1. The 14-Year Risk of T2DM* 

Leisure-time PA in previous week #


**Table 1.** Characteristics of the studied population sample (aged 45–69 years old at baseline, *n* = 7739).  **Males and Females Males Females** *p*  Examined 7739 3376 4363

are presented in Table 1.

**3. Results** 

*J. Pers. Med.* **2021**, *11*, x FOR PEER REVIEW 5 of 15

**Table 1.** Characteristics of the studied population sample (aged 45–69 years old at baseline, *n* = 7739).

The population sample of males and females aged 45–69 years old was examined at

During the follow-up of 13.7 (0.7) years (mean (SD)), 915 participants developed

We compared 26 factors potentially related to T2DM between the participants who

T2DM for the first time (11.8%). Among males, the frequency of incident cases of T2DM

developed T2DM and those who remained free of T2DM; the results are summarised in Table 2. Individuals of both sexes who developed T2DM were younger, had a greater BMI, greater WC, more frequent abdominal obesity, higher SBP and DBP, more frequent

was 1.8-fold lower than that among females, 9.7% and 15.5%, respectively (*p* < 0.0001).

Fasting hyperglycaemia: glucose ≥6.1 mmol/L. ^ Fruit

baseline in 2003–2005 in Novosibirsk (*n* = 9360 subjects). The present analysis was limited to those without T2DM at baseline (*n* = 7739). The baseline characteristics of this sample

Only primary 719 (9.3) 335 (9.9) 384 (8.8) Any secondary 4722 (61.0) 1927 (57.1) 2795 (64.1) University 2298 (29.7) 1114 (33.0) 1184 (27.1) Abdominal obesity ▪ , *n*(%) 5014 (64.8) 1563 (46.3) 3451 (79.1) <0.001 SBP (mmHg) 142.8 (24.5) 142.4 (22.6) 143.1 (25.8) 0.202 DBP (mmHg) 90.0 (13.3) 90.0 (13.0) 89.9 (13.5) 0.899 Abdominal obesity: waist circumference (WC) ≥94 cm for males and ≥80 cm for females. \* Hypertension is defined as systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg or antihypertensive drug treatment during the last two weeks; Low high-density lipoprotein cholesterol (HDL-C) ≤0.9 mmol/L for males and females. High triglyceride (TG): serum TG concentration ≥2.8 mmol/L. • Fasting hyperglycaemia: glucose ≥6.1 mmol/L. ˆ Fruit and vegetable consumption (every day/not every day); # Leisure time physical activity (PA) in a previous week (insufficient: 1–179 min or sufficient: ≥180 min).

Family history of T2DM (%) 863 (11.2) 300 (9.0) 563 (12.9) <0.001 History of CVD, *n*(%) 832 (10.8) 475 (14.1) 357 (8.2) <0.001 Fruit and vegetable consumption less than every day ^, *n*(%) 833 (10.8) 361 (10.7) 472 (10.8) 0.856 Leisure-time PA in previous week # , min 6260 (80.9) 2803 (83.0) 3457 (79.2) <0.001 FPG (mmol/L) 5.61 (0.6) 5.63 (0.6) 5.59 (0.6) 0.003 TC (mmol/L) 6.2 (1.2) 5.9 (1.1) 6.4 (1.2) <0.001 TG (mmol/L) 1.4 (0.7) 1.4 (0.7) 1.5 (0.6) <0.001 HDL-C (mmol/L) 1.5 (0.4) 1.5 (0.4) 1.6 (0.3) <0.001 Hypertension \* *n*(%) 4910 (63.5) 2082 (61.7) 2828 (64.8) 0.005 We compared 26 factors potentially related to T2DM between the participants who developed T2DM and those who remained free of T2DM; the results are summarised in Table 2. Individuals of both sexes who developed T2DM were younger, had a greater BMI, greater WC, more frequent abdominal obesity, higher SBP and DBP, more frequent hypertension, a more frequent family history of DM, and higher FPG, TG, and HDL-C levels as compared to those without T2DM.

Abdominal obesity: waist circumference (WC) ≥94 cm for males and ≥80 cm for females. \* Hypertension is defined as systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg or antihypertensive drug treatment during the last two weeks; ■ Low high-density lipoprotein cholesterol (HDL-C) ≤0.9 mmol/L for males and females. , *n*(%) 43 (0.6) 26 (0.8) 17 (0.4) 0.026 High TG ♦, *n*(%) 296 (3.8) 125 (3.7) 171 (3.9) 0.609 Fasting hyperglycaemia ●, *n*(%) 1541 (20.2) 708 (21.2) 833 (19.4) 0.045 Males with incident T2DM had higher TC levels were less frequently current smokers and more frequently smokers in the past than their counterparts without T2DM. Females with incident T2DM had a more frequent history of CVD, engaged in less PA, and more frequently had a secondary-education level than their counterparts without T2DM.

♦ High triglyceride (TG): serum TG concentration ≥2.8 mmol/L. ● Fasting hyperglycaemia: glucose ≥6.1 mmol/L. ^ Fruit and vegetable consumption (every day/not every day); # Leisure time physical activity (PA) in a previous week (insufficient: 1–179 min or sufficient: ≥180 min). , *n*(%) 5510 (71.2) 1993 (59.0) 3517 (80.6) <0.001 Alcohol, mean dose per occasion (g) 35.8 (36.5) 54.2 (45.6) 21.7 (17.3) <0.001 **Smoking,** *n***(%)** <0.001 Never smoked 4624 (59.8) 917 (27.2) 3707 (85.0) The results of the Cox regression analysis are presented in Table 3. In the age-adjusted model for males, the 14-year risk of incident T2DM was associated with BMI ≥ 25 kg/m<sup>2</sup> , abdominal obesity, fasting hyperglycaemia, a high TG level, hypertension, and a family history of DM (Table 3). In the age-adjusted model for females, the 14-year risk of incident T2DM was associated with abdominal obesity, BMI ≥ 25 kg/m<sup>2</sup> , fasting hyperglycaemia, a

Single 2145 (27.7) 387 (11.5) 1758 (40.3)

Married or cohabitating 5594 (72.3) 2989 (88.5) 2605 (59.7)

Only primary 719 (9.3) 335 (9.9) 384 (8.8) Any secondary 4722 (61.0) 1927 (57.1) 2795 (64.1) University 2298 (29.7) 1114 (33.0) 1184 (27.1)

**Marital status,** *n***(%)** <0.001

**Education level,** *n***(%)** <0.001

Family history of T2DM (%) 863 (11.2) 300 (9.0) 563 (12.9) <0.001 History of CVD, *n*(%) 832 (10.8) 475 (14.1) 357 (8.2) <0.001

Abdominal obesity: waist circumference (WC) ≥94 cm for males and ≥80 cm for females. \* Hypertension is defined as systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg or antihypertensive drug treat-

and vegetable consumption (every day/not every day); # Leisure time physical activity (PA) in a previous week (insuffi-

every day ^, *n*(%) 833 (10.8) 361 (10.7) 472 (10.8) 0.856

, min 6260 (80.9) 2803 (83.0) 3457 (79.2) <0.001

Low high-density lipoprotein cholesterol (HDL-C) ≤0.9 mmol/L for males and females.

During the follow-up of 13.7 (0.7) years (mean (SD)), 915 participants developed

We compared 26 factors potentially related to T2DM between the participants who

T2DM for the first time (11.8%). Among males, the frequency of incident cases of T2DM

developed T2DM and those who remained free of T2DM; the results are summarised in Table 2. Individuals of both sexes who developed T2DM were younger, had a greater BMI, greater WC, more frequent abdominal obesity, higher SBP and DBP, more frequent

was 1.8-fold lower than that among females, 9.7% and 15.5%, respectively (*p* < 0.0001).

high TG level, hypertension, a family history of DM, any secondary -education level, low PA (1–179 min/week or 1–29 min/day), and a history of CVD (Table 3). **Table 1.** Characteristics of the studied population sample (aged 45–69 years old at baseline, *n* = 7739).

*J. Pers. Med.* **2021**, *11*, x FOR PEER REVIEW 5 of 15

The population sample of males and females aged 45–69 years old was examined at

baseline in 2003–2005 in Novosibirsk (*n* = 9360 subjects). The present analysis was limited to those without T2DM at baseline (*n* = 7739). The baseline characteristics of this sample

Low high-density lipoprotein cholesterol (HDL-C) ≤0.9 mmol/L for males and females.

During the follow-up of 13.7 (0.7) years (mean (SD)), 915 participants developed

We compared 26 factors potentially related to T2DM between the participants who

T2DM for the first time (11.8%). Among males, the frequency of incident cases of T2DM

developed T2DM and those who remained free of T2DM; the results are summarised in Table 2. Individuals of both sexes who developed T2DM were younger, had a greater BMI, greater WC, more frequent abdominal obesity, higher SBP and DBP, more frequent

Fasting hyperglycaemia: glucose ≥6.1 mmol/L. ^ Fruit

was 1.8-fold lower than that among females, 9.7% and 15.5%, respectively (*p* < 0.0001).

Fasting hyperglycaemia: glucose ≥6.1 mmol/L. ^ Fruit


**Table 2.** Baseline characteristics of groups with incident T2DM and without T2DM stratified by sex (mean (SD), and *n*(%)).

are presented in Table 1.

**3. Results** 

Only primary 719 (9.3) 335 (9.9) 384 (8.8) Any secondary 4722 (61.0) 1927 (57.1) 2795 (64.1) University 2298 (29.7) 1114 (33.0) 1184 (27.1) Abdominal obesity ▪ , *n*(%) 5014 (64.8) 1563 (46.3) 3451 (79.1) <0.001 SBP (mmHg) 142.8 (24.5) 142.4 (22.6) 143.1 (25.8) 0.202 DBP (mmHg) 90.0 (13.3) 90.0 (13.0) 89.9 (13.5) 0.899 Abdominal obesity: WC ≥ 94 cm for males and ≥80 cm for females. \* Hypertension is defined as SBP ≥ 140 mmHg or DBP ≥ 90 mmHg or antihypertensive drug treatment during the last 2 weeks; Low HDL-C: ≤0.9 mmol/L for males and females. High TG: serum TG level ≥2.8 mmol/L. • Fasting hyperglycaemia: glucose concentration ≥6.1 mmol/L. ˆ Fruit and vegetable consumption (every day/not every day); # Leisure time PA in a previous week (insufficient: 1–179 min or sufficient: ≥180 min).

Family history of T2DM (%) 863 (11.2) 300 (9.0) 563 (12.9) <0.001 History of CVD, *n*(%) 832 (10.8) 475 (14.1) 357 (8.2) <0.001 Fruit and vegetable consumption less than every day ^, *n*(%) 833 (10.8) 361 (10.7) 472 (10.8) 0.856 Leisure-time PA in previous week # , min 6260 (80.9) 2803 (83.0) 3457 (79.2) <0.001 Abdominal obesity: waist circumference (WC) ≥94 cm for males and ≥80 cm for females. \* Hypertension is defined as FPG (mmol/L) 5.61 (0.6) 5.63 (0.6) 5.59 (0.6) 0.003 TC (mmol/L) 6.2 (1.2) 5.9 (1.1) 6.4 (1.2) <0.001 TG (mmol/L) 1.4 (0.7) 1.4 (0.7) 1.5 (0.6) <0.001 HDL-C (mmol/L) 1.5 (0.4) 1.5 (0.4) 1.6 (0.3) <0.001 Hypertension \* *n*(%) 4910 (63.5) 2082 (61.7) 2828 (64.8) 0.005 Low HDL-C ■ , *n*(%) 43 (0.6) 26 (0.8) 17 (0.4) 0.026 The results of multivariable-Cox regression analysis are given in Table 3. In the multivariable-adjusted model, BMI ≥ 25 kg/m<sup>2</sup> made the greatest contribution to the development of T2DM, with HR = 3.34 (2.60; 4.30); besides, the risk of incident T2DM was independently associated with fasting hyperglycaemia (HR = 2.70 (2.35; 3,10)), hypertension (HR = 1.86 (1.57; 2.21)), a high TG level (HR = 1.59 (1.26; 2.01)), a family history of DM (HR = 1.53 (1.28; 1.83)), a low leisure-time PA (HR = 1.22 (1.02; 1.46)) (Table 3).

systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg or antihypertensive drug treat-

BMI ≥ 25 kg/m<sup>2</sup>

Fruit and vegetable consumption less than

ment during the last two weeks; ■

cient: 1–179 min or sufficient: ≥180 min).

Leisure-time PA in previous week #

ment during the last two weeks; ■ High TG ♦, *n*(%) 296 (3.8) 125 (3.7) 171 (3.9) 0.609 Fasting hyperglycaemia ●, *n*(%) 1541 (20.2) 708 (21.2) 833 (19.4) 0.045

cient: 1–179 min or sufficient: ≥180 min).

♦ High triglyceride (TG): serum TG concentration ≥2.8 mmol/L. ●

*3.1. The 14-Year Risk of T2DM* 

Alcohol, mean dose per occasion (g) 35.8 (36.5) 54.2 (45.6) 21.7 (17.3) <0.001

**Smoking,** *n***(%)** <0.001

**Marital status,** *n***(%)** <0.001

**Education level,** *n***(%)** <0.001

Family history of T2DM (%) 863 (11.2) 300 (9.0) 563 (12.9) <0.001 History of CVD, *n*(%) 832 (10.8) 475 (14.1) 357 (8.2) <0.001

Abdominal obesity: waist circumference (WC) ≥94 cm for males and ≥80 cm for females. \* Hypertension is defined as systolic blood pressure (SBP) ≥140 mmHg or diastolic blood pressure (DBP) ≥90 mmHg or antihypertensive drug treat-

and vegetable consumption (every day/not every day); # Leisure time physical activity (PA) in a previous week (insuffi-

every day ^, *n*(%) 833 (10.8) 361 (10.7) 472 (10.8) 0.856

*3.1. The 14-Year Risk of T2DM* 

Married or cohabitating 5594 (72.3) 2989 (88.5) 2605 (59.7)

Only primary 719 (9.3) 335 (9.9) 384 (8.8) Any secondary 4722 (61.0) 1927 (57.1) 2795 (64.1) University 2298 (29.7) 1114 (33.0) 1184 (27.1)

Single 2145 (27.7) 387 (11.5) 1758 (40.3)

Never smoked 4624 (59.8) 917 (27.2) 3707 (85.0) Former smoker 973 (12.6) 786 (23.3) 187 (4.3) Current smoker 2141 (27.7) 1672 (49.5) 469 (10.7)

and vegetable consumption (every day/not every day); # Leisure time physical activity (PA) in a previous week (insuffi-

, min 6260 (80.9) 2803 (83.0) 3457 (79.2) <0.001

Low high-density lipoprotein cholesterol (HDL-C) ≤0.9 mmol/L for males and females.

During the follow-up of 13.7 (0.7) years (mean (SD)), 915 participants developed

We compared 26 factors potentially related to T2DM between the participants who

T2DM for the first time (11.8%). Among males, the frequency of incident cases of T2DM

developed T2DM and those who remained free of T2DM; the results are summarised in Table 2. Individuals of both sexes who developed T2DM were younger, had a greater BMI, greater WC, more frequent abdominal obesity, higher SBP and DBP, more frequent

was 1.8-fold lower than that among females, 9.7% and 15.5%, respectively (*p* < 0.0001).

**Table 3.** The preliminary assessment of the relationship between the 14-year risk of T2DM and risk factors by age-adjusted and age- and sex-adjusted Cox regression.



\* Model 1: age- and sex-adjusted model. \*\* Model 2: age-adjusted model split by sex. \*\*\* Model 3: The model is multivariable and adjusted for age, sex, a family history of diabetes mellitus (DM), fasting hyperglycaemia, a history of cardiovascular disease (CVD), hypertension, abdominal obesity, high TG level, low HDL-C concentration, education level, smoking, low PA, and fruit and vegetable consumption.

#### *3.2. Development of the Type 2 Diabetes Risk Scale*

In many countries, 10-year risk scales for T2DM have been created on the basis of epidemiological studies [11]. These scales are based on the most specific risk factors of T2DM in a population under study. To build models for assessing the risk of T2DM, we used cut-off points (cut-off) which were calculated for the Siberian population in the age group under study. This allowed us to take into account the regional characteristics of the studied cohort. Different values were obtained for males and females: for males, the BMI (cut-off) was 27 kg/m<sup>2</sup> , WC (cut-off) 95.0 cm, SBP (cut-off) 150 mmHg, DBP (cutoff) 90 mmHg, TG (cut-off) 1.4 mmol/L, HDL-C (cut-off) 0.9 mmol/L, and FPG (cut-off) 6.0 mmol/L; for females, the BMI (cut-off) 32 kg/m<sup>2</sup> , WC (cut-off) 95 cm, SBP (cut-off) 135 mmHg, DBP (cut-off) 90 mmHg, TG (cut-off) 1.5 mmol/L, HDL-C (cut-off) 0.9 mmol/L, and FPG (cut-off) 5.7 mmol/L.

We designed a model for assessing the 10-year risk of T2DM on the basis of the cut-offs and multivariate Cox regression analysis. To create a risk scale, a new variable was created arterial hypertension (HT)<sup>1</sup> BP ≥ 150/90 mmHg for males and BP ≥ 135/90 mmHg for females. During the 10 years of follow-up, 463 (5%) new cases of T2DM were registered. The average age at first diagnosis of T2DM in the study population was 61.3 (6.7) years old.

Having studied the association between dichotomised risk factors and the 10-year risk of T2DM by multivariable-adjusted Cox regression analysis.

In males, the final version of the T2DM risk model includes significant risk factors dichotomised by the cut-off as T2DM predictors: FPG (cut-off) ≥6.0 mmol/L (HR = 3.79 (2.6; 5.6)), the BMI (cut-off) ≥27 kg/m<sup>2</sup> (HR = 3.03 (2.0; 4.7)), HDL-C (cut-off) ≤0.9 mmol/L (HR = 2.20 (1.2; 3.9)), TG (cut-off) ≥1.4 mmol/L (HR = 1.55 (1.0; 2.3)), and HT<sup>1</sup> ≥150/90 mmHg (HR = 1.57 (1.0; 2.4). The model for males is adjusted for FPG (cut-off), BMI (cut-off), HDL-C (cut-off), TG (cut-off), HT1. For females, predictors were included that were different from those in the model for males: WC (cut-off) ≥95 cm (HR = 2.25 (1.6; 3.1)), FPG (cutoff) ≥5.7 mmol/L (HR = 2.58 (2.0; 3.3)), TG (cut-off) ≥1.5 mmol/L (HR = 1.81 (1.4; 2.3)), HT<sup>1</sup> ≥135/90 mmHg (HR = 1.64 (1.2; 2.2)), a family history of T2DM (HR = 1.50 (1.1; 2.0)), and the BMI (cut-off) ≥32 kg/m<sup>2</sup> (HR = 1.47 (1.1; 1.9)). The model for females is adjusted for WC (cut-off), FPG (cut-off), TG (cut-off), HT1, age, family history of T2DM, BMI (cut-off).

Exp(B) measures served as weights to create the risk scale. Each predictor included in the regression model was scored by rounding out Exp(B) to a whole number (Table 4). The maximum total number of points on the created T2DM risk scale of the model is 13 for males and 12 for females.


**Table 4.** The 10-year risk scale for incident T2DM.

To determine the threshold of the total score associated with a high risk of T2DM, a receiver-operating characteristic (ROC) curve was constructed. The optimal cut-off for the sum of points that allowed to divide the analysed groups into two subgroups was 7 points (sensitivity (Se) 76.0%, specificity (Sp) 71.5%) for males and 6 points (Se 71.7%, Sp 69.2%) for females. When cross-checking the adequacy of the model in the population of Novosibirsk, we calculated the actual incidence of T2DM. Among the males who scored 7 or more points, T2DM developed in 10.2% of cases, and in the group of people who scored less than 7 points, only in 1.4% of cases; females who scored 6 or more points developed T2DM in 15.8% of cases, and among those who scored less than 6 points, incident cases of DM were detected only in 3.2% of cases.

Clinical testing of the newly developed model and of the Finnish model predicting the 10-year risk of T2DM on persons of retirement age in Novosibirsk revealed difficulties with independent filling out of the questionnaire [21]. Determination of the BMI, blood lipid levels, and in some cases, of own BP was difficult for the elderly. Accordingly, the next aim was to develop a simple T2DM risk calculator that would be convenient to use in primary health care and for self-completion for both sexes. Our model had to include only the parameters that can be easily assessed without laboratory tests or other measurements that require specialised medical skills. Predictors of the 10-year risk of T2DM selected for the best scale included history of hypertension 1.6 (HR = 1.6 (1.3; 1.8)), family history of T2DM (HR = 1.8 (1.1; 1.9)), WC (cut-off) (HR = 3.6 (1.9; 3.8)).

As a result of the analysis of various multivariate models, a scale with the best quality was selected (Table 5), where the area under the curve (AUC) was 0.71; this scale included such risk factors as WC, a history of arterial hypertension, and a family history of T2DM.

The relative risk scores obtained in the Cox regression analysis were chosen as variables to create a risk scale. Each of the three predictors included in the regression model was scored by rounding out to a whole number (Table 5). The highest total score on the created T2DM risk scale is 8 points. The cut-off of the scale was found to be 4 points—Se 74.7% and Sp 60.0%.


**Table 5.** The risk scale predicting the development of T2DM within 10 years (both females and males).

Note: WC of 95 cm was obtained by receiver-operating characteristic (ROC) analysis for those surveyed who experienced the first onset of type 2 diabetes mellitus (T2DM) within 10 years of observation in Novosibirsk.

In the group that scored ≥4 points during 10 years, T2DM developed in 10.7% of cases, and in the group with <4 points, T2DM developed in 2.6% of cases.

#### **4. Discussion**

According to our analysis of the study population aged 45–69 years old, the prevalence of T2DM was 11.0% in both the male and female samples [22]. During 14 years of observation, new cases of T2DM occurred in 9% of the population more often among females than among males, 328 (9.7%) and 587 (15.5%), respectively (*p* < 0.001).

The results were subjected to Cox univariate proportional hazards regression analysis with adjustment for age. This analysis revealed significant risk factors for T2DM among males and females: BMI ≥ 25 kg/m<sup>2</sup> , abdominal obesity, fasting hyperglycaemia, a high TG level, hypertension, a family history of DM, and a history of CVD. Among females, additional predictors were vocational- or primary-education level and leisure-time PA reported for the previous week (insufficient: 1–179 min per week or 1–29 min every day).

An increased BMI and WC indicate the presence of increased intra-abdominal visceral fat, which disrupts insulin metabolism through a release of serum-free fatty acids [23]. Nonetheless, according to a meta-analysis in 2018, not all obese individuals are at the same risk of T2DM; it seems that the risk is affected by their metabolic profile—the metabolically unhealthy obese have a ~10-fold higher risk of T2DM, whereas the metabolically healthy obese have a ~4.5-fold higher risk of T2DM, as compared to nonobese individuals. Moreover, in that study, weight gain during early adulthood was found to be more harmful than weight gain after the age of 25. On the contrary, peripheral fat accumulation has been linked to a better metabolic profile, which manifests itself in the observed protective effect of the greater hip circumference on T2DM [24].

Hypertension is known to be associated with the development of T2DM [25,26]. Persons with hypertension have an increased activity of the renin–angiotensin system, which causes systemic inflammatory processes leading to T2DM [27].

The presence of a family history of DM indicates a genetic contributor to DM but can also reflect the lifestyle or environmental conditions people were exposed to during their upbringing [28].

Impaired lipid metabolism with insulin resistance is characterised by an increase in TG levels, lower concentration of HDL-C, and an increase in free fatty acid levels. Diabetic dyslipidaemia also includes qualitative and kinetic lipid disorders, which are more atherogenic in nature, because cholesterol ester transfer protein (CETP) increases the production of small particles of low-density lipoproteins (LDL) [29]. Earlier studies in mice have shown dyslipidaemia to be a factor contributing to the apoptosis of pancreatic β-cells, to insulin biosynthesis, defective insulin secretion, and altered glucose metabolism. Fatty acid metabolism is known to be affected by ceramide formation, endoplasmic reticulum stress, oxidative stress, inflammation, the insulin signalling pathway, and protein kinase B, associated with damage to pancreatic β-cells. According to observational studies, there is a correlation between the level of TGs and the risk of T2DM [30,31].

High PA at leisure time reduces the relative risk of T2DM. Regular PA improves blood glucose control and can prevent or delay the onset of T2DM. [32,33]. Observational studies suggest that greater physical performance is associated with a reduced risk of T2DM [23], even if only as moderate-intensity exercises.

A 2018 meta-analysis revealed an association of lower educational attainment with a higher risk of T2DM [34]. The education level constitutes a component of socioeconomic status. Lower socioeconomic status correlates with higher stress levels, leading to a disruption of endocrine function through perturbations in the neuroendocrine system. Additionally, people with low socioeconomic status are more prone to an unhealthy lifestyle and have limited access to healthcare facilities [35].

In a multivariate regression analysis, we identified gender differences in the risk factors of T2DM. Among males, the best predictors were BMI, fasting hyperglycaemia, hypertension, and a family history of DM. Among females, the best predictors were BMI, fasting hyperglycaemia, hypertension, and fasting hyperglycaemia.

In 2016, Kautzky-Willer et al. analysed sex dimorphism in diabetes risk factors and found that T2DM is more often diagnosed at an earlier age in males, whereas the best predictors of T2DM are the BMI and WC in males and the BMI and WC in females. Limited mobility increases the risk of T2DM in females, and fasting hyperglycaemia in males [36]. The obtained data are explained by those authors as the influence of sex hormones on energy metabolism, body composition, vascular function, and inflammatory reactions. An endocrine imbalance denotes adverse cardiometabolic symptoms in females or males. Furthermore, genetic effects, epigenetic mechanisms, nutritional factors, and a sedentary lifestyle have different effects on the risk of T2DM and complications in both sexes [36].

According to the literature, the risk of DM increases with age [37], although in our models no such increase in risk was found. A possible explanation is that other factors such as WC or physical inactivity also increase with age [38], thus cancelling out the age-specific increase in DM risk.

According to a prospective study in Turkey with an average follow-up of 5.9 years, significant independent predictors of DM are abdominal obesity (risk ratio = 2.61 (95% CI 1.87–3.63)) and age in both sexes, hypertension (risk ratio = 1.81 (95% CI 1.10–2.98)), and low HDL-C in males only [39].

In the FINDRISK study, 38,689 participants aged 30–59 years old were analysed to estimate the prevalence of T2DM at the start and within 10 years. Among males, the frequency of diagnosed pharmacologically controlled T2DM increased over time. Compared to males surveyed in the 1970s, the incidence of diabetes was higher among males in the 1980s (adjusted HR = 1.44, 95% CI: 1.13–1.84) and in the 1990s (adjusted HR = 1.72, 95% CI: 1.32–2.24). The BMI explained some but not all of this variance. The increase occurred predominantly among males with a low education level (adjusted HR in the 1980s = 2.07, 95% CI: 1.28–3.35; adjusted HR in the 1990s = 2.12, 95% CI: 1.28–3.53) and an average education level (adjusted HR in the 1980s = 1.30, 95% CI: 0.85–1.99; adjusted HR in the 1990s = 1.65, 95% CI: 1.05–2.60). The female subgroup showed no dependence of DM incidence on education [40].

In a study conducted in Iran, after observation for an average of six years, 237 new cases of diabetes were identified, which corresponded to an age- and sex-adjusted cumulative incidence of 6.4% (95% CI: 5.6–7.2). In that study, in addition to the classic risk factors of DM, female sex and low levels of education significantly increased the risk of DM in the age-adjusted models. In the full model, independent predictors were age (odds ratio = 1.2 (95% CI: 1.1–1.3), family history of DM (1.8 (1.3–2.5)), BMI ≥ 30 kg/m<sup>2</sup> (2.3 (1.5–3.6)), abdominal obesity (1.9 (1.4–2.6)), high TG concentration (1.4 (1.1–1.9)), impaired fasting glucose (7.4 (3.6–15.0), impaired glucose tolerance (5.9 (4.2–8.4)), and combined impaired fasting glucose and impaired glucose tolerance (42.2 (23.8–74.9)) [41].

In Australia, among 554 adults who completed the study with a six-year followup, 100 developed DM. Abdominal obesity increased the risk of DM for aboriginal people (risk ratio = 2.0 (95% CI: 1.1–3.6)) and for residents of the Torres Strait Islands (odds ratio = 6.3 (95% CI: 2.5–16.1)) as compared to subjects with normal weight. Metabolic

syndrome was a strong predictor of DM (corrected risk ratio = 2.4 (CI 95% 1.6–3.7)). For both groups, the ratio of WC to hip circumference and the presence of metabolic syndrome predicted DM better than did WC or the BMI [42].

Risk models and calculators were first developed for cardiovascular diseases and are widely used in clinical practice and public health.

Diabetes risk scales in various countries have been actively designed since the 2000s. Most risk scales are based on the most sensitive and easily identifiable risk factors of T2DM such as age, sex, BMI, WC, T2DM in relatives, PA level, hypertension or regular use of antihypertensive drugs, and occasionally detectable hyperglycaemia. According to a 2011 review, 145 T2DM risk scales had been published, of which 94 risk models were consistent with qualitative studies—overall, 55 were inferences from risk models based on population studies and 39 were based on validation in new populations [10]. Millions of participants worldwide have already taken part in epidemiological studies assessing the risk of DM. A greater number of possible risk scales are now available to those wishing to use them clinically, but none of them is perfect, all have strengths and weaknesses.

The study has several limitations. One potential limitation is related to non-responders at baseline who might differ from responders by the distribution of DM2 risk factors or poor health; but, the response rate (61%) was a commonly accepted level for population studies and it is unlikely to influence the estimates of outcome risk. Against potential 'recall' bias in the survey, we applied standardised questionnaires/personnel/interview procedures, duplicated questions, and definitions of diseases by few sources (e.g., T2DM by interview of DM history and treatment, and fasting plasma glucose). We could not exclude the attrition bias at the follow-up stage due to those who dropped-out from the follow-up as non-responders for waves 2 or 3 or missed from the DM register, and those who died from competing causes. However, we used overlapping sources for diabetic endpoints in the cohort (DM register, CVD and mortality register, two repeated surveys, phone calls for non-responders) which minimised the chance for missing new cases of T2DM. The study was conducted in an urban Siberian population and has limited generalisability. However, it fits well for the risk of T2DM in the region and provides an excellent approach from the point of personalised medicine for region-specific preventive measures.

The validation of the FINDRISC questionnaire in Novosibirsk demonstrated the good quality of the model, which provides support for its use in Siberian populations. The cut-off risk score of 11 using the FINDRISC questionnaire to identify diabetes had a sensitivity and specificity (76.0% and 60.2%, respectively). The area under the receiver-operating curve for T2DM was 0.73 (0.73 in men and 0.70 in women) [43]. The T2DM risk model that includes three risk factors has several advantages over existing models. The cut-off of the scale was found to be 4 points—Se 74.7% and Sp 60.0%. Thus, the indicators of sensitivity and specificity of the cut-off values of the FINDRISK risk scale and the short Siberian risk scale are comparable and can be used in the medical practice.

#### **5. Conclusions**

Millions of people around the world have already participated in epidemiological studies aimed at assessing the risk of DM. A large number of risk scales are now available to those who seek to apply them clinically, but none of these instruments is ideal; each has strengths and shortcomings. In 2015, validation of the FINDRISC risk scale was conducted on a population sample in Novosibirsk [43]. During the validation, we obtained data suggestive of good quality of the model, which allows us to recommend its use in the Siberian population. Nevertheless, according to our data, not all risk factors included in the FINDRISC scale are widespread among patients with newly diagnosed T2DM, and the incidence of newly diagnosed DM in the very-high-risk group is lower than that predicted in Finland (22.6% versus 50%) [43]. Thus, the task of creating a risk scale for DM remains urgent, as does the issue of finding risk factors with a pronounced contribution to the development of T2DM in the Siberian population representing the appropriate approach from the point of personalised medicine. In addition, the revealed risk determinants might

be validated on a wider sample for generalisability. In this study, for the first time in Russia, as part of a cohort study, an assessment of the risk factors of T2DM in males and females was carried out via multivariate risk models for the development of T2DM. This approach allowed us to identify significant risk factors and direct preventive measures for correcting these factors.

The T2DM risk model that includes three risk factors has several advantages over existing models. It is based on a questionnaire, which takes little time to complete and enables a person to independently determine their risk of T2DM and then visit a healthcare institution for examination, determination of carbohydrate metabolism disorders or T2DM, and preventive measures.

#### **6. Patents**

Patent No. 030585 (EAPO), issued on 31.08.2018. "A method for predicting the risk of developing type 2 diabetes".

**Author Contributions:** Conceptualization M.I.V., G.I.S., M.B. and S.K.M.; methodology H.P., M.B., S.V.M. and S.K.M.; validation L.V.S., E.G.V. and S.V.M.; formal analysis S.V.M., S.K.M., L.V.S. and O.D.R.; investigation S.K.M., Y.I.R., S.V.M. and O.V.S.; data curation S.K.M.; writing—original draft preparation S.V.M., O.D.R., S.K.M.; writing—review and editing H.P., M.B.,Y.I.R., M.I.V.; supervision S.K.M., M.B.; project administration M.B. and S.K.M.; funding acquisition S.K.M., M.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Russian Science Foundation (grant No. 20-15-00371) and the Russian Academy of Science (state assignment No. AAAA-A17-117112850280-2); the HAPIEE study was funded by the Welcome Trust (WT064947, WT081081) and the US National Institute of Aging (1RO1AG23522).

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of Research Institute of Internal and Preventive Medicine–Branch of the Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (protocol № 1 from 14.mar.2002).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available in tabulated form on request. The data are not publicly available due to ethical restrictions and project regulations.

**Acknowledgments:** The authors acknowledge A. Peasey, M. Holmes, D. Stefler, and J. Hubacek for valuable advice on manuscript planning and discussion.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; or in the decision to publish the results.

#### **References**


## *Article* **Hypothalamic Norepinephrine Concentration and Heart Mass in Hypertensive ISIAH Rats Are Associated with a Genetic Locus on Chromosome 18**

**Olga E. Redina 1,\*, Svetlana E. Smolenskaya <sup>1</sup> , Yulia K. Polityko 1,2, Nikita I. Ershov <sup>1</sup> , Michael A. Gilinsky <sup>2</sup> and Arcady L. Markel 1,3**


**Abstract:** The relationship between activation of the sympathetic nervous system and cardiac hypertrophy has long been known. However, the molecular genetic basis of this association is poorly understood. Given the known role of hypothalamic norepinephrine in the activation of the sympathetic nervous system, the aim of the work was to carry out genetic mapping using Quantitative Trait Loci (QTL) analysis and determine the loci associated both with an increase in the concentration of norepinephrine in the hypothalamus and with an increase in heart mass in Inherited Stress-Induced Arterial Hypertension (ISIAH) rats simulating the stress-sensitive form of arterial hypertension. The work describes a genetic locus on chromosome 18, in which there are genes that control the development of cardiac hypertrophy associated with an increase in the concentration of norepinephrine in the hypothalamus, i.e., genes involved in enhanced sympathetic myocardial stimulation. No association of this locus with the blood pressure was found. Taking into consideration previously obtained results, it was concluded that the contribution to the development of heart hypertrophy in the ISIAH rats is controlled by different genetic loci, one of which is associated with the concentration of norepinephrine in the hypothalamus (on chromosome 18) and the other is associated with high blood pressure (on chromosome 1). Nucleotide substitutions that may be involved in the formation or absence of association with blood pressure in different rat strains are discussed.

**Keywords:** norepinephrine concentration in the hypothalamus; heart mass; QTL analysis; SNPs; ISIAH hypertensive rat strain

#### **1. Introduction**

Hypertension is known to contribute to the development of such often fatal complications as cerebral stroke and myocardial infarction. Myocardial hypertrophy associated with hypertension, although not so pronounced in its initial clinical manifestations, ultimately leads to the development of cardiosclerosis, dangerous arrhythmias and heart failure. The development of myocardial hypertrophy in arterial hypertension is produced mainly by an increased afterload on the left ventricle. However, a lot of information has accumulated suggesting that the cause of myocardial hypertrophy in hypertension is far from unambiguous.

The dissociation between the blood pressure (BP) level and myocardial mass increase has been described both in patients with hypertension and in experimental hypertensive animals [1]. Thus, it was shown that the mass of the left ventricle in spontaneously hypertensive rats (SHR) is increased, not only in adult rats in which the blood pressure level is high, but also in young rat pups without obvious signs of arterial hypertension,

**Citation:** Redina, O.E.; Smolenskaya, S.E.; Polityko, Y.K.; Ershov, N.I.; Gilinsky, M.A.; Markel, A.L. Hypothalamic Norepinephrine Concentration and Heart Mass in Hypertensive ISIAH Rats Are Associated with a Genetic Locus on Chromosome 18. *J. Pers. Med.* **2021**, *11*, 67. https://doi.org/10.3390/jpm 11020067

Academic Editor: Mikhail Ivanovich Voevoda Received: 1 December 2020 Accepted: 21 January 2021 Published: 23 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

which indicated the genetic origin of cardiac hypertrophy. Antihypertensive therapy in adult SHR rats led to a decrease in blood pressure. At the same time, there was some involution of myocardial mass, but only if the antagonist of the sympathetic system, alphamethyldopamine, was used as a hypotensive drug. The use of another antihypertensive drug (hydralazine) led to an even greater decrease in blood pressure, but there was no involution of myocardial hypertrophy. The authors concluded that SHR rat myocardial hypertrophy is caused not only by an increase in blood pressure, but also by hormonal stimulation by the sympathetic adrenal and renin–angiotensin systems [2]. Another type of hypertension with myocardial hypertrophy was developed in rats receiving a high fructose diet (a condition similar to the metabolic syndrome of humans). It turned out that in this case the mechanism of cardiac hypertrophy is caused not so much by the degree of increase in blood pressure, but by an increase in the function of the sympathetic adrenal system [3]. When studying the role of adrenergic stimulation on the growth of cardiomyocytes in cell culture, it was shown that stimulation of α1-adrenergic receptors induces the growth and hypertrophy of neonatal cardiomyocytes and enhances the expression of contractile protein genes. Moreover, this effect did not depend on the mechanical activity of cells [4]. All this indicates the presence of additional factors, along with enhanced blood pressure, responsible for the development of myocardial hypertrophy in arterial hypertension, and these additional factors are directed to stimulation of the sympathetic adrenal and other neuroendocrine systems.

Recently, more and more attention has been paid to the role of the central nervous system and its sympathetic part as a kind of trigger in the pathogenesis of hypertension [5]. Using modern methods of visualizing the functioning of the brain (Fmri—functional magnetic resonance imaging) and constructing a neuroconnectome it has been shown that nerve connections between a number of brain structures, including the hypothalamus and the brainstem regions, are involved in maintaining the sympathetic tone and the regulation of blood pressure [6]. Hypothalamus, receiving extensive efferent information and having wide effector connections, is considered as the central integrating link of the nervous system. It functions as a regulatory center in maintaining the body's homeostasis, controlling endocrine, autonomic and somatic reactions. It is the hypothalamus that plays the role of coordinator of the neuroendocrine and sympathoadrenal systems' functions [7].

One of the most important hypothalamic neurotransmitters is norepinephrine. The paraventricular nucleus of the hypothalamus is thought to be the primary recipient of stress-related information, in particular, from the norepinephrine-releasing Locus Ceruleus (LC). The hypothalamus–LC axis controls input and output information from and to the peripheral autonomic nervous system, from and to the hypothalamic–adrenocortical and adrenomedullary systems with numerous brain-to-gland positive and negative (feedback) loops [8]. The neurons of the paraventricular nuclei, in which the central stimulator of the pituitary–adrenal system (stress system)—corticotropin releasing hormone (CRH) is synthesized, also receives noradrenergic stimulation [9,10]. Postganglionic sympathetic neurons innervate the myocardium, and myocardial hypertrophy associated with increased sympathetic tone is realized through stimulation of both myocardial alpha and beta adrenoreceptors [11].

The concentration of norepinephrine in the hypothalamus is an important factor in the regulation of the activity of the sympathetic nervous system and blood pressure [12,13]. The relationship between activation of the sympathetic nervous system and cardiac hypertrophy is shown in experimental rat models. Adult rats with spontaneously developing hypertension (SHR) are characterized by heart hypertrophy [14] and an increased content of norepinephrine in the hypothalamus [15,16]. Sympathetic hyperactivity in rats with heart failure is associated with an increase in norepinephrine in the paraventricular nucleus of the hypothalamus at rest, and in acute sympathetic activation under conditions of restraint stress in rats with heart failure, an impairment of the central noradrenergic system's response was observed [17].

Based on the available information, in Inherited Stress-Induced Arterial Hypertension (ISIAH) rats simulating the stress-sensitive form of arterial hypertension, one can also suggest a connection between sympathetic activation and pathological changes in the myocardium. In the medulla of the adrenal glands of the ISIAH rats, an increased amount of chromogranin A was shown [18], which is an indicator of the catecholamines synthesis activation [19]. In addition, an increased concentration of epinephrine in the adrenal glands has been shown [20], which indicates an enhanced function of the sympathoadrenal system in ISIAH rats. ISIAH rats are also characterized by increased reactivity of the sympathoadrenal and hypothalamic–pituitary–adrenal systems [20], as well as morphological changes in the peripheral target organs, including the heart. In early postnatal ontogenesis in ISIAH rats, specific morphometric changes were described, such as an increase in the thickness of the walls of the ventricles and interventricular septum, indicating the formation of structural remodeling of the ventricular myocardium [21], and in adult rats there is an increase in the absolute and relative mass of the heart and signs of left ventricular hypertrophy compared with normotensive control rats [22].

It is known that an increase in the activity of the sympathetic nervous system is associated with the development of pathophysiological processes in the cardiovascular system and the development of hypertension [23]. Taking into account that ISIAH rats have characteristic signs of increased sympathetic tone at rest and its increased reactivity under stress, including a sharp increase in norepinephrine and adrenaline in blood plasma, we can assume that cardiac hypertrophy in ISIAH rats is, at least in part, the result of sympathetic tension. This is also indicated by the high expression of mRNA of the *Adrb1* (beta 1 adrenergic receptor) gene in the myocardium of intact ISIAH rats [24]. The involvement of sympathetic activation both in the functioning of the heart and in the development of hypertension in ISIAH rats is also supported by the fact that intravenous administration of antisense oligonucleotides directed to mRNA of beta 1-adrenergic receptors resulted in a long-term decrease in the blood pressure in ISIAH rats [24].

One of the approaches to studying the genetic control of physiological and pathophysiological traits is Quantitative Trait Loci (QTL) analysis, which allows the determination of the genetic loci associated with phenotypic traits. QTL analysis is also an effective approach for determining common loci for several traits, where genes with pleiotropic effects on the traits under study can be located. Previously, using QTL analysis, we described the genetic loci associated with the level of blood pressure and the mass of the adrenal glands, kidneys, and heart [25,26], as well as the loci associated with the plasma corticosterone concentration in hypertensive ISIAH rats [27]. All the studied traits were associated with several loci suggesting polygenic control of the manifestation of the listed characters of ISIAH rats. In addition, it was previously shown using the linear regression method (ANOVA) that the trait "concentration of norepinephrine in the hypothalamus" can be associated with several genetic loci in ISIAH rats [28]. In addition, previously, using data from transcriptome (RNA-Seq) analysis of several organs/tissues of ISIAH and control (Wistar Albino Glaxo (WAG) rats, as well as the available sequencing data for the genomes of 42 other rat strains and substrains, ISIAH rat-specific single nucleotide polymorphisms (SNPs) have been described, and significant genetic distances between the genotype of ISIAH rats and the genotypes of all other rat strains and substrains were shown [29].

The aim of this work was to conduct a genome-wide analysis of the trait designated as "concentration of norepinephrine in the hypothalamus" using QTL analysis and to describe the complete list of genetic loci associated with this trait. In addition, the goal of this work was to identify the loci associated with both the concentration of norepinephrine in the hypothalamus and the increased myocardial mass in ISIAH rats, which may contain genes that lead to cardiac hypertrophy associated with increased sympathetic stimulation. In the QTL of interest, nucleotide substitutions specific to the ISIAH rat strain, which may possibly contribute to the phenotypes under consideration, were identified and discussed.

#### **2. Results**

#### *2.1. Determination of Loci Associated with the Concentration of Norepinephrine in the Hypothalamus*

Determination of loci associated with the concentration of norepinephrine in the hypothalamus was carried out using hypertensive ISIAH and normotensive control WAG rats. The concentration of norepinephrine in the hypothalamus of ISIAH rats is significantly higher than that of WAG rats (Table 1).

**Table 1.** Comparative characteristics of Inherited Stress-Induced Arterial Hypertension (ISIAH) and Wistar Albino Glaxo (WAG) rats aged 6 months.


\*\* *p* < 0.01 in ISIAH as compared with WAG (Student's *t*-test). M—mean, SEM—standard error of mean.

A QTL analysis of this trait was performed using F2(ISIAH × WAG) male rats at the age of 6 months. The trait "concentration of norepinephrine in the hypothalamus" was analyzed for the first time. Figure 1 shows the LOD (**logarithm of odds)** plots for chromosomes.

**Figure 1.** Genome-wide LOD (**logarithm of odds)** plots for concentration of norepinephrine in the hypothalamus in male rats F<sup>2</sup> (ISIAH × WAG) aged 6 months. The X axis shows the chromosome numbers. Lines show the level of experimentwise threshold: the dashed line corresponds to *p* = 0.05, the dotted line corresponds to *p* = 0.025.

The genome scan revealed one significant and several suggestive loci associated with concentration of norepinephrine in the hypothalamus, indicating the polygenic control of the trait. The description of these QTL is given in Table 2.

The effects of alleles on the trait are shown in Table 3. In rats with two ISIAH alleles at the loci on chromosomes 18, 6 (in the region of the D6Rat143 marker) and X, the value of the trait "concentration of norepinephrine in the hypothalamus" was significantly increased in comparison with animals with both alleles of normotensive WAG rats in these QTL. At the loci, on chromosomes 4 and 16, the opposite effect was observed—a decrease in the value of the trait in the presence of two alleles of ISIAH rats. At two loci, a significant change in the concentration of norepinephrine in the hypothalamus was associated with a heterozygous genotype. At the locus on chromosome 6 (in the region of the marker D6Rat75), the value of the trait in animals with a heterozygous genotype decreased, and at the locus on chromosome 14 the value of the trait increased in heterozygotes. Accordingly, the obtained results show that the effects of alleles of hypertensive ISIAH rats at different loci have a specific modulating effect on the trait.


**Table 2.** Quantitative Trait Loci (QTL) associated with concentration of norepinephrine in the hypothalamus in male rats F<sup>2</sup> (ISIAHxWAG) aged 6 months.

\* Boundaries of a locus are defined in the respective one LOD interval; Mb—megabases.

**Table 3.** Allele effects in QTL associated with concentration of norepinephrine in the hypothalamus in male rats F2 (ISIAHxWAG) aged 6 months.



**Table 3.** *Cont.*

I/I—a homozygote for the ISIAH allele; W/W—a homozygote for the WAG allele; I/W—a heterozygote; Mb—megabases; \* *p* < 0.05, \*\* *p* < 0.01—compared with W/W; † *p* < 0.05, †† *p* < 0.01—compared with H/W; #—degree of dominance. Н

#### *2.2. Determination of Loci Associated with both the Concentration of Norepinephrine in the Hypothalamus and Heart Mass*

Using the QTL Cartographer program, which allows one to determine the covariance of two or more characters, the character covariance of the norepinephrine concentration in the hypothalamus with heart mass was found at the locus on chromosome 18 in the region of marker D18Mgh1 (Figure 2). The description of the locus associated with the mass of the heart on chromosome 18 is given in Table 4.

−

−

−

**Figure 2.** The covariance of two traits ( in the hypothalamus and heart mass) at the locus on chromosome 18.

**Table 4.** QTL on chromosome 18 associated with heart mass in male rats F<sup>2</sup> (ISIAHxWAG) aged 6 months.


\* Boundaries of a locus are defined in the respective one LOD interval; Mb—megabases.

The presence of ISIAH alleles in QTL on chromosome 18 (in the region of marker D18Mgh1) is associated with an increase in both the concentration of norepinephrine in the hypothalamus and heart mass (Figure 3).

**Figure 3.** Allele effects on trait values in the QTL on chromosome 18 in the region of marker D18Mgh1: (**a**) concentration of norepinephrine in the hypothalamus; (**b**) heart mass; I/I—a homozygote for the ISIAH allele; W/W—a homozygote for the WAG allele; I/W—a heterozygote.

#### *2.3. Nucleotide Substitutions (SNPs) Detected in ISIAH but not in WAG Rats*

In the QTL associated with the concentration of norepinephrine in the hypothalamus and heart mass, 240 SNPs (Supplementary Table) were identified in the transcribed regions belonging to the sequences of 82 genes. None of the substitutions found at the locus were characterized as having a high impact effect, but most of them can have a modifying effect (Table 5).



Nonsynonymous substitutions were analyzed using the Sorting Intolerant From Tolerant (SIFT) algorithm (Table 6). In two genes, *Slc4a9* (solute carrier family 4, member 9) and *Tcof1* (treacle ribosome biogenesis factor 1), nonsynonymous substitutions were characterized in the SIFT program as Deleterious, i.e., these substitutions are likely to have an effect on the structure and/or function of the proteins encoded by these genes. Comparison with the sequencing data of the genomes of 42 rat strains and substrains showed that a nonsynonymous substitution in the mRNA sequence of the *Slc4a9* gene (c.269C>T; p.Ala90Val) occurs only in ISIAH rats (Table 6). Several substitutions were found in the *Tcof1* gene (Table 6 and Supplementary Table) and the substitution c.1556C>A, leading to the replacement of the amino acid p.Ala519Glu, which is characterized as a Deleterious one, occurs in 20 out of 42 rat strains (Supplementary Table).


**Table 6.** Predicting the effects of missense variants on protein function using the Sorting Intolerant From Tolerant (SIFT) algorithm.

\*—statistically significant values are given in bold; ID—identification number; DP—sequencing depth.

#### **3. Discussion**

The genome scan was carried out and QTL associated with the concentration of norepinephrine in the hypothalamus of ISIAH rats were identified. QTL analysis of this trait on other rat strains has not yet been performed. The obtained results unambiguously show that the manifestation of the trait "concentration of norepinephrine in the hypothalamus" in ISIAH rats is under polygenic control, like other earlier studied physiological traits (blood pressure, body weight and mass of the adrenal glands, kidneys, and heart, as well as plasma corticosterone concentration) [26,27]. We showed that the presence of ISIAH hypertensive rat alleles in some loci can lead to a decrease in the concentration of norepinephrine in the hypothalamus, and in other loci—to its increase. These data may be useful in selecting candidate genes for a specific modulating effect on the trait.

The most highly significant association of norepinephrine concentration in the hypothalamus was found with a genetic locus on chromosome 18. The significant extent of this QTL suggests that, in this region of chromosome 18, there are most likely several genes that may be involved in the genetic control of the norepinephrine concentration in the hypothalamus in ISIAH rats. In the central part of chromosome 18 in the region of marker D18Mgh1, this QTL overlaps with the locus previously described as associated with heart mass [26]. Despite the fact that the genetic control of heart mass in ISIAH rats was earlier associated with several genetic loci, only one of these was found to be associated with both heart mass and norepinephrine concentration in the hypothalamus (Figure 2). Accordingly, it is precisely in the central part of chromosome 18 in the region of the D18Mgh1 marker the closely linked genes controlling the manifestation of the studied traits, or genes with a pleiotropic effect on the traits "concentration of norepinephrine in the hypothalamus" and "heart mass" can be localized. Earlier, for this locus on chromosome 18, an association with the heart mass (Cardiac mass QTL 125, Cm125) was also found in the congenic rats SS.LEW- (D18Chm41-D18Rat92)/Ayd, in which the fragment of chromosome 18 (30.8—52.3 Mb) of Dahl salt-sensitive (DSS) rats, which are a model of the salt-sensitive hypertension, has been replaced by a fragment of the genome of normotensive Lewis (LEW) rats [30].

Increased sympathetic activity is considered a risk factor for cardiovascular problems and as a cause of the onset and maintenance of high blood pressure [23]. The central nervous system, in particular the nuclei of the hypothalamus, plays a vital role in the regulation of these processes [31]. The current work made it possible for the first time to determine the genetic loci associated with the concentration of norepinephrine in the hypothalamus and compare them with the loci previously associated with heart mass and blood pressure both in ISIAH rats and in other hypertensive rat strains. When studying SS hypertensive rat strains simulating salt-sensitive hypertension [30,32–34], and in SHR

rats with spontaneously developing hypertension [35], this locus on chromosome 18 was also associated with the Blood pressure trait. We mapped the Blood pressure trait earlier in ISIAH rats [26]; however, no associations of the Blood pressure trait with the genetic locus on chromosome 18 discussed in this work was identified. This is not the first time that certain loci have been associated only with the weight parameters of target organs, but not with the level of blood pressure. It is believed that such results make it possible to identify the mechanisms that underlie the development of hypertrophy of target organs but independent of hypertension [30]. Accordingly, we assume that the genetic control of sympathetic activation, which affects the development of hypertension in ISIAH rats, differs from that in SHR rats and rats with salt-sensitive hypertension. The differences found in the genetic control of the traits in SS, SHR, and ISIAH rats are in good agreement with the fact that the genotype of ISIAH rats is significantly different from the genotypes of other hypertensive strains [29]. The fact that no association with blood pressure was found in ISIAH rats at this locus is in good agreement with the concept of the nature of hypertension in humans, when chronic activation of the sympathetic nervous system in hypertension has a diverse range of pathophysiological consequences independent of any increase in blood pressure [36].

It is known that various (hemodynamic and hormonal) factors can influence the functioning of the heart, which can cause left ventricular hypertrophy [37]. Our results are in good agreement with these ideas. Despite the fact that in ISIAH rats at the locus on chromosome 18, the value of heart mass is not associated with the level of blood pressure; such an association was found on other chromosomes. We previously showed that in the distal part of chromosome 1 there is a locus associated with both blood pressure and heart mass in ISIAH rats [26]. Moreover, at the locus common to these two traits on chromosome 1, the presence of ISIAH rat alleles was associated with an increase in the values of both characters. Our results allow us to conclude that in ISIAH rats the development of heart hypertrophy has a complex nature and is associated both with sympathetic activation, which is controlled by the locus on chromosome 18, and with increased blood pressure, which is controlled by the locus on chromosome 1. These data are in good agreement with the idea that the activation of the sympathetic nervous system plays an important role in the development and maintenance of arterial hypertension and development of cardiac hypertrophy and heart failure. In addition, our data emphasize that these processes can be controlled by different genetic loci. This is consistent with the opinion that pharmacological treatment of hypertension should be aimed not only at lowering blood pressure, but also at correcting the sympathetic nervous activity [38].

In ISIAH rats, numerous nucleotide substitutions were found at the discussed locus, which can have a modifying effect on the level of gene expression. As shown in a number of studies, nucleotide substitutions in the regulatory regions of mRNA [39–42] and in introns [43,44] can have a significant modifying effect on transcription and translation and lead to various pathologies. Accordingly, it can be assumed that some of the substitutions listed in Table 5 may be essential in the formation of phenotypes associated with the discussed locus.

However, it is believed that it is the single nucleotide polymorphisms that lead to nonsynonymous amino acid substitutions in protein molecules that can most significantly affect its function and have a significant impact on human health compared to SNPs in other regions of the genome [45]. Our analysis allowed us to identify two SNPs that lead to nonsynonymous amino acid substitutions and, according to the SIFT algorithm, can presumably lead to changes in the structure and/or function of proteins encoded by the *Slc4a9* and *Tcof1* genes.

*Slc4a9* (solute carrier family 4, member 9) is encoding transmembrane anion exchange protein involved in chloride/bicarbonate exchange. However, according to the National Center for Biotechnology Information (NCBI) Database, this gene is expressed mainly in the kidneys, and its expression in brain and in heart is very low [46]. Accordingly, on the one hand, it can be assumed that the substitution found in the *Slc4a9* gene sequence may not play a key role in the formation of the discussed phenotypes. However, on the other hand, it can be assumed that the expression level of this gene may increase with sympathetic activation or other conditions and affect the function of the heart, which, however, has not been studied so far.

*Tcof1* (treacle ribosome biogenesis factor 1) gene product is involved in ribosomal DNA gene transcription by interacting with upstream binding factor [47]. Mutations in the *TCOF1* gene cause Treacher Collins syndrome (TCS), which is an autosomal dominant disorder characterized by abnormalities of craniofacial development that arises during early embryogenesis. However, the nucleotide substitution c.1556C>A, detected in the genome of ISIAH rats, also occurs in 20 out of 42 analyzed rat strains and substrains (see Supplementary Table), the development of which corresponds to normal, which indicates that the discussed substitution in the mRNA sequence of the gene *Tcof1* is not associated with TCS. Interestingly, the *Tcof1* gene is considered the only function candidate for Blood pressure in QTL 319 also known as C18QTL3 (53.3–78.6 Mb), described in the study of genetic control of hypertension in salt-sensitive rats [48].

Could a missense mutation in the *Tcof1* gene sequence be the reason that no association with blood pressure was found at this locus in ISIAH rats? As we wrote above, this mutation was found in 20 out of 42 analyzed rat strains/substrains. These included both several rat strains used as a normotensive control (FHL/EurMcwi, WKY/N, WKY/Gla, WKY/NHsd), and several hypertensive strains (FHH/EurMcwi, SBH/Ygl, SHRSP/Gla, SHR/OlaIpcv, SHR/NCrlPrin, SHR/NHsd, SHR/OlaIpcvPrin). However, it should be noted that among the hypertensive strains with this substitution, there were no strains with salt-sensitive hypertension, in which arterial blood pressure-associated loci were described in the discussed genome locus, and rats with spontaneous hypertension (SHR/Mol), for which an association with blood pressure was found at the locus of chromosome 18 [35] were not included in SNPs analysis. For SHRSP rats on chromosome 18, the Blood pressure QTL2 locus was described in region 1–34.9 Mb [49], which does not include the *Tcof1* gene, and for FHH [50] and SBH rats [51], Blood pressure trait mapping showed associations with other chromosomes. Based on the foregoing, it can be concluded that the found substitution in the *Tcof1* gene deserves additional attention and subsequent study of its possible association with the manifestation of hypertensive status in rats of different strains. As shown earlier, the epistatic hierarchy may play an important role in the genetic regulation of blood pressure [52]. Perhaps for the manifestation of this polymorphism on the regulation of blood pressure, its effect must be combined with other genetic factors.

The results presented in this work add new information to the existing knowledge about the functional annotation of the genome, which is an important step in understanding the formation of external phenotype [53]. Undoubtedly, the QTL method has its limitations and is only the first stage in a long process of further experimental evidence of the relationship between the function of candidate genes found in loci and the manifestation of the phenotype. Nevertheless, this work was the first to analyze the QTL for an important trait—the concentration of norepinephrine in the hypothalamus, and for the first time, a genetic locus associated with both the concentration of norepinephrine in the hypothalamus and an increased heart mass was described. In addition, it was shown that the locus found on chromosome 18 may or may not be associated with the level of blood pressure in different strains of hypertensive rats. Accordingly, the results obtained in this study can be useful for identifying specific targets for the treatment of heart pathology associated with increased sympathetic activity at the identified genetic locus on chromosome 18. Nucleotide substitutions specific to ISIAH rats that were identified and discussed in this study can potentially be used in further studies of genetic control of the manifestation of the corresponding phenotypes in human populations.

#### **4. Materials and Methods**

#### *4.1. Animals*

We used hypertensive ISIAH/Icgn rats (abbreviation from the words Inherited Stress-Induced Arterial Hypertension) and the normotensive WAG/GSto-Icgn (Wistar Albino Glaxo) strain. This work was carried out on the basis of the *Center for Genetic Resources* of *Laboratory Animals*, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia. Rats were kept under standard conditions, water and balanced food were given without restriction. All manipulations with animals were carried out in accordance with the European Convention for the protection of Vertebrate Animals Used for experimental and Other Scientific Purposes (ETS 123), Strasbourg, 18 March 1986 and in compliance with the rules of the Animal Care and Use Committee of Institute of Cytology and Genetics SB RAS, and approved by the Bioethical Committee of the Scientific Research Institute of Physiology and Basic Medicine (protocol No. 7 of 9 October 2015), Novosibirsk, Russia.

#### *4.2. Determination of Norepinephrine Concentration in the Hypothalamus*

Rats were quickly decapitated, the hypothalamus was isolated and frozen in liquid nitrogen, and then stored at −70 ◦C until the norepinephrine concentration was measured. A tissue sample was homogenized in a glass homogenizer in 300 µL 0.1 M perchloric acid, to which isopropyl–norepinephrine (100 ng/mL) (Sigma, United States) was added as an internal standard. The homogenate was centrifuged for 15 min (12,000× *g*). The supernatant was transferred to a separate tube. High performance liquid chromatography with two-electrode electrochemical detection was used for the analysis of norepinephrine in the supernatant [54]. The sensitivity of the measurement was 2 pg/mL. Norepinephrine concentration was measured in ng/mg of tissue.

#### *4.3. QTL (Quantitative Trait Locus) Analysis*

QTL analysis was performed on male F2 hybrids (ISIAHxWAG) at the age of 6 months (*n* = 126) using 149 polymorphic microsatellite markers. Their list and primer sequences are given on the website of the Institute of Cytology and Genetics SB RAS (http://icg. nsc.ru/isiah/en/category/qtl/). The position of microsatellite markers on chromosomes was determined using the RGSC Genome Assembly v 5.0. and expressed in millions of nucleotides (megabases, Mb) from the start of the chromosome. The genotyping was performed as described previously [26].

Linkage analysis was performed using the programs MAPMAKER/EXP 3.0 and MAPMAKER/QTL 1.1 [55]. The concentration of norepinephrine in the hypothalamus was transformed using the natural logarithm (ln) to reduce the asymmetry and excess (skewness and kurtosis) in the distribution of the values of the trait. A one-LOD dropoff was used to obtain an approximate 95% confidence interval for QTL position. QTL Cartographer Version 1.17, JZmapqtl [56] was used to map loci that are common for pairs of traits (bivariate analysis), and to calculate the LOD score significance thresholds. The level of statistical significance was calculated by random permutation of experimental data with replication 1000 times (permutation test) [57]. The linkage was considered significant if the experimentally obtained LOD score exceeded 5% of the threshold value in the analysis of the genome (experiment-wise threshold) [58], linkage was considered probable if the experimentally obtained LOD score exceeded 5% of the threshold value during a permutation test for a single chromosome (chromosome-wise threshold). Statistical processing of the results was carried out using Statistica 6.0 (StatSoft, Tulsa, OK, USA). The significance of differences between the mean values was evaluated using Student's *t*-test. The degree of dominance was calculated by the standard method [59], assuming that the complete dominance of ISIAH rat alleles is +1, and the complete dominance of WAG rat alleles is −1.

The Rat Genome Database (RGD, http://rgd.mcw.edu/) was used to compare the genetic loci found in this work with the results of studies on other model animals.

#### *4.4. Tissue Collection for SNP Analysis*

To analyze SNPs, male ISIAH/Icgn and WAG/GSto-Icgn rats were used. Each experimental group consisted of 6 rats. Samples of five tissues were taken from each animal for analysis of the transcriptome (brainstem, hypothalamus, adrenal gland, cortex and medulla of the kidney). Sequencing of transcriptomes was performed at JSC Genoanalytica (Moscow, Russia).

#### *4.5. RNA Sequencing*

Over 10 million single-end reads of 50-bp length were obtained for each sample in accordance with standard Illumina protocols as described previously [29]. All samples were analyzed as biological replicates.

Mapping was performed for the reference genome (Rnor\_5.0\rn5) using the TopHat2 software [60]. The quality of the mapped data was assessed using the "CollectRnaSeqMetrics" module in the Picard software package (http://broadinstitute.github.io/picard/). Potential PCR duplicates were removed from the mapped bam data obtained for 5 different tissues of each animal (hypothalamus, brainstem, adrenal gland, renal medulla and cortex), after which they were combined into one pool for each animal with the Picard "MergeSamFiles" module for further analysis.

#### *4.6. Polymorphism Detection*

The initial set of polymorphisms was determined using the Genome Analysis Toolkit (GATK) [61] using the "HaplotypeCaller" module in the "GVCF" mode and the "GenotypeGVCFs" module for combined genotyping, using the settings for variant calling and hard filtering recommended by the GATK developers. The details were described in [29].

Determination of polymorphisms in transcriptome data of ISIAH and WAG rats was carried out relative to the reference genome sequence of BN/NhsdMcwi rats in the assembly version RGSC5.0 [62]. Further analysis of the list of polymorphic variants of ISIAH rats was carried out using the Rat Genome Sequencing Consortium data for the genome sequences of 42 rat strains and substrains including 11 hypertesive rat strains and substrains: FHH/EurMcwi, LH/MavRrrc, MHS/Gib, SBH/Ygl, SHR/OlaIpcv, SHRSP/Gla, SHR/NCrlPrin, SHR/NHsd, SHR/OlaIpcvPrin, SS/Jr, SS/JrHsdMcwi; 10 rat strains and substrains that usually serve as a normotensive control: FHL/EurMcwi, LN/MavRrrc, LL/MavRrrc, MNS/Gib, SBN/Ygl, SR/Jr, WKY/N, WKY/Gla, WKY/NCrl, WKY/NHsd; and 21 other rat strains and substrains that are used in experiments not related to hypertension: ACI/N, ACI/EurMcwi, BBDP/Wor, BN-Lx/Cub, BN-Lx/CubPrin, BN/SsN, BUF/N, DA/BklArbNsi, F334/N, F344/NHsd, F344/NCrl, SUO\_F344, GK/Ox, LE/Stm (SOLiD), LEW/Crl, LEW/NCrlBR, LE/Stm (Illumina), M520/N, MR/N, WAG/Rij, WN/N [63]. Comparison of single nucleotide polymorphisms of ISIAH/Icgn and WAG/Gsto-Icgn strains with genotypes of 42 rat strains and substrains was carried out only for genomic loci sequenced in transcriptomic analysis.

#### *4.7. Prediction of the SNP Effects*

The classification of the found polymorphisms and their effects are given according to the description in the SnpEff program (http://snpeff.sourceforge.net/SnpEff\_manual. html). To determine the possible effect of amino acid substitution on protein function, the Sorting Intolerant From Tolerant (SIFT) program [64] was used.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/2075-442 6/11/2/67/s1, Supplementary Table: SNPs in QTL.

**Author Contributions:** Conceptualization, O.E.R. and A.L.M.; data curation, O.E.R., S.E.S., Y.K.P. and M.A.G.; formal analysis, O.E.R., S.E.S. and N.I.E.; funding acquisition, O.E.R.; investigation, O.E.R., S.E.S., N.I.E., Y.K.P., M.A.G. and A.L.M.; methodology, O.E.R. and A.L.M.; supervision, O.E.R. and A.L.M.; writing—original draft, O.E.R.; writing—review and editing, O.E.R., S.E.S., N.I.E., Y.K.P., M.A.G. and A.L.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** Genotyping was supported by budget project No. 0259-2021-0015. SNPs analysis was supported by budget project No. 0259-2021-0013. The mathematical processing of the obtained data and the preparation of the results for publication was supported by the Russian Foundation for Basic Research, grant No. 20-04-00119a.

**Institutional Review Board Statement:** All manipulations with animals were carried out in accordance with the European Convention for the protection of Vertebrate Animals Used for experimental and Other Scientific Purposes (ETS 123), Strasbourg, March 18, 1986 and in compliance with the rules of the Animal Care and Use Committee of Institute of Cytology and Genetics SB RAS, and approved by the Bioethical Committee of the Scientific Research Institute of Physiology and Basic Medicine (protocol No. 7 of 9 October 2015), Novosibirsk, Russia.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors are grateful to JSC Genoanalytica (Moscow, Russia) for conducting the technological part of the RNA-Seq analysis. The Siberian Branch of the Russian Academy of Sciences (SB RAS) Siberian Supercomputer Center is gratefully acknowledged for providing supercomputer facilities.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Does Proteomic Mirror Reflect Clinical Characteristics of Obesity?**

**Olga I. Kiseleva 1,\* , Viktoriia A. Arzumanian <sup>1</sup> , Ekaterina V. Poverennaya <sup>1</sup> , Mikhail A. Pyatnitskiy <sup>1</sup> , Ekaterina V. Ilgisonis <sup>1</sup> , Victor G. Zgoda <sup>1</sup> , Oksana A. Plotnikova <sup>2</sup> , Khaider K. Sharafetdinov 2,3,4 , Andrey V. Lisitsa <sup>1</sup> , Victor A. Tutelyan 2,4, Dmitry B. Nikityuk 2,4, Alexander I. Archakov <sup>1</sup> and Elena A. Ponomarenko <sup>1</sup>**


**Abstract:** Obesity is a frightening chronic disease, which has tripled since 1975. It is not expected to slow down staying one of the leading cases of preventable death and resulting in an increased clinical and economic burden. Poor lifestyle choices and excessive intake of "cheap calories" are major contributors to obesity, triggering type 2 diabetes, cardiovascular diseases, and other comorbidities. Understanding the molecular mechanisms responsible for development of obesity is essential as it might result in the introducing of anti-obesity targets and early-stage obesity biomarkers, allowing the distinction between metabolic syndromes. The complex nature of this disease, coupled with the phenomenon of metabolically healthy obesity, inspired us to perform data-centric, hypothesisgenerating pilot research, aimed to find correlations between parameters of classic clinical blood tests and proteomic profiles of 104 lean and obese subjects. As the result, we assembled patterns of proteins, which presence or absence allows predicting the weight of the patient fairly well. We believe that such proteomic patterns with high prediction power should facilitate the translation of potential candidates into biomarkers of clinical use for early-stage stratification of obesity therapy.

**Keywords:** obesity; BMI; blood tests; proteomics; mass spectrometry

#### **1. Introduction**

Obesity in most cases is blatantly visible by the unaided eye. Paradoxically, at the same time both clinicians and citizens tend to ignore this pathology, acquiring the scale of "globesity" [1,2]. Being a generally preventable disease, obesity, resulting from the excess of body fat, often entails the development of 50+ various pathologies, significant disability, and premature death [3].

The pathogenesis of obesity involves the interaction of genetic, environmental, and behavioral factors [4]. Each time, figuring out the characteristic features in the biomedical portrait of obesity, scientists are trying to resolve the nature vs nurture debate [5]. The multifactorial nature and high comorbidity of obesity make it difficult to understand the clear molecular nature of this disease. Moreover, about a third of obese patients are "metabolically healthy" with little or no evidence of metabolic syndrome. There are four

**Citation:** Kiseleva, O.I.; Arzumanian, V.A.; Poverennaya, E.V.; Pyatnitskiy, M.A.; Ilgisonis, E.V.; Zgoda, V.G.; Plotnikova, O.A.; Sharafetdinov, K.K.; Lisitsa, A.V.; Tutelyan, V.A.; et al. Does Proteomic Mirror Reflect Clinical Characteristics of Obesity? *J. Pers. Med.* **2021**, *11*, 64. https:// doi.org/10.3390/jpm11020064

Academic Editor: Mikhail Ivanovich Voevoda Received: 10 December 2020 Accepted: 15 January 2021 Published: 21 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

central features (insulin resistance, increased visceral fat, atherogenic dyslipidemia, and endothelial dysfunction), which make up the essential definition for metabolic syndrome. Among them, only the first two are obligatory [6].

However, metabolic syndrome is not the one and only way to identify individuals with increased risk of cardiovascular diseases and diabetes, as well as other comorbidities. Reliable identification of individuals with a significant risk of endocrine or cardiovascular complications requires assessment methods taking into account orthogonal factors (e.g., family history, age, sex, smoking, and other crucial parameters) [6].

Importantly, the "healthy" phenotype of an obese individual with no metabolic aberrations is not constant. Thus, the metabolism of half of such patients ceases being "healthy" in ca. 10 years [7]. It means that early diagnostics and intervening even for "metabolically healthy obesity" is crucial. The dynamic nature of obesity makes it even more difficult to find out meaningful differences between normal state, metabolically healthy, and unhealthy obesity.

Despite the apparent obviousness of strategies for treatment and prevention of obesity, unfortunately, in the long term, they demonstrate low efficiency, primarily because standard pharmacological solutions and significant behavioral changes regarding nutrition and activity, in most cases, do not take root in the modus vivendi of the patients.

A major diagnostic criterion of obesity is body mass index (BMI), expressed by a person's weight divided by the square of his or her height. BMI reliably indicates the anthropometric condition for the overwhelming majority of cases; however, it does not accurately reflect the severity of the health risks [8,9]. The same could be stated for traditional laboratory tests, including monitoring of triglycerides and lipid profiles [10].

The dynamic and multifactorial nature of obesity may be the reason why there are still no biomarkers approved by the FDA or other reputable organizations that could be effectively used to diagnose this disease in personalized—not "one size fits all"—mode.

High-throughput technologies accelerated life science dramatically: the speed of reading the sequences of biological macromolecules is no longer a bottleneck for unraveling the mechanisms of health and disease [11]. Since the sequencing of DNA emerged, a wide range of projects aimed at establishing genetic markers of various medical conditions were performed, and obesity is no exception. Several large-scale genomic studies (e.g., DiOGenes, which explored biological samples from 350 European families [12,13]) paved the way for a further selection of nutritional recommendations through understanding the dynamics of weight maintenance based on the uniqueness of a particular patient [14].

Technical progress in the exploration of the proteome, the final level of transmission of biological information and predecessor of the metabolome, achieved during the last decade has provided a base for illuminating risks and improving current therapeutic strategies [15]. Proteomics allows the development of a personalized molecular profile that takes into account the pattern of biomarkers. This "molecular mirror" reflects all significant processes in the body, including systemic chronic inflammation, associated with obesity [16–18].

In the study, we used a proteomics approach to get a panoramic picture of the proteome of patients with respect to their weight status. To the best of our knowledge, we provide the first evidence that qualitative plasma protein landscape significantly differs from classic clinical parameters on an issue of obesity. We highlighted proteins, which could play a significant role in the development of obesity and obesity-associated disease and should be additionally explored as a prominent tool to improve risk stratification.

#### **2. Materials and Methods**

#### *2.1. Sample Collection*

One hundred and four human plasma samples were obtained from the patients of the Clinic of "Federal Research Centre of Nutrition, Biotechnology and Food Safety" (Moscow, Russia).

All study participants gave informed consent confirming their willingness to participate in the research. All procedures performed in studies involving human participants were under the ethical standards of the institutional or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The study was approved by the relevant ethical review committee of the Federal Research Centre of Nutrition, Biotechnology and Food Safety (protocol #4 from 15 June 2018).

The present study included 104 individuals in accordance with the inclusion and exclusion criteria. The inclusion criteria were the age of the study participants from 18 to 45 years, BMI from 18.5, absence of diagnosed somatic and mental disorders.

Individuals younger than 18 and older than 45 years were excluded from the study, as well as pregnant/breastfeeding patients, patients with mental disorders, identified cancer, cardiovascular and any gastrointestinal diseases, other somatic disorders, and recent (6 months) weight loss.

The patients were divided into groups according to their body mass indexes. BMI, calculated as the mass of the individual in kilograms divided by his/her height in meters squared, is one of the most popular metrics to characterize body condition [19].

Five groups of patients (Table 1) were enrolled for this study: controls (NORM, BMI = 18.5–24.9), overweight patients (OW, BMI = 25.0–29.9), and patients with obesity stage 1 (OB1, BMI = 30.0–34.9), 2 (OB2, BMI = 35.0–39.9), and 3 (OB3, BMI > 40.0).


**Table 1.** Characteristics of patients enrolled in the study.

**<sup>1</sup>** To refute the theory that the group characteristics do not differ significantly, a *p*-value < 0.05 was used. *p*-value, calculated for age and height characteristics of groups under study, provides evidence, that differences of age and height groups are statistically insignificant. NORM—individuals with BMI = 18.5–24.9; OW—overweight patients with BMI = 25.0–29.9; OB1—patients with obesity stage 1 and BMI = 30.0–34.9; OB2—patients with obesity stage 2 and BMI = 35.0–39.9; OB3—patients with obesity stage 3 and BMI > 40.0.

> Venous blood samples were collected into EDTA tubes after overnight fasting and centrifuged at 1500× *g*, for 10 min at room temperature. Plasma fractions were stored at −80 ◦C in cryotubes until processing. Samples were randomized prior to proteomic investigation to avoid potential batch effects.

#### *2.2. Anthropometric and Clinical Tests*

#### 2.2.1. Anthropometric Tests

The BMIs of the patients were evaluated according to the standard formula [19]. The weight distributions were measured using the bioelectrical impedance analysis method.

#### 2.2.2. Biochemical Blood Test and Complete Blood Count

Serum levels of fasting plasma glucose, triglycerides, high-density lipoprotein, lowdensity lipoprotein, cholesterol, alanine aminotransferase, aspartate aminotransferase, *γ*-glutamyl transpeptidase, alkaline phosphatase, uric acid, urea, creatinine, albumin, bilirubin, etc. were determined according to standard protocols. Blood levels of hemoglobin, hematocrit, and blood cell indexes were established according to standard protocols [20].

Results of anthropometric and blood tests are provided in Supplementary Table S1.

#### *2.3. Sample Preparation*

#### 2.3.1. The Depletion of Blood Plasma

The immunoaffinity depletion of the high abundance plasma proteins (albumin and IgG) was used to enhance the detection of lower abundance but more insightful proteins in further shotgun proteomic analysis. For plasma depletion, we used ProteoPrep Kit (Sigma-Aldrich, St. Louis, MO, USA). The depletion was carried out following the manufacturer's instructions [21].

#### 2.3.2. Trypsinolysis of Depleted Plasma

The depleted blood plasma samples (175 µg of total protein) were in-solution digested in accordance with a standard protocol [22]. In brief, proteins were denatured and reduced with a solution containing sodium deoxycholate, tris-2-carboxyethyl-phosphine hydrochloride, and 1,4-dithiothreitol, and further alkylated with vinylpyridine. Trypsin was added to the sample (trypsin/total protein = 1/100) and then incubated within 2 h at a temperature of 44 ◦C. After 2 h, an aliquot of trypsin was added and then incubated for 2 h at 37 ◦C. Trypsinolysis was quenched by adding formic acid to each sample to a final concentration of 5%, then a mixture of peptides was centrifuged at 10,000 rpm within 15 min. The supernatant was collected and subjected to further chromatography-mass spectrometric analysis.

#### *2.4. HPLC-MS/MS Analysis*

Separation and identification of the peptides were performed on an Ultimate 3000 nano-flow HPLC (Thermo Fisher Scientific, Cleveland, OH, USA) connected to Orbitrap Exactive (Thermo Fisher Scientific, Cleveland, OH, USA) mass spectrometer equipped with a Nanospray Flex NG ion source (Thermo Fisher Scientific, Cleveland, OH, USA). Peptide separation was carried out on an RP-HPLC column Zorbax 300SB-C18 (C18 particle size of 3.5 µm, inner diameter of 75 µm and length of 150 mm, Acclaim® PepMap™ RSLC, Thermo Fisher Scientific, Cleveland, OH, USA) using a linear 90-min gradient from 95% solvent A (0.1% formic acid) and 5% solvent B (0.1% formic acid, 80% acetonitrile) to 60% solvent B over 95 min at a flow rate of 0.3 µL/min.

Mass spectra were registered in the positive ion mode. Data was acquired in the Orbitrap Exactive analyzer with a resolution of 70,000 (at *m/z* 400) for MS and 15,000 (*m/z* 400) for MS/MS scans. For peptide fragmentation higher energy collisional dissociation (HCD) was used, the signal threshold was set to 17,500 for an isolation window of 1 *m/z* and the first mass of HCD spectra was set to 100 *m/z*. The collision energy was set to 35%. Fragmented precursors were dynamically excluded from targeting for 10 s. Singly charged ions and ions with not defined charge states were excluded from triggering MS/MS scans. Three LC-MS/MS repetitions were performed for each plasma sample.

#### *2.5. Interpretation of Experimental Data*

Raw files were converted into .mgf files by MSConvert (v. 3.0). Each of the 312 mgf files containing the feature list for protein identification was processed by SearchGUI software (v. 4.0.4 [23]) using three search engines (X!Tandem, MS-GF+, OMMSA) against SwissProt library of human canonical and alternatively spliced protein sequences in automatic mode [24]. Trypsin was specified as the proteolytic enzyme; maximum of 2 missing cleavages were allowed. Pyridylethylation (C) was used as a constant modification, and oxidation of methionine was set as a variable one. Charge states of +2, +3, and +4 were selected as parent ions. Mass tolerance was set to ±15 ppm for precursor ions and ±0.01 Da for fragment ions. The cut-off of false discovery rates for peptide-spectra matches, peptides, and proteins was ≤1%. Results were visualized in PeptideShaker (v. 2.0.5 [25]). The MS data were deposited to the ProteomeXchange Consortium via the PRIDE partner repository [26] with the dataset identifier PXD023526.

#### *2.6. Statistical Analysis*

Clustering analysis of clinical and anthropometric tests was performed using Ward's method and Euclidean distance for normalized data. Clustering patterns of protein presence/absence were done using Ward's method and Jaccard distance metric. All statistical analyses and graphics were performed using R version 4.0 [27].

Each protein of interest was annotated with its GO-terms from UniProt using ViSEAGO package [28]. We used the "2020-03" GO release and "2020\_01" UniProt release.

When comparing the results of proteomic and clinical analysis, we explored publicly available data on the relationship between proteins and parameters of clinical analysis. The automatic analysis of the texts of scientific publications was performed by Scanbious platform [29,30], which visualizes semantic networks between objects of various types (names of proteins, pathological processes, etc.).

We predicted the BMI of the patient based on the pattern of presence/absence of certain proteins in his/her blood plasma using the Least Absolute Shrinkage and Selection Operator (LASSO) regression implemented in glmnet package [31]. We performed 10 iterations, each time randomly selecting 90% of the samples. For each iteration we needed to select the optimum value of LASSO tuning parameter lambda, which penalizes the sum of the absolute values of the coefficient. Optimum value of lambda was also selected by performing cross-validation (10 runs of 10-fold cross-validation cycle). The lambda with the minimum average error was selected as a lambda for the current iteration. Final model included only proteins, which were selected at every iteration (10 out of 10 times). Model performance was estimated as the median absolute error which was defined as the median of absolute differences between the true BMI and the predicted BMI.

#### **3. Results and Discussion**

#### *3.1. Clinical Component*

Much attention has been riveted on the phenomenon of metabolically healthy obesity (MHO), characterized by the absence of the metabolic abnormalities that traditionally accompany excess adiposity [32]. Thus, a substantial proportion of the obese subjects does not seem to be at an (at least temporarily, [33]) increased risk of mortality and metabolic complications of obesity. MHO is characterized by the absence of dyslipidemia, hypertension, insulin resistance, and chronic inflammation.

Moreover, lean subjects may possess abnormal metabolic parameters (exhibiting metabolically unhealthy non-obesity, MUNO) [34]. A gradient of metabolically healthy and unhealthy obese and lean phenotypes makes the revealing of abnormalities as well as relevant prevention of risks more difficult even for non-obese individuals.

To elucidate whether there is a bias to any of the selected extremes (four combinations of metabolic status and BMI) in our sample collection, we selected the monitored parameters of blood and anthropometric tests (Supplementary Table S1), which significantly differed between groups under study (NORM, OW, OB1, OB2, OB3). For these differed parameters, we performed a principal components analysis (Supplementary File S1) and hierarchical cluster analysis (Figure 1) using Ward's minimum variance. The results of cluster analysis were evaluated with the Adjusted Rand Index (ARI), which reflects an agreement between two partitions: one given by the clustering process and the other defined by external criteria.

In our case, ARI was equal to 0.051, which indicates a low similarity between resulting and expected clustering. According to the obtained result, it is not possible to explicitly define the boundaries between groups of subjects with different weight conditions under study.

The impossibility to unambiguously divide patients according to their weight conditions based only on the results of clinical tests once again emphasizes the controversial and considerably challenging nature of obesity and indicates the need for orthogonal data.

In our opinion, the most promising for solving this problem will be the transition to the proteome level and multiplex assessment of the patient's proteome landscape.

**Figure 1.** Visualization of the results of hierarchical cluster analysis of significantly different parameters of clinical and anthropometric tests performed for patients with different weight conditions. Sample IDs are presented as follows: group\_number of the sample (NORM—normal weight, OW—overweight, OB1, 2, 3—obesity stage 1, 2, and 3, correspondingly).

#### *3.2. Proteomic Component*

In total, 154 proteins were reliably identified in the entire collection of plasma samples. These proteins are predominantly associated with peptidase activity, receptor binding, and lipid transporter activity. Most of the proteins are localized in blood microparticles or plasma lipoprotein particles and vesicles, and therefore we expect stable detection of them under various mass spectrometric protocols. Of those, 36 proteins were consistently found in all plasma samples under study. A total of 138 proteins were identified in the NORM group of lean subjects. A total of 148 proteins were identified in the integrated group of overweight (OW) and obese (OB1, OB2, OB3) samples.

.

Next, we performed a principal components analysis (Supplementary File S1) and studied possible relationships between the pattern of presence/absence of proteins in blood plasma and the patient's BMI using cluster analysis.

Preliminarily, unrepresentative proteins (identified in a single sample in the collection) and non-specific proteins (identified in all samples) were excluded from the calculations. The data matrix consisted of 104 rows (samples) and 101 columns (proteins).

The pattern of 15 proteins (namely, P07225, P00748, P07357, P07358, P09871, P01591, P01861, O43866, P00736, P02654, P13671, P25311, P01619, P01859, and P29622) allows to distinguish (Figure 2a) a group of 14 samples with increased BMI (mean 39 vs 33, *p*-value = 0.002). Moreover, 11 of them belong to the OB2 and OB3 groups, and three samples were obtained from overweight individuals. It should be noted that these three patients from OW were diagnosed with blood lipids disorder (there are seven samples with such a diagnosis in the whole OW group). Half of the samples from the OB2 and OB3 groups were also characterized by this diagnosis. As part of a pattern of 15 proteins for 5 (P07225 [35], O43866 [36,37], P02654 [38], P25311 [39–41], P29622 [42]) the association with the obesity was shown. It is noteworthy that five of these 15 proteins are complement components, included in two complexes: P07358, P07357, and P13671 organize membrane attack complex (MAC), that plays a key role in the innate and adaptive immune response, and P09871 and P00736 combine with serine protease to form the first component of the classical and less variable pathway of the complement system, also associated with obesity [43,44].

**Figure 2.** Cluster analysis of the presence/absence patterns of (**a**) 101 proteins (columns) in 104 blood plasma samples (rows) for all 104 samples under study (rows) and (**b**) 98 proteins (columns) in 83 samples, excluding plasma samples obtained from overweight individuals. The color bar indicates BMI.

The collection of the samples under study contains blood plasma from overweight patients with an increased body mass index (OW), but not exceeding the threshold values required for the diagnosis of obesity, which could affect the results of cluster analysis. In this respect, we removed from consideration 21 samples from the borderline OW group, as a result, the total number of identified proteins remained practically unchanged, as well as the set of proteins common for the two—NORM and OB—groups. The updated matrix consisted of 83 rows (samples) and 98 columns (proteins) plotted with the same parameters. Clustering indices improved slightly, so for the group with high BMI its mean value was 42, and for the rest—32 (*p*-value = 0.002, Figure 2b).

The composition of the cluster with high BMI practically did not change—three samples from the OW group left, and one image from the OB2 group was added. Accordingly, the pattern of specific proteins did not change significantly, it included 13 proteins, where 12, except two immunoglobulins (P01619 and P01859) and serpin (P29622), coincide with the results of the pattern of proteins according to the all-samples clustering. New in the resulting pattern is the component of the above MAC complex—P07360.

#### *3.3. BMI Prediction*

To assess the contribution of proteins to obesity, an attempt to predict the BMI of the sample based on proteomic data was performed. For this, using the LASSO method, we built a regression model predicting the BMI of a sample according to the pattern of presence/absence of proteins in blood plasma. The model based on all data consisted of five proteins (P08185, P0DJI8, P10643, P25311, and P35858), and the median absolute error (MAE) was 5.1 kg/m<sup>2</sup> (Figure 3a). At the same time, the model obtained on the basis of processing data excluding samples from the OW group showed a higher accuracy, MAE = 3.2 kg/m<sup>2</sup> (Figure 3b), and the number of proteins required to build the model was 18.

**Figure 3.** BMI prediction based on pattern of presence/absence of proteins for (**a**) all samples and for (**b**) collection without plasma samples obtained from overweight individuals.

It is noteworthy that the pattern of 18 proteins includes five proteins from the model with lower predictive power, as well as three previously considered proteins of the complement system (P00736, P07358, P07360) included in the MAC complex, which is indirectly associated with obesity [45].

Text-mining [29,30] performed for these proteins and their relations with pathological processes showed that 15 out of 18 proteins (Table 2) are associated to varying degrees with metabolic disorders, including obesity. For example, according to our model, the absence of sex hormone-binding globulin (P04278) correlates with increased BMI, which is confirmed by studies on its expression, where it was shown that inhibition of the corresponding gene leads to the development of obesity [46].

**Table 2.** Proteins included into predicting pattern and their association with obesity.


Summarizing the above said, we can conclude that among the reliably and reliably detected proteins [76] in blood plasma, there is a pattern that has predictive power in the issue of obesity. The minimum pattern size is five proteins. Expanding the panel increases the level of BMI prediction accuracy, which can be critical in examining borderline states in metabolically healthy obese and unhealthy lean, and also provide researchers with additional information about body composition status even when exploring protein profiles from the patients with non-obesity disorders.

We would like to stress that our intention was not to build the perfect BMI prediction model (the dataset is quite limited for this task) but rather to point to some plasma proteins likely associated with obesity when analyzed together. We suppose that the further studies needed to elaborate on this issue will also allow detection of the transition from a "metabolically healthy" phenotype of the patient with a high BMI to an "unhealthy" one.

#### **4. Conclusions**

According to the authors' knowledge, no approved omics pattern has been developed to distinguish individuals at increased risk of obesity and its comorbidities. In the present study, we analyzed clinical and anthropometrical parameters of 104 subjects with different weight conditions. Each individual was also characterized by the profile of core proteins circulating through his/her blood plasma.

Our main conclusions were two-fold:


We strongly believe that such proteomic patterns have great potential as warning labels, signaling about obesity-associated alterations, and, thus, improving early-stage therapy of both metabolically unhealthy obese and lean individuals.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/2075-442 6/11/2/64/s1, Table S1: Anthropometrical and clinical characteristic of individuals under study; File S1: Principal Components Analysis of clinical parameters and proteins identified in the samples of blood plasma.

**Author Contributions:** Conceptualization, E.A.P. and K.K.S.; sample collection, O.A.P., E.A.P. and E.V.I.; methodology, V.G.Z., V.A.A. and O.I.K.; data curation, V.A.A., O.I.K., E.V.P. and E.V.I.; writing, O.I.K., E.V.P., M.A.P. and V.A.A.; visualization, V.A.A. and M.A.P.; revision, O.I.K., V.A.A., E.V.P., M.A.P., K.K.S., D.B.N.; project administration and funding acquisition, E.A.P., A.V.L., V.A.T., D.B.N., A.I.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Ministry of Science and Higher Education of the Russian Federation within the framework of state support for the creation and development of World-Class Research Centers "Digital biodesign and personalized healthcare" No. 075-15-2020-913.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Ethics Committee of the Federal Research Centre of Nutrition, Biotechnology and Food Safety (protocol #4 from 15.06.2018).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available in PRIDE repository (dataset identifier PXD023526); Supplementary Materials are available via link https://zenodo.org/ record/4432333#.YAbOw1X7QuU.

**Acknowledgments:** We would like to thank Alexandra M. Nazarova for her help in sample collection and annotation. We acknowledge the IBMC "Human Proteome" Core Facility for access to the computer cluster and assistance with the generation of mass spectrometry data.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **The Mutation Spectrum of Maturity Onset Diabetes of the Young (MODY)-Associated Genes among Western Siberia Patients**

**Dinara E. Ivanoshchuk 1,2,\* , Elena V. Shakhtshneider 1,2 , Oksana D. Rymar <sup>2</sup> , Alla K. Ovsyannikova <sup>2</sup> , Svetlana V. Mikhailova <sup>1</sup> , Veniamin S. Fishman <sup>1</sup> , Emil S. Valeev <sup>1</sup> , Pavel S. Orlov 1,2 and Mikhail I. Voevoda <sup>1</sup>**


**Abstract:** Maturity onset diabetes of the young (MODY) is a congenital form of diabetes characterized by onset at a young age and a primary defect in pancreatic-β-cell function. Currently, 14 subtypes of MODY are known, and each is associated with mutations in a specific gene: *HNF4A*, *GCK*, *HNF1A*, *PDX1*, *HNF1B*, *NEUROD1*, *KLF11*, *CEL*, *PAX4*, *INS*, *BLK*, *KCNJ11*, *ABCC8*, and *APPL1*. The most common subtypes of MODY are associated with mutations in the genes *GCK*, *HNF1A*, *HNF4A*, and *HNF1B*. Among them, up to 70% of cases are caused by mutations in *GCK* and *HNF1A*. Here, an analysis of 14 MODY genes was performed in 178 patients with a MODY phenotype in Western Siberia. Multiplex ligation-dependent probe amplification analysis of DNA samples from 50 randomly selected patients without detectable mutations did not reveal large rearrangements in the MODY genes. In 38 patients (37% males) among the 178 subjects, mutations were identified in *HNF4A*, *GCK*, *HNF1A*, and *ABCC8*. We identified novel potentially causative mutations p.Lys142\*, Leu146Val, Ala173Glnfs\*30, Val181Asp, Gly261Ala, IVS7 c.864 −1G>T, Cys371\*, and Glu443Lys in *GCK* and Ser6Arg, IVS 2 c.526 +1 G>T, IVS3 c.713 +2 T>A, and Arg238Lys in *HNF1A*.

**Keywords:** maturity onset diabetes of the young; MODY; diabetes mellitus; multiplex ligationdependent probe amplification; next-generation sequencing; GCK; HNF1A; HNF4A; HNF1B; singlenucleotide variant; population

#### **1. Introduction**

Maturity onset diabetes of the young (MODY) is a congenital form of diabetes characterized by onset at a young age and a primary defect in pancreatic-β-cell function. This type of diabetes (OMIM # 606391) differs from classic types of diabetes mellitus (types 1 and 2, or T1DM and T2DM) in its clinical course, treatment strategies, and prognosis [1,2]. MODY is predominantly inherited in an autosomal dominant manner, but cases of spontaneous mutagenesis and germline mosaicism have been described [3,4]. MODY can be suspected [5] when hyperglycemia is detected in patients under 25–35 years of age, there is little or no need for insulin, secretion of C-peptide is intact, β-cell antibodies are absent, and dysfunction of pancreatic β-cells results in a decrease in the insulin amount. MODY patients usually have a normal body mass index. Currently, 14 subtypes of MODY are known, and each is associated with mutations in a specific gene: *HNF4A*, *GCK*, *HNF1A*, *PDX1*, *HNF1B*, *NEUROD1*, *KLF11*, *CEL*, *PAX4*, *INS*, *BLK*, *KCNJ11*, *ABCC8*, and *APPL1* [6]. The genes associated with MODY are presented in Table 1.

**Citation:** Ivanoshchuk, D.E.; Shakhtshneider, E.V.; Rymar, O.D.; Ovsyannikova, A.K.; Mikhailova, S.V.; Fishman, V.S.; Valeev, E.S.; Orlov, P.S.; Voevoda, M.I. The Mutation Spectrum of Maturity Onset Diabetes of the Young (MODY)-Associated Genes among Western Siberia Patients. *J. Pers. Med.* **2021**, *11*, 57. https://doi.org/10.3390/jpm11010057

Received: 10 December 2020 Accepted: 13 January 2021 Published: 18 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).


**Table 1.** The genes associated with maturity onset diabetes of the young (MODY).

The most common subtypes of MODY are associated with mutations in the genes *GCK*, *HNF1A*, *HNF4A*, and *HNF1B* [7]; among them, up to 70% of cases are caused by mutations in *GCK* and *HNF1A* [8]. Clinical manifestations of MODY are diverse and may vary even among members of the same family, i.e., carriers of identical mutations. This phenotypic variation is due to the interaction of the mutations with different genetic backgrounds and with environmental factors (e.g., the lifestyle) [9]. Identification of mutations causing MODY is important for early diagnosis of the disease in first-degree relatives of a proband. Up to 80% of MODY cases remain undetected or are misdiagnosed as T1DM or T2DM, resulting in incorrect treatment, including unjustified insulin therapy and its complications [1]. Next-generation sequencing (NGS) is most commonly used in MODY diagnostics and allows for simultaneous analysis of numerous genes and for the identification of single-nucleotide variants (SNVs) or small deletions or insertions, which constitute the majority of the known mutations causing this disease [10]. Nonetheless, large rearrangements in MODY-associated genes have been described, including heterozygosity of a complete *HNF1B* deletion [11,12], detected by multiplex ligation-dependent probe amplification (MLPA). This method makes it possible to detect variations in the copy number of a gene and accordingly is widely used in molecular diagnostics of diseases whose pathogenesis is associated with deletions or duplications of certain genomic regions [13]. The purpose of the present study was to determine the spectrum of genetic variants in patients with a MODY phenotype in Russia (Western Siberia) by whole-exome sequencing, targeted sequencing, and MLPA.

#### **2. Materials and Methods**

#### *2.1. Patients*

The study protocol was approved by the Ethics Committee of the Institute of Internal and Preventive Medicine- branch of the Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia, protocol number 7, 22 June 2008. Written informed consent to be examined and to participate in the study was obtained from each patient or his/her parent or legal guardian.

A total of 178 unrelated patients aged 4 to 35 years (23.4 ± 11.1 years [mean ± SD]; 41.7% males) with a MODY phenotype and 140 available family members were enrolled in this study (Figure 1).

**Figure 1.** The study design. DNA: deoxyribonucleic acid, MLPA: multiplex ligation-dependent probe amplification, MODY: maturity onset diabetes of the young, SNV: single-nucleotide variant, WES: whole-exome sequencing.

All 178 probands were examined in the Clinical Department of the Institute of Internal and Preventive Medicine from the year 2014 to 2020. The probands were referred for molecular genetic testing if they met the following criteria: a verified diagnosis of diabetes [according to the criteria of the American Diabetes Association: HbA1C ≥6.5%, or fasting plasma glucose ≥126 mg/dL (7.0 mmol/L), or 2-h plasma glucose ≥200 mg/dL (11.1 mmol/L) during an oral glucose tolerance test (in the absence of unequivocal hyperglycemia; the result should be confirmed by repeat testing), or a patient with classic symptoms of hyperglycemia or hyperglycemic crisis with a random plasma glucose ≥200 mg/dL (11.1 mmol/L)] [14]; a debut of the disease in probands at the age of 35 years or earlier; a family history of diabetes mellitus; the absence of obesity; the absence of antibodies against pancreas islet cells and glutamic acid decarboxylase; intact secretory function of β-cells; normal or mildly reduced C-peptide levels; no need for insulin therapy; and the absence of ketoacidosis at the onset of the disease. Patients with clinical features of untypical diabetes (differing from those of T1DM and T2DM) and, in some cases, lacking a family history were included in this study because MODY can be occasionally caused by de novo mutations [3,4]. Patients with a history of tuberculosis, infection with the human immunodeficiency virus, an infectious disease caused by hepatitis B or C virus that required antiviral treatment, or substance abuse or alcoholism in the 2 years before the examination were excluded from this study. The studied MODY group may have included patients with MODY, patients with T1DM and negative test results on antibodies, and patients with early-onset T2DM.

#### *2.2. Isolation of Genomic DNA*

Genomic DNA was isolated from venous blood leukocytes by phenol–chloroform extraction [15].

#### *2.3. Genome Library Preparation and Sequencing*

At the first stage, exome sequencing was performed on 40 randomly selected probands from the MODY group. The exome sequencing was carried out on an Illumina HiSeq 1500 instrument (Illumina, San Diego, CA, USA). The enrichment and library preparation were performed using the SureSelectXT Human All Exon v.5 + UTRs Kit. Whole-exome libraries were prepared with the AmpliSeq Exome Kit (Thermo Fisher Scientific, Walthamm, MA, USA). On the DNA samples from the other patients (138 unrelated probands), we performed targeted sequencing. In the target panel, we included coding parts and adjacent splicing sites of the following MODY-associated genes: *HNF4A*, *GCK*, *HNF1A*, *PDX1*, *HNF1B*, *NEUROD1*, *KLF11*, *CEL*, *PAX4*, *INS*, *BLK*, *KCNJ11*, *ABCC8*, and *APPL1*. Oligodeoxynucleotide probes and the KAPA HyperPlus Kit (Roche, Basel, Switzerland) were used to prepare libraries at the Genomics Multi-Access Center (Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia). The quality of the analyzed DNA and of the prepared libraries was evaluated by means of a capillary electrophoresis system, the Agilent 2100 Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA, USA). Analysis of each fully prepared library was conducted on the Illumina MiSeq platform (Illumina, San Diego, CA, USA).

#### *2.4. Bioinformatic Analysis*

The sequence reads were mapped to the reference human genome (GRCh37/hg19) via the Burrow–Wheeler Alignment tool (BWA v.0.7.17) [16]. Polymerase chain reaction (PCR)-generated duplicates were removed with MarkDuplicates of PicardTools (https: //gatk.broadinstitute.org/hc/en-us). Base quality recalibration and searches for SNVs were conducted using Genome Analysis Toolkit (GATK) v.3.3, with BaseRecalibration and HaplotypeCaller tools, respectively; variant calling included local remapping of short insertions/deletions and correction of base quality [17].

The depth of coverage was 34× to 53×. SNVs with genotype quality scores <20 were filtered out and excluded from further analysis. In sequenced groups, we kept sequence variants if they were present in 10 or more variant reads with a quality score ≥20. Annotation of the SNVs was performed in the ANNOVAR (ANNOtate VARiation) software [18] with the help of populational data gnomAD [19] as a basic source and databases for several specific populations, such as The Greater Middle East (GME) Variome Project (http: //igm.ucsd.edu/gme/), ABraOM: Brazilian genomic variants (http://abraom.ib.usp.br/), Korean Personal Genome Project (http://opengenome.net/Main\_Page), other available data from populations and data on clinical significance (ClinVar) [20], and the Human Gene Mutation Database (HGMD) [10]; literature data were taken into account too.

Possible functional effects of SNVs were assessed in the dbNSFP database (https: //sites.google.com/site/jpopgen/dbNSFP), aggregating data from 37 in silico prediction tools (SIFT, Polyphen2, MutationTaster2, PROVEAN, and others), nine conservation scores (e.g., PhyloP, phastCons, GERP++, and SiPhy), and data on population frequencies. Thresholds for prediction scores were set according to respective authors' recommendations; for the conservation ratio, we set one single threshold at 0.7, which means that a variant can be considered conserved if its predicted conservation score is greater than the scores of 70% variants.

Variants described in ClinVar, Leiden Open Variation Database (https://www.lovd. nl/), or predicted in silico as benign (B) or likely benign (LB) as well as variants with minor allele frequency higher than 0.01% in any of the population databases listed above were excluded from the analysis. The pathogenicity of each novel candidate mutation was assessed according to the recommendations of the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology [21]. Using these criteria, we also assessed two variants (p.Val590Met in the *HNF1A* gene in proband P81 and p.Ala173Thr in the *GCK* gene in P90) previously reported without any description of a phenotype (Table 2).


**Table 2.**The genetic variants identified in Western Siberia patients with a phenotype of maturity onset diabetes of the young (MODY).


**Table 2.** *Cont*.

\* RefSeq reference transcript: *GCK* (NM\_000162.5), *ABCC8* (NM\_000352.3), *HNF1B* (NM\_000458.2), and *HNF1A* (NM\_000545.6). HGMD: Human Genome Mutation Database; LOVD database: Leiden Open Variation Database; NA: not available; PR: previously reported to be associated with a pathological phenotype (in the literature, in LOVD, ClinVar, or HGMD) and was not classified according to ACMG recommendations [21]. Variants identified for the first time are boldfaced.

To identify pathogenic variants at potential splice sites, we performed in silico testing, using dbscSNV (http://www.liulab.science/dbscsnv.html) for splice sites and regSNPintron (https://regsnps-intron.ccbb.iupui.edu/) for both intronic and splice-altering SNVs. Nevertheless, these databases can be only a supplementary tool for the estimation of variant pathogenicity, in addition to the population and clinical data.

#### *2.5. Verification Analysis*

All selected SNVs were verified by Sanger direct automatic sequencing on an ABI 3500 DNA sequencer (Thermo Fisher Scientific, USA) by means of the BigDye Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific, USA). Primer design for the selected SNVs was performed in the Primer-Blast software (https://www.ncbi.nlm.nih.gov/tools/ primer-blast/).

At the second stage, segregation analysis was performed on the selected variants (novel and previously described) in the probands' family members available for the analysis.

#### *2.6. MLPA Analysis*

The third stage included an MLPA analysis of DNA samples from 50 randomly selected patients without detectable mutations in MODY-associated genes to search for major structural rearrangements (deletions and duplications). MLPA was performed using SALSA MLPA Probemix P241 MODY Mix 1 (MRC-Holland, Amsterdam, Netherlands), containing 52 specific MLPA probes for four genes: *HNF4A*, *GCK*, *HNF1A*, and *HNF1B* (https://www.mrcholland.com/product/P241/2528). The reaction was carried out in accordance with the manufacturer's recommendations. The resulting amplification products were detected by capillary electrophoresis on an ABI 3500 DNA sequencer (Thermo Fisher Scientific, USA). The data were processed in the Coffalyser.Net software.

#### **3. Results**

In 40 probands with a MODY phenotype, whole-exome analysis was performed; in 138 probands, we analyzed exons and adjacent splice sites of MODY-associated genes: *HNF4A*, *GCK*, *HNF1A*, *PDX1*, *HNF1B*, *NEUROD1*, *KLF11*, *CEL*, *PAX4*, *INS*, *BLK*, *KCNJ11*, *ABCC8*, and *APPL1*. In this work, we ignored common genetic variants and did not assess their possible effects on the phenotype. The results are presented in Table 2. We did not find any rare pathogenic variants in genes *HNF4A*, *PDX1*, *NEUROD1*, *KLF11*, *CEL*, *PAX4*, *INS*, *BLK*, *KCNJ11*, and *APPL1*. In 38 patients (37% males) out of the 178 subjects, mutations were identified in *HNF4A*, *GCK*, *HNF1A*, and *ABCC8* (one in a compound heterozygous state and 37 in a heterozygous state). The vast majority of them were carriers of rare variants of the *GCK* gene (26 probands; 68.4%), and nine probands (26.3%) had mutations in the *HNF1A* gene; one male (2.6%) carried a pathogenic variant in the *ABCC8* gene, and one female was a carrier of a rare variant in *HNF1B*. One male patient (73) was a compound heterozygote on two previously described mutations: Arg54\* in *HNF1A* and Arg521Gln in *ABCC8*.

In total, we identified 36 rare variants in the studied genes: 12 of them (33.3%) are described for the first time, and the rest are present in databases (LOVD, HGMD, ClinVar, and/or gnomAD) or are previously reported in the literature. Among the identified variants, only three (all located in the *GCK* gene) were found twice in unrelated patients. Probands 3 and 77 both were found to carry Gly258Cys, probands 46 and 80 to carry undescribed Val181Asp, and probands 6 and 50 to carry Arg36Trp in the *GCK* gene. All the novel variants proved to be likely pathogenic or pathogenic according to the ACMG criteria [21]. For 28 families with identified relevant variants, a segregation analysis was performed. For most patients, segregation of the identified mutations with pathogenic phenotypes was found among their relatives.

An exception was the His336Asp mutation in *HNF1B* in a patient (P27) with gestational diabetes. The same mutation was found in her normoglycemic mother and her daughter.

In 50 patients without mutations in the studied genes, the MLPA analysis did not reveal any structural abnormalities in the genes *HNF4A*, *GCK*, *HNF1A*, and *HNF1B*.

#### **4. Discussion**

Genetic diagnosis is an important step in clinical practice, especially for family screening of individuals with borderline or moderate carbohydrate metabolic disorders. Clinical manifestations of MODY differ among patients and do not allow us to identify the type of diabetes unambiguously; therefore, it is important to employ timely genetic diagnostics for the optimal choice of a treatment. In the present study, we searched for mutations in MODY-associated genes by combining an examination of clinical criteria with NGS (the latter method has been effectively used as a first-line screening test for MODY-associated mutations). Next, the identified mutations were verified by Sanger sequencing followed by cascade genetic screening of available family members. In a Siberian population, we have previously reported some of the mutations: c.580 –1G>A in *GCK* [23], Ser6Arg in *HNF1A* [24], and Ala1457Thr in *ABCC8* [25]. Among the studied patients (all from Western Siberia), the predominance of subtypes MODY2 (68.4%) and MODY3 (26.3%) was demonstrated here. We found eight novel genetic variants in *GCK* and four novel variants in *HNF1A* in this work (Figure 2, Table 2).

Routine or accidental blood glucose testing is the main route of hyperglycemia detection in *GCK*-MODY. The disease is characterized by an insignificant increase in the fasting glucose level (well controlled without medication) and low prevalence of microand macrovascular complications of diabetes [2]. Encoded by the *GCK* gene, glucokinase B (GlkB, hexokinase IV) is the first enzyme in the glycolytic pathway and phosphorylates glucose in pancreatic β-cells [26].

**Figure 2.** *Cont*.

**Figure 2.** Screened MODY families (**A**–**L**) with novel identified variants in genes *GCK* and *HNF1A*. A grey filled symbol: a patient with prediabetes, MT: altered allele, SB: stillbirth, WT: wild type allele, ?/?: persons not genotyped, ?: a person with unknown health status. \* RefSeq reference transcript: GCK (NM\_000162.5), ABCC8 (NM\_000352.3), HNF1B (NM\_000458.2), and HNF1A (NM\_000545.6).

The crystal structure of human glucokinase indicates that it has a large domain and small domain forming a deep cleft for glucose binding [27]. Amino acid residues 1–64 and 206–439 belong to the large domain, and residues 72–201 and 445–465 belong to the small domain; residues 65–71, 202–205, and 440–444 constitute three loops connecting the domains. Glucokinase is switched from an inactive to active conformation by ligand binding via a large rotation of the small domain [27,28]. Many mutations of this gene have been described (nonsense, missense, frameshift, and splice site mutations) in various populations [10]. Of note, all *GCK*-MODY patients have similar clinical phenotypes regardless of the mutation type [29].

p.Lys142\* was found here in a 10-year-old boy (P51 in family A) with hyperglycemia. Lysine at codon 142 is adjacent to three basic amino acid residues (His-141, Lys-142, and Lys-143) that are phylogenetically conserved and form a positively charged surface. This basic patch in GlkB is believed to be a critical site for the binding of glucokinase-regulatory protein because substitutions at this site decrease this binding and nuclear entry of GlkB [30]. There are no reports on mutations in this codon in the literature.

The Leu146Val substitution is located in the small domain and was found in a 12-yearold male proband here (P4 in family B). Within codon 146, two other variants (Leu146Pro and Leu146Arg) have been described, both associated with MODY [10]. Research on enzyme kinetics has revealed that mutation Leu146Arg reduces enzymatic activity owing to decreased affinity for glucose [31].

A novel frameshift mutation, p.Ala173Glnfs\*30, was identified here in family C with a MODY2 phenotype. Namely, this heterozygous deletion of four nucleotides in exon 5 was detected in a young woman 35 years of age (P86), in her son, and in her sibling. This AGGC deletion in codon 173 of the *GCK* gene changes the amino acid sequence and generates a premature stop codon at amino acid position 203. In our study, this variant segregated with the pathological phenotype in the proband and proband's daughter and sister. In view of the young age of the proband's nephew, monitoring of his carbohydrate metabolism parameters was recommended. Enzyme kinetics should be assessed to determine the functional basis of the disease.

In two unrelated female patients (P80 in family D and P46 in family E), a variant was found resulting in the replacement of valine in codon 181 with aspartic acid. In both families, the supposed carriers (fathers) of the Val181Ala variant were not available for the analysis; it turned out that healthy mothers of the probands and relatives of P46 are not carriers of this substitution. Val181Ala in a family of a MODY patient was described in Italy [32]. In the present study, a young patient (P88 in family F) with gestational diabetes mellitus and a family history of diabetes was found to have a previously undescribed missense substitution, Gly261Ala, in the *GCK* gene. Earlier, two substitution variants of this codon—Gly261Arg and Gly261Glu—were found in France [33,34], both associated with MODY.

Here, in a 2-year-old boy (P54 in family G) with hyperglycemia, the c.864 −1G>T substitution was identified in the acceptor site of *GCK* intron 7. This variant segregated with a pathological phenotype in the examined family members. Substitutions −1G>A and −1G>C at this position have been described in the United Kingdom and Czech Republic [35,36].

Several variants in codon Cys371 have been described [10]. We first identified a nonsense variant in this codon in patient P57 (family H). The substitution segregated with a pathological phenotype in other family members. Cys371 is a highly conserved amino acid residue located in the hydrophobic core of the protein [37], where it participates in disulfide bond formation [38].

It is known that L-arginine stimulates the production of glucose-6-phosphate and induces insulin secretion [39]. In a male patient (P87) in family I, the p.Glu443Lys variant was detected, which is involved in L-arginine binding and insulin secretion [39]. Glu443 is located in the α-13 helix of human GlkB [26]. During the domain reorganization between the active and inactive forms of the enzyme, helices α-13 and α-5 take part in the global conformational change [27].

Hepatocyte nuclear factor I homeobox A (HNF1A) is a transcription factor regulating the differentiation of pancreatic cells [40]. Mutations in the *HNF1A* gene are associated with MODY3, which is characterized by impaired insulin secretion, retention of sensitivity to sulfonylureas, and a decrease in the renal threshold for glucose. Cases of liver adenomatosis, renal dysplasia, and hypopituitarism in carriers of these mutations have been documented [41,42]. The clinical manifestations of MODY3 can vary within the same family and among unrelated mutation carriers. Moreover, carriers of *HNF1A* gene mutations may be normoglycemic, while their siblings can be hyperglycemic [43].

The Ser6Arg mutation of the *HNF1A* gene was found here in a family with diabetes mellitus in five generations [24] in proband P19 (family J). Another variant, c.526 +1 G>T in intron 2 of *HNF1A*, was detected in a 12-year-old patient (proband P34 in family K). She was found to have fasting hyperglycemia of 11.3 mmol/L and glycosuria of more than 55 mmol/L. In the proband's family, this variant segregated with a pathological phenotype and has not been described previously.

Proband P91 is a carrier of novel variant Arg238Lys. Two other substitutions at the same position, Arg238Met and Arg238Thr, have been previously described in the LOVD database as likely pathogenic (HNF1A\_000306 and HNF1A\_000417). One of them, Arg238Thr, is associated with MODY in the United Kingdom [44], and the other one, Arg238Met, is associated with hyperinsulinism diagnosed at birth (LOVD database). According to the proband, her ancestors had diabetes mellitus for three generations, but they were not available for analysis.

One of our young male patients (78 in family L) has substitution c.713 +2 T>A in intron 3 of the *HNF1A* gene. In an available relative of the proband, this mutation segregated with non-insulin-dependent diabetes mellitus and was found for the first time.

ATP-binding cassette superfamily transporter family C8 gene (*ABCC8*) encodes sulfonylurea receptor 1 (SUR1), which is a part of ATP-sensitive potassium channels on the

islet β-cell membrane. Mutations in this gene can cause T2DM [45], a transient type of neonatal diabetes [46], neonatal diabetes [47], or MODY12 [6]. Previously, for the first time in Russia, we described the case of a patient with an *ABCC8* mutation [25]. In the current study, a 13-year-old male patient (73, family history is not shown) proved to be compound heterozygous on Arg521Gln of *ABCC8* and previously reported Arg54\* of *HNF1A*. His mother, also with diabetes mellitus, has the same genotype. Elsewhere, the Arg521Gln substitution in the ABCC8 protein has been identified in a female with dominant hyperinsulinism [48] and in a case of a heterozygous carrier with diabetes [49]. In our case, neither the mother nor her son had any relevant complications.

A mutation in the *HNF1B* gene results in MODY5, which is characterized by reduced insulin secretion and usually a renal disease [50]. This gene plays the major part in the normal development of (and tissue-specific gene expression in) the kidneys, liver, pancreas, bile ducts, urogenital tract, lungs, thymus, and gut [50]. In patient P27 (family history is not shown), a diagnosis of gestational diabetes was made. Mutation His336Asp in *HNF1B* was detected in a proband and in her normoglycemic mother and daughter; no other *HNF1B*associated clinical phenotypes, such as genital deformities or kidney involvement, were revealed in this proband [51]. The information on the pathogenicity of this variant is unclear. This mutation has been identified and categorized as pathogenic in two unrelated patients with severe renal anomalies, but healthy relatives of the patients are reported to be carriers of the mutation as well [52]. In another study, two out of three affected members from one family were carriers of this mutation, and it was found in one healthy individual [53]. In other studies, this variant was identified in a patient with suspected T1DM and absence of autoimmunity, but information about their parents was not available [54]. It is likely that the mutation does not have 100% penetrance and that other modulating pathological factors, including genetic ones, are required for disease manifestation.

In Russia, genetic screening of patients with MODY has been previously performed in the European part of Russia, and mutations in the *GCK* gene seem to be most prevalent among these MODY patients [55,56].

Direct sequencing of MODY genes is time-consuming and does not cover the entire spectrum of genes that can cause this type of diabetes. NGS techniques enable more efficient and cost-effective diagnosis of MODY subtypes [57]. High-tech sequencing also helps to determine the cause of a MODY phenotype if mutations in the genes known to be associated with MODY1–MODY14 are absent.

#### *Limitations*

This study has some limitations due to the unavailability of information about probands' family members (in some cases). This situation did not allow us to perform a segregation analysis of some rare potentially pathogenic variants identified in our patients (data not shown).

#### **5. Conclusions**

The spectrum of mutations in MODY genes was determined in a Western Siberian population by NGS. We believe that NGS techniques will lead to more effective and cost-efficient methods of MODY diagnosis. Apparently, mutations in the *GCK* gene are the predominant cause of MODY in Russia. We identified novel potentially causative mutations p.Lys142\*, Leu146Val, Ala173Glnfs\*30, Val181Asp, Gly261Ala, IVS7 c.864 −1G>T, Cys371\*, and Glu443Lys in *GCK* and Ser6Arg, IVS 2 c.526 +1 G>T, IVS3 c.713 +2 T>A, and Arg238Lys in *HNF1A* (both are known MODY-associated genes). We did not find large rearrangements in MODY genes in a randomly selected cohort of 50 patients devoid of relevant point mutations.

**Author Contributions:** Conceptualization, M.I.V.; data curation, E.V.S., O.D.R., and A.K.O.; investigation, D.E.I.; methodology, D.E.I., E.S.V., and V.S.F.; project administration, M.I.V.; validation, S.V.M. and P.S.O.; writing—Original draft, D.E.I., E.V.S., E.S.V., and V.S.F.; writing—Review and editing, E.V.S. and S.V.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** The collection of the samples and of clinical data and clinical examination were conducted as part of the main topic in state assignment No. AAAA-A17-117112850280-2, and NGS, Sanger sequencing, and bioinformatic analyses are financially supported as part of the publicly funded topic in state assignment No. AAAA-A19-119100990053-4.

**Institutional Review Board Statement:** The study protocol was approved by the Ethics Committee of the Institute of Internal and Preventive Medicine—Branch of the Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia, protocol number 7, 22 June 2008.

**Informed Consent Statement:** Informed consent was obtained from all subjects or his/her parent or legal guardian involved in the study.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

**Acknowledgments:** The authors thank the patients for the participation in this study.

**Conflicts of Interest:** The authors declare that they have no conflicts of interest related to the publication of this article.

#### **References**


### *Review* **Gallstone Disease, Obesity and the Firmicutes/Bacteroidetes Ratio as a Possible Biomarker of Gut Dysbiosis**

**Irina N. Grigor'eva**

**Abstract:** Obesity is a major risk factor for developing gallstone disease (GSD). Previous studies have shown that obesity is associated with an elevated *Firmicutes/Bacteroidetes* ratio in the gut microbiota. These findings suggest that the development of GSD may be related to gut dysbiosis. This review presents and summarizes the recent findings of studies on the gut microbiota in patients with GSD. Most of the studies on the gut microbiota in patients with GSD have shown a significant increase in the phyla *Firmicutes* (Lactobacillaceae family, genera *Clostridium*, *Ruminococcus*, *Veillonella*, *Blautia*, *Dorea*, *Anaerostipes*, and *Oscillospira*), *Actinobacteria* (*Bifidobacterium* genus), *Proteobacteria*, *Bacteroidetes* (genera *Bacteroides*, *Prevotella*, and *Fusobacterium*) and a significant decrease in the phyla *Bacteroidetes* (family *Muribaculaceae*, and genera *Bacteroides*, *Prevotella*, *Alistipes*, *Paludibacter*, *Barnesiella*), *Firmicutes* (genera *Faecalibacterium*, *Eubacterium*, *Lachnospira*, and *Roseburia*), *Actinobacteria* (*Bifidobacterium* genus), and *Proteobacteria* (*Desulfovibrio* genus). The influence of GSD on microbial diversity is not clear. Some studies report that GSD reduces microbial diversity in the bile, whereas others suggest the increase in microbial diversity in the bile of patients with GSD. The phyla *Proteobacteria* (especially family *Enterobacteriaceae*) and *Firmicutes* (*Enterococcus* genus) are most commonly detected in the bile of patients with GSD. On the other hand, the composition of bile microbiota in patients with GSD shows considerable inter-individual variability. The impact of GSD on the *Firmicutes*/*Bacteroidetes* ratio is unclear and reports are contradictory. For this reason, it should be stated that the results of reviewed studies do not allow for drawing unequivocal conclusions regarding the relationship between GSD and the *Firmicutes*/*Bacteroidetes* ratio in the microbiota.

**Keywords:** Firmicutes; Bacteroidetes; gut microbiota; bile microbiota; gallstone patients

#### **1. Introduction**

Obesity is defined as excessive fat accumulation that may impair health; obesity is a result of an imbalance between energy intake and expenditure [1,2]. Today obesity has become pandemic; about 1.9 billion people on the planet are overweight: overall, about 13% of the world's adult populations (11% of men and 15% of women) were obese in 2016 [3]. The World Health Organization (WHO) estimated that nearly 2.8 million deaths annually are a consequence of overweight and obesity-associated conditions [3], such as atherosclerosis, diabetes, gallstone disease (GSD), etc. [4–6].

GSD is a common benign gastrointestinal disease affecting 10–15% of adults around the world that greatly contributes to health care costs [7–10]. Risk factors of the GSD are age, female sex, obesity, insulin resistance, physical inactivity, genetic background, dietary factors (high carbohydrate, high-calorie intake), dyslipoproteinaemia, certain diseases (such as diabetes mellitus, nonalcoholic fatty liver disease (NAFLD), hypertension, and cardiovascular disease) and medications (hormone replacement therapy, fibrates, etc.), social and economic issues, fertility, and intestinal factors (with increased absorption of cholesterol, slow intestinal motility, and dysbiosis) [7–10]. Obesity is a major risk factor for developing GSD [9–12] because it is accompanied by increased synthesis and excretion of cholesterol into bile [13], wherein the amount of cholesterol produced is

**Citation:** Grigor'eva, I.N. Gallstone Disease, Obesity and the Firmicutes/ Bacteroidetes Ratio as a Possible Biomarker of Gut Dysbiosis. *J. Pers. Med.* **2021**, *11*, 13. https://dx.doi. org/10.3390/jpm11010013

Received: 14 November 2020 Accepted: 22 December 2020 Published: 25 December 2020

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

directly proportional to being overweight [11].Obesity is regarded as an inflammatory condition [14]. Inflammation may be the potential link between insulin resistance and gallstones [15]. Insulin resistance is considered a risk factor for GSD, as it may lead to excess biliary cholesterol production and saturation [16,17] and alone may be responsible for gallbladder dysmotility [18]. However, the absence of a relationship between body mass index (BMI) and GSD had been reported in several epidemiologic studies [7,8,19]. The possible pathogenesis for the close association between obesity and GSD iscomplex and not fully understood.

A significant relationship exists among food intake, energy balance and gut peptides that are secreted from gastrointestinal enteroendocrine cells, such as ghrelin, leptin, glucagon-like peptide-1, cholecystokinin (CCK), peptide tyrosine tyrosine (PYY), and serotonin [20]. Let's focus on two of them. Ghrelin, an orexigenic peptidyl hormone secreted from the stomach, was discovered in 1999 and is associated with feeding and energy balance [21]. Ghrelin increases appetite and energy expenditure and promotes the use of carbohydrates as a source of fuel at the same time as sparing fat [22]. The development of resistance to leptin andghrelin, hormones that are crucial for the neuroendocrine control of energy homeostasis, is a hallmark ofobesity [23]. The impact of acyl-ghrelin on glucose metabolism and lipid homeostasis may allow for novel preventative or early intervention therapeutic strategies to treat obesity-related type 2 diabetes and associated metabolic dysfunction [24]. There were no differences for total bile acids, insulin, ghrelin, and glucosedependent insulinotropic polypeptide between patients with GSD and the control group without gallstones [25]. Mendez-Sanchez et al. (2006) found an inverse correlation of serum ghrelin levels and theprevalence of GSD in alogistic regression analysis (OR = 0.27, 95% CI 0.09–0.82, *p* = 0.02) [26]. Authors suggest that serum ghrelin concentrations are associated with a protective effect of GSD and this is related to a motilin-like effect of ghrelin on the gallbladder motility. However, themedian of serum ghrelin values did not show a difference between the patients and controls (660 vs. 682 ng/L) [26].

Leptin is associated with obesity: although it should reduce food intake and body weight, in obese patientsthe serum leptin levels are higher than in the lean individuals and do not manage reducing their food intake [27]. Insulin and leptin play an important role in the development of prediabetes and NAFLD, which is a risk factor for GSD. There could be the following pathogenic links: obesity promotes insulin resistance; high levels of insulin increase leptin levels; leptin cannot lead to decreased insulin levels and decreased appetite because of leptin resistance in the nervous system and the adipose tissue; and high levels of leptin promote hepatic steatosis which in turn increases insulin resistance [27]. Positive correlations between serum leptin and BMI, CCK, total cholesterol, and insulin were found in the gallstone group [28].

Gut microbiota can regulate levels of these gut peptides and thus regulate intestinal metabolism via the microbiota-gut-brain axis [20]. Serum ghrelin levels were negatively correlated with *Bifidobacterium, Lactobacillus* and *B. coccoides–Eubacterium rectale*, and positively correlated with *Bacteroides* and *Prevotella* [29]. Leptin was negatively correlated with *Clostridium, Bacteroides* and *Prevotella*, and positively correlated with *Bifidobacterium* and *Lactobacillus* [29].The results of the studies on the relationship between GSD, obesity, and incretin hormones remain controversial.

GSD and obesity have similar prevalence [10]. Most of the above risk factors are common to GSD and obesity. Despite the increasing number of scientific publications on the gut microbiota in obesity, there is a lack of studies that assess the gut microbiota in GSD. Research on this topic is limited and mainly focused on the study of certain genera and species of microorganisms, but not the *Firmicutes/Bacteroidetes* ratio. Many studies have shown that, in humans, obesity is associated with an increased *Firmicutes/Bacteroidetes* ratio in comparison with lean or "healthy obese" individuals [1,2,30–42]. This review presents and summarizes the recent findings of studies on the gut microbiota in patients with GSD regarding the *Firmicutes/Bacteroidetes* ratio, as a possible biomarker of obesity, given that obesity is a key risk factor for GSD.

#### *The Gut Microbiome and Its Functions*

Bacteria emerged 3.8 billion ago [43]. There are about 10 trillion human cells in the human body and about 100 trillion cells outside and inside our bodies being of microbial origin [44,45]. The gut microbiome is a dynamic assembly of microorganisms and the resultant products of their collective genetic and metabolic materials, containing from 2 to 20 million microbial genes by the human microbiome's predominantly in the gut [44]. The gut microbiome plays an array of biological functions ranging from controlling gut– immune system axis, providing several key metabolites and maintaining an optimal digestive system due to the presence of genes, which encode digestive enzymes that are not present in human cells but are associated with the metabolism and fermentation of many food compounds necessary for the host's nutrition [46]. A greater richness and diversity of bacterial species in the human intestine may be an indicator of health [45,47,48].

#### **2. The Firmicutes/Bacteroidetes Ratio**

#### *2.1. Short Characteristics of the Firmicutes and the Bacteroidetes*

As the dominant gut microbiota in healthy adult humans [4], intestinal bacteria include members of both the *Firmicutes* (range of quantitative data–20.5% up to 80% [49–53]) and the *Bacteroidetes* (from 13.85% up to 75.3% [20,49,51,52]). Major taxa of the *Firmicutes* to be included of more than 200 genera [53,54]. *Proteobacteria, Fusobacteria, Actinobacteria, Cyanobacteria*, and *Verrucomicrobia* phyla also are present as minor players [54].

Some *Bacteroides* spp. and *Prevotella* spp. have a variety of glycans and glycosidases that can utilize polysaccharides [53,55]. Other important functions of *Bacteroides* spp. include deconjugation of bile acids [56]. The gut microbiota, especially *Bacteroides intestinalis*, and to a certain extent *Bacteroides fragilis* and *E. coli*, also has the capacity to deconjugate and dehydrate the primary bile acids and convert them into the secondary bile acids in the human colon [57]. Bacteria belonging to the phylum *Bacteroidetes* have high functional redundancy, whereas the phylum *Firmicutes* was comprised of a large number of more functionally diverse core bacteria [53,54,58]. Commensal *Clostridial* clusters XIVa and IV plays an important role in the host and gut homeostasis from the metabolic point of view through the production of short-chain fatty acids, normalizes intestinal permeability, involved the brain–gut axis regulation, in the immune system development, etc. [59]. Many *Firmicutes'* abilities are related to the host's body weight: obesity-associated gut microbiota is enriched in *Clostridium leptum* [54], Roseburia intestinalis, Eubacterium ventriosum, *Eubacterium hallii* [60], *Lactobacillus reuteri* [42], *Blautia hydrogenotorophica*, *Coprococcus catus*, *Ruminococcus bromii*, *Ruminococcus obeum* [50]. However, other *Firmicutes* are abundant in non-obese individuals: *Clostridium cellulosi,* associated with the degradation of plant material [60,61], *Clostridium orbiscindens* (currently known as *Flavonifractor plautii*), capable of utilizing flavonoids [52], *Clostridium bolteae*, *Blautia wexlerae* [58], *Clostridium difficile*, the *Staphylococcus* genus [40], *Oscillospira guillermondii* [60], *Faecalibacterium (prausnitzii), Lactobacillus plantarum,* and *paracasei* [42]. Also, two *Bacteroides* species (*B. faecichinchillae* and *B. thetaiotaomicron*) [58] and *Akkermansia muciniphila,* and *Methanobrevibacter smithii* [42] were significantly more abundant in stool samples from non-obese compared with obese subjects. Such differences in the "behaviour" of bacteria cannot be explained only by their metabolic properties, because of the exact functions of bacteria are still unclear.

#### *2.2. The Story of "Discovery" of the Firmicutes/Bacteroidetes Ratio*

Increased efficiency of energy harvest, due to alterations in the gut microbiota has been implicated in obesity in mice [31,32,62] and humans [38].Alterations affecting the dominant intestinal phyla the *Firmicutes* and the *Bacteroidetes* were first described by Ley et al. (2005) in obese animals [1]. In the analysis of the cecal microbiota (by the 16S rRNA gene sequences) of genetically obese ob/ob mice, lean ob/+ and wild-type +/+ siblings, ob/ob animals have a 50% reduction in the abundance of Bacteroidetes and a proportional increase in Firmicutes compared with lean mice [1]. The authors also pointed out that an increase ofthe *Firmicutes*/*Bacteroidetes* ratio may help promote adiposity in *ob*/*ob* mice. The *Firmicutes/Bacteroidetes* ratio is also under debate as a possible biomarker of obesity and related dysfunctions [53,62–66]. A low *Firmicutes/Bacteroidetes* ratio was found to be associated with lean phenotypes, younger age, cardiovascular health, and a balanced immune system and is generally considered beneficial for health [67–69].

#### *2.3. The Firmicutes/Bacteroidetes Ratio in Obesity: Pro*

Ley et al. (2006) have shown that the microbiota in obese subjects shows an elevated proportion of the *Firmicutes* and a reduced population of the *Bacteroides.* Conversely, the relative proportion of the *Bacteroidetes* decreased in humans on a weight-loss program [30]. 16S rRNA gene sequencing revealed a lower proportion of *Bacteroidetes*, more *Actinobacteria* in obese versus lean individuals, but no significant difference in *Firmicutes* in 31 monozygotic twin pairs and 23 dizygotic twin pairs [33]. Armougom et al. [34] confirmed a reduction in the *Bacteroidetes* community in 20 obese patients compared with 20 normal-weight individuals (*p* < 0.01). Zuo et al. (2011) reported that obese people had fewer cultivable *Bacteroides* than their normal-weight counterparts [37]. In the gut in obese adolescents, the total microbiota was more abundant on the phylum *Firmicutes* (94.6%) as compared with *Bacteroidetes* (3.2%) [39]. In the systematic review (PubMed: 2005–2017) adecrease in the *Bacteroidetes* phylum and *Bacteroides/Prevotella* groups was related to high BMI and the *Firmicutes* phylum was positively correlated with weight gain in children between 0 and 13 years of age [40]. In an adult Ukrainian population, the *Firmicutes/Bacteroidetes* ratio was significantly associated with BMI (OR = 1.23, 95% CI 1.09–1.38) and this association continued to be significant after adjusting for confounders such as age, sex, smoking and physical activity (OR = 1.33, 95% CI 1.11–1.60) [41]. The recent systematic review confirmed that individuals with obesity have a greater the *Firmicutes/Bacteroidetes* ratio, more *Firmicutes, Fusobacteria, Proteobacteria, Mollicutes,* and less *Bacteroidetes* [42].

#### *2.4. The Firmicutes/Bacteroidetes Ratio in Obesity: Contra*

However, other human trials not only failed to confirm a high proportion of *Firmicutes* in obese patients [63,70–78] and, but reported even the opposite: about higher amounts of *Bacteroidetes*, and decreased amounts of *Clostridium* cluster XIVa in obese subjects as compared with lean donors [71]. Proportions of the genus *Bacteroides* were greater in overweight volunteers than lean and obese volunteers and the *Firmicutes/Bacteroidetes* ratio changed in favour of the *Bacteroidetes* in overweight and obese subjects [72]. Duncan et al. (2008) found that weight loss did not change the relative proportions of the *Bacteroides* spp, or the percentage of the *Firmicutes* present, in the human gut [73]. In another study, no significant differences in the *Firmicutes/Bacteroidetes* ratios were found between obese and normal-weight adults [74] or obese and normal-weight children [75]. Two meta-analyses have shown that the content of the *Firmicutes* and the *Bacteroidetes* and their ratio is not a consistent feature distinguishing lean from obese human microbiota generally [76,77].

Many authors have concluded that there is no simple taxonomic signature of obesity in the microbiota of the human gut and that significant technical and clinical differences exist between published studies [63] and that the phylum level difference of the gut microbiota between obese and lean individuals might not be universally true [78]. Likely explanations for these controversies are discussed below.

#### **3. Role of the Microbiota in the Pathogenesis of Gallstone Disease**

The pathogenesis of cholesterol GSD is multifactor, it is determined by five primary defects: genetic background and LITH genes, hepatic hypersecretion of biliary cholesterol, rapid precipitation of solid cholesterol crystals in bile, gallbladder dysmotility, and intestinal factors (with increased absorption of cholesterol, slow intestinal motility, and dysbiosis) [10].

In recent years, attention has been focused on the potential impact of the gut microbiota on the pathogenesis of pigment and cholesterol gallstones. It is proved that intestinal dysbiosis makes a significant contribution to the development of not only the GSD itself [5,6,79–82], but also to the development of numerous disorders that are risk factors for GSD, including obesity [31–42], type 2 diabetes [83], hypercholesterolemia [20,52], diet [84], NAFLD [85–88], cardiovascular diseases [68,89], physical inactivity [29,90,91], etc.

Gut microbiota affects the pathogenesis of GSD through several mechanisms. Some bacteria alter the composition of bile directly via *β*-glucuronidase, cholyl-glycyl hydrolase, phospholipase A1, or urease activities, or by biofilmformationthereby promoting calcium bilirubinate (pigment) stone generation [92,93]. Till now, it has not been clear whether bacterial pathogens of the biliary tree contribute to the stone formation or alternatively if the presence of gallstones promotes chronic colonization [15]. The activity of the gut microbiota could also be linked to the development of GSD by altering the concentration of serum lipids [94], and biliary lipids in bile and/or increasing the faecal excretion of bile salts [95]. Gut microbiota can modulate bile acid metabolism through the activity of bile salt hydrolases, which deconjugate bile acids, and the activity of 7α-dehydroxylase, which converts primary bile acids (cholic acid and chenodeoxycholic acid) to secondary bile acids (deoxycholic acid and lithocholic acid) [94].

Bile acids regulate metabolism via activation of specific nuclear receptors (e.g., farnesoid X receptor, pregnane X receptor, vitamin D receptor, and cell surface G protein-coupled receptors, such as the G protein-coupled bile acid receptor (TGR5 and Gpbar-1)) [96,97]. The effect of the farnesoid X receptor is antilithogenic:farnesoid X receptor activation in the intestine by bile acids induces fibroblast growth factor 15 expression, which suppresses the expression of cholesterol 7α-dehydroxylase in the liver [98]. Gallstone patients had significantly higher levels of 7α-dehydroxylating bacteria than individuals without gallstones [99]. The increase of 7α-dehydroxylation activity of the intestinal microflora promoted the deoxycholic acid excess in the bile acid pool [100], and the increase in the percentage of deoxycholic acid in bile and bile acid hydrophobicity leads to a decrease in the cholesterol microcrystal nucleation time and the formation of cholesterol gallstones [101].

#### **4. The Firmicutes/BacteroidetesRatio and GSD**

#### *4.1. Gut Microbiota*

#### 4.1.1. Gut Microbiota in Mice and Cholelithiasis

Many reports are underlining the association of the gut microbiota with the pathogenesis of cholesterol cholelithogenesis in mice [15,102,103] and humans [5,6,80–82,93,100,104– 119].

Alteration of indigenous gut microbiota by bacteria transferring has been shown to make germ-free mice more susceptible to the formation of cholesterol gallstones [102]. In a study of mice without and with cholesterol gallstones (induced by a lithogenic diet) using 16S rRNA gene sequencing, it was found that in the faeces of mice, the *Firmicutes/Bacteroidetes* ratio and the *Firmicutes* content decreased (from 59.71% under chow diet to 31.45% under lithogenic diet, *p* < 0.01), the richness and alpha diversity of the microbiota also significantly reduced [103]. Cholelithogenic enterohepatic *Helicobacter* spp. (phylum *Proteobacteria*) have been identified and their important role in the formation of cholesterol gallstones in mice and perhaps in humans has been shown [15].

#### 4.1.2. Gut Microbiota in Humans and Gallstones

In the gallstone group included 30 patients, the diversity of intestinal bacteria and the abundances of certain phylogroups significantly decreased, especially *Firmicutes*, the *Firmicutes/Bacteroidetes* ratio was also significantly decreased compared with the control group included 30 healthy individuals [6]. 7α-dehydroxylating gut bacteria (the *Clostridium* genus) were significantly increased, whereas cholesterol-lowering bacteria (the *Eubacterium* genus) were significantly reduced. *Clostridium* was positively correlated with secondary bile acids. It can be assumed that an increase in *Clostridium* and a decrease in *Eubacterium* contribute to bile saturation with cholesterol in patients with gallstones [100]. In the gallstone group, *Ruminococcus gnavus* could be used as a biomarker, while in the control group–*Prevotella 9* and *Faecalibacterium* [6].

Keren et al. (2015) showed that intestinal microbial diversity, the abundances of the genus *Roseburia* and the species *Bacteroides uniformis* were decreased, and those of the family *Ruminococcaceae* and the genus *Oscillospira* were increased in patients with gallstones before cholecystectomy compared with the controls [5]. After cholecystectomy in the patients with gallstones, the abundance of the phylum *Bacteroidetes*, and also the family *Bacteroidaceae* and the genus *Bacteroides* showed a significant increase. Gallstone patients had higher overall concentrations of faecal bile acids [5]. *Roseburia* was significantly positively correlated with faecal cholesterol, but not with bile acids; *Oscillospira* correlated negatively with primary bile acids and faecal cholesterol concentration and positively–with the secondary bile lithocholic acid in the faeces. Thus, the authors suggest that *Oscillospira* may predispose individuals to cholesterol gallstones [5]. Cholecystectomy alters bile flow into the intestine and bidirectional interactions between bile acids and intestinal microbiota, thereby increasing bacterial degradation of bile acids into faecal secondary bile acids [104,105]. Deoxycholic acid can inhibit the growth of thececal microbiota in rats; moreover, members of the *Bacteroidetes* phylum (*Bacteroides vulgatus, Bacteroides sartorii*) are more sensitive to secondary bile acids exposure than members of the *Firmicutes* phylum (*Clostridium innocuum*, *Blautia coccoides*) [120]. Deoxycholic acid concentrations were negatively correlated with the *Bacteroidetes* phylum in patients with GSD [5]. Increasing levels of the cholic acid cause a dramatic shift toward the *Firmicutes* (from 54.1% before of administration of cholic acid up to 95% after [120]), particularly *Clostridium* cluster XIVa and increasing production of the harmful deoxycholic acid [104,121].

Wang W et al. (2018) identified ageing-associated faecal microbiota in a healthy population, which was lost in cholecystectomy patients [81]. Absent intestinal bacteria, such as *Bacteroides*, were also negatively related to secondary bile acids in cholecystectomy patients. The abundances of *Prevotella, Desulfovibrio, Barnesiella, Paludibacter*, and *Alistipes* all decreased, whereas those of *Bifidobacterium, Anaerostipes*, and *Dorea* all increased in the cholecystectomy patients [81].

In the frame of a case-control study, Yoon W et al. (2019) showed that *Blautia obeum* and *Veillonella parvula*, which have azoreductase activity, were more abundant in faecal samples in the 27 patients of the cholecystectomy group compared to the control group [82]. The abundance of family *Muribaculaceae* belonging to the phylum *Bacteroidetes* was decreased and that of the family *Lactobacillaceae* was increased in the cholecystectomy group. At the genus level, the abundance of *Ruminococcus* was greater in the cholecystectomy group [82].The actual number of taxa observed in a faecal sample was significantly lower in the cholecystectomy group. However, the difference in the diversity of the gut microbiota between the cholecystectomy and control groups was subtle [82].

Two years after cholecystectomy, eight patients with the symptomatic post-cholecystectomy syndrome, eight patients with the asymptomatic post-cholecystectomy syndrome, and eight healthy individuals were examined [106]. It was shown that *Firmicutes* and *Bacteroidetes* had similar abundance and contents among the three groups. The gut microbiome of the symptomatic post-cholecystectomy syndrome patients was dominated by *Proteobacteria* in faeces and contained little *Firmicutes* and *Bacteroidetes* [106].

Wu et al. (2013) studied the composition of bacterial communities of the gut, bile, and gallstones from 29 cholesterol gallstone patients and the gut of 38 healthy controls [107] by 16S rRNA gene sequencing method. They found a significant increment of the gut bacterial phylum *Proteobacteria* anddecrement of gut bacterial genera *Faecalibacterium*, *Lachnospira*, and *Roseburia*. When compared with gut, a significantly decreased level of the bacterial phylum *Bacteroidetes* in the biliary tract was found. The *Firmicutes/Bacteroidetes* ratio in faeces in patients with GSD did not differ in comparison with the control group [107].

Ren X et al. (2020) examined stool samples from 104 subjects (equally post-cholecystectomy patients and healthy controls) which were collected for 16S rRNA gene sequencing to analyze the bacterial profile [80]. It was shown noteworthy compositional and abundant alterations of bacterial microbiota in post-cholecystectomy patients, characterized as *Bacteroides ovatus, Prevotella copri*, and *Fusobacterium varium* remarkably increased; *Faecalibacterium prausnitzii, Roseburia faecis*, and *Bifidobacterium adolescentis* significantly decreased. Machine learning-based analysis, that integrates gut microbiota and other anthropometric parameters, showed a pivotal role of *Megamonas funiformis* in discriminating post-cholecystectomy patients from healthy controls. Additionally, the duration after cholecystectomy notably affected bacterial composition in post-cholecystectomy patients [80].

Eventually, if we summarize the results of most studies of the microbiota in patients with GSD different authors found both a significant increment of gut bacterial phyla *Firmicutes* (*Lactobacillaceae* family, genera *Clostridium*, *Ruminococcus*, *Veillonella*, *Blautia*, *Dorea, Anaerostipes*, and *Oscillospira*), *Actinobacteria* (*Bifidobacterium* genus), *Proteobacteria*, *Bacteroidetes* (genera *Bacteroides*, *Prevotella*, and *Fusobacterium*) (Figure 1) and significant decrement of gut bacterial phyla *Bacteroidetes* (*Muribaculaceae* family, and *genera Bacteroides, Prevotella, Alistipes, Paludibacter, Barnesiella)*, *Firmicutes* (genera *Faecalibacterium*, *Eubacterium*, *Lachnospira*, and *Roseburia*), *Actinobacteria* (*Bifidobacterium* genus), and *Proteobacteria* (*Desulfovibrio* genus) (Figure 2). In other words, in patients with GSD, an increase and decrease in almost all major intestinal bacterial phyla were detected. In one study the *Firmicutes/Bacteroidetes* ratio in faeces in patients with GSD was significantly decreased in comparison with the controls [6], in two studies–did not differ [106,107]. In addition to *Firmicutes* and *Bacteroidetes* as the main phyla, *Proteobacteria* and other phyla may contribute to the gut dysbiosis in patients with GSD.

**Figure 1.** Characteristics of the gut microbiome of patients with GSD. A significant increase of the phyla *Firmicutes*, *Actinobacteria*, *Proteobacteria*, and *Bacteroidetes* is reflected. The number in square brackets indicates a reference in the list of references.

**Figure 2.** Characteristics of the gut microbiome of patients with GSD. A significant decrease of the phyla *Firmicutes*, *Actinobacteria*, *Proteobacteria*, and *Bacteroidetes* is reflected. The number in square brackets indicates a reference in the list of references.

Using metagenomic DNA sequencing, researchers have been able to categorize individuals as either high gene count (HGC) or low gene count (LGC) [44]. HGC individuals are generally considered to have a greater repertoire of microbial metabolic functions, a functionally more robust gut microbiome, and greater overall health, including a lower prevalence of obesity and metabolic disorders [48]. Examples of bacterial taxa that have been associated with human health and proper gastrointestinal function include *Bacteroides, Bifidobacterium, Clostridium* clusters XIVa and IVa (butyrate producers), *Eubacterium, Faecalibacterium, Lactobacillus,* and *Roseburia*. Bacterial species that might protect against weight gain and are enriched in HGC individuals include *Anaerotruncus colihominis, Butyrovibrio crossotus, Akkermansia* spp., and *Faecalibacterium* spp. [48]. The studies of the gut microbiota in patients with GSD included in our review demonstrated a reduction of bacterial taxa that have been associated with human health, i.e., genera *Bacteroides, Faecalibacterium*, *Roseburia*, *Eubacterium*, an increase in *Lactobacillaceae* family, and oppositely directed changes in *Bifidobacterium*.

#### 4.1.3. Bile Microbiota in Humans and Gallstones

The presence of bacterial amplicons belonging to *Firmicutes, Bacteroidetes*, and *Actinobacteria,* and *Proteobacteria* phyla in the human intact gallbladder bile was proved by 16S rRNA gene sequencing [108,109]. Associations between alpha- and beta-diversity, a taxonomic profile of bile microbiota (*Bacteroidetes, Proteobacteria, Actinobacteria,* and *Firmicutes* phyla, analyzed with 16S rRNA gene sequencing), and taurocholic and taurochenodeoxycholic bile acid levels were evidenced in 37 Russian patients with GSD [110].

At the phylum level, *Bacteroidetes* was statistically more abundant in the bile of patients with GSD (24.00%) compared to the control (13.49%) [109]. Members of the families *Bacteroidaceae*, *Prevotellaceae*, *Porphyromonadaceae*, and *Veillonellaceae* were more frequently

detected in patients with GSD. The genus *Dialister* and enterobacteria *Escherichia-Shigella* also showed a significantly higher representation in the bile in the patients with GSD [109]. The Shannon diversity index was statistically higher in the bile of the control group than that obtained in the patients with GSD [102].However, it was not taken into account that bile samples from the gallbladder of individuals from a control group were obtained from liver donors, and they were not only treated with antibiotics but also not fully examined to exclude hepatobiliary or other important pathology [109].

The *Proteobacteria*, *Firmicutes*, *Bacteroidetes,* and *Actinobacteria* phyla dominated the biliary microbiota in the persons, all of whom were diagnosed with GSD, at that biliary tract microbiota of participants with GSD showed substantial person-to-person variation [79]. Metagenomic sequencing of bile from gallstone patients showed that oral cavity/respiratory tract inhabitants were more prevalent than intestinal inhabitants [108]. At the same time, bile samples from gallstone patients had reduced microbial diversity compared to healthy faecal samples [108]. Among patients with the new onset of common bile duct stones, five dominant phyla were identified: *Proteobacteria* (60%), *Firmicutes* (27%), *Bacteroidetes* (4%), *Actinobacteria* (3%), and Unclassified\_Bacteria (3%) in biliary microbiota [111]. At the genus level, the five genera with the highest relative abundances in patients with the new onset of common bile duct stones were *Escherichia/Shigella*, *Halomonas*, *Klebsiella, Streptococcus,* and *Enterococcus* [111].

In patients with cholangiolithiasis associated with sphincter of Oddi laxity, *Proteobacteria* and *Firmicutes* were the most widespread phylotypes, especially *Enterobacteriaceae*, in the bile, which was collected intraoperatively [112]. In the bile of the cholecystectomized gallstone patients *Escherichia coli*, *Salmonella* sp., and *Enterococcusgallinarum* were detected by using next-generation sequencing technology [113]. *Enterobacteriaceae* are frequently isolated from bile aspirates or gallbladder bile from GSD patients using cultural [114,115] and culture-independent techniques [116,117]. The biliary microbiota (investigated by using 16S rRNA amplicon sequencing) had a reduced diversity comparatively with the duodenal microbiota in gallstone patients [117]. Although the majority of identified bacteria were greatly diminished in bile samples, three *Enterobacteriaceae* genera (*Escherichia*, *Klebsiella*, and an Unclassified genus) and *Pyramidobacter* were abundant in bile [117].

In terms of bile microbial distribution, analyzed by the 16S rRNA encoding gene (V3-V4), patients with recurrent common bile duct stone had significantly higher *Proteobacteria*, while *Bacteroidetes* and *Actinobacteria* are significantly lower compared with the control group at the phylum level [117]. At the family level, *Enterobacteriaceae* was significantly abundant in the bile samples of the recurrence stone group compared with the control group. At the genus level, the recurrence stone group had significantly more *Escherichia*. The diversity of bile microbiome in patients with recurrent common bile duct stone is lower than that in the control non-cholelithiasis group [117].

During a cholecystectomy, mucosal DNA extraction and metagenomic sequencing were performed to evaluate changes in the microbiota between chronic calculous cholecystitis and gallbladder cancer patients [118]. At the phylum level, *Firmicutes*, *Bacteroidetes*, *Actinobacteria*, and *Proteobacteria* were found to be stable in both groups. The diversity of the biliary microbiota was significantly lower in the calculous cholecystitis group, compared with the gallbladder cancer group [118].

In four patients who underwent cholecystectomy for acute calculous cholecystitis metagenome analysis of bile, faeces, and saliva was performed [119]. In all the examined patients with acute calculous cholecystitis, *Escherichia coli* (*Enterobacteriaceae* family) was found in large quantities in the bile, in two of them-also in the faeces, in the third patient, *Bifidobacterium* prevailed in the faeces. This is not enough to conclude the relationship between the intestinal microbiota and acute calculous cholecystitis, since if bile samples were taken during surgery, then saliva and faeces were collected by patients during hospitalization (it is not clear before or after the cholecystectomy) [119].

During endoscopic retrograde cholangiopancreatography, a total of 44 bile samples of patients with GSD were collected. Bacterial infection in bile samples was detected in 54.5% of patients with GSD. *Escherichia coli* showed a significant association with gallstones [122].

Thus, bile samples from patients with GSD had reduced microbial diversity in some studies and increased microbial diversity in others compared to healthy faecal samples. Nevertheless, most authors recognize that patients with GSD have reduced bacterial diversity of intestinal and bile microbiota. The phyla *Proteobacteria* (especially family *Enterobacteriaceae*) and *Firmicutes* (*Enterococcus* genus) were more often detected in the bile of patients with GSD, and the phyla *Bacteroidetes* and *Synergistes* (*Pyramidobacter* genus) were less frequently detected.

Some reports described live bacteria and bacterial DNA as long-term constituents in different fat depots in obesity and diabetes mellitus type 2 [123,124]. In humans with the metabolic syndrome, altered microbiome composition together with a defective intestinal barrier has been suggested to facilitate translocation of microbes, thereby contributing to low-grade inflammation. A recent study demonstrated a bacterial signature in mesenteric adipose tissues without the obvious presence of blood: members of the *Enterobacteriaceae* family compartmentalize in the extra-intestinal tissues of people with diabetes mellitus type 2 independently of obesity [123]. The authors suggest that members of the *Enterobacteriaceae* family are key players in diet-induced dysmetabolism in the host. Unfortunately, the intriguing topic of possible translocation of living bacteria (perhaps even members of the *Enterobacteriaceae* family) from the gut to other body sites in patients with GSD remains undiscovered.

So, when analyzing available studies of intestinal and bile microbiota in animals and patients with GSD [5,6,15,79–82,92,93,100,102–119] there were no unidirectional changes in the *Firmicutes/Bacteroidetes* ratio. This situation with opposite results is typical not only for GSD. For comparison, we will briefly present the results of several studies reporting differences in phylum levels in patients with non-alcoholic fatty liver disease (usually associated with obesity): the phylum *Bacteroidetes*–increased [86], decreased [88,125], did not differ [87,126], the phylum *Firmicutes*–decreased [86,87], increased [126], and the *Firmicutes*/*Bacteroidetes* ratio decreased [88].

This variation in the relative abundance of the phylum of the gut corresponds to the analysis of seven studies in Finucane et al. (2014): *Bacteroidetes*–from 0% to 90%, *Firmicutes*– from 0 to 100% [63]. This also applies to GSD. For example, the highest abundance of *Firmicutes* phylum in the human gastrointestinal tract in one GSD patient was 93.30% and the lowest was 1.17% in another. A similar result was also seen in bile with a high of 55.10% and low of 0.08% [107]. In another study, the range of relative abundance of *Firmicutes* phylum was 0–92% in the bile of patients with GSD [79].

#### **5. Some Reasons for the Lack of Unity in the Assessment of the** *Firmicutes/Bacteroidetes* **Ratio**

Gut microbiota is changing with human development and is influenced by many confounding variables which could prevent the existence of a unique taxonomic signature as a standard feature for obesity and associated comorbidities such as GSD [64,83,89].


with the abundance of the phylum *Bacteroidetes* and negatively–with the abundance of the phylum *Firmicutes* in the faeces [38]. As a rule, the "western diet" increases biliary secretion of bile acids and reshapes the gut microbiota in obesity by increasing the *Firmicutes* and decreasing the *Bacteroidetes* [35,62]. Several population-based studies have shown that populations given increased amounts of polyunsaturated fats have a significant risk of developing gallstones [9,12,132–134]. The MICOL study, however, showed no such association [135]. Gutiérrez-Díaz et al. (2018) support a link between diet, biliary microbiota, and GSD [84]. Comparing to health control in patients with GSD, dairy product intake was negatively associated with the proportions of *Bacteroidaceae* and *Bacteroides*, and several types of fibre, phenolics, and fatty acids were linkedto the abundance of *Bacteroidaceae, Chitinophagaceae, Propionibacteraceae, Bacteroides,* and *Escherichia-Shigella* [84]. However, the timing of these changes is surprising. In response to dietary perturbations, the gut microbiota took from 24 h [130] to 3.5 days [36] to change detectably and reaches a new steady state. Repeated dietary shifts demonstrated that most changes to the gut microbiota are reversible [36]. Also, Carmody et al. (2015) suggest, that the effects of dietary intake overshadow any pre-existing differences between strains due to host genotype [36]. Add to this the inter-individual variability in the processing of dietary compounds by the human gastrointestinal tract [136] and the hope of finding patterns in the relationship "microbiota–host–diet" becomes quite vague.


a stable temporal core microbiome. Yet, each person still maintained a personalized microbiome [140].

10. There are also hard-to-determine factors, such as the Earth's geomagnetic field, weather, etc.

#### **6. Conclusions**

Meta-analysis has shown that the microbial changes associated with obesity may be minor shifts in the community that escape detection with significance tests [77]. It may be the case that the microbiome's effect on obesity is not mediated through its taxonomic composition but rather its function, since closely related taxa can have widely varying functions and distantly related taxa can have similar functions [63]. It is proved that variable combinations of species from different phyla could 'presumptively' fulfil overlapping and/or complementary functional roles required by the host, a scenario where minor bacterial taxa seem to be significant active contributors [39]. For example, the cocolonization of germ-free mice with *B. Thetaiotaomicron* and *E. rectale* constitutes a mutualism, in which both members show a clear benefit [148] and the efficiency of fermentation of dietary polysaccharides to short-chain fatty acids by *B. thetaiotaomicron* increases in the presence of *M. Smithii* [149].

Based on the analysis of the great number of contradictory results reported in the literature, it is currently difficult to associate specific microbial signatures or the *Firmicutes/Bacteroidetes* ratio with determining health status and more specifically to consider it as a hallmark of GSD and/or obesity. However, most authors believe that both obesity [33,34,40,48,64,130] and GSD [5,6,81,103,109,111,117–120] are associated with reduced microbial diversity.Therefore, it is important to look at the overall composition of the gut microbial population structure as an indicator of obesity and obesity-associated pathologies, such as GSD, rather than simply the *Firmicutes/Bacteroidetes* ratio [150]. However, in my opinion, it is possible to modify this ratio, e.g., to introduce a coefficient that characterizes BMI, to calculate the ratio not of the *Firmicutes* phylum, but only of the *Clostridia* class, and so on.

Further studies should focus on the possibility of modulating the intestinal microbiota to find out whether variations in the microbiota may be a target for lowering the risks and prevalence rates of GSD. Future studies to identify specific bacterial species or populations associated with the obesity or GSD phenotype will help optimize disease therapies through microbiome-informed patient stratification, through personalized treatment decisions. A better understanding of bacterial communities in both the gut and biliary tract of gallstone patients is crucial in developing strategies to promote personalized microbiome-based GSD prediction and treatment responsiveness.

**Funding:** This study was performed according to the framework of the budget theme of the State assignment no. AAAA-A17-117112850280-2 and with the financial support of the Biocodex MICRO-BIOTA Foundation, France.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Ethical Statement::** The author is accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**

1. Ley, R.E.; Bäckhed, F.; Turnbaugh, P.; Lozupone, C.A.; Knight, R.D.; Gordon, J.I. Obesity alters gut microbial ecology. *Proc. Natl. Acad. Sci. USA* **2005**, *102*, 11070–11075. [CrossRef] [PubMed]


#### *Article*
