*Article* **The Structure of Relationships between the Human Exposome and Cardiometabolic Health: The Million Veteran Program**

**Kerry L. Ivey 1,2,3,\*, Xuan-Mai T. Nguyen 1,4,5, Daniel Posner 1, Geraint B. Rogers 2, Deirdre K. Tobias 3,6, Rebecca Song 1,7, Yuk-Lam Ho 1, Ruifeng Li 3, Peter W. F. Wilson 8,9, Kelly Cho 1,4,5, John Michael Gaziano 1,4,5,10, Frank B. Hu 3,11,12, Walter C. Willett 3,11,12 and Luc Djoussé 1,4,5,†**


**Abstract:** The *exposome* represents the array of dietary, lifestyle, and demographic factors to which an individual is exposed. Individual components of the exposome, or groups of components, are recognized as influencing many aspects of human physiology, including cardiometabolic health. However, the influence of the whole exposome on health outcomes is poorly understood and may differ substantially from the sum of its individual components. As such, studies of the complete exposome are more biologically representative than fragmented models based on subsets of factors. This study aimed to model the system of relationships underlying the way in which the diet, lifestyle, and demographic components of the overall exposome shapes the cardiometabolic risk profile. The current study included 36,496 US Veterans enrolled in the VA Million Veteran Program (MVP) who had complete assessments of their diet, lifestyle, demography, and markers of cardiometabolic health, including serum lipids, blood pressure, and glycemic control. The cohort was randomly divided into training and validation datasets. In the training dataset, we conducted two separate exploratory factor analyses (EFA) to identify common factors among exposures (diet, demographics, and physical activity) and laboratory measures (lipids, blood pressure, and glycemic control), respectively. In the validation dataset, we used multiple normal regression to examine the combined effects of exposure factors on the clinical factors representing cardiometabolic health. The mean ± SD age of participants was 62.4 ± 13.4 years for both the training and validation datasets. The EFA revealed 19 Exposure Common Factors and 5 Physiology Common Factors that explained the observed (measured) data. Multivariate regression in the validation dataset revealed the structure of associations between the Exposure Common Factors and the Physiology Common Factors. For example, we found that the factor for fruit consumption was inversely associated with the factor summarizing total cholesterol

**Citation:** Ivey, K.L.; Nguyen, X.-M.T.; Posner, D.; Rogers, G.B.; Tobias, D.K.; Song, R.; Ho, Y.-L.; Li, R.; Wilson, P.W.F.; Cho, K.; et al. The Structure of Relationships between the Human Exposome and Cardiometabolic Health: The Million Veteran Program. *Nutrients* **2021**, *13*, 1364. https:// doi.org/10.3390/nu13041364

Academic Editor: Rosa Casas

Received: 10 March 2021 Accepted: 14 April 2021 Published: 19 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and low-density lipoprotein cholesterol (LDLC, *p* = 0.008), and the latent construct describing light levels of physical activity was inversely associated with the blood pressure latent construct (*p* < 0.0001). We also found that a factor summarizing that participants who frequently consume whole milk are less likely to frequently consume skim milk, was positively associated with the latent constructs representing total cholesterol and LDLC as well as systolic and diastolic blood pressure (*p* = 0.0006 and <0.0001, respectively). Multiple multivariable-adjusted regression analyses of exposome factors allowed us to model the influence of the exposome as a whole. In this metadata-rich, prospective cohort of US Veterans, there was evidence of structural relationships between diet, lifestyle, and demographic exposures and subsequent markers of cardiometabolic health. This methodology could be applied to answer a variety of research questions about human health exposures that utilize electronic health record data and can accommodate continuous, ordinal, and binary data derived from questionnaires. Further work to explore the potential utility of including genetic risk scores and time-varying covariates is warranted.

**Keywords:** exposome; diet; lifestyle; demographics; cardiovascular disease; cholesterol; triglycerides; blood pressure; glycemic control

#### **1. Background**

Cardiovascular disease (CVD) is the leading cause of adult mortality globally [1] Strategies aimed at reducing CVD rates involve modulation of markers of CVD risk. In particular, elevated circulating cholesterol and triglyceride levels are associated with higher CVD risk and mortality and are principal targets for risk reduction [2,3]. Further, elevated blood pressure is one of the leading non-communicable disease risk factors [4], and elevated glycated hemoglobin is also a predictor of cardiovascular disease [5–8]. As such, strategies aimed at improving lipid profile, blood pressure, and glycemic control are urgently needed.

The array of external factors an individual is exposed to, referred to as the exposome, represents a complex network of interrelationships within and between different components comprising diet, lifestyle, and demographics. Despite the well-documented ability of individual exposome components, or groups of exposome components, to influence many aspects of human physiology [9–11], surprisingly little is known about how the complex array of exposures as a whole shape the cardiometabolic risk profile. Reductionist approaches, such as single-exposure models, are unable to account for the complex interactions between the many potential exposome components and their effect on physiology [12]. The absence of models that integrate the many different components undermines the capacity of current studies of exposome components to draw robust generalizations. This project therefore aims to model the system of relationships underlying the way in which the exposome, as a whole, shapes the cardiometabolic risk profile. To achieve this aim, we utilized a truly unique dataset generated from an exposome assessment, as well as longitudinally assessed markers of cardiometabolic health in the Million Veteran Program.

#### **2. Methods**

#### *2.1. Study Population*

Between January 2011 and November 2019, approximately 800,000 Veterans enrolled in the Million Veteran Program (MVP) [13]. The current prospective study draws from the approximately 350,000 Veterans that had enrolled in the MVP between January 2011 and 2016. Of the 297,937 participants that had exposure data, 182,363 participants were excluded if they were using antilipemic, antihypertensive, and/or hypoglycemic medications during either the exposure-assessment or outcome-assessment periods (Supplementary Table S1). A further 79,078 participants were excluded as they had incomplete exposure and/or physiology data. Consequently, the final analysis included 36,496 MVP participants. Consent was obtained in accordance with all VA policies and under the authority of the VA Central IRB [13].

#### *2.2. Exposome Assessment*

Variables representing the exposome were observed (measured) using various questionnaires. At enrollment, participants completed a baseline questionnaire in which they reported their date of birth, height, smoking status, gender, ethnicity, and race. Participants also completed a lifestyle questionnaire in which they reported body weight, frequency of light, moderate, and vigorous physical activities both during and outside of work hours, smoking status, and number of hours per week spent in sedentary activities (watching television, DVDs, or videos, using a computer, or playing video games). Body mass index (BMI) was calculated as weight (kg)/height (m) [2].

A food frequency questionnaire (FFQ) was used to assess habitual dietary intake over the year preceding lifestyle questionnaire administration. Participants were asked to describe the average consumption frequency of 67 different foods. "For each food listed, please mark the column indicating how often, on average, you have used the amount specified during the past year". Pre-specified answers were as follows: "Never or less than once a month; 1 to 3 per month; once a week; 2 to 4 per week; 5 to 6 per week; once (1) a day; 2 to 3 per day; 4 to 5 per day; or 6+ per day".

For this study, we included 83 measured (observed) variables to represent the exposome. This is a large number of individual variables, many of which are not independent from one another due to patterns of behaviors amongst participants. In order to make sense of the numerous individual measured (observed) exposome variables, it was imperative that we reduce the dimensionality of the exposome variables to a smaller set of common factors representing groups of covarying measured (observed) variables. The methods to achieve this dimension reduction are detailed in the statistical analysis section.
