**1. Introduction**

Systemic hypertension is a consistently elevated systolic or diastolic blood pressure in the systemic arteries. Systolic blood pressure (SBP) is generated by the contraction of the ventricles and represents the highest blood pressure (BP) level. Diastolic blood pressure (DBP) is the BP remaining during the relaxation of the ventricles and represents the lowest BP level. The term Pulse Pressure (PP) refers to the difference (in mmHg) between the systolic and diastolic pressures, while the Mean Arterial Pressure (MAP) is the average BP during a single cardiac cycle [1–3]. Clinicians consider 140 mmHg as the maximum normal adult SBP value, and 90 mmHg as the upper limit for normal DBP value, as suggested by the World Health Organization (WHO) [4]. Usually, high SBP is caused by the narrowing of the arterioles. This narrowing raises the peripheral resistance to blood flow, which requires a greater workload for the heart and raises arterial pressure [1]. Elevated BP levels still represent a huge public health issue worldwide, being the major risk factor for cardiovascular disease, including coronary heart disease, stroke, and heart failure. Each year, 17 million people prematurely die because of cardiovascular disease, and, among these, nine million deaths occur as a consequence of hypertension-related complications [5].

Ninety-five percent of hypertensive patients presents a type lacking an obvious identifiable cause (Essential or Primary Hypertension). Investigations of twin and family studies revealed a moderate heritability ranging between 30% and 50% [6,7]. Hypertension is a heterogeneous disease; besides genetic variation, several factors such as age, sex, and ethnicity influence this trait, in addition to other environmental factors (e.g., lipid levels and obesity).

So far, the study of hypertension has mostly been based on Genome-Wide Association Studies (GWAS). GWAS represent a valuable approach to type hundreds of thousands of Single Nucleotide Polymorphisms (SNPs) in very large cohorts. During the last ten years, many studies have been published thanks to the setting of very large consortia, including the International Consortium for Blood Pressure Genome-Wide Association Studies, Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) and Global BPgen, the Wellcome Trust Case Control Consortium Studies, the UK Biobank, the PBCHARGE-EchoGen consortium, the CHARGE-HF consortium [8–12], leading to the identification of more than 100 SNPs implicated in BP levels, as recently reviewed by Seidel and Scholl [13].

The cause of a complex trait, like essential hypertension, remains elusive if examined in the light of the GWAS results. There has been a step forward compared to the classic GWAS analyses thanks to system genetics approaches and related statistical methods [14]. These approaches use intermediate phenotypes, such as transcript, protein, or metabolite levels, and quantify and integrate them with several traits of interest. Several genes pathways and networks underlying common human diseases have been discovered using systems genetics studies.

For example, data derived from GWAS were integrated with expression data to provide a measure of functional variation, i.e., the expression Quantitative Trait Loci (eQTL). When one of these loci is located within ≤1 Mb from the gene encoding the transcript, it is termed a *cis*-eQTL. When an eQTL affects the expression level of a distal gene, it is called *trans*-eQTL. Disease susceptibility can be regulated by a plethora of genes controlled by *trans*-eQTLs which, for this reason, are very informative [15].

Thanks to studies based on rat, mouse, or human cells and tissues, it has been calculated that about 30% of mammalian genes are under the control of eQTLs and they heavily contribute to complex disease susceptibility [16]. Moreover, using comparative genomics between established rat models of hypertension and humans, several studies have shown that human genes found to be associated with hypertension through GWAS, when conserved in the rat, are likely to form both *cis*- and *trans*-acting eQTLs in multiple tissues [17]. These studies have also taken advantage of a statistical methods known as Weighted Gene Co-Expression Network Analysis (WGCNA) that studies biological networks based on pairwise correlations between variables and is often used to highlight clusters (modules) of highly correlated genes [18].

At the heart of the GWAS-based approaches lies the "common variant–common disease" hypothesis. However, when considering all the detected high-frequency variants in aggregate, the percentage of BP variability explained by genetic variants accounts for only 2–3%. Moreover, blood pressure changes related to different genotypes at these loci are estimated to be modest, approximately 1.0 and 0.5 mmHg for SBP and DBP, respectively [19]. Considering the moderate effects and the scarce genetic control ascribable to high-frequency variants, two possible scenarios came forward: a wrong heritability was estimated, or alleles are more likely to be heterogeneous and uncommon. Furthermore, array-based technologies were infrequently conducted for the detection of causal polymorphisms. These observations implicated strong limits in exploiting GWAS to identify druggable targets with high confidence and supported the idea that rare (frequency < 1%) and uncommon (frequency between 1% and 5%) functional variants may explain a greater fraction of hypertensive individuals. The arrival of Next-Generation Sequencing (NGS) technologies facilitated a shift in focus from common to rare variants and provided the opportunity to unravel the genomic architecture underlying hypertension risk. Along with the development of even more advanced laboratory methodologies, statistical genetic models must also evolve to meet the challenge of using rare variants to link previously unidentified genome loci to BP changes [20,21].

In this review, we first present an overview of the most recent findings regarding the role of rare and uncommon variants in BP alteration identified through the currently available technologies, moving from the candidate-gene approach to the high-throughput exome chips, and then to NGS solutions; next, we report the statistical methods proposed so far for rare variants analysis. Finally, we draw conclusions on the contribution ascribable to rare and low-frequency variants in the improvement of cardiovascular risk assessment.

## **2. Results**

#### *2.1. Results from Studies on Selected Single Nucleotide Variants and Genes*

Conducting studies based on a candidate-gene approach is the easiest and cheapest way to investigate genetic variation. *FBN1*, a gene that is thought to be causative of vascular damage and whose mutations have been previously detected only in relation to Marfan syndrome, was selected by Jeppesen and colleagues [22] as their research focus. A sample of 4839 Danish subjects was genotyped for the rs11856553 rare variant (Minor Allele Frequency, MAF, of A allele = 0.2%, 1000 Genomes) using a PCR-based method. In the Health 2006 study, an unadjusted risk of hypertension of 2.67 (95% Confidence Interval, CI, 1.14–6.18) for the G/A genotype was reported. The adjusted risk of moderate to severe hypertension (grade 3) for the A/A–G/A genotypes (homozygous and heterozygous carriers were grouped) was 8.01 (95% CI, 3.27–19.58), *p* < 0.0001). No significant differences in BP between G/A and G/G variant carriers were described within the MONICA10 study, however, the adjusted risk of moderate to severe hypertension (grade 2) for A/A-G/A variants was 6.54 (2.12–20.2); *p* < 0.01. It is still undefined how this intronic mutation could functionally affect hypertension [22].

The cytokine Interleukin-6 (IL-6) is a fundamental mediator of the acute-phase response to endothelial injury and regulates the production of C Reactive Protein (CRP) in hepatocytes [23]; therefore, both *IL-6* and *CRP* genetic variants have been evaluated in relation to hypertension [2,20]. In the paper from Karaman et al. [24], *IL-6* rs1800795 and rs1800796 SNPs (MAF = 14.12% and 31.39%, respectively, 1000 Genomes) were genotyped in a Turkish sample of 108 controls and 111 hypertension patients. Both SNPs genotypes were not significantly related to hypertension or to IL-6 and CRP plasma levels. The CC genotype of the rs1800796 SNP is very rare in the examined population and large frequency differences among different populations and geographic regions have been reported [25].

Endothelial nitric oxide synthase (eNOS) produces Nitric Oxide, a vasodilator of vascular smooth muscles, and thus plays a crucial role in regulating BP. A four-SNP haplotype, comprising the uncommon variant rs11699009 in the *BPIFB4* gene, has been associated with notable longevity [26]. In the study of Vecchione et al. [27], 416 individuals were genotyped to determine their haplotypes. The rare variant-haplotype carriers showed a significantly increased DBP (*p* = 0.013) and a borderline increased SBP (*p* = 0.067). The authors demonstrated that the overexpression of the *BPIFB4* uncommon variant in mice impaired eNOS signaling and increased BP, opening the way for the development of new therapeutic strategies.

#### *2.2. Results from Exome Chips-Based Studies*

When the 1000 Genomes Project became publicly available, data from NGS technology allowed the development, from Affymetrix (Santa Clara, CA, USA) and Illumina (San Diego, CA, USA) companies, of array-based genotyping platforms which offer the possibility to capture a greater range of single nucleotide variability compared to GWAS. In Table 1, studies investigating common and rare variants in association with hypertension andBP phenotypes and through exome array approaches are listed. Most publications [28–38]; (Table 1) took advantage of the Illumina HumanExome BeadChip (Exome Chip; Illumina, Inc., San Diego, CA, USA).


29,350


#### *Int. J. Mol. Sci.* **2018**, *19*, 688


**Table 1.** *Cont.*

(w-SUM), Simple Sum Test (SST), Equation model (GEE model), burden test on deleterious variants

Variable-Threshold

 (VT),

Replication-Based

(burden-T1-del).

Weighted-Sum

 Statistic (RBS), Functional Principal Components Analysis (FPCA), Generalized Estimating

This chip was produced in order to meet the need of moving from relatively frequent variants derived from GWAS to functional variants located in coding regions. The array constitutes an intermediate choice between GWAS and NGS of large number of samples in terms of both cost and practical issues. The Exome Chip was designed on genome and exome sequencing data of 16 contributing studies, reaching a total of 12,031 subjects. In the array, 247,039 markers were assayed including 84% rare variants, 9.2% low-frequency variants, and only 5.8% common variants which were identified more than three times in at least two different datasets. Most variants (>90%) are non-synonymous or splicing variants that were absent in previously available chips. Genotyped individuals were mostly of European American ancestry which led to some concerns about the evolutionary young age of variants and population-specific results. The Exome Chip consortia provided information on several common diseases, including cardiovascular disease (Available online: http://genome.sph.umich.edu/wiki/Exome\_Chip\_Design) [39].

In 2015, Sung and colleagues [28] reported the results of rare and low-frequency single variants and four sets of gene-based analyses using Exome Chip data on 2045 African-American subjects from the HyperGEN cohort. Neither Single Nucleotide Variants (SNVs) nor gene level analyses reached genome-wide Bonferroni-corrected thresholds (*<sup>p</sup>* < 6.4 × <sup>10</sup>−<sup>7</sup> for SNVs; *<sup>p</sup>* < 2 × <sup>10</sup>−<sup>6</sup> with MAF < 1% and *<sup>p</sup>* < 3.9 × <sup>10</sup>−<sup>6</sup> with MAF < 5%) [28].

The same cohort was used for analyses focused on both rare (through the Exome Chip) and common (using the AffymetrixGenome-WideHumanSNP6.0 Array) variants within the *PCSK9* gene in relation to BP traits. PCSK9 is a protease able to interact with the three subunits of the renal epithelial sodium channel (ENaC). This interaction consequently increases proteasomal degradation of the ENaC which regulates sodium reabsorption [40]. Among the 31 SNPs identified, none of the associations were statistically significant (*p* > 0.05). The cumulative effect of rare variants (mostly non-synonymous or stop-gain SNVs) detected in *PCSK9* was significantly associated with DBP in HyperGEN (*p* = 0.04) and to SBP in REGARDS data (*p* = 0.04). The disparity in the associated phenotypes was probably due to differences in the age of populations [29].

Alteration in lipid levels is strongly related to hypertension [41], and a pleiotropic effect of lipid-associated loci on hypertension could be speculated. To investigate this, the group of Kim et al. [38] interrogated 135 Exome Chip SNVs for associations with ten cardiometabolic traits in 14,028 Korean individuals. Three new common variants in the *BRAP*, *ACAD10*, and *ALDH2* genes within the 12q24.12 locus were significantly associated with both SBP and DBP (*<sup>p</sup>* < 1.09 × <sup>10</sup><sup>−</sup>4; effect sizes between −1.53 ± 0.32 and −0.78 ± 0.20). The locus was also associated with High-Density Lipoprotein (HDL), Low-Density Lipoprotein (LDL), triglycerides, fasting plasma glucose, body mass index, and waist–hip ratio (*<sup>p</sup>* < 1.06 × <sup>10</sup>−2; effect sizes between −7.60 ± 1.72 and 2.55 ± 0.53) [38]. Successively, a longitudinal Exome-Wide Association Study (EWAS), which is a genotyping method restricted to exonic SNVs using Illumina exome chips, allowed the detection of six hypertension-related SNVs at the 12q24.1 locus, creating an East Asian-specific haplotype comprising five derived alleles. The study was conducted in 6026 Japanese individuals whose disease progression and physiological changes were traced for several years during annual health check-ups. The rationale of this study was the observation that SBP, DBP, and the prevalence of hypertension are significantly correlated with age, while conventional GWAS have commonly been conducted in a cross-sectional manner measuring traits at a single point in time. People carrying the East Asian-specific haplotype displayed a hypertension prevalence significantly lower than those individuals carrying a common haplotype (mean Odds Ratio (OR) = 0.78, *<sup>p</sup>* < 1.0 × <sup>10</sup><sup>−</sup>8). Furthermore, using a recessive model, an SNV located within the *COL6A5* gene, was significantly associated with SBP (Estimate: −2.93; *<sup>p</sup>* = 2.3 × <sup>10</sup><sup>−</sup>8) [30].

Stop codons are highly likely to alter protein function affecting BP-related traits, and, for this reason, Ohlsson et al. [31] focused the aim of their work on the relationship between BP and protein truncating variants in the genotypes of 5453 Swedish people. They reported 19 SNVs associated with SBP with a *p* value < 0.05. The *PDE11A* R307X mutation conferred a 7 mmHg higher SBP and a 4.6 mmHg higher DBP (β coefficients = 7.0 (1.8–12) for SBP corrected, *p* = 0.009 and 4.6 (1.8–7.4) for DBP corrected, *p* = 0.001) and was previously described as a loss-of-function mutation linked to familial hypertension and Cushing's syndrome [42]. The stop codon mutation caused a three-fold increased risk of hypertension in female carriers (OR = 3.1 (95% CI, 1.3–7.4), *p* = 0.009).

Two very large meta-analyses were then published at the same time to identify novel coding variants and loci influencing BP traits and hypertension. In the first meta-analysis, Surendran et al. [32] genotyped 192,763 subjects, mostly of European descent, in the discovery phase. Fifty-one genomic regions were found to be significantly associated with at least one of the following BP traits: SBP, DBP, PP, and hypertension in the discovery analysis (*<sup>p</sup>* < 5 × <sup>10</sup>−8). Thirty novel SNVs replicated in 155,063 multiethnic populations (*<sup>p</sup>* < 6.2 × <sup>10</sup>−4; <sup>β</sup>s: −1.43−2.70). Among these, rare putative functional variants were identified within *A2ML1*, *COL21A1*, *RRAS*, *RBM47*, and *ENPEP* genes. Interestingly, intersecting previous GWAS data with Exome Chip data revealed five out 35 known loci which likely had rare coding functional variants. The second large meta-analysis was conducted on all ancestry subjects from the same five consortia described in Surendran et al. [32], reaching a total sample number of 327,288. Here, the authors identified 31 additional new loci with statistically significant associations with one of the BP traits (*<sup>p</sup>* < 3.4 × <sup>10</sup>−7). Three variants had frequencies between 1% and 5% and were non-synonymous substitutions in *NPR1* (already established), *SVEP1*, and *PTPMT1* (novel genes) with a *<sup>p</sup>* value less than 3.4 × <sup>10</sup>−<sup>7</sup> when corrected for multiple testing. To note, the BP increment attributable to any of these low-frequency variants (>1.5 mmHg) was higher than any of the novel common SNPs described here. Low-frequency and frequent SNVs with non-synonymous, stop-coding, and splicing effects were aggregated using burden tests to identify new gene-based associations. These analyses showed significant results for *NPR1* (*<sup>p</sup>* = 4.4 × <sup>10</sup>−5) and marginally for *PTPMT1* and *DBH* genes (*p* = 0.019 and 0.053, respectively). Considering that an overlap between cardiovascular-specific pathways and metabolic disease-related factors was observed, the authors suggested a shared origin between the phenotypes that could be exploited for new drugs discovery [33].

In the most recent meta-analysis on Exome Chip data, Nandakumar et al. [34] screened 15,914 individuals of African ancestry to detect novel genes and BP-related SNVs considering the full spectrum frequency. Nine rare SNVs (mostly missense) within eight genes (*SLC28A3*, *KRBA1*, *SEL1L3*, *YOD1*, *COL6A1*, *CRYBA2*, *GAPDHS*, and *AFF1*) exhibited Bonferroni-corrected associations with SBP or DBP (SeqMeta <sup>β</sup>s (βssm): 21.10 (4.12)–73.65 (13.19); *<sup>p</sup>* < 4.6 × <sup>10</sup><sup>−</sup>7) and the *CCDC13, QSOX1* genes were also described through burden test including only predicted damaging variants (Betassm: 54.38 (10.68) and 32.93 (7.13), respectively; *<sup>p</sup>* < 3.86 × <sup>10</sup><sup>−</sup>6). By contrast, no significant results were obtained considering common and low-frequency variations.

Linkage analysis can have good power to detect multiple rare or lower frequency BP variants in a gene or region with relatively larger effect sizes [43]. However, the identified linkage regions from well-designed linkage family studies [44–46] did not overlap with many BP loci identified by large BP GWAS of mostly unrelated individuals. Therefore, He and colleagues [35] applied variance-component linkage analysis to the Cleveland Family Study (CFS) to identify candidate genomic regions related to SBP, DBP, and PP. Since the region identified (16p13) showed no overlapping with any SNPs derived from previous GWAS, 517 individuals from the CFS who had been genotyped using the Illumina OmniExpress Exome array [39], were screened for variants within the 16p13 locus. At a gene-based level, the association between the aggregation of five rare variants within the *RBFOX1* gene and SBP as well as PP traits replicated in the meta-analysis of a large sample of 57,234 participants (*<sup>p</sup>* < 1.71 × <sup>10</sup><sup>−</sup>2). This gene encodes for the Ataxin-2 Binding Protein 1 whose genetic variations were suggested to have a protective effect on BP levels, although the underlying mechanisms remain to be clarified [35].

The UK Biobank is a huge prospective cohort including 500,000 individuals of European ancestry recruited to investigate genetic and non-genetic factors underlying diseases that takes advantage of many phenotypes and biological samples [47]. Genotypes obtained through a customized array in addition to genome-wide imputation based on 1000 Genomes and UK10K sequence data, and information related to BP traits, were retrieved for 140,886 participants included in the discovery phase of the study conducted by Warren and colleagues [37]. Both GWAS and exome analyses were performed to identify SNVs with MAF ≥ 1% and MAF ≥ 0.01%. Among the 240 loci derived from the discovery phase, 102 GWAS and five exome variants with *<sup>p</sup>* < 5 × <sup>10</sup>−<sup>8</sup> were reported. Noteworthy, a 9.3 mmHg higher SBP was observed after comparing subjects with the highest genetic risk score (estimated on the basis of all the loci identified) and above 50 years old with those with the lowest genetic risk score (95% CI: 6.9–11.7, *<sup>p</sup>* = 1 × <sup>10</sup><sup>−</sup>13) [37]. In the recently published paper from Pazoki and coauthors [48], the 267 SNPs identified by Warren et al. [37] were combined with the 47 BP-associated loci reported by Hoffman et al. [11] to calculate a genetic risk score for high BP in 277,005 subjects belonging to the prospective UK Biobank cohort. A healthy lifestyle score was also constructed for all the individuals in order to investigate whether the adherence to a favorable lifestyle could counteract the high genetic susceptibility to develop hypertension and cardiovascular diseases. The authors reported an association between healthy lifestyle and lower SBP and DBP within each tertile of genetic risk. In particular, at low genetic risk, the estimated mean SBP was 140 mmHg (95% CI, 102–177) among subjects with an unhealthy lifestyle and 134 mmHg (95% CI, 95–172) among those with a healthy lifestyle.

To date, the largest meta-analysis on exome chips data was conducted on 475,000 individuals (mostly European) genotyped using the UK BiLEVE array and the UK Biobank Axiom Array. These arrays are closely related new next-generation microarrays (95% identical content) designed from the Affymetrix Company. More than 800,000 markers were included to comprehensively cover beyond common SNPs, rare and low-frequency coding variants, copy number variants, pharmacogenomics markers, Human Leukocyte Antigen (HLA), inflammation, and eQTL variants. Among rare variants, in addition to primarily missense mutations, protein truncating variants resulting in premature stop codons, frameshifts, and loss of start mutations were included as loss-of-function variants. The genomic coverage was optimized for European and British populations. The array provided the opportunity to test the association between a wide range of genetic variations and many frequent human diseases, including cardiovascular disease and cardiometabolic traits such as BP (Available online: http://www.ukbiobank.ac.uk); [49]. In the paper from Kraja and colleagues [36], 21 SNVs showed significant associations with at least one BP trait, after correcting for multiple testing (*<sup>p</sup>* < 5 × <sup>10</sup><sup>−</sup>8; <sup>β</sup>s(se): −1.14 (0.19)–0.42 (0.06)). Moreover, all variants had concordant directions across all the datasets. Only one SNV (in the *DBH* gene) had a MAF less than 1% and exhibited the lowest effect estimate (βs(se): −1.14 (0.19); *<sup>p</sup>* = 1.23 × <sup>10</sup><sup>−</sup>9). Four novel associations of common SNPs within *SLC4A1AP*, *AFAP1, STAB1*, and *SYNPO2L* genes were reported [36].

## *2.3. Results from DNA Sequencing Studies*

Large-scale genotyping through high-throughput platforms opened the way to great efforts aimed at discovering the causative variants explaining the associations described.

#### 2.3.1. Pre-Next-Generation Sequencing Era

Direct sequencing represented an easy way to characterize hypertension-related genes embedding SNPs found through GWAS. Okuda et al. [50] validated 143 SNPs identified in a small Japanese population. Among these SNPs, most had frequencies higher than 5% and caused amino acid substitutions, whereas almost all novel variants were rare (13 out of 16).

Genetic Epidemiology Network of Salt Sensitivity (GenSalt) study participants were recruited to evaluate SBP, DBP, and MAP responses to a dietary sodium intervention. The renin-angiotensin -aldosterone system (RAAS) is a hormonal cascade essential for the control of homeostasis, BP, and vascular tone [51,52]. In the first re-sequencing study focused on the RAAS pathway, Kelly and coauthors [53] analysed seven genes for putative associations with BP salt-sensitivity among participants of the GenSalt study. Carriers of 124 rare variants had 1.55-fold increased odds (95% CI: 1.15, 2.10) of salt sensitivity compared to non-carriers (*p* = 0.004). No genes showed significant associations with salt sensitivity after Bonferroni correction. No significant common and low-frequency single markers were detected when the analyses were corrected for multiple comparisons [53]. The reabsorption of sodium in epithelial cells located in the renal tubule is carried out by the renal epithelial sodium channel (ENaC) whose activity is fundamental for BP control [54]. *SCNN1A*, *SCNN1B*, and *SCNN1G* genes encode the three ENaC subunits [55]. These genes were targeted by Gu et al. [56] to identify novel common, low-frequency, and rare variants in 300 GenSalt participants with the highest MAP response to the high-sodium intervention and 300 GenSalt participants with the lowest MAP response to the high-sodium intervention. No significant associations with salt sensitivity were observed. In gene-based analyses, *SCNN1A* gene showed a significant association with salt sensitivity (*p* = 0.009). Individuals carrying rare variants in *SCNN1A* gene had an odds ratio of 0.52 (95% CI: 0.32–0.85). Neither *SCNN1G* nor *SCNN1B* associated with salt sensitivity in rare variant analyses. Three common variants in *SCNN1A* associated with salt sensitivity of BP (*<sup>p</sup>* < 1.3 × <sup>10</sup>−3; 1.23-fold increased odds and 0.68–0.69-fold decreased odds of salt sensitivity) [56].

Another suggested candidate gene for hypertension is represented by the Cadherin-13 gene (*CDH13*). *CDH13* encodes a cell adhesion molecule involved in the protection of vascular endothelial cells from apoptosis following oxidative stress, survival, proliferation, and endothelial cells migration [57–59]. The promoter region was re-sequenced and subjected to methylation QTL (meQTL) analysis within the HYPertension in ESTonia (HYPEST) and Coronary Artery Disease in Czech (CADCZ) studies. The meQTL rs8060301 (a frequent variant) showed a pleiotropic effect on HDL and DBP (nominal *p* < 0.005), which was unconfirmed after multiple testing correction [60].
