**3. Results**

## *3.1. DCM-Related Reconnection of mRNA Expression*

We first normalized the read counts of RNA-Seq of each gene in each sample. In the sample–sample distance plot of the screening cohort, clustering of the majority of DCM samples could be seen (Figure 1B), while only a few DCM samples were close to the cluster of control samples. This was an expected picture confirming the known heterogenous transcriptomic profile of DCM patients having disease states ranging from mild to severe. In the corresponding principal component analysis, a similar distribution pattern was detected (Figure 1C). In the sample–sample distance plot of the replication cohort, only one sample in each group were outliers (Figure S1A), and the same effect can be seen in the PCA plot (Figure S1B). Overall, the existence of outliers in the data might render further analysis more challenging; however, since our attempt was to pinpoint only the most robust associations, we set to work with all samples.

**Figure 1.** RNA sequences in the screening cohort. (**A**) Heatmap of the normalized gene counts of the fifty most significantly differentially expressed genes between dilated cardiomyopathy (DCM) and control samples, as an example to demonstrate a distinct pattern of gene expression in DCM and control subjects. (**B**) Heatmap of sample–sample-distances of the gene expression. (**C**) Principal component analysis (PCA) plot of the gene expression.

In the analysis for differential gene expression, 1330 genes were found to be significantly differentially expressed between DCM and control samples with a significance threshold of FDR < 0.05 (Supplemental File 1). Of these, 259 genes were upregulated and 1071 genes were downregulated (Supplemental File 2). These findings indicate that, on the transcriptomic level, orchestrated changes of gene expression govern the disease state (Figure 1A). The FPKM (fragments per kilobase per million) scatter presents the relative expression between the DCM and control (Figure 2B). The MA plot visualized the relationship between the mean of normalized count and the log fold change in the analyzed samples (Figure 2A). As examples of the upregulated genes in DCM, we demonstrated the gene browser tracks of genes *NPPA* and *NPPB* (Figure 2C). *NPPA* and *NPPB* genes encode natriuretic peptides, ANP and BNP, respectively, which are commonly used as biomarkers in diagnostics and monitoring of DCM since they are strongly associated with the disease.

**Figure 2.** (**A**) MA plot for the analysis of differential gene expression. The significant candidates (FDR < 0.05) are marked in red. (**B**) FPKM scatter plot of the gene expression in DCM and control samples. The significantly differentially expressed genes (FDR < 0.05) are marked in red. FPKM: fragments per kilobase per million. (**C**) Gene browser tracks for *NPPA* and *NPPB* as examples for differential gene expression. The track(s) on top represent(s) common transcripts of the genes. RNA-Seq coverage of only one selected DCM sample and one selected control sample is shown below. It appears that both genes may have a different abundance of transcripts. However, when pooling all samples of the same condition together and using robust statistic testing, isoform differences could not be shown.

In the gene ontology analysis for cellular components (Supplemental File 3), the upregulated genes were found to be enriched for several neural system components, extracellular matrix components, ion channel complexes, as well as contractile fibers and sarcolemma (FDR < 0.05). The down-regulated genes were enriched for several immunological complexes, ribosomal subunits, and numerous cellular membranous components, including reticulum membrane (FDR < 0.05). These are typical findings of DCM pathogenesis. In summary, the data on whole-transcriptomes from patients and controls accentuates the distinct expression landscape of mRNA transcripts. This raised the question of whether the individual transcripts are also differentially composed, e.g., by alternative splicing.

## *3.2. Epigenome-Wide Linkage of DNA Methylation and Inclusion of Exon*

In the data exploration of 394,247 qualified methylation probes in the screening cohort, two principal components could well separate the samples of DCM patients from samples of control subjects, and the two clusters had an overlapping area in the middle (Figure S3A). In the replication cohort, the clustering of DCM patients and control subjects was more delineated in the PCA plot (Figure S3B), as only one outlier of the DCM sample stood out in the top right quadrant.

Of all probed sites of the Infinium HumanMethylation 450 K BeadChip, 88,699 probes were found to locate in intronic regions, comprising approximately 20% of all probes. Further, we identified around 33% of all exons to have accessible methylation measurements in their neighboring introns (either upstream, downstream, or both-sides). Subsequently, these identified exonic regions and their flanking introns with available methylation measurements were analyzed as pairs together in order to inspect the possible regional effects of intronic methylation on the inclusion of alternative exons. The analyzable pairs included 41,158 intron-exon-pairs and 41,253 exon-intron pairs across the whole genome (Figure 3). The association between intronic DNA methylation and the calculated exon usage was modeled with logistic regression, which consequently showed a robust positive correlation between intronic DNA methylation and exon usage (up- and downstream, *p* value < 2 <sup>×</sup> 10−<sup>16</sup> and *<sup>p</sup>* value <sup>&</sup>lt; <sup>2</sup> <sup>×</sup> <sup>10</sup><sup>−</sup>16, respectively), even after adjustment for intron-exon distance (Table 1).

**Figure 3.** Scheme presenting the included intron-exon and exon-intron pairs in the epigenome-wide analysis between DNA methylation and inclusion of exon.



\* Generalized regression analysis using quasi-Poisson distribution. # Median distance between exon and intron = 1741 bp. &*n* = 41,158 intron-exon pairs. \$*n* = 41,253 exon-intron pairs.
