*Article* **Systems Analysis Reveals Ageing-Related Perturbations in Retinoids and Sex Hormones in Alzheimer's and Parkinson's Diseases**

**Simon Lam 1, Nils Hartmann 2,†, Rui Benfeitas 3, Cheng Zhang 4, Muhammad Arif 4, Hasan Turkez 5, Mathias Uhlén 4, Christoph Englert 2,6, Robert Knight 1,\* and Adil Mardinoglu 1,4,\***


**Abstract:** Neurodegenerative diseases, including Alzheimer's (AD) and Parkinson's diseases (PD), are complex heterogeneous diseases with highly variable patient responses to treatment. Due to the growing evidence for ageing-related clinical and pathological commonalities between AD and PD, these diseases have recently been studied in tandem. In this study, we analysed transcriptomic data from AD and PD patients, and stratified these patients into three subclasses with distinct gene expression and metabolic profiles. Through integrating transcriptomic data with a genome-scale metabolic model and validating our findings by network exploration and co-analysis using a zebrafish ageing model, we identified retinoids as a key ageing-related feature in all subclasses of AD and PD. We also demonstrated that the dysregulation of androgen metabolism by three different independent mechanisms is a source of heterogeneity in AD and PD. Taken together, our work highlights the need for stratification of AD/PD patients and development of personalised and precision medicine approaches based on the detailed characterisation of these subclasses.

**Keywords:** neurodegeneration; Alzheimer's; Parkinson's; ageing; systems biology

### **1. Introduction**

Neurodegenerative diseases, including Alzheimer's (AD) and Parkinson's diseases (PD), cause years of a healthy life to be lost. Much previous AD and PD research has focused on the causative neurotoxicity agents, namely, amyloid β and α-synuclein, respectively. The current front-line therapies for AD and PD are cholinesterase inhibition and dopamine repletion, respectively, which are considered gold standards. Unfortunately, these therapies are not capable of reversing neurodegeneration [1,2], thus necessitating potentially lifelong dependence on the drug and risking drug-associated complications. Moreover, AD and PD are complex multifactorial diseases with heterogeneous underlying molecular mechanisms involved in their progression [3–5]. This variability can explain the differences in patient response to other treatments such as oestrogen replacement therapy [6,7] and statin treatment [8,9]. Hence, we observed that there are distinct disease

**Citation:** Lam, S.; Hartmann, N.; Benfeitas, R.; Zhang, C.; Arif, M.; Turkez, H.; Uhlén, M.; Englert, C.; Knight, R.; Mardinoglu, A. Systems Analysis Reveals Ageing-Related Perturbations in Retinoids and Sex Hormones in Alzheimer's and Parkinson's Diseases. *Biomedicines* **2021**, *9*, 1310. https://doi.org/ 10.3390/biomedicines9101310

Academic Editor: Masaru Tanaka

Received: 28 July 2021 Accepted: 22 September 2021 Published: 24 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

classes affecting specific cellular processes. Therefore, there is a need for the development of personalised treatment regimens.

In this study, we propose a holistic view of the mechanisms underlying the development of AD and PD rather than focusing on amyloid β and α-synuclein [10]. To date, complex diseases including liver disorders and certain cancers have been well studied through the use of metabolic modelling. This enabled the integration of multiple omics data for stratification of patients, discovery of diagnostic markers, identification of drug targets, and proposing of personalised or class-specific treatment strategies [11–14]. A similar approach may be applied for AD and PD since there is already a wealth of data from AD and PD patients from post-mortem brain tissues and blood transcriptomics.

AD and PD share multiple clinical and pathological similarities, including comorbidities [15,16], inverse associations with cancer [17,18], and ageing as a risk factor [19,20]. One type of ageing is telomeric ageing, which is associated with the loss of telomeres, protein/nucleic acid structures that protect chromosome ends from degradation [21]. The enzyme telomerase is necessary for the maintenance of telomeres. In adults, telomerase activity is mostly limited to progenitor tissues such as in the ovaries, testes, and bone marrow. Loss of telomerase activity leads to telomere shortening, loss of sequences due to end-replication, and eventual degradation of sequences within coding regions, leading to telomeric ageing. Considering AD and PD as products of ageing, we can use an ageing model organism to study its effects on the brain. In our study, we used zebrafish (*Danio rerio*) as a model organism since it has been used extensively used to study vertebrate ageing [22]. For example, a zebrafish ageing model can harbour a nonsense mutation in the *tert* gene, which encodes the catalytic subunit of telomerase, and exhibit faster-than-normal ageing [23,24].

In our study, we first analysed post-mortem brain gene expression data and protein– protein interaction data from the Genotype-Tissue Expression (GTEx) database [25], Functional Annotation of the Mammalian Genome 5 (FANTOM5) database [26–29], Human Reference Protein Interactome (HuRI) database [30], and Human Protein Atlas (HPA) (http://www.proteinatlas.org, accessed on 9 March 2021) [31] for characterization of normal brain tissue (Figure 1A). Secondly, we analysed transcriptomic data from the Religious Orders Study and Rush Memory Aging Project (ROSMAP) [32–34] with published expression data from anterior cingulate cortices and dorsolateral prefrontal cortices of PD and Lewy body dementia patients, hereafter referred to as the Rajkumar dataset [35], and from putamina, substantiae nigrae, and prefrontal cortices from patients with PD, hereafter referred to as the Zhang/Zheng dataset [36,37]. On these data, we conducted differential gene expression and functional analysis, and then constructed biological networks to further explore coordinated patterns of gene expression. Next, we performed global metabolic analyses using genome-scale metabolic modelling. Alongside these analyses, we also leveraged zebrafish *tert* mutants to test the hypothesis that the identified changes may be associated with telomeric ageing. Finally, on the basis of our integrative systems analysis, we defined three distinct disease subclasses within AD and PD and identified retinoids as a common feature of all three subclasses, being likely to be perturbed through ageing. We revealed subclass-specific perturbations at three separate processes in the androgen biosynthesis and metabolism pathway, namely, oestradiol metabolism, cholesterol biosynthesis, and testosterone metabolism.

**Figure 1.** Overview and exploratory data analysis. (**A**) Workflow for the analysis of human AD and PD samples. (**B**) AD and PD samples were clustered into *k* clusters without supervision on the basis of normalised expression counts. Results are shown for *k* = 3 and 1000 bootstrap replicates. Colour bars indicate cluster identity for each sample. For 2 ≤ *k* ≤ 7, refer to Figure S1. (**C**) Normalised expression data from AD, PD, and control samples were projected onto 2-D space using t-distributed stochastic neighbour embedding (t-SNE). Points are coloured according to cluster assignment by unsupervised clustering. For further data visualisation, refer to Figure S2.

### **2. Materials and Methods**

### *2.1. Data Acquisition and Processing*

Gene expression values of protein-coding genes from the ROSMAP dataset were determined using kallisto (version 0.46.1, Pachter Lab, Berkeley, CA, USA) [38] by aligning raw RNA sequencing reads to the *Homo sapiens* genome in Ensembl release 96 [39]. Raw single-cell RNA sequencing reads from ROSMAP were converted to counts in Cell Ranger (version 4.0, 10x Genomics, Pleasanton, CA, USA, https://support.10xgenomics.com/ single-cell-gene-expression/software/pipelines/latest/installation; accessed on 24 July 2020), and aligned to the Cell Ranger *Homo sapiens* reference transcriptome version 2020- A. Single-cell expression values were compiled into pseudo-bulk expression profiles for each sample.

Expression values of protein-coding genes from brain samples of the ROSMAP dataset [32–34], GTEx database version 8 [25], FANTOM5 database [26–28] via Regulatory Circuits Network Compendium 1.0 [29], HPA database [31], Rajkumar dataset [35], and Zhang/Zheng dataset [36,37] were then combined. Genes from GTEx and FANTOM5 brain samples were filtered such that only genes whose products are known to participate in a protein–protein interaction described in the HuRI database [30] were included. Expression values were scaled and TMM normalised per sample, Pareto scaled per gene, and batch effects removed with the *removeBatchEffect* function from the limma (version 3.42.0, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia) [40] R package. After quality control and normalisation, a total of 64,794 genes and 2055 samples resulted. As the data also included samples from patients with neurological conditions other than AD or PD, we then removed those samples and finally accepted 1572 samples corresponding to AD, PD, or control for further analysis.

Projections onto 2-D space by PCA, t-SNE [41], and UMAP [42] methods were generated on data after missing value imputation with data diffusion [43]. t-SNE projections were generated with perplexity 20 and 1000 iterations. All other parameters were kept default. PCA and UMAP projections were generated using all default parameters.

#### *2.2. Transcriptome Analysis*

Using normalised, imputed expression values, AD and PD samples were then arranged into clusters without supervision using ConsensusClusterPlus (version 1.50.0, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA) [44] with maxK = 20 and rep = 1000. All other parameters were kept default. Clustering by *k* = 3 clusters was selected for downstream analysis. A fourth cluster containing only control samples was artificially added to the analysis.

For differential gene expression analysis, normalised, non-imputed counts were used. Genes were removed if expression values were missing in 40% or more of samples or were zero in all samples. Differential expression was then performed using DESeq2 (version 1.26.0, European Molecular Biology Laboratory, Heidelberg, Germany) [45] with uniform size factors and all other parameters set to default. Genes with a Benjamini– Hochberg adjusted *<sup>p</sup>*-value at or below a cut-off of 1 × <sup>10</sup>−<sup>10</sup> were determined significantly differentially expressed genes.

Gene set enrichment analysis was performed using piano (version 2.2.0, Chalmers University of Technology, Göteborg, Sweden) [46] using all default parameters. GO term lists were obtained from Ensembl Biomart (https://www.ensembl.org/biomart/martview, accessed on 9 March 2021) and were used as gene set collections. Enrichment of GO terms was determined by analysing GO terms of genes differentially expressed genes detected by DESeq2 as well as the parents of those GO terms. GO terms with an adjusted *p*-value at or below 0.05 for distinct-directional and/or mixed-directional methods were determined statistically significant.

### *2.3. Metabolic Analysis*

For each cluster, consensus gene expression values were determined by taking the arithmetic mean of normalised expression counts across all samples within each cluster.

A reference GEM was created by modifying the gene associations of all reactions within the adipocyte-specific GEM *iAdipocytes1850* [47] to match those within the generic human GEM HMR3 [48]. The resulting GEM was designated *iBrain2845*. Cluster-specific GEMs were reconstructed using the RAVEN Toolbox (version 2.0, Chalmers University of Technology, Göteborg, Sweden) [49] tINIT algorithm [50,51], with *iBrain2845* as the reference GEM.

FBA was conducted on each cluster-specific GEM using the *solveLP* function from the RAVEN Toolbox with previously reported constraints [52] and defining ATP synthesis (*iBrain2845:* HMR\_6916) as the objective function. All constraints were applied with the exception of the following reaction IDs, which were excluded: EX\_ac[e] (*iBrain2845*: HMR\_9086) and EX\_etoh[e] (*iBrain2845*: HMR\_9099).

Reporter metabolite analysis was conducted using the *reporterMetabolites* function [53] from the RAVEN Toolbox, using *iBrain2845* as the reference model.

#### *2.4. Network Analysis*

To generate gene networks, we took normalised, non-imputed expression values from AD and PD samples. Control samples and samples from blood were excluded. One network was generated each for AD and PD. For the AD network, all male samples were included, and 171 female samples were chosen at random and included. For the PD network, all samples were included. Genes with any missing values were dropped. Genes with the 15% lowest expression or 15% lowest variance were disregarded from further analysis. Spearman correlations were calculated for each pair of genes, and the top 1% of significant correlations were used to generate gene co-expression networks. Random Erd˝os–Rényi models were created for the AD and PD networks, with the same numbers of nodes and edges to act as null networks, and compared against their respective networks in terms of centrality distributions. Community analyses were performed through the Leiden algorithm [54] by optimising CPMVertexPartition, after a resolution scan of 10,000 points between 10−<sup>3</sup> and 10. The scan showed global maxima at resolutions = 0.077526 and 0.089074 for AD and PD networks, respectively, which were used for optimisation. Enrichment analysis was performed on modules with >30 nodes using enrichr (https://maayanlab.cloud/Enrichr, accessed on 5 March 2021) [55,56] using GO Biological Process, KEGG, and Online Mendelian Inheritance in Man libraries and was explored using Revigo (http://revigo.irb.hr, accessed on 5 March 2021) [57].

### *2.5. Zebrafish Data Acquisition and Analysis*

The *tert* mutant zebrafish line (*tert*hu3430) was obtained from Miguel Godhino Ferreira [24]. Fish maintenance, RNA isolation, processing, and sequencing were conducted as described previously [58].

From *n* = 5 wild-type (*tert*+/+), *n* = 5 heterozygous mutant (*tert*+/−), and *n* = 3 homozygous mutant (*tert*−/−), expression values were determined from RNA sequencing reads using kallisto by aligning to the *Danio rerio* genome in Ensembl release 96 [39]. Expression values were generated for each extracted tissue as well as 'psuedo–whole animal', containing combined values across all tissues.

A reference zebrafish GEM was manually curated by modifying the existing *ZebraGEM2* model and was designated *ZebraGEM2.1*.

Differential expression analysis, gene set enrichment analysis, GEM reconstruction, FBA, and reporter metabolite analysis were conducted on *tert*−/<sup>−</sup> and *tert*+/<sup>−</sup> animals against a *tert*+/+ reference using DESeq2, piano, and RAVEN Toolbox 2.0 with default parameters. Reporter metabolite analysis was conducted with *ZebraGEM2.1* as the reference GEM.

FBA was attempted as described for the human GEMs with the exception that the following metabolic constraints were excluded: r1391, HMR\_0482 (*ZebraGEM2.1*: G3PDm), EX\_ile\_L[e] (*ZebraGEM2.1*: EX\_ile\_e), EX\_val\_L[e] (*ZebraGEM2.1*: EX\_val\_e), EX\_lys\_L[e] (*ZebraGEM2.1*: EX\_lys\_e), EX\_phe\_L[e] (*ZebraGEM2.1*: EX\_phe\_e), GLCt1r, EX\_thr\_L[e] (*ZebraGEM2.1*: EX\_thr\_e), EX\_met\_L[e] (*ZebraGEM2.1*: EX\_met\_\_L\_e), EX\_arg\_L[e] (*ZebraGEM2.1*: EX\_arg\_e), EX\_his\_L[e] (*ZebraGEM2.1*: EX\_his\_\_L\_e), EX\_leu\_L[e] (*ZebraGEM2.1*: EX\_leu\_e), and EX\_o2[e] (*ZebraGEM2.1*: EX\_o2\_e). The objective function was defined as ATP synthesis (*ZebraGEM2.1*: ATPS4m). FBA results for zebrafish are not presented.

#### *2.6. Data and Code Accessibility*

All original computer code, models, and author-curated data files have been released under a Creative Commons Attribution ShareAlike 4.0 International Licence (https:// creativecommons.org/licenses/by-sa/4.0/; accessed on 29 March 2021) and are freely available for download from <https://github.com/SimonLammmm/ad-pd-retinoid>; accessed on 29 March 2021.

Zebrafish *tert* mutant sequencing data have been deposited in the NCBI Gene Expression Omnibus (GEO) and are accessible through GEO Series accession numbers GSE102426, GSE102429, GSE102431, and GSE102434.

#### *2.7. Ethics Statement*

Zebrafish were housed in the fish facility of the Leibniz Institute on Aging—Fritz Lipmann Institute (FLI) under standard conditions and a 14 h light and 10 h dark cycle. All animal procedures were performed in accordance with the German animal welfare guidelines and approved by the Landesamt für Verbraucherschutz Thüringen (TLV), Germany.

#### **3. Results**

### *3.1. Stratification of Patients Revealed Three Distinct Disease Classes*

We retrieved gene expression and protein–protein interaction data from GTEx, FAN-TOM5, HuRI, HPA, and ROSMAP databases and integrated these data with the published datasets by Rajkumar and Zhang/Zheng. After performing quality control and normalisation (as outlined in the Materials and Methods), we included a total of 629 AD samples, 54 PD samples, and 889 control samples in the analysis (Table 1). To reveal transcriptomic differences between AD/PD samples compared to healthy controls, we identified differentially expressed genes (DEGs) and performed gene set enrichment (GSE) analyses. However, since AD and PD are complex diseases with no single cure, it is likely that multiple gene expression profiling exist, manifesting in numerous disease classes requiring distinct treatment strategies. We therefore used unsupervised clustering to elucidate these expression profiles and stratify the AD and PD patients on the basis of the underlying molecular mechanisms involved in the disease occurrence.


**Table 1.** Summary of expression data sources.

Expression data from AD and PD samples were obtained from the Genotype-Tissue Expression (GTEx) database, Functional Annotation of the Mammalian Genome 5 (FANTOM5) database, Human Protein Atlas (HPA), Religious Orders Study and Rush Memory Aging Project (ROSMAP), Rajkumar dataset, and Zhang/Zheng dataset.

Following unsupervised clustering with ConsensusClusterPlus [44], we separated AD and PD samples into three clusters (Figures 1B and S1). Clusters 1 and 2 contained samples from Zhang/Zheng and Rajkumar datasets, respectively, in addition to samples in the ROSMAP dataset, and consisted of 127 and 186 samples from female donors, respectively, and 73 and 95 samples from male donors, respectively. Cluster 2 also contained 14 samples with sex not recorded. Cluster 3 contained only ROSMAP samples and consisted of 114 female and 74 male samples. Clusters did not form firmly along lines of sex, age, or brain tissues or brain subregion (Figure S2). Samples from non-diseased individuals were artificially added as a fourth, control cluster, consisting of 495 female samples, 262 male samples, 13 samples with sex not recorded, and 119 samples derived from aggregate sources.

By differential expression analysis using DESeq2 [45], we then characterised the distinct transcriptomic profiles within our disease clusters (Figure 2A). Cluster 1 showed mixed up- and downregulation of genes compared to control, whereas cluster 2 showed more downregulation and cluster 3 showed vast downregulation of genes compared to control.

To infer the functional differences between the subclasses, we performed GSE analysis using piano [46] (Figure 2B, Supplementary Data S1). Globally, DEGs in any cluster 1–3 were enriched in upregulated Gene Ontology (GO) terms for immune response, olfaction, retinoid function, and apoptosis, but downregulated for copper ion transport and telomere organisation, compared to the control cluster. Considering individual clusters, cluster 1 DEGs were enriched in upregulated GO terms associated with immune signalling, cell signalling, and visual perception. We also found downregulation of GO terms associated with olfactory signalling and cytoskeleton. DEGs in cluster 2 were found to be enriched in downregulated GO terms associated with the cytoskeleton, organ development, cell differentiation, retinoid metabolism and response, DNA damage repair, inflammatory response, telomere maintenance, unfolded protein response, and acetylcholine biosynthesis and binding. On the other hand, we did not find any significantly enriched upregulated GO terms. In cluster 3, we found that DEGs were enriched in upregulated GO terms associated with neuron function, olfaction, cell motility, and immune system. DEGs in cluster 3 were found to be enriched in downregulated GO terms associated with DNA damage response, ageing, and retinoid metabolism and response.

The difference in expression profiles illustrate highly heterogeneous transcriptomics in AD and PD and that there are notable commonalities and differences between the subclasses of AD or PD samples. Interestingly, we found retinoid metabolism or function to be a common altered GO term in all subclasses. This was upregulated in cluster 1 but downregulated in clusters 2 and 3. We therefore observed that retinoid dysregulation appears to be a common ageing-related hallmark in AD and PD.

**Figure 2.** *Cont*.

**Figure 2.** Transcriptomic and functional characterisation of AD and PD subclasses. Differentially expressed gene (DEG) analysis and gene set enrichment (GSE) analysis were performed for AD and PD and control samples for each disease cluster, using the control cluster as reference. (**A**) DEG results. Significant DEGs were determined as those with a Benjamini–Hochberg adjusted *p*-value at or below a cut-off of 1 <sup>×</sup> <sup>10</sup>−10. Upregulated significant DEGs are coloured red. Downregulated significant DEGs are coloured blue. Non-significant DEGs are coloured grey. (**B**) Selected significantly enriched GO terms by number of genes as determined by GSE analysis. Red bars indicate upregulated GO terms. Blue bars indicate downregulated GO terms. For full data, refer to Supplementary Data S1.

### *3.2. Metabolic Analysis Revealed Retinoids and Sex Hormones as Significantly Dysregulated in AD and PD*

On the basis of clustering and GSE analysis, we identified distinct expression profiles, but these alone could not offer insights into metabolic activities of brain in AD and PD. To determine metabolic changes in the clusters compared to controls, we performed constraintbased genome-scale metabolic modelling. We reconstructed a brain-specific genomescale metabolic model (GEM) based on the well-studied HMR2.0 [47] reference GEM by overlaying transcriptomic data from each cluster and applying brain-specific constraints as described previously [52] using the tINIT algorithm [50,51] within the RAVEN Toolbox 2.0 [49]. We generated a brain-specific GEM (*iBrain2845*) (Supplementary File S1) and used it as the reference GEM for reconstruction of cluster-specific GEMs in turn. We constructed the resulting context-specific *iADPD* series GEMs *iADPD1*, *iADPD2*, *iADPD3*, and *iADPDControl*, corresponding to cluster 1, cluster 2, cluster 3, and the control cluster, respectively (Supplementary File S2).

We conducted flux balance analysis (FBA) by defining maximisation of ATP synthesis as the objective function. *iADPD1* and *iADPD2* both showed upregulation of fluxes in reactions involved in cholesterol biosynthesis and downregulation in O-glycan metabolism, with reaction flux changes being more pronounced in *iADPD2* than in *iADPD1* (Table 2, Supplementary Data S2). We found that the fluxes in *iADPD1* were uniquely upregulated in oestrogen metabolism and the Kandustch–Russell pathway. *iADPD2* was uniquely upregulated in cholesterol metabolism, whereas *iADPD3* uniquely displayed roughly equal parts upregulation and downregulation in several pathways, including aminoacyltRNA biosynthesis; androgen metabolism; arginine and proline metabolism; cholesterol biosynthesis; galactose metabolism; glycine, serine, and threonine metabolism; and Nglycan metabolism.


**Table 2.** Flux balance analysis of *iADPD1*, *iADPD2*, and *iADPD3* versus *iADPDControl*.


**Table 2.** *Cont.*

Flux balance analysis was performed for each *iADPD*-series GEM, and the predicted fluxes for the three disease cluster GEMs were compared against the predicted fluxes for the control cluster GEM. Reactions are grouped by subsystem and flux difference values are expressed as mean flux difference between disease clusters and the control cluster across all changed reactions within a subsystem. For full results, refer to Supplementary Data S2.

In particular, we observed increased positive fluxes through reactions HMR\_2055 and HMR\_2059 in *iADPD1*, which convert oestrone to 2-hydroxyoestrone and then to 2 methoxyoestrone (Figure 3). In *iADPDControl*, these reactions carried zero flux. In *iADPD2*, we observed increased positive fluxes through HMR\_1457 and HMR\_1533, which produce geranyl pyrophosphate and lathosterol, respectively. Both of these molecules are precursors to cholesterol, and while we did not see a proportionate increase in the production of other molecules along the pathway (namely, farnesyl pyrophosphate and squalene), we did observe a general increase in fluxes through the androgen biosynthesis and metabolism pathway. Finally, we observed that *iADPD3* displayed a decreased production of testosterone from 4-androstene-3,17-dione via HMR\_1974, despite an increase in production of 4-androstene-3,17-dione via HMR\_1971.

**Figure 3.** Metabolic characterisation of AD and PD subclasses. Flux balance analysis (FBA) was performed on *iADPD1-3* genome-scale metabolic models (GEMs), and flux values were compared with those of *iADPDControl*. Key metabolites and reactions within the androgen metabolism pathway are shown and key dysregulations are displayed as coloured arrows: red indicates increased flux compared to *iADPDControl*; blue indicates decreased flux compared to *iADPDControl*. Dysregulations associated to each GEM are shown in coloured boxes. The dashed line indicates multiple reactions are involved. Human Metabolic Reactions (HMR) identifiers are shown for androgen metabolism reactions with dysregulated fluxes. For full data, refer to Supplementary Data S2.

> Taken together, the obtained results indicate the existence of three distinct metabolic dysregulation profiles in AD and PD, with dysregulation being most pronounced in cluster 2 patients and least pronounced in cluster 3 patients. Furthermore, we found that all three clusters show dysregulations in or around sex hormone biosynthesis and metabolism, which might explain the heterogeneity in responses to sex hormone replacement therapy in AD and PD patients as extensively reported previously [6,59–61]. We also confirmed that dysregulations through sex hormone pathways in the *iADPD* series GEMs were not due to differences in relative frequencies between sexes in the main clusters 1-3 (Fisher's exact test, *p* = 0.4700).

> In addition to metabolic inference and FBA, we performed reporter metabolite analysis [53] by overlaying DEG analysis results onto the reference GEM to identify hotspots of metabolism (Table 3, Supplementary Data S3). In short, we uniquely identified oestrone as a reporter metabolite in cluster 1, and lipids such as acylglycerol and dolichol in cluster 2. No notable reporter metabolites were identified as significantly changed in cluster 3 only. In common to all clusters 1–3, retinoids, and sex hormones such as androsterone

and pregnanediol were identified as significantly changed reporter metabolites, which are generally in line with GSE and FBA results.


**Table 3.** Reporter metabolite analysis of AD and PD subclasses.

Reporter metabolite analysis was performed for each AD/PD subclass by overlaying differential expression results onto *iBrain2845*. Top 10 unique reporter metabolites by *p*-value for each cluster compared to the control cluster are shown. For full results, refer to Supplementary Data S3.

#### *3.3. Network Analysis Supported Retinoid and Androgen Dysregulation and Suggests Transcriptomic Similarity between AD and PD*

To further explore the gene expression patterns shown across AD and PD patients, we took expression data and constructed a weighted gene co-expression network for each group (Spearman ρ > 0.9, FDR < 10−9; see the Materials and Methods section). Each network was compared against equivalent randomly generated networks acting as null models. After quality control, the AD network contained 4861 nodes (genes) and ≈397,000 edges (significant correlations), and the PD network contained 5857 nodes and ≈394,000 edges (Figure 4A,B, Table 4). A community analysis to identify modules of highly co-expressed genes [54] highlighted 9 and 15 communities with significant functional enrichment in AD and PD, respectively.

In the AD network, gene module C3 was enriched for genes involved with neuron and synapse development, similar to patient cluster 3; C4 for genes involved with mRNA splicing, similar to patient cluster 2; and C5 for genes involved with the mitochondrial electron transport chain (Figure 4C, Supplementary Data S4). C1 and C2 were the gene modules with the largest number of genes. C1 was enriched for gene expression quality control genes and development and morphogenesis genes, mirroring patient cluster 2, whereas C2 contained cytoskeleton-related genes, similar to patient cluster 1.

In the PD network, C1 was enriched for genes involved with retinoid metabolism, glucuronidation, and cytokine signalling. Since androgens are major targets of glucuronidation [62], these results are in line with our main findings. Further, C2 contained DNA damage response and gene regulation genes, similar to patient cluster 2; C3 contained nuclear protein regulation genes; and C4 contained mRNA splicing genes, again similar to patient cluster 2.

Further, the two networks share a large number of enriched terms in common, and there is high similarity between the major gene modules, highlighting the similarity between AD and PD. In addition to this, enrichment analysis for KEGG terms was unable to assign "Alzheimer disease" and "Parkinson disease" to the correct gene modules from the respective networks, and additional neurological disease terms such as "Huntington disease" and "amyotrophic lateral sclerosis" were also identified by the analysis, further suggesting the transcriptomic similarity between neurological diseases. We found that AD C1 and PD C2 were frequently annotated with these disease terms, and these gene modules are also highly similar. Therefore, this gene module could constitute a core set of dysregulated genes in neurodegeneration.

Taken together, the network analysis supports our GSE findings. The functional consequences of differential expression in the patient clusters could be explained by differential modulation of gene modules identified in our network analysis together with dysregulation of a core set of genes implicated in both AD and PD.

**Figure 4.** *Cont*.

**Figure 4.** Network analysis of AD and PD gene co-expression modules. (**A**) Gene co-expression networks were constructed from transcriptomic data from AD and PD samples. Community analysis was used to identify gene modules (see the Materials and Methods section). Modules with at least 30 genes are shown as nodes. Node size indicates number of genes. Nodes are coloured by network of origin and numbered in descending order of module size. Shared genes between modules are shown as edges. Edge weight indicates number of shared genes. (**B**) Degree distribution of AD, PD, and random networks. (**C**) Enrichment analysis was performed on gene modules containing at least 30 genes using the KEGG database (see the Materials and Methods section). Significantly enriched gene modules are shown as coloured, numbered blocks. Colour and number keys are as in (**A**).



Gene co-expression networks were generated for AD and PD samples. AD, PD, and random networks are shown.

#### *3.4. Zebrafish Transcriptomic and Metabolic Investigations Suggest an Association between Brain Ageing and Retinoid Dysregulation*

To further validate our findings regarding the differences between clusters of human AD and PD samples, we analysed transcriptomic data from *tert* mutant zebrafish and reconstructed tissue-specific GEMs (Figure 5A). To ascertain that these effects of ageing were limited to the brain, we analysed the brain, liver, muscle, and skin of zebrafish as well as the whole animal.

We first repeated DEG and GSE analyses in the *tert* mutants using brain transcriptomic data. We found significant enrichment of GO terms associated with retinoid metabolism as well as eye development and light sensing, in which retinoids act as signalling molecules [63] (Figures 5B and S3, Supplementary Data S5). To further support our findings, we then reconstructed mutant- and genotype-specific GEMs by overlaying zebrafish *tert* mutant transcriptomic data onto a modified generic *ZebraGEM2* GEM [64]. We designated the modified GEM *ZebraGEM2.1* (Supplementary File S3) and used it as the reference GEM. We also generated zebrafish organ-specific GEMs and provided them to the interested reader (Supplementary File S4).

We then repeated reporter metabolite analysis using the transcriptomic data from zebrafish tissue-specific GEMs and found that retinoids were identified as significant reporter metabolites in *tert*+/<sup>−</sup> zebrafish (*p* = 0.045) but not in *tert*−/−, where evidence was marginal (*p* = 0.084) (Figure 5C, Table 5, Supplementary Data S6). We also observed this result in the skin of *tert*-/- mutants, where evidence was significant (*p* = 0.017). This result can be explained due to the susceptibility of skin as an organ to photoageing, for which topical application of retinol is a widely used treatment [65]. However, we did not find evidence for significant changes in pregnanediol, and androsterone was significant only in the skin of *tert*−/<sup>−</sup> zebrafish (*p* = 0.017). This would suggest that either change in sex hormones are not ageing-related with regards AD and PD, or the changes were outside the scope of the zebrafish model that we used.

Taken together, these results indicated that ageing can largely explain alterations in retinoid metabolism in the brain but not alterations in sex hormone metabolism. These results also suggest that ageing has a differential effect on different organs, implying that metabolic changes due to ageing in the brain are associated with neurological disorders.

**Figure 5.** Summary of zebrafish *tert* mutant analysis. (**A**) Workflow for the analysis of zebrafish *tert* mutants. (**B**) Differentially expressed gene (DEG) (left panels) and gene set enrichment (GSE) analysis (right panels) of zebrafish brain samples. DEG and GSE analyses were performed on zebrafish *tert* mutant brain expression data for *tert*−/<sup>−</sup> (upper panels) and *tert*+/<sup>−</sup> (lower panels), using *tert*+/+ as a reference. Methods and colour keys are as in Figure 2. For muscle, liver, skin, and pseudo-whole animal analyses, refer to Figure S3. For full data, refer to Supplementary Data S5. (**C**) Reporter metabolite analysis of zebrafish samples. DEG data were overlaid on *ZebraGEM2.1* to determine reporter metabolites. Shown are reporter metabolites with *p* < 0.1 within the retinoic acid metabolic pathway. Red numbers indicate *p*-values in *tert*−/<sup>−</sup> compared to *tert*+/+. Blue numbers indicate *p*-values in *tert*+/<sup>−</sup> compared to *tert*+/+. Green numbers indicate *p*-values in *tert*−/<sup>−</sup> compared to *tert*+/−. Tissues are indicated with icons. For full data, refer to Supplementary Data S6.


**Table 5.** Reporter metabolite analysis of zebrafish *tert* mutants.

Reporter metabolite analysis was performed for the brains of zebrafish *tert* mutant by overlaying differential expression results onto *ZebraGEM2.1*. Top 20 unique reporter metabolites by *p*-value for each cluster compared to wild-type *tert*+/+ zebrafish are shown. For full results, refer to Supplementary Data S6.

#### **4. Discussion**

In this work, we integrated gene expression data across diverse sources into contextspecific GEMs and sought to identify and characterise disease subclasses of AD and PD. We used unsupervised clustering to identify AD/PD subclasses and employed DEG and GSE analysis to functionally characterise them. We used network exploration, constraint-based metabolic modelling, and reporter metabolite analysis to characterise flux and metabolic perturbations within basal metabolic functions and pathways. We then leveraged expression data from zebrafish ageing mutants to validate our findings that these perturbations might be explained by ageing. Our analysis concluded with the identification and characterisation of three AD/PD subclasses, each with distinct functional characteristics and

metabolic profiles. All three subclasses showed depletion of retinoids by an ageing-related mechanism as a common characteristic.

We believe that a combined analysis that integrates AD and PD data is necessary to elucidate common attributes between the two diseases. However, we realised that such an analysis will likely obscure AD- and PD-specific factors, such as amyloid β and α-synuclein, but should aid the discovery of any factors in common. Since AD and PD share numerous risk factors and comorbidities such as old age, diabetes, and cancer risk, we believe that an AD/PD combined analysis can identify factors in common to both diseases and prove valuable for the identification of treatment strategies that might be effective in the treatment of both diseases.

GSE analysis highlighted significant changes related to retinoid function or visual system function, in which retinol and retinal act as signalling molecules [63], in all clusters (Figure 2, Supplementary Data S1). Together with the identification of multiple retinol derivatives as significant reporter metabolites in *iBrain2845* (Table 3, Supplementary Data S3), we hypothesised that retinoids are a commonly dysregulated class of molecules in both AD and PD, and that this may be due to an ageing mechanism. Indeed, in our investigation with zebrafish telomerase mutants, we again found alterations in retinoid and visual system function in GSE analysis (Figures 5B and S3, Supplementary Data S5) and reporter metabolite analysis (Figure 5C, Table 5, Supplementary Data S6).

Retinoids were identified as a reporter metabolite in all three clusters of patients in this study. Further, our zebrafish analysis highlighted the importance of retinoids in ageing of the brain and the skin (Figure 5C, Table 5, Supplementary Data S6). Retinol, its derivatives, and its analogues are already used as topical anti-ageing therapies for aged skin [65], and there is a growing body of evidence suggesting its efficacy for the treatment of AD [66–69]. We add to the body of evidence with this in silico investigation involving zebrafish telomerase mutants, suggesting that the source of retinoid depletion in AD and PD is ageing-related. Interestingly, regarding our finding for skin ageing in zebrafish, lipid biomarkers have been proposed in a recent skin sebum metabolomics study in PD patients [70]. This could be interpreted as co-ageing in brain and skin tissues, possibly allowing for cheap, non-invasive prognostic testing for PD.

In addition to retinoids, we found evidence for subclass-specific dysregulation within the androgen metabolism pathway in each of the three clusters in FBA (Table 2, Supplementary Data S2) and reporter metabolite analysis (Table 3, Supplementary Data S3). We found that *iADPD1* displayed increased oestrone conversion to the less potent [71] 2-methoxyoestrone, *iADPD2* displayed increased production of the cholesterol precursor molecules geranyl pyrophosphate and lathosterol and increased androgen biosynthesis, and *iADPD3* displayed decreased conversion of 4-androstene-3,17-dione to testosterone. However, there was no definitive evidence to suggest an ageing-related basis for these observations on the basis of our zebrafish study, but this may be due to the diverse functional roles that sex hormones have, limitations within the *ZebraGEM2.1* model, or absence of an actual biological link between sex hormones and ageing of the brain. Despite this, given the widely reported variability in responses to sex hormone replacement therapy in AD and PD [6,8,60,61], we believe that this observation represents a possible explanation for the heterogeneity. Our observation regarding the dysregulation of the androgen pathway at three separate points suggests that dysregulation at other points might also be linked to AD and PD, thus implying that androgen metabolism dysregulation in general might be important for the development of AD and PD. Our finding via network community analysis of a gene module associated with glucuronidation activity points to a possible therapeutic strategy to combat androgen dysregulation. In our study, the limitation of our dataset that some samples were aggregate samples or did not record the donor's sex meant that we were unable to assess in detail the sex-dependency of our results. However, as has been extensively studied in the AD model mouse [72–74], this remains an important question, and more work is needed to elucidate the importance of sex hormones and glucuronidation regarding AD and PD.

Identification of subclasses is desirable to address the heterogeneity in disease with regards transcriptomic profile and treatment response, but patients must be stratified in order to be diagnosed with the correct disease subclass and therefore administer the appropriate treatment. To this end, we used GSE analysis to functionally characterise the AD/PD subclasses (Figure 2, Supplementary Data S1). Cluster 2, which was associated with a decreased immune and stress response, appeared to be most severe disease subclass, whereas cluster 3, which was associated with an increased sensory perception of smell, reduced haemostasis, and reduced immune and DNA damage response, seemed to be the least severe. Meanwhile, cluster 1 was associated with an increased immune and inflammatory responses and reduced sensory perception of smell. The functional terms are supported by community analysis of our AD and PD gene co-expression networks, which identified gene modules that roughly align with the GSE results (Figure 4, Supplementary Data S4). The proposed severity ratings are supported by FBA findings, which show *iADPD2* as having the highest total flux dysregulation compared to control, and *iADPD3* as having the least (Table 2, Supplementary Data S2). Although we did not attempt to characterise for stratifying and diagnosing patients in our study, our findings clearly show that such stratification is possible.

In this work, we leveraged samples from zebrafish telomerase mutants and insights from the *ZebraGEM2.1* metabolic model. These models were utilised in order to test the hypothesis that the observations we made in the human subjects could be explained by telomeric ageing in a way where we could control the 'dosage' of ageing, i.e., *tert+/*<sup>−</sup> and *tert*−*/*<sup>−</sup> mutants. However, it is important to acknowledge the limitations of such models. Although zebrafish are a widely used model organism to study vertebrate ageing [22], most neurons of the brain do not divide and are therefore not likely to be subject to the direct effects of telomeric ageing. Therefore, we cannot conclude that the dysregulations we observed in AD and PD were caused by ageing of neurons per se, but rather correlations exist between telomeric ageing in zebrafish and AD and PD in humans, and these correlations may act via an indirect mechanism affecting multiple systems of the ageing organism. Due to these limitations, more data and more studies are required to support the link between ageing and neuronal degeneration in AD and PD. The basic requirement of such studies would be brain tissue from donors of a wide range of ages. In our study, we utilised datasets containing aggregated samples, where age cannot be assigned, and samples from age-matched donors, meaning that younger donors were poorly represented in our data, making it unsuitable for an ageing analysis.

In conclusion, we report three distinct subclasses of AD and PD. The first subclass was identified as being associated with increased immune response, inflammatory response, and reduced sensory perception of smell, according to GSE results. We observed that this subclass exhibited increased oestradiol turnover, according to FBA results. The second subclass was linked with increased cholesterol biosynthesis and general increased flux through the androgen biosynthesis and metabolism pathway. This subclass was characterised by reduced immune response. The third subclass was characterised by enrichment of GO terms indicating increased sensory perception of smell, reduced haemostasis, and reduced immune and DNA damage response. This subclass also exhibited reduced testosterone biosynthesis from androstenedione, as determined by FBA. All subclasses exhibited dysregulation within the retinoid metabolism pathway. For all subclasses of AD and PD, more investigation is required to verify the effectiveness of these stratification methods and to aid prediction of effective precision therapies. To our knowledge, this is the first meta-analysis at this scale highlighting the potential significance of retinoids, oestradiol, and testosterone in AD and PD by studying the two diseases in combination. We observed that the existence of disease subclasses demands precision or personalised medicine and explains the heterogeneity in AD and PD response to single-factor treatments.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/biomedicines9101310/s1, Figure S1: Unsupervised clustering of AD and PD samples, Figure S2: Visualisation of AD and PD samples, Figure S3: Transcriptomic and functional characterisation of zebrafish *tert* mutants, Supplementary File S1: *iBrain2845* genome-scale metabolic model, Supplementary File S2: *iADPD*-series context-specific genome-scale metabolic models, Supplementary File S3: *ZebraGEM2.1* genome-scale metabolic model, Supplementary File S4: Zebrafish context-specific genome-scale metabolic models, Supplementary Data S1: Gene set enrichment analysis results for AD and PD subclasses, Supplementary Data S2: Flux balance analysis results for *iADPD*-series genome-scale metabolic models, Supplementary Data SS3: Reporter metabolite analysis results for AD and PD subclasses, Supplementary Data S4: Network analysis results for AD and PD samples, Supplementary Data S5: Gene set enrichment analysis results for zebrafish *tert* mutants, Supplementary Data S6: Reporter metabolite analysis results for zebrafish *tert* mutants.

**Author Contributions:** A.M. and M.U. supervised the study and designed the study on human data. R.K. and C.E. designed the study on zebrafish. N.H. performed the zebrafish RNA sequencing and generated the raw counts. With the exception of the network analysis, S.L. performed all in silico analysis and M.A., R.B. and C.Z. provided technical consultation. R.B. performed the network analysis. S.L. analysed all the results. S.L. wrote the manuscript with input from H.T. and the rest of the authors. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the German Ministry for Education and Research within the framework of the GerontoSys initiative (research core JenAge, funding code BMBF 0315581) to C.E.; and the Knut and Alice Wallenberg Foundation (grant number 2017.0303) to A.M.

**Institutional Review Board Statement:** All animal procedures were performed in accordance with the German animal welfare guidelines and approved by the Landesamt für Verbraucherschutz Thüringen (TLV), Germany.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** All original computer code, models, and author-curated data files have been released under a Creative Commons Attribution ShareAlike 4.0 International Licence (https://creativecommons.org/licenses/by-sa/4.0/) and are freely available for download from <https://github.com/SimonLammmm/ad-pd-retinoid>. Zebrafish tert mutant sequencing data have been deposited in the NCBI Gene Expression Omnibus (GEO) and are accessible through GEO Series accession numbers GSE102426, GSE102429, GSE102431, and GSE102434.

**Acknowledgments:** We are grateful to Catarina Henriques and Miguel Godinho Ferreira for sharing the *tert* mutant zebrafish line. We also thank Ivonne Heinze, Ivonne Görlich, and Marco Groth from the FLI sequencing facility for sequencing the zebrafish samples. The authors acknowledge use of the research computing facility at King's College London, *Rosalind* (https://rosalind.kcl.ac.uk; accessed on 17 July 2019). The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from the GTEx Portal on 6 December 2019. The results published here are in part based on data obtained from the AD Knowledge Portal (https://adknowledgeportal.synapse.org; accessed on 1 October 2020). Study data were provided by the Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago. Data collection was supported through funding by NIA grants P30AG10161 (ROS), R01AG15819 (ROSMAP; genomics and RNAseq), R01AG17917 (MAP), R01AG30146, R01AG36042 (5hC methylation, ATACseq), RC2AG036547 (H3K9Ac), R01AG36836 (RNAseq), R01AG48015 (monocyte RNAseq), RF1AG57473 (single nucleus RNAseq), U01AG32984 (genomic and whole exome sequencing), U01AG46152 (ROSMAP AMP-AD, targeted proteomics), U01AG46161(TMT proteomics), U01AG61356 (whole genome sequencing, targeted proteomics, ROSMAP AMP-AD), the Illinois Department of Public Health (ROSMAP), and the Translational Genomics Research Institute (genomic). Additional phenotypic data can be requested at www.radc. rush.edu; accessed on 1 October 2020). We thank the patients and their families for their selfless donation to further understanding Alzheimer's disease. This project was supported by funding from the National Institute on Aging (AG034504 and AG041232). Many data and biomaterials were collected from several National Institute on Aging (NIA) and National Alzheimer's Coordinating Center (NACC, grant #U01 AG016976) funded sites. Amanda J. Myers, PhD (University of Miami, Department of Psychiatry), prepared the series. The directors, pathologist and technicians involved

include Rush University Medical Center; Rush Alzheimer's Disease Center (NIH #AG10161); David A. Bennett, M.D.; Julie A. Schneider, MD, MS; Karen Skish, MS, PA (ASCP) MT; Wayne T Longman. The Rush portion of this study was supported by National Institutes of Health grants P30AG10161, R01AG15819, R01AG17917, R01AG36042, R01AG36836, U01AG46152, R01AG34374, R01NS78009, U18NS82140, R01AG42210, and R01AG39478, and the Illinois Department of Public Health. Quality control checks and preparation of the gene expression data was provided by the National Institute on Aging Alzheimer's Disease Data Storage Site (NIAGADS, U24AG041689) at the University of Pennsylvania.

**Conflicts of Interest:** The authors declare no competing financial interests.

### **Abbreviations**

AD, Alzheimer's disease; DEG, differentially expressed gene; FANTOM5, Functional Annotation of the Mammalian Genome 5; FBA, flux balance analysis; GEM, genome-scale metabolic model; GO, Gene Ontology; GSE, gene set enrichment; GTEx, Genotype-Tissue Expression; HuRI, Human Reference Protein Interactome; HPA, Human Protein Atlas; PD, Parkinson's disease; ROSMAP, Religious Orders Study and Rush Memory Aging Project.

### **References**

