**1. Introduction**

Kidney transplantation (KTx) is the preferred method of treatment for end-stage kidney failure [1]. Increasing the longevity of transplanted kidneys is critical because of the shortage of available kidneys and kidney donors [2]. While improved short-term survival of the transplanted kidney has been attributed to better immunosuppressive drugs and sophistication in organ procurement and surgical methods [3], long-term survival outcomes have largely remained limited and unchanged [4]. Currently used methods of KTx monitoring, such as patient serum creatinine and proteinuria, are neither su fficiently sensitive nor specific to detect early-stage injury and only detect advanced and often irreversible tissue injury [5]. Additionally, kidney biopsies cannot easily be used to predict injury [6,7]. Over recent years, the application of high throughput technologies towards a more discovery-based approach for correlative biomarkers of graft injury have utilized sequencing [8] gene expression, proteomic [9–14], and metabolomic methods [15,16]. Many of these approaches show background signals of other clinical confounders, such as immunosuppression exposure [17], and thus require the application of more customized and robust analytical techniques for improving the diagnostic accuracy of biomarkers in blood and urine to reflect di fferent transplant (Tx) injury phenotypes [18–20].

In this study, we hypothesized that the recipient's immune response towards the graft induces immunological and downstream metabolic changes at the time of specific injuries, such as acute rejection (AR), which result in perturbations in specific urine metabolite concentrations. We also hypothesized that specific metabolic pathways are injury-specific such that a panel of metabolites can be used as a surrogate biomarker to monitor KTx injuries. In this report, we present our findings from a comprehensive targeted metabolomics analysis of urine collected from pediatric KTx patients. These samples have been biopsy matched, providing an accurate phenotype characterization, and enabling exploration of metabolic pathways associated with KTx dysfunction.

#### **2. Experimental Section**

#### *2.1. Patients and Samples*

Biobanked urine samples available in the Sarwal lab from previously funded studies were screened for matching biopsy data on the day of urine collection. Out of a total of 2016 biobanked urine samples collected between 2006 and 2009, 770 were biopsy-matched, of which 326 unique and clinically annotated urine samples were included in the first part of this study. These patients were on calcineurin inhibitor (CNI) based immunosuppression (IS). All urine samples were stored at −80 ◦C with urine processing techniques, procedures, and conditions in which we have previously shown negligible degradation of urine components [11].

All samples from our biobank were matched with transplant biopsies; all biopsies were read by a central pathologist and scored by the Ban ff and Chronic Allograft Damage Index (CADI) [21–23] as acute cellular or humoral rejection with clinical graft dysfunction, and tubulitis and/or vasculitis on histology (AR; *n* = 106) [24], stable with no histological or clinical graft injury (stable graft function (STA); *n* = 111), interstitial fibrosis and tubular atrophy (IFTA; *n* = 71) [25], and BK viral nephritis with SV40 staining on histology, with/without clinical graft dysfunction (BK viral nephritis (BKVN); *n* = 22). Intragraft C4d stains were performed to assess for antibody-mediated rejection (ABMR). AR was defined, at minimum, by the following criteria: (i) TCMR consisting of either a tubulitis (t) score > 2 accompanied by an interstitial inflammation score > 2 or vascular changes (v) score > 0; (ii) C4d-positive ABMR consisting of positive donor-specific antibodies (DSAs) with a glomerulitis (g) score > 0 or peritubular capillaritis score (ptc) > 0 or v > 0 with unexplained acute tubular necrosis/thrombotic microangiopathy (ATN/TMA) with C4d = 2; or (iii) C4d-negative ABMR consisting of positive DSA with unexplained ATN/TMA with g + ptc ≥ 2 and C4d = 0 or 1. Stable allografts were defined by an absence of substantial injury on the matched biopsy pathology and definitions of the inflammation or i score and the tubulitis or t score. IFTA used standard pathology definitions as described by the Ban ff schema on the paired biopsies from each individual urine sample.

This study was conducted in accordance with the relevant guidelines and regulations as approved by the University of California San Francisco (UCSF) Human Research Protection Program Institutional Review Board (IRB) under IRB #14-13573. All patients provided written informed consent. In cases of pediatric and young adult patients, written informed consent was obtained from a parent and/or legal guardian to participate in the research, in full adherence to the Declaration of Helsinki. The clinical and research activities being reported are consistent with the Principles of the Declaration of Istanbul as outlined in the 'Declaration of Istanbul on Organ Tra fficking and Transplantation Tourism'. As such, no organs or tissues were procured from prisoners. All organs or tissues were procured from the Departments of Surgery at either UCSF or Stanford University.

#### *2.2. Urine Collection, Initial Processing, Storage, and GC*/*MS-TOF Analysis*

Second morning void mid-stream urine (50–100 mL) was collected in sterile containers and was centrifuged at 2000× *g* for 20 min at room temperature within 1 h of collection. Specifically, the urine specimens were collected in sterile polypropylene collection tubes that are leak-resistant with a sterility seal. Processing of the urine was done all in one bath with sterile polypropylene plastic

tubes. The supernatant was separated from the pellet containing any particulate matter including cells and cell debris. The pH of the supernatant was adjusted to 7.0 with Tris-HCL and stored at −80 ◦C in polypropylene plastic tubes until further analysis. The identification of metabolites followed the well-established FiehnLib protocol [26]. In brief, all metabolite reference standards underwent a two-step derivatization procedure following the previously published protocol [27]. The derivatization of urine metabolites procedure has been described previously [27]. Briefly, neat urine samples were lyophilized without further pretreatment after our initial finding of severe alterations using urease treatments. To the dried samples, 20 μL of 40 mg/mL methoxylamine hydrochloride in pyridine was added, and samples were agitated at 30 ◦C for 30 min. Subsequently, 180 μL of trimethylsilylating agen<sup>t</sup> *N*-methyl-*N*-trimethylsilyltrifluoroacetamide (MSTFA) was added, and samples were agitated at 37 ◦C for 30 min. GC–MS analysis was performed using an Agilent 6890 N gas chromatograph (Atlanta, GA, USA) interfaced to a time-of-flight (TOF) Pegasus III mass spectrometer (Leco, St. Joseph, MI, USA) [27]. Automated injections were performed with a programmable robotic Gerstel MPS2 multipurpose sampler (Mülheim an der Ruhr, Germany). The GC was fitted with both an Agilent injector and a Gerstel temperature-programmed injector, cooled injection system (model CIS 4), with a Peltier cooling source. An automated liner exchange (ALEX) designed by Gerstel was used to eliminate cross-contamination from sample matrix occurring between sample runs. Multiple baffled liners for the GC inlet were deactivated with 1 μL injections of MSTFA. The Agilent injector temperature was held constant at 250 ◦C while the Gerstel injector was programmed (initial temperature 50 ◦C, hold 0.1 min, and increased at a rate of 10 ◦C/s to a final temperature of 330 ◦C, hold time 10 min). Injections of 1 μL were made in split (1:5) mode (purge time 120 s, purge flow 40 mL/min). Chromatography was performed on a Rtx-5Sil MS column (30 m × 0.25 mm inner diameter (i.d.), 0.25 μm film thickness) with an Integra-Guard column (Restek, Bellefonte, PA, USA). Helium carrier gas was used at a constant flow of 1 mL/min. The GC oven temperature program was initially 50 ◦C with a 1-min hold time and ramping at 20 ◦C/min to a final temperature of 330 ◦C with a 5-min hold time before cool-down for a 20 min run time. MS parameters were based on Autotune using FC43 (Perfluorotributylamine) with manufacturer-specific tune settings. Transfer line temperature was 250 ◦C and electron impact ionization was set at 70 eV. Filament source temperature was at 250 ◦C and TOF at room temperature. After a solvent delay of 350 s, mass spectra were acquired at 20 scans/s with a mass range of 50 to 500 *<sup>m</sup>*/*<sup>z</sup>*. Initial peak detection and mass spectrum deconvolution were performed with Leco Chroma-TOF software (version 2.25, Leco) and samples were exported to the netCDF format for further data evaluation with MZmine [28] and XCMS [29].

#### *2.3. Raw Data Processing and Statistics*

All chromatograms were assessed in the same manner by software packages MZmine [28] and XCMS [29]. These packages performed peak finding in an automated and unbiased way using the common MS netCDF file format that enables a unique way of data export irrespective of different instrument platforms. For the raw GC–MS data, the netCDF export function from the Leco ChromaTOF software was used. For MZmine, the *m*/*z* bin size was set to 0.01, the chromatographic threshold level was set to 0.5, the absolute intensity threshold was set to 2500, the tolerance in *m*/*z* values was set to 0.5, the tolerance in intensity was set to 1.0, and the minimum peak length was set to 2 s.

The raw data was normalized using urine creatinine, as an internal control, measured as a part of urine metabolome assessment and quantile normalization for batch correction. Moreover, 310 biopsy-matched urine samples, with resulting panels of 266 metabolites, were used for the analyses of both post-transplant injury classification and significant metabolite selection. Non-parametric imputation was applied to these samples via the missForest algorithm [30]. If more there was missing data on more than one-third of the metabolites, these samples were excluded. Sixteen samples met this criterion.

Clustering was performed and visualized with Morpheus (Broad Institute) using average linkage hierarchical clustering. The log-transformed data was median centered, per metabolite, prior to clustering for better visualization. One minus Pearson's correlation was used for the similarity metric. A fire color scheme was used in heat maps of the metabolites. Z-score analysis scaled each metabolite according to a reference distribution.

To evaluate the performance of the classification models, these 310 samples were randomly assigned to training (75%) and test (25%) sets. To avoid overfitting, 10-fold cross-validation was performed for models on the training set. The primary statistical learning method used for allograft outcome classification was Random Forests [31] via the randomForest package in R. Significant metabolites were selected from the Random Forests model using the VSURF package in R [32]. Additionally, for visualization of significant metabolites, volcano plots were produced using variable importance values derived from Random Forests models as a significance measure. Metabolite selection was done by Bonferroni-corrected *p*-value in addition to VSURF to display a traditional volcano plot and directly compare VSURF to traditional t-testing methods and their resulting metabolite lists. These variable importance scores are defined as the mean percentage decrease in classification accuracy of the model if the metabolite data were to be randomly permuted rather than taken as quantified (a higher score denotes a higher variable importance). Comparison of classification models was done by computing and plotting area under the curve (AUC) from the receiver operating characteristic (ROC) using the pROC package in R. Statistical comparison between full and abbreviated metabolite models to assess diagnostic accuracy similarity was carried out using the DeLong's test [33]. Given that certain clinical data variables were significantly di fferent between groups, these variables were reviewed for any association with particular variable di fferences within or between groups and their impact on metabolite signatures of di fferent transplant phenotypes. Analysis was performed using the R statistical software version 3.4.3. MetaboAnalyst (www.metaboanalyst.ca) was used to perform targeted pathway and enrichment analysis [34].

## *2.4. Data Availability*

The datasets generated during and analyzed during the current study are not publicly available due to legacy IRB consent restrictions on public sharing of data from these patient populations but are available from the corresponding author on reasonable request.
