**3. Discussion**

The value of combining the two most commonly used techniques in metabolomics, NMR and MS, was recently recognized and addressed in a review by Marshall and Powers [8]. However, no method has attempted to directly correlate an NMR chemical shift with an MS *m*/*z* value of a single sample because there is no specific information to indicate that these two features belong to the same molecule [8]. MS intensities belonging to a cohort of samples were cross-correlated with NMR spectral regions to overcome this limitation [17]. The correlation that does not exist in one sample exists in all of them as a group because, at this point, it is the distribution of intensities that determines whether a

given chemical shift belongs to a molecule signal presenting a certain *m*/*z*. The so-called Statistical Heterospectroscopy approach, however, led to the identification of a reduced number of metabolites. The major drawback of this approach is, in our opinion, that the correlation was attempted between intensities of compounds whose levels are measured separately by HPLC–MS with regions of the NMR spectrum whose intensities result from the simultaneous contributions of many metabolites. A clear correlation between the MS and the NMR bin intensities can only be expected for strongly dominating metabolites because of their concentration in the region's shape.

Differently, SYNHMET uses the resolution power of NMR to separate most of the different signals contributing to the spectrum profile, coupled to that of UHPLC–HRMS. The deconvolution strategy was used to extract more than 200 metabolite concentrations from urine [16], a result not reproduced in any further study to the best of our knowledge. The difficulty associated with this methodology lies mainly in the extraction of levels for not concentrated metabolites or those presenting signals in crowded regions. These areas only provide the sum of the contributions of the various compounds, and without further information, there are many ways to combine the positions and intensities of the mixture components to reproduce the experimental shape of the NMR spectrum. The simultaneous use of UHPLC–HRMS intensities provides the key to obtain a single solution because it adds two new features to calculate the relative contribution of metabolites to a profile: the molecular weight and the chromatographic resolution. The latter is not practical in NMR measurements due to a combination of low sensitivity and long acquisition times. In this way, the correct proportions are extracted by combining NMR and UHPLC–HRMS, which transforms the experiment used for metabolite identification/quantification from monodimensional (chemical shift) into three-dimensional (by adding retention time and exact mass). Globally, the mechanism by which this method operates can be defined as an MS-assisted NMR deconvolution, improving the quality and quantity of the obtained data compared to that expected when exclusively using NMR (Scheme 2).

**Scheme 2.** Correlation between NMR concentration and MS intensity helps determine the relative position in crowded regions of the NMR spectrum, like in the example (**a**) that shows the superposition of five different signals corresponding to different metabolites by resolving the peaks in the retention time-exact mass plane (**b**).

According to the Metabolomics Standard Initiative, a definite metabolite identification, called level 1, needs a direct comparison of experimental data with an authentic reference standard [27]. It was argued that NMR metabolite identification of compounds in mixtures achieved by comparing with a spectra database approaches level 1 identification [27]. In the SYNHMET strategy, we added parameters characterizing a compound (chemical shift, multiplicity, and the number of signals) and the elemental composition provided by the correlation with the MS data to the NMR. This additional information limits the possible structures to the existing isomers, constituting a very restricted chemical space for low molecular weight compounds. The probability that two isomers show the same NMR parameters is extremely low, if even possible. For all these reasons, the confidence in the SYNHMET identification should be considered, in our opinion, similar to that in level 1.

Applying SYNHMET enabled us to quantify a large number of metabolites in urine. Many papers have supported the concept that the utility of a given approach is directly proportional to the measurable number of metabolite levels. However, this is only one of the two essential parameters in defining the value of a dataset for metabolomics studies. The other is the completeness of the matrix because if there are too many missing values, the classification ability or the detection of correlations between metabolites becomes weaker [28]. In our experience and from analyzing the literature on NMR urine metabolomics, the maximum number of metabolites quantified in at least 80% of samples is around 50–60 [10–15]. This number is far from that achieved in LC-MS studies, which have reached more than a thousand [29]. However, the identity of many of these metabolites is only putative because it is only supported by fragmentation spectra.

In our scheme, both the reproducibility and accuracy of the results are mainly supported by the characteristics of NMR. It is commonly accepted that these are two main robust features of NMR, which involve the possibility of obtaining the same instrumental response even when different spectrometers are used. These characteristics have favored constructing a community-built reference calibration line, with the participation of twentythree laboratories, including ours [30]. We foresee that a similar calibration line can be produced among laboratories keen to prove the validity of the SYNHMET approach, allowing for the scientific community to obtain more robust results in metabolomics. However, the number of samples analyzed in this work did not allow for a definitive answer, and further studies will be needed to assess the limits in terms of precision and accuracy.

A compelling application of SYNHMET is the possibility of generating a detailed personalized profile of urinary metabolites. The main way to get a reliable profile is the election of an effective way to normalize the metabolites' concentrations to correct the variation induced by the subject hydration status. Usually, concentrations are normalized by the total urine volume collected during 24 h or the urinary creatinine level. These two normalization strategies present advantages and drawbacks. In case of using the total urine volume for 24 h, the incomplete collection is the main problem [22]. On the other hand, creatinine concentration is affected by several factors that are not directly related to the glomerular filtration rate, like muscle mass, diet, age, sex, and race [31,32]. Creatinine is also secreted from the renal tubules, which is not desirable for a glomerular filtration marker. A study comparing the uncertainties related to standardization of urine samples with volume and creatinine concentration showed that the latter introduces a 19–35% error [22]. However, if compared with the total volume normalization, this is partially counteracted by the higher risk that the sample is incomplete in collecting voids during a 24-h time interval. More recently, a study showed that normalization with urinary creatinine is better than volume in rats under controlled preclinical conditions, even when compared to a more recently proposed normalizer, cystatin C [33].

Our study used creatinine to normalize the metabolite concentrations, mainly because almost all the available normal ranges found in the literature are expressed in μM/mM of creatinine [34]. Independently on the used strategy, the profile of normalized metabolite concentrations constitutes a personalized urinary picture, which can be used to expand the current capability of classical biochemical tests to determine aperson's health status. This approach is very different from classical metabolomics, which seeks to find universal biomarkers of a disease or drug effects. The concept of personalized medicine grew up from the scientific evidence that there is high interindividual variability in the metabolic response to any change in the health status or the response to a drug. Therefore, expanding the number of metabolites that can be routinely monitored in biofluids can define a more accurate picture to be used in clinical practice [35]. At the heart of this analysis is the concept that a person's metabolic profile can reflect an individual's overall health status. Nowadays, physicians only capture a tiny fraction of the information contained in the metabolome, mainly due to its high complexity and the lack of robust and efficient analytical methods to determine the absolute instead of the relative level of a large number of chemical compounds in biofluids. Routine analyses only evaluate a very restricted number of

compounds, such as glucose level for monitoring diabetes, cholesterol and low/highdensity lipoproteins for cardiovascular health, or urea and creatinine for renal disorders. Simultaneously determining the absolute concentration of hundreds of molecules will open up new scenarios towards more accurate personalized medicine and increase the predictive value of such analyses.

For example, a patient suffering from BC showed a urinary profile with significant abnormal values for metabolites belonging to galactose/starch sucrose, caffeine, and lysine metabolisms (Figure 5). A recent study about recognizing different stages of BC using machine learning identified the first two as the main dysregulated metabolisms in early stages, whereas lysine metabolism was found to be unbalanced in late stages [36]. The case of caffeine metabolism is remarkable. Along with one of its metabolites, 1,3-dimethylurate, caffeine is processed by a P450 family cytochrome acting in the liver, CYP1A2 [37]. The connection between caffeine metabolism, exposure to tobacco compounds, and urinary mutagenicity has been known for a long time [38]. Significantly, cigarette smoking is the leading risk factor for BC, accounting for 50% of the total [39]. In addition to this patient, urine caffeine levels were significantly elevated in six other subjects with BC.

This patient also presented significant comorbidity due to cardiovascular pathologies, particularly severe myocardial ischemia. We observed different altered metabolisms related to cardiopathies, like those corresponding to branched-chain amino acids, lactate, and fatty acid metabolism [40]. They are the consequences of increased fatty acid metabolism, decreased glucose metabolism, and impaired branched-chain amino acid catabolism. Finally, the patient showed chronic pancreatitis, probably related to past alcohol abuse. The malfunction of the pancreas should explain the very high level of glucose in the urine, as in diabetic subjects.
