1. Introduction
Despite extensive improvement in the diagnosis and management of breast cancer, which is the most common cancer type in women with an incidence of around 30/1000 [
1,
2], advanced breast cancer is still accompanied by high mortality rates. For instance, in the case of metastatic breast cancer, the 5-year survival rate is around 23% [
3]. Given the dismal prognostic of advanced forms of breast cancer, screening continues to be one of the most important strategies of improving the survival of breast cancer patients [
4]. To this end, the guidelines issued by the European Society for Medical Oncology (ESMO) recommend biennial screening for all women aged 50 to 69 years using mammography [
5]. However, mammography is an unpleasant experience for the patients and the accuracy of the method is affected by the high density of the breast tissue [
5]. Therefore, there is ongoing research for simpler, quicker and more accurate strategies of diagnosing breast cancer.
Raman spectroscopy is a type of vibrational spectroscopy based on the inelastic scattering of laser photons, which can assess the vibrational energy structure and implicitly the molecular structure of samples [
6]. However, the use of Raman is often limited by the low sensitivity of the effect. For instance, the concentration of most metabolites in biofluids like serum or urine is below the detection limit of Raman scattering. Thus, analysing biofluids requires Raman amplification methods such as surface-enhanced Raman scattering (SERS).
SERS is a method of enhancing the Raman signal of molecules [
7], using nanometre-sized metal substrates such as metal colloids [
8]. SERS has attracted much attention as a method of analysing biofluids in the point-of-care setting, especially for cancer detection [
9]. Thus, SERS enables the assessment of the chemical composition of biofluids, which exhibit a wealth of spectral information that can aid diagnosing diseases such as cancer in the point-of-care setting.
For instance, preliminary results in a murine model showed that Raman spectra of urine samples displayed distinguishable features in the case of rats with breast cancer [
10]. Similarly, Bonifacio et al. showed on n = 20 samples that the SERS spectra of urine displayed distinguishable modifications in the case of patients with prostate cancer [
11]. Good classification accuracies using SERS spectra of serum were also reported in the case of colorectal, lung, oral, breast or prostate cancer [
12,
13,
14,
15,
16,
17].
Previous reports on SERS spectra acquired from urine showed that the urinary proteins prevent the acquisition of SERS spectra [
11]. Therefore, the authors filtered the urine and then acquired the SERS spectra from protein-free urine. On the other hand, the SERS spectra of proteins can be selectively amplified by modifying the nanoparticles with iodide, which facilitate the chemisorption of proteins onto the nanoparticles [
18]. The selective amplification of the SERS signal of proteins using iodide-modified nanoparticles suggests that ions can play important roles in promoting the chemisorption of analytes, in line with the chemical mechanism of SERS [
19].
In this study, we aimed to demonstrate the possibility to diagnose breast cancer based on the SERS spectra of urine in the case of n = 53 patients with breast cancer and n = 22 controls. By including in our analysis a larger cohort than in previous reports, the study contributes to the effort of translating SERS in the clinical setting as a novel diagnostic tool for breast cancer.
2. Materials and Methods
In this study, we enrolled n = 53 female patients with biopsy confirmed breast cancer, which were referred to the Ion Chiricuta Oncologic Institute Cluj-Napoca, Romania, for mastectomy/lumpectomy. Patients were included irrespective of their stage, grade or histologic type. Male patients presenting with breast cancer were excluded from this study.
Morning urine samples were collected in plastic containers and stored at −80 °C until analysis. Before the SERS measurements, the urine samples were centrifuged for 10 min at 5800 g in order to remove crystals and cell debris. All urine samples were collected from treatment-naïve patients. Control urine samples were obtained from n = 22 subjects confirmed to be healthy by clinical examinations. The study was approved by the ethics committee of Iuliu Hatieganu University of Medicine and Pharmacy Cluj-Napoca and all subjects provided written informed consent for enrolling in the study.
For the SERS analysis, silver nanoparticles synthesized by reduction with hydroxylamine hydrochloride (hya-AgNPs) were used [
20]. Briefly, the nanoparticles were synthesized by mixing 17 mg H
2NOH × HCl with 1.2 mL of NaOH solution 1% and 8.8 mL of ultrapure water. Separately, 17 mg of AgNO
3 was dissolved in 90 mL of ultrapure water. The two solutions were rapidly mixed under vigorous stirring in order to synthesize the nanoparticles. The pH of the colloid was 7.5 after the synthesis. The fresh colloid was left at room temperature overnight before measurements. All chemicals were purchased from Sigma–Aldrich (Steinheim, Germany). UV-Vis absorption spectra of the silver nanoparticles were recorded using a V-630 Spectrometer (Jasco) by diluting the colloidal solution 10-fold.
For acquiring the SERS spectra, 10 µL of urine was added to 90 µL of hya-AgNPs. Then, the colloid was activated by adding 1 µL of Ca(NO3)2 10−2 M (final concentration of Ca(NO3)2 10−4 M). A 5 µL droplet from this mixture was then placed on an aluminium foil covered microscope slide and analysed by Raman spectroscopy immediately (in liquid form). The SERS spectra were acquired using an InVia Raman Spectrometer (Renishaw), equipped with a Nd:YAG doubled frequency laser emitting at 532 nm (laser power 20 mW on the sample), which was focused for 40 s on the sample through a 5X microscope objective (NA 0.12). Acquiring SERS spectra from samples in liquid form results in an average SERS spectrum due to the continuous thermal motion of the molecules.
Spectra pre-processing consisted of background removal using a linear baseline correction, which eliminated the spectral noise due to autofluorescence, followed by mean normalization. In the case of mean normalization, each intensity value is normalized by the mean intensity of the spectrum. In this way, spectra can be compared even when there are significant differences in the absolute intensities of spectra [
21].
The statistical analysis consisted of principal component analysis (PCA) and principal component analysis-linear discriminant analysis (PCA-LDA). PCA is an exploratory data analysis technique that reduces the dimensionality of the data down to a desired number of principal component (PC) score values. The information captured by each PC can be visualized by analysing the corresponding loading plot of the PC. Given that PCA is an unsupervised method, it cannot be used for assessing the classification accuracy. Therefore, we employed
PCA-LDA, which is a supervised multivariate data analysis technique that requires prior knowledge regarding the group to which samples correspond to (in our case the breast cancer and the control group) in order to calculate figures of merit such as sensitivity, specificity and overall accuracy. Sensitivity is defined as the true positive rate (the percentage of samples in the disease group correctly assigned as such), whereas specificity is defined as true negative rate (the percentage of samples in the control group correctly assigned as such). Overall accuracy is defined as the number of correctly assigned samples (irrespective of the group to which they belong).
The need to perform first PCA and then LDA resides in the fact that LDA can be applied only when the number of variables is smaller than the number of samples in any group of the model [
21]. Thus, PCA replaces each string of wavenumber with a desired number of principal component (PC) score values. In our case, the number of PCs was chosen such that the explained variance exceeded the 80% threshold. Moreover, we also inspected visually the loading plots for the presence of spectral features. For each sample, LDA calculates a discriminant value corresponding to each group, the sample being assigned to the group having the highest discriminant value. All statistical analysis was performed using The Unscrambler (version 10.1, CAMO Software, Oslo, Norway).
3. Results
In this study, we acquired SERS spectra of urine samples from n = 53 breast cancer patients and n = 22 controls. The distribution of patients according to stage I, II and III breast cancer is presented in
Figure 1. The average age of the breast cancer patients was 55 ± 11 years, while the average age of the control group was 45 ± 5 years. The detailed information regarding stage and age for each patient is presented in
Table S1.
Adding cations such as Ca
2+ or Mg
2+ to the silver colloid promotes the chemisorption of anionic species (including purine metabolites) to the silver surface and the switching on of the SERS effect, without necessarily aggregating the nanoparticles, as shown recently by us [
19,
22]. For instance, adding Ca
2+ 10
−4 M leads to intense SERS spectra of uric acid 10
−4 M, while in the absence of Ca
2+, the spectra are much weaker (
Figure S1 in the Supplementary Materials).
No aggregation of the hya-AgNPs was observed by the addition of Ca
2+ up to 10
−4 M. To test the stability of the hya-AgNPs after their supplementary activation with Ca
2+ up to 10
−4 M, we acquired UV-Vis absorption spectra of nanoparticles before and after activation with Ca
2+ 10
−4 M. The UV-Vis spectrum of hya-AgNPs showed the characteristic plasmonic resonance band at 408 nm (
Figure 2), which did not suffer any significant shift after the SERS-activation of the colloid.
Thus, the results showed an overlap in the UV-Vis spectra between the two time points.
For the acquisition of the SERS spectra, we employed the 532 nm laser (green), which meets pre-resonant conditions with the absorption maximum of the nanoparticles. The average SERS spectra of urine samples from breast cancer patients and controls along with the difference spectrum are presented in
Figure 3. The standard deviation of the difference spectrum, which is a measure of the robustness of the separation between the groups, is presented as a grey shaded area
The SERS spectra in
Figure 3 are dominated by several bands tentatively attributed to uric acid at 650, 809, 1017, 1135 and 1522 cm
−1. The SERS bands tentatively attributed to xanthine are represented by the ones at 1135, 1251 and 1318 cm
−1 while the bands at 724, 1094, 1459 and 1595 cm
−1 were tentatively attributed to hypoxanthine [
18,
23,
24]. However, several bands remained unassigned.
To explore the spectral data which has a significant variability across SERS spectra, we performed PCA (
Figure 4).
Figure 4a depicts the grouping of samples based on the score values of PC 1 and PC 3, which performed best in separating the two groups.
Figure 4b shows the corresponding loading plots of PC 1 and PC 3, which depict the SERS bands captured by each PC. Given that PCA is an unsupervised statistical method, some bands are present across PCs.
To test the classification accuracy yielded by SERS spectra of urine, we also performed PCA-LDA, using the first 7 PCs as input, which captured 82% of the variability of the original data set (
Figure S2). The discriminant values of PCA-LDA are presented in
Figure 5a. For each sample, PCA-LDA calculated the group-associated discriminant values and samples were then attributed to the group with the highest discriminant value.
The confusion matrix yielded by the PCA-LDA analysis is shown in
Figure 5b. Among the 22 urine samples in the control group, PCA-LDA classified correctly 21 samples, corresponding to a specificity of 95%. In the case of the breast cancer group, PCA-LDA classified correctly 43 out of 53 samples. Therefore, the sensitivity of this test was 81%. In total, the test classified correctly 64 out of 75 samples, corresponding to an overall accuracy of 88%.
4. Discussion
The staging of breast cancer is made according to the TNM system, which takes into account the size of the tumour (T), the presence of metastases in the lymph nodes (N) and on the presence of distant metastases (M) [
5]. If a distant metastasis is present, the disease is considered stage IV irrespective of the size of the tumour and the presence of metastases in the lymph nodes. Among the 53 patients in the breast cancer group, stage I disease was present in 5% of patients, stage II was present in around 40% of the patients while 55% of the patients had stage III breast cancer (
Figure 1). No patient had stage IV disease, which means that all patients were free of distant metastases. The bias towards stage II and III cases was determined by the fact that sample collection was performed in a tertiary referral hospital specialized in breast cancer surgery. Thus, only few cases with stage I breast cancer were referred to the clinic in which patient enrolment took place. The lack of stage IV breast cancer is explained by the fact that these cases are usually treated only with chemotherapy and/or radiotherapy.
For the acquisition of high-intensity SERS spectra from urine, we employed hya-AgNPs which were SERS-activated using Ca
2+. The SERS signature of the urine samples suggests that the main class of metabolites to chemisorb onto the metal surface is represented by purine metabolites [
18,
24]. We have recently showed that cations such as Ca
2+ or Mg
2+ facilitate the specific adsorption of anionic species, including citrate and chloride ions, which chemisorb onto the nanoparticles in the order of their relative affinity for the activated metal surface [
19,
22]. Moreover, the SERS activating effect does not depend on the aggregation of the nanoparticles, as evidenced by the absence of a measurable shift in the plasmon resonance after the addition of Ca
2+ 10
−4 M (
Figure 2). The absence of a shift in the absorbance maximum also explains the use of the 532 nm laser in favour of other laser lines such as the 633 nm or 785 nm lines due to (pre)resonant conditions with the surface plasmons of nanoparticles (UV-Vis absorption maximum at 408 nm). Using the 532 nm laser is also convenient because the sensitivity of the detector and the transmittance of the optical components of the spectroscope are maximal for this laser line.
When acquiring SERS spectra from purine metabolites using a colloid such as hya-AgNPs that contains chloride ions from the synthesis reaction, there is a competitive adsorption to the AgNPs surface between chloride ions and purine metabolites that favours the later due to their higher affinity for the silver surface. The addition of Ca
2+ or Mg
2+ shifts the equilibrium towards purine metabolites even further, leading to an increase in the SERS intensity of purine metabolites such as uric acid (
Figure S1).
In contrast to the study by del Mistro et al. [
11], which filtered the urine using centrifugal filter devices for eliminating proteins, our strategy allowed the acquisition of intense SERS spectra of purine metabolites without any filtering step, since the Ca
2+ added to the solution facilitated the specific adsorption of purine metabolites to the detriment of proteins traces [
19,
22]. Conversely, the chemisorption of proteins can also be favoured in the detriment of purine metabolites by modifying the nanoparticles with iodide, as shown recently in our study regarding albumin detection in urine [
22]. It is worth mentioning that this strategy works well only for biofluids containing low concentrations of proteins such as urine [
22].
The assignment of SERS spectra in the case of biofluids such as urine or serum is difficult, given the enormous chemical complexity of these matrices [
9]. In line with previous studies, we have assigned several bands to uric acid (SERS bands at 650, 809, 1017, 1135 and 1522 cm
−1), xanthine (SERS bands at 1135, 1251 and 1318 cm
−1) and hypoxanthine (SERS bands at 724, 1094, 1459 and 1595 cm
−1) [
18,
23,
24].
The perturbations in the pathways of the purine metabolism (also called purinosome) are known to play a significant role in the onset and progression of cancer [
25]. For instance, an increase in the levels of uric acid is known to accompany an increase in the cellular turnover, as it is the case with a malignant lesion which spreads into the surrounding tissues [
26]. However, the processes that regulate the purine metabolism are intricate and there are ongoing efforts to clarify the differential expression of purine metabolites that accompany different types of cancer [
25]. The fact that some SERS bands remain unassigned underscores an important limitation of SERS strategy, which is the uncertainty regarding the analytes responsible for the SERS signal. This is particularly applicable in the case of biofluids such as urine or serum, which contain a myriad of metabolites that could be responsible for the SERS signal.
Nonetheless, the shape of mean spectra difference showed that some of the bands are more intense in the case of urine samples from breast cancer whereas other SERS bands are more intense in the case of urine samples from controls.
To inspect the degree of separation between samples corresponding to the breast cancer group and controls, we performed PCA. The results show a good separation between the two groups, especially when plotting PC 1 versus PC 3 (
Figure 4). Given that PCA is an unsupervised method, it cannot be used for calculating figures of merit such as sensitivity and specificity values but only for inspecting the relation between the samples.
To quantify the classification accuracy resulting from SERS spectra of urine, we performed PCA-LDA using the first 7 PC score values as input. Together, the first 7 PCs accounted for 82% of the variability in the data set (
Figure S2). For each sample, LDA calculated a discriminant value corresponding to each group and the sample was assigned to the group having the highest discriminant value (
Figure 5a). Thus, the points in
Figure 5a that sit above the dashed line were assigned to the breast cancer group, while the points below the dashed line were assigned to the control group. The Hotelling T
2 values corresponding to the control group and the breast cancer group are presented in
Figure S3 and
Figure S4, respectively.
The confusion matrix corresponding to the PCA-LDA of SERS spectra from urine samples is presented in
Figure 5b and it corresponds to a specificity of 95%, a sensitivity of 81% and an overall accuracy of 88%. For comparison, in the study by del Mistro
et al. on n = 20 urine samples from prostate cancer patients, the authors reported an overall accuracy of 95% [
11], while in the preliminary Raman study on urine from rats with breast cancer the sensitivity and specificity reported by the authors was 80% and 72%, respectively, using unprocessed urine and 78% and 91%, respectively, using concentrated urine [
10]. Thus, the results of our study are in line with previous reports on the use of Raman and SERS spectroscopy of urine for discriminating between cancer patients and controls, which reported similar figures of merit [
10,
11]. Given that our dataset lacked urine from patients with stage I and stage IV breast cancer patients, we did not include the stage of the cancer in our analysis. Nonetheless, Bonifacio et al. showed that SERS spectra of filtered serum acquired from breast cancer patients enable the classification of samples according to cancer stage [
14]. Whether the SERS spectra of urine also display features that are specific to cancer stage will require further studies.
These results pave the way for future studies aiming to validate these preliminary findings in the clinical setting. To this end, investigators should enrol patients prospectively and they should also check the discrepancies in the results yielded by different laboratories in multicentric trials. The latter obstacle is especially difficult to surpass, given the notorious sensibility of SERS for even slight modifications of the experimental setup. Nonetheless, a better understanding of the intimate mechanisms behind SERS might allow the successful clinical translation of SERS in the near future.