2.1. Sample Cleanup in the SPE System
The goal of this Section of the study was to compare our in-house optimized C
18 protocol (for a detailed protocol, see
Section 3.3) developed for the purification of hydrophilic peptide samples to a number of other SPE methods. The C18 method bears differences in several aspects from the reference method: (i) the cartridge and the buffers (except the elution buffer) were cooled to 4 °C instead of room temperature, (ii) the ion pairing reagent for loading and wash was changed to heptafluorobutyric acid (HFBA) instead of trifluoroacetic acid (TFA), and (iii) a third elution step was added with formic acid (FA) instead of TFA to reduce ion pairing effects (
Figure 1a).
The evaluation of performance was tested on fractionated plasma samples and was based on multiple characteristics, both qualitative, and quantitative. The tested SPE methods were as follows: C18 (C
18 sorbent with in-house optimized method for hydrophilic species), TopTip (graphite sorbent), Pierce (graphite sorbent), and Cotton (cotton sorbent with HILIC characteristics). Moreover, two combined methods were tested where the flow-through of the C18 cleanup was further purified with the respective graphite resin (C18 + TopTip and C18 + Pierce). For a visual demonstration of the combined methods, see
Figure 1b. As a reference method, the C
18 sorbent using the manufacturer’s protocol was chosen. Each method was carried out in triplicate using 1 µg of fractionated plasma (for exact protocols, see
Section 3.3).
First, the performance of the seven SPE methods was compared using qualitative measures: the number of proteins, peptides, glycopeptides, and glycosylation sites detected. These are shown in
Figure 2.
The performance of the different SPE methods was markedly different. The C18 method showed the best performance regarding peptide detection, with more than 800 peptides detected on average (
Figure 2a). The other C
18-based methods were slightly worse (between 700 and 750 peptides detected), while the Pierce, TopTip, and Cotton methods performed significantly worse (less than 500, 600, and 700, detections respectively). These differences, however, were much smaller at the protein detection level, where the difference between the best and worst methods was only 25% (C18 and Pierce methods) compared to 43% for peptides. The C18 showed better performance than the Cotton and both graphite-based (Pierce, TopTip) sorbents (
Figure 2b); on average, 41–49 proteins were detected with other sorbents, while this was 55 for the C18 method. It also showed a ca. 10% detection gain as compared to the reference method with the same sorbent. These small differences in protein detection are attributed to the relatively low complexity of the samples.
Next, we combined the C18 method with the graphite-based setups to see if an additional gain could be achieved by purifying the flow-through and combining the two elution fractions (
Figure 1b). Both combined methods showed inferior performance to the C18 and reference methods. One possible explanation for this is that solvent evaporation is carried out twice using the combined method, thus decreasing peptide recovery after purification (
Figure 2a). Note that 20–30% sample loss is expected during solvent evaporation with a heated vacuum centrifuge (unpublished data). However, this loss is strongly influenced by the organic solvent content, the volume, and thus the time of evaporation. Higher organic solvent content results in a larger interaction surface for non-specific binding with the tube walls, higher organic solvent content elevates the possibility of droplets escaping the tube, and longer times facilitate degradation of peptides and permanent interactions with tube walls. In our method, we evaporated solvents after the first step, and then the elution fraction of the second step was pipetted in the same Eppendorf tube and a second evaporation took place. Thus, two evaporation steps were performed for both parts of each sample in the combined methods, in contrast to the reference or the C18 methods, where only one drying down/reconstitution step is performed between the purification and the measurement. The increase in sample loss caused by the evaporation is balanced by the gain attributable to the second purification step. In the presented combinations, the evaporation sample loss was larger than the gain from the second purification, thus decreasing the number of detected peptides and proteins.
Although the graphite-based SPE methods performed worst for the clean-up of non-glycosylated species, they were among the best for glycopeptides (
Figure 2c). The TopTip method was significantly better than any other method (on average over 45 glycopeptides were detected), while the Pierce and the C18 methods also showed good performance (38 and 39 detections, respectively). The Cotton method performed the worst (60% fewer detections than the TopTip method), contrary to our expectations based on its frequent use as a glycopeptide enrichment method. The number of detected glycosylation sites was highly similar for all methods except for the Cotton method, which is presumably due to the low reproducibility of the method (indicated by the high standard deviation in
Figure 2c).
The assessment of possible differences in selectivity between the different SPE methods was carried out by the comparison of detected glycosylation sites; glycan types for glycopeptides; and peptide length, hydrophobicity (Grand Average of Hydropathy, GRAVY scores), and isoelectric point distributions for peptides.
Regarding glycopeptides, no differences in selectivity could be detected. The glycan types detected were mostly bi-, tri-, and tetra-antennary complex types for each method, and all of them contained both non-sialylated and highly-sialylated variants, which suggests no major differences in selectivity towards glycans (
Table A1).
Similarly, peptide hydrophobicity (described by the GRAVY score) and isoelectric point distributions did not reveal significant differences in selectivity towards peptide backbones (
Figure A1). GRAVY score and isoelectric point distributions showed excellent correlation between all methods (mean correlation coefficient of 0.995 and 0.998, respectively). However, peptide length distributions were different between the C
18-containing sorbents and the Pierce, TopTip, and Cotton methods (
Figure 3).
The shift in the peptide length distribution towards smaller peptides for the Pierce, TopTip, and (to a lesser extent) the Cotton methods (
Figure 3), combined with the lower peptide detection numbers (
Figure 2a), suggests that these methods are selective towards shorter peptides, and a significant number of larger peptides were lost. For the graphite-based sorbents, this is in line with their retention characteristics and suggests that the retention is dominated by polar interactions in this solvent system. This hypothesis is supported by the fact that peptide length distributions were heavily affected, while GRAVY score (average hydrophobicity normalized to peptide length) distributions were not (
Figure A1).
Overall selectivity differences between the methods were addressed by comparing all the detected peptides on Venn diagrams. When comparing the C18 method with the TopTip and Cotton methods, we could conclude that only a minor selectivity difference could be observed. More than 70% of all peptides could be detected using the C18 method, but more than 30% could not be detected while using either TopTip or Cotton (
Figure A2a). On the other hand, the unique detections with the TopTip and Cotton methods were around 10% of all peptides, which is comparable to the variability attributed to the data-dependent acquisition mode. Furthermore, there was no significant additional selectivity gain by using another type of stationary phase combined with the C18 method (
Figure A2b).
Next, the performance of the different SPE methods was compared from a quantitative aspect. For this comparison, MaxQuant LFQ (Label-Free Quantitation) intensity values were used, and recovery values were calculated for each method relative to the reference method for proteins quantified in all the samples (28 in total). The recovery value and LFQ intensity relative standard deviation (RSD) distributions are presented in
Figure 4.
The recovery value distributions (
Figure 4a) show great variation between the different methods. For the C
18-containing methods, most recovery values were between 0.8 and 1.2. However, for the graphite sorbents and the Cotton method, they showed a much wider distribution, which suggests uncontrolled binding and elution performance for the different peptides. The RSD values also varied greatly (
Figure 4b) between the different methods. The C
18-containing methods clearly showed superior repeatability compared to other methods, with the majority of RSD values under 0.1. Overall, the quantitative comparison suggests that there are differences in selectivity between the different methods (especially between the C
18-containing and the other three methods) and that C
18-based methods are more suitable for comparative proteomics due to their excellent repeatability.
In summary, the in-house optimized C18 method outperformed all other methods for the cleanup of heavily glycosylated samples in terms of peptide and protein detection and quantitation. This method has proven excellent utility in the analysis of other sample types as well in our laboratory, such as FFPE tissues [
18], cell lines, and extracellular vesicles (unpublished data). The graphite-based methods showed slightly different selectivity than the C
18-based methods, and the TopTip method was best for the detection of glycopeptides. On the other hand, the combined (C
18 and graphite) methods showed similar behavior to the one-step C18 method, which implies that the addition of different retention mechanisms did not improve performance. Finally, the Cotton method showed poor overall performance except for good repeatability for quantitation. The performance of the different SPE methods is summarized in
Table 1.
2.2. Reversed-Phase Gradient Separation
During the gradient development, we compared the performance of five different gradients. The gradient programs for the distinct gradients are summarized in
Section 3.4. As the initial slope was presumed to have a crucial effect on the separation of the hydrophilic region, we designed methods from 0.21 to 0.51% acetonitrile/min initial slopes (
Table 2).
The most important factor influencing the repeatability and detection performance of the chromatographic method is the distribution of peaks throughout the elution window. It depends on both the initial and average slope of the gradient. The 2step 4-20-40 method provided perfect peak distribution in the whole elution window (
Figure 5b). The peak distributions using the other two lower-slope methods (Lin 4–27 and 2step 4-25-40) were also close to ideal. However, the higher initial slopes (0.256 and 0.280, respectively) caused the decrease of retention times generally, thus generating a time window scarce in peaks (95–110 min,
Figure 5a,c). This difference is well-reflected in the detection rate; more peptides were detected using the 2step 4-20-40 method after 90 min, especially in the 110–120 min region, than with the other two above-mentioned ones. This effect is even more remarkable when looking at the two gradients with a larger average slope (
Figure 5d,e). The shallow first gradient step in the 3step 4-15-35-50 method resulted in ideal peak distribution in the first 60 min, but the high second slope caused a stacked peak distribution between 70 and 80 min and a region scarce in peaks between 85 and 110 min. Finally, using a large linear slope in the Lin 4–50 method resulted in the elution of most of the peptides in the first 70 min in a visibly stacked manner, causing a region lacking peaks after 70 min. This unfavorable peak distribution resulted in the lack of detected peptides after 90 min and 78 min, respectively. The ion suppression caused by the stacked peak distribution made peptide detection less effective between 70 and 80 min for the 3step 4-15-35-50 method and between 50 and 80 min for the Lin 4–50 method.
The number of detected peptides ranges from 896 to 971 with the use of the different gradients (
Table 3). Generally, gradients with a lower overall average slope (Lin 4–27, 2step 4-20-40, and 2step 4-25-40) resulted in better peptide detection numbers due to the near-optimal distribution of peaks through the elution window.
Peptide and glycopeptide detections up until 90 min are in good correlation with the effect of the average slope until that point. The three shallower gradients (2step 4-20-40, Lin 4–27, and 2step 4-25-40 in increasing order of slope) showed a decreasing trend with increasing slope due to co-elution and ion suppression of some components. The two higher-slope gradients resulted in higher detection rates that are almost identical since they had the same average slope, and almost all the components eluted in 90 min in both cases. Note that in the case of the shallower gradients, on average 102–169 peptides and up to 21 glycopeptides were detected in the last part of the elution window, while practically no additional detection happened in that Section when using the higher-slope gradients. Similar trends could also be observed when analyzing the detection rates until 60 min.
Surprisingly, the differences in peptide and glycopeptide detection were not reflected in protein and glycosite detection. The methods with a higher average slope allowed for slightly fewer detected proteins and slightly more glycosites, but the differences were not significant (
Table 3).
Selectivity and quantitation performance can also be affected by the distribution of peaks in the different (hydrophilic and hydrophobic) regions of the chromatogram. However, these differences are inherently smaller than those seen with different sorbents for SPE. Small differences were observed with regard to the peptide length and GRAVY score distribution of peptides. Regarding quantitation, a gain of only 3% in the average peak areas was observed when using the optimal gradient as compared to the 3step 4-15-35-50 method. Additional information is provided in
Appendix A.
In summary, gradients with shallower starting conditions resulted in better peak distribution not only in the first part but throughout the whole elution window. The 2step 4-20-40 method provided the best peak distribution and quantitation performance as well as repeatability in all the investigated means of measure (
Table 4).