3.2. Overnight Resting of PBMC Samples Increases the Magnitude and Statistical Significance of Responses to Individual Peptides of the A2-CEF Pool
To more effectively measure the effect of ON resting, each of the A2-CEF peptides included in the A2-CEF pool was tested individually (
Figure 3 and representative images for EBV-1 in
Figure 4B). TRP-2, a tumor associated antigen, was also included as a control. As summarized in
Figure 3B, after ON resting, donors had not only more strong statistically significant responses to peptides, but also developed new moderate significant responses (left). Further, the magnitude of response upon resting increased significantly in multiple cases. The raw spot counts for each of the triplicate wells used for the evaluation of the peptides and for each of the six replicates used for background control are shown in
Supplementary Table 1.
Looking at specific examples, the response to EBV-2 by H10 is statistically significant only when cells are rested, and the statistical significances for EBV-2 and EBV-1 by H12 changes from moderate (yellow) to strong (green) when cells are rested for at least 18 h. The responses to Inf A, EBV-2 and EBV-1 by H4 are significant only after resting 22 h, albeit only moderately so (
Figure 3B, left). Further, the magnitude of the response was found to be strongly statistically significant between no resting and either 18 h or 22 h of resting for Inf A and EBV-1 by H3, Inf M and EBV-2 by H4 and EBV-1 by H10 (
Figure 3B, right). Out of the total of 20 responses to the individual viral peptides evaluated in the four donors (4 donors × 5 single viral peptides), four responses (20%) were significantly different at 18 h compared to no resting and eight (40%) were significantly different at 22 h compared with no-resting. Furthermore, the magnitude of the response was only statistically different between 18 and 22 h for EBV-1 by donor H10. This analysis suggests that 22 h of resting could be even more advantageous than 18 h, but an 18-h resting period appears to be a sufficient resting timeframe for evaluating the T-cell responses to individual peptides. Of interest is the fact that the impact of resting the samples has been less remarkable in donor H12, whose samples have been stored in liquid nitrogen for a shorter time than the other three donors tested. Future studies including more samples and more variable storage periods could be considered.
Figure 2.
The magnitude of the response to the A2-CEF peptide pool in PBMC samples from normal donors increases with overnight resting. PBMC samples from the indicated donors were thawed and seeded in the presence and absence (no Ag, background control) of the A2-CEF peptide pool after 0 h, 18 h. or 22 h. of resting. (A) Results are the average of spots from triplicates wells. (B) Statistical significance for the responses to A2-CEF as compare to No Ag at each resting time were determined by modified DFR(2x) (DFR, distribution free resampling) after Westfall–Young max-T correction; p-values <5% are shown in green. The statistical significance of the responses obtained for the three different resting time points was determined by a DFR-like permutation method with Westfall–Young max-T correction; 5% or 10% significances are shown in dark blue or light blue, respectively.
Figure 2.
The magnitude of the response to the A2-CEF peptide pool in PBMC samples from normal donors increases with overnight resting. PBMC samples from the indicated donors were thawed and seeded in the presence and absence (no Ag, background control) of the A2-CEF peptide pool after 0 h, 18 h. or 22 h. of resting. (A) Results are the average of spots from triplicates wells. (B) Statistical significance for the responses to A2-CEF as compare to No Ag at each resting time were determined by modified DFR(2x) (DFR, distribution free resampling) after Westfall–Young max-T correction; p-values <5% are shown in green. The statistical significance of the responses obtained for the three different resting time points was determined by a DFR-like permutation method with Westfall–Young max-T correction; 5% or 10% significances are shown in dark blue or light blue, respectively.
Figure 3.
Resting of PBMC samples prior to the evaluation of individual peptides of the A2-CEF pool increases the magnitude of response detected by IFNγ ELISPOT. PBMC samples from the indicated donors where thawed and seeded in the presence or absence (no Ag, background control) of peptide. The peptides included TRP-2 and each of the individual peptides from the A2-CEF pool. PBMCs were not rested (0 h) or rested 18 h or 22 h prior to testing. (A) Results are the average of spots from triplicate wells for wells tested with peptide and from six replicates in the absence of peptide. (B) The statistical significance for the response at each of the rest times was determined by modified DFR(2x) or DFR(eq) after Westfall–Young max-T correction, and p-values <5% are shown in green or yellow, respectively. The statistical significance of the responses obtained for the three different resting time points were determined by a DFR-like permutation method with Westfall-Young max-T correction; 5% or 10% significances are denoted by dark blue or light blue, respectively.
Figure 3.
Resting of PBMC samples prior to the evaluation of individual peptides of the A2-CEF pool increases the magnitude of response detected by IFNγ ELISPOT. PBMC samples from the indicated donors where thawed and seeded in the presence or absence (no Ag, background control) of peptide. The peptides included TRP-2 and each of the individual peptides from the A2-CEF pool. PBMCs were not rested (0 h) or rested 18 h or 22 h prior to testing. (A) Results are the average of spots from triplicate wells for wells tested with peptide and from six replicates in the absence of peptide. (B) The statistical significance for the response at each of the rest times was determined by modified DFR(2x) or DFR(eq) after Westfall–Young max-T correction, and p-values <5% are shown in green or yellow, respectively. The statistical significance of the responses obtained for the three different resting time points were determined by a DFR-like permutation method with Westfall-Young max-T correction; 5% or 10% significances are denoted by dark blue or light blue, respectively.
Figure 4.
Representative images of the IFNγ ELISPOT performed with previously frozen PBMC. PBMC samples from donors H3, H4, H10 and H12 were thawed and stimulated (A) with the A2-CEF pool or (B) with EBV-1. PBMCs were not rested (0 h) or rested for 18 h or 22 h prior to testing. Each picture is a representative of triplicate wells. Responses that were significantly different from 0 h were determined by a DFR-like permutation method with Westfall–Young max-T correction; 5% or 10% significances are denoted by (**) or (*), respectively.
Figure 4.
Representative images of the IFNγ ELISPOT performed with previously frozen PBMC. PBMC samples from donors H3, H4, H10 and H12 were thawed and stimulated (A) with the A2-CEF pool or (B) with EBV-1. PBMCs were not rested (0 h) or rested for 18 h or 22 h prior to testing. Each picture is a representative of triplicate wells. Responses that were significantly different from 0 h were determined by a DFR-like permutation method with Westfall–Young max-T correction; 5% or 10% significances are denoted by (**) or (*), respectively.
As can be observed in
Figure 3B, many responses remained undetectable, even when resting the cells, such as the response to HCMV by H4. Indeed, out of the 14 responses that were negative without resting among all four donors, nine remained negative and five became positive after 18 or 22 h. of resting (
Figure 3B). The only response that was detected without resting and not detected with resting was to HCMV and Inf M by donor H3; however, they were only moderately statistically significant, and this lack of detection after resting was not reproducible in an earlier experiment. In that experiment, moderate (Inf M) or strong (HCMV) significance resulted after resting 22 h., as well; see
Supplementary Table 2. Lastly, no responses to TRP-2 were detected by any of the subjects, at any time point. This finding was not surprising, since TRP-2 is a tumor-associated antigen and the PBMCs used in this study were obtained from healthy donors. These results emphasize the specificity of resting effects on antigen-specific responses and demonstrate that resting does not cause a non-specific increase of spot counts or induce false positives.
3.4. Discussion
In this study, a condensed version of the CEF peptide pool, containing only five A2-restricted peptides (A2-CEF), was used. Clearly, the strong significance level of A2-CEF responses observed in these study indicate that the A2-CEF peptide pool can be an effective control for CD8+ memory responses in HLA-A2 subjects and may be of interest in light of cost-preserving measures. In immune monitoring efforts that include large panels of clinical samples with the need for confirmatory assay repeats, this can represent a significant cost savings.
Equally apparent in the data is the advantage of resting the cells.
Figure 2A and
Table S1 clearly show that the number of spots in wells tested with peptide is higher in most cases after resting. In addition to the response to EBV-2 by H10 being undetected without resting, but detected with strong significance after resting 18 h, and the responses to Inf A, EBV-2 and EBV-1 by H4 being moderately detected only after 22 h resting, there was also an improvement in the significance of the detected stimulation from moderate to strongly significant in the responses to EBV-2 and EBV-1 by H12. Indeed, of the nine strongly significant results for individual peptides after resting 18 h, only two thirds were strongly significant without resting. The advantages to increasing the level of detected significance should not be underestimated, because, in general, moderately statistically significant results are less likely to reproduce than strongly significant results; as previously reported [
19], DFR(eq) has a false positive rate of 10.7%, over five-times higher than that of DFR(2x) (2.0%). Direct comparisons between spot counts associated with not resting
versus resting also support this conclusion, with eight subject/stimulus combinations showing at least moderate significance between no resting and either (or both) 18 and 22 h of resting. Note that these significances do not always track with increases in significance overall. If the sample is already significant without resting, for example, then the significance between resting times will not be evident in the significance at each resting time. Conversely, because significances between resting times were the results of comparisons between two samples of a size of three, rather than three
versus six background samples, as was the case of measuring the significance at each resting time, a change in significance after resting does not necessarily imply significance between those two measurements.
Overall, moderately significant responses (yellow) were found in wells in which the number of spots was low (less than approximately 35). An excellent example of one way in which resting is advantageous to such a moderately significant sample is shown through analysis of the response of donor H12 to EBV2. The response to the peptide did not result in a particularly high spot count at 0 h., and the average spot count did not increase after resting; rather, there was a decrease in the number of background spots. Reduction in the number of spots for the background wells after resting for 18 and 22 h contributed to the increase of the statistical significance of the difference between background wells and peptide response (
Supplementary Table 1), even though the unrested background number of spots was not unreasonably high. This shows that the background can be “cleaned up” by resting. Another observed example of such “cleaning up” was the removal of testing artifacts after resting (Figure 1S). Although this finding was observed in only one out of the four donors included in this study, it suggests that when analyzing a large number of donors, a nontrivial percentage of the samples may present artifacts in the background and, thus, will benefit from ON resting for the determination of peptide-specific T-cell responses.
Both of these observations, the increase of spot counts and the decrease of artifacts, underline the applicability of the previously described benefits of ON resting. Small artifacts in ELISPOT are most commonly caused by dying cells during the assay incubation time. As Kutscher
et al. convincingly showed, cells die during prolonged resting periods [
18]; hence, less apoptotic cells are added to the actual assay, giving less cause for artifacts. Indeed, the percent recovery determined in the presented study also supports that observation. The median recovery of the seeded “live” cells for all samples after resting was 67%. Although, in this study, apoptotic markers were not used to confirm that these lost cells were undergoing apoptosis, our data shows that resting the cells is beneficial for reliable determination of peptide-specific T-cell responses. Future studies comparing the performance of freshly isolated PBMC with rested cryopreserved samples could be performed in order to determine which of the tested resting conditions more accurately resemble the freshly-isolated PBMC performance. In addition, the fact that cells were cryopreserved at a high concentration (100 × 10
6 cells/vial) could have contributed to the effect of overnight resting; however, the results presented here clearly show that samples that are potentially impaired due to processing steps can highly benefit from the effects of overnight resting prior to testing. ELISPOT results using rested PBMC samples derived from clinical studies that have been cryopreserved at a concentration of 5 to 20 × 10
6 cells/vial have shown, in general, low background responses and variance (manuscript in preparation) [
22].
It should also be noted that the improvements associated with ON resting are not simply an alternative to using more cells, even if it is practical to do so (i.e., cell scarcity is not an issue). A doubling of all values in this study would lead to exactly the same significances, since the factor of two would cancel in the numerator and denominator of the test statistic. A doubling of the number of replicates would not offer substantial benefit either, as it would only improve non-significant samples with high variance. Resulting from the multiple aforementioned biological processes that occur during ON resting, the improvement in detectability of responsive cells presented here as a result of ON resting is far more universal in its benefit, especially since non-responders continue to be non-significant rather than presenting as false positives.
Furthermore, using polychromatic flow experiments, Kutscher et al. demonstrated that the quantity, as well as the quality of T-cells responding to viral antigens changes after resting. As the mono-functional T-cell fraction decreased upon ON resting, the fraction of multi-functional T-cells, as well as their antigen sensitivity increased. Importantly, tetramer assessment revealed that the actual number of TCR-specific T-cells does not change during the resting period. This fact is important for obtaining reliable estimates of true precursor frequencies of antigen-specific T-cells, even after ON resting.
An enlightening insight into the mechanism related to the benefits (as the increase in responses, as observed in the study presented here) of resting cells before functionally assessing them is given by Roemer
et al. [
13]. Their work is based on the acknowledgement that the cells typically assessed in most studies consist of circulating cells (PBMC), which contain less than 1% of the body’s T-cells. It had already been shown in the murine model that T-cells entering the circulation lose their primed stage [
23]. Roemer shows that the expression of phosphotyrosine, a key player of the signal complex assembly, is low in circulating T-cells, but is regained after high density culture, correlating with a recovery of CD4 functionality. The cellular interactions during the high density resting period were necessary for functional maturation, which include the provision of weak TCR signals from HLA scanning [
24], as well as the upregulation of the sensitivity to TCR signals [
25]. Similar effects of resting periods were recently presented for CD8+ cells, postulating that high density preculture of PBMC resets cells to a tissue-like state with a proper functioning signaling platform [
26].
Interestingly, the data presented here show that
in vitro culture of PBMCs for 13 days results in a similar qualitative detection of antigen-specific T-cells at 22 h. There was only one case in which IVS led to the detection of a significant response that resulted in being negative when tested
ex vivo (H12 with HCMV,
Supplementary Table 2).
The ELISPOT results associated with this study comprised a total of 164 separate comparisons between stimulated and background control wells, over all subjects and stimuli. Studies that analyze large datasets to determine a significant change in response, but that use arbitrary criteria, such as doubling of spot numbers [
27] rather than
p-values and statistical significances
versus background, lack analytical rigor. In particular, these methods are incapable of giving
p-values and correcting for multiple comparisons and, hence, cannot quantify the Type I error (
i.e., the theoretical false positive rate) of the study. The importance of accounting for multiple comparisons in the determination of statistical significance has been previously highlighted [
19,
28]. Also highlighted in these publications were the difficulties associated with differing orders of magnitude of response; the standard DFR framework, which uses difference in the log of the average spot number as its test statistic, corrects within a given stimulus, but does not correct between stimuli, because the magnitude of two stimulus responses can disrupt the ability for a permutation method, like Westfall–Young correction, to properly account for sample randomness. Herein, we have attempted to overcome this issue via using an alternative test statistic, namely the standardized difference between the sample and control means. It should be emphasized that this statistic was the only change to the existing DFR framework of generating
p-values using Westfall–Young correction; although this statistic is used in the Student’s
t-test, the Student’s
t-test was found previously to be suboptimal for these analyses [
19], likely because of the failure of the ELISPOT data to be normally distributed. When this statistic is used in the context of Westfall–Young correction, however, it was quite effective at generating adjusted
p-values across stimuli; the standardized difference between the sample and control means is capable of making comparable data with different orders of magnitude by normalizing the count difference by the standard deviation, placing all data on the same scale (
i.e., the number of standard deviations above the control). Thus, in this study, all
p-values were simultaneously corrected across all experiments, making stronger any findings of significance in the process. In particular, DFR(2x) would have indicated 87 total significant results; modified DFR(2x) indicated only 72, with 10 becoming only moderately significant and five losing significance altogether; DFR(eq) indicated 14 significant results in addition to the above, and 11 of them lost significance in modified DFR(eq).
p-value corrections of IVS results were more similar to DFR, with five DFR(eq) significant values becoming non-significant and no change in significance in those significant in the DFR(2x) analysis. In general, decreasing the false positive rate can potentially increase the false negative rate, but by making the significance results in this study extremely unlikely to be false positives associated with multiple comparisons, and the results are thus more accurately representative of a true reflection of the pre- and post-resting detection ability in general future contexts.
Previous studies have demonstrated the benefits of ON resting [
14,
15,
29]. Furthermore, two large HIV networks, the HVTN (HIV Vaccine Trials Network) and the IAVI (International Aids Vaccine Initiative), included the ON resting into their SOP [
30,
31]. However, it should be noted that a recent study [
27] has presented an analysis of a seemingly similar experimental set-up without coming to this conclusion. In addition to having methodological differences, such as sample condition, freezing and ON resting methodologies, and differing cut-off criteria, Kuerten
et al. approached the analysis and interpretation of the data in a manner that is not able to accurately reflect the benefits of ON resting. In particular, by applying a single statistical test to all samples in a given response category, the authors limit themselves to determining whether ON resting causes an overall increase in spot count for all samples. However, stating that the overall effect of ON resting is not significant because all samples do not increase in spot count essentially penalizes samples for not improving after resting. For this to be a valid means of analysis, all samples would have to be expected to yield a positive response; this assumption is not true in the case of the antigens used in the present study and is certainly false in the general research and immunomonitoring setting. Since, in general, there will be samples containing no cells with specific reactivity for some of the antigens tested (
i.e., there will be truly negative responders), one needs to specifically focus on the ability of resting the cells to improve the detection of the samples that do contain cells specific for the antigen tested (
i.e., are truly positive responders). Indeed, to avoid false positives, it is a benefit if ON resting does not make true negative responses significant. The fact that ON resting does not increase non-specific reactivity has also been previously reported [
29]. It is important to note that this distinction in behavior between responders and non-responders can also be observed in the results as presented in Kuerten
et al.; a greater overall increase in stronger responders (
i.e., samples that are most likely to have responsive cells) was observed as compared to weak responders (
i.e., samples that may be comprised of truly negative responding cells), implying that the results of these studies may agree and that the difference is a matter of analysis and interpretation of the data. Our analysis demonstrates the improvement in response detection in the most rigorous and non-assumptive statistical way possible, through the generation of distribution-free, multiple comparison-corrected
p-values, and not by simply grouping possibly disparate responder-types based on unrested response; this strict statistical analysis revealed the clear advantage of ON resting.
Importantly, our observed results fit into recently published observations addressing the benefits and outcomes of extended resting of thawed PBMC on their functionality [
13,
18,
32], as demonstrated with different experimental approaches. ELISPOT is one of the most commonly used tests in the immune monitoring setting. Most studies use frozen PBMC for batch testing. Freezing and thawing procedures can induce an increased rate of apoptosis, leading to a dilution of the test population with non-responsive cells, which may further impact the antigen processing. Further, there is only limited feasibility of obtaining cells from relevant tissue, e.g., tumor-infiltrating T-cells (TILs), exist [
33]. If such samples are available, they can only be obtained in small samples sizes, and repeated availability (as for numerous time points) is close to impossible. The resting approach appears to offer a partial solution to the dilemma in that it can provide conditions that aid in resetting T-cells to a tissue-like state with an improved responsiveness to the antigens of interest. Such improved responsiveness in samples that are relatively easy to obtain (e.g., PBMCs) is essential for reliable immune monitoring in translational research and clinical trials, in order to guide the development of biomarkers and new immunotherapies [
16].
In summary, the results presented here strongly support the implementation of an ON resting step of previously-frozen PBMC samples for the detection of peptide-specific responses by ELISPOT testing, a step that has already been identified as a critical protocol variable in ELISPOT and, hence, has been incorporated in ELISPOT harmonization guidelines [
14,
15,
30]. One of the reasons for a slow adaptation rate of this recommendation has been the fear of losing too many cells. However, a possible lower recovery of cells after resting has to be accepted, since cells lost were likely undergoing apoptosis and, hence, would have not responded in the assay. As a matter of fact, the cells recovered contain less apoptotic cells, hence posing a lower risk for impaired antigen processing, and contain cells with a recovered signaling platform for efficient immune functioning. Although viral antigens were used in the presented study, these finding should be easily applicable to samples from clinical trial subjects, especially in light of the questionable sample functionality, due to the often apparent difficulties in obtaining PBMCs in a timely manner. The results presented clearly demonstrate the positive influence of extended resting on the quality of the antigen-specific response detection and its statistical significance.