*3.4. Qualification of the Assay Threshold*

An essential aspect of the study was to investigate the assay's ability to predict the potential for unwanted immune responses in line with FDA labels [13]. Accordingly, we characterized our assay in terms of accuracy (overall rate of correct predictions on compound level), sensitivity (probability to detect an immunogenic treatment), specificity (probability of correctly identifying a non-immunogenic treatment), and Positive/Negative Predictive Value (confidence in assigning either label correctly). To this end, we tested the aforementioned 24 molecules for which clinical data were available; however, since ADA responses in a limited number of patients would not necessarily be considered a relevant risk, we divided the tested molecules into two categories: high risk (≥20% reported ADA rate) and low risk (<20% reported ADA rate) for immunogenicity according to the reported data upon treatment. This classification was correlated to the proportion of donors for which a given biopharmaceutical triggered a CD4+ T cell-driven IFN-γ production in the assay: a positive assay readout was set to generate a SI statistically significant above 2 compared to the blank control, while a negative assay readout would not.

Using 10% as an optimal threshold (>3/30 positive donors according to our criteria), the assay reported 4 true positives (TP) and 16 true negatives (TN) for a total of 24 tested biopharmaceuticals (6 categorized as high risk, 18 labeled as low risk). It categorized 2 antibodies (daratumumab and pembrolizumab) at high risk of immunogenicity, even though their clinical ADA rates were below 20% (false positives, FP), while brentuximab and atezolizumab are categorized as low risk of immunogenicity, even though their clinical ADA rate were above 20% (false negatives, FN). The accuracy is the sum of true positives and true negatives over the total of tested compounds, yielding an estimated assay accuracy of 83% (20/24). The sensitivity, TP/(TP + FN), and specificity, TN/(TN + FP), are two additional important estimators, which represent the two types of possible errors. At this threshold, the DC:CD4+ T cell restimulation assay provides a 67% sensitivity at 89% specificity, with a 67% (4/6) and 89% (16/18) Positive and Negative Predictive Value, respectively.

#### *3.5. Case Studies in Pre-Clinical Research*

An important motivation of running a DC:CD4+ T cell restimulation assay in a preclinical setting is to derive information on whether compounds in development might be at risk of inducing an immunogenic response in treated patients. In this context, it is important to reduce false positive compound categorization, even at the expense of a higher false negative rate (i.e., over-classifying new molecules in the high immunogenicity risk category). As part of an integrated immunogenicity risk assessment, other risk factors (e.g., peptide presentation, mode of action, etc.) should also be taken into consideration. Our analysis demonstrates that a direct comparison of the responder rates in the DC:CD4+ T cell restimulation assay with the proportion of ADA-positive patients for a given treatment may not provide the best context of use for this assay. Our proposed strategy is to apply a given threshold to interpret results, essentially reducing the assay output to a binary outcome for biotherapeutics immunogenicity hazard identification. This enables us to retain the essential information on compound risk categorization, while minimizing the impact of noise in the data. Our data suggest that a selected threshold of 10% positive responders to classify a molecule as bearing a higher potential for immunogenicity is the optimal cutoff to flag compounds with high immunogenic potentials, while limiting the number of false negatives at an early stage of preclinical development.

To illustrate the strategy delineated above, we provide here a case study derived from one of our internal programs where seven potential clinical candidates from the same project, which differ from their primary sequence, have been tested in the assay (Figure 4a). The results showed that compounds A, B, D, and G were above the threshold, whereas variants C, E, and F were below the threshold and, therefore, associated with a lower risk of immunogenicity.

**Figure 4.** Case studies in pre-clinical research. (**a**) Stimulation Index (SI) obtained for 7 different candidate compounds of the same project (named A to G). The change in color, from green to red, depicts the positivity of the donor within the screen. The lower panel represents the proportion of positive donors. The threshold derived from the validation study is set at 10% positive donors. (**b**) DC:CD4+ re-stimulation results obtained for a selection of known T cell epitopes derived from biotherapeutics and their "de-immunized" counterparts.

Furthermore, we demonstrated that this assay was also suitable for testing whether peptides could trigger a CD4+ T cell response. Hence, we tested known T cell epitopes from Natalizumab and Interferon β, as well as potential deimmunized versions [15,16] (Figure 4b). Peptides were tested at 2 ug/mL and followed the same experimental procedure as described in the Material and Methods section. Results from the assay demonstrate that minor changes in the amino acid sequence of the T cell epitopes could reduce the onset of a CD4+ T cell response, thus confirming the published findings, but also that this assay can accommodate peptides (e.g., peptide based biotherapeutics or T cell epitopes).
