1. Introduction
Investigation tools to diagnose pancreatic ductal adenocarcinoma (PDAC) are of unsatisfactory sensitivity and specificity. Indeed, around 10% of patients that undergo surgery for suspected PDAC have benign inflammatory diseases, mainly chronic (CP) or autoimmune pancreatitis (AIP) [
1,
2]. However, these benign diseases do not usually require surgical resection except for compromised quality of life by major pain symptoms, deterioration of nutritional status or inability to determine the exact nature of the lesion. Moreover, AIP can be treated by steroids where a high response to corticosteroid therapy is an important diagnostic criterion. The distinction between AIP and particularly AIP-type 2 and PDAC can be challenging [
1,
2,
3]. An international multicentre survey conducted in 2011, showed that 60% (123 of 204) and 78% (50 of 64) of respectively AIP-1 and AIP-2 were evaluated retrospectively from resected pancreas on suspicion of pancreatic cancer [
4]. The need of new diagnostic tools is crucial to decrease these numbers. Therefore, an accurate diagnosis may pre-empt the misdiagnosis of cancer, allowing the appropriate medical treatment of AIP and consequently decrease the number of unnecessary pancreatic resections.
Few tumour-associated antigens are established in clinical routine as serological markers for discrimination of PDAC and the most frequently used is the carbohydrate antigen CA 19-9 [
5]. Although CA 19-9 is not sufficient enough for the screening of PDAC, it helps for the differential diagnosis between PDAC and pancreatitis, to assess treatment response, prognosis and follow-up of PDAC. Furthermore, Lewis antigen-negative individuals do not secrete CA 19-9 or secrete it in small amounts [
6].
A distinguishing feature of autoimmune diseases and cancer is the expression of disease-associated autoantibodies. Their release into the blood circulation and their assessment could aid early diagnosis of high-risk populations and assist the clinical management of patients. Tumour associated autoantibodies (TAAbs) are promising serum biomarkers for detection of early stage disease [
7,
8,
9,
10,
11]. Indeed, TAAbs can be detected earlier before the disease progresses to an advanced, incurable stage. The mechanisms by which autoantibodies are produced accompanying the development of cancer are complex and poorly understood. The immune system induces immunologic processes causing autoantibody production in response to mutations, overexpression of proteins, altered antigen folding, aberrant degradation, aberrant glycosylation and/or the release of proteins from damaged tissue [
12,
13]. Another explanation is that B cells generate autoantibodies (AAbs) by escaping the self-tolerance checkpoints of the immune system [
14]. AAb may be detectable both at early onset of the disease and in higher concentrations as compared with the tumour antigen itself. Their long presence and synthesis due to limited proteolysis and clearance of the tumour antigen representing an in vivo amplification of signal detection [
15]. As biomarkers, AAbs are highly specific and easily purified from serum which makes them an interesting tool for diagnosis and distinction between different disease groups. High-throughput screenings on larger sample cohorts, including patients with well-defined early stage diseases, are required to differentiate autoantibodies’ expression in malignant and benign pancreatic diseases and define disease specific signatures. Protein microarrays are modern high-throughput tools applicable to detect new disease-specific antigen signatures recognized by AAbs for diagnosis and prognosis, but also to help gain more insight into the molecular nature of diseased conditions [
16].
Applying antigen microarray technology [
17] we describe here the array manufacture process and subsequent analysis procedures leading to the identification of disease specific AAbs through profiling of preoperative sera derived from patients with resectable PDACs, CP, both AIP types, other GI-tract diseases and healthy controls.
3. Discussion
Interest on antibodies as biomarkers for both autoimmunity and cancer has been a subject of debate these last decades [
20,
21]. Profiling on protein microarrays is one of the most powerful techniques that allow large-scale quantitative protein determination in a high-throughput way for initial protein biomarker discovery. The majority of autoantibody profiling investigations determined antigens recognized by autoreactive IgG antibodies. To this end they have been used to identify biomarkers for cancer diseases [
22,
23,
24]. Microarrays assembled of thousands of different proteins have been already reported and used to identify tumour associated autoantibodies in the context of colorectal cancer [
25], bladder cancer [
26], ovarian cancer, and pancreatic cancer [
8,
27]. Furthermore, several studies described elevated autoantibody concentrations in sera of patients with AIP including anti-lactoferrin [
28,
29], anti-carbonic anhydrase II, anti-carbonic anhydrase IV, anti-pancreas secretory trypsin inhibitor, anti-anionic and cationic trypsinogens, anti-amylase-1, anti-heat shock protein 10 and anti-plasminogen-binding protein peptide autoantibodies [
7,
18,
30]. However, their performance as diagnostic markers is still under debate. Some of these antibodies are organ-specific mainly against antigens from the pancreatic ducts and acini [
28,
30], whereas others are not organ-specific and directed against nuclear antigen (ANA). Recently, additional autoantibodies have been identified against pancreatic enzyme precursors in AIP patients [
7]. The study also reported different expression of anti-pancreatic lipase and anti-transaldolase in the two AIP-subtypes.
Though AIP have been clearly recognized as two entities with distinct clinical profiles the aetiology and the pathophysiological mechanisms of AIP remain still unknown [
4,
31,
32,
33,
34].
In the current study we aimed at the identification of new biomarkers able to discriminate between very similar pancreatic diseases. We showed the usefulness of a custom protein microarray approach that provides specific serum antibodies pattern in patients affected by PDAC, CP, AIP and GID. Starting with 6400 human recombinant proteins with subsequent two steps refinement strategy as selection, we identified 29 autoantigens which were differentially and significantly recognized by autoantibodies present in the IgG-enriched fraction sera of patients with pancreatic diseases.
Our study cohort was able to address the three following unsolved issues: (i) can autoantibodies differentiate between benign pancreatic diseases from PDAC, (ii) can autoantibodies differentiate AIP-patients particularly those with AIP type 2 from PDAC-patients and (iii) can autoantibodies differentiate both subtypes of AIP. We identified TOR1B, RNF138, PPP1R15A, PAICS, LENG1, GPR3 and CYP3A5 as autoantigens that allowed discrimination between PDAC, CP and AIP.
Many previously described proteins used in the clinic to distinguish PDAC from healthy patient actually fail to discriminate PDAC from pancreatitis (CP and AIP), a spectrum of diseases that shares many molecular and imaging features with PDAC [
35]. In these patients, CT scan can be non-diagnostic and more invasive endoscopic testing may be required toward a final diagnosis. A non-invasive pancreatitis-biomarker panel may be helpful for the investigation of such patients. From a clinical standpoint, the actual utility of a biomarker would depend on the context in which it is used. Indeed, biomarkers for PDAC necessarily need to have a very high specificity because of the low prevalence of the disease (0.01%) that would lead to numerous false positive. CA 19-9 is the most extensively studied and validated serum biomarker for the diagnosis of PDAC. With an overall sensitivity and specificity in the range of 80–86% both values do not reach the necessary sensitivity and specificity for pancreatic cancer detection. This unsatisfactory performance is partly due to elevated CA19-9 levels in benign diseases such as acute and chronic pancreatitis, biliary obstruction, cholangitis, and liver cirrhosis but also in other gastrointestinal cancers resulting in a significant number of false positive. Additionally in patients with Lewis negative genotype (5–10% of population) CA 19-9 is not expressed leading to false negative results.
On the other hand, the prevalence of CP, including autoimmune subtype, approaches 1–5% of the population and in such a population less specificity could be tolerated as long as the biomarker test can distinguish healthy controls.
Some of the markers reported in this study are able to distinguish PDAC from AIP-1 but fail to discriminate between PDAC and AIP-2. This finding may be related to the fact that the clinical spectrum of AIP-1 is a systemic disease compared to AIP-2. Furthermore, it is difficult to distinguish AIP-2 from PDAC as both can induce the same clinical profile. We therefore assessed different antigen profiles between both types of AIP and PDAC and were able to generate predictive models with high AUC values for AIP1 and AIP2 vs. PDAC.
The most interesting biomarker is Torsin 1B (TOR1B) because it is capable to discriminate AIP from other diseases. Torsin Family 1, member B also known as TOR1b or DQ1 is found primarily in the endoplasmic reticulum and nuclear envelope and can act as a chaperone allowing the maintenance of the integrity of the nuclear envelope and endoplasmic reticulum.
Surprisingly only a small number of antigens have been identified that differentiate PDAC from other diseases. We can extrapolate that it could be related to the desmoplastic reaction that often takes place in this tumour [
8]. Furthermore, heterogeneity among our patients with PDAC (stage and grade) could also explain variation of recognition of antigens. Nevertheless, the protein microarray screening was able to identify candidate proteins that show strong differential recognition between the pancreas diseases PDAC, CP and both AIP forms.
We took a step-by-step analysis approach with three different but interlocking profiling parts. The third part involved analysis of a large cohort of patients tested individually. MFI were used to address statistical analysis. The statistical prediction models revealed high predictive performance with AUC values that reached about 90%. This leads us to conclude that the identified antigens might serve as a basis to establish accurate predictive models. However, valid models need to be calculated including a much larger sample size. The uncertainty of the predictive strength can be recognized by the moderate-sized confidence intervals.
Not only might the novel autoantibody panel help to improve diagnostics and to understand the pathophysiology of AIP, but the autoantibodies could also lead to new possible immunotherapeutic targets. Customized, small-sized arrays or similar assay formats could be a useful and affordable way to diagnose and discriminate pancreatic cancer from benign inflammatory diseases and avoid unnecessary surgery on patients with AIP-1 and AIP-2.
4. Materials and Methods
4.1. Patients and Samples
Serum samples of patients with pathologically confirmed PDAC, CP, AIP-1, AIP-2, gastro-intestinal diseases (GID) and controls were obtained from the Pancobank of the
European Pancreas Center (EPZ/Department of Surgery, University Hospital Heidelberg; Ethical Approval Votes no. 301/2001 and 159/2002), a member of BMBH/Biomaterial Bank Heidelberg. Additional sets of serum samples were provided from the Department of Gastroenterology, Kobe University Graduate School of Medicine, Kobe, Japan.
Data of patients who were referred for an operation were collected in a prospectively designed database. The clinico-pathological parameters included age, gender, TNM classification and AJCC stage of tumour location are listed in
Table 5.
The first group, the PDAC sera (65 patients mean age 66.2 ± 11 years) derived from patients with operable pancreatic tumours. The second investigated group, chronic pancreatitis (CP) consisted of 50 serum samples from patients who have undergone duodenum preserving pancreatic head resection (mean age 53.3 years ± 9.6). The histopathological analysis of all patients confirmed CP. The third group, represented by 55 AIP-1 patients, is a cohort composed of 25 and 30 serum samples from Heidelberg and Kobe, respectively (mean age 64.5 ± 12.2 years). The fourth group, AIP-2 consisted of 15 patients from Heidelberg only. All AIP cases were confirmed using histology whenever tissues were available. If histology was not available, the HISORt Mayo Clinic criteria or the International Consensus Diagnostic Criteria (ICDC) for AIP were used to confirm the disease. The fifth patient group was composed of benign and malign extra-pancreatic gastrointestinal diseases (GID), and consisted of 60 patients, (mean age 63.3 ± 10.8 years).As controls, a group of 70 healthy volunteers (Co) (mean age 47 ± 19.5 years) including 25 healthy controls from Kobe was used.
These six groups were used in three dependent sample sets in this study. Sample set 1 consisted of 60 pooled sera IgG fractions (15 sera/group), sample set 2 consisted of 260 sera pooled in 52 IgG fractions (5 sera/pool) and sample set 3 was composed of 185 individual patient sera. Exact numbers of the sera and the patients’ clinical data in each sample set used for controls, gastrointestinal diseases (GID), chronic pancreatitis (CP), autoimmune pancreatitis (AIP-1 and AIP-2) and pancreatic adenocarcinoma (PDAC) cohorts are presented in the
Table 1, text and figures.
4.2. IgG Enrichment
To limit artifacts in the array binding process and increase specificity, serum samples were enriched for IgGs using the NAb protein A/G spin column and buffers (ThermoScientific, Rockford, IL). The purification procedure was performed according to the manufacture’s protocol. The collected fractions were neutralized by adding 40 µL of 1.0 M Tris, pH 8.8, measured at 280 nm to assess the amount of purified antibodies and subsequently used for profiling procedures.
4.3. Generation of DNA Templates for Recombinant Human Antigens from Bacterial Library
The ORF-clone library of the ORFeome Collaboration was provided by the DKFZ Genomic and Proteomic Core Facility [
36]. Clones were selected according to the encoded protein’s role in inflammation and/or cancer. Two microliters of each selected clone were transferred with 150 µL of LB-kanamycin-medium and incubated at 37 °C overnight. The day after, the cultures were spun down at 1900×
g for 30 min. The pellets were resuspended in 100 µL of PCR-grade water, heated in a ventilated oven at 75 °C for 20 min and spun down once more as above-described. Subsequently, the supernatants containing plasmid DNA were transferred into standard 96-well PCR plates. DNA-samples were stored at −20 °C. Five microliters of the isolated plasmid DNA used as a template for the synthesis of the gene of interest was amplified using PCR with a Taq DNA polymerase kit (Qiagen, Hilden, Germany). The kit was used according to the manufacturer’s instructions manual during 40 cycles of denaturation at 94 °C for 30 sec, annealing at 52 °C for 30 sec and elongation at 72 °C for 210 sec. This allowed the gene of interest to be amplified with a pair of expression primers carrying regulatory sequences and fusion tags: forward expression primer: 5’-GAAATTAATA CGACTCACTA TAGGGAGACC ACAACGGTTT CCCTCTAGAA ATAATTTTGT TTAAGAAGGA GATATACATA TGCATCATCA TCATCATCAT AAAGCAGGCT CCACCATG-3’; reverse expression primer: 5’-CTGGAATTCG CCCTTTTATT ACGTAGAATC GAGACCGAGG AGAGGGTTAG GGATAGGCTT ACCAACTTTG TACAAGAAAG CTGGGTC-3’. The C-terminal V5 sequence was intended for the detection of full-length expressed proteins. Agarose gel (1.3%) electrophoresis was performed to verify the amplified DNA fragments for correct DNA bp-length. The identity of the last 29 candidate autoantigens was further confirmed by sequencing using a Mix2seq kit (Eurofins).
4.4. Manufacture of Protein Microarrays
The fabrication of protein microarrays was performed applying the multiple spotting technique, an approach that uses DNA templates to synthesize proteins directly on microarray slides [
17,
37]. The protein microarrays were spotted on epoxysilane coated slides (SCHOTT Nexterion AG, Jena, Germany) using a microarray non-contact printer, the Nanoplotter 2 (GeSIM, Radeberg, Germany) following the protocol described by Hufnagel et al. [
38]. The protein microarrays manufacture process is outlined in
Figure 5. Briefly, during a first spotting step 900 pL of each DNA template was transferred onto the epoxysilane slides. On top of each DNA spot, in a second spotting event, droplets were printed consisting of 3.6 nL of the cell-free expression mixture (S30 T7 High-Yield Protein Expression kit, Promega). Protein synthesis occurred during an incubation phase at 37 °C for one hour followed by an overnight incubation (12-16 hours) at 30 °C. Each candidate was spotted in duplicates. In addition, an experimental positive control (Epstein–Barr Virus VCA p18) and negative control consisting of PBS in PCR mixture were printed in several replicates on each array.
All antigens corresponded to distinct human recombinant proteins. A quality control of the spotting procedure was performed randomly on 10% of each batch of production. The percentage of proteins that were successfully expressed on the protein microarray was assessed using fluorescence-conjugated antibodies directed against fusion tags (6xHis and V5) present at either end of the expressed proteins. Briefly, the slides were blocked using a 2% BSA buffer for one hour. All incubations were performed on a shaker at room temperature. After two washing steps with PBST, the Penta-His Alexa Fluor 647 conjugate antibody (Qiagen) and the anti-V5 Cy3 monoclonal antibody (Sigma) were incubated at a dilution of 1:1000 both in 2% BSA for one hour. The slides were then washed three times and dried in an oven at 30 °C prior to scanning using the Powerscan (Tecan, Männedorf, Switzerland) at two wavelengths (532 and 635 nm). The signal intensity was considered to be representative of the amount of expressed proteins on the slides.
4.5. Immunoassay with Patient Antibodies
In order to compare the reactivity of each antigen equally in the different patients, all IgG fractions used for profiling on microarrays were adjusted to a protein concentration of 10 µg/ml prior to incubation on the slides. As first step, the slides were blocked using SuperBlock (ThermoScientifc) for one hour at room temperature on a shaker. After two washing steps with PBST, 10 μg of IgG fractions diluted in PBS with 0.81 mg of E. coli lysate were incubated for 15 h to allow the antibodies to bind to the recombinant antigens on the microarray. Since an E. coli-based expression mixture was used for protein synthesis, remaining E. coli proteins on the slide could otherwise react with antibodies against E. coli proteins in the serum and thereby produce unspecific signals. After two more washing steps with PBST, a fluorescence-conjugated secondary antibody (goat anti-human whole IgG, (Jackson ImmunoResearch, Europe Ltd., Cambridgeshire, UK) diluted 1:200 in PBS directed against the primary antibodies from the patient sera on the microarray was added for two hours. The slides were afterwards washed three times for 15 min, rinsed in Milli-Q sterile water and dried at 30 °C. All slides were scanned (Powerscan, Tecan, Männedorf, Switzerland) using a 635 nm laser. The signal intensity obtained from the fluorescence-conjugated antibody was approximately proportional to the amount of primary antibody in the enriched-IgG sera that recognize the antigens on the protein microarray.
4.6. Construction and Optimization of the Protein Microarrays
Applying the top-down strategy as a stepwise refinement for selection of the best differentiating autoantigens, three differently sized protein microarrays were manufactured.
The large scale arrays (
n = 16, four per profiled group, each with 1600 individual antigens/array covering a total of 6400 antigens) were printed and incubated with a small cohort of total 60 serum samples assembled in four groups of PDAC, CP, AIP patients and Co. Each group consisted of 15 individual sera (20 µL) combined in a pool (300 µL) and subjected to IgG-enrichment. The patients’ characteristics in this sample set 1 are presented in
Table 1.
The intermediate size microarrays, n = 60, consisted of 345 antigens spotted in duplicates/array. On these 260 IgG-enriched fractions were analysed: Co: n = 70; PDAC: n = 40; GID: n = 50; CP: n = 35 AIP-1: n = 50; AIP-2: n = 15; they were profiled as 52 pools of 5 patients per group (pools: Co n = 14, PDAC n = 8, CP n = 7, GID n = 10 AIP-1 n = 1 0, and AIP-2 n = 3).
Finally, small-size arrays of 36 antigens in duplicates were printed and tested in sample-set 3 on cohorts of 185 patients individual IgG-enriched fractions (PDAC: n = 25; CP: n = 24; AIP-1: n = 47; AIP-2: n = 15; GID: n = 26; and Co: n = 48).
4.7. Data Analysis
GenePix Pro software (Molecular Devices, Sunnyvale, CA, USA) was used to analyse the fluorescence intensity of each spot and to eliminate spatial artifacts. Several negative controls consisting of PBS in PCR mixture spotted on each array were used to assess possible nonspecific signals.
After analysing the microarrays using the GenePixPro Software, the signal intensity expressed in median fluorescence intensity values (MFI), was normalized by linear scaling of log-ratios for all array probes. As variables were not normally distributed, the nonparametric two-sided Kruskal–Wallis test was used to assess differences of an antigen between disease groups. All p values were adjusted using false discovery rate (FDR) method. Adjusted p-values of < 0.05 were considered to be significant.
Logistic regularized regression using an elastic net [
39] was applied to build a prediction model based on selected antigens. The elastic net hyper-parameters α (elastic net mixing parameter) and λ (shrinkage parameter) were tuned by conducting a five-fold cross-validation, assessing variable importance maximize the AUC. An advantage of cross-validation is that each observation is assigned to the test data exactly once. Hence, for each patient the estimated probability for its classification can be extracted. Model building was performed with R version 3.6.2 [
40] using the packages “glmnet” and “caret‘ [
41,
42]. ROC curves were plotted with the “pROC” package [
43]. Confidence intervals for the AUC values were computed by the formula according to Wilson [
19].
All graphs except
Figure 5 were designed in GraphPad Prism software (version 5; La Jolla, CA, USA). The normalized MFI of each candidate were graphically presented as box-and-whisker plots.