**Omics Approaches to Immune-Mediated Inflammatory Diseases: Towards Novel Biomarkers and Potential Therapeutic Targets**

Editor

**Maria-Ioanna (Marianna) Christodoulou**

Basel • Beijing • Wuhan • Barcelona • Belgrade • Novi Sad • Cluj • Manchester

*Editor* Maria-Ioanna (Marianna) Christodoulou European University Cyprus Nicosia, Cyprus

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Biomedicines* (ISSN 2227-9059) (available at: https://www.mdpi.com/journal/biomedicines/special issues/Omics Inflammatory Dieases).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

Lastname, A.A.; Lastname, B.B. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-9276-3 (Hbk) ISBN 978-3-0365-9277-0 (PDF) doi.org/10.3390/books978-3-0365-9277-0**

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license. The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND) license.

## **Contents**



### *Article* **Comprehensive Profiling of Early Neoplastic Gastric Microenvironment Modifications and Biodynamics in Impaired BMP-Signaling FoxL1+-Telocytes**

**Alain B. Alfonso, Véronique Pomerleau, Vilcy Reyes Nicolás, Jennifer Raisch, Carla-Marie Jurkovic, François-Michel Boisvert † and Nathalie Perreault \*,†**

> Département d'Immunologie et Biologie Cellulaire, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada

**\*** Correspondence: nathalie.perreault@usherbrooke.ca

† Co-senior authors.

**Abstract:** FoxL1+telocytes (TCFoxL1+) are novel gastrointestinal subepithelial cells that form a communication axis between the mesenchyme and epithelium. TCFoxL1+ are strategically positioned to be key contributors to the microenvironment through production and secretion of growth factors and extracellular matrix (ECM) proteins. In recent years, the alteration of the bone morphogenetic protein (BMP) signaling in TCFoxL1+ was demonstrated to trigger a toxic microenvironment with ECM remodeling that leads to the development of pre-neoplastic gastric lesions. However, a comprehensive analysis of variations in the ECM composition and its associated proteins in gastric neoplasia linked to TCFoxL1+ dysregulation has never been performed. This study provides a better understanding of how TCFoxL1+ defective BMP signaling participates in the gastric pre-neoplastic microenvironment. Using a proteomic approach, we determined the changes in the complete matrisome of *BmpR1a*-FoxL1+ and control mice, both in total antrum as well as in isolated mesenchyme-enriched antrum fractions. Comparative proteomic analysis revealed that the deconstruction of the gastric antrum led to a more comprehensive analysis of the ECM fraction of gastric tissues microenvironment. These results show that TCFoxL1+ are key members of the mesenchymal cell population and actively participate in the establishment of the matrisomic fraction of the microenvironment, thus influencing epithelial cell behavior.

**Keywords:** FoxL1+-telocytes; epithelial–mesenchymal interaction; BMP signaling; extracellular matrix; mechanical microenvironment; matrisome

#### **1. Introduction**

The extracellular matrix (ECM) is a complex assembly of large fibrous proteins, glycoproteins, proteoglycans, and ECM-associated proteins, such as growth factors, whose composition varies from one tissue to another [1]. The ECM represents the insoluble fraction of the microenvironment, and although it was long believed to be a passive component, it is in fact highly dynamic and influences the behavior of neighboring cells through mechanosensing and signaling [2,3]. Thus, the architecture and homeostasis of a tissue, such as the stomach, are maintained in part by tight regulation of ECM dynamics. Dysregulation of the ECM composition in the microenvironment creates a disbalance in the physical (force, porosity, stiffness) and biochemical (growth factor density, cell adhesion, signaling) stimuli, providing an abnormal cell response to these biomechanical forces and leading to the development of diseases such as gastric neoplasia [4–8]. In gastric cancer, pre-malignant lesions already show dysregulation in ECM dynamics and will also influence the prognosis outcome and therapeutic strategies at later stages of the disease [2,5,9].

In mammals, the ECM is composed of approximately 300 proteins. This represents the core matrisome, which is mainly composed of proteins, such as collagens (CLs) and proteoglycans, with structural and fibrillar glycoproteins [10–13]. The biochemical properties

**Citation:** Alfonso, A.B.; Pomerleau, V.; Nicolás, V.R.; Raisch, J.; Jurkovic, C.-M.; Boisvert, F.-M.; Perreault, N. Comprehensive Profiling of Early Neoplastic Gastric Microenvironment Modifications and Biodynamics in Impaired BMP-Signaling FoxL1+-Telocytes. *Biomedicines* **2023**, *11*, 19. https://doi.org/10.3390/ biomedicines11010019

Academic Editor: Marianna Christodoulou

Received: 18 November 2022 Revised: 13 December 2022 Accepted: 16 December 2022 Published: 22 December 2022

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

1

of these proteins, such as their size, insolubility, and cross-linking, have made attempts to systematically characterize the entire tissue ECM composition challenging [14]. Recently, Naba et al. developed a proteomics-based approach to identify, quantify, and compare the matrisome of whole tissues, partially resolving the limitations of in vivo analysis of ECM dynamics [14]. This approach allows for comprehensive evaluation of the proteins from the core matrisome, as well as the components of matrisome-associated proteins such as ECM regulators (ECM-remodeling enzymes, cross-linkers, proteases) and secreted factors such as growth factors and cytokines binding the ECM [13,14].

As the microenvironment plays an essential role in tissue homeostasis and in the development of pathologies such as gastric cancer [4–8], mesenchymal cells have attracted considerable attention in recent years [15–17]. Mesenchymal cells, more precisely myofibroblasts as well as FoxL1+telocytes (TCFoxL1+), are better known for their contribution to the sub-epithelial microenvironment. Both myofibroblasts and TCFoxL1+ are capable secretors of cytokines, chemokines, growth factors, and ECM proteins [16–22]. In addition, TCFoxL1+ are advantageously positioned directly underlying the epithelium, forming a 3D nexus between the epithelium and the rest of the stroma [17,23]. TCFoxL1+ contribute to the stem cell niche microenvironment by secreting soluble factors such as WNT5a, R-spondin3, and gremlin, which has been documented in recent years [15,17,20,23,24]. However, the precise role of TCFoxL1+ in the insoluble fraction of the gastrointestinal (GI) microenvironment is poorly defined. Considering the effect of TCFoxL1+ on GI epithelial cells [17–19,22,23], there is a critical need to rigorously characterize the role of the ECM biodynamic microenvironment on GI epithelial cell behavior in vivo and determine the contribution of TCFoxL1+.

To date, there have been limitations to the study of the various roles of TCFoxL1+ in the in vivo microenvironment because of the limited models available [17,20,23,25,26]. A previous study, using a murine model with TCFoxL1+ impaired BMP signaling pathway, demonstrated the importance of these cells and this pathway in inducing gastric neoplastic lesions and polyps in 90-day-old mice [22]. *BmpR1a*-FoxL1+ mice did not develop chronic inflammation or a malignant phenotype; however, disturbed TCFoxL1+ led to early precancerous events with important disorganized gastric glands architecture, intestinal metaplasia, and spasmolytic polypeptide-expressing metaplasia (SPEM), in addition to remodeling of the ECM into a reactive microenvironment [22]. Consequently, *BmpR1a*-FoxL1+ mice represent an excellent model to investigate the TCFoxL1+ contribution instructing the microenvironment ECM biodynamics, leading to gastric neoplasia. Using this model, we can perform a matrisomic investigative of the stomach of control and *BmpR1a*-FoxL1+ mice, and better understand the contribution of TCFoxL1+ to this aspect of the microenvironment [13,14].

In the present study, we evaluated the contribution of TCFoxL1+ to the matrisomic microenvironment in mice with early gastric neoplasia. This matrisomic investigative approach, used in concert with the TCFoxL1+ signaling impaired gastric pre-neoplastic mouse model, revealed a detailed inventory of dysregulated core-matrisome and matrisomeassociated proteins in early events of gastric neoplasia. We identified important and subtle changes in the ECM biology that occur during the etiology of gastric neoplasia associated with Bmp-signaling impaired TCFoxL1+.

#### **2. Materials and Methods**

#### *2.1. Animals*

The transgenic mouse line C57BL/6J *FoxL1*Cre was provided by Dr. Kaestner [27] and 129 SvEv-*BmpR1*afx/fx mice were supplied by Dr. Mishina [28]. *BmpR1a*ΔFoxL1+ conditional knockout mice were generated as previously described [18,21,22]. Male and female 90-day-old age-matched mice were used for the study. All experiments were performed in accordance with our animal welfare protocol (approval number: FMSS-2019-2370).

#### *2.2. Deconstruction of Mouse Ex Vivo Stomach Tissues*

Tissue deconstruction was performed stepwise to enrich each compartment (the epithelial, mesenchymal, and muscular layers). First, stomachs were opened along the greater curvature and rinsed with cold 1× PBS, and the antrums were isolated from the corpus and fundus sections of the total tissue. Mouse antrums were cut with a razor blade into 5 mm tissue sections and the muscle layer was mechanically dissociated using forceps under a stereomicroscope. Leftover tissues (mesenchyme and epithelium) were subsequently incubated in 4 mL sterile CorningTM Cell Recovery Solution without agitation (Corning Life Science, Corning, NY, USA) at 4 ◦C for 24 h. The following day, dissociation of the epithelial layer was performed with a 30 min incubation of the tissue on ice followed by vigorous manual shaking for 15 s. The mesenchymal tissue was incubated once again in 6 mL of sterile CorningTM Cell Recovery Solution (Corning Life Science, Corning, NY, USA) on ice with gentle shaking for 30 min followed by further dissociation by vigorous manual shaking for 15 s. Finally, mesenchymal tissues were washed four times with 1× PBS while all remaining epithelial cells were pooled and kept on ice. Deconstructed tissue sections were either snap-frozen for immunoblotting and proteomic analysis or fixed in 4% paraformaldehyde (PFA) (Thermo Fisher Scientific, Waltham, MA, USA) and paraffinembedded for histological analysis. Total tissue samples were also collected to allow for a more comprehensive comparison of the matrisome content.

#### *2.3. Histological Analysis*

The total stomach antrum or deconstructed fractions were fixed overnight at 4 ◦C in 4% PFA (Thermo Fisher Scientific, Waltham, MA, USA) and subsequently processed for tissue embedding as previously described [18,21]. To avoid the diffusion of cells in paraffin, the epithelial layer from the deconstructed tissue was embedded in HistoGelTM (Thermo Fisher Scientific, Waltham, MA, USA) and wrapped in lens paper prior to embedding. Histological staining (H&E) on tissue sections was performed as previously described [18,21]. Virtual images were acquired with a slide scanner (Nanozoomer; Hamamatsu, Japan) and visualized using the NDP.view2 software (version 2.8.24).

#### *2.4. In-Solution Digestion of Proteins to Peptides for Mass Spectrometry Analysis*

Frozen samples of either the total stomach antrum or mesenchymal-enriched stomach antrum fractions were thawed on ice and homogenized directly in 8 M urea (Sigma Aldrich, St. Louis, MO, USA) dissolved in 10 mM HEPES pH 8.0 (Wisent, Saint-Jean-Baptiste, QC, Canada) (100 μL/10 mg wet tissue weight), using the QIAGEN TissueLyser LT (Hilden, Germany). Prior to protein quantification by BCA assay (Pierce Thermo Scientific, Waltham, MA, USA), samples were centrifuged following their homogenization to remove urea-insoluble materials. Following the protocol described by Naba et al., proteins were reduced, alkylated, deglycosylated, and digested, except for the Lys-C digestion, which was omitted [14,29]. Solutions were prepared using MS-grade water and low protein binding tubes were used for these experiments.

#### *2.5. Purification and Desalting of the Peptides on C18 Columns*

Trifluoroacetic acid (TFA) was added following incubation with the proteases to a final concentration of 0.2%, and the samples were desalted using C18 tips (Pierce Thermo Scientific, Waltham, MA, USA). Acetonitrile was first aspirated in the C18 tip initially and then equilibrated with 0.1% TFA. Each peptide sample was bound to the C18 tip by 10 successive up-and-down until the entire sample was loaded. The tip was then washed with a solution containing 0.1% TFA, and the peptides were eluted in a separate low-bind tube using a 50% acetonitrile/1% formic acid solution. The eluted peptides were lyophilized using a centrifugal evaporator at 60 ◦C and the dry peptides were resuspended in 1% formic acid. The peptide concentration was measured using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) at 205 nm absorbance. The peptide samples were transferred to autosampler vials and stored at −20 ◦C until analyzed by mass spectrometry.

#### *2.6. LC-MS/MS Analysis*

Analysis of the purified peptides was carried out at the Université de Sherbrooke proteomics facility using the following parameters: Each sample (was injected into an HPLC system (NanoElute, Bruker Daltonics, Billerica, MA, USA) for LC-MS/MS. A total of 250 ng of peptides were loaded onto a trap column at a constant flow of 4 μL/min (Acclaim PepMap100 C18 column, 0.3 mm id × 5 mm, Dionex Corporation, Sunnyvale, CA, USA) and eluted onto the C18 analytical column (1.9 μm beads size, 75 μm × 25 cm, PepSep) over a 2 h gradient of acetonitrile (5–37%) in 0.1% FA at 500 nL/min into a TimsTOF Pro ion mobility mass spectrometer equipped with a Captive Spray nanoelectrospray source (NanoElute, Bruker Daltonics, Billerica, MA, USA). The data were acquired in data-dependent MS/MS mode with a 100–1700 m/z mass range, and the number of PASEF scans was set at 10 (1.27 s duty cycle) with a dynamic exclusion m/z isolation window of 0.4 min. The collision energy was set at 42.0 eV, and the target intensity was 20,000 with an intensity threshold of 2500.

#### *2.7. Protein Identification Using MaxQuant Analysis*

MaxQuant software version 1.6.17 (Munich, Bavaria, Germany), was used to analyze the raw files using the Uniprot mouse proteome database (25 March 2020, 55,366 entries). The analysis was performed under TIMS-DDA type in group-specific parameters, and included the following parameters: two miscleavages were allowed; fixed modification was carbamidomethylation of cysteine; the enzyme selected was trypsin (not before a proline). The following variable modifications were included in the analysis: methionine oxidation, N-terminal protein acetylation, and protein carbamylation (K, N-terminal). The limit for mass tolerance was set at 10 ppm for the precursor ions and at 20 ppm for the fragment ions. The identification values "PSM FDR", "Protein FDR", and "Site decoy fraction" were set to 0.05. The minimum peptide count was set to 1. Label-free quantification (LFQ) was performed using an LFQ minimal ratio count of 2. Both the "Second peptides" and "Match between runs" were allowed.

#### *2.8. Differential and Statistical Analyses of Mass Spectrometry Data*

Following the MaxQuant analysis, LFQ intensities were sorted according to several parameters using the Prostar software version 1.28.1 (Grenoble, France) [30]. Filtered proteins positive for the "Reverse", "Only.identified.by.site", or "Potential.contaminant" categories were eliminated, as were proteins identified from only one unique peptide. Data were normalized with quantile centering set to 0.5 for the intensity distribution. The non-detection of a protein was considered biologically relevant in the following cases: 75% (3 of 4) of the control or mutant mice group with respect to the other for total antrum (TA) and in 83% (5 of 6) of the control or mutant mice group with respect to the other for enriched mesenchyme (EM). Considering the aforementioned conditions, for all data corresponding to the matrisome, the partially observed value (POV) imputation was revised according to the following cases, followed by recalculation of Log2FC and *p*-value in ProStar. For TA data, the imputed POV was removed and replaced by the minimum POV when three out of four mice presented an LFQ intensity = 0 for a given protein. If two out of four mice presented LFQ intensity = 0, the Log2FC and the *p*-value recalculated in ProStar were considered non-conclusive (NC). For the EM data, the imputed POV was removed and replaced by the minimum POV when five out of six mice in one of the two groups presented an LFQ intensity = 0 for a given protein. If four out of six mice presented an LFQ intensity = 0 in one or both groups, the Log2FC and *p*-value recalculated by ProStar were considered NC. Structured least square adaptation (SLSA) and detQuantile imputation were performed for POV and missing values in the entire condition (MEC), respectively. The results were ranked to preserve the proteins present in at least three of the four (in TA) and three of the six (in MS EM) biological replicates for each condition. For hypotheses testing, a Limma statistical test was used, with a fold-change threshold of 1.5 and a *p*-value

of 0.05, to determine the list of differentially abundant proteins. A "st.boot" calibration plot was chosen for *p*-value distribution.

#### *2.9. Matrisome Identification*

The Matrisome Annotator webtool (matrisomeproject.mit.edu) was used to annotate the list of differentially abundant proteins as previously described [13]. Matrisome divisions (core matrisome or matrisome-associated) and categories (ECM glycoproteins, collagens (CLs) and proteoglycans, ECM-affiliated, ECM regulators, and secreted factors) were used according to Naba et al. [13].

#### *2.10. Indirect Immunofluorescence*

Indirect immunofluorescence of stomach sections from 90-day old control and*BmpR1a*-FoxL1+ mice was performed as previously described [18,21,22,31–33]. Antigen blocking was performed with a solution of 2% bovine serum albumin (BSA), 0.1% fish gelatin, and 0.2% Triton X-100 in 1× PBS for 1 h at room temperature. The following primary antibodies were used in this study: S100A9 (Cell Signaling Technology, Danvers, MA, USA; Cat#73425; RRID:AB\_2799839), fibronectin (Millipore; Burlington, MA, USA, Cat# AB2033, RRID:AB\_2105702), and tenascin C (Millipore, Burlington, MA, USA; Cat# AB19013, RRID:AB\_2256033). The following day, slides were incubated with anti-rabbit IgG Alexa-488 labeled secondary antibody (Cell Signaling Technology; Danvers, MA, USA; Cat# 4412; RRID:AB\_1904025). Slides were examined under a Zeiss Axioscope 5 (Oberkochen, Germany) equipped with a Zeiss Axiocam 705 mono CMOS camera. Images were analyzed using ImageJ v.1.53j (RRID:SDR\_003070).

#### *2.11. Picro-Sirius Red Staining*

Tissue sections of 90-day old mouse stomach were stained with picrosirius red following a previously published protocol [34] and CL content and fibers were analyzed under bright-field and polarized light. Images from four mice in each group were taken using a Zeiss Axioscope 5 equipped with a linear polarizer and analyzer. Multiple representative regions of interest (ROI) were assessed per image to characterize the alignment properties of CL fibers. ROI were selected in both the top and middle antrum glands of *BmpR1a*-FoxL1+ mice to better assess tissue complexity. Each ROI was the same dimension. The distribution of CL fiber angles and coherency was determined using ImageJ software (Madison, WI, USA) package Orientation J (version 2.0.5; RRID:SCR\_014796). Statistical analysis was performed using Prism v9.4.1 (San Diego, CA, USA, RRID:SCR\_002798). To test the normal distribution of the samples, we used D'Agostino-Pearson omnibus normality test and for group analyses we used nested ANOVA.

#### *2.12. Immunoblot Analysis*

The same 8 M urea proteins extracts from total antrum tissues used for proteomic analyses were also assessed to validate the potential proteins of interest (*n* = 4). Samples (10 μg each) were separated on NuPage 4–12% Bis-Tris gels (Thermo Fisher Scientific, Waltham, MA, USA) with MES buffer and transferred onto a PVDF membrane. Membranes were probed with the following antibodies: S100A8 (Proteintech, Rosemont, IL, USA; Cat# 15792-1-AP, RRID:AB\_10666315), S100A9 (Cell Signaling Technology; Danvers, MA, USA; Cat#73425; RRID:AB\_2799839), SPARCL1 (R&D Systems, Minneapolis, MN, USA; Cat# AF2836, RRID:AB\_2195097), and ADAM9 (Cell Signaling Technology; Danvers, MA, USA; Cat# 4151, RRID:AB\_1903892). GAPDH (Cell Signaling Technology; Danvers, MA, USA; Cat# 2118, RRID:AB\_561053) was used as a loading control. Anti-rabbit (Cat#7074; RRID:AB\_2099233) HRP-labeled secondary antibodies were purchased from Cell Signaling Technology; Danvers, MA, USA; and anti-goat HRP-labeled antibodies (Cat#705-035-003; RRID:AB\_2340390) were from Jackson ImmunoResearch Laboratories (West Grove, PA, USA). Immunoreactive bands were detected using the Amersham ECL Western blotting Detection System (GE Healthcare Life Sciences/Cytiva, Chicago, IL, USA) with an Azure Biosystems

c280 digital imager (Azure Biosystems, Dublin, CA, USA). Quantification was performed using ImageJ v1.53j (*n* = 4 mice/group). The Mann–Whitney U test was used to determine data significance.

#### **3. Results**

To study the contribution of TCFoxL1+ in instructing the microenvironment ECM biodynamic leading to gastric neoplasia through a matrisomic investigative approach, we compared and analyzed two methods for tissue preparation of the stomach antrum of 90-day-old control and *BmpR1a*-FoxL1+ mice (Figure 1A). In the first approach, an 8 M urea extraction of total proteins was performed on the stomach antrum of the control and *BmpR1a*-FoxL1+ mice. Proteins from the total antrum were identified using LC-MS/MS as previously described [14]. For the second method, we investigated whether other cell compartments in the tissue caused unwanted interference during the protein identification and quantification within the proteomic analysis. As the bulk of ECM/matrisome proteins is located in the mesenchymal compartment, we decided to deconstruct the stomach antrum to obtain an enriched mesenchymal compartment (Figure 1B–E). First, the stomach antrum was isolated from control and *BmpR1a*-FoxL1+ mice (Figure 1B), and the muscle layers (Figure 1C) were mechanically separated from the antrum using tweezers. Next, the remaining epithelium/mesenchymal fraction (Figure 1D) was incubated with a nonenzymatic cell recovery solution that dissociated the epithelial fraction (Figure 1E) from the underlying mesenchyme, as previously described [18,32,33,35]. The 8 M protein extraction was carried out for the isolated enriched mesenchymal fraction, and the analysis was performed as described above for the total tissue.

#### *3.1. Analysis of the Matrisome from Total Antrum of BmpR1a*-*FoxL1+ Mouse*

To evaluate the changes in ECM composition in our pre-neoplastic gastric *BmpR1a*-FoxL1+ mouse model, we calculated the fold change in matrisome proteins between the total antrum of mutant and control mice. The ratio (*BmpR1a*-FoxL1+/control) of relative expression of total proteins between both groups was compared. Among the 3803 proteins detected, 279 were shown to be upregulated, while 484 were downregulated (Figure 2A). The analysis identified, from the total antrum, the presence of 36 overexpressed matrisome proteins (dark red spots, FC > 1.5) and 37 downregulated proteins (dark blue spots, FC < −1.5) in *BmpR1a*-FoxL1+ mice compared to those observed in the control group (Figure 2A). Matrisome proteins were identified using the Matrisome Annotator analytical tool (http://matrisomeproject.mit.edu/; accessed on 29 September 2020) [13,14,36]. A total of 169 proteins were identified, 70 of them belonging to the core matrisome and 99 to matrisome-associated proteins. Of the proteins belonging to the core matrisome, we identified 11 proteoglycans, 10 CLs, and 49 glycoproteins, whereas we identified 28 ECM-affiliated proteins, 54 ECM regulators, and 17 secreted factors among the matrisome-associated proteins (Figure 2B). Surprisingly, except for two the CL chains (CL1α2, CL4α1, and α2; CL6α1, α2, and α5; CL12α1 and CL14α1) in *BmpR1a*-FoxL1+ mice, all were downregulated compared to those observed in controls (Table 1). Only CL15α1 and CL18α1 were upregulated in the mutant mice compared to those in the controls (Table 1). Similarly, most proteoglycans (HSPG2, perlecan; ASPN, asporin; DCN, decorin; LUM, lumican; and VCAN, versican) were observed to be negatively modulated in *BmpR1a*-FoxL1+ mice compared to those in the controls. Only biglycan (BGN) and bone marrow proteoglycan (PRG2) were upregulated in the mutant mice compared to those in the controls (Table 1) Glycoproteins such as Agrin (AGRN), fibronectin I (FNI), tenascin C (TNC), vitronectin (VTN), and periostin (POSTN) were upregulated in mutant mice compared to those in the controls, whereas others such as microfibrillar-associated proteins (MFAP2, 4, and 5), Nidogen1 and 2 (NID1 and NID2), as well as SPARC-like protein-1 (SPARCL-1) were downregulated (Table 1). Among the matrisome-associated proteins, the analysis revealed that ECM-affiliated proteins such as proteins of the annexin family including annexin 10 (ANXA10) and different galectins, such as galectin-4 (LGALS4) and mucin 4 (MUC4), were upregulated, whereas annexin 6 (ANXA6) and chondroitin sulfate proteoglycan 4 (CSPG4) were downregulated in *BmpR1a*-FoxL1+ mice compared to those in the controls (Table 1). Analysis of ECM regulators revealed that disintegrin, metalloproteinase family members (ADAM9 and 10), and various serpins (SERPINB1a, SERPINB5, and SERPINB12) were overexpressed, whereas α-1-microglobulin/bikunin (AMBP) and transglutaminase 2 (TGM2) were downregulated in mutant mice compared to those in the controls (Table 1). For the secreted factors, proteomic analyses showed that most members of the S100 protein group (S100A1, A2, A4, A6, A8, A9, A11, A13, A14, A16, and G) were overexpressed, except for S100B, which was downregulated in *BmpR1a*-FoxL1+ mice compared to that measured in controls (Table 1).

**Figure 1.** Methods for tissue preparation of stomach antrum for proteomic analysis. (**A**) Schematic representation of the experimental pipeline to assess the gastric matrisome profile in the *BmpR1a*-FoxL1+ mouse model. Created with BioRender.com. (**B**–**F**) Histological assessment of the deconstructed antrum tissue. Total antrum tissue (**B**) was deconstructed in a stepwise manner, where the muscle layers (**E**) were first dissociated from the other two compartments (**C**). Epithelial/mesenchymal tissue (**C**) was further dissociated, yielding the mesenchyme compartment (**D**) and the epithelium (**F**). Scale bar = 100 μm.

**Figure 2.** Total antrum matrisome in mice upon deletion of telocyte BMP-associated signaling. (**A**) Proteomic data from total antrum tissue isolated from control and *BmpR1a*-FoxL1+ mice (*n* = 4) were analyzed using ProStar to determine which proteins were significantly modulated. The volcano plot shows all differentially regulated proteins identified following mass spectrometry, highlighting significant matrisome proteins with at least a 1.5-fold change (plotted as log2FC) and a *p*-value lower than 0.05. Blue dots represent downregulated matrisome proteins; red dots represent upregulated matrisome proteins. The horizontal line represents the threshold *p*-value of 0.05. Vertical lines represent the 1.5-fold change threshold (in log2). Volcano plot was generated using GraphPad Prism version 9.4.1. (**B**). Pie chart indicates the number of matrisome proteins identified in total antrum tissue according to categories (core matrisome proteins in green and matrisome-associated proteins in black).

#### *3.2. Analysis of the Matrisome from Enriched Mesenchymal Antrum of BmpR1a*-*FoxL1+ Mouse*

Next, we evaluated changes in the ECM composition of antrum-enriched mesenchyme extracts from both mutant and control mice. We detected 37.5% fewer proteins in the enriched mesenchyme (2377) compared to those in the total antrum (3803); however, we discovered that a greater number of proteins were modulated, with 827 being upregulated and 492 being downregulated (Figure 3A). The analysis of the enriched mesenchymal antrum revealed the presence of 34 overexpressed matrisome proteins (dark red spots, FC > 1.5) and 59 downregulated proteins (dark blue spots, FC < −1.5) in *BmpR1a*-FoxL1+ mice compared to those in the control group (Figure 3A). As described above, matrisome proteins were identified using the Matrisome Annotator analytical tool (access date: 15 December 2020). A total of 135 proteins were identified, of which 68 belonged to the core matrisome and 67 to the matrisome-associated proteins. Of the proteins belonging to the core matrisome, we identified 10 proteoglycans, 12 CLs, and 46 glycoproteins, whereas among the matrisomeassociated proteins, 21 ECM-affiliated proteins, 34 ECM regulators, and 12 secreted factors were identified (Table 2). As observed for the total tissue extract, most CL chains (CL1α1, CL4α1, CL6α1, α2, α3 and α5, and CL15α1) and most proteoglycans (perlecan, asporin, decorin, lumican, and versican) in the antrum enriched mesenchyme were downregulated in *BmpR1a*-FoxL1+ mice compared to those in the controls (Table 2). We observed that, unlike the total antrum extract, biglycan was downregulated in the enriched mesenchymal antrum extract from mutant mice compared to that from controls (Table 2). Similar results were obtained with the enriched mesenchymal antrum extract for glycoproteins. FN1, TNC, and VTN were upregulated, whereas MFAP2, 4, and 5, NID1 and NID2, and SPARCL-1 were downregulated in mutant mice compared to those measured in controls (Table 2). However, in the enriched mesenchymal antrum extract, Agrin was downregulated, in contrast to our observations for the total antrum extract. Finally, our analysis of the matrisome-associated proteins, ECM-affiliated proteins, ECM regulators, and secreted factors revealed variations in mostly similar proteins identified in the total tissue extract (Table 2). When we compared both analyses, we discovered that the matrisomic variations obtained from the enriched mesenchymal antrum extracts were more robust than those obtained from the total antrum extract.


**Table 1.** Total antrum tissue.

**Figure 3.** Enriched mesenchyme antrum matrisome in mice upon deletion of telocyte BMP-associated signaling. (**A**) Proteomic data from mesenchyme-enriched antrum tissue isolated from control and *BmpR1a*-FoxL1+ mice (*n* = 6) were analyzed using ProStar to determine which proteins were significantly modulated. The volcano plot shows all differentially regulated proteins identified following mass spectrometry, highlighting significant matrisome proteins with at least a 1.5-fold change (plotted as log2FC) and a *p*-value lower than 0.05. Blue dots represent downregulated matrisome proteins; Red dots represent upregulated matrisome proteins. The horizontal line represents the threshold *p*-value of 0.05. Vertical lines represent the 1.5-fold change threshold (in log2FC). Volcano plot was generated using GraphPad Prism version 9.4.1. (**B**) Pie chart indicates the number of matrisome proteins identified in mesenchyme-enriched antrum tissue according to categories (core matrisome proteins in green and matrisome-associated proteins in black).

**Table 2.** Total Enriched mesenchyme from antrum tissue.



**Table 2.** *Cont.*

Data from both types of tissue extracts analyzed were further processed to remove irrelevant data, which led to the identification of 184 matrisome proteins between both experiments (Figure 4). Venn diagrams of the different protein categories, core matrisome (in green), and matrisome-associated proteins (in black), revealed that mesenchymal enrichment did not lead to heavy loss of matrisomal proteins in relation to the total tissue extract, except for the ECM regulators, which were more affected by the tissue treatment. Next, we performed a functional association network using the STRING database and the 116 matrisome proteins that were identified to be significantly modulated in both experiments to obtain a signature profile of proteins indicative of biological processes occurring in the microenvironment of our mouse model. The STRING analysis revealed changes in proteins involved in immune regulation, fibrosis, and tumor microenvironment in *BmpR1a*-FoxL1+ mice compared to those in controls (data not shown).

#### *3.3. Loss of BMP Signaling in Gastric TCFoxL1+ Induces Dysregulations in ECM Biodynamics Associated with Inflammation*

The tissue microenvironment can play an important role in cellular behavior, and ECM proteins influence the biodynamics as well as cell biology of tissues [37–39]. The core matrisome proteins' influence on the microenvironment through biomechanical and biochemical sensing is evident. However, it is important to take into consideration that the ECM can act as a reservoir for secreted growth factors, chemokines, and cytokines also affecting the microenvironment and impacting cell behavior [37,39]. Histopathologically, *BmpR1a*-FoxL1+ mice have been shown to be more prone to gastric neoplasia with mild inflammation [22]. Here, a part of the functional network analysis suggested a protein signature profile linked to immune regulation. S100A8 and S100A9, both secreted factors associated with the ECM, have been associated with acute and chronic inflammatory conditions and autoimmune diseases [40–42]. Matrisomic profiling revealed a significant increase in S100A8 and S100A9 between *BmpR1a*-FoxL1+ mice and controls in total antrum (FC = 11,412 and 13058, respectively; Table 1) as well as in the enriched mesenchymal antrum (FC = 37.9 and 85.2, respectively; Table 2). S100A9 expression in mutant mice was confirmed through immunofluorescence, with strong expression in the *BmpR1a*-FoxL1+ mouse mesenchyme, whereas controls showed no expression of the protein (Figure 5A). In addition, immunoblot analysis against secreted factors S100A8 and A9 revealed de novo expression of both proteins in the mutant mice but not controls, where these proteins were not detected (fold change = 20.34 and 20.48, respectively; Figure 5B,C).

**Figure 4.** Mesenchymal enrichment of the antrum does not lead to notable ECM protein loss. Venn diagrams illustrating the overall *BmpR1a*-FoxL1+ mouse gastric matrisome proteins identified using the two methods combined, indicating a wide overlap between the two approaches. Core matrisome proteins are presented in green and matrisome-associated proteins are shown in black. TA, total antrum; EM, enriched mesenchyme; CM, core matrisome; MA, matrisome-associated.

**Figure 5.** S100A8 and A9 proteins are upregulated secreted factors in *BmpR1a*-FoxL1+ mice, indicating an inflammatory response. (**A**) Immunostaining against S100A9 (shown in green) revealed an increase in its expression in the mesenchyme-enriched area of the antrum tissue of BmpR1a-FoxL1+ mice compared to that in controls. (**B**) Immunoblot analysis of the total antrum tissue indicates strong expression of both S100A8 and S100A9 proteins in BmpR1a-FoxL1+ mice compared to that in controls. (**C**) Quantification of immunoblots confirmed a significant increase in both S100A8 and S100A9 in the mutant animals (FC = 20.34 and 20.48, respectively) compared to that in controls. Statistical analysis was assessed using the Mann–Whitney test with \* *p* < 0.05. Evans blue was used as a counterstain (red signal in (**A**)). Scale bar = 100 μm.

#### *3.4. Disruption of the CL Network in Mice with Impaired Gastric BMP Signaling in TCFoxL1+*

CL is a dominant and important element in the pathological microenvironment and has a significant influence on the initiation and development of pathologies such as cancer [10]. Furthermore, its expression is generally increased in gastric cancers [43]. However, as shown in Tables 1 and 2, the expression of almost all CL chains was negatively modulated in *BmpR1a*-FoxL1+ mice compared to that in controls (CL1α2, CL4α1, and α2; CL6α1, α2 and α5; CL12α1 and CL14α1). Only a few examples were observed to be positively modulated in mutant mice using both tissue preparation methods (Tables 1 and 2). These results differ from previously published work with this mouse model [22], in which marked expression and accumulation of CLI and IV in the gastric glands of *BmpR1a*-FoxL1+ mice were observed. Therefore, we decided to perform further analyses of the CL network in both mouse groups. Collagen deposition, fiber orientation, and spatial distribution were analyzed using picrosirius red staining under bright and polarized light microscopy in both control and mutant mice (Figure 6). The loss of BMP signaling in TCFoxL1+ mice affected the sub-epithelial CL fiber network in mutant mice, mainly towards the upper part of the gland, compared to controls, as shown following picrosirius red staining under bright field (Figure 6A, left panels). Visualization of CL fibers orientation and alignment was performed with polarized light, where fibrillar CL appeared in a range of colors from red, yellow, orange, and green (Figure 6A middle panels). Heterogeneous organization of CL fibers was observed in *BmpR1a*-FoxL1+ mice, with areas of increased alignment of fibrillar collagen towards the top of the gland compared to that in controls (Figure 6A middle and right panels). Analysis using the OrientationJ plugin in ImageJ indicates a similar distribution of fiber angles between the control and *BmpR1a*-FoxL1+ mice in the middle part of the glands (Figure 6B). However, the upper gland of the mutant mice revealed a divergent spatial organization of CL fibers with respect to the organization observed in the controls (Figure 6C). The coherency factor was significantly higher in the top of the gland in *BmpR1a*-FoxL1+ mice (CF = 0.338), indicating that the CL fibers tended to be in a predominant direction and had an increased alignment compared to that observed in control mice (CF = 0.245; Figure 6D).

#### *3.5. Loss of BMP Signaling in Gastric TCFoxL1+ Causes Remodeling of ECM Glycoproteins Associated with Early Gastric Neoplasia*

ECM glycoproteins and ECM regulators are other matrix components essential for proper tissue function, including the stomach [10,44]. In addition, part of the functional annotation analysis also suggested a protein signature profile linked to the tumor microenvironment. Over the years, several ECM glycoproteins and ECM regulators have been associated with every stage of gastric cancer [45–47]. Matrisomic profiling revealed a significant increase in ECM glycoproteins such as FN1 between *BmpR1a*-FoxL1+ and control enriched mesenchymal antrum (FC = 1.46; Table 2) and TNC in total antrum (FC = 1.4; Table 1) as well as in enriched mesenchymal antrum (FC = 1.95; Table 2). A significant decrease in SPARCL-1 in total antrum (FC = −15377; Table 1) was also observed. Finally, we identified a significant increase in the ECM regulator, ADAM9, only in the in total antrum (FC = 510; Table 1). FN1 (Figure 7A) and TNC (Figure 7B) exhibited increased expressions in *BmpR1a*-FoxL1+ mice compared to those in controls, as confirmed by immunofluorescence of stomach sections (Figure 7A,B). The immunoblot analysis against SPARCL-1 confirmed a significant decrease in this ECM glycoprotein in mutant mice compared to that measured in controls (fold change = 0.48; Figure 7C,D). Immunoblot analysis against ADAM9 confirmed a significant increase in this ECM regulator in *BmpR1a*-FoxL1+ mice compared to that in controls (fold change = 1.976; Figure 7C,D).

**Figure 6.** Loss of BMP signaling in gastric TCFoxL1+ disrupts the collagen network. (**A**) Picrosirius red staining was performed on stomach sections from both control and *BmpR1a*-FoxL1+ mice. Collagen fiber organization and alignment was evaluated under bright field (left panels) and polarizing light (middle panels). Imaging was performed using a Zeiss Axioscope 5 equipped with an analyzer and a linear polarizer. ROI (dotted squares) were converted to grayscale 16-bit images and color-coded where pixel hue corresponds to the angle of local fiber orientation, which ranges from −90◦ to +90◦. Representative ROI are shown with their color-coded fiber orientation (right panels) and color-coded orientation legend is shown. (**B**) Distribution of fiber orientations was compiled for each ROI in all analyzed images, to compare control tissue with middle of the gland in *BmpR1a*-FoxL1+ mice antrum. Data are shown as means of distribution ± SD, for four individual mice in each group. (**C**) Distribution of fiber orientations was compiled for each ROI in all analyzed images, to compare control tissue with top of the gland in *BmpR1a*-FoxL1+ mice antrum. Data are shown as means of distribution ± SD, for four individual mice in each group. (**D**) Coherency factor was computed for all ROI and data were plotted showing a significant increase of fiber alignment in the top part of antrum gland in *BmpR1a*-FoxL1+ mice, with a mean coherency factor of 0.338 compared to 0.245 observed in control mice. No significant difference was observed between the middle part of antrum gland in control and that in *BmpR1a*-FoxL1+ mice. Statistical analyses were performed using Prism, with a table and group nested ANOVA. Scale bar = 50 μm. \*\* *p* < 0.01. ROI: representative region of interests.

**Figure 7.** Modulations in ECM glycoproteins and ECM regulator correlate with a neoplasia phenotype in stomach of *BmpR1a*-FoxL1+ mice. (**A**). Immunostaining against ECM glycoprotein fibronectin (shown in green) revealed an increased expression in the enlarged mesenchymal area of the antrum tissue in *BmpR1a*-FoxL1+ mice compared to that in controls. (**B**) Immunostaining against ECM glycoprotein Tenascin C (shown in green) revealed an increased expression in the antrum mesenchyme of *BmpR1a*-FoxL1+ mice compared to that in controls. (**C**). Immunoblot analysis showed a decrease of the ECM glycoprotein SPARCL-1 expression and an increase of the ECM regulator ADAM9 in total antrum samples of *BmpR1a*-FoxL1+ mice compared to that in controls. GAPDH was used as a loading control. (**D**) Quantification of immunoblots revealed a significant modulation of SPARCL-1 and ADAM9 between both group (FC = 0.48 and 1.98, respectively). All quantifications were performed using ImageJ and statistical analyses were performed using Prism. All immunoblot quantification data are presented as the mean ± SD (*n* = 4). Statistical analysis was assessed using the Mann–Whitney test with \* *p* < 0.05. Evans blue was used as a counterstain (red signal in A and B). Scale bar = 100 μm.

#### **4. Discussion**

Due to the complexity and extremely low solubility of the ECM, exhaustive biochemical characterization of tissues has long been a challenge. In recent years, mass spectrometry has been used to characterize ECM proteins in various tissues [14,48–50]. In addition, the developments brought forward by Naba et al. of an in silico definition of the matrisome provide a possibility for a detailed characterization of the biochemistry and composition of the ECM in normal and diseased tissues [13,14,48,51]. Similar to other diseases, ECM deregulation has been shown to play a role in gastric neoplasia by creating a favorable microenvironment for the transformed cells to thrive from pre-neoplastic lesions to metastatic stages [5,52]. Recent studies have demonstrated that TCFoxL1+ are strong contributors to the GI microenvironment [15,17,20,23,24]; however, their precise contribution to the ECM fractions of the microenvironment is less clear. Qualitative analysis of ECM proteins in the *BmpR1a*-FoxL1+ mouse, where TCFoxL1+ are impaired in BMP signaling, suggests a potential role for this mesenchymal cell population in contributing to the ECM fraction of the microenvironment [18,21,22]. In addition, the pathophysiological phenotype of the *BmpR1a*-FoxL1+ mouse model is characterized by the development of gastric pre-neoplastic

lesions [22]. Together, we discovered that *BmpR1a*-FoxL1+ mice represent an adequate model for understanding how TCFoxL1+ participates in an aberrant gastric pre-neoplastic ECM microenvironment.

As part of our study was to characterize the ECM contribution of BMP-signaling impaired TCFoxL1+ to the pre-neoplastic gastric microenvironment, we explored the validity of using enriched mesenchyme over total tissue extract for targeting matrisomic proteins. Tissue deconstruction into minimal mesenchymal compartment, where TCFoxL1+ and the microenvironment are observable, allows for the possibility of circumventing the complexity of the total tissue protein content. As expected, we observed an important decrease in the presence of ECM regulator proteins when we used enriched mesenchymal extract in comparison to the total tissue extract because these proteins are not bound to the ECM. Thus, they are easily lost during purification processes [48]. Deconstruction of the gastric antrum provides a more comprehensive analysis of the matrisome in *BmpR1a*-FoxL1+ mice compared to controls, with the removal of background noise from non-matrisomic proteins. In addition, the mesenchymal-enriched extract allows for improved identification of proteins with low expression levels that could be easily lost in a larger pool of proteins.

In a previous study, the gastric pathophysiological aspects of the *BmpR1a*-FoxL1+ mouse model showed that disruption of BMP signaling in TCFoxL1+ led to the creation of a toxic microenvironment with an increase in CLI, fibronectin, HGF, and FSP1/S100A4, pressuring the epithelium to initiate pre-malignant lesions [22]. Correa's cascade of gastric carcinogenesis shows that a normal gastric epithelium gradually transitions from initial gastritis to chronic gastritis, mucosal atrophy, metaplasia, dysplasia, and carcinoma [53,54]. Early steps of this cascade prior to carcinoma involve the presence of inflammatory processes [54,55] and a reorganization of the nurturing microenvironment into a tumor microenvironment [5]. Interestingly, some protein profiles, such as immune regulation, fibrosis, and tumor microenvironment, were noticeably modulated in the *BmpR1a*-FoxL1+ matrisome analysis. Thus, the present protein profile, in combination with our previous phenotypic analysis of *BmpR1a*-FoxL1+ mice, allows for a better understanding of the sequence of events occurring in the ECM microenvironment of these mice with BMP-impaired TCFoxL1+ with regard to early events in gastric neoplasia.

Consequently, the overexpression of S100A8 and A9 in the matrisomic analysis, as secreted factors associated with the ECM, supports these profiles. Both proteins have been associated with numerous human disorders, including acute and chronic inflammatory conditions, autoimmune diseases, and cancer [40,56,57]. They are also reported to represent highly potent biomarkers of a wide range of inflammatory processes, including rheumatoid arthritis and inflammatory bowel disease [41,58]. In tumor biology, both proteins play a fundamental role, and their levels are elevated in numerous tumors, including gastric cancer, which is in line with our model [57,59–63]. Although there are signs of inflammation in mice with infiltration of lymphocytes (CD3) and macrophages (F4/80), no chronic inflammation was observed [22]. This could partially explain the overexpression of S100A8/A9 in the gastric microenvironment of the *BmpR1a*-FoxL1+ mice.

As for the tumor microenvironment profile identified in this study, ECM glycoproteins and ECM regulators are known to play key roles in the microenvironment for proper tissue function including the stomach [2,5,10,45,64–66]. For example, matricellular proteins such as FN1, TNC, and ADAM9 were upregulated, while SPARCL-1/Hevin was downregulated. In addition, these ECM glycoproteins and ECM regulators have been linked to the tumor microenvironment in various stages of gastric cancer [67–70]. Deregulation of protein expression, such as FN1 and ADAM9 (upregulated) or SPARCL-1 (downregulated), has been shown to affect cell growth and tissue proliferation in gastric cancer [70–73]. The hyperplasia seen in the gastric glands of *BmpR1a*-FoxL1+ mice [22] could be, in part, explained by the modification of these proteins in the microenvironment. TNC is generally absent or suppressed in most normal adult tissues, while it is markedly overexpressed in some pathological conditions, such as wound healing, inflammation, and in a variety of neoplasms [74]. This expression pattern was observed in the stomachs of *BmpR1a*-FoxL1+ mice

when compared to that of controls. Thus, similar to gastrointestinal stromal tumors [67], whereas TNC is used as a potential marker, it can also be used as an indicator of gastric premalignancies, according to the results shown in this study.

CL is a polymeric protein present in greater quantities in the ECM under physiological conditions [75,76], as well as in the tumor microenvironment, where its extensive deposition is one of the pathological characteristics of cancers, such as gastric neoplasia [43,77]. As collagens play an important structural role in the ECM and contribute to its mechanical properties by influencing cellular behavior [78], any changes in CL organization, expression, and/or crosslinking will directly affect optimal tissue function [79]. Unexpectedly, in this study, we discovered that almost all CL chains analyzed using MS were downregulated in the *BmpR1a*-FoxL1+ pre-neoplastic model. This is in contrast to previous findings, especially regarding what is known from descriptive studies on ECM in gastric cancer, as well as previous studies with *BmpR1a*-FoxL1+ [5,22,43]. Other proteomic analyses have shown the difficulties of optimal CL protein extraction from tissues, especially when fibrotic [36,80,81]. We hypothesize that the extraction method used in this study was not optimal for CL protein analysis [81]. However, the choice of another method favoring CL protein extraction could be detrimental to the analysis of other matrisomic proteins [81]. Considering that CL chain expression, as well as its mechanical and biochemical organization, could be validated through other techniques, proteomic analyses would not be the preferred technique for studying fibrotic tissues. In this study, Sirius red staining under bright field was used for the visualization of total CL deposition in tissue, while under polarized light microscopy it provided more relevant information regarding the CL network, such as its organization, stiffness, and fiber alignment.

Altogether, the present study provides a more comprehensive representation of the evolving ECM fraction from the microenvironment in pre-neoplastic gastric lesions associated with BMP signaling-impaired TCFoxL1+. These findings support the importance of TCFoxL1+ and BMP signaling in the maintenance of a healthy microenvironment to maintain gastric homeostasis and prevent the development of pathologies such as neoplasia.

**Author Contributions:** Conceptualization, N.P. and F.-M.B.; methodology, A.B.A.; software, J.R., C.-M.J.; validation, A.B.A., V.P. and V.R.N.; formal analysis, A.B.A., V.P., F.-M.B. and NP.; investigation, A.B.A.; resources, N.P.; data curation, A.B.A.; writing—original draft preparation, N.P.; writing review and editing, N.P. and F.-M.B.; visualization, A.B.A.; supervision, N.P. and F.-M.B.; project administration, N.P.; funding acquisition, N.P. and F.-M.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) RGPIN-2018-05414 to F.M.B. and RGPIN 2018-06115 to N.P.

**Institutional Review Board Statement:** The study protocol was approved by the Animal Welfare Research Committee of the Faculty of Medicine and Health Sciences of the Université de Sherbrooke (FMSS-2019-2370), and all experiments were conducted in strict adherence to the standards and policies of the Canadian Council on Animal Care in Sciences.

**Data Availability Statement:** Raw files, databases, and MaxQuant results have been deposited in ProteomeXchange with the accession number PXD038603.

**Acknowledgments:** NP and FMB are members of the Fonds de Recherche du Québec-Santé-funded "Centre de Recherche CHUS". FMB is an FRQS senior scholar (award number 281824). The authors thank KHK for providing the *FoxL1*Cre transgenic line, Ariane De Castro for her technical assistance with the mouse colony and genotyping, and Electron Microscopy and Histology Research Core of the Faculté de Médecine et des Sciences de la Santé at the Université de Sherbrooke for their histology, electron microscopy, and phenotyping services.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

### *Article* **Aberrant Expression and Prognostic Potential of IL-37 in Human Lung Adenocarcinoma**

**Panayiota Christodoulou 1,2,†, Theodora-Christina Kyriakou 3,†, Panagiotis Boutsikos 2, Maria Andreou 1, Yuan Ji 4, Damo Xu 5, Panagiotis Papageorgis <sup>3</sup> and Maria-Ioanna Christodoulou 1,4,\***


**Abstract:** Interleukin-37 (IL-37) is a relatively new IL-1 family cytokine that, due to its immunoregulatory properties, has lately gained increasing attention in basic and translational biomedical research. Emerging evidence supports the implication of this protein in any human disorder in which immune homeostasis is compromised, including cancer. The aim of this study was to explore the prognostic and/or diagnostic potential of IL-37 and its receptor SIGIRR (single immunoglobulin IL-1-related receptor) in human tumors. We utilized a series of bioinformatics tools and -omics datasets to unravel possible associations of IL-37 and SIGIRR expression levels and genetic aberrations with tumor development, histopathological parameters, distribution of tumor-infiltrating immune cells, and survival rates of patients. Our data revealed that amongst the 17 human malignancies investigated, IL-37 exhibits higher expression levels in tumors of lung adenocarcinoma (LUAD). Moreover, the expression profiles of IL-37 and SIGIRR are associated with LUAD development and tumor stage, whereas their high mRNA levels are favorable prognostic factors for the overall survival of patients. What is more, *IL-37* correlates positively with a LUAD-associated transcriptomic signature, and its nucleotide changes and expression levels are linked with distinct infiltration patterns of certain cell subsets known to control LUAD anti-tumor immune responses. Our data indicate the potential value of IL-37 and its receptor SIGIRR to serve as biomarkers and/or immune-checkpoint therapeutic targets for LUAD patients. Further, the data highlight the urgent need for further exploration of this cytokine and the underlying pathogenetic mechanisms to fully elucidate its implication in LUAD development and progression.

**Keywords:** interleukin (IL-)37; lung adenocarcinoma; biomarker; survival; infiltration rates

#### **1. Introduction**

Interleukin-37 (IL-37) is one of the latest members included in the IL-1 family of cytokines, known to suppress innate immune responses and modulate acquired ones. Thus, this cytokine possesses a pivotal role in inflammation related to the pathophysiology of various human disorders, including autoimmune diseases, inflammatory systemic conditions, infections, and cancer [1]. It is produced by immune and non-immune cells and acts via inhibition of the production of pro-inflammatory cytokines and activation of anti-inflammatory signals [2]. Similar to other immune-regulatory cytokines (e.g., TGF-β and IL-10), IL-37 has attracted notable interest both from a basic biological but also from a translational research perspective [2].

**Citation:** Christodoulou, P.; Kyriakou, T.-C.; Boutsikos, P.; Andreou, M.; Ji, Y.; Xu, D.; Papageorgis, P.; Christodoulou, M.-I. Aberrant Expression and Prognostic Potential of IL-37 in Human Lung Adenocarcinoma. *Biomedicines* **2022**, *10*, 3037. https://doi.org/10.3390/

Academic Editor: Manoj K. Mishra

Received: 20 October 2022 Accepted: 21 November 2022 Published: 24 November 2022

biomedicines10123037

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The human *IL-37* gene is located on chromosome 2q12-13, very close to the regulatory regions of the genes encoding the IL-1a and IL-1b cytokines [3]. The gene encodes for five protein isoforms, a, b, c, d, and e; however, their specific functions as well as their relative abundance are not yet fully elucidated [2]. Among all isoforms, IL-37b is the longest one (consisting of five of the six exons; all except exon 3) and the most well-studied [4,5]. In humans, IL-37 is reported to be constitutively expressed by circulating monocytes, tissue macrophages, dendritic cells (DCs), tonsil B cells, and plasma cells [2,6]. Upon pro-inflammatory stimuli, its expression is significantly augmented by tissue cells and peripheral blood mononuclear cells (PBMCs), predominantly monocytes, thus possessing a prevailing immunoregulatory role [1,6,7].

IL-37 exerts its effects on target cells via two distinct mechanisms: (i) Extracellularly, it binds to interleukin 18 receptor 1 (IL-18R1), which recruits interleukin 1 receptor 8 (IL-1R8, also known as single Ig IL-1-related receptor—SIGIRR), essential for the anti-inflammatory actions of the cytokine and forms a complex that transduces the signal intracellularly [8–11]; the activity of the cytokine lies mainly in the suppression of pro-inflammatory signaling factors, including mTOR (mammalian target of rapamycin), STAT1 (signal transducer and activator of transcription 1), AKT (Ak strain transforming), p53, p38, SHP-2 (SH2 domaincontaining protein tyrosine phosphatase-2), Syk (Spleen Associated Tyrosine Kinase) [2], and also on the enhancement of anti-inflammatory signaling factors, such as PTEN (phosphatase and tensin homolog) phosphatase, to further inhibit inflammation through PI3K (Phosphoinositide 3-kinases) kinase, mTOR, MAPK (mitogen-activated protein kinase), and FADK (focal adhesion kinase) pathways [12]. (ii) Intracellularly, upon its cleavage by caspase-1 at aspartic acid (D20) residue and binding to Smad3, IL-37 can be translocated into the nucleus, where it dampens the expression of inflammatory genes [13–16].

Evidence of the role of IL-37 in cancer, emerging during the last years, support its tumor-protective properties exerted through the enhancement of anti-tumor immunity, specifically within the tumor microenvironment (TME). At this point, it is worthwhile to highlight the importance of the local microenvironment, consisting of distinct immune and non-immune, cellular and non-cellular components (growth factors, chemokines, cytokines), in the development and progression of human tumors, as well as responseto-treatment [17]. TME interplays are orchestrated by tumor-infiltrating lymphocytes (TILs), natural killer (NK) cells, tumor-infiltrating dendritic cells (TIDCs), tumor-associated macrophages (TAMs), tumor-associated neutrophils (TANs), cancer-associated fibroblasts (CAFs), and myeloid-derived suppressor cells (MDSCs). Within these cell types, T helper (Th) 1, cytotoxic T, NK, B cells, M1 macrophages (MΦ), and mature DCs represent partners of immune control against malignant cells, and Th2, regulatory T cells (Tregs), M2 MΦ, neutrophils, CAFs, immature DCs, and MDSCs promote immune escape. Essentially, certain elements of TME have been targeted by therapeutic drugs (antibodies against immune checkpoints, such as PD-1/PD-L1, as well as anti-angiogenic factors, such as anti-VEGF-A) that are associated with good clinical outcomes [18].

IL-37 overexpression by TAMs derived from patients with human hepatocellular carcinoma (HCC) inhibits M2 polarization via regulation of the IL-6/STAT3 pathway, to suppress tumor growth in vivo [19]. High IL-37 expression by HCC tumor cells is associated with upregulated levels of CCL3 and CCL20 and increased recruitment of CD1a+ dendritic cells (DCs) in tumor infiltrations [20]. What is more, IL-37 secreted from HCC cells enhances the expression of MHC-II, CD86, and CD40 surface molecules and the secretion of IL-2, IL-12, IL-12p70, interferon-a (IFN-α), and IFN-γ cytokines by DCs, which is, in turn, associated with an increased proportion of IFNγ+CD8+ T cells [20]. Additional in vivo experiments showed that overexpression of IL-37 in HCC cells resulted in increased recruitment of CD11c<sup>+</sup> DCs in the tumor microenvironment and tumor growth delay [20]. On the other hand, a very recent study using an experimental colorectal cancer (CRC) model reported that IL-37 transgenic mice are highly prone to developing colitis-associated CRC, which is characterized by severely increased tumor burdens and dysfunction of infiltrating CD8+ T cells, dependent on SIGIRR [21].

Apart from immune-related effects, IL-37 also exerts its anti-cancer activity on other aspects of tumor development. First, it acts as an anti-angiogenic factor; its expression by cancer cells suppresses the tubule formation of human umbilical vein endothelial cells (HUVEC) cells in vitro, decreases the expression of matrix metallopeptidase 2 (MMP2) and vascular endothelial growth factor (VEGF) in SK-Hep-1 cells, and inhibits tumor angiogenesis in a murine model of HCC [22]. Second, it suppresses migration through the inhibition of Rac1 activation in various tumor cell types; indeed, intracellular IL-37 binds to the C-terminal region of the protein, preventing its membrane translocation and downstream signaling [22,23]. It has been observed that decreased expression of IL-37 in human lung adenocarcinoma (LUAD) biopsies is associated with tumor metastasis [23]. Lastly, the cytokine can act against tumor progression through the modulation of N6 methyladenosine (m6A) activity and inhibition of the Wnt5a/5b pathway in lung cancer cells [24].

Clinical observations over the last years have shed light on the potential of this cytokine to serve as a possible biomarker in various human malignancies. In CRC patients, serum IL-37 levels were found to be significantly elevated and positively correlated with the levels of CEA (carcinoembryonic antigen), a commonly used diagnostic biomarker for the disease [21]. In these patients, a negative correlation between IL-37 levels in the serum and CD8+ T cell infiltration in the tumor was also observed [21]. Importantly, IL-37 expression in CRC tumors was found (a) to be linearly correlated with their stage, with the highest expression detected in stage I and the lowest in stage IV tumors; and (b) to be associated with survival rates, with higher levels predicting longer disease-free (DFS) and overall (OS) survival [25]. It is of note that intratumoral IL-37 levels, together with the incidence of CD66b+ neutrophils, as well as mismatch repair (MMR) status, have been proven to be independent prognostic factors and are included in nomograms predicting DFS and OS in CRC, which could facilitate individualized patients' management [25].

Elevated serum IL-37 levels were also detected in patients with transitional cell carcinoma of the bladder (TCC) [26]. In melanoma, high levels of IL-37 expressed by peripheral Tregs were found to mirror the secretion of IL-1β mediators, especially TGFβ, by the tumor, suggesting it could be used as a possible biomarker for tumor-induced immunosuppression [27]. Furthermore, in HCC tumor infiltration high prevalence of IL-37+CD1a+ DCs biopsies was linked to higher survival rates of patients [20]. On the other hand, the ratio of IL-18-to-IL-37 levels was higher in the serum and PBMCs of patients with oral squamous cell carcinoma (OSCC) compared to non-cancer individuals and associated with shorter OS and DFS [28]. Low levels of IL-37 in the sera of patients with acute myeloid leukemia (AML) were shown to be associated with poor prognosis of the disease, but they were restored to normal in complete remission [29]. Finally, in breast cancer, peripheral blood *IL-37* mRNA levels and CD8+ T cell numbers were decreased in patients compared to healthy individuals, and they were correlated with ER+/PR+/HER2+ status [30].

In this study, we aimed at the investigation of the possible prognostic potential of IL-37 in patients with cancer utilizing bioinformatics tools and publicly available databases. Since our initial results indicated that among various human malignancies, *IL-37* exerts its highest expression in lung adenocarcinoma (LUAD), the study was subsequently focused on this cancer type. IL-37 levels were found to be correlated with tumor development, stage, grade, and the improved overall survival of patients, and mutations and gene expression levels were associated with the differential distribution of immune cells infiltrating the tumor.

#### **2. Materials and Methods**

#### *2.1. Study Design*

We first investigated the possible differential distribution of *IL-37* and *SIGIRR* expression levels in various human cancers using the Tissue Atlas tool of the Human Protein Atlas website [31]. Lung cancer, and more specifically lung adenocarcinoma (LUAD), was selected for further analysis. The TNMplot web tool [32] was used to compare the

mRNA expression levels in LUAD versus non-LUAD specimens, the UALCAN portal [33] to explore the differential distribution among tumors of various histology, stage, nodal metastasis, or TP53 status, and the Kaplan–Meier plotter tool [34] to assess the effect of mRNA levels on survival rates of LUAD patients. Protein expression profiles as well as associations with various parameters of the pathology of the tumor were investigated through the UALCAN portal [33] and the Pathology tool of the Human Protein Atlas website [31]. To explore the expression distribution of *IL-37* and *SIGIRR* genes in different cell types of the human lung, the Single-Cell Type Atlas part of the Human Protein Atlas was used [35]. Finally, the effect of *IL-37* nucleotide changes or aberrations in expression levels on the differential distribution of various immune cell subsets infiltrating the LUAD tumor was analyzed using the TIMER2.0 webserver [36].

#### *2.2. Study of the Expression Levels of IL-37 and SIGIRR in Various Human Cancers*

To explore the expression patterns of *IL-37* and *SIGIRR* in various human cancers, the Tissue Atlas tool [31] of the Human Protein Atlas website was used (www.proteinatlas. org, accessed on 24 August 2022). The program processes data from RNA-sequencing experiments on tumor samples of various origin (17 different types of cancer, n = 7932 total samples). The mean of the FPKM levels of the genes in each cancer type and in the total cohort of cancer patients as well as its standard deviation (SD) was estimated. Non-parametric Mann–Whitney U test were applied for the evaluation of differences in gene expression levels.

#### *2.3. Investigation of IL-37 and SIGIRR Expression Levels in LUAD versus Non-LUAD Lung Tissue*

The TNMplot tool (www.tnmplot.com, accessed on 3 September 2022) [32] was used to explore impaired expression levels of the *IL-37* and *SIGIRR* genes in LUAD tumors. Comparative analysis processed RNA-sequencing data deposited in The Cancer Genome Atlas (TCGA) database from (a) 524 LUAD versus 486 non-LUAD individuals and (b) 57 pairs of LUAD versus adjacent normal tissue biopsies. Fold-changes of the median expression levels between the groups and the *p*-values assessed utilizing the non-parametric Mann–Whitney U test are reported; significant changes were considered those with a *p*-value < 0.05 and a fold-change > 2 or <0.5.

#### *2.4. Exploration of Associations between IL-37 or SIGIRR Expression Levels with Certain Pathological Characteristics of the LUAD Tumor*

To investigate possible associations between the gene-expression levels of *IL-37* and *SIGIRR* with certain parameters of the pathology of the tumor, data from the UALCAN portal were processed http://ualcan.path.uab.edu/ [33] (accessed on 3 September 2022). Gene expression levels were analyzed in correlation with histological type, cancer stage, nodal metastasis, and TP53 mutation status. Fold-changes compared to control groups >2 or <0.5 and *p*-values < 0.05 were considered significant.

#### *2.5. Assessment of the Effect of IL-37 and SIGIRR Expression Levels on Survival Rates of Patients with LUAD*

The Kaplan–Meier (KM) Plotter tool (www.kmplot.com, accessed on 24 August 2022) [34] was used to evaluate the ability of *IL37* expression to serve as prognostic factor for overall or relapse-free survival (OS or RFS, respectively) in LUAD patients. The tool analyzed RNA-sequencing data from 504 LUAD participants (deposited in Gene Expression Omnibus (GEO), EGA (European Genome-Phenome Archive), and TCGA databases) categorized based on *IL-37* or *SIGIRR* levels of expression ranging from high to low and calculated the hazard ratio (HR) and logrank *p*-values for the probability of survival at 250 months.

#### *2.6. Development of a List of Positively/Negatively Correlated Genes of IL-37 in LUAD Biopsies*

To obtain a list with genes whose expression levels are positively or negatively correlated with those of *IL-37*, the UALCAN tool (http://ualcan.path.uab.edu/ (accessed on 20 September 2022)) was utilized [33]. The tool processed RNAseq data from LUAD tumor biopsies of TCGA. A heatmap was generated, and Pearson's correlation test was applied to assess the significance of the data (*p* < 0.05 were considered significant). Correlations of *IL-37* with genes encoding IC proteins were specifically explored via the Correlation analysis option of the TNM plotter tool [32]. Data from RNA seq experiments on LUAD tumors were processed; Spearman's rho and *p*-values were obtained, and a correlation coefficient cut-off = 0 was set.

#### *2.7. Investigation of the IL-37 and SIGIRR Protein Expression Levels*

Protein expression in LUAD tumors was first explored using the Pathology tool of the Human Protein Atlas website, www.proteinatlas.org [35] (accessed on 20 September 2022). Deposited pictures of immunohistochemically stained sections of paraffin-embedded LUAD tissues were observed. Staining had been performed using polyclonal antibodies against human IL-37 (HPA054371) and SIGIRR (HPA023188). Levels of protein expression (z-values) in paired LUAD primary vs. normal tissues, as well as in tumors of different histological subtype, stage, grade, status of the HIPPO, WNT, mTOR, NRF2, RTK, or p53/Rb-related pathways, SWI-SNF complex, MYC/MYCN or chromatin modifier, were analyzed through the UALCAN web portal http://ualcan.path.uab.edu/ (accessed on 1 October 2021) [33]. *p*-Values calculated utilizing the non-parametric Mann–Whitney U test for differences in between-two group analyses are reported; *p*-values < 0.05 were considered significant.

#### *2.8. Blood and Immune Single-Cell Analysis in Lung Tissue*

To visualize single-cell RNA-seq (scRNAseq) data from human lung tissue, the Single-Cell Type Atlas was used. *IL-37* and *SIGIRR* expression profiles of blood and immune cells including macrophages, alveolar cells type 1 and 2, T cells, granulocytes, fibroblasts, club cells, ciliated cells, and endothelial cells are depicted in colored clusters at UMAP plots and in bar charts. Elevated expression levels (read counts normalized to transcripts per million protein coding genes, pTPM) of *IL-37* and *SIGIRR* in different blood and immune cell groups can categorize genes as cell type-enriched (at least four-fold higher mRNA level in a certain cell type compared to any other cell type), group-enriched (at least four-fold higher average mRNA level in a group of 2–10 cell types compared to any other cell type), and cell type-enhanced (at least four-fold higher mRNA level in a cell certain cell type compared to the average level in all other cancer types).

#### *2.9. Analysis of Possible Correlations between IL-37 Gene Alterations or Expression Levels with Immune Cell Infiltration Patterns*

To analyze the effect of *IL-37* nucleotide changes on or the association of *IL-37* expression levels with immune cell infiltration of lung tumors, the TIMER2.0 webserver was used [36]. The "mutation" module was utilized for the investigation of possible differential distribution of macrophages, CD4+ or CD8<sup>+</sup> T cells, Tregs, dendritic cells (DCs), neutrophils, B cells, monocytes, NK cells, MDSCs, and endothelial cells in LUAD tumors of patients with *IL-37* somatic mutations vs. those without. The Wilcoxon *p*-value and log2fold-change of infiltration levels between the groups were estimated. The "gene" module was utilized for the exploration of possible correlations between the expression levels of *IL-37* and the levels of infiltration of the tumors by CD4+ or CD8<sup>+</sup> T cells, Tregs, γδ T cells B cells, neutrophils, monocytes, macrophages, DCs, NK, mast cells, cancer-associated fibroblasts, lymphoid, myeloid or granulocyte-lymphocyte progenitor cells, endothelial cells, eosinophils, hematopoietic stem cells, and MDSCs. All data were filtered for tumor purity. Spearman's rho and *p*-values were calculated for the evaluation of the linear positive or negative correlation.

#### *2.10. Analysis of Possible Correlations between IL-37 Gene Alterations or Expression Levels with Immune Cell Infiltration Patterns*

The STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) database (https://string-db.org) [37] (accessed on 1 October 2022) was used to explore the cancer/LUAD molecular/cellular networks in which IL-37 signaling is implicated. The Biological Process (Gene Ontology), Molecular Function (Gene Ontology), Cellular Component (Gene Ontology), Reference publications (PubMed), Local network cluster (STRING), KEGG Pathways, Reactome Pathways, WikiPathways, Tissue expression (TISSUES), Subcellular localization (COMPARTMENTS), Annotated Keywords (UniProt), Protein Domains (Pfam), Protein Domains and Features (InterPro), and Protein Domains (SMART) depositories were searched. The level of confidence for the minimum interaction score was 0.4. Pathways with a false-discovery rate (FDR) < 0.05 were considered to be significantly affected.

#### *2.11. Statistical Analysis*

Part of the statistical analysis of the data had been performed by the aforementioned web portals. Additional statistical tests included (as required): (a) for between two-groups analysis, non-parametric Mann–Whitney *U* or unpaired t-test with Welch's correction; (b) for differences among means, an ordinary one-way ANOVA test; (c) for linear correlations, Pearson's *r* test. The analysis was performed using the GraphPad Prism 8.4.2 software (GraphPad Software, San Diego, CA, USA). *p*-Values < 0.05 were considered significant.

#### **3. Results**

#### *3.1. Among Human Cancers, IL-37 Exhibits the Highest Expression Levels in Lung Adenocarcinoma (LUAD)*

The first aim of this study was to investigate the expression patterns of *IL-37* and its receptor, *SIGIRR,* in tumor biopsies of various human cancer types, utilizing the Tissue Atlas tool of the Human Protein Atlas database [31]. Analysis revealed that the *IL-37* expression pattern significantly differs in various types of human cancers (one-way ANOVA *p* < 0.0001), whereas among the 17 types examined, lung cancer exhibited the highest *IL-37* mRNA levels (mean ± SD: 6.5 ± 31.88, n = 994) (Table 1, Figure 1A). Lung cancer specimens were further analyzed based on their specific type: either lung adenocarcinoma (LUAD) or lung squamous carcinoma (LUSC); the former exhibited increased *IL-37* expression levels compared to the latter (12.85 ± 44.05, n = 500 vs. 0.25 ± 1.79, n = 494, respectively; Mann–Whitney *p* = 0.0001). In contrast, *SIGIRR* was expressed at similar levels in almost all cancer types, besides glioma and head-and-neck tumors, in which its expression levels were decreased compared to the rest (Table 1, Figure 1B). However, within lung cancer samples, LUAD specimens exhibited higher *SIGIRR* levels compared to LUSC specimens (41.6 ± 4.46 vs. 13.64 ± 7.03, respectively; Mann-Whitney *p* < 0.0001). It is also noteworthy that, overall, all human tumor biopsies express lower *IL-37* compared to *SIGIRR* mRNA levels (1.029 ± 11.68 vs. 13.04 ± 8.911, respectively, n = 7932, *p* < 0.0001).

**Table 1.** *IL-37* and *SIGIRR* mRNA expression levels (FPMKs) in various human cancer types. Data analyzed were obtained from www.proteinatlas.org [31] (accessed on 24 August 2022). *p*-Values compared to total lung or LUAD specimens using non-parametric Mann-Whitney U test are reported.



**Table 1.** *Cont.*

NA: not applicable, NS: non-significant.

**Figure 1.** Box and Tukey whiskers diagrams showing the differential distribution of *IL-37* (**A**) and *SIGIRR* (**B**) expression levels (FPKMs), as analyzed by RNA sequencing in human tumor samples of various cancer types. Especially for lung cancer, samples were separately analyzed as lung adenocarcinoma (LUAD) or squamous carcinoma (LUSC). Data were obtained from www.proteinatlas.org [31] (accessed on 24 August 2022) and further processed for statistical analysis. Left and right sides of the boxes correspond to the lower and upper quartiles; the box covers the interquartile interval including 50% of the data. The vertical line in the box represents the median. Whiskers outside the box expand from the minimum to the lower quartile and from the upper quartile to the maximum of data range. Each dot represents the value of a single sample.

#### *3.2. IL-37 Levels Are Increased in LUAD versus Non-LUAD Lung Tissues*

Following the observation that LUAD exhibits the highest levels of *IL-37* expression among solid tumors, we investigated its differential expression pattern in cancerous (LUAD) vs. non-cancerous lung biopsies (non-LUAD) using the TNMplot web tool [32]. *IL-37* mRNA levels were found to be increased by 12-fold in samples from LUAD (n = 524) compared to those from non-LUAD individuals (n = 486, Mann–Whitney *<sup>p</sup>* = 3.83 × <sup>10</sup><sup>−</sup>59), and by 11-fold in tumor compared to paired adjacent normal tissues from LUAD individuals (n = 57; *<sup>p</sup>* = 3.81 × <sup>10</sup>−8) (Figure 2A). *SIGIRR* mRNA expression was slightly decreased (fold-change = 0.85; *<sup>p</sup>* = 5.63 × <sup>10</sup><sup>−</sup>5) between LUAD and non-LUAD individuals; however, no change was observed between tumor and adjacent normal tissues in LUAD patients (fold-change = 1.05; *<sup>p</sup>* = 6.28 × <sup>10</sup><sup>−</sup>1) (Figure 2A).

#### *3.3. IL-37 and SIGIRR Levels Are Associated with Histological Type and Tumor Grade of LUAD Tumors*

Based on the UALCAN webtool analysis [33], *IL-37* and *SIGIRR* expression levels derived from RNA-sequencing experiments are associated with the histological type of LUAD tumors. The one-way ANOVA test revealed significant differential expression among groups (*p* = 0.0001). More specifically, LBC (lung bronchioloalveolar carcinoma)-mucinous and mucinous-coloid tumors express significantly higher *IL-37* levels (TPM median (range), fold-change, *p*-value, respectively; for LBC-mucinous (n = 5): 1.07 (0.20–1.96), 13.38, 0.0072 and for mucinous-coloid (n = 10): 0.26 (0–0.76), 3.25, 0.0153) compared to normal lung tissue (n = 59, TPM median (range) 0.08 (0–0.38)) (Figure 2B, Table 2). Regarding *SIGIRR* levels, the observed changes were modest (fold-changes: between 0.5–2; one-way ANOVA *p* > 0.05). *IL-37* levels also correlated with LUAD tumor stage. Compared to controls, stage 1 biopsies (n = 277) expressed 7.63 times higher mRNA levels (TPM median (range) = 28 (0–23.62)) (Figure 2B, Table 2). Compared to stage 1 biopsies, expression levels were significantly decreased in stage 2 (n = 125; 0.37 (0–13.55)) and stage 4 (n = 28; 0.35 (0–6.17)) specimens, being 4.63- and 4.38-fold, respectively, higher than controls. In stage 3, *IL-37* levels were 2.13-fold increased compared to non-LUAD biopsies (n = 85; 0.17 (0–19.55)). A notable trend of association was observed between *SIGIRR* levels and the stage of the LUAD tumor; none of the stage subgroups exhibited significant change compared to controls (n = 59; 35.06 (17.01–55.55)) (fold-changes ranged between 1.21 and 0.73). However, *SIGIRR* levels displayed a statistically significant linear decrease among subgroups, following the stage 1-to-4 order (one-way ANOVA test: *r*<sup>2</sup> = 0.6909, *p* < 0.0001). Between-group analyses revealed that *SIGIRR* levels were significantly lower in biopsies of stage 2 compared to those of stage 1 (36.16 (12.59–75.62) vs. 42.53 (3.97–89.73), unpaired t-test *p* < 0.0001) in biopsies of stage 3 (33.03 (3.84–74.01)) compared to those of stage 1 (*p* < 0.0001), in biopsies of stage 3 compared to those of stage 2 (*p* = 0.038), and in biopsies of stage 4 (25.65 (7.76–69.02)) compared to those of stage 2 (*p* = 0.022). When compared to the control group, only stage 1 biopsies were found to express significantly different (higher) *SIGIRR* levels (*p* < 0.0001).

**Table 2.** Differential distribution of *IL-37* and *SIGIRR* mRNA levels in LUAD vs. non-LUAD tissues or among LUAD tumors of different histological type or stage. Number of patients in each group, the median levels of expression and their range (where available), fold-change over non-LUAD/normal samples, *p*-values, and the webtool used for the analysis (TNMplot [32], UACLAN [33]) are reported.


NOS: not otherwise specified, LBC: lung bronchioloalveolar carcinoma, NS: non-significant, NA: not applicable. Other between-two-groups comparisons; for *IL-37*; NOS vs. mixed: NS, NOS vs. clear cell: NA, NOS vs. LBCnonmucinous: NS, NOS vs. solid pattern predominant: NS, NOS vs. acinar: *p* = 0.0192, NOS vs. mucinous: *p* = 0.000003, NOS vs. mucinous colloid: NA, NOS vs. papillary: NS, NOS vs mucinous: *p* = 0.000003, NOS vs. micropapillary: NS, NOS vs. signet ring: NA, mixed vs. clear cell: NA, mixed vs. LBC-nonmucinous: NS, mixed vs. solid pattern predominant: NS, mixed vs. acinar: *p* = 0.011, mixed vs. LBC-mucinous: *p* = 0.0006, mixed vs. mucinous colloid: NA, mixed vs. papillary: NS, mixed vs. mucinous: NA, mixed vs. micropapillary: NS, mixed vs. signet ring: NA, clear cell vs. LBC-nonmucinous: NA, clear cell vs. solid pattern predominant: NA, clear cell vs. acinar: NA, clear cell vs. LBC-mucinous: NA, clear cell vs. mucinous colloid: NA, clear cell vs. papillary: NA, clear cell vs. mucinous: NA, clear cell vs. micropapillary: NA, clear cell vs. signet ring: NA, LBC-nonmucinous vs. solid pattern predominant: NS, LBC-nonmucinous vs. acinar: NS, LBC-nonmucinous vs. LBC-mucinous: NS, LBC-nonmucinous vs. mucinous colloid: NA, LBC-nonmucinous vs. papillary: NS, LBC-nonmucinous vs. papillary: NS, LBC-nonmucinous vs. nucinous:, LBC-nonmucinous vs. micropapillary: NS, LBC-nonmucinous vs. signet ring: NA, solid pattern predominant vs. acinar: NS, solid pattern predominant vs. LBC-mucinous: NS, solid pattern predominant vs. mucinous colloid: NS, solid pattern predominant vs. papillary: NS, solid pattern predominant vs. mucinous: NS, solid pattern predominant vs. micropapillary: NS, solid pattern predominant vs. signet ring: NA, acinar vs. LBC-mucinous: NS, acinar vs. mucinous colloid: NA, acinar vs. papillary: NS, acinar vs. mucinous: NA, acinar vs. micropapillary: NS, acinar vs. signet ring: NA, LBC-mucinous vs. mucinous colloid: NA, LBC-mucinous vs. papillary: NS, LBC-mucinous vs. mucinous: NS, LBC-mucinous vs. micropapillary: NS, LBC-mucinous vs. signet ring: NA, for *SIGIRR*; NOS vs. mixed: NS, NOS vs. clear cell: NS, NOS vs. LBCnonmucinous: NS, NOS vs. solid pattern predominant: NS, NOS vs. acinar: *p* = 0.0066, NOS vs. mucinous: NS,

NOS vs. mucinous colloid: NS, NOS vs. papillary: NS, NOS vs mucinous: NS, NOS vs. micropapillary: NS, NOS vs. signet ring: NA, mixed vs. clear cell: NS, mixed vs. LBC-nonmucinous: *p* = 0.0422, mixed vs. solid pattern predominant: NS, mixed vs. acinar: *p* = 0.0307, mixed vs. LBC-mucinous: NS, mixed vs. mucinous colloid: NS, mixed vs. papillary: NS, mixed vs. mucinous: NS, mixed vs. micropapillary: NS, mixed vs. signet ring: NA, clear cell vs. LBC-nonmucinous: NS, clear cell vs. solid pattern predominant: NS, clear cell vs. acinar: NS, clear cell vs. LBC-mucinous: NS, clear cell vs. mucinous colloid: NS, clear cell vs. papillary: NS, clear cell vs. mucinous: *p* = 0.0245, clear cell vs. micropapillary: NS, clear cell vs. signet ring: NA, LBC-nonmucinous vs. solid pattern predominant: NS, LBC-nonmucinous vs. acinar: *p* = 0.0140, LBC-nonmucinous vs. LBC-mucinous: NS, LBC-nonmucinous vs. mucinous colloid: *p* = 0.0414, LBC-nonmucinous vs. papillary: NS, LBC-nonmucinous vs. papillary: NS, LBC-nonmucinous vs. mucinous, LBC-nonmucinous vs. micropapillary: NS, LBC-nonmucinous vs. signet ring: NA, solid pattern predominant vs. acinar: NS, solid pattern predominant vs. LBC-mucinous: NS, solid pattern predominant vs. mucinous colloid: NS, solid pattern predominant vs. papillary: NS, solid pattern predominant vs. mucinous: NS, solid pattern predominant vs. micropapillary: NS, solid pattern predominant vs. signet ring: NA, acinar vs. LBC-mucinous: NS, acinar vs. mucinous colloid: NA, acinar vs. papillary: NS, acinar vs. mucinous: NA, acinar vs. micropapillary: NS, acinar vs. signet ring: NA, LBC-mucinous vs. mucinous colloid: NS, LBC-mucinous vs. papillary: NS, LBC-mucinous vs. mucinous: NS, LBC-mucinous vs. micropapillary: NS, LBC-mucinous vs. signet ring: NA.

**Figure 2.** (**A**) Violin plots depicting the differential expression levels of *IL-37* and *SIGIRR* in lung tissues from LUAD (n = 486) vs. non-LUAD (n = 524) individuals or paired tumor vs. adjacent normal tissues from LUAD patients (n = 57 pairs). Median expression levels, Mann–Whitney *p*-values, and fold-changes between medians are reported. Data were obtained from www.tnmplot.com [32] (accessed on 3 September 2022). (**B**) Box and Tukey whiskers diagrams showing the differential distribution of *IL-37* and *SIGIRR* expression levels (FPKMs) as analyzed by RNA sequencing in LUAD samples of different histological type or stage. Data were obtained from http://ualcan.path.uab.edu/ (accessed on 20 September 2022). Asterisks designate statistically significant differences compared to normal samples; between groups (where accompanied by brackets) as analyzed by unpaired *t*-test with Welch's correction or statistically significant linear trend between group means and left-to-right order (where accompanied by an arrow) as analyzed by one-way ANOVA test; \*: *p* < 0.05, \*\*: *p* < 0.01, \*\*\*: *p* < 0.001, \*\*\*\*: *p* < 0.0001. (**C**) Kaplan–Meier plots depicting the probability of overall survival in months in LUAD patients exhibiting high (red) or low (black) expression levels of *IL-37*, *SIGIRR*. Hazard ratio (HR), logrank *p*-values, number of patients with either high or low gene expression categorized also in those who survived for 50, 100, 150, 200, and 250 months are reported. Graphs were exported from www.kmplot.com [34].

#### *3.4. Increased IL-37 Expression Is a Favorable Prognostic Factor for Overall Survival in LUAD Patients*

We further searched for possible correlations between *IL-37* expression levels and survival rates in LUAD patients utilizing the Kaplan–Meier Plotter tool [34]. Analysis of RNA sequencing data from 504 individuals with LUAD tumors revealed significant differences in overall survival (OS) time between patients with tumors expressing high (n = 226) vs. low *IL-37* levels (n = 278) (*log* rank *p* = 0.021). (Figure 2C). Specifically, high *IL-37* expression increases the probability of survival at 250 months by 29% (hazard ratio (HR) = 0.71. 95% CI = 0.53–0.95) compared to the low *IL-37*-expressing group. Further, median survival time for the *IL-37* high expression cohort was 54.07 months, whereas for the low expression cohort, it was 39.9 months. In the case of *SIGIRR*, its high expression increases the probability of survival at 250 months by 35% compared to low expression of the gene (HR = 0.65 (0.49–0.88), *p* = 0.0041; n = 235 and n = 269 for patients with high and low expression, respectively). The median survival time for the *SIGIRR* high expression cohort was 55.1 months, whereas for the low expression cohort, it was 40.3 months. We also checked the prognostic potential of the mean expression of the two genes, which was found to be weaker than the levels of each individual gene (HR = 0.74 (0.55–1), *p* = 0.046; n = 167 and n = 337 for patients with high and low mean expression, respectively).

Lastly, we analyzed the prognostic potential of *IL-37* and *SIGIRR* in individual group biopsies of different grade (1–4), stage (1–4), low or high mutation burden, and neoantigen load. The results revealed that both genes exhibit differential patterns and the ability to predict LUAD OS in patients bearing biopsies of distinct histopathological characteristics. Statistics of the analysis can be found in Supplementary Table S1. A low sample size, though, did not allow us to perform a combinatorial analysis of the parameters.

#### *3.5. Correlation of IL-37 Expression with Cancer-Associated Genes in LUAD Tumors*

The UALCAN portal [33] was used to explore possible linear association of *IL-37* mRNA levels with the expression of other genes in LUAD tumors. As shown in Figure 3A, there are 20 genes that are positively associated with *IL-37* expression, as revealed upon processing RNAseq data from TCGA biopsies and analysis with Pearson's correlation test. In details, *IL-37* levels (log2TPM + 1) were positively associated with those of: *PRODH* (enzyme proline dehydrogenase), *HINF1A* (hepatocyte nuclear factor 1-alpha), *DPP4* (dipeptidyl-peptidase 4), *DGCR5* (DiGeorge syndrome critical region gene 5), *DUSP6* (dual-specificity phosphatase 6), *NMNAT2* (nicotinamide nucleotide adenylyltransferase 2), *HLF* (hepatic leukemia factor), *ADORA1* (adenosine A1 receptor), *SHF* (Src homology 2 domain containing F), *MFSD4* (major facilitator superfamily domain containing 4A), *CXCL14* (C-X-C motif chemokine ligand 14), *ITGA2* (integrin subunit alpha 2), *DGCR9* (DiGeorge syndrome critical region gene 9), *CEACAM2* (carcinoembryonic antigen-related cell adhesion molecule 2), *PLAT* (plasminogen activator, tissue type), *PPP1R1B* (protein phosphatase 1 regulatory inhibitor subunit 1B), *STK39* (serine/threonine kinase 39), *MUC1* (mucin 1, cell surface-associated), *DPY19L1* (dpy-19 like C-mannosyltransferase 1), and *CDC42EP1* (CDC42 effector protein 1) (*p* for all < 0.0001) (Figure 3B). No gene was found to be negatively associated with *IL-37* in LUAD tumors.

Special interest was set on the exploration of possible associations of IL-37 with certain immune-checkpoint (IC) molecules. Since the UALCAN portal did not provide data on correlations for the non-significant associations, we explored the correlation analysis tool in the TNM plotter [32]. Spearman's *r* and *p*-values of linear associations with IL-37 were: *r* = 0.05 and *p* = 0.265 for PD-1 (or PDCD1; programmed cell death protein 1), *r* = 0.13, *p* = 0.0035 for PD-L1 (CD274), *r* = 0.12 and *p* = 0.0065 for CTLA-4 (cytotoxic T-lymphocyte-associated protein 4), *r* = 0.1 and *p* = 0.023 for VISTA (V-domain Ig suppressor of T cell activation or VSIR V-set immunoregulatory receptor), *r* = −0.08 and *p* = 0.0892 for LAG3 (lymphocyte activating 3), *r* = −0.19 and *p* = 0.00001 for TIM-3 (T cell membrane protein 3 or HAVCR2: hepatitis A virus cellular receptor 2).

STRING database [37] processing data from various depositories was utilized to pinpoint any potential cellular or molecular networks or biological processes that are regulated by the aforementioned set of positively correlated genes. "Regulation of leukocyte migration" and "regulation of response to external stimulus" were the two biological processes (Gene Ontology) enriched (false-discovery rate (FDR): 0.0032 for both; level of confidence for minimum required interaction score: 0.4). Since the activity of IL-37 lies mainly in the suppression of proinflammatory signaling factors such as mTOR, AKT, and PI3K, all involved in autophagy and key metabolic functions of immune and cancer cells [2], we further processed individual correlations with genes in these processes. In the aforementioned list, we pinpointed *HIF1A* as a key metabolic regulator, further implicated in key processes during cancer development and progression [38,39], and we also further explored for possible associations with OS of LUAD patients through the Kaplan–Meier Plotter tool [34]. No association was revealed when *HIF1A* was analyzed; however, the ratio of expression levels of *IL-37*-to-*HIF1A* was found to possess a prognostic potential in this cohort (HR = 0.67 (0.5–0.92), *p* = 0.012; low ratio samples n = 130, high expression samples n = 374).

Linear correlations with SIGIRR's expression levels were also explored. UALCAN portal [33] revealed 646 genes positively and 34 genes negatively associated with SIGIRR. The total list of the genes and the corresponding Pearson's *r* values can be found in Supplementary Table S2. Gene-set enrichment analysis of SIGIRR-related genes via the STRING database [37] did not reveal any cancer/LUAD-related entrance in any of the STRING-connected repositories.

#### *3.6. IL-37 Protein Expression Correlates with the Grade of LUAD Tumor*

Following gene expression analysis, investigation of the protein expression pattern through the Pathology tool of the Human Protein Atlas website [31] initially revealed that both IL-37 and SIGIRR are expressed in lung tumor biopsies (Figure 4A). IL-37 protein levels were similar between 111 LUAD and paired non-LUAD specimens (median z-value (range): 0.11 (−1.02–1.16) vs. −0.043 (−1.02–1.16), *<sup>p</sup>* = 1.10 × <sup>10</sup><sup>−</sup>1) (Table 3, Figure 4B). However, protein levels were associated with tumor grade: the lowest levels were observed in grade 2 biopsies and the highest in grade 3, whereas intermediate levels in grade 1 biopsies led to non-cancerous tissues. Regarding SIGIRR protein expression, this was modestly upregulated in tumor tissues compared to normal ones (−0.25 (−1.14–0.38) vs. 0 (−1.91–2.26), *p =* 2.82 × <sup>10</sup>−2) and was associated with tumor grade, since SIGIRR levels exhibited a linear decrease following the grade 1-to-3 order (*r*<sup>2</sup> = 0.1634, *p* = 0.0001).

**Figure 3.** (**A**) Heatmap depicting the relative expression levels (log2TPM + 1) of *IL-37* and twenty IL-37-correlated genes in normal and LUAD biopsies as analyzed by RNAseq. (**B**) Dot plot diagrams showing the correlation between *IL-37* and each of the twenty significantly positively correlated genes. No negative associations were detected. Pearson's *r* values are reported in each case. All *p*-values were <0.0001. Figures were exported from http://ualcan.path.uab.edu/ [33] (accessed on 10 September 2022). (**C**) Heatmap depicting the grade of association between the expression levels of each of the twenty *IL-37*-positively correlated genes and the levels of *IL-37*. Pearson's *r* values are reported.

**Table 3.** Differential distribution of IL-37 and SIGIRR protein levels in LUAD vs. normal tissues or among LUAD tumors of different grades. Number of patients in each group, the median levels of expression, and their range as obtained from UALCAN portal [33], as well as *p*-values of statistical differences are reported.


**Figure 4.** (**A**) Immunohistochemical staining of IL-37 and SIGIRR proteins in lung biopsies of LUAD patients. Pictures were obtained from www.proteinatlas.org [35] (accessed on 20 September 2022). (**B**) Box and Tukey whiskers diagrams depicting the differential protein expression levels of IL-37 and SIGIRR (z-values) in LUAD primary tumors (n = 111) vs. normal tissues (n = 111) and among LUAD tumors of different grade (1 to 3). Figures and graphs were exported from http: //ualcan.path.uab.edu/ [33] (accessed on 20 September 2022). \*: *p* < 0.05, \*\*: *p* < 0.01, \*\*\*: *p* < 0.001, \*\*\*\*: *p* < 0.0001.

#### *3.7. T-Lymphocytes and Macrophages of the Lung Express IL-37 and SIGIRR Genes*

According to single-cell RNA sequencing (scRNAseq) data deposited and processed through the Single-Cell Atlas (of the Human Protein Atlas) [31], *IL-37* was found to be expressed by resident T lymphocytes and macrophages of the normal lung tissue (read counts normalized to transcripts per million protein coding genes (pTPM) = 2.1 and 1.2, respectively) (Supplementary Figure S1). Moreover, *SIGIRR* was found to be expressed by immune cell populations, including T cells, granulocytes, and macrophages, but also by other resident cell types including alveolar cell types 1 and 2, fibroblasts, club ciliated cells, and endothelial cells. The highest *SIGIRR* expression was detected in T cells (pTPM = 115.2) and the lowest in macrophages (pTPM = 23.6). It is also noteworthy that, based on the the scRNAseq analysis, IL-37 protein expression levels in normal human lung cell subsets are relatively lower compared to those of SIGIRR.

#### *3.8. IL37 Gene Alterations Correlate with Differential Immune Cell Infiltration of the Lung Tumor*

To investigate the effects of IL-37 gene mutations on immune cell infiltration in lung adenocarcinoma tumors, the "Mutation" module of the TIMER2.0 webserver was applied [36]. Our analysis indicates that tumors bearing non-synonymous, somatic mutations in the *IL-37* gene were characterized by significantly higher infiltration of CD4+ T lymphocytes (CYBERSORT project; log2fold-change = 2.105, Wilcoxon *p* = 0.004) and significantly lower infiltration of M2 macrophages (XCELL project; log2fold-change = −2.709, Wilcoxon *p* = 0.009) and neutrophils (MCPCOUNTER project; log2fold-change = −0.665, *p* = 0.031). The contribution of myeloid dendritic cells (mDCs) within tumor-association infiltration was of similar levels in patients with and without IL-37 non-synonymous mutations (log2fold-change = 1.171, *p* = 0.047) (Figure 5A).

#### 33

**Figure 5.** (**A**) Violin plots depicting the distribution of infiltrated T cells, M2 macrophages, myeloid dendritic cells, and neutrophils in LUAD tumors without versus with mutation on *IL-37*. Wilcoxon *p*-values and log2 (fold-changes, FC) are reported. (**B**) Scatter plot diagrams depicting the linear association between levels of *IL-37* gene expression (log2TPM; y-axis) and infiltration of the tumor by certain cell subsets (x-axis). Spearman's Rho and *p*-values are reported. Data were filtered for tumor purity. Graphs were exported from http://timer.cistrome.org/ [36] (accessed on 1 October 2022).

#### *3.9. IL37 Expression Levels Correlate with Infiltration Levels of Certain Immune Cell Subsets*

Exploration through the "Gene" module of the TIMER2.0 webserver [36] revealed that *IL37* gene expression levels, as assessed in previous RNAseq experiments, were linearly correlated with certain immune cell populations infiltrating the LUAD tumor. More specifically, *IL37* expression levels (log2TPM) were positively associated with the infiltration rate of mDCs (XCELL project; Spearman's rho = 0.42, *<sup>p</sup>* = 1.84 × <sup>10</sup>−22) progenitors of granulocytes-monocytes (GMPs) (XCELL project; Spearman's rho = 0.323, *<sup>p</sup>* = 2.07 × <sup>10</sup>−13), activated mast cells (CIBERSORT-ABS project; Spearman's rho = 0.303, *<sup>p</sup>* = 5.89 × <sup>10</sup><sup>−</sup>12), Tregs (QUANTISEC project; Spearman's rho = 0.241, *<sup>p</sup>* = 5.70 × <sup>10</sup><sup>−</sup>8), and M2 macrophages (QUANTISEC project; Spearman's rho = 0.279, *<sup>p</sup>* = 2.80 × <sup>10</sup>−10) and negatively associated with the infiltration rate of MDSCs (TIDE project; Spearman's rho = −0.292, *<sup>p</sup>* = 3.54 × <sup>10</sup>−11) (Figure 5B); all correlations are summarized in Table 4.

**Table 4.** Associations between the levels of expression of *IL-37* and those of infiltration by certain immune cell types in LUAD tumors, as attested in 515 specimens. Spearman's *r* values are reported. For significant associations, *p*-values are also mentioned. Tumor purity filter was applied. Data were obtained from www.timer.cistrome.org, (accessed on 24 August 2022) [36].



#### **Table 4.** *Cont*.

\* non-regulatory, NA: not available, NS: non-significant, Tregs: regulatory T cells, MDSCs: myeloid-derived suppressor cells.

#### *3.10. IL-37 Signaling Shares Common Nodes with PD-1/PDL-1 and CTLA-4 Immune Checkpoint Pathways*

Finally, analysis using the STRING database [37] processing data from the various depositories revealed that protein molecules involved in the IL-37 signaling pathway are also members of cancer-related pathways (Figure 6). Specifically, STAT3 is involved in "PD-L1 expression and PD-1 checkpoint pathway in cancer" (KEGG database) and "Cancer immunotherapy by PD-1 blockade" (WikiPathways), PTPN6 in "PD-L1 expression and PD-1 checkpoint pathway in cancer" (KEGG database)/"PD-1 signaling" (Reactome), and "Cancer immunotherapy by CTLA4 blockade" (WikiPathways), and PTPN11 is involved in all of the above. False discovery rates of the significance for each of the enhanced pathways are: 0.0015 for PD-L1 expression and PD-1 checkpoint pathway in cancer, 0.0029 for cancer immunotherapy by CTLA4 blockade, 0.0053 for cancer immunotherapy by PD-1 blockade, and 0.0081 for PD-1 signaling.

**Figure 6.** Network of interactions (exported by STRING portal [37]) (**A**) among proteins involved in IL-37 signaling and other proteins implicated in PD-1/PD-L1 and CTLA-4 immune-checkpoint pathways (**B**), as attested using the STRING (Search Tool for the Retrieval of Interacting Genes/Proteins). The above pathways share common nodes (each color corresponds to different pathway; nodes marked with more than one color belong to equal number of pathways). Edges represent protein–protein associations: known interactions, predicted interactions, or other associations. Level of confidence for minimum required interaction score was 0.4. Data were obtained from https://string-db.org [37] (accessed on 3 October 2022).

#### **4. Discussion**

Lung cancer is a leading cause of cancer death in both men and women, worldwide [40–42]; it is associated with almost three times the rate of deaths compared to prostate cancer in men and breast cancer in women. In 2022, 236.740 estimated new cases of lung cancer are expected to be diagnosed, and approximately 130.180 deaths are predicted to be recorded in the United States of America (USA) [43]. However, during the last 15 years, a steady decline in the incidence of new lung cancer diagnoses (2.8% in men and 1.4% in women) and deaths (50% and 67%, respectively) has been observed. This is probably attributed to recent advances in the management of patients, including the use of novel chemotherapeutic (cisplatin/pemetrexed or gemcitabine), molecular targeted (gefitinib), and/or immunotherapeutic (pembrolizumab, nivolumab, atezolizumab) agents, together with more personalized therapeutic approaches based on specific mutation patterns, such as those on *EGFR*, *ALK,* and *ROS-1* genes [44].

Lung adenocarcinoma (LUAD), which falls under the umbrella of non-small-cell lung cancer (NSCLC), represents about 40% of all lung cancer types [45], and even though there has been a significant decrease in its incidence and mortality rates, it remains the main cause of cancer death in the USA [45]. LUAD tumors that evolve primarily from mucosal glands are usually developed in the lung periphery but can also be found in scars or areas of chronic inflammation. The immune microenvironment exerts functions associated with the development and progression of the disease, as well as the response to therapy [40]. Alternatively activated MΦ (M2 type) and T cells, specifically resting memory CD4+, are the predominant populations surrounding LUAD tumors [46]. Further, high immune cell infiltration rates correlate with a better prognosis compared to lower ones [47,48] and are characterized by an increased incidence of naïve B cells, plasma cells, follicular helper T cells, and classical

(M1 type) macrophages, as well as by a decreased prevalence of resting memory CD4<sup>+</sup> T cells, monocytes, and resting dendritic cells (DCs) [48].

Recently, Zuo et al. developed the immune-cell characteristic score (ICCS) model, which is suggested for the facilitation of LUAD prognosis [49]. This model assesses the infiltration rate of six immune cell populations in LUAD biopsies (B cells, immature DCs, eosinophils, mast cells, granzyme K expressing CD8<sup>+</sup> T cells, and Th2 cells) and independently predicts the overall survival (OS) of the patient. High infiltration of all the ICCS immune cell populations is associated with better prognosis, apart from that of Th2 cells, which indicate a poor outcome of the disease [49]. What is more, processing RNA-sequencing and clinical data from TCGA [50] and GEO databases [51] utilizing specific bioinformatics tools and algorithms has led to the construction of certain gene signatures with independent prognostic/predictive values for the survival of LUAD patients [47,48,52] or their response to immune-checkpoint inhibitors (ICIs) [53,54].

Focusing on the pivotal role of the tumor-immune microenvironment in LUAD development and progression, we aimed at the exploration of the expression patterns of *IL-37*, a novel cytokine with regulatory properties in LUAD tumors, using various bioinformatics tools and available databases. Based on our data, human LUAD tumors exhibit the highest gene-expression levels of the cytokine amongst various common human cancers, suggesting its possible pathophysiological involvement in this malignancy. Importantly, these are significantly increased in LUAD vs. non-LUAD tissues, suggesting a disease-specific involvement of IL-37 in the pathological lesion. Within LUAD, LBC-mucinous and mucinous-coloid histological subtypes are characterized by the highest *IL-37* levels; however, the small number of specimens assessed for each subtype does not allow for the extraction of safe conclusions regarding the histological-specific distribution of the cytokine.

It is of note that IL-37 expression exhibits a lung cancer stage-specific pattern: there is a trend of steady decline in its levels from earlier to later stages. This observation provides initial evidence for a possible prognostic value for *IL-37* expression patterns, but it also implies a potential pathogenetic role of this regulatory cytokine in the LUAD inflammatory lesion. However, there are clearly several unresolved questions regarding the molecular mechanisms underlying this differential expression profile, such as whether aberrant *IL-37* regulation is a cause or consequence of lung cancer development and progression. Therefore, it is of outmost importance to pinpoint the exact cellular source(s) of IL-37 that are responsible for the variability in its distribution (LUAD cells and/or immune cell subsets) before making any assumptions about the link between molecular/cellular pathogenetic mechanisms and the mRNA signature of *IL-37* in LUAD malignancies. Even though most of the current knowledge advocates towards a possible protective role of this cytokine in human malignancies, the enhanced expression in early-stage lung tumors, independently from the exact cellular source(s), may indicate a cellular response to promote anti-tumor immunity. Consequently, the observed gradual decrease in IL-37 expression during tumor progression and advanced lung cancer stages suggests that this anti-tumor effect could probably be attenuated. In contrast to the differential expression pattern of *IL-37* levels between LUAD and non-LUAD biopsies, this does not apply for its receptor *SIGIRR*. The receptor's gene expression levels between the groups are similar; nevertheless, within LUAD, the mixed, LBC-mucinous, and mucinous-coloid subtypes exhibit a tendency to express higher *SIGIRR* mRNA levels, possibly associated with the aforementioned enhanced *IL-37* levels in the same subtypes. What is of interest, though, is the sharp trend of decrease observed following the stage 1-to-stage 4 order, also matching with the corresponding *IL-37* signature.

Our results also reveal the prognostic potential of tumor-expressed *IL-37* and *SIGIRR* patterns. Patients whose biopsies exhibit high *IL-37* or *SIGIRR* mRNA levels have a better prognosis for OS compared to those with low levels. This, together with the stage-specific expression patterns described above, suggests that lower *IL-37* and *SIGIRR* gene expression could be associated with more advanced LUAD cases, of high grade/stage and/or metastatic status linked to poorer prognosis. Yet, *SIGIRR* seems to be an even stronger OS predictor, which is in accordance with the more significant, compared to *IL-37*, correlations with stage and metastasis status described herein.

*IL-37* expression levels were found to be positively correlated with a 20-gene signature in LUAD biopsies. Most of these genes have an already established association with lung cancer and specifically with LUAD. Increased levels of *CXCL14* mRNA have been detected in LUAD biopsies with a micropapillary pattern [55], and smoking-induced *CXCL14* expression in the human airway epithelium has been implicated in chronic obstructive pulmonary disease (COPD)-mediated lung cancer development [56]. Similarly, *NMNAT2* expression was found to be increased in LUAD specimens and correlated negatively with OS of patients, whereas the DGUOK-NMNAT2- NAD+ axis was suggested as a potential therapeutic target for the disease [57]. Increased expression levels of *HLF,* shown to promote cell-cycle progression in various cancers [58], *ITGA2*, *MUC1*, and *DPY19L1* have also been proposed to confer prognostic value for the survival of patients suffering from LUAD malignancies [59–64]. Additionally, *DUSP6*/*MKP3* has been designated as a tumor suppressor phosphatase implicated in LUAD and cancer types, whereas several studies have revealed the clinical relevance of its expression patterns in lung cancer. In addition, various *DUSP6*/*MKP3*-associated SNPs have been linked with the response to chemoradiation therapy [65–69]. Similarly, *PPP1R1B* has been shown to interfere with the response to molecular targeted therapy in *EGFR*-mutated LUAD [70], and *SHP2* has been associated with *MET*-mutated NSCLC [71].

Moreover, additional *IL-37*-correlated genes have been described to be implicated in NSCLC pathogenesis and progression. *PLAT* has been reported to inhibit apoptosis in NSCLC cells, and its knockdown augments the therapeutic efficacy of gefitinib [72]. *PRODH* has been involved in NSCLC metastasis as shown, both in vitro and in vivo [73,74], and *ADORA1*, which is highly expressed in *EGFR*-mutant NSCLC biopsies [75], has been associated involved in tumor-immune evasion in NSCLC xenograft models [76]. *STK39* and the lncRNA *DGCR5* have been proposed as critical molecules for the regulation of the growth, migration, and invasion of NSCLC tumors [74,77–79]. *DGCR5* has been specifically implicated in the tumor progression of LUAD through the inhibition of hsa-mir-22-3p [80]. Regarding the remaining genes that exhibit an *IL-37*-correlated expression pattern, there is evidence for their association with the development of human cancers and for their potential to serve as disease biomarkers. This is the case with *HIF1A*, the most pivotal gene regulating metabolic pathways related to hypoxia, which is further implicated in proliferation, energy metabolism, invasion, and metastasis in a series of human cancers, and has been viewed as a highly promising therapeutic target [38,39]. Indeed, there is evidence that expression levels of *HIF1A* by tumor cells have a diagnostic and prognostic significance among different histological types of lung cancer [81,82]. Interestingly, in our study, the ratio of *IL-37*-to-*HIF1A* expression levels was found to have a favorable prognostic potential in LUAD patients. Further, *MFSD4* is considered to be a tumor-suppressor gene and a biomarker for hepatic metastasis in gastric cancer patients [83], as well as a diagnostic marker of esophageal carcinoma [83]. In addition, the lncRNA *DGCR9* has been reported as a potential tumor neoantigen [84], with a possible pathogenetic role in gastric cancer [85]. As for *CDC42EP1*, it was very recently described that certain gene mutations drive the development of parathyroid and oral tongue squamous cell carcinomas [86,87]. Lastly, *DPP-4* has been reported to possess a deleterious role and potential to be used as a biomarker in respiratory diseases, such as lung cancer, asthma, and chronic obstructive pulmonary disease (COPD) [88–90].

Regarding IL-37 protein expression, this was found to be similar for both proteins between LUAD and non-LUAD specimens. As with mRNA expression, its levels were associated with the tumor grade. It is essential to comment that the webportal used provided the opportunity to analyze the gene expression levels in regard to the stage of the tumor, as well as the protein levels with regard to its grade. Given the differences among these terms and scales, one can speculate that the significant trend of decrease in *SIGIRR* gene expression levels is pathophysiologically connected to the trend in protein levels, both associated with more severe disease. In the case of *IL-37*, things seem to be more complicated: biopsies of intermediate severity (Grade 2) express the highest gene expression but the lowest protein levels, respectively, compared to the rest of the subgroups in each classification. Further exploration for these seemingly opposite regressions needs to be performed.

By taking advantage of the TIMER2.0 webserver, we had the opportunity to search for possible correlations between *IL-37* gene alterations and infiltration rates of various immune cell types in LUAD. Interestingly, it was revealed that tumors with non-synonymous, somatic mutations of *IL-37* are characterized by a higher infiltration of CD4<sup>+</sup> T-cells and a lower infiltration of M2 MΦ and neutrophils. Although the differences are statistically significant, due to the low number of samples in the mutated arm (n = 5), the conclusions cannot be confidently evaluated. What is more, even though these mutations are expected to alter protein function, assumptions about the pathophysiological links of the above relationships, if not falsely positive, could be made upon determining their cell-specific distribution, as well as the activation status of the corresponding immunocytes. It is essential to confirm whether *IL-37* aberrations indeed affect the infiltrating incidence and/or function, especially of M2 MΦ and neutrophils, which dominate the myeloid-landscape of LUAD tumors and are of vital importance for their growth and metastatic potential [91].

Supplementing these observations, the *IL-37* gene expression pattern was found to be significantly associated with the infiltration rate of certain immune cell types. Myeloid dendritic cells (mDCs), which showed the most significant correlation, are known to support protective anti-tumor immunity in lung cancer, while being subjected to suppression mediated by cancer cells via different mechanisms [92,93]. In agreement, there was also a positive association between the percentage (%) of GMPs, which produce DCs and macrophages, and they have also been shown to be involved in LUAD immune cell infiltration [94]. Moreover, *IL-37* mRNA levels positively correlated with the percentage (%) of activated mast cells, which have been assigned as predictors of improved OS and PFS in NSCLC [49,95]. In contrast, *IL-37* expression was negatively correlated with the percentage (%) of infiltrating MDSCs, which are pivotal immunosuppressive partners and key targets for immunotherapy in lung cancer [96]. The above observations support the notion that IL-37 may exert tumor-protective immune functions. However, it should also be noted that our analyses revealed some, at first sight, conflicting evidence: *IL-37* expression levels correlated positively with the rate of infiltration by regulatory T cells, as well as M2 MΦ, that have been reported to exert pro-tumoral and anti-tumor immunity actions in LUAD [97,98]. The molecular and cellular networks responsible for these phenotypes need further investigation.

Finally, pathway analysis revealed that IL-37-signaling mediators, such as STAT3, are crucial partners of PD-1, PD-L1, and CTLA-4 pathways, providing hints for an interfering role of this cytokine in the immune-checkpoint blockade of anti-tumor immune responses.

Taken together, our data highlight the prognostic and diagnostic potential of *IL-37* mRNA levels in LUAD and provide evidence for its involvement in molecular networks and cellular distributions reported to play pivotal roles in LUAD tumorigenesis and progression. What is more, the interplay with the SIGIRR receptor, as well as its possible disturbance and/or independence, as possibly reflected by their differential expression pattern, points toward a crucial role in the differential disease phenotype, which could be further exploited as a potential immunecheckpoint therapeutic target. Our results are in agreement with previous studies supporting the implication of this cytokine and its receptor in the anti-tumor cytotoxic [24,99,100] and anti-angiogenic responses [101], as well as anti-invasion/metastatic processes in NSCLC [102,103]. Specifically, for LUAD, a study on patients' samples showed that the loss of or reduced IL-37 expression in the tumor correlates with metastasis development [23]. The protective effect of IL-37 has also been shown in other lung diseases such as idiopathic pulmonary fibrosis (IPF) [104].

However, the current study is limited in certain parameters, such as the fact that it is based on clinical samples analyzed through omics approaches. Therefore, it is essential that the resulted data are further validated in certain large patient cohorts of interest using specifically designed, targeted assays (including RT-qPCR). A detailed exploration of possible associations with clinical, histopathological, laboratory, and therapeutic parameters needs to be performed to empower the capacity of the *IL-37* to be conceivably used as a LUAD biomarker. For a better understanding of the pathogenetic implications of the cytokine, serial biopsies of LUAD specimens could be used to monitor its differential expression profile throughout the course of the disease and/or in association with certain treatment strategies. Further, peripheral blood vs. tumor samples should be comparably

processed to explore the inflamed-tissue or peripheral LUAD specific distribution of *IL-37* as a demand of local or systemic immunoregulation, as previously described in other immune-mediated inflammatory disorders [105,106]. Moreover, it is important to further deepen our study and investigate the likely differential distribution of each of the five IL-37 isoforms [2], both in mRNA and protein levels. The analysis of the total mRNA and protein isoforms in -omics assays could hide specific patterns of certain variants, which also need to be further checked as to whether they exert similar or different functions in the LUAD microenvironment. The different isoforms bear different exons that have been implicated in extracellular and intranuclear activities of the cytokine, thus mediating different signaling regulations [2]. Additionally, and in combination with the above, it is of outmost importance to pinpoint the exact cellular source(s) of IL-37 and its receptor, to fully understand the effect of their aberrations in the intercellular responses within the LUAD microenvironment. Single-cell analyses supported their expression by lung tissue-resident T cells and MΦ, but relevant study should also be applied in LUAD tumor biopsies to explore the possible inducible expression of IL-37 isoforms and SIGIRR by malignant epithelial and/or tumor-infiltrating immune cell subsets. Towards that direction, our preliminary data support the fact that both IL-37 mRNA and protein are expressed by A549 human lung adenocarcinoma cells, as attested by a specifically developed RT-qPCR assay and flow cytometry (Supplementary Figure S2). SIGIRR expression was not detected in those cells by the aforementioned approaches.

Despite its limitations, the current study clearly supports the crucial involvement of IL-37 in LUAD pathogenesis and monitoring. Moreover, it highlights the plausible necessity for further investigation through mechanistic studies at the molecular and cellular level and validation in experimental models as well as in well-defined patient cohorts, in order to fully elucidate the exact role of this cytokine and to further exploit its potential for the improvement of LUAD patients' personalized management.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/biomedicines10123037/s1, Figure S1: UMAP plot (top) and bar chart (bottom) showing *IL-37* and *SIGIRR* expression levels (read counts normalized to transcripts per million protein coding genes, pTPM) in certain clusters of the lung. The numbers of cells in each cluster analyzed in single-cell RNA sequencing are included in parenthesis. Figures were exported from www.proteinatlas.org [35] (accessed on 20 August 2022). Figure S2: A. Indicative amplification and melting curve of RT-qPCR *IL-37* product in A549 human lung adenocarcinoma cells, B. Dot plot diagrams and related histograms depicting the expression of IL-37 protein by A549 cells as attested by flow cytometry. Table S1: OS prognostic potential of IL-37 and SIGIRR expression levels in LUAD patients bearing tumors of distinct histopathological characteristics. Table S2: Genes whose expression levels correlate with those of *SIGIRR* in LUAD tumors, as attested by the UALCAN portal (http://ualcan.path.uab.edu/) [33] (accessed on 20 September 2022). Pearson's *r* values for positive or negative associations are reported in each case.

**Author Contributions:** Study conception and supervision: M.-I.C.; Study design: M.-I.C. and P.P.; Acquisition of data: T.-C.K., P.C., P.B., M.A. and M.-I.C.; Statistical analysis: M.-I.C.; Drafting of the manuscript: T.-C.K., P.C. and P.B.; Writing of the manuscript: M.-I.C.; Critical revision of the manuscript: P.P., Y.J., D.X. and M.-I.C.; Response to reviewers: P.C. and M.-I.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no funding.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Review*

## **Omics and Multi-Omics Analysis for the Early Identification and Improved Outcome of Patients with Psoriatic Arthritis**

**Robert Gurke 1,2,3,\*, Annika Bendes 4, John Bowes 5,6, Michaela Koehm 1,2,7, Richard M. Twyman 8, Anne Barton 5,6, Dirk Elewaut 9, Carl Goodyear 10, Lisa Hahnefeld 1,2,3, Rainer Hillenbrand 11, Ewan Hunter 12, Mark Ibberson 13, Vassilios Ioannidis 13, Sabine Kugler 2,14, Rik J. Lories 15, Eduard Resch 1,2, Stefan Rüping 2,14, Klaus Scholich 1,2,3, Jochen M. Schwenk 4, James C. Waddington 16, Phil Whitfield 17, Gerd Geisslinger 1,2,3, Oliver FitzGerald 18, Frank Behrens 1,2,7, Stephen R. Pennington 16,18,\* and on behalf of the HIPPOCRATES Consortium**

	- <sup>11</sup> Novartis Pharma AG, CH-4056 Basel, Switzerland
	- <sup>12</sup> Oxford BioDynamics Limited, Oxford OX4 2JZ, UK
	- <sup>13</sup> Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
	- <sup>14</sup> Fraunhofer IAIS, Institute for Intelligent Analysis and Information Systems, Schloss Birlinghoven 1, 53757 Sankt Augustin, Germany
	- <sup>15</sup> Department of Development and Regeneration, KU Leuven, Skeletal Biology and Engineering Research Centre, P.O. Box 813 O&N, Herestraat 49, 3000 Leuven, Belgium
	- <sup>16</sup> Atturos Ltd., c/o UCD Conway Institute, University College Dublin, D04 V1W8 Dublin, Ireland
	- <sup>17</sup> Glasgow Polyomics, College of Medical, Veterinary and Life Sciences, Garscube Campus, University of Glasgow, Glasgow G61 1QH, UK

**Abstract:** The definitive diagnosis and early treatment of many immune-mediated inflammatory diseases (IMIDs) is hindered by variable and overlapping clinical manifestations. Psoriatic arthritis (PsA), which develops in ~30% of people with psoriasis, is a key example. This mixed-pattern IMID is apparent in entheseal and synovial musculoskeletal structures, but a definitive diagnosis often can only be made by clinical experts or when an extensive progressive disease state is apparent. As with other IMIDs, the detection of multimodal molecular biomarkers offers some hope for the early diagnosis of PsA and the initiation of effective management and treatment strategies. However, specific biomarkers are not yet available for PsA. The assessment of new markers by genomic and epigenomic profiling, or the analysis of blood and synovial fluid/tissue samples using proteomics, metabolomics and lipidomics, provides hope that complex molecular biomarker profiles could be developed to diagnose PsA. Importantly, the integration of these markers with high-throughput histology, imaging and standardized clinical assessment data provides an important opportunity to develop molecular profiles that could improve the diagnosis of PsA, predict its occurrence in

**Citation:** Gurke, R.; Bendes, A.; Bowes, J.; Koehm, M.; Twyman, R.M.; Barton, A.; Elewaut, D.; Goodyear, C.; Hahnefeld, L.; Hillenbrand, R.; et al. Omics and Multi-Omics Analysis for the Early Identification and Improved Outcome of Patients with Psoriatic Arthritis. *Biomedicines* **2022**, *10*, 2387. https://doi.org/10.3390/ biomedicines10102387

Academic Editor: Marianna Christodoulou

Received: 11 August 2022 Accepted: 17 September 2022 Published: 24 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

45

cohorts of individuals with psoriasis, differentiate PsA from other IMIDs, and improve therapeutic responses. In this review, we consider the technologies that are currently deployed in the EU IMI2 project HIPPOCRATES to define biomarker profiles specific for PsA and discuss the advantages of combining multi-omics data to improve the outcome of PsA patients.

**Keywords:** psoriatic diseases; psoriatic arthritis; psoriasis; multi-omics; data integration

#### **1. Introduction**

#### *1.1. Psoriasis and Psoriatic Arthritis*

Psoriasis is a chronic, immune-mediated inflammatory disease (IMID) of the skin, which affects 0.91–8.5% of the population, varying by age, region and ethnicity [1]. The most common manifestation is plaque psoriasis (psoriasis vulgaris), which accounts for ~80% of cases and typically involves the formation of erythematous and scaly plaques on the head, ears, elbows and knees, as well as gluteal and umbilical areas. These skin changes are often highly conspicuous, and the resulting stigmatization can lead to psychosocial issues. There is also a high rate of comorbidities, including cardiovascular disease and obesity [2]. Approximately 30% of psoriasis patients go on to develop psoriatic arthritis (PsA) [3], a mixed-pattern IMID characterized by the inflammation of mainly entheseal and synovial musculoskeletal structures [4]. Predisposition to the development of PsA has a strong genetic basis [3] and correlates with the severity of psoriatic skin lesions, including nail involvement (pitting, cracking, separation or nail loss). However, in a minority of cases, the symptoms of PsA develop alongside psoriasis or even before it. Various environmental and lifestyle factors also increase the risk of PsA at the population level, including a high body mass index and smoking [5–7], although paradoxically, smoking is negatively associated with progression to PsA at the level of individual psoriasis patients [8]. There is also increasing evidence that dietary factors influence the risk of progressing to PsA [9,10]. PsA can lead to structural damage and loss of function of the joints due to bone erosion, new bone formation and cartilage loss [11]. It has diverse presentations including asymmetric oligo-articular forms of arthritis, polyarticular disease, dactylitis and spinal inflammation [12].

#### *1.2. Current Diagnostic Practices and Disease Management Strategies*

A diagnosis of psoriasis is usually based on the appearance of the skin [13]. Blood tests or other diagnostic procedures are generally unnecessary [14]. If clinical diagnosis is uncertain, psoriasis can be differentiated from visually similar conditions (such as certain forms of eczema) by skin biopsy, which will confirm epidermal thickening interdigitating with the dermis, changes to the stratum granulosum, the presence of nuclei in the superficial layer, and the presence of infiltrating T cells [15]. In contrast, there is no definitive diagnosis for PsA because the clinical manifestations overlap with other arthritic diseases, including rheumatoid arthritis (RA), osteoarthritis and inflammatory bowel disease (IBD)-associated arthritis. Current diagnostic practice is based on rheumatologic assessment involving physical examination, medical history, blood tests and imaging. More definitive diagnosis is generally dependent on the presence of inflammation and musculoskeletal damage, which makes early intervention much more challenging. The identification of early and specific biomarkers of PsA would facilitate immediate treatment with the most appropriate drugs, therefore offering a much better prognosis for PsA patients and even preventing disease progression in its early stages [16]. In addition to the need for early diagnosis so that treatment can improve patient outcomes, the management of chronic disease plays an important role with a focus on individualized and personalized treatment strategies. Even following the initiation of appropriate immunosuppressive therapy, up to ~40% of patients may not respond or experience adverse effects [17]. There is an urgent medical need for biomarkers that facilitate the early differentiation of PsA and allow the prediction

and monitoring of therapeutic responses during the chronic disease stage, thus helping to normalize function and improve outcomes and quality of life.

#### *1.3. The Promise of Omics and Multi-Omics Technology*

Biomarkers that are distinct for specific groups of patients can be used for the early diagnosis of diseases because they often correspond to qualitative or even quantitative indicators of biological and pathological processes [18,19]. The genomics revolution in the 2000s identified a large panel of new genetic markers that are associated with particular disease phenotypes, but the potential of biomarkers expanded enormously as omics technology broadened to encompass the global analysis of DNA modifications (epigenomics), RNA (transcriptomics), proteins (proteomics) and metabolites (metabolomics). Furthermore, it is reasonable to differentiate between the analysis of polar metabolites and the analysis of lipids (lipidomics) because the physicochemical properties of these compounds are quite distinct and optimized methods for analyzing these groups are necessary. The advent of proteomics and metabolomics/lipidomics in particular has raised the possibility of using combinations of markers to differentiate between diseases or disease stages in a quantitative manner, which is not possible with genetic markers outside the field of oncology. As the corresponding technologies have become increasingly sophisticated, sensitive and automated, the cost of analysis has fallen and more ambitious studies are possible, including the correlation of multiple omics biomarker profiles across large groups of patients. This requires stringent quality control standards to be applied during sample collection, storage, preparation and analysis, including due attention to sample sizes and replicates, as well as appropriate randomization (Figure 1).

**Figure 1.** Experimental design considerations for the utilization of multi-omics data.

The EU-funded HIPPOCRATES project (https://hippocrates-imi.eu, accessed on 20 September 2022) is an ambitious collaboration that considers the potential of multiple molecular marker types across the spectrum of omics technology and seeks to combine them with conventional clinical diagnostic methods (imaging, medical records and physical examinations) for PsA. The value of omics technologies in the clinical care of PsA patients has been explored in a recent review article, including transcriptomics (which is not part of the HIPPOCRATES project) [20]. HIPPOCRATES aims to extend the concept by combining marker profiles for the differential diagnosis of psoriasis and PsA, as well as prognosis and the monitoring of treatment responses. In this review, we focus on the main objectives of the HIPPOCRATES project by considering the advantages and disadvantages of different omics technologies for the discovery of biomarkers for psoriasis and PsA, the potential of multi-omics approaches that combine different technologies to take advantage of synergies and how the diverse data formats may be combined and interrogated using advanced data evaluation tools (e.g., tools based on artificial intelligence) to identify patterns with diagnostic or prognostic value.

#### **2. Genomics**

#### *2.1. Brief Overview of Relevant Genomics Technologies*

Genomics is the branch of biology that deals with the analysis of genomes. In the context of psoriasis and PsA, genomics can be used to identify and characterize the genes, and more importantly the gene variants (alleles), that are associated with each disease. Many of the genes identified as associated with psoriasis have also been found to be associated with PsA when compared to population controls, highlighting their shared genetic basis. Susceptibility loci associated with PsA alone have also been identified, including several *HLA-B* alleles and *IL23R* [21,22]. The detection of pathological gene variants can be used to assist diagnosis and also to predict the age of onset, severity and likely symptoms of the disease. However, the multiple genes that distinguish between psoriasis and PsA may also be shared with other arthritic diseases, such as RA or ankylosing spondylitis.

The fundamental technology underlying the field of genomics is the genome-wide genotyping array, the contents of which are routinely enhanced by imputation, which provides the structure and sequence of key disease-associated genes and allows causative allelic variants to be identified. Genome-wide association studies (GWAS) and gene chip experiments have identified more than 20 additional loci outside the HLA system that are associated with PsA [23,24], some of which are exclusive (i.e., not also associated with psoriasis) [25]. The advent of next-generation sequencing platforms that are faster, cheaper and easier to automate than classic Sanger sequencing will enable researchers to amass a large body of sequence data from various patient cohorts, and this allows the comparison of patient groups to identify relevant alleles, in particular for rare variants not captured on genotyping arrays or by imputation.

#### *2.2. Applications for Early Diagnosis, Prognosis and Treatment Monitoring*

PsA is known to have a strong genetic component, which means that certain allelic variants are likely to be more prevalent among PsA patients than controls (or other disease cohorts). Because such genetic variation is present from conception, it should be possible to detect disease-causing alleles before the onset of symptoms and commence treatment as early as possible. Similarly, it should be possible to detect PsA-associated alleles in cohorts of psoriasis patients and thus identify those at the greatest risk of progression. Although many different alleles are associated with psoriasis, PsA or both, GWAS can be used to screen for large panels of variants in a single test, which is generally based on array hybridization or multiplex PCR [26,27]. The detection of one or more informative variants can therefore provide data to indicate causality. Other markers may be useful for the assessment of therapy, and to determine which subcomponents are heritable, and therefore more predictable [28]. Accordingly, prospective studies are needed in psoriasis patients, ideally recruited from primary care before disease-modifying therapy commences, to assess the ability of genetic variants to predict the onset of PsA.

#### *2.3. Case Studies/Examples in Psoriasis and PsA*

The primary genetic factors that distinguish PsA from psoriasis map to the *HLA-B* locus [29,30]. The alleles *HLA-B\*39*, *HLA-B\*07*, *HLA-B\*38* and in particular *HLA-B\*27* have been described as specific risk factors for PsA [31]. Although gene mapping is consistent across different studies, resolution to a precise allelic variant is conflicting when the reported index associations point to amino acid positions 45 or 97 (Table 1). Outside the HLA region, there is convincing evidence for a PsA-specific effect at the *IL23R* locus independent of the known psoriasis risk variant [32–34]. Other genes associated with PsA but not psoriasis include *KIR2D* [35], *IL4* and *KIF3A* [36], *B3GNT2* [37] and *PTPN22* [25].

**Table 1.** Genetic variants with evidence to support their ability to distinguish between PsA and cutaneous-only psoriasis.


#### **3. Epigenomics**

#### *3.1. Brief Overview of Relevant Epigenomics Technologies*

Epigenomics is the large-scale analysis of epigenetic phenomena, which include DNA methylation and histone modification as regulators of the 3D configuration of the genome, and the expression of small regulatory RNAs. Epigenetic mechanisms play a key role in the regulation of gene expression, and specific epigenetic markers can be associated with diseases such as psoriasis and PsA. Various technologies can be used to monitor genomewide epigenetic phenomena, including chromatin immunoprecipitation (ChIP) followed by detection on microarrays (ChIP-chip) [38] or by sequencing (ChIP-Seq) [39], the detection of methylated DNA using bisulfite sequencing or (directly) by nanopore sequencing or SMRT sequencing [40], and enzyme-based chromatin accessibility assays [41]. The detection of chromosome conformation signatures (sequences that are likely to control the 3D structure of the genome) can also be used to pinpoint abnormal chromosome structures that are associated with diseases or responses to treatment. For example, the Oxford Biodynamics EpiSwitch platform is based on the testing of more than 10,000 samples in 30 disease indications, enabling the screening, evaluation, validation and monitoring of 3D genomic biomarkers [42].

#### *3.2. Applications for Early Diagnosis, Prognosis and Treatment Monitoring*

The EpiSwitch platform facilitates the discovery of stable and heriTable 3D genomic markers and the development of highly sensitive clinical assays based on non-invasive blood readouts. In the case of PsA, it can assist with a definitive diagnosis and prognosis in the context of comorbidities and overlapping symptoms, without resorting to biopsy. This technique has already delivered biomarkers that predict the response to methotrexate treatment in RA patients [43], that predict the response to immune checkpoint inhibitors in cancer [44], and that are prognostic for severe outcomes of COVID-19 based on individual patient immune health profiling [45]. The markers profiled by EpiSwitch technology are governed by all forms of genetic and epigenetic variation, and their combined influence has a major impact on the regulation of gene expression by controlling access to chromatin. Therefore, such markers are powerful high-level integrators of multi-omic signals [46]. In order to utilize the full potential of EpiSwitch, a representative cohort of whole blood samples with clinical annotations is required, representing extreme clinical outcomes. That spectrum will define the quality of the EpiSwitch biomarkers and their correlation with other modalities.

#### *3.3. Case Studies/Examples in Psoriasis and PsA*

Although chromosome conformation signatures for psoriasis and PsA are not yet available, the promise of the technique has been demonstrated in early RA patients commencing methotrexate treatment [43]. Using blood samples from responders, non-responders and healthy controls, a custom biomarker discovery array was refined to a five-marker chromosome conformation signature that could discriminate between responders and nonresponders. Markers were validated using a blinded, independent cohort of 19 early RA patients (9 responders and 10 non-responders) and the corresponding loci were mapped to a RA-specific expression quantitative trait locus (eQTL). Finally, a five-marker chromosome conformation signature was found that could identify, at baseline, responders and non-responders to methotrexate. It consisted of binary chromosome conformations in the genomic regions of *IFNAR1*, *IL-21R*, *IL-23*, *CXCL13* and *IL-17A*. When tested on a cohort of 59 RA patients the marker provided a negative predictive value of 90% for methotrexate response. When tested on a blinded independent validation cohort of 19 early RA patients, the signature demonstrated a true negative response rate of 86%, and 90% sensitivity for the detection of non-responders. Only conformations in responders mapped to the RA-specific eQTL.

#### **4. Proteomics**

#### *4.1. Brief Overview of Relevant Proteomics Technologies*

Proteomics can be defined as the large-scale analysis of proteins. In the context of PsA, it has been applied mainly to identify biomarkers that can be detected in blood, synovial fluid or skin samples for the early diagnosis of PsA and its differentiation from psoriasis [47–49]. The proteome is much more complex and dynamic than the genome because there are an estimated ~20,000 protein-encoding genes in the human genome [50], but these give rise to multiple variants by alternative transcription, splicing and processing of RNA, post-translational modification and protein–protein interactions. About 10% of the human proteome lacks experimental evidence, and the combined effect of differential protein abundance, protein modifications, sequence variation and interactions further complicate the task of measuring all proteins in every sample [50].

The technologies used to interrogate the proteome can be broadly divided into untargeted methods that attempt to consider all proteins in a sample, and targeted methods that focus on specific proteins or classes of proteins. Mass spectrometry (MS) is a key platform in both approaches because it is a sensitive, high-throughput technology that is relatively easy to automate. Proteins are digested into peptides using a protease with known specificity such as trypsin, and the mass of each peptide, and its fragments generated inside a collision cell, is correlated with values in databases to achieve peptide and hence protein identification. Untargeted methods are based on the analysis of complex, uncharacterized peptide mixtures from multiple proteins. These are generally fractionated by liquid chromatography before injection into the mass spectrometer (LC-MS) and/or by multiple rounds of MS. In the latter case, data-dependent acquisition (DDA) involves the selection of specific peptides during the first round of MS for further fragmentation in subsequent rounds, whereas data-independent acquisition (DIA) involves the fragmentation and further analysis of all peptides from the first round [51]. Targeted methods involve the selection of one or a relatively small number of proteins from a sample for quantitative analysis [52]. Targeted analysis can be undertaken using MS-based methods as exemplified in the Atturos platform or methods that rely on highly specific affinity reagents. In the latter case, the production of high-quality data requires the use of validated binders (affinity reagents) that capture target proteins at low abundance [53]. Current affinity proteomics methods can detect more than 3000 proteins simultaneously by using different selectivity concepts, as well as the amplification capabilities of DNA-based readout methods. One relevant example is the Olink platform, a proximity extension assay that involves the recognition of proteins by antibodies linked to protein-specific DNA barcodes that can be amplified by qPCR or sequencing [54]. This may have a broader dynamic range and

greater sensitivity than LC-MS and can simultaneously detect 3000 human proteins in plasma samples [55]. The use of slow off-rate DNA aptamers, provided by SomaLogic, has enabled large-scale studies of 10,000 donors targeting 4000 circulating proteins across human diseases [56].

#### *4.2. Applications for Early Diagnosis, Prognosis and Treatment Monitoring*

For the proteomic analysis of body fluids, particularly blood, further challenges arise due to the broad concentration range of different proteins, dynamic changes induced by disease processes, and analytical factors that influence protein detection [57]. Collectively, more than 4500 proteins have been detected in plasma samples by discovery-driven MS [58]. Many abundant plasma constituents are secreted by the liver, whereas other secreted proteins, such as inflammatory cytokines, are often elevated only transiently [59]. Accordingly, differences have been observed between individuals and between molecular profiles at longitudinal study time points [60]. When searching for protein biomarkers in healthy individuals, as well as psoriasis and PsA patients, the heterogeneity of signatures from circulating proteins should be expected.

Multiple candidate biomarkers of PsA have been reported in serum and plasma, in addition to a smaller number found in synovial fluid/tissue and skin biopsies [47,48]. Most of the biomarker candidates are proposed for the detection of PsA [61,62], differentiation between mild and severe forms [63,64], measuring disease activity [65], or predicting which psoriasis patients are likely to develop PsA [66]. However, others have been proposed to distinguish PsA from other arthritic diseases such as RA [49,67] or to monitor responses to therapy [68–71]. For example, the label-free MS analysis of synovial fluid from PsA patients revealed 12 candidate PsA markers including the injury marker MMP3, as well as the inflammatory proteins S100A9 and CRP [62]. A subsequent study using LC-MS identified periostin, which is related to cell-adhesion proteins, and the angiogenesis marker PGK1 [72]. More recently, a systematic search of five bibliographic databases for clinical, laboratory and genetic markers was used to determine the level of evidence for each marker and its association with concomitant/developing PsA [73]. These have been converted into proteomic biomarkers in Table 2. For the prediction of PsA in psoriasis patients, highly characterized cohorts of patients are needed with each disease, minimizing the proportion of undiagnosed subclinical PsA patients in the psoriasis group. Alternatively, longitudinal observation and sample collection in the psoriasis group may directly identify those progressing to PsA, allowing the retrospective analysis of early samples to look for predictive biomarkers.

#### *4.3. Case Studies/Examples in Psoriasis and PsA*

In a recent study, a set of 951 circulating proteins was analyzed in serum samples to interrogate possible differences between patients with PsA, psoriasis and healthy controls [74]. Sixty-eight differentially expressed proteins were identified when comparing PsA patients and healthy controls, but no differentially expressed proteins were identified when comparing PsA and psoriasis patients. This led the authors to propose a "shared serum proteomic signature" between psoriasis and PsA. However, the cohorts were very small and subclinical PsA in the psoriasis group could not be excluded. Indeed, no information was provided about patient inclusion/exclusion criteria or the criteria used for the differentiation of PsA from psoriasis, which is necessary in such studies. In conclusion, the authors recommended that future studies focus on skin and synovial tissue to find differences between PsA and psoriasis patients.

**Table 2.** Proteomic markers with evidence to support their ability to distinguish between PsA and cutaneous-only psoriasis. A gene-centric table of candidates was created by using the biomarkers listed by Mulder et al. [73]. The proteins and mRNAs were converted into gene-centric entries using the Human Protein Atlas portal (www.proteinatlas.org (accessed on 5 July 2022)), and were annotated for secretion location, tissue expression and biological functional based on the recent clustering of single-cell expression data [75].



**Table 2.** *Cont*.

#### **5. Metabolomics**

#### *5.1. Brief Overview of Relevant Metabolomics Technologies*

Metabolomics can be defined as the investigation of changes in the populations of endogenous and exogenous low-molecular-weight metabolites (<1500 Da), representing a shift from single metabolite monitoring to complex profiling and pattern recognition [76]. This is a considerable analytical challenge that involves the identification and quantification of a broad spectrum of molecules in biological matrices such as human plasma or urine, which contain hundreds or thousands of metabolites with diverse chemical and physical properties across a wide dynamic range of concentrations. The most widely used techniques include nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry in combination with gas chromatography (GC-MS) or liquid chromatography (LC-MS). Advanced bioinformatics and statistical tools are used to maximize the recovery of information from the resulting metabolomic datasets.

#### *5.2. Applications for Early Diagnosis, Prognosis and Treatment Monitoring*

Low-molecular-weight metabolites are important indicators and even integrators of phenotypes, reflecting the biochemical activity of cells and tissues. Metabolomics recognizes that changes in cell function are most evident at the level of small-molecule metabolism and can provide a coherent view of the response of individuals to a variety of genetic and environmental influences [77]. The abnormal cellular processes associated with disease often disrupt the composition of low-molecular-weight metabolites. Perturbations in metabolite abundance and temporal profiles in readily accessible body fluids may provide an index of disease severity through the direct measurement of biochemical changes. As such, metabolomics has the potential to identify biomarkers of PsA that may improve diagnostic accuracy and predict disease progression as well as defining patient responses to specific therapeutic interventions. Similarly, metabolomics may offer additional insight into the metabolic pathways that drive the chronic, immune-mediated processes that are characteristic of PsA, opening routes to potential new drug targets.

#### *5.3. Case Studies/Examples in Psoriasis and PsA*

Researchers are increasingly using metabolomics for the clinical assessment of PsA [78–80]. Several studies have reported alterations in the metabolomes of PsA patients in comparison to healthy controls or individuals with related inflammatory diseases such as psoriasis or RA. The serum levels of various amino acids are modified in PsA patients relative to RA cohorts [81,82]. Changes in the levels of circulating glucuronic acid and α-ketoglutaric acid were detected among psoriasis patients with or without PsA [83] and a correlation was made between serum levels of the choline metabolite trimethylamine *N*-oxide (TMAO) and inflammation in PsA patients [84]. A more recent study used untargeted metabolomics to characterize the metabolic changes in the transition from psoriasis to PsA, revealing differences in the abundance of bile acids (particularly glycoursodeoxycholic acid sulfate) and butyrate to differentiate between psoriasis patients who did or did not progress to PsA [85].

Metabolite profiles in other matrices can also provide a window of opportunity to elucidate the metabolic changes in PsA. It was recently reported that α/β-turmerone, glycerol 1-hexadecanoate, dihydrosphingosine, pantothenic acid and glutamine may act

as fecal biomarkers for PsA [86]. In addition, a metabolomic study focusing on urinary metabolites revealed lower levels of citrate, alanine, methylsuccinate and trigonelline in PsA patients compared to unaffected individuals [87]. Metabolomic approaches have also been used to evaluate PsA patient responses to anti-TNF therapy. For example, histamine, glutamine, phenylacetic acid, xanthine, xanthurenic acid and creatinine levels were elevated in urine samples from patients who responded to TNF antagonists, whereas ethanolamine, *p*-hydroxyphenylpyruvic acid and phosphocreatine levels were depleted [88].

#### **6. Lipidomics**

#### *6.1. Brief Overview of Relevant Lipidomics Technologies*

Recent technological improvements in LC-MS enable comprehensive lipidomic analysis in clinical studies, analyzing extensive sample sets for different lipids and lipid mediators. Depending on the specific lipids, targeted or untargeted LC-MS may be most appropriate. The untargeted approach is based on high-resolution mass spectrometry (HRMS) and can potentially examine the whole lipidome in a single run, but focuses on the more abundant lipids because the dynamic range is not sufficient to detect scarce molecules such as lipid mediators alongside abundant lipids such as triglycerides. Scarce lipids are analyzed using targeted approaches based on tandem mass spectrometry (MS/MS), which has greater sensitivity and selectivity. However, targeted methods cannot display the whole lipidome, so an approach combining untargeted and targeted methods is used for comprehensive lipidomics analysis, searching for lipids and lipid mediators relevant in the context of psoriatic diseases.

#### *6.2. Applications for Early Diagnosis, Prognosis and Treatment Monitoring*

Lipids and lipid mediators play a fundamental role in the immune system and changes in homeostatic status are closely related to IMIDs [89–93] such as RA [94], IBD [95–99] and psoriatic diseases [100–103]. Lipids are involved in many different processes and are also essential building blocks of membranes and key components in energy metabolism. Lipid mediators such as oxylipins and endocannabinoids, which are present at very low concentrations, are signaling molecules implicated in diverse physiological and pathological processes. Therefore, lipid profiles might be used as biomarkers for early diagnosis, prognosis of disease progression or the development of comorbidities, and to guide the selection of the most promising therapeutic approach.

Biomarker discovery in the field of lipidomics is challenging due to strict procedural requirements for sampling, sample preparation and analysis. This is partly due to the variable concentration of lipids in different biological matrices, the broad spectrum of isomeric compounds and the special procedures required to ensure lipid stability at all stages, including pre-analytical sample handling. On the other hand, lipidomics covers a field of up to several thousand different molecules and one of its key advantages is the close temporal linkage between these markers and individual clinical phenotypes or disease states [77,104].

#### *6.3. Case Studies/Examples in Psoriasis and PsA*

The close link between lipid profiles and IMIDs was recently demonstrated in PsA patients, where oxylipins [102,103,105,106], endocannabinoids [103,107], fatty acids [105–107] and phospholipids [106] were found to be potentially pathophysiologically relevant. A recent study also found that the level of inflammatory lipid mediators in psoriasis patients increased following a PsA diagnosis, particularly leukotriene B4 [85]. However, a comprehensive study is required to compare the lipid profiles of patients with psoriasis and PsA, and this will be the first step toward the identification of lipid biomarkers that improve the diagnosis and treatment of PsA.

#### **7. Complementary Technologies—Multiple Sequential Immunohistochemistry**

Several multiple immunohistochemistry systems have been developed that allow the staining of tissue sections with a theoretically unlimited number of antibodies. The technology makes use of directly labeled antibodies carrying a fluorophore or heavy metal ion. The antibodies are applied to the sample in an automated process, which includes a short incubation period, washing steps, imaging and signal removal. The latter involves either bleaching or chemical inactivation, and is followed by the addition of the next antibody [108,109]. This process can be repeated as often as necessary, and typically creates image stacks representing 30–50 antibodies. Recent developments include software that combines single-cell phenotyping and localized information about neighboring cells, facilitating a quantitative "tissue FACS analysis" (FACS = fluorescence-activated cell sorting) with the description of disease-specific immunological neighborhoods within inflamed tissues [108].

One of the key benefits of multiple immunohistochemistry systems in the context of PsA is single-cell phenotyping in different patient groups using 40–50 antibody probes in automated cycles. By detecting and quantifying a large panel of corresponding markers, it is possible to identify nearly all immune cells and their subtypes, and to characterize their cellular neighborhood to quantify and visualize cellular networks (information that is lost during FACS analysis). The comparison of samples from psoriasis and PsA cohorts can therefore identify differences between the patient groups and generate information about biomarkers and immune cell networks/interactions that may lead to new therapeutic options.

#### **8. Data Management/Integration and Artificial Intelligence**

To benefit from the wealth of methods used to mine multi-omics data, it is essential to align the data and verify their quality before integration. Data should be formatted according to international standards, including standard bioinformatics file formats (such as FASTA, FASTQ, SAM/BAM, VCF and GFF), and "minimum information" standards for omics experiments [110], including MIGS/MIMS for genomics [111] and MIAPE for proteomics [112]. The data must be checked to ensure they include the same annotation references (e.g., genome version, standard gene and protein names). This is challenging with lipidomics and metabolomics data where there are currently no widely accepted standards, although efforts are ongoing to establish equivalent minimum information standard such as MIAMET [113] as well as standards for lipidomics analysis [114,115]. Following de-identification, clinical data are standardized using the OMOP common data model (Observational Health Data Sciences and Informatics, OHDSI program, available at https://ohdsi.org/ (accessed on 20 June 2022) and aligned to standard dictionaries to ensure interoperability. Once formatted and standardized, data are stored and accessed via a secure data management infrastructure specifically designed to protect sensitive clinical and biomedical data [116].

Multiple processing steps should be considered to ensure data integrity, including missing value imputation, normalization, transformations, aggregation and batch effect correction [117–119]. Unsupervised multivariate analysis methods such as common dimensions [120] can be used to assess overall variability, trends and potential biases across multiple integrated layers of multi-omics and clinical data before further data exploration by supervised multivariate analysis methods such as OPLS-DA [121] or artificial intelligencebased approaches such as machine learning. A wide range of computational methods can be applied depending on the study design and research aim. In addition to classical statistical analysis, machine learning can be used to evaluate data in an unsupervised manner for preliminary exploration and dimensional reduction (e.g., clustering approaches such as DBSCAN or k-means algorithms, or dimensional reduction such as PCA or TSNE). Batch effects in dimensional reduction and clustering approaches can reveal outliers [118]. In a clinical setting, supervised machine learning often tackles classification problems rather than regression. Due to the high dimensionality of multi-omics data and the so-called

"curse of dimensionality" (low number of subjects and high number of features), feature selection algorithms such as LASSO or ridge can be applied to enhance the results of supervised learning [118,122]. In addition to feature selection, class imbalances are common challenges in multi-omics analysis, but can be solved by sampling or cost-sensitive learning [122]. Commonly used algorithms such as random forests, support vector machines and the k-nearest neighbor algorithm can provide insight into the underlying structures of datasets and can be developed into powerful models for the support of clinical decision making [123–125]. In order to develop further hypotheses and integrate data with the literature, pathway analysis can embed the data in a broader context [126].

#### **9. The Advantage of Multi-Omics Evaluation**

As discussed above, several markers have been identified that commonly occur in PsA patients, but no single marker stands alone as a specific indicator of the disease. Even the most reliable markers are also present in other IMIDs, which therefore makes it difficult if not impossible to achieve a definitive diagnosis. In other fields, the lack of definitive qualitative markers has been addressed by (a) seeking quantitative markers, whose abundance rather than presence/absence correlates with a disease, and (b) profiles based on combinations of several markers that are more informative than single molecules, a strategy first applied to ovarian cancer [127]. Indeed, this approach has also been successful in RA, where the proteomic analysis of serum and synovial fluid has revealed the elevation of multiple biomarkers representing processes such as joint inflammation and injury (e.g., MMP3, IL-12, IL-15 and IL-18), cartilage integrity and bone or connective tissue degradation (e.g., MMP13 and neoepitopes of collagen) [128]. Considering that such panels of RA markers have been assembled based solely on proteomics data, it is clear that the combination of proteomics with orthogonal omics datasets plus more diverse data can increase the power of this approach exponentially, both for diagnosis/prognosis [129] and the monitoring of drug responses [130]. In one recent study, metabolomics and lipidomics analysis revealed that a combination of the bile acid conjugate glycoursodeoxycholic acid sulfate and lipid mediator leukotriene B4 provided a sensitive and specific predictor of progression from psoriasis to PsA [85]. However, adding new features will also require the careful evaluation of added value, both for discovery in basic research and translation to the clinic. Models that incorporate more markers may be more sensitive and specific, but the cost of acquiring the data in routine clinical practice may be prohibitive, although this may not always be the case [131]. A risk remains that expensive and large datasets merely report already known aspects, such as the effect of the body mass index or inflammation on disease progression. The field should also strive to identify causal markers rather than solely correlative observations without a direct biochemical connection to the phenotype. As the amount and complexity of the data increase, it becomes more difficult for humans to identify consistent patterns that correlate with certain diseases, but machine learning algorithms either supervised to assign samples to known categories or devising categories de novo using unsupervised analysis have the power to reveal these hidden patterns and then to apply the same approach when analyzing data from new patients, greatly improving the accuracy of the resulting predictions.

#### **10. Conclusions and Outlook**

The definitive diagnosis and early treatment of PsA are not yet possible because the clinical manifestations and associated biomarkers are not, on an individual basis, able to distinguish PsA from other IMIDs or predict those individuals with psoriasis who will progress to PsA. However, the combination of biomarker profiles based on data from multiomics technologies and classical sources, such as imaging data and clinical evaluations, could provide the body of information required for early diagnosis and the initiation of effective treatment before symptoms emerge. Combinations of different types of biomarkers have proven effective in other fields, particularly oncology, but such biomarker profiles are often mainly based on a single method. The power of biomarker profiles may increase

with the number of complementary modalities that can be tested simultaneously. For PsA, combining information from disease-associated alleles and chromatin structures, the levels of proteins, lipids and other metabolites, multimodal image analysis, histology and classical phenotyping will provide an important step forward. Ultimately, the use of multiple orthogonal technologies that embed machine learning will lead to the generation of unique molecular and clinical fingerprints that can be used for PsA diagnosis, prognosis and therapeutic monitoring. However, research on the identification of biomarker profiles/fingerprints using different omics technologies is still in the discovery phase with much work to be conducted to turn the anticipated results of these analyses into assays which are applicable in routine clinical settings. The HIPPOCRATES project is therefore strategically important because it combines expertise from all relevant fields with access to comprehensive cohorts, technologies and translational research experience. This will ultimately improve the quality of life for those living with PsA or at risk of developing PsA.

**Author Contributions:** All authors contributed to the conceptualization, writing, reviewing and editing of the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** HIPPOCRATES has received funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement no. 101007757. The JU receives support from the European Union's Horizon 2020 research and innovation program and EFPIA.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** R.G., M.K., L.H., S.K., E.R., S.R., K.S., G.G. and F.B. acknowledge support from the Fraunhofer Cluster of Excellence Immune Mediated diseases (CIMD). A.B. is an NIHR Senior Investigator. A.B. and J.B. receive support from Versus Arthritis (grant ref. 21754) and the NIHR Manchester Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, National Institute for Health and Care Research or the Department of Health. The authors would like to thank Daniel Kratz and Samuel Rischke for support in creating the figures.

**Conflicts of Interest:** The authors, including Richard M. Twyman, who is the director of the company TRM, declare no conflict of interest.

#### **References**


### *Review* **Rebooting Regulatory T Cell and Dendritic Cell Function in Immune-Mediated Inflammatory Diseases: Biomarker and Therapy Discovery under a Multi-Omics Lens**

**Dimitra Kerdidani 1,2,†, Nikos E. Papaioannou 1,2,†, Evangelia Nakou 1,2,† and Themis Alissafi 1,2,\***


**Abstract:** Immune-mediated inflammatory diseases (IMIDs) are a group of autoimmune and chronic inflammatory disorders with constantly increasing prevalence in the modern world. The vast majority of IMIDs develop as a consequence of complex mechanisms dependent on genetic, epigenetic, molecular, cellular, and environmental elements, that lead to defects in immune regulatory guardians of tolerance, such as dendritic (DCs) and regulatory T (Tregs) cells. As a result of this dysfunction, immune tolerance collapses and pathogenesis emerges. Deeper understanding of such disease driving mechanisms remains a major challenge for the prevention of inflammatory disorders. The recent renaissance in high throughput technologies has enabled the increase in the amount of data collected through multiple omics layers, while additionally narrowing the resolution down to the single cell level. In light of the aforementioned, this review focuses on DCs and Tregs and discusses how multi-omics approaches can be harnessed to create robust cell-based IMID biomarkers in hope of leading to more efficient and patient-tailored therapeutic interventions.

**Keywords:** immune-mediated inflammatory disorders; autoimmune diseases; immune regulation; dendritic cells; regulatory T cells; omics; therapeutic targeting; biomarkers; chronic inflammation

#### **1. Introduction**

Immune-mediated inflammatory diseases (IMIDs) are a diverse group of incurable clinical disorders that constitute a unique conceptual and medical challenge for the scientific community. Under the umbrella of the broad term IMIDs, many autoimmune as well as chronic inflammatory diseases, such as rheumatoid arthritis (RA), inflammatory bowel disease (IBD), systemic lupus erythematosus (SLE), type 1 diabetes (T1D), cutaneous inflammatory disorders (including psoriasis and atopic dermatitis (AD)), asthma and autoimmune neurological diseases such as multiple sclerosis (MS), can be incorporated. IMIDs develop as a consequence of complex mechanisms that depend on genetic, epigenetic, molecular, cellular, and environmental elements and result in defects in immune regulatory checkpoints of tolerance [1,2]. This breakdown of self-tolerance leads to the aberrant activation of lymphocytes against otherwise harmless self or foreign antigens causing chronic unrestrained inflammation that destroys self-organs and tissues.

Two key checkpoints of self-tolerance and decision-makers of the type and magnitude of the immune response are dendritic (DC) and regulatory T (Tregs) cells. On the one side, DCs, by up-taking environmental cues, self or foreign antigens and translating them into signals for the proper initiation of the immune response, constitute the sensors of the immune system and the link between innate and adaptive immunity [3,4]. On the other side are Tregs, that respond to signals of DCs, regulating and restraining exacerbated inflammation, thus comprising the brakes of the immune response [5,6]. During IMIDs,

**Citation:** Kerdidani, D.; Papaioannou, N.E.; Nakou, E.; Alissafi, T. Rebooting Regulatory T Cell and Dendritic Cell Function in Immune-Mediated Inflammatory Diseases: Biomarker and Therapy Discovery under a Multi-Omics Lens. *Biomedicines* **2022**, *10*, 2140. https://doi.org/10.3390/ biomedicines10092140

Academic Editor: Marianna Christodoulou

Received: 27 July 2022 Accepted: 29 August 2022 Published: 31 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

both cell types have been reported to be dysregulated, with altered frequencies in the periphery of patients, overt activation, and certain degrees of imbalance in their phenotype and function [7–10], thus leading to the breakdown of self-tolerance. Although the previous two decades have been transformative for the understanding of the mechanisms that govern immune dysregulation in IMIDs, effective and highly targeted treatments have proven to be elusive. Evidently, IMIDs remain a major burden on health systems around the world, accounting annually for several billion EUR in medical costs and lost income. Deciphering in depth the cellular and molecular mechanisms that contribute to the breakdown of immune tolerance is thus an important goal, with the prospect that this knowledge will pave the way to new clinical advances in the treatment of IMIDs.

The recent breakthrough in advanced multi-omics technologies provides the essential tools to ease the massive and in-depth understanding of the mechanisms driving immune dysfunction in IMIDs. Indeed, bulk and single-cell omics, multi-parameter flow and mass cytometry, next-generation spatial omics, and systems biology are among the current approaches expected to be applied in daily clinical practice for the upgrade of patients' management and quality of life. Here, we focus on Tregs and DCs, the two fundamental gatekeepers of the immune tolerance and discuss how recent advances in the field of IMIDs, illuminated by the dawn of omics technologies, can be harnessed to create robust cell-based biomarkers and patient-tailored therapeutic interventions.

#### **2. Regulatory T Cells as Multifaceted Orchestrators of Immune Responses**

Tregs are an important immune system component, critical for maintaining homeostasis and immunological self-tolerance [11,12]. Tregs exert their suppressive functions either by cell-to-cell contact or secretion of cytokines. More specifically, Tregs can effectively suppress immune responses via (a) secretion of anti-inflammatory cytokines such as interleukin-10 (IL-10), IL-35, and TGF-β [13–16], (b) granzyme and perforinmediated cytolysis [17–19], (c) expression of nucleotide-metabolizing enzymes such as CD39 and CD73 [20–22], (d) competing with effector T cells for IL-2, an essential T cell survival cytokine [23,24], and (e) dampening the maturation/antigen-presenting capacity of dendritic cells [25–27].

Both human and murine Tregs are phenotypically distinguishable by the expression of the transcription factor Foxp3 and the IL-2 receptor alpha chain (IL-2Rα, CD25). However, since CD25 can also be highly expressed in other subsets of activated CD4+ T cells in humans, the absence of the IL-7 receptor alpha chain (IL-7Rα, CD127) is complementarily used to identify human Tregs [28]. Expression of the master regulator Foxp3 is a cardinal feature of Tregs, fundamental for their development and suppressive function [6,29]. Therefore, loss-of-function mutations of the *FOXP3* gene in humans lead to the development of a severe autoimmune disease termed immune dysregulation, polyendocrinopathy, enteropathy, and X-linked (IPEX) syndrome [30,31]. Miyara et al. classified CD4+Foxp3+ Tregs from the peripheral blood of healthy individuals into three main fractions: Fr.I naïve Tregs (CD45RA+/CD25low), Fr. II effector Tregs (CD45RA−/CD25high) and Fr. III not Tregs (CD45RA−/CD25low) [32]. This classification, which is based on surface markers, nicely correlates to Tregs' epigenetic profile and suppressive function with Fr. I and II being suppressive resting or activated Tregs, respectively, and Fr. III being non-suppressive and cytokine secreting non-Tregs [8].

Tregs are either generated in the thymus (thymic-derived Tregs, tTreg) or in the periphery through conversion of CD4+Foxp3<sup>−</sup> T conventional cells following antigenic stimulation in the presence of TGF-β and IL-2 (induced Treg, iTreg) [33,34]. Whereas Treg cells were traditionally considered a terminally differentiated population, it is now well accepted that they acquire plasticity that allows them to adapt to the cues of the microenvironment [8]. By acquiring expression of specific lineage T cell-transcription factors, such as T-bet, GATA-3, IRF-4, STAT-3, RORγt, Bcl-6, and chemokine receptors, Tregs can skew to Th1, Th2, Th17, or T follicular helper cell-like phenotypes [35–41]. These functional adaptability is context and tissue-dependent [8].

Th1-like Tregs circulate in the blood of patients with autoimmune diseases [9]. Except for T-bet, they also upregulate CCR5, CXCR3, and secrete IFN-γ, while displaying reduced suppressive capacity when compared to Tregs. IFN-γ secretion has been shown in vitro to depend on PI3K/AKT/FoxO signaling [9,42]. Respectively, Th2-like Tregs upregulate GATA-3 and IRF-4 and secrete IL-4 and IL-13 [9]. In the setting of IMIDs, Th2-like Tregs have been found in tissues rather than the periphery [43]. Th17-like Tregs upregulate the transcription factor RORγt and secrete IL-17A. Although it is yet unclear whether they are a stable subcluster of Tregs or a transitory stage of Tregs to Th17 cells, Th17-like Tregs are found in steady-state in the gastrointestinal tract, where they have a protective role [40,44], but also in the synovium of arthritic patients and in psoriatic lesions where they contribute to disease pathogenesis [9,45–47].

Adding up to their heterogeneity, Tregs possess also a certain degree of instability. Unstable Tregs, named ex-Tregs, produce inflammatory cytokines, downregulate Foxp3 expression, and concomitantly lose their suppressive function [46,48,49]. Post-translational modifications of the Foxp3 protein, namely acetylation, phosphorylation, and ubiquitination of specific residues also contribute to Tregs' instability and plasticity as they may lead to Foxp3 protein stabilization or proteasomal degradation [50–52]. Tregs' instability seems to have a key role in the pathogenesis of autoimmune diseases [49]. However, the extent to which both Treg plasticity and instability contribute to the pathogenesis of IMIDs and whether the modulation of Tregs' state can be proven therapeutically relevant is under investigation.

Lastly, major advances in the field have uncovered Tregs that reside in non-lymphoid structures and contribute to tissue homeostasis rather than immune surveillance [53]. Tissue-resident Tregs have been identified in several tissues including adipose tissue, skin, lung and gastrointestinal tract where they become epigenetically adapted to microenvironment's specific cues [54]. Thus, the transcriptomic profile of tissue-resident Tregs varies significantly with that of their lymphoid tissue counterparts, as well as among different tissues. Several markers have been identified that distinguish tissue-Treg precursors that reside in lymphoid organs prior to their transport to homing tissues, such as PPARγlow, TCF1low, ID3low, and NFIL3+ [55]. Nevertheless, tissue-resident Treg biology remains largely unexplored and constitutes a fruitful field of research.

Tregs are instrumental in preventing IMIDs and preserving immune homeostasis. In fact, most autoimmune diseases bear numerical or functional alterations in their Treg cell compartment. For example, in T1D, the activated Tregs (CD4+CD45RA−Foxp3high) in peripheral blood of patients are increased in numbers and functionally impaired akin to a pro-inflammatory phenotype [56–58]. In RA patients, although the frequencies of Tregs (CD4+CD25+CD127−) in the periphery are either similar or lower compared to healthy controls [32,59,60], Tregs in the synovial fluid are increased in numbers and less suppressive [32,61]. Individuals suffering from relapsing-remitting MS, in most of the studies, have decreased numbers of CD4+CD25+ Tregs and increased frequencies of Th1 like (CD4+CD25highCD45RA−CD127−Foxp3+) Tregs in their blood [32,59,60,62,63]. The latter have been shown to express the pro-inflammatory cytokine IFN-γ and have reduced suppressive function when co-cultured with effector T cells in vitro [61]. In line with the perturbed function and frequency of Tregs noted in various autoimmune diseases, Tregs (CD4+CD25high) numbers are also decreased in the peripheral blood of SLE patients and demonstrate reduced suppressive capacity relative to healthy controls [32,60,64]. Although human studies that investigate Treg frequencies and function in various autoimmune diseases suffer from discrepancies due to a lack of consistency in Treg definition markers, they nevertheless reveal the significance of Tregs for immune homeostasis [59].

Studies of Tregs in IMIDs derive mostly from data acquisition of flow cytometry and ex vivo assays, thus lacking collective and high-throughput insight. Recent technological advances have established multi-omics platforms in the field of research and offer holistic approaches to data acquisition that are unbiased and hypothesis-driven independent. Herein, we review Treg-specific multi-omics approaches that have been applied in IMIDs research up to date.

#### *2.1. Transcriptomic Studies Paving the Way for Illuminating Tregs' Functional Profiles and Subsets in IMIDs*

The study of bulk mRNA transcripts within a biological sample, termed transcriptomics, has now become a standard approach for investigating molecular mechanisms that underlie steady-state and pathogenic conditions, as transcriptional profiling of cells is able to reveal gene function and gene structure [65]. By moving onward to the single-cell era it became apparent that transcriptomics at the single-cell level have reshaped modern research and have uncovered cellular differences and the heterogeneity of biological samples that have been masked by bulk RNA-sequencing (RNA-seq). Mostly bulk, and to a lesser extent single-cell RNA-sequencing (scRNA-seq), have been applied thus far in studying Tregs in the context of IMIDs.

A recent study in our lab interrogated the transcriptomic profile of Tregs from the peripheral blood of individuals suffering from MS, RA and SLE [60]. RNA-seq analysis revealed a plethora of deregulated transcripts when compared to healthy controls. Tregs were predominately altered in metabolic pathways related to oxidative stress, mitochondrial dysfunction, cell death and DNA damage response. Interestingly, this signature was consistent across all autoimmune disease settings studied [60].

As mentioned above, Tregs are able to adapt to specific microenvironments, thus conditions such as excessive inflammation imprint onto Treg profile. It has been reported by two independent studies that in juvenile idiopathic arthritis (JIA), Tregs obtained from inflamed joints have a specific effector profile [66,67]. Both studies compared, among others, the transcriptome of Tregs from synovial fluid to those of peripheral blood of individuals with JIA. Differential gene expression analysis revealed that Tregs in the synovial fluid express a Th1 transcriptomic signature that is characterized by the expression of transcription factor *TBX21* (T-bet), chemokine receptor *CXCR3*, and IL-12 receptor β2 (*IL12RB2*). IFN-γ was also found upregulated in one of the studies [66], nevertheless, when Tregs were stimulated ex vivo they failed to produce this cytokine [67]. Despite high expression of Th1-related proteins, Tregs preserved their suppressive features as shown by the maintenance of a robust Treg-associated transcriptional program [66] and functional assays [66,67].

Julé et al. employed scRNA-seq on Tregs sorted from synovial fluid of individuals experiencing JIA. Among five Treg clusters identified in this study, cluster 1 matched the expression profile of Th1-like Tregs while preserving the Treg transcriptomic signature, thus confirming the uncovering of a stable effector Treg population that maintains Treg-specific demethylation patterns and suppressive capacity, as identified by bulk RNA-seq. The newly identified and highly suppressive population of Th1-like Tregs, which was unveiled through the prism of transcriptomics, could constitute an attractive target with important therapeutic benefits for individuals with JIA. Four additional Tregs subpopulations were identified that spanned from the classical and highly activated HLA-DR+ Tregs that robustly express Treg signature genes to the CD161<sup>+</sup> and IFN-induced Tregs that share some genes with effector T cell clusters [66].

Recently, the transcriptome of Tregs in the blood of individuals experiencing autoimmune polyendocrine syndrome type I (APS-1) versus healthy controls has been interrogated [68]. Whereas only subtle changes were observed between disease and healthy groups, the G Protein-Coupled Receptor 15 (*GPR15*) gene was found significantly downregulated, while the Fatty Acid Synthase (*FASN*) gene was upregulated in APS-1 Tregs [68]. Given that individuals with APS-1 suffer from gastrointestinal manifestations and GPR15 is a homing receptor for the gut, it was speculated that GPR15 downregulation might be indicative of a defective influx of Tregs in the gut [68]. In addition, an increase in FASN, important for fatty acid synthesis, is suggestive of metabolic reprogramming of APS-1

Tregs [68]. However, data were strictly descriptive and deprived of functional evidence, thus results must be considered cautiously.

Despite their dominant role in immunosuppression, so far only a very limited number of studies have focused on the single-cell analysis of tissue-specific Treg cells in IMIDs. In one of them, scRNA-seq was used to characterize Treg cells isolated from the peripheral blood and synovial fluid of two individuals with ankylosing spondylitis (AS) [69]. Analysis revealed ten specialized Treg clusters, present in both tissues, with unique gene expression signatures. Among them, a CD8<sup>+</sup> Treg subset expressing cytotoxic markers such as granzyme B and granulysin was significantly enriched in the synovial fluid of individuals with AS, whereas a Th17-like RORC+KLRB1+ Treg subset characterized by IL-10 and LAG-3 expression was significantly enriched in the blood of AS patients. Despite the small size of samples, these two clusters were also identified in the peripheral blood and synovial fluid of individuals with psoriatic arthritis, another type of spondyloarthritis (SpA) [69]. Total synovial fluid Tregs were characterized by the upregulation of activation and inhibitory markers, as well as TNF and interferon response genes, and they were clonally expanded suggesting tissue-specific adaptation. Thus, targeting these unique characteristics of joint-specific Treg subsets could have promising applications for the amelioration of SpA.

Immune-related adverse events (irAEs) are an atypical IMID that is worth mentioning. The impressive success of immune checkpoint therapies in the treatment of various types of cancer is often overshadowed by irAEs that arise due to excessive activation of the immune system. Previous studies in our lab applying RNA-seq have demonstrated that Tregs from the peripheral blood of individuals developing irAEs bear a pro-inflammatory profile accompanied by enrichment in the apoptotic and metabolic pathways [70]. Moreover, irAEs-Treg signature is shared across different types of cancer and resembles Treg traits of individuals with autoimmune diseases [70]. Unraveling phenotypic switches of Tregs that drive or precondition the development of irAEs is of utmost importance for the prevention of toxicities that often accompany cancer immunotherapies.

Although not an IMID per se, graft-versus-host disease (GVHD) manifests as an autoimmune disease, and transcriptomic approaches have been employed to dissect Treg complexity in patients receiving hematopoietic stem cell transplantation [71]. Specifically, single-cell transcriptomic analysis was performed in Tregs of the peripheral blood and bone marrow of healthy donors and patients after hematopoietic stem cell transplantation that were either experiencing GVHD or not. The analysis resolved nine clusters both in the peripheral blood and bone marrow of individuals that included naïve (CCR7hi), activated (HLA-DRhi), LIMS1hi, effector (Foxp3hi), and proliferative (MKI67hi) Tregs. Functional evaluation revealed MKI67hi and Foxp3hi clusters as highly suppressive, followed by HLA-DRhi and LIMS1hi clusters. Pseudotime trajectory analysis uncovered the transition among clusters according to which naïve Tregs followed two distinct differentiation pathways towards either Foxp3hi Tregs (Path 1) or MKI67hi Tregs (Path2). Whereas similar clusters, spanning from naïve to activated/effector Tregs, were identified in all groups, effector Tregs clusters in individuals developing GVHD displayed downregulation of suppression and migration pathways as well as a senescence-like signature compared with non-GVHD patients [71]. Although the latter can be attributed to the age gap between GVHD and non-GVHD patients, Treg interrogation on a single-cell level offered a greater understanding of Treg features upon GVHD.

Regarding organ-specific immune-mediated diseases, the role of cell-based omics technologies, and particularly the advances in single-cell TCRαβ sequencing, is of primary importance to illuminate the antigen specificities of the pathogenic cells that mediate tissue damage, or of the regulatory cells that suppress the former in the inflammatory niche. Such knowledge will be decisive during the design of more efficient and targeted therapeutic approaches such as autoantigen-specific TCR engineering.

One such case is T1D in which Treg cells have already been exploited in therapies, with early phase clinical trials of ex vivo expanded polyclonal Treg cells showing promising

results [72,73]. However, since polyclonal Tregs are not antigen-specific, the approach utilized in these clinical trials could potentially lead to systemic unwanted immunosuppression. Interestingly, preclinical studies using the non-obese diabetic (NOD) murine model for T1D revealed that relatively small numbers of antigen-specific Treg cells, either pancreatic lymph node-derived or genetically engineered, and not polyclonal Treg cells, could prevent and even reverse T1D, pointing to therapies utilizing diabetogenic TCR-expressing Treg cells [74,75]. Still, most antigen-specific Treg cells are tissue-resident and only a small portion of them circulates in the bloodstream, rendering them difficult to isolate and characterize in humans. Additionally, so far, the attempts to create tailored Tregs utilize recombinant TCRs from Teff cells [76,77]. Due to these challenges, up to now the identification of the exact TCR sequences specific for dominant diabetogenic epitopes in Treg cells has been restricted only to NOD mice. To this end, Spence et al. employed TCR repertoire profiling and TCRαβ scRNA-seq to determine the specificity of Treg cells in the islets of Langerhans. Treg clonotypes were found to be expanded and the least diverse in inflamed islets compared to other lymphoid organs, while some of their TCRs were specific for islet-derived antigens including insulin B:9–23 and proinsulin, implying tissue-specific antigen-driven expansion of Treg clonotypes [78]. Their transcriptomic observation was further confirmed utilizing insulin B:9-23 tetramers able to detect increased insulin-specific Treg clones in the islets of NOD mice. Moreover, the adoptive transfer of total Treg cells from the islets, but not of Tregs from lymphoid organs, in NOD.CD28−/<sup>−</sup> mice could lead to disease rescue, further supporting the suitability of engineered Treg cells expressing insulin-specific TCRs as a promising strategy for suppressing autoimmune reactions against beta cells.

In JIA, TCR repertoire assessment on a single-cell level revealed that the Th1-like Tregs identified in the joints of individuals with JIA are bone fide Tregs, as their clonotypic composition was similar to that of other Treg clusters and not to effector T cells [66]. Another study has identified a subpopulation of activated Tregs (HLA-DR+) in the blood of JIA and RA patients that has been negatively correlated to response to therapy [79]. In JIA, the so-called inflammation associated (ia) Tregs expand during inflammation and decrease when the disease is inactive. It is important to note that iaTregs also expand when children have poor responsiveness to therapy. TCR-seq revealed antigenic stimulation and shared clonotypes between these iaTregs and Tregs from the synovium [79]. This observation confirmed the fact that HLA-DR+ Tregs recirculate between the synovium and blood, which could only be hypothesized up to then by the expression of tissue-homing receptors [79]. Migrating to blood-synovial Tregs could offer easy, non-invasive access to arthritis-associated clonotypes and at the same time could be exploited to monitor response to therapy [79].

#### *2.2. Unraveling the Epigenetic Mechanisms Governing Tregs Links Molecular Traits to Pathogenicity*

Marking the epigenetic changes across many genes is another available multi-omic tool termed epigenomics. Gene expression is driven by promoters, enhancers, insulators, etc. Epigenetic regulation of enhancers via histone modifications, which reveals gene regulation, has been used in IMIDs research [80]. Epigenomic approaches often act conjointly with transcriptomics to uncover context-specific gene regulation, as changes noted at the mRNA level are sought to be reflected also at the epigenetic level [67]. ChIP-seq was performed to profile histone modification marks that indicate transcriptionally active enhancers (acetylation of lysine 27 on histone H3 (H3K27ac) and monomethylation of lysine 4 on histone (H3K4me1)) using Tregs obtained from the synovial fluid versus peripheral blood of individuals with JIA. The study validated the Th1-like profile of synovial fluid-Tregs that was observed from RNA-seq data [66,67]. Specifically, ChIP-seq identified super-enhancers of genes that were found upregulated in mRNA levels such as *TBX21* and *IL12RB2* as well as super-enhancers associated with putative Treg markers, indicating that Tregs in the inflammatory environment of arthritic joints are adapted to a Th1-related profile while maintaining Treg specific features [67]. The same study uncovered vitamin D

receptor (VDR) as one of the top predicted regulators of Treg differentiation in the arthritic joints, marking it as an attractive therapeutic target. Ex vivo stimulation with vitamin D3 skewed Tregs towards an effector Treg profile [67].

Similar epigenetic profiling was performed in peripheral blood-Tregs in individuals with T1D versus healthy controls [81]. ChIP-seq and subsequent sophisticated in silico analysis revealed that (a) T1D-Tregs have fewer active enhancers compared to healthy Tregs, many of which regulate genes implicated in T1D pathogenesis, and (b) certain single nucleotide polymorphisms (SNPs) in enhancer regions disrupt the binding of key transcription factors that regulate transcriptome changes in T1D-Tregs [81]. Similar studies that translate, via multi-omics approaches, non-coding genetic variants to functional/pathological states of Tregs are needed for the prediction and understanding of IMIDs.

ChIP-seq along with ATAC-seq that determines chromatin accessibility and RNAseq have also been used to highlight Tregs' contribution to the development of IMIDs at large [82–86]. Epigenetic profiling of Tregs from the peripheral blood of healthy individuals revealed that autoimmune disease-associated SNPs are enriched in hypomethylated regions of naïve Tregs that control transcription and epigenetic changes, hence Treg function [86]. A recent study further supports the functional relevance of SNPs by showing that immune disease variants reside in chromosomal loci involved in Treg cell activation and IL-2 signaling [83]. In general, genetic variants associated with immune diseases are found enriched in regulatory regions of Tregs [82,84,85]. Multi-omics approaches combined with genome-wide association studies (GWAS) pave the way for the understanding of Tregs' contribution to IMIDs and the discovery of new Treg-specific therapeutic targets.

#### *2.3. Proteomic Studies Shed Light on Distinct Treg Subsets with Opposing Functions*

Proteomic analyses have helped us elucidate the mechanisms of inflammation-mediated pathology. They have also long been considered a valuable platform for the identification of autoimmune disease biomarkers for diagnostic and prognostic purposes in accessible biological fluids. In the new multi-omics era, approaches combining cell-type-based proteomics with transcriptomics could foster the characterization of disease-specific Treg subtypes which may serve as biomarkers for disease initiation or progression. However, up to date, only one study has focused on the proteomic profiling of Tregs in the context of IMIDs.

In this study, Weerakoon et al. employed proteomics in sorted Tregs (CD4+CD25highCD127<sup>−</sup>) and conventional CD4+ T cells (CD4+CD25−) from the peripheral blood of IBD patients. Their analysis pinpoints the absence or presence of integrin CD49f as a marker that distinguishes conventional T cells from Tregs. However, among Tregs, CD49f expression was also variable, and could separate two Treg subsets with distinct functions in the peripheral blood of IBD patients. CD49f − Tregs show increased suppressive ability and expression of inhibitory receptors, whereas CD49fhigh Tregs possess a proinflammatory phenotype and they are increased in the blood of IBD patients with active disease. They also suggest that the ratio CD49fhigh/CD49f <sup>−</sup> Tregs may constitute a useful predictor of disease activity, but this result should be validated in larger cohorts of patients [87]. Still, it is beyond doubt that more studies in the field of Treg proteomics in IMIDs are needed in order to disentangle the protein profile of these cells and identify novel Treg-specific markers and potential therapeutic targets. Furthermore, following the road paved by single-cell transcriptomics, newly developed single-cell proteomic platforms have the potential to uncover additional layers of Tregs' complexity in the setting of IMIDs.

#### *2.4. Microbiome-16S-Sequencing at the Crossroads between Tregs and Microbiota, Leading the Way to Microbiota-Related Therapeutic Interventions*

Commensal microbes colonize barrier sites where they are essential for immune homeostasis predominantly by modulating the generation of Treg cells. The advancements in 16S rRNA and metagenomics sequencing technologies have shed light on the composition and function of the human microbiome, as well as its direct role in modulating immune responses through its components or metabolites [88]. Increasing evidence suggests that gut dysbiosis is implicated in many IMIDs including SLE [89,90], RA [91,92], IBD [93,94], T1D [95], Grave's disease [96] and MS [97,98], and it is characterized by a reduction in small-chain fatty acid (SCFA)-producing species. Given the importance of Treg cells in establishing immune tolerance to self-antigens and commensal microbes, researchers' attention is now shifted towards Treg–microbiota interactions in autoimmune disorders, which may underpin the decreased numbers and/or dysfunction of Tregs in these conditions. Specifically, it has been shown that in mice, the SCFA butyrate promotes the induction of Treg cells, whereas treatment of naïve T cells with butyrate-enhanced histone 3 acetylation in the promoter and conserved non-coding sequence regions of the *FOXP3* locus leads to differentiation into Treg cells [99]. These unique effects of butyrate on Treg cells could provide protection from diabetes in NOD mice fed with a diet that generates large amounts of butyrate after colonic fermentation [100].

Another study showed that microbial species found in fecal samples of SLE patients induced a pro-inflammatory immune phenotype characterized by lymphocyte activation and Th17 differentiation. Interestingly, supplementation of SLE stool samples with Treginducing bacteria could restore Treg/Th17/Th1 imbalance [90]. Furthermore, long-term propionic acid supplementation in MS patients could reduce the annual relapse rate and ameliorate disease progression by increasing Treg cell numbers and suppressive function [98]. Thus, further exploring the crosstalk between Tregs and microbiota by integrating information from different high-throughput technologies (single-cell, metabolomics) will facilitate the development of therapeutic interventions that restore immunological tolerance through manipulation of the microbiome.

Key observations by studies employing transcriptomic, proteomic and epigenomic approaches have provided insight into Treg cells' function and contribution to the pathogenesis of numerous IMIDs (Figure 1). In the single-cell era, multi-omics approaches are indispensable for understanding the perplexing mechanisms that underlie Treg cell biology in IMIDs, the elucidation of which can lead to specific and effective therapeutic regimes.

**Figure 1.** Multi-omics approaches utilized in IMIDs research, focusing on regulatory T cells. The Pie chart depicts omics technologies that have been used to study the contribution of regulatory T cells in the pathology of IMIDs. Predominantly RNA-seq but also proteomic and epigenomic technologies have revealed Treg profiles that are suppressive, pro-inflammatory, or metabolically reprogrammed, as well as distinct Treg subsets across various diseases. scRNA-seq, single cell RNA-seq; ChIP-seq, Chromatin Immunoprecipitation sequencing; TCR-seq, T Cell Receptor sequencing; GVHD, Graft Versus Host Disease; JIA, Juvenile Rheumatoid Arthritis; SpA, Spondyloarthritis; MS, Multiple Sclerosis; RA, Rheumatoid Arthritis; SLE, Systemic Lupus Erythematosus; irAEs, immune related Adverse Events; APS-1, Autoimmune Polyendocrine Syndrome Type I; T1D, Type I Diabetes; IBD, Inflammatory Bowel Disease.

#### **3. Dendritic Cells as Multifaceted Orchestrators of Immune Responses**

Ever since their initial discovery by Steinmann and Cohn [101], DCs have grown from simply being viewed as highly motile stellate cells to being recognized as an essential connective link between the innate and adaptive arm of immunity in mammals. DCs constantly sample their microenvironment by engulfing self or non-self antigenic molecules. By possessing a large array of surface and intracellular receptors, they integrate the context in which these molecules are met and thus whether they are associated with invading pathogens, damaged cells or constitute innocuous antigens. After antigen processing, DCs present peptides to T cells, thereby activating them in an antigen-specific way. Most importantly, the induced T cell activation is polarized accordingly, through the production of cytokines and provision of specific costimulatory signals in order to ensure either sufficient protection against the pathogen met, or establishment and maintenance of tolerance against self and innocuous antigens [102–104]. This has earned them the title of orchestrators of immune responses.

The multifaceted role of DCs in immune responses is a derivative of their heterogeneity. Notably, the DC term functions as an umbrella that encloses several cell subsets, each possessing distinct developmental requirements, phenotype and functional properties [102,105]. While DCs have initially been studied more extensively in mice, with the help of multi-omics approaches, recent publications have elegantly dissected the human DC compartment, elucidating in parallel a high interspecies conservation of their development, phenotype and function [105–107]. Among DCs, two main distinct lineages can be distinguished, namely conventional DCs (cDCs) and plasmacytoid DCs (pDCs).

In both mice and humans, pDCs have a prominent role in anti-viral defense due to their ability to secrete copious amounts of type I interferons (IFN) in response to virally derived nucleic acids [108]. The efficiency of pDCs in antigen presentation and T cell activation is still not clearly defined due to controversial findings between different experimental settings [109–112]. While their exact developmental trajectory has also been a highly debated topic in recent years, [113–115] the consensus is that their differentiation is dependent on the transcription factor E2-2 in both species [107,108]. On the contrary, their major defining phenotypic markers seem to be not so well-conserved. Despite MHC-II/HLA-DR expression being a common trait, murine pDCs are characterized as B220+, SiglecH+, CD317+, Ly6C+, CD11cintermediate,while in humans characteristic pDC markers are CD123, CD303, CD304, combined with a lack of CD11c and CD5 expression [107,108,116].

cDCs excel in the activation of adaptive immune responses by presenting antigens to T cells [105]. They are subsequently divided into cDC1 and cDC2 and exhibit a remarkable division of labor when it comes to their role in immune responses [105]. Both cDC subsets are characterized by the expression of CD11c and MHC-II/HLA-DR but are distinct in dependence on transcription factors and the expression of other surface markers. Continuous and high expression of the transcription factors IRF8 and BATF3 is a prerequisite for maintaining the developmental and functional program of both human and murine cDC1 [106,117–119]. Genetic approaches have additionally elucidated the role of ID2 [120] and NFIL3 [121,122] in mouse cDC1 development, however, their implication in humans has yet to be determined. In terms of their phenotype, murine cDC1 can be reliably identified across tissues by the expression of XCR-1, CLEC9A, CD24 and CD205 [105]. Moreover, CD8α and CD103 are used as cDC1 characteristic markers in lymphoid and non-lymphoid tissues, respectively, despite the latter also being expressed in an intestinal cDC2 population [105]. In addition to XCR-1 and CLEC9A, human cDC1 in both blood and non-lymphoid tissues have characteristic expression of CD141 and CADM1 [107,116]. Functionally, cDC1 play a dominant role in inducing cytotoxic CD8+ T and Th1 polarized CD4+ T cell responses against intracellular pathogens, such as viruses and bacteria, but also participate majorly in antitumor immunity [105]. They do so via producing ample amounts of IL-12 that activates T cells both directly and indirectly by promoting a Th1-favorable cytokine milieu from bystander cells [119,123,124]. Added to the above, their remarkable potential as CD8+ T cell activators is extended by their ability to cross-present extracellular

antigens on MHC-I molecules [119,123,124]. In contrast to the pro-inflammatory role described above, especially in mice, the high potential of cDC1 to induce peripheral regulatory T cells has also been proposed [125,126].

In contrast to pDCs and cDC1 subtypes, the phenotype and developmental requirements of cDC2 between humans and mice seem to overlap the least. In mice, studies have identified transcription factors IRF-4, ZEB2, KLF4 and RELB as central mediators of cDC2 development [102,105,122] as well as pathways with more tissue-specific context such as NOTCH and retinoic acid signaling [127]. While human cDC2 distinctively expresses IRF-4, its role in their development is not yet elucidated. Characteristic murine cDC2 surface markers include CD11b, CD172a, CD4 and CLEC4A4 [105] of which only CD172a is a common defining marker with their human counterparts. The latter are additionally identified by their expression of CD1c, FcεR1α and CLEC10A [107,116]. Functionally, human and murine cDC2 align and are believed to be more efficient in inducing CD4+ T cell activation and polarization towards Tfh, Th2 or Th17 effector responses, crucial for T cell-dependent antibody production by B cells, defense against multicellular pathogens such as helminths or extracellular bacteria and fungi, respectively [128–135]. Their CD4+ T cell activation pattern also extends to regulatory directions via the induction of Tregs both in the thymus and in peripheral tissues [136,137]. Remarkably, cDC2 have been found to exhibit the highest intra-subset diversity compared to pDCs and cDC1. This heterogeneity, despite being ever-growing, has been studied in detail in mice [138–140], however, it has only recently been appreciated in humans.

In addition to pDCs and cDCs, a new cell subset termed transitional DCs has quite recently been identified in both humans and mice [141]. As implied by their name, these cells are placed in between the two aforementioned populations in the DC spectrum and have been described to possess shared pDC and cDC properties. Nonetheless, their exact function is yet to be defined and needs to be investigated further.

Many studies focusing on DCs, and especially in humans, use peripheral blood monocytes as a source to generate them in vitro. While not ontogenetically related to pDCs and cDCs, these monocyte-derived DCs (moDCs) have been used extensively due to the existence of established protocols for their generation, the enhanced availability of monocytes in the peripheral blood and their implementation in clinical practice [107,142]. Similar protocols exist in mice, however, bone marrow rather than peripheral blood is the selected source to generate such cells [107]. Data suggest that mature in vitro-differentiated moDCs likely align with monocyte-derived cells arising under inflammatory conditions in vivo [107,143]. The latter cells are characterized by the expression of CD11c, MHC-II/HLA-DR, CD14, CD64, CD11b, CCR2, CD209 and CD206 in mice and humans with Ly6C positivity being an extra distinctive phenotypic trait of the murine cells [143]. In vivo generated moDCs have a profound pro-inflammatory potential and functional specialization, primarily connected with direct anti-microbial effector function, evident by the fact that there were first described in mice infected with *L. monocytogenes* [144]. Their T cell activation potential in most cases does not match that of cDCs, however, it is not redundant for the clearance of some pathogens requiring strong Th1 immunity [145]. As expected, their pro-inflammatory role can function as a double-edged sword, since these cells have been postulated to enhance many IMIDs manifestations [107,143].

#### *3.1. Elucidating the Role of Dendritic Cells in IMIDs Utilizing Multi-Omics Approaches*

Given their role in maintaining the balance between protective immune responses and self-tolerance, DCs play a critical part in IMID manifestations in which this balance is by default perturbed. Their detailed role has been extensively reviewed elsewhere [7], and in brief entails the dysregulation of one or more of the following functional properties: (a) perturbation in the pattern of secreted cytokines, quantitatively and qualitatively, that promote pro-inflammatory responses from other innate and adaptive immune system cells; (b) enhanced antigen presentation of primarily self-antigens; and (c) altered distribution in terms of both frequency and spatial arrangement, often related to differences in their

migratory capacity, that affects especially the inflamed tissues but also peripheral blood. Here, we aim to report cases in which the role of DCs in IMIDs has been refined or enriched by the advent of recent omics approaches (Figure 2).

**Figure 2.** Multi-omics approaches utilized in IMIDs research focusing on dendritic cells. The Pie chart depicts omics technologies that have been used to study the contribution of dendritic cells in the pathology of IMIDs. Mainly scRNA-seq but also proteomic and metabolomic studies have highlighted dendritic cell subsets and inflammatory signatures that drive pathogenic responses in the disease spectrum of IMIDs. scRNA-seq, single cell RNA-sequencing; RA, Rheumatoid Arthritis; SLE, Systemic Lupus Erythematosus; T1D, Type I Diabetes; AD, Atopic Dermatitis; PsO, Psoriasis; Ssc, Systemic sclerosis.

#### *3.2. Bulk and Single-Cell RNA Sequencing Have Expanded the Portfolio of DC Subsets and Illuminated Their Role in IMIDs Perturbations*

One remarkable advantage of multi-omics approaches is their potential for singlecell resolution. This was made apparent especially for human cDC2, as recent studies identified novel subsets within the CD1c<sup>+</sup> cDC2 population using scRNA-seq coupled with index sorting [146,147]. The subdivision of these new subpopulations, namely DC2 (CD5+/−CD163−CD14−) and DC3 (CD5−CD163+CD14+/−), based on their immunophenotype was also found to be accompanied by functional differences [147,148]. In the context of IMIDs, CD163+ DC3s were found to be expanded in the blood of SLE patients and presented a highly activated phenotype compared to healthy controls. Interestingly, their frequency in blood was highly correlated to clinical scores. Secretome analysis showed that, among cDC2 subsets, DC3s uniquely produced many pro-inflammatory mediators when activated by the serum of SLE patients [147]. Given the above, it would be intriguing to investigate the performance of these cells as disease biomarkers and establish whether manipulating their function could ameliorate disease progression. Additionally, their role in other IMIDs such as RA and Psoriatic Arthritis (PsA) warrants further investigation due to their increased potential for induction of IL-17A producing T cells [147].

DC3s were also found selectively expanded, among cDC2, as assessed by scRNA-seq in pediatric SLE (cSLE) patients' peripheral blood mononuclear cells (PBMCs), compared to age-matched healthy individuals [149]. Interestingly, overtaking even DC3s, the majorly expanded cDC cluster resembled the AXL+ DCs first identified by Villani et al. [146]. Additionally, although pDCs were found decreased as a total population in cSLE samples, further analysis revealed four distinct subclusters, one of which was profoundly expanded in SLE compared to healthy individuals. Notably, the defining markers of this expanded pDCs subcluster consisted primarily of interferon-induced genes, accompanied by genes connected to transcription factors (e.g., *STAT1, IRF7*) and antigen presentation (e.g., *CD74, HLA-DRA, CTSB*) [149]. The latter could point towards a yet unexplored role of these cells in propagating the IMID by activation of autoreactive T cells. In line with their initial placing on the verge between cDCs and pDCs, AXL<sup>+</sup> DCs together with the expanded pDC subcluster were found to be among the PBMC clusters contributing the most to the SLE IFN signature. The above study proceeded a step further by aligning side by side the pediatric samples to corresponding samples from adults, highlighting age as another contributor to the fluctuation of disease-specific subclusters. Keeping up with the pDC and SLE field, Hjorton et al. investigated the cellular source of type III IFNs, a cytokine group whose contribution to the SLE IFN signature and disease progression remains poorly studied. To this end, they isolated pDCs from healthy donors, used a stimulation mix containing RNA immunocomplexes (used widely as IFN inducers in these cells) plus IFN2ab and IL-3, and subjected them to scRNA-seq [150]. Unexpectedly, they found that only a small population of single-cell sequenced pDCs contributed mostly to the total detected transcripts of both IFN III and I. Compared to the non-IFN III-producing cells, the identified small pDC cluster was also characterized by higher mRNA levels of genes connected to immune activation such as *TNF, CD40*, *CD83* and *IL12A*. While not explored by the authors, it would be interesting to speculate as to whether their identified pDC subcluster aligns with the expanded one mentioned by Belaid et al. [149], as in both cases its frequency among healthy donor pDCs was minimal. Thus, the importance of single-cell resolution in identifying the disease and age-relevant cell populations was once more signified.

In most cases, peripheral blood has been used as a mirror to study DC properties in IMIDs, however, analyses from inflamed tissues are equally or even more important as suggested by the expected effect of tissue microenvironments in DC transcriptional and functional signatures [151,152]. This site-specific analysis has been bolstered by recent omics advances, since their high throughput performance automatically decreases the required cell numbers to conduct meaningful experiments. As an example, Caravan and colleagues, studied the impact of the synovial microenvironment in cDCs from RA patients [153,154]. Using multiparameter flow cytometry and RNA sequencing, they found not only that cDCs were enriched in the synovial tissue of RA patients, compared to the blood of the same individuals as well as that of healthy controls, but that they also exhibited a highly activated phenotype as assessed by expression of costimulatory molecules [153,154]. Regarding the CD1c+ cDC2, the synovial microenvironment was shown to induce metabolic alterations, polarizing them to a more glycolytic phenotype while a more detailed analysis was performed for CD141<sup>+</sup> cDC1. For the latter, the hypoxic synovium was shown to specifically induce the expression of TREM-1 as part of a site and disease-specific signature. Interestingly, in vitro crosslinking of TREM-1 in cDC1 isolated from synovial tissue could induce their activated phenotype in parallel to an increased ability to induce pro-inflammatory cytokine production from heterologous and autologous T cells [153]. Additionally, supernatants from these cDC1-T cell co-cultures could activate synovial fibroblasts to produce an array of soluble mediators consistent with the acquisition of an invasive phenotype. The authors concluded that the discovered synovium-specific signatures could be harnessed in order to design novel therapeutic and cDC targeted strategies, with TREM-1 being a frontline example.

Omics analysis targeted to the inflamed tissue is additionally essential for another IMID, namely AD. Two recent studies have attempted to interrogate the immune and non-immune skin compartments of patients with AD and healthy controls using scRNAseq [155,156]. Both studies found that DC populations were expanded in the pathogenic samples in relation to healthy skin, with cDC2 probably being the more over-represented population due to their characteristic expression pattern of surface markers. Alongside the "typical" cDC populations, another smaller cluster expressing *CCR7* and *LAMP3* was identified. Despite their small numbers, these cells exhibited some very interesting traits such as clear characteristics of mature and migratory behavior and selective enrichment in the lesional skin of AD patients combined with their almost complete absence from healthy samples [155,156]. He et al. also found that these LAMP3+CCR7+ DCs robustly expressed type 2 chemokines such as *CCL17* and *CCL22*. These data were nicely corroborated by the fact that T cell populations with Th2 and Th22 polarization states were additionally enriched in the AD skin samples, opening the possibility that DCs are the major innate immune cell to attract these pathogenic T cells in the site of inflammation. Notably, these type 2 chemokines have already been used as reliable biomarkers to measure disease progression and response to therapy [157], however, the source cells were not clearly defined. Moreover, Rojahn et al. reported that added to type 2 chemokines, myeloid cells including DCs, produced amphiregulin in the lesional skin that can activate keratinocytes and thus worsen the clinical manifestations of AD [156]. Collectively, the above studies could be the starting point of further investigations on whether DCs are a major source of the above soluble factors and if so, implement their targeting as better therapeutic interventions and/or evaluation of them as more accurate biomarkers.

On the same page and similarly focusing on IMIDs with skin-related pathological manifestations, Kim and colleagues [158] interrogated the immune compartment of skin biopsies from patients with psoriasis as compared to healthy volunteers. To cope with the inherent issues introduced by enzymatic digestion of the skin as well as with the low leukocyte frequencies in that tissue, they implemented a novel approach by profiling with scRNA-seq the cells naturally emigrating from skin biopsies over the course of 48 h. In line with the studies above, they found DCs to be majorly expanded in the samples of psoriatic patients with a reported increase in their numbers over three-fold compared to healthy skin [158]. Interestingly, they identified DCs with both a mature and semimature phenotype. Semi-mature DCs, in both sample groups, were found to express genes encoding for IL-10 and CD141. While the authors did not elaborate further on this, the description could fit skin resident cDC1 and at the same time render these cells as potential targets of tolerance, re-establishing therapeutic approaches. Mature DCs, on the other hand, had higher expression of genes related to antigen presentation machinery and costimulatory signals, a signature that was further reinforced in psoriasis samples. A defining marker of these mature DCs was *LAMP3*, highlighting, in conjunction with the above studies in AD, that the same DC populations can have disease-promoting roles in a broad spectrum of IMIDs. Additionally, in psoriatic samples mature DCs expressed considerably more *IL-23A*, a cytokine related to the establishment of a pathogenic Th17 profile, while at the same time had markedly less expression of *KYNU*, an enzyme participating in the kynurenine pathway known for its immunomodulatory role. Going a step further, by using a computational algorithm to simulate cell-to-cell communication events, the authors were able to show that the increased *IL-23A* production by mature DCs in psoriatic skin would signal in *IL-17F* producing Th17 cells, shown to express the highest amount of the cognate receptor. Interestingly, these IL-17F+ cells were the largest subset of IL-17 producing T cells in psoriasis samples, therefore suggesting that their expansion and pathogenic function is a derivative of the pro-inflammatory secretory behavior of mature DCs.

#### *3.3. Contribution of Proteomics in the Identification of DC-Presented Epitopes in IMIDs*

As mentioned above, a prominent DC function lies in the processing and presentation of autoantigens to autoreactive T cells. However, an up-to-the-point question concerns whether specific epitopes dominate and are preferentially presented, even in cases that a particulate cell population is targeted. In the example of T1D, a recent study aimed to delineate the naturally processed and presented epitopes by DCs as derived from pancreatic beta cells [159]. The authors isolated peripheral blood CD14+ monocytes from healthy donors and cultured them in vitro with GM-CSF and IL-4 in order to generate moDCs. Then, these moDCs were pulsed in vitro with various pancreatic islet autoantigens and then their "presentome" was analyzed (the eluted epitopes presented in the surface HLA-DR molecules) with mass spectrometry. Their experimental set up also held augmented clinical relevance as they selectively used moDCs from individuals possessing the alleles HLA-DR3 and HLA-DR4, associated with high-risk of disease emergence. Among their findings was the addition of new epitopes to those already characterized for some peptide autoantigens as well as the discovery of some derived from pancreatic islet peptides for which epitope generation had not been previously reported. Interestingly, they were able to show that not all the discovered epitopes induce a pro-inflammatory reaction, evident by the response magnitude and its induced IL-10 or IFN-α signature upon incubation of PBMCs from T1D patients with them. Importantly, the most immunodominant epitopes were generated by moDCs when compared to B cells, another immune system cell with antigen presentation capacity, in an HLA-DR allele independent manner. Thus, such approaches could help increase the efficacy of peptide-based tolerogenic immunotherapies as well as their patient-specific tailoring by using epitopes preferentially presented by each HLA haplotype in moDCs.

#### *3.4. Bridging the Metabolic Profile and the Function of DCs in IMIDs: An Emerging Field of Research*

The relevance of metabolomics in IMIDs is constantly increasing, however, its targeted application in DCs is only lately gaining attention. Towards that point, a recent study aimed to identify metabolic pathways exhibiting similar dysregulation in the circulation and DCs of patients with systemic sclerosis [160]. To this end, they first performed metabolite analysis using plasma of patients and healthy individuals and discovered evidence of imbalanced fatty acid and carnitine levels in systemic sclerosis samples. In line with this, they also found increased levels of L-carnitine in moDCs, derived from GM-CSF and IL-4 cultures of peripheral blood monocytes from systemic sclerosis patients, after their stimulation with TLR agonists. As a continuation of their observations, the authors tested the effect of etoposide, a carnitine transporter inhibitor widely used for cancer treatment, on the activation of patient-derived moDCs after TLR stimulation and showed secretion of reduced levels of pro-inflammatory cytokines such as IL-6 in its presence. As carnitine transports fatty acids into the mitochondria in order to facilitate their oxidation, the above observations suggest that targeted suppression of fatty acid oxidation in DCs could be helpful in decreasing the inflammation related to the particular IMID.

#### **4. Conclusions, Challenges and Open Questions**

Multi-omics data can play a crucial role in clinical practice in the near future, for predicting disease susceptibility, disease severity and treatment response or identifying new therapeutic targets for IMIDs. However, we are still in the dawn of this exciting new era. Building large-scale patient cohorts with high-quality clinical data consisting of patient demographics, disease response and multiple layers of omics data, as well as refined analytic approaches to handle these data, would contribute to a better understanding of mechanisms governing IMIDs biology and accelerate precision medicine.

Certain barriers need to be considered and overcome towards the vision of biomarker discovery and targeted new therapies for IMIDs. First of all, so far, high-throughput analysis is mostly restricted to total PBMCs of patients, with data extracted from diverse immune cell types being very limited. To analyze the genome, which is regarded as a stable feature for each individual, an easily accessible tissue, such as blood and analysis of whole PBMCs is broadly acceptable. However, many other types of omics, such as transcriptome, proteome and metabolome, vary between diverse immune cell types and tissues. Due to the high degree of complexity of the immune system, selective targeting of specific immune cell populations dictating the complex immune responses during IMIDs, such as DCs and Tregs, allows a deeper understanding of the mechanisms driving disease pathogenesis, with the prospect of identifying more precise therapeutic targets avoiding broad immunosuppression. Additional multi-omics data extracted from the analysis of Tregs and DCs specifically are needed to elucidate the degree of dysfunction rendering these cells pathogenic for IMIDs.

Secondly, the few existing studies utilizing diverse omics approaches to analyze DCs and Tregs in IMIDs are restricted to information extrapolated from a single omic level. A single omic data layer characterizes a specific biological process from one aspect. However, biological processes are based on interactions among genes, proteomes, metabolites, etc., and are regulated by epigenetic modifications. Single biomolecules or signaling pathways cannot fully explain biological mechanisms or functions. To acquire a comprehensive picture of the intrinsic molecular mechanisms driving disease pathogenesis, a systematic collection of multi-omics data is required. This increasing availability of multi-omic platforms and layers poses new challenges in data analysis. Integration and common visualization of multi-omics data are fundamental in comprehending connections across diverse molecular layers and in fully utilizing the multi-omics resources available to make breakthroughs in biomarker and therapy discovery. Artificial intelligence (AI) and machine learning (ML) approaches are techniques further required to identify and uncover clinically relevant biomarkers and biological processes that can be targeted for therapy. To achieve this vision, the interdisciplinary collaboration of biologists, computer scientists, mathematicians, and physicians is indispensable for the task of precision medicine, that holds the promise of clinically meaningful benefits for the individual patient with IMIDs.

Thirdly, the tissue (or source) where the immune cells to be studied are located is another critical aspect to be considered. Indeed, due to sample accessibility, fewer studies have been performed on tissues rather than blood. Taking into account: (a) the diversity of IMIDs, each manifesting into different tissue of the body; and (b) the recently appreciated residency of DCs and Tregs in non-lymphoid tissues such as skin, adipose tissue, lung, bone marrow, etc., with the ability to control local inflammatory responses and to express diverse transcriptional programs compared to peripheral blood or lymphoid organs, additional multi-omics studies investigating the role and function of distinct immune cells in diverse tissues are required, in order to acquire a more holistic view of the complexity of the mechanisms governing the development of IMIDs.

Finally, taking into consideration the dynamics, rapid responses and spatial particularity of the immune system, temporal and spatial omics studies will be meaningful in providing insights into the dynamic process dictating the manifestation of IMIDs. For example, the process of antigen uptake, presentation, immunological synapsis and cellto-cell contact in the interplay of DCs and Tregs is highly dynamic and depends on the spatial position of immune cells, stroma and other non-immune counterparts. Despite its importance in the function of an immune response and immune-mediated diseases, our current knowledge is only basic, which calls for more extensive research.

**Author Contributions:** D.K., N.E.P., E.N. and T.A. wrote the article and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (agreement no. 947975) and the Hellenic Foundation for Research and Innovation (H.F.R.I.), under the "2nd Call for H.F.R.I. Research Projects to support Post-Doctoral Researchers" (Project Number: 166).

**Acknowledgments:** The authors would like to thank Pegy Melissa for providing technical and administrative assistance and Dora Togia for financial and administrative management. Figures were partly created using BioRender.com.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Review* **Contribution of the Environment, Epigenetic Mechanisms and Non-Coding RNAs in Psoriasis**

**Charalabos Antonatos 1, Katerina Grafanaki 2, Paschalia Asmenoudi 1, Panagiotis Xiropotamos 1, Paraskevi Nani 1, Georgios K. Georgakilas 1,3, Sophia Georgiou <sup>2</sup> and Yiannis Vasilopoulos 1,\***


**Abstract:** Despite the increasing research and clinical interest in the predisposition of psoriasis, a chronic inflammatory skin disease, the multitude of genetic and environmental factors involved in its pathogenesis remain unclear. This complexity is further exacerbated by the several cell types that are implicated in Psoriasis's progression, including keratinocytes, melanocytes and various immune cell types. The observed interactions between the genetic substrate and the environment lead to epigenetic alterations that directly or indirectly affect gene expression. Changes in DNA methylation and histone modifications that alter DNA-binding site accessibility, as well as non-coding RNAs implicated in the post-transcriptional regulation, are mechanisms of gene transcriptional activity modification and therefore affect the pathways involved in the pathogenesis of Psoriasis. In this review, we summarize the research conducted on the environmental factors contributing to the disease onset, epigenetic modifications and non-coding RNAs exhibiting deregulation in Psoriasis, and we further categorize them based on the under-study cell types. We also assess the recent literature considering therapeutic applications targeting molecules that compromise the epigenome, as a way to suppress the inflammatory cutaneous cascade.

**Keywords:** epigenetics; psoriasis; methylation; histone; ncRNAs

#### **1. Introduction**

Psoriasis (PsO) is a chronic, inflammatory skin disease with its prevalence ranging from 1.83 to 5.32% in central European adults [1], with similar frequencies observed in white individuals in the United States [2]. PsO's manifestation lies in the epidermal keratinocytes (KCs), where the perturbation of inflammatory and cell-cycle-related pathways leads to their uncontrolled proliferation, aberrant differentiation and the development of distinctive, erythematous plaques on the skin surface [3]. Significant progress has been accomplished in characterizing the mechanisms involved in the pathogenesis of PsO, which are related to the activation of immune cell types and the maintenance of the chronic inflammation through the production by KCs of numerous signaling and chemotactic molecules [3,4]. Specifically, antimicrobial peptides (AMPs) produced by KCs and melanocyte auto-antigens, such as ADAMTSL5, in response to cell damage and altered microbial environment stimulate Toll-like Receptor (TLR) 9 and 8 signaling pathways in the plasmacytoid dendritic cells (pDCs) and myeloid dendritic cells (mDCs), respectively [5,6]. The pDCs are activated and type I IFN is produced, inducing both the maturation of mDCs and the secretion of proinflammatory cytokines, such as IL-12, IL-23 and tumor necrosis factor-alpha (TNF-α), and consequently the expansion of T helper (Th) cells. [7]. Both interleukins modulate the differentiation and proliferation of Th1 and Th17 cell subtypes [8], while TNFα is capable

**Citation:** Antonatos, C.; Grafanaki, K.; Asmenoudi, P.; Xiropotamos, P.; Nani, P.; Georgakilas, G.K.; Georgiou, S.; Vasilopoulos, Y. Contribution of the Environment, Epigenetic Mechanisms and Non-Coding RNAs in Psoriasis. *Biomedicines* **2022**, *10*, 1934. https://doi.org/10.3390/ biomedicines10081934

Academic Editor: Marianna Christodoulou

Received: 7 July 2022 Accepted: 8 August 2022 Published: 9 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of enhancing the mitotic rhythm of KCs through the stimulation of cutaneous fibroblasts, production of the Keratinocyte Growth Factor (KGF) [9,10], as well as fostering leukocyte migration and T regulatory (Treg) cells suppression [11]. Specifically, the diverse role of TNFα in both the facilitation of the leukocyte migration in the cutaneous inflammation and the stimulation of KCs for the production of inflammatory cytokines has been exploited by the development of anti-TNFα drugs, displaying a high remission rate amongst PsO patients [3]. TNFα is secreted by the majority of the implicated cell types, including IFNγproducing Th1 cells, which stimulate chemokine synthesis by the KCs [4] (Figure 1). The inflammatory cascade is amplified through the dysregulated IL-23/Th17 axis that plays a central role in the pathogenesis of the disease via the secretion of IL-17 and IL-22 [12]; IL-23 exhibits a lesional-specific increased expression profile, in contrast to other Th17 dependent inflammatory diseases, highlighting the accumulated cutaneous levels of Th17 as well as the role of IL-23 in the Th17 polarization [9]. The direct influence of IL-17 on the inflamed KCs via their increased proliferative activity is enhanced due to its secretion from the majority of the diverse cell types implicated in the disease pathogenesis, with, nevertheless, the IL-23-dependent IL-17 secretion through the activation of Th17 being widely demonstrated as a core pathogenetic mechanism and utilized as a therapeutic approach [12]. The inflammatory milieu is preserved by the abundant production of chemokines, AMPs and proinflammatory cytokines by the KCs [9] (Figure 1).

**Figure 1.** Overview of the inflammatory cascade observed in PsO, as well as key deregulated epigenetic factors and ncRNAs during the disease progression.

Disruption of such core mechanisms that regulate the immune response and cell proliferation is mediated through a multi-layered interaction between genetic and environmental factors. The largest genome-wide association analysis, conducted in 2017 by Tsoi et al., has uncovered 63 associated genetic loci, mapped in genes that participate in the inflammatory cascade occurring in PsO, such as the adaptive immune response and differentiation of lymphocytes [13]. Nevertheless, the complexity of PsO considering the disease's development and progression is also attributed to environmental factors that aggravate the existing genetic predisposition. The continuous infiltration of the epidermal barricade from immunogenic stimuli, as well as smoking [14], diet [15] and sun exposure, [16] significantly alter the epigenomic profile of the diverse collection of cell types involved in the pathogenesis of PsO. Epigenetic alterations are characterized as reversible, chemical modifications in the structure of DNA without affecting the genomic sequence, thus modifying gene expression [17]. DNA methylation and post-translational histone modifications contribute to transcriptional activity, whereas post-transcriptional regulation is performed by non-coding RNA molecules (ncRNAs) [18] (Figure 1). While epigenetic changes can normally be utilized as a tool to control gene expression throughout the developmental stages of a cell type, multiple studies have associated aberrant epigenetic changes with the pathogenesis of cancer [19] and cardiovascular and autoimmune diseases [20]. This review focuses on the research conducted in characterizing the contribution of environmental factors, epigenetic factors and ncRNAs in PsO onset, in the context of KCs and the various implicated immune cell types, as well as their potential clinical relevance as biomarkers and therapeutic targeting.

#### **2. Microbiome and Environmental Risk Factors**

Since PsO is a multifactorial disease, genetic and environmental factors affect its onset and progression. The microbiome [21,22] is responsible for triggering adaptive and innate immune responses that extensively affect numerous immunomodulatory mechanisms (Figure 1). The connection between PsO and bacterial infections was established decades ago by providing evidence that streptococcal infection [23] can lead to PsO, and PsO patients can be distinguished from healthy individuals based on differences in their skin and gut microbiome [24]. Lifestyle factors such as diet, smoking and alcohol intake can further alter the gut microbiota composition, while metabolites derived from the latter can influence epigenetic modifying enzymes (Figure 1).

#### *2.1. Skin and Gut Microbiome*

Bacterial and fungal populations on the skin differ between healthy and psoriatic patients; *Streptococcus*, *Staphylococcus* and *Malassezia* [25] are increased and *Propionibacterium* and *Corynebacterium* are decreased [22]. It is suggested that *Streptococcus*' M protein, which is highly homologous to type I keratins, can induce the expression of superantigens by T cells, further targeting KCs and causing chronic inflammation and proliferation of KCs [23]. Such superantigens are also produced by the gram-positive *Staphylococcus aureus* through the secretion of pyrogenic exotoxins, leading to severe cutaneous inflammation. HLA-DR, expressed by KCs, bind these superantigens along with secreted TNFα leading to inflammatory cascades [26]. Voluminous fungal populations of *Malassezia* spp. [27] can disrupt the epidermal barrier by producing lipases and phospholipases, attracting polymorphonuclear leukocytes and causing local skin sensitization. Production of propionate and radical oxygenase by *Propionibacterium* [26] reduces the oxidative stress levels and prevents skin inflammation. Additionally, it can modulate Th17 cells to maintain immune homeostasis. Decreased populations of *Corynebacterium* are associated with the onset and exacerbation of PsO [28], as *Corynebacterium* possesses anti-inflammatory abilities by negatively regulating interferon signaling in pDCs [28,29].

The gut microbiome is associated with skin diseases via the intestinal barrier, inflammatory mediators, and metabolites. In general, PsO patients appear to have inadequate intestinal flora [30,31], which is characterized by a reduced population of *Bacteroides* and

abundant *Actinobacteria* and *Firmicutes*. *Bacteroides* possess anti-inflammatory capabilities through the production of polysaccharide-A, which is able to activate Tregs, stimulate anti-inflammatory pathways (i.e., IL-10), and thus inhibit the maintenance of inflammation. *Firmicutes* and *Bacteroides* can decrease the level of short-chain fatty acids (SCFAs) that cause inflammation and increase the vulnerability of the intestinal barrier [32]. A disrupted barrier enables microbiota dysbiosis via the circulatory system inducing both local and systemic immune responses.

#### *2.2. Lifestyle*

It is widely accepted that the interaction of environmental and genetic factors via epigenetic modifications contributes to the onset of a wide spectrum of diseases, including PsO (Figure 1). Food and nutrient intake can lead to alterations in the composition of the gut microbiome, allowing differential growth [33] of certain populations that are associated with PsO. In parallel with microbiota alterations, several nutrients such as sulphoraphane, curcumin [34] and omega-3 polyunsaturated fatty acids [35] can induce DNA methylation in leukocytes [36] and histone modifications by activating the epigenetic related enzymes DNMTs, HDAC and HAT. Dietary habits display a further added risk in developing PsO via the increased obesity prevalence amongst PsO cases, as described in numerous studies [37–39]. Meta-analysis of the leptin levels in patients with PsO confirmed the increased levels of the pro-inflammatory adipokine, a hormone that inhibits hunger and autoregulation of T cells, despite the increased between-study heterogeneity [40,41]. The implication of the adipose tissue in the inflammation through the secretion of both adipokines and the classical pro-inflammatory cytokines, such as TNFα and IL-6, establishes abdominal obesity as a risk factor for PsO, nevertheless without fully clarifying the exact causal mechanisms [42]. Smoking and alcohol can additionally enhance such psoriatic signals through a variety of mechanisms implicated in immunological disorders (i.e., KC hyperproliferation) due to the overexpression [43] of a5 integrin, cyclin D1, KGF receptor and pro-inflammatory cytokines. The epigenetic effect of tobacco is based on the induction of CpG island [44] methylation, decreased HDAC activity, increased histone methylation [45] levels and altered expression of non-coding RNAs [46,47].

Psychological factors as well as mood disorders, such as stress and depression, appear to play an important role in the onset and exacerbation of PsO. Stress is implicated in PsO pathogenesis through immune regulation and abnormal T cell activation. Actually, patients with PsO have lower cortisol levels [48] when stressed. Moreover, cortisol in addition to its anti-inflammatory effects, induces epigenetic changes such as DNA methylation [49], histone modifications [50], and may affect the expression of ncRNAs [51]. The psychological burden stimulates the secretion of pro-inflammatory cytokines [52], including TNFα and IL-6, further strengthening the correlation between depression and inflammatory disorders. Specifically, PsO patients undergo a significant social stigmatization due to the presence of the psoriatic plaques, leading to increased risk of social anxiety and depression [53].

UV radiation, especially UVB, is used to treat psoriatic plaques, although in some cases exposure to low light UVA may trigger photosensitivity of the skin and cause inflammation by enabling the local infiltration of neutrophiles and lymphocytes [54]. The above therapeutic mechanism of the UVB radiation is highlighted via the elevated serum levels of 25(OH) Vitamin D (the serum marker of vitamin D) in PsO patients undergoing UVB phototherapy [55]; Vitamin D binds to the Vitamin D receptor (VDR) exhibiting an immunomodulatory activity by decreasing IL-17 and IFNγ levels on Peripheral Blood Mononuclear Cells (PBMCs) [56], while Vitamin D deficiency displays a perturbated differentiation and increased proliferation of KCs [57]. Epigenetic modifications related to the Vitamin D show an anti-inflammatory and anti-proliferative profile [58], further strengthening the role of Vitamin D as an anti-inflammatory mediator.

#### **3. DNA Methylation**

DNA methylation is a well-studied epigenetic alteration mechanism mainly occurring in CpG islands localized on gene promoters. The addition of a methyl group into the cytosine's C5 position, forming a 5-methylcytosine (5mC), can significantly reduce the accessibility of Transcription Factor (TF) and RNA polymerase binding sites on the DNA helix, thus repressing the transcriptional activity. DNA methylation is catalyzed through the DNA methyltransferase enzymes (DNMTs), which consist of DNMT1, DNMT3a, DNMT3b and DNMT3L. DNMT1 prefers hemimethylated DNA and is characterized as a "maintenance DNMT" due to its repairing activity, while DNMT3L induces the de novo methyl group transfer activities catalyzed by both DNMT3a and DNMT3b [59,60]. Demethylation of 5mC is performed either from the ten-eleven translocation (TET) enzymes with the formation of the intermediate 5 hydroxyl-methylated cytosine (5hmC), or by the deamination of 5mC and the utilization of the base excision repair (BER) pathway [60].

#### *3.1. DNA Methylation in KCs*

The role of KCs in PsO pathogenesis includes both the formation of the psoriatic plaques due to their increased proliferation and the maintenance of inflammation through their contribution in the inflammatory milieu and the production of multiple proinflammatory cytokines [61]. Thus, epigenetic modifications affecting the transcriptional activity of genes involved in these pathways may contribute to PsO pathogenesis.

The pro-inflammatory Ca2+ binding proteins S100A8 and S100A9 are members of the S100 family. The heterodimeric protein S100A8/A9 is released actively at the time of inflammation, modulating its progression by stimulating leukocyte recruitment and inducing cytokine secretion [62]. These molecules are highly up-regulated in KCs and leukocytes of psoriatic skin and their expression is induced by IL-10 in differentiated human dendritic cells [63]. The methylation status of their genes' promoter has been characterized by multiple whole-genome methylation analyses (Figure 1).

Roberson et al. were the first to observe global alterations of CpG methylation in skin from PsO patients compared to skin from healthy volunteers. They identified 674 hypermethylated and 444 hypomethylated CpG sites, which were mainly localized on gene promoters, unveiling significant correlation between methylation and the expression of nearby genes such as *C10orf99*, *OAS2* and *KYNU* [64] (Figure 1). Furthermore, the methylation patterns were shown to be reversible to a non-psoriatic state after the administration of anti-TNFα therapy for one month. In the study by Zhang et al., whole-genome DNA methylation of psoriatic and non-psoriatic skin samples showed more hypermethylated regions (15,684) than hypomethylated (11,084). *PDCD5* and *TIMP2*, which induce KC proliferation, were hypermethylated and hypomethylated, respectively, exhibiting reversed expression levels [65]. Chandra et al. reported that 25% of differentially methylated CpGs were located at characterized PsO susceptibility (PSORS) loci, including PSORS2, PSORS4, PSORS6 and PSORS7, encoding several genes such as *S100A9*, *SELENBP1, CARD14, KAZN* and *PTPN22* with an inverse correlation between methylation and expression (Figure 1). Differentially methylated genes associated with histopathological aspects were also found, including *AIF1*, *FFAR2* and *TREM1*, which are implicated in neutrophil and leukocyte chemotactic events [66]. Hou et al. detected 96 hypermethylated genes, including MAPK signalingrelated genes, such as *CACNA2D3* and *SRF*, and 234 hypomethylated genes participating in the increased angiogenesis of psoriatic lesion, namely, *NRP2, VEGF,* and *VASH1* [67]. In PsO skin, Zhou et al. discovered nine differentially methylated sites near metabolism-related genes, including *CYP2S1*, *ECE1, EIF2C2*, *MAN1C1*, and *DLGAP4*, whose methylation was negatively correlated with their expression. In the intergenic area surrounding *CYP21*, considerably low methylation has been observed [68].

There is ample evidence in the literature linking the methylation status and expression level of genes that are candidates for PsO pathogenesis. Bisulfite sequencing in skin lesions revealed that hypermethylation of *p14ARF* promoter resulted in its downregulation [69]. The low expression of *SFRP4* in psoriatic skin, which is involved in the Wnt pathway and

KC's hyperproliferation, is correlated with its promoter's hypermethylation [70] (Figure 1). The Wnt pathway plays a key role in PsO by regulating the proliferation and differentiation of KCs [71]. Additional evidence exists regarding the non-malignant effect of specialized promoters' methylation status in a human malady. The promoter of the SHP-1 isoform was found to be hypomethylated in psoriatic lesions, indicating that the methylation of *SHP-1*'s promoter in PsO might be related to the STAT3's binding affinity, due to the upregulation of the latter in the lesional skin [72] (Figure 1). The same group also studied ID4, a protein that participates in cell proliferation, differentiation and apoptosis, as well as in tumorigenesis (cholangiocarcinoma, breast cancer, lymphoma). *ID4* promoter hypermethylation promoter was linked with parakeratosis, which refers to deficient development of KCs, and skin-related cellular differentiation in PsO cases [73] (Figure 1). The promoter of the *p16INK4a* gene, which is involved in hyperproliferative skin diseases, was found to be hypermethylated in psoriatic skin [74]. Sheng et al. investigated the hypomethylation of *CYP2S1* and further identified the hypomethylation of two extra loci within the *CYP2S1* region, leading to its upregulation in psoriatic tissues [75] (Figure 1). Members of the growth arrest and DNA damage-inducible gene family, such as *GADD45a* and *GADD45b*, exhibit low expression in psoriatic lesional skin [76]. Specifically the expression *GADD45a*, which has a demethylase activity, was found to be positively correlated with IFN-γ and TNFα expression. Its depletion leads to hypermethylation of *UCHL1* promoter in PsO cases [77]. The expression of *WIF1*, an inhibitor of Wnt signaling, is linked to its promoter's demethylation, which is a result of *DNMT1* silencing, while its hypermethylation is a consequence of *DNMT1* overexpression. Indirubin, a traditional medicine utilized for treating various inflammatory diseases, inhibits the expression of *DNMT1* and the methylation of *WIF1* promoter, as well as the expression of Wnt-pathway core genes, such as *FZD2*, *FZD5*, and β-catenin [78].

#### *3.2. DNA Methylation in Immune Cells*

Despite the predominant role of KCs in the pathogenesis of cutaneous diseases, PsO, as an autoimmune disorder, is driven by the substantial activation of multiple immune cell types, which further stimulate KCs' proliferation through the secretion of pro-inflammatory cytokines. Moreover, PsO is considered to be a T-cell-mediated autoimmune disease, since the role of Th cells, as well as their secretome, has been extensively studied and targeted therapeutically [79]. Immune cells can be isolated either from lesional skin or PBMCs. The latter is a non-invasive approach and has therefore been established as the standard method for studying PsO-related immune cells.

PBMCs from psoriatic patients have been shown to exhibit aberrant DNA methylation, for example, hypermethylation of *p14ARF*, *MBD2*, *MeCP2* and hypomethylation of *DNMT1* [69] (Figure 1). Genome-wide DNA methylation profiling of CD4+ T cells unveiled the significantly hypermethylated promoters of immune-related X chromosome genes, such as *SLITRK4*, *EMD*, *ZIC3*, *CXorf40A*, *HDAC6*, *IKBKG*, *SH3KBP1*, *OTUD5*, *NDUFA1*, *WNK3* and *MSL3* [80]. Recent studies uncovered the hypermethylated profile of genes that are implicated in the TGFβ pathway, including *SNX25, STAD3* and *BRG1*, from whole blood samples of monozygotic twins [81] (Figure 1). Additional comparison of CD8<sup>+</sup> T cells in monozygotic twins from psoriatic samples and healthy controls identified 110 hypermethylated and 224 hypomethylated loci. DNA methylation analyses of CD8<sup>+</sup> T cells between PsO, psoriatic arthritis cases and healthy controls revealed numerous differences, indicating that DNA methylation screening in these cell subtypes could act as a potential diagnostic biomarker [82]. Furthermore, genome-wide DNA methylation profiling from peripheral whole blood displayed that *FOXP3* is hypermethylated, leading to reduced Treg levels in patients with PsO [83] (Figure 1).

Bisulfite-sequencing on a targeted gene panel revealed the low methylation levels of *p15* and *p21* promoters in hematopoietic stem cells; *p15*, *p21* and *p16* exhibit similar methylation patterns and play a well-established role in controlling cell cycle [84,85] (Figure 1). Research conducted in CD4<sup>+</sup> T cells of monozygotic twins with PsO showed the promoter

hypomethylation of transcription-regulator *ZNF99* gene. In CD8+ T cells associated with PsO, hypomethylation of the serine/threonine MAST3 and MTOR kinases, and hypermethylation of the *PM20D1* peptidase gene was shown [86] (Figure 1).

#### **4. Histone Modifications**

The post-translational histone modification (PTM) process is an important mechanism of gene expression regulation since these proteins directly participate in DNA organization and accessibility. Briefly, nucleosomes are the fundamental subunit of chromatin and are composed of histone octamers. Each histone (H2A, H2B, H3 and H4) is represented twice in the nucleosome structure that forms a binding scaffold for 147 DNA base pairs [87]. PTMs usually occur in the overhanging N-terminal tails of histones, resulting in either enhanced transcriptional activity through nucleosome unwinding and euchromatin formation, or strengthened DNA-histone interactions that form heterochromatin and suppress gene expression. Several types of histone modifications, implicated in both gene silencing and enhanced transcription, have been described, including adenylation, methylation, phosphorylation, ADP ribosylation and sumoylation among others [88].

#### *4.1. Histone Modifications in KCs*

Epigenetic regulation of KCs is an essential part of chronic skin inflammation. Recent research demonstrated that decreased H3K9 dimethylation leads to increased IL-23 expression in KCs; H3K9me2 levels play a key role in regulating basal and TNF-induced IL-23A expression [89]. H3K27me3 and *EZH2*, a histone methyl-transferase enzyme, were significantly enriched in cutaneous biopsies from individuals with PsO when compared to healthy controls. *EZH2* is implicated in cell proliferation and tumorigenesis [90] and thus its transcriptional silencing affects the proliferation and differentiation of KCs.

Sirtuin (SIRT) is a family of (NAD+)-dependent deacetylases, involved in cell apoptosis, gene transcription, tumor development, autoimmune inflammation and epigenetic modification processes. Specifically, *SIRT1* regulates inflammation-associated signaling pathways [91,92]. Hwang et al. showed that *HDAC-1* is overexpressed and *SIRT1* displays a decreased expression in skin biopsies of patients with PsO [93]. GLS1-mediated glutaminolysis induces proliferation of KCs in PsO and promotes Th17 and γδ T17 cell differentiation through the acetylation of H3 on Il17a promoter [94]. The first whole-genome study for histone modifications showed that H3K27 is hyperacetylated in 60% of the overexpressed gene promoters in cutaneous lesions and binding sites of overexpressed TFs in lesional skin, such as *GRHL* [95] (Figure 1). *WT1*, a TF implicated in cell proliferation and apoptosis, is highly expressed in psoriatic skin lesions and has two binding sites in the IL-1β gene promoter. IL-1β is produced by KCs, with its gene promoter showing considerably high histone acetylation levels that positively correlate with histone acetyltransferases p300 (P300) expression [96]. H3K27 hyperacetylation of *RPL22* promoter in PsO lesional skin leads to overexpression of *RPL22*, which is linked to CyclinD1 upregulation, inducing KC proliferation. *RLP22* also prevents KC apoptosis and is involved in CD4+ T cell chemotaxis [97].

#### *4.2. Histone Modifications in Immune Cells*

PBMCs from PsO patients exhibit higher H3K4 methylation levels when compared to healthy individuals. Responders and non-responders in biological therapies tend to have different H3K27 and H3K4 methylation profiles [98] (Figure 1). Regulation of Th17 cell differentiation can be achieved by TCR-induced H3K27 demethylase Jmjd3, which is overexpressed and decreases H3K27me3 levels. Jmjd3 controls chromatin accessibility of numerous Th17-related loci, such as Th17-specific gene promoters, and induces Th17 cell differentiation by decreasing H3K27me3 enrichment [99]. Zhang et al. unveiled decreased H4 acetylation in psoriatic PBMCs, downregulation of *P300*, *CBP* and *SIRT1* as well as increased HDAC1 levels [100] (Figure 1).

#### **5. Non-Coding RNAs**

Non-coding RNAs (ncRNAs) are RNA molecules that are not translated into functional proteins. They are typically grouped into distinct families according to their size and function. Most ncRNA families play a key role in directly or indirectly affecting gene expression at the transcriptional and post-transcriptional level. The microRNA (miRNA) family consists of small (~18–23 nucleotides), single-stranded RNA molecules that are loaded on and guide the RNA-induced silencing complex (RISC) to mediate mRNA degradation and/or translation suppression [101]. Circular RNAs (cirRNAs) form a continuous loop through their linkage on the 5 and 3 termini, establishing them as stable, exonuclease-proof RNA molecules with numerous roles in transcriptional and translational regulation [102]. Long non-coding RNAs (lncRNAs) are defined as longer than 200 nucleotides transcripts with an emerging role in the pathogenesis of autoimmune diseases [91]. LncRNAs and circRNAs can regulate gene expression by participating in processes that alter chromatic conformation, forming triplexes with DNA as well as interfering with transcription enzymes [103].

Due to the abundance of the ncRNAs implicated and studied in the context of PsO, we conducted an exhaustive literature search regarding the differential expression of ncRNAs in cutaneous biopsies, serum levels as well as PBMCs. We filtered the screened studies according to the importance of the ncRNAs under study, evaluated through their statistically significant differential expression in contrast to healthy controls as well as their identified, direct or indirect target genes. Non-coding RNAs screened by more than one study were also identified as important regulators of the psoriatic transcriptome.

#### *5.1. MiRNAs in KCs*

The immortalized nontumorigenic human epidermal (HaCaT) cell line and KCs have been widely used in PsO studies as they are easy to isolate and provide reliable results regarding the transcriptome and proteome profile of the disease [104]. Fibronectin 1 (*FN1*) and integrin subunit α9 (*ITGA9*) signaling pathways are both implicated in cell motility and direct targets of miR-4516, which was found to be downregulated in cutaneous PsO biopsies (Figure 1). Specifically, Chowdhari et al. found significant overexpression of both *FN1 and ITGA9* as well as STAT3 in PsO, which could be partly responsible for the KCs activated state in lesional skin since it induces proliferation and terminal differentiation [105]. MiR-424 has been also investigated as a potential biomarker in PsO given its regulatory role in signaling pathways that orchestrate differentiation and cell cycle regulation in KCs [106] (Figure 1). The decreased miR-424 expression has been associated with the overexpression of *MEK1* and *CCNE1*, members of the metabolic pathways responsible for the abnormal KC proliferation observed in the clinical manifestation of the disease; however, the exact regulatory mechanism has yet to be defined. Another miRNA whose down-regulation has been associated with PsO is miR-145, a molecule that has been widely studied in immunemediated inflammatory diseases due to its inhibitory role in cell proliferation and immune responses [107] (Figure 1). It has been observed that miR-145 downregulation promotes proliferation and chemokine expression in the lesional skin, as it directly targets *MLK3*, which in turn regulates STAT3 and NF-κB TFs. Increased expression of miR-21 has been associated with the epidermal downregulation of *TIMP-3*, leading to the activation of *TACE*, which subsequently induces TNFα overexpression and a psoriasis-like phenotype [108] (Figure 1).

MiR-200c, a miRNA involved in apoptosis and senescence of KCs, was found to be upregulated in lesional skin compared to non-lesional skin biopsies and healthy controls [109] (Figure 1). MiR-200c is also known to directly repress *SIRT1*, which has a key role in oxidative stress and the regulation of skin inflammation, as well as *eNOS* and *FOXO1,* which have a significant role in regulating the function and preservation of endothelial cells. MiR-200c expression also shows a positive correlation, without being a direct regulator of molecules involved in inflammation such as *IL-6* and *COX-2*, and plaque destabilization such as *MMP-1* and *MMP-9*. Table 1 presents further deregulated miRNAs in KCs from PsO patients.


**Table 1.** Deregulated ncRNAs in PsO in keratinocytes.

Abbreviations: ncRNA, non-coding RNA; miR, microRNA; HEK, human embryonic kidney cell line; HaCaT, human epidermal keratinocyte cell line.

#### *5.2. MiRNAs in Immune Cells and Serum*

Fu et al. showed that miR-138, whose expression levels regulate the balance between Th1 and Th2 cells by targeting RUNX3, was found downregulated in PBMCs of Pso patients [125] (Figure 1). RUNX3 is an important TF regulating cell proliferation and apoptosis, and its increased expression in PsO classifies it as a key gene for PsO susceptibility. MiR-143 downregulation exhibits a significant correlation with PsO severity; specifically, patients with stable disease stages showed higher miR-143 expression levels, while patients in progressive stages had lower expression levels [126] (Figure 1). *BCL2*, which is thought to be responsible for shortening the lifetime of cortical cells, has been proven a direct target of miR-143 [126]. Additionally, García-Rodríguez et al. showed that the upregulation of miR-155 in PsO plasma samples is an important regulator of *SOCS1*, a susceptibility locus of PsO [127]. The miRNA-155/SOCS1 pathway is targeted in macrophages by Vitamin D or Vitamin D Receptor (VDR) signaling to reduce the inflammatory response [128] (Figure 1). The upregulation of miR-210 in CD4+ T cells has also been found to significantly correlate with PsO onset and progression, mainly by targeting *FOXP3* [129] (Figure 1). *FOXP3* displays a central role in the development and diverse functionality of Treg cells, as it appears to facilitate their differentiation through genetic programming [129], thus establishing it as an important contributor to the pathogenesis of the disease. In normal CD4+ T cells, overexpression of miR-210 can indirectly, contribute to the expression of inflammatory cytokines such as IFN-γ and IL-17 while suppressing other cytokines such as IL-10 and TGF-β, which are secreted by Tregs. Table 2 presents further deregulated miRNAs in PBMCs and plasma from PsO patients.


**Table 2.** Deregulated ncRNAs in PsO in immune cells and plasma.

Abbreviations: ncRNA, non-coding RNA; miR, microRNA; PBMCs, peripheral blood mononuclear cells.

#### *5.3. LncRNAs in PsO*

Despite the established regulatory role of lncRNAs, there is still limited evidence regarding their participation in PsO pathogenesis. The maternally expressed gene 3 (*MEG3*), a downregulated lncRNA in HaCaT cells, has an identified miR-21 binding site, thus acting as a sponge or decoy for miR-21. It is postulated that *MEG3* participates in the regulation of PsO KCs proliferation and apoptosis as well as the expression of *CASP8* through its interplay with miR-21 [135]. By utilizing a luciferase reporter assay, it was confirmed that MSX2P1, a lncRNA overexpressed in HaCaT and KCs, is a direct target of miR-6731, thus negatively affecting its function on other RNAs. Further research in IL-22-stimulated KCs revealed MSX2P1's indirect role in increasing the protein levels of S100A7, IL-23, NFκB, TNFα, IL-12β, HLA-C, and CCHCR [136]. Psoriasis-susceptibility-related RNA Gene Induced by Stress (PRINS) [137] and GAS5 [138] are lncRNAs displaying an under- and over-expression pattern, respectively, in the serum levels of patients with PsO. Specifically, plasma levels of PRINS transcripts were found to be down-regulated in patients with PsO, similar to its direct target *G1P3* and interacting partner *NPM*, while PRINS' miRNA targets that function as decoys, consisting of miR-124, miR-203a, miR-129, miR-146a and miR-9, were overexpressed, thus indicating a possible lncRNA–miRNA–mRNA axis of diagnostic value.

While the role of circRNAs in PsO progression remains obscure, CDR1as, a ciRNA significantly downregulated in cutaneous PsO biopsies, has been associated with numerous genes that are involved in the pathogenesis of the disease, such as *EGR3, GATA6, GATA3* and *FOXN3* [139]. However, the direct regulatory mechanism remains unclear (Figure 1). Xiaoxin Liu et al. also showed that circRNA hsa\_skin\_088763 was down-regulated in lesional skin compared to normal controls. It is postulated that this circRNA is indirectly associated with several PsO-related genes such as *GATA6*, *SIK2*, *IL17RD*, *EGR3*, *FAS*, *LRIG1*, and *PPARGC1A*, due to their shared regulatory ncRNAs (Figure 1), thus characterizing it as a competing endogenous RNA (ceRNA). A comprehensive list of all deregulated lncRNAs is presented in Tables 3 and 4, stratified based on the cell type under study.


**Table 3.** Deregulated lncRNAs in psoriasis in keratinocytes. In this table, miRNAs, which are targets of lncRNAs that function as decoys, are highlighted in bold.

Abbreviations: circRNA, circular RNA; lncRNA, long non-coding RNA; HEKs, human embryonic kidney cell line; HaCaT, human epidermal keratinocyte cell line; NHEK, normal human epidermal keratinocyte.

**Table 4.** Deregulated lncRNAs in psoriasis in serum. In this table, miRNAs, which are targets of lncRNAs that function as decoys, are highlighted in bold.


#### **6. Therapeutic Approaches Targeting the Epigenetic Mechanisms**

In recent years, biologic drugs targeting TNF, IL-23 and IL-17 and small-molecule drugs such as phosphodiesterase-4 (apremilast) and Janus kinase (JAK) inhibitors have been effective in plaque psoriasis clinical management. The predominant role of epigenetic modifications in the pathogenesis of complex diseases, as analyzed in the framework of PsO, establishes the therapeutic interventions targeting the epigenome, a promising clinical field. Numerous inhibitors of molecules that participate in the epigenetic reprogramming, including DNMTs and HDAC, have been extensively utilized in clinical trials. Recently FDA-approved agents, combined with cytotoxic chemotherapies, show promising results despite their limited implementation, mostly on hematologic malignancies [146]. Such repurposing approaches, already applicable in cancer, might prove beneficial in PsO considering the diverse cell subtypes that are involved in its pathogenesis.

Reservatol, a polyphenol with anti-inflammatory properties, was shown to stimulate the expression of *SIRT1* leading HaCaT cells to death [147]. Recently, trichostatin A (TSA), a class I and II HDAC inhibitor (HDACi), significantly decreased KC's proliferative phenotype, both in vitro and in vivo [148]. These results are in accordance with previous studies examining the effect of TSA on human Tregs and the prevention of their differentiation into IL-17A producing cells through the overexpression of *FOXP3* [149,150]. Another example of an epigenetic-driven therapeutic intervention is peroxisome proliferator-activated receptor gamma (PPARγ) and/or alpha (PPARα) antagonists, which inhibit *AQP3* expression in KCs, while agonists induce the differentiation of the latter, establishing them as a topical treatment for cutaneous diseases such as PsO. AQP3 is a water channel protein that regulates multiple aspects of KCs [151,152], with its expression induced by HDACs, particularly HDAC3, via acetylated transcription factors such as the family of p53 and PPARs.

MiRNA-mediated gene expression regulation is another approach in the anti-PsO therapeutic arsenal. MiRNAs exhibit a diverse role through their interaction with numerous transcripts and can therefore affect multiple pathways implicated in PsO. Imiquimodinduced psoriasis-like murine models were treated with a miR-210 antisense moleculecontaining topical gel. Based on the predominant role of miR-210 in the regulation of multiple genes expressed in CD4<sup>+</sup> T cells, as analyzed before, cellular markers of cell proliferation were significantly decreased in KCs, while the imbalance of CD4+ T cells was reversed to a non-pathological state [153]. Another topical-application approach referred to the usage of quaternized starch (Q-starch) as a miRNA-197 delivery system in lesionalxenotransplantated mice, alongside ultrasound for increased cutaneous permeability. The Q-starch/miRNA-197 complex was able to alleviate the psoriatic symptoms through the targeting of IL-22RA1 and IL-17RA transcripts, nevertheless without a homogeneous effect along the transplanted psoriatic skin sample [154]. Locked nucleid acid (LNA) *anti-miR-21* oligonucleotides were also assessed for their efficacy as a therapeutic approach in a doubleknockout PsO murine model, displaying an important amelioration in the histopathological symptoms of the disease [108], further highlighting the role of miR-21 overexpression in PsO. Additionally, Qiao et al. inhibited miR-6731 in IL-22-stimulated HaCaT cells, showing an increased proliferative activity of the IL-22-induced KCs, while protein expression levels of therapeutic targets of PsO, including TNFα, IL-23, HLA-C as well as inflammatory molecules such as NF-κB and the PSORS1 locus were significantly overexpressed. These results imply a protective role of miR-21 in the pathogenesis of PsO [135].

#### **7. Discussion**

The etiopathology of PsO lies in the complex interactions between genetic, immunological and environmental factors, including but not limited to the imbalance in the gut and skin microbiome, as well as lifestyle and stress-inducing factors. The effect of genome– environment interactions can extend to the epigenome with direct and indirect modulation of DNA methylation and histone modifications in PsO-related loci. These cascading effects are further amplified by the function of ncRNAs, which in numerous ways regulate the expression of PsO-associated genes and pathways. However, epigenetic modifications and deregulation of ncRNA molecules occur in the spectrum of both KCs and immune cells, while the polygenicity of PsO aggravates candidate-gene approaches, thus obscuring the characterization of the exact mechanisms that alter the disease predisposition in individuals. Furthermore, skin tissue and immune cells consist of abundant and heterogeneous cellular populations, where each cell type exhibits a distinct epigenomic and, thus, transcriptomic profile. This diversity dramatically increases the complexity of research conducted in the field. The secretome of implicated cell types is affected by the distinct epigenetic and regulatory profiles of each sub-population, which is triggered by stress-inducing factors and the disease progression (Figure 1). Nevertheless, with the advent of next-generation sequencing, modern flow cytometry techniques and genome-wide analyses, the epigenetic reprogramming that participates in homeostasis disruption and the cutaneous inflammatory cascade is gradually elucidated, leading to the clarification of PsO pathogenesis.

Even though the epigenetic modifications cannot be utilized as clinical biomarkers for the disease progression, shedding light upon the molecular mechanisms governing their development and maintenance can potentially uncover novel therapeutic targets associated with the induced epigenetic changes. In contrast, ncRNAs and especially miRNAs have a rich history of being used as disease biomarkers and targets of therapeutic intervention, despite the difficulties imposed by the off-target effects due to the wide spectrum of pathways affected by miRNAs.

Future studies should focus on developing methods for reversing DNA methylation in the context of PsO therapeutic interventions, as a way to understand the molecular discrepancies between responders and non-responders to therapy. Additionally, the catalogue of ncRNAs implicated in PsO should be significantly enriched to provide an extensive view of the targeted genes set and an accurate description of the vast interactome that underlies

PsO. The gene regulatory networks that will emerge from the combination of epigenetic and ncRNA meta-analyses, as well as the genetic predisposition to PsO, will facilitate the development of a new generation of highly precise therapeutic approaches with minimum adverse effects and maximum impact.

**Author Contributions:** Conceptualization, C.A. and Y.V.; resources, C.A., P.A., P.X. and P.N.; data curation, C.A., P.A., P.X., P.N., G.K.G. and Y.V.; writing—original draft preparation, C.A., P.A., P.X., P.N. and G.K.G.; writing—review and editing, C.A., K.G., P.A., P.X., P.N., G.K.G., S.G. and Y.V.; visualization, C.A, P.X. and Y.V.; supervision, Y.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** The publication of this article has been financed by the Research Committee of the University of Patras.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The datasets generated or analyzed during the current study are available from the corresponding author on reasonable request.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Lucia Martin-Gutierrez 1, Robert Wilson 2, Madhura Castelino 2, Elizabeth C. Jury <sup>1</sup> and Coziana Ciurtin 3,\***


**Abstract:** Sjögren's syndrome (SS) is a heterogeneous autoimmune rheumatic disease (ARD) characterised by dryness due to the chronic lymphocytic infiltration of the exocrine glands. Patients can also present other extra glandular manifestations, such as arthritis, anaemia and fatigue or various types of organ involvement. Due to its heterogenicity, along with the lack of effective treatments, the diagnosis and management of this disease is challenging. The objective of this review is to summarize recent multi-omic publications aiming to identify biomarkers in tears, saliva and peripheral blood from SS patients that could be relevant for their better stratification aiming at improved treatment selection and hopefully better outcomes. We highlight the relevance of pro-inflammatory cytokines and interferon (IFN) as biomarkers identified in higher concentrations in serum, saliva and tears. Transcriptomic studies confirmed the upregulation of IFN and interleukin signalling in patients with SS, whereas immunophenotyping studies have shown dysregulation in the immune cell population frequencies, specifically CD4+and C8+T activated cells, and their correlations with clinical parameters, such as disease activity scores. Lastly, we discussed emerging findings derived from different omic technologies which can provide integrated knowledge about SS pathogenesis and facilitate personalised medicine approaches leading to better patient outcomes in the future.

**Keywords:** Sjogren's syndrome; patient stratification; clinical relevance; multi-omics

#### **1. Introduction**

Sjögren's syndrome (SS) is an autoimmune rheumatic disease (ARD) characterised by a chronic inflammatory process associated with lymphocytic infiltrate affecting the exocrine glands. The disease has significant heterogeneity in clinical presentation according to age at disease onset, type of organ involvement, as well as serological features and response to therapy [1,2]. When the disease occurs on its own, it is called primary SS (pSS), while when it accompanies other autoimmune conditions, it is defined as secondary SS (sSS). Various classification criteria have been used to define pSS and exclude mimicking pathology, with the most recent ones being the data and consensus-driven American College of Rheumatology/European League Against Rheumatism Classification Criteria proposed in 2016 [3].

There are currently no universally accepted classification criteria for sSS and some experts argue that making a distinction between pSS and sSS is not adequate anymore, as both phenotypes represent the same disease [4]. Moreover, the classification criteria validated in adults have minimal utility in SS with childhood-onset (defined as disease onset before the age of 18 years), as the disease presentation in children and young people, although rare, is different [5]. This, in addition to the lack of validated classification criteria

**Citation:** Martin-Gutierrez, L.; Wilson, R.; Castelino, M.; Jury, E.C.; Ciurtin, C. Multi-Omic Biomarkers for Patient Stratification in Sjogren's Syndrome—A Review of the Literature. *Biomedicines* **2022**, *10*, 1773. https://doi.org/10.3390/ biomedicines10081773

Academic Editor: Marianna Christodoulou

Received: 22 June 2022 Accepted: 20 July 2022 Published: 22 July 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

for childhood-onset SS, further limits the research opportunities for younger people affected by this disease [6].

The disease manifestations vary among patients; some have predominant exocrine glandular involvement leading to dryness, which is the hallmark symptom of the disease. Glandular involvement manifests as dry mouth (xerostomia), dry eyes (xerophthalmia), dry skin (xerosis cutis), as well as vaginal dryness, dry cough, pancreatic dysfunction, salivary gland inflammation/enlargement (e.g., parotitis), etc., while patients with extraglandular involvement can experience frequent musculoskeletal, haematological, and rarely hepatic, renal, pulmonary, cardiac, peripheral or central nervous system manifestations, as well as less specific symptoms of fatigue (common) or fever and lymphadenopathy (less common) [7]. Clinical presentation usually guides the disease management, which is largely symptomatic for glandular manifestations and involves the use of immunosuppressive treatment approaches in patients with more severe organ involvement [8]. The evidence for the efficacy of various therapies currently recommended for the management of SS is modest overall [9,10], emphasising the need for better research.

As a direct consequence of the disease pathogenesis being centred around the process of autoimmune epithelitis, powered by the interplay between the cells of the innate and adaptive immune systems, and activated by interferons and other pro-inflammatory cytokines leading to chronic immune activation in a host with genetic susceptibility [11], various disease fingerprints can be identified from the peripheral blood, as well as serum, saliva, tears and salivary gland biopsies.

Significant progress has been achieved recently in clinical research in terms of better patient clinical and molecular characterisation [12], but despite this, the management of this condition remains challenging because of patient heterogeneity and various limitations of the way the disease activity and response to treatment are measured. These aspects very likely contribute to the lack of significant treatment advances in SS, despite preliminary signals of the efficacy of various biologic agents in clinical trials [13–15]. Better research into disease pathogenesis and distinct clinical and molecular phenotypes will hopefully enable better patient selection for available therapies as well as new target discoveries.

#### **2. Materials and Methods**

This review aimed as identifying the main papers published since 2000 investigating multi-omics (cytokine profiling in serum, tears, saliva, immunophenotyping, genomic, transcriptomic and metabolomic) studies in SS in an effort to identify distinct patient groups (endotypes) which can inform meaningful stratification for better disease characterisation and improved treatment strategies. Publications selected for this review followed these inclusion criteria: Sample size higher than 10, the inclusion of age and gender-matched healthy controls, and data on at least one omic analysis in any biologic sample relevant to SS (blood, serum, tears, saliva, salivary gland biopsy) and published in English. We presented the most informative papers found in the literature in tables, summarising the study design, sample size, control groups, main findings and their clinical relevance.

#### **3. Results**

#### *3.1. Multi-Omic Biomarkers for Patient Stratification*

#### 3.1.1. Tear Biomarkers

One of the main symptoms of pSS is dry eye (xeropththalmia) as a result of lymphocytic infiltration of the lacrimal glands. Tears represent a valuable biological sample resource because of their proximity to the site of glandular inflammation and they might contain biomarkers that could help us to understand the pathogenesis of pSS, improve its diagnosis and have therapeutic implications.

Recently, several studies have concentrated their efforts on identifying those biomarkers through either cytokine, metabolomic or proteomic tear profiling (Table 1).

In this context, Chen et al. [16] determined the cytokine profile of tears, measured by a 27-plexcytokine assay in 29 pSS patients and 20 gender/age matched controls (non-SS sicca subjects and healthy controls—HCs). Elevated levels of pro-inflammatory cytokines, such as interleukin (IL)-1 receptor agonist (ra), IL-2, IL-17A, interferon (IFN)-γ, Macrophage inflammatory protein-1-β (MIP-1b), and Rantes (Regulated upon Activation, Normal T Cell Expressed and Presumably Secreted) and anti-inflammatory interleukin 4 IL-4, were found in pSS patients compared to controls. Interestingly, higher cytokine levels correlated positively with eye dryness severity and negatively with Schirmer's test which measures the volume of tears secreted over 5 min [16]. These findings are validated by another study Willems et al., which also found an increased concentration of IFN-γ, tumor necrosis factor alfa (TNF-α), IL-2, IL-4, IL-6, IL-10, IL-12p70 and IL-5 in tears of pSS patients compared to HCs [17]. Moreover, they also verified the negative correlation between the Schirmer test and the concentration of IL-2, IL-4, IL-10 and IL-12p70 in tears.

Tears are composed of water, electrolytes, mucins and hundreds of different proteins and metabolites. Urbanski et al. identified a metabolic signature of tears comprising nine metabolites specific to pSS, compared to patients with non-immune dry eye disease [18]. Metabolomic quantification by mass spectrometry and liquid chromatography showed that three metabolites, serine, aspartate and dopamine, had lower concentrations whereas six lipids (including pro-inflammatory lysophosphatidylcholine [19], sphingomyelin, and phoshatidylcholine diacyl) had increased concentrations in pSS patients compared to the non-pSS Sicca controls. Moreover, age, sex, use of anticholinergic drugs, or the presence of anti-Ro/SSA antibodies did not influence the association between the metabolomic signature and the pSS status, suggesting that it is a true disease signature.

Tear proteomic analysis by Das et al. [20] using high performance liquid chromatography (HPLC) and mass spectrometry revealed the upregulation of 83 proteins and downregulation of 112 proteins in pSS patients compared to HCs. Enrichment pathway analysis of upregulated proteins included leukocyte trans-endothelial migration, protein-lipid complex remodelling and collagen catabolic pathway. On the other hand, the analysis of downregulated proteins indicated that pathways, such as glycolysis and amino acid metabolism, were diminished in tears from pSS patients. The relationship between proteomic biomarkers and clinical outcomes was not explored.

#### 3.1.2. Saliva Biomarkers

Dry mouth (xerostomia) is a key symptom of pSS occurring in more than 95% of patients as a consequence of autoimmune destruction of salivary glands [21,22]. Salivary gland pathology detected by salivary gland biopsy is included in the classification criteria for pSS, and in many patients, this is an essential diagnostic and prognostic tool. However, this is an invasive procedure that could lead to local complications in a minority of cases, [23] whereas the collection of saliva for research purposes is, in contrast, a noninvasive procedure. Biomarkers found in saliva could potentially reflect the pathogenesis of this disease. Therefore, many researchers have been aiming to identify those lately (Table 2).

One of the few studies examining the cytokine profile of unstimulated saliva of pSS patients using the Luminex platform found an increase in IFN-γ, IL-1, IL-4, IL-10, IL-12p40, IL-17, and TNF-α levels in pSS patients compared to non-SS and HCs [24]. Moreover, IL-6 levels were higher in pSS compared to HCs. Notably, unstimulated saliva flow rate correlated with INF-γ/IL-4 ratio and salivary gland biopsy focus score (the number of inflammatory infiltrates of at least 50 cells present in 4 mm2 of salivary gland area) correlated with TNF-α/IL-4 ratio in pSS, suggesting a predominant Th1 saliva signature. Years later, Chen et al. [16] reported enhanced levels of IP-10 (Interferon gamma-induced protein 10 or C-X-C motif chemokine ligand 10, CXCL10) and MIP-1α in saliva samples from pSS compared to HCs and a negative correlation between MIP-1α levels and both unstimulated whole saliva as well as the stimulated whole saliva flow rates.


**Table 1.** *Cont.*


**Table 2.** Examples of studies investigating potential Sjögren's Syndrome biomarkers in saliva/salivary

 glands.


**Table 2.** *Cont.*


**Table 2.** *Cont.*

Metabolomics analysis of saliva identified a total of 41 metabolites reduced in pSS patients compared to HCs [25]. Principal component analysis (PCA) revealed that saliva from pSS patients had less biological diversity compared to HCs. Two distinct groups of pSS patients were identified based on their metabolomic profile: the only clinical differences between the groups were older age and the presence of major salivary gland glanditis in one group compared to the other. Recently, a longitudinal study by Herrala et al. [26] investigated changes in the levels of salivary metabolites in pSS and HCs using proton nuclear magnetic resonance (NMR) spectroscopy over 20 weeks. Choline, taurine, alanine, and glycine were the most significantly different metabolites, all of them were found in higher concentrations in saliva samples from pSS patients than in HCs. Compared to the baseline of the HCs, choline was significantly elevated at each time point, taurine and glycine were significantly higher at weeks 1, 10 and 20, whereas alanine was higher at weeks 10 and 20, suggesting that the distinct saliva metabolic signature is relatively stable over time.

Saliva proteomic analysis by Delaleu et al. [27] using a multiplex capture antibodybased assay identified 61 differentially expressed proteins in pSS vs. non-SS controls including rheumatoid arthritis (RA) and HCs samples. Interestingly only one protein, fibroblast growth factor (FGF)-4, was found at a lower concentration in pSS while 60 different proteins were present at higher concentrations compared to controls. This comprehensive analysis recognised a proteomic signature based on the following proteins: clusterin, IL-5, FGF-4 and IL-4. The proteomic signature could correctly identify pSS patients with an accuracy of 93.8% and non-SS patients with an accuracy of 100%. However, none of the protein biomarkers correlated with saliva flow rates in pSS. A more recent study by Das et al. [20] identified the upregulation of 104 proteins and downregulation of 42 proteins in pSS compared to controls. Some enriched pathways in patients' saliva included JAK-STAT signalling after IL-12 stimulation, superoxide metabolic process and phagocytosis.

#### 3.1.3. Potential pSS Biomarkers in Peripheral Blood

It is well known that pSS is characterised by an imbalance of immune cell types, including a loss of T cell tolerance and autoreactivity, increased infiltration of exocrine gland tissues, contributing to the inflammatory microenvironment, as well as B cell activation, which is crucial for ectopic lymphoid structure and germinal centre formation, which eventually leads to the irreversible glandular damage [29]. Table 3.

Mingueneau et al. [30] published a fascinating study whereby, using mass spectrometry and immunochemistry in paired blood and salivary gland biopsies, a SS disease signature was uncovered. These findings highlighted the presence of activated CD8+ T cells, terminally differentiated plasma cells, and activated epithelial cells in biopsies, whereas in blood samples they observed a cell signature of low numbers of CD4<sup>+</sup> T cells, memory B cells, plasmacytoid dendritic cells and high numbers of activated CD4+, CD8+ T cells and plasmablasts. The blood signature observed correlated with clinical parameters and enabled patient stratification into different endotypes with distinct disease activity and degrees of glandular inflammation. In line with this result, Van der Kroef et al. [31] also observed reduced frequencies of memory B cells and plasmacytoid dendritic cells and increased frequencies of activated HLA-DR CD4<sup>+</sup> and CD8+ T cells in pSS patients compared to HCs.

In 2021, Szabó et al. [32] published an article whose aim was to investigate whether the distribution of B cells in pSS could be affected by a change in the balance of circulating T follicular helper (Tfh) cell subsets and follicular regulatory T cells. Utilising multicolour flow cytometry, they discovered that pSS patients had a significant increase in activated Tfh cells compared to HCs. Interestingly, anti-La/SSB-positive patients had a higher frequency of T follicular regulatory cells compared to seronegative patients. In the B-cell compartment, they observed that memory B cells were decreased, and transitional and naïve B cells were significantly increased. Lastly, they identified a positive correlation between the proportion of activated Tfh cells and both the levels of anti-La/SSB autoantibody and the serum IgA

titre. Moreover, they demonstrated the frequency of pro-inflammatory Tfh1 cells correlated positively with levels of serum IgG and anti-LA/SSB autoantibody, suggesting the potential implication of various immune cell subsets in the disease pathogenesis through correlations with serological markers.

Martin-Gutierrez et al. [33] identified an immune signature derived from the analysis of 29 different cell subsets including B and T cells, which was driven by five distinct cell subsets: transitional Bm2 cells, late memory Bm5 cells, IgD-CD27-B cells, and CD8+ naive and CD8<sup>+</sup> Tem which differentiated between pSS patients and matched HCs. Moreover, they identified a shared immunological profile across three disease phenotypes: systemic lupus erythematosus (SLE), pSS and SLE associated with SS. By applying machine learning approaches, they identified two patient endotypes based on immune cell alterations, irrespective of the underlying diagnosis, suggesting significant pathogenic commonalities between these three disease groups. Notably, correlations were found between clinical manifestations and the frequencies of the immune cell subsets driving the stratification. CD8+ and CD4+ T cell subsets and B cell populations correlated with the erythrocyte sedimentation rate (ESR) in pSS patients whereas haemoglobin levels correlated with the frequency of CD8+central memory T cells. Disease damage scores also correlated with the frequency of CD8<sup>+</sup> TEMRA (effector memory T cells re-expresses CD45RA)cells, CD8+ responder cells (CD25-CD127+) and CD8+CD25-CD127-T cells.

Single-cell RNA sequencing of peripheral blood mononuclear cells (PBMCs) identified the expansion of CD4<sup>+</sup> cytotoxic T lymphocytes and a population of CD4<sup>+</sup> T cells highly expressing the T cell receptor Alpha Variable 13-2gene, in pSS patients compared to HCs [34]. Pathway enrichment analysis revealed upregulation of genes involved with type I and II interferon signalling, TNF family signalling and antigen processing and presentation in pSS patients. Using flow cytometry, it was confirmed the percentages of CD4+ Granzyme B+ T cells in the CD4+ T cell populations were significantly higher in pSS patients compared to the HCs. No correlations were found between the frequencies of CD4<sup>+</sup> T cells and clinical or serological parameters, including the disease activity index ESSDAI (EULAR Sjögren's syndrome (SS) disease activity index), ESR levels, or the presence of anti-Ro antibodies.

Disease-associated biomarkers can be detected in serum through proteomic or metabolomic technologies. Serum concentrations of proteins in pSS patients and HCs were measured by a high-throughput proteomic assay in a recent publication [35]. Using this complex assay 1110 proteins were quantified and, from those, 82 were found to be differentially expressed in pSS patients. Significant correlations between nine differently expressed serum proteins and the ESSDAI score were found. Using a second cohort of pSS patients, five proteins including CXCL13, TNF-receptor 2, CD48, B-cell activating factor (BAFF), and PD-L2 (Programmed cell death ligand 2) were validated as pSS-associated biomarkers. Another study investigated which serum protein biomarkers, measured by Bio-Plex, could distinguish pSS from other autoimmune diseases, such as SLE and RA [36]. Out of 63 proteins, they were able to identify eight and four proteins that could differentiate pSS from SLE and RA, respectively. A combination of four different proteins: BDNF (Brain Derived Neurotrophic Factor), I-TAC/CXCL11, soluble (s) CD163 and Fractalkine/CX3CL1 was identified as a pSS protein signature as it could discern pSS from other autoimmune diseases. A negative correlation between ESSDAI score and serum sCD163 concentrations was found.

Different reports [37–39] have also focused on analysing changes in serum metabolites by different techniques, such as mass spectrometry to find new molecules that could play a role in the pathogenesis of pSS and could become new drug targets [39]. Using a nontargeted gas chromatography-mass spectrometry (GC-MS) serum metabolic profile, the authors detected 21 metabolites that differentiated between pSS patients and controls, with 18 out of 21 metabolites further validated in another cohort. Two metabolites, stearic acid and linoleic acid had the adequate discriminatory capacity to separate pSS patients from HCs and correlated with clinical parameters, such as C-Reactive Protein (CRP), ESR, IgG, anti-Ro/SSA, anti-La/SSB, antinuclear antibodies, IgA and rheumatoid factor.


**Table 3.**

Examples of studies

investigating

 potential Sjögren's Syndrome biomarkers

 in saliva.


**Table 3.** *Cont.*


**Table 3.** *Cont.*

#### 3.1.4. Genetic and Epigenetic Studies

Although the aetiology of SS is unknown, it is considered that different factors, such as environmental, genetic and epigenetic, contribute to the disease pathogenesis. In this context, several studies [40–43], have focused on finding genetic and epigenetic factors that could be associated with SS. Transcriptomics, genome-wide association studies (GWAS) to identify genomic variants that are statistically associated with a risk of suffering the disease and epigenetic studies to determine whether gene expression is active or inactive based on DNA methylation are widely used nowadays.

#### Multi-omic pSS Signatures

It is well known that the diagnosis and treatment of SS, is challenging due to the existing molecular and clinical heterogenicity, which reflects different disease stages, variable types of organ involvement, disease severity and treatment, as well as patient-specific factors, such as age, environmental exposures and comorbidities. Thus, recent research (Table 4) has been focused on integrating genomic/epigenomic, transcriptomic, proteomic, metabolomic and immunophenotype characterisation and clinical data to gain more knowledge about the disease pathogenesis as well as being able to classify patients into groups defined by their molecular pattern.

Integrated transcriptomic and serum proteomic data with an immune signature comprising 24 different cell populations highlighted the presence of a pSS gene signature driven by interferon genes as well as ADAMs (a disintegrin and metalloprotease) substrates [44]. Interestingly, the genomic regions coding the genes identified as part of the disease signature were predominantly hypomethylated, therefore, transcriptionally activated. In addition, the proteomic analysis revealed some correlations between ADAMs substrates and ESSDAI scores. Relevantly, the authors confirmed that CD8+ T cells, especially TEMRA, produced the signature observed. Similarly, transcriptomic and cytokine profiling of pSS patients allowed the stratification of pSS patients into three distinct clusters, defined by IFN-responsive and inflammation-associated genes [45]. Interestingly, patients belonging to the cluster with the strongest IFN and inflammation gene signature also had high ESSDAI scores and elevated levels of anti-Ro/SSA and La/SSB autoantibodies. This cluster was also defined by a high serum concentration of cytokines, such as LIGHT and Blys and chemokine CXCL13 [45]. Soret et al. [46] and Barturen et al. [47] independently validated some of these findings, by showing a pSS patient transcriptomic stratification also driven by IFN-related pathways. Interestingly, in both studies, a cluster of patients with low disease activity was identified, characterised by a transcriptomic profile similar to that of HCs. Soret et al. [46] did not detect any differentially expressed genes, single nucleotide polymorphisms (SNPs) or differences in B and T cells, monocytes, basophils, eosinophils and neutrophils frequencies in pSS patients with low disease activity when compared to HCs.


**Table 4.**

Examples of studies

investigating

 potential Sjögren's Syndrome biomarkers

 using multi-omic

 approaches.


**Table 4.** *Cont.*



signature, IFN—interferon, CXCL-C-X-C motif ligand, SNPs—Single nucleotide polymorphisms, Notch—Neurogenic locus notch homolog protein, MX1myxovirus resistance protein 1,NLRC5—NLR Family CARD Domain Containing 5, CCL8/MCP2—monocyte chemotactic protein-2, SLEDAI—Systemic LupusErythematosus Disease Activity Index, ESSDAI—EULAR Sjögren's syndrome (SS) disease activity index.

#### **4. Discussion**

The significant progress made by high-throughput technologies, increased effort for large-scale academic and industry research collaborations to facilitate external validation, and advancement of computer algorithms for big data integration and cluster analysis provide unprecedented opportunities for better patient classification, improved pathogenic characterisation, prediction and therapeutic opportunities across all autoimmune diseases. It is increasingly recognised the need for better quality research, including both pre-clinical and clinical validation to enable meeting the ultimate goal of achieving clinical utility and patient benefit. Despite the impressive therapeutic advances leading to licensing of many new targeted therapies in autoimmune rheumatic disease (ARDs), such as inflammatory arthritis or SLE [48], patients with SS do not benefit from the same range of therapeutic options available for other conditions, despite shared pathogenesis [49]. Many of the signals of efficacy from early phase clinical trials of various biologics investigated in SS have not been replicated in larger studies and research is ongoing [50,51].

Efforts have been made in improving the way the response to treatment in clinical trials of patients with SS is assessed [52], while current treatment recommendations expanded to targeted biologic treatment options despite of lack of large phase 3 clinical trials [10]. In addition, novel approaches, such as advocating for a molecular classification of SS to drive precision medicine strategies have been proposed [46], which suggests that the future of clinical research in SS will likely involve multi-omic characterisation of patients (Figure 1). In this respect, good quality, reproducible research involving large cohort collaborations to capture the disease heterogeneity, as well as facilitate the validation of disease signatures, is required to improve knowledge about SS pathogenesis and facilitate the much-needed therapeutic advances.

**Figure 1.** Potential multi-omic approaches taken in clinical research.

It is widely recognised that SS is associated with a genetic predisposition, similar to other ARDs, which has been confirmed in large GWAS studies which validated the associations with HLA, IRF5, STAT4 and BLK genetic loci, while also detecting novel susceptible loci [53]. The best characterised are the HLA genes, associated with an increased disease risk ranging from 1.85 to 3.41 as per a large meta-analysis [54]. Various non-HLA genes associated with the disease have also been described but very few have been validated across studies [11].

Research into the role of environmental factors and epigenetics currently supports the old hypothesis that a ubiquitous virus is a potential trigger for the mechanism of autoimmunity, with most data potentially implicating Epstein–Barr Virus (EBV), Human T cell Leukemia Virus-1 (HTLV-1), or Coxsackie virus in the development of SS [55], despite the lack of conclusive evidence for their causal role. The most well-defined epigenetic mechanisms likely to play a role in the pathogenesis of SS have been described as DNA methylation, histone modifications and non-coding RNAs [42].

SS is characterised by a pro-inflammatory environment and cytokine profiling of serum, tears and saliva identified a predominance of pro-inflammatory cytokines, such as MIP, IL-1, IFN-γ, TNF-α, IL-6, IL-12 or IL17 in various proportions, as well as increased anti-inflammatory molecules, such as IL-4 or IL-10. Some of these data have been validated across studies, while some of the biomarkers, including cytokine ratios suggesting a Th1 signature also correlated with clinically meaningful parameters, such as tear and saliva secretion [16,17,24].

Dysregulation of various immune cell populations has been hypothesised as one of the key factors implicated in disease pathogenesis. Immune profiling of patients with SS found distinct immune signatures in the salivary gland tissue compared to peripheral blood, as expected. The main players seem to be activated CD8<sup>+</sup> T cells, terminally differentiated plasma cells, and activated epithelial cells in biopsies, whereas the peripheral blood signatures comprised high numbers of activated CD4+, CD8+ T cells. Although these signatures were not perfectly validated across various studies [30,33], some correlated with serological and clinical parameters, suggesting a potential clinical utility.

Proteomic analysis revealed distinct signatures in tears and saliva compared to the serum of patients SS, with the majority of signatures being able to differentiate, with high accuracy, SS patients from controls, and a few correlating with clinical meaningful parameters. Enriched pathway analysis also overlapped with some cytokine signature findings, such as the upregulation of the JAK-STAT signalling after IL-12 stimulation in saliva [27]. The protein patterns identified in saliva were associated with B cell immune responses, macrophage differentiation and T cell chemotaxis, which showed similarities with salivary gland histopathological features [27], suggesting a potential role for saliva analysis as a proxy measure of glandular inflammation.

Transcriptomic profiling of salivary gland tissue was characterised by the upregulation of IFN-α and IL-12/IL-18 signalling, as well as CD3/CD28 T cell activation, CD40 signalling in B-cells, as well as significant correlation with the IFN-α score in PBMCs [28], which shows similarities with proteomic profiles of saliva. IFN response genes were also upregulated in most cell subsets when assessed by single-cell blood transcriptomic analysis, highlighting the role of the IFN activation pathway in the pathogenesis of the disease. In terms of potential clinical implications, the IFNγ/IFNα mRNA ratio in salivary gland tissue was shown to have the best discriminative capacity for lymphoma development in patients with pSS [56]. Patient stratification based on transcriptomic signatures identified distinct clusters driven by IFN and B cell activation, as well as SNPs in HLA genes and epigenetic modifications including gene hypomethylation [46], all processes recognised as involved in the disease pathogenesis, although the clinical significance of patient stratification was less clear.

Metabolomic characterisation of serum, tears and saliva of patients with pSS identified distinct signatures with almost no overlap between various biologic fluids [18,26,39]. Further research exploring the inter-individual variability and its stability over time is required [26].

The power of integrating several omic technologies in the investigation of the disease fingerprints harnessed evidence for the role of cytotoxic CD8 T cells in the disease pathogenesis [44] as well as enabled the identification of inflammatory, lymphoid and IFN-driven patient clusters generated by a combination of the transcriptome, methylome and cytokine profilin [45,47]. Patient clusters driven by high IFN and pro-inflammatory signatures were

also associated with high disease activity suggesting that these pathways are relevant to the disease pathogenesis.

#### **5. Conclusions**

Omic investigation of SS provides a valuable insight into the disease pathogenesis and patient molecular heterogeneity which has implications for SS prognosis and better management strategies to address the unmet patient needs. Further research into standardising technologies and validating findings across large patient populations, as well as further exploration of potential correlations with clinical significance, are required to establish which are the strongest molecular signals that could be potentially translated into research with patient benefit. Ultimately, integrating data provided by multiple omics analysis can provide the much-required complementary knowledge related to the interplay between genes, environment, immune cell activation and pro-inflammatory milieu which all sustain the pathogenic processes associated with SS.

Understanding how the disease's natural course or treatment impacts these molecular signatures, as well as which pathways can be targeted by available and novel treatments will open a new era for research in SS.

**Author Contributions:** L.M.-G. and C.C. conceptualised the paper search strategy. All authors reviewed the literature and contributed to data collection. L.M.-G. and C.C. wrote the first draft of the manuscript. All authors reviewed the manuscript, provided intellectual input in the presentation of findings, conclusions, discussions, and limitations. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by grants from the MRC EMINENT project grant (MR/X004694/1), NIHR UCLH Biomedical Research Centre grant BRC772/III/EJ/101350, BRC773/III/CC/101350. This work was performed within the Centre for Adolescent Rheumatology Versus Arthritis at UCL UCLH and GOSH supported by grants from Versus Arthritis (21593, 22908 and 20164). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

**Institutional Review Board Statement:** No institutional review board was required as this is a literature review.

**Informed Consent Statement:** Not required as this study is a literature review.

**Conflicts of Interest:** The authors declare that the perspective was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### **References**


**Zahra Alghoul 1,2, Chunhua Yang 1,3,\* and Didier Merlin 1,3**

	- <sup>3</sup> Atlanta Veterans Affairs Medical Center, Decatur, GA 30033, USA

**Abstract:** Diagnosis and prognosis of inflammatory bowel disease (IBD)—a chronic inflammation that affects the gastrointestinal tract of patients—are challenging, as most clinical symptoms are not specific to IBD, and are often seen in other inflammatory diseases, such as intestinal infections, druginduced colitis, and monogenic diseases. To date, there is no gold-standard test for monitoring IBD. Endoscopy and imaging are essential diagnostic tools that provide information about the disease's state, location, and severity. However, the invasive nature and high cost of endoscopy make it unsuitable for frequent monitoring of disease activity in IBD patients, and even when it is possible to replace endoscopy with imaging, high cost remains a concern. Laboratory testing of blood or feces has the advantage of being non-invasive, rapid, cost-effective, and standardizable. Although the specificity and accuracy of laboratory testing alone need to be improved, it is increasingly used to monitor disease activity or to diagnose suspected IBD cases in combination with endoscopy and/or imaging. The literature survey indicates a dearth of summarization of biomarkers for IBD testing. This review introduces currently available non-invasive biomarkers of clinical importance in laboratory testing for IBD, and discusses the trends and challenges in the IBD biomarker studies.

**Keywords:** proteomics; epigenetics; endoscopy; imaging; laboratory testing

#### **1. Introduction**

Inflammatory bowel disease (IBD) is a set of chronic and idiopathic inflammatory conditions that affect more than 3.5 million patients worldwide. The two major forms of IBD are Crohn's disease (CD), in which inflammation affects any segment of the gastrointestinal (GI) tract [1], and ulcerative colitis (UC), in which inflammation affects the inner lining of the colon or rectum [2]. Patients with IBD are up to six times more likely to develop colorectal cancer than the general population [3,4]. In addition to the molecular alterations (such as chromosomal instability, microsatellite instability, and hypermethylation) that contribute to sporadic colorectal cancer, IBD-related colorectal cancer is linked to inflammation that induces the transcription of mutated cancer genes [5]. Loss-of-function mutations in tumor-suppressor protein p53 occur in both sporadic and IBD-related colorectal cancer, but they occur earlier in the non-dysplastic mucosa of IBD-related colorectal cancer than in sporadic colorectal cancer [4,5]. Another mutation observed in both types of cancer is the nonfunctional adenomatous polyposis coli (APC) gatekeeper gene. Unlike the p53 mutation, APC mutation occurs just prior to carcinoma in IBD-related colorectal cancer, but at a much earlier stage in sporadic colorectal cancer [4]. Other gene mutations linked to IBD-related colorectal cancer include p27, k-Ras (12p12) oncogene, human mismatch repair genes (e.g., hMLH1, hMSH2), and p16 [4].

CD and UC are both characterized by mucosal inflammation, with occasional flares and remittance. Inflammation in CD can affect any segment of the GI tract, and spreads in a non-continuous pattern [1,6]. CD commonly involves the formation of strictures,

**Citation:** Alghoul, Z.; Yang, C.; Merlin, D. The Current Status of Molecular Biomarkers for Inflammatory Bowel Disease. *Biomedicines* **2022**, *10*, 1492. https://doi.org/10.3390/ biomedicines10071492

Academic Editor: Marianna Christodoulou

Received: 24 May 2022 Accepted: 21 June 2022 Published: 24 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

abscesses, and fistulas [6]. Its histological features include thickened submucosa, fissuring ulceration, transmural inflammation, and non-caseating granulomas [6]. Inflammation in UC affects the inner lining of the colon or rectum, and spreads in a continuous pattern [2,6]. It shows superficial inflammatory changes in the mucosa and submucosa, and involves the formation of cryptitis and crypt abscesses [6]. The clinical symptoms of IBD include abdominal pain, diarrhea, rectal bleeding, weight loss, nausea, intestinal pain and, in some cases, fever [7,8]. As these symptoms are not specific to IBD, the clinical diagnostic process must consist of using a combination of endoscopic, radiological, clinical, histological, and laboratory tests [9]; a single technique is often insufficient for the diagnosis.

Endoscopy and imaging are essential techniques for the diagnosis, management, and treatment of IBD. They are used in the initial evaluation of patients with suspected IBD, as well as in making a differential diagnosis of UC versus CD in confirmed IBD cases [10]. The strength of endoscopy as a diagnostic tool lies primarily in its ability to visually observe different bowel segments, allowing clinicians to assess disease severity and monitor disease activity over time. Ileocolonoscopy has traditionally been the most used form of endoscopy in IBD. The initial evaluation of patients presenting with clinical symptoms suggestive of IBD should be carried out with ileocolonoscopy, as recommended by the American Society for Gastrointestinal Endoscopy (ASGE) Standards of Practice Committee [11]. In addition to providing a visual of the colon and the terminal ileum, ileocolonoscopy can be used to obtain biopsy specimens for further analysis. The ASGE suggests obtaining at least two biopsy specimens from five sites throughout the bowel during the initial evaluation [12]. However, the invasiveness and high cost of ileocolonoscopy are major drawbacks that have limited its frequent use for monitoring disease activity.

New, less-invasive endoscopic techniques that can more accurately diagnose IBD, while also providing a differential diagnosis of CD and UC, have emerged in the past few years. These include video capsule endoscopy (VCE), confocal laser endomicroscopy (CLE), and single- or double-balloon-assisted enteroscopy (SBE and DBE, respectively). VCE provides imaging of the whole bowel via ingestion of a wireless capsule endoscope [13]. This technique is particularly useful for inspecting areas in the GI tract that cannot be visualized by colonoscopy [14]. Although the risk of capsule retention is low, it remains the primary concern in patients with suspected or known IBD [15]. VCE is less invasive and more cost-effective than ileocolonoscopy, but it cannot be used in performing biopsies. In CLE, a confocal laser microscope is used in vivo to obtain living tissue images during colonoscopy [16]. CLE has the advantage of offering a faster diagnosis than a traditional colonoscopy. Enteroscopy in both of its forms (SBE and DBE) allows access to small bowel areas that standard endoscopy cannot reach. Additionally, enteroscopy can be used in performing histological analysis. However, due to its technical complexity and time-consuming preparation, enteroscopy is not recommended for the initial evaluation of suspected IBD cases [17].

In confirmed IBD cases, clinical symptoms alone are insufficient for clinicians to determine the extent of mucosal inflammation, or to make a differential diagnosis between UC and CD. There has been a growing interest in the use of cross-sectional imaging modalities such as magnetic resonance enterography (MRE), ultrasonography (US), and computed tomography (CT) as tools to supplement endoscopy in the diagnosis and monitoring of IBD [18]. These techniques are instrumental in detecting mural and extramural complications and assessing laminal inflammation in areas affected by CD in the small bowel that are beyond the reach of colonoscopy [19]. Due to their ability to diagnose CD with high accuracy, cross-sectional imaging modalities are used to make differential diagnoses in suspected cases of UC [20]. This aspect is critical because these diseases differ in their prognosis and required treatments.

Although imaging techniques offer highly accurate IBD diagnosis, they require experienced personnel, sophisticated instruments, and high costs, hampering their routine application. Laboratory testing's advantage lies in the fact that these tests can be standardized, rapid, and cost-effective, but they can also be applied to the already established

patient sample libraries to process independent investigations. An increasing number of laboratory tests, combined with endoscopy or imaging, are used to monitor disease activity or diagnose suspected IBD cases. As good laboratory test results rely on the proper use of molecular biomarkers from the patients' tissue, blood (serum), or fecal samples, this review summarizes currently available biomarkers of clinical importance in laboratory testing of IBD, discusses the possible involved genetic and epigenetic factors, and envisions the trends and challenges of biomarker discovery in IBD.

#### **2. Non-Invasive Molecular Biomarkers of IBD: Serum Proteins, Serological Antibodies, and Fecal Proteins**

Biomarkers play critical roles in the early detection and monitoring of disease progression and therapeutic responses (Figure 1). Disease activity can be monitored with laboratory tests that measure circulating biomarkers in the blood (serum or plasma), tissue, or feces. A biomarker is defined as "a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention" [21]. Identifying a biomarker or several biomarkers of a given condition's pathologies might help to diagnose, prognose, and assess therapeutic responses. For a biomarker to be effective, it should possess several attributes, such as being non-invasive, inexpensive, convenient for sampling, reproducible, and disease-specific (i.e., accurate and precise). An ideal biomarker also needs to have a rapid test-to-result turnaround time, be standardizable to provide comparable test results across different assays, be widely available and stable for storage, have a wide dynamic range, use defined thresholds to determine the absence/presence or extent of inflammation, and be responsive to changes in the state of inflammation [22].

**Figure 1.** The potential role of biomarker assays in the care of patients with suspected or established IBD: Biomarkers may be used in all phases of the care. For patients with suspected IBD, biomarkers can be used to select which patients are unlikely to have IBD and could forgo further testing. Once patients are diagnosed, biomarkers can determine which patients have CD or UC and predict the disease course. Biomarkers can be used to determine which patients are most likely to respond to therapies, determine prognosis, and identify those who require more aggressive therapies. In patients with recurrent symptoms, biomarkers can differentiate patients with active inflammation from those likely to have symptoms from other causes. Adapted from James D. Lewis's review [23]; *Gastroenterology*, Volume 140 Issue 6, Pages 1817–1826.e2; https://doi.org/10.1053/j.gastro.2010.11.058.

Several molecular biomarkers have been established as reliable measures for disease activity in IBD [22,24]. They are minimally invasive and relatively inexpensive compared to colonoscopy and imaging techniques. They can also assist in identifying patients who require diagnosis with endoscopy and biopsies. However, many of these biomarkers have limitations in terms of their specificity, sensitivity, responsiveness, and/or other desirable attributes of IBD biomarkers [22]. There are currently three major types of molecular biomarkers available for IBD: serum biomarkers, serological antibodies, and fecal biomarkers.

#### *2.1. Serum Biomarkers*

Several inflammatory serum biomarkers have become part of routine laboratory testing for the diagnosis of IBD. Although they are not specific to IBD, these serum biomarkers are commonly used for initial diagnosis due to their ease of use, low cost, and well-established protocols. The most common of these tests are those for C-reactive protein (CRP) and the erythrocyte sedimentation rate (ESR).

CRP is a pentameric protein that is produced in the liver by hepatocytes. It is found in serum at <1 mg/L under physiological conditions. Its concentration increases during an acute-phase response, as pro-inflammatory cytokines such as IL-6, tumor necrosis factor α (TNF-α), and IL-1β stimulate its production in the hepatocytes [25–27]. CRP has a relatively short half-life (about 19 h) [28], making it a better indicator of inflammation than most acute-phase proteins. Elevated CRP levels are observed in most active CD cases, whereas the CRP levels of UC patients show little-to-no increase in the case of active disease [27,29]. This may reflect the production of CRP by mesenteric adipocytes in patients with CD [30]. Although CRP is widely used as a biomarker for IBD, it lacks specificity; elevated CRP levels are also observed in autoimmune disorders, infections, and malignancies [25].

ESR is a measure of how quickly erythrocytes sediment through plasma in a column, with a higher rate taken as indicating more inflammation. ESR values are affected by physiological factors such as pregnancy, age, and gender, as well as changes in hematocrit levels in patients with anemia and polycythemia [31]. Medications that cause changes in the size of erythrocytes can also affect ESR values [32]. Changes in ESR values are not specific to IBD, and can be due to any inflammatory stimulus. Unlike CRP, ESR values are altered in both UC and CD, and we cannot distinguish them. ESR values peak more slowly than CRP, and take longer to return to normal after the end of an inflammatory flare [28].

CRP and ESR have been studied long enough to become established in IBD diagnosis. While both tests lack the specificity and accuracy to be considered a gold-standard diagnosis, CRP has some advantages over ESR. For example, the CRP concentration changes faster than the ESR value upon a change in disease activity, CRP has a broader range of abnormal values than ESR, and (unlike ESR) CRP does not show age-related variation [33].

Leucine-rich alpha-2 glycoprotein (LRG) is a 50 kD protein that is secreted by hepatocytes, neutrophils, macrophages, and intestinal epithelial cells [34–36]. It has recently emerged as a novel serological biomarker for IBD and rheumatoid arthritis. Studies have found that levels of LRG are elevated in patients with active UC, and decrease with a decline in disease activity [37,38]. Notably, elevated levels of LRG correlate better than CRP with clinical and endoscopic scores in patients with active UC and CD [38–40]. LRG has been also found to predict mucosal healing in both UC and CD patients with normal CRP levels [41].

#### *2.2. Serological Antibodies*

Serological testing is a well-established diagnostic tool for a variety of immune diseases. Its use in IBD has been mainly focused on patients with a confirmed diagnosis; little work has been done on its potential as a primary diagnostic tool in patients with suspected IBD. Perinuclear anti-neutrophil cytoplasmic antibodies (p-ANCAs) and anti-*Saccharomyces cerevisiae* antibodies (ASCAs) are the two primary antibodies currently examined in IBD studies. ANCAs are a group of antibodies produced against antigens in the cytoplasm of

neutrophils. ASCAs are produced against mannan and other yeast cell wall components. Both have been reported to provide clinically useful positive or negative predictive values: p-ANCA+/ASCA− is reported in patients with UC, while p-ANCA−/ASCA+ is seen in patients with CD. Although each of these biomarker antibodies can be used to discriminate UC from CD, they both have low accuracy and sensitivity [42]. Positive results for either antibody are not unique to IBD, and may be related to several other GI and inflammatory conditions, such as celiac disease, Behcet's disease, cystic fibrosis, and rheumatoid arthritis [42,43].

#### *2.3. Fecal Biomarkers*

Fecal biomarkers are the proteins that are explicitly found in stool samples of patients with IBD. The fecal biomarkers for IBD reported to date are mainly fecal leukocyte proteins. These include calprotectin, calgranulin C, lactoferrin, and lipocalin-2. They have several advantages over blood biomarkers, including the ease of sample accessibility, high biomarker concentration due to the direct contact of the fecal sample with the site of inflammation, and higher specificity for IBD because they reflect GI inflammation (unlike serum biomarkers, which are increased by various types of inflammation) [44].

Calprotectin is the most widely used fecal biomarker for IBD. It is a calcium- and zincbinding protein that is abundant in neutrophils, eosinophils, and macrophages. Changes in its concentration are observed in various secretory and excretory products in the body upon activation of granulocytes and mononuclear phagocytes [45]. Elevated fecal calprotectin levels are expected in patients with active IBD, due to the presence of a high number of neutrophils in the GI tract, which is characteristic of the disease [28]. Calprotectin is resistant to degradation, and is stable for 7 days in fecal samples stored at room temperature [46]. Changes in fecal calprotectin levels are not exclusive to IBD; alterations are also observed in various colon and intestine diseases [47].

Calgranulin C (S100A12) belongs to the S100 family of low-molecular-weight calciumbinding proteins, which activate the NF-κB pathway and increase cytokine release during pro-inflammatory processes [31]. The serum concentration of calgranulin C is high in IBD [48], but the fecal concentration is higher, making the fecal assay more sensitive to IBD. Elevated levels of calgranulin C have been reported in other inflammatory conditions, such as arthritis [49].

Lactoferrin is another biomarker whose levels are significantly elevated in active IBD. It is an iron-binding glycoprotein that is found specifically in neutrophils; in this respect, it contrasts with calprotectin, which is found in several types of cells. Lactoferrin has high specificity and sensitivity for diagnosing active IBD [50].

Lipocalin-2 (LCN-2), also known as neutrophil gelatinase-associated lipocalin (NGAL) or siderocalin (Scn), is a bacteriostatic protein stored in neutrophil granules [51,52]. LCN-2 is involved in innate immunity by secluding iron from pathogenic bacteria, limiting their invasion. It is a highly stable protein whose elevated expression by gut epithelial cells has been demonstrated in colonic biopsies from inflamed areas of patients with IBD. Serum LCN-2 has been proven to be an active biomarker in UC patients, and it is widely used as a fecal biomarker of acute inflammation in the animal model of UC, indicating that it can potentially be used as a fecal biomarker of human UC. Upregulation of LCN-2 is believed to be induced by IL-22 and IL-17A [53].

#### *2.4. Diagnostic/Prognostic Accuracy*

The major concern about diagnosis and prognosis of IBD that solely rely on singular molecular biomarkers is their detection accuracy. A study showed that the biomarkers' correlation coefficients with endoscopy could vary from 0.48 to 0.83 (for calprotectin) and from 0.19 to 0.87 (for lactoferrin) in IBD patients [23] (Table 1). IBD detection methods that combine endoscopy with histopathology biomarkers can be highly accurate, such as in the context of oncostatin M (OSM) or oncostatin M receptor (OSMR), which are found to be highly overexpressed in the inflamed intestinal tissue of active IBD patients, with a

*p*-value < 0.001 for OSM (*n* = 42) and a *p*-value < 0.05 for OSMR (*n* = 86) at a false discovery rate (FDR) of 1% [54].


**Table 1.** Correlation of biomarkers with disease activity, determined by endoscopy.

\* CDEIS, Crohn's Disease Endoscopic Index of Severity; \*\* SES-CD, Simple Endoscopic Score for Crohn's Disease. Adapted from James D. Lewis's review [23]; *Gastroenterology*, Volume 140 Issue 6 Pages 1817–1826.e2; https://doi.org/10.1053/j.gastro.2010.11.058.

To date, C-reactive protein and fecal calprotectin are considered reliable markers of disease activity, with demonstrated utility in IBD management [55]. However, singlebiomarker-based detections often present a larger ambiguous "grey zone" than detections made using composite biomarkers (Figure 2). Composite biomarkers are defined as "a combination of ≥2 biomarkers", and are selected using an optimized algorithm to render a single interpretive output. The combination of different biomarkers has shown higher accuracy, and is expected to reduce the "grey zone" of each biomarker and replace singlemarker approaches in the future of research and clinical practice [55] (Figure 2).

**Figure 2.** Improvements are provided by composite biomarkers. Careful selection of markers and their integration can optimize the diagnostic accuracy of single biomarkers of disease activity and drastically reduce the blind spot resulting from the "grey zone". Adapted from Dragoni G. et al.'s review [55]; *Digestive Diseases*, https://doi.org/10.1159/000511641.

#### **3. Trends in IBD Biomarker Discovery**

#### *3.1. Proteomics*

Proteomics, the study of the set of gene-encoded proteins known as the proteome, uses a range of techniques for separating, identifying, and structurally characterizing proteins. Proteomics goes beyond the study of proteins in a given cell, including their isoforms, post-translational modifications, and protein–protein interactions [56]. Depending on the

analysis method, proteomic approaches can be bottom-up or top-down. In bottom-up proteomics, proteolytic digestion breaks the extracted proteins into peptides, which are then analyzed by mass spectrometry (MS). In top-down proteomics, intact proteins are analyzed. The samples used in IBD-related studies are usually obtained from blood (serum or plasma) or colonic biopsies. Liquid chromatography coupled with electrospray tandem mass spectrometry (LC–ESI-MS/MS) is the most widely used proteomic technique in IBD research. Other commonly used techniques include two-dimensional gel electrophoresis coupled with matrix-assisted laser desorption/ionization (MALDI)-MS screening and immunofluorescence microscopy.

Due to the strong connections between protein expression and disease activity, the application of proteomics in biomarker discovery is a promising emerging field. Advances in LC–MS instrumentation, such as the combination of ultrahigh-performance liquid chromatography (UPLC) with nano-electrospray ionization and high-resolution mass spectrometry (HRMS), have revealed the potential of MS-based proteomics to compete with or even replace traditional immunoassay techniques. It is hoped that proteomics may help to develop personalized and precision medicine [57]. Instead of focusing on finding a single biomarker, current proteomic biomarker research aims to identify protein biomarker panels representing an individual's disease state. In this context, three approaches have emerged over the past few years: (1) Proteotyping—a multiprotein approach used to determine an individual's unique proteome [58]. (2) Proteogenomics—a multi-omics approach in which genomic and proteomic analyses are performed on the same sample; data obtained from this pairing contain information that would not be obtained using either technique alone [59,60]. (3) Proteoforms—protein variants that result from post-translational modifications of proteins, genetic mutations, or truncations. MS immunoassays are often used to map a specific protein's proteoforms to distinguish between normal and clinical fluctuations [61,62].

To date, the use of proteomics in IBD has focused on three areas: identifying novel protein biomarkers for diagnosis, understanding the pathological mechanisms underlying disease activity, and monitoring the response to treatment. Berndt et al. pioneered the use of proteomics in IBD by performing proteomic analysis of normal and inflamed intestinal mucosa using multi-epitope ligand cartography immunofluorescence microscopy. The authors found that different T-cell populations in the mucosa expressed distinct proteins in each form of IBD [63]. An experimental approach based on combining discovery proteomics with targeted verification experiments successfully assessed transmural intestinal complications in CD, with 70% sensitivity and 72.5% specificity. This approach, which used label-free LC–MS/MS, identified a serological biomarker panel that could discriminate complicated CD from uncomplicated CD, rheumatoid arthritis, UC, and healthy controls [64]. Another study that used LC–MS identified a panel of four proteins that could distinguish active pediatric IBD from non-IBD with high sensitivity and specificity.

Additionally, the study found that two of the identified proteins were elevated in IBD stool samples, demonstrating that fecal samples can be used for measuring these biomarkers [65]. Several studies attempted to identify differentially expressed proteins in patients with UC and CD through proteomic profiling of serum or colonic biopsies. Proteomic profiling of colon biopsies using MALDI-MS identified distinct protein peaks for UC and CD specimens, indicating that it could be possible to differentially diagnose these IBD forms using protein profiles [66–68]. In a study that compared the proteomic spectra of submucosal samples from inflamed UC versus CD and uninflamed UC versus CD, two distinct peaks were identified in the first case, and three in the second [66]. Another study identified a set of 25 proteins as differentiators for UC and CD in colonic mucosal tissue samples obtained from 62 patients with confirmed UC/CD [67]. Screening of mucosal biopsies obtained from children with suspected IBD identified two distinct biomarker panels: one consisted of 5 proteins that were reported to discriminate IBD from control patients, while the other consisted of 12 proteins reported to allow the differential diagnosis of CD and UC patients [68]. Protein profiling of 120 serum samples from patients with

CD or UC and inflammatory and healthy controls was performed using surface-enhanced laser desorption/ionization–time-of-flight mass spectrometry (SELDI-TOF-MS). This work identified four diagnostic protein biomarkers for IBD, one of which could reportedly discriminate UC from CD with accuracies similar to or higher than those of the ANCA and ASCA serological tests [69]. Proteomic profiling of stricturing CD, non-stricturing CD, and UC patients identified a smaller set of peptides for differentiating stricture versus non-stricture CD in IBD [70].

In addition to diagnostic biomarkers, several studies have used proteomics to identify biomarkers that could be used to assess treatment responses in IBD. One study monitored the treatment response to infliximab in IBD patients by measuring the levels of circulating chemokines and monocyte activation using LC–nano-ESI-MS/MS. The study found that 2 weeks from the start of treatment, decreases were evident in the levels of macrophage-derived CD14 and CD86, as well as the chemokine, CCL2 potentially providing a mechanistic explanation for why not all patients respond to this treatment [71]. Another study investigated the treatment response to infliximab and prednisone in children with IBD. The study identified 18 proteins and 3 miRNAs that were responsive to both drugs; some were downregulated with inflammation, while others were upregulated as the inflammation was resolved [72].

#### *3.2. Genetics*

Pathological studies of IBD and its two subtypes suggest a genetic risk factor behind the immune response to the intestinal microbiota. Genome-wide association studies (GWASs) have identified approximately 240 gene loci associated with susceptibility to IBD [73]. Several studies have used genetic profiling of blood samples to identify gene panels that may help to differentiate IBD from healthy controls [74], active from inactive CD [75], and CD from UC [76–78]. Distinct gene panels were also identified in peripheral blood samples from pediatric IBD patients in clinical remission compared to healthy controls [79]. Other studies performed gene expression analysis on mucosal biopsies from IBD patients, and identified distinct gene panels for IBD versus healthy controls [80] and UC versus healthy controls [81]. The use of genetics to identify loci associated with IBD can potentially define causal disease mechanisms, which could, in turn, advance the biomarker discovery process [82].

#### *3.3. Epigenetics*

Epigenetics, which describes changes in gene function caused by gene–environment interactions rather than changes in the DNA sequence, is gaining research interest among scientists seeking to study the pathogenesis and diagnosis of IBD [83,84]. DNA methylation and RNA interference are the two most heavily researched areas in IBD epigenetic studies.

DNA methylation refers to adding a methyl group to cytosine residues in the CpG dinucleotide sequence [85]. Early studies of DNA methylation changes in the mucosa of IBD patients focused primarily on their use as predictors of malignancy [86]. Recent studies have shown that the DNA methylation of specific genes plays a role in the pathogenesis of IBD, suggesting that they could be useful as biomarkers [87,88]. A genome-wide methylation profiling conducted on rectal biopsies identified panels of genes (e.g., *THRAP2*, *FANCC*, *GBGT1*, *DOK2*, *TNFSF4*, *TNFSF12*, and *FUT7*) that showed evidence of differential methylation in CD and UC specimens in comparison to those from healthy controls [88]. Another study identified seven differentially methylated CpG sites in the diseased intestinal tissue of IBD patients compared to normal intestinal tissue from the same patients [89]. Genome-wide changes in DNA methylation have also been analyzed using the peripheral blood of patients with IBD. Analysis of the DNA methylation changes using peripheral blood from CD patients identified 50 genes that showed significant differential methylation compared to that in healthy controls [87]. Site-specific DNA methylation changes in genes associated with IBD pathways have also been identified, with the results showing a 45% overlap of the differentially methylated positions in UC and CD [90].

MicroRNAs (miRNAs) are non-coding, single-stranded RNA species that consist of 18–25 nucleotides. Disruptions in their expression profiles and function are observed in human diseases such as cancer and neurological, cardiovascular, and autoimmune diseases [91]. The potential of miRNAs as diagnostic biomarkers and treatment options in IBD has garnered growing interest in the past few years. Colonic tissue and circulating miRNAs (e.g., serum, feces) are the two types of samples used in most of these studies.

Several studies have successfully identified distinct miRNA profiles reflecting the upor downregulation of one or more miRNAs in colonic biopsy specimens of IBD patients [92] (Table 2). One of the pioneering studies in this area identified the differential expression of 11 miRNAs in the mucosal tissue samples of patients with active UC [93]. Other studies that examined the colonic mucosa of patients with active UC reported upregulation of one or more miRNAs (such as miR-21 [94], miR-150 [95], and miR-155 [94]) and downregulation of others (such as miR-143 and miR-145 [96]), in comparison to healthy controls. Similarly, some studies compared the colonic mucosa of patients with active CD to healthy controls, and reported upregulation of miR-196 [97] and downregulation of miR-7 [98]. Other studies assayed the expression of hundreds of miRNAs, and identified panels differentially expressed in the colonic tissues of patients with UC and CD versus controls [99–101].

**Table 2.** A summary of microRNAs that are correlated with ulcerative colitis (UC#1–12) or Crohn's disease (CD, #13–22).



**Table 2.** *Cont.*

HC: healthy controls, RT-qPCR: quantitative real-time polymerase chain reaction, Biopsy: colon tissue biopsy, ISH: in situ hybridization, PBMCs: peripheral blood mononuclear cells, DSS: dextran sodium sulfate, TNF: Tumor necrosis factor alpha. Adapted and modified from Jaslin P. James et al.'s review [92]; *Int. J. Mol. Sci*. 2020, 21, 7893; doi:10.3390/ijms21217893.

Distinct profiles of circulating miRNAs have also been identified in blood samples of IBD patients. Several studies identified many upregulated or downregulated miRNAs in peripheral blood samples from patients with IBD. Samples were obtained from patients with UC or CD versus healthy controls [101–104] and pediatric CD versus healthy controls [105]. Distinct panels of miRNAs have also been identified in fecal samples of IBD patients [106–108]. More investigation into the specificity of miRNAs for IBD is required before they can be used as diagnostic tools, as some miRNAs are known to be associated with other conditions. For example, miR-21 is significantly high in the blood of UC patients [103], but is also upregulated in patients with colorectal cancer [109]. One study examined the differential expression of miRNAs between UC and CD in saliva, in addition to blood and colon tissue samples [110]. The study identified several miRNAs (i.e., miR-21, miR-31, miR-142-3p, miR-142-5p) whose expression levels in all three types of samples were significantly altered between IBD and non-IBD patients.

#### **4. Challenges and Future Directions**

#### *4.1. Proteomic Biomarker Discovery*

The typical protein biomarker discovery and validation process consists of six phases: discovery, qualification, verification, assay optimization, clinical evaluation/validation, and commercialization [111]. During the discovery phase, researchers identify a list of 20 to several hundred proteins that are differentially expressed between healthy and diseaseconfirmed samples. This identification process is based on an unbiased, semi-quantitative assessment of peptide abundances in both samples. In the next phase, qualification, this unbiased approach is replaced with a targeted analysis to confirm the differential expression of the candidate proteins identified in the discovery phase. In the verification phase, a more significant number of samples are used to account for the variations in the proteomes of the different studied sets. At this stage, specificity and sensitivity acquire particular importance when the researchers select the few protein biomarkers used in the assay optimization and clinical evaluation phases. In the assay optimization phase, an antibody is selected for each biomarker candidate and used to develop an immunoassay to replace the MS step in protein quantification. During the evaluation/validation phase, the assay is evaluated for

analytical parameters, such as accuracy and precision. If clinical validation is successful, the protein biomarker moves to the commercialization state [111].

The path to successful protein biomarker discovery through this multistage process faces several challenges. As a result, the introduction of new protein biomarkers has been slow, and has not met the clinical need for proteomic tests [112]. Some relevant challenges include the low number of samples under study and the lack of well-designed study methods and standard protocols [113]. These variables can be optimized through more careful choices of sample types and sizes. Sample selection and processing require special consideration when performing a proteomic analysis. For example, human plasma contains tens of thousands of proteins that differ in their structures and abundances [114]. It is not always possible to identify a single or multiple disease-specific proteins that could be used as markers for a particular disease. The proteins selected in the discovery phase are often classified as false positives. This is primarily due to the low frequency of selecting low-abundance proteins and limitations in their detection [111]. Even using other biofluids—such as urine, cerebrospinal fluid, cell line homogenates, or tissue lysates has not eliminated this complexity [111]. There are also considerations more specific to the study of IBD. Intestinal mucosal biopsies are widely used in IBD studies. Protein degradation during and after extraction might lead to the under- or over-representation of specific proteins [115]. The use of protease inhibitors that minimize protein degradation can keep this variable under control. Cell heterogeneity of the mucosal specimens is another variable that could lead to an inaccurate proteome analysis [115]. Enriching samples for specific cell types and/or organelles can lower the sample's complexity and improve the protein identification efficiency [115,116]. The statistical power of a proteomic study is another factor that requires special attention in the biomarker discovery pipeline, especially in the discovery and verification stages. Skates et al. proposed a statistical framework for increasing the probability of identifying a biomarker that can reach the clinical validation stage [117]. According to their framework, the success of a biomarker in reaching clinical validation depends on the number of candidate proteins examined at each stage, the separation in biomarker signal between cases and controls (as measured by standard deviation), and the percentage of cases in which the biomarker is expressed. The authors provided probability tables that can be used in determining the proper sample size for a given study.

Although significant progress has been achieved in the instrumentation and sample preparation of proteomic techniques, proteomics in biomarker discovery is still in its early stages. Compared to molecular biomarkers, significant work is required to prove the utility of any protein panel as a new biomarker for IBD.

#### *4.2. Epigenetics in Diagnostic Biomarkers*

Epigenetic signatures are tissue- and cell-type-specific. A major challenge in IBD epigenetic studies using peripheral blood or mucosal biopsies is the cell-type heterogeneity of these specimens. Additional non-disease-specific cell types can lead to complications in interpreting the data due to interference from the different individual epigenetic features. Thus, disease-specific cell types should be purified from the mixed cell or tissue samples before analysis. However, several cell types have been linked to the pathogenesis of IBD, making the selection of disease-specific cell types in IBD a challenge. Although the techniques used in epigenetic studies are well established, they also have their limitations. Most microRNA studies use real-time quantitative PCR followed by microarrays. Although these techniques can identify a wide number of miRNAs, they are not sensitive to functionally distinct microRNA variants and slight nucleotide variations between microRNAs in the same families. They also have a low dynamic range, and cannot detect miRNAs with low expression levels [118]. Next-generation sequencing (NGS) is a high-throughput and fast method that has emerged lately as a more effective technique for identifying novel microRNAs [119].

Other challenges emerge from environmental factors, such as age, diet, and smoking, which can affect the epigenome. Hence, a well-designed study seeking to identify diseasespecific variations selectively would require a careful selection of patients and controls.

#### **5. Conclusions**

The role of endoscopy and inflammatory biomarkers in the diagnosis of IBD has been extensively studied over the years, improving our understanding of the utility and limitations of each diagnostic tool in clinical settings. Although the combination of endoscopy and molecular tests has become a well-established diagnostic tool for IBD, there is continuing effort to find an ideal diagnostic tool that can overcome the challenges limiting the current tools. Lately, there has been growing interest in switching from using a single biomarker to the biomarker panel approach, in an effort to identify biomarkers that, together, are specific to IBD and can enable differential diagnosis of UC versus CD. This shift in research focus is evident from the increasing number of studies looking into the use of proteomics and genomics for identifying biomarker signatures. As the causes of IBD are still undetermined, with immunological, genetic, and environmental triggers having been found to contribute to disease progression [120–123], researchers also continue to search for new molecular biomarkers that are associated with these factors—especially in the context of new fecal biomarkers and serological antibodies.

**Author Contributions:** C.Y. and D.M. developed the concept, and Z.A. wrote the draft. The manuscript was then critically revised by D.M. and C.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the Department of Veterans Affairs (Merit Award BX002526 to D. Merlin) and by the National Institute of Diabetes and Digestive and Kidney Diseases (RO1-DK-116306 and RO1-DK-107739 to D. Merlin). D. Merlin is a recipient of a Senior Research Career Scientist Award (BX004476) from the Department of Veterans Affairs.

**Data Availability Statement:** No new data were generated or analyzed in support of this research.

**Acknowledgments:** The authors appreciate the support from the Elsevier for reusing the Figure and table from Gastroenterology (Volume 140, Issue 6, pages 1817–1826.e2) for our review's Figure 1 and Table 1. We also appreciate the S. Karger AG, Basel publishers for reusing the Figure for our review's Figure 2 from the journal Digestive Diseases (2021; 39: 190–203, https://doi.org/10.1159/000511641), and thank the MDPI for reusing table unit from Int. J. Mol. Sci. (2020, 21, 7893; doi:10.3390/ijms21217893) for our review's Table 2.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Review* **Cell Death-Related Ubiquitin Modifications in Inflammatory Syndromes: From Mice to Men**

**Nieves Peltzer 1,2,3,\*,† and Alessandro Annibaldi 1,\*,†**


**Abstract:** Aberrant cell death can cause inflammation and inflammation-related diseases. While the link between cell death and inflammation has been widely established in mouse models, evidence supporting a role for cell death in the onset of inflammatory and autoimmune diseases in patients is still missing. In this review, we discuss how the lessons learnt from mouse models can help shed new light on the initiating or contributing events leading to immune-mediated disorders. In addition, we discuss how multiomic approaches can provide new insight on the soluble factors released by dying cells that might contribute to the development of such diseases.

**Keywords:** cell death; apoptosis; necroptosis; pyroptosis; ubiquitin; LUBAC; OTULIN; A20; inflammation; autoimmunity; human genetics

#### **1. Introduction**

The field of cell death has undergone quite a substantial evolution over the past three decades. During the 1990s and the first years of the new century, scientists were mainly focused on how the knowledge of cell death pathways could have helped improve cancer therapy, to eliminate as many cancer cells as possible, with little interest for the pathophysiological roles of cell death [1,2]. In the last decade or so, the field has moved towards a new direction, with the purpose to understand how the decision between life and death regulates tissue homeostasis and inflammatory responses during tissue damage or pathogen infection [3,4]. It became increasingly clear that cell death-regulating molecules are not always programmed to kill and can fulfil cell death-independent functions [5,6]. In addition, cell death is not a biological end point and, in the process of dying or after their death, cells can still emit signals in a programmed manner [7]. These signals evoke inflammatory programs that are essential for the ability of tissues to recover from different types of insults and restore homeostasis [8]. However, aberrantly regulated cell death can exacerbate inflammatory processes that can in turn cause tissue failure and inflammatory disorders and autoimmunity [9]. Therefore, the magnitude of cell death processes is always fine-tuned by multiple control mechanisms that are in place to prevent the detrimental effects of uncontrolled cell death [10].

Cytokines of the tumour necrosis factor (TNF) family are crucial regulators of cell death, inflammation and autoimmunity [11]. TNF receptor 1 (TNFR1) is a member of the death receptor (DR) family. These receptors are characterised by the presence, in their intracellular portion, of a death domain (DD) that is able to initiate cell death cascades [11]. Other well-studied members of this family are CD95 (Fas/APO-1), TNF-related apoptosisinducing ligand (TRAIL)-R1 (DR4) and TRAIL-R2 (DR5) [12]. It is the research conducted on the TNF-TNFR1 system during the past two decades that allowed scientists to discover the

**Citation:** Peltzer, N.; Annibaldi, A. Cell Death-Related Ubiquitin Modifications in Inflammatory Syndromes: From Mice to Men. *Biomedicines* **2022**, *10*, 1436. https://doi.org/10.3390/ biomedicines10061436

Academic Editor: Marianna Christodoulou

Received: 20 May 2022 Accepted: 15 June 2022 Published: 17 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

147

previously unappreciated link between cell death and inflammation. Indeed, the numerous mouse models developed in the past 10 years, bearing inactivating mutation in players on the TNFR1-signalling pathway, revealed that cell death could be the triggering event in inflammatory and autoimmune diseases [9]. In this review, we focus on different mouse models that led to the discovery of the relationship between cell death and inflammation, how they contributed to establish the link between cell death and inflammation-related disorders and how these disorders resemble human autoinflammatory and autoimmune diseases. In addition, we review the latest omic-based approaches adopted to elucidate the inflammatory potential of dying cells.

#### **2. TNFR1-Signalling Pathway**

Upon binding to its cognate ligand, TNF, TNFR1 trimerizes and initiates the formation of a receptor-bound complex called complex-I or TNFR1-associated signalling complex (TNFR1-RSC) [13]. The first event for complex-I formation is the recruitment of the adaptor protein TRADD, and the kinase RIPK1, via DD-interaction [14]. Subsequently, TRADD recruits TRAF2 that in turn binds to the E3 ligases cIAP1 and cIAP2 [15]. These cIAPs synthesise ubiquitin chains of different topologies (i.e., K63, K48 and K11) on different components of complex-I, including themselves and RIPK1 [16,17]. These ubiquitin chains serve as a scaffold to recruit another E3 ligase, the linear ubiquitin chain assembly complex, LUBAC [18]. LUBAC is a tripartite E3 ligase complex formed of HOIP, HOIL-1 and SHARPIN [19,20]. LUBAC conjugates linear ubiquitin chains, also called methionine (M) 1 chains, to several complex-I components, including RIPK1 and TNFR1 [21]. The ubiquitin chains formed by cIAP1/2 and LUBAC stabilise complex-I and favour the recruitment of different kinase complexes: TAB2/TAB3/TAK1 [22–24], NEMO/IKKα/IKKβ and NEMO/TANK1/NAP1/TBK1/IKKε [25]. The TAK1 and IKKα/IKKβ kinases are required for the activation of NF-κB and MAPKs for the expression of pro-survival and pro-inflammatory genes that are required to mount an innate immune response [26]. TAK1 also controls MK2 activation. The M1 chains synthesised by LUBAC also serve to recruit deubiquitinating enzymes (DUBs), namely A20 and CYLD, this last via the adapter SPATA2, which have opposing effects in complex-I [21,27]. While CYLD hydrolyses ubiquitin chains, prevalently K63 chains [28], to control the extent of NF-κB activation, A20 shields them and prevents their removal, ensuring complex-I stability [21] (Figure 1).

The TNF-signalling pathway is tightly controlled by a number of checkpoints that rely on ubiquitination-, phosphorylation-, gene-expression-dependent and proteolytic events [10,29]. In conditions where any of these checkpoints is disabled, there is the formation of a secondary cytoplasmic complex, referred to as complex-II, which is composed of FADD, Caspase-8, cFLIP, RIPK1 and RIPK3, and that has cytotoxic activity [10,13,29]. Complex II can induce (i) apoptosis, via the activation of the initiator Caspase-8, which, in turn, cleaves and activates the executioner Caspase-3 and Caspase-7 [30]; (ii) necroptosis, mediated by the kinase activity of RIPK1 and RIPK3 and the pseudokinase MLKL, which, following RIPK3-mediated activation, forms pores in the plasma membrane [31–33]; and (iii) pyroptosis, following Caspase-8-mediated cleavage of Gasdermin D that, similarly to MLKL, has the ability to form pores on the plasma membrane, leading to a lytic type of death [34,35]. While in non-immune cells Caspase-8 activation mainly leads to apoptosis, in innate immune cells, such as macrophages, Caspase-8 activation can induce both caspase-dependent apoptosis and Gasdermin D-mediated pyroptosis-like death [35] (Figure 1).

**Figure 1.** TNFR1-induced-signalling pathway. Cartoon depicting the TNFR1-induced-signalling pathway. Upon binding of TNF to TNFR1, a membrane-bound complex referred to as complex-I forms. This complex is characterised by the presence of adaptor proteins (e.g., TRADD, TRAF2, SPATA2, TAB1/2 and NAP1/TANK1), E3 ligases (e.g., cIAP1/2 and LUBAC), which conjugate poly-ubiquitin chains of different topology (i.e., K63, K48, K11 and M1) to different proteins of the complex, the deubiquitinases (DUBs) A20 and CYLD, and protein kinases such as RIPK1, IKK1/2, TAK1 and TBK1/IKKε. Complex-I promotes the activation of NF-κB and MAPKs that in turn mediate the expression of pro-survival as well as pro-inflammatory genes. Under certain circumstances, a secondary cytoplasmic complex originates in the cytosol from complex-I, referred to as complex-II. This complex is composed of FADD, cFLIP, Caspase-8, RIPK1 and RIPK3. Caspase-8 can trigger apoptosis via activation of Caspase-3 or, in some cell type, pyroptosis, via cleavage of Gasdermin D. Upon Caspase-8 inhibition by the means of synthetic- or viral-encoded caspase inhibitor, RIPK1 activates RIPK3 that in turn phosphorylates MLKL that undergoes activation and executes necroptosis. Of note, OTULIN, XIAP and MK2, have important regulatory functions in the TNFR1-signalling pathway, despite they are not directly recruited to complex-I or complex-II. OTULIN regulates the availability of the LUBAC components for their recruitment to complex-I. XIAP controls RIPK1 ubiquitination status outside complex-II and potentially its cytotoxic activity. MK2, by phosphorylating RIPK1, modulates its killing activity.

Both the conjugation and hydrolysis of ubiquitin moieties represent one of the most studied checkpoint mechanisms that control complex-II formation and restrain TNF cytotoxicity [36]. Amongst the key proteins and protein complexes responsible for these ubiquitin-system-mediated control mechanisms, are LUBAC (E3 ligase complex) [37], XIAP (X-linked IAP) [38], A20 and OTULIN (DUBs) [21] (Figure 2). In the next sections we focus on the pathological consequences of mutations disrupting the activity of the above indicated E3s and DUBs in genetically modified mice and human patients, and highlight the similarities and differences between the two systems.

**Figure 2.** A20, LUBAC, OTULIN and XIAP regulate the balance between cell death and inflammation in mice. Cartoons illustrating how A20 (**a**), LUBAC (**b**), OTULIN (**c**) and XIAP (**d**) control the balance between NF-κB-mediated gene activation and complex-II-mediated cell death in mouse cells. A20 deletion results in the deregulation of both NF-κB response and RIPK1/RIPK3/MLKL-induced necroptosis, which, in turn, triggers inflammasome activation (**a**). Individual deletion of the LUBAC components causes an attenuation of NF-κB response, but an exacerbation of complex-II-mediated cell death, which can result in embryonic lethality or cell death-dependent inflammation in adult mice (**b**). OTULIN deletion leads to hyperactivation of NF-κB and, at the same time, can unleash complex-II-mediated cell death. This can in turn cause embryonic lethality or cell death-dependent inflammation in adult mice (**c**). XIAP deletion causes both Caspase-8-dependent apoptosis and RIPK3-dependent inflammasome activation, which eventually triggers inflammation (**d**).

#### **3. LUBAC**

Genetic deletion of any of the three LUBAC components causes either absence of linear chains (HOIP and HOIL-1) or reduction in linear chains (SHARPIN) in TNFR1 induced complex-I [39]. The phenotype of mice bearing a naturally occurring mutation in the SHARPIN gene, referred to as *cpdm* (chronic proliferative dermatitis mice) (Table 1), which causes its deletion, initially puzzled scientists of the cell death field [40]. Although SHARPIN deletion caused reduced NF-κB-mediated gene activation in vitro, *cpdm* mice developed chronic dermatitis and multiorgan inflammation [20]. This conundrum was solved by findings showing that (i) attenuation of linear ubiquitination in complex-I on the one side impairs NF-κB activation and reduces gene expression, but on the other side causes cell death by favouring complex-II formation in response to TNF stimulation [20]; and (ii) TNF-mediated cell death can be a potent trigger of inflammation. The latter was supported by the evidence that genetic deletion of TNF rescues *cpdm* mice from developing dermatitis and represented a watershed in the cell death field [20]. Indisputable evidence

confirming that cell death was the cause of inflammation in *cpdm* mice came from the fact that combined deletion of FADD in keratinocytes or Caspase-8, to suppress apoptosis, and RIPK3 or MLKL to suppress necroptosis, prevented the inflammatory phenotype of SHARPIN mutant-mice [41,42]. Around the same years, animal models of cell deathinduced inflammation, some of which we discuss here, boomed, further corroborating the notion that cell death can be the etiological agent of inflammatory syndromes. Differently from mice bearing the SHARPIN mutation, mice lacking HOIP or HOIL-1 are embryonically lethal due to exacerbated endothelial cell death and heart defects [39,43] (Table 1). Yet, the selective deletion of HOIP or HOIL-1 in keratinocyte causes severe skin inflammation, which is cell death dependent [44,45]. However, differently from *cpdm*, lethal dermatitis was found to be only partially TNF-driven [44]. Indeed, while concomitant deletion of RIPK3 and Caspase-8 completely prevents the inflammatory lesions, loss of TNFR1 delays the onset of dermatitis, but mutant mice still succumb later in life due to severe skin inflammation [44]. What triggers cell death in the absence of linear ubiquitination beyond TNF? It was reported that the inflammatory phenotype occurring as a consequence of HOIL-1 deficiency could be significantly delayed by the simultaneous deletion of TNFR1 and the DD of CD95 and TRAIL-R. This indicates that these three death receptors act in concert to induce cell death in the skin when LUBAC activity is completely abrogated [44]. In the liver, HOIP deletion causes hepatocellular carcinoma that arises from inflammation caused by hepatocyte death [46]. The role of LUBAC in immune cells has also been described. T cellspecific deletion of HOIP or HOIL-1 leads to an almost complete depletion of CD4/CD8+ T cells in mice [47]. HOIP- and HOIL-1-deficient T cells exhibited delayed NF-κB activation upon TCR and TNFR1 stimulation, consistently with a role for LUBAC in the activation of NF-κB. However, enforced NF-κB activation via the overexpression of a constitutively active version of IKKβ (IKKβca), does not restore normal T cell numbers [47]. Equally intriguing was the fact that T cell developmental defects also seem to be independent of cell death activation. Similarly, loss of LUBAC in B cells impairs signalling via the TNFR superfamily member CD40, highlighting an important role of linear ubiquitination in B cell activation [48]. Notably, mice with full body deficiency in HOIP or HOIL-1 displayed severe defects in haematopoietic progenitors, which affected erythropoiesis, and this was independent of cell death [39,49] (Table 1). This evidence would suggest that, differently from keratinocytes or other cell types, LUBAC activity does not primarily inhibit cell death in immune cells, but it rather plays an important role in coordinating different signals required for cell development or differentiation.

Last, although HOIL-1 also bears E3 ligase activity, it is not essential for LUBAC activity as E3 catalytic inactive mice are viable [18,50,51]. Instead, HOIL-1 catalytic activity limits linear ubiquitination. Indeed, mouse embryonic fibroblasts (MEFs) expressing catalytically inactive HOIL-1 are protected from cell death and have enhanced NF-κB activation in response to TNF due to increased levels of linear ubiquitin chains [39,51]. As a consequence, HOIL-1 catalytic inactive mice are protected from hepatocyte death in a model of liver damage and are also protected from dermatitis in a *cpdm* background [51]. Intriguingly, mice harbouring HOIL-1 catalytic inactivity displayed increased glycogen deposition in muscle [52].

To date, only a limited number of patients bearing LUBAC mutations have ever been reported. The amino acid sequence similarity between human and mouse is 91.6% for HOIL-1, 86.5% for HOIP and 73.6% for SHARPIN. The first reported LUBAC-mutant patients, in 2012, were two sisters from a non-consanguineous marriage, with compound heterozygous mutation of HOIL-1, consisting of deletion and nonsense mutation (p.Q185X) and one boy from a consanguineous marriage, with homozygous deletion of two nucleotides (c.121\_122delCT) in HOIL-1 [53]. Few weeks after their birth, they started developing a series of disorders including autoinflammation (e.g., abdominal pain), immunodeficiency, which rendered the three patients susceptible to bacterial infections, amylopectin-like deposits in muscle and cardiomyopathy [53]. They all died by the age of eight years [53]. Immune cells, in particular the monocytes, lacking HOIL-1 are hyperresponsive to IL-1β stimulation, with exaggerated

cytokine production, including IL-6 and IL-8. On the other hand, non-immune cells, such as fibroblasts, exhibited a delayed NF-κB response following IL-1β and TNF stimulation and severely impaired cytokine and chemokine production [53] (Table 1). This dichotomy most likely explains the paradoxical clinical phenotype observed in the three HOIL-1 mutant patients. Indeed, while the hyperresponsiveness of monocytes is the underlying cause of autoinflammation, the refractoriness of non-immune cells to mount an innate immune response could explain the immunodeficiency and susceptibility to bacterial infections. One year later, two independent studies reported a few more cases of patients with homozygous or heterozygous compound truncating or missense mutations of HOIL-1 [54]. These patients suffered from progressive muscular weakness, abnormal accumulation of glycogen in muscles and cardiomyopathy. Intriguingly, they presented no sign of autoinflammation and immunodeficiency [54]. Even more intriguingly was the discovery of two more patients, in 2018, carrying HOIL-1 mutations with both autoinflammatory/immunodeficiency and myopathic features [55]. At present, it is not known what determines whether HOIL-1 mutant patients preponderantly have one or the other phenotype, or both. It is also difficult to understand whether the mutations present in these patients would result in reduced HOIL-1 E3 ligase activity or whether they would rather behave as a linear ubiquitin-null mutant.

The first HOIP homozygous missense mutation (L72P) patient was reported in 2015 [56]. This patient was born from consanguineous parents. The second one, born from nonconsanguineous parents, carrying biallelic variants, was identified in 2019 [57]. In both cases, the authors reported that, similarly to some HOIL-1 mutant patients, fibroblasts have impaired NF-κB activation upon TNF and IL-1β stimulation while monocytes are hyperresponsive to IL-1β stimulation [56]. In both cases, patients presented clinical features characteristic of multiorgan autoinflammation and immunodeficiency (recurrent bacterial infection). Lymphopenia (T cell depletion) was only present in the first observed patient [56]. The similar clinical manifestations between HOIL-1 and HOIP mutant patients can probably be attributed to the loss of linear ubiquitination, which is a common feature of loss of these two proteins, at least in mice [39] (Table 1).


**Table 1.** Overview of the pathological consequences of the deletion or mutation of the indicated genes in mice and human patients.


#### **Table 1.** *Cont.*

Very recently, a non-synonymous variant of SHARPIN was identified as genetic risk factor for LOAD (late-onset Alzheimer's disease) in a cohort of 202 Japanese individuals [58]. This variant has an amino acid substitution (G186R) that seemingly affects its subcellular localisation and NF-κB activation. In a follow-up study, six more SHARPIN variants were identified from a cohort of 180 patients with LOAD and 184 patients with mild cognitive impairment (Table 1). This, at present, is the only reported association between SHARPIN mutation and human diseases [59].

The different LUBAC mutant mouse models described above have been instrumental in unveiling the physiological role of linear ubiquitination and how linear ubiquitin chains orchestrate inflammatory programs. Equally important was the fact that they allowed to unravel the link between cell death and inflammation and the potent inflammatory potential of cell death activation. Mouse work revealed that the individual LUBAC components contribute to optimal gene activation following stimulation of immune receptors, including TNFR1, but, most importantly, they limit the killing activity of TNF. Therefore, the predominant phenotypic effect triggered in mice by their absence is cell death and cell death-induced inflammation. However, the phenotypic similarities between LUBAC mutant mice and patients are limited to some features of glycogen deposition and heart defects. In the human setting, loss of function mutations of HOIP, HOIL-1 or SHARPIN do not only result in autoinflammation, but also in immunodeficiency, glycogen storage disorders (HOIP and HOIL-1) and neurodegeneration (SHARPIN). This might indicate that (i) in humans, the gene activatory functions of LUBAC are as or more predominant than its cell death inhibitory functions, although the occurrence of cell death has not been fully analysed in patients; and (ii) those individuals carrying LUBAC mutations that escape lethality in utero, might have backup systems in place to regulate cell death and inflammation that are not completely dependent on LUBAC. An extensive cell death analysis in LUBAC mutant patients, using classical cell death markers might help elucidate the role of LUBAC in controlling organism homeostasis, and find therapeutic strategies to improve the care of LUBAC mutant patients.

#### **4. OTULIN**

OTULIN is the only known linear chain-specific deubiquitinase [81]. The amino acid sequence similarity between human and mouse OTULIN is 90.1%. In 2016, homozygous missense mutation of OTULIN was identified in three siblings, from a consanguineous family, affected by a severe sterile form of autoinflammation, which was named OTULIN-related autoinflammatoy syndrome (ORAS) [61]. In the same year, another group identified three more patients, from three different consanguineous families, carrying OTULIN biallelic mutations and symptoms of systemic sterile inflammation (e.g., prolonged fevers and diarrhoea) that they called Otulipenia [63] (Table 1). Few more patients carrying compound heterozygous mutations on OTULIN were identified, with similar clinical manifestations [62]. This prompted different groups of scientists to investigate the molecular basis of this mutant OTULIN-driven inflammatory disorder using mouse models. The most intuitive explanation as to why OTULIN-mutant patients suffered from a severe inflammatory disease was that, without functional OTULIN, there would be an excess of linear chains that would in turn exacerbate NF-κB responses and the consequent production of pro-inflammatory factors. Indeed, both OTULIN-mutant patients and OTULIN-deficient mice (conditional full body and myeloid cell-specific) exhibited linear chains accumulation, increased NF-κB activation and excessive cytokine production [61,63]. The fact that Infliximab (TNF-neutralising antibody) drastically reduced the inflammatory syndrome in patients and mice [61] indicated that the main instigator of the excessive NF-κB activation and cytokine production observed in absence of functional OTULIN (patients) and full-length protein (mice) is TNF (Table 1). Intriguingly, it was concluded that, differently from LUBAC deficient mice, it was the gene activation ability of TNF rather than its cell death-promoting potential that caused the inflammatory phenotype in mice lacking OTULIN.

This view was subsequently challenged by another study that, using a catalytically inactive mutant of OTULIN in mice, showed that the primary effect of linear chains accumulation is not NF-κB hyperactivation but rather complex-II-mediated cell death, in the form of apoptosis and necroptosis [64] (Table 1), the reason for this being that absence of OTULIN causes hyperubiquitination of the LUBAC components that impairs their recruitment to complex-I (Figure 1) [21,64]. This would in turn favour complex-II formation and cell death. This scenario was later confirmed by reports showing that OTULIN deletion in the liver causes TNFR1-driven, apoptosis- and compensatory proliferation-mediated liver pathology, while OTULIN deletion in keratinocytes causes TNFR1-driven, RIPK1 kinase activity-mediated, cell death-dependent skin inflammation [65,66] (Table 1). Of note, some OTULIN-mutant patients display signs of liver dysfunction and skin inflammation in the form of panniculitis and neutrophilic dermatosis [67]. Despite the contrasting results concerning the etiological agent of the systemic inflammatory syndrome that characterises OTULIN mutant mice (NF-κB hyperactivation vs. cell death), the common denominator of these mouse models is that, similarly to human settings, the blockade of the TNF/TNFR1 system significantly ameliorates the disease.

Importantly, OTULIN has also been implicated in signalling events that are different from cell death and inflammation per se. For example, the *Gumby* mutation, which is a spontaneous mutation in OTULIN (W96R), results in embryonic lethality resembling the report on OTULIN catalytically inactive mice [60]. Gumby mice display increased Wnt signalling [60]. Whether this is the cause for lethality remains unresolved. In addition, the pathology of OTULIN deficiency in the liver seems to be independent of TNFR1 signalling but dependent on aberrant mTOR activation [67].

The OTULIN-mutant mouse models were extremely useful to unveil the link between OTULIN mutations and excessive linear chains in ORAS/Otulipenia. In addition, they allowed to understand why infliximab has such therapeutic benefits in patients. At present, and similar to LUBAC-mutant patients, evidence that OTULIN absence can unleash inflammatory cell death is still missing. One could speculate that absence of OTULIN activity can induce cell death only in some cell types, while in others sustained NF-κB activation would be the main outcome. Future analysis employing cell death-specific stainings will be required to understand the mechanisms of hyperinflammation and the respective contributions of cell death and NF-κB in OTULIN mutant patients.

#### **5. A20**

A20 is a deubiquitinase enzyme that exhibits 88.1% amino acid sequence conservation between human and mouse. It was identified in 1990 as a NF-κB target gene [82], which had the ability of preventing TNF-induced cytotoxicity [83]. It was subsequently discovered that A20 is not only a NF-κB target gene but also an inhibitor of the NF-κB-signalling pathway [84]. Consistent with this idea, A20 null mice die perinatally due to multiorgan inflammation [84]. Intriguingly, the ability of A20 to suppress inflammation does not reside in its deubiquitinase activity, since mice carrying a point mutation in A20 catalytic domain do not display inflammation [85]. After the realisation that A20 deubiquitinase activity is dispensable to control inflammation, different groups have tried to uncover which other domains of A20 are responsible for this. A20 is a ubiquitin-editing enzyme, which not only possesses DUB activity but also E3 ligase activity, mediated by the fourth zinc finger domain (ZnF4) [68]. Surprisingly, the E3 ligase activity is not required to keep inflammation in check, since ZnF4 mutant mice are viable and healthy [86,87]. A major advance in the understanding of the role of A20 in repressing inflammatory processes came from studies whereby the ZnF7 was mutated. It was shown that the ZnF7 is required for the ability of A20 to suppress NF-κB activation, in a non-catalytic fashion [88]. A subsequent work proved that A20 stabilises linear chains by direct ZnF7-mediated binding; indeed, in the absence of A20, complete absence of linear chains was observed in complex-I [21,89]. Therefore, a model was proposed whereby A20, via the ZnF7 domain, binds to and shields linear chains, preventing their hydrolysis by other DUBs (e.g., CYLD), ensuring complex-I stability. At the same time, this shielding prevents the excessive recruitment of NF-κB-activating molecules to complex-I, such as the NEMO/IKK complex, thereby controlling NF-κB activation [21,90]. Consistently with this model, ZnF7 mutant mice display spontaneous inflammation [76]. Similar to the various LUBAC-mutant mice, the *Tnfaip3*−*/*−, the *Tnfaip3myel-KO* (A20 full knock-out or selectively in myeloid cells, respectively) and *Tnfaip3ZnF7mut* mutant mice were extremely useful to strengthen the link between cell death and inflammation, and to gain a better understanding of the cause of the inflammatory diseases displayed by A20 mutant patients. A20-deficient mice suffered from severe multiorgan inflammation, including, but not limited to, liver, kidneys and joints. Similarly, although milder, mice lacking A20 in myeloid cells and those bearing a ZnF7 mutant version of A20 developed arthritis [76]

(Table 1). Importantly, it has been proven at the genetic level that the cause of arthritis in these mice is not hyperactivation of NF-κB, but rather RIPK1/RIPK3/MLKL-dependent necroptosis of macrophages. This leads in turn to NLRP3 inflammasome activation within the same dying macrophages with the consequent release of IL-1β. Excessive production of IL-1β then causes cartilage erosion and joint inflammation [76,91].

In humans, it has been known for two decades that A20 is a susceptibility gene for autoinflammatory diseases such as systemic lupus erythematosus (SLE), rheumatoid arthritis, psoriasis and diabetes [70–75]. However, it was only in 2016 that it was proven that loss-of-function germline mutations in A20 cause systemic autoinflammatory disease [69]. The authors of this study identified five heterozygous truncating mutations in five families. Patients carrying the mutations displayed a range of clinical manifestations including early-onset systemic inflammation, arthritis, oral and genital ulcers, SLE-like disease and central nervous system vasculitis. This systemic inflammatory disease caused by A20 haploinsufficiency was named HA20. Patient-derived immune cells had a strong inflammatory signature (e.g., elevated levels of TNF, IL-6 and IL-17) and were hyperresponsive to inflammasome activation following LPS stimulation. Along the same line, patient-derived fibroblasts exhibited increased NF-κB activation upon TNFR1 stimulation [69] (Table 1).

A20 is another example of how mouse models can be extremely valuable to accelerate the understanding of (i) the genetic cause of a human pathology, or group of pathologies, and (ii) the molecular mechanisms driving the pathology. At the same time, the genetic studies in humans can indicate how to refine the existing mouse models to develop better preclinical disease models. For example, although the A20 mutant mice develop a set of diseases that quite closely recapitulate the patient's clinical features, it is surprising that almost all the human mutations are found in the OTU catalytic domain (catalytic inactive A20 mutant mice are normal), and no mutation has ever been found in the ZnF7 domain [92]. One explanation could be that in patients there is very little to no A20 detected in lysates from fibroblasts or PBMCs, suggesting that the mutations in the OTU destabilise A20 rather than solely killing its catalytic activity [69]. This possibility could be addressed by generating new genetically modified mice bearing the human corresponding A20 mutations. In addition, the fact that in mice the individual mutations in the OTU and ZnF4 do not trigger the inflammatory phenotype [86,87] could suggest that the combination of the two mutations might induce the phenotype observed in *Tnfaip3*−*/*<sup>−</sup> and *Tnfaip3ZnF7mut* mice. These new potential mouse models would broaden the possibility to study HA20 and find novel therapeutic approaches.

Finally, while the mouse models clearly indicated that the aetiology of the systemic inflammation observed in A20 mutant mice is cell death, in patients there is still lack of evidence supporting this possibility. Similar to what was highlighted for LUBAC and OTULIN mutant patients, cell death marker stainings, (phospho)proteomic and ubiquitinome analysis on patients' samples might help determine the contribution of cell death vs. NF-κB hyperactivation to the disease.

#### **6. XIAP**

XIAP is an E3 ligase enzyme that belongs to the IAP (inhibitor of apoptosis) family [93]. Human and mouse XIAP share 89.3% of their amino acid sequence. It was initially characterised as able to inhibit Caspase-9, by preventing its dimerization, and Caspase-3 and Caspase-7, by blocking their active site [94–97]. However, it subsequently became clear that XIAP is an important immune regulator, both in a cell death-independent and -dependent manner. Indeed, its E3 ligase activity plays a crucial role in pathogen responses mediated by NOD2, a member of the NLR family [98]. In the NOD2-signalling pathway, XIAP-mediated ubiquitination of RIPK2 is crucial for the correct activation of the pathway and secretion of the cytokines needed for the pathogen response [98]. Therefore, the importance of XIAP in the NOD2 pathway is independent of its ability to regulate cell death. By contrast, following TLR activation (e.g., TLR2 or TLR4), XIAP is required to prevent, in myeloid cells, RIPK3-mediated necroptosis and the concomitant NLRP3 activation and

IL-1β release [99] (Figure 2 and Table 1). Additionally, although XIAP is not recruited to TNF-induced complex-I, it regulates RIPK1 ubiquitination outside complex-I, therefore contributing to the regulation of complex-II formation [38] (Figure 1). In 2006, mutations in XIAP were found in 12 individuals belonging to three different families affected by X-linked lymphoproliferative syndrome (XLP) [79]. This was the first time that XIAP mutations were associated to a human disease. To date, many XIAP mutations have been identified, recently summarised in [100], which include nonsense and missense mutations, exon deletion, and small insertions and deletions, often leading to premature stop codon and protein deficiency. XLP is a rare immunodeficiency that is characterised by hemophagocytic lymphohystiocystosis (HLH), hypogammaglobulinaemia and lymphoma [101]. This syndrome normally develops following Epstein–Barr virus infection. The identification of more XIAP-mutant patients in the following years prompted clinicians to classify XIAP deficiency-caused disease as familial HLH (FHHL) or XLP2 [102] (Table 1). XLP2 differs from XLP1 in some immunological features, including absence of lymphoma development and high risk of IBD (inflammatory bowel disease). In particular, IBD is observed in about 25% of XIAP mutant patients and it is often refractory to treatment and lethal in 10% of the cases. Moreover, XIAP mutations are detected in up to 4% of male paediatric patients with very early onset IBD [80,103]. Another immunologic feature of XLP2 patients is the excessive activation of macrophages and dendritic cells to EBV and other viruses [101]. However, the mechanisms linking XIAP mutations to XLP2 disease and intestinal epithelial barrier damage are not yet entirely understood. Given the fact that XIAP plays a crucial role in the NOD2 pathway and NOD2 was the first ever identified risk gene for IBD, one could argue that XIAP deletion would predispose patients to IBD by impairing the NOD2 pathway. This speculation would be supported by the evidence that the majority of the missense mutations on XIAP map either in the BIR2 or the RING domain, both crucial for correct activation of the NOD2-signalling pathway [100]. However, while the penetrance of IBD in XIAP-mutant patients is 23%, only 1.5% of individuals carrying homozygous NOD2 risk variants develop IBD [78]. This suggests that the role of XIAP in ensuring intestinal homeostasis goes beyond its function in the NOD2 pathway, perhaps to its cell death regulatory functions.

Mouse models were again of great help to gain insights into the role of XIAP in the regulation of immune system responses (Table 1). Indeed, two recent works have shed new light on the role of XIAP deficiency in IBD pathogenesis, using *Xiap*−*/*<sup>−</sup> mice as a model [77,78]. In one case, the authors showed that XIAP-deficient mice have a reduced number of Paneth cells, as consequence of their death, which is TNF- and microbiotadependent and RIPK1/RIPK3-mediated. Decrease in Paneth cells correlates with a decrease in production and secretion of antimicrobial peptides and change in the structure of the microbiota, termed dysbiosis. These changes, per se, are not enough to elicit intestinal inflammation in mice. However, *Xiap*−*/*<sup>−</sup> mice are very sensitive to intestinal inflammation triggers, such as DSS or the pathogenic bacterial strain *Helicobacter hepaticus*. Importantly, delivery of antimicrobial peptide to the intestine, by means of adenoviruses, allowed XIAPdeficient mice to clear the *H. hepaticus* [78]. These findings are in line with the fact that in XIAP-mutant patients, similar to the mouse, there is a reduction in Paneth cells in the ileum and often the intestinal inflammation is triggered by bacterial or viral infection. In the other study, the authors have demonstrated that, unlike the abovementioned work, XIAP-deficient mice develop spontaneous ileal inflammation, which is microbiome and TNF dependent. In addition, the authors showed that both TNFR1 and TNFR2 contributed to the inflammation since their individual genetic ablation abrogates the inflammation. Furthermore, they proved that the death of dendritic cells mediated by the TLR/TNFR2/RIPK3 axis ignites the intestinal inflammation in XIAP-null mice [78]. Very interestingly, elevated levels of TNFR2 correlates with disease severity in paediatric patients affected by IBD. These studies are attractive representative examples of how mouse models can help take human genetic studies forward to understand the molecular mechanisms underlying pathological clinical cases. For example, they corroborate the central role of Paneth cell damage

in IBD and the importance of TNF as the instigator of the disease. They also hint at the importance of cell death, of both Paneth cells and dendritic cells, in the initiation of the intestinal inflammation in XIAP-mutant patients. Equally important, these mouse-based studies can help devise new therapeutic intervention, such as cell death inhibition, or delivery to the intestine of specific antimicrobial compounds, which will ultimately help improve the patient's care.

#### **7. Omics**

Genomic sequencing approaches have decisively contributed to identify the cause of many inflammation-related genetic diseases. The combination of genetic studies with biochemical studies has then helped dissect the molecular mechanisms underlying these genetic diseases and provided sound scientific basis for therapeutic intervention. The question that can be raised now is: How can different omic approaches further help understand the aetiology of human inflammatory diseases? It is established that TNF often plays an eminent role in inflammatory syndromes and that TNF-induced cell death, rather than TNF-induced gene activation, might be the decisive factor for the development of these syndromes. Therefore, the next question to be addressed now relates to the nature of factors released by dying cells that can instigate inflammatory processes.

Different cell death forms have different inflammatory potential. It is widely accepted that, while necroptosis and pyroptosis are inflammatory cell death forms, because of membrane rupture and intracellular content spillage, apoptosis is immunologically silent [3]. Interestingly, it has been demonstrated that necroptotic cells do not only passively release soluble factors as a consequence of plasma membrane rupture, but they are transcriptionally and translationally active for the production of pro-inflammatory cytokines [7]. Conversely, it was reported that apoptotic cells shut down translation via caspases, hence their scarce inflammatory potential [104]. However, some mouse models seem to challenge, at least partially, the current dogma. For example, the skin inflammatory phenotype observed in SHARPIN mutant mice has, at present, to be solely accounted to apoptosis [41,42]. In order to ascertain whether the different inflammatory potential of apoptosis and necroptosis is to be accounted to differences in the soluble factors that are released by the dying cells, Tanzer and colleagues took an unbiased, mass-spectrometry-based approach [105]. With this approach they analysed supernatant of human lymphoma cells and primary human macrophages undergoing TNF-induced apoptosis or necroptosis. As expected, a large number of proteins were significantly associated with either cell death type. Surprisingly, there was no significant qualitative or quantitative difference in terms of conventional cytokines between apoptosis and necroptosis. However, intriguingly, the authors observed that the supernatant of apoptotic cells had high levels of nucleosome components while the supernatant of necroptotic cells had high levels of lysosomal proteins [105]. How this translates into the different inflammatory potential between apoptosis and necroptosis is unknown. Furthermore, it is conceivable to think that only a limited number of factors, differentially released from necroptotic cells with respect to apoptosis, have the potential to trigger inflammation. More refined proteomic analysis to be conducted on more clinically related settings will be needed to try to determine the inflammatory potential of factors specifically secreted by dying cells. Proteomic analysis could be coupled to transcriptomic-based and mass-cytometry-based approaches with the purpose to examine cell death-specific signatures. Reliable animal models that recapitulate human mutation-driven, cell death-dependent diseases would be again the starting point for this combinatorial approach.

#### **8. Conclusions**

The last decade saw an impressive body of work that enabled us to understand how cell death modulates inflammatory diseases and how this was associated with human genetics. It is not always expected that mouse models completely recapitulate the human settings. However, model organisms spanning from 2D cells/organoids to invertebrates and mice bring us closer to the identification of aetiological factors of chronic inflammation and autoimmune disorders in humans. Excitingly, the gap between mice and men is becoming smaller with the advancement in technologies and preclinical animal models. Currently, mouse work has become the springboard for human studies with the ultimate purpose to design novel therapies that improve the care of patients affected by inflammatory/autoimmune diseases.

**Author Contributions:** N.P. and A.A. wrote the text and drew the figures. Both authors approved the submitted version. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation, project no. AN 1717/1-1), the Jürgen Manchot Foundation, and the CMMC Junior Research Group Program for A.A.; N.P. is supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation; project SFB1399, ID 413326622) and the Jürgen Manchot Foundation. Both N.P. and A.A. are supported by the collaborative research centres SFB1430, ID. 414786233 (A10 for N.P. and associated project for A.A.) and SFB1503, ID 455784452 (B02 for N.P. and A05 for A.A.).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** We would like to thank members of the Annibaldi, Peltzer, and Liccardi laboratories for helpful discussions. We would like to apologise to the many authors whose work we could not cite due to space restrictions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Comparison of Transcriptional Signatures of Three Staphylococcal Superantigenic Toxins in Human Melanocytes**

**Nabarun Chakraborty 1,\*,†, Seshamalini Srinivasan 1,2,†, Ruoting Yang 1, Stacy-Ann Miller 1, Aarti Gautam 1, Leanne J. Detwiler 1,2, Bonnie C. Carney 3,4,5, Abdulnaser Alkhalil 3, Lauren T. Moffatt 3,4,5, Marti Jett 1, Jeffrey W. Shupp 3,4,5,6 and Rasha Hammamieh <sup>1</sup>**


**Abstract:** *Staphylococcus aureus*, a gram-positive bacterium, causes toxic shock through the production of superantigenic toxins (sAgs) known as Staphylococcal enterotoxins (SE), serotypes A-J (SEA, SEB, etc.), and toxic shock syndrome toxin-1 (TSST-1). The chronology of host transcriptomic events that characterizes the response to the pathogenesis of superantigenic toxicity remains uncertain. The focus of this study was to elucidate time-resolved host responses to three toxins of the superantigenic family, namely SEA, SEB, and TSST-1. Due to the evolving critical role of melanocytes in the host's immune response against environmental harmful elements, we investigated herein the transcriptomic responses of melanocytes after treatment with 200 ng/mL of SEA, SEB, or TSST-1 for 0.5, 2, 6, 12, 24, or 48 h. Functional analysis indicated that each of these three toxins induced a specific transcriptional pattern. In particular, the time-resolved transcriptional modulations due to SEB exposure were very distinct from those induced by SEA and TSST-1. The three superantigens share some similarities in the mechanisms underlying apoptosis, innate immunity, and other biological processes. Superantigen-specific signatures were determined for the functional dynamics related to necrosis, cytokine production, and acute-phase response. These differentially regulated networks can be targeted for therapeutic intervention and marked as the distinguishing factors for the three sAgs.

**Keywords:** superantigens; gene expression; transcriptional dynamics; staphylococcal enterotoxins; SEB; SEA; TSST-1; toxins; biological networks; clustering; functional pathways; time–course analysis; cDNA microarray; human melanocytes

#### **1. Introduction**

*Staphylococcus aureus* (*S. aureus*) is widely circulated in nature and carried by 25–33% of normal individuals in the anterior nares and skin [1,2]. The extreme penetrance of this bacteria and its ability to colonize skin, open wounds, and other surfaces makes it a serious threat in facilities that provide health care [3,4]. The myriad of exotoxins synthesized and secreted by *S. aureus* include the Streptococcal enterotoxins (SEs), such as SEA-SEE, SEG-SEI, SEK-SET, and SEY, and the toxic shock syndrome toxin (TSST-1). As SEA is the most common toxin in food poisoning, SEB is recognized for its potent toxicity as a biological

**Citation:** Chakraborty, N.; Srinivasan, S.; Yang, R.; Miller, S.-A.; Gautam, A.; Detwiler, L.J.; Carney, B.C.; Alkhalil, A.; Moffatt, L.T.; Jett, M.; et al. Comparison of Transcriptional Signatures of Three Staphylococcal Superantigenic Toxins in Human Melanocytes. *Biomedicines* **2022**, *10*, 1402. https://doi.org/10.3390/ biomedicines10061402

Academic Editor: Marianna Christodoulou

Received: 20 April 2022 Accepted: 6 May 2022 Published: 14 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

165

weapon, and TSST-1 is known for being the causative agent of lethal toxic shock [5–7], they remain the primary focus of *S. aureus* toxins research [8].

Staphylococcal enterotoxins and TSST-1 are superantigens (sAgs) that bind as an intact molecule to the major histocompatibility complex II (MHC) and interact directly with the variable region of the beta chain of T-cell receptors (TCRs) without the need for processing or presentation by the antigen presenting cells (APC). These interactions activate T-cells, resulting in massive production of cytokines and chemokines, activation-induced apoptosis, and T-cell anergy [9].

The interaction of sAgs with immune cells and the ensuing pathogenesis have been well documented [10,11]. Previous work from our lab identified a set of genes in human peripheral blood mononuclear cells (PBMCs) that were expressed as early as 2 h post-SEB treatment [11] and played important roles in tissue repair, inflammation, and increased vascular permeability. Supporting studies reported SEB-induced proinflammatory mediators contribute to vasodilation, vascular leak, and edema [12–14].

The immunologic barrier raised by the skin is a concerted effort from different cell types. Keratinocytes, melanocytes, and Langerhans cells actively contributed to the innate immune response to sAgs [15]. We presently focused on the melanocytes, which are dendritic cells of neuroectodermal origin and an integral part of the epidermis [16–18]. The dendritic nature and strategic location of melanocytes in the epidermal layer of the skin allow for an ideal milieu to interact with the extra-skin environment and build response coordination among neighboring shallow skin cells.

The immunological responses of melanocytes have been attributed to their ability to express MHC molecules and other various adhesion molecules, including intercellular adhesion molecule (ICAM)-1 and vascular cell adhesion molecule (VCAM)-1 [19–21]. In addition, melanocytes can produce several cytokines, tumor necrosis factor alpha (TNF-α), and transforming growth factor beta (TGF-β1) with potential functions in phagocytosis and antigen processing and presentation [20,22,23]. The immunomodulatory cytotoxic properties of melanocytes were highlighted in a recent in vitro study, where melanocytes were exposed to *C. albicans* infection [24]. Despite the wide coverage of melanocyte research and the increasing knowledge of their role in immune response, minimal information is available about the role of melanocytes in response to sAgs. The selection of melanocytes for the present study was further justified by the differential host-sAgs responses that were essentially determined by the significant structural differences of the three sAgs, thus affecting their interactions with the host cells [8,25]. Major findings include the toxinspecific melanocyte response dynamics enabling the distinction of toxin pathogenesis; in particular, we elucidated later-stage molecular events that could have the potential for common or customized therapeutic targets for the three toxins of choice.

#### **2. Materials and Methods**

#### *2.1. Cell Culture and Toxin Treatment*

Normal human epidermal melanocytes (NHEM) and the reagents required for culturing the cells were purchased from Clonetics® (Lonza, Walkersville, MD, USA). Cells were maintained in Melanocyte Growth Medium (MGM) BulletKit® according to the supplier's instructions (Lonza, Walkersville, MD, USA). Cell cultures were established at the recommended starting cell density of 10,000 cells per cm2 and maintained in 150 cm2 flasks at 37 ◦C and 5% CO2 in a humidified incubator.

SEA, SEB, and TSST-1 were purchased from Toxin Technology (Toxin Technology, Sarasota, FL, USA). The toxins were diluted from the stock solution to 25 μg/mL in the MGM growth media (Lonza, Walkersville, MD, USA). On the day of the assay, the cells were treated with the appropriate amount of toxins to reach a final concentration of 200 ng/mL. The toxins were inactivated by adding TRIzol (Invitrogen, Carlsbad, CA, USA) at 0.5, 2, 6, 12, 24, and 48 h post-exposure (p.e.). As controls, untreated melanocytes were grown in parallel and harvested at the same time points. Each time point for each toxin was represented by a single culture.

#### *2.2. RNA Isolation*

Total RNA was isolated with TRIzol reagent (Invitrogen, Carlsbad, CA, USA) using the manufacturer's procedure, followed by a cleanup procedure using the RNeasy MinElute Cleanup kit (QIAGEN, Germantown, MD, USA). The integrity of the extracted RNA was assessed using the 2100 Bioanalyzer instrument (Agilent Technologies, Santa Clara, CA, USA), and RNA integrity number (RIN) values were recorded.

#### *2.3. Transcriptomic Assay and Analysis*

The dual dye microarray hybridization was carried out using the SurePrint 4 × 44 K v2 Microarray Kit (Agilent Technologies, Inc., Santa Clara, CA, USA) following the vendor's protocol. Cy-5-labelled 200 ng of purified RNA was co-hybridized with Cy-3-labelled reference RNA (Agilent Technologies, Inc., Santa Clara, CA, USA) and bound to Agilent 4 × 44 k slides (Design ID: 026652). These arrays contain 41,000 unique probes targeting 27,958 Entrez gene RNAs. Following standard protocol, overnight hybridization at 55 ◦C was followed by a series of washes. The slides were scanned with an Agilent DNA microarray scanner and the features were extracted using the default setting of the Feature Extraction software (Feature Extraction software v.10.7, Agilent, CA, USA). The genes that displayed transcriptomic expressions at a fold change higher than 2 (fold change ≥ 2) were selected for further analysis.

Gene expression analysis used functions available in the Bioconductor Project [26] and functional heatmap tool (https://bioinfo-abcc.ncifcrf.gov/Heatmap/ (accessed on 26 August 2021). GeneSpring v.10.1 (Agilent Technologies, Inc., Santa Clara, CA, USA) was used for data visualization. Enrichment analysis was performed using Ingenuity Pathways Analysis (IPA, QIAGEN, Inc., Germantown, MD, USA). The data from this study was submitted to GEO under accession number GSE124756.

#### *2.4. Gene Expression Validation by Nanostring Assays*

A custom NanoString panel (NanoString Technologies, Seattle, WA, USA) was designed for genes deemed functionally important for the current study. The results and discussion section justify our choice of genes listed in Table S1. Six genes—*GIGYF2*, *INO80*, *USF2*, *WDR89*, *PPIA*, and *EIF2B1*—were selected as housekeeping genes based on their stable expression levels in melanocytes [27]. We followed the standard nCounter instructions [28], a master-mix containing hybridization buffer, Reporter ProbeSet, and Capture ProbeSet (volume:volume ratio of 1:1:0.5) was prepared, of which 25 μL was added to 5 μL target RNA. The GEN2 Prep Station incubation time was set at the higher sensitivity setting (3 h) and 280 fields of view (FOV) were routinely captured. Analysis and normalization of the raw NanoString data was conducted using nSolver Analysis Software v3.0 (NanoString Technologies, Seattle, WA, USA).

#### **3. Results**

#### *3.1. Genomic Responses to the Three Toxins Are Characterized by Unique Host Expression Patterns*

Principal component analysis (PCA) of transcriptomic expression data showed timeresolved clustering patterns of melanocytes exposed to three toxins for six treatment sequels (Figure 1). PC1 and PC2 represented 21.7% and 16.1% of the total variance; thus, together, PC1 and PC2 represent nearly 38% of the total variance. Within the transcriptomic variance defined by PC1 and PC2, we found three distinct clusters for each of the toxin types. These time points emerged clustered following longitudinal trends. For example, 30 min and 2 h SEA p.e. time points clustered together, and this combination was labelled as the early treatment phase. The early treatment phase was distantly located in the PCA plot from the middle treatment phase and was defined by 6 h and 12 h SEA p.e. time points. Finally, the late treatment phase was defined by 24 h and 48 h SEA p.e. time points, which were juxtaposed in the PCA landscape and distally located from the middle phase. A hypothetical line connecting these three treatment phases showed a potential temporal trend. A very similar picture emerged from TSST-1. The genes responding to SEB

treatment, however, showed a different clustering pattern, which was more apparent in the late treatment phase of SEB. A considerable Euclidian distance was observed between 24 h and 48 h SEB p.e. Therefore, unlike SEA and TSST-1, we included 24 h SEB p.e. in the middle treatment phase along with its original two members, namely 6 h and 12 h SEB p.e. This arrangement automatically labelled 48 h SEB p.e. as the sole candidate of the SEB late-treatment phase. Interestingly, the middle-to-late treatment phases (12 h, 24 h, and 48 h) of SEA p.e. clustered closely to the middle treatment phases (6 h, 12 h, 24 h) of SEB p.e.

**Figure 1.** Principal components analysis (PCA) of time-resolved gene expression values. Black-, red-, green-, and blue-colored open circles represent control, SEA, SEB, and TSST-1, respectively. Dotted lines trace the temporal shifts caused by different toxins; here red, green, and blue dotted lines represent control, SEA, SEB, and TSST-1, respectively.

Since neither of the time point experiments have technical or biological replicates, the present strategy of grouping time sequela into the early, middle, and late treatment phases essentially enhanced the statistical confidence of the overall results. Using the longitudinal patterns of transcriptomic expressions, we sub-grouped the genes in three sets: (i) the 'Early' gene group, in which the transcriptomic fold changes were greater than |2| for at least one of the two time points (30 m and 2 h p.e.) of the early treatment phase; (ii) the 'Consistent' gene group, in which the transcriptomic fold changes were greater than |2| in all time points, and (iii) the 'Late' gene group, in which the transcriptomic fold changes were greater than |2| for at least one of the two time points (24 h and 48 h p.e.) of the late treatment phase. The exception was the SEB treatment, for which the late treatment phase included only 48 h p.e. Next, we combined (i) and (ii) to form 'Early–Consistent' gene groups; similarly, (ii) and (iii) were combined to form 'Late–Consistent' gene groups. These gene groups were used for functional analysis.

Figures S1A, S2A, and S3A depict Early–Consistent gene profiles of SEA, SEB, and TSST-1, respectively. Likewise, Figures S1B, S2B, and S3A depict Late–Consistent gene profiles of SEA, SEB, and TSST-1, respectively. A total of 445, 123, and 376 transcripts emerged, and these time clusters were called 'SEA—Early–Consistent' (SEA-E), 'SEB— Early–Consistent' (SEB–E), and 'TSST—Early–Consistent' (TSST-1-E), respectively. As explained above, the clustering for late-phase SEB exposure was performed differently than late-phase SEA and TSST-1 exposures. Hence, genes responding exclusively at 48 h p.e. (for SEB, Figure S2C) or in one of the two late p.e. phases (24 h or 48 h p.e. for SEA, Figure S1C,

and TSST-1, Figure S3C) were combined with their respective consistently expressed genes (i.e., Figure S2A for SEB, Figure S1A for SEA, and Figure S3A for TSST-1). A total of 555, 1071, and 661 genes emerged, and they were called 'SEA—Late–Consistent' (SEA-L), 'SEB—Late–Consistent' (SEB-L) and 'TSST—Late–Consistent' (TSST-1-L), respectively.

#### *3.2. Differences in Transcriptional Regulation in Response to the Three Toxins*

In agreement with the PCA trend, the number of genes showing altered transcription varied greatly in response to the three toxins (Figure S4). Comparisons of early and late genomic responses to each of the toxins showed differences that were at their maximum after SEB treatment in both up- and downregulated genes. The largest number of genes responding with fold change (FC > |2|, nearly 1100 genes) were observed in SEB-L, whereas nearly 100 genes showed FC > |2| in SEB-E. In contrast, SEA-E and SEA-L comprised the least number of genes with FC > |2|. Nearly 450 genes showed transcriptomic modulations at early time points and nearly 600 genes were modulated during the late time points. Treatment with TSST-1 toxin elicited a response somewhat like SEA p.e. Interestingly, there was a common trend among all three toxins: the number of perturbed genes increased with the progression of treatment time, indicating the transcriptomic storm typically augmented by this family of sAgs [29–32] (Figure S4).

#### *3.3. Biological Networks and Functions That Were Differentially Regulated by the Three Toxins*

Functional analysis was performed using the genes listed under SEA-E, SEA-L, SEB-E, SEB-L, TSST-1-E, and TSST-1-L, respectively, to elucidate the time-dependent, toxin-specific enrichment profiles of biological and canonical functions. Table S2 lists the top biological functions (*p* < 0.001) and canonical networks (*p* < 0.01) associated with the three early treatment categories, SEA-E, SEB-E, and TSST-1-E. The list was filtered to include only those biological processes which were significantly enriched and functionally relevant to cell survival and the defense and maintenance of skin cells. In a similar fashion, genes belonging to the late treatment phase were probed to generate a list of significant biological and canonical processes that were enriched due to the prolonged toxin exposure (Table S3).

Table 1 lists the top biological pathways (*p* < 0.001) and canonical functions (*p* < 0.01) that represent melanocytes' dendritic cell-like (DC like) or macrophage-like property. 'Antigen presentation pathways', 'dendritic cell maturation', 'IL17 signalling', and 'chemokine signalling', among others, emerged as the top functions that are related to melanocytes' immunogenicity.

ILK signalling emerged as a significant network that was conserved between the early and late treatment phases in response to all three sAg. Functional annotation of the 36 genes (Table S4) associated with the ILK signalling pathway demonstrated association with two cellular processes, namely the cell death and tight junction signalling. Other networks that responded in common to at least two toxins and were conserved throughout the time–course of the study include acute phase response signaling, the antigen presentation pathway, the complement system, and agranulocyte adhesion and diapedesis.

The Venn diagram in Figure S5A elucidated those biological networks that were common among as well as exclusive to SEA-E, SEB-E, and TSST-1-E. Nine networks related to cell survival and maintenance were affected by all three toxins. SEA-E and TSST-1-E shared the largest number (28) of networks, including those, which were associated with endometriosis, proliferation of connective tissue cells, and angiogenesis. SEA-E and SEB-E shared the smallest group of networks (2), which were related to skin disorders such as chronic skin disorder and chronic psoriasis.

A Venn diagram of the functional annotation enriched by the three late treatment phases, namely SEA-L, SEB-L, and TSST-1-L, (Figure S5B), demonstrated a cohesive picture of the early treatment phase (Figure S5A). The number of overall annotated networks was greater for the late phase (87 as compared to 66 networks for the early treatment phase), as described in Tables S2 and S3. The largest number of networks was shared between SEA-L and TSST-1-L as in the early treatment phase, with similar enriched networks, namely endometriosis and proliferation of connective tissue cells. A total of 19 networks were commonly enriched for SEA-L and SEB-L; hence, the late treatment phase was associated with a higher number of significantly enriched gene networks than those associated with the early treatment phase.

**Table 1.** Biological pathways (*p* < 0.001) and canonical functions (*p* < 0.01) that represent melanocytes' dendritic cell-like (DC-like) or macrophage-like property. Networks which are perturbed by the toxins are double tick (√√) marked. In addition, the association of the networks with DC-like and/or macrophage-like properties are noted by single tick (√) mark.


#### **Table 1.** *Cont.*


All three sAgs induced responses highly enriched for three biological processes: necrosis, skin diseases, and inflammation. Separate hierarchical clustering was performed using three gene sets, namely 217 genes from the necrosis network (Figure 2), 53 genes from the inflammation network (Figure S6), and 167 genes from the skin diseases network (Figure S7). The clustering analysis in Figure 2 identified four distinct groups of genes (indicated within

yellow borders and labeled as groups A–D in Figure 2), which could be exclusive necrosis markers for TSST-1-L, TSST-1-E, SEA-L, and SEA-E, respectively. This hierarchical analysis failed to mine any exclusive signatures for SEB-E and SEB-L, respectively. Both the inflammation (Figure S6) and the skin diseases (Figure S7) clusters were mined as a single set, each under SEB-L (labelled group A in Figures S6 and S7, respectively). The complete list of all six gene sets is compiled in Table S5.

**Figure 2.** Hierarchical clustering analysis using of 217 genes with a log2 fold change > |2| enriching the necrosis pathway. The Euclidian algorithm was used to sort both conditions and genes. Each block represents one gene, and its color code is at the bottom right. Clusters bordered by yellow lines represent those genes which were potentially unique signatures of the particular condition. The conditions from left to right are named as TSST-1-L, TSST-1-E, SEA-L, SEA-E, SEB-L, and SEB-E, which represent TSST-1 at the late time point, TSST-1 at the early time point, SEA at the late time point, SEA at the early time point, SEB at the late time point, and SEB at the early time point, respectively.

#### *3.4. Confirmation of Expression Pattern for Select Genes from the Necrosis Clusters*

We performed validation of gene expression levels by NanoString nCounter® technology. Table S1 lists the top thirteen highly perturbed genes (up- and downregulated) grouped under necrosis. This list is limited to genes responding only to SEA and TSST-1 for two reasons: first, none of the genes responding in the SEB-E phase were grouped in the three clusters discussed above (Table S5), and second, a lack of sufficient RNA samples for the SEB 48 h treatment point forced us to exclude genes that belong to the SEB-L treatment phase.

Overall, a positive correlation was observed between the NanoString and microarrays results. Of the 13 genes tested, 12 genes followed the same directionality of fold changes for the NanoString and the microarray results (Figure S8) with the exception of one gene (PLCB1).

#### **4. Discussion**

The present study investigated in vitro host gene expression patterns induced by SEA, SEB, and TSST-1 during six time points ranging from 0.5 h to 48 h post-toxin exposure. A less frequently tested human skin cell type, but a major component of skin cell-mediated immunology, namely melanocytes, were selected as the target cells. The hybrid character of melanocytes was highlighted as we mined those biofunctions which were linked to the dendritic cell activities and/or the macrophage-based immune responses. This study could have benefited from incorporating additional time points to enhance the resolution of sequential biological events. For instance, our data suggested that the dosages of SEA and TSST-1 used for melanocyte treatments were potentially exhaustive within 24 h p.e.; in this context, extended time points could be highly informative. Furthermore, additional replicates in this study would result in better statistically significant gene identification. To mitigate this drawback to some extent, we mined the networks that met the cut off *p* < 0.05 using hypergeometric tests.

#### *4.1. Distinct Temporal Trend of Pathogenesis Initiated by sAgs*

The three toxins SEA, SEB, and TSST-1 of the sAg family are distinct in their structural, functional, and mechanistic properties [7,8,33]. Present literature not only lacks an understanding of molecular pathogenesis underlying the sAgs' toxicity, but also fails to fully comprehend the role of melanocytes in response to sAgs. The melanocytes' dendriticlike nature and their strategic location in the superficial layers of skin qualify them to be excellent mediators of initial immune defense against the sAgs [16–22]. We presented a whole genome-level investigation to compare the melanocytes temporal responses to SEA, SEB, and TSST-1.

A striking observation when comparing SEA and TSST-1 was the similarity in their gene expression patterns across the p.e. time course. Although SEA and TSST-1 share weak overall structural homology, TSST-1 can be displaced by SEA due to shared MHC class II binding sites [33]. This sheds light on the similarities in their mode of action as evidenced by the maximum number of shared networks for both early and late treatment phases.

Compared to SEA and TSST-1, the magnitude of transcriptional response perturbed by SEB was relatively smaller during the early treatment phase. However, the number of genes perturbed by SEB sequentially ramped up. This sort of delayed response is typical for any tissue that is not enriched with lymphocytes, as they are not the direct cellular targets of SEB [14]. Subsequently, SEB caused considerable genomic perturbations between 24 h and 48 h p.e. This trend is to be expected, as SEB typically causes a rapid neutrophil cell death accompanied by vascular congestion and leakage 24 h p.e., causing a shift to a predominantly adaptive immune response [30]. A perturbation in eNOS signalling pathways, potent vasodilators, was reported in the current study.

Another important observation was the temporal differences between SEA- and SEBinduced pathogenesis, particularly during their middle-to-late treatment phases (Figure 1). Nevertheless, a certain cohesiveness emerged between these two sAgs at the functional level. There were 11 and 19 networks that were synchronously enriched by both SEA and SEB at the early and late treatment phases (Figure S3A,B). This fact may demonstrate an underlying similarity in their mode of pathogenesis. Early pathogenesis caused by SEAand SEB-perturbed genes manifested in skin disorders. In concurrence, SEB exclusively targeted genes linked to T-lymphocytes and their related functions, whereas SEA targeted glucose and protein metabolism networks. The consequences may include dysregulation of immune functions, apoptosis and cell death.

All three toxins enriched several networks related to cell death at early exposure phase and this response continues throughout the time course of the study. This response could be attributed to the moderately high doses of toxins used in the present study. Even though the three toxins perturbed the similar networks during the early exposure phase, as time progressed, each toxin had its unique mode of action in achieving the outcome manifested by cell death and apoptosis. One of the networks that was consistently perturbed by all three toxins across the p.e. time-course was ILK signalling. ILK functions as a kinase and signal transmitter or as a scaffold protein to facilitate cell–matrix interactions, cell signalling, and cytoskeletal organization [34]. These signals control processes related to survival, proliferation, differentiation, adhesion, migration, contractility, and neovascularization. Inhibition of ILK arrests the cell cycle and promotes apoptosis [35]. This is a key observation to support the following argument.

Early perturbation of genes associated with superoxide radical degradation in SEA indicates an oxidative stress-driven early onset of cell death [36]. TSST also perturbed this mechanism at later time points. Treatment with SEA and TSST down regulated the transcriptional levels of SOD1 and TYRP1, which potentially diminished the synthesis of different isoforms of superoxide dismutase (SODs). The potential loss of SODs highlighted the onset of oxidative stress initiated by the toxins [37], ultimately leading to onset of apoptosis during the late phase p.e.

Additional aspects of the apoptotic network, such as ERK/MAPK, were enriched by SEA and SEB at early p.e. phases, which appears to show a SE-induced apoptotic pathway distinct from that induced by TSST-1 [38,39]. We observed increased expression levels of FOS and NFAT genes during early p.e. SEA and SEB treatments. The FOS gene encodes the proto-oncogene c-FOS protein and NFATs, which are known widely for their cytokine gene expression properties and have been increasingly shown to regulate other genes related to cell cycle progression, cell differentiation, and apoptosis [39,40]. Late phase, SEB p.e. up-regulated genes that encode oncoproteins, such as Rho GTPase, which is also linked with ERK/MAPK [41]. Consequently, the G2/M DNA Damage Checkpoint Regulation, a critical biofunction closely linked with apoptosis, was highly perturbed. At late p.e. phase, SEA cross activated PI3K/AKT signaling, a critical pathway which affects many intracellular processes, including cell survival, growth, and migration.

#### *4.2. Late Phase SEB Is Associated with Certain Dermatological Disorders*

sAgs have long been implicated in the development of various inflammatory skin diseases such as psoriasis, atopic dermatitis, Kawasaki Syndrome, etc. [42,43] We observed that all three toxins modulated genes associated with the pathogenesis of psoriasis and chronic psoriasis starting from the early treatment phase. Psoriasis is often associated with functions like cell death, inflammation, autoimmune syndrome, and the production of ROS and nitric oxide [44]. From early to late treatment phases, SEA and TSST-1 shifted the expression of the gene enriching networks that are linked to lichen planus and endometriosis. During the late treatment phase, SEB regulated two unique set of genes that are closely linked to psoriasis and dermatomyositis, respectively. These genes are listed under their respective disease names in Table S3. Concurrent enrichment of oxidative stress networks could be related to the NRF2-mediated oxidative stress response and eNOS signaling pathways. Together these networks typically compromise the host's antioxidant defense mechanisms, a hallmark indicator of psoriasis [45].

#### *4.3. Several Genes of Immunological Networks Are Differentially Modulated by Toxins*

The skin exhibits a highly specialized innate immune response to invading pathogens and external stimuli. The major immune players—keratinocytes, Langerhans cells, dendritic cells, resident T-cells, and innate lymphoid cells—act in a coordinated fashion, from sensing the external stimuli to communicating through inflammatory signalling cascades, to ultimately regulating immune homeostasis [46,47]. Accumulating evidence uncovered a hybrid role of melanocytes in regulating innate and adaptive immunity [16–22,24,48–54]. Similar to keratinocytes, melanocytes express several types of toll-like receptors (TLRs) and have the ability to produce several pro-inflammatory cytokines and chemokines [48,52,54]. Melanocytes also regulate the adaptive immunity through their functional similarities to lysosomes, such as capability to phagocytose and their antigen presentation and processing aptitudes [20,48,55]. In this context, we listed those networks (Table 1) which are associated with melanocytes' hybrid role in responding to sAgs.

All of the toxin-induced adaptive immune responses could be attributed to the networks associated with leukocyte (granulocyte/agranulocyte) adhesion, a marker for second tier responses to inflammation induced by infection. Although all toxins contributed to adaptive immunity simulation, the patterns of cytokine production and acute-phase responses differed among the three toxins. For instance, during the early treatment phases of both SEA and SEB, the cytokine and chemokine signalling networks were comprised of CXCL1, CXCL12, and PLCB1, which control leukocyte trafficking; CCL2 and CCL7, which are involved in monocyte migration and macrophage recruitment; and CFL1, which regulates cell morphology and cytoskeletal organization. Early host responses to SEB and TSST-1 included an acute phase response signal that typically triggers non-specific inflammation, leukocytosis, complement activation, protease inhibition, clotting, etc. These responses persisted until 48 h p.e.

All three toxins perturbed IL-17 signalling, a pro-inflammatory signal that bridges innate and adaptive immune responses by playing critical roles in T-cell activation and in promoting the expansion and recruitment of innate immune cells, such as neutrophils [56]. The IL-17 signalling pathway was implicated in response to toxins via alterations of the transcription of several genes in this network, including CXCL1, CXCL5, CXCL8, CCL2, CCL20, and MAP2K6.

#### **5. Conclusions**

To our knowledge, this is the first mRNA-level study describing the temporal response of human melanocytes to three staphylococcal superantigenic toxins, namely SEA, SEB, and TSST-1. We observed distinct temporal patterns of transcriptomic regulation for the three individual toxins. The majority of the identified networks were related to necrosis and inflammation, in agreement with previous publications [38–40], although most of the past studies targeted different cells than melanocytes. Pathways related to innate immunity, such as the patterns of cytokine production and acute-phase response, showed toxin-specific regulation. The time-resolved response to SEB assault took a more differential pattern than SEA and TSST-1. In conclusion, these three toxins followed distinguishable pathways to achieve a common endpoint manifested by the cell death coordinated with apoptosis and necrosis. Hence, the temporal knowledge of their pathogenesis could be the key to customized intervention.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biomedicines10061402/s1, Figure S1: Temporal profile of overall gene expression patterns in response to SEA: (A) Genes that follow the same expression pattern independent of the treatment time. (B) Genes with altered expression patterns only in one of the two early time points, i.e., 30 min or 2 h post-exposure. (C) Genes with altered expression patterns only toward the end of the exposure period, i.e., either 24 h or 48 h post-exposure; Figure S2: Temporal profile of the overall gene expression patterns in response to SEB: (A) genes that follow the same expression pattern independent of the treatment time. (B) Genes with altered expression patterns only in one of the two early time points, i.e., 30 min or 2 h post-exposure. (C) Genes with altered expression patterns only toward the end of the exposure period, i.e., 48 h post-exposure; Figure S3: Temporal profile of the overall gene expression patterns in response to TSST-1. (A) Genes that follow the same expression pattern independent of the treatment time. (B) Genes with altered expression patterns only in one of the two early time points, i.e., 30 min or 2 h post-exposure. (C) Genes with altered expression patterns only toward the end of the exposure period, i.e., either 24 h or 48 h post-exposure; Figure S4: Number of genes altered in the different experimental conditions. The stacked bar chart shows the number of over-expressed (Fold change > 2) and under-expressed (Fold change < −2) that are marked by black and white color; Figure S5: Venn diagram showing the temporal profiles of non-canonical pathways. (A) Common and unique non-canonical pathways enriched by the three sAgs at Early phases of pathogenesis. For instance, there are 17, 4, and 12 networks which are uniquely perturbed by SEA, SEB, and TSST-1 at early time points. There are 2 networks commonly perturbed by SEA and SEB. Likewise, 19 and 3 networks are commonly perturbed by SEA and TSST-1, and SEB and TSST-1, respectively. There are 9 non-canonical networks that were perturbed by all three sAgs. All of these

networks are listed in the diagram. (B) Common and unique non-canonical pathways enriched by the three sAgs at late phases of pathogenesis. For instance, there are 24, 30 and 3 networks are uniquely perturbed by SEA, SEB and TSST-1 at early time points. There are 10 networks commonly perturbed by SEA and SEB. Likewise, 11 and 0 networks are commonly perturbed by SEA and TSST-1, and SEB and TSST-1, respectively. There are 9 non-canonical networks that were perturbed by all three sAgs. All of these networks are listed in the diagram; Figure S6: Hierarchical clustering analysis using of 53 genes with log2 fold change > |2| enriching the inflammation pathway. Euclidian algorithm is used to sort both conditions and genes. Each block represents one gene, and its color code is at the bottom right. The conditions from left to right are named as TSST-1-L, TSST-1-E, SEA-L, SEA-E, SEB-L and SEB-E, which represent TSST-1-L at late time point, TSST-1-E at early time point, SEA-L at late time point, SEA-E at early time point, SEB-L at late time point and SEB-E at early time point, respectively; Figure S7: Hierarchical clustering analysis using of 167 genes with log2 fold change > |2| enriching the skin disease pathway. The Euclidian algorithm is used to sort both conditions and genes. Each block represents one gene, and its color code is at the bottom right. The conditions from left to right are named as TSST-1-L, TSST-1-E, SEA-L, SEA-E, SEB-L and SEB-E; Figure S8: Targeted gene expression analysis using the NanoString platform to validate the microarray data. Validation study includes a set of 13 genes from three different conditions, namely SEA at early time point, SEA at late time point and TSST-1 at late time point. The color code profile is at the bottom left; Table S1: Top 13 highly perturbed genes grouped under the necrosis cluster; Table S2: Top biological functions and diseases (*p* < 0.001) and canonical functions (*p* < 0.01) identified through IPA for early post-exposure SEA, SEB, and TSST-1 treatments; Table S3: Top biological functions and diseases (*p* < 0.001) and canonical functions (*p* < 0.01) identified through IPA for late post-exposure SEA, SEB, and TSST-1 treatments (*p* < 0.001); Table S4: A list of 36 genes highly perturbed by one of the three sAgs during the early and late post-exposure phases; Table S5: List of significantly different genes that enrich necrosis, inflammation, and the skin diseases pathways, respectively. Genes are sorted by their fold changes, that is, log2 transformed. The necrosis network is perturbed by TSST-1 at the early (TSST-1-E) and late (TSST-1-L) time points, and SEA at the early (SEA-E) and late (SEA-L) time points. Likewise, networks linked to skin disease and inflammation, respectively, are perturbed by SEB at the late time point (SEB-L).

**Author Contributions:** Conceptualization, R.H., J.W.S. and M.J.; methodology, R.Y. and N.C.; software, R.Y.; validation, L.J.D.; formal analysis, N.C., R.Y., and S.S.; investigation, S.S., A.G., S.-A.M., L.T.M. and B.C.C.; resources, R.H., J.W.S., R.Y. and M.J.; data curation, R.Y., N.C., S.S. and A.A.; writing—original draft preparation, N.C. and S.S.; writing—review and editing, N.C., A.G., R.Y., A.A. and J.W.S.; visualization, N.C. and S.S.; supervision, R.H., J.W.S., and M.J.; project administration, S.S., N.C. and A.G.; funding acquisition, R.H., J.W.S. and M.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Defense Threat Reduction Agency, Project Number: G0020\_04\_WR\_B.

**Institutional Review Board Statement:** Not Applicable.

**Informed Consent Statement:** Not Applicable.

**Data Availability Statement:** Data is contained within the article and supplementary files. Array data is available in GEO under accession number GSE124756.

**Acknowledgments:** All of the individual support from the Integrative Systems Biology Program, the US Army Center for Environmental Health Research, and the Geneva Foundation, and the editing assistance of Julia Scheerer, Derese Getnet, and Linda Brennan are deeply appreciated. We also thank Joshua Williams for his assistance with the Functional Heatmap tool.

**Conflicts of Interest:** The authors declare no conflict of interest.

**Disclaimer:** Material has been reviewed by the Walter Reed Army Institute of Research. There is no objection to its presentation and/or publication. The opinions or assertions contained herein are the private views of the author, and are not to be construed as official, or as reflecting the true views of the Department of the Army or the Department of Defense.

#### **References**


### *Review* **Novel Biomarkers, Diagnostic and Therapeutic Approach in Rheumatoid Arthritis Interstitial Lung Disease— A Narrative Review**

**Alesandra Florescu 1,†, Florin Liviu Gherghina 2,†, Anca Emanuela Mus, etescu 1,\*, Vlad Pădureanu 3,\*, Anca Ros,u 1, Mirela Marinela Florescu 4, Cristina Criveanu 1, Lucian-Mihai Florescu <sup>5</sup> and Anca Bobircă <sup>6</sup>**


**Abstract:** Rheumatoid arthritis (RA) is considered a systemic inflammatory disease marked by polyarthritis which affects the joints symmetrically, leading to progressive damage of the bone structure and eventually joint deformity. Lung involvement is the most prevalent extra-articular feature of RA, affecting 10–60% of patients with this disease. In this review, we aim to discuss the patterns of RA interstitial lung disease (ILD), the molecular mechanisms involved in the pathogenesis of ILD in RA, and also the therapeutic challenges in this particular extra-articular manifestation. The pathophysiology of RA-ILD has been linked to biomarkers such as anti-citrullinated protein antibodies (ACPAs), MUC5B mutation, Krebs von den Lungen 6 (KL-6), and other environmental factors such as smoking. Patients at the highest risk for RA-ILD and those most likely to advance will be identified using biomarkers. The hope is that finding biomarkers with good performance characteristics would help researchers better understand the pathophysiology of RA-ILD and, in turn, lead to the development of tailored therapeutics for this severe RA manifestation.

**Keywords:** rheumatoid arthritis; interstitial lung disease; biomarkers; treatment

### **1. Introduction**

Rheumatoid arthritis (RA) is considered a systemic inflammatory disease marked by polyarthritis, which affects the joints symmetrically, leading to progressive damage of the bone structure and eventually joint deformity. This pathology affects around 1% of the population in the United States and northern Europe [1,2]. Even though arthritis is the most prevalent clinical manifestation of RA, extra-articular manifestations are often evidenced in people with the disease. Extra-articular manifestations include cardiac, ocular, lung, cutaneous, gastrointestinal, neurological, and renal involvement, but also rheumatoid vasculitis and rheumatoid nodules [3,4].

Lung involvement is the most prevalent extra-articular feature of RA, affecting 10–60% of patients with this disease. Any segment of the respiratory tract can be affected in RA patients. The involved segments include the parenchyma, which can cause ILD or rheumatoid nodules, the pleura, causing pleural effusions or inflammation, the small and

**Citation:** Florescu, A.; Gherghina, F.L.; Mus, etescu, A.E.; P ˘adureanu, V.; Ros,u, A.; Florescu, M.M.; Criveanu, C.; Florescu, L.-M.; Bobirc ˘a, A. Novel Biomarkers, Diagnostic and Therapeutic Approach in Rheumatoid Arthritis Interstitial Lung Disease—A Narrative Review. *Biomedicines* **2022**, *10*, 1367. https://doi.org/10.3390/ biomedicines10061367

Academic Editor: Marianna Christodoulou

Received: 25 April 2022 Accepted: 7 June 2022 Published: 9 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

181

large airways (bronchiolitis, bronchiectasis, and cricoarytenoid inflammation), but also the pulmonary vessels, resulting in vasculitis and pulmonary hypertension. ILD is considered to have a prevalence ranging from 5 to 58%, clinically overt RA-ILD being encountered in less than 50% of patients [2,5,6].

Pleural effusion was thought to be the most frequent feature of RA-ILD before the development of computed tomography (CT), which aids in assessing the correct diagnosis. High-resolution computed tomography (HRCT) can identify more subtle changes in the parenchyma, leading to earlier discovery of the ILD, especially in subclinical phases when the patients have not developed symptoms such as dyspnea [7–9].

The aim of this review is to present the patterns involved in RA-ILD and the molecular mechanisms described in the pathogenesis of this extra-articular manifestation. We also aim to present the diagnostic and therapeutic approach in patients with RA-ILD.

#### **2. Pathogenesis**

Rheumatoid factors (RF) and anti-citrullinated protein antibodies (ACPAs) are frequently found in the serum of RA patients. These autoantibodies are discovered in 50–80% of RA patients. They were discovered in the serum of patients with subclinical disease several years prior to clinical manifestations, thus testifying to the affirmation that genetic and environmental predispositions play an important part in the development of antibodies [10]. The production of antibodies leads to inflammation, followed by the development of clinical manifestations of the disease. Citrullination, the process through which arginine is converted to citrulline, leads to an immune response which implies the formation of AC-PAs. ACPAs are significantly linked to the development of RA in those who are genetically susceptible [11,12].

Several immunopathogenic routes for RA-ILD have been proposed, although the precise location of the trigger event in the RA pathogenic cascade remains unknown. It is thought that the citrullinated proteins cross-react with the antigens in the lungs, albeit the immune response might be initiated in the synovium. This finding is reinforced by the fact that articular involvement precedes the pulmonary involvement in patients with RA. Recent literature data have shown that the microbiome plays a lead part in the development of RA due to its role in modulating the immune response. The "mucosal origins" theory posits that the development of RA begins in the mucosa of either the mouth, airway, or gastrointestinal tract. The bacterial, viral or mycobacterial antigens cross-react with antibodies, leading to the development of RA. Germs such as Proteus spp. and Porphyromonas gingivalis are thought to be involved in the pathogenesis of RA-ILD [13,14].

The genetic background of a patient might have either a predisposing (HLADRB1\*15, HLADRB1\*16, DQB1\*06, and HLA-A\*31:01 alleles) or protecting (HLA-DRB1 SE) role in the establishment of RA-ILD. Environmental conditions have a critical impact in genetically predisposed individuals. Tobacco usage has been identified as a probable cause of RA-ILD development. Smoking can harm pulmonary epithelial and vascular endothelial cells directly and increase citrullination of proteins in the lungs by activating PAD enzymes locally. Citrullinated proteins act as antigen targets, even in the preclinical stage, leading to a local immune response. This process leads to the formation of ACPAs, followed by the generation of RA and ILD. This stage is characterized by increased citrullination [15–17].

These formed antibodies lead to the development of an inflammatory response such as the production of pro-inflammatory cytokines, tumor necrosis factor (TNF)-α being one of the most important. B-cells are activated, and their differentiation is promoted by T-lymphocytes after antigen exposure. CD4+ T cells infiltrates are more prominent in RA-IL than CD3+ T lymphocytes, in contrast with idiopathic pulmonary fibrosis (IPF) infiltrates. Other researchers have speculated that CD8+ T cells might have a role in the progression of pulmonary fibrosis in RA. Certain data attest to the fact that CD8<sup>+</sup> lymphocytes also have an important role in the development of ILD associated with RA, although this affirmation is portrayed in a study which suggests that smoking leads to an increase in CD8+ T lymphocytes in the lungs. [15,18–20].

The lung cellular infiltrates in RA-ILD have proven to be complex in SKG mouse models, consisting of CD4+ T lymphocytes, B lymphocytes, neutrophils, and macrophages. Cytokines and chemokines are of great importance in interstitial lung involvement in RA patients. TNF-α is a pro-inflammatory cytokine generated primarily by activated lymphocytes, macrophages, endothelial, and epithelial cells involved in the pathophysiology of ILD. TNF-α is important in the early stages and preservation of the cytokine and chemokine generation cascade and the induction of cell–cell adhesion and trans-endothelial migration [6,21].

The proliferation of fibroblasts is stimulated by TNF-α. Additionally, TNF-α promotes their capacity to degrade the extracellular matrix and to trigger the appearance of growth factors (GFs). GFs implied in the pathogenesis of ILD are platelet-derived growth factor (PDGF-β), transforming growth factor (TGF-β), but also vascular endothelial cell growth factor (VEGF). Nevertheless, the expression of cytokines such as IL-4 and IL-13, and chemokines (CXCL5, 8, 12, and 13) is also important. The GFs, cytokines and chemokines stimulate the fibroblasts to differentiate and proliferate, thus connecting the inflammatory and fibrotic stages. Macrophages, fibroblasts, epithelial, and endothelial cells all generate PDGF-β. PDGF-β is one of the pro-fibrotic, and pro-inflammatory molecules recognized to be important in the pathophysiology of ILD, such as TGF-β and TNF-α [5,22].

TGF-β's profibrotic effect is mediated via monocyte and fibroblast recruitment and activation and the stimulation of extracellular matrix deposition. TGF-β also causes fibroblasts to differentiate into myofibroblasts, which are the primary source of the extracellular matrix in the process of fibrosis of the lungs. Chemokines do not have a well-defined role in the formation of the inflammatory infiltrates in the lungs of patients with RA-ILD. These chemokines are produced by macrophages, fibroblasts, and epithelial cells, and they function by recruiting and activating fibroblasts.

The pro-fibrotic and/or pro-inflammatory cytokines and GFs are known to activate the Janus kinase (JAK)/ signal transducer and activator of transcription (STAT) pathway. JAK/STAT activation leads to the polarization of macrophages into pro-inflammatory M1 type macrophages, with increases the secretion of cytokines such as IL-6, CXCL10 and TNF-α. These pro-inflammatory cytokines promote inflammation and/or fibrotic changes.

Other mediators included in the pathogenesis are matrix metalloproteinases (MMPs) produced by damaged epithelia. MMPs maintain the crosstalk between inflammation and fibrosis by increasing the recruitment of cells such as B and T lymphocytes, macrophages, and neutrophils and producing additional pro-fibrotic mediators. The inflammatory process promotes the generation of the VEGF which aids the angiogenetic process. The exact mechanism of the generation of VEGF is still not well determined (Figure 1).

**Figure 1.** Pathogenesis of RA-ILD.

#### **3. Biomarkers**

#### *3.1. Antibody Biomarkers*

Patients with RA are known to have a preclinical stage in which autoantibodies such as RF and/or ACPA are detected in the serum, before the appearance of clinical synovial inflammation. However, the presence of the serological markers in the serum in situations when ILD develops before the articular manifestations might somehow be confusing [23].

RF are autoantibodies oriented against modified Fc segments of immunoglobulin (Ig) G. RFs are found in the bloodstream of up to 80% of patients with RA. The majority of RF consists of IgM antibodies, which are linked to the development of interstitial lung involvement in RA. IgA RFs have also been linked to ILD [24].

ACPAs have specificity for proteins in which peptidyl arginine deaminases (PAD) have transformed the arginine residues and are seen in the sera of 70–80% of RA patients. Anti-PAD3 antibodies have been associated with interstitial lung disease. ACPAs are reported to have greater specificity for RA than RFs. ACPA levels over a certain threshold are linked to ILD in RA. It has been claimed that IPF is related to the generation of IgA type ACPAs, albeit this has not been linked to ILD as a RA consequence. Circulating secretory IgA-ACPAs have also been found in the serum of RA patients with ILD [25,26].

Other RA-ILD ACPAs have been discovered in RA-ILD patients. Antibodies against the citrullinated alpha-enolase peptide 1 (anti-CEP1) were linked to RA-ILD in an Italian investigation. In a Chinese study, it was also shown that increased levels of anti-CEP antibodies contributed to the development of ILD in RA patients. Antibodies against anticitrullinated heat shock protein 90 (cit-Hsp90) α or β have also been linked to RA-ILD, with low sensitivity but high specificity. Patients with RA-ILD produced more interferon g (IFN-γ) than those without ILD when their peripheral blood mononuclear cells (PBMC) were grown in contact with cit-Hsp90 beta. IFN- γ was not discovered in the PBMC of patients with other connective tissue diseases (CTD)-ILD. This is due to the fact that IFN- γ production is increased by cit-Hsp90 T lymphocytes specific for RA-ILD [19,27,28].

Antibodies targeting additional post-translationally modified proteins have been reported in addition to citrullinated proteins. Antibodies against anti-carbamylated proteins (anti-CarP) have recently been linked to the development of RA-ILD. Four anti-CarP antibodies were often discovered with high serum titers: IgG anti-fetal calf serum (FCS), antichimeric fibrin/filaggrin homocitrullinated peptide, anti-fibrinogen, and IgA anti-FCS. Finally, antibodies against malondialdehyde-acetaldehyde (anti-MAA) have been linked to lung involvement in RA. Anti-MAA antibodies have been linked with increased disease activity and response to ACPA [29–31].

Patients with RA-ILD had greater plasma levels of IgA and IgM anti-MAA antibodies than those with RA without ILD. Levels of IgM anti-MAA antibodies were also higher in patients with RA-ILD than in those with lung disease not related to RA, such as chronic obstructive pulmonary disease (COPD) [32,33].

To summarize, there is no evidence that ACPAs have a role in RA-ILD risk. In clinical practice, ACPA positivity should not be used as a predictor for the risk of development of RA-ILD. However, high anti-CCP antibody titers and rheumatoid factor titers may assist in identifying individuals with RA who are at high risk of ILD [34].

#### *3.2. Genetic Biomarkers*

Not many papers have reported the genetic connections involving interstitial lung involvement in RA, even though genetic risk factors for RA or IPF have been thoroughly researched. A single nucleotide variation (SNV) in the promoter region of the MUC5B gene, rs35705950, has been linked to familial and sporadic IPF. Additionally, a link has been established between RA-ILD and the mutation of the MUC5B gene [35,36].

The MUC5B gene is overexpressed when this risk allele is present. MUC5B overproduction may impede alveolar repair. On the other hand, this risk allele has been linked to a better prognosis in IPF patients, depicting its relevance in moderate IPF. In order to attest the influent aspect of common variations in disease predisposition, genome-wide association studies (GWASs) have been developed. In a Japanese GWAS, the SNV rs12702634 in the RPA3-UMAD1 gene proved to have a major association with the development of RA-ILD. This genetic polymorphism was mostly associated with the UIP pattern [37–39].

A study conducted by Jönsson et al. on 1466 RA patients from northern Sweden analyzed 571151 SNVs, finding that 4 of the tested SNPs were associated with interstitial lung involvement in RA, as follows: rs35705950 (MUC5B gene), rs2609255 (FAM13A), rs111521887 (TOLLIP gene), and rs2736100 (TERT gene). However, more extensive studies on a larger number of patients are yet to be conducted [40].

The antigens are provided to the T-cell receptors by HLA molecules; thus, HLA alleles are connected to a wide range of diseases. IPF is linked to HLA-B\*15, HLA-B\*40, HLA-DR2 (DRB1\*15 and DRB1\*16), and MICA\*001. RA is linked to HLA-DRB1\*04:01, \*04:04, \*04:05, \*01:01, and \*10:01. These RA risk alleles are known as "shared epitope" (SE) alleles because they share amino acid sequences at positions 70–74 of the HLA-DR protein (QKRAA, RRRAA, or QRRAA). In RA, DR2 alleles have been found to predispose to ILD, whilst SE alleles have been found to protect against ILD. Even though SE alleles are closely linked to ACPA-positive RA, the frequency of these alleles is lower in RA patients with interstitial lung involvement [41].

Micro RNAs (miRNAs) control the expression of genes that code proteins and are non-coding RNAs formed from about 22 nucleotides. Circulating miRNAs are rapidly emerging as disease biomarkers in several illnesses. Plasma levels of hsa-miR-214-5p and hsa-miR-7-5p are elevated in RA and IPF. Additionally, the potential of long non-coding RNAs has been tested. They are transcripts that are longer than 200 nucleotides, but do not have the capacity to be translated into proteins. The levels of several of these long non-coding RNAs were likewise shown to be higher in RA-ILD patients' PBMCs [42,43].

#### *3.3. Other Biomarkers*

Krebs von den Lungen 6/MUC1 (KL-6) is a mucin-like glycoprotein which stimulates fibrosis and inhibits apoptosis of pulmonary fibroblasts. Serum KL-6 levels were shown to be higher in those with RA lung involvement, suggesting that it might help detect ILD development early on. In a study of 47 RA patients, the findings on lung computed tomography proved to be related to higher levels of serum KL-6 levels and increased disease severity. Severity was defined as extensive lung fibrosis on HRCT (>30%) or forced vital capacity (FVC) on PFT less than 50% and also the need of oxygen supplementation. Increased levels of KL-6 were also found in a study by Lee and colleagues in the serum of patients with CTD-ILD [44]. Type II pneumocytes and bronchiolar epithelial cells both express KL-6. KL-6 is expected to leak into the vascular system after epithelium breakdown caused by lung damage, indicating that it might be employed as a marker of epithelial injury. KL-6 might be used as a diagnostic marker in CTD-ILD. According to Oguz et al. in a study conducted on 113 CTD patients and 45 healthy controls, median KL-6 readings were significantly higher in the CTD-ILD group [44–48].

The pathophysiology of IPF is influenced by matrix metalloproteinases (MMPs) and tissue inhibitors of metalloproteinases (TIMPs), but also by cytokines and chemokines. MMP-7 levels were regularly observed to be higher in IPF patients. Several studies have looked at the involvement of these proteins in interstitial lung involvement in RA patients. High levels of MMP-7, soluble programmed death-ligand 1, C-X-C motif chemokine ligand 10 (CXCL10), interleukin (IL)-13, and IL-18 were discovered in the serum of patients with lung involvement in RA [49,50].

Chen et al. proved that MMP-7 and CXCL10 serum levels were more elevated in patients than those with RA without ILD [51]. Doyle et al. conducted a study which might help diagnose RA-ILD in the subclinical phase by discovering that a biomarker profile consisting of MMP-7, activation-regulation chemokines, and surfactant protein D (SP-D) is consistent with the development of ILD in RA patients [49,52,53].

Fu et al. discovered that lysyl oxidase-like 2 (LOXL2) levels in RA patients with or without ILD were higher in comparison with healthy controls. LOXL2 levels were substantially increased in subjects with RA-ILD who had ILD for ≤3 months than those who had ILD for >3 months [52]. The main candidates for biomarkers in RA-ILD are presented in Table 1.

**Table 1.** Value of biomarkers in RA-ILD.


ACPA—anticitrullinated protein antibodies; Anticitrullinated HSP90—heat shock protein 90; PAD—peptidyl arginine deaminases; anti-CEP1—citrullinated alpha-enolase peptide 1; anti-CarP—anti-carbamylated proteins; anti-MAA—anti- malondialdehyde-acetaldehyde; KL-6—Krebs von den Lungen 6/MUC1; MMP-7—matrix metalloproteinases 7; CXCL10—C-X-C motif chemokine ligand 10; sPD-1—soluble programmed death ligand 1; IL—interleukin; SP-D—surfactant protein D; LOLX2—lysyl oxidase-like 2.

In conclusion, each of these potential compounds, such as RF and ACPA, have some evidence of a link to RA-ILD. If any of these relationships are to be regarded as clinically effective biomarkers for RA-ILD, more research is needed to explain them and establish their validity. There are multiple ongoing clinical studies which aim to investigate biomarkers in RA-ILD, as presented in Table 2.


*Biomedicines* **2022**, *10*, 1367

**Table 2.** Clinical study in RA-ILD biomarkers.


**Table 2.** *Cont.*

#### **4. Similarities between RA-ILD and IPF**

RA-ILD has certain phenotypic similarities with IPF, unlike other CTD-associated ILD. First, some risk variables are shared by RA-ILD and IPF, the most important being smoking, followed by age and male sex. On the second hand, they have a similar imaging and pathology phenotype, with an apparent prevalence of the usual interstitial pneumonia (UIP) pattern, which is the most prevalent pattern of interstitial lung involvement in RA [72].

ACPAs have recently been discovered in patients with IPF. ACPA positivity was shown to be more common in two different IPF cohorts. In these two IPF cohorts, IgA-ACPA positivity was higher than in the general population control group. The concept of a common genetic foundation in RA interstitial lung involvement and IPF is supported by phenotypic resemblance and shared environmental risk factors [73]. An increase in rare variations in genes associated with familial pulmonary fibrosis has been identified in RA-ILD. The functional MUC5B rs35705950 promoter mutation has recently been described as a risk factor for RA-ILD, in addition to being a significant risk factor for IPF. Strong MUC5B staining was evidenced in lung samples from individuals with RA-ILD, located in the areas with alveolar epithelium hyperplasia in the fibrotic regions, comparable to that seen in IPF. According to immunohistochemistry, IgA-ACPA positivity was higher than IgG-ACPA positivity in patients with IPF, whereas IgG-ACPA positivity was higher than IgA-ACPA positivity in patients with RA [74,75].

#### **5. Diagnosis of RA-ILD**

The diagnosis of ILD in patients with diagnosed or suspected RA demands a coordinated multidisciplinary approach involving radiology, pathology, rheumatology, and pulmonology expertise, as well as consideration of other possible causes of ILD. Each specialist has a well determined role in the diagnosis and treatment of ILD. After a diagnosis of RA is established by the rheumatologist, the patients have to be thoroughly evaluated. A HRCT has to be performed and interpreted by a specialized radiologist and if alterations in the lung parenchyma are detected, a complete evaluation with pulmonary function tests (PFTs) has to be conducted by a pulmonologist. Regarding the treatment, collaboration between the rheumatologist and pulmonologist is of great importance, since the therapeutic arsenal is different in each specialty. Thus, frequent meetings and conferences, or even the formation of multidisciplinary teams, are of great importance in the diagnosis and treatment of RA-ILD [76].

#### *5.1. Clinical Presentation*

Exertional dyspnea, cough, chest discomfort, and exhaustion are symptoms of ILD that are similar to those of a variety of more frequent lung disorders.

In individuals with fibrotic ILD, a clinical evaluation might reveal digital clubbing and/or Velcro-crackles on lung auscultation. Up to 15% of patients with RA-ILD have been reported to present clubbing [77,78].

Patients with RA-ILD have been found to exhibit bilateral basal crackles in almost 90% of cases. Crackles were found in individuals with RA who did not have ILD, albeit to a lesser level. The complexity of the illness and the diversity in HRCT patterns are most likely responsible for the clinical variability [34].

#### *5.2. Imaging*

The use of a chest X-ray to detect ILD in RA patients is ineffective. On a thoracic radiograph, up to 64% of individuals with ILD on HRCT will have no visible interstitial abnormalities. As a result, if ILD is suspected, HRCT must be performed as part of the diagnostic process.

The UIP pattern is the most frequently encountered in RA-ILD, although all types of interstitial pneumonia have been described. UIP, obliterative bronchiolitis, nonspecific interstitial pneumonia (NSIP), and organizing pneumonia (OP) were identified as the four primary HRCT patterns in individuals with RA-ILD [79,80].

#### *5.3. Phenotypes of RA-ILD*

The most prevalent type of ILD is usual interstitial pneumonia, evidenced in up to 70% of cases. It is associated with worse outcomes in comparison with other RA-ILD patterns. UIP typical HRCT features include a subpleural distribution with a basal predominance, honeycombing, which is highly specific, reflecting the stage and the severity of the disease, reticular opacities associated with honeycombing and traction bronchiectasis, ground-glass opacities, which are usually less extensive than the reticular pattern, architectural distortion, and lobar volume loss [81].

Non-specific interstitial pneumonia is less prevalent than UIP. NSIP has two main subtypes: fibrotic and cellular, with lung involvement being mostly subpleural with an apicobasal gradient. NSIP typical HRCT features include ground-glass opacities with immediate subpleural sparing, mostly bilateral and symmetric, reticular opacities and irregular linear opacities, thickening of bronchovascular bundles, traction bronchiectasis, and lung volume loss, particularly in the lower lobes. It is associated with a lower risk of disease progression and a better response to treatment in comparison with UIP (Figure 2) [82].

**Figure 2.** CT of the thorax—lung window—showing bilateral fine interstitial thickening and ground glass opacities with a basal predominance, minimal traction bronchiectasis and relative subpleural sparing (NSIP pattern).

Organizing pneumonia is a less frequent pattern encountered in RA-ILD. HRCT typical features include focal ground-glass opacities, consolidation and reversed halo sign.

Other less common patterns are lymphocytic interstitial pneumonia (LIP) and desquamative interstitial pneumonia (DIP) [83].

LIP may present HRCT features such as diffuse with mid to lower lobe predominance, interstitial thickening along lymph channels, thickening of the bronchovascular bundles pulmonary nodules, either centrilobular or subpleural, ground-glass opacities, and thin wall cysts.

DIP is characterized on HRCT by ground-glass opacities, irregular linear opacities, and small cystic spaces [84,85].

#### *5.4. Pulmonary Function Tests*

PFTs, especially the lung's carbon monoxide diffusing capacity (DLCO), are able to detect subclinical pulmonary disease. The presence of concomitant emphysema and the variability of the PFTs within the normal values, may restrict the use of this diagnostic method. PFT outcomes in individuals with RA-ILD vary depending on the research groups and severity of the illness. PFT abnormalities are present in 45–65% of individuals with RA, whether or not they have respiratory symptoms.

Restrictive patterns, but also airway obstruction, and decreased DLCO are among the patterns. The incidence of a restricted pattern ranges from 5 to 25%. Approximately 20–45% of people with RA have a DLCO that is impaired. Although many people have abnormal PFTs, most of these abnormalities are clinically inconsequential and silent [86,87].

#### *5.5. Bronchoalveolar Lavage*

In individuals with RA-ILD, the cellular characteristics of bronchoalveolar lavage (BAL) fluid are frequently aberrant but nonspecific. Lymphocytosis tends to be more common in the non-UIP pattern, while increased neutrophil levels are characteristic of the UIP pattern. BAL is not always required. Usually, it is conducted to rule out other causes of lung disease. The nonspecific results prevent this method from being a useful diagnostic tool [88].

#### *5.6. Histopathology*

Insights into the histopathological structure of interstitial pneumonia obtained through surgical lung biopsy (SLB) may help to clarify the diagnosis and might also have prognostic significance. However, the risks outweigh the benefits in some cases, and the decision to perform a SLB needs to be carefully considered [89].

In RA-ILD, the histological patterns are varied, and any kind of interstitial pneumonia can occur and even overlap. Idiopathic interstitial pneumonias are classified according to a variety of distinct histological characteristics that are also observed in RA-ILD. Patches of fibrosis with honeycombing and fibroblast foci alternate with patches of normal lung tissue in a UIP pattern marked by heterogeneity. The appearance of NSIP is uniform, with thickening of the alveolar septa and various degrees of inflammatory and fibrotic changes. DIP and follicular bronchiolitis include peribronchiolar inflammation and fibrosis, while intra-alveolar connective tissue plugs characterize OP [90].

#### **6. Treatment**

It is critical to carry out a baseline evaluation of disease severity in patients diagnosed with RA-ILD and closely monitor patients to identify those who develop disease progression. When selecting whether to start or continue therapy in individuals with RA-ILD, the severity and progression of the illness are two significant variables to consider.

The best therapeutic plan for RA-ILD patients has yet to be determined. There have been no randomized controlled trials (RCTs) comparing drugs for the therapeutic options of RA-ILD to date [91].

#### *6.1. Corticosteroids, Synthetic, Biological, and Targeted Therapy*

In patients with refractory disease the most frequently utilized therapeutic strategies consist of corticosteroids, azathioprine, and mycophenolate, with rituximab or TNF-α inhibitors. In RA-ILD with an inflammatory pattern, treatment response is frequently better. Fibrotic lung disease, for example RA-UIP, is usually less responsive to treatment and disease progression is similar to IPF [92].

Current therapy is primarily centered on immunosuppression and is based on empirical information. Corticosteroids are usually administered either in a daily oral dose or as pulse therapy. The dose is tapered over several months according to tolerance and clinical response. In inflammatory types of RA-ILD, such as NSIP and OP, corticosteroids have proved to have a limited effect on disease progression. TNF-alpha inhibitors, methotrexate (MTX), azathioprine (AZA), mycophenolate mofetil (MMF), and cyclophosphamide (CYC) are among the immunosuppressive medications used as maintenance therapy or in corticosteroid-resistant cases [93].

Therapy with corticosteroids alone or in combination with DMARDs alleviated or stabilized the disease in almost half of the 84 patients with RA-UIP, according to a retrospective search by Song et al., but there was no substantial difference in lifespan compared to the untreated group [94].

In rapidly advancing, severe ILD and RA-ILD with substantial UIP, cyclophosphamide in conjunction with methylprednisolone have shown potential efficacy; however, the data is based on a limited retrospective case series [22].

In RA patients, methotrexate is recommended as the first-line therapy, since it successfully reduces disease progression, disability, and mortality. MTX, on the other hand, has been linked to the development or worsening ILD in RA patients [95]. Kiely et al. intended to see whether treatment with MTX is linked to RA-ILD diagnosis and delays RA-ILD development. They found that MTX exposure was linked with a substantially lower incidence of RA-ILD in a multicenter prospective early RA cohort analysis involving 2701 participants. Furthermore, they discovered that therapy may help RA patients postpone the onset of ILD. This research offers us reason to believe that MTX may be helpful in the prevention and treatment of RA-ILD [96].

In a study conducted by Yusof et al., rituximab (RTX) was administered in 700 individuals with RA, 56 of whom already had RA-ILD. After receiving rituximab, 68% of these patients had improved or maintained pulmonary function. Rituximab was shown to have a good safety profile, only three individuals (0.4%) having developed following therapy [96,97].

Interstitial lung involvement induced by medication has been cited for most TNF agents, including infliximab, adalimumab, etanercept, certolizumab pegol, golimumab, and IL-6 receptor antagonist tocilizumab. The majority of evidence for TNF inhibitors-related ILD comes from case reports. A thorough literature quest revealed that establishing a causal link between RA treatment and the beginning or progression of ILD is extremely challenging [98].

However, due to the lack of a dedicated RCT, the effect of bDMARDs on RA-ILD is uncertain. Rituximab, tocilizumab, and abatacept have all been shown to have favorable results in recent studies, with the disease in treated individuals maintaining constant or improving as measured by PFTs. However, most of these investigations are small, uncontrolled retrospective studies, and their findings must be confirmed in RCTs [99–101].

The JAK/STAT pathway is incriminated in the development of ILD. The beneficial effect of JAK inhibitors on CTD-ILD has been reported in a number of case reports presented in recent literature, in mouse models and in a few clinical studies. An open-label trial conducted by Chen Z. et al. evaluated the efficiency of tofacitinib in amyopathic dermatomyositis associated with ILD in patients with anti-melanoma differentiation-associated gene 5 (MDA5). The study involved 18 patients treated with GC and tofacitinib in doses of 10mg/day, while 32 patients treated with GC alone were included as historical controls. The 6 month survival rate was significantly higher in the group treated with tofacitinib than in the control group. Favorable outcomes were also noted in the case of FVC, DLCO and findings on the HRCT in the study group [102].

D'Alessandro et al. conducted a study on 15 patients (out of which 4 were diagnosed with RA-ILD) with RA in order to evaluate the adipokine levels in RA patients after 6 months of baricitinib treatment. The study showed a significant decrease in KL-6 levels in the patients with ILD, also showing an improvement in DLCO. Although the RA-ILD group was too small to have statistical significance, the results of this study may be a cornerstone for the development of other trials [103]. Other case reports on ruxolitinib have shown improvement in PFTs and HRCT in patients with ILD [104–106]. However, more expensive RCTs have to be conducted in order to establish the beneficial effect of JAK inhibitors in RA-ILD and the potential adverse events.

#### *6.2. Antifibrotic Therapy*

Due to the mechanistic similarities between RA-related UIP and IPF, antifibrotic medication may have a beneficial effect on progressive fibrotic RA-ILD, particularly with UIP patterns. Antifibrotic drugs are not known to be beneficial for articular symptoms of the condition, thus immunomodulating therapy may be needed in addition. When a varied group of patients with progressive fibrotic ILD (PF-ILD) (other than IPF) were placed together as a single entity, the results of the INBUILD study suggested a therapeutic advantage with nintedanib in individuals displaying pulmonary disease progression. A post hoc assessment of all diagnostic categories (including some autoimmune-ILDs) revealed a treatment advantage (particularly, the rate of FVC decline) [107–110].

Another treatment option is represented by pirfenidone which lowers serum concentrations of IL-6 and TNF-alpha, two important cytokines in RA pathogenesis. A recent discovery suggested that pirfenidone prevents the transition from fibroblast to myofibroblast in the lung tissues of patients with ILD. Due to this fact, treatment with pirfenidone may be considered in the case of UIP patterns. According to recent studies, pirfenidone has a beneficial effect on disease progression by slowing it in patients with unclassifiable PF-ILD [111,112].

The main limitation of our review is the fact that it is a narrative review, therefore eligibility criteria for studies, search strategy, selection process, study risk of bias assessment, and data collection are not explained.

#### **7. Conclusions**

For RA patients, ILD is a frequent and sometimes fatal consequence. Unfortunately, the precise etiology of RA-ILD is not fully understood yet. The pathophysiology of RA-ILD has been linked to biomarkers such as ACPA, MUC5B mutation, KL-6, and other environmental factors such as smoking. Patients at the highest risk for RA-ILD and those most likely to advance will be identified using biomarkers. The hope is that finding biomarkers with good performance characteristics would help researchers better understand the pathophysiology of RA-ILD and, in turn, lead to the development of tailored therapeutics for this severe RA manifestation. Although multiple biomarkers have been studied, none have proven performance characteristics in order to reliably identify interstitial lung disease in RA patients. More studies have to be performed in order to establish and validate the clinical implications, sensitivity, specificity, utility in diagnosis, prognosis and disease severity.

**Author Contributions:** Conceptualization, A.F. and F.L.G.; methodology, A.E.M., V.P. and A.B.; validation, C.C. and A.R.; investigation, L.-M.F. and M.M.F.; data curation, A.F. and A.E.M.; writing original draft preparation, L.-M.F., M.M.F. and C.C.; writing—review and editing, A.R., F.L.G. and A.F.; supervision, A.E.M., V.P. and A.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** The Article Processing Charges were funded by the University of Medicine and Pharmacy of Craiova, Romania.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Gelsolin as a Potential Biomarker for Endoscopic Activity and Mucosal Healing in Ulcerative Colitis**

**Keiko Maeda 1,\*, Masanao Nakamura 2, Takeshi Yamamura 2, Tsunaki Sawada 1, Eri Ishikawa 2, Akina Oishi 2, Shuji Ikegami 2, Naomi Kakushima 2, Kazuhiro Furukawa 2, Tadashi Iida 2, Yasuyuki Mizutani 2, Takuya Ishikawa 2, Eizaburo Ohno 2, Takashi Honda 2, Masatoshi Ishigami <sup>2</sup> and Hiroki Kawashima <sup>2</sup>**


**Abstract:** The therapeutic goal in ulcerative colitis is mucosal healing, which requires improved non-invasive biomarkers to evaluate disease activity. Gelsolin is associated with several autoimmune diseases, and here, we aimed to analyze its usefulness as a serological biomarker for clinical and endoscopic activities in ulcerative colitis. Patients with ulcerative colitis (*n* = 138) who had undergone blood tests and colonoscopy were included. Serum gelsolin was measured using enzyme-linked immunosorbent assay, and correlation between the gelsolin level and clinical and endoscopic activities was examined. The serum gelsolin level in patients with ulcerative colitis was significantly lower than that in healthy subjects, and it decreased in proportion to increasing Mayo score and Mayo endoscopic subscore. The area under the curve for correlation between clinical and endoscopic remission and serum gelsolin level was higher than that for C-reactive protein. Furthermore, in C-reactive protein-negative patients, the serum gelsolin level was lower in the active phase than in remission. Our findings indicate that the serum gelsolin level correlates with clinical and endoscopic activities in ulcerative colitis, has a higher sensitivity and specificity than C-reactive protein, and can detect mucosal healing, suggesting that gelsolin can be used as a biomarker for ulcerative colitis.

**Keywords:** biomarker; ulcerative colitis; gelsolin; mucosal healing

#### **1. Introduction**

Inflammatory bowel disease (IBD), represented by ulcerative colitis (UC) and Crohn's disease, is a chronic inflammatory ailment of the gastrointestinal tract with an increasing incidence worldwide [1,2]. With recent advances in medical therapies, such as the development of immunomodulators and biologics, the goal of IBD treatment has shifted from alleviating clinical symptoms to achieving endoscopic mucosal healing. Mucosal healing reduces subsequent recurrence, surgery, and carcinogenesis rates, [3–5] and the concept of treat-to-target, aimed at achieving endoscopic mucosal healing, is being widely accepted [6,7]. The gold standard for determining disease activity and mucosal healing in IBD is endoscopy; however, this method is associated with physical, time, and economic burdens. Therefore, in clinical practice, non-invasive, and repeatable blood and stool-based biomarkers that reflect disease activity and mucosal healing are necessary. In addition to fecal markers such as fecal calprotectin and fecal immunochemical test, serum C-reactive protein (CRP) [8,9] and serum leucine rich glycoprotein (LRG) [10–12] have been reported as useful.

**Citation:** Maeda, K.; Nakamura, M.; Yamamura, T.; Sawada, T.; Ishikawa, E.; Oishi, A.; Ikegami, S.; Kakushima, N.; Furukawa, K.; Iida, T.; et al. Gelsolin as a Potential Biomarker for Endoscopic Activity and Mucosal Healing in Ulcerative Colitis. *Biomedicines* **2022**, *10*, 872. https://doi.org/10.3390/ biomedicines10040872

Academic Editor: Marianna Christodoulou

Received: 17 March 2022 Accepted: 6 April 2022 Published: 9 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

CRP is produced by hepatocytes in the acute phase upon IL-6 stimulation and is used as a biomarker for various inflammatory diseases. In terms of correlation with disease activity, CRP is associated with endoscopic activity in CD, but only with histologically severe inflammation in UC. Therefore, in UC, a low CRP does not necessarily mean the absence of endoscopic activity, which is problematic owing to its low sensitivity. Leucinerich glycoprotein is also expressed on neutrophils, macrophages, and intestinal epithelial cells and is induced by interleukin (IL)-22, tumor necrosis factor-α, and IL-1 independent of IL-6. In CD, the LRG level is strongly correlated with disease activity, and in UC, it correlates with endoscopic activity, but more evidence is needed in this regard. Therefore, biomarkers that more accurately reflect clinical and endoscopic activities and predict mucosal healing than conventional markers are needed. The noninvasive assessment of accurate disease activity would enable optimal therapeutic choices and improve patient prognosis.

In this study, we performed proteasome analysis of colon mucus samples from patients with UC and focused on GSN, which is significantly downregulated in active UC compared with that in remission. Gelsolin (GSN) is an 82–84 kDa protein consisting of 730 amino acids organized into six homologous domains and expressed in both extracellular fluids and cytoplasm of most human cells [13,14]. GSN is a multifunctional protein; because of its strong effect on the cytoskeleton and inflammation-related biological processes, it shows potential as a biomarker for inflammation-associated medical conditions, such as for predicting illness severity, treatment efficacy, and clinical outcomes [15,16]. Reduced GSN level has been observed in patients with chronic autoimmune diseases such as rheumatoid arthritis and psoriasis [17,18]. GSN localization is also altered in patients with Crohn's disease. Moreover, GSN is an actin-depolymerizing protein that regulates actin dynamics and is involved in cytoskeletal remodeling [16,17]. Its extracellular isoform, plasma GSN, is expressed in the blood, urine, and other extracellular fluids, such as lymph, burn wound fluid, cerebrospinal fluid, and airway surface fluids [19,20]. The secreted GSN also functions in the extracellular actin scavenger system, where it is responsible for the severance and removal of actin filaments from dead cells into the bloodstream [13]. In addition, the secreted GSN binds to lipopolysaccharides, which are compounds derived from the cell wall of Gram-negative bacilli, and inhibits the activation of Toll-like receptors, thereby regulating immune responses [21–24]. The secreted GSN has anti-inflammatory properties, and decreased GSN levels in the blood have been reported in chronic inflammatory diseases. Although the mechanism by which the GSN levels in the blood are reduced remains unclear, the re-distribution of GSN to inflammatory sites, binding to some plasma factors secreted in association with inflammation, and decreased GSN production have previously been reported [23,24]. Therefore, in this study, we examined whether GSN can be used as a biomarker for the clinical and endoscopic activities of UC.

#### **2. Materials and Methods**

#### *2.1. Study Subjects and Sample Collection*

In total, 138 patients with UC and 16 healthy controls were enrolled in this study at the Department of Gastroenterology and Hepatology, Nagoya University Hospital, between April 2016 and April 2021. The healthy controls comprised 10 women, and their median age was 45 (range 36–66) years. Patients were diagnosed with UC based on clinical, endoscopic, and histological criteria and received medical therapy. Clinical and endoscopic activity scores were reviewed from their medical records. Blood sampling and endoscopy were performed within a maximum interval of 1 month. Serum was obtained from the blood samples and stored at −80 ◦C until GSN analysis. Patients with UC comprised 84 women and 54 men, and their median age was 47 (range 20–82) years. The median duration of disease was 143 (range 7–372) months. The median C-reactive protein level was 0.08 (range 0–8.4) mg/dL. The median albumin level was 4.1 (range 1.8-4.9) g/dL. The median Mayo score was 3 (range 0–12). Here, 74.6% (103/138) patients were administered 5-aminosalicylic acids, 13% (18/138) patients were administered corticosteroids, and 32.6% (45/138) patients were administered biologic agents. Patients with UC were classified

according to the extent of disease involvement as those with proctitis, left-sided colitis, or pancolitis, as described in the Montreal classification.

The proportion of patients with proctitis, left-sided colitis, and extensive colitis was 5.7% (8/138), 26% (36/138), and 68.3% (94/138), respectively. Clinical activity was determined using the Mayo score, and remission was defined by a score of ≤2. The endoscopic Mayo score was used to determine endoscopic activity, and endoscopic remission was defined by a score of 0. The proportion of clinically and endoscopically active patients was 56.5% (70/138) and 63.6% (88/138), respectively. The patient characteristics are presented in Table 1.

**Table 1.** Characteristics of patients with ulcerative colitis.


#### *2.2. Measurement of Serum GSN Level*

Serum GSN level was measured using an enzyme-linked immunosorbent assay (ELISA) kit (Abcam, Cambridge, UK), according to the manufacturer's instructions. The absorbance of each sample was measured at 450 and 570 nm using a PowerScan4 microplate reader (DS Pharma Medical Co., Osaka, Japan). The level of GSN was calculated using a standard curve.

#### *2.3. Mass Spectrometry*

Lower rectum intestinal mucus samples of 3 patients with active UC and 3 patients with UC in remission were collected through colonoscopy. Colon mucus from the anterior and right rectal walls was collected using brush catheters (Colonoscope Cytology Brush®; Cook Medical, Winston-Salem, NC, USA).

Patients were diagnosed with UC based on clinical, endoscopic, and histological criteria. These samples were lysed using a Minute Total Protein Extraction Kit for mass spectrometry (Funakoshi, Tokyo, Japan), and the specimens were adjusted to the same protein level before mass spectrometry (MS).

The proteins were digested using trypsin for 16 h at 37 ◦C after reduction and alkylation, and the peptides were analyzed using LC−MS on an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific Inc., Waltham, MA, USA) coupled to an UltiMate3000 RSLC nano LC system (Dionex Co., Amsterdam, The Netherlands), using a nano HPLC capillary column (Nikkyo Technos Co., Tokyo, Japan) with a nano electrospray ion source. Reversephase chromatography was performed with a linear gradient (0 min, 5% B, 100 min, 40% B) of solvent A (2% acetonitrile with 0.1% formic acid) and solvent B (95% acetonitrile with 0.1% formic acid). A precursor ion scan was carried out using a 400–1600 mass to charge ratio (m/z) before MS/MS analysis.

#### *2.4. Data Analysis*

The raw data were processed using Proteome Discoverer 1.4 (Thermo Fisher Scientific) in conjunction with the MASCOT search engine, version 2.6.0 (Matrix Science Inc., Boston, MA, USA) for protein identification. The peptides and proteins were identified using the human protein database in UniProt (release 2020\_03) with a precursor mass tolerance of 10 ppm and a fragment ion mass tolerance of 0.8 Da. Fixed modification was set to carbamidomethylation of cysteine, and variable modification was set to oxidation of methionine.

#### *2.5. Statistical Analysis*

All analyses were performed using Prism software (GraphPad prism version 8 Software, GraphPad Software, San Diego, CA, USA). Differences between groups were compared using Mann–Whitney U-test and Kruskal–Wallis test. The area under the receiver operating characteristic (ROC) curve (AUC) was calculated by plotting sensitivity on the *y* axis against 1—specificity on the *x* axis for each value. The correlation analysis was performed using Pearson coefficients. Statistical significance was defined as *p* < 0.05 (\*, *p* < 0.05; \*\*, *p* < 0.01; and \*\*\*, *p* < 0.001).

#### *2.6. Ethical Considerations*

This study was approved by the ethics committee of the Nagoya University Hospital, Japan (Protocol number 2015-0420, August 2016). Written informed consent was obtained from all patients before their enrollment in accordance with tenets of the Declaration of Helsinki.

#### **3. Results**

#### *3.1. Downregulation of Serum GSN in Patients with Clinically Active UC*

We first conducted proteomic analysis of the specimens from patients with UC in the active phase and in remission. The specimens were collected from the anterior and right walls of the rectal mucosa using brush samples. We identified 460 proteins with a score of ≥30 from the brush samples in patients with active UC (Table S1). Inflammatory protein markers (protein S100-A9) and a neurotrophic protein (myeloperoxidase) presented high scores. We compared protein expression in patients with UC in remission and in the active phase. Consistent with previous study results, Mucin-5B and Mucin-13 were downregulated in active UC compared with those in remission UC [25,26]. In addition, we found that GSN was downregulated in patients with active UC compared with that in patients with remission UC (Table 2).


**Table 2.** List of genes downregulated in active UC compared with those in remission UC.

We then compared the serum GSN level between patients with UC and healthy controls. We analyzed samples from 138 patients (54 males and 84 females) whose median age was 47 (20–82) years. Of all patients, 68.1% had extensive colitis, 26% had left-sided colitis, and 5.8% had proctitis. The proportion of clinically and endoscopically active patients was 56.5% (70/138) and 63.6% (88/138), respectively. The serum GSN level was lower in patients with UC than in the healthy controls (138 patients with UC and 16 healthy controls, *p* < 0.001, Figure 1a). In addition, the serum GSN level was significantly lower in clinically active patients with UC than in those in remission (138 patients with UC, *p* < 0.001, Figure 1b). The correlation between the GSN levels and Mayo scores was determined using

Pearson coefficients, and a significant correlation was found (r = −0.70229, *p* < 0.001) (Figure S1).

**Figure 1.** Serum gelsolin (GSN) level decreased in clinically active patients with ulcerative colitis (UC). Serum GSN level was measured in (**a**) 138 patients with UC and 16 healthy subjects (control); (**b**) 68 patients with UC in clinical remission and 70 patients with clinically active UC; and (**c**) 56 patients with UC in clinical remission and 26 patients with clinically active UC and normal C-reactive protein (CRP) level (<0.14 mg/dL). Statistical significance was defined as *p* < 0.05 (\*\*\* *p* < 0.001) using Mann–Whitney U-test.

The expression of CRP, which is used as a serum biomarker for UC, is induced by IL-6, but the expression of GSN is downregulated by a mechanism different from that of CRP. The correlation between the sensitivity of CRP and disease activity is low; therefore, we tested whether GSN is effective for patients in whom activity was difficult to assess with CRP. Among the 82 patients with UC whose CRP level was normal (<0.14 mg/dL), the GSN level was significantly lower in patients with clinically active disease than in those in the remission phase (82 patients with UC, *p* < 0.001, Figure 1c). These findings indicate that the GSN level correlates with clinical activity, even in cases with a normal CRP level.

#### *3.2. Inverse Relationship between the Serum GSN Level and Endoscopic Activity in Patients with UC*

We analyzed whether the serum GSN level is associated with endoscopic activity in patients with UC. A decreased GSN level correlated with an increased endoscopic activity score (Mayo 0 vs. 1, *p* = 0.999; Mayo 0 vs. 2, *p* = 0.0113: Mayo 0 vs. 3, *p* < 0.01; Mayo 1 vs. 2, *p* = 0.0549; Mayo 1 vs. 3, *p* < 0.001; and Mayo 2 vs. 3, *p* < 0.001). Patients with MES 2 had a lower GSN level than those with MES 0 (54 patients (Mayo = 2), 50 patients (Mayo = 0), *p* = 0.0113), possibly reflecting minor mucosal changes (Figure 2a).

**Figure 2.** GSN level correlates with the endoscopic activity score in patients with UC. (**a**) Serum GSN level in patients with UC categorized according to disease activity (MES 0 (*n* = 50), 1 (*n* = 26), 2 (*n* = 54), and 3 (*n* = 8)). (**b**) Serum GSN level in 50 patients with UC in endoscopic remission (Mayo endoscopic score (MES) = 0) and 88 patients with endoscopically active UC (MES > 0). (**c**) Serum GSN level was measured in 34 patients with UC in endoscopic remission and 48 patients with endoscopically active UC and normal CRP level (CRP < 0.14 mg/dL). Statistical significance was defined as *p* < 0.05 (\* *p* < 0.05; \*\* *p* < 0.01; \*\*\* *p* < 0.001; and N.S., not significant) using Mann–Whitney U-test and Kruskal–Wallis test.

Recently, mucosal healing has been reported to reduce operative and relapse rates, and therapeutic goals have shifted from symptom relief to mucosal healing [3–5]. The detection of mucosal healing is important when using a serum biomarker that correlates with UC activity. We defined mucosal healing using the Mayo endoscopic score of 0 and tested whether mucosal healing could be detected by the GSN level.

The GSN level was lower in patients with endoscopically active UC (Mayo endoscopic score (MES) > 0) than in those with endoscopic remission (MES = 0) (*p* < 0.001, Figure 2a). The correlation between the GSN levels and Mayo endoscopic scores was measured using Pearson coefficients, and a significant correlation was found (r = −0.7585, *p* < 0.01) (Figure S2). Pearson coefficient was also used to determine the correlation between the GSN and CRP levels, and it was found that they had a low correlation (r = −0.287, *p* = 0.006) (Figure S3A).

The correlation between the GSN levels and Mayo endoscopic scores was measured using Pearson coefficients, and it was found that the GSN level and albumin had a low correlation (r = 0.44755, *p* < 0.001) (Figure S3B).

In addition, we tested whether the GSN level could detect mucosal healing in cases with a normal CRP level. Among the 82 patients with UC whose CRP level was within the normal level (<0.14 mg/dL), the GSN level was significantly lower in patients in the endoscopically active phase than in those in the remission phase (*p* < 0.001, Figure 2c). These findings indicate that the GSN level could detect clinical and endoscopic activities in UC patients with high sensitivity. Furthermore, even in patients with a normal CRP level, it correlated with clinical and endoscopic activities, making CRP useful for patients whose activity is difficult to assess with conventional blood tests.

#### *3.3. GSN as a Serological Biomarker of Clinically and Endoscopically Active UC*

Given that the GSN level correlated with clinical and endoscopic activities in patients with UC, we next investigated its diagnostic potential to detect clinical remission and mucosal healing in order to use it as a serum biomarker.

We compared the sensitivity and specificity of GSN with those of CRP using ROC curve and AUC analyses. The sensitivity and specificity of GSN were 91.43% and 89.71%, respectively, for the detection of clinical remission at a cut-off of 10.67 μg/mL (Figure 3a). The AUC of GSN was 0.874 and that of CRP was 0.78. For the detection of endoscopic remission, the sensitivity and specificity of GSN were 78.41% and 86.54%, respectively (Figure 3c), whereas those of CRP were 56.82% and 82.00%, respectively (Figure 3d). The AUC of GSN was 0.835, and that of CRP was 0.692. The AUC of GSN was higher than that of CRP for identifying both clinical and endoscopic remission (Figure 3a–d). These data suggest that GSN is a biomarker that reflects clinical and endoscopic activities and that it can detect mucosal healing.

To determine whether GSN can be used as a biomarker for assessing the clinical and endoscopic activities of UC, we analyzed its sensitivity and specificity using the ROC curve and AUC analyses and compared the results of GSN and CRP, an existing UC marker. The AUC of GSN was higher than that of CRP for identifying both clinical and endoscopic remission (Figure 3a–d).

**Figure 3.** GSN level reflects clinical and endoscopic activities in patients with UC. Receiver operating characteristic (ROC) curves for GSN and CRP indicating their sensitivity and specificity in discriminating (**a**,**b**) clinical remission and (**c**,**d**) endoscopic remission.

#### **4. Discussion**

In this study, we showed that the GSN level correlates with the clinical and endoscopic activities of UC. GSN also showed high sensitivity and specificity in predicting the achievement of mucosal healing in patients with UC.

Currently, CRP and LRG are used as blood-based biomarkers to evaluate the activity of IBD. CRP expression is induced by IL-6 and is used to evaluate various inflammatory diseases [27]. CRP is a useful marker for the diagnosis of IBD, evaluation of disease activity, and prediction of therapeutic efficacy [8,9,28]. However, CRP correlates mostly with severe inflammation and does not reflect mild inflammation [29].

We performed proteomic analysis of the specimens from patients with UC in the active phase and remission phase. We identified 460 proteins in patients with active UC, as in previous studies, the inflammatory markers such as S100-A9 and myeloperoxitase were detected. In addition, IgG-Fc, which is required for the stabilization of Mucin-2 was also detected. Among them, we focused on GSN, the expression of which was downregulated in patients with active UC compared with that in patents with remission UC, and its expression in patents with remission UC was higher than that in patients with active UC.

The GSN level decreases with endoscopic activity, and patients with MES 2 have significantly lower GSN level than patients with MES 0, suggesting that GSN may also reflect mild intestinal inflammation. In addition, the fact that the GSN level reflects clinical and endoscopic activities even in a group of patients with normal CRP levels suggests that the GSN level may be useful for patients whose activity has been difficult to assess with conventional biomarkers. Furthermore, in this study, we demonstrated that the GSN level reflects clinical and endoscopic remission with higher sensitivity and specificity than CRP. We believe that GSN can help detect IL-6-independent inflammation and mucosal healing because it reflects even mild inflammation. Moreover, it can be used to assess endoscopic activity even in CRP-negative cases and could be a new biomarker with an underlying mechanism of action that is different from that of CRP. LRG, a newly identified serum marker for UC, has also been reported to correlate with clinical and endoscopic activities of the disease [10,12]. In the future, it will be necessary to compare the sensitivity and specificity of LRG and GSN, and utilize them according to disease activity and stage. Furthermore, it has been suggested that the measurement of both LRG and GSN may allow more accurate assessment of disease activity and predict mucosal healing with higher sensitivity and specificity.

Mucosal healing was previously defined as MES 0 and MES 1; however, as patients with MES 1 have a higher relapse rate than those with MES 0 [4,30], several studies have considered only MES 0 to reflect mucosal healing. Therefore, mucosal healing was defined as MES 0 in this study. As the operation and relapse rates are low in patients who have achieved mucosal healing, mucosal healing has become the therapeutic goal in UC. GSN presented higher sensitivity and specificity than CRP in detecting mucosal healing with a cut off of 10.67 g/mL. UC is a chronic inflammatory disease with recurrent remissions and relapses, and optimization of treatment based on more accurate assessment of disease activity is needed. Optimal therapeutic options may improve the prognosis of patients by enabling long-term maintenance of mucosal healing. Therefore, using GSN as a biomarker will enable accurate assessment of mucosal healing and treatment optimization.

GSN is a multifunctional protein with altered blood levels in chronic inflammatory and autoimmune disorders such as rheumatoid arthritis [17,31], ankylosing spondylitis [32], systemic lupus erythematosus [33], and Henoch–Schoenlein purpura [34]. However, there have been no reports on the association between GSN level and disease activity in patients with IBD. In the gastrointestinal tract, GSN, along with the GSN superfamily protein villin-1, regulates actin dynamics, intestinal epithelial cell death, and intestinal inflammation [35], but its function in IBD is unknown, and the mechanism of its decreased expression in the intestinal tissues and blood requires further analysis. GSN binds to lipopolysaccharides (LPS), a bacterial cell wall component, and inhibits the activation of Toll-like receptors on the surface of innate immune system cells, such as macrophages and dendritic cells. In IBD, the intestinal epithelial barrier, including mucus production and tight junction formation, is disrupted, and this disruption induces bacterial translocation of LPS from the intestinal tract into the bloodstream. The progressive disruption of the intestinal epithelial barrier mechanism associated with inflammation in IBD may induce an increase in the blood levels of LPS and decrease the level of GSN. In addition, GSN has been reported to be associated with multiple immune cell functions, such as neutrophil migration, suggesting that the abnormal activation of immune cells associated with chronic inflammation may be related to the mechanism of decreased GSN level. The proteins that we identified using proteomic analysis included calprotectin, which is currently used as a stool marker, and could have comprised proteins that can be used as blood- or stool-based markers of disease activity.

Our study had some limitations. For instance, it was a retrospective, single-center study with a small number of patients with heterogeneous backgrounds and treatments. Future studies should involve the recruitment of a prospective cohort to ascertain whether GSN reflects endoscopic activity and mucosal healing in patients with UC. As mentioned earlier, GSN is affected by other inflammatory and autoimmune disorders, and therefore, it may not be useful when other inflammations are involved. Prospective correlations between the GSN level and clinical and endoscopic activities should be carefully examined for the presence of intestinal and other infections or other autoimmune complications.

Nevertheless, we believe that GSN has the potential to be developed into a biomarker to assess UC disease activity and mucosal healing, and can contribute to the realization of treatment targets aimed at achieving mucosal healing. Our findings may lead to a reduction in the number of endoscopic procedures that are needed to assess UC disease activity, reducing patient stress and medical costs. Furthermore, non-invasive markers for disease activity will enable us to accurately assess UC disease activity and adjust treatments appropriately, as well as enable the use of treat-to-target approaches to achieve mucosal healing.

#### **5. Patents**

This work has been submitted for a patent application.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biomedicines10040872/s1, Table S1: Proteins detected in colonic samples from active UC patients; Figure S1: Correlation between the GSN level and Mayo score determined using Pearson coefficients; Figure S2: Correlation between the GSN level and Mayo endoscopic score determined using Pearson coefficients; Figure S3: GSN level and C-reactive protein or albumin had a low correlation.

**Author Contributions:** Conceptualization, K.M. and M.N.; methodology, K.M.; formal analysis, K.M. and A.O.; investigation, T.Y., T.S. and S.I.; resources, K.M.; data curation, K.M. and E.I.; writing original draft preparation, K.M.; writing—review and editing, N.K., K.F., T.I. (Takuya Ishikawa), Y.M., T.I. (Tadashi Iida) and E.O.; supervision, M.I., T.H. and H.K.; project administration, K.M.; funding acquisition, K.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by a Grant-in-Aid for Young Scientists (to K.M), grant number 19K20667 and 21K15920).

**Institutional Review Board Statement:** The study was conducted in accordance with the tenets of the Declaration of Helsinki and approved by the Ethics Committee of the Nagoya University Hospital, Japan (protocol code 04667146 and 1 April in 2016 of approval).

**Informed Consent Statement:** Written informed consent was obtained from all participants.

**Data Availability Statement:** The original data and the materials generated in this study are available from the corresponding author on request.

**Acknowledgments:** We thank members of the Department of Gastroenterology at Nagoya University Graduate School of Medicine for careful reading the manuscript and for helpful discussions.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **References**


### *Article* **Periodontal Disease Augments Cardiovascular Disease Risk Biomarkers in Rheumatoid Arthritis**

**Jeneen Panezai 1,2,3,\*, Ambereen Ghaffar 4, Mohammad Altamash 5, Mikael Åberg 6, Thomas E. Van Dyke 3,7, Anders Larsson <sup>6</sup> and Per-Erik Engström <sup>1</sup>**


**Abstract:** Objectives: Periodontal disease (PD) and rheumatoid arthritis (RA) are known chronic conditions with sustained inflammation leading to osteolysis. Cardiovascular diseases (CVD) are frequent comorbidities that may arise from sustained inflammation associated with both PD and RA. In order to determine CVD risk, alterations at the molecular level need to be identified. The objective of this study, therefore, was to assess the relationship of CVD associated biomarkers in RA patients and how it is influenced by PD. Methods: The study consisted of patient (26 RA with PD, 21 RA without PD, 51 patients with PD only) and systemically and periodontally healthy control (*n* = 20) groups. Periodontal parameters bleeding on probing, probing pocket depth, and marginal bone loss were determined to characterize the patient groups. Proteomic analysis of 92 CVD-related protein biomarkers was performed using a multiplex proximity extension assay. Biomarkers were clustered using the search tool for retrieval of interacting genes (STRING) to determine protein– protein interaction (PPI) networks. Results: RA patients with PD had higher detection levels for 47% of the measured markers (ANGPT1, BOC, CCL17, CCL3, CD4, CD84, CTRC, FGF-21, FGF-23, GLO1, HAOX1, HB-EGF, hOSCAR, HSP 27, IL16, IL-17D, IL18, IL-27, IL6, LEP, LPL, MERTK, MMP12, MMP7, NEMO, PAPPA, PAR-1, PARP-1, PD-L2, PGF, PIgR, PRELP, RAGE, SCF, SLAMF7, SRC, THBS2, THPO, TNFRSF13B, TRAIL-R2, VEGFD, VSIG2, and XCL1) as compared to RA without PD. Furthermore, a strong biological network was identified amongst these proteins (clustering coefficient = 0.52, PPI enrichment *p*-value < 0.0001). Coefficients for protein clusters involved in CVD (0.59), metabolic (0.53), and skeletal (0.51) diseases were strongest in the PD group. Conclusion: Periodontal disease augments CVD-related biomarkers in RA through shared pathological clusters, concurrently enhancing metabolic and skeletal disease protein interactions, independent of autoimmune status.

**Keywords:** inflammation; proteins; proteomics; rheumatoid arthritis; periodontal disease; cardiovascular disease

#### **1. Introduction**

Chronic inflammation stems from persistent acute inflammation due to the failure to resolve the acute phase, often associated with the inability to remove the inducing agent or stimulus. Several diseases that acquire such chronicity due to a dysregulated

**Citation:** Panezai, J.; Ghaffar, A.; Altamash, M.; Åberg, M.; Van Dyke, T.E.; Larsson, A.; Engström, P.-E. Periodontal Disease Augments Cardiovascular Disease Risk Biomarkers in Rheumatoid Arthritis. *Biomedicines* **2022**, *10*, 714. https:// doi.org/10.3390/biomedicines 10030714

Academic Editor: Marianna Christodoulou

Received: 15 February 2022 Accepted: 16 March 2022 Published: 19 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

211

immune response include atherosclerosis, type 2 diabetes, rheumatoid arthritis (RA), and periodontal disease (PD) [1].

Cardiovascular diseases (CVD) are the leading cause of global mortality with over 75% of cases in low- and middle-income countries [2]. CVD comprises a group of disorders that involve the heart muscle and blood vessels. The most common pathogenic pathway that leads to CVD is atherosclerosis [3]. Risk factors such as smoking, diabetes, hypertension, and obesity are transduced into atherosclerotic events via complex interactions between endothelial adhesion molecules and inflammatory cells including macrophages and T lymphocytes. The inflammatory response also has an autoimmune component as lowdensity lipoprotein (LDL) cholesterol, one of the retained lipids in atherosclerotic plaques, is antigenic resulting in production of high affinity antibodies [4].

PD is an independent risk factor for the development of CVD [5]. Systematic reviews have shown a consistent association between CVD and PD which may be partially attributed to shared risk factors and the dissemination of periodontal pathogens into the bloodstream or an increase in systemic inflammation [6].

RA is an autoimmune disease characterized by synovial inflammation and destruction due to immune mediated inflammation. This sustained inflammation in RA promotes cardiovascular pathology to such an extent that it remains the leading cause of mortality in RA patients [7]. The overall increased CVD risk in RA has been attributed less to traditional CVD risk factors and more to underlying autoimmunity and inflammatory burden [8].

The influence of PD and RA combined may increase the menacing effects of inflammation and raise an individual's risk of developing CVD even further. This can be evaluated through exploration of emergent biomarkers involved in CVD initiation and pathology and, therefore, the aim of this study is to assess CVD related biomarkers in RA patients and how it is influenced by PD.

#### **2. Materials and Methods**

The study was performed at the Department of Periodontology, the Altamash Institute of Dental Medicine, between October 2012 and August 2017 in Karachi, Pakistan. Upon obtaining informed consent, a detailed questionnaire was used to acquire information pertaining to medical and dental history. The minimum sample size was calculated using the Epitools Epidemiological Calculators [9] with the assumptions of a power of 80% and a confidence interval (CI) of 95%. The parameters reported in the literature pertaining to South Asian population were used [10]: (1) frequency of 60% for PD among RA patients and (2) frequency of 28% for PD without RA.

#### *2.1. Rheumatoid Arthritis Group*

A total of 47 RA patients were recruited via consecutive sampling from the Department of Rheumatology at the Habib Medical Centre in Karachi, Pakistan. These patients were established RA cases diagnosed by a rheumatologist (AG) using current ACR/EULAR classification criteria [11]. All patients were receiving disease-modifying anti-rheumatic drugs (DMARDs), corticosteroids, non-steroidal anti-inflammatory drugs (NSAIDs), or a biologic DMARD (Rituximab) at the time of the examination. Based on their periodontal status, the patients were divided into two groups: RA with PD (*n* = 26) and RA without PD (*n* = 21).

#### *2.2. Periodontal Disease Group*

A group of 51 participants diagnosed with PD, but exhibiting no signs of RA, gout, or osteoarthritis were also included. Individuals with a history of treatment for PD during the last six months and/or treatment with antibiotics in the last three months were excluded.

We used twenty controls for comparison and better characterization of the groups. All controls had clinically healthy periodontium and no systemic disease. Blood samples were drawn from all participants and prepared sera were stored at −80 ◦C until the time of analyses.

#### *2.3. Periodontal Examination*

Periodontal examination was carried out for all teeth except for the third molars by a single examiner (JP). The defining criteria for PD was probing pocket depth (PPD) of ≥5 mm (to the nearest millimeter) in at least three different sites using a periodontal probe (Hu-Friedy manufacturing, Chicago, IL, USA). Pockets measuring ≥5 mm were added to calculate the sum of deep pockets representing PD affected sites. Our group has designed continuous periodontal indices to gauge the severity of PD as a continuous variable rather than dichotomous. These parameters were used to assess the severity of PD:


Details of the parameters and their measurements are described in our previous publications [12,13].

#### *2.4. Anthropometric Measures*

Body weight was measured to the nearest kg. Using non-stretchable measuring tape, height and waist were measured to the nearest cm. Waist was measured in the horizontal plane at the midpoint between the lowest rib and the iliac crest. Body mass index (BMI) was calculated from weight and height measurements (kg/m2). Anthropometric measurements were recorded for all four groups.

#### *2.5. Glycated Hemoglobin (HbA1c)*

Glycated hemoglobin levels were determined for all four groups after collecting four milliliters of whole blood into spray-coated EDTA tubes (lavender top, Becton, Dickinson, Franklin Lakes, NJ, USA). The samples were analyzed on the same day using the ion-exchange high-performance liquid chromatography system Bio-Rad D-10 Hemoglobin Testing System (Bio-Rad Laboratories, Hercules, CA, USA). The HbA1c values are standardized according to the National Glycohemoglobin Standardization Program (NGSP) system [14].

#### *2.6. Proteomic Profiling*

Serum samples were analyzed at the SciLifeLab Affinity Proteomics Uppsala, Uppsala University, (Uppsala, Sweden) using proximity extension assay (PEA) technology (Olink Proteomics, Uppsala, Sweden). Levels of 92 proteins from the Olink Target96 CVD II panel were measured (Supplementary File Table S1). The PEA technology utilize pairs of antibodies equipped with DNA reporter molecules [15]. When binding in close proximity to their correct targets, the antibody pairs give rise to new DNA amplicons each IDbarcoding their respective antigens. The amplicons are subsequently quantified using the Fluidigm BioMark™ HD real-time PCR platform (South San Francisco, CA, USA). For data analysis and quality control Olink NPX Manager Software was used and the inter-plate variability was adjusted by intensity normalization. The final protein values are expressed as Normalized Protein eXpression (NPX) values which are on a log2 scale and one unit higher NPX represents a doubling of the measured protein concentration. Data quality was controlled and normalized using an internal and an interpolate control. Assay validation data for all proteins from the panel are available (www.olink.com).

#### *2.7. Protein–Protein Interaction (PPI) Network Analysis*

The Protein–Protein Interactions (PPI) Network analysis was performed using the search tool for retrieval of interacting genes (STRING) (https://string-db.org, accessed on 21 February 2021). The STRING database interaction evidence is thematically grouped into 'channels' (such as text mining, co-expression, and lab experiments) and limited to *Homo sapiens*. An interaction score > 0.4 was applied to construct the PPI networks. STRING performed identifier mapping (test the proteins of each known pathway for any nonrandom skew within the user-provided input values, and report statistically significant pathways) and displayed a network with all of the mapped proteins and their interconnections [16]. In the networks, the nodes correspond to the proteins and the edges represent the interactions. STRING was employed to seek potential interactions among proteins. The clustering coefficient, where 0 represents the absence of connections and 1 a fully connected network, was calculated quantifying the abundance of connected nodes in a PPI network. PPI enrichment *p*-value is used to indicate that the nodes are not random and that the observed number of edges is significant.

#### *2.8. Statistical Analyses*

All analyses were performed test using GraphPad Prism version 9.0. for Windows (GraphPad Software, San Diego, CA, USA). Patient characteristics were analyzed using one-way ANOVA and Kruskal–Wallis tests depending on the normality of the data to identify group wise differences. Inter-group differences in biomarker distributions were analyzed using the Mann–Whitney U test. The relationship between each of the 92 protein biomarkers and periodontal parameters was assessed using Spearman correlation analyses. To control the false discovery rate (FDR), the Benjamini–Hochberg procedure was applied to adjust *p*-values from multiple testing [17]. The significance level was defined at *p* ≤ 0.05.

For exploration of biomarker patterns within disease groups, principal component analysis (PCA) was performed. PCA is a powerful exploratory model statistically used for data exploration and simplification. The technique is based on generating principal components (latent variable) from the original dataset. The relationship of the principal components to the samples is referred to as 'scores', and that to the variables is called 'loadings'. A threshold of 0.5 was deemed significant for variable loadings.

#### **3. Results**

#### *3.1. Characteristics of Study Groups*

The characteristics for the disease groups (RA with PD, RA without PD, and PD only) and controls are shown in Table 1. There were no differences in age amongst the four groups. The number of females were higher in disease groups as compared to controls. The clinical status comprised self-reported hypertension and diabetes confirmed by medication and prescription. The frequency of both conditions were similar amongst the disease groups. Periodontal parameters, waist circumference and HbA1c differed significantly amongst the groups with the highest medians in PD patients. The median for HbA1c value in the PD group classifies them as pre-diabetes overall (Supplementary File Table S3).



BOP = bleeding on probing, PPD = probing pocket depth, MBL = marginal bone loss, HbA1c = glycated hemoglobin. <sup>a</sup> Differences in means were tested using one-way ANOVA test (testing overall difference among the three groups). <sup>b</sup> Differences in frequency were tested using χ<sup>2</sup> (chi-squared) test (testing overall difference among the three groups). <sup>c</sup> Differences in medians were tested using Kruskal–Wallis test (testing overall difference among the three groups). <sup>d</sup> Missing data (*n* = 5) was excluded in the analyses. <sup>e</sup> Missing data (*n* = 1) was excluded in the analyses. <sup>f</sup> Missing data (*n* = 1) was excluded in the analyses.

#### *3.2. Group-Wise Biomarker Distribution*

The distribution of 92 CVD biomarkers was assessed among the four groups. Two samples from the PD group were excluded due to unacceptable technical variations. Biomarkers with significantly increased levels in RA with PD groups as compared to RA without PD are shown in Figure 1. Higher NPX values were noted for 47% (43/92) of the markers which were: ANGPT1, BOC, CCL17, CCL3, CD4, CD84, CTRC, FGF-21, FGF-23, GLO1, HAOX1, HB-EGF, hOSCAR, HSP 27, IL16, IL-17D, IL18, IL-27, IL6, LEP, LPL, MERTK, MMP12, MMP7, NEMO, PAPPA, PAR-1, PARP-1, PD-L2, PGF, PIgR, PRELP, RAGE, SCF, SLAMF7, SRC, THBS2, THPO, TNFRSF13B, TRAIL-R2, VEGFD, VSIG2 and XCL1. For 32 of these biomarkers (BOC, CCL17, CCL3, CD84, CTRC, FGF-21, GLO1, HAOX1, HB-EGF, hOSCAR, HSP 27, IL-16, IL-17D, IL-18, IL-27, IL-6, LEP, LPL, MERTK, MMP12, NEMO, PAR-1, PARP-1, PD-L2, PRELP, RAGE, SCF, SLAMF7, THBS2, TNFRSF13B, TRAIL-R2, and XCL1) PD and RA with PD groups exhibited no differences (Supplementary File Table S2).

**Figure 1.** Group wise analyses for CVD-related biomarkers. Graphs 1–43 showing higher detection levels in RA with PD as compared to RA without PD. (1) ANGPT1, (2) BOC, (3) CCL17, (4) CCL3, (5) CD4, (6) CD84, (7) CTRC, (8) FGF-21, (9) FGF-23, (10) GLO1, (11) HAOX1, (12) HB-EGF, (13) hOSCAR (14) HSP 27, (15) IL-16, (16) IL-17D, (17) IL-18 (18) IL-27, (19) IL-6, (20) LEP, (21) LPL, (22) MERTK, (23) MMP7, (24), MMP12, (25) NEMO, (26) PAPPA, (27) PAR-1, (28) PARP-1, (29) PD-L2, (30) PGF, (31) PIgR, (32) PRELP, (33) RAGE, (34) SCF, (35) SLAMF7, (36) SRC, (37) THBS2, (38) THPO, (39) TNFRSF13B, (40) TRAIL-R2 (41) VEGFD, (42) VSIG2, and (43) XCL1. Data are presented as median with interquartile range. Group differences were calculated using Mann–Whitney U test. \* *p* value ≤ 0.05, \*\* *p* value < 0.01, \*\*\* *p* value < 0.001, \*\*\*\* *p* value < 0.0001.

#### *3.3. Correlation of CVD Biomarkers with Periodontal Parameters*

The correlation between periodontal parameters and CVD-related biomarkers are shown in Table 2. The highest frequency of significant correlations was seen in the PD group for all parameters except for adjusted MBL. Anti-inflammatory marker IL-4RA was inversely related with three out of five indices for inflammation and pocketing. The Proto-oncogene tyrosine-protein kinase Src (SRC) was inversely correlated with four out of five indices for inflammation and pocketing. All correlations were direct amongst the RA with PD group, except for ACE-2. The most frequent and moderately strong correlations were noted with adjusted MBL. Least frequent correlations were noted in the RA without PD group. Dickkopf-related protein 1 (Dkk-1) and thrombospondin 2 (THBS2) were directly associated with Adjusted PPD Total. There was no overlap between the associated biomarkers amongst the three groups.


**Table 2.** Correlations of CVD risk biomarkers with periodontal pocketing and marginal bone loss.

Spearman rank correlation was used to identify correlations. All coefficients show biomarkers with adjusted *p*-values ≤ 0.05 after using the Benjamini–Hochberg procedure for multiple testing.

#### *3.4. PCA*

PCA was performed using standardized data and PC selection via parallel analysis. In the initial PCA output, selected component PC1 with loading structure >0.5 are shown for all disease groups (Table 3). The individual values show the correlation between the specific biomarker and the PC 1 for which the loading is calculated for. For RA with PD group, 64 biomarkers exceeded the 0.5 threshold of loading significance. Similarly, RA without PD had 65 biomarkers exceeding the threshold whereas PD group showed 55 biomarkers exceeding the threshold.

**Table 3.** PC loadings for disease groups.


**Table 3.** *Cont.*



**Table 3.** *Cont.*

PC = principal component. All loadings > 0.5 are in bold. The variance represented by two principal components in proportion and cumulatively are shown as percentages.

ě

Visual representation of PC loadings plot (Figure 2) shows how the biomarkers are clustered closely together. In the disease groups, the majority of the biomarkers not only correlated strongly with each other but also with PC1 as most of the values were close to 1. The clustering pattern was more similar between the PD and RA without PD groups as some biomarkers showed stronger correlation with PC2. Using controls for reference, the loading plot showed weaker correlations between the biomarkers themselves and PC 1 and 2. The PC score plots reveal the variation in the dimensionality of the four groups.

**Figure 2.** *Cont*.

**Figure 2.** PCA analysis. The principal component analysis showing loadings (left side) and scores (right side) RA with PD (panel **A**), PD (panel **B**), RA without PD (panel **C**), and controls (panel **D**).

#### *3.5. Protein–Protein Interaction Network*

The Protein–Protein interaction (PPI) network analysis of 43 proteins discriminating RA with PD from RA without PD is shown in Figure 3. The potential interactions between ANGPT1, BOC, CCL17, CCL3, CD4, CD84, FGF-21, FGF-23, HB-EGF, hOSCAR, HSP 27, IL16, IL-17D, IL18, IL-27, IL6, LEP, LPL, MMP12, MMP7, NEMO, PAPPA, PAR-1, PARP-1, PD-L2, PGF, RAGE, SCF, SRC, THBS2, THPO, TNFRSF13B, TRAIL-R2, VEGFD, and XCL1 yielded a clustering coefficient of 0.52, with a PPI enrichment *p* value < 0.0001. Markers from the CVD panel that also play a significant role in metabolic and skeletal disease areas were identified from the PC1 results for each disease group based on bioinformatic databases, including UniProt, Human Protein Atlas, Gene Ontology (GO), and DisGeNET. PPI network analysis was performed for three disease areas per group. These results are shown in Figure 4. The clustering coefficient was strongest for the PD group in all three disease areas when compared to the RA groups. The metabolic disease proteins were identical in clustering strength in both RA groups, uninfluenced by PD status.

**Figure 3.** Protein–Protein interactions (PPI) showing networking of 43 CVD related biomarkers identified to be increased in RA with PD patients. The cluster shows frequent and strong interactions (represented by the same color of the nodes).

**Figure 4.** PC1 biomarkers and their protein network analysis according to disease area in RA with PD, PD, and RA without PD groups. The network nodes represent proteins with red colored nodes denoting first shell interactors and green color showing second shell of interactors. All cluster coefficients (CC) have a PPI enrichment value of <0.0001.

#### **4. Discussion**

In this report, we identified 43 markers with a strong interactive network in patients suffering from PD, with and without RA. The risk of CVD exists in both PD and RA through shared pathological clusters. Several markers also increase associated metabolic and skeletal disease risk, independent of autoimmune status. In order to prevent CVD related morbidity and mortality in chronic inflammatory conditions, it is crucial to identify and study CVD risk biomarkers in the early stages of inflammatory disease. Studying a vast array of biomarkers that are significant in CVD development is an advantage offered by protein profiling using proteomic techniques. The biological mechanisms can be better understood with identification of early stage biomarkers which predispose RA and PD independently or combined to risk of CVD.

In our study, we examined an array of 92 biomarkers related to cardiovascular dysfunction and inflammation in RA patients with or without PD and PD patients alone. The disease groups showed a higher number of women of a relatively young age (<50 years). The gender dominance of women was expected since they are affected more by RA and seek dental care more frequently as compared to men [18]. In young women, being affected with RA is a risk for CVD [19]. RA female patients are 2.6-times more likely to develop CVD as compared to the general population. Our findings in relation to the age of the present cohort are, therefore, relevant.

An overall dysregulated level of HbA1c and increased waist circumference, a measure of central obesity, in PD patients has been confirmed previously as well [20]. Periodontal parameters were less severe in RA patients with PD and attributable to the use of disease modifying anti-rheumatic drugs (DMARDs) by the former group.

For direct comparisons, CVD biomarker distribution was assessed in all groups. Based on biological processes, the frequency of PD relevant biomarkers represented immune response (47%), cell adhesion (40%), intracellular mitogen-activated protein kinase (MAPK) signaling cascade (35%), inflammation (30%), catabolic process (23%), and proteolysis (19%). MAPKs are implicated as key regulators of inflammatory cytokines like IL-6 and TNF, thus transducing inflammation [21]. One of the contributing factors to CVD is endothelial dysfunction which is brought about by over expression of adhesion molecules due to inflammatory mediators [22].

The association of periodontal parameters with CVD biomarkers was also examined per disease group. In RA with PD, the associated biomarkers for periodontal pocketing spanned from enzymes (ACE-2) and membrane proteins (LOX-1) to plasma proteins (PTX3). The inverse relationship between ACE-2 levels and PPD Total scores reflect a pro-atherogenic state as ACE-2 levels have been detected in RA patients with a negative correlation with intima media thickness of carotid arteries [23]. Diseased probing sites correlated moderately with PTX3, also a pro-atherogenic inflammatory marker expressed by vascular endothelium known to modify angiogenesis and atherosclerotic lesion development [24]. Oxidized low density lipoproteins (ox-LDL) have been directly implicated in the pathogenesis of RA through signaling via the lectin-like ox-LDL receptor 1 (LOX-1) in the joint synovium [25]. LOX-1 activates downstream pathways that enhance atherosclerosis via endothelial dysfunction. LOX-1 is also expressed in platelets, where it enhances platelet activation and adhesion to endothelial cells [26]. Both LOX-1 and PTX3 associations with PPD Disease were moderately strong suggesting that PD contributes to a pro-atherosclerotic milieu in RA.

In PD patients only, three biomarkers (IL-4RA, SRC, MMP-12) conveyed a consistent pattern associated with deep pocketing. They reflect an unbalanced state with low antiinflammatory IL-4RA levels confirming previous findings [27]. These findings corroborate a defect in the regulatory involvement of SRC and MMP-12 with phagocytosis and host defense mechanisms in PD patients. Low-MMP12 levels in periodontal tissues may be a risk factor underlying excessive pro-inflammatory IFN-γ macrophage activation in disease [28].

FGF-23, a bone-derived hormone, can also drive an increased production of proinflammatory cytokines [29]. Dkk-1 is known to play a pathophysiological role in bone erosion and joint remodeling in RA patients [30]. It negatively regulates the function of the Wnt pathway which is involved in the differentiation of osteoblasts. Thrombospondin 2 (THBS2) a matricellular protein, has been demonstrated as an endogenous regulator of angiogenesis and inflammation in the RA synovium [31].

High levels of LEP (leptin) associated with increased MBL in RA with PD patients further enforce previous findings of increased LEP levels found in dysfunctional immune phenotype including insulin resistance, inflammation, and disturbances in hemostatic factors [32]. TNSFR13B and its association with MBL reflects an increased B-cell proliferative and surviving capacity via its receptor BAFF (B-cell-activating factor). BAFF are up-regulated in RA synovial joints as well as early stages of PD [33,34]. IL-27 s enhancement of TNF-α mediated upregulation of adhesion molecules and pro-inflammatory IL-6 in blood monocytes of patients with acute myocardial infarction (MI) makes it high CVD risk associated [35]. The TGM2 levels in RA without PD correlate inversely with the total sum of MBL which aligns with previous findings that TGM2 correlates with RANKL production in human periodontal ligament cells as part of the inflammatory response in PD [36].

Protein biomarkers with high loadings (>0.5) on PC1 exceeded 50% of the total biomarkers analyzed in all disease groups. The biomarkers contributing to the greatest variance were similar in all three groups. Based on their disease–gene associations, these biomarkers are involved in vascular inflammation (HO-1, LPL, PAPPA, ADAMTS13, ADM, PGF, and GDF-2), hypertension, and arterial disease (PAPPA, ADAMTS13, ADM, and PGF). The underlying gene ontology represents upregulation of chemotaxis (XCL1 and CCL3), T helper 1 cytokines (SLAMF1, IL-18, XCL1, and IL-1ra), T helper 2 cytokines (XCL1), negative regulation of vasoconstriction (ADM and LEP), hematopoietic stem cell proliferation (THPO and ATXN) and increased bone loss (TNFSF11A and TF) [37].

The additional analyses of PPI networking for PC1 markers in CVD, metabolic and skeletal disease areas was performed as osseous and metabolic disturbances, especially insulin resistance, are highly frequent co-morbidities in both PD and RA [38–41]. The clustering coefficients displayed by PD group PC 1 biomarkers reflect a greater involvement of disease related proteins that make them a group with the highest risk for developing CVD, insulin resistance and skeletal diseases. The dampening of inflammatory circuits due to the use of NSAIDs and DMARDs in RA groups are to have some impact on the level of engagement amongst these proteins. Future studies are required to identify and validate markers of diagnostic and therapeutic relevance that may enhance the 'treat-to-target' strategy for RA and, hopefully, PD.

The limitations to our study pertain to the limited size of samples and the exploration of proteins which have been associated with cardiovascular diseases. Due to the exploratory nature of our study and the low prevalence of RA (~1%), we used a non-probability sampling method in which groups were not sex-matched. Despite these limitations, our findings have identified a direction for the exploration of other pathways in order to understand molecular alterations responsible for increased risk of CVD development in RA and PD.

#### **5. Conclusions**

We identified 43 markers with a strong interactive network in patients suffering from PD, with and without RA. In addition, several of these markers also increase associated metabolic and skeletal disease risk, independent of autoimmune status.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/biomedicines10030714/s1, Table S1, Olink Cardiovascular II panel; Table S2; Table S3, Clinical Parameters.

**Author Contributions:** Conceptualization, J.P., M.Å., T.E.V.D., A.L. and P.-E.E.; methodology, J.P., A.G., M.A., M.Å., A.L. and P.-E.E.; software, J.P., M.Å. and A.L.; validation, J.P., A.G., M.A., M.Å., T.E.V.D., A.L. and P.-E.E.; formal analysis, J.P., M.Å. and A.L.; investigation, J.P., A.G., M.A., M.Å. and A.L.; resources, A.G., M.A. and P.-E.E.; data curation, J.P.; writing—original draft preparation, J.P.; writing—review and editing, J.P., A.G., M.A., M.Å., T.E.V.D., A.L. and P.-E.E.; visualization, J.P., T.E.V.D., A.L. and P.-E.E.; supervision, J.P., M.A., T.E.V.D., A.L. and P.-E.E.; project administration, J.P., A.G., M.A., T.E.V.D., A.L. and P.-E.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study protocol was approved by the regional research ethical committees in Karachi, Pakistan (26 September 2012, 30 September 2016) and Stockholm, Sweden (2016/296-31/1).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** All relevant data are provided as Supporting Information in Tables S1–S3.

**Acknowledgments:** The authors would like to acknowledge support of SciLifeLab Affinity Proteomics Uppsala, (Uppsala University, Sweden) for aiding in protein analyses. All authors gave their final approval and agree to be accountable for all aspects of the work.

**Conflicts of Interest:** The authors declare no conflict of interest. Van Dyke is supported by USPHS, NIH, NIDCR DE025020.

#### **References**


## **Frequency and Clinical Significance of Elevated IgG4 in Rheumatoid Arthritis: A Systematic Review**

**Rajalingham Sakthiswary 1,\*, Syahrul Sazliyana Shaharir <sup>1</sup> and Asrul Abdul Wahab <sup>2</sup>**


**Abstract:** Immunoglobulin (Ig)G4 is a unique protein molecule and its role in autoimmune diseases remains elusive and controversial. Accumulating evidence suggests a pathogenic role of IgG4 in rheumatoid arthritis (RA). Rheumatoid factors (RF) in RA can recognize the Fc domains of IgG4 to form RF-IgG4 immune complexes that may activate the complement system leading to synovial injury. The aim of this article was to systematically review the literature from the past 2 decades to determine the frequency of elevated IgG4 and its clinical significance in RA. We comprehensively searched the Pubmed, Scopus, and Web of Science databases with the following terms: "IgG4", "rheumatoid arthritis", and "immunoglobulin G4", and scrutinized all of the relevant publications. Based on the selection criteria, 12 studies were incorporated, which involved a total of 1715 RA patients. Out of 328 subjects from three studies, the pooled frequency of elevated non-specific IgG4 was 35.98%. There was a significant positive correlation between the IgG4 levels and the RA disease activity based on DAS-28 measurements (r = 0.245–0.253) and inflammatory markers, i.e., erythrocyte sedimentation rate (ESR) and C-reactive protein (CRP) levels (r = 0.262–0.389). Longitudinal studies that measured the serial levels of IgG4 consistently showed a decline in the concentrations (up to 48% less than baseline) with disease modifying anti-rheumatic drug (DMARD) treatment. Current evidence suggests that serum IgG4 levels are significantly elevated in RA compared to the general population. This review indicates that IgG4 is a promising biomarker of disease activity and tends to decline in response to DMARD therapies. Biologic therapies have revolutionized the therapeutic armamentarium of RA in the recent decade, and IgG4 appears to be a potential treatment target.

**Keywords:** arthritis; rheumatoid; immunoglobulin G; immune system

#### **1. Introduction**

Immunoglobulin (Ig)G accounts for 80% of the total immunoglobulins in human serum, and can be divided into four subclasses, i.e., IgG1 (60–70%), IgG2 (15–20%), IgG3 (5–10%), and IgG4 (4–6%). Each of these has different immunological properties and functions [1]. Immunoglobulins play a pivotal role in autoimmune diseases such as rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), and myasthenia gravis. RA is a chronic inflammatory joint disease with a complex pathogenesis. The orchestrated interaction of a wide array of cytokines, autoreactive B cells, and T cells underpin the mechanisms in RA. The sera of RA patients tend to typically exhibit a wide variety of autoantibodies [2]. Rheumatoid factors (RF), which are the predominant autoantibodies in RA, have Fab segments, which react with the Fc portion of the IgG molecule to generate IgM (RF)-IgG immune complexes, which can stimulate the complement system and trigger a cascade of events in the synovial microenvironment [3]. IgG4 molecules have stirred much interest among researchers in the past decade, ever since IgG4-related disease (IgG4-RD) was endorsed in 2011 [4]. The striking difference between IgG4-RD and RA is the presence

**Citation:** Sakthiswary, R.; Shaharir, S.S.; Wahab, A.A. Frequency and Clinical Significance of Elevated IgG4 in Rheumatoid Arthritis: A Systematic Review. *Biomedicines* **2022**, *10*, 558. https://doi.org/10.3390/ biomedicines10030558

Academic Editor: Marianna Christodoulou

Received: 4 January 2022 Accepted: 17 February 2022 Published: 26 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

227

and frequency of joint involvement. IgG4-RD is a multisystemic disorder, with arthritis being reported in only 10% of patients [5]. The common forms of presentations of IgG4-RD include autoimmune pancreatitis, sclerosing cholangitis, sclerosing sialedenitis, orbital disease, and retroperitoneal fibrosis [6]. RA, on the other hand, is primarily a disease of the joints. The presence of arthritis is mandatory to diagnose RA [7].

IgG4 is a poorly understood molecule with controversial roles in the immune system. Traditionally, IgG4 has been viewed as a "non-inflammatory" molecule, which dampens rather than incites immune activation. This is due to the unique molecular structure of IgG4, whereby the heavy chains in each IgG4 molecule have inefficient disulphide bridges due to a single amino acid difference in the hinge region [8]. In hemi-IgG4 molecules, one heavy chain may covalently bind with one light chain, and then dissociate from each other and re-associate randomly with other hemi-IgG4 molecules. This phenomenon is known as the "Fab arm exchange", which exclusively occurs in the IgG4 subclass [9]. This half-antibody exchange generates antibodies that are capable of binding two different antigens, but are rarely able to form large immune complexes [10]. Based on these theories, IgG4 have a limited ability to form immune responses.

Nevertheless, accumulating evidence suggests a pathogenic role of IgG4 based on its correlations with disease activity and the severity of certain disease entities [4,11]. IgG4 may theoretically bind to Fc receptors on macrophages and eosinophils, and facilitate the presentation of extracellular antigens to CD4+ T lymphocytes [12,13]. Some recent publications have implicated IgG4 autoantibodies in the pathogenesis of RA. Histopathological findings of IgG4 infiltration in the rheumatoid synovium and elevated serum levels of IgG4 in RA patients lend further credence to the above notion [3,14,15].

As far as we know, there are no published systematic reviews focusing on IgG4 in RA. Hence, the purpose of this systematic review was to gather and scrutinize all available literature in the past few decades to determine the pooled frequency of elevated IgG4 and its clinical significance.

#### **2. Materials and Methods**

#### *2.1. Search Strategy*

We comprehensively searched the Pubmed, Scopus, and Web of Science databases with the following terms: "IgG4", "immunoglobulin G4", and "rheumatoid arthritis", and tracked all of the publications. All three authors independently performed a literature search by title and abstract screening using the Endnote software. In the event of uncertainty, the full text of the article was obtained and assessed. Disagreements were resolved by a consensus-based discussion. Only articles that were approved after much scrutiny by all were finally included in the review. To minimize the selection, information, and confounding biases, the PICOT (patient/population, intervention, control, outcome, time) approach was employed to develop the inclusion and exclusion criteria [16]. The population in this review referred to patients with RA, intervention in most studies included treatment with disease modifying anti-rheumatic drugs, and the outcome was the serum IgG4 levels. A clear search protocol reduced the ambiguity in the selection process of the articles. In order to achieve extensive coverage without missing any relevant articles, the references of all retrieved articles were reviewed. This systematic review was conducted in accordance with the standards set by the Preferred Reporting Item for Systematic Review and Meta-Analysis (PRISMA) Statement [17]. Figure 1 summarizes our search strategy.

#### *2.2. Inclusion Criteria*

All adult human studies written in English that looked into IgG4 in RA were included. Conference abstracts with sufficient data were considered eligible.

**Figure 1.** The algorithm for the selection of studies in this systematic review.

#### *2.3. Exclusion Criteria*

We excluded studies published before 2000. Furthermore, articles in other languages, case reports, case series, animal studies, editorials, and review articles were excluded.

#### *2.4. Data Extraction*

After compiling the relevant studies, the authors extracted the relevant data from each paper, including year of publication, country, study design, study population, frequency of subjects with elevated IgG4, mean/median IgG4 levels in RA, and the correlations with clinical and biochemical markers. The Newcastle−Ottawa Scale [18] (Table 1) was used for the quality assessment of the 11 included observational studies. The above scale is not applicable for randomized trials. Scores of ≥3 were considered as low risk of bias, whereas <3 were judged as high risk. Disagreements among the authors were solved through discussions and a consensus was reached.



#### **3. Results**

#### *3.1. Study Characteristics Proteomic Analysis of IgG4*

Based on the selection criteria, 12 studies were incorporated, which involved a total of 1715 RA patients. Among the included twelve research works, four were from Asia [14,15,19,20], seven from Europe [21–27], and one from North America [28]. All of the studies in this series were observational, except for one randomized trial [23]. There were seven cross-sectional studies [14,15,19,20,22,24,27] and five longitudinal studies [21,23,25,26,28] included in this review. The quality assessment of the observational studies based on the Newcastle−Ottawa scale revealed that six articles were of low-risk bias (≥3 points) and the remaining five were of high-risk bias (<3 points).

IgG4 levels were detected using three methods, i.e., ELISA in seven studies [21–24,26–28], immunonephelometry in four studies [14,15,19,20], and radioimmunoassay in a single study [25]. Of note, all Asian studies used the immunonephelometry method of testing IgG4. The studies that performed the immunonephelometric quantification of IgG4 stored the samples between −80 to −70 degrees Celsius after the samples were processed in a centrifugal separator. The total levels of IgG and IgG4 were determined with liquid reagent kits [15,19]. The levels of IgG4 specific-anti-citrulinated cyclic peptide (CCP) antibodies were determined using the ELISA kit containing a CCP-coated plate with horseperoxidaseconjugated anti-human IgG4 antibodies [28].

#### *3.2. Frequency of Elevated IgG4 in Rheumatoid Arthritis*

There were five studies that analyzed the non-specific IgG4 levels in RA [14,15,19,20,23] (Table 2). There were two studies that did not provide data on the frequency of subjects with raised levels of IgG4 [20,23]. Out of 328 subjects from three studies, the pooled frequency of elevated IgG4 was 35.98%. The studies used different kits with variable units of measurements, i.e., g/L, mg/L and mg/dL. Calculation of effect size was not performed as there were only two studies [15,19] that provided the mean values of IgG4.

There were four studies that investigated the levels of IgG4 specific to citrullinated cyclic peptide (CCP) [21,22,27,28] and two studies on citrullinated cyclic fibrinogen (CCF) [24,26] (Table 3). The pooled frequency of elevated IgG4 anti-CCP was 330 out of 581 subjects (56.79%).

#### *3.3. Clinical Significance of IgG4 in Rheumatoid Arthritis*

#### 3.3.1. IgG4 and Disease Activity

There were four studies that investigated the association of serum IgG4 levels with the RA disease activity [14,15,19,20]. Of note, two of the studies were from the same group of researchers [14,15]. Kim et al. [19] found significant correlations between serum IgG4 levels and DAS28-ESR (r = 0.245; *p* = 0.016), and with ESR (r = 0.262; *p* = 0.010). In keeping with these findings, Chen et al. [15] revealed that IgG4 levels correlated positively with CRP (r = 0.373), ESR (r = 0.389), and DAS28 (r = 0.253; all *p* < 0.05) [4]. The Pearson correlation coefficient r value from these studies for the correlation between IgG4 levels and the RA disease activity based on DAS-28 measurements ranged from 0.245–0.253, whereas for inflammatory markers, i.e., ESR and CRP levels, it was 0.262–0.389. The r values that fell between 0.2–0.4, in general, reflected a weak to moderate strength in the relationships of the aforementioned variables [29]. There was a trend towards higher IgG4 levels in the high disease activity group compared to the moderate, low, and remission groups, although statistical significance was not achieved.

In one of the studies, the synovial samples of RA patients had a median IgG4 positive(+) plasma cells count of 83 (10–192)/mm2 and a median ratio of IgG4+/IgG+ plasma cells of 19.1 (8.4–31.5). Both of them were positively correlated with ESR, CRP, and serum IgG4 (r = 0.216–0.394, all *p* < 0.05) [14].



#### 3.3.2. IgG4 and Treatment Response

There were four longitudinal studies [21,23,26,28] that evaluated the changes in the levels of IgG4 with therapy. The therapies used included biologic disease modifying anti-rheumatic drugs (DMARDs) such as tocilizumab [28], adalimumab [26], conventional DMARDs [21], and an experimental agent that was oral bovine type II collagen [23]. All of these studies except for one [23] consistently showed a decline in the IgG4 levels with treatment. There was a parallel decrease in the disease activity of the subjects. Bos et al. [26] disclosed that although all types of IgG (IgG1–4) decreased with treatment, the good responders based on European League Against Rheumatism (EULAR) response criteria [30] had the greatest decline in antibody levels, and this effect was most pronounced for IgG4 (48% reduction). Similarly, Carbone et al. [28] found a 2–3-fold reduction in IgG4 levels with tocilizumab therapy, but not in IgG1 levels, despite IgG1 being the most frequent IgG subtype.

Among the subjects who were treated with adalimumab, secondary failure to this biologic therapy was due to the formation of anti-drug antibody, which was IgG4 in up to 29% of the subjects [25].


**Table 3.** Summary of Rheumatoid Arthritis studies on specific types of IgG4.

#### **4. Discussion**

To the best of our knowledge, this is the first systematic review in the literature on IgG4 in RA. Our pooled analyses showed that IgG4 levels were significantly elevated in RA patients (35.98%) compared to the frequencies reported in healthy individuals. Various studies have found that the frequency of elevated IgG4 in healthy subjects ranged from 0–2.5% [19,31]. In keeping with our findings, several studies have reported higher frequencies of elevated IgG4 in autoimmune diseases such as Sjogren syndrome, systemic lupus erythematosus, myasthenia gravis, and eosinophilic granulomatous polyangiitis [32,33]. A serum IgG4 concentration of above 135 mg/dL has been widely accepted as the cutoff value to define "elevated IgG4" and as a criterion for the diagnosis of IgG4-related disease [34].

There was a significant positive correlation between the IgG4 levels and RA disease activity based on the findings of all three studies that performed correlation analyses between the above-mentioned parameters. Of note, all three studies used the same composite clinical disease activity tool, i.e., DAS28-ESR, which may partially explain the similarity in the findings. Disease activity in RA reflects synovial inflammation, which is due to the effects of circulating cytokines, such as interleukin (IL)-1, IL-6, and tumor necrosis factor (TNF) α. The synthesis of IgG4 in vitro was regulated by IL-6. IL-6 may enhance IgG4 production through IL-21 expressed in CD4+ T cells [35], which in turn promotes the differentiation of B cells into antibody-secreting plasma cells [36]. The link between IL-6 and IgG4 may explain the relationship between the latter and RA disease activity. The pro-inflammatory nature of IL-6 is well established in RA and it plays important roles in the regulation of the immune response, inflammation, and bone metabolism [37]. The reported association in the studies need not necessarily imply causation of RA disease activity directly by IgG4. Nevertheless, elevated IgG4 levels may indicate a relapse of RA. The conventional biomarkers of disease activity widely used by clinicians in day-to-day clinical practice are ESR and CRP. Clinicians may consider IgG4 as an adjunct biomarker in this regard, but not for diagnostic purposes.

The independent role of IgG4 in RA remains elusive, although there is some supporting evidence based on the histopathological analysis by Chen et al. [14,15]. The studies demonstrated marked infiltration of RA synovium by IgG4-positive plasma cells, which were correlated with a total synovitis score, inflammatory infiltration subscore, CD3-positive T cells, CD20-positive B cells, and CD38-positive plasma cells. This finding suggested that IgG4 was potentially a culprit molecule in RA rather than an innocent bystander. It is tempting to speculate that the fibrosis observed in the RA synovium could be secondary to the upregulation of a fibrogenic cytokine, i.e., transforming growth factor (TGF)-β by IgG4 [38]. This postulation is based on our knowledge on IgG4-RD and its striking histological feature, which is fibrosis [39].

Rheumatoid factors (RF) in RA can recognize the Fc domains of IgG4 to form RF-IgG4 immune complexes that may activate the complement system, leading to synovial injury [40]. Although IgG1 is the most frequent isotype against citrullinated cyclic peptide, Bos et al. [26] proposed that prolonged exposure to autoantigens might lead to changes in the IgG4/IgG1 antibody ratio switching to an IgG4-dominated response. Figure 2 illustrates the theoretical role of IgG4 in the pathogenesis of RA.

The evidence from this systematic review suggests that IgG4 is a reliable biomarker of treatment response. The results from the studies in this regard were consistent. There are a few hypothetical explanations for the above. DMARD therapy tends to inhibit IgG4 production via TNFα inhibition [41]. Furthermore, IgG4 levels, unlike IgG1 levels tend to decline with therapy among responders due to disruptions in the chronic stimulation by citrullinated proteins [42]. Citrullination is inflammation-dependent and is hence suppressed by DMARD therapies, which have anti-inflammatory properties. IgG1 levels are stable and inflammation-independent as they are predominantly produced by long-

lived plasma cells, whereas IgG4 levels are produced by short-lived plasma cells, which are driven by citrullinated proteins [43].

**Figure 2.** Hypothetical pathogenesis model on the role of IgG4 in rheumatoid arthritis.

We acknowledge the limitations of this systematic review. Most of the studies were conducted in Europe and Asia, which may limit the representativeness of the results to a certain extent. There are racial and ethnic disparities with regard to the disease characteristics and clinical outcomes in RA [44]. We were unable to calculate the effect size for the correlation between the RA disease activity and IgG4, as well as the difference in the means of IgG4 across the various categories of RA patients due to the lack of relevant numerical data in the included studies. From the limited data from a few studies, conclusions cannot be made firmly and may appear speculative. Moreover, several articles with ambiguous data description were excluded, which may affect the pooled frequency.

#### **5. Conclusions**

Current evidence suggests that the serum IgG4 levels are elevated in RA compared to the general population. This review indicates that IgG4 is a promising biomarker of disease activity, and tends to decline in response to DMARD therapies. Thus, IgG4 could serve as an alternative modality in RA to assess patients' disease severity. There are several theories with regard to the pro-inflammatory role of IgG4. Further research is necessary to substantiate these hypotheses.

**Author Contributions:** Conceptualization, R.S., S.S.S. and A.A.W.; methodology, R.S., S.S.S. and A.A.W.; data curation, R.S.; writing, R.S. and S.S.S.; funding acquisition, R.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was funded by the University Kebangsaan Malaysia, Kuala Lumpur, Malaysia.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available upon request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland www.mdpi.com

*Biomedicines* Editorial Office E-mail: biomedicines@mdpi.com www.mdpi.com/journal/biomedicines

Disclaimer/Publisher's Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Academic Open Access Publishing

mdpi.com ISBN 978-3-0365-9277-0