2.2.4. Genes That Encode Costimulatory Molecules

Some of the most significant RA candidate genes are enzymes or costimulating auxiliary proteins that are important for the metabolism of immunocompetent cells, although not directly involved in the formation of the immune response. For instance, a missense mutation in exon 14 of the *PTPN22* gene leads to the arginine/tryptophan replacement at the 620 position of the polypeptide chain (p. R620W, rs2476601) [30]. This R620W substitution is localised in the motif responsible for the interaction of cofactor proteins and could mediate an increase in autoreactive B-cell numbers during the experiment. This replacement is not only associated with RA but other autoimmune diseases as well, namely, systemic lupus erythematosus, type 1 diabetes, and Graves' disease (European populations only; the site is usually not polymorphic in Asians). The *PTPN22* gene encodes lymphocyte tyrosine phosphatase, and the mentioned substitution acts as a gain of function mutation. A detailed study of the normal and mutant alleles' functions in model experiments revealed unexpected difficulties—the homologous 619W mutation in mice acts as a loss of function mutation that is identical to the full knockout of the orthologous gene [6,19,30]. According to a recent meta-analysis of 53 original case–control studies, the polymorphism rs2476601 is associated with RA development in homozygous and heterozygous models in Caucasians and Africans [31]. In addition, the C allele of rs2488457 is associated with RA progress in Caucasians but not in Asians [32]. PTPN22 interacts with the product of *PADI4*, another RA candidate gene encoding peptidyl arginine deaminase. This enzyme converts arginine amino acid residues into citrulline, and a disturbance of this process can contribute to anticitrulline antibody formation. Increased PADI4 activity leading to abnormal citrullination of fibrin strands is noted in inflammatory infiltrate. In addition, it is known that PADI enzyme expression is increased and citrullinated peptides are found in the bronchoalveolar lavage cells of smokers compared to nonsmokers. Smoking is considered an RA-provoking environmental factor that contributes to the multifactorial basis for disease development [33]. The *PADI4* gene located in the 1p36 region was the first RA candidate gene identified in the Asian population. Unfavourable substitutions of the *PADI4* codons—G55S, V82A and G112A—do not significantly affect the activity of the enzyme itself but reduces the gene's mRNA stability. A meta-analysis of the SNP gene in 20,000 RA patients and 25,000 controls determined the associations of the disease with the –94G/A polymorphism in Asian populations, the –92C/G polymorphism in African populations, and the —90C/T polymorphism in Latin populations. Interestingly, the PTPN22 and PADI4 interaction most likely has a functional significance since PTPN22 deficiency leads to an increased production of citrulline compounds and the formation of extracellular neutrophilic traps [6,27,34].

The autoimmune regulator transcription factor (AIRE), with its 11 polymorphic variants studied in autoimmune diseases, should also be noted. The association of rs2075876 and rs760426 of *AIRE* gene polymorphisms with RA was shown by the meta-analysis, which included a total of 6696 RA patients and 8164 controls. The polymorphisms were localised in the gene promoter; the presence of unfavourable alleles reduced gene expression. A lack of AIRE protein led to the failure of naive T-cell specific selection in the thymus, resulting in autoimmune T-cell survival [35]. Thus, specific polymorphisms associated with the disease as predisposition factors in various populations and heterogeneous clinical groups, according to a number of meta-analyses, could be distinguished among the large number of genetic variants related to RA. The main RA-predisposing polymorphisms are presented in Table 1.


**Table 1.** Polymorphisms (non-HLA genes) associated with rheumatoid arthritis (RA) in mixed populations according to meta-analyses data.

Some of the candidate genes and polymorphic variants are not directly related to cytokines and intracellular signalling pathways that enhance inflammatory processes. However, polymorphisms in such genes may change the activity of the protein product, which leads to the accumulation of certain metabolites and the subsequent overproduction of proinflammatory cytokines. For example, a meta-analysis shows the association of intron SNP in the *SLC8A3* gene encoding a Na+/Ca<sup>+</sup> transmembrane transfer protein with ACPA-positive RA. However, there is evidence that this carrier protein hyperactivity may be accompanied by an increased level of tumour necrosis factor alpha (TNFα) [47]. The *MTHFR* gene encodes methylenetetrahydrofolate reductase, promoting the homocysteine to methionine conversion that is important for folate metabolism and the synthesis of nucleic acids. According to the meta-analysis, the T-allele of the C677T polymorphism (rs1801133) is associated with RA in Caucasians. This allele encodes an enzyme with reduced activity that leads to hyperhomocysteinemia in homozygotes, which results in an increase in proinflammatory cytokine concentration [40]. Moreover, according to a meta-analysis of 16 original studies published in 2020, C677T is associated with RA in the dominant and recessive models in Caucasians and Africans, and in the recessive model in Asians [41]. In addition, a meta-analysis of eight papers demonstrated the association of TIM3 with RA development. This gene encodes the auxiliary factor expressed by dendritic cells, macrophages, type 1 T-helper, and type 2 T-helper cells. TIM3 modulates the differentiation of type 1 T-helper and type 17 T-helper cells, which suppress autoimmune processes. The polymorphism rs1036199 in TIM3 is associated with RA development [48].

#### **3. Predisposition to RA Depends on the Method of Analysis of Genes and Population Characteristics**

Candidate gene missense variant impact studies have led to a greater understanding of the molecular pathogenesis of RA. A multivariate GWAS-based analysis identified the genes and signalling pathways involved in the pathogenesis of the disease. However, these identified RA candidate gene variants only explain 30% of the disease cases with family accumulation, while the actual family prevalence among all RA patients reaches 65% [1,14,18]. The genetic causes of a significant proportion of familial RA have not been identified until now. Perhaps high-performance sequencing technology—high-throughput sequencing (HTS)—will help to solve this problem in the near future. The list of unfavourable alleles has already been supplemented with missense variants of the *IL2RA* and *IL2RB* genes by the first HTS experiments. By the in-silico constructing of an interactome between the already known RA candidate genes, about 160 other presumptive candidates were identified. Subsequently, a number of them were considered RA-associated according to the GWAS results: the *NOTCH4* gene, the *TNXB* gene located near the MHC class III genes, the *BTNL2* gene (especially rs3817963; also associated with systemic sarcoidosis), as well as the insufficiently characterised sequence C6orf10, expressed in autoimmune and some other diseases [49].

The population genetics characteristic, as well as the results of previous meta-analyses, need to be taken into account when using GWAS and HTS technologies. A list of the main predisposing RA genes identified in Latin American populations is an example of such research. The list was mostly formed based on the results of studies done between 2003 and 2013, and it is consistent with similar lists of European populations. This could be explained by a significant proportion of Spanish origin RA patients amongst all of the examined patients. At the same time, the RA candidate genes attributable to Latin American populations have only been described in recent years, such as *ENOX1* on chromosome 13 and *NNA25* on chromosome 12 [7].

SNPs occur in no more than 1% of the coding part of the genome; most of them are localised in noncoding DNA. Single RA-associated nucleotide substitutions have been detected near the noncoding regions of 40–50 genes. At least some of them are suggested to increase RA risk by means of the activation of tissue-specific superenhancers. Thus, polymorphic variants associated with RA are mainly localised in the T-cell and natural killer enhancers compared to the other cell types, according to the FANTOM5 consortium (contains information for 71 cell type tissue-specific enhancers). In addition, SNPs associated with RA and other autoimmune diseases are more likely to occur in superenhancers than in "regular" enhancer regions of CD4<sup>+</sup> T-cells [50]. These superenhancers are associated with 27 of the 100 major RA candidate genes. Moreover, 12 loci have shown an association with RA according to GWAS and contained long noncoding RNA genes that regulate the expression of other candidate genes. RA-associated SNPs of noncoding regions are often found in candidate gene enhancer regions. Therefore, a number of functionally significant polymorphisms could be identified by reverse genetics methods. Some authors have searched for SNPs in regulatory regions of genes using data on its expression changes. The RA association of rs2013109 in the *RNASE2* gene was shown this way [51]. Currently, the search for RA candidate genes is performed by bioinformatic analysis of published expression profiles obtained with high-density microchip technology (for instance, Affymetrix) [52–55]. Whole-genome or whole-exome sequencing, utilising high-performance HTS platforms, could also help to identify rare and population-specific pathogenic genetic variants. In particular, rare RA-associated germline variants were identified in the *NCR3LG1*, *RAP1GAP*, *CHCHD5*, *HIPK2*, and *DIAPH2* genes by whole-exome sequencing with Illumina HiSeq in a Han (Chinese ethnic group) patient cohort [56].

The data presented in this section indicate the significant experimental and bioinformatic work performed by researchers in countries all over the world to elucidate the genetic causes of RA development. However, whether the genetic determinants can help in personalising and increasing the effectiveness of RA treatment remains an important question for clinical rheumatology.

#### **4. Genetic Factors are Prognostic Markers Associated with RA Clinical Manifestations**

There are currently many genes and loci associated with a predisposition to RA. The identification of genetic markers associated with the clinical prognosis of RA is more difficult than identifying predisposition markers. This is because the severity of the disease depends not only on genetic factors but also on epigenetic changes, as well as the action of provoking environmental factors.

Some genetic factors predisposing to RA can be associated with the intensity of the course of the disease. As such, the genetic variants HLA-DRB1 alleles were identified as associated with radiologically determined joint damage, which is one of the main manifestations of the severity of the disease. Sixteen HLA-DRB1 haplotypes are associated with an increased risk of RA, erosive joint damage, and the patient's lifespan [57]. Valine at position 11 of HLA-DRB1 is a predictor of joint erosion and an unfavourable outcome, whereas serine at the same position is associated with a less severe clinical course of the disease [58]. Besides such modifications, the rs112112734 polymorphism of the HLA-DRB1 gene is also connected with radiologically determined joint damage [17]. In addition, this study showed that rs112112734 is also associated with the presence of rheumatoid factor and ACPAs, which are serological markers of RA.

The IL6 signalling pathway is an important participant in the proinflammatory network of cytokines, which contributes to the destruction of articular tissues. A study of the IL6R gene polymorphism and the extent of joint erosion in RA revealed an association between SNP rs4845618, located in the first intron of the *IL6R* gene, and joint damage. It was shown that rs4845618 is significant for the expression of IL6R in the study of whole blood samples. Using the latest data from the Roadmap Epigenomics Project epigenetic consortia, it has been demonstrated that the SNP rs4845618 localisation region is associated with the regulation of gene activity in more than 50 different types of human cells, including T-cells. This suggests an association between joint damage and the regulatory sequence of the *IL6R* gene [59].

The serum level of another interleukin, IL37, which is one of the key modulators of RA, correlates with disease activity. An analysis of the *IL37* gene rs3811047 polymorphism in patients with RA in the Egyptian Arab population showed that patients with the GG genotype had a higher severity score of lesions on the DAS28 clinical scale than patients with genotypes AA or AG [60]. In another study, the polymorphism rs911263 located inside the *RAD51B* gene was presented as a genetic factor associated with joint erosion and the severity of RA. Allele C rs911263 is associated with a lower incidence of joint erosion in patients with RA [61]. This polymorphism demonstrated a strong protective effect on RA [61]. RAD51B is a member of the RAD51 family of proteins that are required for DNA repair through recombination. It is currently not known how this polymorphism is functionally related to the severity of the clinical manifestations of RA.

Associated with the development of RA in the Korean population and Caucasians, the *UBASH3A* gene [62] is also associated with the intensity of the course of the disease. The C allele of the rs1893592 polymorphism of the *UBASH3A* gene was shown to be protective with respect to RA activity (scores according to DAS28, C-reactive protein level and bone erosion). At the same time, this may be caused by population specificity [63]. The *UBASH3A* gene encodes a protein involved in the degradation of receptor tyrosine kinases, with proapoptotic properties in T-cells, which may explain its involvement in the pathogenesis of RA. It is noteworthy that AA homozygotes at position –308 of the *TNFA* gene also have a pronounced association with both the risk of developing RA and the severity of the clinical picture of this disease [64].

The above examples show that polymorphic variants of RA candidate genes can not only increase the risk of developing a disease but also increase the likelihood of a more severe course of the disease, ceteris paribus. Although their use in clinical practice as markers is impractical due to low penetrance and, in some cases, population specificity, polymorphisms associated with RA activity indicate the molecular pathways underlying the intensive development of RA and the possibilities of targeted therapy of this disease.
