**RNA Sequencing (RNA-Seq) Based Transcriptome Analysis in Immune Response of Holstein Cattle to Killed Vaccine against Bovine Viral Diarrhea Virus Type I**

#### **Bryan Irvine Lopez 1,**†**, Kier Gumangan Santiago 2,3,**†**, Donghui Lee 2, Seungmin Ha <sup>4</sup> and Kangseok Seo 2,\***


Received: 7 February 2020; Accepted: 18 February 2020; Published: 21 February 2020

**Simple Summary:** Due to the undeniable detrimental impact of bovine viral diarrhea virus (BVDV) on cattle worldwide, various preventive approaches are carried out to control the spread of this disease. Among the established preventive approaches, vaccination remains the most widely used cost-effective method of control. Hence, a deeper study into the host immune response to vaccines will further refine the efficacy of these vaccines; the identification of differentially expressed genes (DEGs) related to immune response might bring a long-lasting solution. Thus far, studies showing the genes related to the immune response of cattle to vaccines are still limited. Therefore, this study identified DEGs in animals with high and low sample to positive (S/P) ratio based on the BVDV antibody level, using RNA sequencing (RNA-seq) transcriptome analysis, and functional enrichment analysis in gene ontology (GO) annotations and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. Results revealed that several upregulated and downregulated genes were significantly annotated to antigen processing and presentation (MHC class I), immune response, and interferon-gamma production, indicating the immune response of the animals related to possible shaping of their adaptive immunity against the BVDV type I. Moreover, significant enrichment to various KEGG pathways related to the development of adaptive immunity was observed.

**Abstract:** Immune response of 107 vaccinated Holstein cattle was initially obtained prior to the ELISA test. Five cattle with high and low bovine viral diarrhea virus (BVDV) type I antibody were identified as the final experimental animals. Blood samples from these animals were then utilized to determine significant differentially expressed genes (DEGs) using the RNA-seq transcriptome analysis and enrichment analysis. Our analysis identified 261 DEGs in cattle identified as experimental animals. Functional enrichment analysis in gene ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways revealed the DEGs potentially induced by the inactivated BVDV type I vaccine, and might be responsible for the host immune responses. Our findings suggested that inactivated vaccine induced upregulation of genes involved in different GO annotations, including antigen processing and presentation of peptide antigen (via MHC class I), immune response, and positive regulation of interferon-gamma production. The observed downregulation of other genes involved in immune response might be due to inhibition of toll-like receptors (TLRs) by the

upregulation of the Bcl-3 gene. Meanwhile, the result of KEGG pathways revealed that the majority of DEGs were upregulated and enriched to different pathways, including cytokine-cytokine receptor interaction, platelet activation, extracellular matrix (ECM) receptor interaction, hematopoietic cell lineage, and ATP-binding cassette (ABC) transporters. These significant pathways supported our initial findings and are known to play a vital role in shaping adaptive immunity against BVDV type 1. In addition, type 1 diabetes mellitus pathways tended to be significantly enriched. Thus, further studies are needed to investigate the prevalence of type 1 diabetes mellitus in cattle vaccinated with inactivated and live BVDV vaccine.

**Keywords:** Bovine Viral Diarrhea Virus; RNA-Seq; Transcriptome analysis; Holstein cattle

#### **1. Introduction**

Bovine viral diarrhea virus (BVDV) is an economically important pathogen of domestic and wild ruminants affecting multiple organ systems, incurring most economic losses due to respiratory diseases, low reproductive performance (due to reduced conception rates), early embryonic deaths, abortion, congenital weak calves, and high costs of control programs [1–4]. BVDV belongs to the genus *Pestivirus* within the family *Flaviviridae*, with two species, namely BVDV1 and BVDV2; both consist of strains, belonging to biotypes non-cytopathogenic or cytopathogenic, based on cell-cultured characteristics [5]. The non-cytopathogenic strain has the ability to cross the maternal placenta, infecting the growing fetus at early gestation (before 150 gestation days) due to the undeveloped immune system and failure of recognizing the virus as foreign [6]. This fetal infection affects fetal development [7], and results in persistently infected born calves, shedding lifelong reservoir of BVDV in the herd, while cytopathic BVDV plays a vital role by superinfecting persistently infected cattle, leading to mucosal disease [5]. In addition, recent studies reported that the immunosuppressive ability of BVDV heightens other viral disease potentiators, particularly the bovine respiratory disease [8–10]. The differences in genotypes and biotypes of BVDV, a wide range of susceptible hosts, the ability to induce persistent infection, and intervene with both innate and adaptive immunity, makes the prevention and control program difficult [4].

To date, modified live viral and inactivated viral vaccines are widely used to prevent the consequences of BVDV infection [11]. However, vaccine efficacy varies, depending on the animal's nutritional status [12], maternal antibody from the dam colostrum [13], and the presence of persistently infected cattle in the herd. Thus far, numerous studies were conducted to improve vaccine efficacy against BVDV, yet still remains prevalent among the cattle herd worldwide.

Numerous studies focusing on the transcriptomic analysis of animals infected with various diseases were carried out. In the study of Li et al. [4], various differentially expressed genes (DEGs) related to goat immune response, including inflammation, defense response, cell locomotion, and cytokine/chemokine-mediated signaling were revealed by transcriptome analysis with samples from BVDV2 artificially infected goat peripheral blood mononuclear cells (PBMCs). Similar methodology was done in the study of Singh et al. [14], where various significant DEGs related to the immune system processes of goat and sheep against bluetongue virus serotype 16 (BTV-16) were revealed, such as NFκB, MAPK, Ras, NOD, RIG, TNF, TLR, JAK-STAT, and VEGF signaling pathways. Meanwhile, comparative transcriptomic analyses between infected and non-infected animals were conducted by Barreto et al. [15], where infected bovine were observed to have massive changes in the expression profiles of keratinocyte, immune system, cell proliferation, and apoptosis genes.

All of these studies used transcriptome analysis and next-generation sequencing (NGS) approach particularly the RNA sequencing (RNA-Seq). RNA sequencing is a developed method that uses deep sequencing technology for transcriptome profiling [16]; aside from providing a comprehensive picture of the transcriptome, it also reveals the activity and mechanism of the molecular structure and explores the biological function of a gene [17]. Furthermore, future works using this technology are considered, such as genetic linkage mapping, quantitative trait analysis, disease-resistant strains, effective vaccines, and therapies development [18].

Although genotyping and RNA sequencing still remain costly, such studies may provide novel insights and solid foundation in improving herd performance through the robustness of animals against viral diseases. However, studies pertaining to the immune responses of the animal to the vaccine were not fully explored. Thus, the objective of this study is to identify differentially expressed genes (DEGs) related to the functional immune response of the host vaccinated with the inactivated BVDV type I vaccine, and to provide insight on how vaccines improve the immunity of animals against diseases.

#### **2. Materials and Methods**

#### *2.1. Experimental Animals and Vaccination*

A total of 107 vaccinated Holstein Cattle (*Bos taurus)* was used in the study. Multivalent killed vaccine Bar Vac Elite 4-HS (Boehringer Ingelheim Vetmedica, Inc., St Joseph, MO, USA) containing antigen of infectious bovine rhinotracheitis virus, bovine viral diarrhea virus (BVDV, type I), bovine respiratory syncytial virus (BRSV), Myxovirus parainfluenza type 3 (PI3) and *Haemophilus somnus* bacterin were given intramuscularly, as prescribed by the manufacturer.

Blood samples were collected from the jugular vein at 7, 28, and 168 d post-vaccination. Subsequently, blood was allowed to coagulate for 1–2 h at 4 ◦C and centrifugate for 20 min at room temperature with the relative centrifugal force of 1800 × g. Serum was collected and aliquoted into 1.5 mL tubes and stored below −60 ◦C until ELISA test.

#### *2.2. Serological Antibody Detection*

Competitive ELISA, using VDPro BVDV AB ELISA (Median Diagnostics Inc., Chuncheon, Republic of Korea) was used to assay the antibody responses of each animal against the vaccine. Assaying was performed as per manufacturer protocols. Concisely, the BVDV gp63 antigen was allowed to absorb in the polystyrene plate and bind with antibodies in serum samples. It was competed for corresponding hydrogen peroxide conjugated monoclonal antibodies. The chromogenic change after the addition of 3, 3', 5, 5'-Tetramethylbenzidine substrate was measured at 450 nm optical density using BioTek ELISA reader and Gen5 2.07 software; results with lower color development signify a higher level of antibody. The optical density (OD) value was measured using a microplate reader set at 405 nm and concentration was valued with the corresponding standard references. The obtained OD value of BVDV type I antibodies were evaluated for the comparative value, related to the positive control value, to get antibodies level in the sample to positive (S/P) ratio form by applying the equation as below.

$$\frac{S}{P} = \frac{\text{Sample O.D. value - Negative Control O.D. value}}{\text{Positive control O.D. value - Negative Control O.D. value}}\tag{1}$$

The obtained BVDV type I S/P ratio, and other immune-related parameters, such as TNF-alpha, IFN-gamma, IL-17A, IL-1b, IL-4, IL-2, and IL-6 were used as the basis for selecting experimental animals. Among the 107 experimental animals, only ten (10) animals were selected and grouped into two groups, namely the low and high BVDV type I groups; each group had five (5) animals (Figure 1).

#### *2.3. RNA Isolation, Library Preparation, and RNA Sequencing (RNA-seq)*

Total RNA was stabilized and isolated from the blood samples (collected after vaccination) of the selected animal groups using Tempus Blood RNA Tube (Applied Biosystems, Seoul, Korea), according to the manufacturer's instructions. RNeasy MinElute Cleanup Kit (Qiagen, Valencia, CA, USA) was used to purify and concentrate the previously isolated RNA. RNA quality based on RNA integrity number (RIN) was determined using Agilent Technologies 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA) with an acceptable RIN value of ≥7. These RNA samples were used to generate RNA-Seq transcriptome libraries, using the TruSeq Stranded Total RNA LT sample preparation kit (Globin) of Illumina (San Diego, CA, USA). To ensure the quality of prepared libraries, the size of the PCR enriched fragments were verified by checking the template size distribution, by running on Agilent Technologies 2100 Bioanalyzer using a DNA 1000 chip, and were quantified using qPCR according to the Illumina qPCR quantification protocol guide. After a series of quality control and quantification, prepared paired-end libraries for animals with high (test) and low (control) BVDV type I antibody were then sequenced with the Illumina NovaSeq 6000 platform, performed by TNT Research Corporation Limited (Anyang, South Korea).

**Figure 1.** Sample to positive (S/P) ratio of cattle groups identified as high (*n* = 5), low (*n* = 5) and average (*n* = 107) bovine viral diarrhea virus (BVDV) type I antibody level at different time points. The error bars indicate standard error.

#### *2.4. Reads Trimming, Mapping, and Assembly of Sequenced RNA Reads*

Quality control (QC) of raw paired-end reads were done by trimming reads using Trimmomatic 0.38 (http://www.usadellab.org/cms/?page=trimmomatic). QC included the removal of adapter sequence, contaminant DNA, and low-quality reads with lengths below 36 bp. Clean reads (cDNA) were indexed to reference (*Bos taurus*) cattle genome GCF\_002263795.1 ARS-UCD1.2 and were mapped against the reference genome using HISAT2 version 2.1.0 (Bowtie2 aligner) (https://ccb.jhu.edu/software/hisat2/ index.shtml). Reference-based aligned read assembly of transcripts was performed using the StringTie 1.3.4d (https://ccb.jhu.edu/software/stringtie/). This allowed the identification of transcript or genes with annotation information in the assembled genome, while genes without annotated information were defined as new transcripts [19]. On the other hand, mapping of each sample without the –e option of StringTie allowed the prediction of novel transcript and novel alternative splicing transcript. The gffcompare program from GFF utilities was used to compare existing annotations and distinguish novel transcript types.

#### *2.5. Di*ff*erential Expression Genes (DEGs) Analysis and Clustering*

The identification of DEGs between case and control samples was based on the expression level on each transcript, which was calculated using the fragments per kilobase of exon per million mapped reads (FPKM) method. The DESeq2 package, equipped with fold change and negative binomial (nbinom) Wald test, was used for differential expression analyses. DEGs were identified based on the following parameters: the logarithmic fold change was greater than or equal to 2 (|fc|>=2) and nbinom Wald test raw *p* < 0.05. In addition, hierarchical clustering of significant genes was done to determine the similarity level of each sample.

#### *2.6. Functional Annotation and Enrichment Analysis*

DEGs were based on several functional annotation databases, specifically gene ontology (GO) (http: //geneontology.org/) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://kegg.jp). Enrichment analysis was performed using the Database for Annotation, Visualization and Integrated Discovery 6.8

(DAVID) tool (http://david.abcc.ncifcrf.gov/) equipped with the modified Fisher's exact test. DEGs with a *p*-value of less than 0.05 were significantly considered enriched in GO terms and KEGG pathways.

#### **3. Results**

#### *3.1. Experimental Animals*

Table 1 shows the two groups of experimental animals identified in this study, namely, low and high group, based on the level of BVDV type I antibody and level of immune responses, including TNF-alpha, IFN-gamma, IL-17A, IL-1b, IL-4, IL-2, and IL-6. For the earlier group, there were five identified animals namely (ID number), 13064, 13083, 13090, 14010, and 14017, while a similar number of animals belong to the latter group, namely, 14107, 15060, 15071, 15083, and 15094.

#### *3.2. Transcriptome Sequencing Data*

An average of 9.0G bp (Table 1) raw data for each sample was obtained from paired-end transcriptome sequencing from the Illumina NovaSeq 6000 platform. Prior to further analysis, raw data were subject to quality control using Trimmomatic version 0.38. The trimmed results show that the total read bases, GC (%), and Q30 (%) of each sample have values ranging from 7.0 G to 11.0 G, 44.86% to 45.99%, and 94.65% to 95.31%, respectively, as shown in Table 1. Trimmed data were mapped against the reference genome (GCF\_002263795.1 ARS-UCD1.2) using the HISAT2 program. Obtained mapped reads were then assembled using StringTie-e option version 1.3.4d, which resulted in a total of 100,685 transcripts and 39,127 genes successfully mapped against the reference genome. Thereafter, the removal of low-quality transcripts and genes was done, leaving only 10,000 transcripts and 35,000 genes for differentiation analysis. Furthermore, a total of 1452 novel transcripts, 13,060 novel splicing variants, and 4199 novel genes were identified using the StringTie software.


**Table 1.** Summary of the mapping information for each sample.

\* Animals belongs to high immune responses and BVDV type I antibody group.

#### *3.3. Di*ff*erentially Expressed Genes*

The resulting good quality genes from StringTie software were filtered by excluding genes with at least one zero count, leaving only 45% (16,315 genes) for DEG analysis. Prior to DEG analysis, expression levels between genes of each sample were first normalized using the Relative Log Expression (RLE) normalization method, based on raw read counts (Figure 2).

The analysis of the differently expressed genes (DEGs) were done by comparing the normalized values using the DESeq2 package, equipped with log fold change and nbinom Wald test. A total of 261 genes were considered differentially expressed based on the threshold level (fold change (log2) ≥ 2 and *p*-value < 0.05) (Figure 3). Results of comparison analysis between animals with high (test) and low (control) BVDV type I antibody groups revealed that 143 genes were classified as up-regulated while the remaining 118 genes were down-regulated (Figure 3).

**Figure 2.** Density plot of normalized data using Relative Log Expression (RLE) normalization method based on read count and log2.

**Figure 3.** Number of up- and down-regulated genes after comparison of normalized values using the DESeq2 package.

#### *3.4. Functional Enrichment Analysis of Identified DEGs*

RNA-seq transcriptome analysis successfully identify a total of 261 DEGs. Several functional annotation databases, such as gene ontology (GO) and KEGG pathway using the Database for Annotation, Visualization and Integrated Discovery 6.8 (DAVID) tool, were used to determine the biological function of these identified DEGs. DAVID gene enrichment analysis revealed 28 significant GO terms throughout the differentiation analysis (*p* < 0.05). However, there were only three GO major categories, namely biological process (GOTERM\_BP), cellular component (GOTERM\_CC), and molecular function (GOTERM\_MF) where the significant DEGs were grouped based on their functionality (Figure 4). In this study, 38 significant DEGs were distributed to top 10 GOTERM\_BP, namely antigen processing and presentation of peptide antigen (via MHC class I), immune response, positive regulation of gene expression, negative regulation of gene expression, negative regulation of cell growth, negative regulation of oxidoreductase activity, positive regulation of interferon-gamma production, collagen biosynthetic process, and caveola assembly. In GOTERM\_MF, 21 DEGs were distributed into the following top 5 GO terms; calcium ion binding, calcium-dependent cysteine-type endopeptidase activity, SH3 domain binding, superoxide-generating Nicotinamide Adenine Dinucleotide Phosphate (NADPH) oxidase activator activity, and protein complex scaffold. Meanwhile, 78 DEGs were distributed to the top 7 GOTERM\_CC as follows; integral component of membrane, class I protein complex, extracellular region, proteinaceous extracellular matrix, membrane raft, axon and anchored component of the external side of the plasma membrane.

**Figure 4.** Gene ontology (GO) enrichment analysis of differentially expressed genes (DEGs) in vaccinated cattle, selected based on BVDV type I antibody level (**a**) GOTERM\_Biological Process, (**b**) GOTERM\_Cellular Component, and (**c**) GOTERM\_Molecular Function. GO terms are located on the y-axis and terms with (\*) (\*\*) (\*\*\*) means significant enrichment with a *p*-value of <0.05, 0.01, and 0.001, respectively.

#### *3.5. KEGG Pathway Enrichment Analysis*

To allow deeper understanding of the biological function of significant DEGs, KEGG pathway enrichment analysis was done using DAVID 6.8 tool. Initially, KEGG pathways analysis successfully annotated DEGs into 10 pathways which later reduced into 5 significantly enriched pathways (*P*<*0.05*) namely; cytokine-cytokine interaction, platelet activation, ECM-receptor interaction, hematopoietic cell lineage, and ABC transporters (Table 2). Cytokine-cytokine interaction pathway involved 2 upregulated (IL18, IL1RAP) and 4 down-regulated DEGs (CCR8, CCL3, IL20RA, TGFB2); hematopoietic cell lineage pathway linked 4 upregulated DEGs (GP5, GP1BA, CD24, GP9); platelet activation pathway with 5 up-regulated DEGs (GP5, P2RX1, MAPK12, GP1BA, GP9); ECM-receptor interaction pathway with 3 up-regulated (GP5, GP1BA, GP9) and 1 down-regulated (ITGB4) DEGs and ABC transporters pathway with 1 upregulated (ABCB11) and 2 downregulated DEGs (LOC100296627, CFTR). Results obtained from GO and KEGG analysis indicated that various DEGs were involved in host immune response to inactivated BVDV Type I vaccine.

**Table 2.** List of significantly enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways associated with immune response.


ECM—Extra Cellular Matrix, ABC—ATP Binding Cassette

#### **4. Discussion**

BVDV infection is undeniably detrimental to bovine raisers by reducing milk yield, and is associated with low reproductive performance and growth retardation, allowing the occurrence of other disease potentiators, premature culling, and a high rate of mortality to young stock [20]. Houe et al. [20] and Carman et al. [21] estimated that national herd could experience economic loss ranging between \$10 million and \$40 million per million calvings, and \$40,000–\$100,000 (USD) per herd, respectively. Thus, to prevent such negative effects of BVDV, the development of cost-effective controls, including vaccines and eradication schemes were considered [22]. However, despite effective control programs, BVDV infection remains rampant in most cattle herd worldwide. Evidence reveals that variability of the BVDV strains, cross placental ability of the virus leading to persistent infections, wide spectrum of susceptible hosts, and the ability to interfere both innate and adaptive immunity makes prevention and control, such as vaccination, less effective [4,23].

Recently, similar studies that used next-generation sequencing technology (NGS) purported various DEGs related to animal immune response against viral diseases, providing a deeper understanding of immune responses. Since the development of microarray-based analysis and completion of the Human Genome Project, more advanced sequencing technology has come about, such as RNA-Seq based transcriptome analysis [24]. Compared to DNA microarray-based technology, RNA–Seq provide greater dynamic range by directly revealing sequence identity crucial for annotation quantification of unknown genes and novel transcript isoforms [25,26]. In studies by Li et al., Singh et al., and Barreto et al. [4,14,15], RNA-Seq based transcriptome analyses were used to successfully identify both up- and down-regulated genes related to the host immune response during BVDV, bluetongue virus of sheep and goats, and bovine papillomatosis infection, respectively.

In this study, cattle vaccinated with inactivated multivalent vaccine (BVDV type I, BRSV, Myxovirus parainfluenza type 3 (PI3), *Haemophilus somnus* bacterin) were evaluated days after the last vaccination, to identify and understand changes in gene expressions related to the immune response brought by the vaccine. Specifically, this study examines the only animal with a high and low BVDV type I antibody level; thus, DEGs identified in the transcriptome analysis were highly attributed to the immune response of the animal to inactivated BVDV type I vaccine.

Vaccination is considered an effective tool in preventing and controlling infectious diseases involving the cooperative action of innate and adaptive immunity [27]. Innate immunity plays a key role in triggering adaptive immune response by involving hematopoietic cells, such as macrophages, mast cells, neutrophils, eosinophils, dendritic cells, natural killer cells, and non-hematopoietic cells, such as skin and epithelial linings of the gastrointestinal, genitourinary, and respiratory tract [28]. Meanwhile, adaptive immunity plays its vital role in the immune system as it involves a tightly regulated interaction between antigen -presenting cells and T and B lymphocytes that facilitate pathogen-specific immunologic effector pathways, immunologic memory, and regulation of host immune homeostasis [29].

In this study, functional enrichment analysis revealed upregulated DEGs related to both innate and adaptive immune responses, such as BoLA, IL18, and BCL3. Among identified immune-related genes, the bovine lymphocyte antigen (BoLA) caught the attention of the researcher, as it was directly involved in antigen presentation. The BoLA gene located on chromosomes BTA 23 [30], and generally known as the MHC of cattle, was reported to play an integral role in immune responsiveness and susceptibility to the diseases of the host animal [31]. MHC is a cell surface glycoprotein molecule, having the binding ability to foreign peptides, such as viral proteins, and provides context for the recognition of T-lymphocytes responsible for cell-mediated immunity [32,33]. In studies conducted by Gutierrez et al. [34] and Weigel et al. [35], it was discovered that MHC genes are strongly associated with disease resistance and susceptibility to a wide range of diseases; thus, it can be a natural strategy in controlling infectious diseases, by incorporating it to the selection index and in genetic manipulation techniques.

The IL18 and Bcl-3 gene was also identified, upregulated in this study; IL18 gene play an important role in the T-cell-helper type 1 (Th1) and are involved in the regulation of innate and adaptive immune response by inducing IFN-gamma in natural killer cells (NKC) and T helper (Th1) lymphocytes [36,37]. Primary precursors of IL-18 are expressed in epithelial cells of the body, while the primary sources are macrophages and dendritic cells [38]. In a study conducted on laboratory mice, IL-18 was considered an effective adjuvant by enhancing immunogenicity through its relevant activities, such as activator of NK cells, a strong stimulator of Th1 responses, and other immunoactive cytokines in Th1 cells, monocytes, and NK Cells [38].

Whereas, the BCL3 is a proto-oncogene member of the IkB family, and also reportedly plays an important role in immune responses. In a study by Schwarz et al., it was purported that the Bcl-3 protein interacts specifically with the NFkB subunits (p50 and p52). It was also reported that mice lacking the Bcl-3 gene exhibit normal development and immunoglobin levels, but the humoral immune response was severely affected, and fail to produce antigen-specific antibodies [39]. Further, in the study of Fredericksen et al. [40], it was observed that the BVDV-1 infected Madin-Darby bovine kidney cell line induces immune marker production, such as BCL3, IL-1, IL-8, IL-15, IL-18, Mx-1, IRF-1, and IRF-7 through the NF-kB signaling pathway. Furthermore, Carmody et al. [41] reported that Bcl-3 limits the strength of toll-like receptors (TLRs) that are responsible for triggering inflammatory cytokines production and development of both adaptive and innate immunity through p50 subunit ubiquitination stabilization. Thus, this limitation of TLR responses might be responsible for the downregulation of other DEGs related to inflammatory responses.

Enrichment analysis through KEGG pathways of DEGs was done to understand signal transduction pathways activated and repressed by inactivated antigen (vaccine). Results of KEGG pathway enrichment analysis revealed five (5) significantly enriched pathways, such as platelet activation, cytokine-cytokine receptor interaction, ECM receptor interaction, hematopoietic cell lineage, and ABC transporters. Among identified significant pathways, cytokine-cytokine receptor interaction involved the most number of DEGs. The cytokine-cytokine (c-c) receptor interaction plays a vital role in health during immunological and inflammatory responses to diseases through the synergistic convergence of signaling pathways and divergence of the cytokine signal, which activates another cytokine system [42]. In this study, six (6) significant DEGs under c-c receptor interaction pathways were identified, namely; CCR8, CCL3, IL20RA, TGFB2, IL18, and IL1RAP with only the last two (2) DEGs identified as upregulated (IL18, IL1RAP). Downregulation of other DEGs belonging to c-c receptor interaction and linked to TLR may be attributed to previously reported upregulation of Bcl-3, which limits the duration of TLR responses that control deleterious inflammatory diseases. However, further studies are warranted to fully support this claim.

Another significantly enriched pathway observed in this study is the extracellular matrix (ECM) receptor interaction pathway, which includes four (4) upregulated DEGs, namely, glycoprotein (GpV), GpIba, GpIX, and ITGB4. Briefly, ECM is a non-cellular component found in all tissue and organs, providing cellular constituents its physical framework, and it also plays a vital role in tissue morphogenesis, differentiation, and homeostasis by initiating crucial biochemical and biomechanical signals [43]. Additionally, ECM conveys specific signals to cells resulting in the modulation of basic functions that are important for the early steps of inflammation, particularly the migration of immune cells during tissue inflammation and immune cell differentiation [44]. These functions of ECM are believed to be mediated primarily by integrins under the family of cell surface receptors [45]. In support, Kroll et al. [46] and Englund et al. [47] reported that platelet membranes, such as GpIb and GpIX, when bound to the von Willebrand factor (vWF), would help transmit signals to the platelet that leads to platelet activation and adhesion. Whereas, the platelet glycoprotein (GP) Ib-IX-V complex is responsible for platelet rolling and adhesion to the site of injury [48]. As such, the literature suggests that the upregulation of integrin subunit beta 4 (ITGB4), GpV, GpIba, and GpIX in this study might be involved in the immune-related functions of ECM.

Fascinatingly, enrichment of ECM receptor pathways supported the succeeding enriched pathways, such as the hematopoietic cell lineage and platelet activation pathways. In a study by Klein [49], it was reported that the ECM matrix molecules (collagen, proteoglycans, and glycoproteins) are part of the bone marrow microenvironment that plays a very significant role in promoting hematopoietic cell proliferation and differentiation. Thus, the upregulation of all DEGs under the hematopoietic cell lineage pathway (GP5, GP1BA, CD24, GP9) supports the observed upregulation of some DEGs under the ECM pathway.

On the other hand, the platelet activation pathway, which are believed to be related to ECM glycoprotein, was reported to be triggered during viral antigen-antibody complexes, from which virus-induced platelet activation can modulate platelet count that help shape immune response through their released products that suppressed infection [50]. Under this pathway, there were four identified upregulated DEGs, namely, GP5, P2RX1, GP1BA, and GP9, with only MAPK12 as down-regulated.

Another important enriched pathway is the ATP-binding cassette (ABC) transporter, which purportedly plays a crucial role in adaptive immunity by its ability to shuttle degrade proteasomal products into the endoplasmic reticulum (ER), which then loaded to MHC class I before antigen presentation on the cell surface [51,52]. In the study of Hinz and Tampé [53], it was also reported that transporters associated with antigen processing (TAP) could be challenged with a number of viral factors, which prevent antigen translocation and loading MHC class I in virally infected cells. Thus, this literature suggests that the upregulation of ABCB11 (ABC transporter pathways) previously observed in this study was associated with the development of adaptive immunity against BVDV Type I.

Another interesting pathway that tended to be significantly enriched was the type 1 diabetes mellitus pathway. This information catches the attention of researchers due to a previous report that cattle infected with the BVD-mucosal disease virus can induce insulin-dependent diabetes mellitus [54].

#### **5. Conclusions**

The results of this study showed significantly identified DEGs under different immune-related gene ontologies and signaling pathways in response to BVDV type 1 antigen. These observed findings will surely provide assistance by enlightening end-users and other researchers on the changes happening in the animal immune system brought by vaccination. In addition, the potential inclusion of DEGs to animal improvement programs, such as breeding, selection, and genetic manipulation techniques will surely help improve the efficacy of the vaccine. Furthermore, the DEGs, annotation, and pathways identified in this study can be utilized for future studies concerning the immune response of cattle to vaccines.

**Author Contributions:** Conceptualization, K.S.; Data curation, K.S.; Formal analysis, B.I.L. and D.L.; Methodology, B.I.L., S.H. and K.S.; Project administration, K.S.; Supervision, K.S.; Visualization, S.H. and K.S.; Writing—original draft, K.G.S.; Writing—review & editing, B.I.L. and K.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the Cooperative Research Program for Agriculture Science and Technology Development (Project No. PJ012704012019), Rural Development Administration, Republic of Korea. Bryan Irvine Lopez was supported by the 2020 RDA Research Associate Fellowship Program of the National Institute of Animal Science, Rural Development Administration, Republic of Korea.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Genetics of Arthrogryposis and Macroglossia in Piemontese Cattle Breed**

#### **Liliana Di Stasio 1,\*, Andrea Albera 2, Alfredo Pauciullo 1, Alberto Cesarani 3, Nicolò P. P. Macciotta <sup>3</sup> and Giustino Gaspa <sup>1</sup>**


Received: 7 September 2020; Accepted: 18 September 2020; Published: 24 September 2020

**Simple Summary:** The study was carried out in order to investigate the genetic background of arthrogryposis and macroglossia in the Piemontese cattle breed, for which limited information is available so far. The genotyping of affected and healthy animals with a high-density chip and the subsequent genome-wide association study did not evidence a single strong association with the two pathologies. Therefore, for arthrogryposis, the results do not support the existence of a single-gene model, as reported for other breeds. Rather, 23 significant markers on different chromosomes were found, associated to arthrogryposis, to macroglossia, or to both pathologies, suggesting a more complex genetic mechanism underlying both diseases in the Piemontese breed. The significant single nucleotide polymorphisms (SNPs) allowed the identification of some genes (*NTN3*, *KCNH1*, *KCNH2*, and *KANK3*) for which a possible role in the pathologies can be hypothesized. The real involvement of these genes needs to be further investigated and validated.

**Abstract:** Arthrogryposis and macroglossia are congenital pathologies known in several cattle breeds, including Piemontese. As variations in single genes were identified as responsible for arthrogryposis in some breeds, we decided: (i) to test the hypothesis of a similar genetic determinism for arthrogryposis in the Piemontese breed by genotyping affected and healthy animals with a high-density chip and applying genome-wide association study (GWAS), *F*ST and canonical discriminant analysis (CDA) procedures, and (ii) to investigate with the same approach the genetic background of macroglossia, for which no genetic studies exist so far. The study included 125 animals (63 healthy, 30 with arthrogryposis, and 32 with macroglossia). Differently from what reported for other breeds, the analysis did not evidence a single strong association with the two pathologies. Rather, 23 significant markers on different chromosomes were found (7 associated to arthrogryposis, 11 to macroglossia, and 5 to both pathologies), suggesting a multifactorial genetic mechanism underlying both diseases in the Piemontese breed. In the 100-kb interval surrounding the significant SNPs, 20 and 26 genes were identified for arthrogryposis and macroglossia, respectively, with 12 genes in common to both diseases. For some genes (*NTN3*, *KCNH1*, *KCNH2*, and *KANK3*), a possible role in the pathologies can be hypothesized, being involved in processes related to muscular or nervous tissue development. The real involvement of these genes needs to be further investigated and validated.

**Keywords:** Piemontese breed; arthrogryposis; macroglossia; genetic model

#### **1. Introduction**

Arthrogryposis and macroglossia have long been known as congenital abnormalities observed in several cattle breeds [1,2]. Arthrogryposis is characterized by joints contractures with different degrees of severity, which can affect one to four legs, with various associated clinical signs, the most frequent being cleft palate [3]. More than one etiologic event, such as plant toxicosis [4], prenatal viral infections, and a possible hereditary component, have been reported as responsible for the disease occurrence [5]. Less information is available for macroglossia, which consists in the swelling of the tongue that may interfere with the calf's ability to nurse. The defect is thought to have a genetic basis, but no scientific evidence is available so far.

For both defects, double muscling is considered as a predisposing factor. Already back in the 1963, Lauvergne et al. [6] listed rickets-like troubles and macroglossia among the clinical signs displayed by the hypertrophied animals. The observation that the manipulation of the myostatin gene and, more specifically, the downregulation of its expression resulted in a series of adverse effects, including leg problems and macroglossia, which seems to confirm the negative influence of double muscling [7]. Moreover, macroglossia is one of the primary features of the human Wiedemann-Beckwith syndrome (OMIM 130650), which is clinically similar to muscular hypertrophy in cattle [8].

Both arthrogryposis and macroglossia have been reported for decades in the hypertrophied Piemontese cattle breed. Since the end of 1980s, the National Association of the Piemontese cattle Breeders (ANABORAPI) started to select against these two pathologies by culling Artificial Insemination (AI) bulls with a high percentage of affected progeny. A decrease from 2.74% to 0.34% and from 2.36% to 0.28% in the occurrence of arthrogryposis and macroglossia, respectively, were obtained in the period 1990–2017 (Supplemental Figure S1) as a consequence of this selection strategy (ANABORAPI).

These data seem to support the hypothesis of a genetic background for the defects, but the few investigations in the Piemontese breed did not give conclusive results. Huston et al. [3] suggested that, in the Piemontese, arthrogryposis could be determined by an incompletely penetrant recessive allele, with higher penetrance in males, which seems to be consistent with the ANABORAPI data (Supplemental Figure S1). However, a genome-wide association study carried out on Piemontese calves affected by arthrogryposis and macroglossia genotyped with a medium density (50 K) single nucleotide polymorphism (SNP) BeadChip did not detect clear signals of association for both pathologies [9]. On the contrary, recent studies detected variations in single genes as responsible for arthrogryposis in Angus [10], Swiss Holstein [11], Belgian Blue [12], and Red Danish [13] cattle breeds.

Therefore, the aims of this study were: (i) to test the hypothesis of a similar monogenic determinism for arthrogryposis in the Piemontese cattle breed by genotyping affected and healthy animals with a high-density chip never used in previous studies and applying genome-wide association study (GWAS), *F*ST and canonical discriminant analysis (CDA) procedures, and (ii) to investigate with the same approach the genetic background of macroglossia, for which no information is available so far.

#### **2. Materials and Methods**

#### *2.1. Ethics Statement*

No animals were used in the present study. The biological samples belonged to collections available from the ANABORAPI institutional activity. For this reason, the Animal Care and Use Committee approval was not necessary.

#### *2.2. Animals, Genotyping, and Data Editing*

Animals affected by arthrogryposis or macroglossia were found and sampled by the veterinarians of the ANABORAPI during the routine inspections in the farms registered in the Herd Book of the Piemontese breed. The phenotypic expression of arthrogryposis in the Piemontese breed is very variable, ranging from moderate contracture of the legs to more severe expressions that can be only surgically corrected. As this variability could represent a confounding factor, only animals with

extreme expressions of the defect (Supplemental Figure S2a) were considered. Moreover, as in about 3% of the affected animals the two pathologies coexist, only animals affected by a single pathology, arthrogryposis or macroglossia (Supplemental Figure S2), were included in the study, in order to avoid a further confounding effect.

Blood samples were collected from a total of 98 Piemontese male veals: 17 affected by arthrogryposis (Ar), 18 affected by macroglossia (Ma), and 63 healthy (He). The Ar, Ma, and He subjects were distributed in 17, 15, and 59 herds, respectively. All these animals were genotyped with the customized GeneSeek® Genomic Profiler™ Bovine 150 K (Lincoln, NE, USA). As it was difficult to collect a larger number of affected animals for their low incidence due the selection policy, we decided to include 31 additional subjects (16 affected by arthrogryposis and 15 by macroglossia) previously genotyped with the Illumina® BovineSNP50v2 (San Diego, CA, USA). Both 150 K and 50 K markers were mapped on the ARS-UCD1.2 and subjected to quality checks (QC). The QC was performed using PLINK v.1.923 [14] independently for each dataset using the following filters: SNPs with missing rate > 0.02, minor allele frequency (MAF) < 0.05 or deviating from Hardy-Weinberg equilibrium (*p*-values < 10<sup>−</sup>6) were discarded; moreover, only autosomal SNPs with known genomic positions were considered for further analysis. On the subjects' side, both SNP missingness per individual (>0.02) and individual heterozygosity deviations caused an animal to be removed from the datasets. After data editing, 94 and 31 veals were left in 150 K and 50 K, respectively. Then, the default settings of Beagle software [15] were applied for imputing the 50-K genotypes at 150 K on the whole dataset, obtaining a group of 125 animals (63 He, 30 Ar, and 32 Ma). This dataset was then split into two subsets of 93 (He + Ar) and 95 (He + Ma) individuals each that underwent a new round of QC with the aforementioned settings. A total of 100,791 and 100,907 SNPs were analyzed for the Ar and Ma subsets, respectively.

#### *2.3. Statistical Analysis*

A genome-wide association study (GWAS) was conducted using two separate case-control designs for the two syndromes following the approaches of [16,17]. The individuals involved in this study were weakly or not related. The genomic relationship matrix off-diagonal elements were, on average, −0.03, −0.03, and −0.02 for Ar, Ma, and He animals, respectively. The negative relatedness values signify that the individuals have genotypes less similar in expectation than the average. The affected and healthy animals were respectively assigned to cases or controls prior to test the allelic associations (–assoc flag in PLINK) performed by a X2 test. Genomic-control adjusted *p*-values were plotted against the genomic position for the two analyzed diseases. To tackle the multiple testing issue that arises when thousands of hypotheses are simultaneously tested, we decided to consider as significant the associations with false discovery rate (FDR) [18] below 0.05 using the –adjust flag in PLINK [14]. On the same two datasets, *F*ST analysis was conducted using the Weir and Cockerham [19] estimator implemented in PLINK, and those SNPs exceeding the 99.9th percentile threshold were retained. Then, the *F*ST outliers were merged with SNPs resulting from the GWAS, obtaining two sets of SNPs associated to Ar and Ma, respectively. In order to corroborate the GWAS results, the canonical discriminant analysis (CDA) was run on the top significant SNPs associated to both diseases, jointly considering all the animals and using the CANDISC procedures of SAS software (Sas Institute, Cary, NC, USA). The correlation structure between the top ranked SNPs was used to derive new variables, namely canonical discriminant variables (CAN), able to maximize the separation between predefined groups [20]. If S is the n × p matrix of the p SNP genotypes (coded as 0, 1, or 2 for the "aa", "Aa", or "AA" genotypes, respectively) measured in n animals belonging to k groups (three in our cases: Ar, Ma, and He), the ith CAN may be calculated as

$$\text{CAN}\_{\text{i}} = \text{w}\_{\text{i}1}\text{s}\_{1} + \text{w}\_{\text{i}2}\text{s}\_{2} + \dots + \text{w}\_{\text{i}\text{p}}\text{s}\_{\text{p}} \tag{1}$$

where *si* are the centered SNP genotypes and wip are the raw canonical coefficients for the p analyzed SNP (i.e., the weights of each SNP in the discriminant function). The vectors of coefficients w were obtained with a procedure that involves the eigen-decomposition of a linear transformation of betweenand within-group SNP (co)variance matrices [20].

#### *2.4. Candidate Gene Detection*

Annotated genes were retrieved in the region of 100-kilo base pairs (kb) (50-kb down- and upstream, respectively) surrounding each SNP highlighted by the combined use of GWAS and *F*ST. The National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov) and UCSC Genome Browser Gateway (http://genome.ucsc.edu/) databases were used.

#### **3. Results**

#### *3.1. Genome-wide Association Study and FST*

The results of the genome-wide case-control study are shown in Figure 1, which reports the −log10 of genomic-control adjusted *p*-values for the two diseases; in green are highlighted the SNPs that exceeded the chosen threshold. The factor λ was 1.11 and 1.17 for Ar and Ma, respectively, indicating a slight inflation of the statistical test. Quantile-quantile (QQ) plots of the ordered *p*-values for the Ar and Ma case-control GWAS (Supplemental Figure S3) showed that few SNPs strongly deviate from the diagonal identity lines for both diseases.

**Figure 1.** Manhattan plots. (**a**) Healthy vs. arthrogryposis and (**b**) healthy vs. macroglossia. GWAS: genome-wide association study and SNP: single nucleotide polymorphism.

The *F*ST analysis enabled to highlight 102 and 101 outlier SNPs for Ar and Ma, respectively, (Supplemental Tables S1 and S2) that overcame the 99.9th percentile threshold (Ar vs. He: 0.1401 and Ma vs. He: 0.1425) (Supplemental Figure S4). The combined GWAS and *F*ST approaches were able to identify 23 SNPs that were considered in a further analysis. In particular, seven SNPs were exclusively associated with arthrogryposis (on *Bos taurus* chromosomes, BTAs, 7, 8, 24, 25, 28, and 29) and 11 exclusively with macroglossia (on BTAs 1, 2, 7, 16, 17, and 26), whereas five markers (on BTAs 4, 11, 22, and 24) were in common to both pathologies (Table 1). Consistently, those SNPs presented MAF

values markedly deviating in the two groups of animals, as also highlighted by the *F*ST values (from 0.217 to 0.336 and from 0.200 to 0.319 for Ar and Ma, respectively) compared with healthy samples (Supplemental Table S3).

**Table 1.** Relevant markers found by the genome-wide association study (GWAS) and *F*ST analysis and raw canonical coefficients (CC) associated to significant single nucleotide polymorphisms (SNPs). Within-class average scores for the two discriminant functions (Can1 and Can2). CC: canonical coefficients.


<sup>1</sup> *Bos taurus* chromosome; <sup>2</sup> Ar: arthrogryposis and Ma: macroglossia.

#### *3.2. Canonical Discriminant Analysis*

The analysis of the squared Mahalanobis distances computed using the top significant SNPs indicated that affected (Ar or Ma) and healthy controls statistically differ (*p* < 0.001) as well as Ar and Ma did within affected animals (Supplemental Table S4).

The results of the CDA carried out using the significant markers identified by the GWAS and *F*ST are presented in Figure 2 and Table 1. The two canonical functions explained 75% and 25% of the variance, respectively. When the individual samples were plotted in the new coordinate system defined by the first two discriminant functions, a clear, albeit imperfect, separation between the affected (regardless of the pathology) and the healthy subjects was highlighted (Figure 2). The animals with positive CAN1 scores are mostly the affected animals. In addition, CAN2 allowed to separate the subjects affected by arthrogryposis (with negative scores on CAN2) from those affected by macroglossia (with positive scores), as confirmed by the within-class (Ar, Ma, or He) average scores for CAN1 and CAN2 (Table 1).

**Figure 2.** Plot of the individual scores on the two canonical variables (Can1 and Can2) depicted in different colors and symbols according to their health status.

The raw canonical coefficients for the identified SNPs had mainly large positive weights for Ma (ranging from −0.09 to 0.72) and negative weights for Ar, even with some exceptions (from −0.57 to 0.62). A rather elusive pattern was ascertained looking at the most deviating canonical weight; conversely, the selected SNPs jointly were able to discriminate between groups of animals. This seems also confirmed looking at the canonical correlation between the SNP genotype and CAN variables (Supplemental Figure S5).

#### *3.3. Candidate Gene Detection*

The lists of genes that mapped in the interval of 100 kb surrounding the significant SNPs and that may be putatively associated with the diseases are reported in Tables 2 and 3. As for arthrogryposis, a total of 20 genes were identified in the considered interval, and eight of them included a significant marker, while, for macroglossia, 26 genes were mapped in the highlighted regions, with 10 of them including a marker. Of the identified genes, 12 were in common to both investigated diseases (Tables 2 and 3).


*Animals* **2020**

, *10*, 1732


### *Animals* **2020**, *10*, 1732

**Table 3.** Genes near the SNPs associated to

macroglossia.

 The genes including a marker are in bold.

#### **4. Discussion**

The present study provides new data on the genetics of arthrogryposis and the first insight into the analysis of macroglossia in the Piemontese breed. An important systematic bias in GWAS often reported in the literature is caused by population stratification due to ethnic/breed admixture and/or close relationships among individuals of case-control studies [21,22]. In our case, a limited inflation of the statistical tests was observed; thus, a genomic-control approach was adopted, since the individuals included in the design belonged to the same breed and were weakly or not related [21].

Interestingly, the combined use of case-control GWAS, *F*ST, and CDA highlighted several markers potentially associated with the investigated syndromes. The use of multiple approaches is generally advised in genome-wide analysis [17,23,24]. The use of CDA was recently proposed as an effective tool for improving the discovery rate either alone or in a combination with GWAS, especially when the sample size is reduced [25].

As for arthrogryposis, the results depict a situation different from what was observed in the other investigated breeds [10–13], where variations in single genes were identified as responsible for the disease. In fact, our data did not evidence a single strong association with the pathology, while they highlighted a number of significant markers located on different chromosomes, suggesting a polygenic mechanism underlying the disease. The joint role of these markers is supported by their ability to separate the three groups of animals according to their health status.

None of the markers for arthrogryposis identified in the Piemontese breed are located within or near the genes reported as causing the disease in the other breeds. In this respect, it is important to underline that also the causal variations found in those breeds were of different types and in different genes: a large deletion encompassing three genes (BTA 16) in Angus [10], a missense mutation in the *MYBPC1* gene (BTA 5) in Swiss Holstein [11], a splicing variant in the *PIGH* gene (BTA 10) in Belgian Blue [12], and a small deletion in the *CHRNB1* gene (BTA 19) in Red Danish [13]. This implies that the genetic determinism of arthrogryposis is not the same in the affected breeds. On the other hand, it must be considered that a large variability in the phenotypic expression of what is called "arthrogryposis" was observed in the breeds studied so far, from lethal consequences, as in Belgian Blue or Angus breeds, to less severe problems, as in the Piemontese. Additionally, at least six types of arthrogryposis with different clinical signs and grades of severity were reviewed by Huston et al. [3] in cattle. Such heterogeneity makes it difficult to clearly define the trait that could explain the differences observed at the genetic level. In all cases, however, the findings of the different studies are compatible with the autosomal recessive mode of inheritance suggested since the earliest studies. The incidences of the two pathologies in the Piemontese breed in the last decades also showed a trend compatible with the case of selection against the recessive phenotype, and this led us to hypothesize the existence of a monogenic determinism similar to what was observed in the other cattle breeds. However, the present data do not support this hypothesis, suggesting that a more complex mechanism is responsible for the disease in the Piemontese breed.

For macroglossia, no previous genetic data exist. The results of the current study highlight a situation comparable to that obtained for arthrogryposis, so that a multifactorial mechanism can be hypothesized also for macroglossia.

The identified SNPs were located within or close to 33 genes, of which 9 and 13 were exclusive for arthrogryposis and macroglossia, respectively, and 11 common to the two pathologies. This is worthy of note, considering that, in the Piemontese breed, both pathologies are sometimes observed in the same animal. Thus, the findings of this study might suggest that the putative candidate genes common to both diseases could be involved in basic physiological processes common to both defects.

In the case of arthrogryposis, seven of the relevant SNPs mapped in coding genes, whereas, for macroglossia, 11 SNPs were located within coding genes. In some cases, it is unclear from the gene annotations their possible involvement in the pathologies. Instead, for other genes, a possible role can be hypothesized, as their products are part of processes related to muscular or nervous

tissue developments whose defects are included among the common causes of the pathologies here considered [26].

Among these genes, Netrin3 (*NTN3*) encodes a member (NTN3) of a family of extracellular proteins that act as chemotropic guidance cues for migrating cells and axons during neural development [27]. In mice, it was demonstrated that NTN3 is expressed in muscle cells, and therefore, it may play a role in guiding peripheral axons to their corrected muscle targets [28].

Additionally, *KCNH1* (potassium voltage-gated channel subfamily H member 1) and *KCNH2* (potassium voltage-gated channel subfamily H member 2) genes code for proteins that belong to a complex protein superfamily widely distributed during embryonic development and involved in a wide variety of cell functions. In mice, the two genes are co-expressed in the skeletal muscle during embryogenesis, including the cranial, thoracic, and limb regions [29]. In man, KCNH1 was shown to be involved in myoblast fusion, a complex process that includes withdrawal from the cell cycle, cell-cell interactions, adhesion, alignment, and a final membrane fusion to form the multinucleated skeletal muscle fiber [30].

A possible role can be also suggested for the *KANK3* (KN motif and ankyrin repeat domains 3) gene strongly expressed in different body compartments, including the skeletal muscle, and involved in the control of cytoskeleton formation by negatively regulating actin polymerization [31].

#### **5. Conclusions**

The overall findings indicate that the genetic determinism of arthrogryposis and macroglossia in the Piemontese breed is more complex than previously believed. In fact, the results do not support the existence of a single-gene model, while suggesting a multifactorial genetic mechanism underlying the investigated pathologies. Several markers significantly associated with both diseases were found, and genes possibly affecting the traits were identified. The real involvement of these genes needs to be further investigated and validated.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-2615/10/10/1732/s1: Figure S1: Trend of arthrogryposis and macroglossia in the Piemontese breed: Incidence from 1990 to 2017. Figure S2: Affected Piemontese veals: (**a**) arthrogryposis and (**b**) macroglossia. Figure S3: Quantile-quantile plot for the case-control genome-wide analysis (in red, the identity line). Figure S4: Distribution of *F*ST values and threshold line of the 99.9th percentile of the ranked *F*ST values for arthrogryposis and macroglossia vs. the healthy control. Figure S5: Canonical correlation between Can1, Can2, and the original SNP results significantly associated with the disease status, represented by different colors. SNPs associated with arthrogryposis (Ar), macroglossia (Ma), or both (Ar\_Ma). Table S1: *F*ST outliers (99.9% of ranked *F*ST, threshold value: 0.140158) from the arthrogryposis vs. healthy control comparison (n = 102). Table S2: *F*ST outliers (99.9% of ranked *F*ST, threshold value: 0.1425518) from the macroglossia vs. healthy control comparison (n = 101). Table S3: Statistics of the significant markers in common between the GWAS and *F*ST analysis. Table S4: Quadratic distance between the animals affected and healthy controls for each comparison.

**Author Contributions:** Conceptualization, L.D.S. and A.A.; formal analysis, G.G., A.C., and N.P.P.M.; investigation, L.D.S., A.A., and A.P.; resources, L.D.S.; data curation, G.G.; writing—original draft preparation, L.D.S. and G.G.; writing—review and editing, L.D.S., A.A., A.P., A.C., N.P.P.M., and G.G.; supervision, L.D.S.; and funding acquisition, L.D.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by MIUR ex60% (DISL\_RILO\_16\_01), Department of Agricultural, Forest and Food Sciences, University of Torino, Grugliasco (Italy).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*
