*Article* **Methylation of Estrogen Receptor 1 Gene in the Paraspinal Muscles of Girls with Idiopathic Scoliosis and Its Association with Disease Severity**

**Piotr Janusz 1,†, Małgorzata Chmielewska 2,\* ,†, Mirosław Andrusiewicz <sup>2</sup> , Małgorzata Kotwicka <sup>2</sup> and Tomasz Kotwicki <sup>1</sup>**


**Abstract:** Idiopathic scoliosis (IS) is a multifactorial disease with epigenetic modifications. Tissue dependent and differentially methylated regions (T-DMRs) may regulate tissue-specific expression of the estrogen receptor 1 gene (*ESR1*). This study aimed to analyze methylation levels within T-DMR1 and T-DMR2 and its concatenation with ESR1 expression of IS patients. The study involved 87 tissue samples (deep paravertebral muscles, both on the convex and the concave side of the curve, and from back superficial muscles) from 29 girls who underwent an operation due to IS. Patient subgroups were analyzed according to Cobb angle ≤70◦ vs. >70◦ . Methylation was significantly higher in the superficial muscles than in deep paravertebral muscles in half of the T-DMR1 CpGs and all T-DMR2 CpGs. The methylation level correlated with *ESR1* expression level on the concave, but not convex, side of the curvature in a majority of the T-DMR2 CpGs. The T-DMR2 methylation level in the deep paravertebral muscles on the curvature's concave side was significantly lower in patients with a Cobb angle ≤70◦ in four CpGs. DNA methylation of the T-DMRs is specific to muscle tissue location and may be related to *ESR1* expression regulation. Additionally, the difference in T-DMR2 methylation may be associated with IS severity.

**Keywords:** spinal curvatures; scoliosis; idiopathic; DNA methylation; pyrosequencing; estrogen receptor 1; ESR1; scoliosis progression; adolescent idiopathic scoliosis

### **1. Introduction**

The most common spine disorder in adolescents is idiopathic scoliosis (IS), affecting 1–3% of the population. It is a structural, three-dimensional spinal deformity characterized by lateral curvature of the spine, impaired kyphosis or lordosis, and vertebral rotation with a rib hump [1]. IS is a highly heterogeneous condition, with some patients having a rapidly progressive presentation, resulting in severe curves, and others progressing slowly to mild or moderate curves [2]. Progressive scoliosis may result in cosmetic deformity, back pain, and functional deficits as well as psychological problems and impaired social interactions. Severe curvatures are associated with cardiac dysfunction and pulmonary constraints [3–5]. Currently, clinical or radiological criteria cannot adequately predict which children who are diagnosed with mild disease may ultimately undergo subsequent curve progression that requires surgical intervention [6]. Identifying patients at risk of scoliosis, or those at risk of curve progression, is essential for early, appropriate treatment [1,7].

Despite the high prevalence of IS, its etiology remains poorly understood [8]. IS is considered a multifactorial disease with genetic susceptibilities [9]. Many candidate genes potentially associated with IS have been described in family linkage studies, single nucleotide poly-

**Citation:** Janusz, P.; Chmielewska, M.; Andrusiewicz, M.; Kotwicka, M.; Kotwicki, T. Methylation of Estrogen Receptor 1 Gene in the Paraspinal Muscles of Girls with Idiopathic Scoliosis and Its Association with Disease Severity. *Genes* **2021**, *12*, 790. https://doi.org/10.3390 /genes12060790

Academic Editor: Robert Winqvist

Received: 20 April 2021 Accepted: 20 May 2021 Published: 21 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

morphisms association studies, and genome-wide association studies [8,10–12]. Results from these suggest IS is a complex genetic disorder [6]. It was postulated that genetic factors are more important in the occurrence of IS while environmental factors have a more significant impact on disease progression [13]. An epigenetic link between genetic and environmental factors may be involved in IS etiopathogenesis [14]. As a new area of research, only a few publications concerning the impact of DNA methylation on IS have been published [15–19]. However, none of these studies have evaluated this mechanism in paraspinal muscle tissues.

Due to the gender-related distribution of idiopathic scoliosis, the role of estrogen hormones in IS occurrence and progression has been suggested [20,21]. Previous studies have reported the effect of estrogens on skeletal muscles, in which the mRNA and protein expression of estrogen receptor 1 and 2 (*ESR1*, *ESR2*) has been demonstrated [22–24]. *ESR1* and *ESR2* expression was confirmed in the superficial and deep paravertebral muscles of patients with IS. Moreover, expression of *PELP1* (proline-, glutamic acid-, and leucinerich protein) was significantly higher in the deep back muscles compared to superficial muscles. This protein participates in estrogen-induced signal transduction pathways. Additionally, *PELP1* expression level was correlated with both the Cobb angle value and *ESR1* expression [25].

DNA methylation is one of the most well-characterized epigenetic modifications. Methylation of CpG islands (CGI), located near the transcription start site of gene promoter and regulatory regions, is associated with altered gene expression [26,27]. Methylation at the gene promoter inhibits recognition and binding of transcription factors. This leads to the recruitment of proteins binding to methylated CpG dinucleotides which, in turn, interact with transcription repressors and activate chromatin condensation by recruiting histone deacetylases. As a result, DNA methylation in CpG site-rich regions, found in close proximity to the promoter region, is thought to play an essential role in gene silencing [28]. It has been indicated that a different methylation status is characteristic for particular types of tissues or the development phase [29]. Although the CpG islands in intragenic and regulatory regions of genes may display a tissue-dependent and differentially methylated region pattern [30], CGIs associated with transcription start sites rarely show tissue-specific patterns of methylation [31].

It was shown that in the case of ESR1, the level of methylation within the promoter is cell-specific [32]. Analysis of the C promoter (Figure 1) in in vitro study indicated that demethylation of this region is responsible for the increased expression of ESR1 [33]. The relationship between regulatory regions methylation of ESR1 and its expression was suggested. It has been reported that ESR1 has tissue-dependent and differentially methylated regions (T-DMRs; Figure 1), which are associated with tissue-specific gene expression [34]. A previous study showed that methylation at T-DMR1 and T-DMR2 is correlated with decreased ESR1 expression in the placenta and skin tissue but not in mammary glands and the endometrium [34]. Maekawa et al. suggested that ESR1 expression is tissue-specific and regulated by DNA methylation at T-DMR1 rather than by DNA methylation at the promoter region [34]. Thus, changes in ESR1 mRNA expression may not correspond with methylation of the ESR1 promoter. Moreover, it was indicated that, in the case of some breast cancer tissues, ESR1 expression might be modulated not only by DNA methylation at T-DMRs and promoter regions but also by different mechanisms that require clarification in future studies [34].

Taking into consideration that methylation level alterations among patients with different IS phenotypes may be associated with susceptibility to disease or disease progression, we therefore analyzed T-DMR1 and T-DMR2 methylation status. Subsequently, the expression level of *ESR1* in the superficial and paraspinal muscles on the convex and concave side of the IS curvature was analyzed and evaluated in relation to methylation status.

*Genes* **2021**, *12*, 790 3 of 15

**Figure 1.** *ESR1* promoters (A-D) and T-DMR1, and T-DMR2 regions localization with respect to transcription start site (TSS) and translation start codon (ATG). EGR1 indicates transcription factor binding sites. Adapted from [34]. **Figure 1.** *ESR1* promoters (A-D) and T-DMR1, and T-DMR2 regions localization with respect to transcription start site (TSS) and translation start codon (ATG). EGR1 indicates transcription factor binding sites. Adapted from [34].

### Taking into consideration that methylation level alterations among patients with **2. Results**

### different IS phenotypes may be associated with susceptibility to disease or disease *2.1. Patient Characteristics*

progression, we therefore analyzed T-DMR1 and T-DMR2 methylation status. Subsequently, the expression level of *ESR1* in the superficial and paraspinal muscles on the convex and concave side of the IS curvature was analyzed and evaluated in relation to methylation status. **2. Results**  *2.1. Patient Characteristics*  The study group consisted of 29 female IS patients (age at surgery: 12.1–17.9 years, mean age: 14.5 ± 1.5 years). The Cobb angle ranged from 52◦ to 115◦ , with a mean of 77.4 ± 16.1◦ . The mean age, number of curvatures, and Risser sign value did not differ significantly between the subgroups of patients with Cobb angles ≤70◦ and ≥70◦ (14.5 ± 1.3 vs. 14.7 ± 1.7, *p* = 0.9; 3 single: 7 double vs. 8 single: 11 double, *p* = 0.7; Me = 4 vs. Me = 4, *p* = 0.7, respectively). The mean Cobb angle value of patients with a Cobb angle ≤70◦ and those ≥70◦ was 61.1◦ ± 6 ◦ and 86◦ ± 12.7◦ , respectively.

### The study group consisted of 29 female IS patients (age at surgery: 12.1–17.9 years, *2.2. DNA Methylation at the ESR1 T-DMR1 and T-DMR2*

mean age: 14.5 ± 1.5 years). The Cobb angle ranged from 52° to 115°, with a mean of 77.4 ± 16.1°. The mean age, number of curvatures, and Risser sign value did not differ significantly between the subgroups of patients with Cobb angles ≤70° and ≥70° (14.5 ± The methylation pattern within T-DMR1 and T-DMR2 of individual patients is shown in Figure 2.

1.3 vs. 14.7 ± 1.7, *p =* 0.9; 3 single: 7 double vs. 8 single: 11 double, *p =* 0.7; Me = 4 vs. Me = 4, *p =* 0.7, respectively). The mean Cobb angle value of patients with a Cobb angle ≤70° and those ≥70° was 61.1° ± 6° and 86° ± 12.7°, respectively. *2.2. DNA Methylation at the ESR1 T-DMR1 and T-DMR2*  The methylation pattern within T-DMR1 and T-DMR2 of individual patients is shown in Figure 2. The methylation level within the *ESR1* T-DMR1 region was significantly higher in the superficial muscle compared to the deep paravertebral muscles at the CpG1 (*p* = 0.0001; Figure 3; Supplementary Table S1) and CpG2 sites (*p* < 0.0001; Figure 3; Supplementary Table S1). The methylation level was significantly higher in the superficial muscle compared to both, the concave (*p* < 0.05; Figure 3; Supplementary Table S1) and convex side of the curvature (*p* < 0.05; Figure 3; Supplementary Table S1). Moreover, in the deep paravertebral muscles, methylation was decreased on the concave side in contrast to the convex side of the curvature. However, the difference was not statistically significant (Supplementary Table S1).

Significant differences in methylation levels of all CpG sites within the *ESR1* T-DMR2 region between superficial and deep paravertebral muscles were observed (*p* < 0.05; Figure 4; Supplementary Table S1). Methylation was found to be significantly higher in the superficial muscle versus the concave (at CpG sites 1–4 and 6–8; *p* < 0.05; Figure 4; Supplementary Table S1) and convex side of the curvature (at all CpG sites; *p* < 0.05; Figure 4; Supplementary Table S1). In contrast to the *ESR1* T-DMR1 region, the methylation level within the T-DMR2 region in the deep paravertebral muscles was lower on the convex side of the curvature in seven of eight CpGs compared to the concave side. However, the difference was not statistically significant (Figure 4; Supplementary Table S1).


*Genes* **2021**, *12*, 790 4 of 15

**Figure 2.** Dot plot of *ESR1* T-DMR1 and T-DMR2 regions methylation pattern of individual patients. **Figure 2.** Dot plot of *ESR1* T-DMR1 and T-DMR2 regions methylation pattern of individual patients.

**Figure 3.** DNA methylation level within *ESR1* T-DMR1 region in deep paravertebral muscles and superficial muscles; \*\* *p* < 0.01, \*\*\* *p* < 0.001. **Figure 3.** DNA methylation level within *ESR1* T-DMR1 region in deep paravertebral muscles and **Figure 3.** DNA methylation level within *ESR1* T-DMR1 region in deep paravertebral muscles and superficial muscles; \*\* *p* < 0.01, \*\*\* *p* < 0.001.

superficial muscles; \*\* *p* < 0.01, \*\*\* *p* < 0.001.

Supplementary Table S1).

**Figure 4.** DNA methylation level within *ESR1* T-DMR2 region in deep paravertebral muscles and superficial muscles; \* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001. **Figure 4.** DNA methylation level within *ESR1* T-DMR2 region in deep paravertebral muscles and superficial muscles; \* *p* < 0.05, \*\* *p* < 0.01, \*\*\* *p* < 0.001.

### *2.3. Correlation between ESR1 Methylation Levels and Relative Expression of the ESR1 Gene 2.3. Correlation between ESR1 Methylation Levels and Relative Expression of the ESR1 Gene*

The ESR1 relative expression did not differ significantly between the deep paravertebral muscles on both, convex and concave side, and superficial muscles (*p* > 0.05; Supplementary Figure S1). The ESR1 relative expression did not differ significantly between the deep paravertebral muscles on both, convex and concave side, and superficial muscles (*p* > 0.05; Supplementary Figure S1).

Significant differences in methylation levels of all CpG sites within the *ESR1* T-DMR2 region between superficial and deep paravertebral muscles were observed (*p <*  0.05; Figure 4; Supplementary Table S1). Methylation was found to be significantly higher in the superficial muscle versus the concave (at CpG sites 1–4 and 6–8; *p <* 0.05; Figure 4; Supplementary Table S1) and convex side of the curvature (at all CpG sites; *p <* 0.05; Figure 4; Supplementary Table S1). In contrast to the *ESR1* T-DMR1 region, the methylation level within the T-DMR2 region in the deep paravertebral muscles was lower on the convex side of the curvature in seven of eight CpGs compared to the concave side. However, the difference was not statistically significant (Figure 4;

On the concave side of the curvature, a significant, moderate, and positive correlation was observed between *ESR1* mRNA expression and methylation level at the CpG1 dinucleotide in the T-DMR1 region and at six CpG sites (CpG2-CpG8) in the T-DMR2 region (R ranged from 0.44 to 0.59; *p <* 0.05; Figure 5). No correlation between *ESR1* expression and methylation level within the T-DMR1 and T-DMR2 regions was found either in the superficial muscle or on the convex side of thoracic scoliosis (*p >* 0.05; Figure 5; Supplementary Figures S2 and S3). On the concave side of the curvature, a significant, moderate, and positive correlation was observed between *ESR1* mRNA expression and methylation level at the CpG1 dinucleotide in the T-DMR1 region and at six CpG sites (CpG2-CpG8) in the T-DMR2 region (R ranged from 0.44 to 0.59; *p* < 0.05; Figure 5). No correlation between *ESR1* expression and methylation level within the T-DMR1 and T-DMR2 regions was found either in the superficial muscle or on the convex side of thoracic scoliosis (*p* > 0.05; Figure 5; Supplementary Figures S2 and S3).


**Figure 5.** Correlation between *ESR1* expression and methylation level within T-DMR1 and T-DMR2 regions in deep paravertebral muscles and superficial muscles. R—Spearman rank correlation coefficient.

### *2.4. Association between Methylation Status of ESR1 and Cobb Angle*

In the deep paravertebral muscle, the methylation level within the *ESR1* T-DMR2 region on the concave side of the curvature was significantly different between groups of patients with a Cobb angle >70◦ or ≤r0◦ at four CpG sites: CpG2 (*p* = 0.02; Figure 6; Supplementary Table S2), CpG3 (*p* = 0.04; Figure 6; Supplementary Table S2), CpG4 (*p* = 0.04; Figure 6; Supplementary Table S2) and CpG 6 (*p* = 0.005; Figure 6; Supplementary Table S2). There was no difference in the *ESR1* T-DMR1 region methylation level between groups of patients with a Cobb angle ≤70◦ or >70◦ (*p* > 0.05; Supplementary Table S2). No differences were observed in T-DMR1 methylation levels between groups of patients with Cobb angles ≤70◦ and >70◦ (*p* > 0.05; Supplementary Table S2).

No correlation was found between T-DMR1 region methylation level and Cobb angle value in either the superficial or deep paravertebral muscle tissues (r ranged from 0.02 to 0.23; *p* > 0.05). Examining the concave side of thoracic scoliosis, a significant, moderate and positive correlation between T-DMR2 methylation and Cobb angle was observed at CpG2 (R = 0.44; *p* = 0.02) and CpG6 (r = 0.5; *p* = 0.005). There was no significant correlation between T-DMR2 methylation and Cobb angle in the superficial muscles (CpG2 and CpG4- CpG8, R ranged from 0.03 to 0.27 (*p* > 0.05); CpG1 and CpG3, r ranged from 0.12 to 0.14 (*p* > 0.05)) or in the deep paravertebral muscles on the convex side of the curvature (CpG1, CpG2, CpG6 and CpG8, R ranged from 0.03 to 0.24 (*p* > 0.05); CpG3-CpG5 and CpG7, r ranged from 0.07 to 0.26 (*p* > 0.05)).

**Figure 5.** Correlation between *ESR1* expression and methylation level within T-DMR1 and T-DMR2 regions in deep paravertebral muscles and superficial muscles. R—Spearman rank

In the deep paravertebral muscle, the methylation level within the *ESR1* T-DMR2

region on the concave side of the curvature was significantly different between groups of patients with a Cobb angle >70° or ≤r0° at four CpG sites: CpG2 (*p =* 0.02; Figure 6; Supplementary Table S2), CpG3 (*p =* 0.04; Figure 6; Supplementary Table S2), CpG4 (*p =*  0.04; Figure 6; Supplementary Table S2) and CpG 6 (*p =* 0.005; Figure 6; Supplementary Table S2). There was no difference in the *ESR1* T-DMR1 region methylation level between groups of patients with a Cobb angle ≤70° or >70° (*p >* 0.05; Supplementary Table S2). No differences were observed in T-DMR1 methylation levels between groups of patients

*2.4. Association between Methylation Status of ESR1 and Cobb Angle* 

with Cobb angles ≤70° and >70° (*p >* 0.05; Supplementary Table S2).

**Figure 6.** DNA methylation level within *ESR1* T-DMR2 region in deep paravertebral muscles and superficial muscles in patients with Cobb angles ≤70° and >70°; \* *p* < 0.05, \*\* *p* < 0.01. **Figure 6.** DNA methylation level within *ESR1* T-DMR2 region in deep paravertebral muscles and superficial muscles in patients with Cobb angles ≤70◦ and >70◦ ; \* *p* < 0.05, \*\* *p* < 0.01.

### **3. Discussion**

correlation coefficient.

No correlation was found between T-DMR1 region methylation level and Cobb angle value in either the superficial or deep paravertebral muscle tissues (r ranged from 0.02 to 0.23; *p >* 0.05). Examining the concave side of thoracic scoliosis, a significant, moderate and positive correlation between T-DMR2 methylation and Cobb angle was observed at CpG2 (R = 0.44; *p =* 0.02) and CpG6 (r = 0.5; *p =* 0.005). There was no significant Although the history of IS has been thoroughly described and treatment methods are established, the exact etiology and pathology have yet to be elucidated [1,8,35]. IS has been extensively analyzed with respect to susceptibility to scoliosis development and curvature progression, with various theories concerning IS etiology suggested. Recently, genetic studies revealed an important association between DNA polymorphisms and disease susceptibility and severity [36–38]. Despite promising results, these studies did not provide insight to any IS predisposition nor provide a molecular explanation of the disease. It has become increasingly apparent that many diseases are likely the result of interactions between genes and the environment [2]. According to Grauers et al., 38% of the variance in the liability of IS development is due to additive genetic effects and 62% to unique environmental effects [39]. Thus, one of the most interesting hypotheses regarding IS etiology is the linkage of genetic susceptibilities to environmental factors.

Several epigenetic studies concerning the etiopathogenesis of IS have been conducted. As of now, five of them described DNA methylation as an epigenetic mechanism associated with IS. Mao et al. evaluated methylation levels of the cartilage oligomeric matrix protein gene. They found that hypermethylation of the gene promoter correlated with adolescent idiopathic scoliosis (AIS) curve severity [16]. Shi et al. published two studies concerning DNA methylation in AIS and revealed an association of paired-like homeodomain 1 and protocadherin-10 gene methylation with IS susceptibility and curvature severity [17,18]. Meng et al. conducted analysis of the whole-genome methylation in two pairs of twins. They found an association between methylation levels at site cg01374129 and curve severity [19]. Liu et al. also performed whole-genome methylation analysis in a pair of twins. They discovered several signaling pathways potentially associated with AIS and a significantly higher methylated region in chromosome 15 of the AIS group [15]. All mentioned studies concerning DNA methylation were performed with peripheral blood samples. In the search for the molecular explanation of IS, we analyzed the local molecular predisposition to IS occurrence or progression at the apex of the curvature. We focused on paraspinal muscles as a possible target tissue for locally acting factors as these muscles play a key role in controlling spinal stability [2]. There is a hypothesis that dysfunctional paraspinal muscles may contribute the development of the scoliotic curve [2,40]. Additionally, reports have described functional and histological differences in the paraspinal muscles between the convex and the concave sides of the curve in IS patients [41,42].

An interesting feature of IS is the correlation of disease severity with gender, especially after puberty. The female/male ratio in mild scoliosis is reported to be 1.4/1, while in severe scoliosis, it is estimated to be 8.4/1 [13,20,43]. This shift suggests a relationship between sex hormones with a clinical manifestation of IS [43,44]. As *ESR1* and *ESR2* are known to mediate the effects of estrogens, they became the subject of genetic studies concerning DNA polymorphisms of *ESR1* and *ESR2* in IS. Although early studies were promising, they failed to be replicable in subsequent studies [45,46]. A recent cross-sectional study revealed that some *ESR1* and *ESR2* variants were associated with the occurrence risk of idiopathic scoliosis [12]. Meta-analysis performed by Sobhan et al. suggested that *ESR1* polymorphisms rs9340799 and rs2234693 are not related to the risk of IS occurrence. However, rs9340799 may be associated with the risk of developing AIS among the Asian population [47]. Due to the unknown role of estrogens and their receptors in IS etiology, we evaluated the gene methylation status of estrogen receptors in IS.

In our study, we found differences in methylation levels between the deep paravertebral muscles (*m. longissimus*) and the superficial muscles (*m. trapezius*) in two CpGs of T-DMR1 and in all CpGs of T-DMR2. We consider the superficial muscle as a control, due to its distance from deformation, anatomical borders (fascia layers) between it and the deep muscles, different embryogenesis and function from the deep muscles, and a disparate nerve supply. Thus, this observed difference in methylation supports the theory of distinctive methylation patterns depending on localization in the same tissue type. Slieker et al. identified, using genome-wide DNA methylation data, that there are T-DMRs in CpG-poor regions such as CGI shores or distal promoters, which are associated with aberrant transcription. Interestingly, the authors observed interindividual variation of DNA methylation for more than 8000 CpGs in the skeletal muscle tissue and within-individual methylation differences between muscle and blood tissues for over 2000 CpGs [48]. Therefore, the interpretation of methylation patterns for tissues representing cellular heterogeneity, such as skeletal muscle, is particularly complex. It is also a challenge in comparative tissue research [48–50]. According to Maekawa et al., *ESR1* has tissue-dependent and differentially methylated regions (T-DMRs), which are associated with tissue-specific gene expression [34].

Our results did not reveal a difference in DNA methylation between the concave and convex side of paravertebral muscles when all patients were considered together. However, we found a difference in DNA methylation between patients with a Cobb angle ≤70◦ and >70◦ in the T-DMR2 region at the concave side of the curvature. Moreover, in CpG2 and CpG6 in the T-DMR2 region, the level of methylation at the concave side of the curvature correlated with the Cobb angle value. According to the abovementioned studies concerning the differences between paraspinal muscles at the apex of the curvature, this side was significant in relation to the etiopathogenesis of IS [51–53]. The absence of this correlation in the superficial and paravertebral muscles on the convex side in any of the evaluated CpGs also supports the importance of the concave side of the curvature in predisposition to IS progression. Thus, we interpret these results as a lack of association of *ESR1* methylation with a predisposition to IS development and consider methylation status as an IS phenotype modifier rather than the direct molecular background. These results are in line with the opinion of Cheung et al. who state that some factors may contribute to curve progression while others contribute to curve initiation [54]. According to Leboeuf et al., estrogens may be considered contributing factors in the progression of scoliosis [55]. Our results support this hypothesis from an epigenetic point of view.

When debating on changes of the paravertebral muscles at the level of the concave side of the curvature, there exists a dilemma regarding the primary versus secondary nature due to impaired spine biomechanics (asymmetrical loading). It is possible that the difference in methylation may contribute to some asymmetry in muscle function and promote curvature progression. On the other hand, the presented results may be a consequence of exposure to different local mechanical conditions due to asymmetric loading or other unknown factors. In our opinion, the results of this study most likely reveal primary changes. The impact of asymmetrical loading or other factors on the methylation level should be detectable in all evaluated CpGs, not only in specific ones. To evaluate the direct impact of DNA methylation on IS progression, two patient subgroups were distinguished. The patients were divided according to disease severity and its possible impact on the patient health [35,56,57]. The skeletally mature patients with a Cobb angle between 50◦ and 70◦ require surgical scoliosis correction to avoid further curvature deterioration into adulthood [35,56,57]. Whereas severe curvatures can negatively impact patient health including outcomes such as decreased lung function, cardiac function, back pain, and degenerative spine disease [3,5,57]. There is no solid threshold for Cobb angle value when curvature significantly impacts the patients' health. Studies concerning surgical treatment of IS classify scoliosis as a severe when a curvature exceeds 70◦ in Cobb angle [58–60]. Thus, we used this value to categorize study subgroups.

Contrary to previous studies, we observed a positive correlation between *ESR1* expression and methylation level within regulatory regions. Maekawa et al. showed that *ESR1* expression was tissue specific and downregulated by DNA methylation at T-DMRs in normal tissues but not always in breast cancer. They have also evaluated the expression level of different *ESR1* variants and suggested that there is interplay between DNA methylation of T-DMRs and regions around upstream exons [34]. Our result, different from those of Maekawa [34], may be explained by the fact that we examined all transcription variants in one quantitative reaction. Additionally, we did not evaluate methylation of promoter regions but only both T-DMRs. Moreover, it is indicated that T-DMR methylation may modulate the availability of DNA sequences for methylation-dependent transcription factors [61]. Those findings are in line with results presented by Maekawa et al., who identified that EGR1 (early growth response protein 1) may be the potential transcription factor that binds to the T-DMRs and, as a result, upregulates *ESR1* expression [34]. It has also been suggested that there are T-DMRs negatively and positively correlated with gene expression depending on genomic localization [61].

Direct comparison of our results with other published studies concerning methylation of DNA in IS was challenging due to different tissue samples used for evaluation. Peripheral blood is a very good source of DNA when polymorphisms are considered. However, the methylation level obtained from the blood will only show the methylation level of whole DNA without specific local disturbances. Thus, a strong point of this study was the evaluation of tissues at the center of the pathology, thereby bringing forth new facts about the impact of *ESR1* DNA methylation on the IS phenotype.

Our study reveals new aspects concerning IS etiopathogenesis. It develops a further explanation of why some IS progress more often than others. A better understanding of the pathology improves the diagnosis and treatment methods. However, the direct clinical implications of this study are limited. Genetic studies aim to develop a test that may help distinguish the prognosis between patients with severe disease from a more benign condition. We hope that further studies beyond our results can be useful in the development of such a test.

The main limitation of our study is the lack of a healthy control group. It was impossible to obtain paraspinal muscles samples from healthy, age-matched females. We considered harvesting muscle samples from patients who have undergone a surgery due to degenerative spine disease. However, the vast difference in patient age and the muscle atrophy due to long-lasting degenerative spine disease may induce unknown methylation changes. This would be possible introduction of bias rather than a reliable evaluation of methylation impact on the etiology of IS. Another limitation in this study was sample size. However, it is comparable with other studies evaluating DNA methylation in IS even though the other studies were performed on blood samples [15–19].

### **4. Materials and Methods**

### *4.1. Patient Population*

The study group consisted of 29 girls who underwent an operation due to IS between January 2017 and December 2019 at the Department of Spine Disorders and Pediatric Orthopedics in Poznan University Hospital. All patients met the following inclusion criteria: (1) confirmed diagnosis of IS (other backgrounds of scoliosis were excluded); (2) no coexisting genetic, neurological, or orthopedic disorders; (3) thoracic location of the main curvature; (4) surgical treatment with posterior spinal instrumentation and fusion. All patients underwent clinical and radiological examinations, including long-cassette standing X-rays taken prior to surgery. Number, localization, and curvature size (Cobb angle) was measured [62]. Skeletal maturity was assessed by the Risser sign [63]. One experienced spine surgeon performed all measurements. Patients were divided into two subgroups according to final disease severity at skeletal maturity: scoliosis of equal to or less than 70◦ vs. greater than 70◦ as measured with the Cobb angle. The first subgroup (Cobb ≤ 70◦ ) consisted of 10 patients without a major risk of significant impact on cardio-pulmonary function in adulthood. The second subgroup (Cobb > 70◦ ) consisted of 19 patients with severely progressive IS that possibly may impact cardio-pulmonary function.

### *4.2. Tissue Samples*

During surgery, 1 cm<sup>3</sup> muscle tissue fragments were obtained from one deep paravertebral muscle (*m. longissimus thoracic*) on the convex and concave side of the curvature as well as from one superficial muscle (*m. trapezius*). Samples were stored at −80 ◦C in tubes containing nucleic acid preservation solution (Novazym, cat no. ST01; Poznan, Poland).

### *4.3. Genomic DNA Methylation Analysis*

### 4.3.1. Genomic DNA Isolation and Bisulfite Conversion

Total genomic DNA was extracted using a silica matrix column kit (Zymo Research, cat no. D4069; Irvine, CA, USA) with a modified protocol. In short, 25 mg of tissue samples ground in liquid nitrogen were incubated overnight at 55 ◦C with proteinase K. The lysate was then centrifuged (12,000× *g*, 1 min, room temperature). Next, the procedure followed the isolation according to manufacturer's protocol. The gDNA quantity, purity, and integrity were assessed both spectrophotometrically and electrophoretically. One microgram of gDNA was bisulfite converted using an EZ DNA Methylation™ Kit (Zymo Research, cat no. D5002; Irvine, CA, USA) according to the manufacturer's protocol.

### 4.3.2. Polymerase Chain Reaction and Pyrosequencing Analysis

Bisulfite converted DNA served as the template for polymerase chain reaction (PCR) followed by pyrosequencing (PSQ). The primers for PCR and PSQ reactions were designed using PyroMark Assay Design software (version 2.0.1.15; Qiagen; Hilden, Germany). The input DNA sequences corresponded to the T-DMR1 and T-DMR2 regions of the *ESR1* gene (https://www.ncbi.nlm.nih.gov (accessed on 5 March 2019); GenBank N◦ : NG\_008493.2). Sequencing, forward, and biotinylated reverse primers are presented in Table 1.

Polymerase chain reactions were performed using ZymoTaqTM PreMix (Zymo Research; cat no. E2004; Irvine, CA, USA) designed for the amplification of bisulfite-treated DNA. Reaction mixture components, concentrations, and thermal profile is presented in Table 2. Two microliters of the product were separated using a standard 2% agarose gel and compared to molecular mass marker (Novazym, cat no. MA1000-03; Poznan, Poland).


### **Table 1.** Primer sequences and location.

<sup>→</sup> PCR—forward primer; <sup>←</sup> PCR—reverse primer; <sup>B</sup>—biotinylated primer; Tm—melting temperature, GC—guanine-cytosine content; bp—base pairs; TSS—transcription start site; ATG—start codon; → SEQ—sequencing primer.


**Table 2.** PCR mixture content and thermal profile of the reactions.

→PCR—forward primer; ←PCR—reverse primer; min—minutes, s—seconds.

PSQ analysis was performed using the PyroMark Q48 instrument (Qiagen; Hilden, Germany) according to CpG assays designed with Pyromark Q48 Autoprep 2.4.2 software (Qiagen; Hilden, Germany). Analysis of 4 and 8 CpG sites for T-DMR1 and T-DMR2, respectively, were performed (internal sodium bisulfite treatment quality control was included in each reaction). The methylation level was quantified using Pyromark Q48 Autoprep 2.4.2 software and expressed as a percentage ratio of methylated to non-methylated dinucleotides.

### *4.4. Analysis of ESR1 mRNA Expression*

### 4.4.1. Total RNA Isolation and Reverse Transcription

Total cellular RNA was extracted using Renozol (GenoPlast Biochemicals, cat no. BNGPB1100-2; Rokocin, Poland) and Direct-zol RNA Miniprep Kit (Zymo Research, cat no. R2052; Irvine, CA, USA) following the manufacturer's protocol. RNA quantity and purity were assessed similarly to gDNA. RNA integrity was evaluated with 18S and 28S ribosomal RNA using 1% standard denaturing agarose gel electrophoresis.

Reverse transcription reactions were performed using M-MuLV-RT (Sigma-Aldrich, cat no.11785826001; Saint Louis, MO, USA) according to the manufacturer's protocol. The total reaction volume was 10 µL. In the first step, the mixture containing 500 ng of total RNA, water, 5 mmol/µL universal oligo(dT)<sup>10</sup> primer, and 300 nmol/µL random hexamer primer were denatured at 65 ◦C for 10 min then cooled on ice. Subsequently, 2 mmol/µL of each deoxynucleotide triphosphates, 1.5 U/reaction of *E. coli* Poly(A) Polymerase (Carolina Biosystems, cat no. PAPY-30; Prague, Czech Republic), 150 nm/µL deoxyadenosine triphosphates, 15U/reaction of ribonuclease inhibitor, 1X buffer M-MuLV-RT buffer, and 10 U/reaction of M-MuLV reverse transcriptase were added. Samples were incubated at 25 ◦C for 10 min, 55 ◦C for 60 min, then 5 min at 85 ◦C. Complementary DNA (cDNA) was

either immediately used for quantitative polymerase chain reaction (qPCR) or stored at −20 ◦C until further analysis (but no longer than seven days).

### 4.4.2. Quantitative Polymerase Chain Reaction

*ESR1* mRNA was quantified using sequence-specific primers (sense: CCTTCTTCAA-GAGAAGTATTCAAGG and antisense: ATTCCCACTTCGTAGCATTTG) and the Roche Universal ProbeLibrary TaqMan® hydrolysis probe (#69, cat no. 04688686001) using the ProbeFinder Assay Design Center (https://lifescience.roche.com/en\_pl/brands/universalprobe-library.html, accessed on 4 October 2016). The hypoxanthine-guanine phosphoribosyltransferase (*HPRT*) gene was used as a reference gene (RealTime ready *HPRT*, Roche, cat no. 05532957001; Basel, Switzerland). The 20µL total volume reaction mixture contained 5 µL cDNA, 1X LightCycler® FastStart TaqMan® Probe Master (Roche, cat no. 04673417001; Basel, Switzerland), and 1X RealTime ready *HPRT* for reference gene or 200 nm of hydrolysis probe #69 along with 400 nm of the primer mixture for the gene of interest, and nuclease-free water. qPCR reactions were performed using the LightCycler® 2.0 carousel glass capillary-based system (Roche). The thermal profile was performed as previously described [25]. Each sample was analyzed in duplicate with independently synthesized cDNA. The quantitative PCR results were assembled using the LightCycler Data Analysis Software version 5.0.0.38 (Roche; Basel, Switzerland), and the fluorescence measurement results were normalized to standard curves [25]. In each sample, *ESR1* expression was compared to reference gene expression in order to obtain a Cr value (concentration ratio) which corresponded to the relative *ESR1* expression level.

### *4.5. Statistical Analyses*

Data analyses were performed using Statistica 13.3 software (TIBCO Software Inc.; Palo Alto, CA, USA) and PQStat 1.8.0.414 software (PQStat software; Poznan, Poland). The methylation level of specific CpG sites was analyzed in T-DMR1 and T-DMR2 separately for each CpG site in each region. The Shapiro–Wilk test was used for the normality of continuous variable distribution assessment. The differences in methylation levels between concave, convex, and superficial muscles was evaluated using repeated measures ANOVA or Friedman ANOVA with HSD Tukey and Dunn's Bonferroni post-hoc tests, respectively. Methylation between patient subgroups with a Cobb angle ≤70◦ or >70◦ was compared using an independent t-test or Mann–Whitney U test. The correlation coefficients were determined by Pearson's (r) or Spearman's rank test (R). Data are presented as mean ± SE (standard error) or median with interquartiles. Data was considered statistically significant when *p* < 0.05.

### **5. Conclusions**

The DNA methylation level of *ESR1* regulatory regions is specific to the muscle tissue localization in patients with idiopathic scoliosis. The lack of significant asymmetry between the concave, compared to the convex, side of the spinal curvature suggests that *ESR1* methylation level does not signify predisposition to the occurrence of IS. The difference in *ESR1* T-DMR2 CpGs methylation of the deep paravertebral muscles on the concave side of the curvature may be associated with IS severity.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/genes12060790/s1, Supplementary Table S1. DNA methylation level (%) within *ESR1* T-DMR1 and T-DMR2 regions in deep paravertebral muscles and superficial muscles (*n* = 29). Supplementary Table S2. DNA methylation level (%) within *ESR1* T-DMR1 and T-DMR2 regions in deep paravertebral muscles and superficial muscles in the groups of patients with Cobb angles ≤70◦ (*n* = 10) and >70◦ (*n* = 19). Supplementary Figure S1. Relative *ESR1* expression level in deep paravertebral muscles and superficial muscles. Supplementary Figure S2. Scatter plots showing correlations between *ESR1* expression and methylation level within T-DMR1 region in deep paravertebral muscles and superficial muscles. Figure S3. Scatter plots showing correlations between *ESR1* expression and methylation level within T-DMR2 region in deep paravertebral muscles and superficial muscles.

**Author Contributions:** Conceptualization, P.J., M.C., M.A., M.K. and T.K.; methodology, P.J., M.C. and M.A.; validation, M.C. and M.A.; formal analysis, P.J., M.C., M.A. and M.K.; investigation, P.J., M.C. and M.A.; resources, P.J.; data curation, P.J., M.C. and M.A.; writing—original draft preparation, P.J., M.C., M.A. and M.K.; writing—review and editing, M.C., M.A. and T.K.; visualization, M.C. and M.A.; supervision, P.J., T.K.; project administration, P.J.; funding acquisition, P.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Science Centre, grant number 2016/23/D/NZ5/02606.

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of Poznan University of Medical Sciences (protocol code No 546/17 and 741/19 and date of approval 5/11/2017 and 6/19/2019, respectively).

**Informed Consent Statement:** Informed consent was obtained from all the patients or their parents/legal guardians in the case of underage participants.

**Data Availability Statement:** The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

**Acknowledgments:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Abbreviations**

AIS: adolescent idiopathic scoliosis; cDNA: complementary DNA; ESR1: estrogen receptor 1; ESR2: estrogen receptor 2; *HPRT*: hypoxanthine-guanine phosphoribosyltransferase; IS: idiopathic scoliosis; PCR: polymerase chain reaction; PELP1: proline-, glutamic acid- and leucine-rich protein; qPCR: quantitative polymerase chain reaction; SE: standard error of mean; T-DMRs: differentially methylated regions.

### **References**

	- **2013**, *7*, 11–16. [CrossRef]

## *Review* **Models of Distal Arthrogryposis and Lethal Congenital Contracture Syndrome**

**Julia Whittle <sup>1</sup> , Aaron Johnson <sup>2</sup> , Matthew B. Dobbs <sup>3</sup> and Christina A. Gurnett 1,\***


**Abstract:** Distal arthrogryposis and lethal congenital contracture syndromes describe a broad group of disorders that share congenital limb contractures in common. While skeletal muscle sarcomeric genes comprise many of the first genes identified for Distal Arthrogyposis, other mechanisms of disease have been demonstrated, including key effects on peripheral nerve function. While Distal Arthrogryposis and Lethal Congenital Contracture Syndromes display superficial similarities in phenotype, the underlying mechanisms for these conditions are diverse but overlapping. In this review, we discuss the important insights gained into these human genetic diseases resulting from in vitro molecular studies and in vivo models in fruit fly, zebrafish, and mice.

**Keywords:** contracture; arthrogryposis; congenital

**Citation:** Whittle, J.; Johnson, A.; Dobbs, M.B.; Gurnett, C.A. Models of Distal Arthrogryposis and Lethal Congenital Contracture Syndrome. *Genes* **2021**, *12*, 943. https://doi.org/ 10.3390/genes12060943

Academic Editors: Philip Giampietro, Nancy Hadley-Miller and Cathy L. Raggio

Received: 17 May 2021 Accepted: 16 June 2021 Published: 20 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

### **1. Introduction**

Arthrogryposis (arth = joint; grp = curved; osis = pathological state) describes a broad range of phenotypes consisting of multiple congenital joint contractures presenting at birth [1]. About 1 in 3000 live births presents with some form of arthrogryposis, many of which are nonprogressive and improve with physiotherapy. The core root of arthrogryposis is fetal akinesia, or lack of fetal movement, that results in contractures forming in the joints [1–3]. Movement is required for normal joint development; it influences the structure of the joints, as well as promoting cellular signaling that guides normal tissue development. Mechanical forces also influence bone morphology, affecting organization of chondrocytes, bone elongation, and differential growth, all affecting the shape of bones as they develop. Fetal akinesia impairs joint formation, which may lead to joint fusions. Furthermore, tension is required for normal tendon development, forming a connection between bone and muscle [4]. Arrested movement during development has significant impact on the formation of the skeleton, joints, muscle, and connective tissues.

The full range of joint movement in utero can be perturbed both intrinsically and extrinsically. Intrinsically, mutations affecting the muscle, bone, connective tissue, and neural system can affect the range of movement of joints. Currently, there are over 400 genes associated with arthrogryposis broadly, encapsulating a wide diversity of genes affecting different pathways, including genes associated with axon structure, circulatory development, or synaptic transmission [5]. Extrinsically, maternal disease or exposures, uterine space limitation, and decreased blood supply are also root causes for contraction defects [1,2]. Because joint motion is affected by many different systems, a wide range of issues during development can arrest joint motion.

A subset of arthrogryposis is described as Distal Arthrogryposis (DA), a group of genetically induced contractures that predominantly affect the joints of the distal limbs, including the hands, wrists, ankles, and feet. Clinically, the lower extremity manifestations

commonly include clubfoot and vertical talus. There are currently 10 classifications of DA, including Sheldon-Hall syndrome (DA2B) and Freeman-Sheldon syndrome (DA2A) [6–9]. Freeman-Sheldon syndrome is considered the most severe form of DA, and also presents with facial contractures [9].

Currently, DA patients are offered supportive care to improve quality of life, including occupational therapy, physical therapy, and surgery [10]. While these treatments improve outcome for patients, they often fall short of complete restoration of range of motion in the joints and functionality. This strategy also fails to address underlying causes for DA, such as muscle weakness and impaired neurotransmission. Therefore, further investigation is necessary to understand the impact of disease variants which will allow us to determine the most effective treatment options for patients.

Lethal Congenital Contracture Syndromes (LCCS) are included in this review, as some of the same genes and disease mechanisms apply to this serious condition, which is typically fetal or neonatal lethal. LCCS presents with severe generalized contractures, along with many other typical features including incomplete lung development, and polyhydramnios [11]. In contrast to DA, which is most often inherited as an autosomal dominant condition, LCCS has only been described in the autosomal recessive state. Eleven subtypes have been described to date [10].

Various disease models have been developed to examine the mechanisms behind DA and to test therapeutic interventions (Table 1). Molecular and single-cell studies are useful for precisely examining the effects of DA-causing variants on the affected proteins and tissues. Access to human tissues, particularly from muscle biopsies, has facilitated molecular analysis for research, yet is clinically useful only in select cases [12,13]. In addition, protein modeling can help predict the impact of various amino acid substitutions on molecular interactions [14,15]. On the other hand, animal models are necessary to analyze the effect of single gene variants on organisms on scales larger than single cells. The effect of zygosity and gene dosage may also be better studied in animal models to assess interactions between normal and abnormal gene products. Animal models are also useful for studying experimental interventions that may improve patient quality of life and outcome, acting as stand-ins for potential human patients.


**Table 1.** List of genes and associated conditions and models of distal arthrogryposis (DA) and lethal congenital contracture syndrome (LCCS) used for study. Autosomal dominant (AD), Autosomal recessive (AR) [9,11,12,14,16–57].


**Table 1.** *Cont.*

Models of human disease are rapidly becoming more sophisticated, with the ability to knock-in single nucleotide variants and create conditional (tissue specific or timedependent) knockouts [16,17]. Loss-of-function alleles, which are often easier to generate, provide critical information about gene function, but may not fully explain autosomal dominant phenotypes in which gain-of-function or dominant negative effects can cause markedly different phenotypes. Conditional knockouts, while very helpful in defining gene function, rarely replicate the human phenotype in its entirety, but may be required when early lethality limits further study. These methods allow researchers to design models that more accurately represent these human conditions, and replicate pathogenic effects broadly or in specific tissues.

This review will examine genetic models of DA and LCCS, and the impact they have had in understanding the underlying pathophysiology. While we will describe both in vitro and in vivo approaches, we will focus primarily on vertebrate models, as these have the potential to provide insight into the multifaceted effects of disease variants on the multiple tissue types that contribute to these complex human phenotypes. We will also examine the current trajectory of DA research, and how these research strategies can help those afflicted by DA.

### **2. Muscle-Related Distal Arthrogryposis**

### *2.1. MYH3*

Missense mutations in *MYH3*, the earliest expressed embryonic myosin heavy chain gene that is predominantly expressed in myotubes destined to become fast-twitch myofibers [9], are strongly associated with DA clinically, and contribute to multiple subtypes including DA1, DA2A, and DA2B with varying degrees of severity [7–9,59,60]. *MYH3* associated DA is almost always caused by single missense variants, as frameshift knockout or premature stop mutations are frequently observed in healthy population controls [9]. However, nonsense *MYH3* variants may contribute to autosomal recessive spondylocarpotarsal syndrome in the compound heterozygous state when presenting along with a missense allele [21], and have also been described with autosomal dominant spondylocarpotarsal syndrome [61]. DA-associated pathogenic variants cluster in the motor domain, but have also been found in the tail region of the protein [9]. Many of these missense variants are de novo, but some segregate in families with complete, or nearly complete penetrance [9]. *MYH3* appears to be one of the most common genes associated with DA, therefore various in vitro and in vivo studies, including protein modeling, cell models, and vertebrate studies, have been performed to elucidate the effects of *MYH3* variants on muscle function and the subsequent effects on the joints and skeleton.

### 2.1.1. Biochemical and Cell Models for *MYH3*-Associated Distal Arthrogryposis

Single molecule and single cell studies are useful to examine the precise impact of a variant on protein function. The effects of amino acid substitutions are difficult to predict without mechanistic examination. Single-molecule studies facilitate understanding the effect of a missense variant on protein function and can later be translated into an understanding of how small mechanical differences affect tissues and whole body systems. Missense variants can be studied in human skeletal muscle biopsies. However, these are not routinely performed for DA diagnosis, which makes these studies challenging. To study this mechanistic link between DA phenotype and gene variant, Racca et al. performed contractility studies on isolated muscle cells and myofibrils derived from biopsied muscle tissue from DA2A patients [12]. They found that a DA2A-associated *MYH3* variant inhibited cross-bridge detachment, thereby slowing muscle relaxation and lowering force production. A later study replicated these results while examining multiple *MYH3* variants associated with DA2 [54]. In addition to slower actin-myosin detachment, ATP binding and ATPase activity were lower in variant MYH3 molecules.

Development of single cell models, in which *MYH3* variants are exogenously expressed or overexpressed from a plasmid or virus, have been limited by the difficulties of expressing large genes, like *MYH3*, in vitro. Other key challenges of in vitro modeling include the paucity of skeletal muscle cell lines other than C2C12 cells, and the propensity of muscle cells to form a syncytium. In addition, difficulties in differentiating human induced pluripotent stem cells (iPSCs) into skeletal muscle have also limited their use for arthrogryposis disease modeling. Likewise, because many features of DA are due to complex relationships between different cell types, co-cultures of muscle cells with tenocytes and bone may be required to recapitulate the human condition. Thus, many investigators have preferred to study DA genes in whole organisms.

### 2.1.2. Invertebrate Models for *MYH3*-Associated Distal Arthrogryposis

*Drosophila melanogaster* (fruit flies) are useful tools for studying muscle function and myofibril assembly, particularly as introduction of single variants are traditionally simpler in this system compared to other models. *MYH3* and the *Drosophila* myosin heavy chain gene, *Mhc*, are highly conserved. *Drosophila* have the advantage of having only a single myosin heavy chain, which eliminates the possible obscuration of effects due to compensation by other myosin heavy chain analogs. Therefore, the effect of variants on protein function can be examined in a setting without other myosin heavy chain isoforms.

*Drosophila* transgenic models have been generated by overexpressing *Mhc* constructs containing DA variants [14,25]. Guo et al. predicted that a DA1 variant would perturb a hydrophobic interaction, while a DA2B mutation would introduce a hydrogen bond that was not present in the wild type. The effect of these predicted interactions was tested mechanistically in *Drosophila* models. Muscle fibers containing the DA alleles were extracted and found to have lower actin affinity, reduced power output, and increased stiffness, which may explain the motor deficits [14].

Morphological studies of *Drosophila* skeletal muscle expressing three DA2A *Mhc* transgenes (R672H, R672C, and T178I) showed branching and splitting defects, which were most severe in the R672C variant, which caused Z-discs to be split and malformed. In addition, Z-disc distance was shorter in the transgenic flies indicating an overall shortening of the sarcomeres, perhaps due to an enhanced contractile state of the myofibers. Presumably, the shortened sarcomeres observed in *Drosophila* contribute to the formation of contractures in human patients. Indeed, ATPase activity was reduced in these transgenic flies, leading to functional defects in muscle activity [25].

In studying the intact adult *Drosophila*, Guo et al. showed that the *Mhc*F437I mutants had a much longer lifespan than *Mhc*A234T mutants, consistent with the less severe phenotype of DA1 patients compared to DA2B patients [14]. In addition, the researchers found that both mutants displayed aberrant myofibril assembly, as well as misaligned sarcomere structure including distorted M and Z lines [14]. Again, this phenotype was more severe in A234T mutants than in the F437I mutants. *Mhc*F437I mutants displayed essentially normal myofibers and sarcomeres, while *Mhc*A234T mutants had small myofibers with disrupted morphology, as well as abnormal sarcomeres [14]. Das et al. also found that decreased climbing capability of adult flies also correlated with the phenotypic severity in humans [25].

Like *Drosophila*, *Caenorhabditis elegans* (*C. elegans*) has also been used to study myosin heavy chain genes [62]. There are many advantages of *C. elegans* and *Drosophila* for disease modeling, including large numbers of progeny, knowledge of ontogeny of individual cells, and ease of functional studies for drug screening. However, as described in Gil-Galvez et al., the evolutionary distance, and differences in number of myosin heavy chain genes between invertebrates and humans makes it difficult to determine whether the gene being studied has the same function, particularly in terms of spatial and temporal expression, as its human counterpart. Furthermore, Gil-Galvarez et al. caution against overexpression studies in *C. elegans* broadly, citing interference with muscle cell function overall [62]. The major drawback of invertebrate models is, quite obviously, the lack of skeletal structures which limits their use in understanding the complex relationship between muscle, nerve, and bone.

### 2.1.3. Vertebrate Models for *MYH3*-Associated Distal Arthrogryposis

Germline loss of *Myh3* in mice results in altered muscle fiber size, fiber number, fiber type, and misregulation of genes, and adult *Myh3* null mice develop scoliosis [32]. However, the molecular defect may make these mice a better model for recessive *MYH3* spondylocarpotarsal synostosis syndromes than for autosomal dominant DA [21,61]. Notably, many patients with spondylocarpotarsal synostosis syndrome also have congenital contractures, which highlights the phenotypic overlap between DA and spondylocarpotarsal synostosis syndrome. Interestingly, *MYH3* was also shown to be expressed in

bone, which the authors state may explain the effects of *MYH3* variants on both skeletal muscle and bone, particularly for patients with spondylocarpotarsal synostosis syndrome and bony fusions [61].

To more accurately model DA2A, Whittle et al. recently introduced one of the most common Freeman-Sheldon syndrome *MYH3* variants, R672H, into an analogous gene in zebrafish (*Danio rerio*) (*smyhc1*R673H) [16]. Zebrafish breed profusely and are cost-effective compared to mice. They also mature quickly, develop in vitro, and are transparent in the first few days of life, which facilitates imaging. Zebrafish are also vertebrates, making them more closely related to humans than *Drosophila* or *C. elegans*. Because two zebrafish lines were created, including *smyhc1* null and *smyhc1*R673H lines, gene dosage effects were studied by examining the variant in the context of different zygosities. Indeed, *smyhc1*R673H homozygotes displayed severe, early lethal phenotype compared to *smyhc1*R673H heterozygotes, indicating that the *smyhc1*R673H mutation acts as a hypermorph [16]. This result suggests human fetal lethality if a DA missense variant occurs in the homozygous state, which has not yet been described.

Zebrafish larvae harboring the *smyhc1*R673H variant demonstrated severe notochord kinks [1], and adults had vertebral fusions that were similar to those seen in patients autosomal dominant spondylocarpotarsal synostosis due to *MYH3* variants. On histological examination, skeletal muscle showed severely shortened and misshapen muscle fibers. Similar to studies in *Drosophila*, the somite length was reduced in *smyhc1*R673H mutants, consistent with shortening of the sarcomere.

A major advantage of zebrafish is the ease with which drugs can be administered for therapeutic investigations and drug screening. Based on knowledge that myosin ATPase inhibitors are now being evaluated to treat human cardiomyopathy due to similar variants in cardiac myosin genes [63], Whittle et al. preemptively treated embryos with paraaminoblebbistatin to prevent contractures from forming in larvae. Para-aminoblebbistatin inhibits myosin heavy chain ATPase activity, which chemically relaxed the skeletal muscle and prevented the curved phenotype of the treated *smyhc1* mutant fish (Figure 1) [16]. Previous molecular and single fiber studies predicted this mechanistic effect. Based on this experimental work, myosin ATPase inhibitors may be a viable avenue for *MYH3* associated DA treatment, but will most likely require development of skeletal-musclespecific inhibitors and treatment at an appropriately early developmental window. *Genes* **2021**, *12*, x FOR PEER REVIEW 7 of 15

**Figure 1.** The curved spinal phenotype associated with both *smyhc1*+/R673H and *smyhc1*R673H/R673H genotypes is normalized with the myosin inhibitor para-aminoblebbistatin. Embryos were treated from 24–48 hpf and photographed at 48 hpf. Treated embryos are shown below DMSO treated controls. Unlike the newer myosin inhibitors that are being developed, para-aminoblebbistatin has many toxic effects, including lethal cardiac edema, which limits its use as a human therapeutic. These images are similar to those published in [16]. *2.2. MYBPC1 and MYBPC2* **Figure 1.** The curved spinal phenotype associated with both *smyhc1*+/R673H and *smyhc1*R673H/R673H genotypes is normalized with the myosin inhibitor para-aminoblebbistatin. Embryos were treated from 24–48 hpf and photographed at 48 hpf. Treated embryos are shown below DMSO treated controls. Unlike the newer myosin inhibitors that are being developed, para-aminoblebbistatin has many toxic effects, including lethal cardiac edema, which limits its use as a human therapeutic. These images are similar to those published in [16].

### ing protein C1 (*MYBPC1*) to dominantly inherited DA1 [30], DA2 [58], arthrogryposis 136

phenotype [37], but single variants have not yet been studied.

congenital contracture syndrome LCCS4 [11].

significance.

*2.3. TPM2*

Strong evidence now exists linking variants in the slow skeletal muscle myosin bind-

Morpholino knockdown of *mybpc1* in zebrafish resulted in embryos with severe body curvature, as well as impaired motor excitation with defective myofibril organization and reduced sarcomere numbers [31]. Furthermore, overexpression of human *MYBPC1* DA1 associated variants in zebrafish resulted in hypermorphic effects with body curvature, decreased motor activity, and impaired survival. No effect was seen with overexpression of wild-type transcripts, suggesting that overexpression studies in zebrafish could be an efficient model for future functional testing of the human variants of uncertain clinical

In contrast to *MYBPC1,* which is strongly implicated in human disease, the role of fast skeletal muscle myosin binding protein C2 (*MYBPC2*) in DA is less clear, as there is only a single report of *MYBPC*2 variants in DA patients in whom other known arthrogryposis gene variants were also observed, suggesting a possible role as a modifier [18]. Knockdown of *MYBPC2* with morpholino oligonucleotides produced a myopathic

*TPM2* variants cause a spectrum of phenotypes, including DA1, DA2, as well as nemaline myopathy and cap myopathy (reviewed in (Tajsharghi et al., 2012)) [64]. All are autosomal dominant with the exception of a pathogenic null variant identified in a consanguineous family with Escobar variant of multiple pterygium syndrome that was observed in the recessive state [65]. Biochemical studies were undertaken to study *TPM2* gain-of-function phenotypes, including in vitro motility assays, which showed variable effects on calcium sensitivity and tropomyosin flexibility [20,39]. In addition, *TPM2* was

multiplex congenita [28], myopathy with tremor [49], and, in the recessive state, to lethal

### *2.2. MYBPC1 and MYBPC2*

Strong evidence now exists linking variants in the slow skeletal muscle myosin binding protein C1 (*MYBPC1*) to dominantly inherited DA1 [30], DA2 [58], arthrogryposis multiplex congenita [28], myopathy with tremor [49], and, in the recessive state, to lethal congenital contracture syndrome LCCS4 [11].

Morpholino knockdown of *mybpc1* in zebrafish resulted in embryos with severe body curvature, as well as impaired motor excitation with defective myofibril organization and reduced sarcomere numbers [31]. Furthermore, overexpression of human *MYBPC1* DA1-associated variants in zebrafish resulted in hypermorphic effects with body curvature, decreased motor activity, and impaired survival. No effect was seen with overexpression of wild-type transcripts, suggesting that overexpression studies in zebrafish could be an efficient model for future functional testing of the human variants of uncertain clinical significance.

In contrast to *MYBPC1,* which is strongly implicated in human disease, the role of fast skeletal muscle myosin binding protein C2 (*MYBPC2*) in DA is less clear, as there is only a single report of *MYBPC*2 variants in DA patients in whom other known arthrogryposis gene variants were also observed, suggesting a possible role as a modifier [18]. Knockdown of *MYBPC2* with morpholino oligonucleotides produced a myopathic phenotype [37], but single variants have not yet been studied.

### *2.3. TPM2*

*TPM2* variants cause a spectrum of phenotypes, including DA1, DA2, as well as nemaline myopathy and cap myopathy (reviewed in (Tajsharghi et al., 2012)) [64]. All are autosomal dominant with the exception of a pathogenic null variant identified in a consanguineous family with Escobar variant of multiple pterygium syndrome that was observed in the recessive state [65]. Biochemical studies were undertaken to study *TPM2* gain-offunction phenotypes, including in vitro motility assays, which showed variable effects on calcium sensitivity and tropomyosin flexibility [20,39]. In addition, *TPM2* was recently shown to have noncanonical roles other than its sarcomeric function, where it binds thin filament actin to regulate muscle contraction. In this work, *TPM2* directly regulated muscle morphogenesis by directing myotubes toward tendon attachment sites [56]. Muscle morphology was disrupted in both flies and zebrafish expressing DA1-associated *TPM2* variants, likely by causing myofiber hypercontraction (Figure 2).

### *2.4. TNNI2*

While the function of the fast skeletal muscle Troponin I (*TNNI2*) has been described in flightless *Drosophila* models [53], only a single DA disease-associated missense variant has been modeled in mice, which accurately recapitulated the human disease [17]. However, the small body size of mice carrying the *TNNI2* DA variant could not be explained by the direct effect of the variant on skeletal muscle morphology or function. Rather, *TNNI2* was shown to be expressed in osteoblasts and chondrocytes of long bone growth plates, through which its effects on growth was predicted to occur. Therefore, like the studies described above for *MYH3*, this model provides evidence that some DA phenotypes may be directly attributable to expression in non-muscle tissue, such as bone.

**Figure 2.** DA associated *TPM2* variants cause muscle phenotypes in *Drosophila*. Confocal micrographs of live L3 larva that express GFP-tagged TPM2 variants in skeletal muscles (body wall muscles). Mef2.Gal4 was used to activate UAS.TPM2 transgenes. Lateral and dorsal views are shown for each genotype. (**A**,**B**) Larva that express TPM2.GFP showed normal muscle histology. Larva that expresses TPM2.E41K.GFP (**C**,**D**) or TPM2.R91G.GFP (**E**,**F**). GFP have rounded myofibers that appear to result from internal tears (arrows; note affected muscles remain associated with tendons at segment boundaries) and shortened segments that could be due to hypercontractile muscles (arrowheads). Thoracic segments (T1–T3) and abdominal segments (A1–A8) are labeled. Scale bars, 500 mM. Previously unpublished data. **Figure 2.** DA associated *TPM2* variants cause muscle phenotypes in *Drosophila*. Confocal micrographs of live L3 larva that express GFP-tagged TPM2 variants in skeletal muscles (body wall muscles). Mef2.Gal4 was used to activate UAS.TPM2 transgenes. Lateral and dorsal views are shown for each genotype. (**A**,**B**) Larva that express TPM2.GFP showed normal muscle histology. Larva that expresses TPM2.E41K.GFP (**C**,**D**) or TPM2.R91G.GFP (**E**,**F**). GFP have rounded myofibers that appear to result from internal tears (arrows; note affected muscles remain associated with tendons at segment boundaries) and shortened segments that could be due to hypercontractile muscles (arrowheads). Thoracic segments (T1–T3) and abdominal segments (A1–A8) are labeled. Scale bars, 500 mM. Previously unpublished data.

### *2.4. TNNI2 2.5. TNNT3*

While the function of the fast skeletal muscle Troponin I (*TNNI2*) has been described in flightless *Drosophila* models [53], only a single DA disease-associated missense variant has been modeled in mice, which accurately recapitulated the human disease [17]. However, the small body size of mice carrying the *TNNI2* DA variant could not be explained by the direct effect of the variant on skeletal muscle morphology or function. Rather, *TNNI2* was shown to be expressed in osteoblasts and chondrocytes of long bone growth plates, through which its effects on growth was predicted to occur. Therefore, like the DA-associated variants in fast skeletal muscle Troponin T (*TNNT3*) are also dominantly inherited missense variants, and therefore, like those in many genes described previously, cause disease through a gain-of-function manner and therefore cannot be adequately modeled using a simple knockout approach. Therefore, while knockout approaches have shown a critical function of *TNNT3* during vertebral development [34], no models with disease-specific missense variants have been generated, and future studies are needed.

recently shown to have noncanonical roles other than its sarcomeric function, where it binds thin filament actin to regulate muscle contraction. In this work, *TPM2* directly regulated muscle morphogenesis by directing myotubes toward tendon attachment sites [56]. Muscle morphology was disrupted in both flies and zebrafish expressing DA1-associated

*TPM2* variants, likely by causing myofiber hypercontraction (Figure 2).

### studies described above for *MYH3*, this model provides evidence that some DA pheno-*2.6. MYLPF*

types may be directly attributable to expression in non-muscle tissue, such as bone. *2.5. TNNT3* DA-associated variants in fast skeletal muscle Troponin T (*TNNT3*) are also dominantly inherited missense variants, and therefore, like those in many genes described previously, cause disease through a gain-of-function manner and therefore cannot be adequately modeled using a simple knockout approach. Therefore, while knockout approaches have shown a critical function of *TNNT3* during vertebral development [34], no models with disease-specific missense variants have been generated, and future studies Exome sequencing recently identified *MYLPF*, a phosphorylatable fast skeletal muscle regulatory light chain, as a cause of DA [23]. Some affected individuals were homozygous for rare variants in the gene, while other individuals have autosomal dominant disease, a finding similar to what was described for *MYH3*-related disorders. However, unlike *MYH3*-related disorders, the phenotypes for *MYLPF* autosomal dominant and recessive conditions are apparently indistinguishable. Protein modeling of *MYLPF* alleles suggested that the autosomal dominant pathogenic variants cause disease through their direct interaction with myosin, while the recessive alleles only indirectly affect the interaction with myosin [23].

are needed. *2.6. MYLPF* Previously described *mylpf* knockout mice did not develop either fast or slow type skeletal muscle mass, which resulted in early death before or after delivery, presumably due to respiratory failure [66]. Similarly, an individual with recessive *MYLPF*-associated DA1 was found to have absent skeletal muscle in an amputated foot, which could best be described as amyoplasia. The recessive *MYLPF* pathogenic variant in this individual was hypothesized to be hypomorphic. Therefore, to further model hypomorphic alleles of *MYLPF*, a zebrafish *mylpfa* mutant was characterized. Because zebrafish have 2 *MYLPF* genes, *mylpfa* and *mylpfb*, the more prominently expressed gene (*mlfpfa*) was chosen to model hypomorphic *MYLPF* alleles [23]. Of note, zebrafish had an evolutionary genome duplication event that has resulted in many human genes being represented twice in

the zebrafish genome. Some of these genes have subsequently evolved to occupy additional temporal or spatial expression patterns and/or adopted new developmental roles. While this duplication event may complicate identification of the relevant human gene, gene duplications can also be an advantage, allowing genes to be studied whose knockout is early embryonic lethal in other species.

*Mylpfa* zebrafish null mutants were found to have paralyzed pectoral fins, an impaired escape response, and consistently lower trunk muscle force compared to wildtype [23]. In in vitro studies, myosin extracted from *mylpfa* mutant larvae propelled significantly more slowly than their wild type protein. Skeletal muscle fibers were also found to degenerate in mutant larvae, collapsing and losing structure, developing membrane abnormalities, indicating that *mylpf* is necessary to maintain cellular integrity in muscle cells [23]. Thus, *MYLPF* may be unique among the DA genes in also causing amyoplasia, which may have important implications for personalized therapeutic strategies.

### **3. Neural-Related Distal Arthrogryposis**

### *3.1. PIEZO2*

DA is not only associated with muscle proteins. Whole genome sequencing was performed on individuals with DA5, in which arthrogryposis occurs in combination with ptosis, ophthalmoplegia, and facial dysmorphism, and gain of function variants (I802F and E2727del) were discovered in *PIEZO2* [24], a mechanosensitive cation channel responsible for mediating cation currents in primary sensory neurons [24,40,55]. Because of their mechanosensitivity, these channels are also termed "stretch-activated ion channels". *PIEZO2* was found to affect the skeleton non-autonomously in mice. Loss of *PIEZO2* specifically in skeletal tissue did not affect bone development; however, loss of gene function in proprioceptive neurons caused spine malalignment [67]. This suggests both that the neural system is necessary in maintaining normal skeletal development, and that *PIEZO2* is a critical gene in this process.

To test the mechanistic effects of these pathogenic variants, one group recently transfected human embryonic kidney cells with wild type *PIEZO2*, or *PIEZO2* with missense variants encoding I802F, or E2727del [24]. Both of these disease-associated variants caused the channel to recover more quickly from inactivation and resulted in increased channel activity following a mechanical stimulus. This supports the hypothesis that DA5 is caused by gain of function mutations that alter mechanosensory nerves. Although additional studies are needed, overstimulation may directly affect the neuromuscular pathway that controls muscle tone in developing fetuses, perhaps causing near-constitutive contractions that constrain the developing joint.

### *3.2. ECEL1*

Variants in *ECEL1*, which encodes the endothelin-converting enzyme like-1, were identified in the recessive state in several families with DA5D, a rare form of arthrogryposis in which affected individuals have contractures as well as distinctive facial features and ptosis [41]. *ECEL1*, which is expressed in brain and nerve, is required for post-natal development in mice. Loss of *Ecel1* in mice results in abnormal terminal branching of motor neurons at the skeletal muscle endplate [68]. The mechanism by which *ECEL1* directs motor neuron branching is currently unknown; however, the resulting contractures in patients with *ECEL1* variants were proposed to be caused by a similar mechanism to those caused by genes such as *CHRNG* that causes multiple pterygium syndrome that impairs neurotransmission at the neuromuscular junction [69].

### **4. Lethal Congenital Contracture Syndrome**

In contrast with DA, which are more common and often autosomal dominant, Lethal Congenital Contracture Syndromes (LCCSs) are a group of rare autosomal recessive forms of arthrogryposis. LCCS are characterized by lack of fetal movement (akinesia), micrognathia, incomplete lung development, polyhydramnios, characteristic contractures of the

limbs (clubfoot, hyperextended knees, elbow and wrist flexion contractures) and motoneuron degeneration. Eleven subtypes of LCCS have been characterized. However, there are likely to be many more genes that result in these conditions as more genetic studies are performed on products of conception due to spontaneous abortion or stillbirth. LCCS is more common in communities with high rates of consanguinity consistent with the recessive inheritance pattern. Variable expression of LCCS phenotypes may be due to residual gene function in patients with missense variants or modifier genes. The recessive phenotypes of LCCS have made them more amenable to study by complete knockdown of gene expression.

### *4.1. Nuclear mRNA Export (GLE1, ERBB3, and PIP5K1C)*

The first three LCCS subtypes may all act through a similar pathway by supporting nuclear mRNA export. LCCS type 1 (LCCS1) is caused by mutations in *GLE1* RNA Export Mediator (*GLE1*), a regulator of post-transcriptional gene expression [70]. *GLE1* acts as an mRNA export factor, as well as by mediating translation initiation and termination [15]. In mice, in situ hybridization showed marked expression in the neural tube of 11 dpf embryos, specifically in the ventral portion from which motoneurons generate [70]. In zebrafish, the gene is expressed prominently in the central nervous system during development [33].

A mutation in *GLE1*, FinMajor, has been linked to LCCS1 by causing a splice-site mutation that results in a 3 amino acid insertion in the coiled-coil domain [70]. The coiled-coil domain is required for the protein to self-associates to form oligomers, and one group examined the effect of the FinMajor mutation on polypeptide self-association in vitro and in vivo [29]. Both in vitro and in living cells, the GLE1 protein self-aggregated, and FinMajor mutant oligomers were malformed. In human cell culture and in the yeast model, these malformed oligomers were found to perturb mRNA export from the nucleus [29].

Because the FinMajor mutation reduced function of the GLE1 protein in mRNA transport, *gle1* knockdown and knockouts were studied in zebrafish to understand its effects on development. Knockouts developed with small eyes and underdeveloped jaws and pectoral fins [33]. Cell death was also observed in the head and spinal cord, and there were fewer motoneurons than in wild type fish. Motoneurons also exhibit aberrant branching that worsened with age. Maternal *gle1* mRNA is loaded into the yolk sac of oocytes, where it contributes to zebrafish embryogenesis; therefore, morpholino oligonucleotides were also used to knock down expression of the mRNA in embryos. This exacerbated the phenotype, with CNS cell death becoming apparent earlier in development, at 1 dpf, which suggests an important role for *gle1* for early development. Notably, this phenotype is rescued in morphants injected with human wild type *GLE1*, but not when injected with the FinMajor allele [33]. Thus, this zebrafish model may be a viable tool for screening and determining the pathogenicity of human alleles.

LCCS2 is due to loss-of-function mutations in Erb-B2 Receptor Tyrosine Kinase 3 (*ERBB3*), which encodes HER3, a known modulator of the phosphatidylinositol pathway [44]. Interestingly, variants in LCCS3 were found to be due to variants in Phosphatidylinositol-4-Phosphate 5-Kinase Type 1 (*PIP5K1C*), which encodes the enzyme PIPK-gamma of the phosphatidylinositol pathway [43]. Nouslainen et al. realized that both *ERBB3* and *PIP5K1C* are involved in the synthesis of inositol hexakisphosphate, which binds directly to yeast *Gle1*, activating *Dbp5* for mRNA transport [70]. Because Gle1 is expressed in the neural tube during development, pathogenic variants in this gene can be devastating to development of the nervous system, as *Gle1* is integral to mRNA transport [70].

### *4.2. Peripheral Nerve (CNTNAP1, ADGRG6, GLDN)*

The genes responsible for LCCS7, LCCS9, and LCCS11 are all highly expressed in peripheral nerves and required for proper peripheral nerve function. *Contactin Associated Protein 1* (*CNTNAP1)*, which causes LCCS7, is a contactin-associated protein that is required for localization of the paranodal junction proteins contactin and neurofascin. *CNTNAP1* is

also required for the normal spatial expression patterns of neuronal sodium and potassium channels [19]. Likewise, the causative gene for LCCS11, *gliomedin (GLDN)*, is a ligand for neurofascin and Nrcam, which are axonal immunoglobulin cell adhesion molecules critical for association with sodium channels at the nodes of Ranvier [71]. *Adhesion G Protein Coupled Receptor G6 (ADGRG6)*, which is also known as GPR126, is required for normal Schwann cell development. Thus, defects in all three genes likely result in similar peripheral nerve dysfunction at very early stages in development that leads to the LCCS phenotype.

### **5. Conclusions**

Many techniques and organisms have been used for modeling arthrogryposis, each of which provides complementary information that is essential for understanding basic mechanisms and will yield translational benefits to human patients. There is an expanding list of genes that are associated with limb contractures, as one of many clinical features, beyond those discussed in this review article. Other genes are yet to be discovered, and disease models are often needed to provide evidence of causality. Furthermore, as exome sequencing becomes standard care, disease models may be helpful to facilitate variant interpretation. However, it will be essential to develop more efficient methods for introducing and studying large numbers of individual variants.

Although most genes responsible for distal arthrogryposis and LCCS are skeletal muscle sarcomeric genes or genes critical for neuronal function and neuromuscular transmission, crucial aspects remain to be established using disease models. It is important to determine whether common pathways and mechanisms supported by the genetic data will predict a unifying approach to therapy. Furthermore, now that gene therapies are becoming viable treatment mechanisms, where and when the defect needs to be corrected to prevent development of the DA or LCCS phenotype needs to be elucidated. Disease models will be essential to improve treatment for these challenging disorders.

**Author Contributions:** J.W. and C.A.G. wrote the first draft of the manuscript. J.W., A.J., M.B.D. and C.A.G. wrote or critically reviewed the manuscript, and reviewed and approved the final version. All authors have read and agreed to the published version of the manuscript.

**Funding:** Research reported in this publication was supported by National Institute of Arthritis and Musculoskeletal and Skin Diseases under Award Numbers R01AR067715 and R01AR070299, Eunice Kennedy Shriver National Institutes of Child Health and Human Development of the National Institutes of Health under the Award Numbers R03 HD104065 an P01 HD084387, Washington University Institute of Clinical and Translational Sciences grant UL1 TR002345 from the National Center for Advancing Translational Sciences of the National Institutes of Health, Washington University Musculoskeletal Research Center (NIH/NIAMS P30 AR057235) (NIH/NIAMS P30 AR074992), and the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number P50 HD103525 to the Intellectual and Developmental Disabilities Research Center at Washington University. This study was funded with support from Shriners Hospital for Children, and the Children's Discovery Institute of Washington University and St Louis Children's Hospital.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


## *Article* **Prevalence of** *POC5* **Coding Variants in French-Canadian and British AIS Cohort**

**Hélène Mathieu <sup>1</sup> , Aurélia Spataru <sup>1</sup> , José Antonio Aragon-Martin <sup>2</sup> , Anne Child <sup>3</sup> , Soraya Barchi <sup>1</sup> , Carole Fortin 4,5, Stefan Parent 1,6 and Florina Moldovan 1,7,\***


**Abstract:** Adolescent idiopathic scoliosis (AIS) is a complex common disorder of multifactorial etiology defined by a deviation of the spine in three dimensions that affects approximately 2% to 4% of adolescents. Risk factors include other affected family members, suggesting a genetic component to the disease. The *POC5* gene was identified as one of the first ciliary candidate genes for AIS, as three variants were identified in large families with multiple members affected with idiopathic scoliosis. To assess the prevalence of p.(A429V), p.(A446T), and p.(A455P) *POC5* variants in patients with AIS, we used next-generation sequencing in our cohort of French-Canadian and British families and sporadic cases. Our study highlighted a prevalence of 13% for *POC5* variants, 7.5% for p.(A429V), and 6.4% for p.(A446T). These results suggest a higher prevalence of the aforementioned *POC5* coding variants in patients with AIS compared to the general population.

**Keywords:** *POC5*; adolescent idiopathic scoliosis; cilia; genetics; spine deformity

### **1. Introduction**

Adolescent idiopathic scoliosis (AIS) is a common disorder characterized by a combination of deviations of the spine in the sagittal and the coronal plane, with vertebral rotation. It affects approximatively 3% of the adolescent population [1,2]; affects females more than males, with a ratio ranging from 1.5:1 to 3:1 [2,3]; and is more prevalent in northern latitudes [4]. The etiology of AIS remains not fully understood, but it is now widely accepted that this disorder has a genetic component, as supported by family history, and higher concordance rates for monozygotic twins compared to dizygotic twins [5–7]. Approximatively 40% of AIS patients have a family history [8,9]. The genetic model for AIS remains unclear; indeed, several studies have suggested that it is a polygenic and multifactorial disease [9]. However, other analyses suggest mendelian inheritance, such as autosomal dominant or sex-related, could show with incomplete penetrance [10–12]. Since the advent of next-generation sequencing, candidate-gene analysis using pedigrees and population-based genome-wide association studies (GWAS) have been widely used to assess the genetic etiology of AIS. Despite all these efforts, only a few of the candidate genes have been functionally linked to the development of AIS. In 2015, Patten et al. [13] performed a linkage analysis followed by exome sequencing, and identified coding variants in the centrosomal protein gene *POC5* (NM\_001099271) in a multiplex four-generation

**Citation:** Mathieu, H.; Spataru, A.; Aragon-Martin, J.A.; Child, A.; Barchi, S.; Fortin, C.; Parent, S.; Moldovan, F. Prevalence of *POC5* Coding Variants in French-Canadian and British AIS Cohort. *Genes* **2021**, *12*, 1032. https://doi.org/10.3390/ genes12071032

Academic Editors: Philip Giampietro, Nancy Hadley-Miller, Cathy L. Raggio and Zhousheng Xiao

Received: 17 May 2021 Accepted: 28 June 2021 Published: 1 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

AIS French family [13]. Indeed, a rare SNP, p.(A446T), was found to perfectly segregate with AIS in four families from a pool of 41 AIS French families. Two additional rare variants, p.(A429V) and p.(A455P), were found in the *POC5* gene in AIS sporadic cases. Moreover, all three *POC5* coding variants were functionally related to AIS using a zebrafish model [13]. More recently, a fourth SNP (rs6892146) was identified to be associated with AIS development in a Chinese population [14], but the three other SNPs were not found in this cohort. POC5 is a centriolar protein that is essential for cell cycle progression, cilia elongation [15], centriole elongation, and maturation. Since the identification of the *POC5* gene, a ciliary gene that is strongly associated with AIS, the ciliary pathway has been thoroughly investigated and has revealed promising results [12,15–18].

To investigate the prevalence of *POC5* genetic variants in AIS, French-Canadian and British AIS patients were screened by targeted or whole-exome sequencing followed by Sanger analyses of DNA.

### **2. Materials and Methods**

### *2.1. Patients*

One hundred and seventy-seven AIS patients with a Cobb angle of at least 10◦ , 73 patients from a British cohort (63 unrelated consecutives IS individuals and 10 families), and 104 patients from a French-Canadian cohort (30 unrelated consecutive AIS individuals and 74 patients from 43 families), were recruited. Genomic DNA was extracted from saliva (Cat. RU49000, Norgen, Thorold, ON, Canada) or blood following the protocol provided by the company.

Samples were collected in accordance with the policies regarding the ethical use of human tissues for research. The protocol used in this study was approved by the Centre hospitalier universitaire Sainte-Justine Ethics Committee (#3704).

The control population consisted of an in-house cohort of 1268 individuals with similar ancestry (French, French-Canadian, or European) and was not screened for the presence of AIS [13].

### *2.2. Targeted Exome Sequencing*

A library was generated from 10 ng of genomic DNA to perform targeted sequencing of the *POC5* gene using the Ion AmpliSeq (Life Technologies). Sequencing of the 12 exons of *POC5* of 63 British unrelated AIS individuals and 10 families, and 18 French-Canadian families affected with autosomal dominant AIS, was performed by the Centre de Génomique Clinique Pédiatrique intégré CHU Sainte-Justine. The library was prepared using the Ion AmpliSeq DNA and RNA Library Preparation (MAN0006735, Rev. B.0, Ion Torrent, Life Technologies) prior to the exome sequencing following the Ion PGM IC 200 Kit (MAN0007661, Rev. B.0) protocol. Sequencing reads were aligned to the reference human genome sequence (hg19) [19], and the SNPs were identified by Ion Reporter (Ion Torrent). Identified variants were annotated using ANNOVAR [20], which is implemented in the VarAFT software [21] that we used to select exonic and splicing variants with a MAF (minor allele frequency) ≤ 1%. The identified SNPs were then compared to the EVS (Exome Variant Server) database, the Genome Aggregation Database (gnomAD: https://gnomad.broadinstitute.org (accessed on 21 March 2020)), and our in-house control cohort (*n* = 1268).

### *2.3. Whole-Exome Sequencing*

A library was generated from 10 ng of genomic DNA to perform whole-exome sequencing using the HiSeq 4000 sequencing machine from the Centre de Génomique Clinique Pédiatrique intégré CHU Sainte-Justine. The gDNA extracted from saliva or blood was sheared to a mean fragment size of 200 pb (Covaris E220 Montreal, PQ, Canada), and gDNA fragments were used for DNA library preparation following the protocol for the SeqCap EZ HyperCap (Roche-NimbleGen, Pleasanton, CA, USA). Enriched DNA fragments were sequenced with 100 pb paired-end reads (HiSeq 4000, Illumina, Vancouver,

BC, Canada). Sequencing reads were converted to FASTQ using bcl2fastq software (Illumina) and trimmed by Trimmomatic. The reads were then aligned to the reference human genome sequence (hg19) using the Burrows–Wheeler transform (BWT), followed by a local alignment for indels using Genome Analysis ToolKit software (Broad Institute). Duplicate sequencing reads were excluded by Picard software, and SNPs were identified using the GATK Unified Genotyper and annotated by ANNOVAR software [20], implemented in the VarAFT software [21] that we used to select exonic and splicing variants with a MAF (minor allele frequency) ≤ 1% in the *POC5* gene. The identified SNPs were then compared to the EVS (Exome Variant Server) database, the Genome Aggregation Database (gnomAD: https://gnomad.broadinstitute.org (accessed on 21 March 2020)), and our in-house control cohort (*n* = 1268).

### *2.4. Validation with Sanger Sequencing*

All *POC5* coding variants identified by WES or targeted exome sequencing were validated by Sanger sequencing. The segregation of the coding variants was also completed using Sanger sequencing for the take-out-concerned families studied. PCR amplification was performed using the TransStart FastPfu FLY DNA Polymerase (AP231, Civic Bioscience) following the instructions of the manufacturer with primers FWD-50 - GGACCAAACTTTAGCCAGTATG-30 and RV-50 -TCTCGATCTCCTGACCTCGT-30 . Sanger sequencing of amplicons was performed on an ABI 3730xl DNA Analyzer (Applied Biosystems, Louisville, KY, USA) at Eurofins Genomics (Louisville, KY, USA).

### *2.5. Statistics*

The allelic frequency of the *POC5* coding variants was compared to the control population using a one-tailed Fischer's exact test. A *p*-value < 0.05 was considered statistically significant.

### **3. Results**

### *3.1. Patient Enrolment*

Since the identification of *POC5* as a candidate gene for adolescent idiopathic scoliosis and its functional validation, we have analyzed the prevalence of *POC5* coding variants within the AIS population, and also have sought to identify new candidate genes. Ninety AIS French-Canadian families were recruited demonstrating different types of transmission: autosomal dominant or recessive, and 30 AIS sporadic cases. For this study, 43 families were selected. Our French-Canadian cohort was supplemented with the 73 UK AIS patients of our collaborators from London, United Kingdom.

### *3.2. POC5 Variants Prevalence Using Next-Generation Sequencing*

Among the 53 French-Canadian and British AIS families and 94 unrelated AIS patients, 177 AIS patients were screened for *POC5* coding variants. The combination of whole-exome sequencing and targeted sequencing by AmpliSeq using a targeted Amplicon chip followed by a confirmation with Sanger sequencing revealed that 13% (*p* < 0.0001, Fisher's exact test) of AIS patients with or without family history were carrying one of the three variants of *POC5,* previously identified as the first causative gene [13]. Indeed, 11 of them carried the A429V variant (6 families and 5 sporadic cases); i.e., 7.5% (*p* < 0.0001), and 11 were found to carry the A446T variant (2 familial, 6 sporadic cases); i.e., 6.4% (*p* = 0.0052). No patient with the A455P variant was reported (Table 1).

**Table 1.** *POC5* coding variant distribution among the French-Canadian and British AIS cohort compared to 1268 controls. The number of patients from families or sporadic cases that were carrying *POC5* coding variants and the frequency for each of the 2 variants (p.(A446T) and p.(A429V)) are reported.


The EVS (Exome Variant Server) database [22] reported a minor allele frequency (MAF) in the European-American population of 1.2% for the p.A429V variant and 1.5% for p.A446T. The Genome Aggregation Database (gnomAD: https://gnomad.broadinstitute. org (accessed on 21 March 2020)) [23] reported a MAF of 1.1% for the p.A429V (rs146984380) variant and 1.6% for p.A446T (rs34678567).

### *3.3. Segragation Analysis of POC5 Coding Variants with AIS*

The segregation analysis of the *POC5* coding variants with the disease for the AIS families was then performed using Sanger sequencing (Figure 1). The variant p.(A446T) was found in two families (F62 and F80), and showed a perfect segregation with the disease in family 80. Sadly, DNA was not able for the rest of family 62. For the families that were found to carry the variant p.(A429V)—F02, F18, F37, F57, F58, and F66—the segregation analysis showed incomplete penetrance (Figure 1).

**Figure 1.** *Cont*.

**Figure 1.** Pedigrees of French-Canadian families showing the co-segregation of POC5 variants (c. 1286C > T (p.(A429V) and c.1336G > A (p.(A446T)) with the disease. Open circles and squares indicate unaffected individuals, Blackened circles and squares indicate affected females and males, respectively. Blue circles and squares indicate juvenile females and males. Yellows stars indicate exomed AIS patients. \* Incomplete penetrance.

### **4. Discussion**

We reported the prevalence of *POC5* gene variants in 13% of AIS patients with or without a family history of this condition; that is, six times more frequent than in our inhouse control cohort that matched for ethnicity. Two methods were used for the screening of *POC5* gene in both French-Canadian and British families with AIS. This study confirmed the previously reported data by Patten et al. [13], which evidenced three rare SNPs (p.A446T, p.A429V, and p.A455P) in the *POC5* gene in four families of a pool of 41 AIS French families and 150 IS cases from France. We also confirmed the autosomal dominant transmission pattern of *POC5* coding variants in families with AIS.

Adolescent idiopathic scoliosis is a complex disease with a multifactorial etiology including genetic, environmental, and hormonal factors, but the pathogenesis of this disease remains poorly understood. To decipher the genetic implication in AIS, different approaches were used: association studies and linkage analyses to identify causative genes or genes that may impact AIS susceptibility and/or disease progression. Many association studies were performed using GWAS technology and genome-wide linkage analysis, followed by exome sequencing and highlighted multiple-locus candidate, especially on chromosomes 6, 9, 16, 17 [24], and 19 [25]. *POC5*, locus 5q13.3, was the first unambiguously causative gene that was identified and functionally related to the diseases using a zebrafish (*Danio rerio*) model. The three causative variants of the *POC5* gene (p.A429V, p.A446T, and p.A455P) were found in exon 10, and all three corresponded to the substitution of an alanine to another amino acid, suggesting that the alanine in exon 10 of *POC5* could

play an important role in the pathogenesis of AIS. Our study supported the importance of *POC5* variants in the AIS population. In AIS patients from French-Canadian and British families, we found *POC5* variants in 13% of AIS cases, but not all the variants showed a perfect segregation with the disease, highlighting the fact that AIS is a complex disease, and very likely a polygenic disorder. In our cohort of French-Canadian subjects, the previously identified variant in IS cases, namely p.A429V, was the most predominant.

To date, these three variants of *POC5* have not been found in the Chinese population [14]; however, in the cohort of this Chinese study, AIS patients were not screened for their family history. It is important to underline that *POC5* gene variants were first identified in a huge multiplex family in which variants and the disease were transmitted in an autosomal dominant manner through four generations. This was consistent with the hypothesis of the initiating role of POC5 in the development of the familial form of AIS.

POC5, a conserved protein, is essential for centriole assembly, elongation, and cell cycle [15,26]. Recent studies identified POC5 as interacting with POC1B, FAM161A, and centrin-2 to build an inner scaffold with an helicoidal assembly, which provides the structural flexibility and strength to maintain microtubule and ciliary cohesion [27]. It can be hypothesized that those alanines in exon 10 of *POC5* are important for the proper interaction of POC5 with its protein partners, and therefore for the proper ciliary integrity. Several osteogenic pathways are hosted on primary cilia, including Sonic Hedgehog, Wnt, or the calcium-signaling pathway, and could play a part in AIS pathogenesis. More recently, primary cilia have been found to be related to bone-mass reduction through the microtubule (MT) network disorganization caused by the reduction of MT anchorage to the basal body [28]. Altogether, these findings could explain the low bone mineral density observed in AIS patients [29].

### **5. Conclusions**

The *POC5* gene was one of the first pieces of the puzzle of the genetic etiology of AIS, and since its identification, other genes coding for components of the primary cilia have been found to be linked to this disease [12,16]. Our study confirmed a higher prevalence of *POC5* variants in patients with AIS compared to the general population, as Patten et al. [13] already reported, and this reinforces that *POC5* plays a role in the pathogenesis of AIS. The functionality of those variants also was previously reported [13], and was related to ciliary functionality [15]. Further investigations are necessary in order to identify additional genes and finally draw complete pathways that relate the primary cilia to AIS. Finding causative genes for AIS and understanding the molecular consequences of these gene variants is necessary to improve the knowledge about this disease, especially by deciphering the genetic involvement.

**Author Contributions:** H.M.: methodology, investigation, validation, writing—original draft preparation, writing—review and editing; A.S.: investigation, formal analysis; J.A.A.-M.: recruitment of the patients, funding acquisition; A.C.: recruitment of the patients, funding acquisition, editing; S.B.: recruitment of the patients, writing—original draft preparation; C.F.: recruitment of the patients; S.P.: recruitment of the patients; F.M.: project administration, funding acquisition, visualization, writing—original draft preparation, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by the Yves Cotrel Foundation (to F.M., S.P., and A.C.), the Scoliosis Research Society (to F.M.), the Marfan Trust UK (to A.C. and J.A.A.-M.), the Peter and Sonia Field Charitable Trust (to A.C.), and St. George's University of London (to A.C. and J.A.A.-M.). This study also was supported by RSBO (Réseau de recherché en Santé Buccodentaire et Osseuse), CHU Sainte-Justine Foundation (to H.M.), and FRQS (to H.M.).

**Institutional Review Board Statement:** The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Centre hospitalier universitaire Sainte-Justine Ethics Committee (protocol code #3704, July 2016).

**Informed Consent Statement:** Written informed consent was obtained from all subjects involved in this study and from legally authorized representative of minor participants prior to enrollment.

**Data Availability Statement:** The data presented in this study are available in doi: 10.3390/genes1 2071032.

**Acknowledgments:** Special thanks to the Clinical Research Orthopaedic Unit (URCO) team, who contributed to the recruitment of the patients. We also thank Philippe Campeau for his help with the exome analysis.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

### **References**


### *Article* **The Mutational Landscape of** *PTK7* **in Congenital Scoliosis and Adolescent Idiopathic Scoliosis**

**Zhe Su 1,2,†, Yang Yang 1,3,4,†, Shengru Wang 1,3,4, Sen Zhao 1,4, Hengqiang Zhao 1,2,3, Xiaoxin Li 3,4,5 , Yuchen Niu 3,4,5, Deciphering Disorders Involving Scoliosis and COmorbidities (DISCO) Study Group ‡ , Guixing Qiu 1,3,4, Zhihong Wu 3,4,5, Nan Wu 1,3,4 and Terry Jianguo Zhang 1,3,4,\***


**Abstract:** Depletion of *ptk7* is associated with both congenital scoliosis (CS) and adolescent idiopathic scoliosis (AIS) in zebrafish models. However, only one human variant of *PTK7* has been reported previously in a patient with AIS. In this study, we systemically investigated the variant landscape of *PTK7* in 583 patients with CS and 302 patients with AIS from the Deciphering Disorders Involving Scoliosis and COmorbidities (DISCO) study. We identified a total of four rare variants in CS and four variants in AIS, including one protein truncating variant (c.464\_465delAC) in a patient with CS. We then explored the effects of these variants on protein expression and sub-cellular location. We confirmed that the c.464\_465delAC variant causes loss-of-function (LoF) of *PTK7*. In addition, the c.353C>T and c.2290G>A variants identified in two patients with AIS led to reduced protein expression of PTK7 as compared to that of the wild type. In conclusion, LoF and hypomorphic variants are associated with CS and AIS, respectively.

**Keywords:** protein tyrosine kinase 7 (*PTK7*); congenital scoliosis; adolescent idiopathic scoliosis; whole exome sequencing

### **1. Introduction**

Protein tyrosine kinase (*PTK7*), also known as colon carcinomakinase-4 (*CCK-4*), is an evolutionarily conserved atypical receptor tyrosine kinase. The PTK7 protein consists of seven extracellular immunoglobulin (Ig)-like domains, one transmembrane domain and one catalytically inactive kinase domain [1]. *PTK7* plays an important role in vertebrate canonical and non-canonical Wnt (planar cell polarity, PCP), Semaphorin/Plexin and vascular endothelial growth factor (VEGF) signaling pathways [2–5]. These signaling pathways are important for embryonic developmental processes, including tissue specification, axial morphogenesis, formation of the cardiovascular, endocrine and immune systems, and regulation of neural crest migration and tumorigenesis [6–10].

Zebrafish models depleted of *ptk7* presented various spinal curve phenotypes. Maternal zygotic *ptk7* (MZ*ptk7*) and zygotic *ptk7* (Z*ptk7*) mutant zebrafish develop spinal

**Citation:** Su, Z.; Yang, Y.; Wang, S.; Zhao, S.; Zhao, H.; Li, X.; Niu, Y.; Deciphering Disorders Involving Scoliosis and COmorbidities (DISCO) Study Group; Qiu, G.; Wu, Z.; et al. The Mutational Landscape of *PTK7* in Congenital Scoliosis and Adolescent Idiopathic Scoliosis. *Genes* **2021**, *12*, 1791. https://doi.org/10.3390/ genes12111791

Academic Editors: Philip Giampietro, Nancy Hadley-Miller, Cathy L. Raggio and Donato Gemmati

Received: 11 July 2021 Accepted: 8 November 2021 Published: 12 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

curvatures that model congenital scoliosis (CS) and adolescent idiopathic scoliosis (AIS), respectively, due to differential timing of ptk7 loss-of-function [2,11]. Meanwhile, a novel sequence variant *PTK7P545A* has been reported in a patient with AIS, but without further in vitro investigation [11]. The association between human *PTK7* variants and scoliotic phenotypes continue to be understudied.

In this study, we analyzed variants in *PTK7* identified in a mixed cohort of patients with congenital scoliosis and adolescent idiopathic scoliosis, then performed in vitro experiments to determine the effects of these variants on protein expression and sub-cellular location.

### **2. Materials and Methods**

### *2.1. Human Subjects*

A total of 885 Han Chinese individuals who received a diagnosis of congenital scoliosis (CS, *n* = 583) and adolescent idiopathic scoliosis (AIS, *n* = 302) were recruited between 2009 and 2018 at Peking Union Medical College Hospital (PUMCH) for the Deciphering disorders Involving Scoliosis and COmorbidities (DISCO, http://discostudy.org/, accessed on 1 November 2021) project. Clinical manifestations, physical examination results, and detailed medical histories were obtained with the patients' informed consent. Clinical diagnoses were confirmed by radiology imaging, including X-ray and computed tomography (CT). The criteria for the diagnosis of congenital scoliosis and adolescent idiopathic scoliosis were as follow: congenital scoliosis was caused by vertebral defects, and may be associated with rib anomalies, while idiopathic scoliosis was diagnosed by spinal curvature exceeding 10◦ on a plain antero-posterior X-ray image, with no other identifiable underlying disease. For a diagnosis of adolescent idiopathic scoliosis, patients were required to have an onset age of 10–18 years old. All radiographic evaluations were conducted by trained spine surgeons, while clinical reviews were performed by alternate observers blinded to the radiographic assessment. Patients with a prior molecular diagnosis such as a disease-causing genetic variant were excluded. Blood was obtained from all subjects and whole-exome sequencing (WES) was performed.

Written informed consent for both clinical data and the genetic exome sequencing was obtained from each participant prior to study participation. This study was approved by the Department of Scientific Research and Ethics Committee of PUMCH in China.

### *2.2. Bioinformatic Analysis and Variant Interpretation*

Two DNA extraction and purification kits, Red Blood Cell (RBC) Lysis Buffer (R1010, Solarbio) and Circulating Nucleic Acid Kit (55114, Qiagen), were used in accordance with the manufacturers' protocols. Approximately 4 mL of peripheral blood was transferred to an Eppendorf safe lock tube after sufficient centrifugation. 10 mL RBC lysis solution was added to each centrifuge tube for efficient lysis. 4.5 mL cell lysis solution and 250 µL proteinase K solution were added to each tube and placed at 56.5 ◦C constant temperature shaker digestion overnight. 1.5 mL protein precipitation solution was added to each tube and allowed to incubate for 10 min at −20 ◦C. After centrifugation, the supernatant was taken, and 7 mL precooled isopropanol was added into the supernatant until floccule was precipitated. Finally, 1 mL of 75% ethanol was used to wash the DNA pellet after inverting the tube several times, followed by centrifugation at 17,000× *g* for 10 min. The quality and quantity of the DNA was evaluated using a spectrophotometer (NanoPhotometer Pearl, Denville Scientific, Inc., Holliston, MA, USA) and fluorometer (Qubit® dsDNA High Sensitivity and dsDNA Broad Range assay, Life Technologies Corporation, Waltham, MA, USA). DNA samples were prepared in Illumina libraries and then underwent whole-exome capture with the SureSelect Human All Exon V6 + UTR r2 core design (91 Mb, Agilent, Santa Clara, CA, USA), followed by sequencing on the Illumina HiSeq 4000 platform in 150-bp paired-end reads mode (Illumina, San Diego, CA, USA). WES data processing was performed with the Peking Union Medical College Hospital Pipeline (PUMP) based on the reference genome GRCh37-v1.6 [12,13]. Combined Annotation Dependent Depletion (CADD PHRED-score) [14] and Polyphen-2 [15] were used to predict the pathogenicity of

candidate variants. Genotype was filtered for read-depth (DP > 10×), genotype quality (GQ > 20), quality by depth (QD < 2), strand odds ratio (SOR > 9), and allele balance (AB > 0.25). The populational frequency of each QC-passed variant was obtained from the public population databases, including the 1000 Genomes Project, the Exome Sequencing Project [16], the Genome Aggregation Database (gnomAD) [17], and the in-house database of DISCO (Deciphering disorders Involving Scoliosis and COmorbidities, http://discostudy.org/, ≈8000 exomes/genomes, accessed on 1 November 2021) study. Rare variants (minor allele frequency < 0.001) were retained for further filtering. From these rare variants, we included the protein-altering or splice-region variants for subsequent analysis. Potential spicing variants were predicted using SpliceAI [18].

Candidate variants of *PTK7* were selected based on the following criteria:


### *2.3. Site-Directed Mutagenesis*

pcDNA3.1+ with C-terminal flag-tagged wild type (WT) and variant *PTK7* cDNA (NM\_152881.3) plasmids were acquired from Beijing Hitrobio Biotechnology (Beijing, China). The variant constructs were sequenced on both strands to verify nucleotide changes.

### *2.4. Cell Culture and Transfection Assay*

HEK293T cells were cultured in Dulbecco's Modified Eagle's medium (DMEM, Invitrogen, Carlsbad, CA, USA) supplemented with 10% fetal bovine serum (Gibco, Waltham, MA, USA), penicillin (50 U/mL), and streptomycin (50 µg/mL) in six-well plates. HEK293T cells were transfected with full-length WT or variant *PTK7* constructs using Lipofectamine 3000 (Invitrogen).

### *2.5. Western Blot Assay*

After a 48 h transfection, HEK293T cells were harvested in RIPA lysis buffer (Solarbio, Beijing, China) and whole-cell lysate was resolved on gels under reducing conditions, transferred to a Nitrocellulose Transfer (NC) membrane, blocked with non-fat milk for 30 min at room temperature, and probed with primary antibodies: mouse anti-DDK(FLAG) antibody (1:1000, ZSGB-BIO, TA-05) and mouse anti-β-actin antibody (1:5000, Proteintech, 66009-1- Ig) over-night at 4 ◦C, and then with a goat anti-mouse horseradish peroxidase-conjugated secondary antibody (1:5000, ZSGB-BIO, ZB-2305). Immunoreactivity was visualized using Western Lighting chemiluminescence reagent (Beyotime, Shanghai, China). All Western blotting experiments were repeated three times.

### *2.6. Immuno-Fluorescence*

Cells were grown on 35 mm glass bottom cell culture dishes for 48 h after transfection, washed three times, fixed in 4% paraformaldehyde, permeabilized with 0.1% Triton X-100, and incubated with primary anti-DDDDK-tag mouse antibody (1:1000, MBL, M185-3L) at 4 ◦C overnight. Cells were incubated and stained with secondary antibody Alexa FlourTM 488 goat anti-mouse IgG (1:2000, Invitrogen, 1911843), then covered with DAPI (Solarbio, Beijing, China).

### *2.7. Statistical Analysis*

The overall protein expression levels were normalized to WT (set as 1.0) and mean values of variant versus WT from all three experiments were compared using unpaired *t*-test. All charts are drawn and analyzed using GraphPad Prism 8 and *p* < 0.05 was considered significant for all analyses.

**3. Results** 

*2.7. Statistical Analysis*

ered significant for all analyses.

### **3. Results** variants in nine patients were found. The mean age of the included nine patients with

*Cells* **2021**, *10*, x FOR PEER REVIEW 4 of 13

A total of 885 genomes from patients with scoliosis were sequenced and eight *PTK7* variants in nine patients were found. The mean age of the included nine patients with variants was 11.11 ± 5.51 years. In five CS patients, the mean Cobb angle of the coronal plane was 59.94◦ ± 25.85◦ . Among them, three patients displayed kyphosis with a mean angle of 50.53◦ ± 8.05◦ . The mean Cobb angle of structural curve in four AIS patients was 48.95◦ ± 4.77◦ . In the CS group (*n* = 583), four possibly deleterious variants were revealed in five patients, including one frameshift variant and three missense variants (c.464\_465delAC, c.1394A>G, c.1879G>A, c. 1955G>T) (Table 1). One of the missense variants (c.1955G>T) was identified in two patients. In the AIS group (*n* = 302), four deleterious missense variants (c.49C>T, c.353C>T, c.2290G>A, c.2384G>A) were identified (Figure 1A, Table 2). No peripheral blood samples from the patients' families were obtained, and no similar family history of spinal deformity was found after follow-up. variants was 11.11 ± 5.51 years. In five CS patients, the mean Cobb angle of the coronal plane was 59.94° ± 25.85°. Among them, three patients displayed kyphosis with a mean angle of 50.53° ± 8.05°. The mean Cobb angle of structural curve in four AIS patients was 48.95° ± 4.77°. In the CS group (*n* = 583), four possibly deleterious variants were revealed in five patients, including one frameshift variant and three missense variants (c.464\_465delAC, c.1394A>G, c.1879G>A, c. 1955G>T) (Table 1). One of the missense variants (c.1955G>T) was identified in two patients. In the AIS group (*n* = 302), four deleterious missense variants (c.49C>T, c.353C>T, c.2290G>A, c.2384G>A) were identified (Figure 1A, Table 2). No peripheral blood samples from the patients' families were obtained, and no similar family history of spinal deformity was found after follow-up.

The overall protein expression levels were normalized to WT (set as 1.0) and mean

A total of 885 genomes from patients with scoliosis were sequenced and eight *PTK7*

values of variant versus WT from all three experiments were compared using unpaired ttest. All charts are drawn and analyzed using GraphPad Prism 8 and *p* < 0.05 was consid-

**Figure 1.** Mapping and conservation analysis of the identified variants. (**A**) Mapping of eight *PTK7* variants revealed that p.L17F, p.S118F, p.H155Pfs\*16 and p.K465R are lo-**Figure 1.** Mapping and conservation analysis of the identified variants. (**A**) Mapping of eight *PTK7* variants revealed that p.L17F, p.S118F, p.H155Pfs\*16 and p.K465R are located in the extracellular region of PTK7 protein. p.D764N and p.R795H are located in the intracellular pseudokinase domain. (**B**) The conservation of variants in human and other vertebrate species.

cated in the extracellular region of PTK7 protein. p.D764N and p.R795H are located in

### the intracellular pseudokinase domain. (**B**) The conservation of variants in human and *3.1. Variant and Phenotypic Characteristics*

other vertebrate species. Patient SCO1905P0038 is a 13-year-old male with T12 butterfly vertebra and T9-T10 segmentation defect. The spinal plain radiograph shows not only a coronal curve to the left, but also has a severe thoracolumbar kyphosis in the sagittal plane with a 65◦ Cobb angle, both results of continuous deformity in the vertebral body (Figure 2A). The patient has a heterozygous deletion between nucleotide 464 and 465 (c.464\_465delAC, p.H155Pfs\*16). This frameshift variant was mapped to the extracellular immunoglobulin region of PTK7 protein (Figure 1A) and is predicted to cause the early termination of mRNA translation. It is a novel variant, previously undescribed in mutational databases and is highly conserved across different vertebral species except zebrafish (Figure 1B).



*Genes* **2021**, *12*, 1791


Patient SCO1905P0038 is a 13-year-old male with T12 butterfly vertebra and T9-T10 segmentation defect. The spinal plain radiograph shows not only a coronal curve to the left, but also has a severe thoracolumbar kyphosis in the sagittal plane with a 65° Cobb angle, both results of continuous deformity in the vertebral body (Figure 2A). The patient has a heterozygous deletion between nucleotide 464 and 465 (c.464\_465delAC, p.H155Pfs\*16). This frameshift variant was mapped to the extracellular immunoglobulin region of PTK7 protein (Figure 1A) and is predicted to cause the early termination of mRNA translation. It is a novel variant, previously undescribed in mutational databases and is highly conserved across different vertebral species except zebrafish (Figure 1B).

*3.1. Variant and Phenotypic Characteristics*

**Figure 2.** Antero-posterior, lateral spinal X-ray and the spinal three-dimensional CT reconstruction of patient SCO1905P0038 (**A**), SCO2003P2127 (**B**), SCO2003P0372 (**C**), SCO1908P0053 (**D**), SCO1907P0150 (**F**), SCO2003P0632 (**G**), SCO2003P2288 (**H**), SCO2003P2237 (**I**). Antero-posterior, lateral spinal X-ray of patient SCO2003P0541 (**E**). Arrowheads point to the vertebral or rib deformities. **Figure 2.** Antero-posterior, lateral spinal X-ray and the spinal three-dimensional CT reconstruction of patient SCO1905P0038 (**A**), SCO2003P2127 (**B**), SCO2003P0372 (**C**), SCO1908P0053 (**D**), SCO1907P0150 (**F**), SCO2003P0632 (**G**), SCO2003P2288 (**H**), SCO2003P2237 (**I**). Antero-posterior, lateral spinal X-ray of patient SCO2003P0541 (**E**). Arrowheads point to the vertebral or rib deformities.

Patient SCO2003P2127 is a 21-year-old female with a diagnosis of congenital scoliosis. This patient presents with the failure of segmentation in the concave side of T9-T11, fusion of the 4th and 5th ribs, and absence of 12th ribs. As a result, the patient has a severe imbalance of the spine in the coronal plane, with a 92◦ Cobb angle of the main curve at T5- L1 and a compensatory curve of 45◦ at the lumbar level (Figure 2B). She also has a history of patent ductus arteriosus. The heterozygous missense variant c.1394A>G (p.K465R) of the *PTK7* gene is mapped to the extracellular immunoglobulin region of the PTK7 protein (Figure 1A). This variant is previously unreported in mutational databases and is highly conserved (Figure 1B). It is predicted by CADD and PolyPhen-2 to be deleterious.

Patient SCO2003P0372 is an 8-year-old female with L2 hemivertebrae and L2-L3 segmentation defect. Her spine has a 38◦ right curve at the lumbar region with the hemiver-

tebrae as its apex in the coronal plane, and a slight compensatory curve at the thoracic region (Figure 2C). She has a heterozygous missense variant c.1879G>A (p.G627R) of *PTK7*. This variant is located in the intracellular domain and adjacent to the transmembrane region of the PTK7 protein (Figure 1A). This variant is highly conserved (Figure 1B) and has been reported in the gnomAD database with low frequency. It is predicted by CADD and PolyPhen-2 to be deleterious.

Patient SCO1908P0053 is a 1-year-old female detected segmented wedge vertebrae in T11 and T12. This child displays severe imbalance in both the coronal and sagittal planes, in which the left curve reached 78◦ with the deformed vertebral body as the apex in the coronal plane, and severe kyphosis of 48◦ in the thoracolumbar region in the sagittal plane (Figure 2D). Due to her young age, long segment involvement, and the combination of thoracolumbar scoliosis and kyphosis, she received growth rod implantation and repeated growth rod extension processes over the past few years. Patient SCO2003P0541 is a 7-year-old female with segmentation failure from T7 to T11. Although the spinal deformity was discovered at the age of 7, the patient did not receive proper treatment until adulthood, leading to the continued progression of kyphoscoliosis. Similar to the patient SCO1908P0053, this patient has severe imbalance on both the coronal and sagittal planes in the thoracic spine, with a right curve of 87◦ and a thoracic kyphosis of 62◦ , which compared with the normal physiological thoracic kyphosis ranged from 10◦ to 40◦ (Figure 2E). She also suffered from chest deformity and diastematomyelia from MRI scans, which is consistant with the reported association between *PTK7* variants and neural tube defects (NTDs) [19]. The patient showed no dyspnea, sensory or motor disorders in the lower limbs, nor did she show urinal or excretory dysfunction. These two individuals carry the same heterozygous missense variant c. 1955G>T (p.R652L). This variant is located in the intracellular domain of PTK7, close to the pseudokinase (PK) domain (Figure 1A). This variant is conserved across vertebral species besides zebrafish (Figure 1B), and has been reported in gnomAD database with low frequency.

Patients SCO1907P0150, SCO2003P0632, SCO2003P2288 and SCO2003P2237 all suffer from adolescent idiopathic scoliosis (AIS). Patient SCO1907P0150 is an 11-year-old female. The patient has two curves on the coronal plane, a 47◦ left curve in the upper thoracic segment (T2-T6) and a 51◦ right curve in the thoracic segment (T6-T12) (Figure 2F). She had a congenital ventricular septal defect that was surgically treated. WES analysis reveals a missense variant c. 49C>T (p.L17F) in *PTK7*, which has not been reported in the mutational databases. This variant is located at the beginning of the extracellular portion of the PTK7 protein (Figure 1A). It is predicted by CADD and PolyPhen-2 score to be deleterious.

Patient SCO2003P0632 displays a 50◦ right curve and a 42◦ left curve at the thoracic and thoracolumbar levels, respectively, without significant trunk deviation (Figure 2G). A missense variant c. 353C>T (p.S118F) is identified in patient SCO2003P0632, a 12-year-old girl, which is also absent from mutational databases. It is predicted to be located at the junction between the first and second Ig domains of the extracellular part of PTK7 protein (Figure 1A). This variant is predicted by CADD and Polyphen-2 scores to be deleterious.

Patient SCO2003P2288 is a 14-year-old female with a c. 2290G>A (p.D764N) variant in *PTK7*. X-rays showed that the main curve was a 46◦ right-sided lumbar scoliosis, with a long compensatory curve of 25◦ at T1-T11 to maintain basic balance of the trunk (Figure 2H). This missense variant has been reported in gnomAD database with low frequency and the altered amino acid is highly conserved in vertebrates except zebrafish (Figure 1B). The variant is mapped to the PK domain of the intracellular portion of the PTK7 protein (Figure 1A). The pathogenicity assessment has contradictory results using CADD and Polyphen-2 scores.

Patient SCO2003P2237 is a 13-year-old girl. On the coronal plane, the patient presents a single thoracic curve from T7 to T12 with a Cobb Angle of 52◦ . Due to the severe vertebral rotation, a relatively obvious razor-back deformity is seen on the sagittal plane (Figure 2I). WES analysis reveals that she has a missense variant c.2384G>A (p.R795H) in *PTK7*. The variant amino acid is highly conserved in vertebrates (Figure 1B) and mapped to the PK

domain of the intracellular section of the PTK7 protein (Figure 1A). Reported in gnomAD database with low frequency, it is predicted by both CADD and Polyphen-2 scores to be deleterious.

### *3.2. Western Blot and Immunocytochemistry Analyses*

To identify the influence of the identified variants on PTK7 protein function, we evaluated overall protein expression and sub-cellular location of the variants of PTK7 compared to the WT. Western blotting analysis identified the immunoreactive-specific band for flag-tagged WT PTK7 at 150kDa and β-actin at 42kDa. The overall expression level was quantified by estimating bands from PNGase-treated samples and normalizing to WT. As anticipated, the overall expression of the frameshift variant (c.464\_465delAC, p.H155Pfs\*16) was significantly decreased compared to that of WT (*p* < 0.0001) (Figure 3A), indicating the loss-of-function effect of this variant. The expression level of two missense variants (p.S118F and p.D764N) were partially reduced (*p* = 0.0061 and *p* = 0.0293, respectively) (Figure 3A). Interestingly, these two variants were both identified in patients with AIS but not CS. There were no significant differences in the overall protein expression of the other missense variants (p.K465R, p.G627R, p.R562L, p.L17F, p.R795H) compared with the WT.

We also performed immunocytochemistry (ICC) assays and found there was a faintest signal from the frameshift variant of the PTK7 protein (Figure 3B), which supported the results of Western blotting assays. However, the sub-cellular location of all missense variant PTK7 protein did not change compared to WT (Figure 3B). *Cells* **2021**, *10*, x FOR PEER REVIEW 10 of 13

**Figure 3.** Western blot and ICC analyses. (**A**) HEK293T cells transiently transfected with WT or variant *PTK7* constructs revealed diminished PTK7 protein expression levels of p.H155Pfs\*16, p.S118F and p.D764N. \* *p* value < 0.05, \*\* *p* value < 0.01, \*\*\*\* *p* value < 0.0001. (**B**) ICC assays showed a faint signal of the frameshift variant PTK7 protein. No difference in PTK7 protein sub-cellular location was detected. **Figure 3.** Western blot and ICC analyses. (**A**) HEK293T cells transiently transfected with WT or variant *PTK7* constructs revealed diminished PTK7 protein expression levels of p.H155Pfs\*16, p.S118F and p.D764N. \* *p* value < 0.05, \*\* *p* value < 0.01, \*\*\*\* *p* value < 0.0001. (**B**) ICC assays showed a faint signal of the frameshift variant PTK7 protein. No difference in PTK7 protein sub-cellular location was detected.

Here, we performed WES on the genomes of 885 scoliosis patients, including 583 CS patients and 302 AIS patients, and identified seven missense variants and one frameshift variant in *PTK7*. Loss-of-function of PTK7 resulted in skeletal phenotypes in both zebrafish and mice. In zebrafish models, embryos with the *PTK7hsc9* variant (a mutant allele harboring a 10bp deletion that results in a frameshift and the incorporation of multiple

and axial mesoderm tissues, and abnormal three-dimensional spinal curvature with growth and development [2]. In 2016, Grimes et al. reported that mutated *ptk7* zebrafish exhibited spinal curvature as well as hydrocephalus, ependymal cell (EC) ciliary dysfunction and abnormal rate and pattern of the cerebrospinal fluid (CSF) flow [20]. The *chuzhoi* mutant mice embryo showed several congenital abnormalities including neural tube, heart and lung defects caused by the disruption of PTK7 protein expression [21]. In human, LoF mutations in *PTK7* and the double-heterozygous mutation of *PTK7* and *VANGL2* were associated with spina bifida and increased the genetic risk of NTDs [19]. However, only one case of scoliosis with a *PTK7* mutation has been previously reported [11], and our identification of these eight variants expands the variant and phenotypic

The PTK7 protein consists of an extracellular continuous immunoglobulin domain, a transmembrane domain and an intracellular PK domain [1]. PTK7 interacts with several molecules in both canonical and non-canonical Wnt signaling as a co-receptor. These molecules include Wnt ligands, Wnt receptors as well as intracellular components such as Dvls and β-catenin [22]. According previous studies, the extracellular domains and intracellular domains of PTK7 play distinct roles. By constructing and studying *PTK7* with

**4. Discussion** 

spectrum of *PTK7*.

### **4. Discussion**

Here, we performed WES on the genomes of 885 scoliosis patients, including 583 CS patients and 302 AIS patients, and identified seven missense variants and one frameshift variant in *PTK7*. Loss-of-function of PTK7 resulted in skeletal phenotypes in both zebrafish and mice. In zebrafish models, embryos with the *PTK7hsc9* variant (a mutant allele harboring a 10bp deletion that results in a frameshift and the incorporation of multiple premature termination codons) had defects in the convergence of both neuroectoderm and axial mesoderm tissues, and abnormal three-dimensional spinal curvature with growth and development [2]. In 2016, Grimes et al. reported that mutated *ptk7* zebrafish exhibited spinal curvature as well as hydrocephalus, ependymal cell (EC) ciliary dysfunction and abnormal rate and pattern of the cerebrospinal fluid (CSF) flow [20]. The *chuzhoi* mutant mice embryo showed several congenital abnormalities including neural tube, heart and lung defects caused by the disruption of PTK7 protein expression [21]. In human, LoF mutations in *PTK7* and the double-heterozygous mutation of *PTK7* and *VANGL2* were associated with spina bifida and increased the genetic risk of NTDs [19]. However, only one case of scoliosis with a *PTK7* mutation has been previously reported [11], and our identification of these eight variants expands the variant and phenotypic spectrum of *PTK7*.

The PTK7 protein consists of an extracellular continuous immunoglobulin domain, a transmembrane domain and an intracellular PK domain [1]. PTK7 interacts with several molecules in both canonical and non-canonical Wnt signaling as a co-receptor. These molecules include Wnt ligands, Wnt receptors as well as intracellular components such as Dvls and β-catenin [22]. According previous studies, the extracellular domains and intracellular domains of PTK7 play distinct roles. By constructing and studying *PTK7* with deletion of different domains, Hayes et al. found that the plasma membrane-tethered *Ptk7* extracellular domain (*Ptk7*∆ICD) was sufficient to promote normal PCP as well as the inhibition of canonical Wnt signaling [2], while the intracellular domains may play a specific role in oriented cell division, radial intercalation and cilia orientation [23–26]. The intracellular PK domain of PTK7 may act as a scaffold promoting the binding of intracellular proteins and other receptors. In *Xenopus*, the ptk7 intracellular domain was stimulated by Wnt5A and induced its translocation into the nucleus, which promoted cell and tissue movements [27]. Additionally, the C-terminal PTK7 species up-regulated cadherin-11 expressed in the mesoderm-derived tissues, and which regulates osteogenesis. It is possible that PTK7 and Cadherin-11 might interact in embryogenesis and regulate similar developmental processes [28,29]. In our study, the frameshift variant and three missense variants (p.L17F, p.S118F, and p.K465R) were located in the extracellular domains of the PTK7 protein, while the other four missense variants (p.G627R, p.R652L, p.D764N, and p.R795H) were mapped to the intracellular portion, with p.D764N and p.R795H variants being located in the pseudokinase domain. In our expression assay, one variant in the extracellular domain and one variant in the intracellular domain were shown to alter the expression of *PTK7*, suggesting that both domains are critical for the integrity of PTK7 protein.

Interestingly, the frameshift variant that caused almost no expression of PTK7 was found in the CS patient, while the two missense variants that resulted in the significant decreases in protein levels were both found in patients with AIS. The patient carrying the truncating variant was diagnosed to have a mixed subtype of CS, with failure of formation (T12 butterfly vertebra) and segmental disorder (T9-T10) and no other developmental malformations. The two patients with the missense variants (p.S118F and p.D764N) did not have congenital developmental deformities of the vertebral body or other systems, presenting simple thoracolumbar scoliosis. However, due to the complex etiology of AIS, the role of PTK7 in AIS still warrants further validation and investigation.

Taken together, we hypothesize that the pathogenesis of vertebral deformities and the onset time of spinal curvature may be due to the effect of *PTK7* variants on the protein expression. In other words, congenital and adolescent idiopathic scoliosis may have a common genetic basis. Although some missense variants in our study did not show

abnormalities in protein expression or sub-cellular location, it is possible that the altered amino acids can affect the structure of the protein and subsequently affect downstream signaling pathways. Therefore, it is necessary to further explore the possible downstream pathways of PTK7, such as the canonical and non-canonical Wnt pathways.

### **5. Conclusions**

In conclusion, we identified eight *PTK7* variants in our mixed scoliosis cohort, including one frameshift variant (c.464\_465delAC) and seven missense variants (c.49C>T, c.353C>T, c.1394A>G, c.1879G>A, c. 1955G>T, c.2290G>A, c.2384G>A). The frameshift variant resulted in a depleted expression of PTK7 protein, and two of missense variants caused reduced expression of PTK7. Our study extended the variant and phenotype spectrums of *PTK7* and suggested a common genetic basis of CS and AIS.

**Author Contributions:** Conceptualization, Z.S., N.W. and S.Z.; methodology, Z.S., S.Z. and H.Z.; software, S.Z. and Y.N.; validation, Z.S., S.W., S.Z. and X.L.; formal analysis, Z.S. and Y.Y.; investigation, Z.S.; resources, S.W., Y.Y. and Deciphering Disorders Involving Scoliosis and COmorbidities (DISCO) Study Group; data curation, Z.S., S.Z. and H.Z.; writing—original draft preparation, Z.S., Y.Y.; writing—review and editing, S.W., S.Z., X.L. and N.W.; visualization, Z.S.; supervision, N.W., Z.W., G.Q. and T.J.Z.; project administration, N.W., Z.W., G.Q. and T.J.Z.; funding acquisition, N.W., Z.W., G.Q. and T.J.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded in part by the National Natural Science Foundation of China (81902178 to S.W., 81972037 and 82172382 to J.Z., 81822030 and 82072391 to N.W., 81930068 and 81772299 to Z.W., 81772301 and 81972132 to G.Q.), Beijing Natural Science Foundation (No. L192015 to J.Z., JQ20032 to N.W., 7191007 to Z.W.), Capital's Funds for Health Improvement and Research (2020- 4-40114 to N.W.), Tsinghua University-Peking Union Medical College Hospital Initiative Scientific Research Program, Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences (No. 2019PT320025), and the PUMC Youth Fund & the Fundamental Research Funds for the Central Universities (No. 3332019021 to Shengru W.).

**Institutional Review Board Statement:** Approval for the study was obtained from the ethics committee at Peking Union Medical College Hospital (JS-098).

**Informed Consent Statement:** Written informed consent was provided by each participant.

**Data Availability Statement:** Data are available upon reasonable request. The datasets analyzed during the current study are available from the corresponding author on reasonable request.

**Acknowledgments:** We thank the patients and their families who participated in this research. We also thank the Nanjing Geneseq Technology Inc. for the technical help in sequencing, the Ekitech Technology Inc. for the technical support in database and data management. The members of the DISCO study group are provided in http://discostudy.org/.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Genes* Editorial Office E-mail: genes@mdpi.com www.mdpi.com/journal/genes

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34

www.mdpi.com

ISBN 978-3-0365-5976-6