**1. A Brief History of FAS1 Domain Proteins**

The first FAS1 protein was identified in an insect model for central nervous system development, the grasshopper *Schistocerca americana.* In order to identify cell surface molecules potentially involved in the formation of axon bundles (fascicles), monoclonal antibodies (mAbs) recognizing cell surface antigens on specific fascicles were characterized. One of these antibodies recognized a 70 kDa glycoprotein named Fasciclin 1 (SaFas1 (Appendix A)) [1]. The genes coding for grasshopper SaFas1 and *Drosophila melanogaster* DmFas1 were cloned soon afterwards [2] and a homologous fruit fly gene called Midline fasciclin (*DmMfas*) was identified later [3]. In the fruit fly, a *DmFas1* knockout affected neuronal branching as well as synaptic function [4] and laser ablation of the grasshopper ortholog *SaFas1* led to disrupted cell adhesion of pioneer axons [5]. The crystal structure of DmFas1 provided the prototype for the structurally novel FAS1 domain [6]. In the meantime, molecular techniques and sequence comparison tools revealed the widespread occurrence of homologous proteins defined by the FAS1 domain (IPR000782; PF02469). The *Homo sapiens* genome encodes four FAS1 domain proteins named transforming growth factor-β induced protein (HsTgfbi), Periostin (HsPn), Stabilin-1 (HsStab1) and Stabilin-2 (HsStab2). The *HsTgfbi* gene (Appendix B) was identified in human adenocarcinoma cells as a transcript that was induced 20-fold by transforming growth factor-β [7]. Likewise, *HsPn*, was cloned based on its expression in an osteoblast cell line [8] and subsequently

found to be enriched in the periosteum [9]. Finally, HsStab1 and HsStab2 were identified using two different antibodies binding to a subpopulation of endothelial cells and to the hyaluronic acid (HA) clearance receptor, respectively [10]. Due to their association with a multitude of clinical conditions, human and mammalian FAS1 proteins have been the focus of numerous detailed studies that are instructive for a general understanding of FAS1 domain proteins. Unexpectedly for proteins associated with cell adhesion, FAS1 domain proteins exist not only in animals but also in plants, fungi and prokaryotes. The first plant FAS1 protein was discovered using antibody interference in the alga *Volvox carteri* [11], a simple model for multi-cellularity consisting of just two cell types. When specific mAbs raised against a crude membrane preparation were added to volvox cultures they inhibited embryo development. The cognate protein was named algal cell adhesion molecule (CAM) based on its apparent role in the formation of intercellular contacts during early embryogenesis. The existence and physiological role of algal-CAM, which contains two FAS1 domains, raised the exciting possibility of a cell adhesion mechanism conserved between animals and plants. In higher plants FAS1 domain proteins were also identified by the biochemical and bioinformatic analysis of a group of highly *O*-glycosylated hydroxyproline-rich glycoproteins called arabinogalactan-proteins (AGPs) [12,13]. The bioinformatic investigation of the *Arabidopsis thaliana* genome revealed the existence of many fasciclin-like AGPs (FLAs) in plants [12–14]. At the same time a different investigation mapped one of several *Arabidopsis thaliana* salt overly sensitive (*sos*) mutations to the Salt overly sensitive 5 (*AtSos5*) gene encoding the AtFla4 protein [15]. In a mutant screen in a crop plant, the rice locus Microspore and tapetum regulator 1 (*OsMtr1*) was found to be required for male reproductive development and to encode a tandem FAS1 glycoprotein [16]. In fungi, FAS1 proteins have been identified in transcriptomics or proteomics studies in the Shiitake mushroom *Lentinula edodes* [17] and the rice pathogen *Magnaporthe oryzae* [18], while in the fission yeast *Schizosaccharomyces pombe* the FAS1 domain protein SpFsc1 was identified in a screen for autophagy related loci [19]. Apparently, FAS1 proteins already existed before the evolution of eukaryotes. The best-known eubacterial FAS1 proteins are Mpb70 and Mpb83, which were identified in *Mycobacterium bovis* culture filtrates [8,20–24]. Database queries reveal FAS1 proteins in both eubacteria and archaea, suggesting the inception of the domain preceded the existence of last universal common ancestor (LUCA) [25]. FAS1 proteins are often implicated in the interaction between the cell and the extracellular matrix (ECM). Considering the diversity of ECM architectures and compositions FAS1 domain proteins are surprisingly widespread between different kingdoms of uni- and multicellular life. However, despite their seemingly boundless presence throughout the tree of life, FAS1 proteins are not ubiquitous, especially in microbes whose genomes rapidly adapt to differing life styles. This suggests that FAS1 domain proteins are not essential for life per se but are suited for specialized cellular interactions that for some organisms are not required. I will next describe what is known about the structure of the FAS1 domain itself and discuss diverse additional structural features of FAS1 proteins in various kingdoms. This will be followed by a review of the biological roles of mammalian and plant FAS1 domain proteins, including the relationship of structure to function, which should help elucidate the mechanisms of FAS1 proteins in plant development.

#### **2. The Structure of the Fasciclin 1 Domain**

#### *2.1. The Fasciclin 1 Domain*

The FAS1 domain extends to approximately 140 amino acids. Although sequence conservation between different FAS1 proteins can be quite low, there exist two more highly conserved sequence stretches of around 15 residues called H1 and H2 and a conserved central YH motif (Figure 1A). Therefore, to identify FAS1 domain proteins in sequence databases, domain enhanced lookup time accelerated BLAST (DELTA-BLAST) should be used [26]. Using X-ray crystallography and NMR spectroscopy, several studies have elucidated the structures of isolated FAS1 domains or of entire FAS1

proteins [27–33]. The FAS1 domain is globular and contains a central structural fold of two β-sheets oriented at an almost perpendicular angle, varyingly described as β-wedge or β-sandwich (Figure 1B).

**Figure 1.** The FAS1 domain across kingdoms of life. (**A**) Delta-BLAST alignment of some FAS1 domains mentioned in this article. Note the conservation of the N and C-proximal H1 and the H2 region as well as the central YH motif. The sequences used by Delta-BLAST were HsTgfbi (NP\_000349.1), HsPn (gi 93138709), DsFas1 FAS1-3 (1O70\_A), Algal-CAM (gi 75282282), AtFLA11 (gi 116247778), AtFLA4 (gi 75206907), MbMpb83 (gi 614094354), and SmNex18 (gi 81635876). Red/blue indicates highly/not conserved residues; color bits threshold set to 2.5; (**B**) The general topology of the FAS1 domain features is reminiscent of the "thumbs-up" gesture. The secondary structure of HsTgfbi FAS1-1 is annotated omitting the three N-terminal helices for clarity; (**C**) The crystal structure of HsTgfbi FAS1-1 [33] in ribbon display.

In the human HsTgfbi structure, the first β-sheet encompasses strands β1–β2–β8–β6/7 [33] (Figure 1C). The two inner strands are oriented parallel and the two outer ones antiparallel. The second β-sheet consists of β3–β4–β5. There are three α-helices at the N-terminus (not shown in Figure 1B), three more (α4 to α6) between β1 and β2 and a less highly conserved α-helix (α ) between β2 and β3. The FAS1 domain (Figure 1B, redrawn from [25]) is a member of the "β-grasp fold" superfamily [25] and may be imagined as a "thumbs-up" gesture of the right hand with the palm representing the first β-sheet (light green in Figure 1B), the bent index, middle and ring fingers symbolizing the second β-sheet (dark green in Figure 1B) and the thumb and pinkie resembling α4 to α6 (light brown in Figure 1B) and α (dark brown in Figure 1B), respectively. Among all FAS1 proteins, structure to function relations have been most intensely studied for HsTgfbi (reviewed in [34]). Therefore, the elucidation of the entire HsTgfbi crystal structure [33] could be seen as the "Rosetta stone" for a better mechanistic understanding of the many biological roles of FAS1 proteins in different organisms including plants. Several studies identified individual regions and amino acid residues on the four

Tgfbi FAS1 domains that are critical for function. The main approach was to use in vitro cell adhesion as a functional assay (Figure 2).

**Figure 2.** Cell adhesion assay. (**A**) On an uncoated or control-coated (red) plastic substrate, cell adhesion is inefficient; (**B**) cells adhere rapidly when plastic is coated with an adhesion protein such as Tgfbi or Pn (green); (**C**) to identify the receptor for the adhesion protein, integrin isotype specific antibodies (red) are co-incubated; (**D**) mammalian FAS1 proteins are thought to bind to different types of dimeric integrins (blue and yellow) that mediate mechanical contact between the cytoskeleton (grey) and the ECM as well as transduce intracellular signals using numerous associated proteins (pink and nude).

Briefly, cells more efficiently adhere to surfaces coated with adhesion proteins such as Tgfbi than to control-coated surfaces (Figure 2A,B). That this adhesion is dependent on the family of ECM receptors called integrins can be tested by adding integrin antibodies that block the Tgfbi-stimulated cell adhesion (Figure 2C). To identify sites on Tgfbi that might mediate cell adhesion, peptides corresponding to conserved Tgfbi sequence motifs were added. For instance, the NKDIL and the EPDIM peptides that are part of the FAS1-2 and the FAS1-4 (see following section) domains, respectively, both interfered with Tgfbi stimulated cell adhesion. By contrast, the KADHH peptide in the corresponding region of FAS1-1 had no effect [35]. In a study of a different cell type expressing a different integrin, the NKDIL and EPDIM peptides did not interfere with adhesion but an 18-amino acid peptide that covered the YH motif did [36]. A HsPn specific mAb identified a corresponding integrin interaction region on FAS1-2 [37]. The cell adhesion assay is too crude to demonstrate *direct* binding between FAS1 proteins and integrin; however, in combination with the crystal structure of HsTgfbi it showed that different integrins interact with different surface regions of the FAS1 domains [33]. Although integrins are not known in plants, given the complex biological roles of some FAS1 proteins of plants, this insight should be valuable for the prediction of their molecular function.
