**1. Introduction**

The compartmentalization of eukaryotic cells by membrane-bound organelles provides diverse environments for a wide variety of biochemical pathways essential to cell function and survival. Plastids are a dynamic class of organelle that evolved from an ancient cyanobacterial endosymbiont [1–3]. As they evolved, plastids adopted various central functions in cell metabolism and biosynthesis as well as in signalling, embryogenesis, leaf development, gravitropism, temperature response, and plant–microbe interactions [4]. Unlike other organelles, plastids can differentiate into various types from a common precursor, known as the proplastid, which serve different metabolic needs throughout plant tissues [5–8]. Plastids can also transition between types in response to different developmental [9,10] or environmental cues [11–15]. One such example is the process of photomorphogenesis, where, in the presence of light, the etioplasts of leaf cells grown in the dark are converted to green photosynthetic chloroplasts, the most notable and wellcharacterized type of plastids [16]. The ability of plants to trigger these types of transitions

**Citation:** Fish, M.; Nash, D.; German, A.; Overton, A.; Jelokhani-Niaraki, M.; Chuong, S.D.X.; Smith, M.D. New Insights into the Chloroplast Outer Membrane Proteome and Associated Targeting Pathways. *Int. J. Mol. Sci.* **2022**, *23*, 1571. https://doi.org/ 10.3390/ijms23031571

Academic Editor: Koichi Kobayashi

Received: 15 December 2021 Accepted: 27 January 2022 Published: 29 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and maintain a variety of plastid types in different tissues and at different stages of life is what allowed for the radiant expansion of the plant kingdom [6]. The interconversion of plastids is made possible by the coordinated remodelling of the plastid proteome [17–19].

Over the course of evolution, a large majority of plastid genes were transferred to the nuclear genome by horizontal gene transfer [1,20–23]. As a result, plastid biogenesis and function rely on the fidelity of intracellular protein trafficking pathways to deliver the corresponding proteins to plastids [24]. To recognize and import chloroplast precursor proteins synthesized in the cytosol, plant cells evolved complex proteinaceous machinery at the outer and inner membranes of the chloroplast known as the general import apparatus, composed of the translocon at the outer membrane of the chloroplast (TOC complex) and the translocon at the inner membrane of the chloroplast (TIC complex) [1,25]. Chloroplast outer membrane proteins, including components of the TOC complex, use alternative targeting pathways to reach the chloroplast outer membrane [26]. A unique set of challenges exist for membrane proteins that are targeted to their resident membranes post-translationally. As the hydrophobic segments of their amino acid chains are synthesized on cytosolic ribosomes, chaperones are often required to maintain stability, prevent misfolding and avoid aggregation before the proteins reach their target membranes [27]. In some cases, the chaperones themselves serve as targeting elements whereas in other cases, a targeting sequence may engage a receptor at the target membrane surface or local secondary and tertiary structures may engage the membrane directly, inducing self-insertion [28].

Classifying common targeting pathways is challenging, as the specific mechanisms used by many outer membrane proteins remain uncharacterized. This is, in large part, due to the limited number of known and confirmed chloroplast outer membrane proteins, as well as the difficulties associated with studying membrane proteins in vitro [29]. It is even the case that some proteins use a combination of redundant strategies [26], increasing the complexity of distinguishing between these mechanisms further. Advances in proteomics and the development of more powerful bioinformatic tools have led to the identification and characterization of an increasing number of chloroplast outer membrane proteins in recent years [29,30]. In this review, we discuss current advances in our understanding of the targeting signals and pathways used by chloroplast outer membrane proteins during their biogenesis. Further, we put forth a proteome-wide bioinformatic approach for identifying novel chloroplast outer membrane protein-targeting signals and pathways. This analysis allowed us to expand the current list of chloroplast outer membrane proteins from 117 to 138.

#### **2. The Chloroplast Outer Membrane**

#### *2.1. Composition and Function of the Chloroplast Outer Membrane*

Like their bacterial ancestors and mitochondria, the other endosymbiotic organelles found in eukaryotes, chloroplasts are surrounded by two membranes that differ in function and composition [31]. The inner membrane is studded with transport proteins and tightly regulates the flux of ions and metabolites between the intermembrane space (IMS) and the stroma, the interior compartment of the chloroplast [32]. The inner membrane is composed primarily of galactolipids with some phospholipids and sulfolipids [33]. In contrast, the chloroplast outer membrane is permeable to ions and metabolites and controls the recognition and import/insertion of chloroplast proteins [32]. It also serves as the site for galactolipid biosynthesis [34] and is composed primarily of phospholipids and galactolipids in equal proportions with some sulfolipids [33]. The entire chloroplast outer membrane proteome is encoded by genes in the nucleus, synthesized on cytosolic ribosomes, and targeted post-translationally [20–22]. Therefore, these proteins must contain signals that direct them to the chloroplast outer membrane. Inoue (2015) published a comprehensive list of chloroplast outer membrane proteins in *Arabidopsis thaliana*, of which there are 117, and categorized them based on their function [30]. This represents a diverse proteome with functions including: solute and ion transport, protein import, protein turnover and modification, lipid metabolism, carbohydrate metabolism and regulation, other metabolism and regulation and intracellular communication, as well as proteins

of unknown function [30]. More proteins still have been identified that associate with the chloroplast outer membrane surface but are not inserted in the membrane [35]. This functional diversity highlights the crucial role that the chloroplast outer membrane plays in plastid biogenesis, cell metabolism and intracellular signalling between plastids and the rest of the cell. Despite the advances made in our understanding of chloroplast precursor protein import into the stroma, the targeting and insertion mechanisms for chloroplast outer membrane proteins remains elusive. This gap is further exaggerated when compared to our understanding of mitochondrial outer membrane proteins [36].

#### *2.2. Topologies of Chloroplast Outer Membrane Proteins*

Based on their structure and topology, chloroplast outer membrane proteins fit into several categories. α-helical proteins are categorized by the number and location of their transmembrane domain(s) (TMD(s)). Signal anchored (SA) proteins contain single membranespanning α-helices located at their N-terminus and tail anchored (TA) proteins contain single membrane-spanning α-helices located at their C-terminus [37]. Both SA and TA proteins adopt similar topologies in which their soluble domains are exposed to the cytosol, with some containing short extensions into the IMS. Although very few have been characterized, some α-helical proteins are anchored by a single membrane-spanning α-helix in the middle of their sequences, containing both N- and C-terminal soluble domains exposed at opposite sides of the membrane. These are not classified as SA or TA proteins. Further, some α-helical proteins in the chloroplast outer membrane contain two or more membranespanning α-helices. Finally, β-barrel proteins span the membrane through the formation of a cylindrical barrel composed of β-strands [27,38]. Beyond these well-established categories of integral membrane proteins are proteins that contain either or both α-helices and β-strands, which form uncharacterized membrane anchors, proteins that are anchored to the membrane by the covalent attachment to lipid molecules, and peripheral membrane proteins that rely on electrostatic or hydrophobic interactions at the membrane surface, as well as on interactions with integral membrane proteins [27,39].

#### **3. Protein Entry into Chloroplasts: Structure and Function of the TOC Complex**

The mechanisms by which chloroplast precursor proteins are targeted to the chloroplast, recognized, and imported to the stroma by the general import apparatus have been extensively reviewed [25,40–43]. In brief, chloroplast precursor proteins are synthesized on cytosolic ribosomes and targeted to the chloroplast post-translationally via a cleavable N-terminal transit peptide (TP) [25,44–46]. TP sequences are moderately hydrophobic, containing an amphipathic α-helix. They are typically rich in hydroxylated and basic residues, void of acidic residues and often contain multiple proline residues [46–50]. A lack of arginine residues in TPs differentiate them from mitochondrial presequences and preclude mitochondrial targeting [51]. Molecular chaperones in the cytosol guide precursor proteins to the chloroplast surface in an import competent state [52]. Heat shock proteins (Hsp) 70 and 90 are involved in the transport of most chloroplast precursor proteins containing TPs [53] and may use TOC64 as a receptor [54,55]. Hsp70 forms a guidance complex with the 14-3-3 protein, which interacts directly with the TOC complex [52,53]. The TOC complex mediates recognition and import of nuclear-encoded precursor proteins into the chloroplast through the TOC-TIC supercomplex, after which the TP is cleaved in the stroma by a stromal processing peptidase [56,57].

The components of the core TOC complex were originally identified in pea (*Pisum sativum*) and *Arabidopsis thaliana* [58], although its composition and function appear to be highly conserved across plant species [59,60]. TOC34 and TOC159 GTPase receptors act to recognize the cleavable N-terminal TPs of chloroplast precursor proteins before they are translocated across the outer membrane via the TOC75 translocation channel [61]. Cryoelectron microscopy was used to shed light on the organization of the TOC complex and its subunits [62]. This, in combination with affinity purification and electrophoretic techniques, sugges<sup>t</sup> the TOC complex exists in a 4:4:1 (TOC75:TOC34:TOC159) arrangemen<sup>t</sup> [62–65].

Further, paralogous TOC complexes have been identified in *Arabidopsis thaliana*, which may be responsible for the recognition and import of different subsets of chloroplast precursor proteins (housekeeping vs. photosynthetic) [5,17–19]. Functionally distinct TOC complexes are defined by the presence of TOC34 (TOC33) and TOC159 (TOC90, TOC120, TOC132) receptor homologs, where TOC75 is central to all TOC complexes [66]. The abundance and ratio of various paralogous TOC complexes in the plastid membrane are likely to play a role in plastid biogenesis and the transition of plastids between various types based on developmental and environmental signals [9], such as in the process of photomorphogenesis described previously.

Less understood is the proteinaceous machinery present in the chloroplast outer membrane that is responsible for the recognition and insertion of α-helical and β-barrel chloroplast outer membrane proteins. In mitochondria, the translocon at the outer membrane of the mitochondrion (TOM) complex imports β-barrel proteins, and they are then transferred to the sorting and assembly machinery (SAM) complex by IMS chaperones before being integrated into the membrane [67,68]. Most, if not all, α-helical outer membrane proteins of mitochondria are inserted directly via the mitochondrial import (MIM) complex [67]. Such distinct pathways have not been identified for chloroplast outer membrane proteins to date.

#### **4. Biogenesis of** α**-Helical Chloroplast Outer Membrane Proteins**

#### *4.1. Signal Anchored (SA) Proteins*

SA and TA proteins represent a diverse array of functions, acting as receptors in pathways of protein translocation, membrane fusion, vesicle trafficking, electron transport, apoptosis and protein quality control [39,69]. SA proteins lack a cleavable TP and are anchored in the membrane by a single hydrophobic α-helix of approximately 20 amino acid residues in length at their N-terminus [30]. This TMD is flanked by a C-terminal positively charged region (CPR) and, together, the TMD and CPR act as an intrinsic targeting signal [70]. At the sequence level, it is difficult to identify a conserved sequence motif that may direct targeting. SA proteins do not share sequence similarity at their N-termini the way they do at their C-termini. Instead, proteinaceous factors in the cytosol may recognize SA proteins directed to chloroplasts and mitochondria based on their physicochemical characteristics, which are highly conserved [27]. Specifically, the majority of TMDs within SA proteins of chloroplasts and mitochondria are only moderately hydrophobic relative to SA proteins directed to the endoplasmic reticulum, with hydrophobicity values below 0.4 on the Wimley and White scale [70]. Increasing the hydrophobicity of the TMD will redirect a chloroplast SA protein to the plasma membrane [71]. Additionally, the CPR contains three or more positively charged amino acid residues that assist in the evasion of the signal recognition particle (SRP) that directs SA proteins to the ER [27]. Like the TMD, substitutions that reduce the positive charge of the CPR redirect targeting of chloroplast SA proteins to the plasma membrane [70].

It is still an open question how the plant cell achieves selective targeting of SA proteins to chloroplasts and mitochondria. Despite the ambiguity in the targeting signals of chloroplast and mitochondrion SA proteins, a cytosolic factor has been identified that is responsible for targeting chloroplast SA proteins to the membrane surface. Ankyrin repeat-containing protein 2 (AKR2) interacts with the SA targeting signal during translation and acts as a chaperone to shield the hydrophobic TMD and prevent aggregation to keep the SA protein in a membrane insertion competent state [39,72,73]. AKR2 achieves selective targeting through its lipid binding domain, which recognizes monogalactosyldiacylglycerol, a lipid unique to plastid membranes, and phosphatidylglycerol headgroups [74]. AKR2 may recognize subtle differences in the density of positively charged residues and amino acid residue composition of the CPR as well as its distance from the TMD [27]. The mechanism by which SA proteins are inserted in the chloroplast outer membrane is not well understood, and there is some evidence that it may vary between SA proteins [27]. Currently, it is not clear whether SA proteins can insert spontaneously into the membrane

or if a, ye<sup>t</sup> to be discovered, insertase is involved; regardless, targeting and insertion seem to be dependent on the presence of the TOC complex [27]. The targeting mechanisms of SA, TA and β-barrel chloroplast outer membrane proteins are depicted in Figure 1.

**Figure 1.** Signal anchor (SA), tail anchor (TA) and β-Barrel Mediated Targeting Pathways to the Chloroplast Outer Membrane. (**A**) β-barrel proteins are targeted to the chloroplast outer membrane by a variety of signals, like the β-signal. Although some may use Hsp70, Hsp90 and 14-3-3 proteins, a specific cytosolic chaperone that aids in their targeting is ye<sup>t</sup> to be discovered. OEP80 likely plays a role in the insertion of β-barrels, although the translocon at the outer membrane of the chloroplast (TOC complex) is also involved in their targeting; (**B**) SA and TA proteins are targeted to the chloroplast outer membrane by their moderately hydrophobic N-terminal and C-terminal transmembrane α-helix, respectively, and a C-terminal positively charged region, sometimes accompanied by an RK/ST motif for TA proteins. Both are guided by ankyrin repeat-containing protein 2 (AKR2), which interacts with monogalactosyldiacylglycerol (MGDG) and phosphatidylglycerol (PG) in the chloroplast outer membrane. Whether they are inserted by the TOC complex or an undiscovered insertase is unknown, but interaction with the TOC complex is an essential step in their targeting. Chloroplast outer membrane (COM); intermembrane space (IMS); chloroplast inner membrane (CIM). Created using BioRender.com (accessed on 1 December 2021).

#### *4.2. Tail Anchored (TA) Proteins*

Like SA proteins, TA proteins lack a cleavable TP, but are anchored in the membrane by a single α-helix at their C-terminus [75]. The TMD is flanked by a C-terminal sequence (CTS) and, together, the TMD and CTS act as an intrinsic targeting signal [76]. Interestingly, the physicochemical characteristics of the TA are not as important to targeting. The hydrophobicity of the TMD varies widely across TA proteins and the importance of the CTS varies from protein to protein, although a net positive charge seems to contribute to chloroplast-targeting specificity [39]. Like the CPR of SA proteins, this assists in the evasion of the SRP. Eliminating the net positive charge redirects TA proteins to the mitochondrion [37]. A subset of TA proteins contains an RK/ST motif in their CTS. This motif

is interchangeable among TA proteins in the subset. The charge distribution within the RK/ST motif appears to play a larger role than net charge [69,77]. For some TA proteins, the CTS is both necessary and sufficient for targeting, where for others, it is necessary but not sufficient [37]. The latter seems to be the case for those TA proteins that contain GTPase (G) domains, such as the TOC34 receptors discussed in detail below.

Like SA proteins, AKR2 also interacts with TA proteins in the cytosol to guide them to the chloroplast outer membrane, although there is evidence that other cytosolic factors are involved [37,69]. The targeting of TA proteins is well characterized in mammalian cells and yeast, and involves the guided entry of TA proteins (GET) and transmembrane recognition complex (TRC) pathways, respectively [78,79]. GET homologs have been identified in *Arabidopsis thaliana* [77] and have been implicated in the targeting of TOC34 receptors [80]. GET3B has been shown to target TA proteins to the thylakoid membrane [81]. Although cytosolic factors contribute to the efficiency of TA protein targeting, specificity for the chloroplast outer membrane seems to rely on lipid composition of the membrane, which may be important for insertion [82]. It was previously thought that SA and TA protein insertion in the chloroplast outer membrane occurred exclusively through interactions with the naked chloroplast outer membrane. More recently, it has been shown that targeting of both SA and TA proteins to the chloroplast outer membrane requires TOC75 and competes with precursor proteins for insertion, suggesting they do use the TOC complex, at least as a first step in their insertion [83].

#### **5. Biogenesis of** β**-Barrel Chloroplast Outer Membrane Proteins**

β-barrel proteins are characterized by their distinct topology from the more common α-helical TMDs of many membrane proteins. They are defined as proteins composed of 8–24 β-strands, where individual strands are usually 9–11 amino acids in length and are tilted approximately 45 degrees from the plane of the membrane. Alternating patterns of amino acid side chains result in amphiphilic segments with a hydrophilic face lining the interior of the pore and a hydrophobic face exposed to the lipid bilayer. The structure is stabilized by hydrogen bonding networks between the peptide backbone of neighbouring β-strands. These pores are found exclusively in the outer membrane of plastids and mitochondria of eukaryotes, as well as the outer membrane of Gram-negative bacteria from which they originated [84,85]. They act as membrane anchors or channels that recognize and transport a wide variety of substrates (ions, small molecules, peptides, nucleic acids, and proteins) with varying levels of specificity [84]. Their shared ancestry, along with their homologous structure and function would sugges<sup>t</sup> easily identifiable targeting elements; but their primary sequences are highly divergent [27], complicating attempts to identify conserved targeting sequences. Their targeting information is thought to, instead, be dispersed among the primary sequence and displayed in the form of secondary and/or tertiary structures [38,86].

Bacterial and mitochondrial β-barrels contain β-signals, which are conserved motifs in the C-terminal β-strand(s) of the barrel [87] that initiate interactions with the β-barrel assembly machinery (BAM) and SAM complexes in bacterial and mitochondrial outer membranes, respectively. It was previously thought that membrane-embedded β-barrels were highly rigid and unlikely to open laterally. This is not the case. In Gram-negative bacteria and mitochondria, the β-signal acts as an insertion signal, triggering the lateral opening of BAM and SAM pores [88]. It has been shown that the targeting of β-barrel proteins to mitochondria relies on the hydrophobicity of the C-terminal β-hairpin. Specifically, a hydrophilic amino acid residue positioned at the C-terminus of the penultimate β-strand determines mitochondrial targeting. Interestingly, deletion of the C-terminal β-hairpin of chloroplast β-barrel proteins disrupts their targeting and redirects them to mitochondria [85]. Mislocalization of β-barrels between chloroplasts and mitochondria does not occur in plant cells but does occur in vitro [38], suggesting cytosolic factors ye<sup>t</sup> to be discovered play a crucial role in targeting fidelity. Although mutagenesis can alter the localization of a chloroplast β-barrel to mitochondria [38], the reverse has not been demonstrated. This

would sugges<sup>t</sup> that chloroplast β-barrel proteins have gained additional targeting elements that both enable chloroplast localization and prevent localization to mitochondria. In fact, there is evidence that chloroplast β-barrels contain distinct groups of signals and may reach the chloroplast outer membrane using a variety of targeting pathways. Some may even be capable of self-insertion [89,90]. Unlike SA and TA proteins, the unassisted targeting of a β-barrel protein to the chloroplast outer membrane has not been demonstrated experimentally to date. For some chloroplast outer membrane β-barrels, targeting information is contained in a set of N-terminal β-strands that engage the TOC complex and trigger import into the IMS [91]. They are then integrated into the outer membrane by OEP80, a process dependent on the C-terminal β-strand [92]. OEP80 is required for the accumulation of other β-barrel proteins in the chloroplast outer membrane and may represent part of the machinery that recognizes and inserts β-barrels in the chloroplast outer membrane [93,94].

#### **6. Targeting and Assembly of the TOC Complex**

#### *6.1. TOC75 Translocation Channel Targeting*

Although the components of the TOC complex fit well into the above-mentioned chloroplast outer membrane protein categories based on their structure and topology, their targeting signals and pathways seem to diverge from those described for β-barrel and TA proteins (Figure 2). TOC75 is the first component of the TOC complex to be inserted in the membrane and facilitates the targeting and insertion of TOC34 and TOC159 receptors [26,95,96]. TOC75 is a β-barrel protein, related to the OMP85 family of proteins that include the BAM and SAM β-barrels [97,98], although no other homologs of the Bam and SAM complex machinery have been detected in the chloroplast outer membrane. Interestingly, and unlike other β-barrel proteins, TOC75 is synthesized as a precursor protein in the cytosol and contains a cleavable bipartite N-terminal chloroplast TP [95,99]. The TP may help to ensure TOC75 targeting, and insertion is coupled to the formation of new TOC complexes, as well as to maintain its reverse topology [26]. Although there is evidence that existing TOC complexes facilitate the biogenesis of new TOC complexes in the chloroplast outer membrane [99], the bipartite signal could act as a temporary anchor to couple insertion by a β-barrel assembly machinery such as OEP80 with TOC complex integration [26]. A poly-glycine stretch downstream of the TP arrests import through the TOC-TIC super-complex and the TP and poly-glycine stretch are cleaved by the stromal processing peptidase and a type I signal peptidase in the IMS, respectively, yielding mature TOC75 that is inserted into the membrane by an unknown mechanism [99]. The poly-glycine stretch may also act to avoid proteinaceous factors in the IMS that drive import [100], instead releasing the protein to be inserted into the membrane laterally by the TOC complex, or via OEP80. Interestingly, OEP80 has been shown to contain a cleavable N-terminal TP, but not a poly-glycine stretch [101,102]. It is not isolated from mature TOC complexes [103] and its targeting is not in competition with chloroplast precursor proteins [101], suggesting it is inserted in a different manner than TOC75.

#### *6.2. TOC34 GTPase Receptor Family Targeting*

"Free" TOC75, or TOC75 not associated with mature TOC complexes, serves as the site for TOC34 receptor integration in the chloroplast outer membrane and its assembly into maturing TOC complexes, simultaneously [96]. TOC34 receptors are TA proteins and are guided to the chloroplast outer membrane by interactions between their C-terminal α-helical TMD and AKR2 [37,104]. Arsenite/tail-anchored protein-transporting ATPase (ARSA1) has also been implicated in the targeting of TOC34 to the chloroplast outer membrane and TOM7, a mitochondrial TA protein, to mitochondria [78]. ARSA1 is similar to the GET proteins that target TA proteins to the ER in eukaryotes. Several ARSA homologs have been identified in plants and may be responsible for the targeting of TA proteins in plant cells [79]. There is evidence to support that TA proteins, like TOC34, engage the TOC complex thereafter [37]. Whether this occurs after insertion by an undiscovered insertase is unclear. Alternatively, there is evidence that interactions with galactolipids

unique to the plastid membrane play an important role in TOC34 targeting [37]. This selectivity occurs at the membrane surface, independent of proteinaceous factors in the cytosol [82], and may facilitate self-insertion before or in addition to interaction with TOC75. Although there is evidence to sugges<sup>t</sup> that TOC34 may self-insert into the chloroplast outer membrane [105–107], there is additional evidence to sugges<sup>t</sup> that insertion is enhanced in the presence of nucleotides and inhibited when exposed to chloroplasts after proteolytic treatment [108], strengthening the theory that a yet-to-be-discovered insertase is involved. Additionally, the G domain of TOC34 receptors and the C-terminal tail exposed to the IMS have been implicated in targeting and integration into TOC complexes [106,109]. GTP hydrolysis could induce interactions between TOC34 monomers to produce TOC34 dimers that promote membrane insertion and/or TOC complex assembly [110]. Alternatively, GTP hydrolysis could induce an active conformation for monomeric TOC34 interaction with TOC75.

**Figure 2.** Targeting and assembly of the TOC complex: (**A**) TOC75 is targeted to the chloroplast outer membrane by a bipartite transit peptide (TP). The recognition and insertion of "free" TOC75 requires mature translocons at the outer membrane of the chloroplast (TOC complexes) and may be aided by OEP80. The presence of a TP suggests the use of chaperones such as Hsp70, Hsp90 and 14-3-3 employed by the canonical TP-mediated targeting pathway, although the existence of an uncharacterized cytosolic factor is possible; (**B**) TOC33/34 is a tail anchored (TA) protein and is targeted to the chloroplast outer membrane by both the TA and the GTPase (G) domain to form immature TOC complexes. Like other TA proteins, ankyrin repeat-containing protein 2 (AKR2) acts as a cytosolic chaperone; (**C**) TOC159/132/120 is targeted to the chloroplast outer membrane by a reverse TP-like sequence at the C-terminus (rTP), which may also engage chaperones employed by the canonical TP-mediated targeting pathway. Both TOC GTPase receptors rely on their G domains for successful targeting and TOC complex integration. The targeting of each component in sequence is coupled to the formation of mature TOC complexes. Chloroplast outer membrane (COM); intermembrane space (IMS); chloroplast inner membrane (CIM). Created using BioRender.com (accessed on 1 December 2021).

#### *6.3. TOC159 GTPase Receptor Family Targeting*

Relative to TOC75 and TOC34, we know very little about how TOC159 is targeted to and inserted in the chloroplast outer membrane [111]. Like TOC34, TOC159 targeting and

insertion relies on TOC75, and there is evidence that TOC34 also supports the targeting and integration of TOC159 [107,112,113]. Whether the binding of TOC34 to TOC75 induces a conformation favourable to TOC159 binding and insertion and/or the interactions between the homologous G domains of the TOC34 and TOC159 receptors enhances these interactions, the membrane (M) domain clearly plays a critical role [5]. Several lines of evidence exist to support the targeting capability of the M domain [107,114,115]. The C-terminus of the M domain has demonstrated the ability to target fluorescent fusion proteins to the chloroplast outer membrane, likely due to a reverse TP-like sequence that shares physicochemical characteristics with canonical N-terminal chloroplast TPs [116,117]. This reverse TP may engage the other TOC complex components in the same way that chloroplast TPs do. Until recently, it was thought that TOC159 receptors used an atypical membrane anchor [118,119]. Recent structural prediction by AlphaFold would sugges<sup>t</sup> that TOC159 receptors are anchored in the membrane by a β-barrel such as TOC75 [120]. The G domain also demonstrates intrinsic targeting capabilities [113], suggesting it may play an important role in targeting specificity and/or assembly of TOC159 into premature TOC complexes that contain TOC75 and TOC34 receptors [107]. In fact, it may even be required for the integration of TOC159 into mature TOC complexes [112,113]. It is important to distinguish between targeting, insertion, and TOC complex integration as they may represent distinct but interdependent processes. The targeting of TOC complex proteins likely evolved redundant targeting measures in this way to couple their targeting with the assembly of TOC complexes, further ensuring fidelity.

#### **7. A Bioinformatic Approach to Identifying Novel Chloroplast Outer Membrane Targeting Signals and Pathways**

Much remains unknown about how most chloroplast outer membrane proteins make their way to and are inserted in the chloroplast outer membrane. The targeting of TOC complex components highlights the complexity of chloroplast outer membrane targeting pathways and their dependence on not only protein structure but coupled assembly into protein complexes. The fact that there are 50 proteins known to be dually targeted to chloroplasts and mitochondria further illustrates the subtle ye<sup>t</sup> powerful role physicochemical characteristics within targeting signals play in targeting fidelity [121–124]. A better understanding of the variety of targeting signals within the chloroplast outer membrane proteome, the structures of their membrane anchors, physicochemical characteristics, and role in targeting and insertion is important in expanding our understanding of already characterized pathways and essential to identifying novel ones. We are the first to develop a bioinformatic approach (Figure 3) to categorize the chloroplast outer membrane proteome based on these properties. The goal of this approach is to select candidates for experimental studies that will be useful in the validation of additional targeting signals and pathways.

Briefly, we scanned the literature to produce an updated version of the chloroplast outer membrane proteome, generated by Inoue (2015) [30]. We identified 21 additional proteins with potential to localize to the chloroplast outer membrane [22,29], bringing the total number of proteins in the chloroplast outer membrane proteome to 138 (Table 1). We sorted these proteins into the functional categories provided by Inoue (2015) for the previous list of 117 proteins. Five proteins (At2g25660 [125], At3g49560 [126], At4g26670 [126], At5g24650 [126] and At5g55510 [127]) areinvolvedinproteinimport; threeproteins (At1g54150 [128], At1g59560 [128] and At5g13530 [129]) are involved in protein turnover and modification; five proteins (At2g40690 [130], At3g63520 [131], At4g12470 [132], At4g13550 [133] and At5g16010 [127]) are involved in lipid metabolism; one protein (At2g32290 [127]) is involved in carbohydrate metabolism and regulation; two proteins (At1g26340 [127] and At5g02580 [127]) are involved in other metabolism and regulation; two proteins (At2g34585 and At3g03870) have unknown functions; and three proteins (At3g07430 [134], At3g19720 [135] and At3g57090 [136]) are involved in organellar fission, a function not described by Inoue (2015). Next, we compiled a database that allowed us to group proteins based on structural and physicochemical characteristics (such as secondary structure elements, amino acid

composition and pI at each terminus), experimentally validated to be unique to SA, TA and β-barrel proteins. The predictive aspects of the database were tested against experimentally determined structures and targeting function of well-studied chloroplast outer membrane proteins. Finally, we added elements to the database, which allowed us to identify potential chloroplast TPs at the N-terminus and reverse TP-like sequences at the C-terminus independent of categorized targeting pathways. At each stage, thresholds were established, and a small number of conflicts were investigated further by careful analysis of secondary structure predictions. Together, this approach allowed us to positively identify TOC75 and OEP80 as β-barrel proteins containing an N-terminal TP and TOC34 as a TA protein. The results showed that 30% of proteins were categorized as single pass α-helical proteins, with 12% being SA, 14% being TA and 4% being "other"; 25% of proteins were categorized as multi pass α-helical proteins; 9% of proteins were categorized as β-barrel proteins; and 36% of proteins were categorized as "other", containing no predictable transmembrane elements. The 21 newly identified proteins are equally dispersed among the categories, except for the β-barrel category, into which none of the 21 newly identified proteins were sorted. Among the last group, 35% are predicted to contain a cleavable TP at their N-terminus and 39% are predicted to have a reverse TP-like sequence at their C-terminus. Interestingly, cleavable N-terminal TPs and reverse TP-like sequences at the C-terminus were also predicted, in small quantities, across all other categories, which suggests that more proteins than just TOC75 could use a bipartite signal sequence. This was also the case for the 21 newly identified chloroplast outer membrane proteins. It is also important to note that no protein was predicted to contain both an N-terminal TP and a reverse TP-like sequence at the C-Terminus. In summary, at least 65% of chloroplast outer membrane proteins were identified as having the potential to use novel targeting signals and pathways. Clearly, there is a disproportionate focus on traditional SA, TA and β-barrel-mediated targeting pathways in the literature.


**Table 1.** An updated list of the chloroplast outer membrane proteome and associated targeting pathways. Identified and predicted proteins of the chloroplast outer membrane proteome were categorized by potential targeting pathway and signal according to bioinformatic analyses.




1 *Arabidopsis* gene identifier (AGI) number, 2 Proteins without a name are marked with -, 3 N-Terminal transit peptide (N TP), 4 TC-Terminal transit peptide-like sequence (C TP-Like), \* Proteins not included in the list published by Inoue (2015) [30].

**Figure 3.** Bioinformatic approach to categorizing the chloroplast outer membrane proteome by targeting pathway. Bioinformatic tools used for sequence retrieval, β-barrel detection, β-barrel exclusion, α-helical transmembrane domain (TMD) detection and N- and C-terminal transit peptide detection [137–148] are provided in green squares, where resultant targeting pathway categories and percentcompositionoftheproteomesortedintoeachcategoryaregiveningreencircles.

## **8. Conclusions and Future Directions**

In summary, plastids, such as the chloroplast, play a central role in a variety of metabolic and signalling processes within plant cells. The biogenesis and function of chloroplasts rely heavily on the fidelity of intracellular protein targeting pathways. Like mitochondria, chloroplasts evolved from an ancient bacterial endosymbiont and the two organelles share many common characteristics in the post-translational targeting of their nuclearencoded proteomes [3]. Similar to mitochondrial presequences, chloroplast transit peptide sequences are highly divergent, but conserved physicochemical and structural properties govern their interactions with proteinaceous factors in the cytosol, recognition by import complexes at the membrane surface and even direct interactions with specific lipids [47], all of which contribute to targeting specificity. Despite recent advances in our understanding of the targeting of chloroplast precursor proteins and their recognition and import by the TOC and TIC complexes, much remains unknown about the targeting of chloroplast outer

membrane proteins, such as TOC159. This gap is further emphasized when compared to our knowledge of outer membrane proteins of mitochondria. Here, we have reviewed the current understanding of how SA, TA and β-barrel chloroplast outer membrane proteins are targeted to the organelle. Further, we used a novel bioinformatic approach to expand the current list of known chloroplast outer membrane proteins from 117 to 138 and provided new insight into novel targeting signals and pathways that could be used by a significant portion of these proteins, ye<sup>t</sup> to be explored experimentally.

In mitochondria, two distinct complexes, SAM and MIM, ensure insertion and assembly of β-barrel and α-helical outer membrane proteins. These processes are mediated by the TOM complex [67]. Such complexes have not been identified in chloroplasts and so our understanding of the molecular mechanisms by which β-barrel and α-helical chloroplast outer membrane proteins are recognized, inserted and assembled at the chloroplast outer membrane is severely limited. Whether the TOC complex assumes these roles or, like the TOM complex, merely mediates interactions between membrane proteins and their insertases is unclear. It is crucial that future studies focus on identifying the targeting signals, cytosolic factors and integration complexes involved in chloroplast outer membrane protein biogenesis. Specifically, experiments should focus on clarifying the role of the TOC complex in the insertion of different types of chloroplast outer membrane proteins; identifying β-signals within the chloroplast outer membrane proteome and the cytosolic chaperone involved in β-barrel targeting; and further characterizing OEP80s interaction with the TOC complex and its role in β-barrel insertion and assembly. Beyond this, we hope to further define bipartite signals that incorporate TPs and reverse TP-like sequences at the C-terminus, and how these sequences contribute to the modularity of targeting signals. AKR2 shuttles SA and TA proteins in the cytosol to the chloroplast outer membrane through specific interactions with lipids unique to the plastid membrane [74]. This emphasizes the interplay between the proteins of the outer membrane and the lipid molecules in which they are embedded; specifically, the role chloroplast outer membrane lipid composition plays in targeting specificity, another area that requires further exploration.

**Author Contributions:** Conceptualization, M.F., D.N., S.D.X.C. and M.D.S.; bioinformatic analyses, D.N., A.G. and M.F.; writing—original draft preparation, M.F.; writing—review and editing, M.F., D.N., A.G., A.O., M.J.-N., S.D.X.C. and M.D.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Natural Sciences and Engineering Research Council of Canada (NSERC); postgraduate doctoral scholarship awarded to M.F. (PGSD3-2021-559542) and Discovery Grants awarded to M.D.S. (RGPIN-2017-05437), S.D.X.C. (RGPIN-2017-04416) and M.J.-N. (RGPIN-2019-05900).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
