**1. Introduction**

Food security is the most crucial challenge in the current scenario of a rapidly growing global population. According to cautious estimates, the global population will escalate to ten billion by the end of 2050 and a 60–100% rise in global food production will be necessary [1]. Besides extreme weather, increasing biotic and abiotic stressors, a growing population, and shrinking availability of agricultural land and water resources are important constraints for food production and farming. Over the past few decades, improvements in crop plants have contributed by deciphering numerous biological mechanisms and elucidating the role of genetic and epigenetics factors [2]. Crop breeders and plant scientist are striving hard to understand mainly the genetic mechanism underlying unique plant responses towards environmental stressors. Recently, numerous novel genes and their regulatory pathways have been identified in plants [3,4]. For crop improvement and development of elite cultivars

with increased productivity, a breeding strategy of "cross the elite with the elite and wait for the best" has been applied, concentrating on the genes linked with vital agronomic traits [5]. Classical plant breeding strategies for crop improvement are more challenging which take a long time for germplasm selection. On the other hand, modern tools for genome editing (GE) exhibit the capability of integrating a foreign gene into a predetermined site of the genome precisely, allowing for accurate substitution of an existing allele with an alternative one [6,7]. Genome editing has emerged as a tremendous strategy for efficient and targeted genome manipulations, especially for crops which have complex genomes and which are difficult to improve through conventional breeding approaches [6].

For basic as well as applied plant biology, the unstable and non-specific transgene incorporation in the host genome has been a matter of concern for edible crop species [8]. The discovery of programmed sequence-specific nucleases (SSNs) has facilitated precise gene editing. In both plant and animal systems, application of SSNs for accurate GE has been recognized as a breakthrough in genome engineering. The SSNs can be applied to produce several kinds of mutations, such as insertions, deletions, replacement, substitutions, integration of specific sequence of DNA at a desired locus, and site-directed substitutions across many organisms and cell types. Though all types of SSNs have unique features, the mechanism for producing double-strand breaks (DSBs) in the target DNA is similar for all. The DSBs created by SSNs are reconstructed via non-homologous end joining (NHEJ) or homology-directed recombination (HDR). Non-homologous end joining is an error-prone DNA repair mechanism that facilitates direct end-joining of DSBs without involving a homologous template and can generate insertions or deletions at target sites to develop gene knockouts. Additionally, NHEJ can also be applied to introduce insertions at the point of the DSB during operation of the repair mechanism. On the other hand, the HDR repair pathway is a highly accurate mechanism that needs a homologous template to mediate repair and can be used to attain precise changes like gene insertion and gene replacement [6,9,10]. As compared to transgenic strategies, which result in inadvertent gene insertions and sometimes random phenotypical characters, GE approaches produce well-defined mutants, proving GE as a powerful technique for plant breeding and functional genomics. In contrast to transgenic plants, genome-edited plants have the added benefit of site specificity [11]. In breeding programs, these improved plants can be proven useful and subsequent species can be employed reliably with less concerns and comparatively minor monitoring methods are needed in contrast to traditional genetically engineered plants [12].

#### **2. Modern Trends in Plant Genome Editing**

In recent years, many fascinating GE approaches have been established because of the advancements in molecular biology, which have permitted site-specific and accurate editing in many genomes [8]. In GE, engineered nucleases are composed of a sequence-specific DNA binding domain merged with a non-specific nucleases domain. Targeted genes can be precisely cleaved by such fused nucleases and nicks can be repaired with the help of HDR or NHEJ [13,14]. A vital strategy to execute targeted GE through SSNs is to generate DSBs at targeted sites; these nicks prompt the activation of the DNA repair mechanism through the HDR or NHEJ pathway [15]. The DNA repair system of the HDR pathways requires a homologous template to repair the DSB, whereas the two ends of DSBs are directly ligated in the NHEJ pathway [15]. Though NHEJ is more common, there are some flaws which make it undesirable in many studies. The major disadvantage of this process is that it produces insertions or deletions of different sizes during the repair mechanism, which may produce off-targets. In contrast to NHEJ, the repair mechanism via HDR is more accurate and reliable, which depends on homologous DNA to repair the DSB [16]. Thus, SSNs can be applied to manipulate the genomic sequences by targeted addition or deletion of specific nucleotides in the targeted locus [10].

Recently, great achievements have been made in the era of genome engineering with the development of meganulceases (MNs), zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats/ CRISPR-associated protein 9 (CRISPR/Cas9). Experimental proofs gradually showed that these SSNs were not only used

for gene insertion or inactivation, but also significantly enhanced the effectiveness of homologous recombination and, thus, allowed more precise gene replacement events. In 1993, Puchta and co-workers [17] provided the first evidence of homologous recombination in plant cells by using SSNs. After the discovery of ZFNs in 1996 by Kim and colleagues [18], extensive efforts have been made for progressive advancement with this tool, which offered a significant breakthrough in plant GE. In 2003, scientists were able to inactivate genes using ZFNs for the first time [19]. In 2005, the first SSN-based mutagenesis via ZFNs was carried out successfully in plants [20]. Therefore, ZFNs have the ability to produce site-specific DSBs and have many applications in genome engineering [21]. Later TALENs were included to the toolbox of SSNs for programmed genome engineering [22]. Genome manipulation through engineered nucleases had gained much importance by the end of 2011, and *Nature Methods* crowned it as the "Method of the Year". In 2012, *Science* chose it as the "Breakthrough of the Year" due to the significant progress achieved in GE using TALENs. Recently, an emerging GE nuclease, "CRISPR/Cas9", was added to the toolbox for editing nucleases. In 2013, the first CRISPR/Cas9-based GE event was reported in eukaryotes [23]. In 2015, Ma's group [24] developed the multiplex genome editing mechanism in monocots and dicots. In 2013 and 2015, it was selected by *Science* as the "Breakthrough of the Year". Furthermore, advancements in the CRISPR/Cas system introduced a more precise technique of base editing, which was heralded again by *Science* in 2017 as the "Breakthrough of the Year". CRISPR/Cas9-based GE has tremendously revolutionized genome engineering since the initial few research articles were published in *Nature Biotechnology* [25,26]. All the above mentioned approaches have been extensively employed for GE and caused mutations via site-directed substitutions, replacement, deletions, and insertions at specific sites in the genome [10].

TALENs, ZFNs, and MNs are the first-generation editing tools for genome manipulation as, illustrated in Figure 1. However, they are time consuming and require lengthy protocols to attain target specificity. As compared to first-generation GE approaches, second-generation GE tools such as the CRISPR-Cas9 technique are easier to design, cost effective, and robust [26–28]. The CRISPR/Cas9 toolkit is very simple to design, as it involves only single-guided RNA (sgRNA) and the Cas9 protein in contrast to TALENs and ZFNs. Additionally, the procedure involved in TALENs and ZFNs are complex because they require protein engineering for their construction. Due to the presence of these constraints, applications of TALENs and ZFNs in plants have been limited [9]. Continuous innovation for efficient GE has expanded the applications of the CRISPR/Cas9 system in several fields of plant science and is quickly becoming a highly promising GE tool [6,11,23–26]. The plant GE tools and their corresponding applications are depicted in Figure 2, while successive steps involving the GE strategies are shown in Figure 3.

In the present review, we discuss fascinating GE tools for crop improvement. We briefly describe first-generation genome editing tools such as TALENs, ZFNs, and MNs and comprehensively elaborate on second-generation genome editing strategies with special focus on the applications of the CRISPR/Cas9 system in plant breeding for crop improvement. We briefly outline historical background, structural organization, and mode of action of the CRISPR/Cas9 toolbox. We describe the workflow of CRISPR/Cas9 from vector design to mutant screening. We also highlight recent breakthrough events in technology improvement in the CRISPR/Cas9 system. Furthermore, we discuss the recent role of CRISPR/Cas9 technology in crop breeding to develop the best performing cultivars with biotic and abiotic stress resilience, improving yield-related traits and production of high-quality crops. Finally, we outline the future outlook of CRISPR/Cas9 and pinpoint the current challenges with respect to the regulation of edited crops and their safe use.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 4 of 49

**Figure 1.** Structural illustration of first-generation genome editing tools: (**A**) meganucleases (MNs) have multifunctional domains with the ability to bind double-stranded target DNA and generate DSBs. The meganuclease is demonstrated to bind a target spacer sequence of 14–40 bp (yellow). The FokI nuclease cuts the target sequence (color). (**B**) Representation of ZFN bound to the target sequence of 18–36 bp long. Each monomer of ZFN (blue) is made by ZFP. There are two basic domains: the DNA binding domain at N-terminus and the catalytic domain with FokI nuclease (white) present at the C-terminus. The connection among these domains is indicated with a pink line. The ZFN modules are merged with Fok1 (white) and dimerized to cut the target sequence at a spacer 5–7 bp (pink) to produce DSBs. (**C**) Two TALEN dimers bound to the target sequence (pink) site. Each module of TALENs are composed of TALE that contain 33–35 amino acid repeats. The pair of TALENs are separated by a spacer region of 12–21 bp (pink). There are specific RVD modules (green NN, grey **Figure 1.** Structural illustration of first-generation genome editing tools: (**A**) meganucleases (MNs) have multifunctional domains with the ability to bind double-stranded target DNA and generate DSBs. The meganuclease is demonstrated to bind a target spacer sequence of 14–40 bp (yellow). The FokI nuclease cuts the target sequence (color). (**B**) Representation of ZFN bound to the target sequence of 18–36 bp long. Each monomer of ZFN (blue) is made by ZFP. There are two basic domains: the DNA binding domain at N-terminus and the catalytic domain with FokI nuclease (white) present at the C-terminus. The connection among these domains is indicated with a pink line. The ZFN modules are merged with Fok1 (white) and dimerized to cut the target sequence at a spacer 5–7 bp (pink) to produce DSBs. (**C**) Two TALEN dimers bound to the target sequence (pink) site. Each module of TALENs are composed of TALE that contain 33–35 amino acid repeats. The pair of TALENs are separated by a spacer region of 12–21 bp (pink). There are specific RVD modules (green NN, grey NH, red HD, dark blue NI, orange NG, and yellow N) that can recognize only one single nucleotide. TALE modules are dimerized to fuse with FokI (at C-terminus) to produce DSBs in the spacer region.

NH, red HD, dark blue NI, orange NG, and yellow N) that can recognize only one single nucleotide. TALE modules are dimerized to fuse with FokI (at C-terminus) to produce DSBs in the spacer region.

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 5 of 49

*Int. J. Mol. Sci.* **2019**, *20*, x FOR PEER REVIEW 5 of 49

**Figure 2.** Diagrammatical representation of various genome editing tools and their applications in plants. **Figure 2.** Diagrammatical representation of various genome editing tools and their applications in plants. **Figure 2.** Diagrammatical representation of various genome editing tools and their applications in plants.

**Figure 3.** A general description of the GE mechanism in plants. Plant GE typically consists of the following steps: designing and construction of vectors, targeted delivery of vectors via *Agrobacterium*mediated transformation or biolistic for transformation, callus induction and regeneration, mutation screening and analysis, and phenotypic characterization for the desired trait. **Figure 3.** A general description of the GE mechanism in plants. Plant GE typically consists of the following steps: designing and construction of vectors, targeted delivery of vectors via *Agrobacterium*mediated transformation or biolistic for transformation, callus induction and regeneration, mutation screening and analysis, and phenotypic characterization for the desired trait. **Figure 3.** A general description of the GE mechanism in plants. Plant GE typically consists of the following steps: designing and construction of vectors, targeted delivery of vectors via *Agrobacterium*-mediated transformation or biolistic for transformation, callus induction and regeneration, mutation screening and analysis, and phenotypic characterization for the desired trait.

#### In the present review, we discuss fascinating GE tools for crop improvement. We briefly describe In the present review, we discuss fascinating GE tools for crop improvement. We briefly describe *2.1. Meganulceases*

first-generation genome editing tools such as TALENs, ZFNs, and MNs and comprehensively elaborate on second-generation genome editing strategies with special focus on the applications of the CRISPR/Cas9 system in plant breeding for crop improvement. We briefly outline historical background, structural organization, and mode of action of the CRISPR/Cas9 toolbox. We describe the workflow of CRISPR/Cas9 from vector design to mutant screening. We also highlight recent breakthrough events in technology improvement in the CRISPR/Cas9 system. Furthermore, we discuss the recent role of CRISPR/Cas9 technology in crop breeding to develop the best performing cultivars with biotic and abiotic stress resilience, improving yield-related traits and production of high-quality crops. Finally, we outline the future outlook of CRISPR/Cas9 and pinpoint the current first-generation genome editing tools such as TALENs, ZFNs, and MNs and comprehensively elaborate on second-generation genome editing strategies with special focus on the applications of the CRISPR/Cas9 system in plant breeding for crop improvement. We briefly outline historical background, structural organization, and mode of action of the CRISPR/Cas9 toolbox. We describe the workflow of CRISPR/Cas9 from vector design to mutant screening. We also highlight recent breakthrough events in technology improvement in the CRISPR/Cas9 system. Furthermore, we discuss the recent role of CRISPR/Cas9 technology in crop breeding to develop the best performing cultivars with biotic and abiotic stress resilience, improving yield-related traits and production of high-quality crops. Finally, we outline the future outlook of CRISPR/Cas9 and pinpoint the current In SSNs, MNs were the pioneering class of nucleases (Figure 1A), extensively applied for plant GE [29,30]. Meganulceases were also termed as homing endonucleases. Later, they were utilized for generating DSBs in several genomes [31]. Meganulceases have the ability to recognize target DNA sequences of about ~12–40 bp, that make MNs the most efficient delivery approach for all vectors including plant RNA viruses [32]. As compared to different SSNs, MNs are difficult to re-design for target sequences different than their natural ones. Non-modular properties of the specific proteins are the main reason for hindrance in re-designing MNs. So, for plants, the applications of MNs have been restricted to only naturally existing MNs such as I-SceI and I-CreI nucleases [29].

#### challenges with respect to the regulation of edited crops and their safe use. *2.2. Zinc-Finger Nucleases*

*2.1. Meganulceases*  In SSNs, MNs were the pioneering class of nucleases (Figure 1A), extensively applied for plant GE [29,30]. Meganulceases were also termed as homing endonucleases. Later, they were utilized for generating DSBs in several genomes [31]. Meganulceases have the ability to recognize target DNA *2.1. Meganulceases*  In SSNs, MNs were the pioneering class of nucleases (Figure 1A), extensively applied for plant GE [29,30]. Meganulceases were also termed as homing endonucleases. Later, they were utilized for generating DSBs in several genomes [31]. Meganulceases have the ability to recognize target DNA In plants, ZFNs are extensively being applied for plant GE [33]. Zinc-finger nucleases are one of the main techniques for genome manipulations which are very beneficial in various GE applications. Zinc-finger nucleases have been widely used for target specific mutagenesis to disrupt the gene function and produce several gene knockouts [34]. GE with ZFNs has demonstrated the production

challenges with respect to the regulation of edited crops and their safe use.

of herbicide-resistant plants, and various kinds of targeted and specific gene insertion have also been unveiled [35]. In plant biotechnology, zinc-finger proteins (ZFPs) can be exploited in two ways: ZFNs and ZFN-TFs. Due to the flexible nature of ZFPs, it provides a striking basis for modeling ZFNs with desired sequence-specific domains to produce DSBs and facilitate GE [36,37]. In 1996, ZFNs were reported for the first time and named as chimeric restriction enzymes. According to this research, chimeric restriction enzymes were developed by associating the non-specific FokI with the DNA binding domain of two dissimilar ZFPs. The ZFNs were constructed by the fusion of chimeric proteins that were composed of DNA cleavage and a DNA binding domain. A set of 3–6 Cys2His 2 ZFs constructed the DNA binding domain, while a Fok1 restriction enzyme was generated by the DNA cleavage domain [18]. FokI is homodimeric in nature and belongs to the type IIS class of restriction enzymes isolated from *Flavobacterium okeanokoites* [18]. The domain of FokI nuclease needs to dimerize in order to cut DNA [38]. Two ZFN monomers are required having FokI dimerization and C-terminal fusion for active cleavage when binding to DNA. Target DNA with 9–18 bp has been recognized by each monomer containing 3–6 ZFs (Figure 1B). Consequently, each monomer of ZFN targets the spacer region of 5–7 bp located in the adjacent half-site and dimerize to perform the cleavage activity for targeted DNA [10]. ZFNs, as compared to MNs, are small in size (about 300 aa in one monomer and 600 aa in a pair of nucleases), enabling them responsive to many delivery procedures. In the last two decades, ZFNs have been applied for site-specific mutations in plants such as *Arabidopsis thaliana*, soybean, maize, tobacco, and petunia [36,39,40]. However, modular association of ZFs has gained partial achievement [41]. Currently, ZFNs are not recommended in several cases due to the lower target specificity, limited amount of specific target domains, and large number of non-targeted editing [42].

#### *2.3. TALENs*

Another interesting tool for GE is termed as transcription activator-like effector nucleases (TALENs). These are modern inclusions to the SSNs resources and have been extensively applied for GE in plants [43]. In 1989, TALENs were first discovered when a pathogenic bacterium called *Xanthomonas* was studied for many plant varieties [44]. *Xanthomonas* is responsible for uncontrolled growth of plant cells due to the synthesis of a novel protein termed as transcription activator-like effectors (TALEs) that target specific DNA sequences and greatly influence gene expression [45]. For targeted GE, TALENs are manipulated by changing the TALE repeated domains required for specific target identification and are successively linked to Fok1 nuclease to obtain suitable TALEN. The TALENs that recognize 12–21 bp extend, likewise to ZFNs, and require a spacer region of 14–20 bp for Fok1 dimerization with a pair of TALENs (Figure 1C) [10].

TALENs have the advantage over other SSNs such as MNs and ZFNs because of their modular domain. The domain for TALENs contains 33–35 aa in direct sequence repeats and two amino acids are called repeat variable di-residues (RVDs) in these repeats. The RVDs are responsible for recognition of specific nucleotides which includes thymine; NI, HD, cytosine; NG and adenine; and NN. A single RVD associated with every single nucleotide in combined mechanisms was identified that has the ability to design specific DNA binding motifs and remove the remodeling issues faced in the case of ZFNs and MNs [22,46–48]. Target specificity is another advantage of TALENs over other nucleases. Typically, 15–20 RVDs are used in order to design TALEN monomers with more than a 30 bp target site. As compared to ZFNs, TALENs reduce the toxicity and are more specific having large target sites [49]. A large size of about ~950 aa to ~1900 aa is the only drawback of TALENs for use as an accurate tool for GE. TALENs are usually carried to cells via direct integration of DNA or by integration of a construct-harboring TALEN-encoding unit into the genome. TALENs have been successfully applied in plants such as rice [43], *Arabidopsis* [22], tobacco [42], and *Brachypodium* [26] for GE. TALENs are more extensively exploited for targeted GE as compared to ZFNs, but they still need an efficient way to assemble tandem repeats for binding to the targeted DNA region. Furthermore, the repetitive nature and large size pose as big hurdles for the successful delivery of TALENs [45]. Some reported events of MN-, TALEN-, and ZFN-mediated mutagenesis in plants are described in Table 1.



Catechol-O-methyltransferase (*COMT*), MATRILINEAL (*MTL*), Phytoene desaturase (*PDS*), Fatty acid desaturase (*FAD*), Vacuolar invertase gene (*VInv*), Acetolactate synthase gene (*ALS*), Maize glossy2 (*GL2*), Betaine aldehyde dehydrogenase (*BADH2*), MILDEW-RESISTANCE LOCU (*MLO*), Green fluorescent protein (*GFP*), Acetolactate synthase genes (*Sur A, Sur*

*B*), Rice bacterial blight susceptibility gene (*Os11N3).*
