*Microorganisms* **2020**, *8*, 305



### *3.2. Comparative Genomics and Molecular Basis of Pathogenicity and Virulence in Fusarium graminearum*

In order to o ffset the serious pathogenic problems caused by *F. graminearum* and to develop insights into the virulence and antagonistic defense mechanisms in host plants, it appears imperative to undertake the identification of fungal pathogenicity and virulence factors which make up the arsenal of this fungal plant pathogen. Presently this is conveniently done using the high-throughput genome sequencing technologies which generate large datasets to reveal genome organization and the various genes present in *F. graminearum*. The information generated from genome sequencing and RNA sequencing projects can be utilized to understand the mechanisms employed by *F. graminearum* to gain entry into plant tissue as well as advance with the plant. Various commendable projects have been launched in the previous years and have advanced our understanding of pathogenicity and virulence of *F. graminearum*. Among these significant projects was the MIPS *Fusarium graminearum* Genome Database (FGDB) with an estimated 14,000 genes and downstream analysis in a live gene validation process was established to provide a comprehensive genomic and molecular analysis [40]. Because *Fusarium* species are among the most important phytopathogenic and toxigenic fungi, it is essential to understand the molecular underpinnings of their pathogenicity. In addition, the production of mycotoxins by these fungi put animals and humans who consume the crop product at risk. Given this alarming situation, aggressive research in *F. graminearum* is imperative and it must take advantage of the presence of convenient tools to generate a multitude of data to elucidate the various pathogenicity and virulence processes of *F. graminearum*. Based on the comparative genomic analysis conducted on three phenotypically diverse species that include *F. graminearum*, it was revealed that among others this particular homothallic fungal species, *F. graminearum*, has shown a relatively narrow host range which includes important cereals [41]. It is therefore important to note that this pathogen is particularly notorious on wheat, barley, rice, oats by causing head blight or 'scab' and on maize causing mainly stalk and ear rot disease [2]. However, genomic analysis shows that this fungus may also infect other plant species without causing disease symptoms. Further genome analysis revealed that 67 gene clusters with significant enrichment of predicted secondary metabolites and with functional enzymes were shown to be expressed among which 30% with gene overexpression were likely in virulence [102]. While the exchange of genes between the core and supernumerary genomes bestows significant opportunities for adaptation and evolution on the organism, it appears reminiscent in *F. graminearum*, to the compartmentalization of genetic material where non-conserved regions are found at various places on the four core chromosomes [111]. Studies from comparative genomics indicate that these mobile pathogenicity chromosomes exist in most *Fusarium* species with lineage-specific genomic regions [41], nevertheless, the molecular foundation of pathogenicity in *F. graminearum* was shown to be closely associated with the MAP1 gene which is also responsible for the development of perithecia in the same fungal species [78]. Furthermore, 29 *F. graminearum* genes are rapidly evolving, in planta-induced and encode secreted proteins, strongly pointing toward e ffector function [112], implicating genomic footprints that can be used in predicting gene sets likely to be involved in host–pathogen interactions. In association with this, as forward and reverse genetics have improved our understanding of molecular mechanisms involved in pathogenesis, it was revealed that mitogen-activated protein kinase and cyclic AMP-protein kinase A cascades both regulate virulence in *Fusarium* species and it has been postulated that cell wall integrity might be necessary for invasive growth and/or resistance to plant defense compounds [113]. These snippets which have been discovered to give clues on the pathogenicity and virulence of *F. graminearum* necessitated a deeper and comprehensive interrogation of the genome of this fungal pathogen to uncover all pathogenicity and virulence genes. NGS was conveniently used to unravel various pathogenicity and virulence factors of *F. graminearum* to the benefit of *F. graminearum* researchers worldwide.

### *3.3. Pathogenicity and Virulence Factors of Fusarium graminearum Discovered Using NGS Technologies*

Understanding the molecular mechanisms involved in fungal pathogens in plants has been accelerated over the past decade. Notably, NGS has contributed immensely towards the generation of vast datasets of genomes and transcriptomes. The availability of fungal genome sequences from a majority of plant fungal pathogens has contributed to these discoveries [28,29,114,115]. Genomics, transcriptomics, proteomics, and metabolomics approaches were introduced, allowing possible identification of genes, proteins, and metabolites of fungi in various artificial cultures and during infection of plants under di fferent experimental conditions [116,117]. Fungal pathogens cause diseases in plants, resulting in tissue damage and disease due to pathogenicity and virulence factors which assist fungal pathogen survival and persistence [118]. The pathogens a ffect their host by adapting to their environment and secreting/producing pathogenicity-related toxins, pectic enzymes, and hormone-like compounds. These products can have devastating e ffects on the quality and yield of crops in the field and can also cause postharvest diseases [119]. The mechanisms involved in fungal pathogenesis in plants are therefore being studied broadly, to protect plants against diseases of economic importance. *F. graminearum* is one of the plant pathogens that a ffect grain cereal crops globally, causing di fferent diseases in di fferent crops [80,120,121]. The pathogen produces metabolites that are toxic or non-toxic, which enables it to manipulate the plant to acquire nutrition. Inhibition of pathogenicity and virulence factors enhanced by the pathogenic fungus results in the development of diseases in plants. Pathogenicity factors involved in plant-pathogen interactions have been investigated extensively in di fferent plants, and genes, proteins, and metabolites have been identified [81,97,122–125], including those involved in response to *F. graminearum* infections [55,104,113,120,126,127]. Pathogenicity and virulence genes that have been largely identified belong to the trichothecene biosynthesis gene cluster, as described by Proctor and colleagues [95]. For *F. graminearum*, it is largely known that pathogenicity and virulence follow a path of germ tube emergence from conidia, production of cell wall-degrading enzymes and the production of trichothecene mycotoxins [53,94,95,128].

However, complete delineation of the infection process of *F. graminearum* requires sequencing and analysis of the entire fungal genome, conveniently and preferably using NGS. From the various e fforts to study pathogenicity and virulence factors of *F. graminearum* using NGS, a few studies are worth noting. The first is the study of King and colleagues [115] (sequencing was done using the Illumina HiSeq 2000 sequencing platform) which provided a complete genome sequence from a combined genome analysis from various sources which had *F. graminearum* genome sequence information. Through the modification the gene model set, the FGRRES\_17235\_M gene was identified to be of particular interest because it is a virulence factor, it encodes for a cysteine-rich secretory protein, allergen V5/Tpx-1-related with CAP and signal peptide domains, with a previous link with plant pathogenesis proteins of the PR-1 family [129] and had been identified in the highly virulent *F. graminearum* strain CS3005 (gene ID: FG05\_09548). Another gene 15917\_M was identified as an endo-1,4-beta-xylanase enzyme, which hydrolyzes (1- > 4)-beta-D-xylosidic linkages in xylans, of the cell walls. The second study by Wang and colleagues [130] identified eight genes responsible for *F. graminearum*-wheat interactions. Three of the genes had already been identified in various studies [112,131]. Their gene annotation revealed largely polymer degrading function i.e., xylanase, catalase, protease and lipase. The third study was by Cuomo and colleagues [114], who identified a variety of pathogenicity and virulence factors belonging to the gene classes cutinases, pectate lyases, pectin lyases and other genes encoding secreted proteins. The fourth study reports, among the findings, the presence of 616 potential e ffector genes, 126 of which are expressed in a host-specific manner. This same study by Laurent and colleagues [132] which utilized the Illumina HiSeq 2000 identified 252 variants within the genic sequences and the intergenic sequences of *Tri* genes. *Tri* genes are involved in the production of type B trichothecenes. Given the mammoth task of elucidating pathogenicity and virulence factors of *F. graminearum*, increasing efforts in whole-genome sequencing of strains from various parts of the world is necessary. These efforts must be coupled with in planta gene expression studies to assess genes which are di fferently expressed during the infection of a host plant by *F. graminearum*. Traditionally, common techniques for gene expression studies included northern blotting, real-time PCR and microarrays. Nowadays most studies which could be done using northern blotting, real-time PCR and microarrays can be conducted conveniently using NGS. However, these relatively old techniques paved the way for NGS.

### 3.3.1. Notable Studies Which Paved Way for NGS

Before NGS technologies were commonplace, Northern blotting, real-time PCR, and microarrays were used to discover pathogenicity and virulence factors of fungi including *F. graminearum*. These studies paved the way for studies based on NGS. Some of the studies (reviewed in this section, none of which were performed in the past two years) utilized proteomic approaches and metabolomics. *F. graminearum* genome analysis reports that the pathogen consists of 1250 genes that encode secreted protein e ffectors [133]. These genes associated with fungal pathogenesis in vitro and in planta have been identified, and a large number are activated during infection. The *F. graminearum* classes of genes overlap with other plant-microbe interaction studies. These genes include the trichothecene gene cluster [95], bZIP transcription factor [104], syntaxin-like t-SNARE proteins [81], PKS [134], lipase [65], EBR1 [127], FgATG15 [135], among others. These genes play di fferent roles in fungal pathogenesis. Among these types of genes, the *Tri* gene cluster has been characterized in the *F. graminearum* species complex, with the type B cluster being the most studied, due to their ability to cause diseases in both animals and humans [136]. In addition, proteins such as the five TRI proteins TRI1, FG00071; TRI3, FG03534; TRI4, FG03535; TRI14, FG03543; TRI101, FG07896 [137]; kinases [138] and FG00028, metallopeptidase MEP1; FG00060, KP4 killer toxin; FG00150, NADP-dependent oxidoreductase (COG2130); FG00192, peptidase S8 (pfam00082); FG00237, O-acyltransferase (pfam02458), among others [139], were found to be involved in *F. graminearum* pathogenesis [124,140,141]. The study by Dhokane et al. [124] demonstrates the interface between metabolomics and NGS. The role of the genes found to be involved in *F. graminearum* pathogenesis [124,140,141] has been identified and characterized using high throughput sequencing approaches, confirming their roles in fungal pathogenesis. NGS studies can employ either studying the genome or gene expression by means of transcriptomics. Both have been instrumental in discovering pathogenicity and virulence factors of *F. graminearum*.

### 3.3.2. *Fusarium graminearum* Pathogenesis-Related Genes Discovered Using RNA-Seq Transcriptomics

Many technologies have been employed for the detection, identification, and quantification of mycotoxins secreted by *F. graminearum* infections in grain cereals. A few studies, including those by Pasquali and Migheli [136], report the most important fungal mycotoxins belonging to the type B trichothecenes that are produced by the *Fusarium* spp. Identification of di fferentially expressed genes regulated by mycotoxins using transcriptomics is one of the approaches to identify and catalog pathogenicity and virulence factors in response to *F. graminearum* in grain cereals. Using comparative transcriptomic analysis, Walkowiak and colleagues [142] identified 1500 di fferentially expressed genes of two *F. graminearum* strains with 3-ADON and 15-ADON trichothecene toxin chemotypes. Furthermore, a whole-genome sequencing and comparative genomics study investigated four *Fusarium* strains and reported few pathogenicity and virulence genes [99]. These included the g8968 gene, which was predicted to contain the *Tri5* domain. The *Tri5* is a terpene synthase gene that catalyzes the first step of trichothecene biosynthesis in *F. graminearum* [95]. Furthermore, *Tri8* was also identified in this study, and was reported to have exhibited a high frequency of SNPs and indels. The importance of *Tri5* in pathogenicity and virulence was also supported in non-NGS studies. Boddu and colleagues [143] report *Tri5* encoding a DON enzyme and revealed that loss-of-function *F. graminearum tri5* mutants were unable to produce DON in wheat and barley. Similar results of the importance of *Tri5* were also observed in a study by Jonkers and colleagues [144] whereby the Wor1-like Protein Fgp1 regulated pathogenicity, toxin synthesis and reproduction in *F. graminearum*. The study predicted that the loss of mycotoxin accumulation alone may be enough to explain the associated loss of pathogenicity to wheat.

Using transcriptomic analyses, di fferentially expressed genes (DEGs) were identified in infected spikelets and rachis wheat samples following *F. graminearum* infections [145]. From the list of the DEGs identified, a few trichothecene biosynthesis genes of the *F. graminearum Tri* cluster were mostly upregulated in the pathogen when infecting the resistant near isogeneic lines (NILs). Interestingly, another transcriptomic study was conducted between three host plants infected with *F. graminearum* strain to identify DEGs during colonization [103]. The study discovered that some genes were only expressed in a specific host, and there was also a di fference in the genes' functional categories identified in each host. In summary, the pathogenicity and virulence factors (listed in Section 3.3) of *F. graminearum* discovered using NGS technologies are provided in Table 2 below:


 *8*, 305


**Table 2.** *Cont.*

### *3.4. Fusarium Graminearum Pathogenesis Proteins Discovered Using Proteomics Approaches*

Proteins are macromolecular machines which undertake various biochemical functions either building blocks, transporters, enzymes, and other functions. Proteins functions are coordinated, they are intertwined with other constituents of organisms like genes, RNA, and metabolites.

Proteomics is a large-scale study of sets of proteins produced by organisms. The set of total proteins produced by organisms is termed the proteome. The proteome varies across cells, and to some extent, it is defined by the underlying transcriptome. Traditionally, proteins were studied using low-throughput methods which focus on a relatively small set of proteins and provide qualitative data on structure, function, and interaction which other cell constituents. The small windows of knowledge opened by these traditional techniques denied biochemists of a broader bigger of the entire proteome of a cell. From the traditional methods of studying proteins emerged proteomics, a large-scale study of the proteome which is able to provide a snapshot of total proteins in an organism. As opposed to gel-based and antibody-based methods of studying proteins, mass spectrometry (MS) has been utilized to produce large datasets on the proteome. The basic workflow followed in proteomics is the extraction from the tissue of total proteins, followed by trypsin digestion, separation chromatography of short peptides from the digestion, and mass analysis by MS. From then onwards is the identification of proteins in the studied sample and the generation of the protein list. Similar to NGS, proteomic studies have been accelerated by the invention of various instruments which perform both the separation of the digested peptides, mass analysis, and other downstream applications. The instruments had to meet a number of qualities which include high throughput and high confidence in the identification of peptides, notably Orbitrap and time-of-flight mass analyzers [147–150]. The common trend in improving the performance of the MSs was to create hybrid systems. The hybrid systems make use of di fferent ion analyzers or separators to enhance the capability, quality and usefulness of the results obtained. The advent of a triple quadrupole instrument enhanced MS capability over a single quadrupole. With the triple quadrupole, data on m/z values are combined with data on molecule fragmentation patterns to improve accuracy. The fragmentation pattern data is made possible by the presence of the second quadrupole which acts as a collision cell.

Fungi produce proteins for pathogenicity and virulence. Usually, some of these proteins are secreted into the intercellular spaces of plant and may either degrade the cell wall or act as e ffectors to perform various other pathogenicity and virulence functions. The secreted proteins are part of what is called the secretome. For comprehensive studies of pathogenicity and virulence proteins, studying the whole-organism proteome becomes necessary and it is made possible using high-throughput proteomics instruments. Yang and colleagues [138] pointed out that the invention of "omics" and bioinformatics tools has enhanced the proteome analysis of phytopathogenic fungi and their host interactions. Paper and colleagues [139] identified 120 fungal proteins of *F. graminearum* which include CWDEs from infected wheat heads through vacuum filtration. Among the identified proteins, about 56% controlled putative secretion signals. Transcriptomics data can be complemented with other omics approaches such as proteomics. The functions of proteins expressed at a given time can be identified and understood using proteomics approaches [138,151]. Although high throughput sequencing technologies have been available for over a decade, identifying di fferentially expressed proteins involved in fungal pathogenicity in cereals has not been widely investigated. Several proteins have been identified and characterized in *F. graminearum* and associated infections. However, they focused on the secretome and the impact of DON [137,139,152]. When the expression of *F. graminearum* proteins was investigated in response to in vitro stimulation of biosynthesis of the mycotoxin, trichothecene, 130 *F. graminearum* proteins that showed changes in expression were reported [137]. Many of the proteins identified were involved in fungal virulence. Moreover, investigation of a secretome of *F. graminearum* annotated secreted pathogenesis proteins related to the KP4 killer toxin and gEgh 16 proteins, among others, which were associated with pathogenicity [146].

Recently, a study by Lu and Edwards [153] reported about 190 small secreted cysteine-rich proteins (SSCPs) found in the genome of *F. graminearum* using genome-wide analysis. From the list of the SSCPs reported, five belonging to the cysteine-rich secretory proteins, Antigen 5 and pathogenesis-related 1 proteins were established. These SSCPs were observed to contain homologies to proteins that have established crystal structures. The authors also maintained that previous studies had not reported these SSCP associations with pathogenicity or virulence in plants. Moreover, in planta expression patterns showed upregulation of nine proteins associated with pathogenesis. These proteins contain conserved domains of Ecp-2-like panels 1 and 4, CFEM-like panel 3, Kp4-like panels 10, 14 and 9, PR-1-like panel 11, hydrophobin-like panel 12 and glycol\_61 family panel 13, which are linked to fungal pathogenicity. This is in line with a study by Paper and colleagues [137], who identified 229 in vitro and 120 in planta proteins secreted by *F. graminearum* during infection of a wheat head using a high-throughput MS/MS comparative study. The study reported that 49 in planta proteins were not present in vitro, indicating that fungal lysis occurred during pathogenesis. Rampitsch and colleagues [154], on the other hand, reported 29 proteins whose relative abundance was affected in their secretome following infection by *F. graminearum* using a comparative secretome analysis. These proteins included metabolic enzymes, proteins of unknown function and pathogenesis-related proteins. Other studies involved in the identification of *F. graminearum* pathogenesis-related proteins in vitro and in planta also include the PR-3 and PR-5 proteins [155]. Various forms of proteomics are significant in studying plant-pathogen relations and other factors which include elicitors. An example of this is phosphoproteomics which has to some extent been studied in necrotrophic pathogens like *B. cinerea* [156–158], *Septoria tritici* [159]. These studies need to be extended to *F. graminearum* to deepen the understanding of its pathogenicity and virulence.
