**Contents**


Reprinted from: *Toxins* **2020**, *12*, 730, doi:10.3390/toxins12110730 . . . . . . . . . . . . . . . . . . **125**

#### **Ryan S. Mote and Nikolay M. Filipov**


## **About the Editor**

#### **James L. Klotz**

James L. Klotz is a research animal scientist with the USDA-ARS, Forage-Animal Production Unit. Dr. Klotz's research has focused on the influence that ergot alkaloids have on grazing livestock physiology. He has collaborated with scientists around the world and across many disciplines to improve the understanding of how ergot alkaloids interact with animal and plant systems.

## *Editorial* **Global Impact of Ergot Alkaloids**

**James L. Klotz**

United States Department of Agriculture-Agricultural Research Service, Forage-Animal Production Research Unit, Lexington, KY 40546, USA; james.klotz@usda.gov; Tel.: +1-859-257-1647; Fax: +1-859-257-3334

For many years, ergot alkaloids have been considered both a problem to be mitigated and a potential medical cure. These compounds have been primarily studied in the medical/pharmaceutical [1] and agricultural fields [2]. Depending on one's perspective, the impact that ergot alkaloids have had on the progress of human medicine and livestock production can be either positive or negative. The dose or concentration of ergot alkaloid exposure is paramount. This can determine whether these compounds are implicated in the morbidity and mortality of individuals with St. Anthony's Fire, or whether they can be used to treat migraines and post-partum bleeding; it can determine whether they are used to maximize plant resistance and persistence, or whether they constitute an animal welfare concern for grazing livestock [3,4]. The ethics of ergot alkaloid use is debated to this day, but there is no debating the impact of these compounds. Many of the positive and negative issues associated with ergot alkaloids have specific conditions with regional implications, but that does not diminish the magnitude of impact that these compounds have had on humans, livestock, and plants globally.

Research evaluating ergot alkaloids can be both basic and applied. Many types of research perspectives are necessary in understanding ergot alkaloids' functions and how these findings might be applied. The focus of this Special Issue concerns original research and review articles that highlight benefits and detriments, and successes and failures involving ergot alkloids around the world with deference to regional distinctions. Research models range from fungus, to plant, to mammal; and the ergot alkaloids produced by both *Claviceps* and *Epichloë* spp. of fungi are included in this Special Issue. All submissions focus on ergot alkaloids' effects (positive or negative) in different contexts. There is a benefit to this shared interest, even if the issues with ergot alkaloids do not directly overlap.

Significant advancements in the manipulation of plant–endophyte symbioses have been made in recent years to optimize the profile and concentration of the secondary compounds produced [5]. Eady [6] has reviewed the complexities plant breeders encounter when selecting a desired plant–endophyte symbiont, with New Zealand ryegrass as a model. This is a balance between selecting a source of ergot alkaloids that permit greater plant persistence, and inhibiting ergot alkaloid production that results in mycotoxicosis in grazing livestock in combination with desired plant traits. This makes the understanding of ergot alkaloid production paramount. The potential of using various-omics technologies to study ergot alkaloid production has been demonstrated in this Special Issue. In addition to traditional selection processes, Florea et al. [7] demonstrate the use of CRISPR technology to create a non-transgenic strain of *Epichloë* fungus without the genes necessary to produce ergot alkaloids. Fungi that produce ergot alkaloids can be endophytic and parasitic. Ergot contamination of cereal crops in Canadian provinces has become an issue of increasing concern. Hicks et al. [8] evaluate diversity in genes related to ergot alkaloid production in Canadian strains of the parasitic *Claviceps purpurea* to better characterize and understand the variation of ergot alkaloid content. Also looking at Canadian strains of *C. purpurea*, Liu et al. [9] evaluated the evolution patterns of gene clusters associated with different classes of ergot alkaloid production. Work of this caliber is critical to better understand this evolving issue.

**Citation:** Klotz, J.L. Global Impact of Ergot Alkaloids. *Toxins* **2022**, *14*, 186. https://doi.org/10.3390/toxins14030186

Received: 18 February 2022 Accepted: 1 March 2022 Published: 3 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Historically, human interactions with ergot alkaloids have been defined by large-scale poisonings through the consumption of contaminated grains [1]. Incidents of human ergot poisonings are increasingly rare due to improvements in crop management, grain screening and cleaning [10], and the regulation of safe quantities in food and feed [11]. However, there are still areas in the world where this can be an issue [12], and there is also still interest in the pharmaceutical potential of ergot alkaloids. Their most prominent use has been the treatment of migraines and controlling post-partum bleeding in the 18th and 19th centuries. In a current review of the past gynecological and obstetric uses of ergot alkaloids, Smakosz et al. [13] defined a potential role for the application of ergot alkaloids in modern obstetrics. In addition to clinical uses of ergot alkaloids, research assessing the sustainable production of ergot alkaloids in desirable formulations is needed. Shahid et al. [14] have developed a response surface methodology to select strains of *Penicillium citrinum* for their ability to produce ergot alkaloids in culture. Many researchers that study ergot alkaloids can relate to the challenges associated with obtaining purified forms of desired ergot alkaloids in any quantity.

Although medical applications focus on ergot alkaloids' positive effects in humans, animal agriculture has historically and consistently viewed ergot alkaloids as a problem to be solved. Further, changing environmental conditions cause the ever-changing fungal production of ergot alkaloid profiles and concentrations. This necessitates routine surveys of grains and grasses. In this Special Issue, these are exemplified by the on-farm monitoring of ergot alkaloid levels in Kentucky horse pastures described by Lea and Smith [15], as well as the ergot alkaloids found in Slovenian feed grains, as described by Babic et al. [16]. Research of this nature is ongoing globally and contributes greatly to the mitigation of large-scale problems as well as the identification of future areas in need of research.

The variation of the content and concentration of ergot alkaloids is further complicated by livestock exposed to ergot alkaloids that demonstrate varied responses to the toxins. Poole et al. [17] and Wilbanks et al. [18] have studied various aspects, including genetics, that may make cattle more resistant to consumed ergot alkaloids. Ault-Seay et al. [19] used advanced-omics technologies to look at the rumen microbial and host metabolomes to provide a whole-animal characterization of impacts of ergot alkaloids. Mote and Filipov [20] reviewed the use of interactomics to provide a systemic understanding of the pathologies caused by ergot alkaloids that cause fescue toxicosis. A very specific pathology associated with ergot alkaloids and ergotism is a chronic vasoconstriction. Yonpaim et al. [21] looked at the acute exposure of ergot alkaloids on vasoactivity in ovine vasculature, and Valente et al. [22] evaluated prolonged ergot alkaloid exposure on the vasoactivity of bovine vasculature. Both studies [21,22] respectively evaluated aspects related to the ability of ergot alkaloids to interact with adrenergic and serotonergic receptors [23], and both papers concluded that receptor-mediated treatments for ergot alkaloid-induced vasoconstriction could be explored as potential therapies. From a systemic evaluation of ergot alkaloids' impact on the whole animal or microbiome, to the study of a specific symptom, there is much yet to be learned about how ergot alkaloids disrupt mammalian physiology.

The collection of papers in the Global Impact of Ergot Alkaloids (https://www.mdpi. com/journal/toxins/special\_issues/ergot\_alkaloid) (accessed on 17 February 2022) Special Issue highlights the rich diversity of research and the complexity of the problems centered around ergot alkaloids. Although many specific issues related to accidental or intentional consumption of ergot alkaloids can be localized to a certain geographic region, the problems, challenges, and fascination with ergot alkaloids is global.

**Funding:** This research received no funding.

**Acknowledgments:** This editor is grateful to all the contributing authors. Their expertise and highquality research made this special issue possible. Also, special thanks are extended to all the reviewers who provided rigorous reviews that have improved the content of this special issue. Lastly, I would like to thank the staff of MDPI Toxins journal for their patience and steadfast organization.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


## *Review* **The Impact of Alkaloid-Producing** *Epichloë* **Endophyte on Forage Ryegrass Breeding: A New Zealand Perspective**

**Colin Eady**

Barenbrug, New Zealand Ltd., 2547 Old West Coast Road, Courtenay, Christchurch 7671, New Zealand; ceady@barenbrug.co.nz; Tel.: +64-27-717440

**Abstract:** For 30 years, forage ryegrass breeding has known that the germplasm may contain a maternally inherited symbiotic *Epichloë* endophyte. These endophytes produce a suite of secondary alkaloid compounds, dependent upon strain. Many produce ergot and other alkaloids, which are associated with both insect deterrence and livestock health issues. The levels of alkaloids and other endophyte characteristics are influenced by strain, host germplasm, and environmental conditions. Some strains in the right host germplasm can confer an advantage over biotic and abiotic stressors, thus acting as a maternally inherited desirable 'trait'. Through seed production, these mutualistic endophytes do not transmit into 100% of the crop seed and are less vigorous than the grass seed itself. This causes stability and longevity issues for seed production and storage should the 'trait' be desired in the germplasm. This makes understanding the precise nature of the relationship vitally important to the plant breeder. These *Epichloë* endophytes cannot be 'bred' in the conventional sense, as they are asexual. Instead, the breeder may modulate endophyte characteristics through selection of host germplasm, a sort of breeding by proxy. This article explores, from a forage seed company perspective, the issues that endophyte characteristics and breeding them by proxy have on ryegrass breeding, and outlines the methods used to assess the 'trait', and the application of these through the breeding, production, and deployment processes. Finally, this article investigates opportunities for enhancing the utilisation of alkaloid-producing endophytes within pastures, with a focus on balancing alkaloid levels to further enhance pest deterrence and improving livestock outcomes.

**Keywords:** endophyte transmission; livestock safety; insect testing; quality control; alkaloid profile

**Key Contribution:** This manuscript details the history and commercial exploitation of alkaloidproducing endophyte in New Zealand, highlighting the issues faced with ergot alkaloid-producing strains, and the future opportunities and risks of deploying alkaloids via a plant host.

#### **1. Introduction**

Efficiently producing animal-sourced food (ASF) from pasture using 'low-cost livestock' (in situ grazing of forage) is a key requirement in order meet land-use sustainability criteria [1]. New Zealand pastoral farming systems are some of the most efficient 'low-cost livestock' systems in the world, producing ASF economically and with a relatively small environmental footprint [2–4], and recently https://www.dairynz.co.nz/news/researchshows-nz-dairy-the-world-s-most-emissions-efficient/ accessed on: 11 February 2021. This is principally because the NZ climate is conducive to high year-round biomass production of high metabolizable energy, palatable temperate pasture. *Lolium perenne*, ryegrass, although an introduced species, is currently the principle grass of choice in NZ because cultivars have been bred to perform within NZ climatic ranges [5], with a cardinal temperature that enables good growth [6], along with architecture and physiology to provide high yearround photosynthetic conversion efficiency [7]. To complement these plant characteristics, management requires the addition of low-cost nutrients provided in part by addition of a legume, white clover, nitrogen and phosphate [8]. Persistence (biomass production/time)

**Citation:** Eady, C. The Impact of Alkaloid-Producing *Epichloë* Endophyte on Forage Ryegrass Breeding: A New Zealand Perspective. *Toxins* **2021**, *13*, 158. https://doi.org/10.3390/toxins 13020158

Received: 30 November 2020 Accepted: 6 February 2021 Published: 18 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of such pastures is achieved through a combination of good stock and pasture management, breeding of the ryegrass germplasm, e.g., for rust resistance, seasonal growth, and the use of *Epichloë* endophytes to relieve biotic and abiotic stressors. These endophytes are asexual, maternally inherited and produce characteristic alkaloid profiles [9]. The compatibility of endophytes with the grass host, the effect they have on the host phenotype, and the temporal, spatial alkaloid profile produced are key components that need to be understood before deployment decisions can be made. The successful use of *Epichloë* endophytes has been achieved despite numerous application problems, and no single perfect endophyte exists on the market. Biologically, this is not surprising as the asexual *Epichloë* is trapped in the vegetative tissue or seed of a host ecotype. Variants are essentially limited to somatic mutations as recombination and reassortment of chromosomes do not occur. There are exceptions to this and hybridisations between endophytes have occurred throughout their evolution [10,11] to produce diploid and triploid interspecific hybrids. For more on the evolution of endophytes, refer to the works of Schardl and Hettiarachchige [9,10,12]. These biological characteristics favour the development of local variants evolved to live within a geographically constrained grass ecotype with specific growth characteristics, e.g., architecture, flowering dates, dormancy, and under local biotic, and abiotic stressors. Endophyte discovery projects [13] have found localised variants such as AR37, nea2, nea6, and CM142. Transfer of endophytes between ecotypes or species can cause gross changes to the symbiosis [13]. Elite forage ryegrass has been bred, through sexual recombination, primarily for biomass production and performance within an environment. This may rapidly change architecture, heading date, vernalisation requirement, etc., of the plant, which a recent study has suggested may limit the symbiotic association [14]. More studies are required to fully understand what dictates the temporal and spatial growth of the endophyte in the ryegrass and particularly the related alkaloid expression profiles. Many studies have been undertaken that demonstrate host-induced differences in *Epichloë* traits, aside from transmission and viability that are crucial production traits. The same *Epichloë* in a different ryegrass host can demonstrate an >10-fold alkaloid expression range [15] and/or have altered infection characteristics. Previous research has primarily been oneoff studies, comparing one combination with another [15]. Trying to exactly copy biotic conditions, sampling, analysis, etc., to compare between manuscripts is difficult, thus making it difficult to home in on the causation(s) of any differences. One theme that comes through is that tetraploid ryegrass tends to have lower alkaloid concentrations than diploid, perhaps not surprising as the larger cell size of the tetraploids would provide a larger plant cell volume: endophyte ratio, thus diluting any endophyte contribution. A better understanding of the interplay between host and endophyte is required and until then, each ryegrass cultivar–*Epichloë* strain relationship must be assessed on its own merits.

This manuscript provides a brief overview of endophyte strains used in New Zealand and the ergot profiles produced, and then focuses on the practical breeding of ryegrass containing *Epichloë* endophyte and outlines some of the key challenges facing a grass seed company's day to day breeding, production, and distribution. This manuscript will highlight quality control (QC) testing requirements and their difficulties, and functionality testing requirements of the industry, including shortfalls in alkaloid knowledge. This manuscript then investigates opportunities for breeding the host to better accept an endophyte as well as advances in endophyte selection, and deployment systems for use in future pasture systems. These highlights will be made whilst focused on *Epichloë*–ryegrass relationships in NZ, the issues and opportunities outlined are likely applicable to endophytes of other crop and forage species.

#### **2. Brief History of Endophyte Strains Used in New Zealand Grass Breeding**

Endophytic fungi have been known about for a long time [16] but the connection to animal livestock health and insect deterrence took some time to discover. It was not until the late 1970s in the USA that an endophytic fungus in tall fescue (*Festuca arundinacea*) was shown to cause fescue toxicosis in cattle [17] and then in New Zealand in the early 1980s

the closely related endophyte in perennial ryegrass was shown to cause ryegrass staggers in sheep [18]. In the subsequent discovery period and following a series of name changes, they eventually became known as *Epichloë festucae* var. *lolii* and the tall fescue endophyte *Epichloë coenophiala* [19]. The original New Zealand var. *lolii* strain that produces alkaloids peramine, lolitrem B and ergovaline was termed Standard Endophyte (SE), wild type or Common Toxic to distinguish it from the strains discovered from there on. Endosafe™ was the first commercial ryegrass endophyte, which was released in 1990. From the literature, it appears (but is difficult to confirm) that Endosafe™ was originally AR6, which itself was two strains, later identified to be AR5 and AR77, which both produce peramine and ergovaline [20,21]. Initial studies (1991) indicated that Endosafe™ demonstrated good insect deterrence and was safe for animal performance [22]. This claim was quickly questioned as subsequent studies gave animal health issues attributed to the ergovaline [21,23] and it became understood that alkaloid profiles of the same endophyte could differ substantially within different hosts [15]. Endosafe™ in diploid perennial ryegrass cultivar 'Pacific' was withdrawn from the market, but the tetraploid cultivar 'Greenstone' with Endosafe™ continued to be sold. Reselection (possibly from the original Endosafe™) for lower ergovaline identified the strain AR5 which was later marketed as Endo5 and is still sold in Australia today. The known role of ergot alkaloids in fescue toxicosis steered ryegrass endophyte research away from ergot alkaloid-producing *Epichloë* and, shortly after, Endosafe™ strain AR1 was introduced to the New Zealand market in 2001. AR1 produced peramine but no ergovaline or lolitrems and proved to be animal safe and demonstrated good Argentine stem weevil deterrence [24–26], leading to its successful uptake by the NZ pastoral sector. Unfortunately, the single alkaloid profile of AR1 did not deter black beetle and other pests, which led to widespread product failure in the upper North Island of NZ, where these pests are prevalent [27].

AR1 was the first endophyte to obtain plant variety rights (PVR) protection in NZ, which set the precedent for future endophytes (Table 1). After AR1, the NEA2 endophyte in diploid cultivar 'Tolosa' was released. This endophyte was initially not fully characterised (see below) and produced moderate levels of peramine and ergovaline, and low levels of lolitrems. The product was withdrawn due to seed production issues. In 2007, a fourth ryegrass endophyte, AR37, was released to the market. AR37 ryegrass appeared to have similar or better persistence than ryegrass with SE endophyte [28,29]. AR37 still caused occasional outbreaks of ryegrass staggers, which could be severe but generally animal production compared favourably with nil endophyte ryegrass [30]. AR37 produces epoxy-janthitrem alkaloids, via a similar biochemical pathway to that for lolitrem production, but janthitrems have a lower potency than lolitrem B [31,32]. Again, as with Endosafe, it was shown that the individual host–endophyte relationship was important in regulating alkaloid expression with some AR37/ryegrass cultivars causing greater staggers than others (certain AR37 products contain warnings to this effect). In some studies, a decrease in milk solids production in dairy cows has also been observed in pastures containing AR37 [26,33]. License terms for AR37 caused some NZ companies to search for alternative endophytes. This led to additional Novel Endophyte Agriseed (NEA) endophytes being developed. These have primarily originated in Spanish ryegrass germplasm but, as the discovery programme gained momentum endophytes from other regions of the world have been included. Many hundreds of *Epichloë* were screened but present commercial NEA endophytes are derived mainly from three strains—nea2, nea6, and nea3. Note most endophyte strains are denoted by capital letter(s) and a number, for the sake of clarity the 'nea' strains are represented here by lowercase to distinguish them from the commercial 'NEA' products. The nea strains are marketed singularly or in combination as NEA, NEA2, or NEA4; and produce a combination of peramine, ergovaline and a low level of lolitrem B. They have been tested to be animal safe and give good insect deterrence, but like AR37, host–endophyte interactions are important and industry evaluation tables carry caveats for animal performance issues under extreme circumstances (https://www.nzpbra.org/ accessed on: 11 February 2021). Ryegrass containing AR37 or

NEA endophytes have been the principle proprietary perennial or hybrid ryegrasses sold in NZ over the last 10 years. Other companies have followed, either through licensed products or by discovering their own endophytes, although commercial success in the market has been limited. CropMark released U2 a *N*-formylloline-producing meadow fescue endophyte (*Epichloë uncinatum*) and it appears that they have tried to aid stabilisation of this in ryegrass through creation of festulolium hybrids. Recently, they have protected a new endophyte CM142 (NZ PVR website) classed as a novel janthitrem-producing *Epichloë*. DLF (DLF.com) market Edge, an *Epichloë festucae* var *lolii* that produces high peramine, low lolitrem B and ergovaline (potentially similar to nea2); and an *Epichloë coenophiala* called Protek that produces low ergovaline and loline, and is derived from (and for) tall fescue (2017 Australian plant breeders rights); and an *Epichloë siegelii* called Happe, a meadow fescue endophyte from Germany that produces lolines and is suitable for use in some ryegrass offering protection against porina [34]. Meanwhile, AgResearch extended their species range through the discovery of AR501, a non-ergovaline-producing tall fescue endophyte, which they superseded with AR542 and AR584 and market as MaxP and MaxQ [35] for use in fescue, and Barenbrug (barusa.com) came out with a similar product E34®. Although a couple of different *Epichloë* species above are being sold in ryegrass (U2 and Happe), it is generally recognised that switching host species for an endophyte is difficult and usually results in gross symbiotic changes that render the relationship unmarketable [13]. Whilst recognising these other host–*Epichloë* relationships, this review focusses on *Epichloë festucae* var *lolii* in ryegrass.

The use of *Epichloë* endophytes as a 'trait' of the plant is an almost unique characteristic of forage grass plant breeding. Its commercial success is demonstrated by expansion of its first designed use in New Zealand to Australia, South America, and South Africa. Development of *Epichloë* for Fescues, for example MaxQ and MaxQII [35], has further extended the use to North American markets. There is also a renewed interest in Europe for the use of endophytes as biological control agents due to increasing constraints on synthetic chemical use. AgResearch has valued the contribution of endophytes to be worth \$200 million NZD (~118M Euro) a year to the NZ economy [36]. The AR37 endophyte patent has been estimated to be worth \$3.6 billion NZD (~2.1B Euro) [37]. Conceptually, using the plant as a solar factory to produce natural biological protectants (via the symbiotic fungus) is an efficient, sustainable, way of delivering protection to broad acre pastures. For reviews on the broader exploitation of *Epichloë* endophytes for agricultural, and future perspectives, the reader is referred to recent reviews [13,35,38]. Use is still mainly limited to temperate grasses, but considerable research is aimed at extending the host range, and endophyte species used, to advance further commercial exploitation [39].

Within commercial ryegrass *Epichloë* associations, there is a dichotomy within the industry around the use of ergot alkaloids. Some groups consider ergot alkaloids to be toxic to livestock and do not support their use in current product lines. This position has arisen from fescue toxicosis observations and early trials in NZ with a high ergovaline-producing *Epichloë* Endosafe™ that did cause health issues [15,38]. Other groups take the position that it is the 'dose that makes the poison'. For this group considerable effort has been undertaken to find products that produce low levels of ergovaline across the pasture, such that it can still confer insect resistance, but intake into the animal is low enough not to cause health issues. A review of the literature [40,41] identified theoretical non-toxic levels but admitted caveats to the research as historical work mainly used SE endophyte and often failed to note pasture alkaloid concentrations for the actual grass consumed. The review work was followed up by a study of NEA2 endophyte demonstrating that in managed pasture situations intake levels by the animal were unlikely to be above detrimental threshold levels [42]. For a review on the use of ergot alkaloid endophytes in New Zealand pastures, the reader is referred to Caradus et al. [15].

**Table 1.** Lists commercial endophytes, their botanical name, principal host, secondary host in brackets, the PVR approval date and the alkaloid profile, P = peramine, L = lolines, E = ergovaline, Lol = lolitrem, and J = epoxy-janthitrems. \* Only basic endpoint compounds identified as data from PVR reports and manuscripts often fail to detail all compounds tested, sampling protocols, temporal and spatial sampling differences, thus making relativities difficult to assess.



**Table 1.** *Cont.*

#### **3. Ergot Alkaloid Profiles Produced in Epichloe Endophytes**

Deployment of endophytes in grasses has primarily focused on expression of five key alkaloids—peramine, loline, lolitrem, janthitrem, and ergovaline. The myriad of associated precursors and derivatives have largely been ignored. Peramine is produced by a single enzymatic step with no intermediates and is known to be non-toxic to mammals [43]. Loline (*N*-formylloline) is produced via 3 enzymatic steps [43] and has several intermediates but like peramine is not known to be toxic to mammals [43]. Further, it is produced in endophytes primarily used in *Festuca pratensis* type grasses. The indole diterpenes lolitrems and janthitrems are toxic to mammals and have very complex pathways with many intermediates [44], some of which are also toxic. As such, the 'simple' determination of endpoint lolitrems (lolitrem B) and janthitrems (epoxy janthitrem I–IV) may give misleading information regarding animal health effects as intermediate and derivative levels may also have biological impact. Likewise, ergovaline is derived from a complex biochemical pathway that has among its precursors ergoline, clavines, ergoamides and ergopeptines [45,46]. The ergoline ring structure with its similarity to dopamine, serotonin and adrenaline provide ergot structures the basis to act on respective receptors as agonists or antagonists. Thus, producing a multitude of effects depending on the secondary structure. The ergopeptines ergotamine and ergovaline have similar activities in mice models [47] causing vasoconstriction, increase in blood pressure and bradycardiac properties. *Epichloe festuca var lolii* has been studied with respect to its particular ergovaline pathway [48], this analysis was for a standard toxic strain that follows the published pathway to produce a 'standard' level of ergovaline. Although this is often poorly determined [41]. Research in the ergot pathways has been limited to between species or across a few strain(s) within a species, which does not necessarily reflect what the profiles in particular variants might be. For example, nea2 produces a low level of ergovaline and lolitrem B [49,50]. Whilst some progress has been made in identifying the gene clusters and biochemistry present or absent in some of these strains [48,51], little work has been performed to understand differences in levels of ergot intermediates or derivatives in particular strains. For example, ergovaline is seen as a key detrimental compound on livestock health, but intermediate compounds can transfer more readily across the rumen and may in themselves cause animal health effects [52]. Finch et al. 2019 [53] studied mammalian toxicity of chanoclavine and demonstrated this to be safe, but this compound is early in the pathway. A recent study (Barenbrug NZ 2020, unpublished) using a non-lolitrem-producing strain nea3, was shown to produce high levels of paxilline and terpendole C greater than observed in SE strain and under hot dry summer conditions with 'rank' feed this was sufficient to cause ryegrass staggers, even though no lolitrem B was detectable. Analysis of *E. uncinata* haplotypes for loline content identified similar differences in the three forms of loline (NAL, NANL, and NFL) between different haplotypes [54] Analysis of six endophytes by Young 2013 looked at presence vs. absence of the *Eas* gene cluster and whether ergot alkaloids were produced or not [55]. The complex nature of ergot production in the Claviceps [48], between strain variation of the genes, expression profiles, and ergot bioactivity suggest that a greater level

of characterisation is needed than is currently undertaken before a new endophyte strain is advanced for commercial exploitation.

#### **4. Strategic Breeding Challenges with Existing Commercial Endophytes**

Strategically the requirement for endophyte in a grass breeding programme throws up a conundrum. Does the breeder breed the host to fit the endophyte or breed the best ryegrass and find a compatible endophyte? The first scenario may limit the host grass germplasm to only those genotypes that fit an endophyte, e.g., work of Gagic [56]. The second strategy does not limit the host genotypes but may produce a grass that cannot sufficiently host a suitable endophyte. If several endophyte types are available, then the second strategy is possibly best but if a single endophyte is available then the first strategy is probably advisable. In NZ this has been depicted in the market through the predominant marketing of endophyte AR37, and to a much lesser extent AR1 by PGGW Seeds and Agricom brands (DLF), whilst Barenbrug have marketed NEA, NEA2, NEA4, AR1 and AR37. Most ryegrass breeding pipelines, such as <sup>1</sup> 2 sib family selections, and population based family selections have been developed primarily because of basic ryegrass characteristics; outcrossing, S and Z incompatibility, wind pollinated, small male and female organs situated together, and the traits of importance (biomass, persistence, heading date) [57,58]. Research efforts are slowly changing this and developments in genomic selection [58–60], hybrid technology [61], paternity testing [62], self-fertility [63] and biotechnology [64] will likely change the current pipelines. For now though, breeding pipelines generally revolve around 1/2 sib family selection [60] or between- and withinfamily population breeding approaches [58]. Both these systems essentially identify best families and put together phenotypically similar parent plants to make a new 'synthetic' cultivar. The process of performing this is very different depending on the endophyte strategy taken.

Limiting the host to fit an endophyte is a relatively easy option, and as cycles of the breeding pipeline are completed, more germplasm contains the single endophyte, and subsequent crossing and selection becomes a simple process. Introgression of ecotypes or competitor germplasm may require the elimination of any incumbent endophyte, a relatively simple heat treating of seed and testing of seedlings [65]. Then crossing with the desired endophyte-containing mother line and harvesting from these mother plants will provide introgressed offspring with the endophyte. As this material is further backcrossed checks are required to determine the stability and transmission of the endophyte. Even crosses between two lines that do contain the desired endophyte may create a host with characteristics less conducive to stability and transmission, so testing is always required. Determining compatibility in such a system, using AR37, has been made easier by the development of a genomic estimated breeding value (GEBV) tool for host acceptance of an endophyte [56]. Whilst this is a great advance, care is still required as transmission and stability within a host is not all that needs to be considered. Spatial and temporal growth, and alkaloid profile knowledge is also required, along with potential detrimental effects on the host, e.g., reduced growth. Even with a GEBV for symbiotic potential the strategic decision to use a single endophyte may limit host germplasm, such that some plant characteristics are ultimately compromised.

Breeding the best host and then finding a suitable endophyte has other challenges but does not limit the host germplasm so plant potential is not compromised. As many breeding lines and ecotypes already contain an endophyte, it is first necessary to understand what endophyte is contained and what this contributes, or detracts, to the plant phenotype. Initially, this was problematical as many endophytes were unknown and testing was a costly, skilled exercise. A small set of simple sequence repeat (SSR) were available to detect endophytes [66] with a cost ~23 Euro per test. Therefore, breeders often worked with limited knowledge, using immunological tests for endophyte presence or absence, and only confirmed type when necessary. Breeding a high-performing synthetic ryegrass population would often occur and then a subset of plants would be tested to determine endophyte

status. Following this, individual plants could be chosen to constrain endophyte status. Historically at least a couple of early commercial endophyte strains were actually mixtures of two or three endophytes [20,30]. If the correct endophyte was not present, then a sample of seed could be heat cured of undesirable endophyte [65], and seedlings re-inoculated with the endophyte of choice [67]. Following this, further re-characterisation was required potentially putting the breeding programme back three years or more.

Such 'blind' breeding resulted in the release of a ryegrass cultivar 'TrojanNEA2' containing two endophytes, determined by SSR, and PVR protected as nea2 and nea6. This serendipitous mix provided an effective alkaloid profile and saw TrojanNEA2 become the top selling ryegrass across NZ between 2015 and 2018. Later, using a broader SSR panel, it was shown that the nea6 component was in fact two different endophyte strains [30], with similar alkaloid profiles. So TrojanNEA2 in fact consisted of nea2, nea6 and the variant endophyte [30] (later PVR protected as nea47).

Recently (Kompetitive allele specific primer), KASP technology [68] (see molecular testing below) has been applied to typing endophytes [69], which has enabled affordable identification of numerous endophyte strains quickly and easily. The system was developed for quality control determination and is based on sequence information that identifies a unique (single nucleotide polymorphism) SNP for a strain. The test result either concurs with the SNP under test, or identifies it as an 'other' strain, or returns a 'below threshold/negative'. If identified as 'other', then this can be further investigated with different strain-specific KASP primers. KASP works very well when breeding material has known commercial strain(s), providing a simple yes/no test for identification. If unknown ecotypes are to be tested, it is also advisable to check with broad SSR panels or multiplex PCR panels such as those developed by Vikuk et al. [70].

The KASP test has transformed breeding the best host because it allows populations to be easily endophyte typed and so parents, F<sup>1</sup> and F<sup>2</sup> individuals can be combined with a knowledge of the endophyte. This also by default utilises two generations of selection pressure for endophyte transmission and survival such that new synthetics are created with some level of selected compatibility. Further, any plants within the synthetic that have the wrong endophyte, or no endophyte, can be eliminated, or used as pollen donors only. If two endophytes are required (as in TrojanNEA2 nea2/6 above), then harvest from equal numbers of mother plants containing nea2 or 6 can favour a balance of seed containing the respective endophytes, which KASP can confirm.

#### **5. Practical Methods for** *Epichloë***-Infected** *L. perenne* **Quality Control**

Endophyte transmission is variable and endophyte viability decreases more quickly than the seed that contains it. Thus, moving from 100% infected nucleus seed to breeder's seed, to G1 and G2 production seed results in a loss of endophyte. If this cumulative loss is >30% then the final seed will have insufficient endophyte to be sold as endophytecontaining seed (NZPBRA industry standard). Additionally, ingress of seed with standard toxic endophyte, or packaging, labelling, handling, and storage errors can all affect the endophyte type and percentage in the final product. New host–endophyte combination(s) also must be checked for alkaloid profile and ultimately effectiveness as an insect deterrent, and impact on animal wellbeing. Practically, this means testing for endophyte is required at all stages of the process. Such QC testing must be simple, robust, cost effective, i.e., fit for purpose. The following section provides a brief outline of common protocols used, including potential caveats, and outlines how they are used in each stage of the seed production process. They are not necessarily the latest, or all, research methods but primarily robust and cost effective.

#### *5.1. Endophyte Detection—Microscopy*

Tillers—Briefly, a ~1 cm base section of tiller is secured to a microscope slide using glue tape and the outer leaf blade unrolled so that it is flat against the slide. These leaf blade sections are then stained using a drop of 2% analine blue and left for ~30 min before being examined under a compound microscope (×40 magnification). Endophyte hyphae stain blue. For a more detailed description, the reader is referred to Bacon and White [71].

Seed—Briefly, ~250 seeds are placed in a test tube and 15.4 M nitric acid added to double the volume of the seeds. This is then heated at 60 ◦C for approximately 15 min and then the seeds are rinsed under tap water. Individual seeds are then dissected to leave the caryopsis which is stained as for the tillers above. For more detail, refer to [72].

Caveats—Endophyte is obvious growing alongside the cells of the plant. However, the strain of endophyte cannot be determined and even different endophyte species—for example, *E. occultans, E. uncinata,* or *E. typhina* can easily be mistaken for each other. Low levels of infection may go undetected and, within seeds, it should not be assumed that the endophyte is alive.

#### *5.2. Endophyte Detection—Immunoblot*

Briefly, tillers, cut at the base of the plant, are blotted by pressing sap from the freshly cut base onto a nitrocellulose membrane. Once blotted, the membrane is incubated with blocking solution, rinsed, and then incubated with a primary antibody. Following this, the blot is rinsed with blocking and incubated with a secondary antibody. This is rinsed off and a final chromogen mix added to bind to the secondary antibody. Blots containing endophyte protein stain according to the chromogen used. For detailed protocol, refer to [73,74], although most companies have their own custom testing regimes, a kit can be purchased from (Agrinostics, Watkinsville, GA, USA; cat. #ENDO797-3).

Caveats—Immunoblots are not strain specific and can react to different *Epichloë* species. A recent in-house comparison of seed immunoblots with microscopy revealed a small percentage of samples (~2%) were positive on the microscope and negative on the blot or vice versa (Barenbrug NZ, unpublished). Immunoblots are also based on threshold levels of detection and tiller tests use standard 6-week-old tillers, younger tillers can be used but the negative results are less reliable. Additionally, late season tillers with little sap can be difficult to blot, all of which results in semi-quantitative/qualitative data.

#### *5.3. Endophyte Detection—Molecular Testing*

Molecular analysis of plant endophyte status was developed with SSRs [75]. This was rapidly developed into a fingerprinting system for endophyte types based on flanking variation around a microsatellite tandem repeat locus B [66] and soon became a routine test in NZ supplied by AgResearch. Similar systems were developed in the USA for fescue endophytes [76]. Testing requires isolation of good quality DNA from tiller or seed samples, usually via freeze dried or fresh samples, followed by multiplex assays using 3 to 5 primer sets to discerning B allele loci. Assays require electrophoretic separation of fragment sizes. Recently, using more complete sequence analysis of different endophyte strains, a catalogue of SNPs has been developed to enable the simple discrimination of known endophytes through KASP [69], this relies on fluorescent (HEX or FAM) primers competing for a binding site, which matches or mismatches the SNP in question. This process can use crudely isolated DNA and provides an immediate digital result (within 2 h, see Figure 1) via real-time PCR, reducing costs and making it suitable for in-house QC testing.

Caveats—Molecular testing requires quality assessment and threshold setting of the data. Is a negative truly negative, or just below the threshold of detection? This is particularly true if performing bulk analysis, although digital droplet PCR and or advancements in next-generation sequencing (NGS) techniques may improve this in the future. In addition, both SSR and KASP do not eliminate the risk that a new but related endophyte is wrongly detected. Using NGS techniques, it is becoming increasingly possible to identify small genetic differences even between identical twins [77], thus molecular differences depend in part on how hard you look for them.

— assaying NEA4. Blue squares represent nea2, orange 'other' (in this case nea3) and black represent below **Figure 1. Left**: typical immunoblot of tiller sap. Blue staining represents presence of endophyte through the detection of immunoblot antibodies bound to endophyte protein motifs. Clear squares represent ryegrass tillers without endophyte. **Right**: typical KASP assay result—in this case, assaying NEA4. Blue squares represent nea2, orange 'other' (in this case nea3) and black represent below threshold (negative). (Images courtesy of Amanda James.)

#### — *5.4. Alkaloid Profiles*

– – – Testing in grass breeding situations is usually limited to analysis of leaf material, traditionally, field, or pot grown, material is usually cut to ground level. Samples are placed on ice, then frozen, freeze dried, and ground to <1 mm. This material is then analysed, via HPLC and LC–MS techniques [78–80]. Recently, a higher-throughput system has been developed for ergovaline, lolitrem B and peramine by Agriculture Victoria, Australia [81]. The authors investigated methanol extraction protocols, multiple separate extractions v two extractions combined and concentrated. Matrix effects and recovery efficiency was also investigated. Analysis was undertaken across three analytical platforms, two LC–MS systems, the QE MS and the QQQ MS and ergovaline was also analysed using an established HPLC-FLD method. The epoxy-janthitrem alkaloids have been very difficult to analyse due to instability issues. This has recently been overcome [82] but it is still a complex HPLC protocol requiring acetone extraction in the dark to prevent degradation. For full structural elucidation, nuclear magnetic resonance (NMR) assignments for the four epoxyjanthitrem (I–IV) compounds, and a new epoxyjanthitriol were required. The instability, extraction requirements and specific techniques for detailed analysis highlights the importance of understanding sample preparation right through to a result and highlights the difficulties of comparing data across platforms and between laboratories.

– — Caveats—Alkaloid analysis requires skilled biochemical knowledge, expensive equipment, and accurate sample preparation. Endophyte alkaloid profiles in a plant are difficult to sample measure and interpret and previous testing methodology and analysis has often raised more questions than it has answered [41]. Cross-laboratory validations are recommended but rarely performed due to costs and some standards such as epoxy-janthitrems are not readily available. Eady et al. [42] compared Australian and New Zealand laboratories for ergovaline and obtained a 95.5% correlation. Although other simpler techniques to detect alkaloids are being investigated such as NIR [83], for now, well-equipped laboratories with specific skills are needed for accurate alkaloid detection. These capabilities are expensive and usually result in essential analysis only being performed. For a semiquantitative technique that can be used in house, it is possible to quantify ergot alkaloids with ELISA kits (agrinostics.com/shop/), but nothing similar is available for other alkaloids. Most of the reported alkaloid sample testing in ryegrass states 'cut to ground level', but even this depends upon interpretation of ground level and may vary by several cm's (Author's personal observation). Combine this with poor characterisation of the height of the sample, and the knowledge that alkaloid temporal or spatial spread throughout the plant also varies with host, strain and environment, and it becomes almost impossible to conclude anything beyond the individual data set [40,41]. Klotz and Nicol [40] and

Nicol and Klotz [41] highlighted this in a review on animal effects caused by ergovaline in ryegrass and concluded that actual knowledge of alkaloid levels in the grass consumed was a much more relevant measure for animal health studies. It seems that the original purpose of alkaloid testing was to understand levels in plants, not levels ingested by grazing animals. Many animal health studies followed the 'cut to ground level' methodology, causing a reduced ability to deduce meaningful knowledge from, or compare, the results [41]. Since that review Eady et al. [42] estimated animal intake levels against the threshold level deduced by Nicol and Klotz [41]. The data would seem to agree with the threshold. However, more research is required to fully understand the impact on health of different alkaloid levels ingested through 'natural' grazing conditions.

#### *5.5. Livestock Safety Testing*

In New Zealand, evaluating novel endophyte ryegrass combinations for animal safety is undertaken voluntarily under agreed industry protocols developed by the Endophyte Technical Committee (ETC), which is part of the New Zealand Plant Breeding Research Association (https://www.nzpbra.org/ accessed on: 11 February 2021). The guidelines set out requirements for trialing including ethical approval and include scores for ryegrass staggers, heat stress, dags, and liveweight gain. Pasture under test must contain >85% viable endophyte, and chemical profiles to ground level determined. For a trial to be valid, a negative health effect must be observed in SE plots. Data are presented to the ETC and, if approved, a star rating for risk of ryegrass staggers and a comment on animal performance is assigned. Each year, industry rated tables are published to allow comparison of the animal safety parameters (https://www.nzpbra.org/ accessed on: 11 February 2021).

Caveats—Animal trialing is difficult and expensive to undertake, and such trials are under increasing pressure regarding animal welfare concerns. Data from the trials are of limited use for several reasons: (1) chemical profiles to ground level of the plant provide little knowledge as to what the actual alkaloid intake by the animal is; (2) environmental conditions vary year on year and influence alkaloid production, with trials often only run for a 4–8 week period within one year; (3) sheep genetics is not controlled and considerable variation is known to exist in regard to alkaloid tolerance; (4) the range between a valid positive control, ~1.1 ppm [84], and an extreme positive control, > 4 ppm [85], for lolitrem B alkaloid level is large; and (5) large differences in pasture quality may exist. This makes comparisons between trials difficult and may underestimate effects by up to 3.5-fold if conditions are such that only the minimum 1.1 ppm threshold is achieved. Thus, the result is open to manipulation by a skilled trial manager and the true potential of an endophyte to cause harm may be missed. The grazing practices of the trial, i.e., 'worst-case scenario', do not represent typical on-farm practice management and so the relevance of the test is also now under question. In addition, presence of other fungi such as *Claviceps purpurea,* or animal diseases may also cause animal fitness issues (though these should be countered by the control plots).

It may be possible with the extensive data set held by the ETC, other published material and on-going experiments, to calculate a relationship between alkaloid level and animal effect. This would allow trial material to be grown under much more defined conditions (e.g., recommended good management practice, vs. extreme, defined hot summer drought conditions) and animal alkaloid intake levels modelled by chemical analysis of the grass profiles. Such an approach would reduce environmental variables and eliminate or reduce the need for animal trials. This concept is not new, as Nicol and Klotz [41] calculated such an effect using published ergovaline data. However, for industry application, more comprehensive models are required.

#### *5.6. Insect Testing*

As with the animal testing, insect testing in NZ is via industry-agreed protocols developed by the ETC. Six key pest insects affecting NZ pastures have agreed testing protocols. Testing is customised for each insect but generally involves pot and/or field trials against a

susceptible (no endophyte) and control, typically SE plants. Insects under question include Argentine stem weevil (*Listronotus bonariensis*), black beetle (*Heteronychus arator*), grass grub (*Costelytra zealandica*), pasture mealybug (*Balanococcus poae*), porina (*Wiseana spp*.), and root aphid (*Aploneura lentisci*). Other insect pests have also been investigated, and ryegrass *Epichloë* alone have been demonstrated to have activity against over 20 insect species [86]. As needs arise, or commercial advantage is sought, it is likely that additional protocols will be added to the current list of six. Each year, data are submitted to the ETC and industry-approved star rating tables for endophyte insect control are published by the NZPBRA (https://www.nzpbra.org/ accessed on: 11 February 2021).

Caveats—As with livestock testing, insect testing is difficult, with many variables requiring control. Health and life stage of the insect, genetics of the populations, health and maturity stage of the plant, and environmental conditions all impact upon the response. These can cause variability between trials in pots, and in-field trials where infestation, or not, is heavily influenced by geography and season. Trials require entomological expertise, and like many livestock experiments are often contracted out to universities or research organisations at considerable expense.

#### **6. Use of Quality Control Methods within a Ryegrass Seed Company**

#### *6.1. Breeding*

Once a strategy of how to breed has been decided (see Section 4), then a further strategy on what to test and when must be established. Determining the order of importance of endophyte traits and when and where to test for them through the breeding pipeline is a perplexing task. Any new endophyte discovered, serendipitously, or via screening of seed banks or collections is usually initially phylogenetically identified through SSR, KASP, or NGS based molecular analysis. The new endophyte is then assessed in its natural host and an alkaloid profile deduced (although this is rarely spatially and temporally studied at these early stages). If desirable, transfer to elite host germplasm follows then a repeated round of alkaloid profiling (1–2 years) is undertaken. Ability of the endophyte to be inoculated into a broad range of host germplasm is also desirable but, this is a resource dependent process, as most inoculations are only successful at a low frequency (~4–20%). Stability and transmission tend to be studied in parallel to field evaluations that also investigate the effect of the endophyte on host architecture and performance. Research groups such as those at the Noble Research Institute (www.noble.org accessed on: 11 February 2021), AgResearch (www.agresearch.co.nz accessed on: 11 February 2021), and Agribio (www.agribio.com.au accessed on: 11 February 2021) have research teams dedicated to this process and usually work collaboratively with seed companies.

The success rate of this discovery process has declined with time as it becomes increasingly unlikely to find a novel and useful profile, within the finite species/strains available [87]. Research, e.g., Kaur et al. [88] has looked at many hundreds of *Epichloë* endophytes, of these, few have proceeded beyond their initial screening. The New Zealand PVR database lists 34 endophytes (www.iponz.govt.nz accessed on: 11 February 2021), of these, 3 have expired, 4 have been withdraw, 24 granted, and only 3 are currently filed. Application dates and numbers suggest (given discovery is some time before application) a peak discovery before 2012–2014 for ryegrass endophytes (Table 2).

**Table 2.** *Epichloë* PVR applications granted in NZ 1997 to 2020.


Inoculation into elite germplasm requires at least 40 individual ryegrass genotypes to be successfully inoculated, otherwise the allele frequency of the host may shift significantly from the original elite germplasm [89]. An alternative way if inoculation is difficult is to inoculate a few plants and then cross onto them with many different elite pollen donor genotypes. If inoculation is not possible, then backcrossing onto the original host, acting as the mother, can also be undertaken. This method may help by introgression of host genes, required for endophyte stability, into the elite germplasm, but requires at least four rounds of backcrossing to get a genotype similar to the original elite germplasm.

In mature breeding pipelines regardless of strategy, existing elite germplasm already contains known functional endophyte(s). With such material the process is easier, and by selective harvesting from mother plants endophyte status can be controlled and new synthetics created. Whatever method is used, each stage of the process needs to be monitored by tiller immunoblots to confirm inoculation, and viable transmission. Microscopy or immunoblots of seeds for presence of the endophyte in the seed is also required. Subsamples of plants, or seed need to be checked by molecular analysis to confirm strain and detect any contamination. Information on transmission, stability and effect on host phenotype is gathered, as resource allows, during the breeding of the new 'synthetic' line. Failure of any of these characteristics can prevent the new endophyte/host symbiont from progressing.

#### *6.2. Agronomy*

Once a new ryegrass synthetic line x endophyte strain(s) combination has been chosen for advancement, it requires agronomic assessment. Initially, confirmation of the alkaloid profile, temporally and spatially, is undertaken, along with nationwide trials to understand the host performance. If chosen for advancement, then animal and insect trials, following ETC testing protocols, can be undertaken. For all these trials, seed batches and tiller samples again need to be tested for endophyte presence and subtested for endophyte strain to ensure QC through the process.

#### *6.3. Production*

Seedlings of the new line (putative cultivar) are tested for endophyte using immunoblots, and a subset for molecular profiling, any negatives are removed, and seed is harvested from 100% infected plants. This is then multiplied up by seed production specialists within the companies. Some marketed endophytes are a mixture of two different endophytes, e.g., NEA2. Initially, this was serendipitously multiplied up as a mixture but now at each stage of multiplication the ratio of the strains is monitored through molecular testing. Alternatively mixtures can be made by combining individual host–endophyte combinations [90] at various stages of the multiplication process. For single strains, testing is simpler but still required to ensure that no contamination or mislabelling has occurred. Additionally, because endophyte transmission is rarely 100%, endophyte seed percentage, and tiller viability assessments are required at each stage of multiplication from breeders, to pre-basic, to certified seed generation 1 and 2. The final seed for sale requires an endophyte viability of above 70%, otherwise it has to be sold as a low-endophyte product. Despite the importance of transmission and viability, it still varies through management and climatic conditions. Research in this area [35,74,91–93], and considerable in-house research, has made some advancement but endophyte status in production crops is still a major production risk. In addition, crop treatments such as pesticides (especially fungicides) have to be evaluated for potential negative effects on endophyte status [92]. Loss of endophyte status can require the production manager to rework the cultivar starting with a new 100% infected batch of nucleus plants. For new host–endophyte combinations, specific testing of fungicide applications, plant growth regulators, and seed treatments, on the transmission and viability of the endophyte may be necessary.

#### *6.4. Storage and Sale*

Endophyte viability in the seed declines quicker than the host seed does [94–97] and this rate varies depending upon the relationship and the storage environment [98]. Storage at low temperature and humidity is required to maintain endophyte viability and insure delivery of good quality product to the farmer [99]. Most seed companies involved

in endophyte have strict harvest and storage conditions to quickly cool the seed and maintain low humidity, required for endophyte longevity. This includes strict timelines from harvest to cool storage. Within the NZ industry, freshly harvest seed is given a 6 month grace period, whereby machine dressed endophyte level (seed squash or immune test) is accepted for viability. Following that seed lots must be tested every 6 months, usually via a 6 week tiller grow-out test to ensure viability of the endophyte. This is a logistical problem as failure to maintain testing can see lines requested for immediate sale having to wait 6 weeks for a test result. Farmers may go elsewhere for product under such a scenario. Thus, efforts to reduce the time required to perform seed viability tests are important (see Section 5.2 above). Sale to farmers is usually via a third-party store and 'just in time' delivery to these stores reduces the likelihood of poor storage between leaving the production company and sale to the farmer. Advice on on-farm seed storage is given, but there is little monitoring, and once the grass is sown in the paddock little or no monitoring is undertaken. Occasionally, if a poor paddock is produced checks on endophyte status can be made using molecular profiling. This has resulted in the identification of wrong product, or products seemingly mixed with other products (Author, unpublished). Buying from a reputable dealer can help avoid this. Using endophyte status as a QC or product marker is another useful attribute that endophyte can confer to a ryegrass product.

#### **7. Opportunities and Risks for the Future**

#### *7.1. Host Breeding*

As mentioned above, ryegrass breeding methods are improving, new phenotyping [64] and genotyping techniques [57] are starting to deliver cost effective genomic selection [58,60,100] and these are being applied to improve endophyte stability and transmission [56]. Addition of hybrid breeding [61], self-fertility [101], along with gene editing technologies [102] will likely reduce variation between genotypes within a host germplasm (cultivar or F1 hybrid). This reduced G × G variation would be expected to result in a more consistent endophyte profile within the product. Still factors affecting the symbioses are many and complex. As an example, Rinklake et al. [14] recently proposed that the process of vernalisation maybe an important transmission factor. Endophyte has a different cardinal temperature to ryegrass and the range of cold required for ryegrass vernalisation may affect endophyte growth and triggers for establishment of reproductive tillers differentially. This could lead to reduced endophyte entering the newly formed tiller, where, through intercalary growth [103], it eventually enters the developing infloresences [103]. Reduced initial colonisation of the shoot apical meristem may therefore reduce the number of infloresences infected. Knowing what traits are important in stability and transmission, are key to future breeding efforts. In a completely different breeding approach, Spangenberg (European Patent Office EP2521442A1) has proposed growing and selecting populations of ryegrass containing multiple different endophyte strains in vivo and simply selecting for the best plants to create a host population containing a plurality of endophytes [104].

#### *7.2. Endophyte Discovery and Manipulation*

Discovery and exploitation have become increasingly sophisticated with the use of NGS [12] and high-throughput biochemical analysis techniques [81]. There is no doubt that discovery programmes will continue but as stated this will require smarter or greater efforts to discover new, useful, endophytes. Increasing ease of certain techniques has widened the scope for who can look, and existing collections may still hold interesting endophytes. A recent NZ PVR for CM142 demonstrates this, as it was found in an existing collection from the Margot Forde Germplasm Centre [105]. Use of molecular QC analysis is now routine for many companies and provides opportunities to detect 'unknown' endophytes. This expands discovery capability that was previously limited to a few research laboratories around the world. Aside from discovery it is possible to screen endophyte/host cultivar populations and identify individuals that express different levels of alkaloid. This can be used to select for a particular profile such as lower ergovaline levels. Ryegrass cultivar Reward containing Endo5 endophyte is an example of such a product being a subselection of AR5, itself a subselection of Endosafe™, with evidence for this shown by their interconnectedness in the paper by Thom et al. [20]).

Genetic modification (GM) and/or gene editing techniques can be used to manipulate the asexual endophyte. With no sexual recombination breeders are reliant on somatic mutations within the endophyte. Evolutionarily hybridisations have arisen but attempts to achieve this in vitro have been unsuccessful to date and use of mutagens invokes Muller's ratchet making accumulation of deleterious mutations far more likely than a specific beneficial one [106]. GM research has been demonstrated in *Epichloë* for a considerable time [107,108] and some site-specific mutagenesis using NHEJ [109], and recently the use of CRISPR/Cas9 gene editing has been demonstrated [110]. Due to regulatory hurdles, GM research is limited to laboratory research activities, but gene editing, having been deregulated in many countries, offers up some unique opportunities. Precise editing of toxin profiles is an obvious target and repair of a perA gene within a LPTG-3 janthitremproducing endophyte, as currently achieved through GM [107], a potential candidate. If GM becomes acceptable, then adding more complex multigene alkaloid pathways to make a customised profiles might be possible.

#### *7.3. Deployment Systems*

Beyond genetic manipulation of endophyte and host, through conventional breeding, mutagenesis or biotechnology, there is also the opportunity to improve endophyte functionality in the pasture through creative deployment systems. One method is simply to dilute the intake by the animal through growing multiple species within the pasture. This is currently a topical approach as regenerative agricultural (RA) proponents advocate multiple species pastures. However, NZ soils already have high, stable carbon reserves [111], thus negating a major proposed benefit of RA. Additionally, ryegrass has an annual biomass, ME, and intake preference advantage in much of NZ [112] so RA would likely dilute production gains achieved by current proprietary ryegrass/endophyte pastures. If diverse species and a more 'natural' habitat is the goal of RA, then perhaps sustainable intensification and set aside land with native species maybe a better approach. Diluting alkaloid levels without losing production could be overcome by blending with a non-endophytecontaining ryegrass, and such products are currently sold as LE (low-endophyte) ryegrass in NZ. However, under high insect pressure without-endophyte grass might be preferentially predated leaving the high alkaloid ryegrass. In reality, it seems that if this occurs, then weeds fill the space and maintain the alkaloid dilution [42]. However, weeds are defined by their poor feed value or growth, so animal performance is again potentially is lost. A second and more sophisticated strategy would be to design novel endophyte combinations within the sward. NEA2 contains nea2 strain, a low ergovaline producer and nea6 strain, a moderate ergovaline producer, resulting in a sward with an ergovaline level sufficient to confer good insect protection but low enough to be unlikely to cause animal health issues under normal grazing conditions [42]. Alternatively, a farmer can purchase two different endophyte-containing cultivars—one with ergovaline and peramine and another producing janthitrem—this could provide a broad spectrum of excellent insect protection properties and reduce the individual alkaloid levels within the pasture. As janthitrem and ergovaline have different effects on livestock, neurological (janthitrem) vs. physiological (ergovaline), the dilution helps prevent either reaching a threshold that will cause animal health issues. Designer mixtures within the best-performing cultivar has been proposed [90] and offers the additional opportunity of maintaining optimal pasture yield. Currently in NZ nitrogen leaching issues are encouraging farmers to reduce synthetic fertiliser application and increase clover and herb (plantain or chicory) percentage within the pasture. This science-based approach results in a diverse pasture blend (ryegrass, clover, herb, endophyte, rhizobia) with effective alkaloid levels, whilst maintaining a productive and sustainable pasture.

#### *7.4. Risks*

The success of ergovaline- and non-ergovaline-producing endophytes in New Zealand ryegrass is well documented and without endophytes insect pests can devastate pastures, especially in the North Island. However, this situation is unique, and uptake and deployment in Australia, South America, Africa, Europe and North America has been much less. Fescue toxicosis, ryegrass staggers and ergot poisoning are key contributors behind this restraint and represent real issues with historically severe outbreaks reported in many countries [113] that have led to numerous animal deaths. In addition, many other regions of the world are not suited, or reliant, on the almost exclusive temperate ryegrass growth that is practiced in New Zealand. Different grass species require different endophytes, different insects require different alkaloid deterrents, greater reliance on other feed detracts from the value of the endophyte, annual production over perennial, and other factors all change the commercial equation that is made in estimating the cost to benefit ratio, the risk, of using endophyte. Climate change will alter insect pests within regions of the globe and what pasture species are grown within specific regions. Such changes are both risk and opportunity for endophyte-based alkaloid deployment but given the long lead in time (6 to 16 years) to incorporate new endophytes into ryegrass cultivars, these decisions also represent a risk to the breeding companies concerned.

#### **8. Conclusions**

Plant breeding of ryegrass would be much simpler without endophytes. Most breeding data management software have issues coping with a maternally inherited trait that can be lost, added, or interchanged with a different 'trait'. For the breeder, following the trait, and all the QC required around this, is an enormous drain on resources. Having to re-test the performance of a novel host germplasm with a new endophyte can take three years from re-inoculation. An inability to routinely produce commercial quantities of seed with sufficient (>70%) viable endophyte can see a huge loss in margin for that crop, and disrupt supply leading to many marketing issues. Likewise, loss of endophyte viability in storage is a logistical challenge requiring constant re-evaluation and re-prioritising of stock for sale. Contamination with wild-type endophyte, or another proprietary endophyte, for example through growing on contaminated land, can see a crop production ruined, and if unknowingly on-sold cost millions in reparation costs. This recently happened with an Australian tall fescue cultivar Barnaby [114]. Sale of the wrong endophyte for consumption by deer or horses, which are more sensitive to alkaloids, can also result in large lawsuits or reparation costs.

Despite the catalogue of negatives, *Epichloë* endophytes have stood the test of time and proved to be a vital component of NZ pasture systems. The obvious natural, sustainable protection that *Epichloë* alkaloids offer to pasture is too valuable to ignore. Farming systems continue to come under ever increasing pressure to 'be natural' and many synthetic pesticides are being removed from the market. This provides unique opportunities that will no doubt see the use of endophytes within and beyond pasture systems expand in the future. In addition, the new knowledge of host–endophyte associations along with better breeding capability for the associations will see improved production and deployment. Finally, the deregulation (in many parts of the world) and use of gene editing, and other biotechnologies, bring an exciting capability to customise alkaloid profiles and or other attributes of the *Epichloë* endophyte to potentially provide new sustainable solutions for the agriculture industry.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The author wishes to acknowledge Barenbrug for the time to undertake this MS, Courtney Inch, Will Clayton and Graham Kerr for review of the MS and the NZ pasture industry for immersion into the practical world of endophytes.

**Conflicts of Interest:** The author is employed by Barenbrug NZ Ltd a company that benefits from the use of endophytes.

#### **References**


## *Article* **Non-Transgenic CRISPR-Mediated Knockout of Entire Ergot Alkaloid Gene Clusters in Slow-Growing Asexual Polyploid Fungi**

**Simona Florea 1 , Jolanta Jaromczyk <sup>2</sup> and Christopher L. Schardl 1, \***


**\*** Correspondence: schardl@uky.edu; Tel.: +1-859-218-0730

**Abstract:** The *Epichloë* species of fungi include seed-borne symbionts (endophytes) of cool-season grasses that enhance plant fitness, although some also produce alkaloids that are toxic to livestock. Selected or mutated toxin-free endophytes can be introduced into forage cultivars for improved livestock performance. Long-read genome sequencing revealed clusters of ergot alkaloid biosynthesis (*EAS*) genes in *Epichloë coenophiala* strain e19 from tall fescue (*Lolium arundinaceum*) and *Epichloë hybrida* Lp1 from perennial ryegrass (*Lolium perenne*). The two homeologous clusters in *E. coenophiala* a triploid hybrid species—were 196 kb (*EAS*1) and 75 kb (*EAS*2), and the *E. hybrida EAS* cluster was 83 kb. As a CRISPR-based approach to target these clusters, the fungi were transformed with ribonucleoprotein (RNP) complexes of modified Cas9 nuclease (Cas9-2NLS) and pairs of single guide RNAs (sgRNAs), plus a transiently selected plasmid. In *E. coenophiala*, the procedure generated deletions of *EAS*1 and *EAS*2 separately, as well as both clusters simultaneously. The technique also gave deletions of the *EAS* cluster in *E. hybrida* and of individual alkaloid biosynthesis genes (*dmaW* and *lolC*) that had previously proved difficult to delete in *E. coenophiala*. Thus, this facile CRISPR RNP approach readily generates non-transgenic endophytes without toxin genes for use in research and forage cultivar improvement.

**Keywords:** CRISPR/Cas9; non-transgenic engineered fungi; genome editing; genome sequencing; MinION; nanopore; secondary metabolites

**Key Contribution:** A CRISPR-based approach was used to generate non-transgenic deletion mutants of *Epichloë* species that are seed-borne fungal symbionts (endophytes) of important forage grasses. Targeted deletions of several genes and genome regions were demonstrated, including the simultaneous deletion of two large gene clusters.

#### **1. Introduction**

A few species of filamentous fungi have been genetic models of choice since the 1950s due to their haploid growth stage, facile sexual cycles, abundant sporulation, rapid growth and, with time, large repertoires of mutants and molecular transformation systems. However, given the importance of fungi in medicine, agriculture, and ecosystems, considerable efforts have been invested over several decades to establish molecular transformation and targeted mutation systems for a much broader range of species. These include the *Epichloë* species (family Clavicipitaceae, order Hypocreales), which are systemic, constitutive, and often seed-transmitted symbionts (endophytes) of cool-season grasses (Poaceae, subfam. Poöideae), and which are capable of producing a panoply of bioprotective alkaloids [1–3]. However, features of important *Epichloë* species that present special difficulties for genetic experimentation include growth rates far slower than model fungi, sparse

**Citation:** Florea, S.; Jaromczyk, J.; Schardl, C.L. Non-Transgenic CRISPR-Mediated Knockout of Entire Ergot Alkaloid Gene Clusters in Slow-Growing Asexual Polyploid Fungi. *Toxins* **2021**, *13*, 153. https:// doi.org/10.3390/toxins13020153

Received: 30 October 2020 Accepted: 21 November 2020 Published: 16 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

jolan-ta.jaromczyk@uky.edu

sporulation, limited availability of selectable markers, and for many, diploid or triploid genomes and the lack of a sexual cycle [2,4].

The development of CRISPR technologies has opened the door to facile gene inactivation, removal, or even replacement for a wide range of organisms, including filamentous fungi [5,6]. Initially, the application of CRISPR in eukaryotes involved transformation or transfection with a gene construct expressing the Cas9 double-strand DNase and another construct to transcribe guide and tracrRNAs to direct the activity of Cas9 to the target sites. In *Aspergillus* species, Nødvig et al. [7] developed a system based on a single plasmid harboring a chimeric RNA guide and the *cas*9 gene under fungal promoters, along with a marker gene required for fungal selection. More recently, Cas9-sgRNA (single guide and tracrRNA) ribonucleoprotein complexes (RNPs) have been employed in a wide range of fungi [5,6].

In an example of the RNP approach, targeted mutations have been introduced into the genome of the fungus *Pyricularia oryzae* [8], which is among the most important fungal pathogens globally impacting cereal grain production. The fungus was transformed simultaneously with the RNPs to mutate the target gene and others to generate mutant genes that are positively selectable. This strategy is based on the presumption that those nuclei in which the genes are converted to their selectable forms are also those most likely to have taken up the RNP and consequently mutate the target gene as well.

We use *Epichloë* spp. as exemplars of particularly difficult but important fungi for targeted genetic modification. *Epichloë coenophiala* is widespread as a seed-borne endophyte of the highly popular pasture, forage, and turf grass, *Lolium arundinaceum* (= *Schedonorus arundinaceus = Festuca arundinacea*; tall fescue), although its existence was unsuspected in the first decades of widespread propagation of the grass during the mid-20th century. The fungus provides important benefits that translate to enhanced stand longevity and productivity and improved tolerance of drought and other stresses [9,10], and it is capable of producing up to four different classes of alkaloids that protect the grass hosts against invertebrates [2,11]. Unfortunately, the strains of *E. coenophiala* that have been unwittingly co-propagated with tall fescue, and which remain dominant in much of the cool-season pasturelands, produce ergovaline, which is an ergot alkaloid of the highly toxic ergopeptine type [12–14]. Levels of ergovaline tend to be very low, but they are often sufficient to at least cause reproductive problems and reduce livestock health and productivity. For the same reason, cultivars of *Lolium perenne* (perennial ryegrass) with *Epichloë hybrida* [15] strain Lp1 were pulled from the market in 1992 after it was determined that they had toxic levels of ergovaline [16,17]. A study including deletion of the dimethylallyltryptophan synthase gene (*dmaW*) from *E. hybrida* Lp1 and its subsequent complementation with the ortholog from *Claviceps fusiformis* [18] has demonstrated that *dmaW* is essential for ergot alkaloid biosynthesis [19]. There is a potential to develop and deploy such genetically altered strains of *Epichloë* species in forage cultivars because the fungus can be cultured, manipulated, and reintroduced to produce new, stable symbioses with forage and pasture cultivars of tall fescue. However, it is desirable and perhaps essential that such modifications should not involve integration of any foreign gene in the genome, which is a requirement that makes the CRISPR RNP approach especially attractive.

The endophyte *E. coenophiala* is a particularly difficult system for genetic inquiry because of its slow growth [20] and the fact that it is a triploid interspecific hybrid [21]. Two of its three ancestors were ergovaline producers, so *E. coenophiala* has two homeologous copies of the ergot alkaloid biosynthesis (*EAS*) gene clusters [2]. An effort to delete the key gene *dmaW*2 by marker exchange mutagenesis with a hygromycin B-resistance gene (loxPflanked *hph*) was a particular *tour de force* in which over 1500 transformants were screened and the frequency of homologous gene replacement was 0.2% [22]. Once a ∆*dmaW*2 mutant was obtained and reintroduced into host plants, there was essentially no effect on ergovaline production because of the presence of its homeologue *dmaW*1. Furthermore, although the selectable marker was readily removed from the ∆*dmaW*2 mutant by transformation with a plasmid for the transient expression of Cre recombinase [22], the resulting marker-free

mutants consistently had lost the ability to establish stable symbiosis with tall fescue (unpublished results).

Natural mutants of *Epichloë* species with deleted or inactivated alkaloid biosynthesis genes consistently have inactivated the entire set of genes for downstream steps [23–25]. Whether or not this relates to the host incompatibility of the aforementioned marker-free ∆*dmaW*2 mutants, the nature of these natural variants suggests that there is selection against expression of enzymes that, due to loss of upstream genes, no longer have access to their normal substrates. Therefore, we consider the most prudent approach to generating ergot alkaloid-negative mutants to be the deletion of both *EAS* clusters in their entirety.

After genome sequencing revealed the subterminal location of the *EAS*1 cluster in *E. coenophiala* isolate e19, we devised a new approach to replace that cluster with a telomererepeat array [24]. Since the homeologous cluster, *EAS*2, has an inactivating mutation in a late-pathway gene, *lpsB*2, the resulting "*EAS*1-knockoff" mutant produced only two early products of the ergot alkaloid pathway, chanoclavine and ergotryptamine. The technique employed transient expression of antibiotic resistance conferred by an *hph* gene positioned in the vector to be lost subsequently by breakage of the integrated DNA at the introduced telomere repeat array. The success of this approach suggested that transient antibiotic selection could be used in other approaches for mutation. In this study, this strategy is applied to CRISPR-based deletion of both *EAS* clusters as well as individual genes.

Both for research and for the practical aim of completely eliminating production of all ergot alkaloids from an agriculturally important grass symbiont, we have chosen to adapt a Cas9-sgRNA RNP approach to entirely eliminate both the 196-kb *EAS*1 cluster and the 75-kb *EAS*2 cluster. Here, we describe the success of that effort and follow-up experiments to demonstrate the facile nature of our approach and its broader applicability, opening the door to a wide range of non-transgenic manipulations of even slow-growing, asexual, polyploid fungi.

#### **2. Results**

#### *2.1. Assembly of E. coenophiala and E. hybrida Genome Sequences Including Nanopore Data*

The genome of *E*. *coenophiala* e19 wild-type strain was previously sequenced by a combination of pyrosequencing (Roche) and Sanger sequencing of fosmid-cloned ends [24]. Due to its triploid hybrid nature, the genome is complex and has been difficult to assemble, especially across repetitive regions. To assess the potential of Oxford Nanopore technology to improve the genome data and assembly, the *E*. *coenophiala* e19 genome was sequenced using the portable DNA sequencer MinION. The MaSuRCA v. 3.4.1 de novo assembly was manually curated to give 216 scaffolds with the contig sizes varying from 1915 to 12,291,650 bp with an N50 of 1,403,312 bp (Supplementary Table S1; GenBank accession number JAFEMN000000000). The estimated genome size was 104.2 Mb. The 196.2 kb complete sequence of the *EAS*1 cluster was identified on a 676 kb scaffold that ended with a telomere repeat array, and the 75.2 kb *EAS*2 cluster was identified on a 2.6 Mb scaffold sequence where it was flanked by housekeeping genes. Both clusters had the 11 known ergot alkaloid biosynthesis genes similarly arranged and oriented, and the difference in their sizes was due to lengths of AT-rich noncoding regions consisting mainly of repeats.

The *E. hybrida* Lp1 genome was sequenced using several sequencing platforms (Supplementary Figure S1) and the assembled data generated 158 scaffolds with the contig size varying from 9259 to 8,313,425 bp, with an N50 of 2,103,505 bp, and estimated total genome size of 79.9 Mb sequence (GenBank accession number JAFEKR000000000). The 83.2-kb *EAS* cluster was located on a 6.7 Mb scaffold.

#### *2.2. Deletion of Ergot Alkaloid Biosynthesis Gene Clusters from the E. coenophiala Genome*

After *E. coenophiala* e19 protoplasts were treated simultaneously with the *EAS*2 directed RNPs with specific sgRNAs (Figure 1 and Table 1) and the plasmid pKAES329 with a fungal-active *hph* gene, 115 hygromycin B-resistant colonies were recovered and subsequently single-spore isolated on plates without selection. DNA was extracted and

subjected to a series of PCR screens (Figure 2 and Supplementary Figure S1) with primers designed for identification of the expected deletion mutants (Table 2). The first screen with primers specific for *easA*1 and *easA*2 indicated 31 colonies lacking only *easA*1, 14 lacking only *easA*2, and five lacking both *easA*1 and *easA*2. As a further check if both *EAS*1 and *EAS*2 clusters were absent in the last five transformants, they were subjected to a second PCR screen with primers targeting a common region in the two *dmaW* homeologs, and all five tested negative for both (Figure 2 and Supplementary Figure S1).

 **Figure 1.** Maps of genes and gene clusters in *E. coenophiala* e19 indicating target locations and sequences of the single guide RNAs (sgRNAs) that direct cleavage by modified Cas9 nuclease. On each map, the AGG (underlined) protospacer adjacent motifs (PAM) required for Cas9 nuclease to generate double-strand break at the target site is 3 ′ -adjacent to the 20-nucleotide target DNA sequence, and the trans-activating crispr RNA (tracrRNA) segment is a 67 nt sequence that interacts with Cas9. Black box-arrows indicate *EAS* or *LOL* genes, which are labeled with the full name or an abbreviation with the last letter of the gene name or with *B* for *cloA*. Hash marks on the *EAS* maps indicate long stretches of noncoding sequences. (**a**) Ergot alkaloid biosynthesis gene cluster *EAS*2 with the *dmaW*2 region magnified. The sgRNA sequences for *EAS*2 deletion match genomic sequences outside but near the 3 ′ end of the *lpsB*2 pseudogene and within but near the 3 ′ end of *lpsA*2. The sgRNA sequences for *dmaW*2 deletion match flanking intergenic regions. (**b**) Ergot alkaloid biosynthesis gene cluster *EAS*1. The sgRNA guides are the same as those for *EAS*2, and the sequence flanking *lpsB*1 has a single mismatch to the sgRNA (asterisk). (**c**) Loline alkaloid biosynthesis gene cluster *LOL* with *lolC* magnified.


**Table 1.** sgRNA guides.

<sup>1</sup> Asterisks indicate 2 ′ O-methyl analogs with 3 ′ phosphorothioate linkages. ′ ′

∆ ∆ **Figure 2.** Graphic summary of the screening strategy and results for ergot alkaloid biosynthesis (*EAS*)*-*cluster deletions in *E. coenophiala* e19. Cylinders represent mutants identified as *EAS*1 knockoffs by chromosome-end deletions [24] (gray), Cas9-mediated *EAS*1 deletions (green; ∆*EAS*1), and Cas9 mediated *EAS*2 deletions (yellow; ∆*EAS*2). Cylinders with two colors represent losses of both *EAS*1 and *EAS*2 gene clusters in the same mutants.



Since the pKAES329 plasmid used to transiently select transformants carried a truncated fragment of *lpsA*1, and there was a single-base mismatch of the EAS2lpsBguide to the *EAS*1 locus target site (Figure 1), we expected that the loss of the *EAS*1 cluster in some mutants might be due to a homologous recombination of the *lpsA*1 sequence followed by chromosome-end knockoff, as was previously accomplished using the same plasmid [24], rather than by Cas9-mediated deletion. To check this possibility, 31 colonies that were negative for *easA*1 and another five that were negative for both *easA*1 and *easA*2 were tested by PCR with primers targeting a 223 bp fragment spanning from the oligotag into the truncated *lpsA*1 gene on pKAES329, which was a sequence that was expected to be present in such knockoff mutants [24] but to be absent from mutants induced only by CRISPR. The result was negative for six of the mutants that lacked only *easA*1 and two that lacked both *easA* genes, suggesting that the loss of *EAS*1 in those mutants was due to Cas9-catalyzed cleavage. The other mutants, which were positive for the oligotag, were not investigated further. The CRISPR-mediated cluster deletions were designated ∆*EAS*1, ∆*EAS*2 and ∆*EAS*1 ∆*EAS*2 (Figure 2).

In addition to generating strains completely lacking ergot alkaloid genes, the goal was to avoid integration of any foreign gene in the genome of the modified strains. A PCR test for *hph* identified two ∆*EAS*1 ∆*EAS*2 mutants (designated e7801 and e7802) and one ∆*EAS*2 mutant (designated e7803) that were marker-free (Figure 2 and Supplementary Figure S1).

#### *2.3. Deletion of dmaW2*

*Epichloë coenophiala* e19 is a triploid interspecific hybrid with orthologous *EAS*1 and *EAS*2 gene clusters that can largely complement each other's ergot alkaloid-biosynthesis gene mutants [22,24]. We previously eliminated the telomere-linked *EAS*1 cluster to generate strain e7480 [24]. This "knockoff" mutant retained the *EAS*2 cluster with most functional ergot alkaloid biosynthesis genes including *dmaW*2, which encodes the enzyme for the first determinant step in the pathway [19], and which has proven exceptionally difficult to eliminate by marker-exchange homologous recombination [22]. Therefore, *dmaW*2 in e7480 was targeted to test if the use of CRISPR RNP technology might provide more efficient editing of the locus. Protoplasts of e7480 were simultaneously treated with the RNP mixture and pKAES329; then, they were regenerated on medium with hygromycin B to obtain a total of 318 selected colonies. Then, these were propagated without selection, and their DNA was screened by PCR for *dmaW*2 to identify 54 putative ∆*dmaW*2 mutants (Supplementary Figure S1). Screening those for *hph* indicated 50 out of the 54 putative mutants that also lacked the selection marker, which could have occurred either because the marker was expressed without plasmid integration or because of recapitulation of the chromosome-end knockoff [24] whereby the plasmid integrated at the *lpsA*1 site and was subsequently lost by breakage within the introduced telomere-repeat array. Positive controls were PCR screens with primers specific for *easA*2 and the oligotag and, as expected, all 50 putative marker-free mutants tested positive for both. Moreover, since the *EAS*1 cluster is missing in the e7480 genome, the PCR test for *easA*1 was negative for all the samples. Two of the ∆*dmaW*2 mutants were designated e7799 and e7800 (Table 3).

**Table 3.** Mutants confirmed by genome sequencing.


#### *2.4. Deletion of the EAS Cluster in E. hybrida Lp1*

*Epichloë hybrida* is a diploid interspecific hybrid that inherited, from its *Epichloë festucae* ancestor, an *EAS* cluster [15] similar to *EAS2* of *E. coenophiala*. The sgRNAs designed for deletion of the *EAS* clusters in e19 were used in an attempt to delete the *EAS* cluster in *E. hybrida* Lp1. The transformation plasmid providing transient hygromycin B resistance was pKAES328, which has the fungal-active gene *hph* but differs from pKAES329 in lacking an *lpsA*1 fragment [24]. Following the same PCR procedures as mentioned above for *E. coenophiala* (Supplementary Figure S1), 16 *E. hybrida* colonies were screened, out of which one (designated e7806) was a putative marker-free ∆*EAS* mutant (Table 3).

#### *2.5. Deletion of lolC in E. coenophiala e19*

Genomic analysis of *E. coenophiala* e19 indicated a single cluster, designated *LOL*, with the loline alkaloid biosynthesis genes. The *lolC* gene has been suggested to encode the enzyme for the first step in the loline alkaloid pathway and its role has been tested previously by RNA interference (RNAi) in *Epichloë uncinata,* resulting in significantly reduced loline-alkaloid production [26]. To further evaluate the general utility of the CRISPR RNP technology in *E. coenophiala,* sgRNAs were designed and used for the deletion of *lolC*. Following protoplast treatment with the RNP mixture followed by initial selection with hygromycin B, a total of 185 colonies were screened with *lolC*-specific primers to identify 11 that tested negative for the gene (Supplementary Figure S1). These putative ∆*lolC* mutants were further screened with primers for *hph,* identifying two marker-free mutants designated e7804 and e7805 (Table 3).

#### *2.6. Genome Analysis of the CRISPR-Derived Mutants*

A de novo genome sequence assembly was performed for each deletion mutant in Table 3. The dataset size (in bases), number of reads (average length 130 bp), genome coverage, and assembly quality metrics for each are presented in Supplementary Table S2. In every case, the sequences were consistent with the PCR results regarding the gene losses due to Cas9 nuclease, absence of the marker gene, and absence of the oligotag sequence. Moreover, to check if any plasmid sequences were present in the deletion mutants, the reads were mapped against the pKAES328 and pKAES329. None of the mutants had sequences from those plasmids.

Interestingly, almost all of the gene deletions resulted from the cleavage and flawless rejoining of the two flanking ends with no other sequence changes (Figure 3). Positions of the cleavage by Cas9 in the e7801 and e7802 ∆*EAS*1 ∆*EAS*2 mutants varied between the strains but also between the two clusters in the same strain. For instance, in e7802, the cleavage at the *EAS*1 cluster near the *lpsB*1 locus was inferred to be 3-bp upstream of the PAM site, even though the position had a mismatch between the sgRNA and *EAS*1 target site, and 4-bp upstream of the PAM site in *EAS*2 where the sgRNA was an exact match to the target sequence. Furthermore, the cleavage near *lpsA*1 was 4-bp upstream of the PAM site but 3-bp upstream of the PAM site near *lpsA*2. Among the six sequenced mutants with 15 Cas9-mediated cleavage sites in all, 11 cleavages were 3-bp from the PAM site and four (27%) were 4-bp from the PAM site.

The changes at both *EAS* clusters in e7801 differed from those in e7802 in several interesting ways. In addition to the different cleavage sites mentioned above, e7801 (but not e7802) had undergone a reciprocal recombination event between the two *EAS* clusters, which was likely triggered by the Cas9-induced double-strand breaks in the orthologous positions of the two *lpsA* genes. Furthermore, in e7801, the cleaved end of the *EAS*1 cluster was linked to a new telomere repeat array, from which it was separated by only a single base pair. In contrast, both the *EAS*1 and the *EAS*2 clusters of e7802 were deleted with the flanking sequences joined.

 ∆ pair addition is indicated (†) just inside the new telomere (italic text and filled circle). **Figure 3.** CRISPR-induced deletions in *E. coenophiala* e19 as deduced from genome sequencing. Targets of the sgRNAs (Table 1) are indicated, adjacent PAM sites (AGG in all cases) are underlined, and cleavage sites are assumed to have been 3 bp or 4 bp 5 ′ of the PAM sites. Colons (:) indicate joined ends. (**a**) Deletions and recombination at *EAS* gene loci. Filled circles indicate telomere repeat arrays of (CCCTAA)n near *EAS*1, and an asterisk (\*) indicates a mismatch between EAS2lpsBguide sgRNA and its target near the *EAS*1 gene cluster. In the ∆*EAS*1 locus of e7801 a single base-pair addition is indicated (†) just inside the new telomere (italic text and filled circle). In e7801, single-nucleotide polymorphisms within the recombined segments labeled *lpsA*1/2 and *lpsA*2/1 are indicated as capital letters. (**b**) CRISPR-mediated deletion of the *dmaW*2 gene for the first step in ergot alkaloid biosynthesis. (**c**) CRISPR-mediated deletion of the *lolC* gene for the presumed first step in loline alkaloid biosynthesis.

> A BLAST (basic local alignment search tool) search against the *E. hybrida* e7806 mutant genome with sequences for the *EAS* cluster and *hph* gene indicated their absence, which was consistent with the previous PCR results. The Cas9-mediated cleavage occurred 3 bp upstream from the PAM site near *lpsB* and 4 bp upstream from the PAM site in *lpsA*. The junction generated by joining the cleaved ends was identical to the junction in the deleted *EAS*2 locus of mutant e7801 shown in Figure 3.

> The *E. coenophiala* mutants e7999 and e7800 had lost *dmaW*2 as indicated by the PCR screen and validated by their sequenced genomes. Since the cleavages induced by Cas9 occurred 3 bp upstream of PAM sequence for the dmaW24KOf and dmaW28KOr sgRNA target sites, both ∆*dmaW*2 mutants had identical junctions with no indels (Figure 3).

> The genome sequence of *E. coenophiala* ∆*lolC* mutants e7804 and e7805 also confirmed that no other foreign DNA was integrated into the genome. When comparing the *lolC* deletion site in the two mutants, it appeared that Cas9 had generated cleavages 3 bp upstream from PAM site at all sites except for the lolCr1 sgRNA target site of e7404, for which the cleavage occurred 4 bp upstream of the PAM site. The rejoining of the ends was perfect without introduction of indels or any sequence changes (Figure 3).

∆

#### *2.7. Inoculation Efficiency*

Endophyte infection determined by tissue-print immunoblot indicated that all CRISPRmediated deletion mutants established symbiotic relationships with the plant at infection rates similar or higher than those typically seen for wild-type strain inoculations (Table 4).

**Table 4.** Establishment of symbiosis of mutant endophytes with host plants.


#### **3. Discussion**

We have demonstrated that a CRISPR/Cas9 technology can be applied to polyploid *Epichloë* species for the precise removal of individual genes or gene clusters, and even the simultaneous removal of a pair of large clusters (196 kb and 75 kb). Specifically, the method utilized RNPs—consisting of sgRNAs and Cas9 protein that was translationally fused with two NLRs—which were introduced into the fungus by cotransformation with a transiently selected antibiotic resistance gene. There are two obvious advantages to this approach. One is that more than one gene or genome region can be deleted in a single procedure, and the other is that the procedure leaves no selectable marker or transgenes in the genome. The absence of the selectable marker in the final product allows for its reuse to eliminate additional genes or to reintroduce genes for complementation analysis, and it also addresses regulatory and public concerns about the use of transgenic organisms in applied research and agriculture.

We previously constructed a plasmid (pKAES329) to transiently integrate a selectable antibiotic-resistance gene (*hph*) at a chromosome end [24], and here, we have combined that approach with the RNP approach and found it to be successful and facile. However, we also questioned whether a transient integration of the marker was even necessary. We show here that marker integration is unnecessary by demonstrating the simultaneous elimination of two long gene clusters (*EAS*1 and *EAS*2) by treatment with the appropriate RNP mixtures together with a plasmid that, without integrating into the genome, provided expression of the selectable marker.

In this study, we did not test whether even transient marker selection was needed. However, we previously used the transformation of *Epichloë* species without selection to delete a loxP-flanked *hph* gene by transient expression of the Cre-recombinase gene [22]. Although successful, the screen was tedious and time consuming, resulting in a 0.5–2.1% frequency of Cre-mediated deletions among the unselected colonies. In contrast, here, we report that plasmids mixed with the RNPs provided for temporary selection of a limited number of initially antibiotic-resistant transformants. From those, mutants with the target deletions were readily identified by PCR screens and a high proportion were marker free (18–100% depending on the experiment). Therefore, although integration of the plasmidborne selectable marker was not required, our results suggest that that its inclusion in the transformation mixture aided in selection of the RNP-transformed isolates, including those with the desired deletions.

In targeting the *EAS*1 gene cluster, a mismatch 3 bp upstream of the PAM site between the EAS2lpsBguide and its target genomic sequence near *lpsB*1 did not prevent cleavage at this position. The mismatch at a single-nucleotide polymorphism between the sites near *EAS*1 and *EAS*2 (the sgRNA sequence was based on the latter) was in the "seed" region but not part of the "core" region for sgRNA-directed Cas9 cleavage [27]. Since deletion of the 196-kb *EAS*1 region was achieved with cleavage at the mismatch position, a precise match to the target was evidently not required.

Two additional genome alterations were observed in one of the ∆*EAS*1 ∆*EAS*2 mutants. One was the introduction of a telomere at the cut site near *EAS*1 in one of the mutants. Interestingly, that cut was repaired very simply with the addition of a single base pair followed by a telomere array. The other alteration in the same mutant was a recombination between the remnants of *lpsA*1 and *lpsA*2, presumably moving the genes near *lpsA*2 from genomic locations far from a telomere to close proximity to the newly generated telomere. Whether that alteration affects the expression of those genes may be an interesting topic for future inquiry.

Our results demonstrate the high efficiency of the CRISPR/Cas9 technology while not indicating any size limit for genome segments that can be efficiently and precisely deleted. The frequency of *dmaW*2 deletion was 17%, compared to only 0.2% previously obtained for marker exchange mutagenesis of the same gene in the same *E. coenophiala* strain [22]. In addition, the frequency of the 75-kb *EAS*2 deletion (5.2%) was not dramatically less than that of the 2.6-kb *dmaW*2 deletion (17%) and was very close to that of the 1.9-kb *lolC* deletion (5.9%). Conceivably, size limits might be imposed by boundaries of chromatin states, so it is worth considering whether or not the clean deletions of large genome regions are more likely when those genes are part of a coordinately regulated cluster.

Though often characterized as "error-prone", non-homologous end joining (NHEJ) frequently occurs without error. A study in mouse cell lines reported 70% precise NHEJ events after Cas9-mediated cleavage [28]. Many target sites also exhibited approximately 10% 1-bp and 2-bp templated insertions. Insertions of 1 bp following Cas9 cleavage and NHEJ have also been described in *Saccharomyces cerevisiae*, with 74–100% apparently being templated, implying that in those cases, Cas9 cleavage left a staggered (5′ -overhanging) end that was filled in by DNA polymerase 4 [29]. In our study, all of the eight sequenced junctions appeared to be precisely joined, but four of those had one or the other end cleaved 4 bp from the PAM site. In those four cases, the result was similar to the common 1-bp templated insertions observed by Lemos et al. [29], so it is reasonable to speculate that they also resulted from staggered cleavage and end-repair.

As commercial sources have recently made Cas9-NLS fusion proteins and synthetic sgRNAs available, a variety of approaches that include Cas9-sgRNA RNPs have been employed in fungi. Most involve the integration of a stable selectable marker [30] or simultaneous generation of a selectable mutation [8]. Khan et al. [31] report using the Cas9 sgRNA RNP system without selection to target the *TOX3* effector gene in *Parastagonospora nodorum*, with all of the six analyzed "transformants" exhibiting mutations at the repair site. This result contrasts with ours in that we found no mutations other than the deletions resulting from rejoining of the cleaved ends and, in approximately half of the junctions, an apparent 1-bp templated insertion. Thus, applications of this technology in different fungal systems may give substantially different outcomes.

#### **4. Conclusions**

We report five findings with regard to the application of Cas9-sgRNA RNP technology for deleting DNA segments in *Epichloë* species. First, transient selection for an antibiotic resistance gene included in the transformation mix allowed for the efficient identification of deletion mutants. Second, the deletions were precisely repaired by NHEJ (sometimes with templated 1-bp insertions). Third, even large segments up to 196 kb in our tests were efficiently deleted. Fourth, simultaneous deletions of two large DNA segments with entire biosynthetic gene clusters were obtained. Fifth, the deletion mutants retained their compatibility as grass symbionts. These results highlight the range of non-transgenic manipulations of even slow-growing, asexual, polyploid fungi that can be undertaken with this CRISPR RNP approach. The application to *Epichloë* species provides for tailored genotypes to use in turf and forage, for example to generate elite lines of livestock-friendly cultivars.

#### **5. Materials and Methods**

#### *5.1. Biological Materials*

The wild-type *Epichloë coenophiala* strain e19 [= American Type Culture Collection (ATCC) 90664] from tall fescue (*Lolium arundinaceum*) cv. Kentucky 31 [32], and the *E. coenophiala EAS*1-knockoff strain e7480 (=ATCC PTA-126679) [24] were maintained in tall fescue elite breeding line KYFA0601 [24]. The wild-type *Epichloë hybrida* Lp1 (=ATCC TSD-66) strain from perennial ryegrass (*Lolium perenne*) [33] was maintained in perennial ryegrass cv. Rosalin [19]. The fungi were isolated from these symbiotic plants and grown as described in Florea et al. [34].

#### *5.2. Miscellaneous Molecular Methods*

Plasmid DNA was isolated from bacterial cultures by use of the ZR Plasmid Miniprep-Classic kit (Zymo Research, Irvine, CA, USA). Fungal DNA was isolated from fresh mycelium by use of the DNeasy 96 Plant Kit (Qiagen, Valencia, CA, USA) and a Geno/Grinder 2000 (SPEX CertiPrep, Metuchen, NJ, USA) or by use of the ZR Fungal/Bacterial DNA MiniPrep kit (Zymo Research). The DNA quality and concentrations were assessed by NanoDrop and Qubit (ThermoFisher Scientific, Waltham, MA, USA). Plasmids pKAES328 and pKAES329, with a fungal-active hygromycin B-resistance gene (*hph*), are described in Florea et al. [24].

#### *5.3. Nanopore Single-Molecule Sequencing of Genomic DNA*

Genomes of wild-type *E. coenophiala* e19 and *E. hybrida* Lp1 were sequenced by a hybrid approach including Illumina, pyrosequencing, and single-molecule sequencing. For Oxford Nanopore (Oxford, UK) single-molecule sequencing, the fungi were grown on potato dextrose agar (PDA) (BD, Franklin Lakes, NJ, USA) plates topped with cellophane for ease of removal. High molecular weight (> 40 kb) DNA was extracted from young mycelium as described by the method of Al-Samarrai and Schmid [35] and recovered by spooling on glass rods. The Ligation Sequencing Kit 1D (SQK-LSK108) was used for library preparation. For each library, the end-repair and dA-tailing step was performed with the NEBNext End repair/dA-tailing kit (New England Biolabs, Ipswich, MA, USA) by incubating 100 µL Ultra II End-prep reaction buffer containing 3 µg of high molecular weight DNA and 6 µL of Ultra II End-prep enzyme mix for 20 min at 20 ◦C, then for 20 min at 65 ◦C. Then, the end-repaired DNA was mixed with 120 µL AMPure XP beads (Beckman Coulter, Pasadena, CA, USA) and incubated 5 min at room temperature. The beads were pelleted on a magnetic rack and the supernatant was discarded; then, they were washed twice with 200 µL of 70% ethanol and allowed to dry. The beads were resuspended in 31 µL of nuclease-free water and incubated for 10 min at room temperature; then, they were pelleted on the magnetic rack until the solution became clear. Then, the DNA solution was transferred into a new 1.5 µL Eppendorf loBind DNA tube, and the DNA concentration was determined using the Qubit. The adapter ligation reaction was performed with 50 µL NEB Blunt/TA Ligase Master Mix and 20 µL adapter mix added to 30 µL containing 2 µg of the End-prep-treated DNA and incubated for 10 min at room temperature. Then, the adapter-ligated DNA was added to 60 µL AMPure XP beads and incubated for 5 min at room temperature; then, it was pelleted on a magnetic rack, and the supernatant was discarded. The beads were resuspended in 500 µL adapter bead binding buffer (ABB), pelleted again on the magnetic rack for the removal of the ABB buffer, allowed to dry, and then resuspended in 15 µL elution buffer. After 10 min incubation at room temperature, the beads were pelleted once more on the magnetic rack to recover the DNA library in the supernatant.

Oxford Nanopore FLO-MIN106 flowcells were primed with a mix consisting of 480 µL RBF (running buffer with fuel mix) and 520 µL nuclease-free water by loading 800 µL of the priming mix into the priming port and incubating at room temperature for 5 min. The flowcell priming was completed by lifting the SpotOn port cover and then loading the remaining 200 µL of priming mix through the priming port. Then, the 75 µL DNA library loading mix consisting of 12 µL DNA library mixed with 35 µL of RBF, 2.5 µL

nuclease-free water, and 25.5 µL LLB (library loading beads, EXP-LLB001) was loaded on the flowcell via the SpotON port. Sequence runs were implemented with MinKNOW software version 3.1.0.30 with the NC\_48Hr\_e19\_exp\_Run\_FLO-MIN106\_SQK-LSK108 and NC\_48Hr\_Lp1\_exp\_Run\_FLO-MIN106\_SQK-LSK108 sequencing scripts, and with the live base calling turned off. The MinION FAST5 output files were processed using Guppy basecalling software ver. 2.2.2 supporting GPU processing with NVIDIA-SMI driver version 418.67 and CUDA version 10.1. Guppy was run as a singularity container on the University of Kentucky HPC system.

#### *5.4. Genome Sequence Assembly*

Oxford Nanopore and Illumina HiSeq data (described above) were combined with Roche 454 GS FLX+ (454 Life Sciences, Branford, CT, USA) and paired-end Illumina MiSeq data described previously [24] in genome assemblies using MaSuRCA ver. 3.4.1 [36], which is a de novo assembler that combines the benefits of deBruijn graph and overlap-layoutconsensus assembly approaches [37]. For Oxford Nanopore data, only the longest reads providing 25-fold genome coverage were included.

#### *5.5. Illumina HiSeq Sequencing of Genomic DNA*

DNA libraries were prepared by use of the Nextera DNA library prep and index kits per the manufacturer's reference guide (Epicentre Biotechnologies, Madison, WI, USA). Sequencing was carried out by Novogene Corporation Inc. (Sacramento, CA, USA) on the HiSeq platform with 2 × 150-cycle paired-end reads with 11 barcoded samples multiplexed on one lane (Illumina, San Diego, CA, USA). The data were evaluated for quality using FastQC v. 0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/ fastqc/ (accessed on 20 October 2020)). Based on the FastQC results, the reads were trimmed, and the remnants of the adapters were removed using Trimmomatic v. 0.39 [38]. Duplicate reads were removed using Prinseq-lite v. 0.20.4 [39]. Assembly was performed with MaSuRCA v. 3.4.1.

#### *5.6. Design of sgRNA Molecules and Assembly of RNP Complexes*

The sgRNAs were designed using the Benchling platform (https://www.benchling. com/ (accessed on 20 October 2020)), having the genome of *Claviceps purpure*a 20.1 set as reference in the guide parameter design. All the sgRNAs used in this study were designed upstream of an AGG protospacer adjacent motif (PAM) recognition site (Table 1). Cas9 nuclease typically cleaves the DNA 3-bp upstream of the PAM site [40]. The sgRNAs chosen to delete *EAS* clusters were exact matches to target sequences near the ends of the *EAS*2 cluster. One (EAS2lpsBguide) was 163 bp downstream of *lpsB*2, and the other (EAS2lpsAguide) was 718 bp upstream of the stop codon of *lpsA*2. The guide sequence in EAS2lpsAguide was also an exact match to the homologous sequence in *lpsA*1, but EAS2lpsBguide had a single mismatch located 3 bp from the PAM site at the sequence near *lpsB*1 (Figure 1).

The *dmaW*2 sgRNAs were designed to recognize target sequences 431 bp upstream (dmaW28KOr) and 1055 kb downstream (dmaW24KOf) of the *dmaW*2 coding region and were an exact match to the genomic sequence at the *dmaW*2 locus. The *lolC* sgRNAs were designed to target sequences 140 bp upstream (lolCf4) of the start codon and 15 bp downstream (lolCr1) of the stop codon, and they were an exact match to the genomic sequence near the *lolC* gene (Figure 1). The *Streptococcus pyogenes* Cas9 nuclease fused with two nuclear localization signal motifs (Cas9-2NLS)*,* and the sgRNAs were purchased from Synthego Corp. (Redwood City, CA, USA). To form RNP complexes, 180 pmol of each sgRNAs was incubated 10 min at room temperature with 20 pmol Cas9-2NLS nuclease and nuclease-free water in 15 µL volume. For transformation, the RNP complexes were paired and mixed with the plasmid DNA, as indicated in Table 1.

#### *5.7. Fungal Transformation and Selection*

*Epichloë coenophiala* e19, its derivative e7480 [24], and *E. hybrida* Lp1 were transformed with the plasmid–RNP mixtures by a modification of previously described methods [24,41] as follows: Fungus was grown in potato dextrose broth (PDB) (BD, Franklin Lakes, NJ, USA) in an incubator shaker for 5–10 days at 22 ◦C and 200 rpm. The culture was transferred into 50 mL conical tubes and harvested by centrifugation at 5525× *g* for 20 min at 4 ◦C. The mycelium was resuspended in 20 mL osmotic medium (1.2 M MgSO4, 10 mM NaHPO4) containing an enzyme mixture consisting of 5 mg/mL Vinoflow FCE (Novozymes, Franklington, NC, USA), 5 mg/mL lysing enzymes from *Trichoderma harzianum*, 5 mg/mL Driselase Basidiomycetes sp., and 3 mg/mL bovine serum albumin (all from Sigma-Aldrich, St. Louis, MO, USA); then, it was incubated for 3 h on a rocker shaker at 30 ◦C. The remaining mycelial mass was removed from the protoplast suspension by passage through autoclaved Miracloth; then, it was transferred into 30 mL Corex tubes at 10 mL of the suspension in each, which was gently overlayed with 10 mL of ST solution (0.6 M sorbitol, 0.1 M Tris-Cl pH 7.4). To isolate the protoplasts, the tubes were centrifuged at 3329× *g* for 20 min. The protoplasts were rescued from the interface of the two solutions with a pipette and transferred into a tube containing 5 mL of STC solution (1 M sorbitol, 50 mM Tris-Cl pH 7.4, 50 mM CaCl2). The protoplasts in STC were pelleted in a centrifuge at 3329× *g* for 10 min. The supernatant was discarded, and the protoplast pellet was gently resuspended in 5 mL STC and pelleted again, and the supernatant was discarded. Finally, the protoplasts were gently resuspended in a small volume of STC, counted by microscopy with a hemocytometer, and diluted in STC so that 100 µL solution would contain at least 5 × 10<sup>6</sup> protoplasts.

Protoplasts were transformed using a modification of the polyethyleneglycol (PEG) method as described by Panaccione et al. [41]. The transformation mix was prepared by combining 15 µL of each preassembled RNP complex with 7–10 µg of *Mlu*I-linearized plasmid DNA that was previously incubated 30 min at room temperature with 10 µg of Lipofectin Transfection Reagent (ThermoFisher Scientific, Waltham, MA, USA). The PEGamendment solution was prepared by mixing two parts of 60% (*w*/*v*) PEG 3350 with one part amendments (1.8 M KCl, 150 mM CaCl2, 150 mM Tris-HCl pH 7.4). The transformation was done in a sterile borosilicate tube by gently mixing 100 µL of protoplast suspension with 50 µL PEG-amendment solution and the DNA mix, and it was incubated on ice for 30 min. Then, 1 mL of PEG-amendment solution was added to the content of the tube and mixed by flicking the tube several times; then, it was incubated for 20 min at room temperature.

The treated protoplasts were plated on complete regeneration medium (CRM) [41], which contained per liter 304 g of sucrose, 1 g of KH2PO4, 1 g of NaCl, 0.46 g of MgSO<sup>4</sup> -7H2O, 0.13 g of CaCl2-2H2O, 1 g of NH4NO3, 1 g of yeast extract, 12 g of PDB powder, 1 g of peptone, 1 g of casein (acid hydrolysate), and 7 g of agarose (Sigma-Aldrich). Plates of 20 mL CRM bottom-agarose contained hygromycin B at concentrations calculated to give a final 50 µg/mL for *E. coenophiala* or 200 µg/mL for *E. hybrida*. Each aliquot (250 µL) of treated protoplast suspension was added to 7 mL of CRM prepared with low melting Sea Plaque Agarose (Lonza, Mapleton, IL, USA) and kept molten at 50 ◦C; then, it was immediately poured and distributed onto the surface of a CRM plate. The plates were incubated at 21 ◦C for 3–4 weeks, and then, the fungal colonies were transferred to PDA without hygromycin B for sporulation and single-spore isolation.

#### *5.8. Screening and Analysis of Deletion Mutants*

Putative transformants were subjected to three rounds of single-conidiospore isolation on PDA without selection; then, they were grown and maintained on PDA plates. DNA was extracted by use of the DNeasy 96 Plant Kit (Qiagen, Valencia, CA, USA). PCR primers were purchased from Integrated DNA Technologies (Coralville, IA, USA) and are listed in Table 2. PCR was performed in 25 µL reaction mixtures with 5–10 ng DNA template, 200 µM each dNTP, 0.2 µM each primer, 2.5 units AmpliTaq Gold, and AmpliTaq Gold

PCR buffer with MgCl<sup>2</sup> at 1.5 mM final conc. (Applied Biosystems, Foster City, CA, USA). Thermocycler conditions were 9 min at 94 ◦C, 35 cycles of 94 ◦C for 30 s, annealing temperature 59 ◦C for 30 s, 72 ◦C for 1 min, and then a final 7-min incubation at 72 ◦C.

Selected isolates were subjected to Illumina HiSeq DNA sequencing (see above) to check for the results of deletions and nonhomologous end joining at the target sites. The reads were also screened for sequences from transformation plasmids pKAES328 and pKAES329 by using Deconseq ver. 0.4.3 [42].

#### *5.9. Establishment of Symbiota*

To establish symbiotic relationships with host plants, wild-type *E. coenophiala* and mutants were surgically introduced into seedlings of tall fescue elite breeding line KYFA0601, and *E. hybrida* mutants were similarly introduced into seedlings of the perennial ryegrass experimental line GA66 [43] by the procedure described by Latch and Christensen [44] and modified by Chung et al. [45]. After seedlings developed at least three tillers, one tiller each was sacrificed and analyzed for presence of the fungus by tissue-print immunoblot with antiserum raised against *E. coenophiala* protein [46].

#### **6. Patents**

Pending: Schardl, C.L.; Florea, S.; Farman, M.L. Fungal chromosome-end knockoff strategy. US 2017 /0349899 Al, 2017.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/2072-665 1/13/2/153/s1, Figure S1: PCR tests of putative CRISPR/Cas9-mediated deletion mutants, Table S1: Genome assembly statistics for wild-type *Epichloë* species, Table S2: Genome assembly statistics for CRISP/Cas9-mediated mutants.

**Author Contributions:** Conceptualization, C.L.S. and S.F.; methodology, S.F.; formal analysis, J.J.; investigation, S.F.; data curation, J.J.; writing, S.F. and C.L.S.; project administration, C.L.S.; funding acquisition, C.L.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the U.S. Department of Agriculture National Institute of Food and Agriculture Hatch project KY012044, and by the Mycological Society of America.

**Acknowledgments:** The authors thank Lusekelo Nkuwi for maintenance of plants, Alfred D. Byrd for technical support, and Neil Moore (University of Kentucky) for assistance in genomic data analysis.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Evolution of the Ergot Alkaloid Biosynthetic Gene Cluster Results in Divergent Mycotoxin Profiles in** *Claviceps purpurea* **Sclerotia**

**Carmen Hicks 1,† , Thomas E. Witte 1,† , Amanda Sproule 1 , Tiah Lee 1 , Parivash Shoukouhi 1 , Zlatko Popovic 2 , Jim G. Menzies 2 , Christopher N. Boddy 3 , Miao Liu <sup>1</sup> and David P. Overy 1, \***


**Abstract:** Research into ergot alkaloid production in major cereal cash crops is crucial for furthering our understanding of the potential toxicological impacts of *Claviceps purpurea* upon Canadian agriculture and to ensure consumer safety. An untargeted metabolomics approach profiling extracts of *C. purpurea* sclerotia from four different grain crops separated the *C. purpurea* strains into two distinct metabolomic classes based on ergot alkaloid content. Variances in *C. purpurea* alkaloid profiles were correlated to genetic differences within the *lpsA* gene of the ergot alkaloid biosynthetic gene cluster from previously published genomes and from newly sequenced, long-read genome assemblies of Canadian strains. Based on gene cluster composition and unique polymorphisms, we hypothesize that the alkaloid content of *C. purpurea* sclerotia is currently undergoing adaptation. The patterns of *lpsA* gene diversity described in this small subset of Canadian strains provides a remarkable framework for understanding accelerated evolution of ergot alkaloid production in *Claviceps purpurea*.

**Keywords:** *Claviceps purpurea*; fungal plant pathogen; biosynthetic gene cluster; ergot alkaloids; mycotoxins; untargeted metabolomics; mass spectrometry; sclerotia

**Key Contribution:** Metabolomic analysis of Canadian *Claviceps purpurea* sclerotia extracts indicates there are two distinct chemical phenotypes associated with divergent ergot alkaloid profiles within the species. Genomic comparison of select strains links specific mutations within the ergot alkaloid biosynthetic gene cluster to the emergence of ergot alkaloid diversity.

**Copyright:** © 2021 by the authors and Her Majesty the Queen in Right of Canada as represented by the Minister of Agriculture and Agri-Food Canada; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Ergot fungi of the genus *Claviceps* are phytopathogenic ascomycetes with the ability to parasitize over 400 monocotyledonous plant species, most notably affecting a large number of grasses and common cereals [1,2]. *Claviceps* infections are characterized by the formation of recalcitrant resting structures known as sclerotia. Sclerotia form as fungal hyphae invade unfertilized ovules of host plants via stigmas and pollen tubes during anthesis, replacing developing host seeds with compact masses of hardened fungal mycelium [3,4]. While dormant in the sclerotial state, the fungus is protected, ensuring its survival though environmental extremes until appropriate weather conditions are presented [5]. Fruiting bodies (ascomata) then develop on the sclerotium that forcefully eject ascospores into the

**Citation:** Hicks, C.; Witte, T.E.; Sproule, A.; Lee, T.; Shoukouhi, P.; Popovic, Z.; Menzies, J.G.; Boddy, C.N.; Liu, M.; Overy, D.P. Evolution of the Ergot Alkaloid Biosynthetic Gene Cluster Results in Divergent Mycotoxin Profiles in *Claviceps purpurea* Sclerotia. *Toxins* **2021**, *13*, 861. https://doi.org/10.3390/ toxins13120861

Received: 27 October 2021 Accepted: 26 November 2021 Published: 2 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

43

air, which are then dispersed by the wind to infect flowering host plants, thus continuing the life cycle.

Ergot infection of cereal crops directly impacts grain quality and yield and has become increasingly problematic across Canadian provinces over the past two decades [6,7]. Ergotinfested grain must have sclerotia removed prior to shipping, or grain shipments become downgraded or rejected at the point of sale, resulting in decreased returns to farmers [6]. However, yield reduction is seen as a secondary importance when compared to the toxic effects of accidental consumption by humans and other animals [8]. Ergot sclerotia contain a wide variety of ergot alkaloids that constitute 0.5–2% of the entire sclerotium mass [9]. Sharing structural similarities to neurotransmitters, ergot alkaloids interact antagonistically or agonistically to α-adrenergic, dopamine, serotonin 5-HT and other receptors, impairing both psychological and motor functions [10]. Preventing or minimizing the occurrence of ergot alkaloids in our food and animal feed is, therefore, of great importance.

The ergot alkaloid biosynthetic pathway is well characterized (Figure 1), and associated biosynthetic gene clusters were identified from the genomes of several *Claviceps* spp. [11]. The presence or absence of various genes in the gene cluster has a direct impact on the diversity of ergot alkaloids produced by a species. Ergot alkaloids are generally divided into three subclasses: simple clavines, ergoamides (simple lysergic acid amides) and the more complex peptide-containing ergopeptides; however, all share the same initial biosynthetic pathway steps involved in the tetracyclic ring formation [12,13]. Ergoamide and ergopeptide biosynthesis initiates with LpsB, a unique monomodular non-ribosomal peptide synthetase (NRPS) that recognizes D-lysergic acid as a substrate, activates it as the AMP-ester and loads it on the LpsB carrier protein. Unique among NRPS pathway, the two downstream NRPSs, LpsA or LpsC, compete for LpsB binding and the subsequent elaboration of lysergic acid [14]. LpsC is a monomodular NRPS responsible for the incorporation of a single amino acid that produce the ergoamides. LpsA is a trimodular NRPS that incorporates three amino acids units through the progressive elongation to form ergopeptams, which the mono-oxygenase EasH then converts into the ergopeptines via oxidation and spontaneous cyclization [15]. Two homologs of *lpsA* encoding LpsA1 and LpsA2 are present in the biosynthetic gene cluster. Sequence variations in the second adenylation domain (A-domain) of the two LpsA enzymes are responsible for incorporation of phenylalanine or leucine in the ergopeptams and ergopeptines [16]. Characterization of an *lpsA1* deletion mutant in *C. purpurea* P1 suggests that LpsA1 is responsible for formation of the phenylalanine containing ergotamine and ergocristine ergopeptines [17].

In Canada, guidelines are placed on grain sclerotia content allowances that range from 0.01 to 0.1% depending on the host crop [18]. Previous studies showed that ergot count and weight are not predictive of diagnostically relevant ergot alkaloid concentrations [19] as ergot alkaloid constituents and concentrations within sclerotia are highly variable and directly dependent upon the producing species. *Claviceps purpurea* is documented as producing a structurally diverse range of ergopeptides due to the adenylation site promiscuity of LpsA and to the tandem duplication and neofunctionalization of the *lpsA* gene; these were recently denoted *lpsA1* and *lpsA2* [11,17]. In addition to increased alkaloid content and variability, *C. purpurea* has the widest reported host range of any *Claviceps* species, making it a major contributor of ergot alkaloid food contamination across Canada [8,20]. Due to the wide associated host range, *C. purpurea* is a species complex that has been the focus of taxonomic evaluation. Multigene phylogenetic studies have confirmed cryptic speciation within the taxon, recorded as G1–7, wherein G1 represents *C. purpurea sensu stricto* [21,22]. The European Food Safety Agency (Panel on Contaminants in the Food Chain) recently called for increased monitoring and improved knowledge on the relation between *C. purpurea* sclerotia and ergot alkaloids of concern, in particular: ergometrine, ergotamine, ergosine, ergocristine, ergocryptine (α- and β-isomers), ergocornine and their corresponding -inine (S)-epimers [23]. Information on variability of *C. purpurea* ergot alkaloid content across crop hosts is, therefore, required if knowledgeable regulatory decisions

are to be made regarding acceptable levels of ergot alkaloids for livestock and human consumption [24].

**Figure 1.** Lysergic acid and ergot alkaloid biosynthesis. (**A**) The organization of the ergot alkaloid biosynthetic gene cluster from *C. purpurea* 20.1. (**B**) The biosynthetic pathway for lysergic acid and the ergot alkaloids (C = condensation; A = adenylation; R = reductase domain; C<sup>0</sup> = C-terminal portion of a C domain).

In this study, we used UPLC-HRMS and an untargeted metabolomics approach to profile the alkaloid content of *C. purpurea* sclerotia from four different grain crops. Consistent trends in sclerotia ergot alkaloid content among grain crops separated the *C. purpurea* strains into two distinct metabolomic classes. *Claviceps purpurea* alkaloid profiles were correlated to genetic differences of the ergot alkaloid biosynthetic gene cluster from previously published isolate genomes and from newly sequenced, long-read genome assemblies of Canadian strains. Based on the gene cluster composition and unique polymorphisms, we hypothesize that the alkaloid content of *C. purpurea* sclerotia is undergoing adaptation that contributes to the diversity of ergopeptides produced.

#### **2. Results**

#### *2.1. Untargeted Metabolomics Profiling*

– particular: ergometrine, ergotamine, ergosine, ergocristine, ergocryptine (α and β An initial unsupervised Euclidean Ward hierarchical clustering was performed individually for each host crop to reveal the underlying structure within the metabolomics data. Hierarchical clustering of the binary mass feature data showed a dichotomy in metabolite profiles amongst the isolates, separating the sclerotium extracts into two putative "metabolomic classes", which was consistent for all four host crops (Figure 2A, and

Figures S2A, S3A and S4A). Class 1 consisted of isolates LM04, LM207, LM232 and LM233, while Class 2 consisted of isolates LM30, LM60, LM469 and LM474. The results of the discriminant multivariate analyses supported the observed Class dichotomy in *C. purpurea* sclerotia. The top 30 ranking mass features with the greatest importance in the random forest model classification (based on both mean decrease accuracy and mean decrease Gini) were found to be highly conserved between host crops and demonstrated large fold changes between classes (Figure 2B, and Figures S2B, S3B and S4B). From the orthogonal partial least squares discriminant analysis (OPLS–DA) models generated for each host crop, the resulting S-plots separated the mass features associated with latent variable class separation (black dots in Figure 2C, and Figures S2C, S3C and S4C), where positive values on the Y-axis represent mass features with greater association to Class 1, and negative values are mass features with greater association to the Class 2 isolates. When overlaid on the OPLS-DA S-plot, the top 30 mass features derived from the random forest analysis were found to occur at opposite ends of the Y-axis (red and blue dots in Figure 2C), indicating a high correlation of the selected mass features with the formation of the observed classes in both discriminant multivariate analyses.

An untargeted consensus binary heatmap representing mass feature frequency of occurrence in sclerotia produced on the four host crops was constructed to further explore the overall mass feature diversity amongst the strains and between classes (Figure 2D). When hierarchical cluster analysis was applied, the resulting class dichotomy was consistent across individual plant crop hosts. Clear groupings of mass features were found to occur at high frequencies (on all four hosts) and were consistent between *C. purpurea* isolate classes. A larger section of mass features was found to be infrequently observed (occurring on a single host by a specific isolate) and likely resulted from inherent differences in crop hostand/or *C. purpurea* isolate-specific metabolism. Most notable, however, were the consistent production of class specific mass features, having a high frequency of occurrence on all four hosts but only occurring in sclerotia from either Class 1 or Class 2 *C. purpurea* isolates (highlighted in Figure 2D). These mass features were the same mass features designated as being responsible for class separation in the discriminant multivariate analyses derived from the non-binary mass feature data. Further investigation (see below) confirmed that these class-specific mass features represented metabolites derived from ergot alkaloid biosynthesis (Table S2).

#### *2.2. Ergot Alkaloid Profiles Drive the Separation of Claviceps purpurea Isolate Classes*

A molecular network was constructed to visualize ergot alkaloid/*C. purpurea* class associations using representative strains of both Class 1 and Class 2 sclerotium extracts from rye (Figure 3). Ergocristine/ergocristinine and ergotamine were the major ergopeptines produced and unique to LM60 (Class 2) sclerotia and ergovaline/ergovalinine, ergocornine/ergocorninine, ergocornam and ergocryptam were the major ergopeptines and ergopeptams produced and unique to LM04 (Class 1) sclerotia. Both ergosine and ergocryptine were also major metabolites produced by LM04, with minor production observed from LM60 sclerotium extracts from rye.

Clear and consistent trends in ergot alkaloid production were also evident in the frequency consensus heatmap (Figure 4). When produced, all of the ergopeptine- and ergopeptam-associated mass features were typically detected in sclerotium extracts from all four host crops surveyed. Trends observed in ergopeptine and ergopeptam production between *C. purpurea* classes were consistent with the trends observed from the molecular network. Ergotamine and ergocristine were the predominant ergopeptines having most abundant peak areas in Class 2 sclerotia and ergocryptine and ergocornine were the most abundant ergopeptines in Class 1 sclerotia. Both ergocornine (with the exception of LM30 and LM60) and ergocryptine and were detected in both Class 1 and 2 sclerotia from all four hosts; however, Class 2 sclerotia from all four hosts contained a significantly greater abundance of ergocornine and ergocyptine, as only very minor peak intensities were observed in Class 1 sclerotium extracts (LM469, LM474). Ergosine was observed in both

Class 1 and Class 2 sclerotia, with a greater abundance associated with Class 1 isolates on all four crop hosts. Finally, ergoamide mass features associated with ergometrine were also detected in all Class 2 sclerotia and only two strains of the Class 1 sclerotia, with only minor production observed from all strains on all four host crops. The two Class 1 strains that did not produce ergometrine (LM04 and LM232, Figure 4) were reported to have an internal stop codon within the *lpsC* gene, the NRPS responsible for biosynthesis of ergoamides such as ergometrine ([25]—this issue).

**Figure 2.** Untargeted analysis support two putative classes of *C. purpurea* strains grown on 'common' wheat. (**A**) Euclidean Ward hierarchical clustering of pseudo-binary (presence/absence) mass feature data of replicate sclerotia. Labels in Red and Blue represent specimens identified in Class 1 and Class 2, respectively, with (n = 6) replicates per specimen. (**B**) Top 30 mass features associated with class formation as determined via random forest analysis of raw data (left: mean decrease in accuracy; right: mean decrease in Gini of raw data of replicate sclerotia). (**C**) S-plot from OPLS-DA highlighting top 30 features identified by random forest analysis of replicate sclerotia, features in red and blue represent specimens identified in Class 1 and Class 2. (**D**) Consensus phenotype heatmap constructed using Ward D2 hierarchical clustering of Euclidean distances between all conserved mass features across the 4 hosts (n = 518 mass features). The mass features indicated in yellow are the likely drivers of this dichotomy.

**Figure 3.** Molecular networking map of ergot alkaloids detected in sclerotium extracts from LM04 (Class 1—red) and LM60 (Class 2—blue) and grown on rye. The sizes of the pie charts reflect the intensity of the precursor ions (summed) and pie chart ratio is calculated from the *m*/*z* peak areas. Lines connect nodes if cosine scores comparing MS 2 scans are above 0.7, with thickness increasing as the cosine score increases. Some annotations are duplicated where the MS <sup>2</sup> analysis was unable to differentiate putative stereoisomers.

strains grown on 'common' wheat. (

Major ergopeptine signals from the Class 2 group were almost universally not detected in Class 1 isolate extracts, whereas the ergopeptines that were characteristic of Class 1 isolate extracts were consistently detected in the Class 2 extracts, albeit at very low intensities. When the molecular structures of the ergopeptines and ergopeptams associated with the Class dichotomy are compared, a clear trend is observed. Ergopeptines and ergopeptams in Class 1 lack phenylalanine incorporation at the second amino acid position (aa2) of the tripeptide moiety.

#### *2.3. Ergot Alkaloid Gene Clusters Show Increased Rates of Mutation Associated with LpsA Gene Modules*

A comparison of the ergot alkaloid biosynthetic gene clusters from LM04 and LM60, selected as representative isolates from Classes 1 and 2, respectively, as well as *C. purpurea* isolate 20.1 [26] and LM72 ([25]—this issue), revealed a higher degree of polymorphism and repetitive DNA elements surrounding the *lpsA* genes (Figure 5A). Higher proportions of inter-strain polymorphisms were localized to specific *lpsA* modules, showing as much as 20% polymorphic sites (Figure 5A, red line graphs), well above the 1.5% 'background' polymorphic frequency calculated by comparison of two other NRPS genes, *lpsB* and *lpsC*, present in the cluster. Patterns of site divergence between *lpsA* genes correlated with the metabolomics data in two ways. First, we detected a frameshift mutation caused by an indel in the sequence encoding the second condensation domain of LM04 *lpsA2* (Figure 5A) that prematurely terminates the encoded NRPS. As such, only LpsA2 is not expected to be expressed as a functional full length NRPS in the Class 1 strain LM04. Second, the sequence encoding the second adenylation, carrier protein and condensation domains of *lpsA1* in LM04 had approximately 18% polymorphic sites compared to the Class 2 strain LM60 as

well as LM72 and *C. purpurea* 20.1. These polymorphisms included missense mutations that change the A-domain's 'specificity code', which is predictive of the domain's amino acid specificity during NRPS biosynthesis [27] (Table S3). —

**Figure 4.** Consensus phenotype heatmap constructed using hierarchical clustering of representative mass features of ergot alkaloids and related derivatives demonstrating the frequency of occurrence and dichotomy in *C. purpurea* classes. Boxplots comparing frequency of occurrence by host crop of mass features accompanied with associated ergopeptine. In the boxplots, outliers are represented by black ovals; key structural differences in the ergopeptine aa2 residue are coloured by class association (blue = Class 1, red = Class 2).

prefix ' ' removed clustering of strains identified as 'Class ' 'Class ' **Figure 5.** (**A**) Synteny analysis between ergot alkaloid biosynthetic gene clusters in *C. purpurea* strains 20.1, LM60, LM72 and LM04. Dark blue genes represent *lpsA* NRPSs, all other genes are labelled in light blue (see Figure 1 for gene classifications). Gene letters have the prefix '*eas*' removed for brevity, with the exception of *B*, which represents *cloA*, and *W*, which represents *dmaW*. Transparent arrows represent repetitive elements. Grey blocks represent syntenic DNA above 88% nucleotide identity, with the exception of the block connecting *easH2* pseudogenes located between LM60 and LM72, which have only 50% nucleotide identity. The red line graph overlaid onto the syntenic blocks represents a running average of the proportion of polymorphic DNA sites as calculated in 100 bp stepwise increments of a 500 bp sliding window, in each instance comparing the sequences above and below, and should be interpreted as aligning to the track below. NRPS domains are overlaid on *lpsA* genes that were predicted by antiSMASH 6.0. A yellow diamond is placed where a frameshift mutation was predicted to have disrupted the second condensation domain of lpsA2 in LM04 via premature stop codon insertion. (**B**) A phylogenetic comparison of the nucleotide sequence encoding the second A-domain (aa2) of all *lpsA* genes from all strains included in this study. The tree shows the clustering of all aa2 A-domains from predicted *lpsA1* genes correlates to the metabolomic clustering of strains identified as 'Class 1' vs. 'Class 2'. (**C**) Comparison of polymorphic site frequencies between *lpsA2* of *C. purpurea* strain 20.1 and *lpsA1* of LM04.

Sequence analysis of the second A-domains from LM04 *lpsA1* shows it encodes residues that most closely match the leucine 'specificity code' identified by Stachelhaus et al. [27,28]. While substrate specificity predictions of A-domains in bacterial NRPS systems are fairly robust, they are known to be less so in fungal NRPS systems. Fungal Adomain predictions are often limited to gross physicochemical properties [28] and, in some cases, they fail entirely [29]. Together, these observations provide a genetic framework for understanding the metabolomic patterns detected in strains from Class 1 and Class 2. The Class 1 strain LM04 has only one functional copy of *lpsA*, *lpsA1*, which encodes substrate

specificity in the second A-domain of LpsA1 for leucine and other branched hydrophobic amino acids rather than phenylalanine, leading to the formation of the observed Class 1 ergopeptines. This contrasts with the documented phenylalanine selectivity of LpsA2 in *C. purpurea* P1 [17].

To verify whether LpsA1 second A-domain sequences, from other strains grouped in Class 1, contained a similar leucine selective A-domain, we extracted the corresponding protein sequences from *lpsA1* and *lpsA2* from the long-read genomes compared in this study and previously published Illumina short-read genomes for the *C. purpurea* strains LM30, LM60, LM207, LM461, LM232, LM233, LM469 and LM474 [30]. The short-read genomes were fragmented at the *lpsA* regions; however, the second A-domain encoding sequences appeared intact. A phylogenetic tree of the aligned sequences (Figure 5B) supports the presence of two divergent *lpsA1* lineages correlating to Class 1 and Class 2 strains. This result supports the hypothesis that all Class 1 strains inherited a divergent second A-domain sequence with substrate specificity for leucine rather than phenylalanine. Additionally, we observed that the *lpsA2* second A-domain of the *C. purpurea* strain 20.1 was highly divergent from all of the other A-domains included in the analysis. It also lacked the predicted specificity for phenylalanine. However, the high levels of sequence divergence between *C. purpurea* 20.1 *lpsA2* and LM04 *lpsA1* suggests that it is unlikely that these two altered A-domain share a common origin (Figure 5C).

Following the prediction of a frameshift mutation in LM04 *lpsA2*, we looked for evidence of *lpsA2* disruption in the genomes of other Class 1 strains to explore the possibility that this lineage has consistent pseudogenization of *lpsA2*. The same C2-domain frameshift mutation was detected in one other genome, LM232 (Figure S5). Alignment of the fragments of putative *lpsA1* and *lpsA2* genes from LM207 and LM233 to the LM04 genes showed no obvious signs of disruption; however, we note that the Illumina assemblies from LM207 and LM233 are fragmented and would benefit from long-read sequencing to thoroughly investigate this possibility. Although LM72 was not selected for metabolomic studies, our genome analysis indicated there is a premature stop codon in *lpsA2* in this strain, disrupting a condensation domain (Figure 5A).

#### *2.4. Transposons and Transposon Fossils in the LpsA1/LpsA2 Region*

The intergenic regions between *lpsA1* and *lpsA2* contained numerous small (200–1000 bp) sequences that are duplicated in other regions in the respective genomes, including one that is inserted into the remnants of the truncated and pseudogenized *easH2* in LM04. This suggests these sequences are potentially transposable element fossils. Repetitive elements in the *C. purpurea* 20.1 and LM60 intergenic regions show some similarity to repetitive elements annotated as DNA transposons of the 'MULE-MuDR' type, whereas repetitive elements in the LM72 and LM04 genomes show similarity to the 'TcMar-Tc1′ type DNA transposons annotated in the LM04 genome. The *easH2* copies, which were truncated compared to *EasH1* and predicted to be pseudogenized, had low sequence similarity, approximately 50% nucleotide identity between shared sequence lengths, and between 20.1/LM60 and LM72/LM04 isoforms. A putative transposon in LM72 between *easF* and *easG* was observed. As this study did not include metabolomic analysis of LM72 sclerotia, we did not pursue the potential effects of this putative transposon on ergot alkaloid production.

#### **3. Discussion**

In this study we classified Canadian *C. purpurea sensu stricto* strains into two broad categories based on ergot alkaloid profiles. We arbitrarily designated these groups as 'Class 1' and 'Class 2', and showed that the classes are dominated by either aliphatic hydrophobic residue-incorporating or phenylalanine-incorporating ergot alkaloids, respectively. Our identification of diversified intra-species ergot alkaloid profiles is generally consistent with previously published analyses of *C. purpurea* strains. Historically, ergopeptide production in *C. purpurea* has been quite diverse and varied between geographic locations and broader population groupings. A prominent early example of this trend compared strain 'P1',

which was found to produce primarily ergotamine and ergocryptine, with strain 'Ecc93', a producer primarily of ergocristine [16]; however, we note the lineage of Ecc93 may need to be further investigated in light of newer *Claviceps* nomenclature [21]. *Claviceps purpurea* strain 20.1 appears to have a similar ergot alkaloid profile as strain P1, reportedly producing ergotamine, ergocryptine and ergometrine as major products [26]. The presence of divergent ergot alkaloid profiles was also previously described from broader genetic groupings of *C. purpurea sensu lato* (G1–G4); however, the profiles of *Claviceps purpurea sensu stricto* (G1) were more difficult to define [31,32]. For example, early work by Pažoutová et al. [31] described *C. arundinis* (G2a) sclerotia as consistent producers of ergocristine, ergosine and small amounts of ergocryptine; however, the G1 strains showed highly variable ergot alkaloid presences and relative abundances. More recent work by Negård et al. [2] showed that Norwegian *C. purpurea* G1 sclerotia contained a complex mixture of ergotamine, ergosine, ergocornine, ergocryptine and ergocristine as major constituents. In light of the present study, we suggest that the profiles generated by Pažoutová and Negård indicate that G1 strains have diverse LpsA A-domain specificities, which results in a broad spectrum of chemo-phenotypes. More recently, a detailed high resolution LCMS analysis of ergot alkaloid profiles from various *Claviceps* species was published, which included six strains of *C. purpurea* isolated from Canadian cereals [33]. The metabolomic classes we defined here broadly correlate with their results: four of the six *C. purpurea* strains' profiles exhibited abundant ergotamine, ergocristine and ergostine mass feature peak areas (consistent with Class 2), and one strain showed abundant ergocornine, ergocryptine and ergovaline peak areas (consistent with Class 1). Taken together, our results contribute to a continuing understanding of the diversity detected in this species, and highlight the tension inherent in classifying strains based on chemotaxonomic vs. genetic criteria.

Our genetic analysis of the ergot alkaloid gene clusters from long-read genomes provides a framework to understand the underpinnings of chemical phenotype diversity within the *C. purpurea* G1 group. The ergot alkaloid biosynthetic gene clusters extracted from long-read assemblies of LM04 and LM60, chosen as representatives from the Class 1 and Class 2 metabolomic groupings, support a hypothesis that the class-specific ergot alkaloid profiles result from sequence diversity in the *easH*/*lpsA* tandem-duplicated region [16,17]. We predict that these mutations cause the shifting of substrate specificities of the *lpsA1* second A-domain to favor leucine and related aliphatic hydrophobic substrates such as isoleucine and valine, and additionally propose that *lpsA2* has been pseudogenized, at least in LM04, restricting ergotamine biosynthesis to interactions between LpsB and LpsA1. Given that the specific frameshift mutation described above is only detected in the genomes of LM04 and LM232, but not the other Class 1 strains LM233 or LM207, it is possible that the frameshift is not a causal feature for the metabolomic shift away from phenylalanine-incorporating ergot alkaloids across all Class 1 strains, but rather, could be a symptom of an overall relaxed selection for LpsA2 functionality due to some other mechanism. Importantly, this relaxed functionality is potentially not limited to Class 1 strains, but may also be present in Class 2 strains, for which both the *lpsA1* and *lpsA2* second A-domains are predicted to have phenylalanine specificity. The identification of a premature stop codon due to a single nucleotide polymorphism in LM72 *lpsA2* supports this hypothesis. Although LM72 has yet to associated with a given metabolomic class, we hypothesize it is in Class 2 based on A-domain selectivity predictions. The lack of detected phenylalanine-containing ergotamines in Class 1 metabolomes strongly suggests that the *lpsA2* gene is nonfunctional at least in those strains. Notably, this pattern departs from the known ergot alkaloid biosynthetic profiles of *C. purpurea* P1, in which researchers found that the *lpsA1* and *lpsA2* genes were both active, and the ergot alkaloid output of strains with *lpsA1* knocked out lost the ability to produce phenylalanine containing ergotamine but still produced leucine containing ergocryptine [17,34].

The ergot alkaloid chemical phenotype patterns observed in this study indicate that a representative portion of the Canadian populations of *C. purpurea* produce either exclusively short-chain alkyl group-bearing products (Leu, Ile, Val), or primarily phenylalanine-

incorporating products with trace amounts of short-chain alkyl group-bearing products. At this time, the advantages of either phenotype on plant colonization, if any, are unknown, limiting the utility of speculation on the selective pressures shaping the evolution of *lpsA*. One important question that remains unanswered is: what are the origins of the mutations? A larger population of genomic and metabolomic data is likely needed to form a robust hypothesis; however, the presence of transposable elements with similarity to the MULE and TcMar families of DNA transposons detected in the intergenic space of the *lpsA* genes strongly suggests that the TE-mediated transposition or mutations associated with TE insertions are potential contributors to the *lpsA* diversity. Notably, the repetitive element-rich, highly polymorphic *lpsA1/lpsA2* intergenic spaces associated with either *C. purpurea* 20.1/LM04 or LM60/LM72 (Figure 5) are tightly linked with divergent *lpsA2* C0 domain sequences (Figure S5), thereby linking the presence of repetitive element insertions to nearby gene-coding sequence changes. Although the pattern of linked intergenic and *lpsA2* C0 domains is reflected in all genomes included in this study, the resulting groupings do not perfectly correlate with the assigned metabolomic classes of the strains. The domainspecific, highly variable proportions of *lpsA1/2* mutations detected when comparing strains (for example, see the comparison of *lpsA1* between LM04 and LM72, Figure 5) suggests that *lpsA* genes are likely undergoing recombinational shuffling [35,36]. Our analysis highlights the highly dynamic nature of the tandem-duplicated *easH*/*lpsA* region and the nature of ergot alkaloid NRPS module-specific evolution.

All ergot alkaloids contribute to and are responsible for a broad spectrum of biological activities as they can act as agonists, partial agonists and antagonists of multiple receptor sites, located throughout the body in various tissue types [37]. Therefore, the effects of ergot alkaloids on mammalian systems such as humans and livestock (i.e., cattle, sheep, horses and goats) are diverse, where, in the case of livestock, observed symptomatic responses to ergot alkaloid exposure can be highly variable and, therefore, historically difficult to diagnose [38]. Gangrenous ergotism occurs most frequently from acute ergot alkaloid exposure, caused by general blood vessel vasoconstriction and dysfunction resulting in tissue necrosis (dry gangrene) of the extremities such as the ear tips, tail, lower limbs and hooves [37–39]. In other instances, the consumption of subtle quantities of—and the prolonged exposure of livestock to—ergot alkaloids negatively impacts energy metabolism, feed efficiency and livestock productivity, as evidenced by decreases in food intake, live weight gain, circulating prolactin, reproductive performance, milk production and hyperthermia [37–39]. Pregnant and lactating animals are most susceptible to ergot alkaloid exposure due to increased risk of abortion and agalactia syndrome (the absence or failure to secrete milk); there is a direct correlation of ergopeptide exposure with a decrease in prolactin, growth hormone and luteinizing hormone secretion, and the inhibition of milk production [37]. Increasing our current understanding regarding the variability in production of ergot alkaloid content in *C. purpurea* sclerotia derived from different cereal crops used for food and feed production is, therefore, of continued relevance.

Ergot alkaloid profiles from *Claviceps* sclerotia consist of complex mixtures of minor stereoisomers, constitutional isomers and transient intermediate products, resulting in numerous peaks with the same *m*/*z* being detected, often at very low abundances when compared to the dominant *m*/*z* peak [31]. This was reflected in our analysis as numerous mass features were observed that shared duplicated isobaric *m*/*z* with ergotamine, ergocornine, ergocryptine and ergosine isomers, as well as isomers of other lactam and precursor molecules. However, many of these isobaric mass features were found to have random occurrences across strains and plant crop hosts, often occurring at a low abundance just above mass detection thresholds. The main ergoamides and ergopeptides of regulatory concern [23], namely ergometrine, ergotamine, ergosine, ergocristine, ergocryptine and ergocornine, when produced, were found to occur in *C. purpurea* sclerotia regardless of host crop. Of the *C. purpurea* strains examined, ergotamine and ergocristine were consistently the most abundant ergopeptides observed in Class 2 sclerotia, and ergocornine and ergocryptine were the most abundant in Class 1 sclerotia, from all four host crops

tested. Based on the limited sample size, these four ergopeptides are likely to be of greatest concern in food and feed commodities derived from rye, triticale, durum wheat and 'common' wheat; however, a more extensive sampling of these crops using a larger breath of *C. purpurea s.s.* isolates will first be required to bring credence to this assumption.

#### **4. Conclusions**

The patterns of *lpsA* gene diversity described in this small subset of Canadian strains provides a remarkable framework for understanding the accelerated evolution of ergot alkaloid profiles. The results of this study provide insight into the variation in alkaloid content across Canadian *C. purpurea* isolates that could help guide future experiments in the exploration of secondary metabolite production. This research is crucial for providing developments in the understanding of the toxicological impacts of *Claviceps* species in order to improve consumer safety. *Claviceps purpurea* alkaloid content is an important topic of investigation because of its recognition as a major impactor of Canadian crops. The eight *C. purpurea* isolates that were analyzed within this study show clear differentiation of ergot alkaloid production. To impact the current regulatory mandate, the next steps would involve analyzing a larger sample size, with isolates across a broader range of hosts and host replicates as well as a broader geographic distribution to develop a better understanding of metabolomic class distribution. An additional recommendation that can be made based on the dichotomy in the observed ergot alkaloid production from *C. purpurea s.s.* isolates is that care should be observed when extrapolating taxonomic inferences based solely on ergot alkaloid profiles from sclerotia of unknown origin.

#### **5. Materials and Methods**

#### *5.1. Strain Selection and Greenhouse Inoculation*

Eight isolates of *C. purpurea* obtained from different Canadian provinces were selected from the Canadian Fungal Culture Collection (DAOMC) and the Liu lab research collection for in planta inoculation and untargeted metabolomic profiling (Table S1). From each isolate, representative sequences of the *RPB2* and *TEF1*-*α* protein coding genes were obtained, concatenated, and used to construct a maximum parsimony analysis with various *Claviceps* spp.; all eight selected isolates were classified as *C. purpurea s.s.* (Figure S1). In planta inoculations were completed on four different crop species: rye (*Secale cereale* L.), the 'common' wheat cultivar AC Cadillac (*Triticum aestivum* L.), durum wheat (*Triticum durum* Desf.) and triticale (*x Triticosecale* Wittmack). *Claviceps purpurea* isolates were grown on potato dextrose broth to obtain spores and spore viability was checked by plating onto malt extract agar. Viable spores were inoculated at a concentration of 10,000 conidia per mL on 20 florets per spike with 3 replicates per host species. Five healthy spikelets on each side of the spike were selected and the primary and secondary florets were filled with the conidial suspension using a syringe and hypodermic needle [40]. The sclerotia were collected at crop maturity when the infection was successful. Greenhouse inoculation trials yielded sufficient *C. purpurea s.s.* sclerotia (n = 6/host, with the exception of a single isolate LM60, which only yielded three sclerotia when inoculated on durum wheat) from the four cereal host plants. Each harvested sclerotium were cut into 5 mg fragments, with one fragment subjected to DNA extraction and *easE* gene sequencing to confirm fungal identity and the second fragment reserved for alkaloid profiling.

#### *5.2. Isolate Identity Confirmation*

Genomic DNA was extracted from sclerotium fragments using a Macherey–Nagel NucleoMag 96 Trace Kit (Macherey–Nagel, Düren, Germany) following a modified protocol as described [22]. Polymerase chain reaction (PCR) amplification was performed on all extracted gDNA samples in 10 µL reactions. Final concentrations of 1× Titanium Taq buffer (with 3.5 mM MgCl2), 0.1 mM dNTPs, 0.08 µM of both forward and reverse primers, 1× Titanium Taq polymerase (Clontech, Mountain View, CA, USA) and 0.01 mg bovine serum albumin (BSA) were obtained with 1 µL of DNA template. FAD-linked oxidore-

ductase *easE*, a single copy ergot alkaloid synthesis gene, was amplified using designed primers: easE996f and easE1895r [22]. PCRs were run on a Mastercycler Pro S (Eppendorf, Mississauga, ON, Canada) using a touchdown protocol with initial denaturation at 95 ◦C for 3 min, followed by 5 cycles of 95 ◦C for 1 min, annealing at 63 ◦C (decrease of 1 ◦C per cycle) for 45 s and extension at 72 ◦C for 1 min 30 s, and then 30 cycles of 95 ◦C for 1 min, annealing at 58 ◦C for 45 s and extension at 72 ◦C for 1 min 30 s, with a final extension at 72 ◦C for 8 min. The PCR products were visualized on a 1% agarose gel with ethidium bromide treatment for 30 min at 200 V.

PCR products were amplified for Sanger sequencing using the ABI BigDye Terminator 3.1 cycling sequencing kit in a reaction volume of 10 µL, with BigDye Seq Mix diluted 1:8 with Seq buffer (Thermo Fisher Scientific, Ottawa, ON, Canada). The final volumes of each reagent were as follows: 1.75 µL of 5× sequencing buffer, 0.5 µL of BigDye Seq Mix and 0.5 µL of 3.2 µM primer, with reaction volumes increased to 9 µL using sterile high-performance liquid chromatography (HPLC) water. One microlitre of PCR product was added directly from the initial PCR amplification. The thermocycler profile used for the sequencing reactions had an initial denaturation step at 95 ◦C for 3 min, followed by 40 cycles at 95 ◦C for 30s, annealing at 58 ◦C for 15 s and extension at 60 ◦C for 3 min. An Applied Biosciences Prism 3130xl Genetic Analyzer (Life Technologies, Streetsville, ON, Canada) was used to generate DNA sequences and chromatograms. DNA sequences were edited and aligned to reference sequences in Geneious 11.1.5 to confirm identity. Once identity was confirmed, additional PCR and sequencing reactions were completed; DNAdirected RNA polymerase II subunit (*RPB2*) and translation elongation factor 1-α (*TEF1*-*α*) were amplified and sequenced using the above methods for phylogenetic construction.

#### *5.3. Phylogenetic Analysis*

A total of 43 reference sequences were downloaded and 2 outgroups were downloaded for each sequence, which included both the second largest subunit of the RNA polymerase II (*RPB2*) and elongation factor 1-α (*TEF1*-*α*) genes, and were aligned individually using MAFFT version 7 (online service https://mafft.cbrc.jp/alignment/server/ accessed on 22 January 2020). The Auto alignment strategy (FFT–NS–1, FFT–NS–2, FFT–NS–i or L– INS–i; depends on data size) was chosen. The alignments were visualized, verified and the two genes were concatenated using Geneious 11.1.5. Parsimony analysis was completed using PAUP\* 4.0b10 [41]; heuristic searches with 50 replicates of random stepwise addition and tree bisection-reconnection branch swapping were conducted with a limit of 1,000,000 re-arrangements set for each replicate. Bootstrapping analyses were set with 100 replicates with full heuristic search of random stepwise addition of 50 replicates and a limit of 1,000,000 rearrangements per replicate.

#### *5.4. HRMS Profiling of Sclerotium Extracts*

Sclerotium fragments were individually frozen (−80 ◦C) and pulverized before being transferred into a 2 mL HPLC amber vial and extracted in 300 µL acetone:water (4:1, *v*:*v*) on a rotary shaker for 1 h (100 rpm). Extracts were centrifuged to pellet debris and 50 µL of the supernatant was transferred into a new HPLC vial for UPLC-HRMS profiling. Sclerotium extracts were profiled in a randomized injection order with solvent blanks interspersed between every six samples to assess for sample carryover and to aid in data curation during metabolomics processing. A reserpine standard was injected at the beginning of each sample sequence to confirm accurate calibration of the mass spectrometer and to aid in data alignment. Chemical profiling was completed using a Thermo Ultimate 3000 UPLC coupled to a Thermo LTQ Orbitrap XL HRMS and an UltiMate Corona VeoRS charged aerosol detector (Thermo Fisher Scientific Inc, Waltham, MS, USA). Chromatography was performed on a Phenomenex C<sup>18</sup> Kinetex column (50 mm × 2.1 mm ID, 1.7 µm) with a flow rate of 0.35 mL/min, running a gradient of H2O (+0.1% formic acid) and ACN (+0.1% formic acid). The gradient started at 5% ACN, increased to 95% ACN over 4.5 min, was held at 95% ACN until 8.0 min, returned to 5% ACN by 9 min and was left to equilibrate

for until 10 min before the next injection. The HRMS was operated in ESI<sup>+</sup> mode (with a 100–2000 *m*/*z* range) using the following parameters: sheath gas (40), auxiliary gas (5), sweep gas (2), spray voltage (4.2 kV), capillary temperature (320 ◦C), capillary voltage (35 V) and tube lens (100 V). MS<sup>n</sup> fragmentation was performed in high resolution on select ions in subsequent experiments using CID at 35 eV.

Additional experiments were performed to investigate the alkaloid diversity from LM04 and LM60 sclerotium extracts derived from rye. Samples were filtered through 0.2 µm PTFE membrane filters prior to analysis by nanoLC coupled to the Q-Exactive Plus mass spectrometer (Thermo Fisher Scientific). Chromatographic separation of metabolites was performed on a Proxeon EASY nLC II System (Thermo Fisher Scientific) equipped with a Thermo Scientific™ Acclaim™ PepMap™ RSLC C18 column (P/N ES800A), 15 cm × 75 µm ID, 3 µm, 100 Å, employing a H2O/ACN gradient (with 0.1% formic acid). Chromatography ran for 60 min at a flow rate of 0.25 µL/min: initiated with a linear gradient from 10 to 100% of ACN for 45 min, held at 100% ACN until 50 min, then decreased from 100 to 10% of ACN by 55 min and held at 10% until 60 min to equilibrate to starting conditions. The mass spectrometer used positive electrospray ionization (ESI) at an ion source temperature of 250 ◦C and an ionspray (Thermo Scientific™ EASY spray) voltage of 2.1 kV. The FTMS scan type was full MS/data-dependent (dd)-MS<sup>2</sup> . The parameters of the full mass scan were as follows: a resolution of 70,000, an auto gain control target under 3 × 10<sup>6</sup> , a maximum isolation time of 100 ms and an *m*/*z* range of 100–1500. The parameters of the dd-MS<sup>2</sup> scan were as follows: a resolution of 17,500, an auto gain control target under 1 × 10<sup>5</sup> , a maximum isolation time of 100 ms, a loop count of top 10 peaks, an isolation window of *m*/*z* 2, a normalized collision energy of 35 and a dynamic exclusion duration of 10 s.

#### *5.5. Metabolomics: Data Pre-Processing*

UPLC–HRMS profiles of sclerotium extracts were compiled into a representative data matrix by converting metabolite mass spectral data into mass features (consisting of a retention time and a mass/charge ratio; RT and *m*/*z*), where each metabolite was represented by one or more mass features associated with various pseudomolecular ions such as protonated mass, salt adducts, neutral loses and charged fragments. UPLC-HRMS profiles of *C. purpurea* sclerotium extracts from the four grain crops were preprocessed together. Preprocessing of the acquired Xcalibur raw data files was carried out using MZmine 2.53 [42]. Masses were detected with a noise threshold of 1 × 10<sup>4</sup> . Chromatogram building was completed with the ADAP algorithm using a minimum group size of 6, a group intensity threshold of 1 × 10<sup>6</sup> and a minimum highest intensity of 5 × 10<sup>6</sup> . RT and *m*/*z* tolerances were consistently set to 0.01 min and 0.005 *m*/*z* (or 5 ppm), respectively, throughout processing. Chromatogram deconvolution was carried out using the ADAP wavelets algorithm with a signal-to-noise threshold of 10, a minimum feature height of 2 × 10<sup>6</sup> , a coefficient/area threshold of 110, a peak duration rage of 0.00–0.50 and an RT wavelet range of 0.00–0.03. Isotopic peaks were then removed followed by alignment of peaks using the JOIN aligner method and a 20:10 ratio for *m*/*z* to RT weight. Peaks missing from the data matrix were then back filled using the same RT and *m*/*z* range gap-filling algorithm.

#### *5.6. Metabolomics: Data Reduction*

The resulting data matrix was exported from MZmine as a csv file and imported into R Studio where all mass features were normalized by mean. Mass features with retention times below 1 min and after 6 min were also removed due to the amplification of background noise at these time points. Prominent mass features occurring above cut-off thresholds in solvent blanks were subtracted from all feature values for a given mass feature to reduce false positives as well as to remove "system peaks" via in-house scripts. Correlation analysis was also preformed to group adducts and fragments. The applied correlation analysis used Pearson correlations (coefficient threshold set to 0.85) with an RT window of 0.02 to group mass features. Representative mass features for each of the

groupings were selected based on evaluation of the aligned chromatograms for the most consistent peak shape, relative intensity and preference for the most likely [M + H]<sup>+</sup> ion by manual determination.

#### *5.7. Metabolomics: Binary Matrix Data Transformation*

Mass feature peak area values were converted to binary presence/absence matrices (to facilitate comparison between the various host crops) using a peak area threshold value of 5 (where values lower then 5 were denoted as 0 and those greater than denoted as 1). For each individual host, mass features of host crop sclerotium replicates were individually averaged to generate a single representative value for each isolate. Mass features with averaged values greater then 0.65 were denoted as 1, while values lower were set to 0, to ensure that only mass features consistently produced by isolates were included in the resulting host sclerotia binary matrix (4/6 replicates). An untargeted consensus binary heatmap representing mass feature frequency of occurrence in sclerotia produced on the four host crops was constructed to further explore the overall mass feature diversity amongst the strains and between classes. The mass features across the four host binary matrices were then summed for each isolate to create a consensus frequency phenotype matrix consisting of frequency values ranging from 0 to 4, with 0 indicating the feature as not detected for the isolate on any of the examined hosts and 4 indicating the presence in sclerotia derived from all hosts.

#### *5.8. Metabolomics: Multivariate and Univariate Analyses*

Multivariate and univariate analyses was completed using the "MUMA" R package with applied Pareto scaling and half minimum imputation on zero values. An initial unsupervised Euclidean Ward hierarchical clustering was performed individually for each host crop to reveal the underlying structure within the data. Metabolomic phenotype heatmaps were generated using the binary data matrices in the "pheatmap" R package, with row and column dendrograms calculated using "ward.D2" clustering of the Euclidean distance matrix. A second metabolomic phenotype heatmap was also generated using only mass feature associated with ergot alkaloid biosynthesis to further demonstrate the dichotomy in ergot alkaloid production across the two classes. Discriminant multivariate analyses (random forest and OPLS-DA) were performed using the original non-binary data matrix to determine which mass features contributed most to class separation for each crop host. The supervised random forest analysis was completed using the "randomForest" R packages with the identified classes used for trained data and the OPLS-DA was executed using the "MUMA" R package.

In the random forest analysis, two indices (mean decrease accuracy and mean decrease Gini) are used to assess variable importance associated with class separation. Mean decrease accuracy uses permuting "out of bag" samples to compute the importance of each variable in the predictive accuracy of the random forest [43]. Mean decrease Gini is a measure of how important a variable is in contributing to node and leaf homogeneity across all of the decision trees, where the larger the Gini index value, the greater the importance the variable has in terms of classification in the random forest model [43]. Both mean decrease accuracy and mean decrease Gini were used to rank the influence of the mass features on class separation between sclerotium extracts based on variances in relative peak intensity patterns.

#### *5.9. Metabolomics: Mass Feature Annotation*

Mass feature annotations were completed using an in-house *C. purpurea* reference database created from literature reports [33,44]. Putative ergot alkaloid annotations were assigned based on the mass spectral accuracy (±5 ppm) and relative elution order. MS<sup>n</sup> experiments were used to confirm ergot alkaloid annotations based on common fragmentation patterns expected from reports in the literature. MassWorksTM software (v5.0.0, Cerno Bioscience, Las Vegas, NV, USA) was used to improve spectral accuracy and confirm the

molecular formulas of annotated ions. The sCLIPS searches were performed in dynamic analysis mode with allowances for the elements C, H, N and O set at a minimum of 1 and a maximum of 100. Charge was specified as 1, mass tolerance was set to 5 ppm and the profile mass range was −1.00 to 3.50 Da.

To support of ergot alkaloid annotations, the obtained nanoLC-HRMS/MS data of rye sclerotia extracts from LM04 and LM60 were compared to the ESI+ LCMS spectral database downloaded from the MassBank of north America website in August 2021. MS<sup>2</sup> spectral comparisons were made using MSDIAL [45] and again using default parameters in the GNPS web-based workflow for molecular networking [46]. Top database matches supported annotations for ergometrine, methylergometrine, ergovaline, ergosine, ergotamine, ergocornine, ergocryptine, ergocristam and ergocristine. Most ergot alkaloid peptam/peptide precursors are not represented in spectral databases at this time. Additionally, MS/MS spectral matching is unable to reliably differentiate between very similar spectra from the same parent ion *m*/*z*, such as the '-ine' and '-inine' form of ergot alkaloids.

#### *5.10. Genomics: DNA Isolation*

Isolates were cultured on PDA and harvested and ground using liquid N2. Genomic DNA was extracted using a cetyltrimethyl ammonium bromide (CTAB) protocol [47,48]. DNA samples were assessed by running on a 1% agarose gel for the presence of DNA shearing and RNA, while DNA integrity was evaluated using TapeStation and Genomic DNA ScreenTapes (Agilent, Santa Clara, CA, USA) and impurities were evaluated using DropSense 16 (Trinean, Pleasantan, CA, USA). gDNA was quantified using a Qubit® 2.0 Fluorometer (Invitrogen by Life Technologies, Carlsbad, CA, USA) before submission for in-house sequencing (Molecular Technologies Laboratory, Ottawa Research & Development Centre, Agriculture and Agri-Food Canada) where DNA libraries were prepared and loaded onto a FLO-MIN106 flow cell and run with a MinION (Oxford Nanopore Technologies, Oxford, UK) for 48 h.

#### *5.11. Genomics: Genome Assembly and Ergot Alkaloid BiosynthetiFc Gene Cluster Annotation*

Genome assembly of LM72, LM04 and LM60 was performed with CANU v1.8 [49] using the sequenced Nanopore reads with default settings and an estimated genome size of 35 Mb. Two rounds of correction were applied to the resulting assemblies: the first round was performed using Nanopolish v. 0.13.2 [50], and the second round used Pilon v1.23 [51], which corrected the nanopolished CANU assemblies using Illumina reads that were mapped with BWA v0.7.17 [52]. Assemblies were annotated using the Funannotate v1.8.8 pipeline [53] using default settings, with predicted proteins from *C. purpurea* isolate 20.1 supplied as protein evidence to assist gene modeling. Libraries of repetitive elements were generated using RepeatModeler2 [54], and were identified, where possible, using the 2018 Repbase database of annotated transposons [55]. Repetitive elements in intergenic regions between *lpsA1* and *lpsA2* were annotated by searching for blast hits in the RepeatModeler2-generated repetitive element libraries. Illumina-based genome assemblies for LM04, LM30, LM60, LM207, LM232, LM233, LM464 and LM479 were previously published and are publicly available in the NCBI database [30].

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/toxins13120861/s1, Figure S1: Concatenated *RPB2* and *TEF1*-*α* gene sequence MP (maximum parsimony) algorithm phylogenetic analyses of 44 *Claviceps* spp., Figure S2: Untargeted metabolomics analysis of *C. purpurea* strains grown on Durum wheat, Figure S3: Untargeted metabolomics analysis of *C. purpurea* strains grown on rye, Figure S4: Untargeted metabolomics analysis of *C. purpurea* strains grown on triticale, Figure S5: Phylogenetic tree of aligned nucleotides from C0 domains with associated intergenic 'genotypes', Table S1: Origin of *Claviceps purpurea* isolates, Table S2: Stachelhaus codes and predicted substrate specificities for all lpsA genes extracted from long-read genomes.

**Author Contributions:** Conceptualization, C.H., T.E.W. and D.P.O.; methodology, C.H., T.E.W., T.L. and D.P.O.; validation, T.E.W. and D.P.O.; formal analysis, C.H., T.E.W., A.S. and T.L.; investigation, C.H., T.E.W., A.S., P.S. and Z.P.; resources, P.S., Z.P., J.G.M., M.L. and D.P.O.; data curation, C.H., T.E.W. and A.S.; writing—original draft preparation, C.H., T.E.W. and D.P.O.; writing—review and editing, C.H., T.E.W., A.S., J.G.M., C.N.B., M.L. and D.P.O.; visualization, C.H., T.E.W., C.N.B. and D.P.O.; supervision, J.G.M., C.N.B., M.L. and D.P.O.; project administration, D.P.O.; funding acquisition, J.G.M., M.L. and D.P.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Agriculture & Agri-Food Canada (Project ID#002272: Fungal and Bacterial Biosystematics-bridging taxonomy and "omics" technology in agricultural research and regulation).

**Institutional Review Board Statement:** Not Applicable.

**Informed Consent Statement:** Not Applicable.

**Data Availability Statement:** Annotated ergot alkaloid biosynthetic gene cluster sequences for LM04, LM60 and LM72 were uploaded to the NCBI nucleotide database under the following accession numbers respectively: OK662595, OL348384, and OL348385.

**Acknowledgments:** We thank the Molecular Technologies Laboratory (MTL) at the Ottawa Research & Development Centre of Agriculture & Agri-Food Canada.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53** *Claviceps* **Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model**

**Miao Liu 1,\*, Wendy Findlay 1 , Jeremy Dettman 1 , Stephen A. Wyka 2 , Kirk Broders 3 , Parivash Shoukouhi 1 , Kasia Dadej <sup>1</sup> , Miroslav Kolaˇrík 4 , Arpeace Basnyat <sup>1</sup> and Jim G. Menzies 5**


**Abstract:** Ergot fungi (*Claviceps* spp.) are infamous for producing sclerotia containing a wide spectrum of ergot alkaloids (EA) toxic to humans and animals, making them nefarious villains in the agricultural and food industries, but also treasures for pharmaceuticals. In addition to three classes of EAs, several species also produce paspaline-derived indole diterpenes (IDT) that cause ataxia and staggers in livestock. Furthermore, two other types of alkaloids, i.e., loline (LOL) and peramine (PER), found in *Epichloë* spp., close relatives of *Claviceps*, have shown beneficial effects on host plants without evidence of toxicity to mammals. The gene clusters associated with the production of these alkaloids are known. We examined genomes of 53 strains of 19 *Claviceps* spp. to screen for these genes, aiming to understand the evolutionary patterns of these genes across the genus through phylogenetic and DNA polymorphism analyses. Our results showed (1) varied numbers of *eas* genes in *C.* sect. *Claviceps* and sect. *Pusillae*, none in sect. *Citrinae*, six *idt/ltm* genes in sect. *Claviceps* (except four in *C. cyperi*), zero to one partial (*idtG*) in sect. *Pusillae*, and four in sect. *Citrinae*, (2) two to three copies of *dmaW*, *easE*, *easF*, *idt/ltmB*, *itd/ltmQ* in sect. *Claviceps*, (3) frequent gene gains and losses, and (4) an evolutionary hourglass pattern in the intra-specific *eas* gene diversity and divergence in *C. purpurea*.

**Keywords:** ergot alkaloids; ergot fungi; gene divergence; gene diversity; indole diterpenes; phylogeny; secondary metabolites

**Key Contribution:** Indole alkaloid gene clusters from a wide range of *Clavicep* spp. were identified through genome screening. Six indole diterpene/lolitrem genes, *idt/ltmP, Q*, *B*, *C*, *S*, and *M*, were commonly present in various species in *C*. sect. *Claviceps.* Micro-evolution of *eas* genes within *Claviceps purpurea* revealed that their evolutionary rates fit an hourglass model.

#### **1. Introduction**

Fungi in the genus *Claviceps* (Clavicipitaceae, Hypocreales, Sordariomycetes) infect the florets of cereal crops, nonagricultural grasses (Poaceae), sedges (Cyperaceae), and rushes (Juncaceae) [1], followed by occupying the unfertilized ovaries and eventually replacing the seeds with fungal resting bodies, i.e., sclerotia, known as ergots [2]. In light

**Citation:** Liu, M.; Findlay, W.; Dettman, J.; Wyka, S.A.; Broders, K.; Shoukouhi, P.; Dadej, K.; Kolaˇrík, M.; Basnyat, A.; Menzies, J.G. Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53 *Claviceps* Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model. *Toxins* **2021**, *13*, 799. https://doi.org/ 10.3390/toxins13110799

Received: 21 October 2021 Accepted: 10 November 2021 Published: 13 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by Her Majesty the Queen in Right of Canada, as represented by Agriculture and Agri Food Canada. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).

of molecular phylogenetics, 63 named species [3,4] are classified into four sections, i.e., *Claviceps* sect. *Claviceps, C.* sect. *Citrinae*, *C.* sect. *Paspalorum*, and *C.* sect. *Pusillae*, on the basis of morphological, ecological, and alkaloid-producing features [3]. Ergot bodies or sclerotia contain a wide spectrum of alkaloids toxic to humans and animals, making them unwelcome pathogens in agricultural and food production [5,6], but also important resources for pharmaceuticals [7,8]. Among the alkaloids produced by *Claviceps*, ergot alkaloids (EAs) are the major culprit for the mass food/feed poisoning in human and livestock, as well as a number of tragedies in human history [9,10]. EAs are indole compounds characterized by a tricyclic or tetracyclic ring system. Over 80 different EAs found in nature fall into three structural groups: clavines, lysergic acid amides, and ergopeptides [8,11], corresponding to their structural complexity. Clavines are the intermediates or derivatives of the intermediates in the lysergic acid amide pathway, whereas ergopeptines are the most complex group [11]. Intensive investigations on biochemistry and molecular genetics have elucidated the EA biosynthetic pathways in EA producers especially *Claviceps* spp. [12,13]. A cluster of 12 functioning EA synthesis (*eas*) genes (*cloA*, *dmaW*, *easA, easC*–*H*, *lpsA*–*C*) in *C. purpurea* strain 20.1 were considered to encode all the enzymes needed for the end-product ergotamine and ergocryptine [14]. The four early steps, requiring *dmaW*, *easF*, *easC*, and *easE*, are responsible for the closure of the third ring resulting in chanoclavine, followed by middle steps, requiring *easD, easA, easG*, and *cloA*, for forming tetracyclic clavines, and later steps for producing the lysergic acid amides, dihydroergot alkaloids, and complex ergot peptines [13] (Figure 1). Among the 12 genes, the homologs of nine were found in *C. fusiformis* in a cluster. In *C. paspali*, two additional genes (*easP* and *easO*) were found; however, *easE* was defective. The presence or absence of *eas* genes has proven to be correlated with EA profiles in several *Claviceps* spp. and strains [13,14]. However, the investigation of *eas* gene clusters in a wide range of *Claviceps* spp. is lacking, and less is reported about the evolution of the individual gene in these clusters among and within the species.

**Figure 1.** The ergot alkaloid biosynthetic pathway in *Claviceps* spp. Modified from Young et al. [13] and Robinson and Panaccione [15].

Indole diterpenes (IDTs) are another large group of bioactive compounds with diverse structural variations, triggering toxicity in animals and insects through interfering with ion channels [16,17]. In the literature, there are copious reports that certain species in *Claviceps* (i.e., *C. paspali* and *C. cynodontis*) and close relatives in *Epichloë* (*Neotyphodium* as the asexual name before implementation of the International Code of Nomenclature for algae, fungi, and plants (Shenzhen Code) [18]) produce the paspaline-derived IDTs,

such as paspalitrem, lolitrem, and paxilline, causing ataxia and staggers in livestock that feed on the grasses infected by those fungal species [19–21]. Biosynthetic pathways and associated gene clusters of these paspaline-derived IDTs have been investigated [22–24], resulting in the discovery of at least 10 genes involved in IDT production in *Epichloë* spp. and the prediction that *ltmG*, *M*, *C*, and *B* were responsible for the synthesis of paspaline, the basic structural backbone of IDTs, whereas *ltmP* and Q were essential for the production of lolitrem and *ltmF*, *J*, *K*, and *E*, which are required for more complex structures [25,26]. The proposed scheme for the biosynthesis of paspalitrem in *C. paspali* involved seven genes including the initial formation of paspaline through *ltmG*, *M*, *C*, and *B*, followed by the sequential functioning steps of *ltmP*, *Q*, and *F* [22]. Recently, the pre-paspaline steps were further resolved as three sequential steps: starting from *ltmG* converting farnesyl diphosphate (FPP) to Geranylgeranyl diphosphate (GGPP), followed by *ltmC* transferring GGPP to 3-geranylgeranylindole, and finally through *ltmM* and *B* yielding paspaline [27]. In addition to *C. paspali* and *C. cynodontis*, other *Claviceps* spp., i.e., *C. arundinis*, *C. humidiphila*, and *C. purpurea*, could also produce indole diterpenes or paspaline-like compounds [28–30]. The genome investigation of *C. purpurea* 20.1 revealed the presence of *ltmM*, *C*, *B*, *P*, *Q*, and an extra gene *ltmS* [14]. It is not known whether these genes are consistently present in various strains of *C. purpurea* and other *Claviceps* species. In addition, two other classes of alkaloids, i.e., lolines (LOL) and peramines (PER), produced by *Epichloë* spp., are known to function as insecticides, but are not associated with any toxicity symptoms in grazing mammals [31,32]. Given the close relationship between *Epichloë* and *Claviceps*, it is reasonable to raise the question of whether any of the loline or peramine gene homologs are present in any of the *Claviceps* spp. even though those two classes of alkaloids have not been reported in *Claviceps*.

The 'hourglass model' borrowed from ontogeny refers to the pattern that the morphological divergence of mid-development stages of an embryo are more conserved compared with earlier and later stages, resembling an hourglass with a narrow waist, but broad ends [33,34]. Before the hourglass model (HGM) was proposed in the 1990s, the early conservation model (ECM) was widely accepted, which echoed von Baer's third law [35], i.e., embryos progressively diverge in morphology during ontogeny. The debates about these two, along with other models, i.e., adaptive penetrance model [36] and unconstrained model (random) [37], are still ongoing, although recent evidence at molecular and genomic levels has provided support for the presence of the phylotypic stage (the waist stage of development) in fungi, insects, plants, and vertebrates [38–40]. According to Haeckel's biogenetic hypothesis, ontogeny recapitulates phylogeny [41]. The evident similarities between the development of an individual and the evolution of the whole biological system have been addressed by many generations [42] to verify that these models in ontogeny are recapitulated in other evolutionary processes. For example, studies on gene evolution in *Drosophila* spp. recaptured the hourglass model in that the early maternal genes showed a higher level of diversity than zygotic genes [43]. Here, we propose the biosynthesis of complex biological compounds as an analogy of the development of an organism, and ask whether any of the models fit to the evolution of the genes involved in the biosynthesis.

The objectives of the present study were to shed light on the presence of four classes of alkaloid genes (clusters) in 53 strains of 19 *Claviceps* species, and to understand the evolutionary patterns of these genes at inter- and intraspecific levels. This information helps build the foundation for future studies on chemo- and genotype associations and for developing gene-based chemotyping and toxin detection.

#### **2. Results**

#### *2.1. Genome Assemblies*

The 37 genome sequences assembled in this study resulted in 1362 to 2581 contigs, N50 values ranged from 19,946 to 55,909 bp, and the completeness measured by Benchmarking Universal Single-Copy Orthologs (BUSCO) over the fungal database (fungi odb10) ranged from 97% to 99.1% (Table 1, available in GenBank https://www.ncbi.nlm.nih.gov/ (accessed on 9 November 2021) as accessions JAIURI000000000–JAIUSS000000000 available upon publication of the article). The quality of the assemblies was equivalent to the assemblies of 17 genomes from previous studies (Table 1) [44,45]. Overall, the 54 assemblies of 53 strains (two versions of assemblies for CCC1102 were included because certain genes were obtained from one or the other assemblies) of 19 *Claviceps* species included in this study belong to three sections: *C.* sect. *Citrinae*, *C.* sect. *Claviceps*, and *C.* sect. *Pusillae*, from six continents (Africa, Asia, Australia, Europe, North America, and South America) and on host plants in 26 genera (Table S1).

#### *2.2. Presence of Four Classes of Alkaloid Genes in 53 Genomes*

One thousand sequences of 19 loci were extracted from the 53 genome assemblies as detailed below. The DNA sequences of each genes were submitted to Genbank associated with accession numbers: *cloA* (49 sequences) MZ882098–MZ882146, *dmaW* (118 sequences) MZ871640– MZ871757,*easA* (51 sequences) MZ851397–MZ851447,*easC*(50 sequences) MZ851807–MZ851856, *easD* (51 sequences) MZ871767–MZ871817, *easE* (66 sequences) MZ877968–MZ878033, *easF* (88 sequences) MZ881959–MZ882046, *easG* (50 sequences) MZ882047–MZ882096, *easH1* (50 sequences) MZ934760–MZ934809, *easH2* (32 sequences) MZ934810–MZ934841, *lpsB* (48 sequences) MZ934842–MZ934889, *lpsC* (44 sequences) MZ934890–MZ934933, *idt*/*ltmB* (55 sequences) MZ935033–MZ935087, *idt*/*ltmC*(47 sequences) MZ935088–MZ935134, *idt*/*ltmG* (three sequences) MZ935227–MZ935229, *idt*/*ltmM* (47 sequences) MZ935135–MZ935181, *idt*/*ltmP* (46 sequences) MZ934987–MZ935032, *idt*/*ltmQ* (60 sequences) MZ934934–MZ934986, and *idt*/*ltmS* (45 sequences) MZ935182–MZ935226.

#### 2.2.1. Ergot Alkaloid Genes (eas)

More consistency in terms of presence/absence of *eas* genes was observed in *C. sect. Claviceps* than *C*. sect. *Pusillae.* The results from BLASTn searches using in-house script (see Section 4.2 for details) and Geneious mapping (https://www.geneious.com, accessed on 9 November 2021) with reference genes showed the genomes of all isolates from *C.* sect. *Claviceps* contained at least 10 *eas* genes matching the *C. purpurea* 20.1 reference sequences (Table 2; *lpsA1* and *lpsA2* were excluded from analyses as they were heavily fragmented due to the significantly long length. A study on long-read sequencing of several selected strains by Hicks et al. was focused on these two genes (in this Special Issue). The 10–12 genes were assembled on two to three contigs. For most strains, nine genes (*lpsC, easA, lpsB, cloA,* and *easC*-*G*) were on the same contig. Genes after *dmaW*, i.e., *easH1*, *easH2*, and fragments of *lpsA*, were on different contigs. The *easH2* gene was either not detected or on a separate contig possibly due to the long length of *lpsA1* because it was located between *lpsA1* and *A2* in the reference genome *C. purpurea* 20.1. The exceptions were *C. humidiphila* LM576*, C. spartinae* CCC535, *C. purpurea* LM461, and *C. ripicola* LM220 and LM454, in which *lpsC* was on a different contig, or *lpsC* along with the next three to four genes were on the same contig separated from other genes (Table 2).

Both inter- and intraspecific variation was observed, regardless of the general consistency of presence of *eas* genes. Species-specific features included all three strains of *C. occidentalis* have two partial copies of *dmaW* (~658 bp, ~641 bp composed of a partial exon 1 and full-length exon 2 and 3) and a single copy of all other *eas* genes except *easH*. Of a relevant note, all partial genes detected in the present study were located at the end of contigs. Moreover, all three strains of *C. quebecensis* had a second partial nonfunctioning copy of *easE* (275, 275, and 1208 bp) and two partial copies of *easF* with good open reading frames (ORFs), and they were lacking *easH2* (Table 2).


**Table 1.**Statistics of genome assemblies screened.

\* BUSCO completeness for these strains was based on the Dikaryon fungal database; see Wyka et al. [44] for details. 


**Table 2.**The*eas*gene copies and their locations in 18 species in*Claviceps*sect.*Claviceps*and sect.*Pusillae*.

\* The assembly versions: BW was from Wingfield et al. [45], SW was from Wyka et al. [44], and WF was generated in the present study; values in the cells denote contig numbers; the 2nd contig number was led by a/when the fragment is on two contigs; green color represents full-length genes, light orange represents partial or gapped sequences, and no fill represents no gene matches; hatches denote fragments containing frameshifts or internal stop codons. None of the genes were detected in*C. citrina* (*C.* sect. *Citrinae*, not listed).

Intraspecific variation among the 27 strains of *C. purpurea* was evident as most strains contained one copy of *lpsC*, *easA*, *lpsB*, *cloA*, *easC*, *D*, *G*, *H1*, and *H2*, and two copies of *dmaW*. However, three strains (LM65, LM72, and LM582) lacked *easH2*. Eleven strains had a second copy of *easE* (*easE2*), six full- or near-full-length and five partial, but these gene fragments contained indels of various sizes and internal stop codons (Table 2). This would indicate that they may not be functional genes unless those variations were caused by sequencing or assembly errors. In contrast, the second copy of full-length *easF* (*easF2*) from LM72 (MZ881984) and LM461 (MZ881981) had good ORFs. The *easF2* gene of the other six strains was split on two contigs with gaps in the middle. Most of these fragments, except the second exon at the 3′ end of Clav26, Clav55, and LM470, were free of internal stop codons. Four strains had a full length (or close to full length), and one strain (LM469 652 bp) had a partial third copy, *easF3*, yet these gene fragments had a number of indels and internal stop codons (Table 2). The intraspecific variations were also found in *C. arundinis* and *C. ripicola* (Table 2).

The six genomes from *C.* sect. *Pusillae* had more variable numbers of the *eas* genes observed, but all six genomes lacked *lpsC* and *easH2* (Table 2). The strain *C. lovelessii* CCC647 had the highest number of matches, i.e., 10 full- or near-full-length matches (*cloA* 1788 bp, *easD* had an 8 bp short gap at split region), while all but *easH1* and *lpsB* had good ORFs. In contrast, *C. digitariae* CCC659 had only two gene matches: *dmaW* and *easA*, but both were full-length with good ORFs. *C. maximensis* CCC398 and *C. citrina* CCC265 (*C.* sect. *Citrinae*) had no matches for any *eas* genes (Table 2).

Examining each *eas* genes, *easA* was present the most consistently in 51 of 53 genomes as a single copy and had good ORFs, except for the one in *C. pusilla* CCC602 which had an internal stop codon. Similarly, *lpsB*, *cloA*, *easC, easD*, and *easG* were present as a single copy in all species of sect. *Claviceps* and two to four species in sect. *Pusillae* (Table 2).

For *easE*, all species in sect. *Claviceps* contained at least one copy, six strains of *C. purpurea* (LM39, LM63, LM72, LM461, LM469, and LM474)) had a full length second copy (*easE2*), and the other five strains *C. purpurea* (Clav04, Clav46, Clav52, Clav55, and LM470), all three *C. quebecensis*, one *C. spartina*, and one *C. monticola* had a second partial copy. Compared with the *C. purpurea* 20.1 *easE1* reference sequence, all the *easE2* sequences contained a large number of deletions (gaps) of various sizes in exon and intron regions, internal stop codons, and no start codon, indicating that they are likely not functional. For species in sect. *Pusillae*, one copy of *easE* was found in four species with good ORFs (*C. africana* CCC489, *C. lovelessii* CCC647, *C. pusilla* CCC602, and *C. sorghi* CCC632).

For *easF*, all species in sect. *Claviceps* contained at least one copy; however, two strains of *C. purpurea* (Clav55 and LM470) had internal stop codons near the 3′ end. Twenty-three strains of seven species (*C*. *arundinis*, *C. humidiphila, C. monticola, C. pazoutovae, C. purpurea, C. quebecensis*, and *C. spartinae*) had a second full-length or partial copy, among which 19 strains had good ORFs. In addition, a third copy was found in some *C. purpurea* strains in full length (LM39, LM63, and LM65) or partial (LM461 and LM469). Even though with 77–93% similarity to *C. purpurea* 20.1 *easF1*, none of the third copies had a correct open reading frame (not functional) (Table 2). Three species in sect. *Pusillae* (*C. africana*, *C. lovelessii*, and *C. sorgji*) had one functioning copy.

For *dmaW*, most species (strains) in sect. *Claviceps* contained two full-length copies or copies split on two contigs with gaps. Six strains of *C. purpurea* (Clav26, Clav52, LM223, LM232, LM4, and LM470) had a partial second copy, but all three strains of *C. occidentalis* had partial sequences (~650 bp) for both copies. One strain of *C. arundinis* (CCC1102) had a third copy in full length, with 81% and 83% similarities with *dmaW1* and *dmaW2*, but frameshifts and internal stop codons were present. Five species in sect. *Pusillae*, except *C. maximensis,* had one copy.

Interestingly, the additional copies of *easE, easF*, and *dmaW* were more or less clustered together, such that the second copies of all three genes were present on the same contig in *C. monticola* CCC1483 and *C. spartinae* CCC535 (Figure 2A). Alternatively, the *easF2* sequence was split on two contigs, which were located with *easE2* on one contig and

*dmaW2* on the other, i.e., *C. purpurea* Clav55, and *C. quebecensis* Clav32, Clav50, and LM458 (Figure 2B). More commonly, *easE2* was on the same contig as *easF2*, whereas *dmaW* was on another contig, such as in seven strains of *C. purpurea* (Clav46, Clav52, LM470, LM474, and LM72; Table 2), or *easF2* co-located with *dmaW* when *easE* was a single copy (LM583; Figure 2C). In cases when the third copy of *easF* was present, they were often on the same contig with *dmaW2,* i.e., *C. purpurea* LM39, LM63, LM65, and LM469 (Figure 2D). The arrangement in LM461 was more peculiar in that the second copies of *easE* and *easF* were on the same contig with dmaW1 and *easG* (a single-copy gene), which indicates that they may all be on the primary ergot alkaloid gene cluster (Figure 2E). The third *dmaW* from CCC1102 (from SW assembly) was not connected to other *eas* genes (Table 2).


**Figure 2.** The schematic arrangements of multiple copies of *easE, easF*, and *dmaW* in relation to the primary cluster of other *eas* genes. The dark solid bars denote the contigs, while gray boxes represent genes labeled accordingly, with the ranges underneath. The lengths of genes and spaces are in approximate scale. The dashed bars and genes on them indicate that those genes are on the same contig; however, the details are not displayed. (**A**–**E**) represent different patterns of locations (see the text Section 2.2.1 for details).

For*easH*,*easH1* was present in 50 genomes, except *C. citrina*, *C. digitariae*, and *C.maximensis*; however, the genes of the four species (CCC489, CCC602, CCC632, and CCC647) in *C*. sect. *Pusillae* had numerous indels of various sizes throughout the sequence, causing frameshifts and internal stop codons. Further validation of the sequences is needed to confirm whether these are functioning. The *easH2* gene was present in 32 strains of six species (*C. arundinis*,

*C. humidiphila*, *C. occidentalis*, *C. perihumidiphila*, *C. purpurea*, and *C. ripicola*). The reference sequence of *easH2* from *C. purpurea* 20.1 was 840 bp, which is about 100 bp shorter than *easH1* (945 bp), and it was considered a pseudogene. Our results showed that the 32 *easH2* sequences had variable lengths and high levels of nucleotide variation (see more notes in later sections: phylogenies and gene diversity). Most of these sequences appeared not functional; however, the lengths of the sequences from two strains of *C. ripicola* (LM218 and LM220) were 954 bp and contained full-length ORFs, indicating that they are likely functioning genes.

For *lpsC*, at least one strain per species in sect. *Claviceps* (except *C. perihumidiphila*) showed one copy of *lpsC*, i.e., in total, 43 out of 46 strains contained a single copy of *lpsC*, among which three strains of *C. purpurea*, i.e., Clav26, LM4, and LM232, had a single internal stop codon; otherwise, the full range of sequences aligned very well with the reference. It is possible that the single internal stop codon could be a sequencing error. Another five strains/species, including *C. capensis* CCC1504*, C. cyperi* CCC1219*, C. humidiphila* LM576, *C. monticola* CCC1483, *C. purpurea* LM223, and *C. spartinae* CCC535 had partial sequences 1000–5000 bp long. These sequence fragments contained several indels and internal stop codons, and they are apparently not functional genes. Only one strain of *C. perihumidiphila* lacked *lpsC*.

#### 2.2.2. Indole-Diterpene/Lolitrem (idt/ltm) Genes

Compared with *eas* genes, the presence/absence and copy numbers of *idt/ltm* genes were less variable. Through mapping genome assemblies to the reference genes, all members in sect. *Claviceps* had one copy of *ltmC*, *M*, *P*, and *S* and one or two copies of *idt/ltmB* and *Q*, except *C. cyperi* CCC1219 that lacked *ltmQ* and *S*. All members in *C.* sect. *Pusillae* had no matches to any *ltm* genes, whereas members of sect. *Citrinae* (*C. citrina* CCC265) had full-length matches with *ltmB*, *C*, *G*, and *M*.

Notable species-specific features were that all three strains of *C. occidentalis* (LM77, LM78, and LM84) had two partial copies *ltmQ* (1517–1518 bp); *C. arundinis* (CCC1102, LM583), *C. perihumidiphila* (LM81), *C. ripicola* (LM218, LM219, LM220, and LM454), and *C. spartinae* (CCC535) had two functional copies of *ltmB* (Table 3). The translated sequences of *ltmS* from three strains of *C. occidentalis* (LM77, LM78, and LM84) and three strains of *C. quebecensis* (Clav32, Clav50, and LM458) were 14 amino-acid residues longer than other species, and those 14 amino acids were identical among the six strains.

Intraspecific variations were observed in *C. purpurea*; four out of 27 strains showed a second copy of *ltmQ* (Table 3). In the strain Clav04, the fragment on the primary cluster (contig130) *ltmQ1* was a partial copy, whereas another copy on contig 637 was a full-length copy (*ltmQ2*) with a good ORF. Clav46 had two partial copies; ironically, the copy on contig 43 (where all other *ltm* genes co-located) had a number of short deletions causing frameshifts and internal stop codons, whereas the copy on contig 229 had good ORFs, except that the first 243 bp (including 53 residuals in exon 1 and partial exon 2) were missing. On the other hand, some of the single-copy *ltmQ* sequences, such as in *C arundinis* CCC1102, *C. pazoutovae* CCC1485, *C. perhumidiphila* LM81, C*. purpurea* LM72, *C. quebecensis* Clav32 and LM458, *C. ripicoloa* LM218 and LM219, and *C. spartinae* CCC535, had varied number of indels causing frameshifts and internal stop codons; however, phylogenetically, they still belonged to copy 1 (more details in in Section 2.3).

All six genes were clustered on the same contig in 29 strains of the 12 species in sect. *Claviceps*; otherwise, at least three genes were on the same contig. The clustered six *ltm* genes were arranged in the same order as in *C. purpurea* 20.1 [14] (Table 3; gene coordinates are not shown). In *C. citrina*, *ltmB* and *C* were on the same contig (1947), whereas *ltmM* and *G* were on separate contigs. It is not assessable whether they were in one cluster. In general, the inter-gene sequences ranged from 500–1200 bp; however, several strains had very long spaces between *ltmP* and *B*, such as 4 kb in *C. ripicola* LM220 and over 2 kb in LM218 and *C. arundinis* CCC1102 and LM583 (results not shown).


**Table 3.** The *idt/ltm* gene copies and their locations in *C.* sect. *Claviceps* and sect. *Citrinae*.

\* The assembly versions: BW was from Wingfield et al. [45], SW was from Wyka et al. [44], and WF was generated in the present study; values in the cells denote contig numbers, two values connected by/indicate the fragment was on two contigs; green color represents full-length genes, light orange represents partial or gapped sequences, and no fill represents no gene matches; hatches denote fragments containing frameshifts or internal stop codons. None of the *idt/ltm* genes were detected in *C.* sect. *Pusillae* except for two short fragments of *ltmG* from *C. maximensis* CCC398 and *C. digitariae* CCC659 by low stringency search, which are not listed (see also Section 2.2.2).

> Through the additional BLAST searches with lower stringency (E-value < E−50), fragments of 483 and 501 bp of *ltmG* from *C. maximensis* CCC398 and *C. digitariae* CCC659, respectively, were pulled out by using *ltmG* from *C. paspali* RRC-1481. They were 76% and 78% similar, respectively, to the reference sequence in the coverage (comparable to the 74% similarity between *C. citrina* CCC265 and *C. paspali* RRC-1481). Running BLAST searches of these two fragments to the NCBI database indicated that 60 bp of the 483 bp from *C. maximensis* matched with *Beauveria bassiana* ARSEF 2860 geranylgeranyl pyrophosphate synthetase; 279 of 501 bp from *C. digitariae* matched with *idtG* (geranylgeranyl diphosphate synthase) from *Periglandula ipomoeae* strain IasaF13.

#### 2.2.3. Loline Alkaloid (lol) and Peramine (per) Genes

All the searches with *lol* and *per* reference genes resulted in no hits, except for the low-stringency BLAST with *lolC* that resulted in small fractions of sequences (~150–180 bp) matched with the start of the fifth exon for seven species (strains): *C. africana* (CCC485), *C. citrina* (CCC265), *C. digitariae* (CCC659), *C. lovelessii* (CCC647), *C. maximensis* (CCC398), *C. pusilla* (CCC602), and *C. sorghi* (CCC632). These fragments matched with 80% to 92% identity to *O*-acetylhomoserine from *Purpureocillium lilacinum* (XM 018324292)*, Drechmeria coniospora* (XM 040800194), and *Verticillium dahliae* (XM 009654023) in the NCBI database https://blast.ncbi.nlm.nih.gov/Blast.cgi accessed in August 2021. These sequences were not submitted to GenBank because of their short length.

#### *2.3. Phylogenies of eas and idt/ltm Genes*

The individual phylogenetic trees of 11 *eas* genes all agreed on the long-branched separation between *C.* sect. *Pusillae* and sect. *Claviceps,* which was congruent with the pattern inferred by the previous multigene analyses combined with morphological, ecological, and metabolic features [3] and supported by the phylogenomic analyses [44] (Figure 3a). In *C.* sect. *Pusillae*, all genes agreed on the close proximity of *C. fusiformis*, *C. lovelessii,* and *C. pusilla*, as well as of *C. africana* and *C. sorghi.* The main incongruence among the gene trees appeared in the uncertain placements of *C. digitariae* and *C. paspali*, as well as the variant relationships among *C. fusiformis*, *C. lovelessii*, and *C. pusillae,* which could be a result of insufficient sampling (see further explanation in Section 3; Figure 3b–d and S1).

In terms of the species relationships in the sect. *Claviceps*, considering single-copy genes, a majority of gene trees agreed on the grouping of the four major clades inferred by the previous phylogenomic study [44]. For communication convenience, we named them as four Batches to avoid confusion with species level and general use of clades: Batch humidiphila including *C. arundinis*, *C. humidiphila*, and *C. perihumidiphila*, Batch purpurea including *C. capensis*, *C. monticola*, *C. pazoutovae*, and *C. purpurea* (previously designated as Clade purpurea by Píchová et al. [3]), Batch occidentalis including *C. occidentalis* and *C. quebecensis*, and Batch spartinae including *C. ripicola* and *C. spartinae* (Figures 3a and S1). The exceptions were *C. perihumidiphila* and *C. cyperi* that had uncertain placement on different gene trees (Figure S1b,d,f,g). The more notable disparities among the gene trees appeared in the order of divergence of the four Batches from C. sect. *Pusillae* or sect. *Paspalorum* (Figures 4, S1 and S2). Previous phylogenomic analyses resulted in the topology of a twice bifurcate pattern, ((Batch humidiphila)(Batch spartinae); (Batch occidentalis)(Batch purpurea)) [44], and this pattern was only supported by *easG* (Figure 4a). A slight variation of the *easA* tree appeared in that Batch humidiphila was an earlier diverged lineage than Batch spartinae, and these two formed a paraphyletic group instead of a monophyletic group (Figure 4b). All other genes supported the derived position of Batch humidiphila and Batch spartinae (Figure 4c–e). Furthermore, eight genes (*cloA*, *dmaW1*, *easC*, *easE1*, *easH1*, *lpsC*, and *ltmB1*) placed Batch purpurea at a more ancestral position than Batch occidentalis, whereas six genes (*easF1*, *lpsB*, *ltmM*, *ltmP*, *ltmS*, and *ltmQ1*) reversed the divergence order of these two Batches (Figure 4c,d). The other three genes (*easD*, *lpsC*, and *ltmC*) showed an unresolved order of divergence (Figure 4e).

As for genes with multiple copies, the most complex was *dmaW*. The *dmaW2* sequences were separated into two groups. Group I included 16 strains of eight species (all non-*C. purpurea dmaW2* except *C. monticola* CCC1483 and *C. pazoutovae* CCC1485), forming a parallel lineage with their *dmaW1* counterpart and representing one gene duplication at node <sup>1</sup> (Figures 5a and S2a). Group II included *C. purpurea*, *C. monticola* CCC1483, and *C. pazoutovae* CCC1485, as well as one strain of each *dmaW1* (LM60) and *dmaW3* (CCC1102). This group diverged from *C. purpurea dmaW1*, representing the second duplication at node <sup>2</sup> . Within group II, the otherwise consistent close relationship between *C. monticola* and *C. pazoutovae* was broken by seven strains of *C. purpurea*. This can be explained by a third duplication at node <sup>3</sup> . The presence of *dmaW3* of *C. arundinis* CCC1102 and *dmaW1 C. purpurea* LM60 in group II indicated extra duplication events at nodes <sup>4</sup> and <sup>5</sup> (Figure 5a).

The second and third copies of *easF* (*easF2*, *easF3)* grouped in one clade diverged from *C. cyperi easF1*. Within this clade, *C. purpurea easF2* (14 strains) appeared as a paraphyletic group, from which diverged a clade composed of *C. purpurea easF3* (five strains) and a subclade *easF2* of *C. quebecensis*, *C. humidiphila*, *C. arundinis*, *C. spartinae*, *C. pazoutovae*, and *C. moticola.* From this tree topology, at least two gene duplication events were inferred (Figures 5b and S2b).

**Figure 3.** (**a**) The hypothetical species relationships of *Claviceps* spp. inferred by orthologous genes from Wyka et al. [44]. (**b**–**d**) Variant species relationships in Sect. *Pusillae* summarized from phylogenies inferred by each *eas* gene trees (Supplementary Figures S1 and S2). The thickened branches denote bootstrapping values >80%. The letters next to thick branches denote the genes supporting the grouping, abbreviated as *A*, *C*—*H1* = *easA, easC*—*H1*; *cl* = *cloA*; *W1* = *dmaW1*. Dashed branches indicate that taxon was present on the gene trees listed after the species name. *lpsC* and *lpsB* are not listed here because only one or three sequences were available on the trees. DNA sequences of *C. fusiformis* and *C. paspali* were from GenBank EU006773 and JN613321.

**Figure 4.** (**a**–**e**) Varied species relationships in sect. *Claviceps* summarized from phylogenetic trees of *eas* and *ltm* genes by PhyML analyses (the full trees are provided in Supplementary Figures S1 and S2). The thick branches denote bootstrapping values >80%. The letters beside the thick branches indicate that those genes had strong support for those branches; otherwise, all genes listed below the figure had strong support.

The second copy of *easE* (*easE2*) from 16 samples grouped into one clade, which diverged from *easE1* of *C. occidentalis*. However, within the *easE2 c*lade, *C. purpurea* samples were separated into two subclades. The sample Clav 04 appeared as an orphan clade located close to *C. quebecensis easE2*, and another 10 samples grouped together and had affinity with *C. monticola easE2*, indicating that the historical gene duplications possibly occurred twice at nodes <sup>1</sup> and <sup>2</sup> (Figures 5c and S2c).

The second copies of *easH* (*easH2*) were grouped into three groups that diverged three times independently. Group I includes two strains of *C. ripicola* (LM218 and LM220) that diverged from *easH1* of the clade composed of *C. capensis*, *C. moticola*, and *C. pazoutovae*. As noted earlier, the sequence lengths of *easH2* from these two strains are similar to *easH1* and contained good ORFs, indicating that they were likely from a very recent gene duplication. Group II, including three strains of *C. occidentalis*, one strain each of *C. arundinis*, *C. humidiphila*, and *C. perihumidiphila*, and 15 strains of *C. purpurea*, diverged from the *easH1* clade composed of eight species in sect. *Claviceps* (*C. occidentalis*, *C. cyperi*, *C. quebecensis*, *C. perihumidiphila*, *C. ripicola*, *C. spartinae*, *C. arundinis*, *C. humidiphila*, and *C. purpurea*). Group III, including nine strains of *C. purpurea* and the reference sequence of *C. purpurea 20.1 easH2*, diverged within the clade of *C. purpurea easH1* (Figures 5d and S2d).

**Figure 5.** *Cont*.

**Figure 5.** The simplified phylogenies of individual multicopy genes showing potential duplication events. The unedited trees generated by PhyML are presented in the Supplementary Figure S2. (**a**) *dmaW*, (**b**) *easF*, (**c**) *easE*, (**d**) *easH*, (**e**) *ltmB*, and (**f**) *ltmQ*. The thickened branches indicate bootstrapping values ≥80%; dashed and hatches branches are shorter than their real length. The lineages that are not shaded gray are the first copies of each gene.

For *idt/ltm* genes, the second copies of *ltmB* can be considered as one group arising from one gene duplication, except that *ltmB1* of *C. humidiphila* LM576 was placed in this group. This sequence was the only copy detected in LM576 and, therefore, labeled as copy one. However, it was on a separate contig (contig 478), clustered with neither *ltmP* and *ltmQ* (contig 945, Table 3) nor *ltmC, ltmS*, and *ltmM* (contig 745). It is very likely that this represents the second copy of this gene, and copy one was either lost or not detected (Figures 5e and S2e).

The three partial *ltmQ2* genes from three strains of *C. occidentalis* grouped closely with a clade composed of four strains *C. purpurea ltmQ1* (Clav04, Clav46, LM71, and LM72)

and two *ltmQ2* (Clav55, and LM461) (Figures 5f and S2f). As noted earlier in Section 2.2.2, *ltmQ1* of Clav04 and Clav46 was either a partial gene or a nonfunctional gene, respectively, whereas the second copies were functioning genes. Here, *ltmQ2* of Clav04 and Clav46 grouped in *C. purpurea ltmQ1* clade 1. This situation can be explained by a scenario in which these two copies might have switched locations due to errors in assembling. For another two sequences, *ltmQ1* of LM71 was on a different contig with other *ltm* genes, and in LM72, the gene was split into two contigs, where one half was connected with *ltmP*, while the other half was independent. Overall, these four sequences appeared as the same copy in *C. purpurea ltmQ2* (Clav55 and LM461). If that is the case, one gene duplication event possibly happened at node <sup>1</sup> . Alternatively, the *ltmQ2* of Clav04 and Clav26, as well as the two *ltmQ2* groups, could have resulted from independent gene duplications (Figures 5f and S2f). Long-read sequencing, i.e., Nanopore or PacBio, could bring more insight by ruling out the possible assembly errors.

#### *2.4. Intraspecific Genetic Variation within C. purpurea*

Overall, the haplotype diversities (Hd) of *eas* genes ranged from 0.936 to 1 (close to saturation), except for *easH2* that had a lower value, 0.858. Nucleotide diversity (Pi) of *eas* genes ranged from 0.08 (*easD*) to 0.168 (*easH2*), the average number of nucleotide difference (K) ranged from 7.1510 (*easD*) to 212.238 (*easE2*), tree-based divergence from COT ranged from 0.06 (*easA* and *easD*) to 0.150 (*easH2*), and tree-based diversity ranged from 0.01 (*easD*) to 0.219 (*easE2*). In general, *easD* and *easA* had lower values for divergence and/or diversity. The second copies of *dmaW*, *easE*, *easF*, and *easH* had much higher values of the four parameters. Some of those genes may not function and, therefore, had fewer functional constraints. If only the first copy of the genes was considered, the genes with the highest diversity and divergence values were Pi 0.03 (*dmaW1*), K 92.379 (*lpsC*), tree-based divergence from COT 0.0025 (*dmaW1*), and tree-based diversity 0.038 (*dmaW1*). The two genes functioning in the middle of the pathway, i.e., *easA* and *easD*, were observed to be the most conserved genes compared with the other genes in the earlier or later steps (Table 4, Figure 6a).

Compared with the first copy of *eas* genes, *idt*/*ltm* genes had a similar level of the highest diversity and divergence. Pi ranged from 0.007 (*ltmM* and *ltmS*) to 0.02 (*ltmQ1*), average number of nucleotide difference (K) ranged from 6.839 (*ltmS*) to 41.486 (*ltmQ1*), tree-based divergence from COT ranged from 0.005 (*ltmM*) to 0.066 (*ltmB1*), and tree-based diversity ranged from 0.009 (*ltmM*) to 0.04 (*ltmQ*) (Table 4, Figure 6b).

**Figure 6.** Nucleotide diversity and tree-based diversity and divergence for individual *eas* genes (**a**) and *idt/ltm* genes (**b**). Error bars denote the standard deviation for Pi and standard error for the other two parameters. The genes are arranged from top to bottom according to their order in the biosynthetic pathway. *ltmS* is not included in the chart as its function is unknown.


**Table 4.** Nucleotide polymorphism, tree-based divergence, and diversity of ergot alkaloid (*eas*) and indole-diterpene/lolitrm (*idt*/*ltm*) synthesis genes in *C. purpurea*.

<sup>1</sup> Sequences with large gaps causing a significant reduction in the number of sites were excluded from the analyses. <sup>2</sup> Tree-based divergence from the center of tree (COT) and diversity were estimated by DIVIEN; other parameters were estimated by DnaSP.

#### **3. Discussion**

#### *3.1. Correlations between the Presence/Absence of Alkaloid Genes and Alkaloid Production*

It has been shown while attempting to induce EA production for pharmaceutical purposes (see review by Flieger [46]) that different ergot species produce varied types of ergot alkaloids. Simultaneously, mycologists explored the use of alkaloid chemistry for characterizing *Claviceps* species [47,48]. Pažoutová and colleagues [49] differentiated chemoraces using the qualitative and quantitative features of EA production. A systematic study on EA production in 43 *Claviceps* species confirmed that ergopeptides were produced only by the members in *C.* sect. *Claviceps*, whereas dihydroergot alkaloids (DH-ergot alkaloids) were produced only by certain members of *C.* sect. *Pusillae*, i.e., *C. africana*, *C. gigantea*, and *C. eriochloe.* Sixteen out of 28 species in *C.* sect. *Pusillae* were shown not to produce any EAs, including *C. maximesis*, *C. pusillae*, and *C. sorghi*. Species only producing clavines included *C. fusiformis*, *C. lovelessii*, and three other species [3]. More recent studies

demonstrated that the indole alkaloid profiles supported the recognition of new species based on molecular and ecological data [29,30].

The EA genes detected in the present study were consistent with the known EA production of the included species, for the most part. For example, *C. africana* CCC489 had eight genes detected (lacking *cloA*, *easH2*, *lpsB*, and *lpsC*), and all appeared to be functional, consistent with its production of DH-ergot alkaloids. Similarly, in *C. lovelessii* CCC647, ten EA genes were detected (lacking *lpsC* and *easH2*); however, *easH1* and *lpsB* had mutations resulting in a number of internal stop codons, which is consistent with the production of *clavines*, a product of the early pathway [3]. A lack of EA production corresponded to no matches for any EA genes in *C. maximensis* CCC398 and *C. citrina* CCC265 (*C.* sect. *Citrinae*). However, for *C. pusillae* and *C. sorghi*, several functional genes were detected even though no EA production was reported [3]. In *C. pusillae* CCC602, eight genes had full-length matches (*dmaW1*, *easA*, *C*, *D*, *E*, *G*, and *H1*, and *lpsB*) and one partial match (*cloA* 332 bp)*,* but only *dmaW1*, *easC*, and *easE* had ORFs. The lack of *easF*, the second step in the pathway encoding dimethylallyltryptophan *N*-methyltransferase, might explain the lack of production of EAs. *C. sorghi* CCC632 had seven full-length matches (*dmaW1* and *easA*, *C*, *E*, *F*, *G*, and *H1*) and two partial (*cloA* 435 bp and *easD* 653 bp). Except for *cloA* and *easH1,* all other genes had good ORFs. Theoretically, at least chanoclavine should be produced unless those genes were not expressed possibly due to a lack of triggers from physical or environmental conditions [50].

Only the members in *C.* sect *Claviceps* had *lpsC* and *easH2*, although *C. perihumidiphila* LM81, one strain of *C. ripicola* (LM454), and *C. arundinis* (LM583) lacked *lpsC*, and *C. capensis*, *C. cyperi*, *C. humidiphila*, and *C. monticola* had a partial *lpsC*. Moreover, three *C. purpurea* strains (LM65, LM72, and LM582) and three *C. quebecensis* strains (Clav32, Clav50, and LM458) lacked *easH2.* Whether the absence of these genes causes variations in their EA profiles requires a systematic investigation on the associations between *eas* genes and products in those species. It is worth noting, however, that the possibility of false negatives in genome screening cannot be ruled out. For instance, for *C. arundinis* CCC1102, *lpsC* was detected in the WF version of the genome assembly (created in the present study), but not in the previous version (SW [44], Table 2). The opposite also occurred in that a full length of *dmW3* was detected in SW assemblies, but only partially (360 bp) in WF assemblies (this study).

The production of indole diterpenoid compounds in ergot fungi was reported in a small number of species, i.e., *C. arundinis*, *C. cynodontis*, *C. humidiphila*, *C. paspali*, and *C. purpurea* [21,28–30]. Our genome mining showed that *ltmQ*, *P*, *B*, *C*, *M*, and *S* were present in all species in *C.* sect. *Claviceps* except *C. cyperi*. Furthermore, *ltmB*, *C*, and *M* and a nonfunctioning *ltmG* were detected in *C. citrina,* while a partial *ltmG* was detected in *C. maximensis* CCC389 and *C. digitariae* CCC659. According to the proposed pathway, to produce paspaline, the first step requires *ltmG*, followed by *ltmC*, *ltmM*, and *ltmB* [27]. The absence of *ltmG* could stop production unless GGPP is present through other resources. This might be the case in the producers of indole diterpenoid compounds listed above. In the same way, it is very likely that most of the species in C. sect. *Claviceps* and the three species in sect. *Citrinae* and sect. *Pusillae* could also produce some forms of indole diterpenoid compounds.

#### *3.2. Macro-Evolution of the Gene Clusters—Frequent Gene Duplications and Losses*

Ergot alkaloid diversity among diverse producers, i.e., species in Hypocreales, Eurotiales, and Xylariales, was formed by three major processes: gene gains, gene losses, and gene sequence changes [13,14]. This is true within the genus *Claviceps.* A recent genus-level genome comparison hypothesized that unconstrained tandem gene duplications were caused by putative loss of repeat-induced point mutations in *C.* sect. *Claviceps* [44]. This pattern of duplication was confirmed here by the presence of a cluster of second or third copies of *easE*, *easF*, and *dmaW*, as well as second copies of *ltmQ* and *B* (Tables 2 and 3). Moreover, *easE2* and *F2* of *C. purpurea* LM461 were on the same contig as *easG* and partial

*dmaW1,* suggesting that the second copies of *easE* and *F* were arranged on the primary cluster possibly as a result of tandem gene duplication. None of the extra gene copies were found in *C*. sect. *Pusillae* or sect. *Citrinae*, consistent with a previous observation that the genomes of sect*. Pusillae* and sect. *Citrinae* had much fewer gene duplication events predicted [44]. According to the phylogenies of multicopy genes, one to five gene duplications can be inferred for individual genes. The *dmaW* gene, encoding the enzyme for the first and determinant step of *EA* production, had the highest number of potential gene duplications. Even though the presence of *dmaW* was conserved across various EA producers and proven to be a monophyletic group [51], its evolutionary rate was faster than genes in the middle steps of the EA pathway.

Gene losses can be inferred through the discrepant placement of certain gene copies on the phylogenies. For instance, one copy of *ltmB* in *C. humidiphila* LM576 was detected; however, this copy grouped with *ltmB2*. It is very likely that this was the second copy of *ltmB* gene, and the first copy was either lost or not detected (Figure 5e, see also Sections 2.2.2 and 2.2.3). The *ltmQ1* from four strains of *C. purpurea* (LM71, LM72, Clav04, and Clav46) was placed in the *ltmQ2* clade. For LM71 and LM72, there was only one copy detected (*ltmQ1*); the scenario is likely similar to *ltmB* of LM576, where this single copy was the second copy, and the original gene was either lost or not detected (Figure 5f). On a related note, *ltmQ2* of Clav04 and Clav46 was located in the *ltmQ1* clade. An intuitive explanation would be that the identities of the two copies switched due to assembly artefacts (Figure 5f). Lastly, the incongruent order of divergence of the four Batches of species in *C.* sect*. Claviceps* inferred by single-copy genes could be explained as lineages sorted during the frequent gains and losses of the ancestral genotypes (Figure 4). Unlike *C.* sect. *Claviceps*, the phylogeny incongruence in *C.* sect *Pusillae* was mainly caused by the uncertain placement of *C. digitariae* and *C. paspali.* In light of the genome structure, this was likely caused by insufficient sampling instead of gene lineage sorting.

#### *3.3. Micro-Evolution of eas Genes within C. purpurea—An Approximate Hourglass Model*

The inter- and intraspecific variations of the second metabolite gene clusters in fungi are typically reported as variations in structures, gene contents, copy numbers, null alleles, and nonhomologous clusters (see review by Rokas [52]). Fewer studies have focused on the DNA sequence variations in each of the gene members. Lorenz et al. [53] identified the sequence differences in *lpsA* between two *C. purpurea* strains (P1 and ECC93) that were associated with the different alkaloid types; however, they could not find differences in *cloA* between *C. fusiformiis* and *C. hirtella* that could explain why this gene was functional in the former but not in the latter. Phylogenetic analyses of DNA sequences of four core genes (*dmaW*, *easF*, *easC*, and *easE*) from selected samples across Clavicipitaceae (with emphasis on *Epichloë*) uncovered extensive gene losses, and the origin of EA clusters on Clavicipitaceous fungi was determined to be direct descent rather than horizontal transfer [13].

The present study is the first, to our knowledge, to examine the variations of each gene on a fine scale, i.e., among 28 strains of *C. purpurea*. Both DNA polymorphism analyses of the DNA sequence alignments through DnaSP and tree-based diversity and divergence analyses using the DEVIEN software indicated that the evolutionary rate of early step genes, i.e., *dmaW* and *easF* is much higher than the middle step genes, i.e., *easA*, *C*, *D*, and *E* (Figure 6, Table 4). The pattern matches with the hourglass model in ontogeny, which was also evidenced in genomic studies [39]. The hourglass model (HGM) and early conservation model (ECM) in ontogeny are explained by developmental constraints. HGM considers that, at the middle stage, the *meta*- and *cis*-interactions reach the highest complexity, posing constraints for development [54,55], whereas ECM considers the constraints at early stage to be critical because any alterations at early stage would cause cascading effects [56]. The EA pathway was reported as an unusually inefficient one such that a high volume of certain intermediates were accumulated more than needed for producing the end-products [57]. This may impose less selective pressure on the middle steps. The sclerotia of *C. purpurea* from tall fescue contained chanoclavine (4 ± 3 µg/g) and agroclavine (2 ± 1 µg/g) in

addition to the end-products, i.e., ergopeptines and ergnovine [57]. The extra amount of chanoclavine coincides with the lowest evolutionary rates of *easD* and *easA* inferred in the present study (Figure 6). The role of *easD* is to oxidize chanoclavine to chanoclavine aldehyde, followed by the reactions of *easA* and *easG* to yield agroclavine. It is likely that *easD* is under less selective pressure because plenty of supplies are available. Alternatively, it might be under a high level of functional constraints because of its pivotal position in the pathway (first step of closure of the D-ring). A different isoform of *easA* in *C. africana* and *C. gigantean* reacts differently, creating a shunt yielding dihydroergot alkaloids (Figure 1). This diversification may result from the change in ecological niches. Nevertheless, the rates of diversity and divergence of *easA* were the second lowest after *easD*, even though it is physically located in between *lpsB* and *lpsC*. Both of these later step genes had much higher rates than *easA*, possibly due to fewer constraints or more direct positive selection, as they are involved in the final steps. The *cloA* gene represents another point of the pathway where shunts may take place. Presumably depending on the different isoforms of *cloA*, varied levels of oxidation occur, resulting in different end-products [13,15]. The high rates of diversity and divergence of *cloA* may reflect a high level of positive selection.

The signatures of selective pressure in DNA sequences could be detected through neutrality tests. For instance, if the value of Tajima's D significantly deviates from zero, it indicates the presence of selective pressures, i.e., negative values suggest a positive selection, whereas positive values indicate balancing selection [58]. We conducted neutrality tests and found that none of the genes departed significantly from neutrality (results not shown). These results are contradictory to Liu et al. [59], in that *easE* and *easA* were under positive selection in Canadian and western USA *C. purpurea* populations. We speculate here that the small sample sizes in present study (28 sequences versus 200–300 in the previous study) might be the factor limiting the ability of the Tajima's D test to detect selective pressures.

Compared with *eas* gene pathways, it is difficult to evaluate whether or not the evolutionary pattern of *ltm* genes conformed with the hourglass model because the sequential order of steps was uncertain. Even if we assume that paspaline-derived compounds are the main products, in the absence of *ltmG,* there are only two to three sequential steps to paspaline. Nevertheless, *ltmM* had the lowest rate of divergence and diversity compared with earlier (*ltmC*) and later steps (*ltmP* and *Q*).

Our results provide evidence for the first time that *eas* gene evolution follows the hourglass model. Whether this pattern exists in other metabolic gene pathways and the mechanisms that underpin this or other patterns are questions to be answered in future work.

#### **4. Materials and Methods**

#### *4.1. Genome Aquisition*

Fifty-four genomes of 19 *Claviceps* spp. were studied. The assemblies of 17 genomes and the raw reads of another 34 genomes were from previous studies (Table 1) [44,45], which outlined the protocols for the DNA extraction, library preparation, and sequencing platforms. In the present study, three additional genomes were sequenced (LM63, LM65, and LM72) using a protocol similar to that described in [44]. Briefly, the gDNA samples were normalized to 300 ng and sheared to 350 bp fragments using an M220 Covaris Focused-Ultrasonicator instrument (Covaris, Woburn, MA, USA). The obtained inserts were used as a template to construct PCR-free libraries using the NxSeq AmpFREE Low DNA Library kit (LGC, Biosearch Technologies, Middleton, WI, USA)) following LGC's library protocol. Balanced libraries in equimolar ratios were pooled, and paired-end sequencing was carried on a NextSeq500/550 (Illumina, San Diego, CA, USA) using 2 × 150 bp NextSeq Mid Output Reagent Kit (Illumina, San Diego, CA, USA) according to the manufacturer's recommendations.

The new assemblies of 37 genomes were achieved using the following protocols: raw reads were trimmed using BBDuk, a component of BBTools downloaded from the Joint Genome Institute website (https://jgi.doe.gov/data-and-tools/bbtools/ accessed on

9 November 2021). Both quality-trim and kmer-trim were applied using the parameters qtrim = rl, trimq = 20, forcetrimleft = 10, minlength = 36, ftm = 5, ref = adapters/adapters.fa, ktrim = r, k = 22, mink = 11, hdist = 1, tbo tpe. The qualities of initial reads and posttrimming reads were assessed using FastQC version 0.11.9, setting parameters as quiet, noextract. Pairs of trimmed reads for each strain were assembled using the SPAdes version 3.14.0 genome assembly toolkit with the default parameters [60]. QUAST version 5.0.2 was used to evaluate the resulting assemblies and to obtain statistics about the assembled contigs [61]. To assess the completeness of the genome assemblies, BUSCO 4.1.4 was run on the contigs using the fungal database (fungi odb10) (Creation date: 10 September 2020, number of species: 549, number of BUSCOs: 758) [62].

#### *4.2. Alkaloid Gene Screening and Extraction*

To investigate the presence/absence of the four classes of alkaloid synthesis genes in 54 genomes, BLAST searches were conducted to interrogate the genomes with the reference genes of interest using an in-house perl script (running blastn with an E-value of E−<sup>99</sup> as the cutoff). Alternatively, each individual genome assembly was mapped onto the reference genes using the 'Map to Reference' function in Geneious prime 2020.1.2 (https://www.geneious.com, accessed on 9 November 2021). The reference gene clusters were downloaded from GenBank and applied as follows: the clusters of 14 ergot alkaloid synthesis (*eas*) genes and six indole-diterpene/lolitrem genes (*IDT/ltm*) from *C. purpurea* strain 20.1 (JN186799 containing *cloA*, *dmaW*, *easA*, *C*–*G*, *easH1*, *easH2*, *lpsA1*, *lpsA2*, *lpsB*, and *lpsC*; JX402756 containing *idt/ltmB*, *C*, *M*, *P*, *Q*, and *S*) and *C. paspali* RRC-1481 JN186800 (*easO*) were first applied as a query to interrogate each genome. In addition, the cluster from *C. fusiformis* PRL1980 EU006773 (10 genes: *cloA*, *dmaW*, *easA*, *C*–*H*, and *lpsB*) were applied to further interrogate genomes in *C.* sect. *Pusillae* and *C. citrina*. For the *IDT/ltm* genes that were not previously reported in *Claviceps purpurea* 20.1, the reference sequences from *C. paspali* JN613321 (*ltmF* and *ltmG*) and *Epichloë* (*ltmE* and *J* on JN613318, and *K* on JN613320) were used to conduct lower stringency megablast searches (https://www. geneious.com, accessed in 9 November 2021) with E-values E−<sup>50</sup> and E−20. Megablast searches were also conducted for loline alkaloid genes (*lolA*, *D*, *E*, *M–P*, *T*, and *U* on JF830816, *lolC* FJ464781, and *lolF* FJ594413) and peramine *(perA* JN640287) in all 54 genomes. Genes that were present in genomes were extracted manually. Split fragments of a single gene on different contigs were concatenated on the basis of reference sequences. DNA sequences of genes extracted from the new genomes were submitted to GenBank.

When multiple copies of certain genes were present (such as *dmaW*, *easE*, *easF*, *ltmB*, and *ltmQ*), the copy on the main cluster was designated as copy 1, as determined by examining the contig numbers. The exception was *easH,* which was determined on the basis of the similarity to the two copies determined by previous studies [14]. Disconnected fragments shorter than 300 bps were not considered.

#### *4.3. Phylogenetic Analyses*

The extracted sequences for each gene were aligned individually through the Geneious Prime (https://www.geneious.com, accessed on 9 November 2021) Align/Assemble function using Global alignment with free end gaps, 93% similarity (5.0/−9.026168) as the cost matrix, a gap open penalty of 12, a gap extension penalty of 3, and two refinement iterations. This protocol is particularly suitable for aligning sequences with large gaps or shorter fragments to full-length sequences. Maximum likelihood phylogenetic trees were developed using the PhyML 3.3.20180621 [63] plugin of Geneious Prime (https://www.geneious.com, accessed on 9 November 2021). Both GTR and HKY substitution models were attempted; branch supports were evaluated through bootstrapping analyses of 100 replicates. Reference sequences of *lpsB* of *C. paspali* has only 52% similarity with *C. purpurea*, causing spurious alignment and a significantly long branch; therefore, they were not included in the analyses.

#### *4.4. Intraspecific Gene Diversity and Divergence Analyses*

Population demographic parameters are suitable for investigating genetic differentiation and gene evolution at an intraspecific level. We investigated the DNA polymorphisms, nucleotide diversity (Pi), and average number of nucleotide differences (K) among 27 strains of *C. purpurea* using DnaSP [64]. Another reason for choosing this sub-set of data, instead of all 53 samples, is that all but three strains (LM65, LM2, and LM582 lacked *easH2*) contained all 12 genes, making the results more comparable. Nonetheless, the sequences with long gaps causing a significant reduction in alignment length in *dmaW* and *easF* were excluded from the DnaSP analyses. In addition, the tree-based diversity and divergence from the center of the tree (COT) were calculated through the web-based DIVEIN software (https://indra.mullins.microbiol.washington.edu/DIVEIN/diver.html, accessed on 9 November 2021) [65]. The following parameters were applied: GTR substitution model, optimized equilibrium frequencies, the best of NNI and SPR tree improvement, and topology + branch length tree optimization algorithm. For multicopy genes (*dmaW*, *easE*, *easF*, and *easH*), we calculated the parameters for each individual copy and combined them as one gene (Table 4).

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/toxins13110799/s1, Figure S1: The phylogenetic trees developed by PhyML for each individual single-copy *eas* and *idt/ltm* genes, thickened branches indicate bootstrapping values >80%. Figure S2: The phylogenetic trees developed by PhyML for each individual multi-copy *eas* and *idt/ltm* genes, Table S1: The collection information for 53 strains of *Clavicep* spp.

**Author Contributions:** Conceptualization, M.L.; methodology, K.D., P.S. and S.A.W.; software (pipeline), W.F.; formal analysis, M.L., W.F., P.S. and A.B.; resources, J.D., M.K., J.G.M., S.A.W. and K.B.; data curation, W.F., P.S., S.A.W. and A.B.; writing—original draft preparation, M.L.; writing review and editing, W.F., J.D., S.A.W., K.B., P.S., K.D., M.K. and J.G.M.; funding acquisition, M.L., J.D. and K.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Agriculture and Agri-Food Canada's Growing Forward 2 for a research network on Emerging Mycotoxins (EmTox, project # J-000048), STB fungal and bacterial biosystematics J-002272, the Agriculture and Food Research Initiative (AFRI) National Institute of Food and Agriculture (NIFA) Fellowships Grant Program: Predoctoral Fellowships grant no. 2019-67011-29502/project accession no. 1019134 from the United States Department of Agriculture (USDA), and the American Malting Barley Association grant no. 17037621. Additional funding was provided by Agriculture and Agri-Food Canada grant J-001564, Biological Collections Data Mobilization Initiative (BioMob, Work Package 2). This research was supported in part by the U.S. Department of Agriculture, Agricultural Research Service.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The genome and gene data presented in this study are openly available in NCBI upon publication of this article https://www.ncbi.nlm.nih.gov/ (accessed on 9 November 2021). Accession numbers are detailed in the text Section 2.2 and Table 1.

**Acknowledgments:** We thank the Molecular Technologies Laboratory (MTL) at the Ottawa Research & Development Centre of Agriculture and Agri-Food Canada, and Kassandra R. Bisson for technical assistance, Chunfang Zheng, and Frank You for bioinformatics assistance, Christopher Schardl for advice during the early stages of the study, two anonymous reviewers for reviewing the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The USDA is an equal opportunity provider and employer.

#### **References**


## *Review* **The Usage of Ergot (***Claviceps purpurea* **(fr.) Tul.) in Obstetrics and Gynecology: A Historical Perspective**

**Aleksander Smakosz 1 , Wiktoria Kurzyna 2 , Michał Rudko <sup>3</sup> and Mateusz D ˛asal 2, \***


**Abstract:** In the past centuries consumption of bread made of ergot-infected flour resulted in mass poisonings and miscarriages. The reason was the sclerotia of *Claviceps purpurea* (Fr.) Tul.—a source of noxious ergot alkaloids (ergotamine and ergovaline). The authors have searched the 19th century medical literature in order to find information on the following topics: dosage forms of drugs based on ergot and their application in official gynecology and obstetrics. The authors also briefly address the relevant data from the previous periods as well as the 20th century research on ergot. The research resulted in a conclusion that applications of ergot in gynecology and obstetrics in the 19th century were limited to controlling excessive uterine bleeding and irregular spasms, treatment of fibrous tumors of the uterus, and prevention of miscarriage, abortion, and amenorrhoea. The most common dosage forms mentioned in the works included in our review were the following: tinctures, water extracts (Wernich's and Squibb's watery extract of ergot), pills, and powders. The information documented in this paper will be helpful for further research and helpful in broadening the understanding of the historical application of the described controversial crude drugs. Ergot alkaloids were widely used in obstetrics, but in modern times they are not used in developed countries anymore. They may, however, play a significant role in developing countries where, in some cases, they can be used as an anti-hemorrhage agent during labor.

**Keywords:** ergot; ethnopharmacology; abortion; childbirth

**Key Contribution:** This review presents a historical view on the application of ergot in 19th century official pharmacy and medicine in Europe and the USA. In this paper we highlight an initial framework for in-depth analysis of historical ergot properties and usage issues.

#### **1. Introduction**

In the history of mankind, *materia medica* representatives have experienced their ups and downs. Plant and animal raw materials have been commonly used in medicine, pharmacy, and other economic sectors. However, fungal raw materials were usually out of the scope of interest. On the other hand, it is difficult not to appreciate the significance of *Saccharomyces cerevisiae* (Desm.) Meyen ex E.C. Hansen (brewer's yeast and baking yeast; with biotechnology, it is possible to produce various drugs with cultures of the above species).

The situation is a bit different with sclerotia of *Claviceps purpurea* (Fr.) Tul. (ergot)—a source of unknown origins and unknown ancient history. Through the ages, this medicinal fungus has been closely connected with gynecology and obstetrics.

We do not know if the ancients could identify and cultivate rye. Possibly this species was used in Thrace and Macedonia to bake bread. Dated around 600 BC, an Assyrian tablet

**Citation:** Smakosz, A.; Kurzyna, W.; Rudko, M.; D ˛asal, M. The Usage of Ergot (*Claviceps purpurea* (fr.) Tul.) in Obstetrics and Gynecology: A Historical Perspective. *Toxins* **2021**, *13*, 492. https://doi.org/10.3390/ toxins13070492

Received: 21 May 2021 Accepted: 13 July 2021 Published: 15 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

alluded to a "noxious pustule in the ear of grain" (Could it be ergot?). In classical Latin, rye was called secale [1]. A little later, during the medieval period, it was called siligo [1].

A large number of *Poaceae* representatives may become infected by fungi belonging to the *Claviceps* genus [1,2]. However, from both the historical and economical point of view, the most important one is *Claviceps purpurea* (ergot), which causes damage to rye, wheat, and barley [2].

In the past centuries, consumption of bread made of ergot-infected flour resulted in mass poisonings. With regard to symptoms, two forms of ergot poisoning can be distinguished—*ergotismus gangrenosus* and *ergotismus convulsivus* [3]. The first stage of the poisoning is similar in both cases—gastrointestinal and abnormal crawling sensation in the limbs which later develops into pain. With time, the poisoning develops into one of the types mentioned above. In the case of gangrenous ergotism, ischemia affecting the limbs results in distal changes of skin color and then gangrene, which in turn causes the loss of limbs or even death [3]. Convulsive ergotism manifests itself in nervous system disorders. This form of poisoning causes painful involuntary muscle twitching while the body of the person affected by the illness takes abnormal postures. In some cases, mania and hallucinations occurs simultaneously [2,3].

*Ergotismus gangrenosus* is most probably an equivalent of the so called "St. Anthony's Fire", while *ergotismus convulsivus* is sometimes compared to "St. Vitus Dance" [1–3]. However, the majority of sources associate it with Huntington's disease [3]. The medieval epidemics of *ergotismus gangrenosus* were common in the regions of Europe west of the Rhine (France), while *ergotismus convulsivus* occurred mostly in the part of Europe east of the Rhine (Germany) and in Scandinavia [3,4]. Ergotism epidemics frequently occurred in communities where the diet was rich in rye and took place after cold and wet winters followed by wet springs, as high air moisture and wind facilitate the spreading of the *Claviceps purpurea* fungus. It was mentioned that breast-fed infants did not show any poisoning symptoms [3].

The last time ergot poisoning happened in Europe on a larger scale was in Westphalia, Hanover, and Lauenburg in 1771 where in some villages only 5 out of 120 people survived [3]. It is also implied that ergot may be responsible for the well-known "choreomania" (dancing plague) during which European villagers of the Middle Ages were falling into an involuntary dance trance [3].

#### **2. Ethnomycology of** *C. purpurea*

#### *2.1. Nomenclature of Ergot*

For centuries, many scientists had difficulties with classifying ergot. In 1816, Augustin Pyramus de Candolle (Swiss botanist who specialized in economic botany and agronomy) described *Secale Cornutum* as a fungus and named this species *Sclerotium Clavis* [5]. Another Swiss researcher—Elias Magnus Fries (professor of botany and applied economics at Uppsala University, the father of modern fungal taxonomy)—denominated it *Spermoëdia Clavus* [5]. What is crucial is that Edwin John Queckett (botanist, surgeon, and microscopist) called it *Ergotoetia abortifaciens* (*abortifaciens*—Latin term for a substance that induces abortion). In 1839 he wrote [6] the following: "I adopted the term abortans [ . . . ] is it probable that I ever should have proposed [ . . . ] name as being fitted to form the specific one of the newly discovered genus."

Miles Joseph Berkley (Anglican clergyman interested in plant pathology) was also conscious of the specific effect of the aforementioned fungus and therefore he called it *Oidium abortifaciens* [5]. In 1853, Edmond Tulasne (French mycologist and botanist) introduced his views on the development cycle of ergot. Since then, this fungus has been called *Claviceps purpurea* [5].

This name, however, was not widely accepted and editors of contemporaneous London Pharmacopoeia referred to ergot as *Acinula Clavus*—a species never described before [2]. In pharmaceutical sources, ergot was called *Clavus siliginis, Calcar, Secalis mater*, *Secale luxurians*, *Secale cornutum*, and *Grana secalis degenerate* [2,4].

#### *2.2. Ethnopharmacology of C. purpurea*

We have no clear evidence as to when ergot was first introduced into medicinal use in the West. However, its history in the East is clearer. Alexander Tschirch (pharmacist and pharmacognosist who lived at the turn of the 20th century) stated that Chou Kung (Chinese philosopher and physician) wrote about ergot circa 1100 BC [1]. According to this man of science, this source was used as an obstetrical remedy—the application was unknown in Europe until the 17th or 18th century. On the other hand, the further works associated with botany, e.g., "Thousand Golden Remedies" of Sun Simiao written in the 7th century and "Pen Ts'ao Kang Mu" written in the 16th century by Li Shih-chen (he complied 11,091 prescriptions used throughout centuries; these are listed in 52 volumes describing 443 animal substances and 1,074 plant materials) do not mention ergot [1].

Rye was primarily cultivated by the Teutons who were a Germanic tribe [1], so it should not be surprising that the first medicinal reference to ergot as a drug comes from German sources. Adam Lonicer was a German botanist and physician in the city of Frankfurt [7]. He was an author of the *Kreuterbuch*—one of the most notable herbals in the history of herbalism and pharmacy. In 1582, the fourth edition of this work was published. In the chapter concerning agricultural crops and grains (the subchapter devoted to rye), he wrote that sometimes long black grains stick out of the spike similar to long nails [7]. This description clearly indicates that Lonicer knew what ergot was. Furthermore, he also wrote that women used three sclerotia of the *Claviceps* to induce a strong uterine contraction [7]. This is the oldest evidence of *Secale cornutum* application in gynecology and obstetrics.

The oldest known illustration (woodcut) of ergot can be found in a book entitled the following: "Botanical theatre or the history of plants" (lat. *Theatrum Botanicum Sive Historia Plantarum*) edited by the son of Caspar Bauhin in 1658 [8].

#### **3. Ups and Downs of Ergot Application in Gynecology and Obstetrics**

#### *3.1. Excessive Uterine Bleeding and Irregular Spasms*

According to John Stearns the official usage of ergot in gynecology started in 1747 in Holland. It was subsequently used in France until it was interdicted in 1774 by a legislative act [9]. In the first decade of the 19th century in the state of Washington, there was a high-profile case of a Scottish woman who applied ergot medications in obstetrics practice with fatal results. These issues were associated with the overdosage of C. purpurea. In addition, this fungal resource is very variable in its active constituent content. Due to this quality, it is highly unlikely to determine what a safe dose should be. Dr. J. Stearns, a physician from the State of New York, began applying ergot-based drugs in his gynecology practice in 1807 [9]. Due to the of the rejection of this drug by previous generations, he had to collect it by hand. He started with ergot powder and then he tried the decoction, which he considered superior. The maximum dose was 10 grains (an equivalent of about 32.4 g) but the regular dose was much lower [9]. In the following years, in the USA, a clinical trial net of obstetricians who used ergot in their practice in form of powder and decoction was established [9]. Stearns called this powder "Childbirth powder" (lat. *pulvis parturiens*) [4]. In 1813, the trial results were published in *The New England Journal of Medicine and Surgery*. They were met with general approval. What is important is that some of the physicians described the cases of children that were stillborn after ergot had been applied [9]. This is not surprising—*C. purpurea* is a very potent medication. Due to prior French objections, some American doctors considered the "new" drug to be harmful and worthless [9].

The British doctors were more reserved—they valued the action of ergot, but they observed that application of ergot during pregnancy and delivery was associated with a greater death rate of infants. Apart from applying ergot in midwifery they also tried to use *C. purpurea* in cases of amenorrhoea [4].

The approach to this raw material in France changed in 1872 when the Academy of Medicine in Paris enacted Article 32 of the law of the 19th Ventose and allowed midwives to prescribe ergot [10]. The Academy stated that ergot-based medicines had a lot of advantages in obstetric practice. That regulation was in opposition to the previous laws and decrees, which restricted the prescription of poisons to physicians and veterinarians [10]. Therefore, pharmacies were able to provide gynecological drugs based on sclerotia of *C. purpurea* to a new group of medical professionals.

It is important to point out that usage of ergot was limited to cases that posed a threat to the life of both mother and child. In such cases, the risk was legitimate. Taking this into consideration, Dr. John Stearns created a list of observations regarding the use of ergot [9]. He stated that it should never be administered during labor and in quantities larger than thirty grains (an equivalent of about 48.6 g) [9]. Combined with opium and water, the sclerotia of *C. purpurea* were given when interrupted pains of regular labor occurred (dosage: teaspoonful was administered every ten min). The described drug was contraindicated when regular labor or the contractions were uninterrupted [9].

On the other hand, application of ergot was indicated in the case of extending labor when the contractions were irregular and/or too weak to advance the labor or when the contractions and pains were transferred from the uterus to the other parts of the body, giving puerperal convulsions [9].

In 1814, Dr. Henry S. Waterhouse witnessed difficulties during childbirth [9]. The patient reported vaginal bleeding and lower abdomen constriction. Then, she began to lose consciousness and started bite her tongue. In the hours that followed, the doctor began to observe alarming contractions of the muscles of her limbs, back, abdomen, neck, and lower jaw [11]. Conventional drugs of the time such as tincture of opium and tincture of asafoetida (*Ferula assa-foetida* L.) had not stopped the symptoms from progressing [9]. Eventually, he decided to give her a mixture of 30 grains of ergot with water. The administration of this medication stopped the worsening of her condition and she was soon able to deliver the baby successfully [9].

Another important remark on ergot was made by Dr. John Paterson who specialized in midwifery. In 1840, he wrote the following [11]: "I consider the ergot more to be depended on, as to its particular effects on the uterus than almost any other specific in the Pharmacopoeia."

This scientist also mentioned that he had never seen any side effects of ergot application. It is rather implausible since just a single dose can cause symptoms such as vomiting, colic, pains, headache, and hallucinations [9,11].

Considering the side effects mentioned above, many pharmacists, physicians, and herbalists tried to compound an ergot-based drug with as few side effects as possible. Dr. Rees's ethereal solution seemed to be the best choice. According to *The Retrospect of Practical Medicine and Surgery* from 1840 [11]: "The ethereal solution, the properties of which you have so well tested, was prepared by digesting 4 ounces. of the powdered ergot in 4 fluid ounces of ether for seven days. The result was a solution of the fatty matters contained in the drug: this was poured off, evaporated to dryness, and the residue again dissolved in 2 fluid ounces of ether."

One part of the preparation was the equivalent of two parts of ergot. Both ether and Claviceps possess a narcotic potential and so usage of the ethereal solution was highly hazardous [11].

Other forms of ergot-based drugs used in obstetrics practice were emulsions, mucilages, syrups, and water extracts mixed with aromatic water [2].

#### *3.2. Fibrous Tumor of the Uterus*

In the 19th century, treatment of a fibrous tumor of the uterus was regarded as beyond the reach of medicine. One of the most problematic symptoms associated with this illness is excessive uterine bleeding and hypertrophy. Ergot-based medicines worked via elimination (excretion) of polyp or intramural tumor [12]. Dr. Byford used a fluid extract of ergot in this treatment (half spoonful for three weeks; both by mouth and sometimes hypodermically). As a result of the therapy, the tumor was expelled and the uterus inverted. The surgeon had to remove the remaining fibroids. In some cases, the pain during therapy was intolerable and therefore the treatment had to be terminated before obtaining expected therapeutic

effect. At that time, there were two main compounds used in the therapy of fibrous tumors of the uterus—Wernich's watery extract of ergot and Squibb's watery extract of ergot [12,13].

The first one was yielded by the extraction of the drug: First with ether, then with ethanol, and lastly with water; then the solvent evaporated. Squibb's compound was a simple water extract evaporated on an evaporating dish [13,14]. Both drugs were mixed with glycerin and sometimes with belladonna (*Atropa belladonna* L.) or opium [12,15]. The therapy described above was forgotten until 1953, when further studies on the effects of pure ergometrine on this type of cancer (especially associated with uterine bleeding) were conducted [14]. In this case, the dose of 0.2–0.4 mg of ergometrine was used [14].

#### *3.3. Abortion and Poisoning*

In 1844, there was a trial of a physician charged with the intention to procure an abortion [16]. The accuser held that Dr. James Calder administered noxious pills made of powdered savin juniper and essential oil (*Juniperus sabina* L.) and a decoction of ergot to an unmarried woman. In addition, he gave her powder compounded of iron (II) carbonate and cantharides (*Lytta vesicatoria* L.) [16]. In the further proceedings, it turned out that they were in a relationship and she was likely impregnated by him [16]. Therefore, he tried to induce an abortion by instructing her to take two pills a day. Due to the terrible smell and taste, she threw them into the fireplace [16].

Another example of a similar case took place in 1889 [17]. The prosecutrix—a nurse at the Selby Oak Workhouse—accused a surgeon named Cuthbert of an intentional abortion procured with the fluid extract of ergot. Due to unreliable evidence, the trial ended without a sentence. This result may have been influenced by the support of the entire medical community [17].

Poisoning (both criminal and suicidal) with ergot was (and is) uncommon. Cases in which the autopsy proved death by poisoning with this crude drug were usually connected with an attempt to procure abortion. One such case was reported in 1864 [18]. The victim—a young unmarried woman—had all the clinical signs of ergot poisoning before her death—yellow skin, vomiting, headache, dryness, and irritation of the throat, and intestine hyperemia [18]. Death occurred due to the aforementioned *ergotismus gangrenosus*. Further investigation demonstrated that she had been consuming the tincture or ergot and pennyroyal (*Mentha pulegium* L.) essential oil for 11 weeks [18]. The latter substance was used as an abortifacient from the times of Ancient Greece [19].

On the other hand, the administration of ergot in the case of profuse hemorrhage could prevent miscarriage. This kind of treatment was suggested by Dr. A. Freer who declared that he used this medicine in such cases over 200 times [20]. At the same time, he wrote, with a high level of uncertainty, about the application of ergot in abortion (10th–12th week) [20]. Some practicing physicians had the opposite experience with this remedy. Dr. John Basset published his remarks on the application of ergot as a medical abortion drug in 1872 [21]. He expressed the opinion that contractions of the muscular fiber (caused by a pulverized sclerotia) are too weak to expel the fetus. Moreover, the ergot alkaloids stopped uterine bleeding—which is a significant occurrence during the abortions performed at that time [21].

According to D. Allen and G. Hatfield (ethnobotanists), only a solitary and contemporary record of folk usage of ergot in procuring abortions (in Norfolk) has been traced, but it may have been as well a widespread practice throughout the centuries [22].

It is necessary to notice that ergot-induced poisoning is also an issue in animal husbandry. Cattle (*Bos taurus* L.) are very sensitive to ergot alkaloids (especially ergotamine, ergovaline, ergonovine, ergocristine, and ergocornine). Clinical signs of ergotism in cattle include convulsions, ataxia, gangrenous extremities, vasoconstriction, and abortion [23].

#### **4. Contemporary Pharmacology of Ergot**

As stated above, ergot was used as a crude drug until the 19th century. In the 20th century, individual alkaloids were extracted and described. Ergotoxine was discovered and described in 1906 as a single chemical compound. This state of affairs held until 1943, when individual alkaloids were isolated from it. Another alkaloid, ergotamine, was isolated in 1918 by Stoll. Lysergic and isolysergic acids were described in 1934 and 1936, respectively. In 1935, another alkaloid was discovered—ergometrine [24,25].

Ergoline is a core chemical structure of ergot alkaloids. The molecular mechanism of drug action is associated with the impact of ergot alkaloids on dopamine, noradrenaline, and serotonin receptors via structural similarities between aforementioned molecules and ergoline core [26]. Similarities of molecular structure are shown in Figure 1.

**Figure 1.** Comparison of dopamine, noradrenaline, and serotonin structure with ergoline. Similarities are exposed by utilizing color (**A**) dopamine, noradrenaline and ergoline; (**B**) Serotonin and ergoline.

> strong α By modifying the lysergic acid, a range of derivatives with different receptor activity could be obtained. Amide derivatives of lysergic acid with a small side chain have less adremolytic and higher 5HT antagonist activity. Hydrogenation of ergotamine class alkaloids results in a higher adrenolitic effect [27]. Nicergoline is one of the examples of hydrogenation impact on ergot alkaloids derivatives as it is administered in hypertension strong α1-receptor blocker [26]. The introduction of elaborate moieties in C-13 or C-14 has a strong impact on weakening the interaction with dopamine receptors and results in increased selectivity towards 5-HT<sup>2</sup> receptors. Selectivity towards 5-HT<sup>1</sup> receptors can be obtained via modification of the C and D ring of ergoline core [27].

> Ergotamine and dihydroergotamine are α tors. Ergometrine is an α Ergotamine and dihydroergotamine are α-adrenergic agonists/antagonists, display dopamine 2 receptors agonistic activity, and show partial agonist action on 5-HT receptors. Ergometrine is an α-adrenergic partial agonist with no impact on dopamine 2 receptors and has partial agonist action on 5HT receptors [27,28].

> Ergot alkaloids in obstetrics are administered at the third stage of labor to prevent postpartum hemorrhage. They must be administered with great caution because of side effects, e.g., blood pressure elevation and pain. Thus, the dose must be chosen with great care because of the side effects. Employment of ergot alkaloids is more important in developing countries where postpartum hemorrhage is the cause of many deaths and where access to modern medicine methods is limited [28]. On the other hand, the administration of oxytocin in postpartum hemorrhage prevention (in contrast to ergot alkaloids) can result in a prolonged third stage of labor [29]. Ergot alkaloids used to be widely applied in obstetrics and yet, nowadays, only a few are still used. Ethylergonovine is used as a highly effective second-line uterotonic medication (unfortunately it is associated with severe vasoconstriction) [30]. In the developing countries in some cases, ergot alkaloids can be the only anti-hemorrhage agents available during labor. The summary of the ergot alkaloids properties is shown in Table 1.

> > 92


**Table 1.** Active compounds found in ergot, application, year of discovery, and impact on receptors for chosen compounds.

\* Discovery that ergotoxine is a mixture of alkaloids, \*\* extraction and characterization, \*\*\* year of the first synthesis, Ref. = Reference.

#### **5. Conclusions**

In conclusion, the application of ergot in gynecology and obstetrics in the 19th century was limited to controlling excessive uterine bleeding and irregular contractions, treatment of fibrous tumors of the uterus, and prevention of miscarriage. There is little evidence that sclerotia of the *Claviceps purpurea* were used in abortion or amenorrhoea. In the cases described above, the aforementioned abortifacient was either ineffective or caused deaths of the patients. On the other hand, abortion in in the 19th century was in most cases illegal and so information about the usage of abortive medications was a type of taboo.

The most common dosage forms mentioned in the works and that are included in our review were the following: tinctures, water extracts (Wernich's watery extract of ergot, Squibb's watery extract of ergot), pills, and powders. The information documented in this paper will be helpful for further research and for broadening the understanding of the historical application of the described controversial crude drug. Twentieth and twenty-first century applications of ergot in medicine will be covered by the authors in future papers.

#### **6. Methodology**

The authors conducted a literature search within the JSTO [35] and archive.org [36] databases using keywords such as "Ergot obstetrics", "Ergot abortion", and "Ergot gynecology" between 1822 and 1896 (published in the United Kingdom and the United States of America). After reading the works, we selected the most relevant papers and books for this article. Regarding the inclusion criteria, the works were selected depending on the inclusion based on the following topics: dosage forms of drugs based on ergot and their application in official gynecology and obstetrics. Exclusion criteria include the following: All other articles that did not cover one of these topics as their primary endpoint. The binominal Latin names of plants were synchronized with The Plantlist [37] database and binominal names of fungi were synchronized with the Index Fungorum [38] database.

**Author Contributions:** Conceptualization, A.S. and M.D.; methodology, A.S.; formal analysis, A.S. and M.D.; investigation, A.S., W.K., M.R. and M.D.; resources, A.S., W.K., M.R. and M.D.; writing original draft preparation, A.S., W.K., M.R. and M.D; writing—review and editing, A.S. and M.D.; visualization, M.R and M.D.; supervision, A.S. and M.D.; project administration, A.S.; funding acquisition, W.K. and M.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This review was funded by the Wroclaw Medical University.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** The authors wish to thank Piotr Czerwik for the review.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*
