**Improved Algal Toxicity Test System for Robust** *Omics***-Driven Mode-of-Action Discovery in** *Chlamydomonas reinhardtii*

**Stefan Schade 1, Emma Butler 2, Steve Gutsell 2, Geo**ff **Hodges 2, John K. Colbourne <sup>1</sup> and Mark R. Viant 1,\***


Received: 12 April 2019; Accepted: 7 May 2019; Published: 10 May 2019

**Abstract:** Algae are key components of aquatic food chains. Consequently, they are internationally recognised test species for the environmental safety assessment of chemicals. However, existing algal toxicity test guidelines are not yet optimized to discover molecular modes of action, which require highly-replicated and carefully controlled experiments. Here, we set out to develop a robust, miniaturised and scalable *Chlamydomonas reinhardtii* toxicity testing approach tailored to meet these demands. We primarily investigated the benefits of synchronised cultures for molecular studies, and of exposure designs that restrict chemical volatilisation yet yield sufficient algal biomass for omics analyses. Flow cytometry and direct-infusion mass spectrometry metabolomics revealed significant and time-resolved changes in sample composition of synchronised cultures. Synchronised cultures in sealed glass vials achieved adequate growth rates at previously unachievably-high inoculation cell densities, with minimal pH drift and negligible chemical loss over 24-h exposures. Algal exposures to a volatile test compound (chlorobenzene) yielded relatively high reproducibility of metabolic phenotypes over experimental repeats. This experimental test system extends existing toxicity testing formats to allow highly-replicated, *omics*-driven, mode-of-action discovery.

**Keywords:** synchronisation; algae; bioassay; biomarker; key event; adverse outcome pathway

#### **1. Introduction**

Algae are internationally established test organisms in chemical risk assessment [1–3]. Toxicity test guidelines incorporating algae are largely focused on traditional apical endpoints such as growth rates following a 72-h chemical exposure. By consequence, no knowledge is obtained on the causes of chemical toxicity, nor on the generality of the chemical effects on other species. Increasingly, chemical risk assessment includes New Approach Methodologies (NAMs) to improve precision in the prioritisation and chemical categorisation of chemicals [4]. One application of NAMs is to increase confidence in cause–effect analyses by providing mechanistic evidence on the harmful effects of chemicals, e.g., in the form of rapid and large-scale discovery of molecular key events (mKE) in an adverse outcome pathway (AOP) framework [5]. This transition, from traditional whole organism endpoint-based to molecular pathway-based (eco)toxicology, is motivated by the application of high-content technologies such as transcriptomics and metabolomics, which have demonstrated a capability to discover biological processes [6]. A robust, scalable test guideline for *omics*-driven discovery of toxicological key events in algae first requires a rigorously characterised algae culturing and exposure protocol with biological processes synchronised across all cells within a test population, so to minimise molecular variation

for the pathway-based cause-effect assessments. Ideally, such an exposure system features minimal test duration to facilitate rapid measurements, and is matched to an algal lifecycle to allow accurate linking of the molecular perturbations to growth inhibition to address regulatory requirements of phenotypic anchoring [6–10]. Additionally, the exposure system should be capable of testing both non-volatile and volatile toxicants, and providing highly-replicated generation of sufficient algal biomass for multi-*omics* measurements. Design of such a robust test guideline should facilitate the generation of highly reproducible, high-quality multi-*omics* data.

Current algal growth inhibition test guidelines [2,3] recommend culturing in constant lighting, a condition that induces rapid drift and loss of cell-cycle alignment over a population of algae that are responsive to light-regulation of the cell cycle [11]. In such a culturing regimen, samples for molecular analyses taken at any given time-point over the test duration will contain a cross-section of individual cells performing simultaneously all of the biological functions in the algal cell cycle, leading to averaged molecular signatures that are non-specific to biological functions or condition dependent perturbations (Figure 1A). Equally, apical endpoints (e.g., inhibition of cell division in algae) will be observable throughout this time course, preventing the possibility of biological and temporal anchoring of hypothesised mKEs to the adverse outcome (AO). To emphasize the challenging nature of interpreting molecular results from such experiments, an analogy is drawn to a hypothetical rodent bioassay with inclusion of ante-natal, neonatal, adolescent, middle aged, old aged and pregnant animal subjects, and pooling tissue samples from all of these life-stages to infer a chemical mode of action (MoA). There is a logical argument that just as life stage is controlled in vertebrate and invertebrate animal testing, similar considerations should be given to algal toxicity testing, which regrettably have so far been limited in scope [12,13].

**Figure 1.** Abstract representation of the differences in sample composition between algal cell cultures grown in (**A**) continuous lighting over 72 h—as routinely used in OECD toxicity testing, and (**B**) alternating light:dark cycles for a single algal generation—as we propose and investigate in this study.

Conventional algal growth inhibition tests commonly apply a 72 h exposure duration. Following introduction of cell cycle synchronisation, we aim to reduce the duration of exposure, and adjust patterns of sampling time-test duration for high-throughput screening, to ultimately enable discovery of mKEs occurring before (and that are predictive of) acute growth inhibition. Here, the prime consideration is the biological anchoring of the test duration to the life-cycle of a single algal generation (Figure 1B).

While *omics* technologies hold considerable promise for the discovery of MoA(s) [14–17], to date the majority of *omics* studies have been too small-scale to deliver on this promise in algae research. Seminal studies in microalgal toxico-*omics* include those by Nestler et al., Jamers et al. [18,19] and Pillai et al. [20]. However, the majority of past study designs either tended to (i) scrutinise only single snapshots of molecular perturbation [16,18,21–25], (ii) apply test durations which lack a biologically-justified anchoring of sampling time-points [19,26–31], and/or (iii) utilise non-synchronised cultures over time-course experiments [20,32–35]. Overall, new robust, miniaturised and scalable designs in routine algal toxicity testing are required [36]. However, large scale studies factoring time and toxicant exposure are currently not supported by conventional microalgal toxicity test practices. For instance, recent multi-*omics* studies on *C. reinhardtii* feature experimental designs that are restricted in sample size, analytical methodology, and time-points due to lack of biomass [37]. This challenge of generating sufficient biological material for *omics* analysis is particularly aggravated during the testing of volatile toxicants. Due to the CO2-dependence of natural photoautotrophic growth, existing volatile test systems are incompatible with above-mentioned experimental designs.

The overarching aim of the current study was to overcome the existing limitations in the algal growth inhibition test in the context of extending its applicability to *omics*-driven toxicological biomarker research. Following some initial modifications to the culture media, the first objective was to demonstrate the benefits of synchronised algae cultures for molecular studies using a combination of untargeted metabolomics, cell counting and flow cytometry. The second objective was to implement a highly replicable, closed-vial chemical exposure system with optimised *C. reinhardtii* growth at high inoculation cell densities, while minimising chemical volatilisation and test duration. The third objective was to demonstrate the application of untargeted metabolomics to this closed-vial test system by characterising the repeatability of the metabolic perturbations to a model narcotic, chlorobenzene, across three independent exposure experiments.

#### **2. Results**

#### *2.1. Synchronised Versus Non-Synchronised Algae Cultures for Molecular Studies*

Cells grown in constant lighting or alternating 12 h:12 h light:dark conditions were compared regarding time-point specificity of zoospore release (i.e., increase in cell number and density) and mitotic activity, over a time-course of 24 h. A clear logarithmic increase of cell density over the whole measurement period (thus a constant presence of zoospore hatching events) were observed in the cell populations grown in constant light (Figure 2A). By contrast, for cell cultures grown in alternating 12 h:12 h light:dark conditions, an increase in cell number occurred exclusively after the light:dark transition in a narrow time window of approximately 2.5–3 h starting around the 16 h time-point, indicating a high population-wide coordination of zoospore release from zoosporangia within the experimental time-frame.

A similar, although phase-shifted, dynamic process was observed when mitotic activity between the two culturing regimens was compared with flow cytometric measurements (Figure 2). Cells were analysed for fluorescence of DNA-intercalating fluorophore propidium iodide after a nucleotide staining procedure and enzymatic RNA degradation. Cell cultures grown in constant lighting conditions comprised either single cell zoospores, zoosporangia enclosing two or four zoospores, or cell particles undergoing S-phase, to narrowly-defined percentages over the whole course of the experiment (single genome copy zoospores 92.07 ± 0.77% of population, two-genome copy zoosporangia 3.25 ± 0.26%, four copy 3.21 ± 0.77%, cells in S-phase 1.48 ± 0.26%). Cell cultures from alternating light:dark conditions were composed of highly variable and time-dependent cell population, dominated by cells containing a single genome copy (SGC zoospores) at all time-points before the light:dark transition at 12 h (97.06 ± 1.53%). Yet at the 13 h time-point, these SGC zoospores comprised only 58% of the population, with a concordant increase in cell particles in S-phase, or zoosporangia containing two or four genome copies. The number of cells containing ever greater numbers of genome copies grew exponentially at later time-points, concordant with the occurrence of multiple-fission cycles of the cells. Replication activity started as early as 11 h into the light-phase (cells in S-phase at 11 h significantly increased from 9 h, Welch's t-test *p* = 2.7 <sup>×</sup> 10−3). Nuclear division events were

observed at merely 1 h after the light:dark transition (13 h time-point). By comparison, zoospore hatching and respective increase in cell density was only observed after 15.5 h.

**Figure 2.** Cell density (**Top**) and number of distinct peaks in PI fluorescence intensity (**Bottom**; ~ amount of genomic material found in each zoosporangium or zoospore) indicating genomic division events during multiple-fission cycles for non-synchronised (**A**) or synchronised (**B**) cultures. For synchronised cultures, light:dark transition occurred at 12 h.

To analyse the impact of light-induced cell cycle synchronisation on the metabolic phenotypes of algal cultures, direct infusion mass spectrometry (DIMS) based metabolomics was applied to cells grown in alternating light:dark conditions (synchronised cell cycles) versus grown in constant lighting (non-synchronised). Initially, two small optimisation studies were conducted; the first to determine the number of washes required during the extraction of metabolites from the algal cells; the second to optimise the dilution of the metabolite extract that was infused into the mass spectrometer (see Supp. Info SI-1 and SI-T1). Basal (unexposed) algal cultures were sampled from each culturing regimen at 4 h 15 min (±15 min), 8 h 15 min (±15 min) and 12 h 15 min (±15 min) post-seeding (*n* = 8 for each light regime and time-point). Intra-study quality control samples (QC) were derived from a single pool of aliquots from each biological sample. Using the established measure of median relative standard deviation (mRSD) of measured metabolite features as a descriptor of variance [38], technical variability over the dataset was estimated from QC samples [39], and total (dominated by biological) metabolic variability was assessed from groups of biological samples. mRSD of all *m*/*z* features for the QC samples in the analysis of the synchronised cell cultures was 9.1%, while QC mRSD for analysis of non-synchronised cell cultures was 12.1%. These results indicate low technical variability and hence high quality of the DIMS metabolomics data, as previously described by Parsons et al. [38]. Total mRSD for synchronised cultures was as follows: 25.1% (4 h 15 min ± 15 min), 27.4% (8 h 15 min ± 15 min) and 30.6% (12 h 15 min ± 15 min); while for non-synchronised cultures: 36.2% (4 h 15 min ±15 min), 27.6% (8 h 15 min ±15 min) and 33.8% (12 h 15 min ± 15 min). These values indicate a slightly higher biological variability in non-synchronised cultures *versus* synchronised cultures. Unsupervised multivariate analysis (principal components analysis, PCA) compared the metabolic phenotypes of the algae as a function of lighting regime and time-course (Figure 3). Outliers were removed if their measurements exceeded the 95% confidence interval derived for all the biological samples, within each dataset (4 samples in non-synchronised study, 2 samples in synchronised). No metabolic separation of time-points was observed for non-synchronised cell cultures grown in constant lighting conditions. However, for cells grown in alternating light:dark conditions, a progression of their metabolic phenotypes was discovered over time (along PC1). Statistical analysis of the PC scores for each sample indicated significant differences in PC1 (ANOVA, *p* = 1.8 <sup>×</sup> 10−11) and PC2

(*<sup>p</sup>* <sup>=</sup> 6.39 <sup>×</sup> <sup>10</sup><sup>−</sup>3) scores between metabolic phenotypes of the time-points for cells with synchronised cell-cycles, but not for non-synchronised cell cultures (ANOVA, PC1 *p* = 0.84, PC2 *p* = 0.85). This result indicates homogeneity of the metabolic composition of samples from non-synchronised cultures, and discrete compositions over time of those from synchronised cultures. Given the data, all subsequent studies were conducted using synchronised cell cultures.

**Figure 3.** PCA scores plots comparing the DIMS metabolic phenotypes of (**A**) synchronised algal cell cultures grown in alternating light:dark conditions, and of (**B**) non-synchronised algal cell cultures grown in constant light. Plotted are PCA scores for samples taken at 4 h 15 min ±15 min (Red), 8 h 15 min ±15 min (Green)) and 12 h 15 min ±15 min (Blue).

#### *2.2. Validation of C. reinhardtii Test System for Volatile Substances*

To establish the validity of the medium and vial culturing system, growth rates of synchronised *C. reinhardtii* cultures and medium pH in capped-vials (7.5 <sup>×</sup> <sup>10</sup><sup>5</sup> cells/mL inoculation density, 10% vial air space) were checked against OECD 201 Test Guideline criteria [3] prior to initiating dose-response experiments in the vials. Adequate growth rates of >0.92−<sup>d</sup> (mean 1.14<sup>−</sup>d; *n* = 4) and a pH drift < 1.5 (mean 0.25; *n* = 4) over a 24 h incubation period were achieved, meeting the required specification (Supplementary Figure S2).

Subsequently, the modified capped-vial volatile test system was checked for robust and repeatable generation of concentration-response data, specifically for the traditional apical endpoint of algal growth, for both an OECD-recommended reference test substance 3,5-dichlorophenol (Supplementary Figure S3; [3]) and the volatile model toxicant chlorobenzene. The exposure period used was 24 h, aligning with both the cell culture studies presented above. An EC50 = 0.93 ± 0.09 mg/L was derived for 3,5-dichlorophenol, while growth data from three independent chlorobenzene exposure experiments yielded an effective concentration estimate of EC50 = 32.5 ± 3.6 mg/L (Figure 4). Of particular note is the high repeatability (low variation) in the growth data derived from the synchronised culture test system.

To assess the suitability of the modified capped-vial exposure system for maintaining stable concentrations of a volatile toxicant in the medium, over the test duration, extra vials were added to the experimental design and GC-MS analysis conducted on chlorobenzene levels in the medium of these sacrificial vials sampled at the beginning (0 h) and after 12 h 15 min incubation. Exposures were conducted and samples taken at each time-point for both low (14 mg/L) and high (24 mg/L) nominal chlorobenzene concentrations (*n* = 4 for each time-point and concentration; Figure 5). Concentrations of chlorobenzene from medium sampled at 0 h were slightly lower than nominal (low group: 12.32 ± 0.31 mg/L; high group 20.87 ± 1.78 mg/L). However, they remained stable until 12 h 15 min (Low: 12.51 ± 0.13 mg/L, High: 22.43 ± 0.78 mg/L). No significant differences were observed

between the 0 h and 12 h 15 min medium chlorobenzene concentrations for both levels (Welch's t-test, Low *p* = 0.39, High *p* = 0.24).

**Figure 4.** Dose-response curve (4 parameter log-logistic model) of 24 h algal growth inhibition experiments of chlorobenzene. The graph represents data from three independent experiments (*n* = 4 per concentration), error bars correspond to 95%-confidence intervals of fitted curve.

**Figure 5.** Barplots comparing chlorobenzene concentrations in test system medium, measured using GC-MS, to assess whether any loss occurs over the test duration. At 0 h and after a 12 h 15 min exposure period, two concentration groups 'Low' (nominal 14 mg/L) and 'High' (nominal 24 mg/L) test groups were measured. Bars represent mean ±1sd (*n* = 4 per group).

#### *2.3. Repeatability of C. reinhardtii Metabolic Phenotypes in Test System*

To characterise the reproducibility of the metabolite phenotypes of synchronised *C. reinhardtii* cultures grown in the capped-vial exposure system, DIMS metabolomics was conducted on samples generated over three independent experiments. Each biological batch comprised of highly replicated algal cultures in pure growth medium (control, *n* = 10) and cultures exposed to 25 mg/L chlorobenzene (CB, *n* = 10), with all samples harvested after a 3 h incubation. This exposure concentration was selected based on the previous dose-response study (Figure 4) with the 3 h time-point assumed to capture early metabolic perturbations induced by the narcotic substance; exposures were conducted early in the light phase of the 12 h:12 h light:dark cycle. Algal cell samples of each biological batch were stored at −80 ◦C, then all samples were extracted and DIMS metabolomics analysis (positive ion mode, polar metabolite fraction) performed as a single analytical batch. This design allowed biological variation between the exposure experiments to be isolated and identified.

First, RSD and median RSD values were calculated for all batches and groups to characterise technical and total (dominated by biological) metabolic variability. mRSD of all *m*/*z* features within intra-study QC samples was 9.3%, confirming low technical variability and high quality of the instrumental analysis (Supplementary Figure S4). RSD distributions and mRSD values suggested observable, although small, differences between biological batches, with batches 2 and 3 displaying lower mRSD than batch 1. mRSD values in biological batches 1, 2 and 3 were—for Control groups—26.0%, 23.1% and 22.4%, and in CB groups 26.7%, 24.1% and 24.1%.

PCA was applied to study the metabolic variability inherent to the complete dataset. Two outliers were removed (batch 1, CB samples) as they exceeded the 95% confidence interval of the complete dataset, and PCA conducted on the 56 remaining samples. The PCA scores plot (Figure 6A) revealed that the three batches were not identical, with batch 1 CB samples showing the highest intra-group variance as well as differing from the batch 2 and 3 CB samples in multivariate space. Additionally, while the metabolic effects of CB relative to the control samples were apparent for batches 1 and 3, a minimal difference between controls and CB-treated samples was observed for batch 2.

**Figure 6.** PCA scores plots (PC1 vs. PC2) visualising (**A**) the metabolic differences across three exposure studies (termed biological batches), with each batch comprising of control (*n* = 10) and CB-exposed (*n* = 10) samples; (**B**) only the control samples from each batch. Sample groups are biological batch 1 Control (Purple) and CB (Green), batch 2 Control (Yellow) and CB (Orange), and batch 3 Control (Red) and CB (Blue).

To further investigate, PCA was conducted on the control samples alone, from all three biological batches (Figure 6B and Supplementary Figure S5), to determine the extent to which the baseline algal metabolome varied across studies. ANOVA of the PC scores for the three control groups indicated significant differences along PC1 and PC3 (*p* = 7.15 <sup>×</sup> 10−<sup>6</sup> and 3.84 <sup>×</sup> 10−4, respectively; Table 1), indicating certain inter-batch metabolic variation between the three biological batches. Similarly, PCA and subsequent ANOVA of PC scores were applied to just the CB samples, which also revealed significance of metabolic differences between the three biological batches (Supplementary Figure S5 and Table 1). Potential sources of this batch effect included metabolic variation in the cultures themselves (over the duration of the three studies), the sampling of the algae, and/or other operator effects.

To achieve the high level of repeatability required for toxicity testing, we explored the effect of normalising each batch of exposure data to the corresponding control metabolic phenotypes. First, we calculated the log2 fold-change (LFC) for every *m*/*z* feature (peak), specifically by determining log2 of the ratio of the median feature intensity in the CB group over the median intensity in the respective control group (log2 (median[IntCB]/median[IntControl]), for each *m*/*z* feature). This calculation was repeated for each of the three batches. The resulting density distributions of LFC values, one for each biological batch, were visually similar (Figure 7A). Yet upon statistical testing, these LFC batch values were significantly different (Kruskal-Wallis test, *<sup>p</sup>* <sup>=</sup> 1.38 <sup>×</sup> <sup>10</sup><sup>−</sup>12).


**Table 1.** ANOVA results comparing PC scores of control samples only, CB samples only, and CB samples normalised to the batch-specific controls, in each case over the three independent biological batches.

**Figure 7.** To increase comparability of exposure effects over the three repeated experiments, *m*/*z* feature intensities in CB groups were normalised to their batch-respective control groups by calculating the log2 fold-change of *m*/*z* features: (**A**) density distribution of LFC values for each batch (log2 (median[IntCB]/median[IntControl]), per *m*/*z* feature); (**B**) PCA scores plot from analysis of LFC values of all three batches (log2 (IntCB/median[IntControl]), for each *m*/*z* feature and for each of 10 CB samples).

To further explore the effect of normalising each batch of exposure data to the corresponding control metabolic phenotypes, specifically using PCA, log2 fold-changes were recalculated individually for the metabolite features of each of the *n* = 10 samples in the CB group relative to the median feature intensity of their respective control group (log2 (IntCB/median[IntControl])). This allowed PCA of these LFC values to reveal any metabolic differences between the effects of CB-treatment normalised to each batch-control, for all three batches (Figure 7B), with the scores plot now highlighting the significant overlap and therefore consistency of the batches. Indeed, ANOVA indicated no significant differences between PC1 and PC3 scores (PC1 *p* = 0.20, PC3 *p* = 0.11). However, slight batch differences were still detectable along PC2 (*p* = 0.015). Importantly, initial batch effects in PCA (Figure 6A,B) were greatly decreased by applying this normalisation strategy to the relevant batch-controls (Figure 7B).

We next evaluated the repeatability of the CB-induced metabolic perturbations in the three exposure experiments from the context of molecular biomarker discovery. This was achieved by applying set enrichment analysis [40] to the metabolomics data. This statistical process of comparing the rank order of genes, based on the magnitude or significance of their responses to multiple experimental conditions, was here applied on the rank order of mass over charge (*m*/*z)* metabolic features. Three comparisons were made (Table 2).

First, the top 100 LFC *m*/*z* features from batch 1 were designated as metabolite set (termed set batch 1, or SB1), and used to interrogate the rank order of *m*/*z* metabolic features obtained from the other two batches. The significance of this set enriching the leading edges (LE) of each of the two batches was calculated using the normalized enrichment score (NES) and by estimating a false discovery rate (FDR). This enrichment test was twice repeated, for SB2 and SB3. The results indicate that top 100 LFC *m*/*z* features for from each batch are strongly and significantly enriching all biological batch-metabolite set combinations (Table 2), concluding that a common and consistent subset of metabolic markers as a toxicological signal was discovered across all the biological batches, despite small batch effects in PCA.

**Table 2.** Enrichment analysis applied to three metabolomics datasets following exposure of *C. reinhardtii* to chlorobenzene in independent experiments (batches 1, 2, 3). Ranked lists of *m*/*z* feature LFC were calculated (SB1–SB3 for batches 1–3) and every list was compared against every other list to determine whether SB1-SB3 were enriched with elements of the other sets. Significance of enrichment was assessed using 1000 random permutations of *m*/*z* values in LFC-ranked lists, and false-discovery rates are reported. NES = normalised enrichment score, degree of metabolite enrichment of S within dataset. LE = leading edge, subset of *m*/*z* features in S that are most impactful towards a high NES.


Finally, we sought to evaluate the metabolic perturbations in the three exposure experiments using a supervised multivariate analysis, which focuses on the most consistent changes induced in the CB-treated samples. Specifically, partial least squares-discriminant analysis (PLS-DA) was conducted on *m*/*z* features of all of the samples from the three biological batches. The algorithm selected an optimum of 5 latent variables for the classification model. While the control and CB samples were clearly separated in the PLS-DA scores plot of latent variable 1 vs. 2 (Figure 8), a robust interpretation of these results required any over-fitting of the PLS model to be assessed. Following 10-fold internal cross-validation, the PLS model yielded an R2 = 0.99 (amount of variation in data explained by model) and Q<sup>2</sup> = 0.72 (goodness of fit), which indicated good generalisability.

**Figure 8.** PLS-DA scores plot visualising the metabolic effects in *C. reinhardtii* following exposure to 25 mg/L chlorobenzene (red circles) relative to unexposed controls (green circles). All samples from three independent exposure experiments (biological batches) were pooled in this analysis.

#### **3. Discussion**

Chemical risk assessment by new approach methodologies requires test protocols with experimental designs that are tailored to applying *omics*, so to gain insights into toxicity mechanisms on a global molecular level and for discovering mKEs [6,15,36,41]. We here developed an extended variant of the regulatory algal growth inhibition test towards routine large-scale investigation of molecular toxicological processes. We achieved this by combining population-wide synchronisation of molecular functions in algal cell cultures with an easily scalable and abbreviated single-generation test system, suitable for testing of both soluble and volatile toxicants, and characterised the quality of generated molecular data using metabolomics.

#### *3.1. Synchronised Cultures in Algal Toxicity Testing*

For various unicellular green algae commonly applied in growth inhibition tests (e.g., genera *Chlamydomonas, Raphidocelis*, *Desmodesmus*, *Scenedesmus*, *Chlorella*), the phenomenon of light-driven cell cycle synchronisation is a known manifestation of evolutionary adaptation to scarcity of light in natural environments that can be experimentally induced by introduction of alternating light:dark cycles during incubation [42–44]. The degree of induced coordination between cells is close to absolute [45], nonetheless the benefits of this algae culture synchronisation to the context of environmental toxicology have previously only been given brief consideration [12]. Exposure and effect analysis of multiple life-stages within the same experiment are commonly avoided in toxicity assessments due to introduction of various confounding factors [46]. Initial matching of test organisms by life-stage is prescribed in the standardised OECD test criteria for most environmental model organisms, such as water flea, birds, bee, annelids, springtail, molluscs, amphibia and fish [47]. As one of the few cell-based in vivo systems, microalgal bioassays do not feature the requirement of life-stage/cell-cycle matching [2,3]. Commonly prescribed test conditions, including constant lighting, may further be considered non-representative of natural environments, and drives suboptimal metabolic efficiency and rapid cell cycle misalignment [11,48,49].

In the presented data, coordination of genomic division and sporulation events over the complete algal population in cultures grown in light:dark cycles, and lack thereof in cultures grown in constant light, was confirmed via electronic cell counting and flow-cytometry. This difference was indicative of cell cycle synchronisation in algae grown under alternating light cycles [45]. From the perspective of mechanistic (eco-)toxicology, one significant benefit of the observed population-wide coordination of nuclear division and sporulation is the ensuing occurrence and measurability of the algal adverse phenotypic endpoint within a narrow time-window across the cell population. This in turn enables temporal phenotypic anchoring of the AO to prior molecular perturbations, thereby defining the AOPs [9]. In contrast, described coordination and opportunity of mechanistic anchoring is lost in algal cell culturing using constant light regimes [11], due to consistent occurrences of nuclear division and sporulation over the entire experimental time-frame.

In mammalian cell models, synchronisation is routinely applied to enable detailed investigations into specific phases of the cell cycle [50]. In microalgae such as *C. reinhardtii*, synchronisation of cell cycle was applied to study various biological processes [51–55], however its explicit importance for toxicological investigations remains undervalued [12,13]. Our experiments comparing the metabolite phenotypes from *C. reinhardtii* grown under the two culturing regimens emphasised substantial and significant changes in the molecular composition in the synchronised *C. reinhardtii* cultures through time. Cells are tightly aligned in their progress through the entire cell cycle up to nuclear and cellular division [56] and characteristically evolving metabolic and transcriptomic patterns have previously been linked to this progression in *C. reinhardtii* [51,57]. In both pro- and eukaryotic cells, evidence is mounting that metabolic activity is not merely affected by position in the cell cycle, but a major driver of it [58,59]. Accordingly, distinct changes of metabolic patterns along the cell cycle were discernable in the scope of metabolomics analysis of synchronised cell populations. However, a complete lack of such an effect was apparent in cultures grown in constant light as prescribed by regulatory guidelines. Under non-nutrient-limited exponential growth conditions, non-synchronised cultures grown in constant light will comprise algal cells from every possible stage of the cell cycle, irrespective of experimental time-point. Although individual cells still progress through cell cycle, extracted samples for molecular studies comprise cross-sections over the whole complement of cell cycle-dependent

biological functions, as evidenced by the high similarity of metabolic profiles between successive time-points over the experimental time-frame.

Metabolic phenotypes from samples of synchronised algae cultures taken at specific time-points during the exposure period were representative of narrow windows of biological functions anchored to cell cycle. The equivalent is true for perturbations of cellular functions (mKEs), which can be precisely measured over respective time-course analyses. Light:dark cycle-induced algal cell cycle synchronisation should thus be able improve *omics*-driven time-course analysis of mKEs by avoiding phase-shifts of molecular signals in the test system [60,61]. In non-synchronised cultures, discrete sampling intervals cannot be precisely rooted to cell-cycle-dependent biological processes. Interpretation of toxicological mechanisms remains limited to snapshots of overlaying molecular cross-sections of algal populations, and molecular phenotypes are averaged over the heterogeneous cell population within a sampling time-point. It is concluded that synchronised cell cultures in this context represent an approach to study molecular biological processes and their perturbations with strongly reduced variability, enabling increased analytical resolution of biological processes and their perturbances.

#### *3.2. Design and Validation of an Alternative Volatile Testing System*

Reducing the duration of a conventional algal test protocol would increase sample throughput and thus facilitate applicability for high throughput *omics* technologies, e.g., *for* applications in read-across and prioritisation. For the purpose of discovering mKEs, the commonly prescribed 72 h test duration represents an arbitrarily-selected timeframe to capture low-concentration delayed-onset mechanisms of toxicity. A warranted question in this context is why the standard test duration should not be increased to even longer durations to optimise sensitivity of the system to lower concentration effects (bioaccumulation or damage). Given the selection of inhibition of cellular division as the AO for microalgal toxicity, it can be assumed that AO-inducing key events must occur and be measurable within perturbations of preceding biological functions [62,63] and thus within previous cell cycle stages. Justification of the test duration in microalgal exposure studies aiming to discover mKE and infer chemical MoA, therefore, should not hinge upon absolute length, but on biological anchoring, i.e., the minimal time required to link molecular stress responses to successive phenotypic effects. The reduction of test duration encompassing a single generation of synchronised cells (Figure 1B) is proposed in this context. It may be argued that an optimal sampling strategy would extend exposure and sampling to the respective dark phase of the alternating light:dark cycle after sporulation, allowing the effects of a chemical on dark cycle processes to be examined. The practical implementation of such a protocol is not readily feasible, risking activation of biological functions in algae exposed even to minimal light indicative of a commencing cell growth cycle [51,64]. Furthermore, algal biological activity in complete darkness is indicated to be largely limited to mitochondrial energy metabolism, which is maintained throughout the light phase [51,65–67]. Consequently, the proposed time-frame from commencement of light phase to sporulation covers the single largest set of molecular and cellular functions within a cell cycle that chemicals may perturb.

Algal toxicity testing of volatile chemicals has historically proven problematic when combined with requirements for test miniaturisation and increased harvestable biomass, due to natural autotrophic requirements. Conventional agitation- and aeration-based exposure vessel systems become highly cumbersome to maintain with increasing experimental scale, and the stability of mid-concentration exposures is difficult in open flask culturing. A variety of solutions has been proposed to address the issue, including tailored complex flasks [68–70] and variants of bottle or tube testing [71–79]. The lower limit of *C. reinhardtii* biomass to achieve reproducible data in metabolomics and lipidomics studies has been reported at ca. 2–2.5 <sup>×</sup> 106 cells per sample [80,81], and similar or higher for transcriptomics [51,82–84] or proteomics [22,85]. This precludes the application of existing methods to large-scale (multi-)*omics* studies due to low cell densities, problematic physical scalability, disturbance of phase equilibria during repeat-sampling procedures, strong pH drift, or a combination thereof.

Here we developed and validated a hybrid sacrificial capped-vial exposure system to address existing shortcomings. The achieved biomass of >7.5 <sup>×</sup> 106 harvestable cells per test vial has been substantially increased from comparable vial systems [75–77] and is sufficient to generate biological material for the demands of multiple *omics* technologies in parallel. In the exposure system, synchronous *C. reinhardtii* cultures performed well with mean growth rates at 1.14−d, well above the validity requirements of 0.92−<sup>d</sup> [3]. Low pH drift of 0.25 over the test duration was achieved (required <1.5; [3], precluding risk of changing bioavailability due to photosynthesis-driven pH changes, an issue more likely observed in other volatile test systems [68,71,86].

Past studies to hypothesise on chemical MoA in algae were frequently limited in exploring the temporal dynamics of toxicity. Many existing multi-*omics* studies attempting to deduce a MoA of chemicals as a dynamic processes in *C. reinhardtii* have employed experimental designs with relatively low numbers of sampling time-points [18,19,21–24,37,87]. Again, the required experimental designs are not sufficiently supported by conventional microalgal toxicity test practices, and the scarcity of existing publications with time-series analyses of algae is partly due to a relative lack of physical scalability. The volatile testing system reported here represents a miniaturised and abbreviated variant of the conventional algal growth inhibition test, reorienting its focus on molecular events predictive of a measurable acute adverse outcome after a single clonal generation of *C. reinhardtii.* By utilising small glass vials, parallelisability of test vessels and thus factorisation of experimental variables can in principle be drastically increased in individual experiments. Capacity to test hundreds of vial replicates in parallel, per individual toxicity test, has been achieved in multiple other studies in the author's laboratory (in prep.), enabling experimental designs to accommodate for high replication, temporal resolution and multiple exposure concentrations.

To demonstrate the generation of reproducible results by the designed testing approach for soluble and volatile toxicants, dose-response experiments of 3,5-DCP and chlorobenzene were performed to compare generated data to existing knowledge. EC50 value for the reference substance 3,5-DCP was estimated at 0.93 ±0.09 mg/L, well within the range of 0.5 to 6.1 mg/L (48 to 72 h-EC50) observed for *C. reinhardtii* and other green algae [88]. The determined 24 h-EC50 of the highly volatile chemical chlorobenzene for *C. reinhardtii* was estimated at 32.52 ±3.64 mg/L (data of three repeats), again within range of published closed-test system data (*P. subcapitata* 48 h-EC50 = 14.4 mg/L [79]; *Selenastrum capricornutum* 96 h-EC50 = 12.5 mg/L, [69]), and drastically reduced compared to results derived from open-flask testing (*S. capricornutum* 96 h-EC50 = 224 mg/L [89]). When accounting for inter-species variability in sensitivity and differences in test exposure durations and physical setups [69], the estimated EC50 suggests adequate performance of the vial exposure system within expected ranges of variation.

Another requirement of the exposure system was to enable stable exposure concentrations of volatile toxicants over the experimental timeframe. Measurements of chlorobenzene levels at the start of the experiment were lower than nominal (14 mg/L nominal measured at 12.32 ± 0.31 mg/L; 24 mg/L nominal measured at 20.87 ± 1.78 mg/L), however the levels did not change significantly over the 12 h 15 min incubation period. This indicates chemical loss during initial stock preparations, however high stability of concentration levels over the incubation period. According to Henry's law constant, fraction of chlorobenzene in the gas phase is calculated as %CBg = Vg/(Vg + (VL·R·T/KH)) with Vg <sup>=</sup> 10% vial gas phase, VL = 90% vial liquid phase, T = 298.15 K, R = 8.205 <sup>±</sup> 10−<sup>5</sup> m3·atm/(K·mol), KH <sup>=</sup> 3.7 <sup>×</sup> <sup>10</sup>−<sup>3</sup> atm·m3/mol [90]. At equilibrium, only 1.65% of chlorobenzene theoretically partitions into the gas phase. Similar calculation for the expected partitioning of 204 Volatile Organic Compounds [91] suggests that 85% would partition less than 6% into the gas phase of the proposed exposure system. Concluding, stable exposure over the time-course of the experiment was well achievable using the designed capped-vial testing system, and minimal partitioning into the gas phase occurs during the experimental timeframe. Anticipation of and correction for initial volatilisation loss during stock preparation of highly volatile chemicals at inoculation is advised for further studies.

#### *3.3. Characterising Variability and Repeatability of Metabolomics Data*

Robust and reproducible determination of biomarkers and toxicological key events predictive of chemical insult from the algal metabolome is paramount for downstream utilisation of the generated data to support environmental hazard assessments. To characterise the reproducibility of data gained from our exposure approach, differences in the metabolite phenotypes between three repeated exposure studies were compared. Ranges of recorded median RSD values (20.86% to 24.99%) lay within the expected ranges for test organisms and cells, e.g., median RSD values have been reported for human cell lines between 20–22% [38].

Small batch effects could be observed between the baseline metabolic phenotypes across the three repeats of the exposure experiment when compared in PCA. A number of possible factors might have contributed to these differences, including subtle changes in laboratory maintenance that unknowingly affected the growth and biological activity of the sensitive algal cultures, as reported previously for microalgal bioassays [92–95]. We demonstrated that normalisation of the observed differences in algal stress response, by calculating log2 fold-change of the measured *m*/*z* feature intensities to their batch-specific biological baseline (controls), proved to mitigate existing batch differences.

Despite these small batch effects in PCA, the algal metabolic stress response appeared to be regulated by a consistent mechanism across studies, as indicated by PLS-DA on pooled data. Additionally, we applied set enrichment analysis to metabolomics data, an approach originally developed for analysing gene set enrichment [40] which lately has become increasingly applied in metabolomics [96–99]. Set enrichment analysis was conducted to characterise the overall degree of similarity of whole sets of metabolic markers associated with chlorobenzene exposure between the repeat experiments. Metabolite sets containing the most important class markers within each of the biological batches were found to be significantly and strongly enriched within the top ranks of *m*/*z* features within the other batches, adding further support to our conclusion that a consistent toxicological effect could be measured.

#### **4. Materials and Methods**

#### *4.1. Algae Culturing*

Axenic *Chlamydomonas reinhardtii* Dangeard (1888) CC-125 wild type mt+ was purchased from the Culture Collection of Algae and Protozoa (CCAP, Dunbeg, UK). The strain was maintained heterotrophically on agar plates (1.5% *w*/*v* agar in tris-acetate-phosphate media) on laboratory shelves at 23 ◦C until liquid inoculation. A modified growth medium (termed *Chlamydomonas* growth medium; CGM) was developed here based on Sueoka's high salt medium supplemented with Kropat's trace elements [100,101] with the following modifications. The final concentrations of the divalent inorganic salts CaCl2\*2H2O and MgSO4\*7H2O were increased to 50 mg/L and 100 mg/L, and the concentration of total phosphates was reduced by 92.5% to 54 mg/L KH2PO4 and 108 mg/L K2HPO4 to decrease both palmelloid formation and mass spectrometric ion suppression due to phosphate accumulation in *C. reinhardtii* cells [102–104]. To increase inorganic carbon levels, the growth medium was supplemented with 500 mg/L Na-HCO3. As *C. reinhardtii* growth is optimal between pH 5.5 to 9 [105,106], pH was adjusted to 6 to shift carbon equilibrium towards photosynthetically-available carbonic acid species [107]. To retain buffer capacity for a photosynthesis-driven change in medium pH, CGM was supplemented with 20 mM MOPS buffer, similar to Renberg et al. [102].

The growth medium was sterilised by 0.22 μm-filtration to avoid degradation of MOPS, salt precipitation and shift of the carbon equilibrium caused by autoclaving [108]. For initial liquid stock cultures, agar cultures were inoculated into 35 mL CGM within foam-bunged 250 mL wide-neck glass flasks. Cultures were incubated in a Multitron Pro rotary shaking incubator (Infors HT, Bottmingen-Basel, Switzerland) at 25 ◦C and 200 rpm orbital shaking. Lighting conditions were either set as constant lighting (effecting non-synchronised cell cycles), or as a 12 h:12 h light:dark cycle (inducing population-wide synchronization of cell cycle). Light colour temperature was 8500 K. Algae were adapted to this lighting and autotrophic growth conditions in liquid cultures for at least 120 h prior to experiments.

The first study compared the metabolic phenotypes and mitotic activity of cell cultures grown under continuous lighting versus those grown under alternating 12 h:12 h light:dark cycles. Algae grown in either condition were seeded in vials as described below. Samples were taken at 4 h 15 min (±15 min), 8 h 15 min (±15 min) and 12 h 15 min (±15 min) post-seeding. 5.25 <sup>×</sup> 106 (7 mL) were taken from synchronised samples at every time-point, and sampling volume of the non-synchronised samples were adjusted to match this cell number by prior electronic cell counting. The latter was performed using CASY TT cell counting technology (Roche Innovatis AG, Basel, Switzerland) which operates via electric pulse area analysis to count suspended particle. As described below, cells were harvested and stored at −80 ◦C until combined metabolite extractions.

Over the course of this study, further samples were taken for flow cytometric measurements of genomic material in cell particles at 4 h, 8 h, 12 h and 16 h post-seeding for non-synchronised cultures, and 3 h, 6 h, 9 h, 11 h, 13 h, 14.5 h, 16 h, 18 h post-seeding for synchronised cultures. <sup>1</sup> <sup>×</sup> 106 cells were sampled (*n* = 3), centrifuged at 7000<sup>×</sup> *g* for 3 min (23 ◦C), pellets washed once in 1 mL phosphate-buffered saline (PBS), and cells fixed in 70/30 ethanol/water (both HPLC-grade, Fisher Scientific, Loughborough, UK) for 24 h at 4 ◦C. This last step served to permeabilise cells for propidium iodide (PI) and to extract chlorophylls which might interfere with the PI emission spectrum [109] The cells were washed again in 1 mL PBS and incubated in the dark for 30 min in 500 μL PBS containing 0.1 mg/mL RNAse-A (Qiagen, Hilden, Germany) and 0.01 mg/mL PI in water (94%, Sigma-Aldrich, Gillingham, UK). For 1 <sup>×</sup> 104 particles per sample, PI fluorescence was measured using a Becton Dickinson FACS-Calibur (488 nm excitation, bandpass filter 610/10), and analysed using open-source Flowing v2.5.1 (flowingsoftware.btk.fi). Absence of chlorophyll interference was confirmed with non-stained samples. Due to haploidity of *C. reinhardtii* vegetative cells, discrete peaks in per-particle fluorescence intensity histograms from cells sampled at a given time were equated either to the number of genome copies present within each cell particle (zoosporangia or zoospores) post nuclear division. Fluorescence areas in between discrete peaks were assigned as cells currently undergoing DNA replication and assigned as 'S-phase'. Growth curves of both culture regimen were determined using CASY cell-counting, at 2–4 h intervals for non-synchronised cultures and at higher resolution (30 min intervals) at the light:dark transition for synchronised cultures.

For each sampling time-point, vials were of sacrificial nature to maintain carbon equilibrium. Synchronised cell culturing was determined as preferable to non-synchronised cultures, as described in results, and used in all further studies.

#### *4.2. Chlorobenzene Exposures*

To expose cells, quadruplicate independent culture batches of synchronised cultures were grown in open flask culture, and at commencement of the light phase the cell cultures were concentrated individually by centrifugal separation in 50 mL tubes at 1200× *g* for 3 min (Eppendorf 5920 r, Eppendorf, Stevenage, UK). The supernatant was decanted, cells were diluted in toxicant-spiked CGM to yield a final cell concentration of 7.5 <sup>×</sup> 105 cells/mL, and seeded in clear 10.5 mL glass vials with aluminium cap inlay (N◦ T101/V4, Scientific Glass Laboratories, Stoke-On-Trent, UK) to establish 10% total gas phase volume within each vial (Figure 9). Vials were capped and incubated for 24 h, rolling free on a lined tray in the rotary incubator to maintain cell suspension. Preparation steps were restricted to a maximum duration of 30 min within the onset of light phase. Culture cell density for inoculation was scrupulously controlled and growth rates measured in technical duplicate using CASY cell counting.

A study conducting toxicity tests on synchronised cell cultures was performed using chlorobenzene (CB; 99.9%, Sigma-Aldrich, Gillingham, UK) as a volatile substance and 3,5-dichlorophenol (3,5-DCP; 97%, Sigma-Aldrich, Gillingham, UK) as the reference substance. All tests were performed with biological quadruplicates. *C. reinhardtii* was exposed to CB in three independent experiments at individual concentration ranges: 0, 15, 24, 38, 60, 95 mg/L (repeat 1, for range-finding); 0, 3.5, 7, 10.5, 14, 17.5, 21, 25, 37.5, 50 mg/L (repeat 2, adjusted range to encompass curve slope); 0, 1.5, 3, 6, 12, 18, 25, 36, 48 mg/L (repeat 3, for focus on environmentally-relevant lower end of dose-response curve). Cells were exposed to 3,5-DCP at 0, 0.6, 1.2, 2.4, 4.8 and 9.6 mg/L. A 4-parameter log-logistic dose-response model and effective concentration (EC) estimation was performed using [110] and drc package.

**Figure 9.** Individual steps for centrifugal concentration, cell density measurement, dilution, seeding of *C. reinhardtii* cell cultures into T101/V4 vials, and incubation of vials on a lined tray. Biological replicates in vials (*n* = 4 per experimental group) were generated from independent flask cultures.

A further exposure study was conducted to compare the metabolite phenotypes of synchronised cell cultures in pure growth medium (control) versus cells exposed to 25 mg/L CB, repeated three times, to characterise the metabolic variability and repeatability of data generated in the exposure vial system. Ten biological replicates for each of the control and CB groups were seeded in capped vials, incubated for 3 h, harvested as described below, and cell pellets stored at −80 ◦C until metabolite extraction. This experiment was repeated three times in identical set-up with a duration of 5d between experiments. Metabolite extraction was performed for all samples of the three repeats combined.

#### *4.3. Sampling and Metabolite Extraction for DIMS Metabolomics*

#### 4.3.1. Sampling Procedure

Modifying the procedure by Lee and Fien [80], algae cell suspensions were quenched by injection into an equivalent volume of −78 ◦C 70:30 methanol:water (HPLC-grade, Fisher Scientific, Loughborough, UK) solution in 15 mL tubes over dry ice. Quenched cell suspensions were centrifuged at 3800× *g* and −11 ◦C for 3 min, and the supernatant aspirated. The number of washes was first optimised in a pilot study by comparing 10 samples from each of three groups: no wash, single wash of cell pellet, or two washes. Each wash comprised of resuspension of the cell pellet in 10 mL of −20 ◦C 35% methanol:water, then repeating the centrifugation and supernatant aspiration (described above). Between centrifugation and aspiration steps, samples were stored in a custom-manufactured mobile freezer unit between −20 ◦C pre-cooled aluminum beads to minimise metabolic activity and avoid freezing of suspension and thus risk of cell lysis at lower temperatures. After aspiration, pellets were flash-frozen in liquid nitrogen, and all samples stored at −80 ◦C until metabolite extraction. A single pellet wash was determined as optimal (see results in Supplementary Figure S1) and used for all further experiments in this study.

#### 4.3.2. Metabolite Extractions

Biphasic metabolite extractions were performed with a final solvent ratio of methanol:chloroform:water (chloroform HPLC-grade, Fisher Scientific, Loughborough, UK) of 2:2:1.8 [111]. Cell pellets were thawed in randomised subsets of 24 tubes on ice. Cell pellets were resuspended in 970μL ice-cold 70:30 methanol:water, transferred into homogenisation tubes with 0.5 mm glass beads (VK05, Stretton Scientific, Stretton, UK) on ice, and immediately homogenised at 6800 rpm for 2 × 15 sec bursts using a Precellys-24 tissue homogeniser (Bertin Instruments). Samples were placed back on ice for 2 min, then the supernatant (742 μL) was transferred to a 1.8 mL glass vial (aluminum-lined caps) containing 795 μL of ice-cold 1:2 water:chloroform, to yield the final solvent ratio of 2:2:1.8. Vials were vortexed for 30 sec and centrifuged for 15 min at 2585× *g* at 4 ◦C to achieve phase separation. Half the polar phase was aliquoted

to a 2 mL Eppendorf tube using a glass syringe for untargeted metabolomics of the polar metabolite fraction (the other half retained for a second metabolomics assay, i.e., a different ion mode mass spectrometry). Each aliquot was dried in a SPD11V SpeedVac sample concentrator (Thermo Scientific, Rugby, UK) at room temperature. Extracts were stored at −80 ◦C until reconstitution in injection solvent (below). The same procedure was performed to generate an extraction blank sample.

#### *4.4. Direct Infusion Mass-Spectrometry (DIMS) Metabolomics*

Sample preparation, direct infusion mass spectrometric analyses and data processing were performed after the procedure of Southam et al. [39] To maximise the number of reproducibly detected *m*/*z* features, the dilution of the reconstituted algal extracts was first optimised (see results in Supplementary Table S1). A reconstitution volume of 50 μL injection solvent (80:20 methanol:water containing 0.25% *v*/*v* formic acid (98%, VWR)) was found to result in the maximal number of unique *m*/*z* features post-processing, therefore this option was used for preparation of all further samples. Ten μL of each reconstituted sample was pooled into an intrastudy quality control (QC) sample. All samples (biological, extraction blank, intrastudy QC) were analysed as technical quadruplicates (10 μL) on an Orbitrap Elite mass spectrometer (Thermo Scientific) with direct infusion nanoelectrospray ionisation (nESI) source (Triversa, Advion Bisociences, Ithaca, NY, USA) in positive ionisation mode. Individual SIM-windows of samples were stitched into single mass spectra, ≥75% replicate filtered, global peak aligned with a 3 ppm error tolerance, extraction blank filtered, QC-based locally-estimated scatterplot smoothing (LOESS) signal corrected [112], and ≥80% QC sample-based peak filtered. Features with ≥20% missing-values were filtered out, and probabilistic quotient normalisation applied [113]. RSD values were calculated for each *m*/*z* feature, and features in QC samples with RSD values >30% omitted from the data. Missing values imputation (k-NN) was performed and the generalised log transformation applied for multivariate data analysis.

Statistical tests (t-test, Kruskal-Wallis test, ANOVA) were performed using R and the R-Stats base package, PCA and PLS-DA conducted using MetaboAnalyst v4.0 [114]. Set enrichment analysis conducted with a java-based tool [40]. The method calculates an enrichment score of a supplied set S of variables against test data, to determine whether variables of S occur tendentially within the *m*/*z* features that are high-ranking in importance for separation of phenotype in the test data. S commonly comprises elements of a biological pathway (pathway enrichment analysis), however may contain individually defined variables. Here, S comprised *m*/*z* features with high LFC of one biological batch, which were checked for enrichment against LFC-ranked lists of *m*/*z*-values in the other biological batches. Thus, it was determined whether those *m*/*z*-values that were responsible for LFC-driven phenotype-separation in one biological batch were equally important in the other biological batches. Strength of the enrichment was calculated using NES metric (normalised enrichment score), significance determined by estimating false-discovery rate corrected *p*-values using 1000-fold permutation of the *m*/*z* features. *m*/*z* features within S deemed most impactful towards a high enrichment metric NES are referred to as the leading edge subset (LE, [40]).

#### *4.5. Water Chemistry of Chlorobenzene*

To measure concentrations of chlorobenzene in test medium over time, algae were incubated in vials with medium containing 14 mg/L or 24 mg/L chlorobenzene (nominal) for 12 h 15 min. At 0 h and 12 h 15 min, 100 μL (*n* = 4) was injected into 20 mL headspace vials with PTFE-lined septa (Scientific Glass Laboratories, UK) containing 9.7 mL CGM and 100 μL 0.8 M H2SO4. Samples were stored in the dark at 4 ◦C. One hundred μL of 10 mg/L d5-chlorobenzene (99 atom-%, Sigma-Aldrich) in CGM was added immediately prior to analysis (final concentration 0.1 mg/L). Headspace GC-MS was performed on an Agilent 5975B inert XL MSD using DB-624 column (30 m × 0.25 mm i.d. × 1.4 μm) and Gerstel Autosampler MPS2. Analytes were identified using target *m*/*z* = 112 and qualifier *m*/*z* = 77 (chlorobenzene); target *m*/*z* = 117 and qualifier *m*/*z* = 82 (d5-chlorobenzene). Sample sequence was randomised and quantification of chlorobenzene was performed relative to d5-chlorobenzene.

#### **5. Conclusions**

Firstly, we designed an alternative closed-vial chemical exposure system that allows for sufficient growth of *C. reinhardtii* (at high inoculation cell densities) for a metabolomics assay, while minimising test duration and chemical volatilisation. The latter was proven by stable exposure concentrations to chlorobenzene over the test duration. We also examined the repeatability of this *C. reinhardtii* exposure system, identifying batch variation when applying unsupervised multivariate analyses to the metabolomics data, yet reducing these unwanted effects by normalising each biological batch to its respective control data. Furthermore, we used set enrichment analysis as well as supervised multivariate analysis to demonstrate the consistency of the metabolic markers discovered in three repeat exposures to chlorobenzene. Secondly, we have demonstrated that the application of an *omics* technology to synchronised algal cell cultures grown in alternating light:dark cycles can detect characteristic metabolic phenotypes that evolve through the cell-cycle. This should benefit the interpretability of molecular data in terms of discovering early mKEs that are anchored to, and predictive of, the algal adverse phenotype of reduced growth.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2218-1989/9/5/94/s1, Figure S1: Metabolic fingerprints for pellet wash optimisation, Table S1: Unique m/z features in direct infusion mass spectra in dilution study, Figure S2: Growth rates and pH drift over 24 h growth of synchronised *C. reinhardtii* in CGM medium, Figure S3: Dose-response curve of 3,5-dichlorophenol, Figure S4: Relative standard deviation of *m*/*z* features per group between three independent repeated exposure studies, Figure S5: PCA scores plots (PC1 vs. PC3) visualising the similarities and differences between three independent repeated exposure studies.

**Author Contributions:** Conceptualization, experimental design, data analysis, review and editing, S.S., E.B., J.K.C., G.H., M.R.V.; Conceptualization, experimental design, S.G.; Experimentation, S.S.

**Funding:** The authors thank Unilever for funding and BBSRC for a PhD studentship (BB/N503587/1).

**Acknowledgments:** We thank Alessio Perroti for help with conducting flow cytometric measurements. We also thank Chris Sparham, Mike Pleasants and Alexandre Teixeira for their major support in conducting gas-chromatography mass-spectrometry.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


p. 123. Available online: https://ntp.niehs.nih.gov/iccvam/docs/about\_docs/validate.pdf (accessed on 19 February 2019).


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article Euglena* **Central Metabolic Pathways and Their Subcellular Locations**

#### **Sahutchai Inwongwan, Nicholas J. Kruger, R. George Ratcli**ff**e and Ellis C. O'Neill \***

Department of Plant Sciences, University of Oxford, South Parks Road, Oxford OX1 3RB, UK; sahutchai.inwongwan@new.ox.ac.uk (S.I.); nick.kruger@plants.ox.ac.uk (N.J.K.); george.ratcliffe@plants.ox.ac.uk (R.G.R.)

**\*** Correspondence: ellis.oneill@plants.ox.ac.uk; Tel.: +44-(0)1865-275-024

Received: 30 April 2019; Accepted: 11 June 2019; Published: 14 June 2019

**Abstract:** Euglenids are a group of algae of great interest for biotechnology, with a large and complex metabolic capability. To study the metabolic network, it is necessary to know where the component enzymes are in the cell, but despite a long history of research into *Euglena*, the subcellular locations of many major pathways are only poorly defined. *Euglena* is phylogenetically distant from other commonly studied algae, they have secondary plastids bounded by three membranes, and they can survive after destruction of their plastids. These unusual features make it difficult to assume that the subcellular organization of the metabolic network will be equivalent to that of other photosynthetic organisms. We analysed bioinformatic, biochemical, and proteomic information from a variety of sources to assess the subcellular location of the enzymes of the central metabolic pathways, and we use these assignments to propose a model of the metabolic network of *Euglena*. Other than photosynthesis, all major pathways present in the chloroplast are also present elsewhere in the cell. Our model demonstrates how *Euglena* can synthesise all the metabolites required for growth from simple carbon inputs, and can survive in the absence of chloroplasts.

**Keywords:** *Euglena*; central metabolic pathway; subcellular location

#### **1. Introduction**

Euglenids, a group of unicellular flagellate algae, have long been studied for their biochemistry, physiology, anatomy, and industrial potential, due to the remarkable metabolic plasticity that allows them to grow in a wide range of conditions [1]. *Euglena* can harness energy heterotrophically, mixotrophically, and photo-autotrophically, and its cultivation is relatively easy, fast, and well established. Euglenids can be found in a broad range of ecological niches including fresh water, brackish water, snow, high and low pH conditions, and both aerobic and anaerobic environments [2]. *Euglena gracilis* is the most studied species of *Euglena* and is regarded as a useful model organism for studying cell biology and biochemistry. Euglenids were once considered one of the most ambiguous groups in terms of evolution and metabolic operation, due to the combination of both "plant-" and "animal-" like features [3]. They are now classified into the kingdom Excavata, superphylum Discoba, subphylum Euglenozoa. *Euglena* is one of the very few plastid-containing organisms for which complete loss of the chloroplast is not lethal. Even the human parasitic apicomplexans retain their plastids for the synthesis of isoprenoids, fatty acids, and heme, while in non-photosynthetic, parasitic plants plastids are necessary for aromatic amino acid biosynthesis and are involved in starch synthesis [4]. Whilst these plastid-localised pathways can be targeted to kill such organisms, *Euglena* can survive complete loss of the plastid and the biochemical explanation for this remains to be established.

The genome of *E. gracilis* is estimated to be around 500 Mb in size, with large amounts of highly repetitive sequences [5], which leads to difficulty in genome sequencing and analysis. The structural complexity of the genome has arisen from a series of horizontal gene transfers and endosymbiosis

events throughout its evolutionary history, causing difficulty in classifying euglenids using modern molecular techniques [6]. A study of the distribution of the homologues of 2770 expressed sequence tags (ESTs) from *E. gracilis* has shown that euglenids are closely related to the kinetoplastids [7]. Euglenids first split from the ancestral Euglenozoa, a eukaryotic protozoa, around a billion years ago [8]. After the endosymbiotic transfer of genes from a hypothesized, since-lost, red algal endosymbiont to the nuclear genome [9], a eukaryotic green alga endosymbiont was incorporated [10], bringing many genes involved in the function and maintenance of the chloroplast. The transcriptome of *Euglena* suggests that many other genes were acquired from diverse distantly related species and the genetic control mechanisms in *Euglena* involve genes which are as sophisticated as those in animal and plant eukaryotes [11].

*Euglena* is considered to be a promising organism for industrial application due to its ability to produce various nutrients and bioactive compounds, such as proteins, polyunsaturated fatty acids, vitamin A, vitamin C, and β-1,3-glucan [12]. The application of *Euglena* in environmental engineering has been studied for wastewater treatment systems, energy sources and bioindicators for environmental pollutants. *Euglena* sp. isolated from sewage treatment plants had higher nutrient removal capability and growth rate than other algae [13]. These results indicate that *Euglena* could be considered as a viable source for biofuel production from wastewaters.

There is no doubt that *E. gracilis* is an interesting organism in terms of its evolution, metabolic capacity, and application and has thus been the subject of intense study. Due to its extraordinary metabolic capacity, investigating and understanding the *Euglena* metabolic network could help expand the applications of this organism and shed light on several mysteries of evolution and secondary endosymbiosis. Investigation of the metabolism of *Euglena* requires the definition of the metabolic network, whether at genome scale for flux balance analysis, or at the level of core metabolism for metabolic flux analysis. This would allow the metabolic phenotype of the organism to be investigated in much the same way as in highly compartmented plant cells [14]. In organisms with complex evolution like *Euglena*, even though the central metabolic pathways are conserved, the characteristics and subcellular localisation of the enzymes involved in the pathway can differ. This is particularly true for *Euglena*, where the secondary chloroplast has a relatively recent evolutionary origin (~600 MYA [15]) and a unique third plastid membrane, giving rise to a novel subcellular compartment in this intermembrane space.

Here, we provide an overview of the central metabolic pathways in *Euglena gracilis*, highlighting unique features. We assess the reported subcellular location of enzyme activities and proteins in *Euglena* and propose a model of the organisation of the central metabolic network.

#### **2. Results**

#### *2.1. Pathway Localisation from Sequence Information*

Even though *Euglena* has long been studied for its biotechnological potential, its genetic and metabolic capacities are poorly established due to the size and complexity of its genome. In the absence of an annotated genome sequence for any species of *Euglena*, transcriptome sequencing has been used as the preliminary alternative to genome structure analysis, with the aim of providing data on gene expression and regulation under different conditions [16,17].

#### 2.1.1. Metabolic Pathways in *Euglena*

The earliest reported extensive transcriptomic analysis of *E. gracilis* studied cells grown in dark and light conditions and illustrated the versatile metabolic capacity of *Euglena* [16]. All the core pathways of carbohydrate metabolism and photosynthesis were identified, including glycolysis, gluconeogenesis, the tricarboxylic acid cycle (TCA), the pentose phosphate pathway (PPP), and the Calvin cycle. In addition, the pathways for production of other major classes of compounds including carotenoids, thylakoid glycolipids, fatty acids, and isoprenoids were also identified in the

transcriptome. Besides the evidence for lipid, amino acid, carbohydrate, and vitamin metabolism, the transcriptome also revealed the capacity of *E. gracilis* to produce multifunctional polydomain proteins that relate to those from both fungi and bacteria and may have been obtained by horizontal gene transfer during its evolution [11]. Furthermore, the transcriptome showed the capacity for polyketide and non-ribosomal peptide biosynthesis [18], along with capacities for using the pathways for vitamin C, vitamin E, and glutathione metabolism to respond to stresses. A subsequent comparative study of the transcriptome of *E. gracilis* under aerobic and anaerobic conditions investigated the regulatory system of wax ester metabolism [17]. The metabolic network of *Euglena mutabilis* has been reconstructed using assembled transcript sequences and topology gap filling [19]. The initial draft network was incomplete with many missing reactions and could not simulate the heterotrophic growth of *E. mutabilis* in the dark [19], despite the long documented capacity of this species to do so. In combination, these studies demonstrate that the genomes of *Euglena* have features in common with genomes from both phototrophic and heterotrophic organisms, and these features provide *Euglena* with the metabolic capacity to adapt to a wide variety of conditions. These studies also demonstrate that transcript abundance does not vary greatly under different growth conditions and does not correlate with protein abundance. Thus, exploration of the metabolic capacity of *Euglena* using an exclusively transcriptomic approach is unlikely to be sufficient to understand pathway control.

#### 2.1.2. Metabolic Pathways in the *Euglena* Plastid

The chloroplast genome of *E. gracilis* has been sequenced [20] and is very similar to that of higher plants in its gene content, although the structure and evolution is different [21]. As with other organisms, the acquisition of the plastid came with many gene loses and gene transfers from the endosymbiont to the host genome [22]. The expression level of plastid genes was found to respond to environmental stimuli [23] and the rate of protein synthesis by the *E. gracilis* plastid in the dark is extremely low compared to that in the light [5,24].

As in the primary plastids of other organisms, most of the *Euglena* secondary plastid proteome is encoded in the nuclear genome. However, since the plastid of *Euglena* was acquired through secondary endosymbiosis of a photosynthetic eukaryote, its chloroplasts are surrounded by three membranes [25,26]. Thus, hundreds of plastidic proteins synthesized in the cytosol have to be transported through either three or four membranes to reach their destination in the plastid stroma or the thylakoid lumen [27] and we have no knowledge of the metabolic capabilities of the unique intermembrane space, found in no other group of organisms.

#### 2.1.3. Predicting the Subcellular Location of *Euglena* Proteins

Most of the previously published studies of the subcellular compartmentation of *Euglena* enzymes have relied on subcellular fractionation of organelles and measurement of enzyme activity distributions. Very few studies have exploited complementary molecular techniques to investigate localisation in *Euglena*. In principle, eukaryotic protein subcellular location prediction tools could be useful. To test this, the protein sequences of selected marker enzymes with defined compartmentation were analysed using a subcellular location prediction work flow. These included proteins known to be located in the chloroplast, mitochondria, cytosol, or directed through the secretory pathway. The predicted amino acid sequences of these marker proteins were deduced from the *E. gracilis* transcriptome [16]. In total 28% of these sequences had spliced leader sequences, indicated in bold in Tables 1–3. Two programs were used to predict the subcellular localisation of all the matching *E. gracilis* protein sequences, WoLF PSORT [28], and TargetP 1.1 [29]. Due to the potential presence of plant and non-plant targeting signals on *Euglena* proteins (arising from the complex evolutionary origin of *Euglena* genes), these analyses were conducted using plant, animal, and fungal reference databases in WoLF PSORT and both plant-based and nonplant-based prediction modes in TargetP 1.1. Moreover, since transport of proteins into *Euglena* chloroplasts requires transit via the secretory pathway [27,30,31], any sequence that was predicted to contain a secretion signal based on the plant-based algorithm in TargetP 1.1 was

subjected to extended analysis in which the signal sequence was removed and the prediction process repeated to establish the ultimate predicted location of the mature protein.

#### Mitochondrial Targeting

The mitochondrial marker enzymes are all well-established biochemical markers and are only detected in mitochondrial fractions in *Euglena*, with the exception of isocitrate dehydrogenase which is also detected in the cytosol. At least one isoform of each of these enzymes is predicted to be targeted to mitochondria using TargetP and WoLF PSORT in all modes (see Table 1). However, using the plant-based algorithm in WoLF PSORT there was more support for some of these enzymes being in the chloroplast. One isoform of succinic semialdehyde dehydrogenase, containing a spliced leader sequence, appears to have no targeting signal and so would be predicted to be in the cytosol. One isoform of isocitrate dehydrogenase has no predicted targeting, in line with biochemical evidence for some cytosolic activity of this enzyme.

#### Proteins without Targeting Signals

Cytosolic marker proteins were selected that are routinely used as marker enzymes in subcellular fractionation studies. Overall, these had less confident predictions and some weak predictions for mitochondrial targeting (Table 2). The exception is thiosulfate sulfurtransferases, for which three isoforms had plastid targeting sequences in WoLF PSORT using the plant mode. Two of these had strong secretion signal predictions in both animal and fungi modes and in TargetP, whilst another isoform has a strong secretion signal prediction in these WoLF PSORT modes. This may indicate that some of these isoforms are targeted to the chloroplast via the endoplasmic reticulum (see below).

#### Targeting for Secretion

Proteins known to be in the Golgi, and which thus utilise the secretory pathway, were used as benchmarks to test the reliability of secretion signal prediction for *Euglena* proteins (Table 2). They were predominantly identified as being targeted for secretion by TargetP with a high level of confidence, especially using the nonplant algorithm, although mitochondrial targeting was predicted in some instances. WoLF PSORT predicted that these proteins were targeted to the plasma membrane, as they are integral membrane proteins. Some were predicted to also contain secretory signals with high confidence, but not all. One of the mannosyltransferases was predicted to target to the chloroplast using WoLF PSORT in plant mode.

#### Chloroplast Targeting

A selection of biochemical marker enzymes and components of the photosynthetic apparatus was used to test the ability of these programs to predict targeting to the plastids. TargetP predicted most of these proteins to be either mitochondrial or secreted (Table 3). The only exceptions were for one of the isoforms of fructose-bisphosphate aldolase, and one ribulose-bisphosphate carboxylase/oxygenase (small subunit) that were predicted to be targeted to the chloroplast after removal of the secretory signal peptide. WoLF PSORT on the other hand correctly predicted many soluble enzymes to be targeted to the chloroplast but predicted many of the integral membrane proteins, such as photosystem components, as being targeted to the plasma membrane.

The limitations of the chloroplast targeting prediction of TargetP have been reported before [29]. The predictive power of TargetP 1.1 is based on the presence of N-terminal presequences, including chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP), or secretory pathway signal peptide (SP) [29]. However, the structure of cTP is not well characterized, especially in *Euglena*, and the prediction performance of chloroplast targeted proteins was reported to be less accurate than that for mitochondria, with occasional poor discrimination between mTP and cTP [32]. This lack of discrimination is partly due to some proteins using the same targeting sequence for both chloroplasts and mitochondria [29]. Thus, using TargetP and WoLF PSORT to predict the location of proteins in *Euglena* might not cover all the possible protein transport systems.



Transcript numbers in bold indicate the presence of the splice leader sequence. PSORT score is the discriminant score, with larger scores having a higher probability. Scores below 5 are not reported. TargetP score is the reliability class is rated from 1 to 5 (1 is the strongest prediction and 5 is the weakest). Chl—chloroplast (green); Cyt—cytosol (grey); E.R.—endoplasmic reticulum (Blue); Mt—mitochondria (orange); Nu—nuclear; Per—peroxisome; PM—plasma membrane (yellow); Sec—secreted or extra cellular (blue). Strength of colour indicates score.

#### *Metabolites* **2019**, *9*, 115



*Metabolites* **2019**, *9*, 115

E.R.—endoplasmic

colour indicates score.

 reticulum (blue);

Mt—mitochondria

 (orange); Nu—nuclear;

Per—peroxisome;

 PM—plasma

 membrane (yellow); Sec—secreted

 or extra cellular (blue). Strength of


**Table 3.** Subcellular location prediction of *E. gracilis* chloroplast marker proteins. not reported. TargetP score is the reliability class is rated from 1 to 5 (1 is the strongest prediction and 5 is the weakest). Chl—chloroplast (green); Cyt—cytosol (grey); E.R.—endoplasmicreticulum (blue); Lyso—lysosome; Mt—mitochondria (orange); Nu—nuclear; Per—peroxisome; PM—plasma membrane (yellow); Sec—secreted or extra cellular (blue). Strengthcolourindicatesscore.

 of

#### *Metabolites* **2019**, *9*, 115

Apart from the evident limitations of these algorithms as protein localisation prediction tools in *Euglena*, protein targeting into chloroplasts of *Euglena* is likely to be inherently complex. In contrast to plants, the chloroplast of *Euglena* evolved from the secondary endosymbiosis, which led to the chloroplast being surrounded by three membranes [25,26,33]. A recent study of the *E. gracilis* chloroplast proteome identified three classes of chloroplast pre-protein based on targeted signal analysis. Class I and II proteins possess a bipartite topogenic signal (BTS), with Class I proteins composed of a signal peptide (SP) followed by a stop-transfer signal (STS) and a transit peptide (TP), whilst Class II proteins contain only an SP and TP [31,34]. The third class of chloroplast proteins was referred to as unclassified, with no signal sequence detected in the proteins. The transport mechanism used to import proteins from this unclassified category into the plastid remains unknown [30]. The transport of *Euglena* Class I and II pre-proteins into the chloroplast involves the first step of co-translational transport into the endoplasmic reticulum (ER) lumen where the cleavage of the signal peptide occurs (Figure 1). The pre-proteins are subsequently transported to the chloroplasts from the Golgi body via vesicles, which then fuse with the outermost plastid membrane. However, the transport across the inner two membranes of the three-membrane-bound plastids in euglenophytes remains unclear [27,30,34]. The TOC/TIC-like pathway was believed to be involved in the inner membranes transport of the *Euglena* plastid due to the presence of plant-like targeting signal (TP) in the preproteins [35]. However, none of the TOC subunits have been detected in the transcriptome of *E. gracilis*, whereas homologues of several TIC subunits were identified [5]. A recent analysis of the structure of TP sequences in *E. gracilis* has suggested that it is possible for the TP to be recognised by the symbiont-derived ERAD-like machinery (SELMA) transport system, as is the case for diatoms [30,36].

**Figure 1.** Protein transport into the secondary chloroplast of *Euglena*. Nuclear encoded chloroplast pre-proteins (blue strip) are synthesised into the lumen of the endoplasmic reticulum (ER) where the signal peptide (SP) is cleaved. Pre-proteins with transit peptides (TP) are subsequently transferred to the outermost chloroplast membrane through the Golgi body via vesicles. GOSR and RAB5 GTPase are proposed to mediate the fusion of the vesicle to the outermost membrane. After transport of proteins into the stroma, where the TP is removed, the mature protein can enter the thylakoid lumen via SEC, TAT, or Alb3/SRP pathway. This scheme only considers proteins possessing Class I and II targeting signals, as the transport of those with unclassified signals is not known [34].

It can be concluded that WoLF PSORT and TargetP have limitations with predicting cTPs and do not specifically include protein targeting to the secondary plastid. Predicting chloroplast protein targeting in *Euglena* is likely to require more specific databases or algorithms, since the evolution of the *Euglena* chloroplast is different from that of plants. In contrast, the prediction of mitochondria targeting with high reliability scores, when there is a high degree of agreement amongst the algorithms, can be informative. However, due to the false predictions of chloroplast proteins to other locations, the prediction results cannot be fully relied upon and need to be carefully evaluated in conjunction with evidence from enzymatic and biochemical analyses.

#### *2.2. Pathway Localisation from Biochemical*/*Proteomic Information*

#### 2.2.1. Central Metabolic Pathways of *Euglena*

The central metabolic pathways are essential to all organisms, providing the precursors for other peripheral pathways, especially metabolites with carbon backbones that are derived from carbohydrate metabolism. In addition, under non-photosynthetic conditions, these pathways have a major role in producing the energy and reducing power for the cell. Pathways of carbohydrate metabolism generally consist of glycolysis (Embden–Meyerhof–Parnas pathway), gluconeogenesis, the PPP, the Entner-Doudoroff (ED) pathway, and the TCA cycle. Notably, there is no evidence for the ED pathway in *Euglena*. Results for subcellular location predictions are available in Supplementary Table S1.

#### Glycolysis and Gluconeogenesis

The intracellular distribution of the glycolytic enzymes in *Euglena* has been studied using fractionation in aqueous and non-aqueous media. This approach showed that most of the glycolytic enzymes are in the cytosol and that several of them are present in both the chloroplast and the cytosol [37,38]. By using sucrose density gradient centrifugation, it was found that phosphofructokinase, pyruvate kinase, triosephosphate isomerase, and aldolase were present in the plastid fraction [39]. In addition, a recent proteomic study reported that several enzymes involved in glycolysis and gluconeogenesis were present in *Euglena* chloroplasts [30].

Hexose-Phosphorylating Enzymes. The activity of hexokinase (EC 2.7.1.1) was three times higher in *E. gracilis* grown on glucose than that on ethanol and acetate [40]. The activity of this enzyme in glucose media was also four times higher in heterotrophic cells than that in autotrophic cells [41]. *E. gracilis* was found to have glucokinase (EC 2.7.1.2) and fructokinase (EC 2.7.1.4) in different locations in both autotrophic and heterotrophic conditions. At 105,000 g separation, the glucokinase was present in the cell pellet while the fructokinase activity was only found in the supernatant [2,42]. Glucokinase is therefore concluded to be in organelles, whilst fructokinase is in the cytosol.

Phosphoglucoisomerase (EC 5.3.1.9). The activity of this enzyme was detected in *E. longa* [2,43], although, the subcellular location has not been reported. Strong targeting signals were not detected in the protein sequences.

6-Phosphofructokinase (ATP-PFK, EC 2.7.1.11) and Diphosphate–Fructose-6-Phosphate 1-Phosphotransferase (PPi-PFK, EC 2.7.1.90). In *E. gracilis*, ATP-PFK was reported to be located in both chloroplasts and the cytosol [39], while PPi-PFK was reported exclusively in the cytosol. During cell growth, the activity of PPi-PFK was 10–30 times higher than the activity of ATP-PFK [44]. No strong targeting signals were detected in these protein sequences.

Fructose Bisphosphate Aldolase (EC 4.1.2.13). There are two classes of aldolase found in *Euglena*: Class I is located in the chloroplast and proplastid, and Class II is located in the cytosol [45]. Class I enzyme peptides were detected in the chloroplast proteome [30] and the Class II cytosolic enzyme was shown to be more active when the *E. gracilis* culture was grown in the dark and is presumed to play the main role in heterotrophic glycolysis and gluconeogenesis [46]. One isoform has no strong targeting signal, whilst two have plastid targeting and one has a strong mitochondrial targeting sequence.

Glyceraldehyde 3-phosphate Dehydrogenase (G3P) Dehydrogenase (EC 1.2.1.12). *E. gracilis* contains both NAD-linked and NADP-linked G3P dehydrogenase, which are found in different subcellular locations [45,47]. The NAD-linked enzyme showed higher activity in heterotrophic conditions and was located in the cytosol. On the other hand, the NADP-linked enzyme was shown to be located in chloroplasts and had higher activity in autotrophic cells [48]. Only the NADP-linked enzyme was detected in the proteome of *E. gracilis* chloroplasts [30].

Triosephosphate Isomerase (EC 5.3.1.1). As with fructose bisphosphate aldolase, two types of the isomerase were identified in *E. gracilis* using enzymatic activity profiling [47]. Type A triosephosphate isomerase was reported to function in the chloroplasts and proplastids of *E. gracilis*, while type B enzymes were located in the cytosol [49]. Sequences matching triosephosphate isomerase could also be detected in the *E. gracilis* chloroplast proteome [30].

Phosphoglycerate Kinase (EC 2.7.2.3)/Phosphoglycerate Mutase (EC 5.4.2.11). The activity of phosphoglycerate kinase was reported in isolated *E. gracilis* chloroplasts [50] and the enzyme was detected in the *E. gracilis* chloroplast proteome [30], although the presence in other subcellular locations has not been investigated. No specific studies of the activity of phosphoglycerate mutase have been reported in *Euglena*. However, the enzyme was recently reported to be present in the *E. gracilis* chloroplast proteome [30]. WoLF PSORT identifies a strong chloroplast targeting sequence on one isoform, with the other three isoforms is predicted to remain in the cytosol.

Enolase (EC 4.2.1.11). The activity of enolase was previously detected in *E. gracilis* but the subcellular location was not described [38,51]. N-terminal targeting peptide analysis of cDNA clones of *E. gracilis* suggested that enolase could be present in both the cytosol and the chloroplast [52]. However, as shown in Section 2.1.3, it is difficult to predict protein targeting into the chloroplasts of *Euglena* and, furthermore, enolase was not found in the chloroplast proteome of *E. gracilis* [30].

Pyruvate Kinase (EC 2.7.1.40). The activity of pyruvate kinase in *E. gracilis* was shown to be highly active in cultures grown on glucose [53]. This enzyme was reported to be located in both proplastids and the cytosol of *E. gracilis*, however, the activity of this enzyme was not detected in the mature chloroplast [39]. WoLF PSORT predicts plastid targeting sequence in two isoforms with very low confidence, whilst one of these has mitochondrial targeting with slightly more confidence, highlighting the challenging nature of predicting subcellular locations.

Fructose-1,6-Bisphosphatase (EC 3.1.3.11). Fructose-1,6-bisphosphatase is involved in gluconeogenesis and has been reported from *Euglena* [39,44]. The cytosolic fructose-1,6-bisphosphatase in *E. gracilis* was detected and characterized [54]. Recently, the enzyme was reported in the *E. gracilis* chloroplast proteome [30], where it is presumably involved in the Calvin cycle. One isoform is predicted not to contain a targeting signal, but the other four are predicted to be variously targeted to the chloroplast, for secretion, or to the plasma membrane, possibly indicating that they all pass through the secretory system to the chloroplast.

#### Pentose Phosphate Pathway

Oxidative Phase. In contrast to higher plants and green algae, all the enzymes of the oxidative arm of the pentose phosphate pathway in *E. gracilis* were reported to be present in the cytosol, but not the chloroplast. Using non-aqueous fractionation, it was found that two dehydrogenases of the oxidative pentose phosphate pathway were absent from the *E. gracilis* plastid [37]. In separate studies, the activity of 6-phosphogluconate dehydrogenase (EC 1.1.1.44) was confirmed to be in the cytosol [38], and glucose-6-phosphate dehydrogenase (EC 1.1.1.49) was reported to be located in the cytosol [2,38,55–57] and has been used as a cytosolic marker enzyme [58]. Although a single glucose-6-phosphate dehydrogenase was detected in the chloroplast proteome, this fraction was reported to be moderately contaminated with protein from other organelles [30] and thus, subcellular location of the enzyme will need further investigation to confirm its location. This enzyme is specific for NADP in *Euglena* and induced by glucose, with low activity detected under heterotrophic growth in the absence of glucose [53]. There has been no specific study of *Euglena* 6-phosphogluconolactonase (EC 3.1.1.31).

Non-Oxidative Phase. All the enzymes involved in the non-oxidative section of the pentose phosphate pathway have been detected in *Euglena* and most of the enzymes were reported to localize to the chloroplast [2,30]. The activity of ribose 5-phosphate isomerase (EC 5.3.1.6) was reported in isolated *E. gracilis* chloroplasts [59]. The subcellular location of pentose-5-phosphate-3-epimerase (EC 5.1.3.1) has not been reported, although the activity of this enzyme was detected in heterotrophic, autotrophic, and mixotrophic growth conditions, along with the activity of transketolase (EC 2.2.1.1) [60] and transaldolase (EC 2.2.1.2) [47]. Non-aqueous separation techniques showed the presence of transaldolase in *Euglena* chloroplasts and proplastids [39].

Notably, there are two isoforms of each enzyme of the non-oxidative PPP in the *E. gracilis* transcriptome, except transketolase which has three. For three of these enzymes, only one isoform was identified in the chloroplast proteome [30], whereas neither isozyme of transaldolase could be detected. This suggests that the other isoforms are present in another location within the cell and the lack of any detectable targeting signal indicates this is likely to be the cytosol. However, extensive study of this pathway has not been reported and further investigation would be needed to confirm the operation of the pathway in the cytosol.

#### Anaplerotic Pathway: Dicarboxylic Acid Bypass

Malate dehydrogenase (NADP-specific oxaloacetate-decarboxylating, EC 1.1.1.40) in *Euglena* is located in the cytosol but not in mitochondria, and is specific for NADP and l-malate [2]. The NAD-specific malate dehydrogenase (decarboxylating, EC 1.1.1.39) can only be detected in *E. gracilis* cultured with d-malate [61]. Recently, a proteomic study detected malate dehydrogenase (NADP-specific) in *E. gracilis* chloroplasts [30]. The activity of this enzyme varied widely with light and carbon sources, and has 55 times greater activity in heterotrophic cells than in autotrophic cells. This result suggests a physiological role in *Euglena* for these enzymes in providing NADPH for cytosolic fatty-acid synthesis in the dark [62,63].

Phosphoenolpyruvate carboxylase (PEP carboxylase, EC 4.1.1.31) was shown to have multiple isozymes which were active in different light conditions. It has been reported that PEP carboxylase functions for CO2 fixation in *E. gracilis* grown in the dark and under CO2 limited conditions [64,65]. The activity of phospho*enol*pyruvate carboxykinase (PEP carboxykinase, EC 4.1.1.32) in *E. gracilis* is specific for GTP rather than ATP [66]. PEP carboxylase and PEP carboxykinase are discrete, separate enzymes in *E. gracilis* [67]. PEP carboxykinase was reported to be located exclusively in the cytosol and the enzyme could not be detected in cells grown under autotrophic conditions [68]. One isoform is predicted to be localised in the chloroplast by WoLF PSORT with a high degree of confidence, but the locations of the other two isoforms are not predicted confidently. In addition, the activity of PEP carboxykinase was detected in *E. gracilis* cultured with acetate or ethanol, but not with glucose [62]. Pyruvate carboxylase (EC 6.4.1.1) was also reported to be located in the cytosol [69]. The activity of this enzyme was found in cells grown under heterotrophic culture fed with glucose, but not with acetate or in autotrophic cells [2].

#### TCA Cycle

The reactions of the TCA cycle occur in the mitochondria of *Euglena* in common with all other eukaryotic organisms [2]. Most of the enzymes involved in the TCA cycle are predicted to target to the mitochondria with high reliability (Table S2), in line with previous studies on the localisation of the TCA cycle.

Pyruvate Dehydrogenase (NAD complex 1.2.4.1, NADP+ EC 1.2.1.51). In *E. gracilis* the conventional NAD<sup>+</sup> pyruvate dehydrogenase complex only contributes around 1% of the activity and instead an NADP+-dependent pyruvate dehydrogenase is used to produce the majority of the acetyl-CoA from pyruvate [70]. This latter enzyme has been detected in the mitochondrial fractions of *E. gracilis* [71–73] and all three component polypeptides are predicted to be targeted to the mitochondria. The activity of the NAD complex has not been localised.

Citrate Synthase (EC 4.1.3.7). Citrate synthase activity was detected in both particulate and soluble fractions from bleached *E. gracilis* [38], indicating that the enzyme is located in cytosol and other cell compartments. Testing the activity of this enzyme from different organelle suspensions showed the presence of this enzyme in both mitochondria and microbodies (glyoxysome-like particles) [74,75]. Only one of the four isoforms is predicted to be targeted to mitochondria.

Aconitase (EC 4.2.1.3). The activity of aconitase was detected in *E. gracilis* [76,77]. However, the subcellular location of this enzyme has apparently never been investigated and only one of the two isoforms is predicted to be targeted to mitochondria.

Isocitrate Dehydrogenase (NAD-specific EC 1.1.1.41, NADP-specific EC 1.1.1.42). NAD- and NADP-specific isozymes of isocitrate dehydrogenase have been characterised from *Euglena*. The activity of NAD-specific isocitrate dehydrogenase was detected in mitochondria and cytosol of *E. longa* [38,43]. The NAD-specific isozyme was detected solely in the mitochondria of the streptomycin-bleached *E. gracilis* [75,78,79]. The NADP-specific isozyme was reported in both mitochondria and the cytosol, with the activity of the mitochondrial enzyme being about 25% of that in the cytosol [75,79].

2-Oxoglutarate Decarboxylase (EC 4.1.1.71). *E. gracilis* contains a 2-oxoglutarate decarboxylase that is dependent on thiamine pyrophosphate, in contrast to the more common CoA-dependent 2-oxoglutarate dehydrogenase complex, which was not detected [80]. The thiamine pyrophosphate dependent activity which coverts 2-oxoglutarate to succinic semialdehyde is located solely in mitochondria [81].

Succinic Semialdehyde Dehydrogenase (EC 1.2.1.16). NAD- and NADP-specific succinate semialdehyde dehydrogenase were detected in *E. gracilis* and reported to be in the mitochondria [73,82]. Three isoforms are predicted to be located in the mitochondria, whilst the remaining isoform is not predicted to have a targeting sequence.

Succinate Dehydrogenase (EC 1.3.5.1). As with other eukaryotes, the succinate dehydrogenase in *E. gracilis* is tightly bound to the inner membrane of mitochondria and has been used as a marker enzyme for mitochondria in *Euglena* [83]. [58,74,75,78]. It is predicted to be associated with the plasma membrane by WoLF PSORT, in line with the integral membrane nature of the protein.

Fumarase (EC 4.2.1.2). Using cell fractionation and enzyme activity assays, fumarase is routinely detected solely in *E. gracilis* mitochondria [39,74,75,78] and is commonly used as a soluble mitochondrial marker enzyme [83].

Malate Dehydrogenase (EC 1.1.1.37). In *E. gracilis,* malate dehydrogenase is found in both mitochondria and the cytosol. The cytosolic enzyme had three times higher activity in heterotrophically grown cells than in photoautotrophic cells, whereas the activity of the mitochondrial isoform was largely uninfluenced by variation in growth conditions [62]. *E. gracilis* contains two forms of malate dehydrogenase, NAD-linked and NADP-linked isozymes. Unlike in higher plants, where the NADP-linked malate dehydrogenase is present exclusively in chloroplasts, in *E. gracilis* the majority (81–91%) of both NAD-linked and NADP-linked activity were located in the cytosol with a smaller proportion (13–16%) found in mitochondria. The activity of the NAD-linked isozyme was reported to be about three times higher than that of the NADP-dependent isozyme [84,85].

#### Glyoxylate Cycle

The glyoxylate cycle is a modified form of the TCA cycle that is found in plants, bacteria, protists and fungi. The cycle has an important role in provision of precursors for gluconeogenesis and allows the cell to use other respiratory substrates when sugars are not available [86]. The subcellular location of the glyoxylate cycle in *Euglena* under different conditions is poorly defined, with studies suggesting that the cycle operates in either mitochondria or discrete microbodies (glyoxysome-like particles). Notably, the presence of microbodies in *E. gracilis* was reported to vary under different conditions [87]. Following cell fractionation on sucrose density gradients, the activities of isocitrate lyase (EC 4.1.3.1) and malate synthase (EC 2.3.3.9), enzymes unique to the glyoxylate cycle, were found in the microbody fraction of *E. gracilis* grown on acetate [75,78]. In contrast, using similar cell fractionation techniques and immunocytochemical analysis, both isocitrate lyase and malate synthase were localised to mitochondria in *E. gracilis* grown on ethanol in which microbodies could not be detected [88].

#### C2 Metabolism

Ethanol, which can readily diffuse into the cell, is first oxidized to acetaldehyde by alcohol dehydrogenase (EC 1.1.1.1), and the product is then oxidised by acetaldehyde dehydrogenase (EC 1.2.1.10) to produce acetate. Both enzymes are found in *E. gracilis* mitochondria [89–91]. Acetate is taken up either by simple diffusion or active transport through monocarboxylate transporters and is then converted to acetyl-CoA by acetyl-CoA synthetase (EC 6.2.1.1), also located in *E. gracilis* mitochondria [92], and then metabolized through the TCA cycle or channelled into the glyoxylate cycle.

#### 2.2.2. Subcellular Locations of Biomass Production

The composition of *Euglena* biomass is similar to that of many organisms, with storage carbohydrates, proteins and lipids predominating. The amounts of the different components varies substantially depending on the growth conditions, from almost 10% dry weight wax esters [93] under anaerobic growth to over 80% paramylon under aerobic conditions [94].

#### Carbohydrate Biosynthesis

Unlike most other photosynthetic organisms, such as plants and green algae, *Euglena* stores carbohydrate in the form of a crystalline β-1,3-glucan, called paramylon, instead of starch, and the soluble disaccharide trehalose, instead of sucrose. *Euglena* has a wide range of enzymes involved in carbohydrate metabolism but it is difficult to predict their substrates and products from sequence alone [95].

Paramylon. Paramylon is synthesized from UDP-glucose [96] using the membrane bound paramylon synthetase (beta-1,3-glucan beta-glucosyltransferase, EC 2.4.1.34) that was identified in the *E. gracilis* mitochondrial fraction by measuring activity following differential centrifugation [97] and the genes identified in the transcriptome [98]. Based on transmission electron microscopy, paramylon was synthesised in vesiculated mitochondrial related membrane complexes (chondriomes). The matrix of these vesicles was dense with paramylon granules and extended into the cytosol. The vesicles developed, resulting in the membrane-bound paramylon grains found in the cytosol [41,99,100]. The endo-1,3-β-glucanases (EC 3.2.1.6 and EC 3.2.1.39), exo-1,3-β-glucanases (EC 3.2.1.58), and 1,3-β-glucan phosphorylases (EC 2.4.1.97) involved in glucan metabolism have been characterized [101–103], though the subcellular locations of these enzymes have not been defined. Some of these are predicted to be membrane associated or chloroplast localised.

Trehalose. In *Euglena gracilis*, trehalose synthesis was reported to have a role in the acclimation to osmotic stress [104,105]. Trehalose biosynthesis involves a two-step process through the sequential action of trehalose-phosphate synthase (TPS, EC 2.4.1.15) and trehalose-phosphate phosphatase (TPP, EC 3.1.3.12). It was found that the activities of TPS and TPP could not be separated and so a TPS/TPP enzyme complex of about 250 kDa was suggested to be responsible for trehalose synthesis in *E. gracilis* [106]. In *Arabidopsis*, the bulk of the TPP was reported to be cytosolic [107,108]. However, the subcellular localisation of the TPS/TPP complex in *Euglena* has not been investigated. Analysis of the chloroplast proteome of *E. gracilis* [30] shows no evidence of the TPS and TPP suggesting it is more likely that the TPS/TPP complex is located in the cytosol (or conceivably mitochondria) rather than in chloroplasts. There is no strong targeting signal predicted for this enzyme, supporting the putative cytosolic location.

#### Amino Acid Biosynthesis

The pathways of amino acid biosynthesis in *Euglena* have been poorly investigated, especially with regard to their subcellular localisation. The recent evidence from the proteomic analysis of *Euglena* chloroplasts suggested that their capacity for synthesis of amino acids is extremely limited, in contrast to plant and algal chloroplasts, which are the major subcellular sites for synthesis of various amino acids [30]. Here we present a summary of the likely subcellular localisation of amino acid biosynthesis in *Euglena*.

Glycine and Serine (Glycolate Pathway Associated). Glycine and serine are synthesised from glyoxylate, an intermediate of photorespiration and gluconeogenesis. Glycolate dehydrogenase (EC 1.1.99.14), the starting enzyme of the glycolate pathway, was reported to be located in both mitochondria and microbodies in *E. gracilis* [78]. Glutamate:glyoxylate aminotransferase (EC 2.6.1.4), which adds the amino group to form glycine [109], is found in mitochondria, the cytosol and microbodies [78,110]. A small proportion of the glyoxylate is converted to glycine by glutamate:glyxoylate aminotransferase in mitochondria, and the majority is split into CO2 and formate. As in higher plants, the formate is then used to produce serine through condensation with glycine [111,112]. Folate coenzymes, which are involved in this C1 transfer, were reported to be located largely in the cytosol [79]. Glycine can also be produced through the cleavage of threonine by threonine aldolase (EC 4.1.2.5/48) [113], though the subcellular location of this activity has not been reported. The enzymes involved in serine biosynthesis from 3-phosphoglycerate have not been studied in detail in *Euglena*. However, recently, phosphoserine phosphatase was identified in the *E. gracilis* chloroplast proteome, indicating the possibility of a plastidic serine biosynthesis pathway [30].

Methionine, Cysteine, and Threonine. The activity of cobalamin-dependent methionine synthase (EC 2.1.1.13), producing methionine from N5-methyltetrahydrofolate and homocysteine, was reported to be distributed between the cytosol (68.9%), chloroplast (18.4%) and mitochondria (9.5%) of phototrophic cells. The more stable, Mg-dependent, variant was reported to be found only in the cytosol [114]. Cysteine synthesis in *Euglena* has not been investigated in detail and the subcellular localisations of the enzymes associated with this pathway have not been elucidated. Two enzymes involved in the synthesis of cysteine (serine O-acetyltransferase and cysteine synthase) were reported in the *E. gracilis* transcriptome [113] and isoform A of cysteine synthase was detected in the *E. gracilis* chloroplast proteome [30]. Threonine is synthesized from aspartate via homoserine. Five enzymes involved in threonine biosynthesis in *E. gracilis* were reported to be expressed in different growth conditions [113]. However, the localisations of the enzymes involved in the synthesis pathway have not been elucidated.

Aromatic Amino Acids (Phenylalanine, Tyrosine, and Tryptophan). Chorismate, the precursor to aromatic amino acids, is synthesised from d-erythrose 4-phosphate and phosphoenolpyruvate by the shikimate pathway in seven steps. Five reactions can be catalysed either by separate enzymes, as in plants [115], or by a pentafunctional enzyme, as in fungi [116]. There is evidence for both of these in the *E. gracilis* transcriptome [27].

In green algal and plant cells, the aromatic amino acids are produced exclusively in the plastid but the protein analysis of isolated organelles of *E. gracilis* suggests that the shikimate pathway occurs in both the chloroplast and cytosol [117]. The preferred pathway depends on the growth conditions, with the cytosolic pathway used in the dark and the plastidic pathway in the light [117,118].

Chorismate is then converted into tyrosine and phenylalanine, via prephenate by dehydration, dehydrogenation, and transamination. The enzymes catalysing these reactions are present in *E. gracilis* as unusual domain fusions, also found in thermophilic bacteria [16]. Tryptophan is synthesised from chorismate by a series of reactions via anthranilate. In *E. gracilis* all four of these reactions are carried out by a unique fusion protein rather than a series of separate enzymes, as in other organisms [11,113].

Together the data suggest that aromatic amino acid biosynthesis in *Euglena* is carried out by a combination of plant-, bacterial-, and fungal-like enzymes, as well as unique proteins. The evidence suggests that these pathways are not exclusively located in the plastid, unlike in plants, supporting the dispensability of the plastid for their biosynthesis.

Branched-Chain Amino Acids (Valine, Isoleucine, and Leucine). Pyruvate and α-ketobutyrate are the precursors for valine, leucine and isoleucine biosynthesis in *Euglena*, as in other organisms [119]. In *E. gracilis*, α-ketobutyrate is synthesized by the action of two threonine dehydratases (EC 4.3.1.19 and EC 4.3.1.17) that are located in the cytosol [120]. The subsequent steps are catalysed by acetolactate synthase, dihydroxy-acid reductoisomerase, and branched-amino-acid aminotransferase, all of which are located in the mitochondria [119], suggesting the biosynthesis of branched-chain amino acids is located in mitochondria.

Arginine and Proline. Arginine is synthesised by the sequential transfer of nitrogen onto glutamate semialdehyde. Arginine biosynthesis is likely to occur mostly in the cytosol in *Euglena*, as the majority of ornithine carbomyltransferase is located in the cytosol and smaller portion in mitochondria [2]. Arginine metabolism follows the arginine dihydrolase pathway in which arginine is converted into citrulline and then ornithine, which occurs in the mitochondria [121]. Proline synthesis in *Euglena* has not been investigated. However, proline metabolism is tightly associated with arginine metabolism as ornithine is the precursor for proline synthesis [122], suggesting that synthesis is likely to be located in the cytosol or mitochondria.

Lysine. Bacteria, plants and algae synthesize lysine via the diaminopimelate (DAP) pathway, using aspartate and pyruvate as the precursors. On the other hand, fungi synthesize lysine through the α-aminoadipate (AAA) pathway, which uses 2-oxoglutarate and acetyl-CoA [123,124]. Several enzymes involved in AAA pathway were detected in *Euglena*, including homocitrate synthase (EC 2.3.3.14), homoaconitate hydratase (EC 4.2.1.36) and homoisocitrate dehydrogenase (EC 1.1.1.87) [113]. However, the subcellular location of the AAA pathway has not been reported.

Histidine. Histidinol dehydrogenase, the enzyme catalysing the final step of histidine biosynthesis, has been detected in *E. gracilis* [113,125]. No other enzyme involved in this process was detected and the subcellular localisation of the enzymes involved in histidine biosynthesis have not been investigated.

Glutamate, Glutamine, Alanine, Aspartate, and Asparagine. Aminotransferases and dehydrogenases play the main role in the synthesis of glutamate, alanine, and aspartate from organic acids. For glutamate, the aspartate aminotransferase (glutamate: oxaloacetate aminotransferase) is present in mitochondria, chloroplasts, microbodies, and cytosol, and was shown to be more active in dark growth conditions [74,78]. NADP-specific glutamate dehydrogenase was reported to be located solely in the cytosol of *E. gracilis*, instead of the mitochondria as in other organisms [126]. Similarly, glutamate synthase was reported to be localised to the cytosol in both wild-type and streptomycin-bleached *E. gracilis* strains [127]. Glutamine is synthesized from glutamate using glutamine synthetase, but the properties of this enzyme have not been studied in *Euglena* [128]. Asparagine synthetase, the enzyme that converts aspartate to asparagine, has not been reported from *Euglena*. The activities of alanine aminotransferase and alanine dehydrogenase were detected in *E. gracilis*, but the localisation of these enzymes has not been described [2,115,116].

Tetrapyrrole Biosynthesis. Tetrapyrrole, the core of heme and chlorophyll, is synthesised from δ-aminolevulinic acid (ALA). Heterotrophs tend to synthesize ALA from glycine and succinyl-CoA via the Shemin pathway in the mitochondia [129], whilst photoautotrophs make ALA from glutamate in the C5 pathway, located in the chloroplast [130]. *E. gracilis* is known to utilise both routes [131], and the transcriptome shows a bacterial-derived Shemin pathway and a green algae-related C5 pathway, presumably obtained with the chloroplast [16]. These have been identified in the mitochondria and chloroplasts of *E. gracilis* respectively [132]. This again supports the multiple locations of core metabolic pathways that are plastid localised in other photosynthetic organisms.

#### Lipid Biosynthesis

The subcellular locations of the enzymes involved in lipid metabolism in *Euglena* are poorly investigated. As in other organisms *Euglena* produces the lipid building block malonyl-CoA from CO2 and acetyl-CoA using acetyl-CoA carboxylase, which forms a multienzyme complex with phosphoenolpyruvate carboxylase and malate dehydrogenase in the cytosol [133]. Malonyl-CoA is

then used to synthesise fatty acid using fatty acid synthases (FAS), of which three types have been reported in *E. gracilis*. FAS I and FAS III were reported to function in heterotrophic growth conditions. The properties of FAS III has not been investigated in detail. The structure of FAS I is similar to yeast and mammalian enzymes, and was located in cytosol [2]. On the other hand, FAS II resembles the plant and bacterial enzymes, and is located in the chloroplasts of *E. gracilis* [134]. In addition to these three types of FAS, a fatty acid biosynthesis system was found in the mitochondria of *Euglena* and is involved in wax-ester synthesis [134].

#### **3. Discussion**

By combining multiple strands of evidence, including biochemical, proteomic, and bioinformatic data, we propose a model for the subcellular localisation of the reactions of the network of central carbon metabolism in *E. gracilis* (Figure 2). Many of these pathways are found in similar subcellular locations to those in other, well-characterised organisms. Glycolysis, which catalyses the initial breakdown of sugars produced by photosynthesis or absorbed from the medium, is present in the cytosol and plastids, as commonly found in green plants. The products of this pathway feed into the TCA cycle, which is mitochondrial, as in other eukaryotes. The enzymes commonly associated with microbodies in higher plants are additionally present in the mitochondria, and it is often difficult to separate these two groups of organelles in *Euglena*. The site of synthesis of many amino acids is unclear, though several appear to be synthesised in the mitochondria from TCA cycle intermediates. Lipids can be made in several cellular compartments, though for different purposes, such as the mitochondrial lipids which are directed towards wax ester biosynthesis and plastid lipids that are used to make photosynthetic glycolipids.

**Figure 2.** Proposed distribution of central metabolic pathways in *Euglena*. Abbreviations: oxPPP oxidative pentose phosphate pathway; Non-oxPPP—non-oxidative pentose phosphate pathway.

However, the locations of many metabolic processes in *Euglena* differ substantially from those found in other photosynthetic organisms. For instance, in *Euglena* the complete PPP is present in the cytosol, with a duplicated non-oxidative phase present in the plastid. A plant-like pathway for aromatic amino acid biosynthesis is present in the plastids [117]. However, unlike plants, in *Euglena* an additional pathway, similar to that found in fungi, is located in the cytosol. Tetrapyrroles, essential prosthetic groups of both the respiratory and photosynthetic electron transport chain proteins, are synthesised in both the chloroplast and mitochondria in *Euglena*.

Overall, these results indicate that, aside from the reactions of photosynthesis, all the metabolic pathways found in the *Euglena* plastid are also found elsewhere in the cell. This includes the biosynthesis of isoprenoids, for which two pathways are found in other plastid-containing organisms, the methylerythritol phosphate pathway found in the plastids and the mevalonate pathway in the cytosol. Although we have not found evidence for the location of these pathways in *Euglena*, the methylerythritol phosphate pathway only contributes to carotenoid biosynthesis in *E. gracilis*, and phytol is instead made by the mevalonate pathway [135], unlike in other studied organisms. The unusual and well-established ability of *E. gracilis* to survive on a simple carbon source when their chloroplasts have been destroyed can be rationalised from the subcellular localisation and duplication of these various critical pathways.

The complicated evolutionary history of *Euglena* means it is not trivial to predict the likely subcellular locations of the various metabolic pathways, or to decide whether the pathways will be similar to those in free-living heterotrophs, or plants, or be entirely different. Precise information is missing for some biosynthetic pathways and the lack of understanding of *Euglena* chloroplast protein targeting restricts the prediction of the subcellular location of some *Euglena* proteins. Despite these limitations, overall, the model is similar to plants and green algae, but has some important differences. The development of this model will lead to the ability to predict the metabolic phenotypes of *Euglena* under various growth conditions.

#### **4. Conclusions**

The subcellular compartmentation of metabolism has been intensively studied in yeast and in plants. For many, more distantly related organisms, most information is typically inferred by extrapolation from these thoroughly examined species. Drawing on a range of *Euglena* biochemical and proteomic data, we propose a model for the organisation of central metabolism in *E. gracilis*. These analyses reveal unique features of this alga that diverge significantly from expectations derived from well-studied organisms. The most striking difference in *Euglena* is the presence of extra activities of the enzymes of various biosynthetic pathways solely present in the plastids of plants, contributing to the ability of *Euglena* to lose its plastid entirely and survive on simple carbon sources. We propose that this is due to the requirement of the heterotrophic ancestor to synthesise all necessary cellular components before the acquisition of the secondary plastid. In this context, it seems likely that the plastid pathways are replicating pathways that were originally present in the euglenid progenitor.

#### **5. Materials and Methods**

#### *5.1. Identification of Euglena Enzymes*

The transcriptome of *E. gracilis* was searched for the target proteins using BLASTP with templates that were selected from the corresponding enzymes from other organisms represented in the NCBI databases. Identified *E. gracilis*transcripts were then used as templates to interrogate the NCBI databases, to confirm the correct identification of the proteins. The presence of a spliced leader was confirmed in 39% of all of these sequences, in the range previously reported in Euglena transcriptomes [16,17], by searching for a 10 bp sequence (TTTTTTTTCG or ATTTTTTTTC) at the 5 end of the transcript.

#### *5.2. ProteinTargeting Prediction for Euglena*

A selection of proteins known to be localized to the chloroplast, mitochondria, Golgi or cytosol [2,136,137] were used to validate the use of WoLF PSORT [28] and TargetP 1.1 [29] (Tables 1–3). For proteins predicted to be secreted by TargetP using the plant search parameters, the signal sequence was removed, using the algorithms predicted cleavage site (Figure 3). The remaining sequence was then reanalysed to identify any alterations in targeting and potentially unveil a chloroplast targeting sequence. WoLF PSORT did not predict any secreted proteins using the plant search parameter. Results for metabolic pathway components are available in Supplementary Table S1.

**Figure 3.** Subcellular location prediction workflow for *Euglena* proteins. Abbreviations: Mt—mitochondria; Chl—chloroplast; others—cytosol; S—secretory pathway.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2218-1989/9/6/115/s1, Table S1: Subcellular location prediction of *E. gracilis* metabolic pathway components using WoLF PSORT and TargetP1.1.

**Author Contributions:** Conceptualization, S.I., N.J.K., R.G.R., and E.C.O.; Data gathering and analysis, S.I.; Visualization, S.I.; Writing—review and editing, S.I., N.J.K., R.G.R., and E.C.O.

**Funding:** The Development and Promotion of Science and Technology Talents Project (Royal Government of Thailand scholarship) funded this research. E.C.O. is supported by a Violette and Samuel Glasstone Fellowship.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Far-Red Light Acclimation for Improved Mass Cultivation of Cyanobacteria**

**Alla Silkina 1, Bethan Kultschar <sup>2</sup> and Carole A. Llewellyn 2,\***


Received: 1 August 2019; Accepted: 15 August 2019; Published: 19 August 2019

**Abstract:** Improving mass cultivation of cyanobacteria is a goal for industrial biotechnology. In this study, the mass cultivation of the thermophilic cyanobacterium *Chlorogloeopsis fritschii* was assessed for biomass production under light-emitting diode white light (LEDWL), far-red light (FRL), and combined white light and far-red light (WLFRL) adaptation. The induction of chl *f* was confirmed at 24 h after the transfer of culture from LEDWL to FRL. Using combined light (WLFRL), chl *f*, *a*, and *d*, maintained the same level of concentration in comparison to FRL conditions. However, phycocyanin and xanthophylls (echinone, caloxanthin, myxoxanthin, nostoxanthin) concentration increased 2.7–4.7 times compared to LEDWL conditions. The productivity of culture was double under WLFRL compared with LEDWL conditions. No significant changes in lipid, protein, and carbohydrate concentrations were found in the two different light conditions. The results are important for informing on optimum biomass cultivation of this species for biomass production and bioactive product development.

**Keywords:** cyanobacteria; chromatic adaptation; LED; far-red light; growth; photosynthesis; mass cultivation; pigments; *Chlorogloeopsis*

#### **1. Introduction**

Cyanobacteria are photosynthetic prokaryotes that are increasingly explored for use in industrial biotechnology. They are extremely diverse and genetically tractable, making them attractive as cell factories, and can adapt to a wide range of extreme habitats, often with the production of unique metabolites [1]. These adaptations can be exploited in industry to increase productivity and for the production of useful compounds such as pigments, mycosporine-like amino acids (MAAs), and fatty acids [2,3].

Having a long evolutionary history, cyanobacteria have evolved with the ability to cope with varying light intensities and wavelengths. They are able to modify their chlorophylls (chls) and carotenoids, as well as rearrange photosystem I (PSI), PSII, and phycobilisomes (PBS) during excess or limited light conditions [4,5]. These rearrangements allow absorption of light to maximise photosynthetic efficiencies. These light-dependent acclimation processes include; complementary chromatic acclimation (CCA), far-red light photoacclimation (FaRLiP), and low light photoacclimation (LoLiP) [6].

Chlorophyll (chl) *a* is the major photosynthetic photo-pigment within almost all organisms that utilise oxygenic photosynthesis [7]. Some cyanobacteria have photoadaptive strategies for absorbing longer wavelengths in the far-red light region (700–750 nm) by the production of chl *d* and *f* [8]. Inducible production of these chls has been seen in a variety of species, such as *Chlorogloeopsis fritschii,* PCC 6912 [9], *Synechococcus sp.* PCC 7335 [10], *Chroococcidiopsis thermalis* PCC 7203, *Leptolyngbya* sp. JSC-1, and *Calothrix sp*. PCC 7507 [4]. This phenomena, FaRLiP, achieves remodeling of PSI and PSII as

well as the PBS [11], with production of chl *d*, *f*, and far-red light (FRL) absorbing phycobiliproteins to maximise photosynthesis, productivity, and survival [7,12].

*Chlorogloeopsis fritschii* (*C. fritschii*) is a subsection V cyanobacterium, first isolated from soils of paddy fields [13]. It has a variety of morphologies and is tolerant to a variety of growth conditions, which are good attributes for an industrial species [14,15]. Previous research on *C. fritschii* has shown the production of chl *d* and *f* under near infrared radiation [9].

Algal biotechnology is a developing area with continued advancements in technologies for cultivation and downstream processing. The main commercial applications of algal biomass are aquaculture feeding, bioremediation, and high value products [16]. The mass production of microalgae species, including cyanobacteria, is investigated around the world. This is because they are rich sources of bio-products such as polysaccharides, lipids, proteins, pigments, and bioactive compounds which can be utilised as feed and food and for pharmaceuticals, cosmetics, and health supplements [17]. The species-specific production of useful metabolites from the algal biomass, including cyanobacteria, has been widely reviewed for industrial biotechnological applications [3,16–18].

Additional research is required to understand the regulation of photosynthesis, photoprotection, and photomorphogenesis in cyanobacteria and the implication of the use of FRL in increasing productivity and/or pigment accumulation as a robust platform in industrial biotechnology [19]. In this study, we characterise the changes in productivity, pigment, and biochemical composition of *C. fritschii* during exposure to light-emitting diode white light (LEDWL) and FRL, followed by a comparison of white light with combined white light and FRL (WLFRL) results. We finish with a discussion on the application within industry.

#### **2. Results**

#### *2.1. Growth and Productivity of C. fritschii Under LED White Light and Far Red Light*

The two growth conditions (LEDWL and FRL) showed similar growth patterns over the 9 days. Cultures grew in lag phase for the first 4 days in both conditions, followed by an exponential growth phase for 5 consecutive days (Figure 1).

**Figure 1.** Growth curve, measured using optical density at 750 nm (OD750 nm) of *Chlorogloeopsis fritschii (C. fritschii)* under either light-emitting diode white light (LEDWL) or far-red light (FRL) only for 10 days. The dotted line represents the transfer of LEDWL cultures into FRL for a further 10 days.

An overall average growth rate (8 days, 0 to 8) for LEDWL was 0.32 (STDEV = 0.01), in comparison with FRL conditions, which provided an average growth rate of 0.26 (STDEV = 0.02), showing a low light adaptation of *C. fritschii* cultures. No significant difference was found for the accumulation of biomass during the first 8 days under the two light conditions (*p* > 0.05). From day 4, both cultures

reached an exponential growth rate. After 10 days of growth under LEDWL, the cultures were transferred from LEDWL to FRL conditions, these cultures were then exposed to FRL for a further 10 days. The average growth rate of this period was 0.27. The growth of *C. fritschii* continued in the exponential phase, with a reduced rate compared to LEDWL.

The final biomass productivity of the culture under LEDWL was 0.014 g L<sup>−</sup>1d−<sup>1</sup> (STDEV = 0.001) and 0.03 g L<sup>−</sup>1d−<sup>1</sup> (STDEV = 0.001) for WLFRL conditions.

The pigment profile for LEDWL showed the presence of the following pigments: Myxol-quinovoside (myxo), nostoxanthin (nosto), caloxanthin (calo), zeaxanthin (zeax) and echinenone (echin), chl *a*, and β-carotene (Figure 2A). The FRL culture had a similar pigment profile with the exception of the absence of nosto. This could be due to the concentration of this pigment below detection level of the HPLC system. A general trend of accumulation was observed for myxo, nosto, calo, and zeax under LEDWL conditions. Under FRL, the biggest changes were observed for myxo, echin, and β-carotene (Figure 2B).

**Figure 2.** Pigment composition of *C. fritschii* under (**A**) LED white light (LEDWL) and (**B**) far red light (FRL).

Maximum concentration of chl *a* was measured for both light conditions and the cultures grown under LEDWL had double the chl *a* concentration compared to the cultures exposed to FRL. After transferring the cultures from LEDWL to FRL conditions on day 10, the cultures showed a slight decrease in chl *a* concentration. The carotenoids maintained a consistent concentration after the transfer (day 10, Figure 2A).

The final concentration of pigments on day 9 (Table 1) showed a general trend of higher concentration (μg g−<sup>1</sup> of dry weight) under LEDWL compared to FRL conditions, except for myxo during FRL conditions, which showed a 1.2 times higher concentration compared to LEDWL. Other pigments, such as calo, zeax, and β-carotene, were 1.3–1.6 times higher in LEDWL cultures with chl *a* and echin concentration was over two-fold higher in LEDWL than FRL cultures (Table 1).

−Chl *d* (detected at 706 nm) was present under both LEDWL and when transferred to FRL conditions (Figure 3) and increased gradually over time. For chl *f*, under FRL there was a ~10-fold increase at day 9 compared to day 2. After transfer of LEDWL exposed cultures to FRL, chl *f* was induced and there was a slight reduction in chl *a* and chl *d* (Figure 3).


**Table 1.** Concentration (μg g−<sup>1</sup> of dry weight) of main pigments present at day 9 within *C. fritschii* biomass during LEDWL and FRL conditions, including fold change comparing LEDWL and FRL.

Statistical significance was measured using a two-sample t-Test with equal variance, \* = 0.05 ≤ *p* > 0.01, \*\* = 0.01 ≤ *p* > 0.001, \*\*\* = *p* ≤ 0.001.

**Figure 3.** Chlorophyll content (chl *a*, chl *d*, and chl *f,* detected at 706 nm) of *C. fritschii* exposed to FRL at day 2, 5, and 9 and LEDWL (day 9) and 24 h after transfer into FRL (day 10).

#### *2.2. Enhancement of Growth by Combining Two Light Sources (White LED Supplemented with Far Red-Light)*

Next, the combination of LEDWL supplemented with FRL compared to LEDWL was investigated. During the first 6 days, the growth under the two light conditions (Figure 4, LEDWL and WLFRL) showed no significant difference (*p* > 0.05). After day 8, the cultures showed a difference in growth performance, with improved results for LEDWL supplemented with FRL (WLFRL). This result was shown in the average growth rate (μ) of 0.39 d−<sup>1</sup> (STDEV = 0.02) for WLFRL and 0.32 d−<sup>1</sup> (STDEV = 0.01) for LEDWL growth conditions (STDEV = 0.01). The exponential growth phase for WLFRL was observed over 8 days (day 8 to day 16, μ = 0.42, STDEV = 0.02), whereas the LEDWL condition had a 5 day exponential growth phase (μ = 0.33)**.** The WLFRL light combination resulted in improved growth.

The algal pigments, such as xanthophylls, carotenes, and chlorophylls were detected in both culture conditions (Figure 5). The WLFRL resulted in improved pigment accumulation (Figure 5B) with all pigments considerably increased in their quantity up to the last day of cultivation (day 19). During the exponential growth phase (WLFRL, day 8 to 16), the highest concentration for most of the analysed pigments was observed. The pigments under LEDWL conditions (Figure 5A) showed saturation at day 15, with a slight reduction in concentration by the final day of cultivation (day 19). Final pigment concentrations at day 19 (Table 2) showed an increase in levels under WLFRL conditions compared to LEDWL, with the exception of β-carotene, which showed increased levels in cultures exposed to LEDWL only.

**Figure 4.** Growth curve of *C. fritschii* under LED white light (LEDWL) and supplemented LED white light with far-red light (WLFRL).

**Figure 5.** Pigment composition of *C. fritschii* exposed to (**A**) LEDWL only and (**B**) LEDWL supplemented with FRL (WLFRL).

**Table 2.** Final pigment concentration (μg g−<sup>1</sup> of dry weight) of main pigments present in *C. fritschii* biomass at day 19 of experiment trial for LEDWL and WLFRL conditions, including fold change comparing LEDWL and WLFRL.


Statistical significance was measured using a two-sample t-Test with equal variance, \* = 0.05 ≤ *p* > 0.01, \*\* = 0.01 ≤ *p* > 0.001, \*\*\* = *p* ≤ 0.001.

The detection of chl *f*, chl *d*, and chl *a* at 706 nm (Figure 6) was investigated under LEDWL supplemented with FRL. An increase in chl *a*, chl *d*, and chl *f* was observed for the cultures grown under supplemented far-red light (WLFRL). Chlorophyll *f* reached its maximum concentration on day 13, after which it gradually reduced to its lowest content at day 19. The same result was observed for chl *d*, with accumulation at day 13; however, the concentration was 5 times less than chl *f*. Chlorophyll *a* consistently increased during the cultivation period and by day 19 reached its maximum concentration (Figure 6).

**Figure 6.** Chlorophyll content (chl *d*, chl *f*, and chl *a*, detected at 706 nm) of *C. fritschii* exposed to LED white light supplemented with far-red light conditions (WLFRL) at day 1, 6, 13, 15, and 19.

#### *2.3. Phycocyanin Concentration During LEDWL and WLFRL Conditions*

Cultures grown under LEDWL had high initial concentrations of phycocyanin followed by a reduction of the concentration until day 15. After this, a slight increase in the concentration was observed up to the final day of cultivation (day 20, Figure 7).

**Figure 7.** Phycocyanin concentration (μg mL<sup>−</sup>1) of *C. fritschii* under LEDWL and WLFRL conditions.

The growth conditions under WLFRL had a positive influence on the accumulation of phycocyanin. The concentration increased two-fold in two weeks of cultivation. Maximum concentrations were observed at day 15 and a slight decrease followed until day 20. The maximum concentration observed under the LEDWL and WLFRL was similar at ~9.7–9.7 μg mL<sup>−</sup>1. At day 13, both cultures, grown on two light conditions (LEDWL and WLFRL), revealed similar concentrations of phycocyanin, thus showing 13 days as an optimum time for the adaptation under both light regimes (Figure 7).

#### *2.4. Biochemical Composition during LEDWL and WLFRL Conditions*

Finally, the protein, carbohydrate, and lipid composition of cultures grown under two light conditions (LEDWL and WLFRL) were evaluated (Figure 8). The biomass grown under both light conditions contained 21–25% carbohydrates, 15–22% proteins, and 2–4% lipids. Statistical results (supplementary materials Table S1) showed that the light, time, and combination of both variables (light and time) did not show any significant differences.

**Figure 8.** Biochemistry composition (%) of *C. fritschii* grown under LEDWL and WLFRL conditions.

#### **3. Discussion**

The discovery of chl *d* and chl *f* in terrestrial cyanobacteria demonstrated that the wavelength range of cyanobacterial photosynthesis could be extend into the far-red region (λ = 700 to ~800nm) [7,12]. This specific adaptation helps cyanobacteria utilise FRL for growth and photosynthesis [20,21]. It is clear that under FRL, cyanobacteria tend to change their metabolism and perform effective growth and active photosynthesis via metabolomics changes with the development of chl *f* and *d* as accessory pigments in antennae systems. These pigments absorb energy and transfer it to the photosynthetic reactor center (RC). Usually these pigments are not involved in the photosynthetic electron transport chain. Additionally, chl *d* as well as chl *a* can function in the photosynthetic RC [22]. Such a transformation in cyanobacterial metabolism increases the possibilities for absorbing light in longer shifted wavelengths, which is important in cyanobacterial survival. There are many studies on these unique chls, however their full function and role in cyanobacterial metabolism is still not clear, specifically how the growth and productivity will be affected for the mass cultivation of this species for biotechnological purposes.

In our research study, we can confirm that the changes in *C. fritschii* pigment composition (xanthophylls, chlorophylls, and phycocyanin) under FRL combined with LEDWL improved growth and twice increased the biomass productivity in comparison of LEDWL. This FaRLiP process triggered antennal transfer of energy to the photosynthetic RC, confirmed by an increase in chl *f* and *d* concentrations [4,22]. The combination of lights (WLFRL) changed the carotenoids' profile. Under WLFRL, we observed an increased concentration of myxo, nosto, calo, and echin (Table 2). In comparison with mono light adaptation (LEDWL only compared to FRL only), this effect was not seen. An extensive study of carotenoid changes in cyanobacteria by Zakar et al., 2016 [23], confirmed that these pigments are responsible for the light harvesting and photoprotective capacities, showing their essential roles in photosynthetic metabolism [24,25]. The photoprotective mechanism can also occur by cyanobacterial carotenoid-proteins. One of these protein complexes is the orange carotenoid protein (OCP), discovered by David Krogmann [23,26]. The carotenoid composition of OCP is presented by 60% echin, 30% keto-carotenoid 3 hydroechninone, and 10% zeax [27]. The increase in echin and zeax under LEDWL and FRL in our study confirmed the activation of this protein and prevented cellular damage from excessive light. This effect has additionally been confirmed by non-photochemical quenching of the carotenoid-binding protein [28].

The combination of both lights activated different acclimation mechanisms and effective light assimilation for productive photosynthetic efficiency. Two main processes are involved in light adaptation, these are light energy harvest and light energy transfer [6]. In our case, by using both lights, it increased the effectiveness of both light adaptive mechanisms. The light energy harvest was demonstrated by appearance of chl *f*, chl *d,* an increased concentration of carotenoids, and phycocyanin. The light energy transfer was proved by an increased concentration of carotenoids of OCP under WLFRL in comparison with LEDWL conditions. Furthermore, the chl *d* was involved in both processes [29]. This dual mechanism of chl *d* functioning was confirmed by a recent study of transcriptional profiling of *C. fritschii* in FRL for chl *d* synthesis regulation [22]. In summary, we supposed that LEDWL maintained stable growth and that FRL activated the synthesis of chl *d* and chl *f* and restructured the functioning of PSI and PSII [11]. The combination of both lights increased growth, productivity, and oxygenic metabolism within *C. fritschii*.

Mass cultivation of *C. fritschii* is very relevant for applications in biotechnology [30]. The biology of this species has great potential for scale-up and mass cultivation in different latitudes around the world. This thermophilic cyanobacterium has several advantages in terms of large-scale growth. It requires high temperatures, which gives real advantages over other species for mass cultivation. In our study we grew this species under 25 ◦C, however successful growth in mass scale was shown in Balasundaram et al., 2012, and it can grow at up to 50 ◦C [31]. The mass cultivation set-up (PBR and raceways) could be placed in desert conditions. It has been confirmed that this species could grow on elevated CO2 concentrations up to 5% of CO2 [31,32]. It can therefore be co-located with industries emitting flue gas, e.g., power plant stations [31]. Additionally, this species could be cultivated in African and South East Asian weather conditions, as the biomass contains many valuable compounds for food, feed additivities, and as a whole food and can be used to combat malnutrition [33,34]. The application of this species as a feed for tilapia has been studied, showing that this species has potential in aquaculture [35].

Several advantages of the mass cultivation of this species are related to the aspect of easy downstream processing. This species is auto flocculating and does not require expensive equipment of membrane filtration and/or centrifugation to obtain the algal biomass paste for future processing and preservation [15,16]. This is another aspect of the development of successful mass cultivation of this species in different locations around the world, making this species a model for worldwide application.

The use of mass cultivation of *C. fritschii* in bioeconomy is an important target of algal biotechnology. The understanding of their cell physiology and specific light adaptation will help to improve the biomass and specific compounds production. The main bioactive compounds of *C. fritschii* are presented in Table S2. The principle groups are mycosporine-like amino-acids (MAAs) and pigments. *Chlorogloeopsis* produces chlorophylls, carotenoids, and phycobiliproteins, which contain different colours and can be used as biodegradable dyes [15,36]. Furthermore, bioproducts such as biodegradables and biocompatible plastic could be produced by *Chlorogloeopsis*. Nowadays, these are very important biomolecules, with the potential to be used as a substitute for single use plastic. The reason for this is that petrochemical and non-biodegradable contamination presents a major problem worldwide [37].

Many other applications of *Chlorogloeopsis* and cyanobacteria could be developed. This algal group can produce antimicrobial, antiviral, anticancer, and antiprotozoal compounds for pharmaceutical applications and can be used as a food, feed, and in other value-added products [38]. Further research and product development activities need to be established. In our research, we confirmed that the production of the main group of pigments (chlorophylls, carotenes, xanthophylls, and phycocyanin) could be of potential commercial interest.

#### **4. Conclusions**

A combination of LEDWL and FRL showed higher productivity of *C. fritschii*, with an increased concentration of myxo, nosto, calo, and echin. These combined light conditions triggered light harvesting and light energy transfer together with the induction of chls *d* and *f*, giving increased growth, photosynthetic effectiveness, and double the productivity of *C. fritschii* cultures. However, the overall protein, lipid, and carbohydrate composition did not significantly change under WLFRL. Our results suggested that the overall production of this biotechnologically promising species can be increased by cultivation using additional far-red light.

#### **5. Materials and Methods**

#### *5.1. Experimental Design*

For the first experiment, three flasks with a total volume of 800 mL each were placed under FRL and LEDWL conditions. The initial cell concentration was 0.5 <sup>×</sup> <sup>10</sup><sup>6</sup> cell mL−<sup>1</sup> (or 750 nm measurements ~0.05).

On day 9, at an OD750 of 0.3–0.4, the flasks under FRL were harvested. The cultures grown under LEDWL conditions were sampled in triplicate for growth and pigment analysis and then transferred to FRL (far red light) conditions for a further 6 days. After 24 h of exposure to FRL, the cultures were sampled again in triplicate for pigments and growth analysis.

For the second experiment, three flasks with a total volume of 800 mL of *C. fritschii* culture were placed under LEDWL and FRL combined with LEDWL. The initial cell concentration was 2 <sup>×</sup> <sup>10</sup><sup>6</sup> cell mL−<sup>1</sup> (or 750 nm measurements ~0.2). The duration of the experiment was 20 days.

Every 24 h, the growth parameters, such as cell concentration, biovolume, and OD (optical density) was measured and a collection of the algal biomass (harvested from 15 mL of culture) was made for pigment analysis by HPLC, spectrophotometer, and for biochemical analysis by FTIR.

#### *5.2. Source Organism and Cultivation Conditions*

*Chlorogloeopsis fritschii* was purchased from Pastor Culture Collection (PCC 6912; Paris, France). The master axenic culture was maintained in a temperature and white fluorescent light-controlled room (T = 28 ◦C) with 16:8 h light: dark cycle.

#### *5.3. Growth Estimation*

Every 24 h cell concentration and biovolume by Coulter counter (Multisizer 4, Beckman, USA) measurement was performed to quantify culture growth. Further details are described in Reference [39]. The OD (750 nm) measurements were analysed by UNICAM UV 300 spectrophotometer.

During sampling days, a minimum of 50 mL of culture was taken from each tube and centrifuged (Beckman Coulter Centrifuge, Avanti J-20XP) for 20 min at 8000 rpm. The biomass was washed twice with deionized (DI) water, centrifuged for 20 min at 8000 rpm, then collected and freeze dried (ScanVac Cool Safe, LaboGene, Lynge, Denmark) for 24 h prior to further analysis.

The specific growth rate (μ) was determined for all LED light treatments using Equation (1), where *N*<sup>0</sup> and *N*<sup>1</sup> are the cell concentrations (cells mL<sup>−</sup>1) at times *t*<sup>0</sup> and *t*1, as follows:

$$
\mu = \text{lnN}\_1 - \text{lnN}\_0/t\_1 - t\_0. \tag{1}
$$

#### *5.4. Dry Weight and Biomass Productivity*

Dry weight was measured according to Reference [39]. A known volume of algae was pelleted and washed with deionized (DI) water (three times using 25 mL of DI water each time) prior to being filtered onto pre-weighed and dried filters (Whatman GF/F 47 mm Ø). The filters with algal biomass were then dried and re-weighed until constant weight was reached. Dry weight (g L−1) was then calculated by subtraction of the final filter weight and the pre-filtered weight.

Biomass productivity was calculated as the difference in terms of DW between the sample day and the previous day. The results are expressed in g L−<sup>1</sup> d<sup>−</sup>1.

#### *5.5. Pigments Extraction and Measurements*

A known mass of frozen cell paste was transferred to an extraction tube containing 1 mL HPLC grade acetone and 0.2 mg zirconium (0.1 mm diameter) beads. The sample was then lysed in Precellys®24, a high-throughput tissue homogenizer, at 6500 rpm 2 <sup>×</sup> 30 s with a pause of 5 s. The sample was centrifuged (5 min at 20,000× *g*, Microcentrifuge) and the removed supernatant was used for pigment analysis on HPLC (Agilent HP 1200).

#### *5.6. HPLC*

The pigment extract was analysed using a high performance liquid chromatography (HPLC) method described previously (Method C in Airs et al., 2001 [40]). Pigment extracts were injected (100 μL) onto the HPLC column (2 Waters Spherisorb ODS2 cartridges coupled, each 150 × 4.6 mm, particle size 3 μm, protected with a precolumn containing the same phase). Elution was carried out using a mobile phase comprising methanol, acetonitrile, ammonium acetate (0.01 M), and ethyl acetate (Method C in Airs et al. 2001) at a flow rate of 0.7 mL min<sup>−</sup>1. The photodiode array PDA detector was set to monitor wavelengths at 406, 440, 660, 696, and 706 nm. Carotenoids and chl-*a* were quantified against standards (Sigma) and, for chls *d* and *f*, peak areas were used [7,9].

#### *5.7. Phycocyanin Extraction and Determination*

Phycocyanin (PC) was extracted using a modified version [41] of the method developed by Reference [42]. The freeze-dried biomass of each sample was weighed to a known weight on a semi-micro and analytical balance (MSE 124S-100-DU, Sartorius balance, Germany). The sample weight was noted to the nearest 0.1 mg and all PC extractions were conducted in triplicate. The samples were transferred into 15 mL falcon tubes and subjected to a minimum of five freeze-thaw cycles; the samples were immersed in 5 mL of 0.1 mol L−<sup>1</sup> phosphate buffer (pH = 6) and stored at <sup>−</sup>20 ◦C until frozen (∼2 h), they were then thawed and subjected to 10 min sonication on ice [43]. The samples were then vortexed for 5 min and then placed back into −20 ◦C, and the process repeated. After the final freeze-thaw cycle, the cell debris was removed via centrifugation at 8000 rpm for 5 min. The supernatant was recovered and used for PC measurements. Absorbance of the extracts was measured at 592, 618, and 645 nm using a UNICAM UV 300 spectrophotometer. The concentration of the PC was determined using the equations in Reference [44], where OD is the optical density of the pigment at the particular wavelength.

$$\text{PC (mg mL}^{-1}) = \text{[(OD618 nm - OD645 nm) - (OD592 nm - OD645 nm) \times 0.15]} \times 0.15 \tag{2}$$

#### *5.8. Proteins, Lipids, and Carbohydrate Analysis Using Fourier Transformed Infra-Red (FTIR)*

FTIR attenuated total reflectance (ATR) spectra were collected using a PerkinElmer Model Spectrum Two instrument equipped with a diamond crystal ATR reflectance cell with a DTGS detector scanning over the wavenumber range of 4000–450 cm−<sup>1</sup> at a resolution of 4 cm−<sup>1</sup> as described by References [45,46]. Briefly, ethanol (70%) was used to clean the diamond ATR before the first use and between samples. Approximately 3–5 mg of finely powdered freeze-dried *C. fritschii* biomass was applied to the surface of the crystal and then pressed onto the crystal head. A duplicate (each consisting of an average of 12 scans) of each bioreactor sample was conducted for each light type; therefore, results of 6 ATR spectra were gained and the results were averaged. Background correction scans of ambient air were made prior to each sample scan. Scans were recorded using the spectroscopic software Spectrum (version 10., PerkinElmer, Germany). The contents of lipids, proteins, and carbohydrates in the biomass samples were determined using FTIR, which had previously been calibrated using mix of monosugars (rhamnose, xylose, glucuronic acid, and glucose) for carbohydrates, palmitic acid for lipids, and BSA for proteins at different concentrations. The carrier powder for the FTIR calibration was potassium bromide (KBr) [41].

#### *5.9. Statistical Analysis*

Statistical analyses were carried out using the R project software on the OD, pigments concentration, lipids, carbohydrates, and protein content data. Data normality was tested using a Shapiro test. Non-normal data significance was assessed using GLMs (generalised linear models) furthered by an analysis of variance (ANOVA) on a data set not following a normal distribution. Crossed factor ANOVAS were carried out on normally distributed data. Both statistical methods tested the impact of experimental duration, pigments and intracellular lipids, carbohydrates, and protein content. When statistical significance was found, post hoc Tukey tests were implemented.

All the experiments were performed in triplicate. The standard deviation and means were analysed for significance using the biostatistics software Excel through one-way ANOVA. The Duncan multiple range test was used to compare the significance of difference among tested algae at *p* values of < 0.05. Results are reported as ± SD or error bars.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2218-1989/9/8/170/s1, Table S1: Statistical analysis of composition of *C. fritschii* culture, Table S2: Bioactive compounds produced by *Chlorogloeopsis* sp.

**Author Contributions:** A.S. and C.A.L. conceived, designed, and performed the *C. fritschii* growth experiments under different light conditions; the biomass and pigments analysis were performed by A.S. and B.K.; all authors wrote the paper.

**Funding:** This research was funded by PHYCONET Proof of concept funding "Exploring chlorophyll-f and associated metabolism for improved intensive cultivation of cyanobacteria".

**Acknowledgments:** The authors would like to thank all CSAR technical staff for the support of the project.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Metabolites* Editorial Office E-mail: metabolites@mdpi.com www.mdpi.com/journal/metabolites

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18