**Zofia E. Taranu 1,2,\*, Frances R. Pick 1, Irena F. Creed 3, Arthur Zastepa <sup>4</sup> and Sue B. Watson <sup>5</sup>**


Received: 1 October 2019; Accepted: 23 October 2019; Published: 26 October 2019

**Abstract:** Cyanobacterial blooms increasingly impair inland waters, with the potential for a concurrent increase in cyanotoxins that have been linked to animal and human mortalities. Microcystins (MCs) are among the most commonly detected cyanotoxins, but little is known about the distribution of different MC congeners despite large differences in their biomagnification, persistence, and toxicity. Using raw-water intake data from sites around the Great Lakes basin, we applied multivariate canonical analyses and regression tree analyses to identify how different congeners (MC-LA, -LR, -RR, and -YR) varied with changes in meteorological and nutrient conditions over time (10 years) and space (longitude range: 77◦2 60 to 94◦29 23 W). We found that MC-LR was associated with strong winds, warm temperatures, and nutrient-rich conditions, whereas the equally toxic yet less commonly studied MC-LA tended to dominate under intermediate winds, wetter, and nutrient-poor conditions. A global synthesis of lake data in the peer-reviewed literature showed that the composition of MC congeners differs among regions, with MC-LA more commonly reported in North America than Europe. Global patterns of MC congeners tended to vary with lake nutrient conditions and lake morphometry. Ultimately, knowledge of the environmental factors leading to the formation of different MC congeners in freshwaters is necessary to assess the duration and degree of toxin exposure under future global change.

**Keywords:** Cyanotoxins; microcystin congeners; MC-LA; nutrients; climate; Great Lakes; raw water intake; multivariate statistics; long-term monitoring

**Key Contribution:** We showed that, regionally, microcystin congener composition varied systematically with environmental change (primarily in response to changes in wind speed, with secondary effects of nutrients and temperature). In turn, this regional-scale heterogeneity helped explain global-scale patterns and the reported dichotomy in congener dominance between Europe and North America.

### **1. Introduction**

The incidence of severe cyanobacterial blooms is increasing worldwide [1–5], along with the risk of exposure to cyanobacterial toxins [6–8]. Cyanotoxins are found on all continents [9], though among the suite of cyanotoxins that occur in freshwater ecosystems, microcystins (MCs) are the most commonly reported [9] and the most diverse, with over 240 different structural variants (congeners) identified to date [10]. Despite considerable research on MCs, little is known about the global or multi-region

occurrence of the different congeners, and how their distribution relates to environmental conditions (though see key work by [11,12]). This inability to predict the spatial (and temporal) variability in MC congeners is a major concern given that numerous wildlife and livestock fatalities have been linked to exposure to these cyanotoxins [13,14]. Humans are also at risk. People living in close proximity to lakes with frequent MC-producing cyanobacterial blooms or exposed to contaminated drinking or recreational water have been reported to experience various health problems such as muscle pain and gastrointestinal, skin and ear irritations [15]. There is also mounting evidence of chronic health problems associated with MC exposure, including a higher incidence of non-alcoholic liver cancer [16–19]. More recently, MCs have been shown to transfer up the pelagic food chain to fish [20], with additional long-term implications for human health [21,22].

Although new MC congeners are continually being identified [10], only a small number are monitored on a regular basis, largely due to the lack of standards and the need to use advanced LC/MS/MS techniques to identify multiple congeners. More routinely, total MC concentrations are reported as MC-LR equivalents based on enzyme-linked immunosorbent assays (ELISA) or protein phosphatase inhibition bioassays (PPI). Among MC congeners, MC-LR is considered the most common and widely distributed [23,24], however, the use of ELISA kits by many agencies may create a bias as the method has a lower reactivity to MC congeners other than MC-LR. This bias would be most extreme in regions where MC-LR is not the most common congener or not among the MC congeners detected (e.g., [25,26]), leading to an underestimation of total MC concentrations. For instance, MC-RR and -YR have been increasingly reported globally (e.g., [27]), and there are reports of MC-LA dominance in Canada (e.g., in Canada [28,29]) and in the US (e.g., Midwest [24]), suggesting that these variants are more common than previously thought [30]. From a toxicity point of view, the occurrence of MC-LA dominating blooms in some North American lakes can have important consequences on animal health. MC-LA has been shown to be as toxic as MC-LR [31,32] and as persistent if not more so than MC-LR. For example, [33] found that MC-LA persisted throughout the recreational season (9.5 weeks) in a small temperate Canadian lake, long after the disappearance of surface cyanobacterial blooms (visible for five weeks), whereas MC-LR was found to decline much more rapidly (two to four weeks) in other lakes. MC-LA may also penetrate lipid membranes more readily (more hydrophobic) than MC-LR or -RR, increasing its likelihood of bioaccumulation, and it has been directly tied to wildlife mortalities [14]. MC-LA is also less readily removed by carbon filtration than MC-LR, -RR, and -YR [32], posing additional challenges for water treatment facilities.

Compared to the large number of studies based on total microcystins, there are fewer studies on the dynamics of specific MC congeners, and these studies suggest there may be a wide range in congener composition among lakes [24,26]. Indeed, while MC-LA appears to be an important player in North American lakes, it seems less common in European lakes [7]. Though this difference may be due to differences in standards available for MC congener analysis, recent work suggests that regional differences in environmental conditions may explain some of the spatial variability in cyanotoxin composition. For instance, environmental factors such as water temperature, light regime, and thermal stratification were shown to be significant drivers of the cyanotoxin composition observed across a broad-scale synthesis of data from 137 European lakes [12]. These spatial patterns in cyanotoxin composition may admittedly reflect the dominance of certain cyanobacterial strains or species, which are themselves driven by environmental gradients (e.g., [28]). However, patterns in species composition may not necessarily determine patterns in MC congener composition. MCs are produced by a number of cyanobacteria (e.g., *Dolichospermum*, *Microcystis,* and *Planktothrix*), each of which can produce several MC congeners simultaneously [26,34–36]. In addition, there is a wide variation in congener composition among strains of the same species. Furthermore, given that guidelines associated with drinking water consumption are based on MC concentrations, and not species composition [12,37], and that MC congeners can vary in toxicity by one to two orders of magnitude [32,38], the questions of which MC congener dominates and why are important to resolve directly. Agencies would also benefit from robust predictions of impending toxic blooms and knowledge of which cyanotoxins are likely to

occur under which routinely monitored environmental conditions [39]. Knowledge of the persistence of MC congeners in natural systems would also be important to reduce the discrepancy between lake closures and the period of potential toxin exposure [33,40]. Overall, a shift in focus from the analysis of one (MC-LR) to many (MC congener composition) will likely have important implications for the protection and management of freshwater resources.

In this study, we modelled changes in the relative concentration of different MC congeners (MC-LA, -LR, -RR, and -YR) in response to routinely monitored environmental factors using a raw water intake dataset collected by a single agency (the Ontario Ministry of the Environment, Conservation and Parks (OMECP)) using the same analytical methods. This entailed both spatial (19 intake sites situated across 12 main water bodies) and temporal (seven months sampled over 10 years) analyses to identify the importance of changes in meteorological conditions and lake nutrient concentrations on the variation of different MC congeners across the Great Lakes region of North America. We examined the potential role of weather-related variables including air temperature, precipitation, wind speed and wind direction as these have previously been shown to enhance cyanobacterial dominance and bloom formation [41–43]. We also examined the potential role of major nutrients (phosphorus (P) and nitrogen (N)) as they are strong predictors of cyanobacterial biomass, cyanobacterial dominance, and cyanotoxins in freshwaters [44,45]. Lastly, we considered characteristics of each intake site (depth and distance from shore) as well as dreissenid mussel control measures (raw water pre-chlorination) which may affect MCs [46,47]. To place this regional analysis within a global context, we conducted a synthesis of the peer-reviewed literature to test for systematic patterns of MC congener dominance across multiple regions.

#### **2. Results**

#### *2.1. Regional Analysis of the Great Lakes Intake Sites*

The time period for which analytical data were available differed among MC congeners. Most were only quantified from 2013 onwards, MC-LR, -RR, and -YR were quantified from 2004 onwards, and MC-LA from 2006 onwards. We thus restricted our statistical analyses to years when the four most dominant congeners (MC-LA, -LR, -RR, and -YR) were measured (i.e., 2006–2016). We also restricted our analysis to ice-free months (April to November). This provided us with a regional raw water intake database spanning a 10-year period, collected during the open water season from 12 main water bodies in Ontario, Canada (Laurentian Great Lakes, Lake of the Woods, and other Ontario lakes and rivers, Figure 1). For the years and months examined, intake sites were sampled, on average (mean and median), four times per month (range: one to nine days per month), for a total of 2002 unique days-months-years-sites covering the years 2006–2016 and 19 intake sites (as some water bodies had multiple intake sites).

Despite differences in the time of sampling across the main water bodies (e.g., complete sampling in Lake Ontario, intermittent in Detroit River and Lake Erie, and late-onset of sampling in Lake St. Clair (Figure 2)), we observed important spatial patterns in MC congener composition within the Great Lakes and the surrounding region. When averaging the concentrations of each congener across all sampling dates, we noted the variability in congener dominance among water sources (Figure S1). Some water bodies (e.g., Lake St. Clair) had greater MC-LA concentrations, while others (e.g., Lake Ontario) had greater MC-LR or MC-RR concentrations across time points. In general, MC-LA was higher at intake sites located along the Detroit River, Lake Erie, and Lake St. Clair, and was identified by the indicator value index (*indval*, Dufrêne and Legendre 1997) as having high fidelity and specificity to these sites (Figure 1), whereas MC-LR was an indicator of sites along Lake Ontario's Bay of Quinte. These dominance patterns were also consistent from year to year within a given water body (Figure S2), notably so among the most routinely monitored sites with highest MC concentrations (i.e., intakes on Lake Ontario's Bay of Quinte (Bayside, Belleville, Deseronto, and Picton), Lake Erie (Elgin, Essex, Pelee, and Union), Lake St. Clair (Lakeshore and Stoney Point) and the Detroit River).

**Figure 1.** Map of main water bodies with raw water intake samples analyzed by the OMECP. Points are colour coded by dominant congener as determined by an indicator analysis (*indval* function from the {labdsv} package in R), which showed that Robin Lake (**8**), Lake St. Clair (**9**), Detroit River (**10**) and Lake Erie (**11**) were dominated by MC-LA (red), Lake of the Woods (**1**), Lake Wabigoon (**2**), Lake Couchiching (**5**), the Otonabee River (**6**), and Lake Ontario's Bay of Quinte (**7**) were dominated by MC-LR (green), Ramsey Lake (**3**) was dominated by MC-YR (blue), and Lake Nipissing (**4**) by MC-RR (orange). Note: There are four intake sites located along Lake Erie (Elgin, Essex, Union, and Pelee island), four along the Bay of Quinte (Bayside, Belleville, Deseronto, and Picton) and two along Lake St. Clair (Lakeshore and Stoney Point). All other main water bodies only have one intake site.

When examining all unique sampling dates (i.e., not averaged across years within a water body, nor across observations within a given year and water body) in the four most frequently monitored water bodies (Bay of Quinte, Lake Erie, Lake St. Clair and the Detroit River), we noted that the variance in MC-LA and -LR (the two dominant congeners), greatly increased after the year 2012 in the Bay of Quinte, with similar patterns in Lake Erie, Lake St. Clair and the Detroit River (Figure S3), though the sampling was more sporadic for the latter three prior to 2012. For the years monitored, the concentration of MC-LA likewise increased over time in Lakes Erie, St. Clair and the Detroit River (Figure S3). Consequently, the relative concentration of each MC congener changed over time (Figure 2), whereby MC-LA concentrations approached and even surpassed those of MC-LR and -RR after 2012.

Environmental factors likewise varied over space and time in the four most frequently sampled water bodies of the Great Lakes basin. On the dates sampled, the Bay of Quinte region tended to experience warmer conditions than the other sites, whereas the Detroit River and Lake St. Clair region tended to experience higher wind and wetter conditions (Table 1). Meteorological conditions also varied over the years monitored and across stations (Figure S4). Lower wind speed, decreased precipitation, and higher maximum temperatures were recorded after 2012 at the meteorological station near the Detroit River and Lake St. Clair. Similar drops in wind speed and precipitation were recorded in 2015 and 2016 at the weather stations near Lake Erie and the Bay of Quinte (Figure S4). Across sites and time points, nutrient concentrations were typically within the oligo-mesotrophic range, with minimum, median, and maximum TP concentrations of 5, 21 and 675 μg/L, respectively. TP concentrations tended to be higher along the Bay of Quinte, whereas TN was highest at the Lake St. Clair intake sites (Table 2). There was a tendency for TP to decrease over time across the four most frequently monitored water bodies, whereas TN only decreased in the Bay of Quinte, Lake Erie and Lake St. Clair sites (Figure S3). No significant trend in TN was detected in the Detroit River. The depth of the intake sites, their distance from shore, and the use of pre-chlorination to control for dreissenid mussels (pre-chlorination in 59% of sites) also differed among intake sites and main water bodies (Figure S5, Table 3).

**Figure 2.** Relationships between dominant congeners from the Laurentian Great Lakes basin raw water intake data. Log-transformed concentrations of (**a**) MC-RR and (**b**) MC-LA vs. MC-LR.

As mentioned previously (and shown in Figure S2), most water bodies were only intermittently monitored. To thus provide a more robust investigation of the distribution pattern of MC congeners in time and space, we restricted further statistical analyses to the four water bodies most frequently sampled by the OMECP (Bay of Quinte, Lakes Erie and St. Clair, and the Detroit River). In terms of relationships among MC congener composition and environmental change, a multivariate canonical ordination (redundancy analysis, RDA) showed that the relative abundance of each congener varied with meteorological conditions, nutrient concentrations, chlorine treatment (Yes/No), and the distance of the intake sites from shore (Figure 3). The relative abundance of MC-LR and -RR increased as mean monthly maximum temperatures and TP increased, and as the direction of the wind with the highest speed changed from southwesterly winds (200◦) to northern winds (360◦). In contrast, the relative abundance of MC-LA increased as maximum wind speeds decreased. Furthermore, intakes closer to shore tended to have higher MC-LR and -RR, and chlorinated sites tended to have higher MC-LA (Figure 3). The multivariate linear model examined with this RDA accounted for a small portion of the total variance in MC congener composition (R2-adj = 0.09). We also identified a clustering among observations from the same water body (Figure 3) and including water body as a co-variate in the RDA accounted for an additional 7% of the variance in MC composition (figure not shown). A limitation of the RDA, however, is that observations with missing environmental data were omitted prior to analysis and that the relationships were based on linear regressions.


**Table 1.** Summary statistics of the climate data (monthly averages from April to November, for the 2006–2016) from weather stations near the four main water bodies most frequently monitored by the Ontario Ministry of the Environment, Conservation and Parks (OMECP).

MAT = maximum air temperature, and *N* = total sample size (only one station used for each water body, to the exception of Lake St. Clair and Detroit River where the same station was used due to proximity of both water bodies). Due to some NAs, *N* = 384 observations for wind direction in Lake Erie, whereas all other climate variables have *N* = 476 observations in Lake Erie.

The use of a multivariate regression tree (MRT) analysis helped identify significant non-linear responses to environmental factors, as well as their interactions, which together accounted for an additional 25% of the variance in MC congener composition (R2 = 0.34, Figure 4). Among the variables tested, wind speed was the most important explanatory variable in predicting congener composition (Figure 4, tree node 1), but temperature, precipitation, and nutrients had significant secondary effects. When average monthly winds were very stable (maximum wind speed <36 km/hr on average), MC-LR dominated, followed by MC-RR (Figure 4, node 5). When the wind was relatively stable (maximum wind speed between 36 and 51 km/hr) coupled with low average monthly precipitation (<1.6 mm), MC-LR still dominated, followed by MC-RR (Figure 4, node 5). In contrast, when intermediate winds (36 and 51 km/hr) were coupled with precipitation above 1.6 mm, MC-LA was the main MC encountered, with more MC-LA than other congeners when temperatures were cooler (<13 ◦C). Under wet but warmer conditions (≥13 ◦C), MC-LA, -LR, and -RR were generally more abundant when TP was at least in the mesotrophic range (TP > 8.5 μg/L) (Figure 4, node 8), but MC-LA was greatest when TN was

below 169 μg/L (Figure S6). Under high wind speeds (maximum wind speed > 51 km/hr), combined with warm air temperatures (MAT > 14 ◦C on average) and as the direction of wind with the highest speed changed from southwest to more northern winds, MC-LR once again dominated, especially under nutrient-rich conditions (TP > 26 μg/L), followed by MC-RR (Figure 4, node 4).


**Table 2.** Summary statistics of nutrient data (months: April–November, years: 2006–2016) from the four main water bodies most frequently monitored by OMECP.

TN = total nitrogen (μg/L), TP = total phosphorus (μg/L), *N* = total sample size, and *n sites* = number of raw water intake sites within each main water body.

**Figure 3.** Redundancy analysis of the relationships between the relative abundance of microcystin congeners (i.e., MCL-A, -LR, -RR, and -YR) and environmental factors (i.e., meteorological conditions, nutrient concentrations, intake location, and chlorine treatment) from the most frequently sampled water bodies of the Laurentian Great Lakes basin raw water intake data. Quantitative environmental factors were scaled and centered to reduce variance prior to analysis. The centroids of the qualitative variable (categorical variable for chlorine treatment, Yes/No) are shown by the grey diamonds.


**Table 3.** Summary of morphometric and nutrient (total phosphorus (TP)) variables for the Laurentian Great Lakes basin (raw water intake sites analyzed by OMECP)and global synthesis of the literature. *N* represents the number of main water bodies.


**Figure 4.** Multivariate regression tree (MRT) of congener relative abundance from the most frequently sampled water bodies of the Laurentian Great Lakes basin, constrained by meteorological conditions (MAT = maximum temperature, WindSpeed = maximum wind speed, WindDir = direction of wind with the highest speed, Precip = precipitation) and nutrient concentrations (TP = total phosphorus).

To tease apart potential effects of geographical location from those of the landscape-scale environmental gradients, we used univariate mixed-effect regression trees with either the concentration of MC-LA or -LR, the two most common congeners detected in the region, as the response variables (the analysis provided similar results when using the relative abundance of MC congeners instead of absolute concentrations, and although not shown MC-RR behaved as MC-LR). The univariate trees showed that MC-LR concentrations were highest under warm, high nutrient conditions (Figure 5a) and that MC-LR concentrations were higher on average in the Bay of Quinte (Figure 5b). In contrast, MC-LA concentrations were higher on average in Lake St. Clair, and primarily related to wind speed (Figure 5c,d). The concentrations of both MC congeners, where greatest from 2013 onwards (Figure 5).

*Toxins* **2019**, *11*, 620

**Figure 5.** Mixed effect regression tree analysis for MC-LR and MC-LA concentrations from the most frequently sampled water bodies of the Laurentian Great Lakes basin, where fixed effects relationships are shown in (**a**,**c**) and random intercept coefficients are shown in (**b**,**d**) for MC-LR and -LA, respectively. MAT = monthly averaged maximum air temperature (◦C), TP = Total Phosphorus (μg/L), and WindSpeed = monthly averaged maximum wind speed (km/hr).

#### *2.2. Global Analysis of the Peer-Reviewed Literature*

At the spatial scale of the Great Lakes basin and the surrounding region, we detected significant heterogeneity in MC congener dominance (notably between MC-LA and MC-LR, -RR) due in part to environmental variability among water bodies and over time. Within a global context, the synthesis of studies reporting MC congener data likewise showed a pattern in MC congener occurrence. In particular, MC-LA was most common in North and South American lakes (Figure 6a, Kruskal–Wallis test: χ<sup>2</sup> = 44.76, *p* < 0.0001). The lakes sampled in this continent (predominantly the US and Canada) tended to have a smaller surface area, be shallower in depth, and have lower TP concentrations than the lakes sampled in the other five continents (Table 3, Figure S7a). MC-LR was more evenly distributed (Figure 6b, Kruskal–Wallis test: χ<sup>2</sup> = 1.73 *p* = 0.943), though most common in eutrophic lakes (i.e., intermediate TP range for these sites (Figure S7b)), which corresponded to lakes sampled in Africa, Asia, Europe, and North America (Figure S7a, Table 3). Lastly, MC-RR and -YR tended to divide the depth niche space (the former being relatively more abundant in lakes of intermediate depth, the latter being relatively more abundant in deeper lakes (Figure S7b)).

**Figure 6.** Summary of global meta-analysis of the peer-reviewed literature (OMECP raw intake sites were omitted). Boxplot of the percentage of (**a**) MC-LA and (**b**) MC-LR in different continents, where n = number of lakes and N = number of countries in each continent.

#### **3. Discussion**

#### *3.1. Regional Relationship between Congener Occurrence and Environmental Conditions*

Our regional analysis of the Laurentian Great Lakes showed that the variability in MC congener composition across raw water intake sites of Southern Ontario, Canada could be due to differences in meteorological conditions and lake nutrient status. MC-LA and MC-LR were the most commonly observed MC congeners in the region, though their dominance was distinct across the landscape (Figure 1). As environmental conditions changed (lower wind speed, decreased precipitation and higher temperatures in most recent years (Figures S3,S4)), the relative abundance of MC-LA and MC-LR also changed (increase proportion of -LA in western locations (Figure 2)), and the concentration of both MC congeners became increasingly variable among sampling dates from 2013 onwards (Figure S3). This observation led us to further assess whether environmental factors influenced MC congener prevalence. Our findings showed that when intermediate winds (average monthly wind speeds ranging from 36 to 51 km/hr) were coupled with wetter conditions, MC-LA tended to dominate. These conditions were typical of the raw water intake sites in Lake St. Clair (Figure 5c,d), as well as along the Detroit River and Lake Erie (Figure 1). In contrast, either weak wind (<36 km/hr), or stronger winds (>51 km/hr) coupled with warm conditions (>14 ◦C) and nutrient-rich waters (TP > 26 μg/L) were related to the dominance of MC-LR (and co-dominance of MC-RR). These conditions were typical of the Bay of Quinte (Figures 1 and 5a,b).

The overriding effect of monthly-averaged wind speed on MC composition in this regional dataset is noteworthy (Figure 4). Wind speed, through changes in turbulent mixing and water temperature, is known to affect phytoplankton and cyanobacteria species composition due to differences in buoyancy regulation and temperature optima [48]. That is, gas-vacuolated species can regulate their buoyancy and overcome settling to optimize resource acquisition during low wind speeds, strong stratification and high irradiance [41,49]. Michalak et al. [50] found that the combined effect of calm wind conditions, reduced lake mixing, increased nutrient loading and increased precipitation may have facilitated a record-breaking *Microcystis* bloom in Lake Erie in 2011, and furthermore, predicted increases in the frequency of such events in the future (50% increase in large storms with precipitation >30 mm under future climate models). Recently, Kelly et al. [51], found that lower wind speed (≤37 km/hr), combined with an increase in eutrophication indicators (Chl *a*, nutrients) and temperature, were associated with the increased probability of total MC concentrations exceeding drinking water standards (1.5 μg/L) in the Bay of Quinte. Given the decline in wind speed observed in recent years in the Laurentian Great Lakes region (Figure S4) and predictions of higher total annual precipitation [52], the increase in MC-LA dominating blooms may very well continue. However, MC-LA dominance was also observed in a small Ontario lake with no flushing (i.e., a closed-basin lake with no inflow or outflow [33]). Thus, the cumulative effect of environmental stressors (wind, precipitation, and nutrients) may vary across the landscape and interact with other factors such as lake morphometry. Furthermore, threshold wind speeds for complete mixing of the water column are likely to vary among water bodies and regions. In some lakes, winds greater than 29 km/hr were required to favour non-buoyant species [53,54], whereas other systems required stronger winds (>72 km/hr, [41]).

The likely reason for the observed relationship between environmental factors and MC congener dominance is that different congeners are produced by different cyanobacterial species or strains, which themselves vary along their ecological niche [55]. MC congener composition has been shown to vary between and within species (i.e., among strains), and to a lesser extent within strains. For example, within-strain variation in congener composition has been observed in response to changes in temperature [56], light availability [57,58] and nitrogen concentrations [59,60]. Different species may also produce many MC congeners concurrently. For instance, *Planktothrix* blooms have been dominated by both MC-LA (North American lakes [61]) and the -RR variant (European lakes [62–65]). Similarly, *Microcystis aeruginosa* has been associated with MC-LA dominance in some lakes (North America [33]) but -LR dominance in others (European lakes [66]).

A caveat of this regional study is the limitation of raw water intake data in terms of extrapolating to the surface water conditions. There may be times of the year (e.g., during storm events) when toxins measured at intake sites are partially or wholly derived from resuspended sediments, and may be less representative of surface water conditions. This is especially an issue given that MC-LA is more resilient to degradation and has lower sediment adsorption than MC-LR [32,67]. The differential resuspension and removal efficiency of MC congeners may thus vary with the depth of intake (greater exchange between surface and bottom waters in shallow sites). For the sites examined here however, we failed to detect any significant effect of depth of intake on MC congener composition, and all intake sites were relatively shallow (<12 m). Instead, we found that the distance from shore was more important (Figure 2), which may be related to the difference in water currents.

Chlorine treatment and dreissenid mussel occurrence could further decouple MC congener composition between surface and intake waters. Indeed, many water treatment plants chlorinate the raw water to serve as a chemical barrier to prevent dreissenid mussel veligers from clogging up intakes pipes as well as to prevent them from moving into the water treatment plant infrastructure. In this regional analysis, we detected a weak effect of chlorination on MC congener composition, which may be driven by the secondary effects of zebra mussel presence. There are reports, for instance in the US Midwest, of higher MC levels in dreissenid-infested water bodies compared to those without mussel [47]. This, in turn, may be linked to the changes in nutrient ratios and concentrations created by the mussels towards N:P ratios more suitable for toxic cyanobacteria growth [47,68–71]. Interestingly, distance from shore and chlorination were not selected by the MRT analysis. Wind speed and air temperature (>12 ◦C) were selected instead, though many water treatment plants only chlorinate when water temperatures exceeded 12 ◦C (i.e., during increased zebra mussel reproduction). Furthermore, although the effect of distance from shore and chlorine treatment on MC congener composition was less clear (Figure S8), the concentration of total MC within the four most frequently monitored water bodies tended to decrease with depth and distance from shoreline (notably so among the intake sites along Lakes Ontario and Erie (Figure S9a,b)) and increase with chlorination (or presence of zebra mussels Figure S5d). From a water treatment perspective, we suggest that more work is needed to determine the effect of chlorination and intake location on total MC concentration and on the relative composition of MC congeners, which may guide treatment optimization (such as using additional treatment methods at MC-LA dominated sites).

#### *3.2. Global Relationships between Congener Occurrence and Environmental Conditions*

Our meta-analysis of the peer-reviewed literature on MC congener dynamics identified significant differences in dominance patterns among continents. In general, the relative proportion of MC-LA was low for the lakes sampled in Africa, Europe, and Asia, but notably greater in lakes sampled in North and South America (Figure 6a). In contrast, MC-LR was more uniformly observed across sample sites (Figure 6b). In light of the relationships identified at the regional scale (Great Lakes and the surrounding region), we explored whether the difference in MC-LA occurrence among continents and countries was due in part to differences in environmental conditions among locations. We noted that climate (wind speed) was the most important driver of MC congener composition in the Great Lakes regions, and concordantly, MC-LA was more common in lakes with smaller surface area (i.e., North and South American lakes (Figure 6a, Figure S7)), which may modulate climate signals such as wind exposure [72,73]. We also noted that the effect of nutrients on regional congener composition was weaker, but still an important driver of MC-LR, with the latter more common at higher nutrient concentrations (Figure 5a). Similarly, on the global scale, we found that MC-LR was more common in eutrophic lakes (Figure S7). In general, despite the lack of a standardised protocol among the studies synthesizes in our meta-analysis, the global-scale relationships between lake morphometry, nutrient concentrations, and MC congener composition in the water column echoed the regional patterns observed in the Great Lakes Basin raw water intake data.
