1. Introduction
Threats to the ecosystem health of estuaries, one of the most altered and at-risk aquatic environments [
1], are predicted to worsen if development along estuaries and their watersheds continues unabated [
2]. Thus, monitoring and assessment of the biota (including fishes and crustaceans) in these ecologically and economically important ecosystems are urgent and crucial [
3,
4]. However, monitoring estuaries is time consuming, and costly due to their high spatial and temporal variability that requires greater sampling effort than many other aquatic ecosystems [
5,
6].
Long-term monitoring programs provide the critical historical data required to define the natural variability of an estuary and capture change over time [
7,
8,
9,
10]. Both abiotic and biotic indicators are recommended to adequately monitor estuaries [
11]. Yet, time and cost constraints can lead programs to sample only abiotic indicators [
4], and estuary biological monitoring programs are constantly under threat of being canceled [
12].
Resource managers tasked with designing a long-term environmental monitoring program have often found success in implementing a community-based monitoring program (CBMP). CBMPs come in various forms, including citizen science initiatives in which local community members volunteer their time to assist in the collection of environmental monitoring data [
13]. Designing a monitoring program that can be executed by local volunteers can both reduce costs associated with monitoring programs and engage local community members [
14,
15]. Enlisting the help of volunteers also enables researchers to collect more data on a larger geographical scale and over a longer time period than would otherwise be possible [
14,
16,
17,
18,
19]. CBMPs continue to gain recognition for their potential to fill data gaps, inform decision-makers, and educate communities [
20]. As such, these programs are becoming increasingly popular among government and non-profit agencies [
21,
22] with millions of volunteers participating in CBMPs worldwide [
18]. A downside of such programs is that professional scientists and decision-makers have expressed concerns regarding the quality of data collected by community members [
15,
20,
23]). These concerns have limited the incorporation of CBMP data into the scientific literature [
18,
22] and its use to support decision-making. However, previous studies have detected no significant differences between the environmental data collected by the community members and professional scientists (e.g., Fore et al. [
24]; Thériault et al. [
25]; Danielsen et al. [
26]; van der Velde et al. [
27]).
The Community Aquatic Monitoring Program (CAMP) is a long-term CBMP that monitors estuaries in the southern Gulf of St. Lawrence, Canada. Implemented in 2003, CAMP continues to be administered by Fisheries and Oceans Canada (DFO) in collaboration with the Southern Gulf of St. Lawrence Coalition on Sustainability (Coalition-SGSL). DFO and Coalition-SGSL personnel work alongside volunteers from watershed groups, First Nation groups, and maritime universities to collect annual data [
28]. Data include littoral nekton (i.e., fish, shrimp, and crabs) counts, estimates of aquatic vegetation cover, water quality measurements, and sediment characteristics [
29]. The initial objective of CAMP was to provide an avenue for community outreach and interaction with Environmental Non-Government Organizations (ENGOs), and to raise awareness of estuarine ecology [
28]. A current goal for the CAMP dataset is to determine if it can be used to assess the relationship between the health of an estuary and the diversity and abundance of nekton within it [
28,
29].
The objective of the present study was to determine if the CAMP sampling design that accommodates volunteer design and participation provides similar information as would be generated by a more rigorous sampling approach. Specifically, even though various sampling station selection criteria were established in the original CAMP design, most station locations were selected primarily to allow volunteers to easily access stations from the road [
28]. In heterogeneous habitats, such as estuaries, a stratified random sample is recommended where the total area is divided into equal plots and a similar number of units is selected randomly from each plot [
30]. Currently, six stations are sampled in each estuary, regardless of estuary size, because that is the number of stations that volunteers were assumed to be capable of sampling in one day [
28]. Two hypotheses were tested: (1) Nekton assemblages observed at CAMP stations would be similar to those at stations located using a stratified random sampling design (SRD) and (2) Sampling a greater number of stations in an estuary would not substantively alter nekton community assemblage estimates.
2. Materials and Methods
2.1. Description of Estuaries Sampled in This Study
Ten estuaries, six in the province of New Brunswick and four in the province of Prince Edward Island, Canada, were sampled in this study (
Figure 1,
Table 1). The selected estuaries encompassed the range of estuary sizes sampled under CAMP to prevent estuary size influencing the results. Likewise, estuaries were selected to include some with CAMP sampling stations clustered in the lower estuary (i.e., closest to the marine environment), and some with CAMP stations spread throughout the estuary (
Figures S1–S10). The initial CAMP sampling design suggested sampling stations should cover as much of the estuary as possible with stations located in the upper, middle, and lower estuary, and located equally on either side of the estuary [
31]. Yet, the consideration of road access for volunteers resulted in many stations clustered in the lower estuary. The lower estuary tends to have more public road access for harbour activities, while private land, inaccessible to volunteers, is more common in the upper estuary.
2.2. Stratified Random Sampling Design (SRD)
Twelve sampling stations were designated within each estuary. Six stations were the established CAMP station locations, and six stations were randomly located and stratified among the upper, middle, and lower estuary. Estuary sampling maps were created using ArcGIS (
Figures S1–S10). The lower extent of the estuary was marked at the mouth of the estuary and where fresh and saline water are fully mixed (i.e., where the coastline opens to a bay or open ocean) or to the lowest CAMP station when sampling extended into the bay. For the purpose of this study, the upper extent of the estuary was marked where (when information was available) the salinity is known to be 10, or where the estuary narrows to a stream channel. A minimum salinity of 10 was selected as the upper estuary benchmark, as that is the lowest average salinity that CAMP samples. Both shorelines were divided into three equal length sections and overlaid by a grid of 50 m
2 squares. Numbers were assigned to each grid square. One station location was randomly assigned to each section using a random number generator. Once a number was randomly selected, the aerial imagery beneath the corresponding grid square was inspected to ensure there were no obvious impediments to seining (e.g., piers, docks). If an obstruction was observed, then a new station location was assigned using the random number generator.
2.3. Field Data Collection
All estuary sampling was completed by the same core team and was supplementary to the regular annual CAMP sampling. We did not use data collected by volunteers to ensure any detected differences were due to the sampling design and not a difference in sampling teams. Estuaries were sampled once in July or August 2016. The environmental data collected at each station are summarized in
Table 2. SRD stations were accessed using a 19-foot Carolina Skiff at New Brunswick estuaries and a 17-foot Carolina Skiff at Prince Edward Island estuaries. One CAMP station at Souris was not sampled because members of the public were swimming at the station during the sampling time. One Summerside SRD station could not be sampled due to unsafe weather conditions.
Figures S1–S10 display estuary maps with finalized station locations.
At each station, nekton and water quality parameters were collected using the CAMP methods outlined by Weldon et al. [
31]. Nekton species were captured using a 30 m by 2 m beach seine with a mesh size of 6 mm and central bag measuring 2 m by 1 m, which samples a standardized area of 225 m
2 at each station. All captured nekton were placed in a live-box with water exchange, identified, classified as either young-of-the-year (YOY) or adult, enumerated, and then released. All fish were handled in accordance with the approved University of Waterloo animal care protocol (AUPP #14–15). All fish collection activities complied with DFO Gulf Region License to Fish for Scientific Purposes, License No. SG-RHQ-16-016C. The following species counts were pooled together for data analysis due to the difficulty of field identification: alewife (
Alosa pseudoharengus) young-of-the-year (YOY) and blueback herring (
Alosa aestivalis) YOY counts were pooled as Gaspereau YOY; blackspotted stickleback (
Gasterosteus wheatlandi) YOY and threespine stickleback (
Gasterosteus aculeatus) YOY counts were pooled as Gasterosteus YOY; mummichog (
Fundulus heteroclitus) YOY and banded killifish (
Fundulus diaphanous) YOY counts were pooled as Fundulus YOY; and winter flounder (
Pseudopleuronectes americanus) YOY and smooth flounder (
Liopsetta putnami) YOY counts were pooled as flounder YOY.
Water quality data, including temperature, dissolved oxygen (DO) (mg/L), and salinity (PSU) were collected using a handheld YSI Professional Plus model at New Brunswick estuaries and a YSI model 6600 M at Prince Edward Island estuaries. Water quality was measured from the middle of the water column within the seined area. The tide height (m); above chart datum for each station at the time of sampling was documented by accessing the tide tables available on the DFO website [
33].
2.4. Data Analysis and Statistics
Similarities between pairs of samples (i.e., sampling stations [six stations per estuary]) were defined with a similarity matrix generated using the Bray–Curtis similarity coefficient. A square-root transformation was applied to the nekton data to reduce the dominance of the highly abundant species prior to analysis while not overemphasizing rare species. The resulting Bray–Curtis similarity matrix was the basis for all multivariate analyses.
The nekton assemblages collected from the SRD and CAMP stations were compared to assess if sampling nekton in these estuaries at different sampling stations would result in a different nekton community. A hierarchical cluster analysis using a group average linkage was performed. A similarity profile (SIMPROF) test was applied to determine which groups created by the cluster analysis were significantly different. The SIMPROF significance level was set at 5% with 9999 permutations. The differences between the nekton assemblages collected were further visualized using non-metric Multi-Dimensional Scaling (nMDS) ordination.
A two-way, permutational MANOVA (PERMANOVA) was used to formally test the effects on nekton community assemblage of CAMP versus SRD “sampling design”, “Estuary”, and their interaction. A Type III sums of squares was used, because it is the most conservative approach to partitioning variability, which is appropriate for unbalanced designs [
34].
p-values were obtained by applying 9999 permutations of residuals under a reduced model, because it yields the best power and most accurate type I error [
34].
The distance-based linear models (DISTLM) routine was used to understand which (if any) environmental variables had the greatest influence on the nekton assemblage data to understand if environmental factors influenced any significant differences detected between sampling designs. The DISTLM routine determines the combination of environmental variables that best describe the multivariate data cloud produced by the Bray–Curtis similarity matrix on the nekton abundance data [
34].
p-values were generated using 9999 permutations, to test the null hypothesis of no relationship between the environment data and nekton abundance data [
34]. The test yields an R
2 value that is the estimate of the amount of variation in the nekton assemblage explained by the variation in environmental variables [
34]. The Best selection procedure and AICc selection criteria were used.
The CAMP and SRD datasets were merged to create a combined dataset to understand how the characterization of the nekton assemblages may change with additional station data. It is acknowledged that the combined dataset is not an ideal tool for evaluating CAMP station numbers, because the data were collected at different locations selected using two different approaches. Ideally, the 12 sampling stations would have been located using the same sampling design (e.g., 12 SRD stations within each estuary). However, employing the SRD stations as theoretical “new” CAMP stations is a pragmatic method to predict the potential for increased station data to alter conclusions based on nekton assemblages.
Species accumulation plots were generated for each estuary using the combined dataset (n = 12, except for Souris and Summerside with n = 11) to determine the typical number of stations required within each estuary to capture all potential nekton species. An ideal number of stations would be the number after which no new species are gained. The combined data were permuted 999 times for each estuary.
The combined dataset was also used to predict the potential for additional stations to alter conclusions of dissimilarities among estuaries (i.e., would additional stations change which estuaries are considered significantly similar/different?). An nMDS plot was generated using the combined dataset to visualize how the pattern of estuaries produced by their relative dissimilarities may change with additional station data. A one-way PERMANOVA was performed to assess if an increase in station numbers would result in different conclusions regarding estuary dissimilarities.
The ideal sample size for CAMP is one that yields sufficient precision to detect the typical differences in mean species abundance estimates among estuaries. The method for calculating measurement error introduced by Bailey and Byrnes [
35] was used to define the precision of the CAMP stations and predict the precision that could be gained by sampling up to six additional stations. This analysis was completed using the data from the CAMP dataset. As a univariate method, the analysis focused on counts of the individual species that were determined to have the greatest influence on estuary dissimilarities. Influential species were identified by using the similarity percentages routine (SIMPER) in PRIMER. For each influential species, one-way, Model II ANOVAs were used to partition the total variance in counts of each species into among and within estuary components, as described by Bailey and Byrnes [
35]. The within estuary mean square (MS
within) is an estimate of the variance among stations within an estuary (
). The among group mean square (MS
among) includes both among estuary and within estuary variability so among estuary variance (
) is calculated as follows:
The variance of the mean (
) was calculated using the within and among estuary component of variance, where n is the number of estuaries sampled and m is the number of stations sampled.
Values of m were then substituted with values of 7 through 12 to measure the reduction in the variance of the mean that could be obtained by sampling up to 12 stations at each estuary. Subsequently, 95% confidence intervals on the mean estimates of each influential species were calculated by multiplying the square-root of the variance of the mean by its corresponding t-value. The confidence intervals for station numbers 6 to 12 were used to understand the precision that could be gained with greater station numbers.
PRIMER 7 with the PERMANOVA add-on package was used to complete multivariate analyses to test for differences (a = 0.05) between the CAMP and the SRD data. The stations were treated as replicates within each estuary. All univariate analyses were completed using RStudio version 0.99.489.
4. Discussion
Previous studies have assessed the scientific utility of community-based monitoring programs (CBMPs) through comparisons of the data collected by community members and professional scientists (e.g., Fore et al. [
24]; Thériault et al. [
25]; Danielsen et al. [
26]; van der Velde et al. [
27]). This study took a different approach by testing if the CBMP data are biased due to the sampling design facilitating volunteer participation, rather than testing volunteer competency. Thériault et al. [
25] previously tested the accuracy of the CAMP volunteers’ nekton identification skills and concluded they were comparable to DFO scientists (i.e., <10% disagreement in abundance counts).
The first objective of this study was to test the hypothesis that nekton assemblage data collected from the CAMP stations would not be significantly different than data collected from stations located through a stratified random design. The nekton assemblages collected with the two sampling designs were generally not significantly different. In addition, when observed individually, both study designs yielded the same result that all estuaries differ significantly in nekton structure. The exceptions were the Cocagne and Shediac estuaries, where significant differences were detected between the sampling designs. However, Shediac’s nekton assemblages differed from all estuaries, regardless of sampling design. Cocagne was not significantly different from Richibucto when considering the nekton data collected from the SRD stations, but was significantly different from all estuaries when considering the data collected from the CAMP stations. The potential causes for the differences detected between the sampling designs at Cocagne and Shediac were explored.
Of the environmental variables measured, tide height had the strongest influence on the differences in Cocagne nekton between the sampling designs. The average difference in tide height between the CAMP and SRD stations during sampling in Cocagne was 0.6 m. Conversely, the next largest difference in tide height between sampling designs within an estuary was 0.1 m. Since tides are known to influence the distribution of nekton within an estuary [
37,
38,
39,
40], studies typically standardize their sampling to a certain tide height (e.g., Schein et al. [
41]; Ellis and Bell [
42]; Gerwing et al. [
43]). This study followed the CAMP protocol of beginning sampling at 8:00 AM every day, and sampled the different sampling designs only one day apart to maintain similar environment conditions. However, Cocagne CAMP station sampling began 1.75 h before the SRD sampling and finished 4.00 h before the SRD sampling. The difference in time was a consequence of the late start of the SRD sampling and longer sampling time due to shallow waters inhibiting boat access to the shoreline. Therefore, the difference in sampling timing (resulting in a difference in tide height) is likely the cause for the significant difference between the Cocagne datasets, rather than the difference in station locations. These results provide further support for estuary monitoring programs to standardize tide height among sampling locations and sampling dates. Further research is required to test the maximum difference in tide height that will not introduce significant variability.
Salinity had the strongest influence on the differences in Shediac nekton between the sampling designs. Salinity is an influential factor contributing to nekton variability within estuaries [
10,
44,
45,
46,
47]. Shediac CAMP stations are clustered in the bay, resulting in those stations experiencing higher salinity concentrations than the majority of the SRD stations that were located further up the estuary (
Figure S6). Yet, a greater difference in salinity was measured between sampling designs at the Bouctouche estuary, and CAMP stations are also clustered in the lower estuary at Brudenell and Scoudouc. Thus, there may have been other environmental variables not measured that had a greater influence on the differences between the sampling designs.
Overall, regardless of sampling design, the assessment of the nekton assemblage at each of these estuaries would not change. Therefore, if assessments of these estuaries were to be made using nekton assemblages as an indicator of estuary condition, the recommendations informed by the assessment would not change. These findings are evidence that a lack of station stratification and randomization does not limit the utility of CAMP for decision-makers.
The remaining question concerning the scientific utility of CAMP was whether the station number (i.e., six stations per estuary) is appropriate. When initiating a monitoring program, it is best to oversample initially to identify the sufficient number of samples. However, budget and time constraints often dictate sample numbers. Six stations were originally proposed for CAMP estuaries, because that was the number of stations that volunteers were predicted to be able to sample within one day [
28], and considered the minimum to achieve sufficient power with multivariate analyses. Accordingly, this study’s second objective was to test the hypothesis that sampling an additional six stations would not significantly change the dissimilarities between estuaries or substantively improve the precision of nekton abundance estimates.
The results from analysis of the combined dataset (i.e., CAMP plus SRD station data) demonstrated collecting an additional six stations would not alter conclusions based on nekton assemblages. While the six CAMP and six SRD stations generally had similar species richness values, the species accumulation plots showed, on average, the species number continued to increase until ten stations. Yet, the one-way PERMANOVA results for the combined dataset suggested collecting data from an additional six stations would not alter the conclusion that all estuaries have dissimilar nekton assemblages. These results indicate that the additional species gained past six stations do not significantly influence the dissimilarity calculation for the multivariate analyses.
It was important to understand how confident one can be in the results that all estuaries are significantly different based on the nekton assemblage data from six stations. The nekton assemblage data are comprised of abundance estimates for individual species. The confidence in the accuracy of these abundance estimates is based on the number of samples and variability of the abundance data. Although accuracy cannot be determined because we have no independent estimate of species number, the precision of these estimates can be calculated to understand the likely range of values within which the mean truly falls. The precision of estimates will govern whether you can be confident in the conclusions and understand the risk of a Type I error (i.e., falsely concluding there are significant differences between the means). Increasing sample size is one way to increase sampling precision to detect biological differences between estuaries [
48]. Therefore, the ideal sample size for CAMP is one that yields sufficient precision to detect the typical differences in mean species abundance estimates that are considered biologically meaningful.
The SIMPER analysis revealed the differences in nekton assemblages among estuaries were governed by four influential species (adult mummichog, fourspine stickleback, and sand shrimp, and young-of-the-year Atlantic silverside). Abundance estimates from the current six CAMP stations only have sufficient precision to be confident in the largest differences in species abundances that differentiated estuaries. However, even six additional stations would mostly still be insufficient for the numerous smaller differences recorded in mean species abundances. Abundance estimates of all four influential species can range in the 100′s within estuaries. This high within- and among- estuary variability makes it difficult to substantially increase the precision of abundance estimates by adding stations. Therefore, the results do not provide compelling evidence to suggest more stations would substantively increase the precision of nekton community descriptions enough to warrant additional sampling effort.
Restricting the number of CAMP stations in consideration of volunteer participation does not appear to have substantively limited the precision of nekton abundance estimates. Rather, the precision of nekton abundance estimates is limited by the naturally high variability in nekton abundances, which has been identified by other studies as a limiting factor in using nekton as an indicator of estuary health [
42,
49,
50,
51].
Future analysis and interpretation of the CAMP dataset should evaluate the precision of the influential species estimates to understand if there can be a reasonable level of confidence in the conclusions that nekton assemblages are dissimilar. This method will help to differentiate from the inherent variability in nekton assemblages within and among estuaries, and those differences that are large enough to signify a biologically meaningful change. These larger differences in nekton abundances should then be investigated to understand if they indicate healthy or degraded conditions relative to the other estuaries or the same estuary overtime. Both conclusions of either healthy or degraded conditions have implications for management decisions to restore habitat or protect what has yet to be impacted.