2.1. Instrumentation and Design of the PHySS
The Programmable Hyperspectral Seawater Scanner (PHySS) embodies a considerable leap forward in the domain of marine environmental surveillance, tracing its origins to the innovative Optical Phytoplankton Discriminator [
9]. This submerged device exemplifies the ongoing technological evolution within aquatic research. It is equipped with two bi-directional ports, ingeniously engineered to concurrently intake seawater samples into its distinct first and second capillary cells. This innovative dual-port design facilitates the swift and efficient examination of marine settings, offering critical insights into the prevalence of specific phytoplankton species, emphasizing those belonging to the genus
Karenia.
At the core of the PHySS’s operational principle lies its reliance on the absorbance spectra derived from the seawater samples. This approach is protected under United States Patent No. US 7,236,248 B2, issued in 2007 [
9], underscoring the device’s proficiency in distinguishing and pinpointing phytoplankton varieties, especially the Karenia species. These species are of significant ecological interest due to their role in occasional HABs. The functionality of the PHySS is significantly enhanced by an intricate processing algorithm devised by researchers at Mote Marine Laboratory. As Hails et al. elaborated in 2009 [
10] that this algorithm scrutinizes the fourth derivative of the particulate absorbance spectrum from each sample. This advanced level of analysis allows for comparison against a reference spectrum of
Karenia brevis, facilitating the accurate identification and categorization of this particular phytoplankton.
Ultimately, the PHySS is a fusion of innovative hardware capabilities and sophisticated analytical methodologies. This combination furnishes marine scientists with an exceptionally potent instrument for monitoring and deciphering phytoplankton dynamics within oceanic ecosystems. The amalgamation of proprietary technology and forefront algorithms elevates the PHySS to a crucial role in the concerted efforts to demystify oceanic phenomena and protect marine habitats against emerging ecological challenges. Through its precise detection and analysis capabilities, the PHySS stands as an indispensable asset in the arsenal against HABs, offering a nuanced understanding of marine biodiversity and ecosystem health.
2.2. Data Collection
The research entailed a methodical approach to continuous seawater sampling, carried out at the designated New Pass site, as illustrated in
Figure 1, employing the advanced capabilities of the PHySS. Firstly, 250 mL samples were immediately fixed using Lugol’s iodine solution to preserve the cells for later analysis. The concentration of the samples was achieved by decanting the grab samples into 20 mL scintillation vials; then, 1 mL aliquots were transferred to a 24-well plate. The phytoplankton cells were then enumerated using an inverted microscope, following [
11], for accurate cell counting. The PHySS captured the fourth-derivative absorption spectra of seawater samples, spanning a spectral range from 400 to 700 nanometers (nm). This specific range was chosen to effectively encompass the light absorption characteristics of various phytoplankton species, including those known to induce HABs, such as
Karenia brevis.
The PHySS’s operation was programmed to execute spectral recordings at predetermined intervals every two hours throughout a comprehensive 30-day sampling period. This systematic and uninterrupted data acquisition strategy was designed to gather an extensive dataset, ensuring a robust foundation for subsequent analysis. The temporal resolution of the sampling, which was maintained consistently throughout this study, was critical in capturing the dynamic nature of the marine environment under observation. Such a dataset provides snapshots of the spectral characteristics at various points in time and allows for the identification of temporal trends and patterns in the presence and concentration of target phytoplankton species. The timeframe of January 2023 was chosen because a small bloom event was occurring, providing an ideal opportunity to compare the PHySS data to physical cell counts. This period allowed us to observe the cell counts at zero, track the bloom’s progression, and see the cell counts return to zero.
To compare the spectral data with physical cell counts, which were conducted daily, we selected the sensor data point that most closely aligned with the time of the physical sampling. Physical cell count samples were taken as surface grab samples and then enumerated by a technician in the lab. This approach ensures that the data from the PHySS and the physical cell counts are comparable, allowing for accurate correlation analysis.
The 30-day sampling period, as used in this study, reflects the normal operational cycle of the PHySS. This cycle is designed to balance the need for continuous, long-term monitoring with the practical requirements of device maintenance and servicing. While some studies may benefit from shorter sampling intervals, the 30-day period used here is sufficient to capture the necessary data on HAB dynamics, including the onset and progression of algal blooms. The extended duration provides a comprehensive dataset that enhances our understanding of temporal patterns in Karenia brevis’s presence and abundance.
The strategic deployment of the PHySS and its sampling regimen facilitated the collection of high-quality, detailed spectral data. These data can advance our understanding of the spectral signatures associated with different phytoplankton populations and their fluctuations over time. By harnessing the power of hyperspectral imaging and analysis, this study aimed to shed light on the complex interactions within marine ecosystems and the factors contributing to the emergence and development of HABs. We employed the fourth derivative of the absorption spectra to differentiate phytoplankton species. This method enhances the subtle spectral features characteristic of specific phytoplankton, allowing for more precise identification.
Figure 2 and
Figure 3 are graphs and show detailed spectral characteristics of representative phytoplankton models, including
Karenia brevis.
2.3. Data Processing
In the critical data processing phase, our methodology incorporated an extensive normalization procedure to refine the collected spectral data, a foundational aspect to enhance the analytical precision and reliability. The primary goal of this normalization process was to address and minimize the impact of extraneous noise and fluctuations commonly present in raw spectral data while simultaneously highlighting subtle but significant deviations indicative of phytoplankton presence, particularly that of Karenia brevis. Normalizing the spectral dataset served as a crucial pre-processing step, ensuring a uniform baseline across all samples. This standardization was instrumental in mitigating the variability inherent in raw data, facilitating a more coherent and comparable analysis across the entire dataset.
The methodology employed in the normalization process involved adjusting the spectral intensity levels of each sample to a common scale. This adjustment was critical for eliminating disparities caused by factors such as varying sample concentrations, light path differences, and instrumental sensitivity variations. By creating a stable and consistent analytical foundation, the normalization process significantly enhanced the dataset’s utility for detailed and rigorous subsequent analysis.
Following normalization, our analysis advanced to the computation of the Fourth-Derivative Spectral Similarity Index (SI) for each processed sample. The SI emerged as a quantitative metric, innovatively designed to evaluate and quantify the degree of similarity between the spectral profile of each sample and the reference spectrum of Karenia brevis, the phytoplankton species of interest, due to its pivotal role in the onset of HABs, including red tide events.
The computation process of the SI leveraged established spectral analysis techniques, integrating sophisticated algorithms and mathematical models tailored to precisely capture and quantify the distinct spectral features characteristic of Karenia brevis. This step was paramount in objectively identifying the presence of this target phytoplankton within the sampled seawater, utilizing the nuanced differences in spectral signatures as a reliable marker for detection.
Integrating the data normalization step with the computation of the SI epitomizes a comprehensive and methodical approach to accurately discerning the presence of harmful algal species in seawater samples. These advanced analytical methodologies not only bolster the integrity and fidelity of the collected dataset but also significantly enhance the overall reliability of our research findings. Such a detailed and systematic analysis is instrumental in advancing our understanding of HAB dynamics, enabling more effective monitoring and predictive capabilities for managing red tide events and other similar ecological phenomena in marine ecosystems. This strategic approach underscores our commitment to leveraging cutting-edge analytical techniques to address the complex challenges posed by HABs, aiming to contribute substantially to the field of marine environmental science.
2.4. Data Correlation Analysis and Its Implications
Our study undertook a Spearman’s rank correlation coefficient analysis to evaluate the relationship between the SI metric and the actual phytoplankton cell counts obtained from our samples. This analysis is crucial for understanding how closely the SI metric, derived from our hyperspectral data, aligns with direct biological measurements of phytoplankton abundance, specifically targeting the Karenia brevis species associated with HABs.
Spearman’s rank correlation analysis was conducted to explore the relationship between physical cell counts of phytoplankton and the PHySS similarity index across our samples. Spearman’s rank correlation was chosen due to its suitability for non-parametric data and its robustness to outliers. A correlation coefficient was derived by converting our datasets into ranks and calculating the differences and squared differences of these ranks. The results showed a Spearman’s of 0.75, indicating a strong positive monotonic relationship. This suggests that higher physical cell counts are associated with more significant similarity in phytoplankton communities. These findings provide valuable insights into phytoplankton dynamics and underscore the utility of the PHySS similarity index in ecological studies.
The correlation coefficient, a statistical measure, plays a pivotal role in our analysis by quantifying both the strength and the directionality of the relationship between these two variables. A positive correlation coefficient indicates that as the SI metric increases, indicating a higher spectral similarity to Karenia brevis, there is a corresponding increase in the physical cell counts of this phytoplankton. Conversely, a negative coefficient would suggest an inverse relationship. The magnitude of the coefficient, ranging from to 1, provides insights into the strength of this relationship, with values closer to either extreme indicating a stronger correlation.
In conducting this analysis, we employed statistical methodologies with MS Excel [
12] to ensure the reliability and accuracy of our findings, which involved using established statistical software and techniques to calculate the correlation coefficient and its associated
p-value. The
p-value is critical in this context as it helps us determine the statistical significance of our findings, indicating whether the observed correlation could have occurred by chance. A low
p-value (typically
) would suggest that the correlation observed is statistically significant, thereby reinforcing the reliability of the SI as a predictive tool for phytoplankton cell counts.
Figure 4 illustrates the temporal progression of SI values throughout the comprehensive 30-day sampling period undertaken. Each data point on this graph is aligned with specific temporal milestones, with the corresponding SI values plotted along the y-axis. This graphical representation is instrumental in illustrating the dynamic fluctuations of SI values over time, offering an intuitive insight into the trends and patterns that emerge from our dataset. The graph’s construction is designed to facilitate an immediate understanding of how the SI values, which reflect the spectral similarity of sampled water to the reference spectrum of
Karenia brevis, evolve in response to environmental and biological variables. Notably, SI values of 0 appeared over the weekends when no physical cell counts were performed, highlighting periods without corresponding data for comparison. By tracking these changes, we can discern potential correlations between the presence of
Karenia brevis and various environmental conditions or events within the sampling period.
Figure 5 presents the outcomes of the correlation analysis between the SI metric and phytoplankton cell counts, covering the date range of January 2023. The graph showcases both the calculated correlation coefficient and the associated
p-value, providing a clear and comprehensive view of the relationship between these variables. The first 7 days show phytoplankton cell counts of 0, corresponding to the time off for the holidays when no sampling was conducted. Additionally, a harmful algal bloom (HAB) event between the 21st and the 25th of the month resulted in the highest phytoplankton cell counts observed during the study period on day 24. This visual representation and the accompanying statistical analysis validate the efficacy of the SI as a reliable indicator of the presence and abundance of
Karenia brevis in marine environments. The methodology behind
Figure 5 involves plotting the SI values against corresponding phytoplankton cell counts, with each data pair representing a matched set of observations from our sampling regime. Through this graphical analysis, we aim to identify any linear or non-linear trends that may exist, providing a quantifiable measure of how well the SI metric serves as a proxy for the physical presence of harmful algae. The strength, direction, and shape of the trend line drawn through these data points will indicate the robustness of the SI as a predictive tool for detecting
Karenia brevis blooms.
Both figures collectively contribute to a deeper understanding of the predictive capabilities of the SI and its practical applications in marine biology and environmental monitoring. By meticulously analyzing the temporal evolution of SI values and their correlation with phytoplankton cell counts, our study endeavors to advance the field of algal bloom monitoring, offering potential pathways for the early detection and management of harmful algal events. These insights are invaluable for developing more effective strategies to mitigate such phenomena’s ecological and economic impacts on marine ecosystems and coastal communities.
To analyze the correlation, we employed Spearman’s rank correlation coefficient [
13] to assess the relationship between the SI and phytoplankton abundance, specifically focusing on quantifying
Karenia brevis cell counts. Spearman’s rank correlation is a non-parametric measure of rank correlation, meaning that it assesses how well the relationship between two variables can be described using a monotonic function without making assumptions about the frequency distribution of the variables. The equation for calculating Spearman’s rank correlation coefficient is grounded in the ranks of the data points rather than their raw values. This approach is particularly useful for our study as it mitigates the effects of outliers and non-normal distribution of data, providing a more robust measure of the relationship between SI and phytoplankton abundance.
The formula is based on the differences (
) between the ranks of each pair of corresponding values (SI value and phytoplankton count) across the dataset, as follows:
where
represents the correlation coefficient,
stands for the difference between each paired rank, and
n denotes the number of data points.
Based on the data, we calculated a correlation coefficient , with and , which signifies a moderate positive correlation between the SI and phytoplankton abundance. A positive correlation coefficient indicates that as the SI value increases (suggesting a greater spectral similarity to Karenia brevis), there is a corresponding increase in the actual counts of Karenia brevis cells. The magnitude of this coefficient, being over the halfway point between 0 and 1, suggests that the relationship is statistically significant. Yet, it also implies that other contributing factors to phytoplankton abundance are not captured by the SI alone.
The coefficient of 0.542, derived from our analysis, provides valuable insights into the efficacy of the SI as a tool for predicting Karenia brevis abundance. While the positive correlation supports the premise that SI can serve as an effective indicator of phytoplankton presence, the moderate strength of this correlation underscores the complexity of marine ecosystems and the multitude of factors influencing phytoplankton dynamics. This correlation coefficient forms the basis for further investigation and analysis. It prompts a deeper exploration into how the SI can be optimized or combined with other variables and predictive models to enhance the accuracy and reliability of HAB forecasts. Additionally, this finding highlights the need for continuous refinement of the SI metric and its computational algorithms to capture better the nuances of Karenia brevis blooms, aiming for a more comprehensive understanding and management of these ecological phenomena.
2.5. Distribution Analysis and Its Implications
In our comprehensive study, an in-depth distribution analysis was performed to scrutinize the statistical behavior of the SI values and the direct measurements of phytoplankton cell counts. This analysis stage is crucial for understanding the underlying statistical properties of our datasets, which, in turn, influences the selection of appropriate analytical methods and the interpretation of correlation results. A key component of this analysis involved conducting tests for normality, a foundational assumption in many statistical inference techniques. We utilized quantile–quantile (QQ) plots as a graphical method to assess the normality of both the SI values and phytoplankton cell count data. QQ plots compare the quantiles of our datasets against the quantiles of a standard normal distribution. Deviations from a straight line in the plot indicate departures from normality, offering a visual representation of how our empirical data match or diverge from the expected distribution under normality.
Figure 6 illustrates the QQ plot for the SI values. The plot visually assesses the SI data’s alignment with a normal distribution. By examining the pattern of points relative to the reference line (which represents a perfect match to a normal distribution), we can discern the degree and nature of the distribution’s deviation from normality. This analysis is instrumental in identifying skewness, kurtosis, or other anomalies that might affect the reliability and interpretation of subsequent statistical tests applied to the SI dataset.
Figure 7 presents the QQ plot for the phytoplankton cell count data. Similarly, this plot offers insights into the distributional characteristics of the cell counts, highlighting any significant deviations from normality. The shape and trend of the plotted points against the theoretical normal line enable us to detect whether the cell count data exhibit a normal distribution or if transformations or alternative non-parametric methods are warranted for accurate analysis.
The results from our distribution analysis provide essential insights into the statistical nature of our datasets. Identifying any significant departures from the normal distribution in the SI values or phytoplankton cell counts has profound implications for our analytical approach. Should the data not follow a normal distribution, non-parametric statistical methods for correlation analysis or the application of data transformation techniques should be considered to meet the assumptions of parametric tests. Furthermore, understanding the distributional properties of our datasets aids in the appropriate selection of predictive models and the interpretation of their outputs, ensuring that our findings are both statistically robust and ecologically meaningful. This level of scrutiny enhances the credibility of our research and reinforces the utility of the SI as a valuable tool in the ongoing study and management of HABs.