1. Introduction
Accurate prediction of material drift in the ocean is critical for responses to human misadventures in the marine environment, such as oil spill responses and search and rescue operations [
1], as well as scientific pursuits, like mapping larval drift and ecosystem connectivity [
2]. Drift is forced by the interaction of ocean currents, waves, and winds. These interactions range from simple additive or linearly scaled forcing [
3,
4] to more complex interactions giving rise to small-scale motions, such as Coriolis–Stokes forcing [
5] and submesoscale turbulence [
6].
Uncertainty from all sources should be quantified as accurately as possible for accurate drift prediction. Predicted drift trajectories that do not acknowledge the associated uncertainty or include the uncertainty in an improper manner may lead to a range of adverse consequences ranging from improper scientific conclusions to loss of life. It is important to note here that overestimation of uncertainty (i.e., predicting too large a possible search area) may have impacts just as detrimental as underestimation. Steps towards accurate accounting of uncertainty have been taken in, amongst others, previous work by Blanken et al. [
7], who employed fuzzy numbers [
8,
9,
10] to propagate uncertainty through a drift trajectory model forced by time series measured at a single point. However, this work did not consider spatially variable velocity fields and, therefore, the algorithm used by Blanken et al. [
7] is not suitable for use with forcing data obtained from numerical models of the ocean and atmosphere. Fuzzy numbers were used in this study since they offer appreciable benefits in the aggregation of uncertainty from various sources through well-defined arithmetic (cf. [
11,
12]), can be constructed from a minimum of only three data points, and do not require assumptions about the statistical distribution of the underlying data [
13].
Generally, information used to inform drift prediction is obtained from numerical models of currents, wind, and waves, as it is not feasible to measure these parameters at all times with high spatial fidelity. Models, however, are by definition an abstraction of reality and are unlikely to reliably reproduce all physical processes that affect currents, wind, and waves. This is especially true at small spatial and temporal scales, as dynamics at these scales are not well understood and are an active area of research [
14]. Misrepresentation of physical processes may be due to assumptions made in the model formulation or due to sub-optimal configuration of the model parameters. Publications such as that by Paquin et al. [
15] show that, even in a state-of-the-art ocean modelling system, there is disagreement between modelled currents and those observed in the real world, leading to demonstrated subsequent disagreement between modelled and observed drift trajectories. An example of the implications of this is given in the work by the Canadian Coast Guard [
16], who noted that, during the response to a fuel oil spill in the harbour of Vancouver, Canada, three trajectory models were run but none were able to correctly predict the observed trajectory of the spilled oil. The trajectory was reproduced after the fact by Zhong et al. [
17], who employed a very-high-resolution hydrodynamic model to provide information on the currents at the time of the incident. Currents were the primary driver of oil drift and dispersion, as there was no significant wind and wave activity in the area at the time. In a general sense, however, results from very-high-resolution hydrodynamic models are unlikely to be readily available in an operational context due to their significant computational expense. Therefore, it is desirable to appropriately account for the uncertainty induced by the use of existing operational ocean models in trajectory prediction.
All instruments used to collect observational data for validating models of the ocean and atmosphere are also subject to uncertainty arising from instrument limitations, limitations in the frequencies of motion resolved in the measurement, and uncertainty in the representativity of the measurements arising from sampling at a finite number of locations. This uncertainty was aggregated for a single observation location and propagated through a trajectory model in the work by Blanken et al. [
7].
Other commonly used approaches to uncertainty propagation in drift prediction treat uncertainty as a singular stochastic parameter. This stochastic parameter is often applied by simulating a large number of particles subject to a series of random velocity perturbations drawn from a prescribed distribution; i.e., a random walk [
18,
19,
20]. More involved approaches have been built on this, such as applying uncertainty to velocity or acceleration rather than position [
21] and considering possible persistent autocorrelation in the velocity field by modelling uncertainty as fractional Brownian motion [
22,
23]. The latter approach is a generalization of a random walk, which is recovered when the fractal dimension of the modelled motion is set to two. Here, strong agreement has been found between the relative dispersion of drifting buoy tracks analyzed via fractional Brownian motion [
22] and independent observations of the growth of dye patches in the ocean [
24]. For a discrete random walk, uncertainty is expected to grow proportionally to
[
25], analogous to the growth rate of a particle cloud in a turbulent field as first described by Richardson [
26]. Growth rates are expected to be slower when correlation in the velocity field exists, as was noted by Sanderson and Booth [
22] and Okubo [
24], who both found relative dispersion proportional to
.
An empirical alternative is given by the leeway model for object drift [
3], which sees widespread use in search and rescue applications. Here, the wind response of a drifting object is described by nine coefficients determined through a regression analysis of the drift motion against wind velocity. Three of these coefficients are error terms used to define a Gaussian distribution that can be sampled for Monte-Carlo simulations to estimate the uncertainty in the prediction [
27]. This approach relies on empirical, object-specific coefficients and is therefore limited in its ability to account for new or unknown object geometries. However, when comparing predictions made for five different types of drifting buoys, it was found to perform similarly to the deterministic, fuzzy-based analysis performed by Blanken et al. [
7].
Some efforts have also been made to reduce model uncertainty by applying hyper-ensemble techniques; i.e., combining results from several different ocean and atmospheric models [
28,
29,
30]. Ensemble approaches may include models that assimilate observed ocean surface velocities, which has been shown to improve representation of modelled surface velocity structure [
31]. These approaches show promise, but for operational application, they are limited to areas where multiple model solutions or the capacity for ensemble modelling with parameter perturbation exist.
Irrespective of the chosen method of uncertainty propagation, making an appropriate choice of the parameters describing the applied uncertainty is critical to producing an accurate trajectory simulation. Within the uncertainty characterization, it is desirable to describe the differences between the numerical model fields used to force the trajectory prediction and observations of currents and winds in the region of interest with as much detail as practicable. This is likely to be the primary source of uncertainty, along with the wind response of the drifting object. When using random velocity kicks to apply uncertainty in drift prediction, model–observation differences may be used to inform the magnitude of these kicks, which are generally drawn from an assumed statistical distribution (cf. [
3]). However, describing the uncertainty as a fuzzy number offers the benefits that biases in the forcing model are accounted for without additional modification (through the location of the kernel, see
Section 2.1) and no distribution needs to be assumed. Using the exact distribution of the data rather than an assumed approximate distribution has been shown to improve results in geophysical modelling applications [
32].
In the current paper, work is presented towards treating uncertainty in drift prediction on a per-source basis. Consideration is given to the differences between the modelled current, wind, and wave fields that inform the drift prediction and analogous observations of these parameters. Specifically, an algorithm for propagating such uncertainty in spatially variable flow fields was developed, as was suggested by Blanken et al. [
7]. This algorithm was subjected to a sensitivity analysis of free parameters in order to document its performance. The results from the proposed scheme are contextualized within dispersion theory. This is a crucial step in developing a broader framework for using fuzzy numbers to describe and propagate uncertainty through a drift prediction system, as described by Blanken et al. [
7]. The current effort builds on work such as that by Ni et al. [
33], who used interval analysis to predict drift trajectories. This is similar in spirit to Blanken et al. [
7] and the extension of this work in the present paper. However, Ni et al. [
33] did not consider spatial or temporal variability in model forcing fields.
2. Materials and Methods
2.1. A Quick Primer on Fuzzy Numbers
Fuzzy numbers are used throughout this paper to describe uncertainty. They were first introduced by Zadeh [
9,
10] as a method to accurately describe imprecise quantities. A fuzzy number consists of a value that is believed to certainly be a possible value of the quantity being described (the crisp value or kernel), as well as the ranges of the quantity’s value considered to be possible at lower degrees of belief. These ranges are described by the membership function as a function of their corresponding degree of belief. Degree of belief is given in the interval
, with 1 being assigned to the crisp value. The lowest and highest values of the quantity thought to be possible form the support of the fuzzy number. The membership function increases monotonically from the lower bound of the support to the crisp value and then decreases monotonically towards the upper bound of the support.
Fuzzy numbers are weakly related to probability theory through the consistency principle, which states that the degree of belief in a range of possible values of a quantity must be greater than or equal to the probability of this range of values [
10]. This does not have a direct impact on the work presented here but is mentioned to contextualize the work in the broader literature on uncertainty in particle tracking. In the present work, fuzzy numbers are used since their arithmetic is well defined, which allows straightforward and accurate aggregation of uncertainty from various sources and propagation of this uncertainty through complex mathematical models.
To perform arithmetic on fuzzy numbers, they are first discretized into a set of membership levels (also called
-cuts in the literature) [
34]. Membership levels are intervals describing the possible values of a fuzzy quantity at a fixed degree of belief. For mathematical models where no terms are repeated, arithmetic can be performed on each membership level separately using interval arithmetic [
11]. If terms are repeated in the model to be solved, then interval arithmetic cannot be used as it overestimates the uncertainty in the result due to "double-counting". In these cases, the transformation method can be used to discretize the fuzzy number into array form and the resulting arrays are used to solve the equations using element-wise arithmetic [
12]. The work presented in this paper does not directly require fuzzy arithmetic; however, a brief summary is again presented for context. For further discussion in the context of ocean particle tracking, please refer to [
7]. For general discussion of fuzzy mathematics, the reader is referred to the foundational papers [
8,
9,
10], as well as texts such as [
11,
12].
2.2. Uncertainty and Fractional Brownian Motion
To propagate uncertainties expressed as fuzzy numbers through drift trajectory predictions, the crisp velocity fields used to force a trajectory model will be perturbed in a systematic manner based on the prescribed uncertainty. For the purpose of this paper, when describing the method for uncertainty propagation, these fields are arbitrarily defined. Rigorous definition of the uncertainty fields based on oceanographic principles is not the focus of the present paper but is discussed in the context of fuzzy numbers in Blanken et al. [
7] and will be further discussed in future publications.
Broadly, uncertainty is defined as consisting of an epistemic (or global) component, , and an aleatoric (or local) component, . The epistemic component of uncertainty relates to broad systemic uncertainties arising from the model configuration, while the aleatoric component relates to changes in this uncertainty over the course of the simulation due to internal variability, such as turbulence, numerical noise, and fluctuations in instrument noise in observations. Contributions to the epistemic component of uncertainty in a particle tracking model come from the following sources: the deviations between the hydrodynamic and atmospheric models providing information on the current, wind, and wave fields (the input models) and analogous observations of these fields; unresolved spatial and temporal scales of motion in the input models and observations; instrument uncertainty in the equipment used to make observations of the input fields; and interpolation error due to the finite size of the model input grids.
In this paper, we describe uncertainty as a fuzzy number [
8,
9,
10] with five membership levels [
34] corresponding to values of
, 0.025, 0.05, 0.075, 0.1 (which is equivalent to signal-to-noise ratios of
). For each membership level, the corresponding value of
is mapped into
-space by taking the following in polar coordinates and converting to Cartesian, resulting in the membership function shown in
Figure 1.
The aleatoric component is related to the rate of change in uncertainty, which is in turn related to the autocorrelation of the deviations between the input model fields and analogous observations. If the autocorrelation of these deviations decays exponentially in time then the aleatoric component of uncertainty is equivalent to Brownian motion with the same magnitude as the epistemic uncertainty. For generality, we model the aleatoric uncertainty as a fractional Brownian function [
22,
35,
36]. While continuous solutions to fuzzy fractional systems exist and recent advances have been made in the field (cf. [
37,
38,
39]), we consider a discrete solution and exploit the self-similar properties of these functions. In the literature, this self-similarity is given as follows:
Here,
implies equality of statistical properties. Therefore, self-similarity suggests that statistically the changes in a time series
over an arbitrary interval
are the same as those over an interval
if they are scaled by a factor
. For a two-dimensional process
D, the fractal dimension of the process lies in the interval
. A value of
implies motion in a straight line, while
implies random motion that eventually fills the available space completely. For the case of
, fractional Brownian motion reduces to standard Brownian motion [
22]. As the fractal dimension of the uncertainty in drift velocity for a modelled particle is not known a priori, we tested values in the interval
.
If we take
to be the positional uncertainty of a particle in a simulation with length
T and the timestep of the simulation to be
, then we can write the following power law for the aleatoric uncertainty,
, in terms of the epistemic uncertainty,
.
As such, for
,
evolves as a linear function of the timestep fraction (
), while for
, the evolution is inverse parabolic with large changes in
at small timestep fractions and comparatively smaller changes at large timestep fractions. Details on the application of the uncertainty described in this section to perturb the forcing velocity field are given in
Section 2.4.
2.3. Forcing Field
The crisp velocities to be perturbed by the uncertainty described in the previous section are given by two idealized, analytical vector fields (
Figure 2). Both represent a variation of a counterclockwise rotating monopole vortex. The first field under consideration is the simplest one, representing the vortex in steady form with constant angular velocities. In mathematical form,
Here, is the radial velocity and is the angular velocity. All other symbols are as conventionally defined in polar coordinates.
The second field was designed to test performance in unsteady velocity fields; here, a periodically reversing flow. The magnitude of angular velocity varies sinusoidally with time
t.
For the purpose of this paper, particle tracking is started at = (0,1) and completed for a duration of . For the constant velocity case, this duration implies that the analytical solution completes one full rotation around the vortex.
2.4. Fuzzy Particle Tracking
Particle tracking is conducted in two loops: an outer loop over the membership levels of the fuzzy number describing velocity uncertainty (
Figure 1 and
Figure 3A) and an inner loop over the number of timesteps required to reach the end of the simulation (orange outlined area in
Figure 3).
The outer loop begins by determining
based on the value of
corresponding to the membership level under consideration and the prescribed values of the timestep (
), simulation length (
T), and fractal dimension of the uncertainty (
D) in accordance with Equation (
3). The area in
—space described by
and
is then discretized into a set of possible perturbations. The discretization of
is done on a grid of size
, resulting in
M possible perturbations. Similarly,
is discretized on a
grid, resulting in
K possible per-timestep perturbations (panels B1 and B2 of
Figure 3).
and
are free parameters of the simulation. The simulation then proceeds to the inner loop, which is run over
timesteps.
For the first two steps of the inner loop, we begin by describing the procedure on the first iteration (
N = 0) as this presents a special case. Here, the epistemic uncertainty, represented by
, is introduced. Initially, the crisp velocity is updated (panel C1 of
Figure 3) to be the value of the crisp velocity field interpolated to
. Following this, the possible perturbations of this crisp value are individually added to the crisp velocity to produce a set of possible velocity fields (panel C2 of
Figure 3). For the first iteration, this means that the
M possible perturbations discretized from
at the beginning of the outer loop are added to the crisp velocity at
to produce
M possible velocity fields.
Particles are then advanced by one timestep from their position in each of the possible velocity fields using a fourth-order Runge–Kutta integration scheme with inverse distance-weighted interpolation (panel C3 of
Figure 3). For the interpolation, the search radius is set to 0.1 and the power parameter to 6, as these values were found to optimize the interpolation results. Once the integration is complete and the new positions
have been obtained for all possible velocity fields, the set of positions
is decimated to manage computational expense by preventing exponential growth in the number of possible positions considered. The decimation is done by placing a grid of size
over the positions
and only keeping the positions that are closest to the center of the grid cell in cells containing multiple possible positions (panel C4 of
Figure 3). The velocity fields resulting in the remaining
L positions are recorded and passed back to the start of the inner loop for further time stepping.
For subsequent timesteps, the aleatoric uncertainty represented by
is introduced. Here, the crisp velocity is updated by first computing the difference in the crisp velocity field between the current timestep and the previous timestep and then adding this difference to the set of velocity fields resulting in the
L possible positions at the end of the previous timestep (panel C1 of
Figure 3). Each updated velocity field is then perturbed by
K possible per-timestep perturbations (determined from
at the beginning of the outer loop), resulting in a set of
new possible velocity fields to advance each possible position
(panel C2 of
Figure 3). The processes for integration to obtain new possible positions and decimation of these positions for computational efficiency are unchanged from the first timestep to subsequent timesteps.
Once timestepping is complete for all membership levels, the membership function of possible final positions is obtained from the final sets of possible positions at each membership level.
The free parameters of the model are the particle tracking timestep (
), the resolution of the forcing velocity field (
), the grid size for discretization of
(
), the grid size for discretization of
(
), and the fractal dimension of the uncertainty
D. To gain further insight into appropriate values for these parameters, a sensitivity analysis was conducted over a parameter space with bounds determined by the limits of computational stability and the results of the skill assessment discussed in
Section 2.5. The timestep
was varied from
to
. The timestep is hereafter reported as the denominator of the given values (i.e., 360 for the smallest timestep), and the analytical solution in the constant velocity case is displaced by one degree per timestep. The spatial resolution of the forcing field
was varied from 1% to 15% of the vortex radius. Larger values of
and
led to instability in the RK4 particle tracking scheme. The minimum values of
and
were set to 16 and 5, respectively. Below these values, the reliability of the method deteriorated as the decimation of positions (panel C4 of
Figure 3) began to fail due to the decimation of too many positions as a result of the large grid and small number of positions. The maximum values of
and
were set to 64 and 15, respectively, as the skill assessment (
Section 2.5) indicated that these maxima for the parameter space would cover the appropriate values. This is further discussed in
Section 3 and
Section 4. The fractal dimension of the uncertainty
D should be informed by analysis of the errors to be accounted for; for example, through yardstick analysis [
40] of a time series of model–observation differences. Since we have no prior information on the value of this parameter, the entire possible parameter space
should be evaluated.
2.5. Skill Assessment
The algorithm presented in this section was evaluated by performing an analysis of sensitivity to variations in and , as well as the particle tracking timestep and the resolution of the forcing velocity field . The objective here was to determine the combination of parameters that maximizes model skill while minimizing computational effort. Additionally, the choice of parameters should not diminish the representation of the uncertainty in the final result.
Simulations were performed with each possible combination of parameters and scored using a cost function, written as follows:
Here, F is the integral of the membership function of particle position, representing the amount of information on uncertainty contained in the solution. is the largest value of F for a specified fractal dimension of aleatoric uncertainty, D. Trajectories are computed for , and the mean cost from all values of D for a specific parameter combination is reported. is the time required for the simulation to complete on a consumer-grade laptop, with being the value of for the simulation with . Finally, , where is the separation between the crisp numerical solution for the given and and the analytical solution. Therefore, a value of indicates that the crisp numerical solution reproduces the analytical solution exactly.
4. Discussion
The skill of the method presented here is clearly evident in the very strong correspondence between the crisp numerical solution and the analytical solution (indicated by
in
Figure 5 and
Figure 8), the convergence of represented uncertainty content for increasing values of
and
, and the physically sound evolution of uncertainty over the course of the simulation. The latter point was evidenced by the characteristics of the trajectory described in the previous section, as well as the fact that the solution remained convex despite membership levels being solved independently of one another (
Figure 6 and
Figure 9). Further, the amount of uncertainty contained in the simulation increased with
D, as was expected from the theory. Therefore, these simulations inspire confidence that the proposed method works in the intended manner.
Therefore, the method above offers the potential to account for uncertainty in the forcing data for a drift trajectory simulation in a systematic and deterministic manner and then propagate this uncertainty through the simulation. The drift trajectory model used by Blanken et al. [
7], as well as in many other studies (cf. [
4,
20,
41]), is expressed as follows:
Here, the drift velocity is
,
is the ocean surface current,
is the Stokes drift due to the ocean surface wave field,
is the ocean surface wind, and
is a scaling coefficient. Accounting for uncertainty in this model will require numerical models of the ocean, atmosphere, and surface wave field, as well as observations of surface currents, winds, and waves/Stokes drift to validate the model results. Broadly speaking, for each forcing term, uncertainty can be expressed as a fuzzy number by: (1) mapping the model–observation differences as a histogram in
-space and converting this histogram to a fuzzy number using the method described by Khan and Valeo [
42]; (2) expressing the uncertainty due to unresolved time scales and instrument uncertainty as fuzzy numbers, as in the work by Blanken et al. [
7]; (3) combining the fuzzy numbers from steps one and two using the principles of fuzzy arithmetic [
11,
12,
43]. Uncertainties in the forcing from surface currents, wind, and Stokes drift can then be aggregated using Equation (
7) and fuzzy arithmetic to form an overall uncertainty term equivalent to
in the present study. The crisp forcing field can be similarly derived by combining the modelled forcing fields according to Equation (
7). Finally, the individual time series of model–observation differences may be combined according to Equation (
7) and the result can be used to determine an appropriate value for
D through yardstick or other fractal analysis [
36,
40,
44,
45].
Innovations and improvements are expected to arise from this approach through the accurate aggregation of uncertainty from various sources without the need to assume a statistical distribution [
32], the accounting for correlations in the error fields through the use of fractional Brownian motion, and the implicit inclusion of any biases in the forcing models. The latter are included through the fuzzification of model–observation differences, which negates any need for biases to be accounted for through dispersion over an artificially large area. It is noted that biases and other uncertainty characteristics are likely to exhibit spatial variability and, therefore, collecting representative observations at as many locations as practicable and repeating the uncertainty quantification at each location is likely to provide significant benefit.
In the present study, analysis of the growth rate of the uncertainty content indicated strong agreement with the theory of relative dispersion; i.e., the growth rate of a dispersing patch of material about its center of mass. For the steady velocity case (
Figure 10), uncertainty content was found to grow as
with
D = 2, which is consistent with the proportionality to
proposed by [
26] and the finding that piecewise-ballistic models of turbulence (functionally similar to the proposed method) exhibit
scaling of relative dispersion [
25]. A comprehensive analysis of expected dispersion behaviour for lower values of
D and increasing correlation in the velocity increments (higher
) is beyond the scope of the present study. However, it is noted that, for lower values of
D, the exponent of time decreases from 3, approaching 2 as
D approaches 1. For
D = 1.3, proportionality of
was observed, which is consistent with the observation of Sanderson and Booth [
22] (
) for a similar value of
D, as well as those of Okubo [
24], who observed this time dependence in dye studies but did not invoke fractal concepts. Therefore, the proposed method appears to be well grounded in physics, reproducing previously observed behaviours of particle separation or patch growth in stochastic fields.
A remaining question is whether the parameter space investigated in
Section 3 completely resolves the uncertainty contained in the formulated problem. It is feasible that, even though further improvements in cost are unlikely, further uncertainty may be resolved by considering increased values of
and
. However, in both the steady and unsteady case, the value of
F converges for
= 64 and
while computational expense increases drastically. From the local minimum of the cost function at
= 32, it is evident that further increases in
may not be necessary for practical application.
The optimal combination of free parameters in the proposed trajectory model will likely be application-specific. Values given here are for a consumer-grade laptop with a 4.6 GHz processor, 24 Mb cache, and 32 Gb RAM. It is likely that code optimization from the current serial formulation, to make use of parallel processing and/or GPUs, can result in significant improvements in computational speed. Therefore, the metrics for the computational effort and total given in this paper should be taken as guidelines rather than firm conclusions.
In future studies, this method will be applied to realistic scenarios with accurately described uncertainty to evaluate the performance against real-world observations and traditional models for uncertainty, such as Brownian motion.