2.1. GPS Data Processing
We used data from 568 continuously operating GPS receivers distributed over the United States, including Alaska. These data were used to compute time series consisting of position estimates with 1-day sampling by the analysis centers—PANGA/CWU (Pacific Northwest Geodetic Array/Central Washington University) and New Mexico Tech (NMT). Their solutions are given in the global International Terrestrial Reference Frame ITRF2008 reference frame [
28]. The analysis center CWU computes the daily positions using the Precise Point Positioning method using the GIPSY software developed by NASA’s Jet Propulsion Laboratory (JPL), which also provides the necessary satellite ephemerides [
29], clock corrections, and wide-lane phase bias estimates [
30]. Note that the station positions were loosely constrained during the initial estimation and subsequently transformed into the International Terrestrial Reference Frame (ITRF2008) using only the translation and rotation [
28]—but not scale—components of the JPL-provided Helmert transformations. On the other hand, NMT processing was performed using the software GAMIT/GLOBK [
31,
32] utilizing the same stations in North America as those with the PANGA processing, but additional stations in other parts of the world were also included for the stability of the reference frame. The Vienna Mapping Function 1 (VMF1) grid was used in both processing by PANGA and NMT for handling the troposphere delay [
33]. All common parameters used in both processing steps are explained in Herring et al. (2016) [
34]. No common mode error filtering is used in any processing. It is important to emphasize that GAMIT double differencing does not remove common mode error. The network of GPS stations selected by NMT and PANGA is large (a quarter of the Earth’s surface), which dilutes the strong common mode error that is detectable over smaller regions. The final processing of these time series described by Herring et al. (2016) [
34] rotates the loosely constrained solutions provided by PANGA and NMT in the NAM08 reference frame using GLOBK [
31,
32].
Our study focuses only on the PANGA and the NMT solution (the original time series are cwu.final_nam08.pos and nmt.final_nam08.pos). For both the PANGA and the NMT solutions, the baseline lengths are their uncertainties do not change between the “loose” solutions submitted by PANGA and NMT and the solutions rotated/translated in the NAM08 reference frame. The difference with the processing preformed at NMT is mainly due to how the scale parameter is handled. The strategy in the latter includes the scale in the Helmert transformation, whereas the scale is not estimated directly in the former. Montillet et al.’s (2018) [
35] study emphasized that the choice of including a radial scaling degree of freedom during daily reference frame realization primarily impacts the average network radial height and produces apparent height anomalies in excess of 5 mm that persist for months. A comprehensive discussion about the Helmert transformation and the scale parameter can be found in references [
34,
35].
In our analysis, the 568 stations have time series that began on 1 of January 2008 and ended on 1 January 2018. Our reason for choosing a fixed data time span is to reduce the differences between random models at different time scales. We also choose GPS stations with very few data gaps, less than 8%, which reduced the total number to 568 sites. In
Appendix A,
Table A1 shows that the percentage of the 568 permanent GPS stations are listed with less than a 3% data gap for each time series. As a result, more than 90% of the stations have more than 9.7 years of data. The average, maximum, and minimum data gaps of the 568 stations are also listed in this table to supply information (see
Table A1 in the
Appendix A) on the quality of the selected time series used throughout this study.
The GPS stations analyzed in this study have large diversity of monuments on which the GPS antenna has been installed. The metadata file (or log file) associated with each station provides a description of the monument, often referred to as mast, pillar, roof top, tower, or tripod [
36,
37,
38]. In this study, we classified all monument types into four categories: concrete piers (CP), deep-drilled brace monument (DDBm), shallow-drilled brace monument (SDBm), and roof top/chimney (RTC). This classification follows previous studies [
39,
40]. Concrete pier (CP) is a pillar that can reach several meters attached deeply into the ground (up to 10 m below the surface). DDBm is braced monument where four or five 2.5 cm-diameter pipes are installed and cemented into inclined boreholes with the antenna attached at ~1 m above the surface [
40,
41]. The pipes are also attached deeply below the surface (up to ~10 m) using heavy motorized equipment. SDB refers to the type of equipment attached to the surface (<1 m-deep) using a hand-driller. The fourth category (RTC) gathers the antennas installed on the top of buildings sometimes using a mast attached to a wall, or with a concrete support. Note that our classification is based on the monument’s description included in each log file available for each station (
Table A1 in
Appendix C gives more details on the monument type of the analyzed 568 sites).
2.2. GPS Time Series Analysis
First, outliers were removed from the time series. Outliers are observations that are larger than 3 times the interquartile range of the residual time series [
42]. Second, the parameters of the trajectory model were estimated with weighted least-squares while the parameters of the model that describe the noise
were estimated using maximum likelihood estimation [
18,
19]. For this estimation process, we used Hector software [
43].
The trajectory model
is a linear sum of the tectonic rate, seasonal signals, co-seismic offsets, and random stochastic processes (
), see Bevis et al. [
20]:
where
is the initial position at the reference epoch
,
is the rate,
and
are the periodic motion parameters (
for annual and semiannual seasonal terms, respectively). The offset term
can be caused by earthquakes, equipment (environment) changes, or human intervention, in which it is the magnitude of the change at epochs; is the total number of offsets; H is the Heaviside step function. The time of known offsets
are retrieved from the station’s metadata. Finally, the automatic offset detection algorithm developed by Fernandes and Bos (2016) is applied to detect undocumented offsets [
44].
The Cascadia subduction zone is a convergent plate boundary that stretches from northern Vancouver Island in Canada to northern California in the United States. As we model and study time series of stations located in the Pacific Northwest including the Cascadia Mountains, specific events must be modeled, such as the Episodic Tremor and Slip (ETS) [
45,
46,
47,
48], for which we used the hyperbolic tangent function [
49]. The amplitude of this function is described by
. Finally,
describes the noise/random stochastic processes. The time of the slow slip event and the delay of the postseismic deformation are required as input parameters for the estimation of the ETS using Hector [
49] with a hyperbolic tangent of which the shape is prescribed by the time the ETS event occurred and its width (see in
Appendix B). The times of the slow slip events can be requested from the Pacific Northwest Geodetic Array website or by a careful analysis of the time series with some training. The start of a slow slip event is evaluated via the correlation of seismic data together with a careful check of each time series [
50]. In the remainder of this work, we use four delays—namely 30, 80, 100, and 130 days—for the postseismic deformation, because it is difficult to precisely estimate the duration of crustal decay. Note that these delays are conservative numbers knowing that the repetition of the ETS events is ~14 months, as evaluated by previous geophysical studies of Cascadia [
35,
50]. These values represent a tradeoff in not modeling enough the phenomenon, and in contrast, absorbing other geophysical phenomena due to an overestimation of the decay time [
50,
51,
52].
Figure 1 displays an example of the functional model including slow slip events superimposed on the observations at station ALBH. In this example, we use a 100-day decay time scale, and at the bottom are the residuals of the time series.
2.3. Stochastic Model Selection Criteria and Simulation Experiment
Power-law noise with a spectral index of −1 is called flicker noise (FN). Random walk (RW) noise has a spectral index of −2. Generalized Gauss Markov (GGM) noise is similar to power-law noise (PL) but flattens below a specified frequency. As noted in the introduction and shown in the following sections, the selection of the correct noise model has a significant influence on the trend uncertainty [
21,
23,
53,
54]. The theory of selecting the best model to describe the observation has a long history and many research areas simply use the Akaike or Bayesian Information Criterion [
55,
56]. However, Langbein (2004) followed a more empirical approach by performing Monte Carlo simulations using synthetic noise, and in this way, determined how much the difference between two log-likelihood values of two competing noise models must be before one can confidently choose one over the other [
42]. In the report by Langbein (2004) and Santamaría-Gómez et al. (2011), the default noise model for these simulations, or the null model, was in all cases a random walk plus white noise and it was determined how much the log-likelihood value needed to be higher before one could accept another noise model as being better with 99% confidence [
42,
57]. The log-likelihood value will be abbreviated as MLE since it is estimated with the maximum likelihood method. Differences between two MLE values are represented as dMLE. The likelihood function represents a probability, although not normalized. Therefore, MLE and dMLE are logarithms of the probabilities and have no units.
We repeated the simulations using 5000 daily time series with a length of 10 years of synthetic random walk + white noise, each with amplitudes of 1 mm/yr
0.5 and 0.5 mm, respectively—that is, for 5000 simulations, the dMLE for which there are 50 values greater is identified as the 99% level to reject the null hypothesis. The results are shown in
Table 1.
The Bayesian Information Criterion (BIC) and BIC_tp are defined as follows (He et al., 2019) [
25]:
where
MLE = ln(L), the log-likelihood value; ν is the number of parameters in the noise model; N is the number of observations. The noise model with the lowest
BIC value is selected. For 10 years of daily observations, ln(
N) = 8.2. Following Langbein (2004) and Santamaría-Gómez et al. (2011) [
42,
57], we can rewrite the difference in
BIC values as a difference in
MLE values:
where
and
are the number of parameters and
and
are the
MLE values of the null and new models, respectively. If this criterion is larger than zero, then the new noise model (
b) is more likely than the null model (
n). RWFN and PL have one more parameter than RW while GGM has two more parameters, resulting in correction values of 4.1 and 8.2. For example, the
MLE value for the GGM model needs to be 8.2 higher than that of RW before one can be confident that it is a better representation of the noise. These correction values are similar to the values listed in
Table 1. For
BIC_tp, the weight factor for each extra parameter in the noise model is 3.2 for time series with a length of 10 years instead of 4.1.
As in He et al. (2019) [
25], we only consider the detection of GGM to be the most likely noise model significant if ϕ < 0.98; this parameter is also estimated by Hector. If this condition is not met, then the second most likely noise model is chosen. He et al. (2019) explained that this condition implies that we only detect GGM noise with flattening that already starts around a period of 1 year [
25]. For the rest of this research, this extra condition of ϕ < 0.98 was always applied in addition to satisfying Equation (4).
The values of the parameters in the noise models used in the Monte Carlo simulations of Langbein (2004) [
42] are slightly different from the noise values discussed here. To ensure the most realistic results, we determined the mean values for each estimated noise model for the horizontal and vertical components for our 5000 time series, see
Table 2 and
Table 3. For each noise model, 5000 synthetic noise time series were generated. Each of them was analyzed using FN + WN (Flicker Noise + White noise), RW + FN + WN (Random Walk + Flicker Noise + White noise), GGM + WN (Generalized Gauss Markov + White Noise), and PL + WN (Power-law + White noise).
Instead of tabulating the 99% quantile of the difference in
, we show all differences as box–whisker plots in
Figure 2 and
Figure 3. Thus, if all values in the box–whisker plots were negative, one could be 100% sure that using the selected noise model (the noise model we think is correct) with the highest
MLE value would indeed reflect the true underlying noise models (better than the alternative noise models). One can see that this is not always the case. Applying the BIC correction (
BIC and
BIC_tp), resulting in the blue box–whisker plots, reduces the MLE of noise models with more parameters than the test/null model and increases it for models with fewer parameters. Therefore, BIC helps to detect FN noise while reducing the rate of detecting GGM noise. The red box–whisker plots represent the results of using BIC_tp. BIC helps increase the number of true positives (the true noise model was selected) and reduce the number of false positives (the false noise model was selected). Overall, its performance is better than BIC; thus, it will be used for the rest of this research.
From
Figure 2 and
Figure 3, it can be concluded from the RW + FN panels with positive box–whisker plots (
) that in many cases, PL + WN noise is detected while in fact, the true underlying noise is RW + FN + WN [
23]. On the other hand, if RW + FN + WN noise is detected, then we have high confidence that it is correct since false positives are extremely rare—see also He et al. [
25]. Important for this research is the fact that for synthetic GGM, the MLE of the other noise models is always lower than that of GGM. In addition, from the other panels of
Figure 2 and
Figure 3 one can also see that GGM is almost never selected when the underlying noise is not GGM. Thus, we can conclude with great confidence that any detection of GGM using BIC or BIC_tp is correct.
The random walk + flicker + white noise model was used by Langbein and Svarc [
40] in the analyses of their time series. However, the analyses of our time series show that some flatten at low frequencies, which can be better described by a GGM noise model. These different conclusions might be caused by the fact that Langbein and Svarc [
40] analyzed regionally filtered time series, whereas we are looking at unfiltered time series that are noisier and in which the smaller random walk signal might be hidden.
Note that Santamaría-Gómez et al. (2011) [
57] concluded that using neither AIC nor BIC is recommended as a means to discriminate between models. Using similar Monte Carlo simulations, the reader should be convinced that for the time series of a length of 10 years used in this research, the performance of BIC and BIC_tp is actually similar to that of using the approach of Langbein and Svarc (2019) [
40], leaving the general debate of using or not using the information criteria in the selection of the stochastic noise model for the future.