4.1. Results of GNSS Denoised Data
Herein, the N component of site P242 is presented as an example to assess the proposed method. The GNSS coordinate time series were first decomposed using CEEMD into different IMFs, and the decomposition results for the eight IMFs and one residue were obtained (
Figure 4). IMF 8 shows a clear annual variation in the GNSS daily time series, whereas IMF 7 shows a strong seasonal cycle.
Table 2 shows the CCs between the raw north GNSS time sequence of station P242 and each IMF decomposed by CEEMD. The transitional IMF was IMF5. Based on our method, IMF1–IMF4 are “noise-dominant” IMFs, IMF5 is a “transition” IMF and the remaining IMFs are “signal-dominant” IMFs. IMF1–IMF5, as “noise-dominant” models, were combined into a new matrix
. For every column of wavelet soft threshold denoising processing, a new matrix
was obtained. The denoised matrix and “signal-dominant” IMFs were reconstructed to obtain a new matrix
. The MPCA method was used for matrix
, and the top
k components, where the cumulative contribution rate of the first
k components exceeded 85%, were selected. The final denoised matrix
was reconstructed.
The raw and denoised GNSS coordinate time sequence at station P242 in the N direction and their corresponding residual time series are shown in
Figure 5. Clear nonlinear variations and oscillations with time were exhibited in the raw coordinate time series, which indicates that the raw GNSS time series was affected by WN, and the colored noise was nonlinear and nonstationary (
Figure 5a). The proposed method was smoother than WD-PCA and EMD-PCA, indicating that the proposed method not only effectively eliminated noise at high frequencies, but also reduced noise at low frequencies. Nonlinear oscillations and random fluctuations with high frequencies were observed in the raw residual time series (
Figure 5b), indicating the presence of WN with high frequencies and systematic noise with low frequencies in the raw GNSS daily coordinate sequence. The proposed method can eliminate more noise at both high and low frequencies than WD-PCA and EMD-PCA. WD-PCA considers the advantages of wavelet denoising and PCA in removing noise at both high and low frequencies. Meanwhile, EMD-PCA considers the advantages of EMD and PCA in partially removing low-frequency noise. Furthermore, the proposed method combines the advantages of CEEMD-WD and MPCA, which consider the correlation between all directions of the station and eliminate various noises effectively. CEEMD is more accurate than the other methods because of its highly adaptive and data-driven ability to decompose GNSS datasets into IMFs of different frequencies.
To analyze the denoising effect more comprehensively using the proposed method, the raw and denoised GNSS coordinate time sequences in the U direction for Blocks 1 (
Figure 6) and 2 (
Figure A1) are shown. The raw GNSS daily time sequence contains centimeter-level fluctuations, namely low-frequency colored noise and clear random oscillations with high frequencies. The denoised coordinate time series was much smoother than the original time series, indicating that the denoising effect is better. For example, the time series of station P242 shows apparent annual cycles, which may be due to changes in the surface environment composed of sea level, continental water and atmospheric load changes, thus resulting in periodic changes in crustal deformation [
44,
48].
To quantitatively describe the improvement in the accuracy of the GNSS daily positions, the standard deviations (STDs) of the GNSS residual time sequences of various methods at each station are shown in
Figure 7. Regardless of the block, the STDs of the original data were the highest. The EMD-PCA method exhibited the least improvement in terms of accuracy, whereas the proposed method exhibited the most significant improvement.
For Block 1, the horizontal components of the proposed method were significantly better than the up component of the other methods. For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.86–2.02], [0.20–1.71], [0.41–1.98] and [0.17–0.58], respectively. For the E component, the STDs of the original residual time series, WD-PCA, EMD-PCA and the proposed method were [1.09–1.66], [0.34–0.93], [0.62–1.46] and [0.16–0.38], respectively. The proposed approach exhibited a significantly higher accuracy than the other approaches in the horizontal direction because the coordinate time sequences of each station in the horizontal direction indicated a stronger correlation, and MPCA can theoretically fully utilize the correlation between different coordinate components to remove noise (
Figure 7). In the up direction, the accuracies achieved by the three methods for stations P216, P232, P233, P235 and P242 were similar. At the remaining stations, the proposed method exhibited a slightly higher accuracy than the other methods. The STD ranges of the original residual time series, WD-PCA, EMD-PCA and C-MPCA data were [4.50–13.02], [2.29–12.36], [2.45–12.71] and [2.35–12.46], respectively, because the correlation between the U components of each station was weak. Considering the correlation between different coordinate components, the filtering effect in the U component was not significant.
For Block 2, the proposed approach showed higher accuracy in the northern regions than the other approaches, except for the MIDA and TBLP stations. For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.83–1.46], [0.15–0.33], [0.25–0.64] and [0.04–0.29], respectively. The median error of the proposed method exceeded that of the WD-PCA method, which was mainly due to the inferior decomposition effect of CEEMD at this station, which resulted in the low filtering accuracy of these two stations. In the east direction, except for TBLP station, the accuracy of the proposed method exceeded that of the other methods. For the E component, the STD ranges of data obtained by the original residual time series, WD-PCA, EMD-PCA and the proposed method were [1.00–1.64], [0.20–0.36], [0.37–0.95] and [0.15–0.26], respectively. For the U component, the STD ranges of the raw residual data, WD-PCA, EMD-PCA and C-MPCA were [4.24–6.26], [2.54–4.73], [2.92–5.51] and [2.11–4.40], respectively. For the LAND and MASW stations, the WD-PCA method performed slightly better than the EMD-PCA and the proposed method, whereas for the other stations, the proposed method performed better than the other methods. This is because the correlation between the U components of each station was weak. Considering the correlation between different coordinate components, the filtering effect of the upper components was not significant.
Table 3 presents the mean STD in each direction for the overall, Block 1 and Block 2 regions. For the overall denoising, the average STDs of the original residual data, WD-PCA and EMD-PCA in the N, E and U directions were (1.12, 1.28, 5.25) mm, (0.45, 0.53, 3.70) mm and (0.56, 0.68, 3.91) mm, respectively. However, the average STDs of the proposed approach were 0.31, 0.30, 3.46 mm in the N, E and U directions, respectively. Furthermore, after separating the GNSS network into two blocks by determining each block’s scale or range and considering the spatial distribution of the GNSS stations, for Block 1 region in the N, E and U directions, the average STDs of the raw residual data, WD-PCA and EMD-PCA were (1.15, 1.36, 5.70) mm, (0.57, 0.51, 4.32) mm and (0.76, 0.87, 4.59) mm, respectively. However, the average STDs of the proposed approach were 0.27, 0.29 and 4.03 mm, respectively. For Block 2 in the N, E and U directions, the average STDs of the original residual data, WD-PCA and EMD-PCA were (1.09, 1.20, 4.79) mm, (0.28, 0.28, 3.19) mm and (0.50, 0.57, 3.58) mm, respectively. However, the average STDs of the proposed approach were 0.15, 0.20 and 2.86 mm, respectively. The original residual time series exhibited the highest STDs, followed by EMD-PCA, whereas the STDs shown by WD-PCA were lower because WD can accurately separate high-frequency WN and irregular periods. By contrast, the proposed method exhibited the lowest mean STD. This is because the CEEMD method can accurately decompose the GNSS coordinate time series, whereas our method takes the correlations between the different coordinate directions for each station in consideration.
4.2. Feature Extraction of Seasonal and Trend Items
The CEEMD method was subsequently applied to decompose the denoised GNSS coordinate time sequences into numerous IMFs to obtain the trend and seasonal terms, and each IMF was arranged based on frequency, as described by Wu et al. [
43]. The decomposed results of the denoised GNSS data via CEEMD in the N direction at station P242 are presented and compared with the properties of WD and EMD (
Figure A2). d1–d6, d7–d8 and a8, which were extracted using WD, denote the noise, seasonal and trend terms, respectively (
Figure A2a). IMF1–IMF3, IMF4–IMF5 and Res, which were extracted using EMD, denote the noise, seasonal and trend terms, respectively (
Figure A2b). IMF1–IMF7, IMF8–IMF9, and Res extracted via CEEMD denote the noise, seasonal and trend terms, respectively (
Figure A2c).
WD and EMD can be used to perform feature extraction on a time sequence. However, for WD, the amplitudes of the noise part are larger, which indicates that noise exerts a more significant effect on feature extraction. The EMD exhibits a modal overlap. To demonstrate the feature extraction ability of the CEEMD method, the similarity of the slope of the trend term and the Pearson correlation coefficients (PCCs) of the seasonal term were used for the denoised GNSS coordinate time series.
- (1)
Comparison results between trend terms.
WD, EMD and CEEMD were applied to trend extraction for Blocks 1 and 2. The slopes of the original sequences and the trend terms were used to verify the superiority of the trend extraction. The similarity of the slope between the raw data and trend terms was calculated as follows:
where
Similarity represents the similarity between trend terms,
represents the slope of the
ith trend term and
represents the slope of the
ith original data.
The similarity in the slope between the trend term and the denoised GNSS time series is presented in
Table 4. The closer the trend term is to the slope value of the raw GNSS data, the more reliable is the extracted result. Based on
Table 4, CEEMD is more stable and reliable than EMD and WD. The accuracy of CEEMD is 99.99%, which was higher than that of WD (99.96%) and EMD (97.60%), demonstrating that CEEMD is superior to WD and EMD in terms of trend-term extraction.
- (2)
Comparison results between seasonal terms.
The performances of WD, EMD and CEEMD in extracting seasonal items were compared, and the PCC was used to establish the superiority of CEEMD (
Table 5).
For Block 1, the average PCCs of the seasonal terms extracted via WD, EMD and CEEMD were 0.10, 0.18 and 0.35, respectively, whereas for Block 2, they were 0.12, 0.26 and 0.37, respectively (
Table 5). The average PCCs obtained by CEEMD in Block 1 were 250.0% and 94.4% higher than those obtained by WD and EMD, respectively, whereas those in Block 2 were higher by 208.3% and 43.2%, respectively. These results demonstrate that CEEMD is more accurate than EMD and WD for the extraction of seasonal items. The seasonal terms extracted via CEEMD and the other parameters in Blocks 1 and 2 are shown in
Figure 8a,b, respectively. Seasonal terms are indicated by yellow lines, GNSS detrended signals are illustrated by grey lines and the errors in GNSS data, which reflect the errors in GNSS signals, are represented by black I-shaped lines. The results showed that the seasonal terms extracted via CEEMD are consistent with the residual time series.
4.3. Noise Analysis
Typical methods for the noise characteristic analysis of GNSS data include maximum likelihood estimation (MLE) and power spectrum analysis. For noise analysis in a time sequence, the nature and intensity of noise in the GNSS data are determined in the frequency and time domains, respectively. Bos et al. [
13] developed a technique that substantially increases the efficiency of the MLE method, which has since been improved [
49]. Additionally, more advanced and computationally expensive methods, such as Markov chain Monte Carlo methods [
14,
50] and wavelet analysis [
51], have been proposed to obtain unbiased probability distributions for noise in the time series. The definition of the related information criterion (IC) is provided in
Appendix A.3.
He et al., objectively selected the best noise model by investigating various criteria such as the AIC and BIC [
52]. The performance of the models investigated was quantified by analyzing 500 batches of simulated time series with lengths of 8.2, 16.4 and 24.6 years and known noise characteristics. He et al. processed a time series before jointly estimating stochastic and functional models using a maximum log-likelihood estimator via Hector software [
13,
53]. To prevent the over-selection of noise models with numerous parameters, the optimal noise model was selected based on the log-likelihood value and information criteria (i.e., AIC, BIC and BIC_tp). He et al., applied these information criteria to Monte Carlo simulations of synthetic time series and obtained highly reliable results [
54]. In this study, the Hector software package was employed to estimate the parameters of various combinations of noise models, which were then used to analyze the noise in raw and denoised GNSS residual sequences at the selected GNSS sites in the two blocks. The AIC, BIC and BIC_tp were applied to describe the feasibility of each noise model. WN, WN plus PL noise (WN + PL), WN plus flicker noise (WN + FN), WN plus random walk noise (WN + RW), WN + FN + RW, generalized Gaussian–Markov noise (GGM) and WN plus generalized Gaussian–Markov noise (WN + GGM) were considered in the noise analysis.
The percentages of each noise model with the lowest AIC, BIC and BIC_tp values in the raw and denoised residual sequences of the selected GNSS stations (N, E and U) in Blocks 1 and 2 are shown in
Table 6. The values of the AIC, BIC and BIC_tp were mostly consistent, thus demonstrating the reliability of the AIC, BIC and BIC_tp criteria used to determine the best noise model. Additionally, the optimal noise model for the raw GNSS residual sequence in the selected Blocks 1 and 2 regions can be regarded as a combination of the WN + FN + RW model with percentages of 46.7% and 46.7%, respectively, which is based on the BIC_tp criterion for the selection of the best noise model. Furthermore, the percentage of the WN + FN models was 26.7% in the Block 1 region, whereas that of the WN + PL model was 23.3% in the Block 2 region.
For the WD-PCA method for most cases in the selected Blocks 1 and 2 regions, a combination of GGM + WN can be regarded as the optimal noise model, followed by the GGM model. For the EMD-PCA method for most cases in the selected Blocks 1 and 2 regions, a combination of WN + FN + RW could be regarded as the optimal noise model, whereas the GCM model in Block 1 and WN + FN in Block 2 can be regarded as the second-most optimal model. For the proposed method, a combination of WN + GGM model can be regarded as the optimal noise model for most cases in the selected Blocks 1 and 2 networks, whereas the GGM model in Block 1 and GGM + WN model in Block 2 can be regarded as the second-best model. One can conclude that the noise components in the raw and denoised GNSS sequences are complex and that applying only one model to all the GNSS positions in the selected GNSS region is unreasonable (
Table 6).
The power spectra of the raw and denoised GNSS residual sequences in the three directions at station P242 are shown in
Figure A3. No significant period was removed by CEEMD, while WD-PCA and EMD-PCA eliminated high- and low-frequency noise, respectively; however, the clearance level was limited. The proposed method performed better than WD-PCA and EMD-PCA in eliminating low- and high-frequency noise, respectively. For high-frequency noise in the right side, both WD-PCA and the proposed method exhibited a favorable denoising effect, indicating that the two approaches can fully utilize the merits of the WD method for eliminating high-frequency noise. For the low-frequency noise in the intermediate section, the proposed method was significantly better than the other two methods for all three components. This indicates that the proposed method fully exploits the merits of CEEMD and WD, where CEEMD is first used to obtain various IMFs as well as because of its good adaptive processing ability by WD for noise-dominant IMFs. For the lower-frequency noise in the left side, the proposed method performed better than the other two methods, particularly for the horizontal components. This indicates that the proposed method fully considers the correlation between the different components of each station and the non-uniform behavior of the CME on a spatial scale.