*Article* **An Ionospheric Anomaly Monitor Based on the One Class Support Vector Algorithm for the Ground-Based Augmentation System**

**Zhen Gao 1, Kun Fang 2, Yanbo Zhu 1, Zhipeng Wang 1,\* and Kai Guo <sup>3</sup>**


**Abstract:** An ionospheric anomaly is the irregular change of the ionosphere. It may result in potential threats for the ground-based augmentation system (GBAS) supporting the high-level precision approach. To counter the hazardous anomalies caused by the steep gradient in ionospheric delays, customized monitors are equipped in GBAS architectures. A major challenge is to rapidly detect the ionospheric gradient anomaly from environmental noise to meet the safety-critical requirements. A one-class support vector machine (OCSVM)-based monitor is developed to clearly detect ionospheric anomalies and to improve the robust detection speed. An offline-online framework based on the OCSVM is proposed to extract useful information related to anomalous characteristics in the presence of noise. To validate the effectiveness of the proposed framework, the influence of noise is fully considered and analyzed based on synthetic, semi-simulated, and real data from a typical ionospheric anomaly event. Synthetic results show that the OCSVM-based monitor can identify the anomaly that cannot be detected by other commonly-used monitors, such as the CCD-1OF, CCD-2OF and KLD-1OF. Semi-simulation results show that compared with other monitors, the newly proposed monitor can improve the average detection speed by more than 40% and decrease the minimum detectable gradient change rate to 0.002 m/s. Furthermore, in the real ionospheric anomaly event experiment, compared with other monitors, the OCSVM-based monitor can improve the detection speed by 16%. The result indicates that the proposed monitor has encouraging potential to ensure integrity of the GBAS.

**Keywords:** GNSS; GBAS; ionospheric gradient anomaly; one class support vector machine

#### Received: 3 October 2021 Accepted: 26 October 2021 Published: 28 October 2021

**Citation:** Gao, Z.; Fang, K.; Zhu, Y.; Wang, Z.; Guo, K. An Ionospheric Anomaly Monitor Based on the One Class Support Vector Algorithm for the Ground-Based Augmentation System. *Remote Sens.* **2021**, *13*, 4327. https://doi.org/10.3390/rs13214327 Academic Editor: Fabio Giannattasio

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

The Ground-Based Augmentation System (GBAS) is a short-baseline, airport-based augmentation of the Global Navigation Satellite Systems (GNSS). It can provide advanced civil-aviation services concerning accuracy, integrity, continuity, and availability. Integrity is one of the key aspects to indicate the safety of precision approach at airports around the world. It refers to the capability to alert when system outputs cannot be trusted. The GBAS ensures integrity in three ways [1], i.e., providing local-area differential corrections and removing common-mode errors; providing the authorized users with customized monitors for rare system faults; allowing users to establish a bound on residual errors to perform navigation operations.

Among all the identified hazards, the severest one is the undetected ionospheric anomaly [2]. Ionospheric delays because of free electrons along the path of the GNSS signal are always uniformly distributed under normal conditions. However, when GNSS signals travel through the disturbed ionosphere, severe extreme errors have been observed, which

potentially compromise integrity of the GBAS. A model is thus developed to predict the maximum errors that GBAS users can suffer from this threat. The threat of ionospheric anomaly in mid-latitude regions is modeled as a spatially linear semi-infinite and wavefront wedge moving at a constant speed. A predigested model of the ionospheric anomaly is shown in Figure 1a [3]. This piecewise and linear curve is parameterized by the slope of the ramp, gradient width and front speed. In the presence of the ionospheric anomaly, the ionospheric delay changes monotonously. The spatial change rate of the ionospheric delay is the ionospheric gradient. It is the "slope" with the dimension of mm/km in Figure 1a measured by ground stations, namely the maximum ionospheric delays (with the dimension of mm) divided by the gradient width (with the dimension of km). We can directly estimate the ionospheric correction residual through the distance between the user and the reference station. The ionospheric gradient is simplified to a constant in small regions, although seasonal variances play a significant role on the dynamics of the ionosphere [4]. This model is consistent with the study of ionospheric delays in the case of ionospheric anomalies in the mid-latitude region. Compared with the ionospheric delay anomaly, an extreme steep gradient in the ionosphere caused by intense ionospheric storms is generally considered as a dominant ionospheric threat. A predigested model of the ionospheric gradient anomaly is shown in Figure 1b.

**Figure 1.** Illustrations of predigested ionospheric anomaly (**a**) and ionospheric gradient anomaly (**b**) models (Figure based on [3]).

Ionospheric gradients can be estimated using code and carrier measurements in practice, including the long-term ionospheric anomaly monitoring and the single-frequency carrier-based and code-aided method [5,6]. Severe extreme gradients have been identified on rare occasions [7]. The largest slant gradient observed in the ionosphere was up to 850 mm/km in Brazil [8], which means measurement errors as large as 850 mm per kilometer were generated between the airborne user and the ground reference station. These errors cannot be accurately corrected using the precise ionospheric model, thus causing inaccurate positioning results and catastrophic consequences for civil aviation. This is unacceptable for the safety-critical GNSS application. An ideal way is to apply a combination of dual-frequency measurements, which can theoretically eliminate the first-order effect of the ionosphere [9,10]. However, this way is not yet available for civil aviation applications, because the correlative GBAS Approach Service Type F (GAST F) is still in the stage of standardization. Therefore, rapid and accurate detection of ionospheric anomalies for single-frequency-based GBAS is still crucial for its performance.

Monitors deployed in the GBAS to detect the ionospheric anomaly mainly refer to the code-carrier divergence (CCD) monitor [11], the dual solution pseudo-range ionospheric gradient monitor [12], and the ground ionospheric gradient monitor [2,13]. Among all the monitors, the CCD monitor is the only one capable of detecting ionospheric gradients observable to both the user and ground subsystem for GAST D. CCD is the anomalous phenomenon caused by the GNSS signal propagation in the disturbed ionosphere, where code phase measurements are delayed and carrier phase measurements are advanced. The CCD monitor is a filter operating on the difference of the code-minus-carrier (CMC) to estimate ionospheric divergences, i.e., the change rate of ionospheric delay over time. If anomalies occur, the divergence increases and can be tested by the customized monitor. The normal ionospheric data with well-defined patterns and the data with unexpected profiles can be separated. In addition, the divergence can also be triggered by potential payloads occurring at the ranging source. Thus, the monitor performs dual tasks, deployed to counter both satellite-induced faults and ionospheric anomalies by detecting divergences.

Two kinds of traditional moving-average-based filters are implemented in the GBAS to monitor the divergence [14,15], i.e., one first-order low-pass auto-regressive and moving average (ARMA) filter, named CCD-1OF, and two first-order cascade ARMA filters, named CCD-2OF. Generally, the CCD-2OF outperforms the CCD-1OF in monitoring divergences due to the smaller smoothing constant. However, both monitors cannot detect divergence rapidly and suffer from a serious delay [16]. These monitors are statistical based, which extract information in the form of simple one-dimensional time series. We cannot understand which process can generate such phase time series. Additionally, the inherent averaging characteristics of filters lead to the weak anomaly-related components being easily overwhelmed by normal noise. Consequently, it is often difficult for traditional divergence monitors to quickly detect the ionospheric anomaly. These monitors even fail to alert in noisy environment. These limitations increase the probability of missed detection, thus leading to misleading information [1]. Therefore, high-level precision approaching requirements are difficult to meet. Developing a novel monitor is highly necessary for future GBAS constructions.

To improve detection performance, a variety of optimization strategies have been studied. Instances of creative research include the monitors based on generalized least squares [17], a two-step method based on an adaptive Kalman filter [3], and a Kullback– Leibler divergence metric [18]. These approaches prove the feasibility of capturing abnormal ionospheric behaviors. However, further work needs to be carried out to overcome the shortcomings, such as the insufficient estimation accuracy, low generalization ability caused by miscellaneous assumptions and poor performance under weak noise.

To overcome these issues, an option is to exploit machine learning algorithms rather than purely statistical-based algorithms. The support vector machine (SVM) is a supervised algorithm designed for binary classification problems [19]. It is famous and appealing, as it can simplify the complex task of classification in an explainable way. It has been extensively utilized in the GNSS community such as ionospheric scintillation detection [20], GBAS availability prediction [21], and high precision positioning [22]. A one class support vector machine (OCSVM) algorithm extends the SVM algorithm to solve a single classification problem [23]. It learns the underlying characteristics of the existing normal samples to judge whether the new samples come from this distribution, thus well suitable for anomaly detection. It has complete mathematical derivation and a meaningful interpretation. As anomalies occur rarely, it is reasonable to assume that they only occur in the tails of normal distributions, thus the goal of anomaly detection is to estimate the density level sets of the normal distributions. This assumption is typical for the OCSVM algorithm. When only a few anomalies occur and the knowledge of new information is limited or unavailable, the OCSVM is suitable, as it does not require any anomalous data. The OCSVM has been found successful in accurately sensing the abnormal information in many domains, such as materials science [24], computer vision [25], power system [26], and agriculture [27]. However, it has not been applied in the GBAS. It is of great significance to investigate the feasibility of using the OCSVM for ionospheric anomaly detection in the GBAS.

The remaining sections are arranged as follows. Section 2 introduces the principle of the OCSVM, followed by the methodology proposed in this work, i.e., an offline-online ionospheric anomaly monitor based on the OCSVM. In Section 3, the synthetic, semisimulated and real anomalous events are discussed in detail. Finally, conclusions and perspectives of this work are summarized in Section 4.

#### **2. Ionospheric Anomalies Detection Based on the OCSVM**

As previously described, a new monitor is required to meet the requirements of the high-level precision approach. The major goal is to perform ionospheric anomaly detection in a time-efficient way. To achieve this purpose, we present an analysis of the statistics of OCSVM metrics and establish a framework of the monitor in this section.

#### *2.1. The OCSVM Algorithm*

The principles of the OCSVM are described in this section. As a special form of the SVM, the OCSVM is a domain-based detection algorithm and can tackle the one-class classification problem. It learns the training set with identical labels and determines whether the samples from the new test set are subordinate to the training set [23].

Suppose that we have vectors **X** = {**x***i*, *i* = 1, ... , *K*}. *K* is the number of vectors in training sets and **x***<sup>i</sup>* comes from the input space *R*. *φ*(**x***i*) represents the projection of the original training sample **x***<sup>i</sup>* in *R* to the huge dimensional feature space *F*. A hyper-plane *f* is used to separate the projected vectors from the original which is assigned as the only anomaly in *F*. The hyper-plane is given by

$$f(\mathbf{x}) = \boldsymbol{\omega}^T \boldsymbol{\phi}(\mathbf{x}) - \boldsymbol{\rho} \tag{1}$$

where *ω* and *ρ* are the normal phase vectors and compensation values of the hyperplane in *F*, respectively. These parameters can be calculated by the following quadratic programming issues

$$\begin{array}{ll}\underset{\omega,\rho,\xi}{\min} & \frac{1}{2}||\omega||^{2} + \frac{1}{\gamma\mathcal{K}}\sum\_{i=1}^{K}\xi\_{i} - \rho\\ & \omega \cdot \phi(\mathbf{x}\_{i}) \ge \rho - \xi\_{i}\\ & \xi\_{i} \ge 0, i = 1,\ldots,K\end{array} \tag{2}$$

where *γ* ∈ (0, 1] is the trade-off constant value used to determine the proportion of the normal and anomalous data. *ξ<sup>i</sup>* is the slack variable, referring to the extent where the samples are misclassified. Δ denotes the 2-norm of the vector Δ. The Lagrange factor [α*1*, ... , α*K*] *<sup>T</sup>* is introduced for each vector **x***<sup>i</sup>* to obtain the dual problem of (2). Solving the dual problem gives rise to

$$
\omega = \sum\_{i=1}^{K} \alpha\_i \phi(\mathbf{x}\_i) \tag{3}
$$

where 0 ≤ *α<sup>i</sup>* ≤ 1/*γK*. The famous kernel trick can replace the inner product of two vectors in *F* with a kernel operation in *R*. This projection transforms a non-linearly separable problem into a linearly separable problem. Thus (1) combined with (3) becomes a function as follows

$$f(\mathbf{x}) = \sum\_{i=1}^{K} \alpha\_i G(\mathbf{x}\_i, \mathbf{x}) - \rho \tag{4}$$

where G(**x***i*,**x**) is the inner product of two projecting functions, i.e., *G*(**x***i*, **x**) = (*φ*(**x***i*), *φ*(**x**)). A popular radial basis function (RBF) kernel function is utilized here to expand vectors into the high dimension. The RBF is given by

$$G(\mathbf{x}\_i, \mathbf{x}) = \exp\left(-\frac{\left\|\mathbf{x}\_i - \mathbf{x}\right\|^2}{\sigma}\right) \tag{5}$$

where *σ* is the hyper-parameter controlling the width of the kernel. The kernel provides the OCSVM the ability to detect abnormal samples in the high dimension feature space *F*. Moreover, it can be proved that only for the vector on the boundary, namely the support vector, the non-zero α imposes effects on the construction of the boundary.

Based on the derivation above, the OCSVM can be constructed and used to detect new samples. If the new samples fit well with the trained model, they are labeled as normal samples. Otherwise, they are adjudged as anomaly samples. *f*(**x**) serves as a score function to determine the classification quantitatively. A vector x is classified to be normal with *f*(**x**)>0 and anomalous with *f*(**x**) < 0. The smaller the *f*(**x**) is, the more abnormal the vector is viewed.

#### *2.2. Ionospheric Anomaly Detection with the OCSVM-Based Monitor*

This section applies the OCSVM algorithm to the monitor. Based on the analysis above, a new offline-modeling-online-monitoring monitor is proposed. It has the potential to meet the high-level approaching requirements. Figure 2 shows the overall schematic diagram of the proposed integrity monitoring algorithm illustrating the process. Statistical information can be extracted to characterize the normal states offline. A classifier can then be trained to detect the potential ionospheric anomaly. The GBAS can monitor the ionospheric anomaly more effectively through the OCSVM-based detection strategy. Detailed analyses are given next.

**Figure 2.** Overall schematic diagram of the ionospheric anomaly monitor based on the OCSVM.

In data collection, we collected both the code-phase and carrier-phase measurements. To eliminate the same item in the code and carrier phase measurements and emphasize the differences, a CMC is introduced, given by

$$\text{CMC} = P - \Phi = 2I + N\lambda + M \tag{6}$$

where *P* and *Φ* represent the code-phase measurement and carrier-phase measurement between the receivers and satellites, respectively. *I* is the ionospheric delay. *N* is the ambiguity of whole cycles. λ is the carrier wavelength. *M* includes noise including multipath effects, thermal noise and other error sources.

Before the introduction of the monitor design, it is necessary to understand which original measurements can be dealt with. A pre-processing is performed to process GNSS raw measurements and to perform various corrections to get the true CMC as in [28]. Recall that our aim is to detect the ionospheric anomaly and separate it from the other errors. The first term in the right side of (6) is what we are interested in. The presence of the double ionospheric delay errors is induced by the dispersive nature of the ionosphere. Although ionospheric delays are time-variant during normal ionospheric days, we can use a neural network and single-frequency GNSS signals to estimate the delays [29]. Therefore, the delays during normal ionospheric days can be considered as a time-variant but known bias. The second term in the right side of (6) is the ambiguity term. It does not directly affect the results when cycle slips are absent in measurements. It is always fixed or easily repaired under the circumstance of relatively weak ionospheric perturbations using the Kalman filter [30]. Therefore, the cycle-relevant term is treated as a pure bias. The third term in the right side of (6) indicates ground-related noise. The noise cannot be simply modeled by a zero-mean Gaussian distribution. It is considered the major obstacle in the ionospheric anomaly detection, as the ionospheric anomaly-related components are easily overwhelmed. Based on the above analysis, we understand what we can cope with before stepping into the monitor design. Compared with traditional CCD monitors, we focus on the original CMC instead of the difference of the CMC. Namely, we concern about the accumulation errors induced by the ionospheric gradient anomalies over a period of time.

Due to the fact that only vector inputs can be imported into the OCSVM, onedimension time series cannot be applied directly [26]. Therefore, the CMC sequence must be converted to a high-dimension vector first. We need to ensure that we can construct a sound mathematical model from a single time series. The method of nonlinear analysis can be used, provided that the CMC is a dynamic, chaotic, and irregular sequence. The CMC is assumed as an observable variable of a multivariate system. Additional knowledge obtained from the collection of real data is required to establish the system. The state phase space of the original system can be reconstructed by the time-delay embedding to maintain the system dynamics. Specifically, let CMC1, ... , CMC*<sup>k</sup>* be a time sequence, it can be unfolded into the phase space *<sup>Q</sup>*, where *<sup>Q</sup>* ⊆ *<sup>R</sup><sup>L</sup>* and *<sup>L</sup>* refers to embedding dimension (ED). We set the embedding delay as the sampling interval to consider more information [31]. The reconstructed vector can be then obtained by

$$\text{CMC}\_{L,k} = \begin{bmatrix} \text{CMC}\_{k-L+1} \text{ CMC}\_{k-L+2} \dots \text{ CMC}\_k \end{bmatrix} \tag{7}$$

where CMC*<sup>k</sup>* ⊆ *Q*. Thus, a one-dimensional time series can be transferred to a string of vector *VL*(*N*), given by

$$V\_L(N) = \left\{ \text{CMC}\_{L,k}, k = L, \dots, N \right\} \tag{8}$$

A continuous track formed by vectors connected by lines in the reconstructed phase space is called a phase diagram, whose geometry reflects the characteristics of unknown systems. The reconstructed phase space is topologically identical to the original system, provided that the embedding dimension is sufficiently large.

After converting the time series to a set of vectors in the phase space, the OCSVM can be used to monitor the ionospheric anomaly. We can model the collected data offline and monitor the real-time data online [32]. An offline modeling needs to be built by training tests. We first pay attention to the parameter selection. The RBF is chosen as the kernel function of the OCSVM. The selection of the hyper-parameter *σ* affects the final detection results. A larger *σ* results in a greater generalization capability but a lower discernibility and vice versa. Note that both *σ* and *L* affect the final test results. They will be discussed in the next section. Because the probability of serious ionospheric anomaly in the actual environment is extraordinarily low, it is advisable to set the trade-off parameter *γ* to 1, indicating that there is no outlier in the training set. Thus, the OCSVM model can be obtained by the offline data. Additionally, the OCSVM operating on the RBF kernel performs poorly with the number of samples in training sets. It does not perform well in the case of big data because of the increased computational load, while hardware auxiliaries can be used to overcome this difficulty [33]. Considering the actual situation, the offline OCSVM is feasible to be trained quickly. At the same time, the offline modeling part is not required to be real-time, and the appropriate computational load is acceptable in practice.

Once a model reflecting the normal state is established, detecting any departure from the normal operation is vital. Because of the complexity of the real environment, we cannot directly determine whether the new sample is an ionospheric anomaly by the sign of the score value. A threshold is set to determine the decision result. A validation test is required to predict new scores. It acts as cross-validation to judge the extent of how the trained OCSVM matches the underlying distribution. Then, the threshold needs to be obtained by the scores. Because the distribution of the OCSVM scores does not follow the Gaussian distribution, the traditional Gaussian overbounding to construct a threshold is a little conservative. Instead, we use non-parametric methods to estimate the threshold. A famous kernel density estimation (KDE) is employed to approach the probability density function in a non-parametric approach [34]. A simple Gaussian kernel is chosen in this work. Since KDE is to estimate the values of unknown random variables

from known samples, it is reasonable to assume that the weight of known samples on unknown variables decreases as a Gaussian distribution gradually for the central limit theorem. The Gaussian kernel is widely employed to estimate the underlying distribution of the scores [35]. Note that different from the traditional monitors, the OCSVM monitor uses test statistics to describe the degree of misclassification. When the value is very large, the distribution of the test sample is very close to the real sample, and there is no abnormality. The ionospheric anomaly occurs only when the statistic is negative and its absolute value is large. Therefore, the threshold is unilateral. Then the threshold *TKDE* can be determined as the unilaterally lower bound of the estimated cumulative distribution function with respect to the corresponding false-alarm risk. As a result, the real threshold *T* is the minimum of zero and *TKDE*. For real samples, the errors are strongly correlated with the elevation angle due to the multipath, thus the threshold is determined related to the elevation bin [36]. It is also noted that because the ambiguity term is included, each satellite channel in each elevation bin needs to set a different threshold, which makes a higher request for the GNSS signal channel.

For the online-monitoring part, the parameter is the same as the offline model. The calculated test statistic is the score function as *D*. This statistic is compared with the threshold values calculated offline to find whether the sample patterns deviate from the majority. If the test statistics are less than the detection threshold *T*, timely warnings are provided and broadcast. Note that the OCSVM only parse the information in normal conditions without any presupposition for any ionospheric anomaly points, thus any ionospheric anomaly causing CMC changes will be detected. Although this unpredictable and unclassified fault mode may cause troubles in areas with high fault probabilities, any threat identified should be informed for the GBAS. Moreover, the linear characteristics of the online-monitoring ensure that the computational cost is comparable to that of the traditional monitors.

The OCSVM-based monitor has three benefits in monitoring ionospheric anomalies: (1) Using the phase-space reconstruction technique, one-dimensional CMC time series can be transformed into high-dimensional vectors. The full dynamics of the CMC system accessible in the phase space can be passed through the OCSVM; (2) The OCSVM is a domain-based detection algorithm that can distinguish between normal and anomaly classes with a redefined boundary or "domain" as the test statistics. It offers better flexibility compared with the Gaussian statistics; (3) The OCSVM has certain robustness to noise and generalization ability to ionospheric anomalies. Therefore, superior performance can be expected with the proposed OCSVM-based monitor to exhibit high-level precision approach.

#### **3. Experiment Analysis**

In this section, three examples are illustrated to show the anomalous monitoring capability of the proposed OCSVM-based monitor. In the first example, synthetic data are used to simulate the ionospheric gradient-free and non-negligible ionospheric gradient anomaly CMC, respectively. In the second example, we verified performance of ionospheric anomaly detection with semi-simulation data. The raw BeiDou Navigation Satellite System (BDS) data collected at Dongying Shengli Airport with an artificial anomaly are used to demonstrate the effectiveness of the proposed monitor. In the third example, a real anomaly event are taken into account to evaluate the performance under real circumstance. The OCSVM-based monitor is compared with traditional CCD monitors and KLD-1OF monitor with respect to the ionospheric anomaly detection performance.

#### *3.1. Ionospheric Anomaly Detection with Synthetic Data*

The aim of the synthetic data analysis is to show the effect of the different noise levels on the required time of detecting anomalous events, which cannot be realized using real data. Because the CMC depends solely on the difference in code and carrier phase rather than individual values in the single-frequency mode, code phase and carrier phase anomalies are equivalent [10]. Thus, only one fault mode analysis is required. To demonstrate the ionospheric anomaly detection performance, 2000 samples are generated to simulate the gradient changes induced by the ionosphere. The first 1000 samples are ionospheric gradient-free. To simulate the ionospheric anomaly, a constant steep gradient change, as shown in Figure 1, is added. The ramp-type change starts at the 1001st sample at a velocity of 0.01 m per epoch [3,18]. The parameters of the samples are written as

$$I\_k = \begin{cases} 4 + n\_k & 0 < k \le 1000 \\ 4 + 0.01(k - 1000) + n\_k & 1000 < k \le 2000 \end{cases} \tag{9}$$

where *nk* is the independent measured noise at the *k*th epoch. Because the noise follows a heavy-tailed distribution, we choose distributions that can characterize upper and lower bounds of the real distribution to conduct the simulation. The tail distribution of real noise lies between Gaussian and Laplacian distributions [36]. Thus, the noise is assumed to be zero-mean, and it includes components following the Gaussian distribution with the variance ranging from 0.1 to 2.5, and components following the Laplacian distribution with the scale ranging from 0.1 to 2.5, respectively [3,18]. We choose them to simulate the noise of different levels. When the variance and scale are 0.1, the noise is weak. With the increase of the variance and scale, the noise amplitude in the simulated data increases gradually. When the variance and scale approach 2.5, the noise is strong enough to conservatively simulate hostile environment [37]. According to our experience, noise beyond this range seldom appears in the real environment. Figure 3 shows the variation of the simulated data amplitude when both the variance of Gaussian noise and the scale of the Laplacian noise are set to 0.5 and 2.5, respectively.

**Figure 3.** Variation of the amplitude of synthetic data simulating ionospheric anomalies with Gaussian noise (**a**) and Laplacian noise (**b**). The ionospheric gradient anomalies are simulated from the 1001st epoch by adding ramp-type errors. The absolute maximum values of noise in (**a**) are 1.7 m (variance = 0.5) and 6.9 m (variance = 2.5). The absolute maximum values in (**b**) are 1.9 m (scale = 0.5) and 9.6 m (scale = 2.5).

To detect whether anomalies occur, (9) is first reconstructed to the form of (7). Figure 4 takes the *L* = 3 as an example to show the constructed phase state of Gaussian and Laplacian noise. The distributions in the constructed phase state are spindle-shaped, indicating that the distributions of normal and anomalous vectors are different. The thick middle part where points are brought together represents normal vectors, while the sharp head part where points spread far away represents abnormal vectors. These two parts are easily separated.

**Figure 4.** Vector sets of the time series in a phase space (*L* = 3) under (**a**) Gaussian noise and (**b**) Laplacian noise. Both distributions are spindle-shaped, indicating that ionospheric anomalies, denoted by the sharp head part, can be easily distinguished from the normal points denoted by the thick middle part.

The first 500 epochs of the time sequences are considered as the training set to construct the OCSVM, while the 501st to 1000th epochs are the validation set to obtain the threshold [3,18]. Figure 5 is the visualization of the 3D trained OCSVM decision bound and the corresponding support vectors in the input space. The vectors are constructed under Gaussian noise with the variance set to 2.5 in Figure 5a, and under Laplacian noise with the scale set to 2.5 in Figure 5b. The red plane is the 3D decision boundary. The corresponding hyperplane maximizes the distance between the original points and vectors in the high-dimensional space. Thus, the plane separates the normal vectors from the anomaly ones. The case when the variance and scale parameters are 0.5 is basically the same as that of 2.5, thus it is not shown.

During the monitoring process, the parameter determination of the dimension and the RBF kernel is important for the determination of an appropriate OCSVM-based monitor. It is noticeable that both of them have an impact on the OCSVM-based monitor; thus, they need to be estimated simultaneously. Optimal parameters are chosen to get the lowest average detection time under different levels of noise. By adjusting the amplitude of noise from 0.5 to 2.5 at a step of 0.5 to characterize the main changes in noise [18], the corresponding parameter is obtained. Taking Gaussian noise as an example, the average detection time with different dimensions and kernel scales is shown in Figure 6. There is an obvious variation of the detection time with the increase of the dimension *L*. The minimum average time of 86 epochs occurs when *L* = 10. As for the RBF kernel scale *σ*, a larger scale generally results in a lower average time. When it reaches a critical point, the average detection time tends to converge and maintains unchanged even keeping increasing the kernel scale continuously. When *L* = 10, the critical point appears when *σ* is 20, which is considered the optimal parameter of the RBF kernel scale. As a result, *σ* = 20 and *L* = 10 are chosen as the optimal values. A similar method is used to determine the optimal parameter under the Laplacian noise. In this case, *σ* = 18 and *L* = 60 are chosen as the optimal values.

**Figure 5.** Trained OCSVM and the vectors in an input phase space (*L* = 3) with (**a**) Gaussian noise and (**b**) Laplacian noise. Red plane is the 3D contour plane.

**Figure 6.** Variation of the average detection time in relation to *L* and σ with synthetic data. The minimum time of 86 epochs occurs when *σ* = 20 and *L* = 10.

To compare the ionospheric anomaly detection performance, three monitors are used as the control group, i.e., the traditional CCD-1OF with *τ<sup>1</sup>* = 100 epochs [14], the traditional CCD-2OF with *τ<sup>1</sup>* = *τ<sup>2</sup>* = 30 epochs [15], the Kullback-Leibler metric-based monitor KLD-1OF with *τ* = 150 epochs and *L* = 50 [18]. The proposed OCSVM-based monitor is selected with the parameters and threshold estimated through the process described previously. The probability of fault detection allocated for the monitor is 10<sup>−</sup>8.

Figure 7 illustrates the test statistics and the corresponding thresholds of the CCD-1OF, CCD-2OF, KLD-1OF and the OCSVM-based monitors, under Gaussian noise with the variance set as 0.5 and 2.5, respectively. In the presence of weak noise when the noise variance is set to 0.5 in Figure 7a, the test statistic of the OCSVM-based monitor takes

the shortest time to cross the threshold, thus the OCSVM-based monitor can perform faster than others. In the presence of strong noise when the noise variance is set to 2.5 in Figure 7b, traditional CCD monitors cannot detect any divergence, thus the detection of the ionospheric anomaly is missed. Therefore, such noise, which may be significant for high-level precision approach, can be missed without timely alarming with CCD-1OF and CCD-2OF. Although the KLD-1OF can also detect anomalies, the OCSVM-based monitor outperforms it in the detection speed.

**Figure 7.** Variation of the test statistics and thresholds of the CCD-1OF, CCD-2OF, KLD-1OF and the proposed OCSVM-based monitors under Gaussian noise, with the noise variance set to 0.5 in (**a**) and 2.5 in (**b**). If the ionospheric anomaly is successfully detected, the detection time is included. Note that the ionospheric anomaly is inserted from the 1000th epoch.

Figure 8 illustrates the test statistics and the corresponding thresholds under Laplacian noise with the scale set to 0.5 and 2.5, respectively. When the amplitude of the noise is small, the OCSVM-based monitor outperforms all the other monitors. When the noise variance is 2.5, only the test statistic of the OCSVM-based monitor crosses the threshold, which means that only this monitor successfully detects anomalies. This result again indicates the effectiveness of the OCSVM-based monitor under all noise levels.

**Figure 8.** Variation of the test statistics and thresholds of the CCD-1OF, CCD-2OF, KLD-1OF and the proposed OCSVM-based monitors under Laplacian noise, with the noise variance set to 0.5 in (**a**) and 2.5 in (**b**). If the ionospheric anomaly is successfully detected, the detection time is included. Note that the ionospheric anomaly is inserted from the 1000th epoch.

Figure 9 presents the average detection time of the four monitors in the presence of varied noise from 50 repeated tests. The results are not shown when monitors fail to detect any ionospheric anomaly. The proposed OCSVM-based monitor is always faster than the others in detecting the ionospheric anomaly. When the noise amplitude is getting larger, the CCD-1OF and CCD-2OF monitors fail to sense the anomaly and the KLD-1OF detects the anomaly at a lower speed. Only the OCSVM-based monitor can detect the anomaly at a relatively high speed. In conclusion, the OCSVM-based monitor is more suitable for ionospheric anomalous detection compared with the traditional CCD monitors in the GBAS regardless of the noise levels and types.

**Figure 9.** Variations of the ionospheric anomaly time in relation to Gaussian (**a**) and Laplacian (**b**) noise levels when the CCD-1OF, CCD-2OF, KLD-1OF and OCSVM-based monitors are applied. Ionospheric anomalies that cannot be detected are not shown in the figure.

#### *3.2. Ionospheric Anomaly Detection with Semi-Simulation Data*

To further evaluate the performance of the proposed ionospheric anomaly monitor, the real observed data are processed. In this work, the real data are collected from the GBAS deployed at the Dongying Airport in Shandong, China [38]. The B1I signals of BDS-2 are captured by four reference receivers, each equipped with the same choke ring antenna as shown in Figure 10. The BDS code and carrier phases measurements are recorded continuously at a sampling rate of 1 Hz. Previous literature shows that one day's data is sufficient to construct a model to perform an evaluation of the detection time [3,5,18]. The antenna of this receiver is installed beside the runway of the airport. The elevation mask is set to 15◦. Thus, 24-h data collected by the 1st receiver on 2 September 2020 are used in this analysis. In Figure 11, the ground-related ranging noise of PRN 7 is drawn as an example. It is obtained as shown in [29]. The amplitude lies between −1.5 m and 1.5 m. The data are then binned at an interval of 5◦ to illustrate the performance of monitors under different noise levels. In addition, because the probability of ionospheric gradient anomaly in practice is very small, which is in the magnitude of about 10−7, we artificially add the simulated anomalies rather than process the real anomalies. As the airport is located in the mid-latitude region, we use the ramp-type error model to simulate the anomaly as shown in Figure 1.

**Figure 10.** BDS GBAS antenna mounted on the ground near the runway at the Dongying Airport in Shandong, China.

**Figure 11.** Processed BDS ranging noise of PRN 7.

To test the performance of the OCSVM-based monitor, a parameter determination process for real data is required first, which is the same as that for the synthetic data. *L* and σ are estimated in the light of the average detection time of the satellite PRN 7 with the ionospheric gradient changing at a rate of 0.018 m/s [3,15]. The satellite is chosen as it covers a larger range of elevations. According to Figure 12, values of *L* = 20 and *σ* = 9.6 are chosen as the optimal parameters.

**Figure 12.** Variation of the average detection time for determining the optimal *L* and *σ* with real data. The minimum time of 14.75 s occurs when *σ* = 9.6 and *L* = 20.

After determining the parameters of the proposed monitor, the threshold in every elevation bin is calculated with the similar method described in Section 2.2. Figure 13 shows the estimated metric and the determined thresholds as a function of elevations.

**Figure 13.** Variation of the metric and threshold of the proposed OCSVM-based monitor in terms of satellite elevations.

The impact of severe ionospheric gradients is then simulated. Two different ionospheric gradients, with the change rate respectively set to 0.01 and 0.02 m/s, are inserted to illustrate the effectiveness of the proposed monitor. The duration of the inserted gradients is 200 s. The experiment is repeated 50 times to compare the stability of different monitors. Figure 14 shows the average detection time in each elevation bin when the four different monitors are implemented in Figure 14a,c. Standard deviations of the detection time are also shown in Figure 14b,d. A greater deviation indicates a lower stability of the detection performance. Because the signal at a lower elevation angle tends to suffer more noise, such as multipath, the detection time decreases as the elevation increases. The standard deviation at lower elevation bin is missing in the figure due to the detection failure. Compared with the other three monitors, the OCSVM-based monitor has the shortest detection time and the greatest detection speed. Especially, compared with the KLD-1OF, the OCSVMbased monitor can decrease the average detection time respectively by about 55% and 44% when the change rate is 0.01 m/s and 0.02 m/s. Additionally, the CCD-2OF and KLD-1OF perform more steadily than the OCSVM-based monitor. The reason is that the averaging process is absent in the OCSVM-based monitor, thus the OCSVM-based monitor has more instabilities in detecting the ionospheric anomalies. However, the shortest detection time of the traditional monitor is mostly greater than the longest detection time of the OCSVMbased monitor. Thus, the instability of the OCSVM-based monitor can be tolerated. In addition, there is no false alarm event in the simulation.

Because the GBAS involves the field of life safety, small ionospheric anomalies are non-negligible for the reliable navigation [18]. Being able to detect small gradient changes, such as at a rate of 0.001m/s is also a clear advantage compared with the traditional monitors. The capabilities of the four different monitors are compared for detecting the ionospheric anomaly occurring on PRN 7. The detection capability refers to the minimum gradient change rate that can be detected during the ionospheric anomaly occurrence. In this example, the rate of the gradient change is varied from 0.001 to 0.025 m/s. Figure 15 shows the minimum gradient changes that can be detected for the four monitors. As it can be seen, the OCSVM-based monitor presents a better detection capability than the other monitors. When it is applied, the minimum gradient changes detected are lower than

0.005 m/s over all elevations. Thus, the OCSVM-based monitor can reduce the integrity risk by alarming quickly and accurately.

**Figure 14.** Comparison of the average and the standard deviations of the detection time when applying the CCD-1OF, CCD-2OF, KLD-1OF and the proposed OCSVM-based monitors to detect the ionospheric anomaly on PRN7. The ionospheric gradients, with the change rate set to 0.01 m/s in (**a**,**b**) and 0.02 m/s in (**c**,**d**), are inserted.

To further verify the effectiveness of the OCSVM-based monitor, all visible satellites are considered in the ionospheric anomaly detection. The ionospheric gradient change, with the rate of 0.02 m/s, is added to all the satellites. The detection time of ionospheric anomaly on each satellite in each elevation bin is counted through 50 experiments when the traditional and proposed monitors are implemented. The average and the standard deviation of the detection time in bin are calculated. Additionally, because the signal qualities of satellites are different, the signals with large noise may be excluded when using the traditional monitors [12]. By contrast, the signal quality does not affect the OCSVM-based monitor, since the OCSVM-based monitor can detect ionospheric anomalies in the environment with large noise. The detection time is shown in Figure 16. It can be seen that the detection time maintains the lowest when the OCSVM-based monitor is used. The OCSVM-based monitor decreases the average detection time by about 43% compared with the KLD-1OF monitor. When the elevations are high, the CCD-2OF and KLD-1OF monitors slightly outperform in terms of the standard deviations, i.e., the stability. However, considering the significant contribution of the OCSVM-based monitor to the detection speed, it is acceptable with more instability for the proposed monitor. It can be seen that the test results of all visible satellites are quite similar to the results of PRN7 alone, regardless of the average or the standard deviation of the detection time. This proves that the OCSVM-based monitor performs well under all circumstances. Therefore, the proposed monitor can help to ensure the integrity of the GBAS by accelerating the detection. Thus, the proposed monitor can help to ensure integrity of the GBAS by accelerating the detection.

**Figure 15.** Variations of the minimum gradient changes that can be detected by different monitors in terms of elevations.

**Figure 16.** Comparison of the average (**a**) and the standard deviations (**b**) of the detection time of all BDS-2 satellites in relation to satellite elevations using the CCD-1OF, CCD-2OF, KLD-1OF and the proposed OCSVM-based monitors. The ionospheric gradient change with a rate of 0.02 m/s is added to all visible satellites.

#### *3.3. Real Ionospheric Anomaly Event Experiment*

The performance of the monitors is evaluated in the presence of real ionospheric anomaly in this section. Global Positioning System (GPS) data on L1 measurements collected from the Crustal Movement Observation Network of China from 2011 to 2014 were processed in [39]. During this period, the largest detected gradient as 71 mm/km in northern China occurred on 23 May 2012, which was identified using the data from two stations, i.e., BJSH (Shisanling, Beijing, 40.3◦ N, 116.2◦ E) and BJYQ (Yanqing, Beijing, 40.4◦ N, 116.0◦ E), located in middle latitude region with a baseline of 25.5 km. The satellite affected by the ionospheric anomaly was PRN 10. The geomagnetic storm class of this day is moderate (Kp = 4.7, Dst = −30 nT). When the GPS signal passed through the ionosphere, it was affected by the irregular geomagnetic storm, resulting in an ionospheric anomaly on this day [40].

The ionospheric slant delays for PRN 10 captured at the BJSH and BJYQ stations from 15 to 21 Universal Time (UT) are shown in the Figure 17a. It can be seen that ionospheric delay calculated at the two stations shares a similar tendency, while it is slightly larger from 15.6 to 15.7 UT at BJSH. This difference is caused by a spatial ionospheric gradient. Figure 17b shows the gradient levels measured along the BJSH-BJYQ baseline. The maximum value of 71 mm/km occurs at 15.7 UT. This spatial gradient is not as extreme as those reported in [7], but it is still a severe threat for the safety of civil aviation. Figure 17c shows the elevation of PRN 10 viewed at BJSH and BJYQ stations. It can be seen that the anomaly occurred when the satellite was about 30◦. Then we pay attention to the measurements at BJSH station. In Figure 17d, the observation ranging noise and ionospheric delay of PRN 10 calculated at BJSH station are shown. During the period of ionospheric anomalies lasting for 420 s, the anomalous ionospheric gradient increased by about 2.1 m; thus, the average change rate is approximately 0.005 m/s. Additionally, the predicted normal ionospheric delay is opposite to the actual abnormal ionospheric delay [29], so the net ionospheric delay change rate exceeds 0.005 m/s and is up to a constant as 0.008 m/s.

**Figure 17.** A typical ionospheric anomaly caused by geomagnetic storms in Beijing, China on 23 May 2012. Panels are: (**a**) The ionospheric delay measured on GPS PRN 10 at BJSH and BJYQ stations; (**b**) the ionospheric gradient calculated along the baseline between the two stations; (**c**) the satellite elevations of PRN 10; (**d**) the observation ranging errors of PRN 10 at BJSH station. The amplitude of noise is briefly limited to 1.5 m.

Under the circumstance of the ionospheric anomaly shown in Figure 17, the performance of the proposed ionospheric anomaly monitor is evaluated. Parameters of traditional CCD monitors are the same as that presented in Section 3.2. *τ* = 30 epochs and *L* = 25 are chosen for KLD-1OF [18], and *L* = 21 and *σ* = 15 are chosen for the OCSVM-based monitor. Due to the fact that the time of anomalies is only 420 s, the threshold is set to a fixed value during this period. Figure 18 illustrates the test statistics and the corresponding thresholds of four monitors. The CCD-1OF monitor fails to detect the ionospheric anomaly in this real case. On the other hand, the CCD-2OF, KLD-1OF and the OCSVM-based monitors can detect the ionospheric anomalies, while the OCSVM-based monitor is at a higher speed. Compared with the KLD-1OF monitor, the detection time can be decreased by 16%. There are two possible reasons to explain why the improvement in the real event is not as obvious as that in the simulation data. First, there may be cycle slips that are not fully repaired, resulting in slightly different distributions between the training set and test set. In addition, the distribution of noise in the irregular environment may be more complex than that of normal noise. However, in any case, the OCSVM-based monitor still performs well in detecting the real anomalous ionospheric gradient.

**Figure 18.** Variation of the test statistics and thresholds of the CCD-1OF (**a**), CCD-2OF (**b**), KLD-1OF (**c**) and the proposed OCSVM-based (**d**) monitors in the presence of an real ionospheric anomaly event. If the ionospheric anomaly is successfully detected, the detection time is included. Note that the ionospheric anomaly starts at the 1000th epoch.

#### **4. Conclusions and Perspectives**

Anomalous ionospheric gradient is a challenging risk source during the high-level precision approach in the GBAS. Several traditional monitors such as CCD-1OF, CCD-2OF, KLD-1OF have been proposed to detect the ionospheric anomalies based on the statistics in terms of the divergence. However, they do not guarantee a rapid identification or a prompt warning to users in noisy environment. This issue may dissatisfy the requirements of the precision approach and result in potential risks.

To overcome this issue, a novel OCSVM-based monitor was proposed in this work to more accurately and rapidly detect ionospheric anomalies in the GBAS. By combining phase-space reconstitution, the one-dimensional CMC time series can be transformed into high-dimensional vectors. Then, the trained OCSVM model was determined using the normal vectors offline to effectively extract the characteristics of the normal measurements and the threshold. Then, current scores can be obtained on the basis of the online measurements passing through the trained model. The distinction between the online score and the offline threshold was sensed to determine whether the ionospheric anomalies occurred. On the basis of the synthetic and BDS real data, the detection speed for the single-frequency GBAS under different levels of noise was analyzed. In addition, the performance evaluation based on real ionospheric anomalous event environment is also carried out. The proposed monitor showed superiority over the CCD-1OF, CCD-2OF and

KLD-1OF monitors regardless of the noise levels. Results showed that the efficiency can be increased by more than 40% compared with the KLD-1OF monitor. Additionally, the OCSVM-based monitor mainly has the ability to detect small anomalies. In real abnormal environment, the OCSVM-based monitor performs better than the KLD-1OF monitor for a reduction of detection time by 16%. As a result, the OCSVM-based monitor can provide real-time protection against the ionospheric anomaly, which provides a better safeguard for future civil aviation.

Follow-up work can be carried out in three ways. First, the OCSVM-based monitor can only be utilized when the cycle slip is absent. Thus, this monitor is used under the implicit assumption that the ionospheric disturbance is slight. Under strong ionospheric disturbance, the possibility to improve the OCSVM-based monitor ability will be investigated in future work. Second, limited by hardware, only the ionospheric anomaly monitoring capability at Dongying airport is tested. Further verification of different scenarios is necessary before the OCSVM algorithm can be leveraged in practice in the safety-critical systems. Finally, the proposed OCSVM-based monitor could still be further applied in the other GNSS argumentation systems. A further analysis of how the proposed monitor can be utilized in other systems will be carried out.

**Author Contributions:** Conceptualization, Z.G., Z.W.; methodology, Z.G., K.F.; validation, Z.G.; formal analysis, Z.G.; writing—original draft preparation, Z.G.; writing—review and editing, K.F., Z.W. and K.G.; visualization, Z.G., K.G.; supervision, Y.Z.; project administration, Z.W.; funding acquisition, Z.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** The work was carried out with financial support from the National Key Research and Development Program of China (grant no. 2020YFB0505602), the National Natural Science Foundation of China (grant nos. 61871012 and 62022012), the Civil Aviation Security Capacity Building Fund Project (grant nos. CAAC Contract 2021(77) and CAAC Contract 2020(123)), and the Beijing Nova Program of Science and Technology (grant no. Z191100001119134).

**Data Availability Statement:** The data presented in this study are available upon request from the corresponding author.

**Acknowledgments:** The authors would like to thank many people at the National Key Laboratory of CNS/ATM for their advice and interests.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


## *Article* **SafeNet: SwArm for Earthquake Perturbations Identification Using Deep Learning Networks**

**Pan Xiong 1, Dedalo Marchetti 2,3, Angelo De Santis 3, Xuemin Zhang <sup>1</sup> and Xuhui Shen 4,\***


**Abstract:** Low Earth orbit satellites collect and study information on changes in the ionosphere, which contributes to the identification of earthquake precursors. Swarm, the European Space Agency three-satellite mission, has been launched to monitor the Earth geomagnetic field, and has successfully shown that in some cases it is able to observe many several ionospheric perturbations that occurred as a result of large earthquake activity. This paper proposes the SafeNet deep learning framework for detecting pre-earthquake ionospheric perturbations. We trained the proposed model using 9017 recent (2014–2020) independent earthquakes of magnitude 4.8 or greater, as well as the corresponding 7-year plasma and magnetic field data from the Swarm A satellite, and excellent performance has been achieved. In addition, the influence of different model inputs and spatial window sizes, earthquake magnitudes, and daytime or nighttime was explored. The results showed that for electromagnetic pre-earthquake data collected within a circular region of the epicenter and with a Dobrovolsky-defined radius and input window size of 70 consecutive data points, nighttime data provided the highest performance in discriminating pre-earthquake perturbations, yielding an F1 score of 0.846 and a Matthews correlation coefficient of 0.717. Moreover, SafeNet performed well in identifying pre-seismic ionospheric anomalies with increasing earthquake magnitude and unbalanced datasets. Hypotheses on the physical causes of earthquake-induced ionospheric perturbations are also provided. Our results suggest that the performance of pre-earthquake ionospheric perturbation identification can be significantly improved by utilizing SafeNet, which is capable of detecting precursor effects within electromagnetic satellite data.

**Keywords:** earthquake; pre-earthquake anomalies; swarm satellites; ionospheric plasma; deep learning; physical mechanisms

## **1. Introduction**

The ionosphere is an important layer of the solar-terrestrial space observation environment, and the process of earthquake preparation and occurrence can also cause anomalous changes in ionospheric parameters over the preparation zone, known as seismic ionospheric disturbances. These phenomena are a manifestation of earthquakes in the ionosphere, a result of lithosphere–atmosphere–ionosphere coupling, and are considered to be one of the more promising ideas for detecting short-term earthquake signals. Decades ago, Moore [1] and Davies and Baker [2] first reported anomalous ionospheric perturbations associated with the 1964 Alaska earthquake in the USA, and studies on seismic ionospheric phenomena have been rapidly developed. With the accumulation of available observational resources, seismic ionospheric phenomena have been detected, reported, and confirmed [3]. Currently, these phenomena are an issue of great concern and have become a hot topic at

**Citation:** Xiong, P.; Marchetti, D.; De Santis, A.; Zhang, X.; Shen, X. SafeNet: SwArm for Earthquake Perturbations Identification Using Deep Learning Networks. *Remote Sens.* **2021**, *13*, 5033. https://doi.org/ 10.3390/rs13245033

Academic Editor: Fabio Giannattasio

Received: 2 November 2021 Accepted: 8 December 2021 Published: 10 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the intersection of seismology and space physics. In the 21st century, with the development of space satellite observation technology, many countries launched satellites dedicated to monitoring space environment changes and natural disaster activities such as earthquakes and volcanoes, for example the QuakeSat (USA), SICH-1M (Ukraine), COMPASS-2 (Russia), DEMETER (France), and China Seismo Electromagnetic Satellite (CSES). Furthermore, even satellites dedicated to other scientific purposes have been demonstrated to provide important observations for seismic ionospheric disturbances, such as the Swarm constellation of the European Space Agency (ESA). The Swarm constellation consists of three identical satellites that carry the same payloads. The combined observation of multiple satellites in the Swarm constellation offers significant advantages over single-satellite observations, allowing for better detection efficiency, better global coverage in a day, higher spatial resolution, and improved capabilities for analyzing anomalies. In this way, Swarm represents a new, successful approach to the study of seismic ionospheric phenomena.

Studies of these phenomena from the Swarm satellites are currently ongoing, yielding continuous investigation and reporting of the valuable description of their mechanisms. A review of previous studies on seismic ionospheric phenomena based on Swarm satellites shows that the work is generally divided into two types: earthquake case studies and statistical investigations, both of which focus on the analysis of ionospheric disturbance phenomena associated with the preparation phase of an earthquake before its occurrence. Both types explore the precursor and provide related criteria potentially useful for earthquake prediction and forecasting.

Various perturbations have been observed using Swarm data before the occurrence of large earthquakes. De Santis et al. [4] investigated magnetic field anomalies from Swarm data for one month before and after the magnitude 7.8 Nepal earthquake that occurred on 25 April 2015 at 06:26 UTC, and found that the cumulative number of anomalies followed the same typical power-law behavior of a critical system close to its critical time, thereafter returning to normal after the earthquake. In another study, Marchetti et al. [5] used multiparameters from ground and space, that is, Earth geomagnetic field data (magnetic data from the ESA Swarm constellation and from L'Aquila and Duronia ground observatories of the INGV (Italian National Institute of Geophysics and Vulcanology)) along with the MERRA-2 climatological dataset to study the lithosphere–atmosphere–ionosphere coupling effects of the 2016–2017 central Italy earthquake sequence. They revealed anomalies in the ground-based geomagnetic observations 275 and 85 days before the earthquake sequence, anomalies from satellite observations 240 days and 3 days before the start of the earthquake sequence, and two perturbations in the chemical/physical composition of the atmosphere 200 and 150 days prior to the earthquake sequence. In a further study, Marchetti et al. [6] analyzed the Swarm satellite magnetic data prior to the magnitude 7.1 California Ridgecrest earthquake that occurred on 6 July 2019 and found some increase in anomalies in the Y (eastern) component of the magnetic field around 200 days before the earthquake; moreover, 15 min before the earthquake, the Swarm Bravo satellite passed right over the epicentral region, and its Y component presented interesting anomalies. Zhu et al. [7] analyzed ionospheric magnetic field data from the Swarm Alpha satellite before the 16 April 2016 Ecuador earthquake (magnitude 7.8) based on the non-negative matrix factorization method and found that the energy and entropy of one of the weighting components were more concentrated within the preparation region of the seismic event. In that study, the cumulative number of orbits with anomalies inside that region showed an acceleration before the Ecuador earthquake and recovered to a linear (i.e., standard) trend after the earthquake. In addition to the aforementioned examples, numerous studies have confirmed that the Swarm ionospheric perturbations are sensitive enough and useful for detecting earthquake-related anomalies, such as for the 2014 Ludian earthquake [8], 2017 Sarpol-e Zahab (Iran) earthquake [9], 2017 Mexico earthquake [10], and the above mentioned 2016 Ecuador earthquake [11,12].

Statistical analysis is a common technique for investigating ionospheric anomalies that occur before earthquakes using satellite data [13–15]. De Santis et al. [16] statistically

analyzed the Swarm electron density and magnetic field data observed by the three Swarm satellites over 4.7 years using a superposed epoch approach and found that some electron density and magnetic anomalies were significantly concentrated from more than two months before the earthquake to a few days prior to the earthquake. This confirmed the well-known Rikitake empirical law between the time of the precursors and the magnitude of the earthquake by studying different magnitude ranges. In another statistical analysis of 5.3 years of magnetic field data from the Swarm satellite by Marchetti et al. [17], the distance from the satellite to the earthquake epicenter matched the measured distance arrival time of the coseismic disturbance from the surface to the ionosphere, confirming that observed anomalies were likely to be caused by seismic events due to their occurrence with a mixed transmission mechanism, i.e., by acoustic gravity waves and electromagnetic propagation in the ionosphere.

However, most existing pre-earthquake anomaly studies are lacking in consistent analysis methods and anomaly evaluation metrics; thus, the analysis results of anomalies lack universality and often lead to various or even contradictory interpretations for the same earthquake. Moreover, statistical studies of seismic anomalies often do not consider the influence of non-seismic anomalies.

Deep learning, which has been widely used in recent earthquake research [18–22], could perform consistent analysis and assessment of a large number of earthquakes, and could take into account the influence of non-seismic anomalies, which are an effective tool to solve the above issue. By investigating different DEMETER satellite datasets, Xiong et al. [21] confirmed some frequency bands with low-frequency electric and magnetic fields to be the main features for pre-seismic electromagnetic perturbation identification using deep learning. Xiong et al. [23] also proposed a deep learning framework termed SeqNetQuake by training whole life cycle dataset from the DEMETER satellites and transferring the well-trained model to the CSES satellite to form a new identification model which achieved a 12% improvement in classification performance. Based on the classical AdaBoost machine learning algorithm and the feature of satellite remote sensing products such as infrared and hyperspectral gases, Xiong et al. [22] proposed a novel earthquake prediction framework based on inverse boosting pruning trees (IBPT), and achieved promising forecasting results in the validation of global earthquake cases retrospectively.

In this study, we use deep learning techniques, combined with multi-year accumulated Swarm satellite data for pre-seismic ionospheric disturbance identification. The proposed method, known as SafeNet (SwArm for Earthquake study using deep learning networks), is a deep-learning method based on a sequence-based classification neural network for pre-earthquake perturbation identification. The suggested model was trained using 9017 recent independent 4.8+ magnitude earthquakes and 7-year plasma and magnetic field data from the Swarm A satellite. The results indicated that nighttime data provided the best performance in distinguishing pre-earthquake perturbations, with an F1 score of 0.846 and a Matthews correlation coefficient of 0.717. It also worked effectively in detecting pre-seismic ionospheric abnormalities as earthquake magnitude increased. Furthermore, the findings of this study enabled us to propose a hypothesis regarding the physical mechanism behind earthquake-induced ionospheric disturbances. In general, SafeNet could considerably enhance the effectiveness of pre-earthquake ionospheric perturbation identification.

This paper is organized as follows. In Section 2, the used data from the Swarm satellite and its data pre-processing are discussed, together with the observation cases of the anomalies before two actual earthquakes. Section 3 describes the network structure design and performance evaluation metrics for the proposed deep learning model. The results are analyzed and discussed in Section 4, and a hypothesis of the earthquake mechanism is presented. In Section 5, we provide a conclusion and further possible orientation for future work.

#### **2. Datasets and Observations**

#### *2.1. The Swarm Satellites*

The Swarm constellation is the first ESA constellation program for geomagnetic observation, whose main scientific objective is to study the delicate structure of the Earth's magnetic field, its dynamics, and its interaction with Earth systems [24,25]. The constellation consists of three identically equipped satellites, Swarm A (Alpha), B (Bravo), and C (Charlie), which were launched together into a near-polar orbit on 22 November 2013, and finalized their commissioning on 17 April 2014. For the main purpose of the Swarm constellation, each satellite is equipped with a high-precision magnetometer in addition to several other sensors to increment the scientific satellites' capabilities. Among them, the charged particle sensor (Langmuir detector) provides a new avenue for the study of the seismic ionosphere phenomena.

#### *2.2. Earthquake Case Study*

#### 2.2.1. 2016 Sumatra Earthquake

Figure 1 shows the Earth magnetic field and electron density residuals as obtained by applying the MASS (MAgnetic Swarm anomaly detection by Spline analysis) method (De Santis et al. [4]) to the data measured by the Swarm Alpha satellite on 15 February 2016, i.e., 16 days before the Mw = 7.8 Sumatra 2016 earthquake localized at 4.952◦S, 94.330◦E, and 24 km depth. The magnetic track shows a clear anomaly highlighted by a red circle in panel E only in the Y-East component; this is compatible for signals coming from below (i.e., the internal ones, Pinheiro et al. [26]). The FFT (fast Fourier transform) spectrum shows a signature highlighted by a red circle in panel B at about 0.4 Hz (i.e., a period of 2.5 s) that could correspond to the frequency of the anomaly. Such a signature is not visible in the track without evident anomalies (the amplitude at this frequency is normally lower than the level in this figure as visible in the other examples in Text S1, Supplementary Materials). The electron density shows the clear EIA diurnal profile, but it is quite unusual that it appeared during nighttime (local time of about 1:30 a.m.). Furthermore, the electron density shows some disturbances at +5◦ geographic latitude and even they do not coincide with the magnetic anomaly; both phenomena could be produced by the preparation phase of this large earthquake. Geomagnetic indices Dst = 6 nT and ap = 4 nT depict a very quiet geomagnetic condition. Thus, all the investigated measurements show an unusual status of the ionosphere in the nighttime of 15 February 2016, without known external perturbations; furthermore, the longitude of the satellite matched with the one of the future epicenters, so finally we can consider this track a good candidate for an earthquake precursor.

A complete investigation of nighttime Swarm Alpha from this track until earthquake occurrence is presented in Text S1 of the Supplementary Materials. Some days are affected by perturbed geomagnetic conditions and thus do not permit searching for possible seismoinduced ionospheric disturbances. Instead, on at least other 3 days (21, 25, and 26 February 2016), there are interesting signals: all of them present higher signal content in the Y-East component with respect to the vertical and North ones. These anomalies, detected inside Dobrovolsky's area and a few days before earthquake occurrence, can in principle be good candidates for precursors.

#### 2.2.2. The Ecuador Earthquake Occurred on 16 April 2016

A similar example to the previous one was recorded before the Ecuador earthquake occurred on 16 April 2016 at 0.371◦ N, 79.940◦ W, and 20.6 km depth by the Swarm Alpha satellite on 19 January 2019 (see Figure 2). Akhoondzadeh et al. [12] found a pattern, in terms of the chain of phenomena, in the lithosphere–atmosphere and ionosphere in preparation of the Ecuador earthquake. In particular, they detected an increase of Swarm magnetic anomalies about 9 days before the event and during geomagnetic quiet conditions; these were probably related to the preparation phase of this seismic event. The track shown in Figure 2 is one example acquired during such an increase of anomalies and it shows an anomaly in three components of the magnetic field highlighted by red circles in panels

D, E, and F. We notice that in the Y-East component the anomaly seems to have a longer duration with respect to the other components and a northern anomalous signal (even if this second North disturbance is formally out of the Dobrovolsky area). From the frequency point of view, in this case the highest intensity in the Y-East FFT spectrum (see panel B) seems to be located around 0.2 Hz (period of 5 s) with some frequency spread, and in particular it seems that there is a double peak at lower frequencies (0.182 Hz and 0.197 Hz) not present in Z FFT (see panel C), while the peak at 0.23 Hz is present also in the Z component as highlighted by the red circle in panel C. Future investigations are necessary to check by a systematic approach if a particular frequency is more prone to identifying possible seismo-induced phenomena. From the multiparametric and multi-instruments investigation, it is possible to note that there is a depletion of electron density which mostly coincides in latitude with the magnetic disturbance. A common alteration of the magnetic field with a decrease of electron density is probably a sign of the crossing of a "plasma bubble". This feature, which is produced for the standard behavior of the ionosphere, can be also induced by air ionization in the preparation of an earthquake [27]. The earthquake was recently re-investigated by another method by Zhu et al. [7], confirming the previous result and offering new insights.

**Figure 1.** Swarm Alpha satellite nighttime track 26 of 15 February 2016 acquired 16 days before the M7.9 Sumatra 2016 earthquake. Panels D, E, and F show magnetic X, Y, and Z components residual after derivate and cubic-spline removal (MASS method; De Santis et al. [4]). Panels A, B, and C provide the FAST Fourier Transform of the residual of X, Y, and Z,

respectively. Panel G shows the logarithm of electron density, with a 10-degree polynomial fit as a red line. The pixels that present Ne values that significantly deviate from the fit are identified as blue stars and they are potential anomalies as defined by NeLOG in De Santis et al. [28]. The map in panel H shows the epicenter of the earthquake by a green star and its preparation area defined by Dobrovolsky et al. [29] as a yellow circle. In the title of the figure, the satellite (A for Alpha, B for Bravo, and C for Charlie), date, track number (counted daily), and time in local and UTC times of the center of the track are indicated. The values of the geomagnetic indexes Dst and ap at the track acquisition time are also provided, and in the second line of title the number of flagged samples are provided together with the total number of samples in the track.

**Figure 2.** Swarm Alpha satellite nighttime track 3 of 7 April 2016 acquired 9 days before the M7.8 Ecuador 2016 earthquake. The description of the subfigures is the same as in Figure 1.

Also, for the case study in Text S2 (Supplementary Materials), from the Swarm Alpha nighttime data, one track for each day was shown until the earthquake's occurrence. In this case, most of the other detected anomalies were probably associated with external geomagnetic activity.

#### *2.3. Dataset and Preprocessing*

All three satellites in the Swarm constellation carry the same scientific payloads to detect ionospheric electromagnetic parameters. This study focused on the use of magnetic field and plasma parameter data from the Swarm A satellite and space weather data. The types of sensors and data utilized in the study are described as follows:

(1) The vector magnetometer (VFM) is a fluxgate magnetometer from which the onground processor provides both high-frequency (50 Hz) and low-frequency (1 Hz) signals. The magnetic field intensity is available in an instrumental reference system as well as in the Earth one (which is used in this study), consisting of three components, X (North), Y (East), and Z (Vertical), and is measured in nT, while time is measured in Universal Time. The VFM measures field components with an accuracy of 0.1 nT every 3 months for signals at global scales within a space resolution of 20 km [25]. In this study, we focus on the X, Y, and Z components of the low-frequency (1 Hz) data.

(2) The Langmuir probe (LP) measures the electron temperature (Te), electron density (Ne), and other parameters of plasma by measuring the current generated by electrons and ions at a sampling rate of 1 Hz. This study used plasma Ne data from a level 1b product [24,25].

(3) To avoid effects caused by space weather events, we collected the Kp index, an indicator of global geomagnetic activity, to be used as an auxiliary means of discriminating between solar (or geomagnetic activity) and seismic ionospheric disturbance phenomena. It should be noted that data corresponding to periods with a Kp index greater than 3.0 were not analyzed in this study.

The data used for this study were from Swarm A and included the parameters mentioned above, collected from 1 April 2014 to 30 April 2020, i.e., 7 years of data. According to the United States Geological Survey, 18,621 earthquakes with magnitudes greater than or equal to 4.8 were reported during this period. Thus, we used the same technique as reported by Yan et al. [14] to exclude aftershocks from the list of earthquakes in order to prevent mixing pre- and post-seismic effects. After this process, the final list comprised 9017 independent earthquakes. We also removed data that corresponded to the aftershocks.

To test the reliability of the machine learning methods and improve their robustness, we created the same number of artificial non-seismic events as actual earthquakes, while randomizing and changing the timings and locations to avoid overlapping with real earthquakes. Within the selected spatio-temporal range, we sampled latitude, longitude, and time at random, adhering to the following constraints: (1) the latitude or longitude is not within 10◦ of the latitude or longitude of a real earthquake and (2) the time is not within 15 days of the occurrence time of a real earthquake.

#### **3. Methodology**

Figure S25 (Supplementary Materials) depicts a flowchart diagram of the methodology used in this study. To begin, a total of 9017 earthquakes with magnitudes of 4.8 or higher were extracted from seismic catalogues considering those that occurred all over the world for the study. Thirteen datasets were built by combining different magnitudes of earthquakes and features. Each dataset was divided into two parts: training data and test data. After that, we used the "sliding window" technique for data preprocessing, and lastly, we generated time series-based features.

In our work, we explored the effect of different model input and spatial window sizes, earthquake magnitudes, and whether the earthquake occurred during the day or at night using different datasets. We also performed a comparison of five state-of-the-art approaches. Considering these approaches are highly sensitive to parameter selection, we preferred to choose those configurations that would allow us to achieve the highest performance in the tests. The performance of each approach was then evaluated after the parameter selection. Finally, we evaluated the performance of each technique using ROC curves and other performance metrics.

#### *3.1. Data Preprocessing*

A variety of reasons, including satellite payload interference, the space environment, and other factors, may produce inaccuracies in constantly observed satellite data. As a precaution against such mistakes, we divided continuous observation data into fixed-length sliding windows (commonly known as "sequences"), which we then utilized as inputs to our proposed model. We also divided the data into sliding windows that were continuous but did not overlap because time series data are highly autocorrelated sequences. As a

result, we carefully examined the difference between the first and final data points ordered by time in each time window to verify that the data were continuous. Time series windows with unreasonable time differences (i.e., gaps) were removed from the analysis.

Subsequently, we reformulated the pre-earthquake ionospheric perturbation discrimination task as a multiclass multivariate time-series classification problem with the data marked as follows (Figure 3): the non-seismic-related data were marked as 0, seismicrelated data were marked as 1, and data with a Kp index greater than 3.0, regardless of whether the data were related to an earthquake, were marked as 2, indicating density perturbations due to solar and magnetic activity [30]. It is known that the Earth's magnetic field undergoes temporal fluctuations and exhibits recognized patterns related to the movement of the poles, and time series data exhibit significant autocorrelation characteristics, which indicates that the field is highly variable. Therefore, to ensure that the overall Swarm dataset could be utilized effectively, it was carefully divided into two contiguous parts: the first 80 percent (in chronological order) of the data was used for model training (the training set), and the last 20 percent was used for model testing and final assessment (the test set).

**Figure 3.** Sequence labeling after data segmentation using a sliding window. Consecutive observations of used parameters are segmented by non-overlapping sliding windows. T is the window's length. Non-seismic data are marked as 0 (class 0), seismic data are marked as 1 (class 1), and data with a Kp index greater than 3 are marked as 2 (class 2), regardless of whether they are associated with an earthquake.

#### *3.2. Deep Learning Network Architecture*

In our research, we used a combination of a convolutional neural network (CNN) and bi-directional long short-term memory (Bi-LSTM) to train our proposed SafeNet model (Figure 4). The model's architecture comprises one-dimensional (1D) convolutional layers, a 1D Bi-LSTM structural layer, and a fully connected (FC) block. For feature extraction from the input data, the SafeNet model employs CNN layers, which are combined with Bi-LSTMs to facilitate sequence prediction. SafeNet accesses subsequences of main sequence as blocks, collects features from each block, and then transfers extracted features to the LSTM layer for interpretation. To enable the CNN model to read each subsequence in the window, the entire CNN model is wrapped in a Time Distributed layer. The extracted features are then flattened and provided to the Bi-LSTM layer for reading, and further features are extracted. The 1D convolutional layers are utilized to extract data features in concept, and then Bi-LSTM structures are employed to optimize feature extraction in sequential data. Finally, the classification probability is calculated using an FC layer. The loss function is categorical cross-entropy, and the optimization is performed using the Adam method [31]. For more details on the SafeNet network architecture, please refer to Text S4 (Supplementary Materials).

The proposed model was developed in TensorFlow 2.0 with the Keras (v. 2.3.0) interface [32]. To facilitate fast training, all models were trained on a server equipped with two Intel Xeon E5-2650 v4 CPUs, 128 GB of RAM, and an NVIDIA GeForce RTX 2080 Ti graphics processing unit (GPU) [33]. Owing to the sensitivity of the proposed method to the chosen parameters, Bayesian hyperparameter tuning was utilized to determine the optimal settings [34] and was developed using the Hyperopt Python package [35]. The negative of the F-measure (F1) was utilized as the objective function's return value (loss) in this procedure. The procedure optimized hyperparameters based on their capacity to minimize an objective function by constructing a probability model based on the results of previous evaluations. Consequently, this model can be expected to perform better with fewer iterations than the random or grid searches would require. Table S1 (Supplementary Materials) summarizes the search space for SafeNet's important parameters. Each model was assigned a maximum of 100 iterations. DataSet S1 (Supplementary Materials) provides the hyperparametric optimization trial results for all the datasets used to train the SafeNet model and other benchmarking classifiers.

#### *3.3. Performance Evaluation*

Datasets utilized in this study were often class unbalanced, with the number of samples representing the non-seismic class being much greater than the number of samples representing the other classes [36]. In this situation, a simple classifier that predicted each sample as the majority class could achieve a high level of accuracy; thus, the total classification accuracy was insufficient to assess performance. As a result, we used the F-measure (F1) to evaluate model performance, which considers the correct classification of each class to be equally important. The F1 score is a metric that considers precision and recall. This is often referred to as the harmonic mean of both. Consequently, class imbalance was addressed by weighting the various classes according to their sample proportions. The Matthews correlation coefficient (MCC) [37], which emphasizes positives in samples, was also employed in this study. The specific formulas for the performance metrics such as F1 score and MCC are defined in Text S3 (Supplementary Materials).

Furthermore, in this study, receiver operating characteristic (ROC) curves, which are plots of the true positive rate versus the false positive rate, were employed to assess the output quality of the classifier's performance. The ROC curve is often used in binary classification settings to assess the output of a classifier. To extend the ROC curve and ROC area for multiclass classification, the output was binarized, and one ROC curve was drawn and used to evaluate classifier quality per class. In addition, we computed the area under the ROC curve, known as the AUC, which was used to compare various models. In this

study, higher AUC values were regarded as better methods for identifying ionospheric perturbations prior to earthquakes.

**Figure 4.** The bottom-up network framework architecture of the SafeNet model. 1-D CNN: Onedimensional convolutional neural network; Dropout: drop-out layer; Bidirectional LSTM: bidirectional long short-term memory layer. "Flatten" and "Dense" are the names of the functional layers.

Finally, to visually illustrate the classification performance of each class, ternary probability diagrams and confusion matrices were used to depict the probability distributions for each input class in the test data, as well as the distribution of predicted and actual values.

Five state-of-the-art machine learning models were benchmarked for the study task: gradient boosting machine [38], deep neural network (DNN) [39], random forest [40], CNN [41], and LSTM [42] models. These methods were implemented in Python (v. 3.6) with scikit-learn (v. 0.20.0) and Keras (v. 2.3.0). Because the explored models are sensitive to parameter selection, we chose parameters that yielded the best performance using Bayesian hyperparameter tuning, as described above. After the optimal parameters were determined for each method, the performances of the different methods were compared.

#### **4. Results**

We used the Swarm dataset to train the proposed SafeNet model directly, which had been split into the training and test sets. Initially, we configured the data with 60 consecutive observations as the input sequence length, a spatial window centered at the epicenter, a deviation of the Dobrovolsky radius [29], and nighttime data in the initial configuration, because there is no universal standard for lengths of the input sequence and the spatial window (DataSet 01 in Table 1). In our study, we consistently considered data from 15 days before to 5 days after every earthquake and set it as the temporal window.



As illustrated in Figure 5, the ROC curves were used as a performance measure because they represent relative trade-offs between true positives (benefits) and false positives (costs) for each class, and the performance of the model utilizing nighttime data is shown. The AUC values of classes 0 and 1 were both greater than 0.9 (Figure 5A,B), indicating that the model can roughly distinguish time series related to earthquakes and non-seismic events, but the AUC of class 2 was only 0.50 (Figure 5C), indicating that the accuracy of the model in identifying space weather such as magnetic storms was low. This may be due to the fact that class 2 was trained with a lower number of samples, causing the model to fail to extract the features of the class. Figure 5D depicts the MCC, F1 score, and accuracy bar plot curves, which represent the overall performance of the model. The findings were similar to those implied by the ROC curves, showing that the model could distinguish earthquakes from non-seismic and space events to some degree. In general, the performance of the model based on the initial setup was reasonable, but not remarkable. As a result, we investigated whether combining datasets with various temporal and geographic characteristics, as well as alternative models, might provide an improved performance.

**Figure 5.** Receiver operating characteristic (ROC) curves showing SafeNet's performance for (**A**) class 0, (**B**) class 1, and (**C**) class 2 utilizing nighttime data. (**D**) Matthews correlation coefficient (MCC), F1 score, and accuracy bar plot curves illustrating the model's performance with nighttime data.

#### *4.1. Considering Various Input Sequence Lengths*

To further investigate whether the SafeNet method can identify pre-earthquake disturbances with varying input sequence lengths and whether it can improve performance, datasets with the following input sequence length of consecutive observations were created (Table 1): 80 (DataSet 02), 70 (DataSet 03), 50 (DataSet 04), and 40 (DataSet 05) consecutive observations.

Figure 6 depicts the ROC curves and MCC, F1 score, and accuracy bar plot curves for the datasets with varying input sequence lengths. Table 2 lists the classification performance metrics achieved using the SafeNet. In Table 2, it is revealed that the overall F1 scores vary from 0.812 to 0.846, and the MCC varies from 0.654 to 0.717 for various datasets; these values are also shown in Figure 6D, which shows a performance comparison of the results. It was illustrated that as the input sequence length fluctuated, the performance of the model changed as well, and the optimal performance was achieved using the dataset with an input sequence length of 70 consecutive observations (DataSet 03). According to the ROC curves shown in Figure 6A–C, SafeNet also offered a reasonable performance for each class when DataSet 03 was used. Based on these results, we could conclude that the length of the input sequence had an influence on the performance of the SafeNet model, and that the best performance was achieved with an input sequence of 70 consecutive observations.

**Table 2.** Performance comparison of SafeNet and benchmark classifiers on different datasets.


**Figure 6.** Comparing model performance using ROC curves for (**A**) class 0, (**B**) class 1, and (**C**) class 2 with window sizes of 40, 50, 60, 70, and 80. (**D**) Bar plots of the MCC, F1 score, and accuracy at various window sizes. We use the letter h to define the window size, and the numbers in brackets represent the specific size.

#### *4.2. Data Comparing Nighttime Versus Daytime*

The data acquisition time may affect the identification of pre-earthquake electromagnetic perturbations. To illustrate the effect of data collection time, a daytime dataset (Dataset 06 in Table 1) was created. As shown in Figure 7 and Table 2, SafeNet's performance was compared using benchmark datasets generated during the day and night. We evaluated the model's performance using the AUC, MCC, F1, and accuracy indicators.

**Figure 7.** The ROC curves comparing model performance in nighttime vs. daytime data for (**A**) class 0, (**B**) class 1, and (**C**) class 2. (**D**) Comparison of model performance for nighttime and daytime data using bar plots of MCC, F1 score, and accuracy.

For the same spatial and temporal features, we found that using the nighttime datasets (Dataset 03 in Table 1) resulted in a higher classification performance than using the daytime dataset (Table 2). SafeNet's ROC curves for both datasets are given in Figure 7A–C, and we can observe that the AUC curve for nighttime data is somewhat higher than that for daytime data, with approximately 5.7%, 8.6%, and 2.1% increases in AUC for the three classes, respectively. When all classes were taken into account, SafeNet's F1 score improved from 0.805 to 0.846 when nighttime data were used, compared to daytime data, and MCC improved by 9.8% (Table 2).

Figure 7D compares the MCC, F1 score, and accuracy values between the daytime and nighttime data, revealing that the nighttime data performed slightly better than the daytime data. One reason for this may be that, because ionospheric conditions are typically more disturbed during the day, identifying seismic electromagnetic effects is more difficult, which may reflect a small number of significant changes in daytime data. This finding is supported by other research on statistical results of electromagnetic disturbances caused by earthquakes [16,43–46].

#### *4.3. Considering Various Spatial Windows*

SafeNet worked effectively for a circular region centered at the epicenter with a Dobrovolsky radius (DataSet 03). To further investigate the impact of various spatial windows on the performance of the model, satellite datasets with spatial windows of 3◦ (DataSet 07), 5◦ (DataSet 08), 7◦ (DataSet 09), and 10◦ (DataSet 10) were created (Table 1).

Table 2 details the SafeNet's performance on the five datasets, and Figure 8 illustrates the ROC and MCC curves, F1 score, and accuracy bar plot curves. SafeNet performed best when the dataset was used with the spatial window radius given by Dobrovolsky's formula (DataSet 03), with an F1 score of 0.846 and an MCC of 0.717. Comparing the results from Figure 8D and Table 2 shows that an improvement in the overall performance was not achieved with larger spatial windows. This tendency is also evident in the ROC curves in Figure 8A–C, where DataSet 03 had the highest AUC value of the datasets. Though the cause for these findings is unclear, it could be that a disturbance moving upward from the Earth's surface alters the ionosphere's properties geometrically, and the radius of the affected area matches the radius calculated using Dobrovolsky's formula. In addition, among the cases with different sizes of the preparation area (DataSet 07–10), it is the smaller ones that show the best performance. This could be due to the fact that there are more earthquakes with a smaller magnitude (i.e., 4.8–5.0) than those with a larger magnitude.

#### *4.4. Considering the Magnitude of the Earthquake*

It is well known that earthquake magnitude may play an active role in the identification of pre-earthquake perturbations. Therefore, to demonstrate the influence of magnitudes, we divided earthquakes into groups of three—5136 earthquakes with magnitudes between 4.8 and 5.2, 2793 earthquakes with magnitudes between 5.2 and 5.8, and 853 earthquakes with magnitudes between 5.8 and 7.5, and the corresponding datasets were created: DataSet 11, DataSet 12, and DataSet 13 (Table 1).

Table 2 illustrates the performance of the SafeNet model for the three datasets. As shown in Figure 9, the model had similar performance over the three datasets; for example, the AUC of class 0 and class 1 on all three datasets using the SafeNet method was above 0.86, and the F1 score ranged from 0.697 to 0.812, which suggests that the SafeNet model provides a satisfactory performance in discriminating electromagnetic pre-earthquake perturbations on the datasets with different magnitudes. Moreover, it can be concluded from Table 2 and Figure 9 that the larger the magnitude, the better the classification performance. This is also in line with the reality.

**Figure 8.** The ROC curves comparing model performance for (**A**) class 0, (**B**) class 1, and (**C**) class 2 with various spatial window radii of 3◦, Dobrovolsky's formula, 5◦, 7◦, and 10◦. (**D**) Bar plots showing the MCC, F1 score, and accuracy of the results.

**Figure 9.** The ROC curves comparing model performance for (**A**) class 0, (**B**) class 1, and (**C**) class 2 of earthquakes with magnitudes between 4.8 and 5.2, earthquakes with magnitudes between 5.2 and 5.8, and earthquakes with magnitudes between 5.8 and 7.5. (**D**) Bar graphs displaying the MCC, F1 score, and accuracy.

#### *4.5. Considering Unbalanced Datasets*

The actual situation of earthquake issues is usually highly unbalanced, and it is obvious that non-earthquake datasets are always significantly larger than earthquake datasets. To test the real performance of the SafeNet model on unbalanced datasets as well as to investigate whether our proposed method could be applied to earthquake anomaly identification on unbalanced datasets, datasets with the positive to negative ratio of 1:2 (DataSet 14), 1:3 (DataSet 15), 1:4 (DataSet 16), and 1:5 (DataSet 17) were generated (shown in Table 1).

Table 2 illustrates the performance of the SafeNet model on five datasets (including DataSet 03). Although the method performs most effectively on the dataset with a positive to negative ratio of 1:1 (DataSet 03), the overall performance is similar on the five datasets; for instance, the proposed method has F1 scores around 0.83 (ranging from 0.814 to 0.846) as well as MCC values ranging from 0.661 to 0.717. Figure 10 shows the ROC curves for all three classes, together with a comparison of the performance metrics; we also observed a similar trend of our proposed method's performance on the five datasets, which suggests that the SafeNet model achieves a good performance for pre-seismic perturbation identification on the unbalanced dataset. Although the five unbalanced datasets are different, these results show that our method is less sensitive on the positive to negative ratio, and our method can be used to identify possible electromagnetic preseismic perturbations on an unbalanced dataset, which suggests that it could provide a good realistic performance.

#### *4.6. Comparative Analysis of Other Classifiers*

Table 2 and Figure 11 report the performance of our SafeNet model with five other benchmarking classifiers for DataSet 03. The performance of the existing methods ranged from F1 = 0.742 to 0.846 and MCC = 0.450 to 0.717. However, SafeNet had the best performance, improving MCC by 8.6% over that of the next-best DNN model. Figure 11A–C compares the ROC curves obtained for the SafeNet model with those of the five other classifiers, and SafeNet again demonstrated the best performance with a 4.6% improvement in AUC for class 0, a 1.7% improvement for class 1, and a 7.1% improvement for class 2 over the second-best DNN model.

To further confirm the performance of SafeNet, ternary probability diagrams and a confusion matrix were used to indicate the distribution of the predicted and true values and to allow for more profound insight into the classification performance of the model. Figure 12 shows the ternary probability diagrams and confusion matrix for the three classes obtained from the SafeNet model. Ternary probability diagrams allow for a qualitative evaluation of classification results. From Figure 12A–C as a whole, most of the samples in class 0 and class 1 were correctly predicted, and classes predicted by SafeNet could be classified into the correct classes, while the performance of class 2 was slightly worse, with some samples having a prediction probability concentrated around 0.5. The confusion matrices shown in Figure 12D quantitatively present the prediction accuracy for each class, with the correct prediction accuracy for class 0, class 1, and class 2 of 89.5%, 85.5%, and 87.9%, respectively. In general, SafeNet provides a good classification performance for identifying seismic signals and space weather events from Swarm satellite data.

**Figure 10.** The ROC curves comparing model performance for (**A**) class 0, (**B**) class 1, and (**C**) class 2 of datasets with the positive to negative ratio of 1:1 (DataSet 3), 1:2 (DataSet 14), 1:3 (DataSet 15), 1:4 (DataSet 16), and 1:5 (DataSet 17). (**D**) Bar graphs displaying the MCC, F1 score, and accuracy.

**Figure 11.** The ROC curves comparing the proposed method, gradient boosting machine (GBM), deep neural network (DNN), random forest (RF), convolutional neural network (CNN), and long short-term memory (LSTM) models for (**A**) class 0, (**B**) class 1, and (**C**) class 2. (**D**) The MCC, F1 score, and accuracy bar plot curves.

**Figure 12.** Ternary probability diagrams illustrating the findings of SafeNet on the test data for (**A**) class 0, (**B**) class 1, and (**C**) class 2. Class 0 is represented by blue crosses, class 1 by red crosses, and class 2 by green crosses; the color of the cross indicates the actual class, and the distance projected from each cross to the class axis represents the probability of that class in the model prediction. (**D**) Matrix of confusion illustrating the distribution of estimated and actual values. Each tile's center contains the normalized count (overall percentage) in black text. Column percentages are shown at the bottom of each tile, while row percentages are displayed on the right (both in red text). The sum tiles on the plot's right and bottom (in shades of green) indicate the overall distribution of predictions and targets. Note that the color intensity is proportional to the number of counts.

#### **5. Discussions**

Several studies have investigated the physical mechanisms of ionospheric pre-earthquake perturbations [47–51]. Perhaps the most widely accepted hypotheses regarding these mechanisms were presented by Pulinets, et al. [52] and Kuo, Lee, and Huba [27], who proposed complex lithosphere–atmosphere–ionosphere coupling as the physical basis for the generation of short-term earthquake precursors. Specifically, the lower crust and upper mantle generate gases such as radon (Rn) at lithostatic pressure during the buildup to an earthquake, and this gas could form large-scale domains in the rock. When reaching a certain vertical extent, these gas domains could become hydrostatically unstable and force their way upward through the lithosphere. Rapid lithospheric degassing can be expected to trigger several atmospheric processes near the Earth's surface, leading to changes in air conductivity and, therefore, changes to the near-ground atmospheric electric field. It is known that, as a part of the global electric circuit, the ionosphere immediately reacts to changes in near-ground electric properties, and an electric field induced within the ionosphere can cause ion drift and irregularities in electron concentrations. To better understand the mechanisms of ionospheric pre-earthquake perturbations, Freund [53] performed a series of experiments on a loaded rock and found that stress activation of p-hole charge carriers in the Earth's crust led to regional positive ground potential. In these scenarios, ionospheric perturbations are expected.

Crustal strain measurements often fail to detect any unusual changes before earthquakes [54], and the measurement of Rn gas content is among the most reported earthquake precursors [55,56]. Moreover, continuous monitoring of soil gas radon and water radon concentrations along with the Amritsar (Punjab, India) seismic zone correlation showed that the amplitude of radon gas anomalies was positively correlated with earthquake magnitude [57]. In addition, recent studies revealed a significant decrease in radon concentration within continuous measurements of radon concentration in the atmosphere before the 2018 earthquake in northern Osaka, Japan [58] and peculiar changes in radon concentration in the atmosphere two months before the 1995 Kobe earthquake in Japan [59]; Fu et al. [60] studied the radon gas anomalies in northern and northeastern Taiwan before the earthquake and observed a significant increase in soil radon concentration from a few days to a few weeks before the earthquake. Finally, it is reasonable that the amount of emitted radon could depend on the rupture length of the fault, that is, the magnitude of the seismic event. This study considers Rn gas emissions as the most reasonable result to be initially triggered by an earthquake. Because Rn decay can emit alpha particles, we propose a new hypothesis to explain the physical mechanisms of earthquake-induced ionospheric perturbations, as schematically depicted in Figure 13. In this hypothesis, various crustal movements in the pre-earthquake stage lead to rock fragmentation, melting, mineral dissolution, or phase change, and daughter isotopes of some radioactive parent isotopes retained in certain minerals or rocks are released in large quantities. In this situation, the Rn gas content in the area near the epicenter is abnormal before the earthquake, and furthermore the Rn gas decays quickly (half-life is about 3.8 days). During Rn decay, many alpha particles are released. The energy of an alpha particle is 5.2 MeV, and the ionization energy required by an atmospheric molecule is 32 eV. Therefore, one alpha particle is sufficient to generate 150,000 pairs of positive and negative ions, thereby creating an excess of positive airborne ions near the Earth's surface.

Because pre-earthquake field ionization occurs over relatively wide areas, we suggest that the positively charged air bubble expands owing to its internal electrostatic repulsion. Furthermore, as the prevailing electric field has an acceleration effect on positive airborne ions relative to the Earth's surface, the only direction of the airborne ions is upward. Therefore, many airborne ions would rise rapidly and shorten the vertical potential difference between the Earth's surface and the ionosphere. In response, the ionospheric plasma would be expected to polarize, causing the electrons located at the bottom of the ionosphere to be pulled downward. Thus, the physical properties of the ionosphere respond to changes in the vertical distribution of electrons and ions in the ionospheric plasma. In this way, it will result in a vertical electric field, E. The electric field pushes upward current flow from the atmosphere into the ionosphere. The injected current could result in the ionosphere being subjected to an enforced electric field. The perpendicular component of the electric field causes plasma E × B motion, which results in fluctuations in the ionosphere's density [27,61].

**Figure 13.** The proposed physical mechanisms of ionospheric perturbations induced by earthquakes. A large area of rock is broken and torn before the earthquake, after which a channel is opened to continuously release radon gas to generate radioactive decay. A gas bubble, laden with positive airborne ions generated during the process of radon decay at the ground-to-air interface, expands upward through the atmosphere, carrying up the Earth's ground potential and eliciting a polarization response in the ionosphere. This leads to a redistribution of the electrons at the lower edge of the ionosphere and thus modifies its physical properties. Thus, the satellite will receive anomalous signals from the ionosphere.

> Moreover, the ionospheric anomalies before and after the earthquake could be positive or negative, and the probability of positive and negative perturbations is almost the same [62]. Simulations of anomalous electric fields show that if the anomalous electric field is westward, then the density enhancement occurs at the equator and the electron density decreases upward at the polarities [62]. If there is an eastward anomalous electric field, the positive and negative anomaly positions will be opposite. According to our knowledge, the anomalous electric field will be significantly different from one earthquake or different times of the same earthquake, so both positive and negative anomalies could be observed before and after the earthquake. In addition, Yao, et al. [63] also found that both positive and negative anomalies could occur before the earthquake by analyzing the GIM TEC of all Ms 7.0+ earthquakes in 2010. In addition, Zhao et al. studied the Wenchuan earthquake and also found that some of the GPS station data near the epicenter showed positive anomalies and some showed negative anomalies [64].

> In the development of earthquake mechanisms at this stage, more experimental and actual evidence of radon gas generation in pressurized rocks is needed because it

can provide a reasonable explanation for the uncertain relationship between pre-seismic electromagnetic anomalies and actual seismic events. In addition, in this endeavor, we suggest setting up ground-based observations of DC electric fields or using VLF noise anomalies to monitor the electrical activity of the atmosphere and lithosphere.

#### **6. Conclusions**

This study proposed the SafeNet deep learning framework for pre-earthquake ionospheric perturbation identification. SafeNet was trained and tested using 9017 independent earthquakes of magnitude 4.8 and above that occurred from April 2014 to April 2020, and the corresponding plasma and magnetic field data from the Swarm A satellite for about 6 years. The results indicated that electromagnetic pre-earthquake data within a circular region centered on the epicenter and with a radius given by the Dobrovolsky formula, with a model input window size of 70 consecutive points and nighttime sequence data, yielded the best performance in discriminating electromagnetic pre-earthquake perturbations, with an F1 score of 0.846 and an MCC value of 0.717. The study also concluded that the larger the magnitude of the earthquake, the better the performance of the SafeNet model in identifying possible pre-earthquake ionospheric anomalies. The results also suggest that the SafeNet model achieves a good performance for probable pre-seismic perturbation identification on the unbalanced dataset. In addition, based on constraints from this study, we proposed a new hypothesis on the physical mechanisms of earthquake-induced ionospheric perturbations.

In order to have a better understanding of the lithosphere, atmosphere, and ionosphere coupling mechanisms, to study pre-earthquake anomalies using electromagnetic satellites, the analysis of the spatial-temporal correlation of multi-sphere and multi-parameters by a remote sensing technique before earthquakes occur has become a hot research topic in recent years. However, it is difficult to match the different parameters and data of each sphere in both time and space. The existing multi-parameter earthquake studies are mainly focused on specific earthquakes or specific remote sensing parameters, have not sufficiently considered the other remote-sensing parameters of other spheres, and have not formed a complete chain of multi-parameter correlation analyses. The deep learning technology can overpass such limitations, combining the remote sensing parameters of multiple spheres in time and space and carrying out the analysis based on a consistent spatial-temporal framework, which could provide global earthquake cases and effectively explain the earthquake coupling mechanism models, and also expand the current tools for earthquake monitoring, providing new perspectives for earthquake prediction.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/rs13245033/s1, Text S1. Swarm Alpha nighttime tracks before the M7.9 Sumatra 2 March 2016 earthquake. Text S2. Swarm Alpha nighttime tracks before the M7.8 Ecuador 16 April 2016 earthquake. Text S3. Performance metrics. Text S4. The SafeNet model structure. Figure S25. The flowchart of the proposed deep learning framework. Table S1. Search space of parameters for the SafeNet model.

**Author Contributions:** Conceptualization, X.Z. and X.S.; data curation, P.X.; investigation, D.M., A.D.S., and X.Z.; methodology, P.X. and A.D.S.; project administration, A.D.S. and X.S.; software, P.X.; visualization, P.X. and D.M.; writing—original draft, P.X.; writing—review and editing, D.M., A.D.S., X.Z., and X.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is supported by the Special Fund of the Institute of Earthquake Forecasting, China Earthquake Administration (Grant No. 2021IEF0706, 2020IEF0510, 2021IEF0708) and funded by the National Key R&D Program of China under Grant No. 2018YFC1503505; Pianeta Dinamico-Working Earth (Italian Ministry of University and Research) and Limadou-Science+ (Italian Space Agency) Projects; National Natural Science Foundation of China (Grant number: 41974084); the China Postdoctoral Science Foundation (Grant number: 2021M691190).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The Swarm data can be downloaded from the ESA Swarm FTP and HTTP Server swarm-diss.eo.esa.int. The Kp indices used in this paper were provided by the World Data Center for Geomagnetism, Kyoto (http://wdc.kugi.kyoto-u.ac.jp/wdc/Sec3.html).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Statistical Study of Ionospheric Equivalent Slab Thickness at Guam Magnetic Equatorial Location**

**Yuqiang Zhang 1, Zhensen Wu 1,\*, Jian Feng 2, Tong Xu 2, Zhongxin Deng 2, Ming Ou 2, Wen Xiong <sup>2</sup> and Weimin Zhen <sup>2</sup>**


**Abstract:** The ionospheric equivalent slab thickness (τ) is defined as the ratio of the total electron content (TEC) to the F2-layer peak electron density (*Nm*F2), and it is a significant parameter representative of the ionosphere. In this paper, a comprehensive statistical analysis of the diurnal, seasonal, solar, and magnetic activity variations in the τ at Guam (144.86◦E, 13.62◦N, 5.54◦N dip lat), which is located near the magnetic equator, is presented using the GPS-TEC and ionosonde *Nm*F2 data during the years 2012–2017. It is found that, for geomagnetically quiet days, the τ reaches its maximum value in the noontime, and the peak value in winter and at the equinox are larger than that in summer. Moreover, there is a post-sunset peak observed in the winter and equinox, and the τ during the post-midnight period is smallest in equinox. The mainly diurnal and seasonal variation of τ can be explained within the framework of relative variation of TEC and *Nm*F2 during different seasonal local time. The dependence of τ on the solar activity shows positive correlation during the daytime, and the opposite situation applies for the nighttime. Specifically, the disturbance index (DI), which can visually assess the relationship between instantaneous τ values and the median, is introduced in the paper to quantitatively describe the overall pattern of the geomagnetic storm effect on the τ variation. The results show that the geomagnetic storm seems to have positive effect on the τ during most of the storm-time period at Guam. An example, on the 1 June 2013, is also presented to analyze the physical mechanism. During the positive storms, the penetration electric field, along with storm time equator-ward neutral wind, tends to increase upward drift and uplift F region, causing the large increase in TEC, accompanied by a relatively small increase in *Nm*F2. On the other hand, an enhanced equatorward wind tends to push more plasma, at low latitudes, into the topside ionosphere in the equatorial region, resulting in the TEC not undergoing severe depletion, as with *Nm*F2, during the negative storms. The results would complement the analysis of τ behavior during quiet and disturbed conditions at equatorial latitudes in East Asia.

**Keywords:** τ; geomagnetic equator; magnetic storm

#### **1. Introduction**

The ionospheric equivalent slab thickness (EST, usually named τ) is defined as the ratio of the total electron content (TEC) to the F2-layer peak electron density (*Nm*F2) and it thus represents an imaginary equivalent depth of the ionosphere, with a constant uniform density of the F2 peak. Strictly speaking, when TEC is obtained by means of GNSS (Global Navigation Satellite System) signals, the τ value will contain a plasmaspheric component in addition to the 'pure' ionospheric component [1]. In the view of satellite-to-ground radio communications, τ is a useful parameter in that it includes information on both the topside and bottomside ionosphere. The τ is, thus, very helpful in understanding the nature of variations of the upper atmosphere and is, therefore, capable of addressing many

**Citation:** Zhang, Y.; Wu, Z.; Feng, J.; Xu, T.; Deng, Z.; Ou, M.; Xiong, W.; Zhen, W. Statistical Study of Ionospheric Equivalent Slab Thickness at Guam Magnetic Equatorial Location. *Remote Sens.* **2021**, *13*, 5175. https://doi.org/ 10.3390/rs13245175

Academic Editor: Fabio Giannattasio

Received: 23 November 2021 Accepted: 16 December 2021 Published: 20 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

useful ionospheric parameters, such as Chapman scale height H, topside 'half-peak density height' h0.5top, and temperature changes in the ionosphere/thermosphere systems [2–7]. In addition, according to the definition of τ, it connects VTEC and *Nm*F2, belonging to different fields of remote sensing, which can be used to deduce *Nm*F2 from the TEC, and vice versa [8–13].

The ionospheric slab thickness has been a subject of study since the 1960s [8–33]. They found that the diurnal, seasonal, and solar activity variations of the τ show significant dependence on the location of the observing station. The greatest variability in τ is observed during periods of geomagnetic storms, and the influence of geomagnetic activity on the τ shows different patterns as the observing station and solar activity vary. Modelling studies have also been carried out to provide global τ [8–13,26,29].

The climatology of τ near the magnetic equator has been mentioned few times in previous studies. Odeyemi et al. [24,25] investigated the corresponding morphologies of ionospheric profile and peak parameters over an equatorial station in west Africa, Ilorin (8.50◦N, 4.68◦E), during a low solar activity year 2010, and they found that the post-sunset increase in τ is more prominent during solstice than equinoxes, of which the December solstice perceives a relatively higher magnitude than the June solstice in 2010. Àlàgbé [30] studied the correlation between the τ and B0 at the Burkina Faso station (12.4◦N, 1.5◦W, dip 5.9◦N), and they found that the maximum of τ at the station appeared in the daytime both high and low years of solar activity. The first peak occurred at between 05LT–06LT in the low solar activity year, and the daytime peak appears at about 11LT; the peak at sunrise in the high solar activity year is not apparent, while the daytime peak occurs at 12LT–13LT. The value of τ in high solar activity year is generally higher than that of low solar activity year. Duarte-Silva et al. [31] studied the τ at Palmas and São Jose dos Campos during one year of extremely low solar activity (from March 2009 to February 2010), and these two places are located at the inner edge of the anomaly region and the southern crest of the anomaly, respectively. They found that the τ at Palmas begins to increase gradually until its maximum value is reached between 13LT–15LT, and the maximum average peak values of τ, of about 477 km, were observed over Palmas during daytime (08LT–16LT) throughout the December solstice months. In addition, the study of τ behavior during the storm time at equatorial latitudes is relatively less. Therefore, long-term monitoring and analyzing observations at equatorial latitudes are needed to give a more comprehensive study of the climatology of τ and geomagnetic storm effect on it at this latitude.

Guam station (144.86◦E, 13.62◦N, 5.54◦N dip lat) is located at the equatorial north latitudes, where it lies between western Pacific Ocean and east of the Eurasian continent, and the surrounding area is dominated by marine environment. The ionospheric data are relatively sparse at this region, and there are few specific studies investigating the τ in this region. In order to strengthen the understanding of the variation of ionospheric thickness in East Asia near the magnetic equator, this paper uses the GPS-TEC and *fo*F2 data at Guam station, between 2012 and 2017, to statistically analyze τ dependence with season, local time, and solar activity. Specifically, we investigate the effect of geomagnetic storms on the τ variation in this station. This paper is organized as follows: Section 2 briefly describes the data and methodology employed in this study. In Section 3, τ variation is presented, with an emphasis on its variations during geomagnetic storms. Section 4 discusses the derived results and use one example to analyze the causes of storm enhancement of τ. Finally, conclusions of our study are found in Section 5.

#### **2. Data and Methods of Analysis**

The GPS TEC data in this paper are derived from UNAVCO (University NAVSTAR Consortium) database (http://www.unavco.org/ accessed on 23 November 2021). After downloading the receiver-independent exchange data (RINEX format) from UNAVCO, the GPS-TEC program developed by Seemala (https://Seemala.blogspot.com accessed on 23 November 2021) is applied to derive TEC values and elevation angle cutoff of 30◦ is adopted in order to eliminate the multi-path effects. In addition, the ionospheric shell

height, used to retrieve TEC, is 350 km. Uwamahoro et al. [34] explained, in detail, how the GPS-TEC software works. Moreover, the GPS-TEC software used in this study has been compared with other techniques, such as the one presented in Ref. [35], alongside the European Geostationary Navigation Overlay System (EGNOS) algorithm, which was taken as a reference in Ref. [36]. Generally, both software is consistent with EGNOS algorithm, but the Gopi software was found to be closer to EGNOS in low-latitudes. It is worth noting that the same GPS-TEC software used in the current work has extensively been used to derive TEC in ionosphere and in τ climatology studies that involved TEC computation [25,37–43]. Moreover, we used the regional kriging interpolation method to convert the STEC obtained by software to VTEC, rather than directly use the average values of VTEC data calculated by the software.

Simultaneous *fo*F2 data are obtained from the GIRO (Global Ionospheric Radio Observation) database (http://umlcar.uml.edu/DIDbase/ accessed on 23 November 2021) [44]. In the data processing, we filtered out the ionosonde data, which had a confidence score (CS) of less than 80 (maximum 100). The *Nm*F2 values are computed from the critical frequency of the F2 layer (*fo*F2) scaled from the recorded automatically ionograms, where *Nm*F2 = 1.24 × <sup>10</sup><sup>10</sup> (*fo*F2)2. Then, the TEC and *Nm*F2 data, with a 15 min resolution, are used to compute the equivalent τ by using the following relation:

$$\tau = \text{TEC} / \text{NmF2} = \text{TEC} / (1.24 \times 10^{10} \text{ (foF2)}^2) \tag{1}$$

where TEC is measured in TEC units (1016 electrons/m2) and *fo*F2 is the critical frequency of the F2-layer, with *fo*F2 given in MHz and *τ* in meters. For simplicity, we transform it in kilometers in the following.

Since (dayside) ionospheric density is produced, mainly, by solar EUV radiation (which is inferred by F10.7), we use F10.7 index to study the effect of solar activity on τ, and they are downloaded from NOAA (ftp://ftp.swpc.noaa.gov/pub/indices/old\_indices/ accessed on 23 November 2021). Figure 1 shows the overall pattern of the solar F10.7 index during the period 2010–2017. It can be seen from Figure 1 that the F10.7 in 2014 was the largest among these 8 years, with an average value of 145.9 SFU (Solar Flux Unit). The F10.7 in 2010, 2016, and 2017 was small, with the average values being 80 SFU, 88.7 SFU, and 77.3 SFU, respectively. Due to the very high rate of data availability in 2016 (94.6%), this paper selects 2016 as the year of low solar activity and 2014 as the year of high solar activity to study the influence of solar activity on τ.

**Figure 1.** F10.7 variations during the years 2012–2017.

Previous studies have found that geomagnetic storms have effect on the τ variation [15,27,32,33]. In order to obtain the τ variation during geomagnetic quiet condition, we selected all the geomagnetic storms satisfying Dstmin < −30 during the period 2010–2017, according to the Dst (http://wdc.kugi.kyoto-u.ac.jp/ accessed on 23 November 2021) index, and the τ data, during the selected geomagnetic storm, are excluded from the overall τ dataset. Then, the τ data during the geomagnetic quiet condition are divided into 288 grids, according to January–December and 01–23LT. Recently, Pignalberi et al. [22] used similar methodology to analyze τ at mid latitudes. The seasons are defined as ±45 days around vernal and autumnal equinoxes, June solstice, and December solstice.

The geomagnetic index Dst is used in this paper to select geomagnetic storm with minimum Dst < −50. To quantitatively describe the overall pattern of the geomagnetic storm effect on the τ variation, we define the disturbance index (DI):

$$\text{DI} = \pi\_{\text{s}}/\pi\_{\text{m}} - 1 \tag{2}$$

where τ<sup>s</sup> and τ<sup>m</sup> are storm-time and monthly median τ, respectively. Figure 2 shows an example during 2013 in Guam. Figure 2a illustrates the Dst variation, and the arrow 'main phase onset (MPO)' marks universal time of the magnetic main phase onset at 18:00 UT on 6 June. Figure 2b shows the variation of measured τ (blue lines) during 6–8 June and corresponding τ median (red) in June. Figure 2c presents the calculated DI index according to the measured τ value and its Median value. In addition, DI index in (storm time 1) S1 time period is used to study the characteristics of DI index during geomagnetic storm, while DI index in time periods (quiet time 1) Q1 and (quiet time 2) Q2 are used to study the one during geomagnetic quiet condition.

**Figure 2.** An example for the calculation of DI index during the period 6–8 June 2013.

#### **3. Results**

According to the grid division of January–December and 0LT–23LT, Figure 3 gives the distribution of mean and standard deviation of τ on this grid. It can be seen from the figure that the first peak of τ appears at 11LT–12LT in all seasons, and it is the maximum value throughout the day. Moreover, there is a second peak that appears at 18LT–22LT in winter and equinox. During the nighttime, τ is significantly smaller than that in the daytime. From the perspective of seasonal variations, the τ in winter and equinox are generally greater than that in summer. In addition, the annual maximum appeared in

October 12LT, reaching 437.6km, and the annual minimum appeared in April 5LT, reaching 105.2km. Moreover, the τ shows great variability during the night-time, especially during the pre-sunrise and post-sunset period, which is consistent with previous studies [7,22].

**Figure 3.** Map of mean (**a**) and standard deviation (**b**) of equivalent slab thickness (τ) under geomagnetically quiet condition, according to January–December and 0LT–23LT division in 2012–2017.

Figure 4 compared the distribution of mean and standard deviation of τ in the year of high solar activity and low solar activity, based on the same grid division. It can be seen from the figure that the EST in high/low activity year show some common features. That is, the first peak of τ appears from 11LT to 12LT in all season. The τ in 0LT–10LT is smaller than that in other periods, both during the years of high or low solar activity. The results also demonstrate that the τ during the daytime in winter and equinox is larger than the τ in summer. As for the differences between τ during the high and low solar activity year, it is worth noting that the τ in the 10LT–22LT time period of the high solar activity year seems greater than the average value of all years, while for other time periods, the τ is smaller. The opposite situation applies for the change in low solar activity year. In addition, nighttime and solar terminator hours are also characterized by the highest dispersion, in the same way as Figure 3b. Specifically speaking, τ shows more variability during the post-sunset period in the high solar activity year, while τ shows more variability during the pre-sunrise period in the low solar activity year.

Among previous studies of τ variation near geomagnetic equator, Àlàgbé. [30] found the maximum of τ at Burkina Faso (12.4◦N, 1.5◦W, dip 5.9◦N) appeared during the day, regardless of the solar activity of the year. The daytime peak appears at about 11LT in the low solar activity year; it appears at 12–13LT in the high solar activity year. Duarte-Silva et al. [31] found that the τ at Palmas station (10.12◦S, 48.21◦W, 7.73◦S dip) in the low solar activity have the following rules: the τ begins to increase gradually after sunrise, and its maximum value appeared between 13LT to 15LT. Moreover, annual maximum of τ reached 550 km in summer (that is, December) at the station. The τ pattern in this paper is similar to these findings, but it is not completely consistent. The ionospheric longitudinal difference may contribute to the differences between them [45–47].

Figure 5 presents the diurnal variations of mean and standard deviation of τ in summer, winter, and equinox at a geomagnetically quiet condition. It can be seen from Figure 5 that the first peak of τ in winter is significantly higher than that in summer, and the equinox τ is almost the same as the winter τ during this period (range 2 (R2) interval). There is another peak of τ in post sunset period in winter and equinox, and the summer τ is also significantly smaller than that in winter and equinox during this period (range 3 (R3) interval). Moreover, the τ during the post-midnight period is the smallest in equinox, especially during the period of 03LT–06LT (range 1 (R1) interval). In previous studies, the

sunrise peak of τ is a hot topic [21,26,33]. However, Figure 5 shows that, starting from 04LT in winter and summer, and 05LT in equinox, the τ gradually increases, and there is no peak. There is only a small peak appears at 06LT in winter. In addition, it can also be seen from the figure that the nighttime is characterized by the high dispersion, especially during the pre-sunrise and post-sunset period. During the pre-sunrise period, τ has highest variability in the summer, and τ has highest variability in the equinox during the post-sunset period.

**Figure 4.** Map of mean and standard deviation of equivalent slab thickness (τ) under geomagnetically quiet condition. (**a**) Mean of τ in high solar activity year 2014; (**b**) Mean of τ in low solar activity year 2016; (**c**) Standard deviation of τ in high solar activity year 2014; (**d**) Standard deviation of τ in low solar activity year 2016.

Taking into account the rapid changes during the sunrise and sunset period, mean daytime (08LT–16LT) and night-time (20LT–04LT) values of τ, for magnetically quiet days during the solar maximum phase 2014 and the solar minimum phase 2016, for three seasons (summer, winter, and equinox) are calculated (Table 1) to investigate the dependence of τ on solar activity. It can be seen from the table that, regardless of the high and low years of solar activity, the τ during the day is greater than the τ at night, and the seasonal characteristics of daytime τ (the τ in winter and equinox are greater than the summer τ) are more obvious in the years of high solar activity. The most important finding is that the mean τ during the day, in the high solar activity, is greater than that in the low solar activity, whereas the opposite situation applies for the τ at night.

**Figure 5.** Diurnal variations of mean and standard deviation of equivalent slab thickness (τ) in winter, summer, and equinox at geomagnetically quiet condition.

**Table 1.** Mean daytime (08:00–16:00LT) and night-time (20:00–04:00LT) values of ionospheric equivalent slab thickness (τ) for magnetically quiet days during the high solar activity year 2014 and the low solar activity year 2016.


The ionosphere is not the same every day, even under undisturbed geomagnetic and solar conditions. In order to compare storm-time τ variation with quiet time τ variations. Figure 6 displays the mean DI index during quiet-time. In this case, the index is

$$\mathrm{DI} = \pi\_{\mathrm{q}}/\pi\_{\mathrm{s}} - 1$$

where τ<sup>q</sup> and τ<sup>m</sup> are quiet-time and monthly median τ, respectively. According to if the DI is positive or negative, the daily variation of mean and standard deviation of DI index are shown in two panels, respectively. The blue curves represent the positive DI variation, and the red curves represent the negative DI variation. It is important to note that thick lines correspond to the average values of positive and negative DIs for the entire period, and vertical lines correspond to STD. As it shown, the τ is most stable in the daytime (08LT–19LT), and the DI index ranges from −0.12 to 0.14 during this period. On the contrary, the τ is more variable during the period of night-time to sunrise period (22LT–07LT) with DI index more/less than 0.19/−0.15. The τ is most variable during the sunrise period, with maximum/minimum having reached 0.46/−0.3. It is consistent with the above results that the great variability of τ at sunrise, indicating that the τ values fluctuate sharply around sunrise. Previous observations and model simulations also illustrate that the day-to-day variability reaches the maximum at dawn, and they suggested that the day-to-day variability of neutral winds in the E region (≤~130 km) is the primary driver of the day-to-day variability of dawn ionosphere at the geomagnetic equator [48–51]. For the post sunset period (19LT–22LT), the DI increases/decreases from 0.12/−0.11 to 0.23/−0.15, whereas it decreases/increases from 0.21/−0.2 to 0.11/−0.11 during the post sunrise period (07LT–08LT).

**Figure 6.** Diurnal variation of the mean and standard deviation (STD) of DI index at geomagnetically quiet conditions.

Figure 7 presents the storm-time DI variation. The storm-time DI pattern is basically consistent with that of quiet-time but with more variability. During the daytime (08LT–19LT), the τ has smaller variability, and the DI indices are between −0.2 to 0.27. By contrast, the τ is more variable during the nighttime to sunrise period (23LT–07LT), with DI indices more/less than 0.287/0.168. In addition, the τ is also most variable during the sunrise period (05LT–07LT), and maximum/minimum DI index are −0.45 and 0.88, respectively. For the post-sunset period (19LT–21LT), the DI increases/decreases from 0.22/−0.15 to 0.44/−0.17, and it decreases/increases from 0.26/−0.19 to 0.15/−0.12 during the sunrise period (07LT–08LT).

**Figure 7.** Same as Figure 6 but for geomagnetic storm periods.

To better illustrate the geomagnetic effect on the τ, Figure 8a compares the storm-time and quiet-time DI variations. One can find that the storm-time DI is more perturbed than quiet-time DI. Most of the storm-time positive DI are larger than quiet-time positive DI values, and this phenomenon is more apparent during night-time, especially during sunrise period. On the other hand, the storm-time negative DI are a little smaller than quiet-time negative DI values, except the pre-dawn period (4:30LT–6:30LT) with large difference. In order to illustrate the difference between them, we calculated the DI difference based on the sign of the DI index, and the result is shown in Figure 8b. As can be seen from Figure 8b, the positive DI difference during the storm is below 0.106 from 08LT to 19LT during the day. It is relatively large at sunrise (04LT–07LT) and sunset (19LT–21LT), reaching 0.45 and 0.24, respectively. The negative DI difference is less than −0.08 in all time periods, except for the period of sunrise, which reached −0.15 between 05LT–06LT. Therefore, we preliminarily draw the conclusion that geomagnetic storms have positive effects on the τ during most of the storm period, except they have negative effects on the τ during the sunrise period (4:30LT–6:30LT) in some cases.

**Figure 8.** (**a**) Comparison of diurnal DI index for geomagnetically quiet condition and geomagnetic storm condition (**b**) The minus between the DI index during geomagnetic storm periods and DI index at geomagnetically quiet time, according to the sign of DI.

#### **4. Discussion**

The Guam station is located near the magnetic equator. The most important factor affecting local ionospheric state is the fountain effect and the ionospheric equatorial anomaly (EIA) caused by it. At the same time, the geographic latitude of Guam Station is 13.43◦N, and the trans-equator neutral wind can affect its ionospheric state significantly. Strong geomagnetic storms lead to positive or negative ionospheric storms, depending on season, local time, and the universal time (UT) arrival time of storms [52–55]. Different scales of neutral atmosphere waves, such as gravity waves, thermospheric tides, and planetary waves, driven by lower atmosphere processes, can also influence the ionosphere directly through plasma transport, or indirectly through electrodynamic processes [56,57]. Therefore, the τ in this paper displays a complicated pattern. Considering that geomagnetic activity is the main contributor to the relative ionospheric variability, we first discussed the τ under geomagnetic quiet conditions in Guam.

In order to explain the diurnal variation of τ, Figure 9 gives the quiet-time variation of TEC and *Nm*F2 in Guam. In addition, it can be easily determined that the local time variation of τ depends on the variation in TEC and *Nm*F2, obtained from Equation (1), as the following:

**Figure 9.** (**a**) Diurnal variations of mean and standard deviation of total electron content (TEC) in Guam during winter, summer, and equinox, respectively. (**b**) Diurnal variations of mean and standard deviation of F2-layer peak electron density (*Nm*F2) in Guam during winter, summer, and equinox, respectively.

To better understand the possible formation mechanisms of the peak of τ, the temporal variation in the ionospheric parameters, TEC and *Nm*F2, are given in Figure 10. According to Equation (3), the increase in the ratio variation of τ can be divided into following categories: (1) TEC increase and *Nm*F2 decrease; (2) TEC and *Nm*F2 increase, but the ratio of TEC increase is larger than that of *Nm*F2; (3) TEC and *Nm*F2 decreases but the ratio of *Nm*F2 decrease is larger than that of TEC. The opposite situation applies for the decrease in τ. As shown in Figure 9, TEC and *Nm*F2 consistently increase with the solar zenith angle, which increases after sunrise. Meanwhile, new electrons produced by the photoionization, in the magnetic equatorial ionosphere, began to be transported upward and diffused downward due to the fountain effect. Considering that the photoionization plays a much more important role in controlling the overall variability in the integrated quantity (i.e., TEC) than a localized quantity (i.e., *Nm*F2) [58], the increases of ratio in TEC are larger than that in NmF2, as presented by Figure 10, and it causes the increases in τ during the forenoon.

**Figure 10.** (**a**) Variation rates in TEC for winter, summer, and equinox, respectively. (**b**) Variation rates in NmF2 for winter, summer, and equinox, respectively.

Figure 5 shows that the τ in Guam reaches its maximum value at noontime, consistent with previous results for equatorial latitudes [24,25,30,31]. Moreover, Figure 5 also shows that the first peak of τ, in winter and equinox, are larger than in summer and this phenomenon is a new finding of this study. It is observed from Figure 9b that *Nm*F2 in winter and equinox show 'noon bite-out' during the period of first τ peak, whereas TEC do not show this phenomenon during the full year (R2 interval). Apparently, these variations, together, result in seasonal minimum of the first peak in summer, according to Equation (1). Noon bite-out in the low latitude to low equator has long been known, and previous studies illustrate that this feature is mainly caused by the upward **E** × **B** drifts in the F region during daytime and noon bite-out was absent in the daytime TEC variation [59,60]. Lee [61] used TEC, *Nm*F2, and *hm*F2 data, observed by the Jicamarca Digisonde and SAMI2, to model the noon bite-out under geomagnetic quiet condition and found that the reason why the noontime bite-out in TEC is absent is, principally, the upward **E** × **B** drifts because the drifts mainly account for the absent noontime bite-out in bottomside TEC and topside TEC. As for the absence of noon bite-out in summer, it might be caused by the neutral winds in Guam. Chen et al. [62] also suggested that the reason is seasonally dependent trans-equator plasma transport, induced by neutral winds. Since the meridional winds are equatorward during the summer daytime in Guam, plasma outflow, induced by transequator transport, results in significantly electron density depletion before the fountain effect induced bite-out. As a result, the pre-noon *Nm*F2 peak is restrained. Thus, noontime bite-out variation pattern does not appear in summer in Guam.

As the solar zenith angel decreases and vertical drift velocity takes a downward trend, the electron density continues to decrease after noontime. This indicates a reduction in the *Nm*F2 and TEC after midday, as illustrated in Figure 9. As described above, the photoionization plays a much more important role in controlling the overall variability in the TEC than *Nm*F2. The variation of *Nm*F2 is, therefore, less sensitive to the solar zenith angle compared to TEC. As shown in Figure 10, the ratio in decrease in TEC is larger than that in *Nm*F2, which causes the decrease in τ in the afternoon.

Figure 5 depicts that there is another peak of τ that appears around sunset in winter and equinox. As shown in Figure 9, TEC and *Nm*F2 continue to decrease during this period, due to the solar radiation decreases (R3 interval). It is the fact that the ratio of decreasing in TEC is smaller than that in *Nm*F2 results in the increase in τ, based on Equation (3) and Figure 10. It is generally accepted that pre-reversal enhancement (PRE), which refers to the strong upward **E** × **B** drift during the post-sunset period, appears from the equator to the low latitude region. Moreover, the electrodynamic process, such as **E** × **B**, has stronger and more direct control only in the F-region height and density. Ionosphere generally exists at heights between 60–1000 km, whereas the plasmasphere could reach tens of thousands kilometers. Therefore, the plasmasphere receives more solar radiation than ionosphere

after sunset, and TEC thus decreases slower than *Nm*F2. Based on these scenarios, there is an increase in τ during sunset period. In addition, night-time enhancement in electron density, at low latitudes, has been studied by several groups of workers and the post sunset enhancement is known to be caused by the PRE. It is necessary to note the night-time enhancement in electron density is said to occur if its value is greater than the exponentially decaying background value. Su et al. [63] reported that the frequency of occurrence of the post-sunset TEC enhancement has two maxima in equinox, and these equinoctial maxima are separated by a broad summer minimum at the eastern low-latitude station Taiwan (11.5◦geomagnetic latitude). Therefore, the post-sunset τ does not illustrate an increase in summer, whereas they show another peak in winter and equinox.

Agrawal et al. [64] has shown that pre reversal increase in the vertical **E** × **B** drift, the primary source for the electron density enhancement, is strongest at equinox in both the eastern and western longitudes, and the night-time enhancement in *Nm*F2 and TEC is, therefore, greatest in equinox, as displayed by Figure 9. However, there is no solar radiation after sunset, and F region only constitutes a part of space environment when evaluating TEC. Therefore, the TEC enhancement is not apparent as *Nm*F2 enhancement. It causes the equinox/solstice ratio in *Nm*F2 to grow larger than that in TEC during the post-midnight to sunrise period, as shown in Figure 9. Figure 9 also shows that the post-midnight enhancement is most obvious in summer, consistent with previous studies. As the recombination process takes place steadily, one can find that the TEC and *Nm*F2 continues to decreases during the post-midnight period, respectively. In addition, the downward transport of plasma will slow the decrease in *Nm*F2 in the nighttime. Therefore, the decrease in τ to the ratio of decrease in TEC is larger than that in *Nm*F2, as shown in Figure 10. Moreover, the τ in equinox is smallest, which is caused by the aforementioned largest equinoctial *Nm*F2 and the faster decrease in TEC in equinox than in solstice, as shown in Figures 9 and 10, respectively.

In previous studies, the sunrise peak of τ has been a hot topic [21,22,26,33]. These studies indicate that the pre sunrise peak is a widely observed feature from low to high latitudes, and τ even reaches a maximum at sunrise for specific seasons. However, Figure 5 shows that, starting from 04LT in winter and summer, and 05LT in equinox, the τ gradually increases, and τ gets very low values (around 100 km) during the pre-sunrise period. For equatorial latitudes, the similar phenomenon was also seen for other longitudes, as follows. Àlàgbé [30] studied the correlation between the τ (TF2 in his paper) and B0 at the Burkina Faso station (12.4◦N, 1.5◦W, dip 5.9◦N), and they found that the τ at Burkina Faso station (12.4◦N, 1.5◦W, dip 5.9◦N) seemed to also have a very low value (less than 100 km) at pre-sunrise period. Duarte-Silva et al. [31] have also found that the Palmas station (10.12◦S, 48.21◦W, 7.73◦S dip lat), located in the inner edge of the anomaly region, also seems to get very low values around the pre-sunrise period. Odeyemi et al. [24,25] have studied the τ over Africa Ilorin (8.50◦N, 4.68◦E), and they found that the τ also seems to get very low values on some months during the low solar activity period. Therefore, the very low values at pre-sunrise hours might not be a special phenomenon in Guam because it could also be seen at other longitudinal equatorial latitude station. Moreover, the TEC should give a more pronounced contribution than *Nm*F2 during the pre-sunrise period. Therefore, the τ begins to increase during the pre-sunrise period, as can be seen in Figure 5, but it does not reach a peak. To further understand the morphology of τ during the pre-sunrise period, more observations and modelling are needed to provide a statistical picture and study the physical mechanism of this phenomenon in the future.

Among previous studies on τ, the dependence of τ on solar activity has shown different correlations in different latitudes [16,19]. Àlàgbé [30] shows that the day-time τ on the magnetic equator, in the high solar activity, is significantly greater than that in the low solar activity. Jayachandran et al. [21] also found that τ at noon is positively correlated with solar activity when studying τ in low and middle latitudes. The results shown in Table 1 confirm their conclusions. As for the τ decreases with solar activity at night, it should be due to the fact the H+-O+ transition height in the low solar activity is lower than that in the high solar activity at night. It is known that the τ is sensitive to the variations of H+/O+ ratio at the F2 peak or equivalent to the transition level at which [O+] = [H+]. Large downward fluxes of H+ can decrease the O+ to H+ transition levels, thereby increasing the topside content and, hence, the slab thickness.

Geomagnetic activity is another key factor affecting the perturbations of the τ. Previous studies have shown that the influence of geomagnetic activity on the τ depends on the location of the observing station and solar activity. For the mid latitude, the τ seems to systematically enhance during periods of geomagnetic disturbances [15,32,65]. However, magnetic activity does not appear to have significant influence on the τ variations at low latitude [22,66]. Until now, there is no specific statistical study analyzing the geomagnetic storms effect on the τ at the near equator latitudes (best to our knowledge). Therefore, we made a statistical study of the geomagnetic storms effect on the τ in Guam, based on the aforementioned DI index during the geomagnetic storm period from the year 2010 to 2017. Comparing with the geomagnetic quiet time DI index shown in Figure 6, the results in Figure 8 suggest that the geomagnetic storm seems to have a positive effect on the τ during most of the storm period near the equatorial station Guam.

During geomagnetic storms, the main process affecting τ (TEC/*Nm*F2) in Guam includes: (1) penetration electric field driven by the solar wind ranging from high latitude to equator; (2) the equator-ward neutral wind resulting from particle precipitation and Joule heating at high latitude, sometimes accompanied with travelling atmosphere disturbance (TAD); (3) the disturbance dynamo electric fields produced by the globally altered thermospheric winds during magnetic storms; (4) composition changes, driven by storm time, altered neutral winds. These coupled drivers together with background thermosphereionosphere generate ionospheric storms and control their strengths in complicated ways. In the following, we will start from these aspects and use an example on the 1 June 2013 in Guam to analyze why the τ tends to increase when the positive or negative ionospheric storm occurs.

Figure 11 shows the geophysical conditions during 1–2 June 2013. As IMF Bz turns southward at 0:00UT, Dst began to decrease sharply, reaching a minimum of −124 nT at 8:00UT, and then, the geomagnetic storm entered the recovery phase. The DI indices of τ, TEC and *Nm*F2 of 1 June are given in Figure 12. As shown in the figure, with the start of the main phase, caused by southward reversal of IMF Bz, a TEC positive storm occurred during the main phase, but the increase in *Nm*F2 during this period was very small, resulting in a τ positive storm during this period (T1 interval in Figure 12).

**Figure 11.** The geophysical conditions during 1–2 June 2013.

**Figure 12.** (**a**) The DI index of TEC and *Nm*F2 on 1 June 2013. (**b**) The DI index of τ on 1 June 2013.

It is generally accepted that the penetration electric field and equatorward neutral wind should be the primary drivers of the positive disturbance at this latitude [43,58]. From the fact that TEC positive storm occurred during the main phase of the magnetic storm, caused by the southward reversal of IMF BZ in Figure 12, we infer that there is a penetrating electric field during this period. The daytime penetrating electric field tends to move the plasma from the equatorial region into higher latitudes through the fountain effect. Meanwhile, the increased upward vertical drifts also transport local plasma in the bottomside ionosphere or the F2 layer to higher altitudes where the chemical recombination rate is low, so plasma accumulates in the topside ionosphere. On the other hand, the plasma, transported upward from the bottomside ionosphere, would be compensated by daytime ionization. As a result, a large increase in TEC is accompanied by a rather small increase in *fo*F2 and τ tends to enhance during the positive ionospheric storm, as shown in T1 interval. In order to confirm above theory, Figure 13 shows the DI index of ionospheric bottomside TEC (BTEC) and topside TEC (TTEC) relative to their monthly median value on 1 June. Among them, BTEC is obtained by integrating the electron density below the ionospheric peak height *hm*F2, while TTEC is obtained by subtracting BTEC from the TEC. Figure 13 shows that the ionospheric positive storm in this time period is mainly caused by the TTEC, while the BTEC gives little contribution. From the perspective of plasma scale height, which has a close relationship with τ, the increased upward drifts changed the shape of the topside ionosphere and increased the topside effective plasma scale height dh/d(ln(Ne)) significantly, so the whole τ is thus increased. Moreover, the enhanced equatorward wind during storm time should also play an important role in producing changes in electron densities, as it can raise the F2 region to a higher altitude to inhibit field-aligned ambipolar diffusion, thus causing changes in the shape of the ionospheric density profile [67,68].

As the geomagnetic storm entered the recovery phase, *Nm*F2 experienced a negative storm, but the disturbance of TEC during this period was small, resulting in a positive disturbance of τ (T2 interval in Figure 12). It is well known that the disturbance dynamo electric fields produced by the globally altered thermospheric winds during magnetic storms and composition changes brought by equatorward neutral winds are the main drivers of negative storms in the region we are interested in [54]. However, TEC and *Nm*F2 might have contrasting behavior during the recovery phase of the geomagnetic storm, as shown in Figure 12. Since the topside ionosphere, at equatorial latitudes, connected with middle and low latitudes of the region near the F2 peak at the flux tube, the storm time enhanced equatorward wind tends to push more plasma at low latitudes from the region near the F2 peak into the topside ionosphere in the equatorial region [69]. Consequently, the topside TEC in the equatorial region can undergo an obvious enhancement, and it

causes τ to increase, combining with the negative disturbance of *Nm*F2. Figure 13 confirms this theory, BTEC experienced a large negative disturbance, but TTEC did not, and even was in a positive disturbance state for most of the time during T2 interval. Therefore, the positive disturbance of TTEC, caused by equatorial neutral wind during a storm, makes TEC decrease little when *Nm*F2 has a negative storm occur, resulting in an increase in τ.

**Figure 13.** The DI index of TTEC and BTEC on 1 June 2013.

#### **5. Conclusions**

Based on the TEC and *Nm*F2 data from the years 2012–2017, this paper statistically analyzed the τ at equatorial latitude Guam. The results show great diurnal, seasonal, solar, and geomagnetic activity variation. A brief review of observations made by other researchers has also been presented, and we obtained the following results, which confirm similar studies:


Moreover, we obtained some new results, which provide interesting insights into τ of this region:


other longitudinal equatorial latitude station. In addition, longitudinal difference might also contribute to the difference.

3. The geomagnetic storm seems to have a positive effect on the τ during most of the storm period in Guam, except at sunrise period, when the τ attains large variability, even at the geomagnetically quiet condition. This study also provides a new physical explanation for the observed effect of geomagnetic storm on τ in Guam. During the positive storms, the penetration electric field along with storm time equator-ward neutral wind tends to increase upward drift and uplift F region, causing the large increase in TEC accompanied by relatively small increase in *Nm*F2. On the other hand, an enhanced equatorward wind tends to push more plasma, at low latitudes, into the topside ionosphere in the equatorial region, resulting in the TEC, which does not undergo severe depletion as *Nm*F2 does during the negative storms. Therefore, the geomagnetic storm seems to enhance τ both during the positive and negative ionospheric storms.

**Author Contributions:** Conceptualization, Y.Z., Z.W. and T.X.; methodology, Z.D.; investigation, J.F. and Y.Z.; validation, Y.Z.; formal analysis, Y.Z. and W.X.; resources, M.O.; visualization, Z.D. and W.Z.; funding acquisition, M.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the National Key R&D Program of China(Grant No. 2018YFF 0103700), Natural Science Basic Research Program of Shaanxi (Program No. 2020JQ-331).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The use of ionosonde data provided by GIRO (Global Ionospheric Radio Observation) database (http://spase.info/SMWG/Observatory/GIRO accessed on 23 November 2021). The use of GPS data provided by UNAVCO database (University NAVSTAR Consortium) (http://www.unavco.org/ accessed on 23 November 2021).The use of Dst data provided by WDC (http://wdc.kugi.kyoto-u.ac.jp/ accessed on 23 November 2021). The use of F10.7 data provided by NOAA (ftp://ftp.swpc.noaa.gov/pub/indices/old\_indices/ accessed on 23 November 2021).

**Acknowledgments:** We acknowledge the use of ionosonde data provided by GIRO (Global Ionospheric Radio Observation) database (http://umlcar.uml.edu/DIDbase accessed on 23 November 2021), GPS data provided by UNAVCO database (University NAVSTAR Consortium) (http: //www.unavco.org/ accessed on 23 November 2021), Dst data provided by WDC (http://wdc.kugi. kyoto-u.ac.jp/ accessed on 23 November 2021) and F10.7 data provided by NOAA (ftp://ftp.swpc. noaa.gov/pub/indices/old\_indices/ accessed on 23 November 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

