1. State-of-The-Art-Review and Paper Contributions
The radio frequency fingerprinting (RFF) concept refers to the process of identifying the hardware (HW) characteristic and HW-specific features or signatures embedded in the radio frequency (RF) waves transmitted over a wireless channel [
1,
2,
3,
4]. In a strict sense, RFF refers only to the transmitter-specific HW features. In a broader sense, the RFF process has also been studied in the context of channel characteristics or features, typically in the context of indoor positioning [
5,
6,
7,
8], as well as in the context of joint transmitter–receiver identification [
9]. In this paper, we adopted the first definition of RFF, namely that the ‘features’ to be identified refer to HW specifics of a wireless transmitter. As a side note, this RFF concept is also encountered in the research literature under the names of
specific emitter identification (SEI) or
physical layer identification. The purpose of any RFF technique is to identify genuine transmitters (or transceivers) and distinguish them from malicious ones. For example, the authors in [
10] performed a thorough analysis of GPS signals using a 30 m dish antenna, illustrating the evolution of the signal quality among the different GPS satellite generations. The paper indirectly showed that with a sufficiently high gain antenna, if the signal-to-noise ratio (SNR) is sufficiently improved, it is possible to identify the specific GNSS signal transmitter.
Especially in the context of global navigation satellite systems (GNSS), intentional interference such as jamming and spoofing has been on the rise in recent years and can have significant adverse effects on the navigation performance of GNSS receivers, as discussed for example in [
11,
12,
13,
14,
15].
Future aviation applications, and in particular unmanned aerial vehicles (UAVs), will increasingly rely on GNSS-based navigation and positioning solutions [
14,
15]. Safety-critical applications, such as those from the aviation domain, require a high capability of anti-spoofing and anti-jamming detection, or, in other words, a high identification accuracy of genuine and malicious transmitters.
There are many authentication and anti-spoofing methods in GNSS which are not based on RFF and such methods that have been widely studied in post-correlation, and especially at navigation levels [
11,
16,
17,
18,
19]. Recently, with the advent of RFF concepts in many non-GNSS wireless communications and with increased capabilities of machine learning (ML) approaches, the RFF solutions have also started to be considered in the GNSS field; in particular, the research problem of whether RFF could work with raw GNSS data, in the pre-correlation domain, before acquisition and tracking, remains an unsolved problem. It is the purpose of this paper to shed more light on whether RFF on pre-correlation GNSS data can work and which are the challenges and limitations in this field. In order to address this research gap of how to apply the well-known radio frequency fingerprinting and ML methods (to date widely used in other research fields) in the context of GNSS receivers, we present here a comprehensive survey of RFF and ML methods, discuss their applicability in the GNSS context, and we introduce a novel methodology to deal with RFF in GNSS, by presenting equivalent block diagrams of the genuine and non-genuine GNSS transmitters. We also give an initial glimpse of what kind of transmitter features are the most important in the context of GNSS transmitters, based on an in-house-made simulator, with Matlab and Python modules. We further summarize the remaining challenges when dealing with realistic environments and point out a few possible paths for future research in this challenging field.
A schematic block diagram of the three domains (pre-correlation, post-correlation, and navigation) of a typical GNSS receiver is shown in
Figure 1. The pre-correlation domain refers to the data at the output of the Automatic Gain Converter (AGC) and Analog-to-Digital Converter (ADC) shown in
Figure 1, in other words, to the raw I/Q samples before the acquisition stage of the GNSS receiver. These samples are typically received at a very low signal-to-noise-ratio, but they can carry important information about the ‘features’ of the transmitter, as they are not yet smoothed or filtered with the correlation filters.
A good survey of anti-spoofing methods based on the post-correlation and navigation data in GNSS can be found for example in [
19]. However, no pre-correlation methods and no RFF methods were addressed in there. Others surveys of anti-spoofing methods can be found for example in our previous work in [
11,
20], where again only the post-correlation and navigation anti-spoofing solutions were addressed. Feature-selection methods for RFF based on the navigation domain of a GNSS signal have also been addressed in [
21]. Surveys on the RFF methods are more difficult to find in the current literature, and they are typically focused on non-GNSS signals, such as cellular, Internet of Things (IoT), or WiFi signals [
22,
23,
24,
25,
26].
As seen in the discussions above, there is still a lack of surveys of RFF methods for GNSS transmitter authentication in the current literature, particularly on surveys of GNSS authentication relying on pre-correlation signals. In this paper, we are addressing this lack, via a comprehensive study of the literature in the past two decades, as well as via theoretical insights and the preliminary analysis of algorithms. Our contributions are as follows:
Offering a thorough survey of RFF methods applied with GNSS and non-GNSS wireless data in the literature, and discussing which of these RFF methods have potential in GNSS, and in particular in GNSS with pre-correlation data. Finding good anti-spoofing methods based on pre-correlation GNSS data could have tremendous benefits for the future GNSS receivers, by being able to detect and remove non-genuine signals even before processing them further in the acquisition and tracking loops. Our survey is unique in the current literature, as the RFF methods for GNSS have to date not been widely investigated and there is a current lack of unified surveys on this;
Proposing a step-by-step problem definition of RFF in the context of GNSS signals, by delving in depth in the sources of possible transmitter hardware impairments, and also discussing the possible channel and receiver–hardware impairments; this problem decomposition into feature-by-feature investigation is also lacking from the current GNSS literature, to the best of our knowledge;
Proposing a four-step generic RFF approach, consisting of: feature identification, feature extraction, data pre-processing, and data classification. Classical ML and transforms methods are used in this four-step methodology, but the four-step block diagram is rather novel;
Presenting the mathematical models of different GNSS transmitter features, with a particular emphasis of five main identified features, namely: the power amplifier non-linearities, the digital-to-analog converters’ non-linearities, the phase noises of the local oscillators, the I/Q imbalances, and the band-pass filtering at the edge of the transmitter front-end; unified mathematical methods of the transmitter HW impairments are not found in the current literature to the best of the authors’ knowledge;
Providing the equivalent transmitter block diagrams for GNSS and spoofers by incorporating the aforementioned five hardware effects into the models;
Presenting an illustrative simulation-based analysis based under ideal conditions in order to emphasize the impact of each HW feature on the RFF performance. Three feature extractors to identify the transmitter HW impairments were used, namely the kurtosis, the Teager–Kaiser energy operator (TKEO), and the spectrogram. The classification accuracies given as examples are based on support vector machines (SVM). Such a simplified analysis allows us to identify the strongest features among the five considered ones and to point out the remaining challenges to overcome to achieve the feasibility of RFF methods under more realistic GNSS scenarios;
Bringing in a qualitative discussion on the existing algorithms and providing a roadmap towards further research on RFF in GNSS for interference detection and classification.
The rest of this paper is organized as follows:
Section 2 presents the use case of a spoofing attack on an on-board GNSS receiver and describes the various spoofing types and anti-spoofing approaches existing in the literature. It also clarifies the fact that the focus of our paper is on pre-correlation approaches using the I/Q sample-level data as inputs, but the proposed methodology and the identified feature extractors and classifiers can also be applied in a broader sense, with post-correlation and navigation GNSS data, as well as with non-GNSS data.
Section 3 gives an overview of the main identified transmitter HW impairments (i.e, ‘features’), which can separate between genuine and spoofing transmitters in RFF-based approaches.
Section 4 presents the equivalent transmitter block diagrams for GNSS and spoofer signals, by emphasizing the places in the transmission payload where the various RF impairments can appear. This also shows the equivalent block diagram of the whole transmitter–channel–receiver chain and discusses the additional impairments that can be introduced by the channels and the receiver parts.
Section 5 focuses on feature-extractor transforms and presents various transforms which can be employed to determine the underlying features in the received signal.
Section 6 focuses on classification approaches which can be used to identify the features, after the feature-extractor transform is applied.
Section 8 summarizes the main RFF solutions from the existing literature, applied on pre-correlation signals, for both GNSS and non-GNSS signals.
Section 9 discusses the methods applicable to GNSS among those listed in
Section 8 and offers a qualitative and comparative view of such approaches. Finally,
Section 10 summarizes the open challenges in this field as well the further methodological steps to be under-taken for a designer implementing RFF algorithms based on pre-correlation GNSS data.
2. Problem Definition and Use-Case Example
Most of the GNSS signals use the code-division multiple access (CDMA) technique, with a received signal power around dBW. This means that the received signals are usually below the noise floor. For this reason, the direct observation of the signal is in general not feasible, if not using extremely high gain antennas. Therefore, when applying RF fingerprinting it is essential to evaluate the capability of the technique to operate at low SNR.
A spoofing scenario is illustrated in
Figure 2. In this example scenario, both the drone-based spoofer and the GNSS target receiver (e.g., a civil aircraft such as a flying taxi or a rescue helicopter) receive the broadcasting GNSS signals from satellites. During the spoofing attack, the GNSS target receiver receives the spoofing signals from the spoofer as well as together with the genuine GNSS signals from sky satellites and its task is to identify and mitigate the spoofing interference for attaining optimal positioning performance. Based on the GNSS signal received from the genuine satellites on sky, the spoofer is able to create fake GNSS-like signals which it will broadcast in the air. There are many ways in which a spoofer can generate a GNSS signal, as described below, whether these involve simplistic, intermediate, and sophisticated attacks.
Figure 2 illustrates only one of the many possible scenarios one could imagine when a GNSS receiver is spoofed by one or several malicious transmitters. More details about spoofing classes and possible mitigation solutions are addressed below.
Spoofing attacks are typically split into three classes, described in detail in [
11]:
Simplistic spoofing attacks, such as those generated by a software defined radio (SDR) GNSS generator connected to an antenna. In this type of attack, the GNSS transmitter is not synchronized to the genuine GNSS satellites, which means that there are typically jumps in the carrier-to-noise ratios (CNR) and Doppler shifts measured at the receiver and such spoofing attacks can be identified in the pseudorange domain via various consistency checks algorithms, such as those described in [
27,
28,
29];
Intermediate spoofing attacks [
30,
31]: these are more complex than the simplistic attacks as they combine a GNSS generator with a GNSS receiver and are able to align the code-phase and synchronize the frequency with the signal transmitted from a genuine GNSS satellite in the sky. A replay attack or a meaconing attack with a single receiver (when the signal from a genuine GNSS satellite is captured and re-sent with a delay) is an example of such an intermediate spoofing attack;
Sophisticated spoofing attacks [
32]: these are the most complex spoofing attacks to mitigate, as they are an extension of the intermediate spoofing attack, where the signals received from multiple GNSS antennas (sometimes placed at different locations) are modified (e.g., through random delays and Doppler shifts) and re-transmitted in a combined manner, in such a way that the receiver is duped to believe the signals are obtained from various genuine satellites.
Spoofing attacks adversely affect the quality of positioning, navigation and timing (PNT) services of GNSS receivers, by introducing errors in the estimated PVT. For example, as shown in [
31], an intermediate spoofer with a spoofer-to-signal ratio of 0 dB (i.e., equal spoofer and GNSS signal power) introducing a code delay of
chips can deteriorate the detection probability of the GNSS signal by
and with a code delay of only
chips, the detection probability decreases with
(i.e., from
to
). The spoofing impact on the good functionality of a GNSS receiver can be thus significant and it is of utmost importance to devise counter-spoofing methods, especially in life-critical applications such as aviation applications.
Current counter-spoofing methods can be classified into three main categories [
11,
33], according to the three GNSS-receiver domains depicted in
Figure 1:
Pre-correlation link-level methods relying on signal samples before the acquisition stage, i.e., on I/Q data. This is the case addressed in this paper. Such pre-correlation anti-spoofing methods are still very rare in the literature;
Post-correlation link-level methods relying on the despread signal, at the output of the tracking stage for a single satellite. Examples can be found in [
33,
34] and they are out of the scope of this paper;
Navigation or system-level methods relying on the pseudorange signals coming from all visible satellites. These are by far the most encountered anti-spoofing methods in the current literature and a few examples can be found in [
27,
28,
29] (they are also outside the scope of this paper).
Our paper focuses on the pre-correlation spoofing identification approaches, taking as the input the I/Q raw data (at sample level) and aiming to identify, based on RF fingerprinting approaches, whether the received signal comes from a genuine GNSS transmitter or from a spoofing transmitter.
We are proposing a
four-step methodology for the RFF-based pre-correlation spoofing detection and transmitter identification, as listed below. Each of these four steps is further detailed in
Section 3,
Section 4,
Section 5 and
Section 6.
Identification of relevant features—this step refers to first identifying the different RF ‘features’ created by the inherent hardware impairments in any transmitter. Several such features will be subsequently described in
Section 3;
Feature-extraction transform—this steps refers to choosing a suitable feature-extraction transform to emphasize the selected features from the previous step. Several feature-extraction transforms are addressed in
Section 5;
Data pre-processing stage—this step refers to choosing the most suitable format of saving the data at the output of the feature-extraction transform, namely as time-stamped vector data, in matrix form, as an image of certain size and number of pixels, etc. The data format selection will be influenced by the algorithms selected in the feature-classification step, as subsequently described in
Section 6, as well as by the data type at the output of the feature-extraction step. For example, spectrogram-type data are also easily stored in image form, while transforms such as kurtosis or Teager–Kaiser are more suitable to be stored in a vector format;
Feature classification—this step refers to applying a selected classification methods, such as based on analytically-derived thresholds or on machine learning algorithms when training data are available, and classifying the received signal into ‘genuine’ versus ‘non-genuine/spoofer’ classes. Several feature classification approaches are discussed in
Section 6. A qualitative discussion is then provided in
Section 9.
The workflow of an RFF algorithm based on the aforementioned four steps is illustrated in
Figure 3.
5. RF Feature Extractors
Section 3 gave an overview of the main RF features that a wireless transmitter can have. The question addressed in this section is how to identify such features, or, more precisely, what feature-extraction transforms
are available from the literature.
5.1. Error Vector Magnitude (EVM)
The error vector magnitude is a time-domain transform that measures how far the estimated symbols at the receiver side may deviate from the true symbols. I/Q imbalance, thermal noise, in- and out-of-band leakage, and phase noise are all causes that can degrade the EVM metric, thus EVM has the potential to be a good feature-extractor transform to capture hardware impairments from the received signals.
In general, EVM is applied in the context of demodulated signals, as follows: let us assume that a symbol
is transmitted, and that at the receiver, a symbol
is received. The receiver estimates (e.g., via decoding process) the symbol
. Therefore, the estimation error
is:
, as depicted in
Figure 9. The EVM of the symbol
is defined as
where
is the Euclidian norm.
When the input to the EVM transform is the I/Q sampled data, one can apply the EVM as follows: is a complex-valued sequence of an ideal GNSS signal (i.e., without any distortions); it can be generated, for example, via a GNSS signal generator; is the received signal (genuine or spoofer) at the I/Q level. Then, the EVM based on pre-correlation data measures the discrepancy between an ideal GNSS signal and the received signal. Under the hypothesis that the spoofer transmitter non-idealities will be further away from the ideal case than the GNSS transmitter non-idealities, then the EVM of a genuine GNSS signal is expected to be smaller than the EVM of a spoofer.
Figure 10a,b show two illustrative examples of EVM outputs for genuine GNSS transmitter and spoofer, respectively (both using Galileo E1 signal specifications and based on a software simulator built by us). The EVM results for the genuine Galileo E1 transmitter and spoofer have visible differences, with EVM values for the spoofer being, on average, slightly higher than those for the Galileo signal, as predicted by the theory. The examples in
Figure 10a,b are based on a very high CNR of 100 dB–Hz, for illustrative purposes. At lower CNRs, such differences are no longer visible to the naked eye, but they still have some potential to be captured by a machine learning algorithm, for example.
5.2. Kurtosis
Kurtosis is a measure of the Gaussian behaviour of a random variable and it is defined as
where
is the complex sampled signal (sampled at sampling times
, with
being the sampling interval, and
the sampling frequency);
is the expectation operator, and
is the standard deviation operator. For Gaussian-distributed sequences
,
is close to level 3. For non-Gaussian distributed sequences, this value is higher or larger than 3. Kurtosis was one of the feature extractors selected in our simulations.
An example of a histogram for the kurtosis results of genuine GNSS transmitter and spoofer is shown in
Figure 11. The magenta line represents the threshold to differentiate the spoofer from a genuine GNSS transmitter. It is typically expected that the received GNSS signals in the pre-correlation domain are Gaussian (see blue histogram from
Figure 11), due to the fact that the pre-correlation data are dominated by the thermal noise. In the presence of a strong spoofer, this Gaussian property may be lost, due to the fact that spoofer power might become the dominant one.
5.3. Teager–Kaiser Energy Operator (TKEO)
The Teager–Kaiser energy operator (TKEO) is a transform which can estimate the instantaneous energy of a signal, and thus may uncover features that are distinguishable in power or energy. The TKEO transform
of a complex signal
is defined as [
58]
where
is the complex sampled signal and
is the conjugate of
.
TKEO has been previously used in the context of RFF in GNSS in [
3] with promising results. It is also one of the feature extractors selected in our study.
5.4. I/Q Data Spectrograms and Other Short-Time-Short-Frequency (STSF) Transforms
The short-time Fourier transform (STFT)
is simply a Fourier transform within a window (i.e., short time); and the discrete STFT over a window of
samples of the received signal
is given by
where
m is the time sample index, the
is the complex sampled signal, containing the I and Q components (
),
f is the frequency, and
is a time window (e.g., Hamming, Hannig, etc.). The spectrogram
is squared absolute value of the STFT transform, namely:
Clearly, and are two-dimensional frequency-time transforms and can be stored both as a matrix and in image form. We investigated both approaches and found that by storing the spectrogram into an image form, we obtained more accurate results than by operating with the matricial form.
Figure 12 shows the comparisons of spectrogram-based results between a genuine Galileo E1 transmitter and a spoofer also based on Galileo E1 signal characteristics. The results are based on our in-house Matlab-based simulator, based on the block diagrams in
Figure 7a,b and at a very high carrier-to-noise (CNR) ratio of 100 dB–Hz, in order to be able to also identify (for illustration purposes) the different HW features by the naked eye. The results are shown in the absence of channel and receiver effects. It can be seen in
Figure 7a,b that there exist visible differences between these two images, e.g., the spectrogram of spoofer I/Q data has one extra line on the upper half of the image compared to the spectrogram of the genuine Galileo I/Q data. The underlying models of the HW features used in our simulator for the genuine and spoofer transmitters were based on the assumptions that phase noises and I/Q imbalances were weaker for a genuine signal than from the spoofer signal. The PA non-linearity models were based on [
59], by picking two different PA non-linearity models from there to characterize the spoofer and the genuine GNSS transmitter.
5.5. Wavelet Transforms
A wavelet transform decomposes an incoming signal into some ‘coarse’ and ‘fine’ coefficients, based on shifted and scaled versions of a so-called ‘mother wavelet’ function. Unlike the Fourier transform that cannot offer compact support in both the time and frequency domains, a wavelet transform can offer a compact/bounded support in both tome- and wavelet-domains. Wavelet transforms have been extensively used in watermarking and image-processing applications, and have been reported to be able to identify ‘hidden’ features; thus, they look like relevant feature extractors for RF fingerprints. Wavelet transforms, in the context of RF fingerprinting, have been previously used, for example in [
25,
60,
61]. The work in [
25,
60] was only focusing on narrowband signals, in contrast to GNSS. The work in [
61] used GNSS simulation-based signals, but only focused on a few simplified transmitter HW impairments. While the work in [
61] showed some limited promising results with the discrete wavelet transforms in the context of RFF, our further investigations with more realistic transmitter models as described in
Section 3 and
Section 4 did not show any improvement by using a wavelet transform instead of a spectrogram. Wavelet transforms have an increased complexity compared to other feature-extraction transforms because they output two pairs of complex coefficients (the coarse and fine-approximation coefficients); by distinction, for example, the spectrogram only has one complex output sequence.
7. Simulation-Based Example and Feature Down Selection
An in-house-based simulator was built based on Matlab 2020b version and Python 3.7.5. The Matlab modules were used to generate I/Q samples based on a GNSS and a spoofer model, each having five types of transmitter features: PA non-linearities, DAC non-linearities, I/Q imbalance, phase noises, and BPF. The parameters of genuine GNSS transmitters are typically not available in open access, as they are protected via IPR. In the absence of such GNSS exact parameters for these HW features, we adopted various models from the literature. For example, the PA non-linearities were modelled according to [
59], and the phase noise existing in the clock unit and up-conversion unit was modelled according to [
90]. Details on the parameters used in our simulator are given in
Table 2. In order to mimic the characteristics of a sophisticated spoofer, the phase noise of the local oscillator in the spoofer was modelled according to [
52], a high-end software-defined radio designed for GNSS signal transmitting and receiving. A simplified model was used for classifying one genuine GNSS transmitter versus one spoofer transmitting GNSS-like signals. As the main goal was to study the feasibility of RFF in the context of GNSS, an ideal, almost noise-free case was considered with a carrier-to-noise ratio (CNR)
dBHz. While the noise-free approach is not realistic in real-life scenarios, the purpose here was to show if there is any potential of RFF with pre-correlation GNSS data and to identify which HW features are likely to best differentiate between different transmitters.
A two-millisecond observation window of Galileo E1 band signals was used in the examples shown in this section. In order to deal better with smaller
levels that the ideal case considered here, one could consider the increase in the observation window. However, the simulation times and the complexity of RFF processing would also increase. Under a different randomness seed, we generated 2000 matrices (or images) of genuine GNSS signals and spoofer signals, respectively (thus a total of 4000 inputs to the ML algorithm). Furthermore, the 4000 data inputs were randomly split into
of training data and
of test data. Such matrices (or images) were the outputs of three considered feature-extraction transforms, namely applied kurtosis, TKEO and spectrogram, applied on the 2 ms observation interval of the raw signal sampled at a very high sampling rate of 491 MHz. Such a high sampling rate was needed in our model because we adopted a quasi-RF model, in order to model the clocks’ non-idealities. The feature-extraction transforms were selected based on the discussions in
Section 5, in order to enhance the capability of differentiating genuine GNSS signals from spoofer signals. An SVM classifier, from the scikit-learn library, together with a radial-basis-function kernel was implemented in Python to perform the classification. The grid search method was used to provide the optimized classification results and 100-fold cross-validation on the training dataset were employed to guarantee the convergence of the results.
The results of the classification are presented via the confusion-matrix metric.
Figure 19 illustrates the definition of the confusion matrix used in our work.
In our simulator, each feature can be active or inactive, making the simulator flexible to be able to down select or identify the ‘strongest’ features, as well as their overall impact when they act jointly (as in a realistic transmission scenario).
Figure 20 shows the confusion-matrix results, first when all features are combined, and then feature-by-feature, in order to be able to identify which features have a strong impact on RFF and which a have weak or no impact. One very interesting result based on
Figure 20 is that, even at a 100 dB–Hz carrier-to-noise ratio, both the phase-noise and DAC-non-linearity features fail to provide differences between the two classes (spoofer present versus genuine Galileo signal present).
Moreover, as seen in
Figure 20, the band-pass filter effects can only provide moderate differentiation between the spoofer and GNSS. These results, at a large degree, imply that the phase noise and DAC non-linearity are ‘weak’ features in the GNSS RFF context, while PA and I/Q imbalance, as well as BPF to some extent, are ‘strong’ features. This is also qualitatively illustrated in the next section.
8. Comparative Summary of Pre-Correlation RFF Methods in Existing Literature
Table 1 gives a concise survey of main RFF-related studies in the recent literature, by specifying the wireless system under investigation, as well as the main algorithms used for feature detection and classification in those RFF approaches. As seen in
Table 1 most of the research work dedicated to RFF has to date been for non-GNSS signals. Moreover, as clearly seen from the last column in
Table 1, RFF in the aviation context has been receiving more and more attention in the last two years, e.g., focusing on automatic dependent surveillance-broadcast (ADS-B) surveillance signals and on UAV transmitters and controllers.
Table 1 shows that a wide variety of classifiers have to date been investigated in the literature in the context of RFF: from a discrete wavelet transform (DWT) and continuous wavelet transform (CWT) to various neural networks, such as convolutional neural networks (CNN), probabilistic neural networks (PNN) and other machine learning algorithms, such as support vector machines (SVM), subclass discriminant analysis (SDA), multiple discriminant analysis (MDA), or permutation-entropy (PE)-based approaches.
Unlike the typical narrowband terrestrial signals typically studied to date with RFF techniques (see
Table 1), the GNSS signals are wideband and continuously transmitted, and hence do not exhibit strong transients to be used as differentiating factors. This means that, for GNSS signals, one should go deeper into the transmitter hardware characteristics and detect the possibly differentiating features between spoofers and genuine GNSS transmitters.
9. Qualitative Discussion and Open Challenges
Based on our literature research and the preliminary theoretical analysis,
Table 3 shows a suitability analysis of various combinations of feature-extraction transforms and classifiers for four selected classifiers and five selected feature-extraction transforms. The suitability analysis took into account both the expected performance and the complexity of the algorithm.
The most promising combinations, based on our preliminary analysis, are the kurtosis and thresholding combination, and the spectrogram and SVM combination. Potential good results may also be expected, based on a current literature search and theoretical analysis, from kurtosis and SVM combination, as shown in
Table 3. Further simulation-based and measurement-based analysis is necessary to validate these findings and this remains a topic of future research. The methodology presented in this paper can serve as a basis for also studying other possible combinations of feature-extraction transforms and classifiers.
Table 4 also discusses the expected impact of various features of the transmitter HW on the accuracy of the results. The analysis is based on the theoretical insights from the mathematical models presented in
Section 3. It is expected that the PA non-linearity, the phase noises and the I/Q imbalances are the strongest differentiating features of the transmitter HW impairments, while the DAC non-linearities are expected to have little or no impact upon the classification performance (as differences between the GNSS and spoofer DAC non-linearities are not expected to be high). The band-pass filter (BPF) at the end of the transmission chain is, however, expected to have a negative impact upon the ability to differentiate among various features, because it is acting as a smoother (or high-frequency removing unit). In practice, an RFF algorithm would, most likely, not be able to distinguish between each individual transmitter feature and would treat all effects jointly. Based on sufficiently large databases, it is expected that the positive-impact effects from
Table 4 will be predominant compared to the zero- and negative-impact effects.
10. Conclusions and Roadmap Ahead
This paper presented a survey of RFF methods for spoofing mitigation in GNSS receivers. While the survey of methods and the methodology presented in here can be generally applied also in a non-GNSS context, the focus in our paper has been on GNSS pre-correlation data, as the pre-correlation anti-spoofing methods are still rare in the current literature.
A four-step methodological approach has been proposed in
Section 2, by breaking down the RFF problem into several parts: the effects (or features) occurring at the transmitter side, the channel effects, and the receiver effects. We identified the main sources of possible hardware imperfections (i.e., features) at the transmitter side and we introduced in
Section 3 detailed mathematical models for the identified HW impairments for GNSS transmitters. It has also been shown that such HW features are best identified with the help of various feature-extraction time-domain or frequency-domain transforms. Some of the most encountered feature-extraction transforms in the current literature were discussed in
Section 5. We also surveyed the literature to identify classification algorithms useful in the context of RFF. Several classification methods, both via thresholding and via machine learning algorithms, were addressed in
Section 6.
Section 8 provided a qualitative comparison of approaches suitable for GNSS pre-correlation data, based on our literature survey, theoretical modelling, and preliminary simulation-based observations. It is to be emphasized that such RFF algorithms need to be further tested via measurement-based data for understanding their full capacity in a realistic environment, but one of the main take-away points of our research has been that the transmitter HW imperfections do have the possibility to act as differentiating features between spoofers and genuine transmitters if proper combinations of feature-extraction transform and classifiers are found. Our focus has been on the transmitter HW features, but we also discussed the possible effects of the wireless channels and the hardware blocks at the receiver side. To sum up, several challenges remain for the roadmap ahead:
Addressing the impact of the signal mixtures from signals from various satellites and various frequency bands: typically, the received signal is a mixture of all satellites visible in the sky at the considered moment, and possibly, of one or several spoofing signals. One approach to look at a single signal at a time would be to first despread each signal from each identified pseudo-random code, and then apply successive or parallel interference cancellation methods to identify each signal, one by one. The errors in the estimation of the signals from various satellites would, of course, affect the quality of the re-constructed signal, and possibly, the accuracy of the RFF-based classification. Another approach would be to create huge training databases with all possible mixtures of satellites in the sky and to use those databases in the classification process;
Evaluating and mitigating the impact of channel multipath and fading effects: each wireless channel (from satellite or spoofer) has its own random signature, determined by the multipath delays, Doppler spreads, and fading effects. As these effects are random in nature, they will, most likely, not provide additional ‘features’, but will have a negative impact on the strength of the transmitter features. The effect of the wireless channels upon the RFF algorithms can be further investigated via simulation- or measurement-based approaches and it remains a topic of future investigation;
Understanding the impact of the receiver HW features upon the RFF methods: while the same receiver is capturing either genuine GNSS signals or a mixture of genuine signals and spoofer(s), and thus the same receiver effects are present in both situations (spoofer present or spoofer absent), the receiver also has local oscillators, ADC and filter blocks, etc., and each of them can introduce additional phase noises, non-linearities and I/Q imbalances. Intuitively, such effects will have a negative impact upon the classification accuracy compared to an ideal receiver (without any HW imperfections), but such effects need to be further analysed based on measurements or simulated data.
Dealing with the negative impact of high noise levels on RFF performance, especially when dealing with low-power signals such as those in the pre-correlation domain: GNSS signals in urban scenarios, such as GNSS receivers on-board of drones flying through tall buildings, can be received at relatively low CNRs, and these low CNRs are likely to act as smoothers of the transmitter features, to the point of fading them out. It remains an open research question what the CNR threshold is above which the RFF methods with pre-correlation GNSS samples are likely to work;
Validating through real-field measurements the promising RFF performance for authenticating GNSS signals.
One of the main contributions of our paper was presenting a step-by-step methodological approach proposed to be adopted for a designer wishing to build an RFF algorithm in a GNSS receiver. The identified transmitter HW features are likely to be reflected not only in the pre-correlation data (illustrated in our examples through the paper), but also in the post-correlation and navigation domains, thus our four-step methodology also paves the road towards more advanced RFF GNSS processing in all three domains (pre-correlation, post-correlation, and navigation), with a future aim to offer robust and hybrid anti-spoofing solutions. An additional contribution of this paper has been to present an ample survey of existing RFF methods in the literature used with both GNSS and non-GNSS signals and already showing promising results. As described in this last section, several challenges are still to be overcome towards the success of RFF methods, especially when relying on the low-power GNSS I/Q raw data. It is our belief that this survey bridges the missing gap between the RFF studies in the non-GNSS context and the anti-spoofing methods studied to date only at the post-correlation and navigation levels in the GNSS context. It is our intent that this paper sheds new light on how to approach an RF fingerprinting process to identify hidden transmitter features, by first decomposing the problem into the relevant transmitter features and then by selecting the most suitable pair of feature-extraction transform and classifier algorithm in order to classify the transmitters according to their features or HW impairments. While many challenges still remain in the RFF GNSS research field, it is also the authors’ belief, based on our understanding of the research problem, that by combining various authentication methods, at different levels (pre-correlation, post-correlation, and navigation levels), one is more likely to obtain good results than by using a single authentication method. The simulation-based results presented here are only for some selected illustrative parameters and are useful in the context of down selecting the most important HW features of a GNSS transmitter. We saw that, even under ideal conditions such as 100 dBHz carrier-to-noise ratio, the phase noise and the DAC nonlinearities are not differentiating features, while P non-linearities, I/Q imbalances, and band-pass filters carry the potential of being good RF ‘fingerprints’. For the sake of a reduced complexity of simulations, the observation window used in our simulations was of 2 ms. Further investigative studies at a lower than 100 dBHz should also increase the observation windows, in order to deal better with the high noise level typical in the pre-correlation domains. The equivalent block diagrams and the methodological approach presented here, as well as the initial pre-selection of relevant features and feature extractors can also serve the basis towards further studies in the post-correlation domain, where the noise levels are significantly lower than in the pre-correlation domain, especially for the long post-detection integration times.