In this section, a brief explanation is provided regarding the dataset used and utilization procedure, and then the prediction results of several models are discussed.
4.2. Measuring the Resiliency of Models
As stated earlier, all six employed algorithms were trained with the maximum generalizability possible for our dataset. We implemented a 10-fold cross validation [
62] for all these models and then controlled their comprehensiveness by using an early stopping technique. Experiments were further performed with the aim to determine the extent to which these models could resist given perturbations and random noise.
We assume that the trained model was built, including post-activation operations, on the training set of
. The following optimization problem was further solved:
In general, this optimization problem is known as an “adversarial attack” [
63], which results in producing samples similar to the original samples but which might lead to mistakes in the model and, thus, its correction is needed.
Although classification and regression tasks are similar to each other, Equation (7) should be updated to non-label values for regression problems. In fact, unlike classification, there is no label for input vectors in the regression task. Therefore, to justify adversarial optimization problem we would need to replace the “label” with a “threshold” and solve for it. To achieve the minimum perturbation of ϵ, the following optimization statement is suggested:
Optimizing for and generates a series of samples that are remarkably similar to the legitimate inputs, but they are totally different to their associated outputs. In other words, after running the optimization inequality as defined in Equation (8), the manipulated input, , is similar to the given legitimate input although their associated output vectors are not similar. This optimization problem could be developed to include certain conditions, namely by redirecting the towards a predefined or random value, which can identify a targeted attack. This condition could add overhead to our abovementioned optimization problem and, therefore, we do not analyze it in the current paper. In our future studies, we will study possible approaches for the defense of our developed prediction models against adversarial attacks.
Having access to the training set, parameters, and hyperparameters of the trained model constitute a white-box attack, although it would still be possible to attack even without them. Both white- and black-box attacks are explained next.
The architectures and training setups of all six models were all the same in this paper, as explained earlier. For the training data with columns of latitude, longitude, altitude, time, and speed, the models were finely trained to predict their future states (latitude, longitude, altitude, time). The given input sample
was randomly perturbed while keeping it close to its associated original value by using an
similarity metrics. There is no generic approach to define the exact values for these hyperparameters. We empirically obtained these values and they can be changed following the adversary’s suggestions. Here, the initial values assigned to
and
are 0.01 and 100, respectively.
Table 1 summarizes the values of
and
achieved for all models trained on the traffic flow management system (TFMS) public dataset of aircraft trajectories.
Table 1 compares
and
values found by use of six benchmarking regression algorithms. Basically, adoption of smaller values for
results in higher similarity between generated adversarial samples and their associated legitimate samples. Additionally, adoption higher values for
leads to higher discrepancies between the ground-truth and the predicted outputs. Ground-truth is defined for supervised learning methods in order to measure the accuracy of the training set. Among these models, higher values for
were achieved using DNN, which means this model yields higher variation in its predictions for legitimate inputs.
We generated adversarial samples for all the records of the dataset and we tested them by using of all the trained models. Interestingly, by applying these samples, all models predicted incorrectly.
Table 2 lists the fooling rates of all six models with their prediction confidence scores. This table compares fooling rates of six victim models against adversarial attacks that were generated by FGSM algorithm. Unfortunately, all these models were completely vulnerable against adversarial samples. The results shown in
Table 2 clearly restate a security concern regarding the robustness of the data-driven models, including the conventional and advanced deep learning architectures. Scaled values of prediction confidence reveal the weakness of each model in terms of its prediction. The main difference between these algorithms is their prediction confidence. Apparently, RNN predicted wrongly with the highest confidence.
Another important concern is the transferability of the generated fake samples from one model to another. To evaluate this situation, adversarial samples were crafted for each model, and were feed-forwarded to another model. The results of this experiment are shown in
Table 3. This table statistically explains the transferability property of adversarial samples.
This table depicts the transferability of adversarial samples from one victim model to another. Reported percentage values are averaged among all 10 folds, which is equivalent to say that the given dataset was divided into 10 equal-size segments versus time and, thus, each one of them was considered a test segment. Finally, the average of accuracy was we computed for these segments. The most transferable adversarial samples for each model are shown in
Table 3 in bold characters. For instance, 81.23% of total crafted adversarial samples for SVR are successfully transferable to the LR model.
Although LSTM is more advanced than the RNN, it is more vulnerable to transferred adversarial attacks. Equation (8) is further explored for a better understanding of crafted samples. A first impression could be that adversarial samples are “noises”. To accept or reject this impression, we need to run experiments to determine if and constitute “noise” (or not).
To answer the abovementioned question, we utilized the local intrinsic dimensionality (LID) score [
64]. This score differentiates “noisy samples” from “crafted adversarial samples”. Assuming that
refers to the distance from legitimate sample
to its nearest neighbors,
, then the maximum of the neighbor distances can be found in which
is the number of neighbor samples. Therefore, the LID score can be computed as shown in Equation (9).
Around
of the training set and generated random noisy samples were randomly selected using Gaussian distribution with 10 different values of
and
. For fairness comparison, we repeated this generation 10 times and exported all the generated noisy samples into the original dataset by building a new directory to include both noisy and legitimate samples. We also generated new adversarial samples for every record in the original training set and further exported them into the adversarial category. Eventually, a logistic regression algorithm is trained for two considered classes in order to classify legitimate and adversarial samples.
Table 4 summarizes the details of this binary classification.
Table 4 primarily compares the accuracy of LR on the LID scores as well its setups for training. For example, the first row of this table shows that LR without cross validation has 86.36% and 84.27% accuracy in training and test, respectively. These accuracies have been achieved at the 120th iteration with
regularization penalty and with a prediction tolerance (error) of
Training has been executed using four CPU core (jobs) without weight normalization (false fitting intercept). The inverse of the regularization strength (C) for this model is set to 0.002.
As shown in
Table 4, the LR is favorably used for the binary classes of the LID scores, and it supports our previous hypothesis (can adversarial samples be interpreted as noisy samples or not?) regarding the fundamental difference between noisy and adversarial samples. For a very good characterization of the distribution values of the original, noisy, and adversarial samples, we plotted their LID scores in Cartesian space. Please note that LID is a score given to every input.
Figure 6 visually shows distribution of LID scores for triplet of original, noisy, and adversarial samples.
Figure 6 shows the LID score comparisons for random samples chosen from the training set. As this figure indicates, original and noisy samples lie in the same LID subspace, which denotes their structural similarity. Conversely, adversarial samples are located in a separated upper subspace different from the original and noisy sets. To demonstrate that these LID scores were also statistically different, we trained an LR in order to classify LID scores of original, noisy, and adversarial samples. Obviously, higher values of accuracy of the trained LR mean better classification for LIDs. We summarize the details of the LR in
Table 4 as well as other training information.
Overall,
Table 4 statistically proves that LIDs for adversarial samples are far from original and noisy samples, and
Figure 6 shows this difference visually.
Generating adversarial samples with respect to the intrinsic characteristics of the given dataset could be very costly in terms of optimization overhead. In other words, Equation (8) does not always show a complex optimization task and could be a non-polynomial problem. These problems cannot be solved by polynomial functions approximation (of any degree). Therefore, Equation (8) could be replaced by a faster operation, namely, by taking advantage of gradient information backpropagated through the network during its training. Generating adversarial samples relying on gradient information was first introduced in the computer vision community, and was called “fast gradient sign method” (FGSM) [
63]. We will adapt this attack for our regression task.
The FGSM is categorized as a white-box and non-targeted adversarial attack, mainly for architectures trained by backpropagation, and requires the model gradient information. For a given input
, the FGSM crafts adversarial sample
, as defined in Equation (10):
where
is the cost function of the model, and
is a float scalar to be defined by a local search. Since the FGSM attack was introduced for classification purposes, we needed to update the label index of
to a bounded value by providing a “supremum” and an “infimum”. Therefore, Equation (10) should be written under the following form [
63]:
where
is an output value, and
is the actual value as defined in the training set. Our adapted version of the FGSM (AFGSM) requires its optimization for both
and
.
In our next experiment, we generated adversarial samples using the AFGSM for our proposed DNN, CNN, RNN, and LSTM architectures. We also studied the transferability property of crafted samples, as shown in
Table 5.
Table 5 compares the transferability of adversarial samples using our proposed AFGSM algorithm. For instance, the first element in
Table 5 suggests that 78.25% of total crafted adversarial samples are successfully transferable from DNN to the LR model.
As shown in
Table 5, all the models are vulnerable to our version of FGSM attack. Not surprisingly, generated adversarial samples using the AFGSM for DNN and CNN are the most transferable samples to each other and are shown in bold characters (91.25, 92.47). Moreover, AFGSM-generated adversarial samples for RNN architecture are the samples most transferable to the CNN model (93.37). One hypothesis could be that this is related to their same utilized convolution layers, regardless of their filters shape, sizes, or order.