Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection

Jiao, Jiajia; Jiang, Ling; Zhou, Quan; Wen, Ran

doi:10.3390/electronics14071384

Open AccessArticle

Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection

College of Information Engineering, Shanghai Maritime University, No. 1550 Haigang Avenue, Shanghai 201306, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(7), 1384; https://doi.org/10.3390/electronics14071384

Submission received: 28 February 2025 / Revised: 16 March 2025 / Accepted: 23 March 2025 / Published: 29 March 2025

(This article belongs to the Special Issue Enhancing Cybersecurity: Advanced Attack Detection and Defense Techniques)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the impact of different Large Language Models (DeepSeek, Kimi and Doubao) on the attack detection success rate of evasive Spectre attacks while accessing text, image, and code tasks. By running different Large Language Models (LLMs) tasks concurrently with evasive Spectre attacks, a unique dataset with LLMs noise was constructed. Subsequently, clustering algorithms were employed to reduce the dimension of the data and filter out representative samples for the test set. Finally, based on a random forest detection model, the study systematically evaluated the impact of different task types on the attack detection success rate. The experimental results indicate that the attack detection success rate follows the pattern of “code > text > image” in both the evasive Spectre memory attack and the evasive Spectre nop attack. To further assess the influence of different architectures on evasive Spectre attacks, additional experiments were conducted on an NVIDIA RTX 3060 GPU. The results reveal that, on the RTX 3060, the attack detection success rate for code tasks decreased, while those for text and image tasks increased compared to the 2080 Ti. This finding suggests that architectural differences impact the manifestation of Hardware Performance Counters (HPCs), influencing the attack detection success rate.

Keywords:

evasive Spectre attacks; attack detection; Large Language Models; cluster analysis

1. Introduction

Large Language Models (LLMs) [1], powered by advanced architectures like Transformers, have demonstrated remarkable capabilities in natural language understanding, image recognition, and code synthesis [2,3]. Owing to their versatility and efficiency, LLMs have profoundly impacted people’s lives and provide significant convenience across various applications.

Spectre attacks [4] are a type of security vulnerability that exploit the speculative execution of processors to execute incorrect instruction sequences and leak sensitive information. Existing research exploits Hardware Performance Counters (HPCs)-based feature values—LLC references (last-level cache reference event), LLC misses (last-level cache missed event), branches (branch instructions retired event), branch misses (branch misprediction retired event)—to detect Spectre attacks and achieves good detection results, with up to 99% accuracy [5,6,7]. Therefore, evasive Spectre attacks, inserting nop instructions and memory delay instructions to reduce the frequency of the Spectre attack [8,9], are proposed to mimic the behavior of benign programs and evade detection. This method of attack makes the malicious behavior more covert, significantly increasing the difficulty of detection. Current detection approaches for evasive Spectre attacks primarily focus on hardware events feature extraction and model architecture innovation. Li et al. [6] monitored last-level cache and branch prediction events on processor cores with fixed sampling rates, employing Logistic Regression (LR), Support Vector Machine (SVM), and Multilayer Perceptron (MLP) models. However, the best-performing MLP model achieved only 70% accuracy. Kosasih et al. [10] attained 100% attack detection success rate using L1/L2 cache-related events and derived features with neural networks, but the model required training on scarce evasive Spectre attacks. Meanwhile, He et al. [11] proposed a detection method leveraging 34 HPCs, but practical implementation faced constraints due to hardware limitations—Intel Core i7-6700K supports only 4 HPCs [12] and AMD Ryzen 7 3700X provides 6 HPCs [13]—necessitating strict HPCs quantity control to avoid performance overhead. Mandal et al. [14] innovatively integrated LLMs into microarchitectural attack defense, employing code analysis for vulnerability identification and defense generation, combined with HPCs-based real-time detection. However, this approach overlooks critical resource consumption issues when applying LLMs to HPCs analysis: The substantial computational demands of LLMs [15] may cause dramatic increases in GPU/CPU utilization, potentially inducing HPCs drift and ultimately compromising detection accuracy. Meanwhile, research on how the LLMs affect the detection of evasive Spectre attacks is still limited. Specifically, it remains unclear how different tasks (such as text generation, image access, and code synthesis) impact the attack detection success rate. Therefore, evaluating the influence of LLMs on evasive Spectre attack detection is of great significance.

Our main contributions include the following four points:

(1): A novel dataset that incorporates LLMs is constructed to provide a realistic test for evaluating the attack detector.
(2): Clustering algorithms are employed to reduce data dimensionality and select representative samples, which improve the efficiency of the attack detector.
(3): Integrating the LLMs with the detection of evasive Spectre attacks, providing new directions for future research.
(4): A comprehensive evaluation across different hardware architectures, showing that architectural differences significantly influence HPCs, leading to varying attack detection success rates for LLMs-based evasive Spectre attacks on text, image, and code tasks.

2. Background

2.1. Large Language Models

LLMs have made significant progress in the field of Natural Language Processing (NLP) [16] and have become a crucial driving force for the advancement of artificial intelligence. The core architecture of LLMs is typically based on the Transformer [3], whose self-attention mechanism effectively captures long-range dependencies of the text, providing powerful technical support for language understanding and generation tasks. Early LLMs, such as the GPT series [17] and BERT [18], demonstrated outstanding performance in tasks like text classification, machine translation, and question-answering systems through large-scale pretraining and fine-tuning.

Additionally, LLMs also show excellent performances in cybersecurity and software engineering. In software security, LLMs have been applied to automated vulnerability detection and repair. For instance, Chen et al. [19] developed VulLibGen, an LLM-based tool for identifying third-party library vulnerabilities by combining code semantics with natural language descriptions to enhance detection efficiency. Chow et al. [20] improved vulnerability discovery accuracy through dual-modal taint analysis. In network intrusion detection, Alkhatib et al. [21] demonstrated the lightweight deployment advantages of CAN-BERT, a BERT-based anomaly detection system for CAN protocols.

To broaden LLMs applicability, Abdechakour et al. [22] created a vulnerability detection tool supporting 64K ultra-long context windows, enabling precise identification of 14 common defect types (CWE) in large-scale Python codebases and significantly improving security review efficiency for complex projects. Gonçalves et al. [23] successfully detected 905 duplicate vulnerabilities through optimized code processing techniques and dataset augmentation, achieving a maximum F1-score of 53% and validating LLMs potential in large-scale vulnerability management. For anomaly detection, Zhao et al. [24] proposed the TAD-GP method, adapting LLMs to tabular data anomaly detection tasks with performance breakthroughs on multiple benchmark datasets.

Recently, the emerge of DeepSeek has attracted widespread attention. DeepSeek-V3 and DeepSeek-R1 adopt the innovative Mixed Expert Model (MOE) architecture to achieve efficient training and optimization through load balancing and training enhancements. DeepSeek-V3 [25] outperforms other LLMs in knowledge-based question answering, code generation, and text processing. In contrast, DeepSeek-R1 [26] focuses on the inference phase, achieving in-depth optimization through reinforcement learning and the multi-stage training process. This article mainly leverages LLMs, including Kimi, DeepSeek, and Doubao, to achieve text generation, image access, and code synthesis.

2.2. Evasive Spectre Attacks

Spectre attacks exploit the side effect of speculative execution and cache-based side channels to leak sensitive data [4,27]. However, Spectre attacks also exhibit high cache miss rates and low branch miss rates due to the cache being reloaded and the branch predictor being mistrained, which makes Spectre attacks relatively easy to detect. Thus, evasive Spectre attacks are employed to evade detection.

Two evasive Spectre attacks are utilized for evaluation, which are evasive Spectre nop and evasive Spectre memory. Evasive Spectre nop attacks are constructed by inserting nop instructions before and after the victim function to extend the program’s execution time and blur the distinction between attack data and non-attack data. Evasive Spectre memory attacks mainly insert memory delay instructions after the victim function and use the Fisher–Yates shuffle algorithm to randomize the order of memory accesses, which can further mask the attack characteristics [8]. Although these evasive Spectre attacks reduce the attack detection success rate and bandwidth of Spectre attacks, they still retain the characteristics of Spectre attacks.

2.3. Hardware Performance Counters

HPCs [5] are specialized registers integrated into processors, used to monitor and record underlying hardware events in real-time, such as cache hit rates, branch prediction, and memory access latency. HPCs have two monitoring modes, including sample mode and counting mode, and this paper focuses on counting mode to count the occurrences of global hardware events within fixed time.

As show in Figure 1, HPCs exhibit two characteristics, low branch miss rate and high cache miss rate, to distinguish attack data and non-attack data. The low branch miss rate results from the attacker’s ability to manipulate the branch predictor, causing it to incorrectly predict the condition branch sequence as true. This misprediction enables the attack to bypass boundary checks during speculative execution. The high cache miss rate stems from attackers repeatedly clearing and reloading cache lines, which considerably increases the frequency of cache misses. Evasive Spectre attacks weaken these two characteristics by inserting nop instructions and memory access delay instructions into the code to reduce the frequency of the attack.

2.4. Density-Based Spatial Clustering of Applications with Noise

Clustering is widely used in data analysis, with the aim of representing significant patterns and important distribution characteristics within a dataset [24]. To date, researchers have developed a variety of clustering algorithms, each employing different strategies to handle large-scale datasets. Specifically, a density-based clustering method was mentioned in [28], known as the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [29].

DBSCAN is a clustering algorithm that groups together data points based on their density in a feature space, identifying clusters of varying shapes and sizes. It works by classifying points as core points, border points, or noise, depending on their proximity to other points. A core point has a minimum number of neighboring points (minPts) within a given radius (

ϵ

), while border points are close to core points but do not meet the density requirement. Noise points do not belong to any cluster.

In this paper, the DBSCAN algorithm is employed to analyze the dataset. Unlike traditional distance-based clustering methods, such as K-Means, DBSCAN can identify clusters of arbitrary shapes and effectively handle noisy data.

3. Proposed Scheme

The proposed scheme includes three parts: data collection, data cleaning based on DBSCAN and cluster analysis, and attack detection. These are shown in Figure 2.

3.1. Data Collection

The data collection process involves utilizing Perf, a performance analysis tool tailored for Linux, to capture Hardware Performance Counter events:As shown in Table 1, branch prediction, branch misses, LLC reference, and LLC misses. These events have been collected separately for attack and non-attack data.

LLC miss rate = \frac{LLC misses}{LLC references}

(1)

Branch miss rate = \frac{Branch misprediction}{Branch references}

(2)

To distinguish attack and non-attack data, the reasons for selection four HPC events are detailed in Section 2.2. Additionally, to assess the impact of LLMs on evasive Spectre attacks, various LLMs, including Doubao, Kimi, and DeepSeek, are executed during the data collection process. Meanwhile, these LLMs are made to ask questions cyclically according to a preset sequence of questions, maintaining an ongoing question-and-answer state. Once a question is answered, the next question is immediately posed.

Most importantly, data collection was performed under three different usage scenarios—text, image, and code—and three sets of data were collected under each of the three different usage scenarios.

3.2. Data Cleaning Based on DBSCAN and Cluster Analysis

3.2.1. DBSCAN Algorithm

The core idea of DBSCAN is to form clusters of arbitrary shapes based on density distribution, characterized by high similarity within clusters and low similarity between clusters. The key parameter Eps and MinPts are defined as follows:

Definition 1 (Eps Neighborhood).

For a given dataset

D = {x_{1}, x_{2}, \dots, x_{n}}

, the

Eps

neighborhood of any point

x_{i} \in D

is defined as:

N_{Eps} (x_{i}) = {x_{j} \in D ∣ dist (x_{i}, x_{j}) \leq Eps}

(3)

where

dist (x_{i}, x_{j})

denotes the distance between point

x_{i}

and point

x_{j}

, typically using Euclidean or Manhattan distance.

Definition 2 (MinPts).

MinPts is the minimum number of points required within the

Eps

neighborhood for a point to be considered a core point.

As shown in Algorithm 1, the algorithm first marks all data points as unvisited and initializes a cluster ID. For each unvisited point in the dataset, it calculates the set of neighboring points within a specified Eps. If the number of neighbors is at least MinPts, the point is considered a core point and forms a new cluster. Then, all density-reachable points are recursively added to this cluster, otherwise the point is marked as noise. This process repeats until all points are visited and classified. The key advantages of DBSCAN are that it does not require a predefined number of clusters, can identify clusters of various shapes, and is robust to noise.

Algorithm 1 DBSCAN Algorithm

Require: Dataset D, neighborhood radius Eps, minimum points MinPts

Ensure: Cluster labels for each point in D

Mark all points in D as unvisited
Cluster ID $\leftarrow 0$
for each point P in D do
if P is not visited then
Mark P as visited
$N \leftarrow$ Get neighbors of P within Eps
if $| N | \geq MinPts$ then
Increment Cluster ID
Expand cluster for P and all neighbors in N
else
Mark P as noise
end if
end if
end for

3.2.2. Data Cleaning

The purpose of employing cluster algorithms is to select representative samples from the HPCs data, thereby reducing the size of the dataset. (1) The evasive Spectre attacks captured when the LLMs accesses text, images, and code may have abnormal or irrelevant values. DBSCAN is able to effectively identify the noise, and then clear these noises rather than forcibly assigning the noise to a certain cluster. (2) DBSCAN can leverage the captured data structure to more accurately identify representative samples. Moreover, the core parameters of DBSCAN, such as the neighborhood radius (Eps) and the minimum number of samples (minPts), can be flexibly adjusted based on data characteristics. This allows for the identification of the most suitable parameter combinations for detecting evasive Spectre attacks in data collected when LLMs access text, images, and code, respectively. (3) Evasive Spectre attacks may exhibit non-linear feature distributions when accessing text, images, and code. By defining clusters through density connectivity rather than through rigid distance thresholds, DBSCAN can effectively identify attack patterns with complex structures and prevent the loss of critical attack features.

In this paper, the parameter Eps is determined through two steps. First, the algorithm generates a k-distance curve. The Eps candidates are selected based on the y-coordinate of the point where the slope of the curve increases sharply, providing an approximate range with multiple possible values. Then, the algorithm iterates through these candidate values and calculates the silhouette coefficient for each. The final Eps is chosen as the one with the highest silhouette coefficient. Since the silhouette coefficient measures clustering performance, ranging from [−1, 1], a value closer to 1 indicates better clustering quality.

3.2.3. Cluster Analysis

As shown in Figure 3, Figure 4 and Figure 5, the cluster results show that text, image, and code access data are all divided into two core clusters, the DeepSeek cluster (attack and non-attack) and the noise cluster, which helps to exclude irrelevant data and accurately evaluates the influence of LLMs for evasive Spectre attack detection. The cluster results from Kimi and Doubao also exhibit similar characteristics.

Additionally, it can be observed that images have relatively high LLC references and low LLC misses, which indicates that LLMs access to images is often accompanied by frequent LLC access. However, due to the spatial locality of images, where adjacent pixels are likely to be accessed sequentially, the cache can effectively pre-fetch neighboring pixel data, reducing LLC misses and lowering the LLC misses rate. A lower LLC miss rate aligns more closely with the characteristics of evasive Spectre attacks, making the attack harder to detect. As a result, the attack detection success rate on images is slightly lower compared to text and code.

As for code, it is observed that the LLC misses are higher compared to text and images. This leads to a higher LLC miss rate, which reflects increased cache side-channel activity, providing the detector with more signals of abnormal LLC access. Consequently, the attack detection success rate for code is higher than for text and images. The reason for this is that the LLC access pattern of code is relatively complex (e.g., function jumps and indirect calls), leading to slightly weaker locality compared to text and images and resulting in a relatively higher LLC miss rate. This further validates the experimental results in Section 4 of the paper.

3.3. Attack Detection

This paper employs the MLP [30] as the benchmark detector for evasive Spectre attack detection, leveraging its strong feature extraction capabilities. Previous studies [5,9] have also demonstrated that MLP performs well in detecting Spectre attacks. Additionally, Random Forest (RF) and Recurrent Neural Network (RNN) are selected as comparative models for evaluation.

3.3.1. Details of Various Models

The training set for these models consists of Spectre attack data, while the test set comprises evasive Spectre attacks data. The training set comprises features and corresponding labels, where the labels classify instances as either attack or non-attack. The features include four HPCs values, LLC miss rate, and branch miss rate. Given the challenges of collecting evasive Spectre attack data in real-world scenarios, this study utilizes common Spectre attack data for training. The dataset is divided into training and testing sets at an 8:2 ratio.

For the MLP model, the dataset is standardized using Z-score normalization before training to ensure feature balance. The model comprises two hidden layers with 100 and 50 neurons, respectively, utilizing the ReLU activation function. Training is performed using the Adam optimizer for 500 epochs. To ensure reproducibility, the random seed (random_state = 42) is fixed.

For the RF model, this paper constructs an ensemble of 100 decision trees with a maximum depth of 10. Each tree splits features by minimizing Gini impurity and employs bootstrap sampling to enhance robustness against hardware noise. To ensure stability, multiple experiments are conducted with fixed random seeds. The trained model is then serialized into a .pkl file for future use.

For the RNN model, a time-aware training strategy is adopted. The original HPCs data are transformed into a three-dimensional tensor (samples × time steps × features) using a sliding window of 10 time steps to preserve the temporal continuity of attack behavior. The data are standardized along the feature dimension using StandardScaler and are strictly split in chronological order. The model consists of a two-layer stacked SimpleRNN (64 → 32 units) with a Dropout rate of 0.3 between layers to mitigate overfitting, followed by a fully connected layer mapped to a Softmax output. The Adam optimizer is used for 50 epochs with a batch size of 32 to balance memory efficiency and gradient stability.

3.3.2. Details of Each Metric

The detection results are presented in a confusion matrix, with key metrics: True Positive (TP) and True Negative (TN). To mitigate data imbalance, TN and TP are presented as ratios, as shown in Equations (4) and (5). TP represents the attack detection success rate, while TN indicates the non-attack detection success rate.

Attack Detection Successful Rate (TP):

TP = \frac{tp}{tp + fn}

(4)

where tp denotes correctly identified attack samples and fn represents undetected attack instances.

Non-Attack Detection Successful Rate (TN):

TN = \frac{tn}{tn + fp}

(5)

where tn indicates correctly classified benign samples and fp corresponds to false alarms (benign samples misclassified as attacks).

Meanwhile, metrics such as precision (the proportion of actual positive instances among all instances predicted as positive by the model), recall (the proportion of actual positive instances correctly predicted as positive by the model among all actual positive instances), accuracy (the proportion of correctly predicted instances among the total number of instances), F1-score (the harmonic mean of precision and recall), ROC, and AUC provide complementary perspectives, which are fundamentally derived from TP and TN.

The Receiver Operating Characteristic (ROC) curve is used to compare the classification performance of different models by plotting the TP against the FP at various classification thresholds. The Area Under the Curve (AUC) quantifies the overall classification performance of a model. AUC values close to 1 indicate a strong ability to distinguish between positive and negative samples. The relevant formulas are defined as shown in Equation (6):

AUC = \int_{0}^{1} TP (FP) d FP

(6)

4. Results and Analysis

4.1. Experiment Configuration

The experiment configuration is listed in Table 2. All experiments in this paper were conducted on our private server, which runs the Ubuntu Linux 18.04.6 LTS operating system and is equipped with an Intel Xeon^® Silver 4210 2.2 GHz processor and 125.5 GB of DDR4 memory. The server also features a 502.9 GB disk and utilizes Perf version 5.4.233 for HPCs measurements. The experiments were performed using Python 3.10 within PyCharm Professional 2022.1.3.

The dataset used in this paper consists of two parts: Spectre attacks and evasive Spectre attacks. The Spectre data are used as the training set, as they are more easily available compared to unknown or novel evasive Spectre attacks. The evasive Spectre attacks, on the other hand, serve as the test set to evaluate the impact of LLMs on the detection of evasive Spectre Attacks. The evasive Spectre attacks are further divided into two categories: noisy and no noise. The noisy scenario involves collecting evasive Spectre attacks, while the LLMs simultaneously access text, images, or code. In contrast, the no noise scenario involves collecting data without running any additional applications to ensure the purity of the data. In addition, the TP is relatively low, which is primarily due to the fact that the training data, consisting of Spectre attack and benign data, are insufficient to cover the more stealthy evasive Spectre attacks.

4.2. Comparative Analysis with Alternative Machine-Learning Models

In this section, we conduct a comparative analysis of the performance of three machine-learning models, including MLP, RF, and RNN, in the task of attack detection. The evaluation of model performance is based on two key metrics: TP and the ROC curve with AUC.

As shown in Table 3, the TP of MLP is consistently low across all attack scenarios. The TP even drops to 0 in the DeepSeek image memory setting (without LLMs noise), indicating that MLP struggles with complex non-linear relationships. The performance of MLP is particularly poor in attack scenarios that involve temporal features and contextual information. Due to its simple structure, MLP lacks the ability to model sequential data, making it challenging to capture key features in complex attack patterns.

However, RNN performs well in detecting evasive Spectre memory. For example, the attack detection success rate reaches 54.17% in the DeepSeek code memory, and 47.70% in DeepSeek text memory. This suggests that RNN can effectively capture temporal features and contextual information from HPCs through its sequential modeling capability. In contract, RNN performs poorly in detecting evasive Spectre nop, with the attack detection success rate of 0, which indicates that RNN struggles to detect subtle variations caused by evasive Spectre nop.

Additionally, RF achieves a higher attack detection success rate in most attack scenarios. For instance, the success rate is 25.79% in evasive Spectre nop (DeepSeek code), compared to 12.70% in evasive Spectre nop (DeepSeek text). This suggests that RF, through its ensemble learning and feature selection capabilities, can effectively identify key patterns in code and text data. Meanwhile, RF also performs well in evasive Spectre memory (the attack detection success rate is 1.67%), demonstrating its strong ability to handle structured data.

This paper integrates results from multiple test sets to evaluate the classification performance of different models, including MLP, RF, and RNN. Specifically, for each test set, we extract features and labels, apply each model for prediction, and combine the predicted probabilities with the ground truth labels into a global dataset. Based on this dataset, we compute the ROC curve and AUC value for each model. By plotting the global ROC curve and comparing AUC values, as shown in Figure 6, we find that the RF model achieves the best classification performance, with a higher AUC than MLP and RNN, indicating its superior detection capability in this task.

Based on the above analysis, we propose adopting RF as the primary attack detection model, with its detection results serving as the main analytical subject. The rationale is threefold:

Superior Comprehensive Performance: RF demonstrates higher attack detection success rates across most attack scenarios, particularly excelling in detecting evasive Spectre nop within code and text data. This indicates RF’s strong capability to capture critical data patterns effectively.

Enhanced Feature Selection: The inherent advantage of RF lies in its ensemble learning framework, which automatically identifies significant features through multiple decision trees.

Non-Parametric Modeling: RF eliminates the need for predefined functional relationships between features and target variables, autonomously capturing non-monotonic correlations.

As shown in Figure 7, the test set for evasive Specter attacks exhibits non-linear statistical deviations in both branch miss rates and LLC miss rate compared to the training set for Spectre, and RF can adapt to such changes.

4.3. The Effectiveness of Random Forest in Detecting Evasive Spectre Attacks

As shown in Figure 8, Figure 9 and Figure 10, the attack detection success rate follows the pattern “code > text > image”; LLMs have a small impact on the TP for evasive Spectre attacks. We have analyzed the results using DeepSeek as a case study.

As for evasive Spectre memory (DeepSeek) in Figure 7a, code tasks are clustered in the high LLC miss rate region (>0.25) and are accompanied by a high branch miss rate (close to 0.01). Text tasks are mainly concentrated around a branch miss rate of approximately 0.01, but with a slightly lower LLC miss rate. Image tasks are primarily distributed below 0.01 branch miss rate, with low LLC miss rates, indicating that attacks have a smaller impact on them. As for evasive Spectre nop (DeepSeek) in Figure 7b, the code still exhibits a relatively high branch miss rate, but the LLC miss rate becomes more dispersed while remaining high. The text tasks remain at a moderate level, with little change from the first plot. The images continue to maintain a low branch miss rate and LLC miss rate. Therefore, we can draw the following conclusions: Code tasks exhibit a high cache miss rate, which closely aligns with the characteristics of Spectre attacks, making them easier to detect. Text tasks show a moderate branch miss rate and LLC miss rate, leading to a detection success rate lower than code tasks but higher than image tasks. Image tasks have low branch miss rates and low LLC miss rates, making them more similar to evasive Spectre attacks. This results in the lowest attack detection success rate. The underlying reason for these differences is closely related to the computational characteristics of LLMs tasks.

To better explain each task, including code, text, and image tasks, we categorize and analyze the impact of evasive Spectre attacks on different tasks as follows:

Code Tasks: (1) High Branch Miss Rate. Code tasks often involve numerous conditional statements, loops, and nested structures, leading to frequent branch instructions. Although modern CPUs feature advanced branch prediction mechanisms, the complexity and variability of control flow still result in a certain degree of misprediction, which manifests as a high branch miss rate in HPCs. (2) High LLC Miss Rate. When processing code, LLMs must handle a vast number of parameters and intermediate computation results. These data may exceed the L3 cache capacity, leading to frequent LLC misses. Consequently, the processor must retrieve data from main memory, significantly increasing the LLC miss rate. Furthermore, the scattered nature of data structures and instruction distributions reduces the likelihood of consecutive memory accesses, making it difficult to fully utilize hardware caching and further contributing to a high LLC miss rate.

Text tasks: Moderate LLC Miss Rate. Text tasks primarily involve string processing, sequence modeling, and transformer-based computations. Since raw text data are typically stored as contiguous arrays or sequences, they exhibit good spatial locality. What is more, when processing text with LLMs, the relatively continuous data layout allows hardware pre-fetchers to better capture and pre-load subsequent data, thereby reducing the LLC miss rate compared to code tasks.

Image Tasks: (1) Low LLC Miss Rate: The image is usually stored in 2D or 3D arrays, which tend to be contiguous in memory. This spatial locality enables a single cache line load to cover a large number of subsequent pixels, significantly reducing the LLC miss rate. (2) Low Branch Miss Rate: The core algorithms in image tasks focus on numerical computations and matrix operations. The control flow is predominantly simple loops with minimal conditional branches or complex jumps. This simple and linear control flow allows the CPU’s branch predictor to make highly accurate predictions, leading to a low branch miss rate.

4.4. Broader Evaluations Across Different Hardware Architectures

The previous experiments were conducted on an NVIDIA 2080 Ti. To further evaluate the impact of different hardware architectures on evasive Spectre attacks, we extended our study to a 3060 GPU. The experimental setup for this phase includes the following: GPU: RTX 3060 (12 GB), CPU: Intel Xeon E5-2680 v2 (10 cores, 20 threads), Memory: 32 GB. The experimental results (Table 4) demonstrated that, on the RTX 3060, the attack detection success rate for code tasks decreased, while those for text and image tasks increased compared to the 2080 Ti. This is because the cache hierarchy design in the Ampere architecture (RTX 3060) significantly alters the impact of different tasks on hardware event behaviors. A detailed analysis of the impact of different hardware on various tasks is provided below.

The code task (such as dynamic memory allocation and virtual function calls) generally exhibits strong temporal locality. In the Turing architecture (RTX 2080 Ti), due to the smaller L2 cache, it cannot effectively store locality-related data, leading to more accesses falling into the L3 cache. This results in a higher L3 cache miss rate, making the behavior more similar to a Spectre attack. In contrast, in the Ampere architecture (RTX 3060), the L2 cache has been significantly increased, allowing most locality-based accesses to be satisfied within L2, reducing dependency on L3. This lowers the L3 cache miss rate, leading to a decrease in attack detection success rates.

Text-related tasks (such as transformer models) frequently access large-scale embedding tables, which usually cannot fit entirely within the 4MB L2 cache. As a result, a substantial number of accesses overflow into the L3 cache. The attention mechanism involves large-scale matrix operations (such as QKV transformations), which require accessing non-contiguous memory regions, further increasing the L3 cache miss rate.

For image tasks with high parallelism in convolution operations, the RTX 3060 (Ampere) leverages its significantly larger L1 and L2 caches compared to previous architectures (Turing). The hardware prioritizes allocating more data to the L1 and L2 caches rather than directly utilizing the LLC (L3 cache). While this strategy enhances data access speeds, it may reduce LLC utilization for image tasks that heavily rely on texture data, leading to an increased LLC miss rate. As shown in Figure 11, both image and text tasks exhibit relatively higher LLC miss rates.

4.5. Comparison with State-of-the-Art Researches

Table 5 compares previous studies with our approach in terms of the number of HPC events, machine-learning models, workload diversity, and attack detection success rate. Previous research primarily employed LR, SVM, MLP, or Neural Networks (NNs) for detection, leveraging 4 to 6 HPC events to identify evasive Spectre attacks, achieving attack detection success rates between 70% and 100%. However, these studies did not consider the computational load of LLMs.

In contrast, our approach uses MLP, RF, and RNN to investigate attack detection in LLMs environments. Our experiments consider tasks related to code, image, and text, increasing workload diversity. Additionally, 4 HPC events are selected to ensure comparability with existing studies. However, experimental results indicate that the introduction of LLMs significantly alters the behavioral patterns of microarchitectural attacks, making traditional HPC events-based detection methods less effective in adapting to these new attack patterns.

Therefore, future research should further explore more adaptive HPC events selection strategies and integrate adaptive cache management and hardware-based solutions to effectively counter evasive Spectre attacks in the LLMs interference.

4.6. Discussion and Limitation

This paper evaluates the impact of LLMs’ access to different data types (text, image, and code) on evasive Spectre attacks. (1) This study reveals that different data-accessing tasks exhibit distinct HPC patterns under evasive Spectre attacks, and these patterns vary across different hardware architectures. This insight could be leveraged to design more adaptive and architecture-aware runtime anomaly detection systems for AI workloads, enhancing the security of large-scale LLMs deployments. (2) Given that attack detection success rates differ based on both task types and underlying hardware architectures, security frameworks could implement architecture-specific and task-aware defense strategies. For instance, on architectures where image-accessing tasks are more challenging to detect, enhanced memory access monitoring and cache protection mechanisms could be prioritized. However, there are still some limitations that could provide directions for future research.

Limitation of Datasets: This study focuses primarily on three types of data: images, text, and code. Although these data types are representative, LLMs are likely to process more complex data types in real-world applications, such as video and audio. Future research could expand the analysis to a broader range of data types for comparative analysis. What is more, the types of LLM tasks used in the study may not fully represent all real-world AI workloads. A more diverse set of AI tasks could be analyzed to improve the comprehensiveness of the findings.

Limitation of Attack Types: This paper investigates the widely influential evasive Spectre attacks. However, other side-channel attacks, such as Meltdown [27,31] and Foreshadow [32], may have different impacts on memory access patterns. Future studies could consider incorporating other attack types into the research to further refine the understanding of attack detection.

5. Conclusions

This paper studies the impact of LLMs (DeepSeek, Kimi, Doubao) on evasive Spectre attack detection for text, image, and code accessing. In order to evaluating the influence of LLMs thoroughly, a dataset was constructed by running different LLM tasks concurrently with evasive Spectre attacks. Subsequently, clustering algorithms were used to reduce the dimensionality of the data and filter out representative samples for the test set. Based on the RF detection model, the attack detection success rate of the evasive Spectre attacks follows the order of “code> text > image”. Furthermore, to assess the influence of different hardware architectures, additional cross-platform experiments were conducted on an NVIDIA RTX 3060 GPU. The results show that, on the RTX 3060, the attack detection success rate for code tasks decreased, while those for text and image tasks increased compared to the 2080 Ti. This finding suggests that hardware architecture variations significantly impact the manifestation of HPCs, thereby influencing the effectiveness of attack detection.

Future researches can build on these findings to explore more effective mitigation strategies for evasive Spectre attacks in LLMs interference, including the following potential defense mechanisms:

Adaptive Cache Management: Since evasive Spectre attacks rely on abnormal cache behavior, dynamically adjusting cache management strategies can help mitigate the impact of such attacks.
Hardware-Based Solutions: Implementing cache partitioning or randomized cache indexing can reduce the likelihood of attackers exploiting cache side channels to infer sensitive data.

Author Contributions

Conceptualization, J.J. and L.J.; methodology, J.J. and L.J.; software, L.J. and Q.Z.; validation, J.J., L.J. and Q.Z.; formal analysis, J.J., L.J. and R.W.; investigation, J.J. and L.J.; resources, J.J., L.J. and Q.Z.; data curation, J.J., L.J. and Q.Z.; writing—original draft preparation, J.J., L.J. and R.W.; writing—review and editing, J.J., L.J. and R.W.; visualization, L.J.; supervision, J.J.; project administration, J.J.; funding acquisition, J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanghai Pujiang Talent Program (No.21PJD026).

Data Availability Statement

The code used in this article can be obtained from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A Survey on Evaluation of Large Language Models. Assoc. Comput. Mach. 2024, 15, 39. [Google Scholar] [CrossRef]
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models Are Few-Shot Learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS 2020), Vancouver, BC, Canada, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020. Article No. 159. pp. 1–25, ISBN 9781713829546. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010, ISBN 9781510860964. [Google Scholar]
Kocher, P.; Horn, J.; Fogh, A.; Genkin, D.; Gruss, D.; Haas, W.; Hamburg, M.; Lipp, M.; Mangard, S.; Prescher, T.; et al. Spectre Attacks: Exploiting Speculative Execution. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 20–22 May 2019; IEEE: New York, NY, USA, 2019; pp. 1–19. [Google Scholar] [CrossRef]
Li, C.; Gaudiot, J.-L. Detecting Spectre Attacks Using Hardware Performance Counters. IEEE Trans. Comput. 2022, 71, 1320–1331. [Google Scholar] [CrossRef]
Li, C.; Gaudiot, J.-L. Challenges in Detecting an “Evasive Spectre”. IEEE Comput. Archit. Lett. 2020, 19, 18–21. [Google Scholar] [CrossRef]
Polychronou, N.F.; Thevenon, P.-H.; Puys, M.; Beroulle, V. MaDMAN: Detection of Software Attacks Targeting Hardware Vulnerabilities. In Proceedings of the 2021 24th Euromicro Conference on Digital System Design (DSD), Palermo, Spain, 1–3 September 2021; pp. 355–362. [Google Scholar] [CrossRef]
Pashrashid, A.; Hajiabadi, A.; Carlson, T.E. Fast, Robust and Accurate Detection of Cache-based Spectre Attack Phases. In Proceedings of the 2022 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Diego, CA, USA, 30 October–3 November 2022; pp. 1–9. [Google Scholar]
Jiao, J.; Wen, R.; Li, Y. T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads. Electronics 2024, 13, 4090. [Google Scholar] [CrossRef]
Kosasih, W.; Feng, Y.; Chuengsatiansup, C.; Yarom, Y.; Zhu, Z. SoK: Can We Really Detect Cache Side-Channel Attacks by Monitoring Performance Counters? In Proceedings of the 19th ACM Asia Conference on Computer and Communications Security, Singapore, 1–5 July 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 172–185. [Google Scholar] [CrossRef]
He, Z.; Hu, G.; Lee, R.B. CloudShield: Real-time Anomaly Detection in the Cloud. In Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy, Charlotte, NC, USA, 24–26 April 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 91–102. [Google Scholar] [CrossRef]
Guide, P. Volume 3B: System Programming Guide Part. Intel^®64 and IA-32 Architectures Software Developer’s Manual; Intel: Santa Clara, CA, USA, 2011; pp. 1–40. Available online: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html (accessed on 26 September 2024).
Advanced Micro Devices. AMD64 Architecture Programmer’s Manual Volume 2: System Programming; Advanced Micro Devices: Santa Clara, CA, USA, 2006. [Google Scholar]
Mandal, U.; Shukla, S.; Rastogi, A.; Bhattacharya, S.; Mukhopadhyay, D. μLAM: A LLM-Powered Assistant for Real-Time Micro-architectural Attack Detection and Mitigation. In Proceedings of the ICCAD ’24 IEEE International Conference on Computer-Aided Design, New York, NY, USA, 27–31 October 2024; Cryptology ePrint Archive, Paper 2024/1978. 2024. [Google Scholar] [CrossRef]
Yu, Y.; Chen, X. Multi-Tenant Deep Learning Acceleration with Competitive GPU Resource Sharing. In Proceedings of the 2023 IEEE Cloud Summit, Baltimore, MD, USA, 6–7 July 2023; pp. 49–51. [Google Scholar]
Raiaan, M.A.K.; Hossain, M.S.; Kaniz, F.; Mohammad, N.F.; Sadman, S.; Jannat, M.M.; Ahmad, J.; Eunus, M.A.; Azam, S. A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges. IEEE Access 2024, 12, 26839–26874. [Google Scholar] [CrossRef]
Radford, A.; Narasimhan, K. Improving Language Understanding by Generative Pre-Training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), Brussels, Belgium, 31 October–4 November 2018. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
Chen, T.; Li, L.; Zhu, L.; Li, Z.; Liu, X.; Liang, G.; Wang, Q.; Xie, T. VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model. arXiv 2024, arXiv:2308.04662. [Google Scholar]
Chow, Y.W.; Schäfer, M.; Pradel, M. Beware of the Unexpected: Bimodal Taint Analysis. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, Seattle, WA, USA, 17–21 July 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 211–222. [Google Scholar] [CrossRef]
Alkhatib, N.; Mushtaq, M.; Ghauch, H.; Danger, J.-L. CAN-BERT do it? Controller Area Network Intrusion Detection System based on BERT Language Model. In Proceedings of the 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates, 5–8 December 2022; IEEE: New York, NY, USA, 2022; pp. 1–8. [Google Scholar] [CrossRef]
Mechri, A.; Ferrag, M.A.; Debbah, M. SecureQwen: Leveraging LLMs for Vulnerability Detection in Python Codebases. Comput. Secur. 2025, 148, 104151. [Google Scholar] [CrossRef]
Gonçalves, J.; Dias, T.; Maia, E.; Praça, I. SCoPE: Evaluating LLMs for Software Vulnerability Detection. arXiv 2024, arXiv:2407.14372. [Google Scholar]
Zhao, X.; Leng, X.; Wang, L.; Wang, N.; Liu, Y. Efficient Anomaly Detection in Tabular Cybersecurity Data Using Large Language Models. Sci. Rep. 2025, 15, 3344. [Google Scholar] [CrossRef]
DeepSeek-AI; Liu, A.; Feng, B.; Xue, B.; Wang, B.; Wu, B.; Lu, C.; Zhao, C.; Deng, C.; Zhang, C.; et al. DeepSeek-V3 Technical Report. arXiv 2025, arXiv:2412.19437. [Google Scholar]
DeepSeek-AI; Guo, D.; Yang, D.; Zhang, H.; Song, J.; Zhang, R.; Xu, R.; Zhu, Q.; Ma, S.; Wang, P.; et al. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv 2025, arXiv:2501.12948. [Google Scholar]
Zhang, J.; Chen, C.; Cui, J.; Li, K. Timing Side-channel Attacks and Countermeasures in CPU Microarchitectures. ACM Comput. Surv. 2024, 56, 178. [Google Scholar] [CrossRef]
Khan, K.; Rehman, S.U.; Aziz, K.; Fong, S.J.; Sarasvady, S.; Vishwa, A. DBSCAN: Past, Present and Future. Proc. Int. Conf. Appl. Digit. Inf. Web Technol. (ICADIWT) 2014, 5, 232–238. [Google Scholar]
Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; AAAI Press: Portland, OR, USA, 1996; pp. 226–231. [Google Scholar]
Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar] [CrossRef]
Lipp, M.; Schwarz, M.; Gruss, D.; Prescher, T.; Haas, W.; Horn, J.; Mangard, S.; Kocher, P.; Genkin, D.; Yarom, Y.; et al. Meltdown: Reading Kernel Memory from User Space. Commun. ACM 2020, 63, 46–56. [Google Scholar] [CrossRef]
Van Bulck, J.; Minkin, M.; Weisse, O.; Genkin, D.; Kasikci, B.; Piessens, F.; Silberstein, M.; Wenisch, T.F.; Yarom, Y.; Strackx, R. Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution. In Proceedings of the 27th USENIX Conference on Security Symposium, Baltimore, MD, USA, 15–17 August 2018; USENIX Association: Baltimore, MD, USA, 2018; pp. 991–1008. [Google Scholar]

Figure 1. HPCs analysis: comparative study of LLC miss rate and branch miss rate in Spectre, evasive Spectre, and normal.

Figure 2. System Structure For the Evaluation of the Impact of Large Language Models on the Detection of Evasive Spectre Attacks under Different Access Types: A Complete Process Covering Data Collection, Cluster Analysis, and Attack Detection.

Figure 3. Visualization analysis of data clustering and noise from evasive Spectre attacks during DeepSeek’s code access.

Figure 4. Visualization analysis of data clustering and noise from evasive Spectre attacks during DeepSeek’s image access.

Figure 5. Visualization analysis of data clustering and noise from evasive Spectre attacks during DeepSeek’s text access.

Figure 6. ROC curve analysis on MLP, RF, and RNN architectures.

Figure 7. Comparative analysis of LLC miss rate vs. branch miss rates under evasive Spectre attacks on RTX 2080ti architecture.

Figure 8. Performance comparison of DeepSeek impacts on evasive Spectre attack detection across text, image, and code.

Figure 9. Performance Comparison of Kimi impacts on evasive Spectre attack detection across text, image, and code.

Figure 10. Performance comparison of Doudou impacts on evasive Spectre attack detection across text, image, and code.

Figure 11. Comparative analysis of LLC miss rate vs. branch miss rates under evasive Spectre attacks on RTX 3060 architecture.

Table 1. Selected HPC events for attack detection.

HPC Events (Hardware Performance Counter Events)	Description
branches	Branch instructions retired
branch misses	Branch misprediction retired
LLC references	Last-level cache reference
LLC misses	Last-level cache missed

Table 2. Experimental configuration.

Item	Configuration
operation system	Linux 5.4.0-146-generic
mirror	Ubuntu 18.04.6 LTS
memory	125.5GiB
processor	Intel Xeon^® Silver 4210 CPU @ 2.2GHz × 20
graphics	llvmpipe (LLVM 10.0.0, 256 bits)
GNOME	3.28.2
OS type	64 bit
disk	502.9 GB
software	Pycharm professional 2022.1.3
Python	Python 3.10
Perf (HPCs)	Perf version 5.4.233

Table 3. Comparison Table of Detection Success Rates (TP) for Evasive Spectre Attacks under No-Noise and Noisy Conditions Based on Different DeepSeek Access Types.

Attack Type	MLP	RF	RNN
Evasive Spectre Memory	0.33%	1.67%	0.84%
Evasive Spectre Nop	0.17%	0.67%	0.00%
DeepSeek Code Nop	4.32%	25.79%	6.37%
DeepSeek Code Memory	0.19%	30.35%	54.17%
DeepSeek Image Nop	0.13%	2.37%	1.20%
DeepSeek Image Memory	0.00%	7.32%	29.58%
DeepSeek Text Nop	0.49%	12.70%	2.12%
DeepSeek Text Memory	0.06%	27.38%	47.70%

Table 4. Detection success rates of different attacks on RTX 2080 Ti and RTX 3060.

Attack Type	RTX 2080 Ti (%)	RTX 3060 (%)
DeepSeek Code Nop	25.79	20.22
DeepSeek Code Memory	30.35	23.91
DeepSeek Image Nop	2.37	42.47
DeepSeek Image Memory	7.32	41.64
DeepSeek Text Nop	12.70	34.45
DeepSeek Text Memory	27.38	42.56

Table 5. Comparison of methods and their attack detection success rates.

Method	HPC Events	ML Models	Workload Variety	Attack Detection Success Rate
[5] 2022	4	LR, SVM, MLP	1	evasive Spectre nop: 70%
[7] 2021	6	LR	2	evasive Spectre: 100%
[10] 2024	4	NN	1	evasive Spectre: 100%
[9] 2024	4	MLP	3	evasive Spectre nop: 95.42%;
				evasive Spectre memory: 100%
Ours	4	MLP, RF, RNN	3	DeepSeek Code Nop: 25.79% DeepSeek Image Nop: 2.37% DeepSeek Text Nop: 12.70% DeepSeek Code Memory: 30.35% DeepSeek Image Memory: 7.32% DeepSeek Text Memory: 27.38%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiao, J.; Jiang, L.; Zhou, Q.; Wen, R. Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection. Electronics 2025, 14, 1384. https://doi.org/10.3390/electronics14071384

AMA Style

Jiao J, Jiang L, Zhou Q, Wen R. Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection. Electronics. 2025; 14(7):1384. https://doi.org/10.3390/electronics14071384

Chicago/Turabian Style

Jiao, Jiajia, Ling Jiang, Quan Zhou, and Ran Wen. 2025. "Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection" Electronics 14, no. 7: 1384. https://doi.org/10.3390/electronics14071384

APA Style

Jiao, J., Jiang, L., Zhou, Q., & Wen, R. (2025). Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection. Electronics, 14(7), 1384. https://doi.org/10.3390/electronics14071384

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection

Abstract

1. Introduction

2. Background

2.1. Large Language Models

2.2. Evasive Spectre Attacks

2.3. Hardware Performance Counters

2.4. Density-Based Spatial Clustering of Applications with Noise

3. Proposed Scheme

3.1. Data Collection

3.2. Data Cleaning Based on DBSCAN and Cluster Analysis

3.2.1. DBSCAN Algorithm

3.2.2. Data Cleaning

3.2.3. Cluster Analysis

3.3. Attack Detection

3.3.1. Details of Various Models

3.3.2. Details of Each Metric

4. Results and Analysis

4.1. Experiment Configuration

4.2. Comparative Analysis with Alternative Machine-Learning Models

4.3. The Effectiveness of Random Forest in Detecting Evasive Spectre Attacks

4.4. Broader Evaluations Across Different Hardware Architectures

4.5. Comparison with State-of-the-Art Researches

4.6. Discussion and Limitation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI