An intrusion detection system (IDS) is a critical countermeasure to detect and protect CAN bus against cyberattacks on the vehicle’s systems. There are two main types of IDS on the CAN bus: CAN packet-based and ECU’s characteristic-based IDSs [
16]. While CAN packet-based approaches exploit various features of CAN frames, ECU’s characteristic-based approaches focus on the physical features of ECUs. In this section, we provide a comprehensive literature review of CAN packet-based related works using techniques close to CANPerFL such as machine learning, federated learning, and transfer learning.
3.1. Machine Learning-Based IDS
Many researchers have applied several machine learning and deep learning techniques to detect invasion on the CAN bus scrupulously. For instance, the idea of using hierarchical temporal memory (HTM) was proposed [
17] to develop a distributed anomaly detection system for the in-vehicle network. The proposed system predicted the next bit based on the data stream of each CAN ID and calculated the anomalous score based on the incoming data and the prediction to determine if any attack exists.
Song et al. [
18] presented an induced Inception ResNet model trained based on a sequential binary CAN ID matrix. The study became state-of-the-art in the field when achieving a relatively small error rate compared to other models. However, the proposed architecture is considered to be complex for deploying in an ECU.
The IDS model was defined as an LSTM autoencoder fed by various signals along with corresponding CAN IDs and trained with reconstruction loss [
19]. While the idea is interesting, the results of this study need to be further improved.
Derhab et al. [
20] introduced a new way of encoding CAN data by using a histogram of sequential CAN data fields. Then, a one-class support vector machine (OCSVM) model was trained to detect attacks on the CAN bus. However, the model required a large window size to achieve high performance.
Desta et al. [
21] developed a lightweight deep-learning CNN model based on recurrence plots built from various continuous CAN IDs. Although the model was able to adapt to time and hardware constraints, it was not good with multiclass classification.
Sequential patterns existing in CAN IDs can be utilized the to build a model, which predicts the next CAN ID given a sequence [
22]. For prediction, the authors used a bi-directional generative pretrained transformer (GPT), which is a recent advanced model in natural language processing. There are two bidirectional GPT models implemented as intrusion detection systems (IDS), where the CAN ID sequences are converted into integers, and performance is evaluated using negative log-likelihood (NNL).
To improve the performance of IDS, Zhang et al. [
23] combined both rule-based and machine learning-based methods to develop a hybrid two-stage IDS. The proposed system is shown efficient when tested on their own collected datasets.
Electric signals generated during transmission of Controller Area Network (CAN) packets contain slight variations that can identify individual Electronic Control Units (ECUs). Murvay et al. [
24] utilized the CAN ID field of CAN packets to create digital fingerprints for all ECUs. By measuring the electric signals associated with the CAN ID field, the authors generated fingerprints for the IDS. The IDS was evaluated in a simulated environment with ten USB-to-CAN devices and five CAN development boards.
Lee et al. [
25] found that malicious CAN packets can affect response time distributions of ECUs. They proposed an IDS that periodically sends request messages to all ECUs and compared their response times to known distributions. If a response time deviates from its expected distribution, it may indicate malicious packet injections. However, the IDS is limited to detecting attacks that affect response times.
Markovitz et al. [
26] identified four common data types in the Controller Area Network (CAN) data field: constant value, multi-value, counter value, and sensor value. As commercial vehicle CAN packet specifications are confidential, an algorithm was developed to assign each part of the 64-bit CAN data field to one of the four data types. Classification and specific rules for each data type were used to identify potential attacks on the automotive CAN system.
Choi et al. [
27] proposed VoltageIDS, an IDS for automotive networks that utilizes the physical properties of electrical signals for transmitting CAN messages. The system operates in three phases: feature extraction, feature selection, and intrusion detection. During the feature extraction phase, VoltageIDS extracts 60 features from normal CAN message signals, which are filtered in the feature selection phase to retain the most relevant features. In the intrusion detection phase, a multi-class classifier, such as Support Vector Machine, is constructed using attack-free CAN data to predict whether a message is normal or an intrusion.
The authors of [
28] proposed LSTM neural networks for anomaly detection in CAN bus messages in vehicles. Two pattern features, data and time intervals, were used for classification. The multi-dimensional LSTM framework combined these features and included a prediction and detection process. The proposed mobile edge-assisted multi-task LSTM reduced computation time and cost by enabling parallel computation on multiple servers.
Zhang et al. [
29] developed an in-vehicle network intrusion detection approach using a Binarized Neural Network (BNN) and Field-Programmable Grid Arrays (FPGAs). BNN uses binary values to accelerate intrusion detection, reducing memory usage and energy consumption. FPGAs improve performance by allowing for concurrent task processing. The IDS was three times faster than traditional IDSs, and 128 times faster after FPGA acceleration.
A combination of CNN, LSTM, and attention mechanism was developed to build a robust model for CAN IDS [
30]. In particular, the proposed approach uses a CNN to extract features from raw data, which are then fed into an LSTM to capture temporal dependencies. An attention mechanism is then applied to weight the contributions of different features to the anomaly detection process. The authors showed the effectiveness of their approach through a comprehensive evaluation using a publicly available dataset. The results demonstrate that the proposed approach outperforms several baseline methods in terms of accuracy, precision, recall, and F1 score.
Al-Jarrah et al. [
31] implemented a multi-model approach to classify attacks, utilizing both LSTM and ConvLSTM. While the former was utilized to process table data, the latter employed a recursion graph. The integration of the two models resulted in a 2% increase in accuracy, with an overall accuracy of 95.1%. It should be noted that this approach demands higher computational power and incorporates data appearing time, which may not significantly contribute to deep learning models.
3.2. Federated Learning for In-Vehicle Intrusion Detection
Applying machine learning and deep learning techniques to design a robust in-vehicle IDS is an active research topic. Yet, only a few studies have attempted to apply federated learning in the field. To the best of our knowledge, there are two recent studies closely related to our proposed idea.
Hussain et al. [
32] designed a ConvLSTM model which will be trained in a federated learning manner on the vehicle level. The model was developed to solve supervised multiclass classification, but the question of how to label data at the vehicle level to train the proposed scheme was not addressed. Meanwhile, they focused on improving the client selection strategy using a deep reinforcement approach.
A federated random forest model with blockchain technology was developed to tackle poisoning attacks [
33]. Since the study aims to develop secure blockchain storage for the global model, the issue of data heterogeneity was not mentioned. In addition, the study considered federated learning at the vehicle level, whereas our scheme is at the car manufacturer level where data labels are available.
3.4. Research Gaps
Many studies have shown the powerful capabilities of machine learning and deep learning in intrusion detection, especially for CAN bus data. However, the effectiveness of these models heavily relies on the availability of a significant amount of data. Additionally, due to the distribution discrepancy between data from different car manufacturers, as well as data confidentiality, building a universal model for different kinds of car manufacturers is challenging. Although transfer learning can be applied, choosing a specific car manufacturer introduces a bias problem. To address these challenges, we propose a novel deep learning-based IDS called CANPerFL. This approach leverages federated learning to take advantage of global features obtained from different car manufacturers, while preserving the privacy of the data. The final global features can be used to produce robust local models for participants, even with limited data. By doing so, CANPerFL can enhance the performance of intrusion detection in vehicles while addressing the challenges of data distribution discrepancy and confidentiality.