Distributed Data Privacy Protection via Collaborative Anomaly Detection

Zeng, Fei; Wang, Mingshen; Pan, Yi; Lv, Shukang; Miao, Huiyu; Han, Huachun; Yuan, Xiaodong

doi:10.3390/electronics14020295

Open AccessArticle

Distributed Data Privacy Protection via Collaborative Anomaly Detection

by

Fei Zeng

,

Mingshen Wang

,

Yi Pan

^*,

Shukang Lv

,

Huiyu Miao

,

Huachun Han

and

Xiaodong Yuan

Research Institute of State Grid Jiangsu Electric Power Co., Ltd., Nanjing 211100, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(2), 295; https://doi.org/10.3390/electronics14020295

Submission received: 1 December 2024 / Revised: 30 December 2024 / Accepted: 9 January 2025 / Published: 13 January 2025

(This article belongs to the Special Issue Applications of Data Analytics and Artificial Intelligence in Electric Vehicles)

Download

Browse Figures

Versions Notes

Abstract

:

Current anomaly detection methods for charging stations primarily rely on centralized network architectures with federated learning frameworks. However, the rapid increase in the number of charging stations and the expanding scale of these networks impose significant communication traffic loads. Consequently, it is essential to explore the relationship between data aggregation among charging stations and the anomaly detection accuracy at individual stations. In this paper, we address efficient anomaly detection in charging stations and propose a distributed anomaly detection algorithm powered by federated learning. To be specific, we introduce a distributed privacy-preserving data aggregation scheme, where a Transformer model is adopted to effectively smooth abnormal data fluctuations, minimizing disruptions to network aggregation nodes. Furthermore, we develop a distributed federated learning framework incorporating an efficient parameter update method without requiring prior knowledge or a central node. Compared with some existing detection solutions, the proposed approach significantly reduces communication bandwidth requirements while maintaining anomaly detection accuracy and mitigating data isolation issues. Extensive experiments demonstrate that the proposed algorithm not only achieves high accuracy in detecting anomalies in electric vehicle charging stations but also ensures robust user data privacy protection.

Keywords:

anomalous data; data aggregation; privacy protection; distributed learning

1. Introduction

New energy electric vehicles (EVs) are being actively promoted worldwide due to their advantages, including clean emissions and low energy consumption [1,2]. These vehicles have not only become a key component of national strategies but also serve as a new driver of economic growth. As a crucial infrastructure for EVs, the number of charging stations and the scale of their networks are expanding rapidly. Charging stations act as critical interaction points between power grid companies, charging station operators, and users, housing extensive data related to both users and the power grid [2,3]. However, anomalies in charging stations, such as network attacks, pose significant risks to the security of the power grid and associated data networks [4,5]. Consequently, ensuring the data security of charging stations and their networks has emerged as a prominent focus in current research.

Abnormality detection in charging stations typically relies on locally collected data pertaining to users and the power grid [6,7,8]. These data are processed by predictive modules deployed within individual charging stations to identify anomalies, such as irregular data flows [9,10,11]. However, charging stations are distributed across diverse regions; thus, each station is limited by the scale and characteristics of its local data. These limitations can result in sub-optimal local solutions. A more effective approach involves leveraging the network of charging stations to enable centralized data collection, facilitating real-time global monitoring of data and load variations across stations [12,13]. Centralized federated learning offers a potential solution, wherein a central node periodically aggregates user data or model parameters reported by individual charging stations. The central node then trains a unified model and distributes updated model parameters back to the stations, enabling real-time anomaly detection across the network [14].

Despite its advantages, the centralized training approach presents significant limitations [15,16]. First, the central node’s data load and computing capacity become bottlenecks for the entire network. If the central node is subjected to an attack, the entire network’s ability to perform data aggregation and related operations is compromised, leaving individual charging stations isolated as data islands [17,18]. Additionally, while centralized federated learning transmits only model parameters during data aggregation, it still poses a risk of inferring original user data, limiting its effectiveness in safeguarding privacy. These limitations underscore the practical importance of exploring distributed federated learning frameworks that do not rely on a central node. Compared to centralized algorithms, distributed approaches offer enhanced privacy protection for charging station data while effectively predicting and detecting anomalies, abnormal states, or potential attacks [19,20]. The primary challenge lies in designing a distributed algorithm that transcends deployment region constraints and hardware limitations, utilizes deep networks for local anomaly detection, and enables the exchange and aggregation of model parameters directly between stations [21]. The goal is to enhance anomaly detection accuracy while maintaining robust user privacy, which forms the core focus of this study.

To address efficient anomaly detection for charging stations, this paper introduces an efficient approach based on the Transformer and distributed federated learning. To be specific, we introduce a distributed anomaly detection algorithm based on the Transformer architecture, focusing on analyzing the impact of anomalies or attacks on the topological structure of the charging station network. Leveraging the strong generalization capabilities of the Transformer model, the proposed detection algorithm can be directly applied to various types of charging stations as well as the associated network. Furthermore, recognizing the substantial parameter size of Transformer models, we propose a data transmission optimization strategy tailored for local nodes in distributed networks. This strategy, built on the anomaly detection model, seeks to optimize both the parameters exchanged between nodes and their transmission methods. By doing so, it alleviates the bandwidth pressure on communication networks associated with distributed anomaly detection algorithms, while preserving detection accuracy.

The structure of this paper is organized as follows. In Section 2, we provide an overview of typical federated learning algorithms. The research methodology is presented in Section 3. We propose a collaborative data privacy protection algorithm for anomaly detection, including a comprehensive analysis of the topological structure of electric vehicle charging station networks under anomalies or attacks in Section 4. Based on this analysis, a distributed anomaly detection algorithm for charging stations is proposed. Section 5 elaborates on the optimization strategy for inter-node transmission. Section 6 presents experimental results, evaluating the performance of the proposed distributed anomaly detection algorithm under various network structures and examining the training loss of charging stations. The paper is concluded in Section 7.

2. Related Work

2.1. Anomalies Detection and Federated Learning

It is well known that there are various attacks to charging stations. The most common attack in charging station networks is denial of service (DoS) and distributed denial of service (DDoS) attacks [22,23]. Specifically, attackers flood the charging station’s network with excessive requests, overwhelming the system and rendering it unavailable to legitimate users. Man-in-the-middle (MitM) attacks intercept and potentially alter communication between the charging station and its backend system or users, enabling data theft, payment manipulation, or service disruption. Unauthorized access and privilege escalation are another way to attack the charging station, exploiting vulnerabilities in the charging station’s software or network to gain unauthorized access and elevate privileges, allowing attackers to control the system or install malicious software [23,24]. These attacks highlight the importance of implementing robust security measures, such as encryption, secure authentication, anomaly detection systems, and regular firmware updates, to safeguard charging stations against potential threats. It is worth noting that in this paper, the proposed algorithm detects general anomalies for charging stations. Exploiting the efficient approach based on federated learning (FL), anomalies indicated by charging station data will be detected.

FL, first introduced by Google, is an efficient distributed machine learning framework that addresses the limitations of traditional cloud-centric models [25]. At its core, FL enables multiple devices to collaboratively train a shared machine learning model by exchanging model parameters rather than raw user data. This approach enhances data privacy, reduces latency, alleviates bandwidth consumption, and minimizes energy costs. Compared to cloud-centric machine learning, FL has proven to be more suitable for wireless edge networks [25,26]. It allows edge devices to collaboratively learn a common model in parallel, while ensuring that raw data remain stored locally. For instance, FL was initially applied in drone networks. However, early FL schemes were centralized, relying on a central entity for the continuous aggregation of machine learning models. This reliance introduced the risk of single-point failures, making centralized approaches unsuitable for dynamic networks, such as drone systems, which often experience unreliable nodes and links. Specifically, if the central entity, such as a drone, depletes its battery or loses its wireless connection with other drones, the entire federated training process would be interrupted [27].

It is important to note that some research has explored decentralized machine learning. However, some works fall under traditional distributed machine learning, rather than federated learning [28,29]. A novel decentralized federated learning architecture is proposed for drone networks to address the limitations of conventional federated learning approaches [30,31]. Unlike traditional server-based federated learning models, this architecture adheres to the core principles of federated learning, such as each network node training its local model based on its own data—but eliminates the need for a central entity to aggregate and combine the global model. Instead, each drone only exchanges its local model by aggregating it with the models from its one-hop neighboring drones [32]. Building on this concept, several extended federated learning models have been proposed, including collaborative federated learning and multi-hop federated learning, all of which are designed to better accommodate the characteristics of unstable network connections.

2.2. Framework of Distributed Federated Learning

Collaborative federated learning: Chen et al. introduced the concept of collaborative federated learning, wherein user equipment (UE) devices, which may be located far from the cloud or base station or even disconnected from these entities, can still participate in federated learning through device-to-device (D2D) communication [33,34]. In this framework, devices transmit their local models to nearby neighbors associated with a base station. Within collaborative federated learning, each UE aggregates the local models received from its neighbors and subsequently sends the aggregated model to the base station. Google’s pioneering federated learning approach can be viewed as a special case of collaborative federated learning. To be specific, when all UEs are connected to the base station, thereby making the two approaches equivalent. The primary advantage of collaborative federated learning is its ability to incorporate a larger number of UEs and data sources, leading to improved training performance.
Multi-hop federated learning: Similar to collaborative FL, which enhances the training process by incorporating a greater number of UE devices, multi-hop federated learning has been proposed in [27] to facilitate FL in wireless multi-hop networks. In this approach, the local models from UEs that are not directly connected to the parameter server node are transmitted according to a predefined routing strategy. The key difference between collaborative federated learning and multi-hop FL lies in the aggregation mechanism. In collaborative federated learning, each UE aggregates the local models from its neighboring devices before transmitting them, whereas in multi-hop federated learning, the models are forwarded directly, without any aggregation.
Fog computing learning: Taking into account the network topology in fog computing environments. Fog computing learning, a method that strategically distributes model training across various nodes, ranging from edge UEs to cloud servers, is introduced in [35]. Similar to collaborative machine learning, fog computing learning leverages device-to-device communication to coordinate heterogeneous UEs at varying distances, thereby forming a multi-layer hybrid federated learning framework.

2.3. Comparison Between CFL and DFL

Centralized federated learning (CFL) and distributed federated learning (DFL) differ significantly in their approaches to privacy protection. In CFL, a central server aggregates model updates from participating clients, ensuring that raw data remain on local devices. However, this setup introduces a single point of vulnerability; if the central server is compromised, aggregated updates or reconstructed models could lead to privacy breaches. Techniques like differential privacy and secure aggregation are commonly employed to mitigate these risks, but the need to trust the central server remains a concern. On the other hand, DFL eliminates the reliance on a central entity by enabling peer-to-peer or hierarchical communication among clients. This decentralized structure reduces the risk of privacy leakage from a single point of failure and distributes trust across the network. Nevertheless, DFL introduces new challenges, such as safeguarding data integrity and confidentiality during direct exchanges between untrusted peers.

In terms of bandwidth efficiency, CFL and DFL exhibit distinct trade-offs. CFL relies on transmitting model updates between clients and a central server, which can create significant bandwidth usage, especially in large-scale networks with many participants. Optimizations such as model compression, scarification, and quantization are often used to address these bandwidth challenges. In contrast, DFL leverages localized communication, where updates are shared directly between neighboring devices or through hierarchical structures. This can reduce global bandwidth consumption but may introduce redundant message exchanges and additional communication overhead in peer-to-peer settings. While CFL is generally more bandwidth-efficient in small-scale systems due to its centralized aggregation, DFL demonstrates better scalability and efficiency in geographically distributed networks, where localized interactions minimize long-range communication costs. Ultimately, the choice between CFL and DFL depends on the specific privacy and communication requirements of the application. Table 1 is summarized the differences between CFL and DFL.

2.4. Advantage of Transformer

Regarding the attention mechanism, the Transformer model uses the self-attention mechanism, which allows it to focus on different parts of the input sequence while processing each token [36]. This mechanism gives the model the ability to capture long-range dependencies effectively, which RNNs struggle with due to their sequential nature and vanishing gradient issues. CNNs, while good at extracting local patterns, are not designed to capture global relationships in sequences as effectively as Transformers. On the other hand, RNNs process sequences step-by-step, which makes them inherently sequential and slow to train, especially on long sequences. Transformers process all input tokens simultaneously (in parallel), thanks to their attention mechanism and position encoding. This allows for much faster training and inference compared to RNNs. The fully connected attention mechanism of Transformers scales better with modern hardware (e.g., GPUs/TPUs) [37]. Furthermore, RNNs are harder to parallelize, and CNNs require deep networks or large kernels to model long-term dependencies, which increases computational complexity.

3. Research Methodology

In this paper, we consider charging station anomaly detection. In this context, distributed federated learning is exploited, which leverages data from geographically dispersed charging stations to collaboratively train anomaly detection models without compromising data privacy. Each charging station operates as a local node, training a model on its own operational and usage data to identify patterns indicative of potential anomalies, such as unusual charging behaviors, equipment malfunctions, or cybersecurity threats. Instead of sending raw data to a centralized server, the local nodes share only model updates or gradients with neighboring stations or a decentralized aggregator using peer-to-peer or hierarchical communication protocols. Aggregation techniques, such as decentralized averaging or consensus-based methods, are employed to combine updates and improve the global anomaly detection model iteratively. This methodology ensures data confidentiality, reduces bandwidth requirements, and enhances the scalability of the detection system across a large network of charging stations. Challenges specific to this application include addressing heterogeneous data distributions due to varying usage patterns and environmental conditions, ensuring robust detection accuracy, and maintaining communication efficiency across the distributed network.

4. Data Protection with Collaborative Anomaly Detection

In the collaboration between anomaly detection and privacy protection, charging stations can utilize locally collected data to train models that identify faults or irregularities, such as power fluctuations, charging failures, or network attacks. At the same time, the results of these distributed local model updates are shared across charging stations through federated learning. This enables different charging stations to access model data beyond their own, thereby improving the accuracy of local models. Such a process fosters collaborative learning between stations without the direct transmission of user data, thereby reducing the risk of data leakage. This section introduces the framework of the charging station anomaly detection algorithm based on distributed federated learning. First, we analyze how the network topology of the charging station system changes during anomalies, and this topology analysis forms the basis for the design of the anomaly detection algorithm. Subsequently, we propose the framework of the anomaly detection algorithm for the EV charging station.

4.1. Distributed Network Topology

In large-scale networks, we use the percolation threshold

p_{c}

to represent the minimum connectivity level required to maintain the overall functionality of the network. The percolation threshold can be expressed as

p_{c} = 1 - \frac{1}{\bar{s} / \tilde{s} - 1},

(1)

where

\bar{s}

and

\tilde{s}

represent the first-order and second-order statistical features of the node (i.e., charging station) distribution in the network, respectively. When the percolation threshold is lower than this value, the network will split into smaller, disconnected parts. When the percolation threshold exceeds this value, the entire charging pile network will form a strongly connected network, and the corresponding network connectivity will improve. The connectivity of the nodes in the network determines the percolation threshold. If nodes in the network are randomly removed, the corresponding percolation threshold is denoted as

p_{r}^{c}

. If the nodes with the strongest connectivity are removed, the corresponding percolation threshold is

p_{r}^{t}

. This calculation standard emphasizes the robustness of the entire network topology while overlooking the potential dynamic characteristics of the network after node removal and their impact on the topology.

Therefore, by considering both the random removal and selective removal of nodes in the network, we analyze the network topology of a network consisting of N nodes (i.e., charging piles), which includes both random failures and targeted network attacks. Specifically, the topology analysis in this section is based on maintaining a constant number of links and finding the topology of a charging pile network that maximizes the overall network percolation threshold, which is given by

p_{c}^{all} = p_{c}^{r} + p_{c}^{t} .

(2)

The bimodal degree distribution is one of the most effective ways to achieve this goal. Its expression is

P_{k} = (1 - r) δ (k - k_{\min} + r δ (k - k_{\max})) .

(3)

In a bimodal distribution, a portion of nodes has a proportion r, with a corresponding degree

k_{\max}

, while the remaining portion, accounting for

1 - r

, has degree

k_{\min}

. The moments of a bimodal distribution are expressed as

\tilde{s} = (1 - r) k_{\min} + r k_{\max},

(4)

and

\bar{s} = \frac{{(\tilde{s} - r k_{\max})}^{2}}{1 - r} + r k_{\max}^{2} .

(5)

Then, we have the percolation threshold as

p_{c}^{r} = \frac{\bar{s} - 2 r \tilde{s} k_{\max} - 2 (1 - r) \tilde{s} + r k_{\max}^{2}}{\bar{s} - 2 r \tilde{s} k_{\max} - (1 - r) \tilde{s} + r k_{\max}^{2}}

(6)

When calculating the threshold for targeted attacks, two types of nodes need to be considered. One type of node has a degree of

k_{\max}

(i.e., central nodes), and its proportion in the total network is r. The other type of node has a degree of

k_{\min}

, and its proportion is

1 - r

. In this case, the removal of central nodes can lead to two possible cases. For case A, all central nodes are removed. Only a portion of them is removed for case B.

Case A: When

p_{c}^{r} > r

, all central nodes are removed, and after the targeted attack, only the nodes with degree

k_{\min}

remain. Therefore, we have

p_{c}^{t} = \frac{1 - r}{\tilde{s} - r k_{\max}} (\tilde{s} \frac{\bar{s} - r k_{\max} - 2 (1 - r)}{\bar{s} - r k_{\max} - (1 - r)} k_{\max}) + r .

(7)

Case B: When

p_{c}^{r} < r

, the removed nodes are high-degree nodes, which leads to the retention of some

k_{\max}

nodes. Therefore, we obtain the following expression:

p_{c}^{t} = \frac{{\bar{s}}^{2} - 2 r \bar{s} k_{\max} + r k_{\max}^{2} - 2 (1 - r) \bar{s}}{k_{\max} (k_{\max} - 1) (1 - r)} .

(8)

By substituting (6), (7), and (8) into (1), we can now compute the overall threshold. By determining the value of

k_{\max}

that maximizes

p_{c}^{all}

, we can derive a function of r that identifies the optimal

k_{\max}

. For small values of r,

k_{\max}

is given by

k_{\max} \sim A r^{(2 / 3)} .

(9)

Therefore, when r approaches zero,

k_{\max}

approaches its theoretical maximum. In a charging station network consisting of N charging stations, when

r = 1 / N

,

p_{c}^{all}

reaches its maximum value. This value represents the minimum threshold that allows at least one central node with degree

k_{\max}

to remain. Thus, we can rewrite (9) as

k_{\max} = A r^{(2 / 3)} .

(10)

In conclusion, in a distributed network with a predetermined average node degree, the most effective topology for countering both random failures and targeted attacks should have at least one node with degree

k_{\max}

, while all other nodes maintain a degree of

k_{\min}

. In Section 6, we will use realistic data to validate the relationship between the network’s degree distribution and its robustness.

4.2. Distributed Anomaly Detection Collaborative Algorithm

In Figure 1, different charging stations deploy deep networks locally, such as Transformer. Then, based on local training of these deep network models, model data (i.e., model parameters) are transferred according to a distributed federated learning framework. This involves transmitting the deep network model data among different charging stations. Each charging station updates its local model parameters upon receiving model data from other charging stations. The specific steps are as shown in Algorithm 1.

Algorithm 1 Distributed Anomaly Detection Collaborative Algorithm

1: All functioning charging stations generate the initial Transformer based on global model parameters.
2: Each charging station system is trained by locally collected user data by minimizing reconstruction errors and continuously updates the model parameters.
3: Each charging station system sends the updated local model parameters to adjacent charging stations.
4: Each charging station performs model parameter aggregation on the received model parameters and updates its local model parameters.
5: Each charging station uses the updated model to perform anomaly detection on the collected user data.
6: Repeat Step 1 to Step 5.

We provide a detailed analysis of each step in Algorithm 1. In Step 1, the charging stations, located in different areas, initialize the deep model. Then, in Step 2, each charging station uses locally collected data from EV users to train based on the initial parameters of the local model. Since the user data is stored locally and not directly transmitted over the network, user data privacy is protected. The user data includes the charging start time, charging end time, electric vehicle battery capacity, current battery level, charging voltage peak, charging current fluctuations, and the time when the charging plug was unplugged. In Step 3, after each charging station completes local training, the Transformer model parameters converge. The trained model parameters are then sent to the connected charging stations. In Step 4, each charging station aggregates the model parameters received from other charging stations (aggregation strategies will be analyzed in Section 4), thereby updating the local anomaly detection model. This distributed training and parameter aggregation process usually cycles through multiple rounds until the model at each charging station reaches the desired accuracy or convergence criteria. Additionally, this algorithm continuously collects new anomaly data from charging stations, then trains the local model and sends it to other stations. This ensures that the overall network of charging stations maintains accuracy in detecting different anomalies.

5. Parameter Exchange Strategy for Distributed Privacy Protection

Considering the rapid expansion of charging station networks and the large scale of Transformer model parameters, which can reach thousands or even tens of thousands within each charging station, directly transmitting these model parameters to neighboring stations would impose a substantial bandwidth burden on the network. To address this challenge, this section proposes a three-phase approach that includes network pruning, training quantization, and weight sharing, aimed at alleviating the network’s bandwidth constraints

5.1. Network Pruning

Through pruning, we can remove connections with weights below a certain threshold to reduce the number of connections. After pruning the network, we retrain it to maintain accuracy. Network pruning has been widely studied as a method to compress convolutional neural network (CNN) models. Early studies have demonstrated that network pruning is an effective technique for reducing network complexity and mitigating overfitting. More recently, the literature [32] has shown that pruning can be applied to current mainstream CNN models without sacrificing accuracy. Based on this approach, we first perform normal network training to achieve connectivity learning. Then, we prune the connections with small weights, i.e., we remove all connections whose weights fall below a specified threshold. Finally, we retrain the network to learn the weights that remain in the sparse connections. This process ensures that the pruned model retains high accuracy while reducing the computational burden and the number of parameters, making it more efficient for transmission between charging stations.

Specifically, we utilize compressed sparse row (CSR) or compressed sparse column (CSC) formats to store the sparse structure of the pruned network. This requires

2 a + n + 1

values, where a denotes the number of non-zero elements and n represents the number of rows or columns. To achieve further compression, we store the differences between indices and employ eight-bit encoding to represent index differences in the convolutional layers, and five-bit encoding in the fully connected layers. In cases where the index differences exceed a predefined threshold, we implement a zero-padding strategy: if the difference exceeds eight (requiring a three-bit unsigned number), we append a zero to pad the index difference. This technique significantly reduces the overall storage requirements while maintaining an efficient representation of the pruned network’s structure, thereby minimizing the communication cost when transmitting model parameters across charging stations.

5.2. Training Quantization

Training quantization and weight sharing further compress the size of the pruned network by reducing the number of bits required to represent each weight. By allowing multiple connections to share the same weight, we can significantly reduce the number of effective weights that need to be stored, and then fine-tune these shared weights. Overall, training quantization and weight sharing reduce the bit representation required for each weight and enable weight sharing between connections. Weight sharing is achieved through k-cluster clustering, where weights that are similar are grouped into clusters and represented by a single shared value. Quantization significantly reduces storage requirements by approximating the weights with fewer bits, while still preserving the accuracy of the model. This approach allows us to compress the model more efficiently without sacrificing performance, which is particularly important for reducing bandwidth usage when transmitting model parameters between charging stations. We use the k-means clustering algorithm to determine the shared weights for each layer of the trained network, so that all the weights falling into the same cluster share the same value. We do not share weights between different layers. The original set of n weights,

W = w_{1}, w_{2}, \dots, w_{n}

is divided into k clusters,

C = c_{1}, c_{2}, \dots, c_{k}

, where

n \geq k

. The goal is to minimize the sum of squared distances within each cluster (also known as the intra-cluster sum of squares). This is carried out by iteratively assigning each weight to the nearest cluster center and then updating the cluster centers to be the average of the weights assigned to them.

The k-means algorithm helps reduce the number of distinct weights by grouping similar weights into clusters, allowing them to be shared among multiple connections. This reduces the overall storage requirements of the network and further compresses the model, without significantly affecting its performance, as the weights within each cluster are represented by a single shared value. In summary, the k-means clustering approach for weight sharing helps significantly reduce the number of weights stored while maintaining model accuracy, which is crucial for efficient transmission of model parameters in the distributed setting of charging stations.

6. Experimental Results

6.1. Experimental Configuration and Dataset

To verify the accuracy of the electric vehicle anomaly and attack detection algorithm proposed in this paper, we examine two types of attacks: random failure and targeted attack. Random failure simulates a scenario where nodes are randomly selected for attack, with other nodes operating normally. Targeted attacks prioritize attacking nodes with higher degrees. The basic hardware specifications for running the algorithm are the following: Intel i9-14900KF, RTX 4090, and 64GB RAM. Each charging station uses a Transformer with the same architecture for training and performs data aggregation under the federated learning framework. Relevant parameters are shown in Table 2.

The data used in this study were collected from charging stations deployed at 150 different locations by power operators. In the dataset, 13,016 charging stations provide data samples for the training set, while 14,812 charging stations provide data samples for the test set. Each sample is represented as a 24-dimensional vector. The test set includes 448,172 normal measurement data samples and 92,642 anomaly data samples. The features of the dataset are summarized in Table 3.

6.2. Experimental Results Analysis for Multiple Charging Stations

In the random network, we randomly combine individual charging stations and set the network’s degree to 3, 5, or 7. That is, on average, each node has three, five, or seven neighboring nodes. The generated network topology and network robustness are shown in Figure 2, Figure 3 and Figure 4. Here, f represents the percentage of failed charging stations, and the vertical axis indicates the proportion of the remaining network size.

From Figure 2, Figure 3 and Figure 4, it can be seen that compared to networks with lower degrees, high-degree random networks exhibit better robustness against both random failures and targeted attacks. For a given degree of the network, targeted attacks cause more damage than random failures, meaning that the number of surviving nodes is smaller after a targeted attack. This can be understood as follows: in a random network, when high-degree nodes are attacked, the remaining network proportion is smaller, and as a result, the charging stations cannot accurately detect anomalies or attacks.

In scale-free networks, the large variance in node degrees makes the average degree less meaningful. Therefore, we consider scale-free networks to be a more realistic network model. As shown in Figure 5, Figure 6 and Figure 7, the robustness of this network structure is slightly better at resisting random failures, but the improvement is limited.

6.3. Performance Evaluation of Transformer

We train the Transformer using local data from different charging stations. As shown in Table 4, compared to ConvGRU and GRU-Attention, the Transformer achieves the highest scores across all metrics. This is because the Transformer extracts the correlation information of the data at different times from the charging stations, efficiently mining the relationships between different users at different times, while also reducing the impact of temporal fluctuations on anomaly detection accuracy. The above experiments validate the feasibility of deploying the Transformer inside charging stations.

7. Conclusions

This paper proposes a distributed privacy protection method for collaborative anomaly detection. Compared to typical published solutions regarding anomaly detection, the proposed approach can efficiently detect anomalies in charging stations without prior knowledge. Each individual charging station employs Transformer to locally detect anomalies. Furthermore, the model parameters of Transformer are transmitted to neighboring nodes for parameter aggregation and updated without user data delivery. Experimental results show that the proposed distributed algorithm effectively avoids the problem of degradation of detection accuracy across all charging stations due to the failure of the central node in centralized federated learning. In addition, it reduces hardware requirements, making it suitable for deployment scenarios without infrastructure.

In future work, we will investigate the relationship between different attacks causing charging station anomalies and charging station data. Moreover, more approaches to protect model parameters to further enhance data privacy will be studied.

Author Contributions

Conceptualization F.Z. and X.Y.; methodology, S.L.; software, H.M.; validation, Y.P. and H.H.; formal analysis, Y.P.; writing—original draft preparation, F.Z. and M.W.; writing—review and editing, F.Z. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and technology project of State Grid Jiangsu Electric Power Co., Ltd., Grant number J2023015.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

All the authors (i.e., Fei Zeng, Mingshen Wang, Yi Pan, Shukang Lv, Huiyu Miao, Huachun Han, and Xiaodong Yuan) are employed by the company Research Institute of State Grid Jiangsu Electric Power Co., Ltd. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest. The authors declare that this study received funding from the science and technology project of State Grid Jiangsu Electric Power Co., Ltd. (J2023015). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of the article, or the decision to submit it for publication.

References

Mohammed, A.; Saif, O.; Abo-Adma, M.; Fahmy, A.; Elazab, R. Strategies and sustainability in fast charging station deployment for electric vehicles. Sci. Rep. 2024, 14, 283. [Google Scholar] [CrossRef] [PubMed]
Algafri, M.; Alghazi, A.; Almoghathawi, Y.; Saleh, H.; Al-Shareef, K. Smart City Charging Station allocation for electric vehicles using analytic hierarchy process and multiobjective goal-programming. Appl. Energy 2024, 372, 123775. [Google Scholar] [CrossRef]
Gao, H.; Zang, B.B. New power system operational state estimation with cluster of electric vehicles. J. Frankl. Inst. 2023, 360, 8918–8935. [Google Scholar] [CrossRef]
Surya, S.; Bhuva, D.; Bhuva, A.; Chavan, S.S.; Basha, D.K.; Chattopadhyay, S. Implementation of Internet of Things (IoT) Framework for Governing Modern Cyber Attacks in Computer Network. In Proceedings of the 2023 IEEE International Conference on ICT in Business Industry & Government (ICTBIG), Indore, India, 8–9 December 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
Gao, H.; Liu, Y.F.; Rong, L.N.; Xie, X.P. Robust Consensus with Edge-based Multiplicative Uncertainties via Recursive Channel Filters. IEEE Trans. Circuits Syst. II Express Briefs 2023, 70, 2550–2554. [Google Scholar] [CrossRef]
Takeda, Y.; Suzuki, Y.; Fukamachi, K.; Yamada, Y.; Tanaka, K. Efficient Simulator for P2P Energy Trading: Customizable Bid Preferences for Trading Agents. Energies 2024, 17, 5945. [Google Scholar] [CrossRef]
Mchirgui, N.; Quadar, N.; Kraiem, H.; Lakhssassi, A. The Applications and Challenges of Digital Twin Technology in Smart Grids: A Comprehensive Review. Appl. Sci. 2024, 14, 10933. [Google Scholar] [CrossRef]
Kang, J.; Yu, R.; Huang, X.; Maharjan, S.; Zhang, Y.; Hossain, E. Enabling localized peer-to-peer electricity trading among plug-in hybrid electric vehicles using consortium blockchains. IEEE Trans. Ind. Inform. 2017, 13, 3154–3164. [Google Scholar] [CrossRef]
Almuhaideb, A.M.; Algothami, S.S. Efficient privacy-preserving and secure authentication for electric-vehicle-to-electric-vehicle-charging system based on ECQV. J. Sens. Actuator Netw. 2022, 11, 28. [Google Scholar] [CrossRef]
Dui, H.; Dong, X.; Chen, L.; Wang, Y. IoT-enabled fault prediction and maintenance for smart charging piles. IEEE Internet Things J. 2023, 10, 21061–21075. [Google Scholar] [CrossRef]
Shinde, S.S.; Tarchi, D. Joint Air-Ground Distributed Federated Learning for Intelligent Transportation Systems. IEEE Trans. Intell. Transp. Syst. 2023, 24, 9996–10011. [Google Scholar] [CrossRef]
Sun, P.; Bisschop, R.; Niu, H.; Huang, X. A review of battery fires in electric vehicles. Fire Technol. 2020, 56, 1361–1410. [Google Scholar] [CrossRef]
Pandya, S.; Srivastava, G.; Jhaveri, R.; Babu, M.R.; Bhattacharya, S.; Maddikunta, P.K.R.; Mastorakis, S.; Piran, J.; Gadekallu, T.R. Federated learning for smart cities: A comprehensive survey. Sustain. Energy Technol. Assess. 2023, 55, 102987. [Google Scholar] [CrossRef]
Liu, J.; Huang, J.; Zhou, Y.; Li, X.; Ji, S.; Xiong, H.; Dou, D. From distributed machine learning to federated learning: A survey. Knowl. Inf. Syst. 2008, 64, 885–917. [Google Scholar] [CrossRef]
Singh, P.; Masud, M.; Hossain, M.S.; Kaur, A.; Muhammad, G.; Ghoneim, A. Privacy-preserving serverless computing using federated learning for smart grids. IEEE Trans. Ind. Inform. 2021, 18, 7843–7852. [Google Scholar] [CrossRef]
Samarakoon, S.; Bennis, M.; Saad, W.; Debbah, M. Distributed federated learning for ultra-reliable low-latency vehicular communications. IEEE Trans. Commun. 2019, 68, 1146–1159. [Google Scholar] [CrossRef]
Berhorst, N.; Hino, M.; Penteado, M.; Galvis, L.; Gotardo, D.M.; Zanardini, M.; dos Santos, R.B.; Canha, L.N.; Marques, F. Business model and economic feasibility of electric vehicle fast charging stations with photovoltaic electric generation and battery storage in Brazil. In Advanced Technologies in Electric Vehicles; Academic Press: Cambridge, MA, USA, 2024; pp. 323–343. [Google Scholar] [CrossRef]
Wang, W.; Peng, X.; Yang, Y.; Xiao, C.; Yang, S.; Wang, M.; Wang, L.; Wang, Y.; Li, L.; Chang, X. Self-Training Enabled Efficient Classification Algorithm: An Application to Charging Pile Risk Assessment. IEEE Access 2022, 10, 86953–86961. [Google Scholar] [CrossRef]
Bhuva, D.R.; Kumar, S. A novel continuous authentication method using biometrics for IOT devices. Internet Things 2023, 24, 100927. [Google Scholar] [CrossRef]
Tuballa, M.L.; Abundo, M.L. A review of the development of Smart Grid technologies. Renew. Sustain. Energy Rev. 2016, 59, 710–725. [Google Scholar] [CrossRef]
Bayindir, R.; Colak, I.; Fulli, G.; Demirtas, K. Smart grid technologies and applications. Renew. Sustain. Energy Rev. 2016, 66, 499–516. [Google Scholar] [CrossRef]
Deb, S.; Tammi, K.; Kalita, K.; Mahanta, P. Impact of electric vehicle charging station load on distribution network. Energies 2018, 11, 178. [Google Scholar] [CrossRef]
Husnoo, M.A.; Anwar, A.; Reda, H.T.; Hosseinzadeh, N.; Islam, S.N.; Mahmood, A.N.; Doss, R. FedDiSC: A computation-efficient federated learning framework for power systems disturbance and cyber attack discrimination. Energy AI 2023, 14, 100271. [Google Scholar] [CrossRef]
Deng, S.; Zhang, L.; Yue, D. Data-driven and privacy-preserving risk assessment method based on federated learning for smart grids. Commun. Eng. 2024, 3, 154. [Google Scholar] [CrossRef] [PubMed]
Zafar, M.H.; Bukhari, S.M.S.; Houran, M.A.; Moosavi, S.K.R.; Mansoor, M.; Al-Tawalbeh, N.; Sanfilippo, F. Step towards secure and reliable smart grids in Industry 5.0: A federated learning assisted hybrid deep learning model for electricity theft detection using smart meters. Energy Rep. 2023, 10, 3001–3019. [Google Scholar] [CrossRef]
Zhang, J.; Guo, S.; Guo, J.; Zeng, D.; Zhou, J.; Zomaya, A.Y. Towards data-independent knowledge transfer in model-heterogeneous federated learning. IEEE Trans. Comput. 2023, 72, 2888–2901. [Google Scholar] [CrossRef]
Fu, L.; Zhang, H.; Gao, G.; Zhang, M.; Liu, X. Client Selection in Federated Learning: Principles, Challenges, and Opportunities. IEEE Internet Things J. 2023, 10, 21811–21819. [Google Scholar] [CrossRef]
Luzón, M.V.; Rodríguez-Barroso, N.; Argente-Garrido, A.; Jiménez-López, D.; Moyano, J.M.; Del Ser, J.; Ding, W.; Herrera, F. A tutorial on federated learning from theory to practice: Foundations, software frameworks, exemplary use cases, and selected trends. IEEE/CAA J. Autom. Sin. 2024, 11, 824–850. [Google Scholar] [CrossRef]
Zeng, S.; Li, Z.; Yu, H.; Zhang, Z.; Luo, L.; Li, B.; Niyato, D. HFedMS: Heterogeneous Federated Learning with Memorable Data Semantics in Industrial Metaverse. IEEE Trans. Cloud Comput. 2023, 11, 3055–3069. [Google Scholar] [CrossRef]
Zhou, X.; Ye, X.; Wang, K.I.-K.; Liang, W.; Nair, N.K.C.; Shimizu, S.; Yan, Z.; Jin, Q. Hierarchical Federated Learning with Social Context Clustering-Based Participant Selection for Internet of Medical Things Applications. IEEE Trans. Comput. Soc. Syst. 2023, 10, 1742–1751. [Google Scholar] [CrossRef]
Ji, S.; Tan, Y.; Saravirta, T.; Yang, Z.; Liu, Y.; Vasankari, L.; Pan, S.; Long, G.; Walid, A. Emerging trends in federated learning: From model fusion to federated x learning. Int. J. Mach. Learn. Cybern. 2024, 15, 3769–3790. [Google Scholar] [CrossRef]
Park, B.; Lee, H.; Kim, Y.-K.; Youm, S. Progressive Pruning of Light Dehaze Networks for Static Scenes. Appl. Sci. 2024, 14, 10820. [Google Scholar] [CrossRef]
Zhang, Y.; You, P.; Cai, L. Optimal charging scheduling by pricing for EV charging station with dual charging modes. IEEE Trans. Intell. Transp. Syst. 2018, 20, 3386–3396. [Google Scholar] [CrossRef]
Moghaddam, Z.; Ahmad, I.; Habibi, D.; Phung, Q.V. Smart charging strategy for electric vehicle charging stations. IEEE Trans. Transp. Electrif. 2017, 4, 76–88. [Google Scholar] [CrossRef]
Li, H.; Han, D.; Tang, M. A privacy-preserving charging scheme for electric vehicles using blockchain and fog computing. IEEE Syst. J. 2020, 15, 3189–3200. [Google Scholar] [CrossRef]
Vig, J. A multiscale visualization of attention in the transformer model. arXiv 2019, arXiv:1906.05714. [Google Scholar]
Chefer, H.; Gur, S.; Wolf, L. Transformer interpretability beyond attention visualizatio. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 782–791. [Google Scholar]
Sadeghi-Barzani, P.; Rajabi-Ghahnavieh, A.; Kazemi-Karegar, H. Optimal fast charging station placing and sizing. Appl. Energy 2014, 125, 289–299. [Google Scholar] [CrossRef]

Figure 1. The framework of the distributed data privacy protection algorithm for collaborative anomaly detection.

Figure 2. Random network robustness (degree = 3).

Figure 3. Random network robustness (degree = 5).

Figure 4. Random network robustness (degree = 7).

Figure 5. Scale-free network robustness (degree =3).

Figure 6. Scale-free network robustness (degree = 5).

Figure 7. Scale-free network robustness (degree = 7).

Table 1. Comparison of CFL and DFL.

	CFL	DFL
Privacy protection	Relies on a central server for aggregation, creating a single point of vulnerability. Requires trust in the central server to maintain privacy. Risk of privacy breaches if the server is compromised. Mitigated using techniques like differential privacy and secure aggregation.	Eliminates the need for a central server, reducing the risk of single-point failures. Privacy risks distributed across the network, making attacks more challenging. Vulnerable to untrusted peers during direct exchanges. Requires secure communication protocols to ensure confidentiality and integrity.
Communication requirements	Requires communication between clients and the central server for model updates. Bandwidth usage increases with the number of clients in large-scale systems. Optimizations like model compression and scarification can reduce bandwidth requirements	Utilizes localized communication among peers, reducing reliance on long-range communications. Lower global bandwidth usage in geographically distributed networks. May incur redundant message exchanges and overhead in peer-to-peer setups. Efficiency depends on the network topology and communication protocol.

Table 2. Configuration of Transformer.

Parameter	Value
Number of layers in the forward deep neural network	16
Dimension of hidden layers	512
Number of attention heads	8
Intermediate layer dimension in the feedforward neural network	1024
Dropout rate	0.1
Learning rate	0.5
Maximum sequence length	1024 token
Vocabulary size	20,000

Table 3. Description of dataset.

Features	Amount
Number of charging station locations	150
Dimension of each sample	24
The number of normal data samples	448,172
The number of abnormal data samples	92,642

Table 4. Performance evaluation of different deep neural networks.

Model	Accuracy	F1	AUC
ConvGRU [21]	0.7916	0.2801	0.7539
GRU-Attention [38]	0.7032	0.0031	0.7041
Transformer	0.8678	0.4326	0.8436

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zeng, F.; Wang, M.; Pan, Y.; Lv, S.; Miao, H.; Han, H.; Yuan, X. Distributed Data Privacy Protection via Collaborative Anomaly Detection. Electronics 2025, 14, 295. https://doi.org/10.3390/electronics14020295

AMA Style

Zeng F, Wang M, Pan Y, Lv S, Miao H, Han H, Yuan X. Distributed Data Privacy Protection via Collaborative Anomaly Detection. Electronics. 2025; 14(2):295. https://doi.org/10.3390/electronics14020295

Chicago/Turabian Style

Zeng, Fei, Mingshen Wang, Yi Pan, Shukang Lv, Huiyu Miao, Huachun Han, and Xiaodong Yuan. 2025. "Distributed Data Privacy Protection via Collaborative Anomaly Detection" Electronics 14, no. 2: 295. https://doi.org/10.3390/electronics14020295

APA Style

Zeng, F., Wang, M., Pan, Y., Lv, S., Miao, H., Han, H., & Yuan, X. (2025). Distributed Data Privacy Protection via Collaborative Anomaly Detection. Electronics, 14(2), 295. https://doi.org/10.3390/electronics14020295

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Distributed Data Privacy Protection via Collaborative Anomaly Detection

Abstract

1. Introduction

2. Related Work

2.1. Anomalies Detection and Federated Learning

2.2. Framework of Distributed Federated Learning

2.3. Comparison Between CFL and DFL

2.4. Advantage of Transformer

3. Research Methodology

4. Data Protection with Collaborative Anomaly Detection

4.1. Distributed Network Topology

4.2. Distributed Anomaly Detection Collaborative Algorithm

5. Parameter Exchange Strategy for Distributed Privacy Protection

5.1. Network Pruning

5.2. Training Quantization

6. Experimental Results

6.1. Experimental Configuration and Dataset

6.2. Experimental Results Analysis for Multiple Charging Stations

6.3. Performance Evaluation of Transformer

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI