Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems

Sheng, Xuanzhu; Yu, Chao; Zhou, Yang; Cui, Xiaolong

doi:10.3390/math12182932

Open AccessArticle

Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems

¹

Chinese People’s Armed Police Force Engineering University, Xi’an 710086, China

²

Department of Electronic Technology, Wuhan Naval University of Engineering, Wuhan 430033, China

³

School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200444, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(18), 2932; https://doi.org/10.3390/math12182932

Submission received: 19 August 2024 / Revised: 14 September 2024 / Accepted: 18 September 2024 / Published: 20 September 2024

(This article belongs to the Special Issue New Advances of Operations Research and Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

With the continuous improvement of the performance of artificial intelligence and neural networks, a new type of computing architecture-edge computing, came into being. However, when the scale of hybrid intelligent edge systems expands, there are redundant communications between the node and the parameter server; the cost of these redundant communications cannot be ignored. This paper proposes a reputation-based asynchronous model update scheme and formulates the federated learning scheme as an optimization problem. First, the explainable reputation consensus mechanism for hybrid intelligent labeling systems communication is proposed. Then, during the process of local intelligent data annotation, significant challenges in consistency, personalization, and privacy protection posed by the federated recommendation system prompted the development of a novel federated recommendation framework utilizing a graph neural network. Additionally, the method of information interaction model fusion was adopted to address data heterogeneity and enhance the uniformity of distributed intelligent annotation. Furthermore, to mitigate communication delays and overhead, an asynchronous federated learning mechanism was devised based on the proposed reputation consensus mechanism. This mechanism leverages deep reinforcement learning to optimize the selection of participating nodes, aiming to maximize system utility and streamline data sharing efficiency. Lastly, integrating the learned models into blockchain technology and conducting validation ensures the reliability and security of shared data. Numerical findings underscore that the proposed federated learning scheme achieves higher learning accuracy and enhances communication efficiency.

Keywords:

federated learning; communication efficiency optimization; reputation consensus mechanism; big data labeling

MSC:

68W15

1. Introduction

The Internet of Things (IoT) is undergoing significant changes with the rapid advancement of the big data era [1]. Given the latest developments in IoT, especially with the introduction of new infrastructure in 2020, four out of seven keywords are closely associated with the IoT industry [2]. This is an unprecedented opportunity for growth and an important milestone for business transformation. Unfortunately, due to the unique characteristics of IoT, such as resource constraints, self-organization, and short-range communication, IoT has been employing the cloud for outsourced storage and computation, which introduces a new set of challenging security and privacy threats. These challenges are ubiquitous and often lead to the fragmentation of information systems, preventing enterprises from achieving truly intelligent management. In an attempt to solve these problems, a significant amount of resources, including human, financial, and time, must be allocated, but these requirements may be too burdensome for many companies. To address this dilemma, IOTOS IoT centers are envisioned as a technology platform product [3]. Broadly speaking, these IoT centers can be categorized into six different types: data centers [4], business centers, algorithm centers, technology centers, R&D centers, and organizational centers. The concept of IoT centers was first introduced by IOTOS in August 2020 [5]. In this specific context, the term “entity” encompasses not only physical assets, such as equipment and facilities but also a broader range of abstract concepts. Connected components involve individuals, devices, systems, algorithms, and services. An IoT center is more abstract and advanced in nature than a data center, as it typically encompasses all aspects of a collection platform, a communication center platform, and a data center platform. In addition to supporting abstract business services such as data analytics, processing and transactions, it also involves data collection and communication. It is essential for the data collection platform to perform protocol analysis and heterogeneous data processing from system facilities. In addition, the communication center platform should ensure seamless transmissions of data between local, public, and hybrid networks without location-specific constraints so that the collected data can be applied smoothly. In addition, with the advent of the digital transformation (DT) [6] era, well-known enterprises such as Alibaba and Huawei have been discussing and exploring the digital transformation process [7], proposing the use of the “data center” mechanism to optimize data utilization.

Data centers play a vital role in enterprises by supplying the necessary data services for various business operations. With their platform and business capabilities, data centers continuously enhance and nurture data, transforming it into an efficient and reliable set of data assets and service capabilities [8]. Thereby enabling an agile response to new market changes and facilitating the rapid development of new front-end applications. It is crucial to note that the business generates the data, while the data center acts as its support system. In this symbiosis, business represents the yang aspect, symbolizing action and movement, while data represents the yin aspect, representing stability and substance. Collectively, they comprise a harmonious, self-sustaining, closed-loop system. The data center is a set of “let data for use” mechanisms, mainly consisting of three parts: a big data platform, a data assets system, and data service competence. In particular, the data labeling category [9,10,11,12] system is an essential part of the data resource system. The label data layer is an object-oriented data model that can unify the various identities of uniform objects in different business modules and data fields and consolidate the data of the same object according to the same granularity to form a comprehensive and rich object labeling system. With the arrival of the big data era, data labeling is gradually upgraded to intelligent labeling. By deploying intelligent labeling models in each data middleware, we can label data in a better adaptive way to match the needs of ever-changing business scenarios.

The abovementioned algorithms require nodes in the communication network to communicate with neighboring nodes at each iteration in a distributed system. However, in reality, many communications are unnecessary, and algorithms can converge well by reducing the number of communications under certain conditions. This paper proposes an event-driven communication mechanism in the communication process between nodes in a communication network, where each node only communicates with its neighboring nodes when it meets the driving conditions. The use of event-driven communication mechanisms can greatly reduce the number of communications between nodes and achieve the goal of saving network resources.

In addition, regarding the distributed data center intelligent annotation scenario applied in this paper, securing the necessary data security poses challenges, including concerns about privacy leakage, data leakage, and network security isolation. The existence of data barriers between the training models of different local databases leads to the formation of data “silos” that cannot be shared securely [13], and we adopt a federated learning framework. Through transferring the data storage and model training phases of machine learning to local users, the privacy of the local secondary data center can be effectively secured only by interacting with the central server for model updates. Certainly, there are challenges and threats to federated learning [14], including the apparent shortcomings in communication efficiency, the increasing difficulty of privacy and security issues, and the lack of trust and incentives [15]. In the meantime, security threats to the federated learning process should not be underestimated [16]. There are mainly three types: firstly, malicious clients change model updates to disrupt global model aggregation; secondly, malicious analyzers analyze model update information to predict the privacy of source data; and thirdly, malicious servers try to obtain source data from clients.

Consequently, to address these threats, in combination with high communication efficiency, optimization of the number of information interactions, and reduction of computational communication overhead in practical scenarios, as well as to achieve consistency in the security environment of distributed smart labels, we utilize the GCN model for local model labeling training in the credit consensus mechanism in the federated learning scenario, and finally in the form of a recommender system. And the contributions of the proposed method are listed as follows:

Communication problem: Aiming at the problem of limited communication bandwidth and high computational cost of local training models in the process of information interaction with the transmission of model parameters, we propose a credit formula mechanism, which determines when the models are fused and the information interaction is done through adaptive parameter settings, to achieve effective interaction between the local model optimization process, and to minimize the communication cost and the computational resources in the transmission process.
Recommendation effect problem: This paper uses the modified GCN model to design the label recommendation model, which finally makes the designed federated label recommendation system to attain the label consistency effect.
Security issues: For the data security and privacy issues in the system, this paper invokes blockchain technology in the cloud (primary data center) to achieve consistency of data annotation in each of the secondary data centers under the conditions of security and privacy.

2. Related Work

2.1. Distributed Data Center Computing Architecture Diagram

First, a distributed data center computational framework is established to explicitly give the components to prepare for the next step of the study, as shown in Figure 1.

In our distributed data center scenario, a cloud-side collaboration structure is being implemented. A cloud mainly consists of extended cloud servers with sufficient computational resources and proficient data scheduling capabilities. In comparison, the edge layer mainly consists of N subdata centers, whereas the end layer contains a large number of widely dispersed sensors and mobile devices. We model the intelligent annotation of data in various data centers in the cloud, where these devices are abundant in number and contribute greatly by providing a large amount of raw data.

2.2. Graph Convolution Networks

The ultimate form of the distributed intelligent annotation architecture proposed in this paper is to annotate and recommend labels for data categorization. As this paper builds a federated recommender system in a distributed scenario, there have been many studies in recent years utilizing GCN for model fusion for social recommendation. For instance, Sun et al. [17] proposed a generalized deep influence propagation model to simulate how users are affected by the recursive social diffusion process of social recommendation. While they gave potential data sources and future directions for the battle, which brought new ideas to think about in this paper, there was no specific algorithmic study on data privacy protection. Consequently, in this paper, the GCN method is chosen to be deployed locally (secondary data center) for model training to achieve accuracy and consistency of label classification in distributed scenarios. Yin et al. [18] have proposed a federated learning method, which is based on GCN and accurately recommends appropriate services for participating clients without collecting raw data. It leverages the possible overlapping services of different clients to guide embedding, aggregation, and sharing to optimize the local training results. The presence of federated learning during the model aggregation process also provides a degree of resistance to hazards such as cyber attacks. Yang et al. [19] were the first to propose a federated social recommendation learning framework based on GCN, which not only explains what a social recommendation is but also eases data collection by incorporating social information along with user-item interactions, which only suffers from the cold-start problem due to the use of user-item interactions. In addition, the impact of the presence of social links and user-item interactions due to their use is being considered. Accordingly, they have designed a federated learning recommender system for social recommendation tasks because of its heterogeneity, personalization, and privacy preserving requirements. A new framework for federated social recommendation and graph neural network GNN (FeSoG) was designed by them. Such an approach also caused the authors to think deeply about applying the improved method to the distributed intelligent annotation system scenario of this paper in the hope of improving the consistency and accuracy of the annotation results. Yang et al. also presented the ConsisRec approach to augment GNNs with consistent neighbor agglomeration for social recommendation [20]. This consistency method enhances GNN while realizing intelligent recommendations, which provides a sufficient reference for the intelligent labeling system in this paper. However, the paper did not consider data privacy security; therefore, based on its research, this paper combines the consistency algorithm with federated learning to ensure the consistency of the distributed intelligent labeling based on the safety, reliability, and privacy protection of the system.

Information interaction is required in every distributed data center. Therefore, uncertainty in node interaction occurs during information interaction between local models. Yang et al. [21] have proposed a new data enhancement method called node replication. They suggested a learning paradigm based on GCN supervised comparative learning (SCL), where the similarity between different nodes between users and projects is computed during data preprocessing, and SCL purposely allows similar nodes to learn from each other in the feature space, which improves the robustness of the GCN and the accuracy and effectiveness of the node replication.

2.3. Federated Learning

Since 2016, federated learning has been proposed as an algorithmic framework by Google Inc. and Microcosmos, respectively, and in March 2021, the IEEE formally published the first international standard for federated learning, which has been heavily researched in the area of model aggregation.

In 2022, Pillutla, K. et al. [22] introduced a robust aggregation for federated learning approach (RFA) to improve the robustness of system aggregation against possible poisoning of local data or model parameters of participating devices. The convergence of robust federated learning algorithms was also established by them for stochastic learning of least squares additive models. Huang et al. [23] provided a comprehensive and systematic overview of key and recent advances in federated learning research, as well as an introduction to the history of research in the field and definitions of terms.

At present, due to the heterogeneity of federation models and the dynamics of federated learning systems, the development of federated learning techniques poses new challenges in the federated learning process. For this reason, Lu et al. [24] artificially solved this problem challenge by proposing FedAAM, an adaptive asynchronous federation learning scheme based on momentum. This approach not only reduces the likelihood of personal data leakage but dynamically allocates weights, enhances the convergence rate of the model for asynchronous federation learning, and enhances the efficiency of training in the global process. Meanwhile, interpretable federated learning is also an emerging challenge. Yang et al. [25] suggested a new federated conceptual bottleneck (FedCBM) approach that introduces client models to identify interpretable features of each client-specific personalized model in a federated learning system. Xu et al. [26] showed low interpretability for deep neural networks, which creates obstacles for clients to deeply understand the predicted results and to make a quick response. Chen et al. [27] proposed a graph-federated learning approach in which multiple servers collaborate to enhance personalized learning on clustered clients to perform relevant learning tasks. The approach performs intracluster and intercluster learning through local interactions between servers and neighboring servers. Clients use the alternating direction multiplier model (ADMM) to learn their local models, addressing the problem of uneven client distribution between servers and clusters and inadequate data for segregating clients. In this paper, we compare and analyze the relevant working methods with the methods of this study, as shown in Appendix A.

3. Method

In this section, we present a federated learning approach that aims to achieve consistency among distributed training models for intelligent labeling recommendation systems and to ensure secure data transfer and privacy preservation during the intelligent labeling and recommendation process. To accomplish this goal, we combine the prediction problem with node classification by creating local graph convolutional network (GCN) models. These models use 10 different node features to determine the interconnectivity between nodes. Through the utilization of proprietary data, we construct private models while employing public data as a training benchmark. The node interactions can be facilitated by matrix similarity computation. Eventually, we compare the trained private models with the public data and perform a variance analysis to identify appropriate instances of information exchange. The summary of the symbols is listed in Table 1.

3.1. Explainable Distributed Semi-Supervised Model Fusion Reputation Consensus Mechanism

Throughout this paper, the accumulated private data for the subsidiary entities show the local correlation with their respective private models, including spatial extent. The proposed distributed training framework for federated learning based on graph convolutional networks is shown in Figure 2. To facilitate model understanding and enhance data relevance, model similarity is considered in the process of sharing data between models. Utilizing its powerful explainability, a fusion mechanism is developed to perform consistent relationship analysis and importance aggregation through relational attention. A higher model similarity means a higher relevance of the shared data received from the data provider, which ultimately leads to higher quality, accuracy, and reliability of data sharing. The reputation value assigned to each subsidiary, denoted as reputation = {attention, parameter, mean absolute error (MAE)}, is determined by corresponding coefficients,

ψ_{1}

,

ψ_{2}

,

ψ_{3}

, whereby the sum of these coefficients equals 1 (

ψ_{1} + ψ_{2} + ψ_{3} = 1

). In the case of subsidiary 1, the similarity between said subsidiary and its own previous temporal state model is defined as:

S I M (L_{1}, J_{1}) = 1 - D I S S = r e p u t a t i o n_{1}

(1)

1. Attention 1: The degree of attention can be determined by considering the interconnections formed by the 10 feature weights derived from the private data, in particular the topology of the 10 nodes that resembles a connectivity mechanism between them. It is worth noting that a higher degree of attention corresponds to a more favorable reputation.

2. Parameter 1: The reputation of subsidiary 1 depends on the parameters of the private model derived from its proprietary dataset. Significantly, the higher the parameter values, the better the reputation.

3. By substituting the data from the validation set into the previously trained private model, one can ascertain the disparity by calculating the

M A E_{11}^{'}

.

4. Therefore, the

D I S S (L_{1}, J_{1})

is the normalized difference between subsidiary 1 and the previous state, defined as:

D I S S (L_{1}, J_{1}) = ψ_{1}^{*} \times a t t e n t i o n_{1} + ψ_{2}^{*} \times p a r a m e t e r_{1} + ψ_{3}^{*} \times M A E_{1}^{'}

(2)

Now, the reputation₂ consists of the following parts:

1. The MAE of the private data obtained by substituting the private data 1 into the private model 2.

2. The Mass Median Diameter between the public data 1 and the public data 2 (

M M D_{1 & 2}

).

3. The difference between parameter 1 and parameter 2 is the

M A E_{p 1 & 2}

.

Similarly, the similarity between subsidiary 1 and subsidiary 2 is:

S I M (L_{1}, J_{1}) = 1 - D I S S (L_{1}, J_{1})

(3)

Therefore, the

D I S S (L_{1}, J_{1})

is the normalized difference between subsidiary 1 and subsidiary 2, standing for reputation₂, defined as:

D I S S (L_{1}, J_{1}) = ψ_{1} \times M A E_{d 12} + ψ_{2} \times M M D_{1 & 2} + ψ_{3} \times M A E_{p 1 & 2}

(4)

By the same token, now consists of the following parts of the reputation

n

:

1. The

M A E_{d 1 n}

generated by adding the private data 1 into the private model

n

.

2. The

M M D_{1 & n}

between public data 1 and public data

n

.

3. The difference between parameter 1 and parameter

n

in

M A E_{p 1 & n}

.

Therefore, the

D I S S (L_{1}, L_{n})

is the normalized difference between subsidiary 1 and subsidiary n, which is defined as:

\begin{array}{l} D I S S (L_{1}, L_{n}) & = ψ_{1} \times M A E_{d 1 n} + ψ_{2} \times M M D_{1 & n} + ψ_{3} \times M A E_{P 1 & n} \\ = {reputation}_{n} \end{array}

(5)

Within these entities, it is important to emphasize the importance of each subsidiary’s recommended information when interacting with entity 1. Therefore, subjective views from different subsidiaries (neighboring nodes) are merged into a viewpoint global model, which can be referred to as the global model of subsidiaries, containing the weights assigned to each individual viewpoint.

\{\begin{matrix} D I S S (L_{1}, J_{1}) = ψ_{1} • {attention}_{1} + ψ_{2} • {parameter}_{1} + ψ_{3} • M A E_{11^{'}} = r e p u t a t i o n_{1} \\ D I S S (L_{1}, J_{1}) = ψ_{1} • {p a r a m e t e r}_{2}^{'} + ψ_{2} • M M D_{12} + ψ_{3} • M A E_{12} = r e p u t a t i o n_{2} \\ D I S S (L_{1}, J_{1}) = ψ_{1} • {p a r a m e t e r}_{n}^{'} + ψ_{2} • M M D_{1 n} + ψ_{3} • M M D_{1 n} = r e p u t a t i o n_{n} \end{matrix}

(6)

Or subsidiary 1, the final reputation is:

\begin{array}{l} {Global}_{1} \times {final - reputation}_{1} = & {reputation}_{1} \times \mod_{1} + \\ r e p u t a t i o n_{2} \times \mod_{2} + \\ ... + {reputation}_{n} \times \mod_{n} \end{array}

(7)

During the integration and interaction between the subsidiaries, the private data is constantly updated, leading to the evolution of the private model trained on said data. The determination of reputation is influenced not only by the current moment

t

but also by the public data. Hence, when calculating the reputation value, it is important to evaluate the variance of the private model and the difference of the public data. Furthermore, it is essential to integrate one’s own private data into the private models of the other interacting subsidiaries in order to identify the resulting variance values. Eventually, these variance values are merged together to construct a comprehensive global reputation. This methodology takes into account the various factors affecting reputation, resulting in a more persuasive assessment, improved integration between models, and consequently, increased consistency of annotations between subsidiaries.

To visualize the content of Section 3.1 more intuitively, we represent it in pseudo-code form as follows, Algorithm 1.

Algorithm 1: Explainable distributed semisupervised model fusion reputation consensus mechanism

1 . initialize ψ_{1}

, ψ_{2}

, ψ_{3}

2.

Reputation = {attention, parameter, mean absolute error (MAE)}, ψ_{1} + ψ_{2} + ψ_{3} = 1

3.

M A E_{p 1 & n}

= difference between parameter1 and parameter n

4 . M M D_{1 & 2}

= mass median diameter between the public-data₁ and the public data 2

5 . M A E_{d 1 n}

= adding the private data 1 into the private model n.

6 . M M D_{1 & n}

= mass median diameter between the public-data₁ and the public data n

7 . M A E_{p 1 & n}

= Difference between parameter1 and parameter n

8 . D I S S (L_{1}, J_{1})

= normalized difference between subsidiary 1 and subsidiary 2
9. for each data center = 1, 2, … n do

10 . Compute S I M (L_{1}, J_{1}) = 1 - D I S S (L_{1}, J_{1})

11 . Compute D I S S (L_{1}, J_{1}) = ψ_{1}^{*} \times a t t e n t i o n_{1} + ψ_{2}^{*} \times p a r a m e t e r_{1} + ψ_{3}^{*} \times M A E_{1}^{'}

12 . Compute S I M (L_{1}, J_{1}) = 1 - D I S S (L_{1}, J_{1})

13. Compute

D I S S (L_{1}, J_{1}) = ψ_{1} \times M A E_{d 12} + ψ_{2} \times M M D_{1 & 2} + ψ_{3} \times M A E_{p 1 & 2}

14 . Compute D I S S (L_{1}, L_{n}) = ψ_{1} \times M A E_{d} 1_{n} + ψ_{2} \times M M D_{1} &_{n} + ψ_{3} \times M A E_{P 1 & n} = {reputation}_{n}

15 . \{\begin{matrix} D I S S (L_{1}, J_{1}) = ψ_{1} • {attention}_{1} + ψ_{2} • {parameter}_{1} + ψ_{3} • M A E_{11^{'}} = r e p u t a t i o n_{1} \\ D I S S (L_{1}, J_{1}) = ψ_{1} • {p a r a m e t e r}_{2}^{'} + ψ_{2} • M M D_{12} + ψ_{3} • M A E_{12} = r e p u t a t i o n_{2} \\ D I S S (L_{1}, J_{1}) = ψ_{1} • {p a r a m e t e r}_{n}^{'} + ψ_{2} • M M D_{1 n} + ψ_{3} • M M D_{1 n} = r e p u t a t i o n_{n} \end{matrix}

16.

G {lobal}_{1} \times {final - reputation}_{1} = {reputation}_{1} \times \mod_{1} + r e p u t a t i o n_{2} \times \mod_{2} ... + {reputation}_{n} \times \mod_{n}

//update global model

The flexible reputation mechanism based on a private model, public data, private data, and updated data is proposed in detail, as shown in Figure 3. The current six subsidiaries of t are summarized as follows: The first row contains only the private data model. The second row contains all public data (forming the global model established in the first two rows). The third row represents the private data at time t. For time t + n, it consists of the private model 1-n (which determines the appropriate time for its next interaction based only on the information available at time t, i.e., the value of n indicates the amount of update required), the public data 1-n, and its own private data 1.

We recommend expanding the credit value to include a variety of factors related to the intelligent labeling training of subsidiaries. These include consideration of the relationship between the model and its own prior state, as well as the differences between the model and other submodels. In particular, we consider the impact of incorporating private models into the training process of other private models, thus affecting the overall structure of pairwise model training.

3.2. Proof of Reputation-Based Fusion Mechanism

For non-Euclidean data, the connections between vertices and edges are different in each topological graph. For this reason, GCN has been chosen as a means to efficiently extract spatial features for machine learning purposes. Various techniques aimed at explaining deep models using saliency graphs have been demonstrated in recent survey studies. By this means, this paper proposes a reputation-based interpretable fusion graph based on the attention GCN network, as shown in Figure 4. The method involves utilizing the parameters and feature roots of the model to obtain the saliency graph. As

f

represents a relational graph model, we opt for

g

as a straightforward GCN. The two-layer GCN structure designed in this paper consists of one layer with 10 nodes as inputs and another layer with the parameters of the private model. The sparse matrices used in the GCN provide explainability and help in interpreting any black-box models.

We utilize the training data to train 10 feature weights that are the same as the input parts. Together, these feature weights form a feature map, with each weight representing a different feature. By incorporating different proportions of input feature weights during training, the global reputation of the system (private model) can be efficiently established, thus solving the problem of model type saturation. Furthermore, since the convolutional layer retains spatial information, only 10 input feature weights are required to generate the model map. Unique feature maps are generated for each model structure by different weighting ratios. Nevertheless, the different feature weights among the six subsidiaries and the unique connections between the nodes lead to differences in the generated feature graphs, resulting in inaccurate training models. In addition, given the existence of information interaction and fusion dynamics between subsidiaries, the establishment of a fusion mechanism can improve the accuracy and consistency of the training model based on the information on node interactions between different subsidiaries.

In graphical data, nodes can represent objects or concepts, which are called labels. Edges represent relationships between nodes, including labels. Graph Neural Network (GNN) employs a state vector

X_{n}

to express the state of a node. The computation of node

n

requires the consideration of four components: the feature vector of node

n

, the feature vectors of neighboring nodes, the state vectors of neighboring nodes, and the feature vectors of edges connected to

n

. The computation of node

n

is performed using a degree-normalized aggregation. These components are updated in each layer using degree-normalized aggregation, resulting in a node representation that captures task-relevant features and structural insights that can benefit a variety of machine learning tasks.

The input phase involves combining information from neighboring nodes and edges to derive a state vector for the current node. Afterwards, the node features and state vectors are merged into the output vector. The graph convolutional network (GCN) utilizes the concept of convolution in graphs where the convolution operation of a particular point is considered as a weighted sum of its neighboring points. Eventually, node classification is performed using node characteristics.

Assuming a pretrained model and 10 input feature weights, we attempt to explain the model by elucidating the relationships between the input features at the node level. These interpretations should ideally capture the features and corresponding labels that are relevant to the model’s predictions. It ensures that when an image interpretation is delivered to a pretrained model, it will generate predictions that are the same or similar to the original image. Interpreting relational data and relational models, however, presents a significant challenge, as the correct relational structure needs to be learned when interpreting the predicted nodes of interest.

In addition, the prediction output of the model is Pr. We also consider the prediction of the unannotated data and calculate the loss through the MAE, as a function of the minimized loss:

L o s s = \frac{1}{L} \sum_{i = 1}^{L} l o s s (y, P r) + \frac{1}{U} \sum_{i = 1}^{L + U} l o s s (\hat{y}, P r)

(8)

M A E (X, h) = \frac{1}{m} |h (x_{i}, y_{i})|

(9)

where can be seen that the proposed explainable strategy includes two parts in the GCN framework: (i) accurate predictions, (ii) regular terms (but fit), (iii) smooth.

f_{i} = \frac{1}{2} {‖L_{i} \frac{1}{2} ({\hat{y}}_{i} - {\tilde{y}}_{i})‖}^{2} + \frac{x_{i}}{2} {‖w‖}^{2} + \frac{η_{i}}{2} ({\hat{y}}_{i} {\tilde{L}}_{i} {\hat{y}}_{i})

(10)

Accuracy

The model accuracy, also known as model precision, represents the ability of a predictive model to correctly assign instances to their respective categories. It quantifies how close the model’s predicted values are to the true values. The higher accuracy indicates e higher precision of the model’s predictions, while the lower accuracy indicates lower reliability.

To achieve accurate model annotation, intelligent annotation is implemented in this context. Distributed models require asynchronous communication between subsidiaries through information exchange to ensure consistency of intelligent annotation across models. This contributes to more accurate annotation at the global model level.

Regularization term

In the machine learning and statistics fields, a regularization term is a mechanism for regulating model comprehensiveness. The term is usually attached to the loss function and linked directly to the model parameters. Its main goal is to mitigate overfitting. Commonly used regularization techniques include L1 regularization and L2 regularization, which impose penalties on model parameters using L1 norms and L2 norms, respectively. L1 normalization facilitates feature selection by encouraging certain model parameters to converge to zero, whereas L2 regularization reduces model complexity by bringing model parameters closer to zero. The optimal selection of the regularization terms and fine-tuning should be based on the characteristics of the particular problem and dataset at hand.

Integrating regularization terms helps enhance the model’s generalization ability and promotes its superior performance on unseen data. By delicately controlling the complexity of the model, it effectively alleviates the overfitting challenge and achieves a well-calibrated balance between model fitting and generalization capabilities. To ensure sparsity and promote improved interpretability, we incorporate regularization measures into our objective function to ensure that the learned masks maintain their desired properties.

Smooth items

Originally, the problem of oversmoothing manifested itself through the similarities observed between node embeddings. The ultimate goal of embedding learning is to utilize them in classification tasks to predict labels. Such oversmoothing, however, leads to generating similar embeddings for nodes that do not share the same label, resulting in misclassification. Since convolution inherently involves aggregation, it produces a smoothing effect when the convolution kernel employs specific values. Considering that parameter sharing plays a crucial role in convolutional operations, its emphasis within the graph convolutional network (GCN) framework becomes even more pronounced due to the variation of vertex degrees in the graph.

After an extended research investigation into the root cause of the smoothing problem, modified convolutional kernel regularization methods were found to have the potential to solve the problem of high-powered arithmetic. An alternative approach is to retain results at lower levels and merge them with other features to alleviate the problem of excessive smoothing. This involves using different scales of convolution and then fusing the results of these different scales of convolution to capture different features of the node.

3.3. Information Interaction Conditions

At the beginning of moment t + 1, when the first company updates its private data, it also corresponds. In addition, the private model is updated, resulting in reputation 1. This adjustment stems from calculating the mean absolute error (MAE) between the private data of the different companies and their respective private models, while also taking into account the difference in Minhash values between the private model of company 1 and the private models of the other companies. In addition, the maximum mean discrepancy (MMD) between company 1’s and the other company’s public data was calculated to further inform the real-time update of reputation 1. Despite the lack of direct interaction, the focus shifted to assessing the extent of reputational change and the desire for interaction. Identifying the exact conditions under which such interactions are initiated remains critical.

The reputation value at the next moment (

r e p u t a t i o n_{t + 1}

) can add a pair with the original reputation. The difference degree of the judgment:

\begin{array}{l} Δ R e p u t a t i o n = & Δ r e p u t a t i o n_{(t + 1)} & r e p u t a t i o n_{(1 t)} + \\ Δ r e p u t a t i o n_{(2 t)} & r e p u t a t i o n_{(1 t)} + \\ Δ r e p u t a t i o n_{(3 t)} & r e p u t a t i o n_{(1 t)} \end{array}

(11)

where,

r e p u t a t i o n_{t + 1}

denotes the reputation value at the next moment,

r e p u t a t i o n_{2 t}

denotes the reputation value at this point in time for subsidiary 2, and so on, and the updated reputation value is related to the reputation value of each subsidiary and to the reputation value at the next moment.

3.4. Asynchronous Federated Learning

3.4.1. Node Optimization Selection

The characteristics of heterogeneous computing resources and variable communication conditions in each local data center hinder the efficiency of model fusion and information interaction, as well as the aggregation of global models. Choosing when to interact with information to reduce unnecessary aggregation time and aggregation time is effective. On the other hand, the selection of nodes is also crucial in order to improve the communication efficiency. Therefore, the aim is to select a subset of nodes

V_{P} \in V_{I}

with the goal of minimizing the number of interactions while maximizing the precision of the aggregation model.

We introduce

λ_{t} = |λ_{i}^{t}|

in time

t

as the vector of the selected state of the intelligent annotation model in the local data center,

λ_{i}^{t} = 1

indicates the node selection interaction state, and

λ_{i}^{t} = 0

indicates the stopped state. Therefore, the time cost at the node selection can be expressed as:

c_{D}^{t} (n) = f_{D} (ξ_{n}, d_{n}, t) = \frac{d_{n} \cdot β_{m}}{ξ_{n} (t)}

(12)

where

d_{i}

is the training data of the local model

n

,

β_{m}

is the number of CPU cycles required to train model

m

on a secondary data center, and

|ω_{n}|

is the local parameter learned in node

t

. The communication cost of local data center

n

is expressed as:

c_{c}^{t} (i) = f_{c} (τ_{n}, ω_{n}, t) = \frac{|ω_{n}|}{τ_{n}}

(13)

where the quality of learning (QoL) describes the accuracy of the local model parameters learned by the local data center in the node t, and the learning accuracy loss

c_{q}^{t}

is expressed as follows:

c_{q}^{t} = \sum_{i \in V_{P}} σ_{n}^{t} (ω^{t}, d_{n}) = \sum_{i \in V_{P}} \sum_{j} L (y_{j} - {\hat{ω}}^{t} (x_{j}))

(14)

DRL is used to address the node selection problem. The model is learned by interacting with other local training models (secondary data center). Specifically, we use the Markov decision-making process

M = (S, V, P_{V}, C_{V})

.

3.4.2. Implementation Process of Asynchronous Federated Learning

System status

During federated learning, the time is recorded as

t

and the system status includes the available computing resources of the local data center, transmission rate of data between individual local data center stations, and the selected state of the node; so the system status is:

s (t) = \{τ (t), ξ (t), λ (t - 1)\}

(15)

Action space

The action of time

t

is the selection strategy of the information interaction node in the data center, which is indicated as follows:

λ^{t} = (λ_{1}^{t}, λ_{2}^{t}, ..., λ_{n}^{t})

(16)

Policy

The strategy that goes from the state space to the action space is called strategy P:

S \to A

. In time period

t

, it can be calculated by policy

λ (t) = P (s_{t})

. Local data center station network status is transmitted according to the node selection action.

Reward function

The effect of an action is evaluated by the system using the reward function

r

. In epoch

t

, the agent tasked with node selection takes action

λ_{t}

at state

S_{t}

. The behavior was evaluated against the specified reward function as follows:

\begin{array}{l} R (s_{t}, λ_{t}) & = - \frac{1}{|\sum_{i = 1}^{n} λ_{i}|} \sum_{i = 1}^{n} c_{i}^{t} \cdot λ_{i}^{t} \\ = - \frac{1}{|\sum_{i = 1}^{n} λ_{i}|} (\sum_{i = 1}^{n} λ_{i}^{t} (\frac{d_{i} \cdot β_{m}}{ξ_{i} (t)} + \frac{|ω_{i}|}{τ_{i}}) + Δ R e p u t a t i o n) \end{array}

(17)

The reward function

R (s_{t}, λ_{t})

quantifies the action

λ

in node

t

, where

γ \in [0, 1]

. The total cumulative bonus is:

E [\sum_{t = 0}^{T - 1} γ R (s_{t}, λ_{t})]

(18)

Next status: After the system is updated, the status changes to

S_{t + 1}

, where

s_{t + 1} = s_{t} + P (s_{t})

. The new status becomes

τ (τ + 1)

,

ξ (t + 1)

, and

λ (t + 1)

.

Through the selection of nodes and the method of judging when to exchange information, the total cost of federated learning can be minimized. For the DRL model, the goal is to find the maximum cumulative reward

λ

, then

λ = \arg \max E [\sum_{t = 0}^{T - 1} γ R (s_{t}, λ_{t})]

(19)

3.5. Data Sharing Process

Within federated learning, it is possible for each device or data source to train the model locally and transmit the updated model parameters to a central server for aggregation, which can update the global model. Autonomy: first, an initialized model is installed on the corresponding terminals of two or more participants, each with the same model. Afterwards, participants can use their local data to train the model. Since the participants have different datasets, the model parameters on the final terminals are different.

Global federated training requires simultaneous uploading of different model parameters to the cloud. The cloud then performs the aggregation and updating of the model parameters, returning the updated parameters to the participants’ endpoints, where each endpoint initiates the next iteration. The process is repeated until the entire training process converges. Consequently, in the scenario of adapting distributed data centers to train a machine learning model with n secondary data centers, a federated learning architecture is introduced, where their business systems have their own user data, and at the same time, the secondary data centers have a labeled data that the model needs to predict. Due to data privacy and security considerations, data cannot be exchanged directly between the secondary platforms. At this point, a federated learning system can be employed to build a distributed intelligent labeling recommender system model, and the system architecture is shown in Figure 5.

Step 1: Local model training

As each local secondary middle-ground uses different categories of data according to each business system and different data of the same category, GCN models are used to train local intelligent labeling models. Simultaneously, each local model is calibrated with encrypted samples.

Step 2: Sharing request

During the sharing request process, the data requester shares and uploads only the generic model to the cloud, whereas the private model and private data are stored in the secondary data center.

Step 3: Interaction model parameters

In the process of uploading the local model, under encryption protection, it is constantly interactive with the intermediate calculation results, such as gradient, step size, and so on.

Step 4: Reputation consensus mechanism

In the modeling process, an intermediate party is involved as a coordinator. The proposed credit consensus mechanism in this paper can optimize the number of information interactions for locally training model parameters and minimize the communication overhead and computational resources.

Step 5: Update parameters

Constantly update model parameters according to the interaction results and encrypt model training.

Step 6: Global aggregation

Each participant aggregates the global model and calculates the final structure combination as the final model.

4. Experimentation

In this section, the performance of our presented blockchain-authorized asynchronous federated smart labeling recommendation system is assessed. A baseline for the experiments is first defined, and we provide a comprehensive description of the dataset and experimental details. Then, the feasibility of the proposed asynchronous federated learning recommendation algorithm based on the reputation consensus mechanism is verified, and the performance of the federated intelligent labeling recommendation system based on the adaptive GCN algorithm is evaluated.

4.1. Datasets

The proposed recommendation algorithm for asynchronous federated learning of labels on the NGSIM dataset is evaluated. The NGSIM (next generation simulation) dataset consists of comprehensive US highway traffic data collected by the FHWA. The dataset consists of driving conditions for vehicles traversing US 101, I-80, and I-80 roads during a specified time frame. Obtained through camera-based techniques, the data were carefully processed into individual track points. In order to evaluate the effectiveness of the proposed method, three well-known trajectory prediction datasets were selected: NGSIM I-80 [28], US-101 [29], and the Apollo landscape trajectory dataset [30].

NGSIM I-80

The NGSIM (next generation simulation) I-80 dataset contains real-world traffic data collected from the I-80 highway in the United States. NGSIM is a project by the Federal Highway Administration aimed at providing detailed, high-quality data for traffic research [28]. The data is collected from the I-80 highway in Emeryville, California, including the vehicle trajectories and motion patterns. Trajectory information includes position, speed, acceleration, and lane changes for each vehicle. Data is collected over multiple time intervals, often covering a 15-minute period. High temporal and spatial resolution, with vehicle positions recorded at 0.1-s intervals. The scenario of the data is urban highway traffic, which is particularly useful for analyzing vehicle interactions in dense traffic conditions, such as lane changing, following behavior, and merging.

US101

The NGSIM US-101 dataset is similar to the I-80 dataset, but the data comes from the US-101 highway in Los Angeles, California. This highway has different geometric and traffic characteristics compared to I-80, providing additional diversity in the dataset [29]. The data are collected from the US-101 highway, Los Angeles, California. The data type contains vehicle trajectories and kinematic data. Trajectory Information includes position, velocity, and lane changing behavior. The scenario is the US-101 highway features higher traffic volume and more complex on/off ramps compared to I-80, introducing different challenges such as more frequent merges and exits.

ApolloScape Trajectory dataset

The Apollo Landscape dataset is part of Baidu’s Apollo autonomous driving platform [30]. This dataset provides high-quality trajectory data collected in both urban and highway driving scenarios. While our highly respected trajectory dataset consists of carefully captured camera-based imagery, LiDAR-scanned point clouds, and carefully labeled trajectories. In the bustling city of Beijing, China, this comprehensive dataset encompasses a wide range of lighting conditions and varying levels of traffic density. In particular, it embraces complex, interwoven traffic patterns, seamlessly integrating vehicle movements with passengers and pedestrians. The data type includes sensor data (LiDAR, radar, cameras) along with vehicle trajectories, providing richer information compared to NGSIM datasets. The trajectory information is data on vehicle positions, velocities, accelerations, and interactions with surrounding vehicles and infrastructure.

These datasets are invaluable in trajectory prediction research. The NGSIM datasets (I-80 and US-101) are heavily focused on highway traffic scenarios, providing detailed data on vehicle behavior in dense traffic conditions. On the other hand, the Apollo Landscape dataset is broader, covering a variety of driving scenarios, making it highly relevant for autonomous vehicle development and testing across different environments.

Initially, the example dataset (10, n = 10) was divided into separate sets that could be intersected. In order to maintain temporal correlation, the imputation time t for each dataset was precisely recorded, and the differences between the datasets were calculated. As for each insertion and extraction, two key parameters of the program were used: the structural parameters and the significance parameters P1 and P2, which were shared between the five local subplatforms.

4.2. Baselines

Throughout this subsection, we provide a comprehensive comparison of our well-designed scenarios against a series of baselines consistent with the methodology outlined in the cited ref. [31]. Additionally, we evaluate various existing solutions employing NGSIM datasets, including V-LSTM, C-VGMM+VIM, GAIL-GRU, CS-LSTM, and GRIP++.

V-LSTM

Vanilla long short-term memory (V-LSTM) represents a specialized variant of recurrent neural network (RNN) meticulously crafted to handle the intricacies of sequential data within the realm of computer vision. Distinguishing itself from the conventional LSTM (long short-term memory) framework, V-LSTM seamlessly integrates visual information into its network architecture.

Within the V-LSTM framework, the input sequence not only contains temporal data but also integrates visual features extracted from images or video frames. This fusion enables the network to skillfully capture temporal dependencies while understanding and reviewing the sequential data through a visual lens.

Incorporating the power of LSTM with visual capabilities, V-LSTM has become a powerful force in various computer vision domains covering action recognition, video captioning, video generation, and video prediction. This fusion-based approach exhibits higher performance when compared to traditional LSTM models, especially in the field of visual sequential learning tasks.

C-VGMM

C-VGMM is a deep learning-based image segmentation algorithm that is mainly used for semantic segmentation of images. VIM is a text editor that is often used to write code and edit text files [31].

GAIL-GRU

GAIL-GRU stands for generative adversarial imitation learning with gated recurrent units. It is a machine learning algorithm that combines generative adversarial networks (GANs) with gated recurrent units (GRUs) to learn a policy from expert demonstrations in an unsupervised manner. The GAIL-GRU algorithm is commonly used in reinforcement learning and imitation learning tasks [32].

CS-LSTM

Contextual semantic long short-term memory (CS-LSTM) is a neural network model used to deal with text sequence modeling tasks, such as sentiment analysis and named entity recognition. It is an improvement on the traditional LSTM (long short-term memory), and the mechanism of introducing contextual semantic information in CS-LSTM can help the model better understand and capture the dependencies between contexts, which improves the performance of text series modeling tasks.

GRIP + +

GRIP + + serves as an improved scheme for GRIP, which utilizes fixed and dynamic graphs to capture complex interactions between different types of traffic agents to improve trajectory prediction accuracy [33].

4.3. Experiment Settings and Evaluation Criteria

The execution of our scheme is on a desktop running Ubuntu 16.04 with a 4.0 GHz Intel Core i7 CPU, 32GB RAM, and an NVIDIA Titan Xp graphics card. Every dataset is randomly partitioned, allocating 70% of the data for training, 20% for validation, and 10% for testing purposes. Each of the methods employs two hidden graph neural network layers (e.g., GCN, graph pages, etc.) with layer sizes precisely matching the number of classes in the dataset. Concretely, the NGSIM dataset is partitioned into 100 segments: 70 segments are used as the training dataset, 20 segments for validation, and 10 segments for testing while the task of sharing edge data involves the propagation of computational results from each data provider’s local data. Ten nodes representing 10 labels are used for model training, and global aggregation is performed after each iteration as a continuous optimization process.

4.3.1. Asynchronous Federated Learning Fusion Optimization Accuracy

From Figure 6, it can be concluded that as the number of experimental cycles increases, the accuracy obtained by our test set continues to improve, eventually leveling off and remaining above 95%. With this observation, we confirm the feasibility and validity of our experimental approach. It is clear that our proposed reputation-based fusion mechanism does improve the accuracy of intelligent model labeling.

4.3.2. Asynchronous Federated Learning Fusion Optimization Loss Function

Figure 7 shows the loss values for model training. There is a decreasing loss function with the increasing number of iterations, which eventually converges to a steady state. For further illustration of the efficiency of the proposed reputation, Table 2 and Table 3 give the loss values for different participants at epochs 1 and 30. As can be seen from Table 2 and Table 3, the accuracy of the model is greatly improved for each subject as the number of epochs increases.

5. Result

Afterwards, we continued to validate the correlation between our utilization of reputation values and the importance given to the different graphical snapshots in the GCN. Although reputation values can be compared to attention weights, it is important to determine whether these reputation values are truly consistent with the importance assigned to various graphical snapshots. It is notable that attention weights may not always serve as a strictly interpretable metric. Additionally, we successfully demonstrate the applicability of the proposed GCN by elucidating its ability to explain the behavior of distributed graph models in real-world application scenarios where the importance of graph snapshots is expected to fluctuate over time.

In graph neural networks, the establishment of connections between nodes relies on an attention mechanism that attributes varying degrees of importance to different node connection patterns. As a consequence, for datasets where node labels are uniform across all time intervals, we have the ability to assign higher connection probabilities to nodes belonging to a specific class at a given time step, which fundamentally enhances their relevance in predicting labels.

It has been established by the experiments presented in the GCN-SE paper that there is a strong correlation between attention and importance. Additionally, it has been established that the importance of graph topology and node attributes may fluctuate depending on the model and data used. In order to validate this correlation, a “perturbation” was introduced in the graph snapshots, and the link between the accuracy fluctuations and the attention weights within GCN-SE was investigated. Such an adjustment greatly helps to confirm the relationship between attention and importance.

These are graphs of the consistency results of the epoch communication in the distributed scenario in Figure 8. As the epoch progresses from 0 to 30, the overall accuracy gradually changes. During this period, the model undergoes five fusion and interaction processes over 30 iterations, but fails to maintain an accuracy level above 95%. These results show that under normal conditions (epoch communication), as the number of epochs increases, the accuracy of model training is not only in an up and down state but also fails to guarantee a consistency of more than 95% or even close to 90%, which is completely unable to satisfy the need for intelligent labeling consistency.

In comparison with Figure 8, the dotted line in Figure 9 is relatively stable, and although it still fluctuates up and down, the accuracy of model training can be maintained above 95%. More importantly, the number of model fusions and information interactions is reduced due to the role of the reputation-based model fusion mechanism we designed. The reason for this is that the mechanism determines when to interact based on the difference, which reduces the computational cost of the unnecessary information interaction process and effectively improves the model utilization. As the epoch progresses from zero to thirty, the consensus graph fluctuates during the fusion process but always ensures that it remains above the 95% threshold. Through the utilization of the fusion mechanism, the model only requires four iterations of fusion and interaction to achieve its goal. Figure 8 shows the fused information interactions, i.e., between subsidiaries, so that the consistency can be maintained above 95%, while the timed communication is updated only four times, thus satisfying the condition of consistency above 95%, enhancing the consistency accuracy, reducing the update time, and reducing the computational overhead.

Figure 10 illustrates the correlation between the number of updates and the number of iterations for the four different datasets. It is clear that all four datasets exhibit remarkable consistency, maintaining over 95% accuracy. This achievement is accompanied by a synchronized reduction in the frequency of information exchange, thus mitigating the computational communication overhead. Remarkably, the fourth dataset achieves the remarkable feat by reducing this overhead to only three instances.

To further validate the effectiveness of the proposed method, three other federated learning frameworks (distributed learning algorithms) were included in the experimental results in the revised manuscript, given as follows:

FedAVG method: FedAvg is a classic method in federated learning that proposes a distributed training approach. The core idea is to distribute the training process of the model to multiple clients (such as user devices); each client uses local data for model training and then sends the locally updated model parameters to the server. The server weighs and averages these parameters to generate a new global model.
FedProx method: FedProx is an improvement on FedAvg specifically designed to address the issue of data and client computing power heterogeneity in federated learning. It limits the offset of local model updates on the client side by introducing a proximal term in the optimization objective, preventing it from deviating too far from the global model. The main steps are similar to FedAvg, with the difference being that the client includes a regularization term during local optimization so that the local model does not differ too much from the global model.
Proof of training quality blockchain-based federated learning (PoTQBFL): It combines model training with the consensus process, which can make better use of the users’ computing resources. For a specific data sharing request, members of the consensus committee are selected by retrieving related users for the request in the blockchain. The committee is responsible for driving the consensus process, as well as for learning the model.

Table 4 presents a comparison of average displacement error (ADE) and final displacement error (FDE) for our method against three baseline methods. Although our model did not achieve state-of-the-art results overall, it demonstrates notable improvements compared to specific methods. While our model leads in all metrics, its practical alignment with real-world scenarios and unique contributions to data-sharing security make it a valuable approach.

In Figure 11, it is evident that the total number of interactions in Figure 9 is aligned with the corresponding number of iterations. For example, in Figure 9a, it corresponds to the initial black mark “+” in Figure 8, and so on. Also, it is worth noting that the model consistently shows improvement as training progresses. Remarkably, in Figure 9d, the fourth experiment achieved effective results using only three interaction instances, thus saving resources and communication costs while increasing the value of data utilization.

The comparative results for the baseline (given in Section IV.B) are shown in Figure 12. By conducting the comparison experiments, it is clear that the results obtained from the various datasets consistently show an upward trajectory. Moreover, the results obtained from the alternative datasets show a commendable level of consistency. The method introduced in this paper clearly outperforms other methods, not only achieving more than 95% label training consistency within a distributed data platform but also exceeding the expected results. These results validate the feasibility and effectiveness of the method proposed in this paper.

6. Conclusions

In this paper, we presented a novel and sophisticated reputation-based interpretable distributed semisupervised fusion mechanism. This mechanism enhances distributed intelligent labeling systems by reducing computational resources and communication overhead while improving labeling consistency and accuracy. Through the integration of multiple perspectives, it skillfully captures structural information and ensures coherence in the task of intelligent annotation. Experimental results clearly demonstrate the superiority of the approach. Remarkably, the annotation information obtained from different subsidiaries is complementary, thus emphasizing the potential for improving system performance through a well-designed fusion mechanism.

However, the proposed method may increase computational overhead because the proof of reputation adds extra steps for verifying the quality of local training on client devices. This verification process can increase computational requirements, especially for resource-constrained devices such as mobile phones or IoT devices. In addition, the scalability issue is another potential limitation. As the number of clients increases, managing and verifying training quality proofs from all participating clients can become computationally and logistically challenging. This impacts the scalability of the system for large-scale deployments. Also, the method proposed in this paper still needs improvement in combating malicious nodes. Although the proposed reputation mechanism aims to ensure the integrity of local training, there may still be vulnerabilities in cases where malicious clients find ways to manipulate the training process or proof generation. In the revised manuscript, we added the limitations of the proposed technique in the conclusion section. According to the limitations of the proposed technique, the first future scope is optimizing computational efficiency. Future research can focus on developing lightweight verification methods that reduce the computational burden on clients while maintaining accurate proof of training quality. And then, a scalable proof mechanism is necessary. New methods to scale reputation mechanisms efficiently, such as batching proofs or creating hierarchical verification systems, will be critical for applying this approach to large-scale federated learning systems with thousands of clients. In addition, the security enhancement and improved incentive structure are needed to be improved to avoid attacks on malicious nodes. Research into more robust mechanisms to ensure secure and tamper-proof proof generation will help strengthen reputation mechanisms against adversarial attacks or malicious behavior by clients. Incorporating better incentive mechanisms to encourage clients to provide high-quality contributions and proofs can enhance the overall model performance and foster more reliable participation from clients.

Author Contributions

Conceptualization, X.S.; methodology, X.S. and C.Y.; software, X.S. and C.Y.; validation, Y.Z.; formal analysis, X.S.; investigation, C.Y.; resources, X.C.; writing—original draft preparation, X.S.; writing—review and editing, X.C.; visualization, X.C.; supervision, X.C.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 62303296.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Method	Advantages	Disadvantage	Communications Overhead	Reputation Mechanism
FGC [18]	The method accurately recommends proper service for participating clients without gathering the raw data.	This method does not consider communication efficiency and heterogeneity, which are two elements that must be considered for distributed smart marking models.	Yes	No
RFA [22]	The method endows its aggregation process with greater robustness to potential poisoning of local data or model parameters of participating devices.	The method suffers from computational complexity, communication overhead, etc., and has not yet carried out research on adaptation to dynamic environment thinking	No	No
FedAAM [24]	This strategy improves the convergence speed of models in asynchronous federated learning systems. FedAAM can achieve sublinear convergence rates.	The method suffers from communication bottlenecks, model convergence, and data heterogeneity.	Yes	No
FedCBM [25]	The shared concepts will be the information carrier to align client-specific representations, and also be applied to enhance the model’s accuracy under a supervised learning loss.	The approach takes less account of model complexity and computational cost.	No	Yes
Fedrep (Ours)	Our proposed approach addresses both the security privacy consistency problem in the distributed intelligent annotation process, while also focusing on the communication overhead and, therefore, introduces a reputation mechanism that facilitates the system to reduce the communication overhead while improving accuracy.	This method is limited in overall model accuracy and consistency by the flaws in the underlying labeled values themselves and cannot be increased to more than 95%.	Yes	Yes

References

Bi, Z.; Jin, Y.; Maropoulos, P.; Zhang, W.J.; Wang, L. Internet of things (IoT) and big data analytics (BDA) for digital manufacturing (DM). Int. J. Prod. Res. 2023, 61, 4004–4021. [Google Scholar] [CrossRef]
Ben-Daya, M.; Hassini, E.; Bahroun, Z. Internet of things and supply chain management: A literature review. Int. J. Prod. Res. 2019, 57, 4719–4742. [Google Scholar] [CrossRef]
Katal, A.; Dahiya, S.; Choudhury, T. Energy efficiency in cloud computing data centers: A survey on software technologies. Clust. Comput. 2023, 26, 1845–1875. [Google Scholar] [CrossRef] [PubMed]
Wu, C.; Buyya, R. Cloud Data Centers and Cost Modeling: A Complete Guide to Planning, Designing and Building a Cloud Data Center; Morgan Kaufmann: Cambridge, MA, USA, 2015. [Google Scholar]
Tawalbeh, L.A.; Muheidat, F.; Tawalbeh, M.; Quwaider, M. IoT Privacy and security: Challenges and solutions. Appl. Sci. 2020, 10, 4102. [Google Scholar] [CrossRef]
Tran-Dang, H.; Kim, D.S. The physical internet in the era of digital transformation: Perspectives and open issues. IEEE Access 2021, 9, 164613–164631. [Google Scholar] [CrossRef]
Zhu, X. Emerging Champions in the Digital Economy; Springer: Singapore, 2019. [Google Scholar]
Bilal, K.; Khalid, O.; Erbad, A.; Khan, S.U. Potentials, trends, and prospects in edge technologies: Fog, cloudlet, mobile edge, and micro data centers. Comput. Netw. 2018, 130, 94–120. [Google Scholar] [CrossRef]
Babbar, R.; Schölkopf, B. Data scarcity, robustness and extreme multi-label classification. Mach. Learn. 2019, 108, 1329–1351. [Google Scholar] [CrossRef]
Huang, J.; Li, G.; Huang, Q.; Wu, X. Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans. Knowl. Data Eng. 2016, 28, 3309–3323. [Google Scholar] [CrossRef]
Tarekegn, A.N.; Giacobini, M.; Michalak, K. A review of methods for imbalanced multi-label classification. Pattern Recognit. 2021, 118, 107965. [Google Scholar] [CrossRef]
Wang, H.; Liu, W.; Bocchieri, A.; Li, Y. Can multi-label classification networks know what they don’t know? Adv. Neural. Inf. Process. Syst. 2021, 34, 29074–29087. [Google Scholar]
Zhang, C.; Xie, Y.; Bai, H.; Yu, B.; Li, W.; Gao, Y. A survey on federated learning. Knowl.-Based Syst. 2021; 216, 106775. [Google Scholar]
Bharati, S.; Mondal, M.; Podder, P.; Prasath, V.B. Federated learning: Applications, challenges and future directions. Int. J. Hybrid Intell. Syst. 2022, 18, 19–35. [Google Scholar] [CrossRef]
Cai, Z.; Chen, J.; Fan, Y.; Zheng, Z.; Li, K. Blockchain-empowered Federated Learning: Benefits, Challenges, and Solutions. arXiv 2024, arXiv:2403.00873. [Google Scholar]
Cabrero-Holgueras, J.; Pastrana, S. Sok: Privacy-preserving computation techniques for deep learning. Proc. Priv. Enhancing Technol. 2021, 4, 139–162. [Google Scholar] [CrossRef]
Sun, L.; Rao, Y.; Wu, L.; Zhang, X.; Lan, Y.; Nazir, A. Fighting false information from propagation process: A survey. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
Yin, Y.; Li, Y.; Gao, H.; Liang, T.; Pan, Q. FGC: GCN-based federated learning approach for trust industrial service recommendation. IEEE Trans. Ind. Inform. 2022, 19, 3240–3250. [Google Scholar] [CrossRef]
Liu, Z.; Yang, L.; Fan, Z.; Peng, H.; Yu, P.S. Federated social recommendation with graph neural network. ACM Trans. Intell. Syst. Technol. (TIST) 2022, 13, 1–24. [Google Scholar] [CrossRef]
Yang, L.; Liu, Z.; Dou, Y.; Ma, J.; Yu, P.S. Consisrec: Enhancing gnn for social recommendation via consistent neighbor aggregation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA, 14–18 July 2024; pp. 2141–2145. [Google Scholar]
Yang, C.; Zou, J.; Wu, J.; Xu, H.; Fan, S. Supervised contrastive learning for recommendation. Knowl.-Based Syst. 2022; 258, 109973. [Google Scholar]
Pillutla, K.; Kakade, S.M.; Harchaoui, Z. Robust aggregation for federated learning. IEEE Trans. Signal Process. 2022, 70, 1142–1154. [Google Scholar] [CrossRef]
Huang, W.; Ye, M.; Shi, Z.; Li, H.; Du, B. Rethinking federated learning with domain shift: A prototype view. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 16312–16322. [Google Scholar]
Lu, R.; Zhang, W.; Li, Q.; He, H.; Zhong, X.; Yang, H.; Alazab, M. Adaptive asynchronous federated learning. Future Gener. Comput. Syst. 2024, 152, 193–206. [Google Scholar] [CrossRef]
Yang, J.; Long, G. Concept-Guided Interpretable Federated Learning. In Proceedings of the Australasian Joint Conference on Artificial Intelligence, Brisbane, QLD, Australia, 28 November–1 December 2023; Springer Nature: Singapore, 2023; pp. 160–172. [Google Scholar]
Xu, C.; Chen, G.; Li, C. Federated learning for interpretable short-term residential load forecasting in edge computing network. Neural Comput. Appl. 2023, 35, 8561–8574. [Google Scholar] [CrossRef]
Chen, F.; Long, G.; Wu, Z.; Zhou, T.; Jiang, J. Personalized federated learning with graph. arXiv 2022, arXiv:2203.00829.1. [Google Scholar]
Coifman, B.; Li, L. A Critical Evaluation of the Next Generation Simulation (NGSIM) Vehicle Trajectory Dataset-Abridged (No. 19-03752). In Proceedings of the Transportation Research Board 98th Annual Meeting, Washington, DC, USA, 13–17 January 2019. [Google Scholar]
US Highway 101 Dataset; Tech. Rep. FHWA-HRT-07-030; Federal Highway Administration (FHWA): Washington, DC, USA, 2007.
Ma, Y.; Zhu, X.; Zhang, S.; Yang, R.; Wang, W.; Manocha, D. Trafficpredict: Trajectory prediction for heterogeneous traffic-agents. arXiv 2018, arXiv:1811.02146. [Google Scholar] [CrossRef]
Deo, N.; Rangesh, A.; Trivedi, M.M. How would surround vehicles move? A unified framework for maneuver classification and motion prediction. IEEE Trans. Intell. Veh. 2018, 3, 129–140. [Google Scholar] [CrossRef]
Kuefler, A.; Morton, J.; Wheeler, T.; Kochenderfer, M. Imitating driver behavior with generative adversarial networks. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 204–211. [Google Scholar]
Li, X.; Ying, X.; Chuah, M.C. Grip++: Enhanced graph-based interaction-aware trajectory prediction for autonomous driving. arXiv 2019, arXiv:1907.07792. [Google Scholar]

Figure 1. Distributed data center computing architecture diagram.

Figure 2. The proposed framework for distributed training based on graph convolutional networks.

Figure 3. The proposed flexible proof of reputation mechanism based on a private model, public data, private data, and renew data.

Figure 4. The reputation-based explainability fusion diagram based on the attention GCN network.

Figure 5. This is a data sharing process figure.

Figure 6. The accuracy value with the change of epoch from zero to fifty showed an upward trend and tended to be stable at about 98%.

Figure 7. The loss function with the change of epoch from zero to fifty showed a downward trend and leveled off at about 53%.

Figure 8. As the epoch varies from 0 to 30, the global accuracy varies with epoch from 0 to 30, and the model needs to be fused and interacted five times in 30 iterations, but the accuracy cannot be maintained.

Figure 9. The fused consensus graph fluctuates up and down as the epoch changes from 0 to 30 but is guaranteed to stay above 95%, and through the fusion mechanism, the model only needs to fuse and interact four times to reach the goal.

Figure 10. The updating epochs for different companies.

Figure 11. Corresponds to the aforementioned Figure 9, with the respective labels (a), (b), (c), and (d) representing the numerical values 1, 2, 3, and 4, respectively.

Figure 12. The accuracy of compared methods.

Table 1. The summary of the symbol.

Symbol	The Meaning of the Symbol
$ψ_{1}$	The attention to reputation
$ψ_{2}$	The parameter of reputation
$ψ_{3}$	Mean absolute error (MAE) of reputation
DISS (L₁,J₁)	The normalized difference between subsidiary 1 and the previous state
$X$	A state vector
$\Pr$	The prediction output of the model
$L_{1}$	Regularization
$V_{P} \in V_{I}$	A subset of nodes
$d_{i}$	The training data of the local model i
$β$	The number of CPU cycles
$m$	The number of train model
$\|ω_{n}\|$	Local parameters learned by the local model n
$n$	The number of local data center
$c$	The learning accuracy loss
$M$	Markov decision making process
P	The strategy that goes from the state space to the action space
$r$	The system using the reward function
$λ_{t}$	The agent tasked with node selection takes action at time $t$
$S_{t}$	The state at time $t$
$R (s_{t}, λ_{t})$	The reward function of the state at time $t$ and the agent tasked with node selection takes action at time $t$

Table 2. The loss value for different participants of epoch 1.

	Quantity	Loss
Car	0.525, 0.932, 1.400, 1.880, 2.413, 2.957	10.112
Human	0.163, 0.297, 0.452, 0.614, 0.800, 1.017	3.343
Bike	0.488, 0.964, 1.435, 1.931, 2.446, 2.992	10.255
WS	0.307, 0.571, 0.857, 1.158, 1.485, 1.839	6.217
Test_epoch 1	0.318, 0.592, 0.893, 1.231, 1.584, 1.951	6.569

Table 3. The loss value for different participants of epoch 30.

	Quantity	Loss
Car	0.390, 0.733, 1.136, 1.548, 1.952, 2.405	8.164
Human	0.156, 0.288, 0.438, 0.616, 0.800, 1.022	3.320
Bike	0.369, 0.722, 1.086, 1.523, 1.880, 2.370	7.950
WS	0.250, 0.472, 0.720, 1.002, 1.268, 1.595	5.307
Test_epoch 30	0.249, 0.475, 0.742, 1.031, 1.313, 1.648	5.457

Table 4. Average performance comparison.

Index	Proposed	FedAVG	FedProx	PoTQBFL
ADE	1.11	3.98	2.86	1.45
FDE	2.01	6.75	4.25	2.64

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sheng, X.; Yu, C.; Zhou, Y.; Cui, X. Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems. Mathematics 2024, 12, 2932. https://doi.org/10.3390/math12182932

AMA Style

Sheng X, Yu C, Zhou Y, Cui X. Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems. Mathematics. 2024; 12(18):2932. https://doi.org/10.3390/math12182932

Chicago/Turabian Style

Sheng, Xuanzhu, Chao Yu, Yang Zhou, and Xiaolong Cui. 2024. "Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems" Mathematics 12, no. 18: 2932. https://doi.org/10.3390/math12182932

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Reputation-Driven Asynchronous Federated Learning for Optimizing Communication Efficiency in Big Data Labeling Systems

Abstract

1. Introduction

2. Related Work

2.1. Distributed Data Center Computing Architecture Diagram

2.2. Graph Convolution Networks

2.3. Federated Learning

3. Method

3.1. Explainable Distributed Semi-Supervised Model Fusion Reputation Consensus Mechanism

3.2. Proof of Reputation-Based Fusion Mechanism

3.3. Information Interaction Conditions

3.4. Asynchronous Federated Learning

3.4.1. Node Optimization Selection

3.4.2. Implementation Process of Asynchronous Federated Learning

3.5. Data Sharing Process

4. Experimentation

4.1. Datasets

4.2. Baselines

4.3. Experiment Settings and Evaluation Criteria

4.3.1. Asynchronous Federated Learning Fusion Optimization Accuracy

4.3.2. Asynchronous Federated Learning Fusion Optimization Loss Function

5. Result

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI