A Short-Term Traffic Flow Prediction Method Based on Personalized Lightweight Federated Learning
Abstract
:1. Introduction
- A significant factor limiting the further improvement of prediction accuracy is the low quality of traffic data. Most cities rely on data collected from intersection cameras, while only a few have deployed specialized equipment such as magnetic or sensor loops on specific roads to gather traffic flow data. Consequently, traffic data in many areas suffer from varying degrees of missing information or other quality issues. Under these circumstances, the performance of traditional centralized training models is greatly constrained.
- Data privacy concerns are also a critical issue; traffic data from different cities or data collection agencies are often not transparent to each other. This leads to situations where a single dataset from one agency is insufficient to train a prediction model to an adequate level of excellence. If it were possible to collaboratively train traffic flow prediction models without revealing each other’s data, this could significantly increase the amount of training data, thereby enhancing the model’s predictive capabilities. This is particularly significant for cities or regions with limited data. However, the current high confidentiality requirements of cities and data agencies are limiting data sharing and training, thus necessitating a solution to the related data privacy and confidentiality issues.
- With the continuous development of machine learning models, the models applied to short-term traffic flow prediction have become increasingly complex, and scholars tend to combine and stack machine learning models. While this approach can improve prediction accuracy to some extent, it also increases model complexity and computational burden, which in turn reduces computational efficiency.
- We introduce a dynamic model pruning strategy that performs pruning on the local models of the clients. This approach reduces the number of model parameters that need to be uploaded, thereby enhancing the communication efficiency of federated learning.
- We propose a personalized federated learning method. In practice, clients vary in the amount of data they possess, which leads to discrepancies in their influence on the model parameters during local training. Clients with substantial data can disproportionately impact the final global model, compromising the personalization of other clients and creating an unfair situation. To rectify this, we incorporate a feature attention mechanism into the global model parameter aggregation phase. When the server aggregates the model parameters uploaded by the clients, it calculates the distance between the current local model parameters, and the pre-processed global model parameters are calculated to ascertain the differences between models. Based on these differences, we customize different weights for each client’s model to enhance their individuality. Afterward, these weights are averaged to determine the final global model parameters, ensuring that the unique characteristics of each model are preserved while they learn from one another.
- We propose a spatiotemporal fusion model that integrates a graph attention network (GAT), a graph convolutional network (GCN), and a multi-head temporal convolutional network (MH-TCN). Initially, MH-TCN is utilized to extract temporal features from the time series data, capturing both local and long-range temporal dependencies. Subsequently, a multi-head GAT is employed to compute attention scores for the relationships between road nodes, weighting the spatial neighbor subgraphs at different time points. The GCN is then used to extract spatial correlations from the dynamically weighted subgraphs.
2. Related Work
2.1. Federated Learning
2.2. Research on Model Lightweighting
2.3. Personalization of Federated Learning
- Simple personalization: The initial client personalization method uses the client’s feature vectors as input and incorporates them into the global model training process in a simple manner, for example, by adjusting the bias or weight of the model using information such as the client’s language environment and location. Li et al. proposed a novel personalized federated learning algorithm that utilizes the Moreau envelope regularization technique to learn personalized models for each client [16]. This algorithm allows for efficient communication between the clients and the server while preserving individual client privacy. Experimental results on benchmark datasets demonstrate that this method can achieve higher accuracy and faster convergence compared with traditional federated learning algorithms. The proposed algorithm has potential applications in various fields where personalized models are required.
- Client selective feedback: Client selective feedback is an improved client personalization method that takes feedback from the client as input and selectively integrates feedback information into the training of the global model to improve model accuracy and convergence speed. Zhang et al. proposed a model-agnostic meta-learning (MAML)-based personalized federated learning algorithm that can learn the optimal personalized model for each client with theoretical guarantees [17]. The proposed algorithm can effectively adapt to the differences in data distributions and local models of different clients by using a small number of samples. Theoretical analysis shows that the proposed algorithm can achieve the optimal rate of convergence for the personalized models. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed algorithm in achieving higher accuracy and faster convergence than existing methods.
- Client caching: Client caching is a method of storing model parameters locally on the client to accelerate global model training. The client trains based on its own feature vectors and local training data, and then uploads the updated model parameters to the server for global model updates. Yang et al. proposed a new federated learning method that utilizes different client caching strategies to improve the personalized performance of the model [18]. This paper uses a weighted average federated average algorithm to update model parameters and improves client performance by personalized adjustments to cached parameters. Through experiments on the MNIST and CIFAR-10 datasets, this method significantly reduces communication traffic and runtime while maintaining accuracy.
- Client adaptation: Client adaptation is a more advanced method of client personalization. It adapts to the characteristics of the client’s local data by adjusting the client’s hyperparameter, activation function, etc., and improves the generalization ability and prediction performance of the model. Li et al. proposed a federated learning method based on parameter personalization [19]. This method allows different devices to upload different parameters, calculate the similarity of the parameters, and combine it with the global model for personalized training. The experimental results show that this method can significantly improve the performance of personalized devices while ensuring global model performance. In addition, this method can also achieve better performance when the number of devices is small, and has good robustness, which can adapt to the situation of device failure and dynamic addition.
3. Methodology
3.1. Personalized Lightweight Federated Learning Framework
3.2. Federated Learning Collaborative Training Framework
- Pre-train the initial global model on the server side to obtain the initial weight parameters of the global model.
- Distribute the initial weight parameters of the server-side global model to each client through the communication network.
- The client receives the initial weight parameters of the global model sent by the server and iteratively trains the model using local data.
- The client uploads the weight parameters of the local model completed in this round of training to the server through the communication network.
- The server aggregates the weight parameters of the model transmitted by various clients to obtain a new global model.
- Continue with the next round of communication and repeat steps (2)–(5) to obtain the final prediction model after reaching the set number of communication rounds.
3.3. Local Model
3.4. Model Pruning
3.5. Client Personalization Mechanism
3.6. Model Aggregation
4. Data Description
5. Experimental Analysis
5.1. Experimental Setup and Evaluation Function
5.2. Model Pruning Experiment
5.3. Personalized Experiment
5.4. Traffic Flow Prediction Experiment Under Personalized Lightweight Federated Learning Framework
- LSTM (long short-term memory network): LSTM is a special type of recurrent neural network (RNN) designed specifically to address sequence learning problems [25].
- TGCN (temporal graph convolutional network): TGCN combines graph convolutional networks (GCNs) and gated recurrent units (GRUs) for modeling spatiotemporal data. TGCN achieves efficient learning and prediction of spatiotemporal data by performing convolution operations on graph-structured data, making it suitable for tasks like traffic flow prediction [26].
- DCRNN (diffusion convolutional recurrent neural network): DCRNN is a model that combines diffusion convolution with recurrent neural networks, focusing on spatiotemporal sequence prediction. By simulating the information diffusion process on a graph, DCRNN can more accurately predict spatiotemporal data, making it especially useful for traffic flow prediction and sensor network data analysis [27].
- STGCN (spatiotemporal graph convolutional network): STGCN is a model that combines graph convolutional networks and one-dimensional convolutional neural networks (1D-CNN). It jointly learns spatial and temporal features to efficiently model and predict spatiotemporal data, applicable to fields such as traffic and meteorology [28].
5.5. Comparison of Prediction Performance of Different Personalized Federated Learning Methods on Different Clients
5.6. Discussion on Traffic Flow Prediction, Land Use, and Reflecting Human Activities
- (a)
- Traffic flow prediction with land use: Traffic flow prediction serves as a valuable tool for reflecting the traffic demand and population density across different regions, which in turn helps to identify the core activity areas and potential development zones within a city. For instance, predictive analytics can uncover the traffic characteristics of various functional areas such as commercial districts, residential neighborhoods, and industrial zones, offering insights for the reference of land use functional zoning. If an area consistently experiences higher traffic flow than others over an extended period, it may suggest a higher commercial value, making it suitable for commercial land use planning. Conversely, areas with lower traffic flow but convenient access may be more appropriate for residential land use. Furthermore, traffic flow prediction can also reveal the strength of inter-regional connections; frequent traffic between certain areas may indicate a potential for collaboration or complementary functions, which could be prioritized for the construction of traffic corridors or for joint regional economic development. In addition, traffic flow prediction plays a crucial role in evaluating the feasibility and potential impact of land use planning proposals. During the planning phase of new developments or projects, simulating changes in traffic flow allows for the prediction of the planning’s impact on surrounding traffic, assessment of the rationality of road network loads, and determination of the necessity for additional transportation infrastructure, such as roads and public transportation stops. Moreover, the outcomes of traffic flow predictions also support the development of green cities. For example, by optimizing land use layout, it is possible to reduce traffic congestion and commuting distances, which can lead to decreased energy consumption and lower carbon emissions.
- (b)
- Traffic flow prediction with human activities: Traffic flow prediction reflects the spatiotemporal distribution of vehicles and crowds, indirectly revealing the dynamic characteristics of human activities. Traffic data are significant indicators of human activity; for instance, peak commuting hours reflect the concentration of work-related travel, while changes in traffic around business districts or tourist attractions can manifest the intensity of shopping or tourism activities. Therefore, by forecasting traffic variations across different times and areas, it is possible to create heat maps of human activity, uncovering the functional attributes and dynamic changes of specific regions. This capability not only provides data support for studying human activity patterns but also helps identify hotspots of activity and potential traffic pressure points. Furthermore, traffic flow prediction offers vital guidance for traffic planning and urban management. In the short term, it can optimize real-time traffic control measures, such as adjusting traffic signal timing or dynamically allocating public transportation resources, to alleviate congestion during peak periods. In the long term, traffic flow prediction can provide insights into the growth trends of regional traffic, offering a reference for infrastructure expansion, road network optimization, and public transportation system planning. In conjunction with intelligent transportation systems, traffic flow prediction also supports autonomous driving route planning and the construction of smart cities, thereby enhancing the efficiency of urban operations and the quality of life for residents. Therefore, the integration of multi-source data and real-time analysis techniques will make traffic flow prediction even more significant in the application of human activity research and urban planning.
6. Conclusions
- Conducting network-level federated learning predictions, fully considering the spatial connectivity between each road section and intersection, to further improve prediction accuracy.
- Utilizing different client initial models, as different machine learning models perform differently in traffic flow prediction. By selecting better initial models, the communication efficiency and accuracy of federated learning can be improved, thereby reducing the computational burden.
- Fully considering the influence of external factors when inputting the model and exploring the extent to which each factor affects the final prediction accuracy.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Parameters | FLOPs | MAE | |
---|---|---|---|
Normal | 4.12 | ||
SMP | 5.06 | ||
DMP | 4.18 |
References
- He, X.; Zhang, W.; Li, X.; Zhang, X. TEA-GCN: Transformer-Enhanced Adaptive Graph Convolutional Network for Traffic Flow Forecasting. Sensors 2024, 24, 7086. [Google Scholar] [CrossRef]
- Huang, H.; Fang, Z.; Wang, Y.; Tang, J.; Fu, X. Analysing taxi customer-search behaviour using Copula-based joint model. Transp. Saf. Environ. 2022, 4, tdab033. [Google Scholar] [CrossRef]
- Ma, C.; Dai, G.; Zhou, J. Short-Term Traffic Flow Prediction for Urban Road Sections Based on Time Series Analysis and LSTM_BILSTM Method. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5615–5624. [Google Scholar] [CrossRef]
- Li, L.; Zhang, J.; Wang, Y.; Ran, B. Missing Value Imputation for Traffic-Related Time Series Data Based on a Multi-View Learning Method. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2933–2943. [Google Scholar] [CrossRef]
- Meng, X.; Tang, J.; Yang, F.; Wang, Z. Lane-changing trajectory prediction based on multi-task learning. Transp. Saf. Environ. 2023, 5, tdac073. [Google Scholar] [CrossRef]
- Zhao, Y.; Li, M.; Lai, L.; Suda, N.; Civin, D.; Chandra, V. Federated learning with non-IID data. arXiv 2022, arXiv:1806.00582. [Google Scholar] [CrossRef]
- Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.S.; Poor, H.V. Federated Learning with Differential Privacy: Algorithms and Performance Analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
- Qi, Y.; Hossain, M.S.; Nie, J.; Li, X. Privacy-preserving blockchain-based federated learning for traffic flow prediction. Future Gener. Comput. Syst. 2020, 117, 328–337. [Google Scholar] [CrossRef]
- Dong, F.; Ge, X.; Li, Q.; Zhang, J.; Shen, D.; Liu, S.; Liu, X.; Li, G.; Wu, F.; Luo, J. PADP-FedMeta: A personalized and adaptive differentially private federated meta learning mechanism for AIoT. J. Syst. Arch. 2022, 134, 102754. [Google Scholar] [CrossRef]
- Gregurić, M.; Vrbanić, F.; Ivanjko, E. Impact of federated deep learning on vehicle-based speed control in mixed traffic flows. J. Parallel Distrib. Comput. 2023, 186, 104812. [Google Scholar] [CrossRef]
- LeCun, Y.; Denker, J.S.; Solla, S. Optimal brain damage. In Advances in Neural Information Processing Systems 2; Touretzky, D., Ed.; MIT Press: Cambridge, MA, USA, 1989; pp. 598–605. [Google Scholar]
- Luo, J.; Wu, J.; Lin, W. ThiNet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Venice, Italy, 22–29 October 2017; pp. 5058–5066. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
- Frankle, J.; Carbin, M. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar] [CrossRef]
- Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv 2016, arXiv:1510.00149. [Google Scholar]
- Dinh, C.T.; Tran, N.H.; Nguyen, T.D. Personalized federated learning with Moreau envelopes. arXiv 2022, arXiv:2006.08848. [Google Scholar]
- Fallah, A.; Mokhtari, A.; Ozdaglar, A. Personalized federated learning: A meta-learning approach. arXiv 2020, arXiv:2002.07948. [Google Scholar]
- Yang, G.; Wang, S.; Wang, H. Federated Learning with Personalized Local Differential Privacy. In Proceedings of the 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, 23–26 April 2021; pp. 484–489. [Google Scholar]
- Li, Y.; Qin, X.; Chen, H.; Han, K.; Zhang, P. Energy-Aware Edge Association for Cluster-Based Personalized Federated Learning. IEEE Trans. Veh. Technol. 2022, 71, 6756–6761. [Google Scholar] [CrossRef]
- Liu, Y.; Yu, J.J.Q.; Kang, J.; Niyato, D.; Zhang, S. Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach. IEEE Internet Things J. 2020, 7, 7751–7763. [Google Scholar] [CrossRef]
- Xia, M.; Jin, D.; Chen, J. Short-Term Traffic Flow Prediction Based on Graph Convolutional Networks and Federated Learning. IEEE Trans. Intell. Transp. Syst. 2022, 24, 1191–1203. [Google Scholar] [CrossRef]
- Yuan, X.; Chen, J.; Yang, J.; Zhang, N.; Yang, T.; Han, T.; Taherkordi, A. FedSTN: Graph Representation Driven Federated Learning for Edge Computing Enabled Urban Traffic Flow Prediction. IEEE Trans. Intell. Transp. Syst. 2022, 24, 8738–8748. [Google Scholar] [CrossRef]
- Zhang, C.; Zhang, S.; James, J.Q.; Yu, S. FASTGNN: A Topological Information Protected Federated Learning Approach for Traffic Speed Forecasting. IEEE Trans. Ind. Inform. 2021, 17, 8464–8474. [Google Scholar] [CrossRef]
- Dai, G.; Tang, J.; Zeng, J.; Hu, C.; Zhao, C. Road network traffic flow prediction: A personalized federated learning method based on client reputation. Comput. Electr. Eng. 2024, 120, 109678. [Google Scholar] [CrossRef]
- Tian, Y.; Li, P. Predicting Short-Term Traffic Flow by Long Short-Term Memory Recurrent Neural Network. In Proceedings of the IEEE Conference on Smart City/SocialCom/SustainCom (SmartCity), Chengdu, China, 19–21 December 2015; pp. 153–158. [Google Scholar]
- Sun, L.; Liu, M.; Liu, G.; Chen, X.; Yu, X. FD-TGCN: Fast and dynamic temporal graph convolution network for traffic flow prediction. Inf. Fusion 2024, 106, 102291. [Google Scholar] [CrossRef]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv 2018, arXiv:1707.01926. [Google Scholar]
- Zhang, Q.; Li, C.; Su, F.; Li, Y. Spatiotemporal Residual Graph Attention Network for Traffic Flow Forecasting. IEEE Internet Things J. 2023, 10, 11518–11532. [Google Scholar] [CrossRef]
- Liu, Y.; Wang, J.; Liu, Q.; Gheisari, M.; Xu, W.; Jiang, Z.L.; Zhang, J. FedTC: A Personalized Federated Learning Method with Two Classifiers. Comput. Mater. Contin. 2023, 76, 3013–3027. [Google Scholar] [CrossRef]
- Guo, Q.; Qi, Y.; Qi, S.; Wu, D.; Li, Q. FedMCSA: Personalized federated learning via model components self-attention. Neurocomputing 2023, 560, 126831. [Google Scholar] [CrossRef]
- Yi, L.; Shi, X.; Wang, N.; Wang, G.; Liu, X.; Shi, Z.; Yu, H. pFedKT: Personalized federated learning with dual knowledge transfer. Knowl. Based Syst. 2024, 292, 111633. [Google Scholar] [CrossRef]
Formula Symbols | Explanation |
---|---|
C | Number of clients |
Model parameters of the server during the nth communication | |
Model parameters for client c in n+1 communication | |
Calculating the distance between two sets of neural parameters using the Euclidean distance formula | |
Importance weight for client model | |
Model parameters of server layer l | |
Model parameters for the c-th local client layer l |
Total/s | Average Per-Round/s | |
---|---|---|
Normal | 565 | 11.3 |
SMP | 360 | 7.2 |
DMP | 385 | 7.7 |
RMSE | MAE | |
---|---|---|
LSTM | 9.12/10.09/11.63 | 6.23/6.97/7.95 |
TGCN | 8.43/8.93/9.51 | 5.59/5.95/6.58 |
DCRNN | 7.86/8.03/8.22 | 4.89/5.01/5.32 |
STGCN | 7.57/7.88/8.09 | 4.55/4.82/5.16 |
PLFL | 7.11/7.63/8.21 | 4.05/4.79/4.98 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dai, G.; Tang, J. A Short-Term Traffic Flow Prediction Method Based on Personalized Lightweight Federated Learning. Sensors 2025, 25, 967. https://doi.org/10.3390/s25030967
Dai G, Tang J. A Short-Term Traffic Flow Prediction Method Based on Personalized Lightweight Federated Learning. Sensors. 2025; 25(3):967. https://doi.org/10.3390/s25030967
Chicago/Turabian StyleDai, Guowen, and Jinjun Tang. 2025. "A Short-Term Traffic Flow Prediction Method Based on Personalized Lightweight Federated Learning" Sensors 25, no. 3: 967. https://doi.org/10.3390/s25030967
APA StyleDai, G., & Tang, J. (2025). A Short-Term Traffic Flow Prediction Method Based on Personalized Lightweight Federated Learning. Sensors, 25(3), 967. https://doi.org/10.3390/s25030967