Research on Water Quality Prediction Model Based on Spatiotemporal Weighted Fusion and Hierarchical Cross-Attention Mechanisms
Abstract
:1. Introduction
2. Data Preprocessing
3. Model Structure Design
3.1. TCN Module
3.2. GRU Module
3.3. Weighted Hierarchical Cross Attention Based on Spatiotemporal Features
4. Experimental Results and Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chen, Y.; Song, L.; Liu, Y.; Yang, L.; Li, D. A review of the artificial neural network models for water quality prediction. Appl. Sci. 2020, 10, 5776. [Google Scholar] [CrossRef]
- Wang, Q.; Hao, D.; Li, F.; Guan, X.; Chen, P. Development of a new framework to identify pathways from socioeconomic development to environmental pollution. J. Clean. Prod. 2020, 253, 119962. [Google Scholar] [CrossRef]
- Li, J.; Zhang, J.; Liu, L.; Fan, Y.; Li, L.; Yang, Y.; Lu, Z.; Zhang, X. Annual periodicity in planktonic bacterial and archaeal community composition of eutrophic Lake Taihu. Sci. Rep. 2015, 5, 15488. [Google Scholar] [CrossRef] [PubMed]
- Babaeinesami, A.; Tohidi, H.; Ghasemi, P.; Goodarzian, F.; Tirkolaee, E.B. A closed-loop supply chain configuration considering environmental impacts: A self-adaptive NSGA-II algorithm. Appl. Intell. 2022, 52, 13478–13496. [Google Scholar] [CrossRef]
- Katimon, A.; Shahid, S.; Mohsenipour, M. Modeling water quality and hydrological variables using ARIMA: A case study of Johor River, Malaysia. Sustain. Water Resour. Manag. 2018, 4, 991–998. [Google Scholar] [CrossRef]
- Avila, R.; Horn, B.; Moriarty, E.; Hodson, R.; Moltchanova, E. Evaluating statistical model performance in water quality prediction. J. Environ. Manag. 2018, 206, 910–919. [Google Scholar] [CrossRef]
- Jadhav, A.R.; Pathak, P.D.; Raut, R.Y. Water and wastewater quality prediction: Current trends and challenges in the implementation of artificial neural network. Environ. Monit. Assess. 2023, 195, 321. [Google Scholar] [CrossRef]
- Gao, Y.; Zhao, T.; Zheng, Z.; Liu, D. A Cotton Leaf Water Potential Prediction Model Based on Particle Swarm Optimisation of the LS-SVM Model. Agronomy 2023, 13, 2929. [Google Scholar] [CrossRef]
- Sabri, M.; El Hassouni, M. Photovoltaic power forecasting with a long short-term memory autoencoder networks. Soft Comput. 2023, 27, 10533–10553. [Google Scholar] [CrossRef]
- Hu, Z.; Zhang, Y.; Zhao, Y.; Xie, M.; Zhong, J.; Tu, Z.; Liu, J. A water quality prediction method based on the deep LSTM network considering correlation in smart mariculture. Sensors 2019, 19, 1420. [Google Scholar] [CrossRef]
- Chen, L.; Zhang, Y.; Xu, B.; Shao, K.; Yan, J.; Bhatti, U.A. A lot-based VMD-CNN-BIGRU indoor mariculture water quality prediction method including attention mechanism. Int. J. High Speed Electron. Syst. 2024, 2540010. [Google Scholar] [CrossRef]
- Yan, J.; Liu, J.; Yu, Y.; Xu, H. Water quality prediction in the Luan river based on 1-DRCNN and BiGRU hybrid neural network model. Water 2021, 13, 1273. [Google Scholar] [CrossRef]
- Bi, J.; Lin, Y.; Dong, Q.; Yuan, H.; Zhou, M. Large-scale water quality prediction with integrated deep neural network. Inf. Sci. 2021, 571, 191–205. [Google Scholar] [CrossRef]
- Niu, D.; Yu, M.; Sun, L.; Gao, T.; Wang, K. Short-term multi-energy load forecasting for integrated energy systems based on CNN-BiGRU optimized by attention mechanism. Appl. Energy 2022, 313, 118801. [Google Scholar] [CrossRef]
- Sun, F.; Jin, W. CAST: A convolutional attention spatiotemporal network for predictive learning. Appl. Intell. 2023, 53, 23553–23563. [Google Scholar] [CrossRef]
- Yuan, J.; Li, Y. Wastewater quality prediction based on channel attention and TCN-BiGRU model. Environ. Monit. Assess. 2025, 197, 219. [Google Scholar] [CrossRef]
- Waqas, M.; Humphries, U.W. A critical review of RNN and LSTM variants in hydrological time series predictions. MethodsX 2024, 13, 102946. [Google Scholar] [CrossRef]
- Li, G.; Zhang, A.; Zhang, Q.; Wu, D.; Zhan, C. Pearson correlation coefficient-based performance enhancement of broad learning system for stock price prediction. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 2413–2417. [Google Scholar] [CrossRef]
- Rahmad Ramadhan, L.; Anne Mudya, Y. A Comparative Study of Z-Score and Min-Max Normalization for Rainfall Classification in Pekanbaru. J. Data Sci. 2024, 2024, 1–8. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
- Tan, H.; Shang, Y.; Luo, H.; Lin, T.R. A Combined Temporal Convolutional Network and Gated Recurrent Unit for the Remaining Useful Life Prediction of Rolling Element Bearings. In Proceedings of the International Conference on the Efficiency and Performance Engineering Network, Huddersfield, UK, 29 August–1 September 2023; Springer Nature: Cham, Switzerland, 2023; pp. 853–862. [Google Scholar]
- Zhao, W.; Gao, Y.; Ji, T.; Wan, X.; Ye, F.; Bai, G. Deep temporal convolutional networks for short-term traffic flow forecasting. IEEE Access 2019, 7, 114496–114507. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Salimans, T.; Kingma, D.P. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. arXiv 2016, arXiv:1602.07868. [Google Scholar] [CrossRef]
- Chu, Y.; Guo, Z. Attention enhanced spatial temporal neural network for HRRP recognition. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; IEEE: New York, NY, USA, 2021; pp. 3805–3809. [Google Scholar]
- Lin, H.; Cheng, X.; Wu, X.; Shen, D. Cat: Cross attention in vision transformer. In Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar]
- Li, H.; Wu, X.J. CrossFuse: A novel cross attention mechanism based infrared and visible image fusion approach. Inf. Fusion 2024, 103, 102147. [Google Scholar] [CrossRef]
- Barzegar, R.; Aalami, M.T.; Adamowski, J. Short-term water quality variable prediction using a hybrid CNN–LSTM deep learning model. Stoch. Environ. Res. Risk Assess. 2020, 34, 415–433. [Google Scholar] [CrossRef]
- Christoph, M. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable; Leanpub: Victoria, BC, Canada, 2020. [Google Scholar]
- Antony, T.; Maruthuperumal, S. An effective illustrative approach for water quality prediction using spatio temporal features with cross attention-based adaptive long short-term memory. Intell. Decis. Technol. 2024, 18724981241298876. [Google Scholar]
Method | Advantages | Limitations |
---|---|---|
ARIMA | Statistically robust; suitable for linear time series prediction. | Lacks nonlinear modeling capability; unclear basis for forecasting accuracy. |
Improved Bayesian network | Relatively successful; demonstrates certain predictive capabilities. | 21% error rate; limited generalization, especially across different water bodies. |
ANN | Effective in handling nonlinear problems; suitable for water and wastewater treatment system prediction. | Prone to overfitting; limited generalization ability. |
LS-SVM | Enhanced prediction performance through swarm optimization; suitable for large-scale datasets. | Computationally complex; low efficiency when handling massive datasets. |
LSTM | Captures long-term dependencies in time series; reliable in forecasting. | Sensitive to training duration and hyperparameter tuning |
VMD-CNN-BiGRU | Incorporates attention mechanism; improves dissolved oxygen prediction accuracy | Complex architecture; requires large data support. |
CNN-BiGRU | Captures short-term dependencies in time series; suitable for short-term load forecasting. | May struggle with long-term dependencies due to limited temporal perception |
CNN-TCN | En-hance long-term temporal correlations using a multi-head attention mechanism | Performance can be highly sensitive to kernel size, dilation rate, and attention head settings. |
CA-TCN-BiGRU | Combines channel attention and temporal convolution; suitable for multi-parameter water quality prediction. | Complex structure; higher demand for computational optimization. |
Component | Advantages |
---|---|
Bidirectional Sliding Window Preprocessing | Enhances short- and long-term pattern learning; preserves temporal continuity; augments feature diversity. increases data utilization. |
BiGRU | Capture temporal dependencies from both past and future contexts; strong performance on time series; reduces gradient vanishing; faster convergence. |
BiTCN | Enables parallel computation and handles long-range dependencies via dilation; extracts deep spatial and local contextual features; reduces gradient vanishing; faster convergence. |
Weighted cross-layer cross-attention mechanism | Allows dynamic feature weighting and hierarchical interaction across BiGRU and BiTCN outputs; enhances spatiotemporal correlation modeling; improves interpretability and generalization. |
Input Data Size: [6370, 5, 5] | |||||
---|---|---|---|---|---|
Variable | Turbidity | Ammonia nitrogen | CODMn | Total nitrogen | Total phosphorus (Y Label) |
Water temperature | CODMn | Ammonia nitrogen | Total phosphorus | Total nitrogen (Y Label) |
Batch Size | Total Phosphorus | Total Nitrogen | ||||||
---|---|---|---|---|---|---|---|---|
MSE | MAE | RMSE | R2 | MSE | MAE | RMSE | R2 | |
128 | 0.198 | 0.348 | 0.445 | 0.579 | 0.728 | 0.810 | 0.853 | 0.292 |
64 | 0.076 | 0.218 | 0.275 | 0.857 | 0.154 | 0.357 | 0.393 | 0.874 |
32 | 0.048 | 0.160 | 0.219 | 0.909 | 0.057 | 0.197 | 0.240 | 0.952 |
16 | 0.040 | 0.142 | 0.200 | 0.925 | 0.031 | 0.117 | 0.176 | 0.974 |
8 | 0.036 | 0.132 | 0.192 | 0.950 | 0.024 | 0.087 | 0.156 | 0.980 |
4 | 0.035 | 0.131 | 0.189 | 0.953 | 0.022 | 0.080 | 0.150 | 0.982 |
Number of Channels | Total Phosphorus | Total Nitrogen | ||||||
---|---|---|---|---|---|---|---|---|
MSE | MAE | RMSE | R2 | MSE | MAE | RMSE | R2 | |
[32,32] | 0.046 | 0.165 | 0.216 | 0.941 | 0.013 | 0.114 | 0.177 | 0.974 |
[32,64] | 0.044 | 0.151 | 0.211 | 0.944 | 0.029 | 0.099 | 0.173 | 0.976 |
[64,64] | 0.041 | 0.139 | 0.204 | 0.946 | 0.028 | 0.088 | 0.167 | 0.977 |
[64,128] | 0.035 | 0.135 | 0.189 | 0.951 | 0.022 | 0.085 | 0.151 | 0.981 |
[128,128] | 0.035 | 0.132 | 0.188 | 0.952 | 0.022 | 0.080 | 0.150 | 0.982 |
[128,256] | 0.035 | 0.132 | 0.188 | 0.952 | 0.022 | 0.082 | 0.151 | 0.981 |
[256,256] | 0.035 | 0.131 | 0.188 | 0.952 | 0.023 | 0.083 | 0.152 | 0.981 |
[256,512] | 0.035 | 0.131 | 0.188 | 0.952 | 0.023 | 0.085 | 0.154 | 0.981 |
[512,512] | 0.035 | 0.131 | 0.189 | 0.953 | 0.024 | 0.085 | 0.155 | 0.981 |
[512,1024] | 0.036 | 0.132 | 0.190 | 0.951 | 0.024 | 0.080 | 0.156 | 0.980 |
Comparison Algorithm | Total Phosphorus | Total Nitrogen | ||||||
---|---|---|---|---|---|---|---|---|
MSE | MAE | RMSE | R2 | MSE | MAE | RMSE | R2 | |
CNN-BiGRU | 0.134 | 0.203 | 0.367 | 0.917 | 0.169 | 0.287 | 0.412 | 0.908 |
1-DRCNN–BiGRU [12] | 0.170 | 0.325 | 0.413 | 0.925 | 0.139 | 0.298 | 0.374 | 0.928 |
VMD-CNN-BiGRU [11] | 0.094 | 0.259 | 0.307 | 0.928 | 0.081 | 0.237 | 0.285 | 0.937 |
CA-TCN-BiGRU [16] | 0.337 | 0.255 | 0.581 | 0.901 | 0.210 | 0.242 | 0.459 | 0.921 |
IOOA-SRF-CAALSTM [29] | 0.187 | 0.227 | 0.433 | 0.924 | 0.068 | 0.205 | 0.261 | 0.949 |
CAST [15] | 0.046 | 0.184 | 0.214 | 0.936 | 0.042 | 0.152 | 0.206 | 0.958 |
TSCA | 0.035 | 0.132 | 0.188 | 0.952 | 0.024 | 0.080 | 0.150 | 0.982 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Wei, K.; Huang, J.; Yang, L.; Shi, J. Research on Water Quality Prediction Model Based on Spatiotemporal Weighted Fusion and Hierarchical Cross-Attention Mechanisms. Water 2025, 17, 1244. https://doi.org/10.3390/w17091244
Zhou J, Wei K, Huang J, Yang L, Shi J. Research on Water Quality Prediction Model Based on Spatiotemporal Weighted Fusion and Hierarchical Cross-Attention Mechanisms. Water. 2025; 17(9):1244. https://doi.org/10.3390/w17091244
Chicago/Turabian StyleZhou, Jiaming, Ke Wei, Jiahuan Huang, Lin Yang, and Junzhe Shi. 2025. "Research on Water Quality Prediction Model Based on Spatiotemporal Weighted Fusion and Hierarchical Cross-Attention Mechanisms" Water 17, no. 9: 1244. https://doi.org/10.3390/w17091244
APA StyleZhou, J., Wei, K., Huang, J., Yang, L., & Shi, J. (2025). Research on Water Quality Prediction Model Based on Spatiotemporal Weighted Fusion and Hierarchical Cross-Attention Mechanisms. Water, 17(9), 1244. https://doi.org/10.3390/w17091244