Study of Five-Hundred-Meter Aperture Spherical Telescope Feed Cabin Time-Series Prediction Studies Based on Long Short-Term Memory–Self-Attention
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Preprocessing
- The position must not be at the coordinate origin.
- The tracking speed must not exceed 20 mm/s, which is the maximum movement speed for this state.
- The duration of the tracking state must be at least one minute, corresponding to a minimum of 600 consecutive data points.
- Exclude data points with a velocity less than 1 mm/s to eliminate noise caused by factors such as wind vibration.
2.2. Model Architecture
2.2.1. Long Short-Term Memory (LSTM)
2.2.2. Self-Attention Mechanism
2.2.3. LSTM-Self-Attention
- The LSTM layers first process the input time-series data (such as historical position and velocity information), capturing both short- and long-term dependencies. The LSTM layers are used to process the sequential data, which may include the feed cabin’s past position and velocity measurements. LSTM is designed to retain long-term dependencies through its three gates (Input, Forget, and Output Gates), ensuring that relevant information from earlier time steps is used to make predictions at later points.
- After passing through the LSTM layers, the data are fed into the Self-Attention mechanism. The Self-Attention mechanism assigns different weights to different time steps, focusing on the most relevant inputs for prediction. This allows the model to ‘attend’ to critical features in the sequence. The Self-Attention mechanism works by assigning attention scores to each time step in the sequence, allowing the model to focus on the most important features of the input sequence, such as critical position changes. This enables the model to give higher importance to relevant time steps and less importance to redundant or irrelevant data.
- To enhance the model’s generalization, we use a dual-head attention mechanism. This means that the attention mechanism evaluates the input sequence from two different perspectives, or ‘heads’. Each head focuses on different aspects of the sequence, improving the model’s ability to capture complex dependencies over time.
- Finally, the output from the Self-Attention layers is passed through a fully connected layer to predict the remaining trajectory and position of the feed cabin, which converts the weighted sequence data into the final prediction of the feed cabin’s future position.
2.2.4. Model Fusion and Output
2.3. Handling Device Failures with the LSTM-SA Model Mechanisms
2.4. Training Process
2.5. Performance Evaluation Metrics
3. Results
3.1. Model Performance
3.2. Comparison and Analysis
3.2.1. Comparison of Models’ Performance Metrics
3.2.2. Comparison of Model’ Prediction Results
3.2.3. Error Analysis
4. Discussion and Conclusions
4.1. Key Findings
4.1.1. Model Performance
- The LSTM-SA model exhibited superior performance across all evaluated metrics, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and . Specifically, the model achieved an MAE of less than 10 mm and an RMSE of approximately 12 mm, comfortably meeting the FAST operational precision requirement of 15 mm. In contrast, the BP and LSTM models produced larger errors, indicating their lesser suitability for this application.
- The LSTM-SA model’s enhanced capability to capture long-range dependencies and reduce the impact of noise and outliers, thanks to the Self-Attention mechanism, is a key factor in its superior performance.
- The LSTM-SA model demonstrates strong predictive capabilities for the FAST feed cabin time-series data, particularly in scenarios involving sensor failures. By integrating long-term dependency capture with the LSTM network and the global feature extraction capability of the Self-Attention mechanism, the model achieves high prediction accuracy. However, the LSTM-SA model is specifically designed for the FAST feed cabin system, and may not be robust to changes in internal structure or external environments beyond this context. Future work could explore the model’s adaptability to different systems or external variables to improve its overall robustness.
4.1.2. Error Analysis
- Detailed error analysis revealed that The overall mean of the error distribution of the LSTM-SA model in the X, Y, and Z axes is concentrated about 3 mm, 95% of the errors are less than 14.1 mm, as can be seen in Figure 9, demonstrating greater precision and stability compared to the BP and LSTM models. The model’s standard deviation in error was notably lower, indicating its robustness in varying conditions.
- The model’s performance was particularly strong in predicting the Z-axis position, which is generally more stable during the feed cabin’s motion. This suggests that the model can effectively handle less complex motion patterns, making it suitable for diverse observational tasks.
4.2. Implications and Future Work
- The analysis from tables as, e.g., Table 1 shows that the model’s prediction accuracy is closely related to the motion modes of the feed cabin. The variation in correlation across different motion modes indicates that the model performs well in capturing position trends in some modes, such as MultiBeamOTF and OnTheFlyMapping, but poorly in others, like Tracking and SwiftCalibration. Observation time does not directly correlate with model performance, as seen in the Tracking mode, where despite the longest observation duration, the correlation is low. Future work will focus on improving the model architecture, designing specialized models for specific modes, balancing the dataset, and further analyzing sources of error to enhance overall prediction performance and generalization.
- Addressing model complexity and data dependency: The dual-head Self-Attention mechanism is computationally demanding and reliant on large volumes of high-quality data. These challenges could be mitigated by adopting more efficient architectures and enhanced data augmentation strategies.
- Enhancing feature engineering and real-time implementation: Incorporating additional factors like environmental conditions and more complex motion patterns could further boost the model’s predictive accuracy. Implementing the model in a real-time system for feed cabin positioning could help evaluate its practical applicability and identify areas for further refinement.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Petovello, M.G. Real-Time Integration of a Tactical-Grade IMU and GPS for High-Accuracy Positioning and Navigation. Ph.D. Thesis, University of Calgary, Calgary, AB, Canada, 2003. [Google Scholar] [CrossRef]
- Schipperijn, J.; Kerr, J.; Duncan, S.; Madsen, T.; Klinker, C.D.; Troelsen, J. Dynamic Accuracy of GPS Receivers for Use in Health Research: A Novel Method to Assess GPS Accuracy in Real-World Settings. Front. Public Health 2014, 2, 21. [Google Scholar] [CrossRef]
- Bidikar, B.; Rao, G.S.; Ganesh, L.; Kumar, M.S. Satellite Clock Error and Orbital Solution Error Estimation for Precise Navigation Applications. Positioning 2014, 5, 22–26. [Google Scholar] [CrossRef]
- Ge, H.; Li, B.; Wu, T.; Jiang, S. Prediction models of GNSS satellite clock errors: Evaluation and application in PPP. Adv. Space Res. 2021, 68, 2470–2487. [Google Scholar] [CrossRef]
- Lienhart, W. Geotechnical monitoring using total stations and laser scanners: Critical aspects and solutions. J. Civ. Struct. Health Monit. 2017, 7, 315–324. [Google Scholar] [CrossRef]
- Yao, R.; Li, Q.; Sun, J.; Sun, C.; Zhu, W. Accuracy Analysis on Focus Cabin of FAST. J. Mech. Eng. 2017, 53, 36–42. [Google Scholar] [CrossRef]
- Jiang, P.; Yue, Y.; Gan, H.; Yao, R.; Li, H.; Pan, G.; Sun, J.; Yu, D.; Liu, H.; Tang, N.; et al. Commissioning progress of the FAST. Sci. China Phys. Mech. Astron. 2019, 62, 959502. [Google Scholar] [CrossRef]
- Li, X.; Wang, Y.; Liu, D. Research on Extended Kalman Filter and Particle Filter Combinational Algorithm in UWB and Foot-Mounted IMU Fusion Positioning. Mob. Inf. Syst. 2018, 1, 1587253. [Google Scholar] [CrossRef]
- Wu, H.; Wu, W.; Xu, X.; Chen, J.; Zhao, Y. A new method to improve power efficiencies of optical systems with Cassegrain-telescope receivers. Opt. Commun. 2011, 284, 3361–3364. [Google Scholar] [CrossRef]
- Chung, S.; Lim, J.; Noh, K.J.; Kim, G.; Jeong, H. Sensor Data Acquisition and Multimodal Sensor Fusion for Human Activity Recognition Using Deep Learning. Sensors 2019, 19, 1716. [Google Scholar] [CrossRef]
- Fayyad, J.; Jaradat, M.A.; Gruyer, D.; Najjaran, H. Deep Learning Sensor Fusion for Autonomous Vehicle Perception and Localization: A Review. Sensors 2020, 20, 4220. [Google Scholar] [CrossRef]
- Wang, H.; Li, S.; Song, L.; Cui, L.; Wang, P. An Enhanced Intelligent Diagnosis Method Based on Multi-Sensor Image Fusion via Improved Deep Learning Network. IEEE Trans. Instrum. Meas. 2020, 69, 2648–2657. [Google Scholar] [CrossRef]
- Blasch, E.; Pham, T.; Chong, C.Y.; Koch, W.; Leung, H.; Braines, D.; Abdelzaher, T. Machine Learning/Artificial Intelligence for Sensor Data Fusion–Opportunities and Challenges. IEEE Aerosp. Electron. Syst. Mag. 2021, 36, 80–93. [Google Scholar] [CrossRef]
- Qiu, S.; Zhao, H.; Jiang, N.; Wang, Z.; Liu, L.; An, Y.; Zhao, H.; Miao, X.; Liu, R.; Fortino, G. Multi-sensor information fusion based on machine learning for real applications in human activity recognition: State-of-the-art and research challenges. Inf. Fusion 2022, 80, 241–265. [Google Scholar] [CrossRef]
- Wilgan, K.; Hadas, T.; Hordyniec, P.; Bosy, J. Real-time precise point positioning augmented with high-resolution numerical weather prediction model. GPS Solut. 2017, 21, 1341–1353. [Google Scholar] [CrossRef]
- Zhang, J.; Sun, L.; Jin, C.; Gao, J.; Li, X.; Luo, J.; Pan, Z.; Tang, Y.; Wang, J. Recent advances in artificial intelligence generated content. Front. Inf. Technol. Electron. Eng. 2024, 25, 1–5. [Google Scholar] [CrossRef]
- Dong, Y.; Xiao, L.; Wang, J.; Wang, J. A time series attention mechanism based model for tourism demand forecasting. Inf. Sci. 2023, 628, 269–290. [Google Scholar] [CrossRef]
- Yin, H.; Jin, D.; Gu, Y.H.; Park, C.J.; Han, S.K.; Yoo, S.J. STL-ATTLSTM: Vegetable Price Forecasting Using STL and Attention Mechanism-Based LSTM. Agriculture 2020, 10, 612. [Google Scholar] [CrossRef]
- Wang, Y.; Qian, C.; Qin, S.J. Attention-mechanism based DiPLS-LSTM and its application in industrial process time series big data prediction. Comput. Chem. Eng. 2023, 176, 108296. [Google Scholar] [CrossRef]
- Zhang, X.; Liang, X.; Zhiyuli, A.; Zhang, S.; Xu, R.; Wu, B. AT-LSTM: An Attention-based LSTM Model for Financial Time Series Prediction. IOP Conf. Ser. Mater. Sci. Eng. 2019, 569, 052037. [Google Scholar] [CrossRef]
- Huang, L.; Mao, F.; Zhang, K.; Li, Z. Spatial-Temporal Convolutional Transformer Network for Multivariate Time Series Forecasting. Sensors 2022, 22, 841. [Google Scholar] [CrossRef]
- Gao, C.; Zhang, N.; Li, Y.; Bian, F.; Wan, H. Self-attention-based time-variant neural networks for multi-step time series forecasting. Neural Comput. Appl. 2022, 34, 8737–8754. [Google Scholar] [CrossRef]
- Le, D.P.C.; Wang, D.; Le, V.T. A Comprehensive Survey of Recent Transformers in Image, Video and Diffusion Models. CMC-Comput. Mater. Contin. 2024, 80, 37–60. [Google Scholar] [CrossRef]
- Hendre, A.; Alachkar, B.; Boven, P.; Chen, S.; Collingwood, H.; Davis, J.; Dewdney, P.; Gozzard, D.; Grainge, K.; Gravestock, C.; et al. Precise timescale, frequency, and time-transfer technology for the Square Kilometer Array. J. Astron. Telesc. Instrum. Syst. 2022, 8, 011022. [Google Scholar] [CrossRef]
- Gui, G.; Huang, H.; Song, Y.; Sari, H. Deep Learning for an Effective Nonorthogonal Multiple Access Scheme. IEEE Trans. Veh. Technol. 2018, 67, 8440–8450. [Google Scholar] [CrossRef]
- Vos, K.; Peng, Z.; Jenkins, C.; Shahriar, M.R.; Borghesani, P.; Wang, W. Vibration-based anomaly detection using LSTM/SVM approaches. Mech. Syst. Signal Process. 2022, 169, 108752. [Google Scholar] [CrossRef]
- Li, S.; Tian, Z.; Li, Y. Residual long short-term memory network with multi-source and multi-frequency information fusion: An application to China’s stock market. Inf. Sci. 2023, 622, 133–147. [Google Scholar] [CrossRef]
- Touate, C.A.; EL Ayachi, R.; Biniz, M. Classification of Sentiment Using Optimized Hybrid Deep Learning Model. Comput. Inform. 2023, 42, 651–666. [Google Scholar] [CrossRef]
- Weerakody, P.B.; Wong, K.W.; Wang, G. Policy gradient empowered LSTM with dynamic skips for irregular time series data. Appl. Soft Comput. 2023, 142, 110314. [Google Scholar] [CrossRef]
- Fahim, A.; Tan, Q.; Mazzi, M.; Sahabuddin, M.; Naz, B.; Ullah Bazai, S. Hybrid LSTM Self-Attention Mechanism Model for Forecasting the Reform of Scientific Research in Morocco. Comput. Intell. Neurosci. 2021, 1, 6689204. [Google Scholar] [CrossRef]
- Zhou, F.; Chen, Y.; Liu, J. Application of a New Hybrid Deep Learning Model That Considers Temporal and Feature Dependencies in Rainfall-Runoff Simulation. Remote Sens. 2023, 15, 1395. [Google Scholar] [CrossRef]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv 2016. [Google Scholar] [CrossRef]
- Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Hochreiter, S. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Pulver, A.; Lyu, S. LSTM with working memory. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 845–851. [Google Scholar] [CrossRef]
- Zhao, Z.; Chen, W.; Wu, X.; Chen, P.C.; Liu, J. LSTM network: A deep learning approach for short-term traffic forecast. IET Intell. Transp. Syst. 2017, 11, 68–75. [Google Scholar] [CrossRef]
- Yang, B.; Yin, K.; Lacasse, S.; Liu, Z. Time series analysis and long short-term memory neural network to predict landslide displacement. Landslides 2019, 16, 677–694. [Google Scholar] [CrossRef]
- Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
- Wen, X.; Li, W. Time Series Prediction Based on LSTM-Attention-LSTM Model. IEEE Access 2023, 11, 48322–48331. [Google Scholar] [CrossRef]
- Li, Y.; Zhu, Z.; Kong, D.; Han, H.; Zhao, Y. EA-LSTM: Evolutionary attention-based LSTM for time series prediction. Knowl.-Based Syst. 2019, 181, 104785. [Google Scholar] [CrossRef]
Observation Mode | Duration (s) | Correlation |
---|---|---|
MultiBeamOTF | 47,496 | 0.338666 |
OnTheFlyMapping | 16,674 | 0.302489 |
DriftWithAngle | 68,400 | 0.235603 |
PhaseReferencing | 0 | NAN |
Tracking | 133,977 | 0.060534 |
OnOff | 59,446 | −0.001844 |
SnapShotCal | 25,980 | −0.026541 |
SnapShot | 41,340 | −0.028853 |
SwiftCalibration | 144,520 | −0.111182 |
MultiBeamCalibration | 4800 | −0.469986 |
DecDriftWithAngle | 0 | NAN |
TrackingWithAngle | 0 | NAN |
Parameter | Value |
---|---|
Dataset size | 129.6 million entries |
Time frame | January to May 2023 |
Batch size | 64 |
Learning rate | 0.0001 (initial) |
Optimizer | AdamW |
Training epochs | 1500 or early stopping |
Hidden units (LSTM) | 256 |
Layers (LSTM) | 8 |
Attention heads | 2 (dual-head attention) |
Loss function | Mean Squared Error (MSE) |
Learning rate schedule | ReduceLROnPlateau |
Training-validation split | Training: 80%, Validation: 20% |
Validation strategy | Early stopping based on validation loss |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Peng, S.; Li, M.; Song, B.; Yu, D.; Luo, Y.; Yang, Q.; Feng, Y.; Yu, K.; Li, J. Study of Five-Hundred-Meter Aperture Spherical Telescope Feed Cabin Time-Series Prediction Studies Based on Long Short-Term Memory–Self-Attention. Sensors 2024, 24, 6857. https://doi.org/10.3390/s24216857
Peng S, Li M, Song B, Yu D, Luo Y, Yang Q, Feng Y, Yu K, Li J. Study of Five-Hundred-Meter Aperture Spherical Telescope Feed Cabin Time-Series Prediction Studies Based on Long Short-Term Memory–Self-Attention. Sensors. 2024; 24(21):6857. https://doi.org/10.3390/s24216857
Chicago/Turabian StylePeng, Shuai, Minghui Li, Benning Song, Dongjun Yu, Yabo Luo, Qingliang Yang, Yu Feng, Kaibin Yu, and Jiaxue Li. 2024. "Study of Five-Hundred-Meter Aperture Spherical Telescope Feed Cabin Time-Series Prediction Studies Based on Long Short-Term Memory–Self-Attention" Sensors 24, no. 21: 6857. https://doi.org/10.3390/s24216857
APA StylePeng, S., Li, M., Song, B., Yu, D., Luo, Y., Yang, Q., Feng, Y., Yu, K., & Li, J. (2024). Study of Five-Hundred-Meter Aperture Spherical Telescope Feed Cabin Time-Series Prediction Studies Based on Long Short-Term Memory–Self-Attention. Sensors, 24(21), 6857. https://doi.org/10.3390/s24216857