Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN
Abstract
:1. Introduction
- We propose a digital twin operation framework, which is mapped to a virtual station by decomposing the knowledge graph of the physical station. The stochastic Petri net is designed to describe the station behavior logic and achieve an efficient virtual mapping.
- In order to resolve the problems of of diverse patterns, variable working conditions and imbalance samples, the method of MTAD-GAN is proposed by using the potential relationship between time-series variables, which enhances the features of multivatiate time-series by combining knowledge graph attention and temporal Hawkes attention mechanism. The ADGS scoring loss function is designed to estimate the probability distribution of network learning samples to complete the anomaly detection.
- Experiments on accuracy, precision, F1 and AUCROC with different datasets have shown great improvements with integrating the proposed MTAD-GAN. It demonstrates that the MTAD-GAN can effectively detect anomalies and outperforms the state-of-the-art deep learning methods as well as traditional methods.
2. Related Works
2.1. Digital Twin
2.2. Deep Learning
2.3. Transfer Learning
3. Proposed Method
3.1. SSUPS Based Digital Twin Framework
3.1.1. Virtual and Reality Mapping Based on SPN
3.1.2. Digital Twin Definitions
3.2. MTAD-GAN Network
3.2.1. Network Structure
3.2.2. KH-LSTM Network Structure
3.2.3. Knowledge Graph Attention Module
3.2.4. Hawkes Attention Module
3.2.5. Knowledge-Aware Transfer Learning
3.3. Abnormal Score
4. Experiments
4.1. Datasets
4.1.1. KDD99
4.1.2. SWaT
4.1.3. WADI
4.1.4. J10031
4.1.5. SKAB
4.1.6. DAMADICS
4.1.7. MSL and SMAP
4.1.8. SMD
4.2. Evaluation Indicators
4.3. Results and Analysis
4.4. Ablation Experiment
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rong, H.; Teixeira, A.; Soares, C.G. Data mining approach to shipping route characterization and anomaly detection based on AIS data. Ocean Eng. 2020, 198, 106936. [Google Scholar] [CrossRef]
- Jamil, F.; Kim, D. An ensemble of prediction and learning mechanism for improving accuracy of anomaly detection in network intrusion environments. Sustainability 2021, 13, 10057. [Google Scholar]
- Chow, J.K.; Su, Z.; Wu, J.; Tan, P.S.; Mao, X.; Wang, Y.H. Anomaly detection of defects on concrete structures with the convolutional autoencoder. Adv. Eng. Inform. 2020, 45, 101105. [Google Scholar] [CrossRef]
- Farzad, A.; Gulliver, T.A. Unsupervised log message anomaly detection. ICT Express 2020, 6, 229–237. [Google Scholar] [CrossRef]
- Liu, J.; Song, K.; Feng, M.; Yan, Y.; Tu, Z.; Zhu, L. Semi-supervised anomaly detection with dual prototypes autoencoder for industrial surface inspection. Opt. Lasers Eng. 2021, 136, 106324. [Google Scholar] [CrossRef]
- Cauteruccio, F.; Cinelli, L.; Corradini, E.; Terracina, G.; Ursino, D.; Virgili, L.; Savaglio, C.; Liotta, A.; Fortino, G. A framework for anomaly detection and classification in Multiple IoT scenarios. Future Gener. Comput. Syst. 2021, 114, 322–335. [Google Scholar] [CrossRef]
- Niu, Z.; Yu, K.; Wu, X. LSTM-based VAE-GAN for time-series anomaly detection. Sensors 2020, 20, 3738. [Google Scholar] [CrossRef]
- Jiang, W.; Hong, Y.; Zhou, B.; He, X.; Cheng, C. A GAN-based anomaly detection approach for imbalanced industrial time series. IEEE Access 2019, 7, 143608–143619. [Google Scholar] [CrossRef]
- Geiger, A.; Liu, D.; Alnegheimish, S.; Cuesta-Infante, A.; Veeramachaneni, K. TadGAN: Time series anomaly detection using generative adversarial networks. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 33–43. [Google Scholar]
- Zeng, Z.; Jin, G.; Xu, C.; Chen, S.; Zeng, Z.; Zhang, L. Satellite Telemetry Data Anomaly Detection Using Causal Network and Feature-Attention-Based LSTM. IEEE Trans. Instrum. Meas. 2022, 71, 1–21. [Google Scholar] [CrossRef]
- Choi, Y.; Lim, H.; Choi, H.; Kim, I.J. Gan-based anomaly detection and localization of multivariate time series data for power plant. In Proceedings of the 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), Busan, Republic of Korea, 19–22 February 2020; pp. 71–74. [Google Scholar]
- Nachman, B.; Shih, D. Anomaly detection with density estimation. Phys. Rev. D 2020, 101, 075042. [Google Scholar] [CrossRef]
- Primartha, R.; Tama, B.A. Anomaly detection using random forest: A performance revisited. In Proceedings of the 2017 International Conference on Data and Software Engineering (ICoDSE), Palembang, Indonesia, 1–2 November 2017; pp. 1–6. [Google Scholar]
- Li, J.; Izakian, H.; Pedrycz, W.; Jamal, I. Clustering-based anomaly detection in multivariate time series data. Appl. Soft Comput. 2021, 100, 106919. [Google Scholar] [CrossRef]
- Idé, T.; Lozano, A.C.; Abe, N.; Liu, Y. Proximity-based anomaly detection using sparse structure learning. In Proceedings of the 2009 SIAM International Conference on Data Mining, SIAM, Sparks, NV, USA, 30 April–2 May 2009; pp. 97–108. [Google Scholar]
- Zeng, X.; Yang, M.; Bo, Y. Gearbox oil temperature anomaly detection for wind turbine based on sparse Bayesian probability estimation. Int. J. Electr. Power Energy Syst. 2020, 123, 106233. [Google Scholar] [CrossRef]
- Sakurada, M.; Yairi, T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, Australia, 2 December 2014; pp. 4–11. [Google Scholar]
- Zhou, X.; Hu, Y.; Liang, W.; Ma, J.; Jin, Q. Variational LSTM enhanced anomaly detection for industrial big data. IEEE Trans. Ind. Inform. 2020, 17, 3469–3477. [Google Scholar] [CrossRef]
- Zhu, M.; Ye, K.; Wang, Y.; Xu, C.Z. A deep learning approach for network anomaly detection based on AMF-LSTM. In Proceedings of the IFIP International Conference on Network and Parallel Computing; Springer: Berlin/Heidelberg, Germany, 2018; pp. 137–141. [Google Scholar]
- Chen, A.; Fu, Y.; Zheng, X.; Lu, G. An efficient network behavior anomaly detection using a hybrid DBN-LSTM network. Comput. Secur. 2022, 114, 102600. [Google Scholar] [CrossRef]
- Provotar, O.I.; Linder, Y.M.; Veres, M.M. Unsupervised anomaly detection in time series using lstm-based autoencoders. In Proceedings of the 2019 IEEE International Conference on Advanced Trends in Information Theory (ATIT), Kyiv, Ukraine, 18–20 December 2019; pp. 513–517. [Google Scholar]
- Que, Z.; Liu, Y.; Guo, C.; Niu, X.; Zhu, Y.; Luk, W. Real-time anomaly detection for flight testing using AutoEncoder and LSTM. In Proceedings of the 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China, 9–13 December 2019; pp. 379–382. [Google Scholar]
- Kang, J.; Kim, C.S.; Kang, J.W.; Gwak, J. Anomaly detection of the brake operating unit on metro vehicles using a one-class lstm autoencoder. Appl. Sci. 2021, 11, 9290. [Google Scholar] [CrossRef]
- Li, L.; Lei, B.; Mao, C. Digital twin in smart manufacturing. J. Ind. Inf. Integr. 2022, 26, 100289. [Google Scholar] [CrossRef]
- Priyanka, E.; Thangavel, S.; Gao, X.Z.; Sivakumar, N. Digital twin for oil pipeline risk estimation using prognostic and machine learning techniques. J. Ind. Inf. Integr. 2022, 26, 100272. [Google Scholar] [CrossRef]
- Yang, M.; Moon, J.; Jeong, J.; Sin, S.; Kim, J. A Novel Embedding Model Based on a Transition System for Building Industry-Collaborative Digital Twin. Appl. Sci. 2022, 12, 553. [Google Scholar] [CrossRef]
- Salem, T.; Dragomir, M. Options for and Challenges of Employing Digital Twins in Construction Management. Appl. Sci. 2022, 12, 2928. [Google Scholar] [CrossRef]
- Guo, K.; Wan, X.; Liu, L.; Gao, Z.; Yang, M. Fault diagnosis of intelligent production line based on digital twin and improved random forest. Appl. Sci. 2021, 11, 7733. [Google Scholar] [CrossRef]
- Tao, F.; Sui, F.; Liu, A.; Qi, Q.; Zhang, M.; Song, B.; Guo, Z.; Lu, S.C.Y.; Nee, A.Y. Digital twin-driven product design framework. Int. J. Prod. Res. 2019, 57, 3935–3953. [Google Scholar] [CrossRef] [Green Version]
- Sacks, R.; Brilakis, I.; Pikas, E.; Xie, H.S.; Girolami, M. Construction with digital twin information systems. Data-Centric Eng. 2020, 1, e14. [Google Scholar] [CrossRef]
- Li, Z.; Li, J.; Wang, Y.; Wang, K. A deep learning approach for anomaly detection based on SAE and LSTM in mechanical equipment. Int. J. Adv. Manuf. Technol. 2019, 103, 499–510. [Google Scholar] [CrossRef]
- Lin, S.; Clark, R.; Birke, R.; Schönborn, S.; Trigoni, N.; Roberts, S. Anomaly detection for time series using vae-lstm hybrid model. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 4322–4326. [Google Scholar]
- Li, Y.; Shi, Z.; Liu, C.; Tian, W.; Kong, Z.; Williams, C.B. Augmented time regularized generative adversarial network (atr-gan) for data augmentation in online process anomaly detection. IEEE Trans. Autom. Sci. Eng. 2021, 19, 3338–3355. [Google Scholar] [CrossRef]
- Tang, T.W.; Kuo, W.H.; Lan, J.H.; Ding, C.F.; Hsu, H.; Young, H.T. Anomaly detection neural network with dual auto-encoders GAN and its industrial inspection applications. Sensors 2020, 20, 3336. [Google Scholar] [CrossRef]
- Zhu, G.; Zhao, H.; Liu, H.; Sun, H. A novel LSTM-GAN algorithm for time series anomaly detection. In Proceedings of the 2019 Prognostics and System Health Management Conference (PHM-Qingdao), Qingdao, China, 25–27 October 2019; pp. 1–6. [Google Scholar]
- Wang, Y.; Du, X.; Lu, Z.; Duan, Q.; Wu, J. Improved lstm-based time-series anomaly detection in rail transit operation environments. IEEE Trans. Ind. Inform. 2022, 18, 9027–9036. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhu, C.; Tan, J.; Tan, Y.; Rao, L. Anomaly detection and condition monitoring of wind turbine gearbox based on LSTM-FS and transfer learning. Renew. Energy 2022, 189, 90–103. [Google Scholar] [CrossRef]
- Torrey, L.; Shavlik, J. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques; IGI Global: Hershey, PA, USA, 2010; pp. 242–264. [Google Scholar]
- Vercruyssen, V.; Meert, W.; Davis, J. Transfer learning for time series anomaly detection. In Proceedings of the Workshop and Tutorial on Interactive Adaptive Learning@ ECMLPKDD 2017; CEUR Workshop Proceedings: Aachen, Germany, 2017; Volume 1924, pp. 27–37. [Google Scholar]
- Andrews, J.; Tanay, T.; Morton, E.J.; Griffin, L.D. Transfer Representation-Learning for Anomaly Detection; JMLR: New York, NY, USA, 2016. [Google Scholar]
- Yang, B.; Xu, S.; Lei, Y.; Lee, C.G.; Stewart, E.; Roberts, C. Multi-source transfer learning network to complement knowledge for intelligent diagnosis of machines with unseen faults. Mech. Syst. Signal Process 2022, 162, 108095. [Google Scholar] [CrossRef]
- Liniger, T.J. Multivariate Hawkes Processes. Ph.D. Thesis, ETH Zurich, Zürich, Switzerland, 2009. [Google Scholar]
- KDD99 Dataset. UCI KDD Archive. 1999. Available online: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (accessed on 10 January 2020).
- Goh, J.; Adepu, S.; Junejo, K.N.; Mathur, A. A dataset to support research in the design of secure water treatment systems. In Proceedings of the Critical Information Infrastructures Security: 11th International Conference, CRITIS 2016, Paris, France, 10–12 October 2016; Revised Selected Papers 11. Springer: Berlin/Heidelberg, Germany, 2017; pp. 88–99. [Google Scholar]
- Ahmed, C.M.; Palleti, V.R.; Mathur, A.P. WADI: A water distribution testbed for research in the design of secure cyber physical systems. In Proceedings of the 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks, Pittsburgh, PA, USA, 21 April 2017; pp. 25–28. [Google Scholar]
- Skoltech Anomaly Benchmark (SKAB). I. D. Katser and V. O. Kozitsin. 2020. Available online: https://www.kaggle.com/dsv/1693952 (accessed on 8 May 2021).
- Damadics Benchmark Website. 2020. Available online: http://diag.mchtr.pw.edu.pl/damadics/ (accessed on 4 March 2019).
- Hundman, K.; Constantinou, V.; Laporte, C.; Colwell, I.; Soderstrom, T. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 387–395. [Google Scholar]
- Su, Y.; Zhao, Y.; Niu, C.; Liu, R.; Sun, W.; Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2828–2837. [Google Scholar]
- Li, S.; Wen, J. A model-based fault detection and diagnostic methodology based on PCA method and wavelet transform. Energy Build. 2014, 68, 63–71. [Google Scholar] [CrossRef]
- Garg, A.; Zhang, W.; Samaran, J.; Savitha, R.; Foo, C.S. An evaluation of anomaly detection and diagnosis in multivariate time series. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 2508–2517. [Google Scholar] [CrossRef] [PubMed]
- Habler, E.; Shabtai, A. Using LSTM encoder-decoder algorithm for detecting anomalous ADS-B messages. Comput. Secur. 2018, 78, 155–173. [Google Scholar] [CrossRef] [Green Version]
- Aygun, R.C.; Yavuz, A.G. Network anomaly detection with stochastically improved autoencoder based models. In Proceedings of the 2017 IEEE 4th international conference on cyber security and cloud computing (CSCloud), New York, NY, USA, 26–28 June 2017; pp. 193–198. [Google Scholar]
- Lin, K.; Sheng, S.; Zhou, Y.; Liu, F.; Li, Z.; Chen, H.; Xu, C.Y.; Chen, J.; Guo, S. The exploration of a temporal convolutional network combined with encoder-decoder framework for runoff forecasting. Hydrol. Res. 2020, 51, 1136–1149. [Google Scholar] [CrossRef]
References | Type | Broad Area | Specific Area | Technology |
---|---|---|---|---|
Tao et al. [29] (2019) | Case Study | Manufacturing | Product Design | Big Data |
Sacks et al. [30] (2020) | Conception | Smart City | Construction Industry | BIM, Construction Planning and Control |
Guo et al. [28] (2021) | Case Study | Manufacturing | Production Line | Transfer Learning, IRF |
Salem and Dragomir [27] (2022) | Conception | Smart City | Construction Industry | BIM, AI, Monitoring |
Li et al. [24] (2022) | Case Study | Manufacturing | Smart Manufacturing | Analytics, Evaluation |
Priyanka et al. [25] (2022) | Case Study | Manufacturing | Oil Pipeline | Risk Estimation, SVM |
Yang et al. [26] (2022) | Case Study | Manufacturing | Log Data Mining | GRU, AI |
Ours | Case Study | Manufacturing | Oil and Gas Station | Petri Net, Knowledge Graph |
References | Techniques | Outcomes | Limitations |
---|---|---|---|
LSTM-GAN [35] (2019) | LSTM, GAN | Higher accuracy rate compared to traditional methods | Relationships between multidimensional variables are ignored |
SAE-LSTM [31] (2019) | Stack AE, LSTM | Higher detection rate in multiple features sequence | Unbalanced data have not been considered |
VAE-LSTM [32] (2020) | VAE, LSTM | Higher recall rate and F1 score compared to standard methods | The relationship between multivariates has not been considered |
DAGAN [34] (2020) | Dual AE, GAN | High AUC is maintained even with a small quantity of training data | Unbalanced sample distribution has not been considered |
ART-GAN [33] (2021) | Data augmentation, GAN | Using data augmentation to solve data imbalances | Data augmentation only considers a single sensor |
Improved LSTM [36](2022) | LSTM | Time series data anomaly detection for diverse distributions | Unable to detect anomalies for small quantity of data |
LSTM-FS [37] (2022) | LSTM, FS, Transfer learning | Using transfer learning to reduce the differences between data distributions | Power status has a large impact on monitoring data |
MTAD-GAN (proposed) | LSTM, GAN, Digital twin, Transfer learning | Using digital twin to solve problem of data imbalances and small quantity | Performance can be improved with exploring relationships of time series and adjusting the network structure for higher anomaly detection rate |
Meaning of Place | ||
---|---|---|
P1 data collection | P2 abnormal data | P3 low level warning |
P4 normal data | P5 advanced warning | P6 detection feedback information |
P7 On-site monitoring notice | P8 Alert information | P9 On-site warning information |
P10 Historical information data | P11 Pre-plan library | P12 accident data |
P13 Plan confirmed | P14 loss estimate | P15 Plan optimization |
P16 resume operation |
Dataset | No. Sensors | Normal Data | Data with Attacks | Attacks | Attack Duration (mins) |
---|---|---|---|---|---|
KDD99 | 34 | 562,387 | 494021 | 2 | NA |
SWAT | 51 | 496,800 | 449,919 | 36 | 2-25 |
WADI | 127 | 1,048,571 | 172,801 | 15 | 1.5–30 |
J10031 | 29 | 43,194 | 42,150 | 5 | 2–30 |
SKAB | 20 | 9401 | 35,600 | 12 | 2.4–9.8 |
DAMADICS | 12 | 8546 | 9542 | 5 | 2.1–5.6 |
MSL | 10 | 2160 | 2731 | 2 | 11–1141 |
SMAP | 24 | 2556 | 8071 | 4 | 31–4218 |
SMD | 16 | 25,300 | 25,301 | 3 | 2–3160 |
Prediction Model | Train Time/h | Predicted Time/s | A% |
---|---|---|---|
PCA | 4.5 | 2.2 | 80.23 |
RS | 4.3 | 3.8 | 88.25 |
UAE | 4.5 | 2.3 | 82.61 |
LSTM-ED | 5.1 | 4.5 | 87.52 |
AutoEncoder | 5.6 | 3.1 | 88.71 |
TcnED | 4.6 | 3.9 | 93.21 |
LSTM-GAN | 5.5 | 4.6 | 76.44 |
MTAD-GAN | 4.6 | 3.4 | 95.70 |
Dataset | TcnED | MTAD-GAN |
---|---|---|
SMAP | 76.64 | 88.46 |
SMD | 80.46 | 86.88 |
DAMADICS | 80.28 | 87.46 |
SKAB | 78.52 | 86.59 |
MSL | 80.21 | 88.62 |
KDD99 | 85.72 | 94.57 |
SWAT | 89.44 | 97.15 |
WADI | 88.35 | 98.54 |
Dataset | KDD99 | SWAT | WADI | J10031 | |
---|---|---|---|---|---|
LSTM-GAN | P | 0.720 | 0.712 | 0.765 | 0.775 |
R | 0.796 | 0.720 | 0.711 | 0.784 | |
F1 | 0.844 | 0.722 | 0.624 | 0.753 | |
K-GAN | P | 0.780 | 0.745 | 0.782 | 0.842 |
R | 0.842 | 0.775 | 0.738 | 0.826 | |
F1 | 0.882 | 0.754 | 0.652 | 0.826 | |
H-GAN | P | 0.820 | 0.740 | 0.794 | 0.817 |
R | 0.944 | 0.842 | 0.810 | 0.827 | |
F1 | 0.864 | 0.833 | 0.645 | 0.792 | |
MTAD-GAN | P | 0.945 | 0.932 | 0.956 | 0.972 |
R | 0.960 | 0.955 | 0.931 | 0.912 | |
F1 | 0.965 | 0.890 | 0.796 | 0.904 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lian, Y.; Geng, Y.; Tian, T. Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN. Appl. Sci. 2023, 13, 1891. https://doi.org/10.3390/app13031891
Lian Y, Geng Y, Tian T. Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN. Applied Sciences. 2023; 13(3):1891. https://doi.org/10.3390/app13031891
Chicago/Turabian StyleLian, Yuanfeng, Yueyao Geng, and Tian Tian. 2023. "Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN" Applied Sciences 13, no. 3: 1891. https://doi.org/10.3390/app13031891
APA StyleLian, Y., Geng, Y., & Tian, T. (2023). Anomaly Detection Method for Multivariate Time Series Data of Oil and Gas Stations Based on Digital Twin and MTAD-GAN. Applied Sciences, 13(3), 1891. https://doi.org/10.3390/app13031891