Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection
Abstract
:1. Introduction
1.1. Traditional Methods for Anomaly Detection
Classification | Representative Model | Type |
---|---|---|
Classifier based | One-Class SVM [2], unsupervised NN | Unsupervised Machine Learning |
Nearest neighbors | KNN [3], LOF [4], COF [5], … | |
Clustering based | CBLOF [5], LDCOF [5], … | |
Statistical based | HBOS [5], … | |
Subspace based | rPCA [5], … | |
Ensembles and Combination | Isolation Forest [6], FeatureBagging [7] | Ensembles |
Neural Networks | Auto-encoder with FNN, LSTM [8], … | Deep Neural Network |
Generative model | GAN [9], … | |
Others | Information theory based | Others |
Spectral decomposition based | ||
Visualization based | ||
Reinforcement Learning |
1.2. Deep Reinforcement Learning for Anomaly Detection
2. Related Researches
- 1
- We elaborated multiple related DL and RL methods’ basic principles, modelling process and made a comparative introduction on the strength and weakness of individual methods.
- 2
- We proposed the improved RL architecture of A3C framework using appropriate policy and value DNNs for specific domains. For sequential tasks the RNN structure with attention mechanism for state value and CNN for images were proposed.
- 3
- Performances of the proposed anomaly detection models were compared with machine learning models, GAN and with RL model variants. The empirical tests on three benchmark datasets showed that our proposed method outperformed or at least was comparable to the SOTA models in detecting anomalies.
3. Preliminaries
3.1. Reinforcement Learning
3.2. Generative Adversarial Learning
4. Materials and Methods
4.1. Definitions
4.2. Anomaly Detector Architecture
Algorithm 1 Algorithm for anomaly detection based on A3C. |
|
5. Results
5.1. AWID Datasets Test
5.2. Time Series Anomaly Test
5.3. NSL-KDD Network Anomaly Test
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
A3C | Asynchronouse Actor-Critic Advantage |
A2C | Actor-Critic Advantage |
(D)RL | (Deep) Reinforcement Learning |
MDP | Markov Decision Process |
MRP | Markov Reward Process |
LSTM | Long Short Term Memory |
TCN | Temporal Convolutional Network |
GAN | Generative Adversarial Network |
NN | Neural Network |
RNN | Recurrent Neural Network |
MLP | Multi Layer Perceptron |
FCN | Fully Convolutional Network |
(D)DQN | (Double) Deep Q-Network |
MC | Monte Carlo |
TD | Temporal-difference |
CNN | Convolutional Neural Network |
References
- Zimek, A.; Schubert, E. Outlier Detection. In Encyclopedia of Database Systems; Liu, L., Özsu, M.T., Eds.; Springer: New York, NY, USA, 2017; pp. 1–5. [Google Scholar] [CrossRef]
- Moya, M.M.; Hush, D.R. Network constraints and multi-objective optimization for one-class classification. Neural Netw. 1996, 9, 463–474. [Google Scholar] [CrossRef]
- Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175–185. [Google Scholar] [CrossRef] [Green Version]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying Density-Based Local Outliers. SIGMOD Rec. 2000, 29, 93–104. [Google Scholar] [CrossRef]
- Goldstein, M.; Uchida, S. A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 2016, 11, e0152173. [Google Scholar] [CrossRef] [Green Version]
- Liu, F.T.; Ting, K.M.; Zhou, Z. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar] [CrossRef]
- Lazarevic, A.; Kumar, V. Feature Bagging for Outlier Detection. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining; Association for Computing Machinery, Chicago, IL, USA, 21–24 August 2005; KDD’05. pp. 157–166. [Google Scholar] [CrossRef]
- Said Elsayed, M.; Le-Khac, N.A.; Dev, S.; Jurcut, A.D. Network Anomaly Detection Using LSTM Based Autoencoder. In Proceedings of the 16th ACM Symposium on QoS and Security for Wireless and Mobile Networks, Alicante, Spain, 16–20 November 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 37–45. [Google Scholar]
- Di Mattia, F.; Galeone, P.; De Simoni, M.; Ghelfi, E. A survey on gans for anomaly detection. arXiv 2019, arXiv:1906.11632. [Google Scholar]
- Moustafa, N.; Hu, J.; Slay, J. A holistic review of Network Anomaly Detection Systems: A comprehensive survey. J. Netw. Comput. Appl. 2019, 128, 33–55. [Google Scholar] [CrossRef]
- Fernandes, G.; Rodrigues, J.J.; Carvalho, L.F.; Al-Muhtadi, J.F.; Proença, M.L. A comprehensive survey on network anomaly detection. Telecommun. Syst. 2019, 70, 447–489. [Google Scholar] [CrossRef]
- Domingues, R.; Filippone, M.; Michiardi, P.; Zouaoui, J. A comparative evaluation of outlier detection algorithms: Experiments and analyses. Pattern Recognit. 2018, 74, 406–421. [Google Scholar] [CrossRef]
- Wang, H.; Bah, M.J.; Hammad, M. Progress in Outlier Detection Techniques: A Survey. IEEE Access 2019, 7, 107964–108000. [Google Scholar] [CrossRef]
- Chalapathy, R.; Chawla, S. Deep Learning for Anomaly Detection: A Survey. arXiv 2019, arXiv:1901.03407. [Google Scholar]
- Bulusu, S.; Kailkhura, B.; Li, B.; Varshney, P.K.; Song, D. Anomalous Example Detection in Deep Learning: A Survey. IEEE Access 2020, 8, 132330–132347. [Google Scholar] [CrossRef]
- Xu, X. Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies. Appl. Soft Comput. J. 2010, 10, 859–867. [Google Scholar] [CrossRef]
- Oh, M.H.; Iyengar, G. Sequential anomaly detection using inverse reinforcement learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 1480–1490. [Google Scholar] [CrossRef]
- He, Y.; Richard Yu, F.; Zhao, N.; Leung, V.C.; Yin, H. Software-Defined Networks with Mobile Edge Computing and Caching for Smart Cities: A Big Data Deep Reinforcement Learning Approach. IEEE Commun. Mag. 2017, 55, 31–37. [Google Scholar] [CrossRef]
- Zhu, H.; Cao, Y.; Wang, W.; Jiang, T.; Jin, S. Deep Reinforcement Learning for Mobile Edge Caching: Review, New Features, and Open Issues. IEEE Netw. 2018, 32, 50–57. [Google Scholar] [CrossRef]
- Xiao, L.; Wan, X.; Dai, C.; Du, X.; Chen, X.; Guizani, M. Security in Mobile Edge Caching with Reinforcement Learning. IEEE Wirel. Commun. 2018, 25, 116–122. [Google Scholar] [CrossRef] [Green Version]
- Akazaki, T.; Liu, S.; Yamagata, Y.; Duan, Y. Falsification of Cyber-Physical Systems; Springer: Cham, Switzerland, 2018; Volume 2, pp. 456–465. [Google Scholar]
- Yu, Z.; Machado, P.; Zahid, A.; Abdulghani, A.M.; Dashtipour, K.; Heidari, H.; Imran, M.A.; Abbasi, Q.H. Energy and Performance Trade-Off Optimization in Heterogeneous Computing via Reinforcement Learning. Electronics 2020, 9, 1812. [Google Scholar] [CrossRef]
- Han, G.; Xiao, L.; Poor, H.V. Two-dimensional anti-jamming communication based on deep reinforcement learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 2087–2091. [Google Scholar] [CrossRef]
- Xiao, L.; Li, Y.; Han, G.; Liu, G.; Zhuang, W. PHY-Layer Spoofing Detection with Reinforcement Learning in Wireless Networks. IEEE Trans. Veh. Technol. 2016, 65, 10037–10047. [Google Scholar] [CrossRef]
- Wan, X.; Sheng, G.; Li, Y.; Xiao, L.; Du, X. Reinforcement Learning Based Mobile Offloading for Cloud-Based Malware Detection. In Proceedings of the 2017 IEEE Global Communications Conference, GLOBECOM, Singapore, 4–8 December 2017; Volume 2018, pp. 1–6. [Google Scholar] [CrossRef]
- Xiao, L.; Li, Y.; Han, G.; Dai, H.; Poor, H.V. A Secure Mobile Crowdsensing Game with Deep Reinforcement Learning. IEEE Trans. Inf. Forensics Secur. 2018, 13, 35–47. [Google Scholar] [CrossRef]
- Chatterjee, M.; Namin, A.S. Detecting phishing websites through deep reinforcement learning. In Proceedings of the 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; Volume 2, pp. 227–232. [Google Scholar] [CrossRef]
- Kurt, M.N.; Ogundijo, O.; Li, C.; Wang, X. Online Cyber-Attack Detection in Smart Grid: A Reinforcement Learning Approach. IEEE Trans. Smart Grid 2018, 10, 5174–5185. [Google Scholar] [CrossRef] [Green Version]
- Lu, H.; Li, Y.; Mu, S.; Wang, D.; Kim, H.; Serikawa, S. Vehicles Using Reinforcement Learning. IEEE Internet Things J. 2018, 5, 2315–2322. [Google Scholar] [CrossRef]
- Mahmud, M.; Kaiser, M.S.; Hussain, A.; Vassanelli, S. Applications of Deep Learning and Reinforcement Learning to Biological Data. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 2063–2079. [Google Scholar] [CrossRef] [Green Version]
- Otoum, S.; Kantarci, B.; Mouftah, H. Empowering Reinforcement Learning on Big Sensed Data for Intrusion Detection. In Proceedings of the 2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019. [Google Scholar] [CrossRef]
- Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A. Application of deep reinforcement learning to intrusion detection for supervised problems. Expert Syst. Appl. 2020, 141, 112963. [Google Scholar] [CrossRef]
- Alauthman, M.; Aslam, N.; Al-kasassbeh, M.; Khan, S.; Al-Qerem, A.; Raymond Choo, K.K. An efficient reinforcement learning-based Botnet detection approach. J. Netw. Comput. Appl. 2020, 150, 102479. [Google Scholar] [CrossRef]
- Malialis, K.; Kudenko, D. Distributed response to network intrusions using multiagent reinforcement learning. Eng. Appl. Artif. Intell. 2015, 41, 270–284. [Google Scholar] [CrossRef]
- Hendrycks, D.; Mazeika, M.; Dietterich, T. Deep anomaly detection with outlier exposure. In Proceedings of the 7th International Conference on Learning Representations, (ICLR), New Orleans, LA, USA, 6–9 May 2019; pp. 1–18. [Google Scholar]
- Wang, S.; Zeng, Y.; Liu, X.; Zhu, E.; Yin, J.; Xu, C.; Kloft, M. Effective End-to-end Unsupervised Outlier Detection via Inlier Priority of Discriminative Network. NeurIPS 2019, 1–14. Available online: https://ml.informatik.uni-kl.de/publications/2019/NeurIPS19_final.pdf (accessed on 1 February 2021).
- Zhu, Y.; Yang, K. Tripartite Active Learning for Interactive Anomaly Discovery. IEEE Access 2019, 7, 63195–63203. [Google Scholar] [CrossRef]
- Zong, B.; Song, Q.; Min, M.R.; Cheng, W.; Lumezanu, C.; Cho, D.; Chen, H. Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In Proceedings of the 6th International Conference on Learning Representations, (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–19. [Google Scholar]
- Zhou, C.; Paffenroth, R.C. Anomaly detection with robust deep autoencoders. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; ACM: New York, NY, USA, 2017. Part F1296. pp. 665–674. [Google Scholar] [CrossRef]
- Chen, J.; Sathe, S.; Aggarwal, C.; Turaga, D. Outlier detection with autoencoder ensembles. In Proceedings of the 17th SIAM International Conference on Data Mining, (SDM 2017), Houston, TX, USA, 27–29 April 2017; pp. 90–98. [Google Scholar] [CrossRef] [Green Version]
- Hundman, K.; Constantinou, V.; Laporte, C.; Colwell, I.; Soderstrom, T. Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018; ACM: New York, NY, USA, 2018; pp. 387–395. [Google Scholar] [CrossRef] [Green Version]
- Lee, T.J.; Gottschlich, J.; Tatbul, N.; Metcalf, E.; Zdonik, S. Greenhouse: A Zero-Positive Machine Learning System for Time-Series Anomaly Detection. arXiv 2018, arXiv:1801.03168. [Google Scholar]
- Habler, E.; Shabtai, A. Using LSTM encoder-decoder algorithm for detecting anomalous ADS-B messages. Comput. Secur. 2018, 78, 155–173. [Google Scholar] [CrossRef] [Green Version]
- Ergen, T.; Kozat, S.S. Unsupervised Anomaly Detection With LSTM Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 3127–3141. [Google Scholar] [CrossRef] [Green Version]
- Li, C.; Qiu, M.; Li, C. Reinforcement Learning for Cybersecurity. Reinf. Learn. Cyber-Phys. Syst. 2019, 155–168. [Google Scholar] [CrossRef]
- Liu, Y.; Li, Z.; Zhou, C.; Jiang, Y.; Sun, J.; Wang, M.; He, X. Generative Adversarial Active Learning for Unsupervised Outlier Detection. IEEE Trans. Knowl. Data Eng. 2020, 32, 1517–1528. [Google Scholar] [CrossRef] [Green Version]
- Sutton, R.S.; Mahmood, A.R.; White, M. An emphatic approach to the problem of off-policy temporal-difference learning. J. Mach. Learn. Res. 2016, 17, 1–29. [Google Scholar]
- Kasim, Ö. An efficient and robust deep learning based network anomaly detection against distributed denial of service attacks. Comput. Netw. 2020, 180, 107390. [Google Scholar] [CrossRef]
- Awad, M.; Alabdallah, A. Addressing Imbalanced Classes Problem of Intrusion Detection System Using Weighted Extreme Learning Machine. Int. J. Comput. Netw. Commun. 2019, 11, 5. [Google Scholar] [CrossRef]
- Sharma, J.; Giri, C.; Granmo, O.C.; Goodwin, M. Multi-layer intrusion detection system with ExtraTrees feature selection, extreme learning machine ensemble, and softmax aggregation. EURASIP J. Inf. Secur. 2019, 2019, 15. [Google Scholar] [CrossRef] [Green Version]
- Bahdanau, D.; Brakel, P.; Xu, K.; Goyal, A.; Lowe, R.; Pineau, J.; Courville, A.; Bengio, Y. An Actor-Critic Algorithm for Sequence Prediction. arXiv 2016, arXiv:1607.07086. [Google Scholar]
- Zhong, C.; Gursoy, M.C.; Velipasalar, S. Deep actor-critic reinforcement learning for anomaly detection. In Proceedings of the 2019 IEEE Global Communications Conference, GLOBECOM 2019, Waikoloa, HI, USA, 9–13 December 2019. [Google Scholar] [CrossRef] [Green Version]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double Q-Learning. In Proceedings of the 30th AAAI Conference on Artificial Intelligence; 2016; pp. 2094–2100. Available online: https://dl.acm.org/doi/10.5555/3016100.3016191 (accessed on 21 February 2021).
- Deep Reinforcement Learning and Control Lectures. Available online: http://www.andrew.cmu.edu/course/10-403/ (accessed on 21 February 2021).
- Pfau, D.; Vinyals, O. Connecting generative adversarial networks and actor-critic methods. arXiv 2016, arXiv:1610.01945. [Google Scholar]
- Mnih, V.; Badia, A.P.; Mirza, M.; Graves, A.; Lillicrap, T.; Harley, T.; Silver, D.; Kavukcuoglu, K. Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning, New York City, NY, USA, 19–24 June 2016; pp. 1928–1937. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the International Conference on Information Systems Security and Privacy (ICISSP), Funchal, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar]
- MontazeriShatoori, M.; Davidson, L.; Kaur, G.; Lashkari, A.H. Detection of DoH Tunnels using Time-series Classification of Encrypted Traffic. In Proceedings of the 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Calgary, AB, Canada, 17–22 August 2020; pp. 63–70. [Google Scholar]
- Sabhnani, M.; Serpen, G. Application of Machine Learning Algorithms to KDD Intrusion Detection Dataset within Misuse Detection Context. In Proceedings of the MLMTA, Las Vegas, NV, USA, 23–26 June 2003; pp. 209–215. [Google Scholar]
- Yang, Y.; Zheng, K.; Wu, C.; Yang, Y. Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network. Sensors 2019, 19, 2528. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, Z.; Shi, Y.; Xue, Z. Idsgan: Generative adversarial networks for attack generation against intrusion detection. arXiv 2018, arXiv:1809.02077. [Google Scholar]
- Yin, H.; Xue, M.; Xiao, Y.; Xia, K.; Yu, G. Intrusion Detection Classification Model on an Improved k-Dependence Bayesian Network. IEEE Access 2019, 7, 157555–157563. [Google Scholar] [CrossRef]
- Zhou, K.; Wang, W.; Hu, T.; Deng, K. Time Series Forecasting and Classification Models Based on Recurrent with Attention Mechanism and Generative Adversarial Networks. Sensors 2020, 20, 7211. [Google Scholar] [CrossRef] [PubMed]
Methods | Algorithms | Application | Features |
---|---|---|---|
Value-based | Q-learning, DDQN, etc. | Simple environment | Sample efficient, steady |
Policy-based | Policy gradient, REINFORCEMENT | continuous, stochastic | Unstable, time consuming |
environment | |||
Actor-Critic (AC) | Combine Value and policy, Actor(value) computing actions, Critic (policy) producing q value of actions;outperformed Value, Policy separately | Complex environment (such as 2/3D video gaming) | Batch normalization Target network Replay buffer Entropy regularization Compatibility, etc. |
GAN | Generator, Discriminator | Image video/audio, etc. | Batch normalization, label smoothing, etc. |
A2C | Introducing advantage based on AC, evaluate goodness of action and its improvement space | Complex environment | reduces high variance of policy networks, stable |
A3C | One global, several workers share the same environment; Introducing asynchronous based on A2C | Complex environment | further improve efficiency based on A2C, robustness, speed |
Proposed method | Similar to A3C | Complex environment | Adaptable, Image: CNN Sequence: TCN |
AWID-CLS-R-Trn | AWID-CLS-R-Tst | ||
---|---|---|---|
Flooding | 48,484 | Flooding | 8097 |
Impersonation | 48,522 | Impersonation | 20,079 |
Injection | 65,379 | Injection | 16,682 |
Normal | 1,633,190 | Normal | 530,785 |
Method | Type | Accuracy (±0.5%) | Precision (±0.5%) | Recall (±0.5%) | F1 (±0.5%) |
---|---|---|---|---|---|
MLP | Flooding | 0.9488 | 0.9195 | 0.6288 | 0.7469 |
Impersonation | 0.9326 | 0.6195 | 0.5491 | 0.5822 | |
Injection | 0.9496 | 0.9202 | 1.00 | 0.9584 | |
Normal | 0.9872 | 0.9498 | 1.00 | 0.9732 | |
Adversarial RL model | Flooding | 0.9941 | 0.9452 | 0.6183 | 0.7476 |
Impersonation | 0.9479 | 0.3201 | 0.4392 | 0.3703 | |
Injection | 0.9980 | 0.9358 | 0.9999 | 0.9668 | |
Normal | 0.9400 | 0.9727 | 0.9620 | 0.9673 | |
SMOTE | Flooding | 0.9924 | 0.6001 | 0.6211 | 0.6104 |
Impersonation | 0.9521 | 0.3398 | 0.9303 | 0.4978 | |
Injection | 0.9720 | 0.4105 | 1.00 | 0.5821 | |
Normal | 0.9396 | 0.9897 | 0.8769 | 0.9299 | |
Proposed Anomaly Detector | Flooding | 0.9930 | 0.9847 | 0.6430 | 0.7780 |
Impersonation | 0.9571 | 0.4736 | 0.9402 | 0.6299 | |
Injection | 0.9834 | 0.9848 | 0.8679 | 0.9227 | |
Normal | 0.9917 | 0.9903 | 0.9851 | 0.9877 |
Method | Algorithm | Accuracy (±0.5%) | Precision (±0.5%) | Recall (±0.5%) | F1 (±0.5%) |
---|---|---|---|---|---|
Linear model | OCSVM | 0.6542 | 0.6953 | 0.6512 | 0.6725 |
Ensemble | IF | 0.7911 | 0.8483 | 0.7192 | 0.7784 |
Proximity | LOF | 0.6834 | 0.7923 | 0.6597 | 0.7189 |
KNN | 0.7808 | 0.8933 | 0.6706 | 0.7661 | |
NN methods | VAE | 0.7912 | 0.8311 | 0.7018 | 0.7610 |
MLP | 0.7966 | 0.8679 | 0.6647 | 0.7529 | |
GAN | WGAN | 0.6149 | 0.8279 | 0.665 | 0.7376 |
RL methods | Plain-AD | 0.7630 | 0.8902 | 0.7829 | 0.8331 |
Multi-AD | 0.7265 | 0.8882 | 0.7265 | 0.7987 | |
A2C | 0.7905 | 0.8930 | 0.7905 | 0.8386 | |
Proposed Anomaly Detector | 0.7920 | 0.9110 | 0.7901 | 0.8463 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, K.; Wang, W.; Hu, T.; Deng, K. Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection. Entropy 2021, 23, 274. https://doi.org/10.3390/e23030274
Zhou K, Wang W, Hu T, Deng K. Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection. Entropy. 2021; 23(3):274. https://doi.org/10.3390/e23030274
Chicago/Turabian StyleZhou, Kun, Wenyong Wang, Teng Hu, and Kai Deng. 2021. "Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection" Entropy 23, no. 3: 274. https://doi.org/10.3390/e23030274
APA StyleZhou, K., Wang, W., Hu, T., & Deng, K. (2021). Application of Improved Asynchronous Advantage Actor Critic Reinforcement Learning Model on Anomaly Detection. Entropy, 23(3), 274. https://doi.org/10.3390/e23030274