Next Article in Journal
Zero-Shot Image Caption Inference System Based on Pretrained Models
Previous Article in Journal
Skin Lesion Segmentation through Generative Adversarial Networks with Global and Local Semantic Feature Awareness
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Optimizing the Effectiveness of Moving Target Defense in a Probabilistic Attack Graph: A Deep Reinforcement Learning Approach

Department of Computer Science and Technology, Tsinghua University, Beijing 100190, China
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(19), 3855; https://doi.org/10.3390/electronics13193855 (registering DOI)
Submission received: 25 June 2024 / Revised: 22 July 2024 / Accepted: 28 July 2024 / Published: 28 September 2024

Abstract

Moving target defense (MTD) technology baffles potential attacks by dynamically changing the software in use and/or its configuration while maintaining the application’s running states. But it incurs a deployment cost and various performance overheads, degrading performance. An attack graph is capable of evaluating the balance between the effectiveness and cost of an MTD deployment. In this study, we consider a network scenario in which each node in the attack graph can deploy MTD technology. We aim to achieve MTD deployment effectiveness optimization (MTD-DO) in terms of minimizing the network security loss under a limited budget. The existing related works either considered only a single node for deploying an MTD or they ignored the deployment cost. We first establish a non-linear MTD-DO formulation. Then, two deep reinforcement learning-based algorithms are developed, namely, deep Q-learning (DQN) and proximal policy optimization (PPO). Moreover, two metrics are defined in order to effectively evaluate MTD-DO algorithms with varying network scales and budgets. The experimental results indicate that both PPO- and DQN-based algorithms perform better than Q-learning-based and random algorithms. The DQN-based algorithm converges more quickly and performs, in terms of reward, marginally better than the PPO-based algorithm.
Keywords: attack graph; deep Q learning; moving target defense; proximal policy optimization; optimization attack graph; deep Q learning; moving target defense; proximal policy optimization; optimization

Share and Cite

MDPI and ACS Style

Li, Q.; Wu, J. Optimizing the Effectiveness of Moving Target Defense in a Probabilistic Attack Graph: A Deep Reinforcement Learning Approach. Electronics 2024, 13, 3855. https://doi.org/10.3390/electronics13193855

AMA Style

Li Q, Wu J. Optimizing the Effectiveness of Moving Target Defense in a Probabilistic Attack Graph: A Deep Reinforcement Learning Approach. Electronics. 2024; 13(19):3855. https://doi.org/10.3390/electronics13193855

Chicago/Turabian Style

Li, Qiuxiang, and Jianping Wu. 2024. "Optimizing the Effectiveness of Moving Target Defense in a Probabilistic Attack Graph: A Deep Reinforcement Learning Approach" Electronics 13, no. 19: 3855. https://doi.org/10.3390/electronics13193855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop