Designing and Evaluating a Flexible and Scalable HTTP Honeypot Platform: Architecture, Implementation, and Applications
Abstract
:1. Introduction
1.1. State of the Art
1.2. Motivation
1.3. Structure of the Paper
2. Proposed System
2.1. Requirements Identification
2.2. System Architecture
- Honeypot units, which receive requests from attackers and form replies based on the provisioned behavior; these can operate different modes, either autonomously, or by proxying the attackers’ requests to the central control system, where responses can be prepared with greater control.
- A central control system with APIs, databases and control logic; it registers and manages honeypot nodes, acts as the source of the latest response models, and in the proxy mode continuously receives requests from the honeypot units and prepares responses according to the content of the request.
- The data pipeline, which ships all request logs to a storage cluster, distributes the metadata to enrichment workers, and runs analyses.
2.2.1. Central Management
2.2.2. Honeypot Nodes
2.3. Threat Impact Assessment for Service Identification
- To estimate the vulnerability density, , we first need to estimate the code base size, which is easily doable for open-source projects only; for closed-source projects, the code base size has to be approximated with a similar project in terms of functionality;
- For the number of known vulnerabilities, the first approximation can be obtained by querying a CVE database;
- To estimate the breach cost, we can first estimate the risk level of the application (low-risk applications with insignificant breach consequences to mission-critical apps with severe service disruption as a consequence) and map it to the interval from 0 to 1; for this, we propose to use the sigmoid function; however, the initial risk should be estimated on a per-use-case basis, which makes this impractical. To simplify the calculation, the factor can be assumed as a constant of 0.5 and, if need be, adjusted upwards for intrinsically risky applications, or downwards for less risky applications.
- Effectiveness of countermeasures could be estimated based on lookups in countermeasure databases such as [45]. This factor then signifies the average countermeasure effectiveness for known vulnerabilities.
- The existence of countermeasures represents no guarantee that these countermeasures are implemented. To get around this, we can assume the compliance index factor is a constant, signifying that all known countermeasures are implemented, or that none are.
- Finally, the installed base of the application could be estimated using market research, download counters, etc. However, the most reliable and consistent data in our experience come from internet surveys, obtained from scanners such as Shodan, Censys, and similar.
2.4. Implementation Details
2.4.1. Central Management
2.4.2. Honeypot Nodes
2.4.3. Data Pipeline
2.4.4. Implemented Honeypots
3. Experimental Validation and Results
3.1. Experiment Setup
3.2. Acquired Data
3.3. Crawler and Bot Identification
- We manually identified well-known security scanners, search engines, and bots, and, where provided, used the source IPs and subnets to label the requests (e.g., GoogleBot);
- We performed a reverse DNS lookup for each IP, labeling the whitelisted scanners based on the resolved domain;
- We used specially crafted replies that steganographically stored the traffic source information in the HTTP response, which effectively watermarked the responses. Searching for our honeypot IPs on well-known vulnerability scanner databases then often yielded enough of the actual watermark such that we were able to recover the source of the scans;
- We amended the data using external services and databases.
3.4. Client-Side Browser Fingerprinting
3.5. Identification of Common Vulnerabilities and Exposures
3.6. Experimental Attack Modeling and Visualization
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
ALG | Application Layer Gateway |
API | Application programming interface |
AS | Autonomous system |
ASN | Autonomous System number |
CMS | Content management system |
COTS | Commercial off-the-shelf |
CPU | Central processing unit |
CSS | Cascading Style Sheets |
CTI | Cyber threat intelligence |
CVE | Common Vulnerabilities and Exposures |
DB | Database |
DNS | Domain name system |
GPU | Graphics processing unit |
GUI | Graphical user interface |
HTML | HyperText Markup Language |
HTTPS | Hypertext Transfer Protocol Secure |
ICS | Industrial control system |
IDS | Intrusion detection system |
IoT | Internet of Things |
IP | Internet Protocol |
IPS | Intrusion prevention system |
ISP | Internet service provider |
IT | Information technology |
JS | JavaScript |
LLM | Large language model |
ML | Machine learning |
NAS | Network attached storage |
OS | Operating system |
OT | Operational technology |
PPDR | Public protection and disaster relief |
REST | Representational state transfer |
RTSP | Real-Time Streaming Protocol |
SQL | Structured query language |
SSH | Secure Shell |
TCP | Transmission Control Protocol |
TLS | Transport Layer Security |
UA | User agent |
UDP | User Datagram Protocol |
UPS | Uninterruptible power supply |
URL | Uniform Resource Locator |
VANET | Vehicular ad hoc networks |
VM | Virtual machine |
VPN | Virtual private network |
WAF | Web application firewall |
XSS | Cross-site scripting |
References
- Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M. A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security. IEEE Commun. Surv. Tutor. 2020, 22, 1646–1685. [Google Scholar] [CrossRef]
- Franco, J.; Aris, A.; Canberk, B.; Uluagac, A.S. A Survey of Honeypots and Honeynets for Internet of Things, Industrial Internet of Things, and Cyber-Physical Systems. IEEE Commun. Surv. Tutor. 2021, 23, 2351–2383. [Google Scholar] [CrossRef]
- Makhdoom, I.; Abolhasan, M.; Lipman, J.; Liu, R.P.; Ni, W. Anatomy of Threats to the Internet of Things. IEEE Commun. Surv. Tutor. 2019, 21, 1636–1675. [Google Scholar] [CrossRef]
- Matheu, S.N.; Hernández-Ramos, J.L.; Skarmeta, A.F.; Baldini, G. A Survey of Cybersecurity Certification for the Internet of Things. ACM Comput. Surv. 2021, 53, 115. [Google Scholar] [CrossRef]
- Dionaea. Available online: https://github.com/DinoTools/dionaea (accessed on 21 July 2023).
- Cowrie. Available online: https://github.com/cowrie/cowrie (accessed on 28 July 2023).
- Honeytrap. Available online: https://github.com/honeytrap/honeytrap (accessed on 28 July 2023).
- Kuskov, V.; Kuzin, M.; Shmelev, Y.; Makrushin, D.; Grachev, I. Honeypots and the Internet of Things. Available online: https://securelist.com/honeypots-and-the-internet-of-things/78751/ (accessed on 21 July 2023).
- Surber, J.G.; Zantua, M. Intelligent Interaction Honeypots for Threat Hunting within the Internet of Things. J. Colloq. Inf. Syst. Secur. Educ. 2022, 9, 5. [Google Scholar] [CrossRef]
- Metongnon, L.; Sadre, R. Beyond Telnet: Prevalence of IoT Protocols in Telescope and Honeypot Measurements. In Proceedings of the 2018 Workshop on Traffic Measurements for Cybersecurity; WTMC’18, Budapest, Hungary, 20 August 2018; Association for Computing Machinery: New York, NY, USA; pp. 21–26. [Google Scholar] [CrossRef]
- Semic, H.; Mrdovic, S. IoT Honeypot: A Multi-Component Solution for Handling Manual and Mirai-based Attacks. In Proceedings of the 2017 25th Telecommunication Forum (TELFOR), Belgrade, Serbia, 21–22 November 2017; pp. 1–4. [Google Scholar] [CrossRef]
- Manzanares, A.G. The Construction of a Virtual, Low-Interaction IoT Honeypot; Semantic Scholar: Seattle, WA, USA, 2017. [Google Scholar]
- Dowling, S.; Schukat, M.; Melvin, H. Data-Centric Framework for Adaptive Smart City Honeynets. In Proceedings of the 2017 Smart City Symposium Prague (SCSP), Prague, Czech Republic, 25–26 May 2017; pp. 1–7. [Google Scholar] [CrossRef]
- Chen, D.D.; Egele, M.; Woo, M.; Brumley, D. Towards Automated Dynamic Analysis for Linux-based Embedded Firmware. In Proceedings of the Proceedings 2016 Network and Distributed System Security Symposium; Internet Society, San Diego, CA, USA, 21–24 February 2016. [Google Scholar] [CrossRef]
- Wang, M.; Santillan, J.; Kuipers, F. ThingPot: An Interactive Internet-of-Things Honeypot. Available online: http://arxiv.org/abs/1807.04114 (accessed on 29 July 2023).
- Vishwakarma, R.; Jain, A.K. A Honeypot with Machine Learning Based Detection Framework for Defending IoT Based Botnet DDoS Attacks. In Proceedings of the 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 23–25 April 2019; pp. 1019–1024. [Google Scholar] [CrossRef]
- Luo, T.; Xu, Z.; Jin, X.; Jia, Y.; Ouyang, X. IoTCandyJar: Towards an Intelligent-Interaction Honeypot for IoT Devices. Black Hat 2017, 1, 1–11. [Google Scholar]
- Zhou, Y. Chameleon: Towards adaptive honeypot for internet of things. In Proceedings of the ACM Turing Celebration Conference—China, Chengdu, China, 17–19 May 2019. [Google Scholar] [CrossRef]
- Vetterl, A.; Clayton, R. Honware: A Virtual Honeypot Framework for Capturing CPE and IoT Zero Days. In Proceedings of the 2019 APWG Symposium on Electronic Crime Research (eCrime), Pittsburgh, PA, USA, 13–15 November 2019; pp. 1–13. [Google Scholar] [CrossRef]
- Guarnizo, J.; Tambe, A.; Bhunia, S.S.; Ochoa, M.; Tippenhauer, N.; Shabtai, A.; Elovici, Y. SIPHON: Towards Scalable High-Interaction Physical Honeypots. arXiv 2017, arXiv:1701.02446. [Google Scholar]
- Shrivastava, R.K.; Bashir, B.; Hota, C. Attack Detection and Forensics Using Honeypot in IoT Environment. In Proceedings of the Distributed Computing and Internet Technology, Bhubaneswar, India, 10–13 January 2019; Fahrnberger, G., Gopinathan, S., Parida, L., Eds.; Lecture Notes in Computer Science. Springer International Publishing: Cham, Switzerland, 2019; pp. 402–409. [Google Scholar] [CrossRef]
- Pauna, A.; Bica, I.; Pop, F.; Castiglione, A. On the Rewards of Self-Adaptive IoT Honeypots. Ann. Telecommun. 2019, 74, 501–515. [Google Scholar] [CrossRef]
- Lingenfelter, B.; Vakilinia, I.; Sengupta, S. Analyzing Variation Among IoT Botnets Using Medium Interaction Honeypots. In Proceedings of the 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 6–8 January 2020; pp. 761–767. [Google Scholar] [CrossRef]
- Wang, B.; Dou, Y.; Sang, Y.; Zhang, Y.; Huang, J. IoTCMal: Towards A Hybrid IoT Honeypot for Capturing and Analyzing Malware. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–7. [Google Scholar] [CrossRef]
- Hakim, M.A.; Aksu, H.; Uluagac, A.S.; Akkaya, K. U-PoT: A Honeypot Framework for UPnP-Based IoT Devices. In Proceedings of the 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), Orlando, FL, USA, 17–19 November 2018; pp. 1–8. [Google Scholar] [CrossRef]
- Trajanovski, T.; Zhang, N. An Automated and Comprehensive Framework for IoT Botnet Detection and Analysis (IoT-BDA). IEEE Access 2021, 9, 124360–124383. [Google Scholar] [CrossRef]
- Krishna, R.R.; Priyadarshini, A.; Jha, A.V.; Appasani, B.; Srinivasulu, A.; Bizon, N. State-of-the-Art Review on IoT Threats and Attacks: Taxonomy, Challenges and Solutions. Sustainability 2021, 13, 9463. [Google Scholar] [CrossRef]
- Pa, Y.M.; Suzuki, S.; Yoshioka, K.; Matsumoto, T.; Kasama, T.; Rossow, C. IoTPOT: A Novel Honeypot for Revealing Current IoT Threats. J. Inf. Process. 2016, 24, 522–533. [Google Scholar] [CrossRef]
- Baş Seyyar, M.; Çatak, F.Ö.; Gül, E. Detection of Attack-Targeted Scans from the Apache HTTP Server Access Logs. Appl. Comput. Inform. 2018, 14, 28–36. [Google Scholar] [CrossRef]
- Mx, P. HoneyPy. Available online: https://github.com/foospidy/HoneyPy (accessed on 30 July 2023).
- Pa, Y.M.P.; Suzuki, S.; Yoshioka, K.; Matsumoto, T.; Kasama, T.; Rossow, C. IoTPOT: Analysing the Rise of IoT Compromises. In Proceedings of the 9th USENIX Workshop on Offensive Technologies (WOOT 15), Washington, DC, USA, 10–11 August 2015. [Google Scholar]
- Dowling, S.; Schukat, M.; Melvin, H. A ZigBee Honeypot to Assess IoT Cyberattack Behaviour. In Proceedings of the 2017 28th Irish Signals and Systems Conference (ISSC), Killarney, Ireland, 20–21 June 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Tambe, A.; Aung, Y.L.; Sridharan, R.; Ochoa, M.; Tippenhauer, N.O.; Shabtai, A.; Elovici, Y. Detection of Threats to IoT Devices Using Scalable VPN-forwarded Honeypots. In Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, Dallas, TX, USA, 25–27 March 2019. [Google Scholar] [CrossRef]
- Ramakrishnan, K.; Gokul, P.; Nigam, R. Pandora: An IOT Based Intrusion Detection Honeypot with Real-time Monitoring. In Proceedings of the 2021 International Conference on Forensics, Analytics, Big Data, Security (FABS), Bengaluru, India, 21–22 December 2021; Volume 1, pp. 1–7. [Google Scholar] [CrossRef]
- Bartwal, U.; Mukhopadhyay, S.; Negi, R.; Shukla, S. Security Orchestration, Automation, and Response Engine for Deployment of Behavioural Honeypots. In Proceedings of the 2022 IEEE Conference on Dependable and Secure Computing (DSC), Edinburgh, UK, 22–24 June 2022; pp. 1–8. [Google Scholar] [CrossRef]
- Kato, S.; Tanabe, R.; Yoshioka, K.; Matsumoto, T. Adaptive Observation of Emerging Cyber Attacks Targeting Various IoT Devices. In Proceedings of the 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM), Bordeaux, France, 17–21 May 2021; Available online: https://ieeexplore.ieee.org/document/9464004 (accessed on 30 July 2023).
- Tabari, A.Z.; Ou, X. A First Step Towards Understanding Real-world Attacks on IoT Devices. arXiv 2020, arXiv:2003.01218. [Google Scholar]
- ENISA Threat Landscape Report 2018. Available online: https://www.enisa.europa.eu/publications/enisa-threat-landscape-report-2018 (accessed on 29 July 2023).
- Sedlar, U. Network Telescope: Insights from a Decade of Observations. Electrotech. Rev. Vestn. 2022, 89, 198–204. [Google Scholar]
- Sedlar, U.; Južnič, L.Š.; Volk, M. An Iteratively-Improving Internet-of-Things Honeypot Experiment. In Proceedings of the 2020 International Conference on Broadband Communications for Next Generation Networks and Multimedia Applications (CoBCom), Graz, Austria, 7–9 July 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Bauer, J.M.; van Eeten, M.J.G. Cybersecurity: Stakeholder Incentives, Externalities, and Policy Options. Telecommun. Policy 2009, 33, 706–719. [Google Scholar] [CrossRef]
- CONCORDIA: Cyber Security Competence for Research and InnovAtion. In Work Package 4: Policy and the European Dimension, Deliverable D4.3: 3rd Year Report on Cybersecurity Threats; Technical Report; European Commission: Luxembourg, 2020.
- Fischer-Hübner, S.; Alcaraz, C.; Ferreira, A.; Fernandez-Gago, C.; Lopez, J.; Markatos, E.; Islami, L.; Akil, M. Stakeholder Perspectives and Requirements on Cybersecurity in Europe. J. Inf. Secur. Appl. 2021, 61, 102916. [Google Scholar] [CrossRef]
- Kren, M.; Kos, A.; Sedlar, U. Estimating Application Cyberthreat Impact Score for Honeypot Coverage Prioritization. In Proceedings of the 2022 International Conference on Broadband Communications for Next Generation Networks and Multimedia Applications (CoBCom), Graz, Austria, 12–14 July 2022; pp. 1–6. [Google Scholar] [CrossRef]
- JVN iPedia. Available online: https://jvndb.jvn.jp/en/ (accessed on 7 July 2023).
- Selenium. Available online: https://www.selenium.dev/ (accessed on 31 July 2023).
- Puppeteer. Available online: https://pptr.dev/ (accessed on 31 July 2023).
- Playwright. Available online: https://playwright.dev/python/ (accessed on 31 July 2023).
Project | Honeypot Type | Level of Interaction | Supported Attack Types |
---|---|---|---|
Dionaea [5] | General | Medium | N/A |
HoneyPy [30] | General | Low/Medium | N/A |
IoTPOT [31] | IoT | Hybrid | Brute force attacks, Hajime, ZmEu attacks |
Dowling [32] | IoT | Medium | Dictionary attack, brute-force attacks, reconnaissance attacks, botnet attacks, launch attacks, individual attacks |
HoneyIo4 [12] | IoT | Low | Reconnaissance attacks |
SIPHON [20] | IoT | High | Brute-force login attempts |
Metongnon [10] | IoT | Low/Medium | Reconnaissance attacks |
Scalable VPN-forwarded honeypots [33] | IoT | High | N/A |
Lingenfelter [23] | IoT | Medium | Botnet attacks |
Firmadyne [14] | IoT, entire device emulation | Reconnaissance attacks, buffer overflow | |
IoTCandyJar [17] | IoT, entire device emulation | HTTP (HEAD, OPTIONS, CONNECT), TCP, UDP, RTSP | |
Pandora [34] | General | Low | Login attempts, port scan attacks |
Bartwal [35] | General | Hybrid | HTTP attacks (XSS, SQLi, OSC), DDoS, botnet attacks |
X-POT [36] | IoT, entire device emulation | Hybrid | N/A |
HoneyCamera [37] | IoT | Low | Login attempts, command injection, shellshock |
Honeypot | Honeypot Type | Level of Interaction |
---|---|---|
EPSON c20600, printer | IoT | Low/Medium |
HP Color LaserJet m552, printer | IoT | Low/Medium |
APC SmartUPS, an uninterruptible power supply | IoT, Cloud | Low/Medium |
MongoDB v2.4 | Cloud | High |
MongoDB v3.2 | Cloud | High |
Openstack v17 | Cloud | Low/Medium |
VMware vCenter Server v6.5 | Cloud | Low/Medium |
QNAP Network attached storage system | IoT, Cloud | Low/Medium |
QSAN Network attached storage system | IoT, Cloud | Low/Medium |
Thecus Network attached storage system | IoT, Cloud | Low/Medium |
IBM Storwize v7000, enterprise storage system | IoT, Cloud | Low/Medium |
phpMyAdmin v5.1.1 | Cloud, Web | High |
Joomla CMS v3 | Web | High |
Joomla CMS v4 | Web | High |
Length of Data Collection Interval | 132 Days (3 March 2023 to 14 July 2023) |
Number of events collected (HTTP requests) | 188,287 |
Number of unique sessions | 107,954 |
Number of unique source IP addresses | 17,176 |
Number of unique source ASNs | 1807 |
Total number of identified search engines in the dataset | 26 |
Total number of IP addresses of identified search engines | 3160 (18.4% of all captured IP addresses) |
Total number of requests from identified search engines | 45,353 (24.1% of all requests) |
Total number of ASNs of identified search engines | 39 (2.2% of all captured ASNs) |
Total number of deployed honeypots | 14 |
Number of unique user agent strings | 3227 |
Total number of IP addresses with malicious traffic | 5006 (29.2% of all captured IP addresses) |
Total number of source ASNs with malicious traffic | 1327 (73.4% of all captured ASNs) |
Total number of returning visitors 1 | 7687 |
Total number of unique IP addresses that loaded JavaScript | 159 (0.9% of all unique IP addresses) |
Total number of sessions that loaded JavaScript | 232 (0.2% of all captured sessions) |
Total number of returning visitors that loaded JavaScript | 40 |
The largest number of returning visits by an IP that loaded JS | 14 (6.0% of all JS-enabled sessions) |
Total number of detected IP addresses using browser automation tools (based on client-side fingerprinting) | 21 (13.2% of all JS-enabled visitor IPs) |
Total number of IP addresses with a high probability of human actors (based on client-side fingerprinting) | 15 |
Total number of returning IP addresses with a high probability of human actors (based on client-side fingerprinting) | 2 (2 sessions by one IP, 3 by the other) |
Total number of captured client-side events | 26,502 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rabzelj, M.; Južnič, L.Š.; Volk, M.; Kos, A.; Kren, M.; Sedlar, U. Designing and Evaluating a Flexible and Scalable HTTP Honeypot Platform: Architecture, Implementation, and Applications. Electronics 2023, 12, 3480. https://doi.org/10.3390/electronics12163480
Rabzelj M, Južnič LŠ, Volk M, Kos A, Kren M, Sedlar U. Designing and Evaluating a Flexible and Scalable HTTP Honeypot Platform: Architecture, Implementation, and Applications. Electronics. 2023; 12(16):3480. https://doi.org/10.3390/electronics12163480
Chicago/Turabian StyleRabzelj, Matej, Leon Štefanić Južnič, Mojca Volk, Andrej Kos, Matej Kren, and Urban Sedlar. 2023. "Designing and Evaluating a Flexible and Scalable HTTP Honeypot Platform: Architecture, Implementation, and Applications" Electronics 12, no. 16: 3480. https://doi.org/10.3390/electronics12163480
APA StyleRabzelj, M., Južnič, L. Š., Volk, M., Kos, A., Kren, M., & Sedlar, U. (2023). Designing and Evaluating a Flexible and Scalable HTTP Honeypot Platform: Architecture, Implementation, and Applications. Electronics, 12(16), 3480. https://doi.org/10.3390/electronics12163480