Abstract
North Korean cyber-attack groups such as Kimsuky, Lazarus, Andariel, and Venus 121 continue to attempt spear-phishing APT attacks that exploit social issues, including COVID-19. Thus, along with the worldwide pandemic of COVID-19, related threats also persist in cyberspace. In January 2022, a hacking attack, presumed to be Kimsuky, a North Korean cyber-attack group, intending to steal research data related to COVID-19. The problem is that the activities of cyber-attack groups are continuously increasing, and it is difficult to accurately identify cyber-attack groups and attack origins only with limited analysis information. To solve this problem, it is necessary to expand the scope of data analysis by using BGP archive data. It is necessary to combine infrastructure and network information to draw correlations and to be able to classify infrastructure by attack group very accurately. Network-based infrastructure analysis is required in the fragmentary host area, such as malware or system logs. This paper studied cyber ISR and BGP and a case study of cyber ISR visualization for situational awareness, hacking trends of North Korean cyber-attack groups, and cyber-attack tracking. Through related research, we estimated the origin of the attack by analyzing hacking cases through cyber intelligence-based profiling techniques and correlation analysis using BGP archive data. Based on the analysis results, we propose an implementation of the cyber ISR visualization method based on BGP archive data. Future research will include a connection with research on a cyber command-and-control system, a study on the cyber battlefield area, cyber ISR, and a traceback visualization model for the origin of the attack. The final R&D goal is to develop an AI-based cyber-attack group automatic identification and attack-origin tracking platform by analyzing cyber-attack behavior and infrastructure lifecycle.
1. Introduction
As COVID-19 became a global issue, hackers quickly changed their attack methods. Numerous hackers, including advanced persistent threat (APT) attack groups, are actively exploiting the COVID-19 issue. Attacks that exploit COVID-19 are mainly social-engineering techniques and phishing attacks, and are classified into four types of cyber threats: malicious code, phishing site, financial scam, and malicious app distribution, according to the behavior required by users. In addition, most of the APT attack groups attempt attacks using malicious codes and malicious apps [1,2,3]. A North Korean hacker group is implementing a strategy of attacking APT after infiltrating the target system with a spear-phishing strategy that exploits social issues, including COVID-19. In January 2022, a hacking attack, presumed to be Kimsuky, a North Korean cyber-attack group, intended to steal research data related to COVID-19. In addition, in August 2022, an attack aimed at stealing information took place against the Russian Ministry of Foreign Affairs, and in October 2022, it targeted foreign and defense professors and North Korean civilian experts. Kimsuky is currently performing the most active cyber-attack, and the US Cyber Security Office continues to warn of “the danger of North Korean Kimsuky APT attacks” [2,3].
The problem is that the activities of cyber-attack groups are continuously increasing, and it is difficult to accurately identify cyber-attack groups and attack origins only with limited analysis information. Analyst’s scope is narrow due to the lack of analysis data, and it is difficult to trace the origin of cyber-attack groups and analyze associations. In addition, only a small number of known attack groups is being identified, depending on the capabilities of the analyst. As it takes too much time to respond with limited manpower and manual analysis, the reliability of analysis results decreases.
To solve this problem, it is necessary to expand the scope of data analysis by using BGP archive data. It is necessary to combine infrastructure and network information to draw correlations and to be able to classify infrastructure by attack group very accurately. Network-based infrastructure analysis is required in the fragmentary host area, such as malware or system logs. In response to this need, we have proposed a cyber ISR visualization method to quickly identify, track, and respond to cyber-attack origins with various analysis information.
The importance of the proposed solution is that when a cyber-attack occurs using a model to trace the origin of a cyber-attack, it is the most important element to visually show in connection with network infrastructure information. Through this, it is possible to trace the origin of the attack, identify the attack group quickly, and respond effectively.
In this paper, a profiling technique was used to analyze attack cases of distributing malicious documents attached to the hacking mail of the North Korean cyber-attack group Kimsuky, which attempted to steal research data related to COVID-19 in January 2022. Furthermore, the origin of the attack was estimated by analyzing the association based on cyber information collection data using border gateway protocol (BGP) archive data.
Most of the previous studies were cases in which various visualization methods were applied for cyber situation awareness. However, it did not consider the cyber ISR performance process and had limitations in focusing on visualizing anomaly detection for cyber threats. As for the academic significance of this study, the actual hacking cases of North Korean cyber-attack groups were analyzed using profiling techniques. By applying the MITRE ATT&CK framework, attack procedures, attack tactics, and attack techniques were derived. Through this, the cyber ISR process and visualization elements were established. Furthermore, we designed a framework architecture for cyber ISR visualization based on various related data, including BGP archive data, and this is the first case of visualizing it in a prototype form. It is expected that it will make great academic contributions to the fields related to cyber command and control systems and cyber operation systems in the future.
The remainder of this paper is organized as follows. Section 2 deals with research content focusing on visualization research cases for cyber intelligent surveillance and reconnaissance (ISR). Section 3 deals with research on the hacking-case profiling analysis of North Korean cyber-attack groups for cyber ISR battlefield visualization. Section 4 covers the research on the implementation of a model that traces back the origin of the attack on the Kimsuky group through the visualization of the cyber ISR battlefield based on the BGP archive data. Through this, we propose a method to visualize cyber ISR based on BGP archive data. The conclusion is presented in Section 5.
2. Related Works
In Section 2, cyber ISR overview is explained and research cases are reviewed through the latest references related to cyber ISR visualization for cyber situational awareness (SA). Fifteen domestic and foreign research cases from 2006 to the present related to cyber ISR visualization for cyber situational awareness were reviewed. In addition, each study was analyzed in detail by dividing it into visualization technology, core function, visualization level, and use case. The BGP Route View Project related to the BGP archive data, which is the basis of this study, is explained. In addition, for profiling analysis, hacking trends by North Korean cyber-attack group and research cases on backtracking of cyber-attacks are reviewed. The related research is described to enhance readers’ understanding and to increase the importance and qualitative value of research.
2.1. Cyber ISR Overview
The fifth battlefield, the cyber battlefield, is a battlespace to defend against attacks such as disturbing, rejecting, controlling, and destroying the enemy’s information system in a virtual space where digitized information is circulated. Cyber operations in cyber space are carried out under the concept of cyberspace activities, which are cyber-attack, cyber defense, cyber ISR, and cyber operation environment [4,5]. Cyber intelligence is the result of analyzing collected information for a specific purpose. Cyber surveillance refers to intensive observation while observing a target, and cyber reconnaissance refers to actions to achieve a specific goal for a specific target. Cyber target selection and information are collected through cyber ISR. For successful cyber operations, we support the commander’s correct decision-making by collecting and analyzing information on allied and enemy forces [5]. As a result, the process of cyber command and control (C2), cyber defense, and cyber battle-damage assessment (BDA) are carried out through cyber ISR. In addition, the process of collecting and analyzing cyber information that can deliver useful information is very important.
2.2. Related Work on Cyber ISR Visualization for Cyber Situational Awareness
The concept of cyber SA in the US joint doctrine refers to current or predictable knowledge of cyberspace and the operational environment and cyberspace on which cyber operations depend, including all factors that affect cyberspace and allies and enemies [5,6]. Using the common operational picture (COP), the commander continuously evaluates the operational environment through intelligence on troops in the operating environment, reporting functions, personnel monitoring, threat warning, and various activities. The defense network is the primary means of collecting information used by commanders to recognize the operational environment’s situation, including the current system status. Therefore, managing the collection means, communication channels, information programs (data feed), user interfaces, etc., of the defense network is a major activity of the defense network operation [5,6].
The realm of cyber operations is gradually expanding from domestic to global. It is difficult to identify an enemy that quickly adapts to a constantly changing operational environment. For this reason, commanders must be aware of the situation accurately and comprehensively for rapid decision-making. For effective cyber SA, BGP archive data, which are data collected from network collection centers worldwide, are utilized, and open-source intelligence (OSINT) information, which is public-source information applying cyber battlefield information-analysis theory, must be quickly fused and linked to visualize. A typical case of cyber ISR visualization research for cyber situation recognition was analyzed, as shown in Table 1. Through theoretical consideration, each research case was analyzed by a visualization technique, core function, level of detail, and use cases.
Table 1.
Visualization work related to cyber ISR for cyber SA.
Soon Tee Teoh et al. (2006) [7] proposed a model called BGP Eye, a visualization tool for analyzing the root cause of BGP abnormalities. Unlike previous approaches, BGP Eye analyzed BGP’s abnormal symptoms in real-time through hierarchical analysis. In addition, through several valuable points, it provided the ability to analyze BGP anomalies on the Internet-centric view and the home-centric view of a specific autonomous system (AS). James Shearer et al. (2008) [8] proposed a model called BGPeep that visualizes BGP traffic at a detailed level using a novel depiction of internet protocol (IP) space. This tool highlights aspects of BGP archive data that have received less attention in previous visualization applications to help form a complete picture of an important part of the Internet communications infrastructure. Ernst Biersack et al. (2012) [9] proposed the VIS-SENSE model for analysts to detect abnormal routing patterns in vast amounts of BGP archive data through network visualization. Emphasis was placed on how to visualize BGP monitoring to identify prefix-hijacking attacks through malicious intent. Heinbockel et al. (2016) [10] proposed a model called MN CD2-WP2, a hierarchical graph-based tool that shows interdependencies between mission objectives, operations, information, and cyber assets. It was developed based on military scenarios at the strategic level within a structured methodology for cyber resilience analysis. Syamkumar et al. (2016) [11] proposed a model called Bigfoot, a BGP update-visualization system designed to highlight and evaluate various actions in an update stream. It is a concept to visualize network prefixes through the geographic location of an IP and was developed to filter, organize, analyze, and visualize BGP updates so that the characteristics and behaviors of interest can be effectively identified. Alex Ulmer et al. (2018) [12] proposed a model called Global Geo-IP Changes, an interactive visualization system that relies solely on Geo-IP data to raise awareness of data sources. Over time, it was developed to analyze suspicious cases through an IP-block owner and location information in Geo-IP data. Vinayakumar et al. (2018) [13] proposed an extensible framework model for cyber threat situational awareness based on domain-name-system data analysis. Web-scale analytics can be performed in near real-time, analyzing more than 2 million events per second. It was developed for the purpose of confirming and detecting malicious activity in real-time from early warning signals. Paulo Fonseca et al. (2019) [14] proposed a model that can simply observe the volume and AS route functions and BGP traffic changes most commonly used in BGP anomaly-detection technology. It was developed to analyze the trend of BGP behavior that can be used to distinguish abnormal behavior and various types of abnormal traffic and general traffic. Syamkumar et al. (2020) [15] proposed a model called BigBen, a network telemetry-processing system designed to report Internet events (interruptions, attacks, configuration changes, etc.) in an accurate and timely manner. It was developed to identify a wide range of Internet events, characterized by location, range, and duration, and to compare detected events with events detected by large, active probe-based detection systems. Candela et al. (2020) [16] proposed a model called Upstream Visibility for scenario-based monitoring of Internet events (interruptions, attacks, configuration changes, etc.). The global view based on the stack-area chart provides a high trend for the visibility of IP prefixes and has been developed to provide a local view to check the impact of IP prefix-visibility time. Vinayakumar et al. (2020) [17] proposed a deep learning-based visualized botnet-detection system for the Internet of Things in smart cities. Based on deep-learning architecture, a domain-generation algorithm is applied to classify normal and abnormal domain names. Various methods have been used to understand the characteristics of data sets and visualize embedded features. In addition, significant improvements have been made in terms of detection speed and false-alarm rate. Youn et al. (2021) [18] proposed a cyber IPB-visualization model based on BGP archive data for cyber situational awareness. BGP archive data were analyzed and preprocessed and a cyberspace prototype was implemented in the form of a di-graph based on the elastic stack. It has established battlefield-visualization elements for the three layers of cyberspace and is characterized by applying “cyber intelligence preparation of the battlefield (IPB).” Fernandes et al. (2022) [19] proposed a high-efficiency model for time-series prediction of LSTM (long short-term memory). It can handle large amounts of data in time series with non-linearities and was developed to be used to predict future growth based on the increase in a specific variable. De Moura Costa et al. (2022) [20] proposed a fog and blockchain software architecture for making accurate decisions. To make fast and accurate decisions, an approach based on network latency, software scalability, blockchain, and fog computing technologies was used. With this, a decentralized infrastructure was developed to enable scalable solutions. Mohamed et al. (2022) [21] proposed a model for multi-layer protection-approach (MLPA) detection for advanced persistent threat detection. Using MITRE ATT&CK, the MimiKatz malicious application was used as a credential-dumping technique for all internal devices. It was developed to apply the approach to the entire infrastructure, starting with implementing CPU utilization methods.
2.3. BGP Route View Project Overviews
The BGP is a protocol for exchanging routing information, which is IP prefix connection information, and is a protocol that is the basis for gateway hosts around the world. Oregon University’s Route Views Project is the best repository of BGP routing data and plays an important role in understanding the global Internet routing system. Starting with the accumulation of routing information since 2001, BGP routing information transmitted from more than 140 peer-observation monitors from a total of 24 collection points has been collected and recorded in the form of BGP archive data [18,22]. Many studies have been done on BGP routing analysis. Among them, CAIDA’s AS Core Internet Graph research is representative [22]. BGP routing information analysis produces diverse information, such as topology changes, routing connections, network instability, network threats, and network attributes [22,23,24,25,26]. Through this, the BGP archive data accumulated in the BGP Route View Project (http://archive.routeviews.org (accessed on 1 June 2022)) are utilized for research.
2.4. Hacking Trends by North Korean Cyber-Attack Groups
According to a recent analysis of cyber threat cases based on public information, Kimsuky is the most active in cyber-attacks by North Korean attack groups, and in Lazarus, including Andariel, many infringement indicators have been identified compared to cyber-attacks. In addition, Venus 121 has slowed its activity against cyber-attacks and infringement indicators [1,2,3]. To confirm the attack pattern of each attack group, as shown in Table 2, the main attack targets, types, and techniques for each North Korean cyber-attack group were analyzed.
Table 2.
Major targets and techniques for each cyber-attack group in North Korea.
As North Korea has recently officially secured a COVID-19 vaccine through international organizations, it is expected that Lazarus’ attacks on the theft of COVID-19 vaccine information will increase further [3].
In addition, the possibility that bitcoin, a cryptocurrency that can play a role as a new safe asset replacing gold, is emerging, and continuous hacking attacks aimed at stealing cryptocurrency are expected [1,3].
2.5. A Study on Cyber-Attack Traceback
Yogesh et al. (2020) [27] built Root Tracker, a network forensics framework for identifying real sources of cybercrime beyond network Internet service providers (ISP). This was done in a real-time environment to identify the attacker’s device and generate a partial-evidence-match report, even if the attacker formats the system or modifies device parameters. Nur et al. (2021) [28] proposed an AS trace-packet marking technique to infer the AS-level forward path from the attacker to the victim site. Using this, it was shown that the victim site can construct an AS-level forwarding path from the attacker site after receiving a single packet. Nur et al. (2018) [29] proposed a probabilistic packet-marking method that infers a forward path from an attacker site to a victim site and allows the victim to delegate the defense to an upstream Internet service provider. This was implemented by utilizing the record path function of the IP protocol, and compared to other technologies, it showed that the number of packets required to construct a path from the attacker site to the victim site is small. Wang et al. (2018) [30] proposed a countermeasure against specifically targeted ransomware by trapping the attacker through a network deception environment and then using a backtracking technique to identify the attack source. The deception environment consisted of an analysis system that collects tracking clues and automatically extracts and analyzes the collected clues while trapping the attacker.
3. Hacking Case Profiling Analysis of North Korean Cyber-Attack Groups for Cyber ISR Battlefield Visualization
A sequence diagram was conceived to profile and analyze the hacking cases of North Korean cyber-attack groups. Following this procedure, we first analyzed the attacker’s behavior based on the MITRE ATT&CK framework. We extracted actual North Korean infringement indicators by profiling and analyzing Kimsuky phishing attacks that exploited the COVID-19 vaccine issue. In addition, malicious HWP document-structure analysis was performed and malicious phishing sites and C2 servers were analyzed. The information obtained through this process was used for visualization implementation and verification.
3.1. Sequence Diagram for Hacking Case Profiling Analysis
As COVID-19 spreads around the world, cyber threats that exploit this situation in cyberspace continue. Attackers are conducting various types of attacks, such as distributing malicious codes and malicious apps, leaking personal information, and committing financial fraud using social-engineering techniques that exploit COVID-19, such as phishing and smishing. In particular, amid the ongoing cyber-attacks targeting domestic medical personnel, the cyber threat to steal related research data continues as the domestic COVID-19 vaccination becomes visible. Entering the second half of 2020, changes in the Hangul Word Processor (HWP) document file attack technique were detected. It changed from the PostScript method, which has been widely used in the past, to the object linking and embedding (OLE) method [1]. OLE means object connection and insertion, and the HWP application uses the word expression of the entity instead of the object. Accordingly, an in-depth analysis looked at hacking cases presumed to be the North Korean cyber-attack group Kimsuky.
The procedure for in-depth analysis of hacking cases of North Korean cyber-attack groups is shown in Figure 1.
Figure 1.
Sequence diagram for profiling Kimsuky’s phishing attack.
--First, the MITRE ATT&CK framework is applied to analyze the attacker’s behavior. Through this, the tactics and strategies of each North Korean hacker group are predicted.
--Second, an in-depth analysis of hacking cases presumed to be Kimsuky, a North Korean cyber-attack group, is conducted using profiling techniques.
--Third, the malicious code structure of the HWP document is analyzed. Through this, the malicious function and the purpose of the attacker are identified.
--Fourth, phishing sites and C2 servers that have been exploited are analyzed through malicious code analysis.
--Fifth, based on the BGP archive data, cyber information collection and analysis are performed to trace the origin of the attack.
--Sixth, we visualize the di-graph-based network path through cyber information collection and analysis.
--Seventh, we implement cyber ISR visualization based on the prototype architecture. Through this, the origin of the attack from the North Korean cyber hacking group is identified and estimated.
3.2. Analysis of Attacker-Behavior Based on the MITRE ATT&CK Framework
Recently, North Korean hacker groups have continuously attempted spear-phishing attacks that exploit social-engineering attack techniques and social issues. As shown in Table 3, tactics and strategies for each North Korean hacker group were predicted through attacker behavior analysis based on the MITRE ATT&CK framework for related cases. In particular, the study focused on phishing hacking cases from Kimsuky, a North Korean cyber-attack group.
Table 3.
Analysis of attacker behavior based on the MITRE ATT&CK framework.
3.3. Profiling of Kimsuky Phishing Attack That Exploits COVID-19 Vaccine Issue
It is estimated that in January 2022, the North Korean cyber-attack group Kimsuky deployed a malicious HWP file to steal information to exploit the COVID-19 vaccine issue [3]. The distribution targets were employees of domestic health-related government agencies and pharmaceutical companies, and the attack technique used a tactic of distributing hacking emails and operating phishing sites by attaching a malicious HWP file with a malicious OLE entity inserted in the email [31,32,33,34]. The threat actor induces curiosity in the recipient when sending an attack email and induces them to download and execute the attached file titled “COVID-19 Reinfection Case_Vaccine Useless.hwp,” as shown in Figure 2.
Figure 2.
Execution screen of malicious HWP file using a news article.
The malicious-code-insertion HWP document uses the contents of the domestic medical media as they are and has the characteristic of being disguised as a document issued by a government agency using the logo of Korea’s Ministry of Health and Welfare. The document does not show any special features to the naked eye, but in fact, the square-shaped transparent entity is set to the size of the entire area. When the transparent entity is clicked, the malicious executable OLE file (Microsoft.vbs) included in the HWP document is called.
To summarize the phishing-attack process, a malicious HWP document is attached to an email and delivered to the attack target. It was designed to go through the process of inducing execution after inserting the malicious module in the document by exploiting the normal OLE function, which is not a security vulnerability [34,35,36]. In particular, since the OLE method is not a security-vulnerability technique, there is a possibility of risk exposure even if the latest product and updated version are used.
As a result of applying the ATT&CK Framework to the phishing attack that exploited the COVID-19 vaccine issue, the ATT&CK-based attack technology exploited by the Kimsuky group was analyzed as shown in Table 4.
Table 4.
ATT&CK-based attack technology exploited by Kimsuky.
3.4. Structure Analysis of Malicious HWP Documents
The following analysis shows the malicious codes that were added to the HWP document and the functions they perform. The malicious HWP files used for analysis related to phishing attacks were obtained through cooperation with the private security-response center. To check the reliability of the malicious file used for hacking, it was also obtained through the dark web and a hash-value comparison process was performed. Table 5 shows the properties of the “COVID-19 Reinfection Case_Vaccine Useless.hwp” file attached to the email.
Table 5.
Property information of malicious HWP documents and malicious OLE files.
Looking at the internal structure of the HWP document in Figure 3, the “BIN0005.OLE” stream is included. The “BIN0005.OLE” stream contains a malicious file designated by the “Microsoft.vbs” file name inside.
Figure 3.
Malware screen included in the “BIN0005.OLE” area.
Looking at the visual basic script (VBS) code of the “Microsoft.vbs” malicious file, the main functions and core routines are encoded in Base64 and hidden as shown in Figure 4. Then, when the code is executed, decoding is performed and loaded into the memory, and then additional commands are executed by bypassing the detection of security devices [36,37].
Figure 4.
PowerShell command and Base64 encoding screen of the malicious file (Microsoft.vbs).
In addition, as shown in Figure 5, there is a command register in the registry named “February” in the Run value and is set to run automatically when the system starts. It secretly communicates by combining the PowerShell command and the encrypted C2 server address (http://***950.cafe24.com/bbs/Samsung/do.php (accessed on 1 January 2021)).
Figure 5.
Registration screen of registry and screen for setting communication with C2 server.
In addition, as shown in Figure 5, the script communicates with the C2 server using the computer name of the attack target. Through this, actions such as information stealing and remote control can be performed according to additional responses and commands prepared by the threat actor [34,35].
3.5. Analysis of Malicious Phishing Sites and C2 Server
The C2 server identified through malicious-code analysis was a private site (Korea NICE credit information) for a debt-collection service using a domestic hosting company. In addition, a space for the attack was built in the server after taking authority of the poorly managed web server. As a result of searching for all virus information about the domain and IP of the server, as shown in Figure 6, the domain was discovered to be http://***950.cafe24.com (accessed on 1 January 2021), the IP 222.122.8*.***, and the AS 4766 (KR).
Figure 6.
Screen of malicious phishing sites and total virus information.
After analyzing the web access and error log for the abused C2 server, a list of attack IPs in the victim server was identified and is shown in Table 6.
Table 6.
Attack IP range confirmed on the abused server.
4. Implementation of Kimsuky’s Attack-Origin Backtracking Model through Cyber ISR Battlefield Visualization Based on BGP Archive Data
Section 4 describes cyber information collection and analysis methods for tracing the attack origin back. Through this, network path visualization was designed based on BGP archive data, and cyber ISR visualization implementation are described.
4.1. Cyber Information Collection and Analysis Method for Backtracking of the Attack Origin
Network forensics was carried out to identify abnormal behavior in network flow through packet analysis. In addition, through Maltego, the topology object was checked from the North Korean network topology view to the terminal OSINT information, and through this process, the effectiveness of the cyber ISR and the backtracking process for the origin of the attack was verified. In the case of hacking attacks, North Korea mainly uses it as a technique to bypass the IP through various transit points, and many reports also state that bypassing IPs through proxy servers is a general technique [38,39].
However, due to the advancement of the backtracking technique, the IP backtracking method is effective, and the backtracking technology is largely classified into two types. First, it is an IP-packet backtracking technology to identify the distributed denial of service (DDoS) attack point, and second, there is a TCP-connection backtracking method that is mobilized to identify the target according to the bypass attack. Each technology has its limitations, but it is still an important method in discovering the subject of hacking through IP analysis as it overcomes the technical difficulties [39].
Two types of IP ranges frequently appear in relation to North Korean hacking, as shown in Table 7. One is the IP band of North Korea’s Ministry of Posts and Telecommunications, which is renting and using the Internet of China, and the other is the IP band managed by Star Joint Venture. Star Joint Venture is a joint venture between North Korea’s Ministry of Posts and Telecommunications and Thailand’s Loxley Pacific Group. If the computer IP address mobilized for hacking is included in the band, the government presumes that it was North Korea’s actions [39].
Table 7.
List of hacking IP ranges in North Korea.
Through this analysis, the attacking IP 121.18.8*.***, which was identified in the web-access log of the server abused as a phishing site, was identified as an AS4837 node managed in China.
4.2. Design of Network Route Visualization Using BGP Archive Data
After the data-processing process to convert the published BGP, OSINT, and IP geolocation data into GeoJSON format, an integrated intelligence DB for visualization was built, and the structure was designed to be linked with ElasticSearch and Kibana’s ElasticMap, as shown in Figure 7 [40]. Although only fragmentary host-area information was analyzed, it was designed to analyze the cyber-attack lifecycle through network-area information analysis. It was identified with the analyst’s manual analysis, but it was designed to implement macroscopic visualization through network characteristic information. Limited individual information for each institution was collected and analyzed, but based on BGP archive data, TTPs and MITRE ATT&CK of various cyber-attack groups were combined. Through this profiling analysis data was used to verify and to see new information inside.
Figure 7.
Design of cyber ISR visualization framework architecture. (Adapted from Go et al. Proc AIS 2022; p. 9, with permission of Dailysecu Press [40].)
To visualize the global network based on GeoMap, first, the BGP archive data must be dumped and pre-processed. For this research, the BGP archive data used the Route Views Project Repository of the University of Oregon, which has been evaluated as the best in the world. To shorten the data-preprocessing process, a Python-based BGP archive-data downloader and a BGP archive-data parser were created as programs and used in the research process. After executing the program, the parsing data that went through the pre-processing process were extracted and built into an integrated DB, as shown in Figure 8 [40]. Information related to cyber-attacks was collected as data, pre-processed, and organized into a DB for management. The collection channel of information related to attack behavior and infrastructure was expanded, and the DB for data relation configuration was expanded.
Figure 8.
Design of DB for collection and pre-processing data relation. (Adapted from Go et al. Proc AIS 2022; p. 10, with permission of Dailysecu Press [40].)
To analyze the AS path, a di-graph was drawn with the information extracted from the BGP archive data. The BGP AS route map between North Korea and China was visualized in the form of a di-graph, as shown in Figure 9, and the command code for the di-graph visualization is as follows.
| cat rib.20220601.1000.AS131279.tsv | tr “(“ “ “| tr ”)” ” ” | awk ‘$2!=$4{print $2 “\t” $4}’ | awk ‘BEGIN{print “digraph{“} {print $0} END{print ”}”}’ | dot -T png -o rib.20220601.1000.AS131279.cntry.png |
Figure 9.
Di-graph visualization of BGP network topology between North Korea and China.
Through this, it was possible to create and analyze not only the AS network unit between North Korea and China, but also the global network topology autonomous system number (ASN), detailed AS route, core node, and relay node information. As shown in Figure 10, the attack path was confirmed from AS4837 in China to AS4766 in Korea, and AS4837 in China was connected to AS131279 in North Korea through a single path, so the origin of the attack is assumed to be North Korea.
Figure 10.
Visualization of North Korean network routes based on BGP archive data.
4.3. Implementation of Cyber ISR Visualization Based on BGP Archive Data
After the data-processing process to convert the published BGP, OSINT, and IP geolocation data into GeoJSON format, an integrated intelligence DB for visualization was built, and the structure was designed to be linked with ElasticSearch and Kibana’s ElasticMap [18]. Using the prototype architecture, we proceeded to visualize the IP traceback. In the ISP managed by Star Joint Ventures, the IP band that North Korea used for hacking was estimated as the origin of the attack, and the IP traceback process was carried out. In addition, it was possible to check the public IP, which is believed to have been used for hacking within the North Korean network. Through this, based on BGP archive data, North Korea’s cyber ISR prototype was visualized, as shown in Figure 11.
Figure 11.
Visualization of North Korean cyber ISR based on BGP archive data.
An interesting fact was discovered during the analysis and visualization process. If we analyze the network connection diagram of North Korea’s network topology, we can classify a total of five attack routes from North Korea to South Korea via China.
--First, a route utilizes a virtual private network (VPN) gate. This route leads from North Korea to South Korea via Japan.
--Second, a route utilizes a commercial VPN (Nord VPN). This route leads from North Korea to South Korea via Japan.
--Third, a route utilizes domain-name-system (DNS) tunneling. This route connects North Korea to South Korea via Europe (Switzerland, London) and other areas (Singapore, etc.).
--Fourth, a route utilizes a private L3 VPN. This route leads from North Korea to South Korea via Kenya.
--Fifth, a path uses an Apple desktop based on MacOS. This route connects North Korea to South Korea via the United States and Middle Eastern countries (Bahrain, etc.).
In particular, from around September 2021, the fifth route rapidly changed from a Windows-based desktop environment to an Apple desktop environment, and Apple Remote Desktop communication rapidly increased in North Korea’s networks. In addition, traffic from North Korea that attempted hacking attacks after passing through the Middle East was continuously increasing. Analyzing this phenomenon, as the size and activity area of the attack group grew, there was a limit to the tactical operation of hacking activities based on Windows and Linux. Accordingly, it is estimated that the operating environment was changing to a more versatile Apple MacOS-based operating environment.
The added test type was conducted to analyze the attack infrastructure and communication to the infected area. For the data for analysis, two-way communication data collected from domestic and overseas sections based on the IP of the affected area were used. The relay point was derived based on the communication fact that occurred at the infected IP. We proceeded in a way to secure additional data on this. Data for analysis were obtained from the Pure Signal Recon Company. As a result of analyzing the communication log for the seven IPs specified as the affected area, it was confirmed that the network service exposed to the outside exists in the six IPs, as shown in Table 8.
Table 8.
Externally exposed network service operating in the victim IP.
There was an Internet section communication log for analysis among the affected areas, and the average number of bytes per packet for the seven IPs was analyzed, as shown in Figure 12.
Figure 12.
Outbound (left)/inbound (right) of average bytes from 7 victim IPs.
As a result of the analysis, most of the inbound communications were unconnected communications that occurred in network scans, etc. Outbound communication was mostly connectionless communication. However, both signal-transmission-type and data-transmission-type communication occurred among connectivity communication. When the inbound and outbound ratios of communications originating from the victim IP were checked, the outbound ratio appeared as 99.97%. This means that the attacker took control of the affected area and obtained illegal access rights. Through this, it can be seen that not only was a cyber-attack on the internal system but also that the damaged area was being used as a new attack base. The structure of communication traffic generated in the affected area is shown in Figure 13.
Figure 13.
Sampling the structure of communication traffic generated in the affected area.
The part marked in blue is the victim IP located in the Republic of Korea, and the arrow shows the direction of communication. What is unusual is the fact that most domestic IPs were being attacked for network vulnerabilities. At the same time, it can be seen that the damaged base was conducting a network vulnerability attack on the server in the overseas section.
5. Conclusions
This paper provided an overview of cyber ISR and BGP and studied cyber ISR visualization for situational awareness, hacking trends of North Korean cyber-attack groups, and cyber-attack tracking. In particular, the hacking attack case of Kimsuky, a North Korean cyber-attack group that attempted to steal research data related to COVID-19 in January 2022, was analyzed using a profiling technique. The origin of the attack was estimated from the verified North Korean hacking IP band through correlation analysis using cyber information-based profiling techniques and BGP archive data.
This research enables a commander to recognize the cyber situation at the level of command and control. The network space of the cyber battlefield was visualized, and a cyber ISR visualization method based on BGP archive data was proposed. This paper proposed an architecture for a visualization model and implemented a prototype in terms of prior research to make a better model. For that reason, Section 3 of the thesis analyzed hacking cases, and Section 4 of the thesis focused on applying the cyber ISR visualization model by correlating the analyzed breach index and BGP archive data. As such, it has not been fully developed and implemented, and thus further evaluation is limited.
In the future, in connection with cyber command-and-control-system research, we plan to research the cyber battlefield area, cyber ISR, and a traceback visualization model for the origin of an attack. The final R&D goal is to develop an AI-based cyber-attack group automatic identification and attack-origin tracking platform by analyzing cyber-attack behavior and infrastructure lifecycle. First, we will develop technologies to collect and manage information related to cyber-attack behavior and infrastructure. Lifecycle information (structured/unstructured) data of cyber-attacks will be collected and pre-processed to compose the DB. Attack behavior and infrastructure-related information collection channels will be expanded, as will DB for data-relation configuration. Second, cyber-attack group-clustering technology based on network infrastructure and network domain-characteristic information will be developed. The characteristics of the cyber-attack group’s network-infrastructure and network-weakness information will be extracted through feature engineering and a multi cyber-attack group-clustering model will be constructed. Third, an AI-based attack group-infrastructure identification technology will be developed by analyzing the cyber-attack lifecycle. We plan to develop an AI-based group identification module to extract connection and differential characteristics for each cyber-attack group to link network-area information and to learn characteristic information.
The final performance goal of the study is 90% identification accuracy of attack groups and 129 identifiable attack groups. The number of pre-matrix tactics and techniques is 30, and the number of attack group-related feature data is 300. In addition, AI-based attack-infrastructure identification shows a performance improvement of over 80% compared to manual work. Accordingly, the main scientific contribution is the ability to identify and effectively respond to fast attack groups based on network infrastructure-information linkage. It is possible to secure national information-protection technology by securing the source technology for cyber warfare response. Based on the lifecycle analysis of the attack infrastructure, the effect of creating new research and technology fields can be expected.
Author Contributions
Conceptualization, J.Y., K.K., and D.S.; funding acquisition, D.S.; methodology, J.Y., D.K., and J.L.; design of cyber ISR visualization, J.Y., K.K., and J.L.; supervision, D.S.; validation, J.L. and M.P.; writing—original draft, J.Y., K.K., and D.K.; writing—review and editing, D.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the Future Challenge Defense Technology Research and Development Project (9129156) hosted by the Agency for Defense Development Institute in 2020.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| APT | Advanced persistent threat |
| ISR | Intelligent surveillance and reconnaissance |
| BGP | Border gateway protocol |
| C2 | Command and control |
| BDA | Battle damage assessment |
| SA | Situational awareness |
| COP | Common operational picture |
| OSINT | Open-source intelligence |
| IP | Internet protocol |
| AS | Autonomous system |
| ASN | Autonomous system number |
| HWP | Hangul word processor |
| OLE | Object linking and embedding |
| VBS | Visual basic script |
| VPN | Virtual private network |
| DNS | Domain name system |
| OS | Operating system |
| KR | Republic of Korea |
| DDoS | Distributed denial of service |
References
- Joint Cybersecurity Advisory. North Korean Advanced Persistent Threat Focus: Kimsuky; Cybersecurity and Infrastructure Security Agency (CISA): Arlington, VA, USA, 2020.
- Joint Cybersecurity Advisory. Guidance on the North Korean Cyber Threat; Cybersecurity and Infrastructure Security Agency (CISA): Arlington, VA, USA, 2020.
- Kim, H.K.; Kim, H.J.; No, Y.H. KISA Cyber Security Issue Report: Q4 2020; Korea Internet & Security Agency (KISA): Seoul, Republic of Korea, 2021. [Google Scholar]
- Miller, K.S. ATP 2-01.3 Intelligence Preparation of the Battlefield; Department of the Army: Washington, DC, USA, 2019.
- Scott, K.D. Joint Publication (JP) 3-12 Cyberspace Operation; The Joint Staff: Washington, DC, USA, 2018.
- Robert, G. Situation Awareness in Defensive Cyberspace Operations: An Annotated Bibliographic Assessment through 2015; NIWC Pacific: San Diego, CA, USA, 2019.
- Soon, T.T.; Supranamaya, R.; Antonio, N.; Chen, N.C. BGP Eye: A New Visualization Tool for Real-time Detection and Analysis of BGP Anomalies. In Proceedings of the 3rd International Workshop on Visualization for Computer Security, Alexandria, VA, USA, 3 November 2006; ACM: New York, NY, USA, 2006; pp. 81–90. [Google Scholar]
- Shearer, J.; Ma, K.L.; Kohlenberg, T. BGPeep: An IP-Space Centered View for Internet Routing Data. In Proceedings of the International Workshop on Visualization for Computer Security, Cambridge, MA, USA, 15 September 2008; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Biersack, E.; Jacquemart, Q.; Fischer, F.; Fuchs, J.; Thonnard, O.; Theodoridis, G.; Tzovaras, D.; Vervier, P.-A. Visual analytics for BGP monitoring and prefix hijacking identification. IEEE Netw. 2012, 26, 33–39. [Google Scholar] [CrossRef]
- Heinbockel, W.; Noel, S.; Curbo, J. Mission Dependency Modeling for Cyber Situational Awareness. In Proceedings of the NATO IST-148 Symposium on Cyber Defence Situation Awareness, McLean, VA, USA, 30 October 2016; pp. 1–14. [Google Scholar]
- Syamkumar, M.; Duraiajan, R.; Barford, P. Bigfoot: A Geo-based Visualization Methodology for Detecting BGP Threats. In Proceedings of the 2016 IEEE Symposium on Visualization for Cyber Security (VizSec), Baltimore, MD, USA, 24 October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–8. [Google Scholar]
- Ulmer, A.; Schufrin, M.; Sessler, D.; Kohlhammer, J. Visual-Interactive Identification of Anomalous IP-Block Behavior Using Geo-IP Data. In Proceedings of the 2018 IEEE Symposium on Visualization for Cyber Security (VizSec), Berlin, Germany, 22 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
- Vinayakumar, R.; Poornachandran, P.; Soman, K.P. Scalable Framework for Cyber Threat Situational Awareness Based on Domain Name Systems Data Analysis. In Big Data in Engineering Applications; Roy, S.S., Samui, P., Deo, R., Ntalampiras, S., Eds.; Springer: Singapore, 2018; pp. 113–142. [Google Scholar]
- Fonseca, P.; Mota, E.S.; Bennesby, R.; Passito, A. BGP Dataset Generation and Feature Extraction for Anomaly Detection. In Proceedings of the 2019 IEEE Symposium on Computers and Communications (ISCC 2019), Barcelona, Spain, 9 June–3 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
- Syamkumar, M.; Gullapalli, Y.; Tang, W.; Barford, P.; Sommers, J. BigBen: Telemetry Processing for Internet-wide Event Monitoring. arXiv 2022, arXiv:2011.10911. [Google Scholar] [CrossRef]
- Candela, M.; Di Battista, G.; Marzialetti, L. Multi-view Routing Visualization for the Identification of BGP Issues. J. Comput. Lang. 2020, 58, 100966. [Google Scholar] [CrossRef]
- Vinayakumar, R.; Alazab, M.; Srinivasan, S.; Pham, Q.V.; Padannayil, S.K.; Simran, K. A Visualized Botnet Detection System Based Deep Learning for the Internet of Things Networks of Smart Cities. IEEE Trans. Ind. Appl. 2020, 56, 4436–4456. [Google Scholar] [CrossRef]
- Youn, J.; Oh, H.; Kang, J.; Shin, D. Research on Cyber IPB Visualization Method based on BGP Archive Data for Cyber Situation Awareness. KSII Trans. Internet Inf. Syst. (TIIS) 2021, 15, 749–766. [Google Scholar]
- Fernandes, F.; Stefenon, S.F.; Seman, L.O.; Nied, A.; Ferreira, F.C.S.; Subtil, M.C.M.; Klaar, A.C.R.; Leithardt, V.R.Q. Long short-term memory stacking model to predict the number of cases and deaths caused by COVID-19. J. Intell. Fuzzy Syst. 2022, 42, 6221–6234. [Google Scholar] [CrossRef]
- Costa, H.J.D.M.; Costa, C.A.D.; Righi, R.D.R.; Antunes, R.S.; Santana, J.F.D.P.; Leithardt, V.R.Q. A Fog and Blockchain Software Architecture for a Global Scale Vaccination Strategy. IEEE Access 2022, 10, 44290–44304. [Google Scholar] [CrossRef]
- Mohamed, N.; Alam, E.; Stubbs, G.L. Multi-Layer Protection Approach MLPA for the Detection of Advanced Persistent Threat. J. Posit. Sch. Psychol. 2022, 6, 4496–4518. [Google Scholar]
- Lee, Y.; Lee, Y. Yet Another BGP Archive Forensic Analysis Tool Using Hadoop and Hive. J. KIISE 2015, 42, 541–549. [Google Scholar] [CrossRef]
- Ozarslan, O.F.; Sarac, K. ZIDX: A Generic Framework for Random Access to BGP Records in Compressed MRT Datasets. In Proceedings of the 2020 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 3–6 August 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–8. [Google Scholar]
- Salido, J.; Nakahara, M.; Wang, Y. An Analysis of Network Reachability Using BGP Data. In Proceedings of the 3rd IEEE Workshop on Internet Applications (WIAPP 2003), San Jose, CA, USA, 23–24 June 2003; IEEE: Piscataway, NJ, USA, 2003; pp. 10–18. [Google Scholar]
- Demchak, C.C.; Shavitt, Y. China’s Maxim–Leave No Access Point Unexploited: The Hidden Story of China Telecom’s BGP Hijacking. Mil. Cyber Aff. 2018, 3, 7. [Google Scholar] [CrossRef]
- Douzet, F.; Pétiniaud, L.; Salamatian, L.; Limonier, K.; Salamatian, K.; Alchus, T. Measuring the Fragmentation of the Internet: The Case of the Border Gateway Protocol (BGP) During the Ukrainian Crisis. In Proceedings of the 2020 12th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 26–29 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 157–182. [Google Scholar]
- Yogesh, P.R. Backtracking Tool Root-tracker to Identify True Source of Cybercrime. Procedia Comput. Sci. 2020, 171, 1120–1128. [Google Scholar] [CrossRef]
- Nur, A.Y.; Tozal, M.E. Single Packet AS Traceback against DoS Attacks. In Proceedings of the 2021 IEEE International Systems Conference (SysCon), Virtual, 15 April–15 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8. [Google Scholar]
- Nur, A.Y.; Tozal, M.E. Record Route IP Traceback: Combating DoS Attacks and the Variants. Comput. Secur. 2018, 72, 13–25. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, C.; Qiu, J.; Tian, Z.; Cui, X.; Su, S. Automatically Traceback RDP-based Targeted Ransomware Attacks. Wirel. Commun. Mob. Comput. 2018, 2018, 7943586. [Google Scholar] [CrossRef]
- Lee, J.; Lee, Y.; Lee, D.; Kwon, H.; Shin, D. Classification of Attack Types and Analysis of Attack Methods for Profiling Phishing Mail Attack Groups. IEEE Access 2021, 9, 80866–80872. [Google Scholar] [CrossRef]
- Suganya, V. A Review on Phishing Attacks and Various Anti Phishing Techniques. Int. J. Comput. Appl. Found. Comput. Sci. (FCS) 2016, 139, 20–23. [Google Scholar] [CrossRef]
- Chiew, K.L.; Yong, K.S.C.; Tan, C.L. A Survey of Phishing Attacks: Their Types, Vectors and Technical Approaches. Expert Syst. Appl. 2018, 106, 1–20. [Google Scholar] [CrossRef]
- Qabajeh, T.F.; Chiclana, F. A Recent Review of Conventional vs. Automated Cybersecurity Anti-Phishing Techniques. Comput. Sci. Rev. 2018, 29, 44–55. [Google Scholar] [CrossRef]
- Kim, J.Y.; Bu, S.J.; Cho, S.B. Zero-day Malware Detection Using Transferred Generative Adversarial Networks based on Deep Autoencoders. Inf. Sci. 2018, 460, 83–102. [Google Scholar] [CrossRef]
- Gangavarapu, T.; Jaidhar, C.; Chanduka, B. Applicability of Machine Learning in Spam and Phishing Email Filtering: Review and Approaches. Artif. Intell. Rev. 2020, 53, 5019–5081. [Google Scholar] [CrossRef]
- Lawson, P.; Pearson, C.J.; Crowson, A.; Mayhorn, C.B. Email Phishing and Signal Detection: How Persuasion Principles and Personality Influence Response Patterns and Accuracy. Appl. Ergon. 2020, 86, 103084. [Google Scholar] [CrossRef]
- Kong, J.Y.; Lim, J.I.; Kim, K.G. The All-Purpose Sword: North Korea’s Cyber Operations and Strategies. In Proceedings of the 2019 11th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 28–31 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–20. [Google Scholar]
- Shin, C.; Lee, S.J. A Study of Countermeasure and Strategy Analysis on North Korean Cyber Terror. J. Police Sci. 2013, 13, 201–226. [Google Scholar]
- Go, W. Technology to Attack groups identify based on cyber-attack life-cycle information learning. In Proceedings of the 2th Artificial Intelligence Information Security Conference 2022 (AIS 2022), Dailysecu, Seoul, Republic of Korea, 15 November 2022; pp. 9–10. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).












