6Trace: An Effective Method for Active IPv6 Topology Discovery

Shen, Zhaobin; Chen, Pan; Xie, Yi; Chen, Chiyu; Zhang, Yongheng; Yang, Guozheng

doi:10.3390/electronics14020343

Open AccessArticle

6Trace: An Effective Method for Active IPv6 Topology Discovery

by

Zhaobin Shen

^1,2

,

Pan Chen

^1,2,

Yi Xie

^1,2,

Chiyu Chen

^1,2

,

Yongheng Zhang

^1,2 and

Guozheng Yang

^1,*

¹

College of Electronic Engineering, National University of DefenseTechnology, Hefei 230037, China

²

Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(2), 343; https://doi.org/10.3390/electronics14020343

Submission received: 3 December 2024 / Revised: 4 January 2025 / Accepted: 14 January 2025 / Published: 17 January 2025

(This article belongs to the Special Issue Network Protocols and Cybersecurity)

Download

Browse Figures

Versions Notes

Abstract

:

Scanning the large-scale topology of the IPv6 internet presents a significant challenge in network engineering, particularly for understanding the underlying network structure and assessing network security. The sheer size of the IPv6 address space makes traditional brute-force scanning techniques, such as traceroute, inefficient and impractical. Existing methodologies are often unable to cope with the inherent complexity of IPv6’s network structure and its probing requirements, leading to issues such as redundant probes, ICMPv6 rate limits, and network congestion. To address these challenges, this paper introduces 6Trace, an innovative solution that mitigates the impact of rate-limiting and congestion by distributing scanning traffic across the network. Furthermore, 6Trace incorporates a stateless, asynchronous scanning approach combined with a bisection-like dynamic probing strategy, significantly reducing redundancy. Experimental results demonstrate that 6Trace enhances scanning efficiency by 70% over current solutions, discovering the maximum number of interface addresses while minimizing probing time. Notably, this paper also provides the first comprehensive analysis of topology probing results across different types of target networks. The insights gained from this study will inform future research on optimal target selection and IPv6 internet measurement techniques.

Keywords:

IPv6; topology; stateless asynchronous scanning

1. Introduction

The transition to an IPv6-based network has become a globally accepted standard for the next-generation Internet. As of December 2024, over 46.7% of Google users [1] accessed their services via IPv6, marking a significant increase compared to 2018. In the rapidly expanding Internet of Things (IoT) sector, IPv6 is crucial for enabling end-to-end connections, ensuring low-latency communication, and supporting the large-scale deployment of connected devices [2]. This shift underscores the pressing need for scalable and efficient network management tools capable of monitoring and mapping IPv6-based network topologies.

Network topology discovery is essential for both optimizing performance [3,4] and enhancing security [5]. For network administrators, access to real-time information concerning network structure, routing dynamics, traffic anomalies, and fault conditions is indispensable [6]. Continuous monitoring of these parameters enables routing optimization, congestion reduction, packet loss minimization, and improved network reliability [7,8,9,10]. Additionally, the ability to rapidly acquire and update large-scale routing information is critical for capturing near-real-time network snapshots. Such snapshots allow administrators to analyze dynamic changes and make informed decisions efficiently.

Studies on IPv4 topology discovery, such as Yarrp [11] and Flashroute [12], have addressed issues like network congestion and redundancy. These tools successfully traced all/24 prefixes in the IPv4 address space within an hour. However, IPv6 introduces unique challenges that prevent the direct application of these methods. One major issue is the stricter ICMP rate-limiting in IPv6, as specified by RFC4443 [13], which imposes significantly tighter constraints compared to IPv4. Moreover, ICMPv6 serves a broader range of functions, further intensifying rate-limiting constraints [14].

Although Yarrp6 [15] extends the Yarrp strategy to partially mitigate rate-limiting, it still suffers from significant redundancy in large-scale probing, leading to inefficient resource utilization. Similarly, 6Search [16] improves probing efficiency using reinforcement learning to explore active IPv6 regions. However, it does not adequately address the high probing time or focus on targeting specific addresses. Given the immense size of the IPv6 address space, efficient probing of prefixes containing active addresses is critical for improving accuracy and minimizing invalid probes.

To address these challenges, this paper presents 6Trace, a novel approach for active topology discovery in large-scale IPv6 networks.

hl6Trace introduces a stateless, asynchronous probing mechanism. This design eliminates the need for synchronization between sending and receiving packets, allowing high-speed parallel probing of target sets. It also reduces delays, enabling efficient handling of high probing volumes during large-scale discovery.

Additionally, 6Trace employs a dynamic traffic distribution strategy to spread probing traffic evenly across routes and hop distances. This approach mitigates network congestion and reduces the impact of ICMP rate-limiting, addressing two key challenges in IPv6 probing.

A core feature of 6Trace is its feedback-based probing strategy, inspired by the DoubleTree algorithm. The strategy starts probing from intermediate hops and progressively narrows the range to maximize efficiency. Redundancy is minimized using a stop set during backward probing. A bisection-like method dynamically adjusts probing points to focus on relevant targets. These mechanisms significantly reduce unnecessary resource consumption.

6Trace organizes its scanning process into iterative rounds, refining strategies based on real-time network routing information. This iterative approach, commonly applied in topology discovery tools [12,17], ensures adaptability and efficiency. Specialized data structures further optimize memory usage and computational overhead. Scanning efficiency, defined as the ratio of discovered active interface addresses [18] to the total number of probes sent, is a key performance metric for 6Trace. The design prioritizes maximizing this efficiency while minimizing probing redundancy and time.

In addition to its core methodology, this study incorporates and evaluates seed sets from multiple open-source IPv6 measurement studies. By analyzing their impact across different network categories, such as NSPs, ISPs, CDNs, and ETPs, this paper highlights how these categories influence discovery efficiency. These findings emphasize the importance of tailoring target selection strategies to network characteristics, improving both accuracy and comprehensiveness.

The main contributions of this paper are as follows:

We propose 6Trace, an innovative tool for large-scale IPv6 network topology discovery. 6Trace leverages a stateless asynchronous scanning mechanism, enabling efficient and scalable probing. A core feature of 6Trace is its feedback-based, bisection-like probing strategy, which dynamically optimizes TTL value adjustments, achieves high-speed probing, minimizes redundancy, and reduces local network load. This design significantly mitigates the impact of large-scale topology discovery on network performance, making it an effective and scalable solution for IPv6 probing.
Through extensive real-world experiments using diverse seed sets, we demonstrate that 6Trace achieves a 70% improvement in scanning efficiency on average. This enhancement enables faster, more comprehensive, and resource-efficient topology discovery compared to existing state-of-the-art methods for IPv6 network measurement.
This study is the first to systematically analyze biases in network topology discovery results. Our findings reveal that targeting Network Service Providers (NSPs) and Internet Service Providers (ISPs) uncovers significantly more extensive topology information compared to other network categories. These results provide actionable insights for selecting optimal target sets to meet varying scanning requirements, thereby improving the accuracy and relevance of future IPv6 network measurement research.

2. Related Work

This paper introduces related work from two aspects: IPv6 topology and IPv6 hitlists.

2.1. IPv6 Topology Discovery

With the rapid expansion of IPv6 networks, the efficiency and comprehensiveness of network topology discovery have become increasingly crucial. Due to the vast and sparse address space of IPv6, traditional IPv4 network topology discovery methods face significant challenges when applied to IPv6. While substantial progress has been made in IPv4 network topology discovery, existing methods still face challenges when addressing the vast address space, sparse distribution, and protocol-specific characteristics of IPv6 networks. Consequently, various new approaches have been proposed to improve the efficiency and accuracy of IPv6 network topology discovery.

Traceroute [19,20,21], is a widely used traditional tool for network topology discovery in IPv4 networks. It operates by probing hop-by-hop to gather path information between the source and destination hosts, utilizing the Time to Live (TTL) value to trace the path. However, the application of traceroute in IPv6 networks is limited due to the vast address space and sparse address distribution, leading to slower probing speeds and redundant probes in large-scale multi-address topologies. Moreover, traceroute has limitations when handling load-balanced paths, making it less effective for IPv6 network topology discovery.

To address redundant probing, the DoubleTree [22] algorithm leverages the tree-like structure of network topologies by selecting an intermediate node to initiate probing, aggregating paths and minimizing redundant probes. However, it still relies on hop-by-hop probing, which fails to distribute probing traffic effectively in large-scale IPv6 networks. While this approach reduces redundancy, it still relies on hop-by-hop probing, which fails to distribute probing traffic effectively and struggles with load-balanced paths, limiting scalability in large-scale IPv6 networks.

To further address load-balancing issues, Paris-Traceroute [19] and the Multipath Detection Algorithm (MDA) [23] introduced multi-path probing strategies. Paris–Traceroute reduces false links caused by load balancing by fixing source/destination ports, while the MDA sends numerous probe packets to identify multiple paths.

The evolution of these traditional topology discovery methods has provided valuable insights into solving issues related to IPv6 topology discovery. However, due to the need for maintaining states and their reliance on synchronous probing, these methods encounter challenges such as slow probing speeds and low efficiency when applied to large-scale networks. To address these issues, newer methods like Yarrp [11], Yarrp6 [15], and FlashRoute [12], inspired by the asynchronous probing approach of ZMap [24], introduce innovations such as decoupling the sending and receiving threads. These improvements significantly enhance scanning efficiency and employ techniques like randomized probing to avoid network congestion and improve probing response rates.

Yarrp6 [15] employs stateless scanning techniques to substantially improve probing speed. Compared to traditional methods, Yarrp6 embeds essential state information directly into the probe, enabling the ICMPv6 TTL-exceeded message to carry details about the probe. This allows Yarrp6 to efficiently reduce redundant probing and increase parallelism. Furthermore, by decoupling the sending and receiving processes, Yarrp6 accelerates probing and utilizes randomized IP × TTL probing to circumvent ICMPv6 rate-limiting. Additionally, Yarrp6 adapts to load-balanced paths by retaining fixed values for specific fields and selecting appropriate probing modes. Despite these improvements, Yarrp6 still fails to fully resolve the redundancy issue, particularly in sparse IPv6 address spaces, leading to efficiency bottlenecks.

FlashRoute integrates Yarrp6’s stateless scanning technique with the DoubleTree algorithm’s redundancy elimination strategy to further optimize the topology discovery process. However, it still struggles with probe traffic concentration in large-scale IPv6 networks, leading to ICMPv6 rate-limiting and packet loss. By decoupling the sending and receiving processes, FlashRoute enhances parallelism and improves scanning efficiency. Additionally, it introduces a Destination Control Block (DCB) to manage the probing logic for each destination and track the progress of the probe. Through a pre-probing process, FlashRoute acquires hop distance information to target nodes, optimizing the probing path and performing forward and backward probing based on this information. However, FlashRoute [12] still encounters issues arising from the hop-by-hop probing strategy in DoubleTree, particularly in IPv6 networks, where probing traffic may concentrate on local networks, leading to ICMPv6 rate-limiting and packet loss, thus diminishing the comprehensiveness of topology discovery. Furthermore, although FlashRoute performs well in IPv4 networks, its performance in IPv6 networks remains underreported, particularly regarding challenges related to IPv6’s sparse address space and rate-limiting.

Recent advancements in IPv6 network topology discovery have built upon existing methods and tools to enhance the comprehensiveness of topology discovery by optimizing probing targets. However, the limitations of existing methods in large-scale IPv6 networks persist, which justifies the need for 6Trace. Notable examples of such approaches include 6Search [16]. 6Search introduces a dynamic resource allocation method based on reinforcement learning, allocating more resources to high-value prefixes, thereby optimizing the efficiency of IPv6 network topology discovery. However, 6Search suffers from early convergence, where probing resources become overly concentrated in localized areas, limiting the global coverage of topology discovery, especially in sparse IPv6 address spaces. Furthermore, the reinforcement learning approach in 6Search relies on previous probing results to update rewards, which increases time complexity and makes it difficult to quickly adapt to changes in network topology, particularly in dynamic environments. Although 6Search incorporates the DoubleTree algorithm to reduce redundant probes, it has not fully overcome the algorithm’s limitations, particularly in large-scale IPv6 networks, where it fails to effectively distribute probing traffic.

In summary, recent advancements in topology discovery methods have made notable progress in enhancing efficiency and comprehensiveness. However, challenges persist, such as redundant probing, handling load-balanced paths, and managing computational complexity in large-scale networks. While 6Search has optimized probing targets using existing tools, it still relies on earlier probing methods with inherent limitations and therefore fails to fully resolve efficiency issues in large-scale IPv6 network discovery. Further optimization and innovation are necessary.

2.2. IPv6 Hitlists

IPv6 hitlists [25] are collections of active IPv6 addresses widely used in Internet measurement and topology discovery. Over the past decade, various techniques have been proposed to generate these hitlists, leveraging address patterns, clustering [26,27], and deep learning algorithms [28,29,30] to identify active regions in the IPv6 address space. These methods aim to address the sparsity and uneven distribution of IPv6 addresses, which pose significant challenges for probing.

Yarrp6 [15] explored the impact of Autonomous System (AS) coverage in hitlist-based seed sets on topology discovery. The study demonstrated that broader AS coverage yields more comprehensive discovery results. Similarly, 6Search [16] showed that using IPv6 prefixes containing hitlist addresses as probing targets improves scanning efficiency by focusing on dense regions of the address space. However, these studies primarily optimize probing efficiency, with limited consideration of the representativeness of hitlists across different network types.

IPv6 hitlists often include addresses from various devices [31] (e.g., routers, web servers, and CPE devices) spanning diverse network types. To mitigate biases in existing datasets, we classify seed sets using PeeringDB [32] into four categories: CDN, ISP, NSP, and ETP. This classification ensures balanced representation across network types, enhancing the comprehensiveness and accuracy of IPv6 topology discovery.

3. Design of 6Trace

This paper first introduces an overview of the 6Trace, followed by a detailed description of its key technological components: (1) Regional State Encoding Technique; (2) Probe Order; and (3) Probing Strategy.

3.1. Overview of 6Trace

6Trace is an efficient active topology discovery method for the IPv6 Internet. Its modular architecture supports dynamic probing with ICMPv6 packets and achieves stateless asynchronous scanning via independent sending and receiving threads. This design enhances scalability and ensures high efficiency in large-scale network probing.The overall framework of 6Trace, illustrating its modular design for targeted seed set probing, is shown in Figure 1.

6Trace retrieves hitlists from multiple seed sets (detailed in Section 4) and employs an address generation module to transform these hitlists into a TargetSet for topology scanning. The LFSR-based shift register module pseudorandomly arranges addresses within the TargetSet. This process evenly distributes the probing load and mitigates the impact of rate-limiting mechanisms. 6Trace subsequently initializes the TargetSet into a Target Node List. Each node contains essential probing parameters, including the target IP address, the initial TTL value, and routing metadata. The packet generation module utilizes this structured data to construct and dispatch ICMPv6 probe packets.

Additionally, 6Trace employs a multi-round iterative probing mechanism combined with a dynamic adjustment strategy. The receiving thread processes response packets from earlier iterations. It decodes the extracted information and dynamically refines the strategy for the next probing round. This asynchronous adjustment cycle iteratively continues until every target node is probed. Finally, the result processing module consolidates and presents the topology discovery results from individual points to the entire TargetSet.

3.2. Regional State Encoding Technique

To achieve efficient large-scale scanning with 6Trace, we implement a technique that encodes probing information directly into packets, without the need for maintaining state. This approach separates sending and receiving tasks, facilitating a streamlined and scalable scanning process Although various packets can be used for active topology probing, network targets often exhibit different responses to protocols due to security measures. According to [33], more than 94% of network devices respond to ICMPv6, establishing it as a reliable choice for network measurement.

ICMPv6 serves a critical role in IPv6 networks, supporting key functionalities like Neighbor Discovery, Stateless Address Autoconfiguration, Neighbor Unreachability Detection, and Router Discovery. These features make ICMPv6 highly suitable for IPv6 measurement research. Accordingly, 6Trace adopts it as the primary protocol for probing operations.

One major advantage of ICMPv6, as specified in [13], is its ability to transmit complete packets. This feature removes the necessity of encoding state information directly into the packet header. Instead, ICMPv6 requests as much of the original packet as possible to be returned, enabling the encoding and recovery of additional state information. As shown in Figure 2, one byte in the packet is allocated for 6Trace’s unique fingerprint. This fingerprint serves to validate the authenticity of the ICMPv6 response. Additionally, another byte is used to encode the original Time-to-Live (TTL) value, which plays a critical role in matching the response IP to the target IP’s path information. Since the TTL is stored in the packet’s data portion, changes to its value directly affect the checksum field in the ICMP header.

To improve path inference accuracy and prevent incorrect assumptions caused by network load balancing, 6Trace applies a two-byte “fudge” technique adapted from Yarrp6. This technique ensures that packets destined for the same target will consistently follow the same path. This approach keeps the transport header checksum unchanged, thereby maintaining per-flow load balancing across the network.

3.3. Probe Order

Large-scale network measurements often generate substantial traffic in a short time. To mitigate network congestion and ICMP rate limitations, the probing sequence of target addresses is randomized. This ensures a more even distribution of scanning traffic across the network.

A Linear Feedback Shift Register (LFSR) is employed for pseudo-random scanning. The LFSR generates pseudo-random sequences by shifting and feeding back bits based on predefined rules. An n-bit LFSR, built on an irreducible polynomial, can produce a sequence with a period of

2^{n} - 1

when initialized with a non-zero state. This technique is widely used in stream ciphers for random number generation. The LFSR executes minimal XOR and shift operations, ensuring computational efficiency. For example, a 25-bit LFSR generates a sequence with a period of approximately 33 million, requiring only one XOR and one shift per operation. This approach enables rapid pseudo-random permutation of large-scale target address sequences, facilitating efficient large-scale scanning.

3.4. Probing Strategy

When conducting large-scale active topology scanning of IPv6 networks based on a stateless asynchronous mechanism, the following challenges must be addressed: (1) Network Congestion: Simultaneous arrival of numerous probe packets at a router can trigger ICMP rate limits, resulting in packet loss. (2) Probing Redundancy: Single-point probing and overlapping paths to different targets result in repeated probes on the same paths. This increases traffic, consumes excessive probes, and exacerbates network congestion and ICMP rate limits.

To handle large-scale probing, 6Trace utilizes a multi-iteration strategy. In each iteration, it sends one-hop probes to all target addresses and adjusts its strategy based on prior responses.

To alleviate network congestion, 6Trace selects an initial TTL for each probe randomly from a predefined range. This ensures even and randomized traffic distribution across different hop counts, preventing congestion and minimizing rate limit impacts.

To reduce redundancy, 6Trace employs a bisection-inspired probing strategy based on the DoubleTree algorithm. Leveraging the rapid convergence of bisection, it seeks to locate the target address in the first hop of forward probing whenever possible. During backward probing, it establishes a stopping point to minimize redundancy. When a response indicates that a probe for a certain hop is beyond the target address, the bisection-like search principle is used to place the next hop probe at the midpoint between the advantageous point and the response hop. This approach prevents unnecessary probing beyond the target IP and reduces probe wastage.

6Trace’s probing activities are managed by sending and receiving threads, which divide the process into forward and backward stages with seamless transitions. The probing direction is determined by the InitTTL value, which is dynamically updated during probing. These stages transition according to the TTL update strategies implemented by the sending and receiving threads, as shown in Figure 3. The dynamic update strategy, guided by a bisection-like algorithm, is detailed in Algorithm 1.

Algorithm 1 Update strategy for receiving threads

Input: ResponsePacket, TN(Target Node)
Output: Updated TN

1:: if $R e s p o n s e P a c k e . t y p e = T i m e E x c e e d e d$ then
2:: if $P r o b i n g D i r e c t i o n = B a c k w a r d$ then
3:: if $S t o p S e t . C o n t a i n s (R e s p o n s e P a c k e t . S r c I P)$ then
4:: $D e c T T L = I n i t T T L + 1$
5:: $P r o b i n g D i r e c t i o n = F o r w a r d$
6:: else
7:: $S t o p S e t . A d d (R e s p o n s e P a c k e t . S r c I P)$
8:: end if
9:: else if $P r o b i n g D i r e c t i o n = F o r w a r d$ then
10:: if $D e c T T L \geq m i n T T L$ then
11:: $m i n T T L = D e c T T L$ ▹ Increase the probing lower bound
12:: $I n i t T T L = \frac{m i n T T L + m a x T T L}{2}$
13:: $D e c T T L = I n i t T T L$
14:: end if
15:: end if
16:: else if $R e s p o n s e P a c k e t . t y p e = E c h o R e p l y$ then
17:: if $P r o b i n g D i r e c t i o n = B a c k w a r d$ then
18:: if $D e c T T L \leq m a x T T L$ then
19:: $m a x T T L = D e c T T L$ ▹ Narrow the probing upper bound
20:: $I n i t T T L = \frac{m i n T T L + m a x T T L}{2}$
21:: $D e c T T L = I n i t T T L$
22:: end if
23:: else if $P r o b i n g D i r e c t i o n = F o r w a r d$ then
24:: Remove TN ▹ End probing of this target
25:: end if
26:: end if
27:: return $U p d a t e d T N$

The packet sending stage adopts a dynamic update strategy. An initial probing TTL range (InitTTLrange) is defined. Each target IP selects a TTL randomly from the InitTTLrange to distribute the load and evenly spread traffic across different TTLs, reducing network congestion. If no response is received, the TTL value and probing direction are adjusted using a natural update strategy.

In the sending stage, 6Trace adopts a natural update strategy that begins with the InitTTL value and decreases until the lower bound (minTTL) is reached. Once minTTL is reached, the direction switches to forward probing. The TTL is updated to one hop beyond the initial TTL, and probing continues incrementally until the upper bound (maxTTL) is reached. Probing ends at this stage. This strategy ensures the completion of the probing task, even in the absence of responses.

The packet receiving stage also adopts a dynamic update strategy. 6Trace maintains a stop set, which stores router addresses discovered during backward probing. This mechanism reduces redundancy and ensures efficient use of probes. The update strategies are determined by two key factors: the source of the response and the current probing direction. If the response is not from the target IP during backward probing, the stop set is checked. If the address is already recorded in the stop set, the probing direction switches to forward. Otherwise, the new address is added to the stop set. If a response is received during forward probing, the bisection-like method is applied. The DecTTL (current TTL) is updated to the midpoint of the maxTTL. The updated DecTTL is assigned to minTTL, and the probing direction switches to backward. If the response is from the target IP during backward probing, the bisection-like method updates the DecTTL to the midpoint of the minTTL. The DecTTL is then assigned to maxTTL, and the probing range is reduced. If the response is received during forward probing, the probing process for the target IP is terminated.

This bisection-like dynamic update strategy enables 6Trace to focus on backward probing, allowing for rapid convergence to the target IP. It minimizes redundant probes and supports large-scale network measurements. Additionally, it evenly distributes probing traffic to alleviate ICMP rate limits. By dynamically adjusting TTL values and switching probing directions, this strategy optimizes the overall probing process.

To efficiently handle TTL updates in multi-round iterative probing, this paper employs a composite data structure combining arrays and doubly linked lists to store probing targets. This structure is well suited for applications that require frequent insertion, deletion, and rapid access. Arrays facilitate fast lookups, while doubly linked lists enable efficient node insertion and deletion.

During probing initialization, the target set and specified probing boundaries are provided as inputs. For each target IP, a corresponding structure containing its probing attributes is created and added as a node to the doubly linked list. For probe response matching, the array achieves a target IP lookup time complexity of

O (1)

. Similarly, the doubly linked list supports node deletion with a time complexity of

O (1)

. This design significantly improves scanning efficiency, allowing for quick updates of TTL values and path information for target IPs during multi-round probing.

Furthermore, to address inherent path inference errors in the DoubleTree algorithm, a stop set is maintained for each TTL. During backward probing, probing only halts at a hop if the hop’s address exists in the stop set. This approach prevents topology inference errors caused by the stop set and enhances the overall accuracy of the probing process.

3.5. Computational and Memory Requirements

To address the computational and memory requirements of 6Trace, we conducted experiments with target IPv6 address sets ranging from 1 M to 10 M addresses. The results demonstrate that 6Trace maintains a low CPU usage of below 10% across all tested scales. This efficiency is attributed to its stateless asynchronous design, which separates sending and receiving threads, thereby minimizing computational overhead.

Memory usage, however, increases linearly with the size of the target set, starting at approximately 300 MB for 1 M addresses and reaching 3000 MB for 10 M addresses. This linear growth is primarily due to additional data structures that 6Trace employs to manage probing progress dynamically for each target. These structures store information such as target IP states and routing exploration progress, ensuring efficient adaptation to dynamic network conditions.

In comparison with existing tools, Yarrp6 demonstrates relatively low memory usage due to its stateless nature but lacks the ability to manage dynamic probing effectively. FlashRoute, which employs techniques like TTL-based probing and redundancy reduction via a stop set, exhibits memory usage similar to 6Trace, requiring approximately 900 MB for a/24 IPv4 scan as reported [12]. Both tools, however, do not scale as efficiently as 6Trace in highly dynamic IPv6 environments.

4. Target Select

The vastness of the IPv6 address space introduces two primary challenges in selecting probing targets from existing open-source data: (1) The IPv6 address space is vast yet sparsely populated, which complicates the selection of suitable probing targets. This process must consider various network attributes. (2) Identifying the most appropriate dataset for topology probing is another challenge. To address these challenges, we first collect seed addresses from multiple open-source datasets and analyze their network attributes. Finally, we propose an efficient method for selecting target addresses.

4.1. Multi-Source Seeds Collection

Considering the variety of devices on the Internet, we account for the diversity of network devices when selecting probing targets. The data sources used in this study are as follows:

This study uses six seed sets divided into three groups, carefully chosen to capture the diversity and richness of IPv6 address availability. Due to the scale and heterogeneity of Internet devices, it is essential to select targets that reflect diverse network configurations. The first group includes the IPv6 hitlist by Gasser et al. [34] and Addrminer by Song et al. [26,35]. Both sources provide responsive IPv6 addresses. The second group is the NTP [36] dataset, collected by Rye et al. [36], which contains/48 prefixes of responsive addresses. Specific addresses are extracted from these prefixes. The third group consists of measurement projects: RIPE Atlas [37], which provides probe node IPs; FDNS [38], which collects public DNS AAAA records; and CAIDA’s Ark project [39], which offers server device seed sets for research.

IPv6 Hitlist [34]: The IPv6 hitlist project, managed by the Akamai team, provides a regularly updated dataset of globally responsive IPv6 addresses. This dataset is publicly accessible and free to use. Additional data, including daily scanned response addresses and alias prefixes, are available through registration. These addresses provide a reliable source for identifying active IPv6 devices. They are essential for mapping the current state of IPv6 adoption and usage.

RIPE Atlas [37]: RIPE Atlas, a distributed network measurement platform managed by RIPE, plays a vital role in global internet resource management. RIPE offers services for IPv4, IPv6, and AS numbers to members across Europe, the Middle East, and parts of Central Asia. The Atlas network includes thousands of measurement probes deployed globally. The IP addresses of these probe nodes are publicly available, offering researchers valuable insights into the global distribution of IPv6-enabled devices.

FDNS [38]: The Forward DNS (FDNS) project continuously collects, analyzes, and stores publicly available DNS records from across the Internet. This dataset includes a wide range of IPv6 AAAA records that can be used to identify active IPv6 addresses. FDNS is a key resource for analyzing the global distribution of IPv6 addresses and the behavior of DNS servers in resolving them.

CAIDA [39,40]: The Center for Applied Internet Data Analysis (CAIDA) conducts research on internet traffic patterns and develops infrastructure for large-scale measurement projects. The Archipelago (Ark) project provides a comprehensive IPv6 DNS name dataset, primarily composed of addresses from server devices. This dataset is crucial for analyzing IPv6 deployment at the server level and understanding the broader internet topology.

Addrminer [26,35]: Addrminer, developed by Song et al. [26], is a global IPv6 address monitoring system. It offers a comprehensive dataset of IPv6 addresses from both seed and non-seed scenarios. By employing active and passive collection techniques, Addrminer continuously updates its dataset, providing valuable insights into IPv6 address availability and usage patterns.

NTP: Rye et al. [36] collected a large-scale IPv6 address dataset through passive methods using the NTP (Network Time Protocol). This dataset primarily includes addresses from subnet client devices and Customer Premises Equipment (CPE) devices. It is particularly valuable for studying IPv6 address distribution in home and small office networks. The authors published the/48 prefixes of these addresses, and in this study, specific addresses are extracted from these prefixes to further analyze the global IPv6 address space.

Collecting these open-source seed sets benefits network measurement in several ways: (1) Increased Data Diversity. Different data sources cover a wide range of network devices and scenarios, including clients, CPEs, and server devices. This diversity enables a more comprehensive understanding of the IPv6 network topology. (2) Enhanced Data Reliability. Cross-verification and complementarity of multi-source data minimize errors and omissions inherent to single data sources, providing more reliable topology information. (3) Real-Time and Dynamic Data. Regular updates from seed sets like RIPE Atlas and Addrminer ensure the data remain current, improving the accuracy of measurement results. Selecting diverse seed sets enriches data diversity and coverage, supporting a comprehensive understanding of the IPv6 network topology.

Table 1 presents the basic attributes of the datasets and categorizes address patterns using the addr6 tool according to RFC 7707 [41]. The analysis highlights significant differences in Interface Identifiers (IIDs) and Autonomous System (AS) coverage. IPv6 hitlists and CAIDA collections, featuring many randomized addresses and extensive AS coverage, are ideal for large-scale topology probing. Addrminer’s hitlists, though smaller in size, also contain a high proportion of randomized addresses and extensive AS coverage, making them suitable for localized probing. NTP and FDNS collections, with moderate AS coverage, are better suited for specific protocols and services. RIPE Atlas, primarily consisting of EUI-64 addresses [42], has limited AS coverage but is well suited for detailed network device analysis. These findings demonstrate the importance of constructing multi-source seed sets to ensure accurate and comprehensive topology probing. Such seed sets facilitate the execution of measurement experiments tailored to different purposes and specific probing objectives.

4.2. Characterization

4.2.1. AS Distribution

We collected the seed sets listed in Table 1 and analyzed their distribution across Autonomous Systems (ASs) and network types. Figure 4 presents the cumulative distribution function (CDF) of six IPv6 address seed sets across the top X ASs. The CDF curves reveal significant differences in AS distribution among the seed sets. Most exhibit rapid growth within the top 1000 ASs, indicating a concentration of addresses in popular ASs. CAIDA and Addrminer’s hitlists are particularly concentrated, with their CDF curves reaching most of the cumulative value within the top 100 ASs. These seed sets are especially useful for probing the internet’s core topology, as they help map major traffic exchange hubs.

Collecting seed sets from multiple sources helps balance AS distribution and mitigates biases introduced by single-source collections. Among all AS, AS16509 (managed by AMAZON-02 in the US) emerges as the most popular. However, exceptions exist. For instance, in the NTP seed set, AS16509 contains only 12 addresses, whereas in other collections, it ranks within the top ten. These differences underscore the importance of combining diverse seed sets to avoid bias toward specific ASs and enhance the diversity and representativeness of the data.

4.2.2. Network Type Distribution

Traditional internet topology research primarily focuses on BGP prefixes and AS coverage, but this approach does not reveal the types of network entities. Understanding the network types in target sets is essential for improving topology probing accuracy. Some seed sets may have a high concentration of ISP addresses but lack adequate coverage of ETPs. Combining data from different sources balances network type distribution and mitigates biases introduced by single-source datasets.

Figure 5 illustrates the distribution of six IPv6 address subsets across various network types, highlighting differences stemming from their generation and deployment. The IPv6 hitlist and CAIDA have significant CDN coverage, accounting for 47.7% and 74.3%, respectively. The RIPE Atlas (66.9%) and NTP (41.9%) datasets prominently feature ISPs, reflecting their role in global infrastructure. Addrminer, with 91.6% ETP coverage, provides unique data from enterprise networks. NSPs are well represented in NTP (57.8%) and FDNS (57.5%), emphasizing the critical role of network service providers.

Using multiple datasets ensures broad representation and diversity, enhancing the accuracy and comprehensiveness of topology inference.

4.3. Target Generation

This research uses multi-source seed sets (e.g., IPv6 Hitlists) generated by target generation algorithms, often used for host activity studies. These addresses are frequently concentrated within the same subnet. ISPs have different IPv6 address allocation strategies. Regional Internet Registries (RIRs) like APNIC, RIPE NCC, LACNIC, and AFRINIC generally recommend assigning a/64 prefix to end-user devices.

To avoid redundant topology information caused by multiple addresses within the same/64 prefix, we extract unique/64 prefixes from the seed sets for topology discovery. This creates a/64-granularity dataset effective for internet topology studies.

Expanding the TargetSet to larger prefix granularities may reduce tail-end routers’ ability to reveal detailed topology. This limits the granularity of the obtained topology data.

5. Performance Evaluation

In this section, we present the results of real-world experiments conducted to evaluate the performance of 6Trace. Specifically, the experiments focus on three key aspects: (1) the impact of 6Trace’s configuration parameters on scanning efficiency, (2) a comparison of 6Trace’s performance with existing topology discovery solutions, such as Yarrp6 and Flashroute, and (3) the influence of different network category seed sets on topology discovery.

5.1. Experimental Setup

The experiments were performed on a Lenovo-branded host machine equipped with a 13th Gen Intel^® Core™ i7-13700 processor, featuring 16 cores and a base clock speed of 2.10 GHz. The system was supported by 32 GB of RAM (31.7 GB usable) and a network interface card (NIC) with a maximum link speed of 1 Gbps. This hardware configuration is representative of modern personal computing systems, ensuring the accessibility and reproducibility of the experimental environment.

To evaluate the performance of 6Trace, two widely recognized tools, FlashRoute and Yarrp6, were selected as baseline methods for comparison. Both tools were configured using their respective optimal parameters, as recommended in their original publications. FlashRoute was evaluated with a gaplimit of 8, while Yarrp6 employed its padding mode with a maximum TTL of 16 for standard configurations. Additionally, parameter tuning was conducted to assess their performance under alternative settings, such as extending the maxTTL to 32, to ensure a comprehensive comparison.

The dataset for this study was derived from multi-source IPv6 seed sets, as described in earlier sections. These seed sets were subjected to dealiasing and deduplication processes to ensure accuracy and uniqueness. The processed dataset contained 1 million unique/64 prefix addresses, which formed the basis of the target set for topology discovery experiments. The dataset included addresses from multiple categories, such as Content Delivery Networks (CDNs), Internet Service Providers (ISPs), Network Service Providers (NSPs), and Enterprise Networks (ETPs), to ensure diversity and comprehensive coverage of different network environments.

Probing rates were initially capped at 25,000 packets per second (25 kpps) to minimize network disruptions and comply with ethical probing standards [43]. This corresponds to a bandwidth utilization of approximately 20 Mbps, which is well within the capacity of most contemporary network connections and avoids triggering security alarms. All tools were evaluated under identical experimental conditions to ensure a fair comparison.

5.2. Evaluation Metrics

To comprehensively assess the performance of 6Trace and the baseline tools, we used three evaluation metrics:

Number of Discovered Interface Addresses: This metric represents the total number of unique interface addresses discovered during the probing process. To ensure fairness in comparison, ICMP reply addresses were excluded from this count to avoid bias in favor of ICMP-based tools.
Scanning efficiency: Scanning efficiency was defined as the ratio of discovered interface addresses to the total number of probes sent. A higher scanning efficiency indicates that the tool achieved better performance with fewer probes, reflecting the effectiveness of mechanisms like dynamic probing strategies and redundancy reduction.
Probing Time: This metric measures the total time required to complete the probing process at a specified probing rate. It provides insight into the overall speed of the topology discovery method and highlights its scalability for large-scale network environments.

5.3. Impact of Configuration Parameters

The performance of 6Trace is strongly influenced by key configuration parameters, particularly the InitTTL range and maxTTL. These parameters are essential for balancing scanning efficiency, resource utilization, and comprehensive topology discovery. Their configuration is informed by both empirical experiments and theoretical considerations, including insights from network theory, practical system constraints, and statistical analyses of path length distributions.

5.3.1. InitTTL Range

The InitTTL range determines the initial TTL values for probing and significantly affects traffic distribution across the network. A larger range promotes better load balancing, while a narrower range risks network congestion near the source or inefficient use of resources. The optimal InitTTL range is determined based on the following considerations:

Maximizing discovery accuracy by focusing on the most active TTL values, as indicated by response distributions.
Avoiding redundant probes near the source by setting a sufficiently large lower bound, especially in densely connected regions.
Preventing unnecessary probing beyond intended targets by limiting the upper bound, which minimizes resource overhead and scan duration.

To identify the optimal range, we conducted traceroute experiments on 1 million randomly selected addresses, with a maximum TTL of 32. Figure 6 shows that the majority of unique interfaces are discovered between 9 and 24 hops. This observation aligns with the statistical distribution of path lengths in large-scale networks, where most active paths fall within this range. As a result, we set the InitTTL range to [9, 24], ensuring a balance between comprehensive discovery and efficient resource utilization.

5.3.2. maxTTL

The maxTTL parameter defines the probing depth and directly impacts the number of discovered interfaces and total probe packets sent. To analyze its influence, experiments were conducted using 0.1 million randomly selected addresses, varying the maxTTL from 16 to 32. The results, summarized in Figure 7, reveal that discovered interfaces increase steadily with maxTTL up to 24. Beyond this threshold, the discovery rate stabilizes, despite additional probes being sent.

This stabilization reflects the “small-world phenomenon” in network theory, which posits that most nodes in large-scale networks are reachable within a limited number of hops. Our experimental results confirm this theoretical insight, as most target interfaces are discovered within 24 hops. Therefore, a maxTTL of 24 optimizes discovery while minimizing resource usage. Increasing maxTTL beyond 24 yields diminishing returns, as the marginal gain in discovered interfaces decreases significantly.

Furthermore, practical system constraints also support this choice. Widely used platforms such as traceroute often limit the maximum TTL to 30 hops. Similarly, state-of-the-art topology discovery tools, including Yarrp6 and Flashroute, typically set their maximum TTL to 32 in experiments. By adopting a slightly lower value of 24, 6Trace balances efficiency and practicality, reducing probe redundancy while ensuring adequate coverage of the IPv6 topology.

These findings highlight the efficiency of 6Trace in adapting to the structural characteristics of the IPv6 Internet. By optimizing InitTTL and maxTTL, 6Trace achieves comprehensive topology discovery with minimal resource overhead, making it well suited for large-scale network measurements. While our parameter selection is informed by theoretical and practical considerations, we acknowledge that real-world network conditions may necessitate empirical adjustments, which remain a limitation of this study.

5.4. Comparison with Existing Solutions

To evaluate the probing performance of 6Trace relative to existing IPv6 Internet topology discovery tools, specifically Yarrp6 and Flashroute, we conducted a series of experiments. The experiments were conducted at a controlled packet rate of 25 kpps in Hefei, China, in June 2024. This rate was chosen to reflect real-world deployment scenarios while ensuring minimal disruption to the underlying network. The IPv6 seed set used for probing was collected as described in Section 4, consisting of 1 million unique/64 prefix addresses, which formed the Targetset for each experiment.

For Yarrp6, according to [15], the optimal performance is achieved with a maxTTL of 16 in its padding mode. Flashroute, which also supports IPv6, achieves its best performance with a Split-TTL of 16, as indicated by its IPv4 topology discovery results. While the TTL for both tools is typically set to 16, Yarrp6’s padding mode and Flashroute’s gamplimit mechanism generate forward probes that extend the probing distance. In order to ensure a fair comparison of probing distances, 6Trace was configured with two different maxTTL settings: 24 (6Trace-24) and 32 (6Trace-32). Similarly, Flashroute was tested with two configurations: Flashroute-16 (gamplimit of 8) and Flashroute-32. Yarrp6 was tested with both Yarrp6-16 and Yarrp6-32 configurations. All experiments were conducted at a consistent rate of 25kpps. In this comparison, we define scanning efficiency as the ratio of newly discovered interfaces to the total number of probe packets sent.

The detailed experimental results in Table 2 highlight the significant performance differences between 6Trace, Yarrp6, and Flashroute. As shown, 6Trace consistently outperforms both Yarrp6 and Flashroute in terms of scanning efficiency. Specifically, the 6Trace-24 configuration achieves the highest scanning efficiency of 0.0163, with the shortest probing time (469 s) and the fewest probe packets (9.98 million), while discovering the largest number of interfaces. This demonstrates that 6Trace is the most efficient method for discovering new interface addresses. The second-best performance is observed with 6Trace-32, which maintains a high scanning efficiency (0.0132) while requiring only slightly more time and probe packets compared to 6Trace-24.

On average, the scanning efficiency of 6Trace is approximately 70% higher than that of both Yarrp6 and Flashroute, indicating a substantial improvement in scanning efficiency. This enhanced performance is attributed to the optimizations implemented in 6Trace, including the random selection of hops from a larger InitiTTL range, the use of a stop set mechanism, and the forward gamplimit. These improvements enable 6Trace to discover new interfaces more quickly and with fewer probe packets, leading to reduced scanning time and lower resource consumption.

In comparison, Yarrp6 ranks second in scanning efficiency. Despite its high-speed random probing and padding mode, which improves efficiency, Yarrp6 requires more probe packets and takes longer to complete the scan than 6Trace. This suggests that while Yarrp6 performs well in terms of load distribution and avoiding ICMP rate-limiting, it suffers from higher resource consumption and slower overall performance. Nonetheless, Yarrp6 manages to discover a similar number of interfaces as 6Trace, indicating that its random probing approach offers some advantages in large-scale probing scenarios.

Flashroute demonstrates the lowest efficiency among the tools evaluated. Research [33] has shown that ICMP probes generally yield higher response rates in IPv6 networks compared to UDP probes, which FlashRoute relies on. FlashRoute employs predictive probing to estimate the hop count (TTL) of target addresses, narrowing the TTL range for each target and using a stop set to reduce redundancy. However, its design introduces significant inefficiencies. Specifically, FlashRoute probes each target in descending TTL order during each iteration. This approach, while effective in avoiding unnecessary probes, causes TTL values to concentrate around specific ranges within each round. Such concentration can lead to overloading routers at particular hop counts, exacerbating delays and negatively impacting efficiency. Furthermore, while FlashRoute initially distributes probing traffic across routing nodes, the later stages of probing, especially when probing backward, exacerbate this concentration, further reducing overall performance.

In conclusion, 6Trace demonstrates superior performance over both Yarrp6 and Flashroute in IPv6 topology discovery, particularly with respect to scanning efficiency. By optimizing probing paths and employing novel mechanisms such as the stop set and forward gamplimit, 6Trace is able to discover new interfaces more efficiently, using fewer resources. These advantages make 6Trace particularly well suited for large-scale IPv6 network topology discovery, offering a promising approach for future IPv6 Internet mapping.

5.5. Scalability Evaluation for Larger IPv6 Target Sets

To evaluate the scalability of 6Trace in large-scale IPv6 networks, we extended the experimental setup to include target IPv6 address sets ranging from 1 million (1 M) to 10 million (10 M) addresses. The probing rate was increased from 25 kpps to 100 kpps to reduce the total probing time, ensuring it remained within the capacity of most standard network connections while maintaining feasibility for real-world deployment. This adjustment reflects a balanced consideration between faster probing and avoiding potential disruption to network infrastructure.

The experimental results, as shown in Figure 8, demonstrate that 6Trace exhibits remarkable scalability and performance in probing large-scale IPv6 networks. Even at the largest target scale of 10 million addresses, the entire discovery process was completed in under 40 min. This rapid completion time highlights the efficiency of 6Trace and its ability to adapt to the demands of large-scale network exploration without significant increases in runtime.

The number of discovered interfaces steadily increased with the target size, surpassing 917,000 interfaces for the 10 M target set. This consistent growth reflects the capability of 6Trace to uncover more interfaces in larger address spaces without experiencing bottlenecks or diminishing returns. However, the efficiency, measured as the ratio of discovered interfaces to sent probes, exhibited a slight decline as the probing scale increased. While the efficiency peaked at 0.0129 for the 2 M target, it gradually decreased to 0.0105 for the 10 M target. This reduction in efficiency is likely attributable to packet loss caused by the high probing rate (100 kpps), which may induce network congestion or trigger ICMP rate limits, resulting in missed responses.

These findings underscore the adaptability of 6Trace in handling the challenges of modern IPv6 network exploration. Its ability to scale up to 10 million target addresses while maintaining stable performance and efficiency makes it a powerful tool for real-world applications. As the internet continues to expand and IPv6 adoption grows, 6Trace provides a reliable and scalable solution to meet the rising demands for efficient and high-speed network topology discovery.

In terms of resource consumption, 6Trace maintains a low CPU usage of under 10% even at the largest scale of 10 M target addresses. This stability reflects the efficiency of its stateless asynchronous design. Memory usage, however, increased linearly with the target scale, reaching approximately 3000 MB for the 10 M target set. This linear growth is attributed to the additional data structures required to track the probing progress dynamically, a trade-off that enables 6Trace to adapt effectively to large-scale and dynamic network conditions.

5.6. Impact of Different Network Categories on Seed Sets

This study investigates the impact of various IPv6 seed sets on network topology discovery. Four distinct types of seed sets were used: Content Delivery Network (CDN), Internet Service Provider (ISP), Network Service Provider (NSP), and Enterprise Network (ETP). Each experiment was conducted with a packet rate of 25 kpps and a maximum TTL of 24, with each seed set containing 1 million addresses.

As summarized in Table 3, the NSP seed sets demonstrated the highest scanning efficiency of 0.2274, leading to the discovery of the largest number of new interfaces (302.4 k). This exceptional performance can be attributed to the extensive geographical coverage and dense node distribution typical of NSPs, which are designed to support large-scale, high-performance connectivity across multiple regions. Their broad infrastructure enables a higher density of active interfaces, resulting in greater discovery efficiency.

ISPs, which also exhibited a high scanning efficiency of 0.1776, closely followed NSPs. ISPs typically have complex network structures that span wide geographic regions, contributing to their high performance in topology discovery. Their broad node distribution and interconnections with various NSPs further enhance their discovery efficiency, as they provide a large number of reachable targets with relatively short path lengths.

In contrast, CDN seed sets showed a relatively low scanning efficiency of 0.0162. Although CDNs are globally distributed, their relatively flat network topology with fewer intermediary nodes results in shorter paths between source and target nodes. This flatness limits the potential for discovering new interfaces, as the number of intermediate nodes is smaller, thus reducing the total number of probes needed for discovery. Additionally, CDNs often have optimized routing paths, further minimizing the need for longer hops.

ETP seed sets, characterized by their centralized network topology and strong internal security measures, exhibited the lowest scanning efficiency of 0.0049. The centralized nature of enterprise networks limits the number of reachable interfaces, as most of the network’s infrastructure is confined to a smaller, highly secured perimeter. This results in fewer new interfaces being discovered and longer discovery paths, as probes must navigate through more constrained and often isolated network structures.

These findings suggest that NSP and ISP seed sets are ideal for large-scale IPv6 network probing, particularly when the goal is to uncover a larger number of new interfaces. The selection of seed sets should therefore be tailored to the specific needs of the topology discovery task. For broader network probing with the aim of uncovering diverse and numerous interfaces, NSP and ISP networks provide the most efficient seeds. This study provides important insights for optimizing future IPv6 network probing strategies, highlighting the need to consider the inherent characteristics of different network categories when designing discovery experiments.

6. Discussion

6.1. Insights from Experimental Results

The experimental findings demonstrate the remarkable improvements achieved by 6Trace in IPv6 topology discovery. By leveraging a stateless asynchronous probing mechanism and dynamic bisection-inspired update strategies, 6Trace achieves a 70% increase in scanning efficiency compared to existing methods. This enhanced efficiency enables the discovery of a greater number of interface addresses within a shorter timeframe while minimizing probing overhead. These advancements make 6Trace particularly suited for large-scale network measurements, where traditional tools often struggle to balance discovery rates with resource constraints.

The role of multi-source seed sets in IPv6 topology discovery is particularly noteworthy. By integrating seed sets from NSP, ISP, CDN, and ETP networks, this study highlights how diversity in network categories directly influences discovery outcomes. Specifically, seed sets from NSP and ISP networks uncovered significantly more interfaces, demonstrating their critical role in comprehensive topology mapping. In contrast, seed sets from CDN and ETP networks were found to contribute less due to their constrained or specialized topologies. These insights emphasize the importance of tailoring seed set selection to the characteristics of the network under investigation, providing a solid foundation for refining future seed set strategies.

6.2. Limitations and Challenges

While 6Trace has shown substantial progress in improving IPv6 topology discovery, several challenges remain that may limit its applicability in certain real-world scenarios. One key challenge is its adaptability to dynamic network conditions. IPv6 networks are inherently dynamic, with routing changes and temporary interfaces occurring frequently. Although 6Trace captures snapshots of the network topology efficiently, it is less equipped to handle real-time changes, limiting its ability to continuously track network evolution.

Another limitation lies in its performance in high-latency networks. In environments characterized by geographical dispersion or congestion, increased latency can lead to packet loss and delayed responses, reducing the tool’s efficiency. While 6Trace’s rapid snapshot capturing mitigates some of these effects, the absence of retransmission mechanisms or latency-aware strategies leaves room for improvement when applied to challenging network conditions.

Finally, the practical deployment of 6Trace faces challenges associated with ICMPv6 filtering and DDoS protection systems. Many networks implement security mechanisms that block or restrict ICMPv6 traffic, potentially reducing the effectiveness of probing. Although strategies such as randomized probing sequences, low probing rates, and distributed probing across/64 subnets help mitigate these risks, their effectiveness in highly secured or aggressively filtered networks remains uncertain. Addressing these challenges will be critical for ensuring the broader applicability of 6Trace in diverse network environments.

6.3. Future Work

Future research should focus on refining and expanding the capabilities of 6Trace to address the identified challenges and improve its adaptability to evolving network conditions. A promising direction involves optimizing hybrid seed set strategies to ensure more accurate and representative probing targets. By integrating diverse data sources and enabling incremental updates, hybrid strategies can dynamically adapt to changes in network environments, enhancing the coverage and efficiency of topology discovery. Addressing issues such as data redundancy and alias filtering within these strategies will be essential for maximizing their potential.

The incorporation of machine learning techniques offers another avenue for improving IPv6 topology discovery. Clustering algorithms could be used to identify high-density address regions, while reinforcement learning can enable dynamic adjustment of probing parameters based on real-time feedback. These approaches not only improve probing efficiency but also enhance the tool’s ability to adapt to complex and unpredictable network behaviors. By leveraging these advancements, 6Trace could become a more intelligent and responsive topology discovery tool.

Improving 6Trace’s adaptability to dynamic network conditions remains a critical area for future work. IPv6 networks, characterized by frequent routing changes and temporary interfaces, pose challenges for accurate and consistent topology discovery. To address this, future enhancements could incorporate mechanisms for adaptive probing that dynamically adjust parameters such as probing direction and TTL ranges based on real-time feedback from network responses. These mechanisms would enable 6Trace to better respond to transient changes in network topology, ensuring that probing efficiency and accuracy are maintained even in highly dynamic environments. Additionally, techniques such as real-time path validation could be explored to ensure that discovered routes accurately reflect current network states, reducing the impact of temporary fluctuations.

Latency-aware probing strategies represent a further opportunity for improvement. Adaptive rate controls and retransmission mechanisms could mitigate the effects of high-latency environments, ensuring consistent performance across diverse network conditions. These enhancements would allow 6Trace to perform reliably in geographically dispersed or congested networks where traditional probing methods often fail.

Additionally, as network environments continue to evolve, researchers should explore the dynamic adjustment of parameters such as InitTTL and maxTTL. Tailoring these configurations to specific network scenarios would optimize the trade-off between efficiency and comprehensiveness, enabling more flexible and effective topology discovery. Building on this work, future studies could also explore optimization techniques for selecting high-value probing targets, focusing on discovering more interfaces with fewer resources. Such efforts would further improve the scalability and resource efficiency of 6Trace.

By addressing these limitations and pursuing the outlined research directions, 6Trace can evolve into a more robust, efficient, and adaptive tool for IPv6 topology discovery, capable of meeting the demands of increasingly complex and dynamic network environments.

7. Conclusions

This paper introduces 6Trace, an efficient and scalable method for IPv6 topology discovery that addresses key challenges such as redundant probing, network congestion, and the vastness of the IPv6 address space. By leveraging a stateless asynchronous mechanism and a dynamic bisection-inspired probing strategy, 6Trace achieves a 70% improvement in scanning efficiency over existing methods, enabling rapid and accurate discovery of interface addresses with minimal overhead.

This study highlights the importance of multi-source seed sets in improving discovery coverage, demonstrating that careful selection and integration of diverse network categories significantly enhance IPv6 topology mapping. These insights provide a foundation for refining target selection strategies and optimizing probing methodologies.

Although 6Trace shows robust performance, challenges such as adapting to dynamic network conditions and mitigating ICMPv6 filtering remain areas for future improvement. Continued advancements in probing strategies and target optimization will further enhance the tool’s scalability and applicability to diverse network environments.

In conclusion, 6Trace provides a scalable and efficient solution for IPv6 topology discovery, contributing to a deeper understanding of the Internet’s evolving infrastructure. As IPv6 adoption accelerates globally, this work lays a foundation for developing next-generation network measurement tools to support a more connected and secure Internet.

Author Contributions

Methodology, Z.S. and P.C.; Writing—review & editing, Z.S., Y.X., C.C., Y.Z. and G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in this paper can be found here: https://ipv6hitlist.github.io; https://catalog.caida.org/dataset/peeringdb; https://atlas.ripe.net; https://opendata.rapid7.com/sonar.fdns_v2/; https://www.caida.org/catalog/datasets/ipv6_allpref_topology_dataset/. These five datasets were accessed on 20 July 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Google. IPv6 Adoption Statistics. 2024. Available online: https://www.google.com/intl/en/ipv6/statistics.html (accessed on 15 July 2024).
Dahlmanns, M.; Heidenreich, F.; Lohmöller, J.; Pennekamp, J.; Wehrle, K.; Henze, M. Unconsidered Installations: Discovering IoT Deployments in the IPv6 Internet. In Proceedings of the NOMS 2024–2024 IEEE Network Operations and Management Symposium, Seoul, Republic of Korea, 6–10 May 2024; pp. 1–8. [Google Scholar]
Sanchez-Navarro, I.; Mamolar, A.S.; Wang, Q.; Calero, J.M.A. 5GTopoNet: Real-time topology discovery and management on 5G multi-tenant networks. Future Gener. Comput. Syst. 2021, 114, 435–447. [Google Scholar] [CrossRef]
Wang, B.; Zhao, F. Computer networking trends: From generalizing to scenario-specific customizing. Inf. Countermeas. Technol. 2023, 2, 113–122. (In Chinese) [Google Scholar]
Soltani, S.; Amanlou, A.; Shojafar, M.; Tafazolli, R. Security of Topology Discovery Service in SDN: Vulnerabilities and Countermeasures. IEEE Open J. Commun. Soc. 2024, 5, 3410–3450. [Google Scholar] [CrossRef]
Wu, J.; Dong, F.; Leung, H.; Zhu, Z.; Zhou, J.; Drew, S. Topology-aware federated learning in edge computing: A comprehensive survey. ACM Comput. Surv. 2024, 56, 1–41. [Google Scholar] [CrossRef]
Canbaz, M.A. Internet Topology Mining: From Big Data to Network Science. Ph.D. Thesis, University of Nevada, Reno, NV, USA, 2018. [Google Scholar]
Wang, Z.; Li, Z.; Liu, G.; Chen, Y.; Wu, Q.; Cheng, G. Examination of WAN traffic characteristics in a large-scale data center network. In Proceedings of the 21st ACM Internet Measurement Conference, Online, 2–4 November 2021; pp. 1–14. [Google Scholar]
Basat, R.B.; Einziger, G.; Gong, J.; Moraney, J.; Raz, D. q-MAX: A unified scheme for improving network measurement throughput. In Proceedings of the Internet Measurement Conference, Amsterdam, The Netherlands, 21–23 October 2019; pp. 322–336. [Google Scholar]
Li, R.; Makhijani, K.; Dong, L. New IP: A data packet framework to evolve the Internet. In Proceedings of the 2020 IEEE 21st International Conference on High Performance Switching and Routing (HPSR), Newark, NJ, USA, 11–14 May 2020; pp. 1–8. [Google Scholar]
Beverly, R. Yarrp’ing the Internet: Randomized high-speed active topology discovery. In Proceedings of the 2016 Internet Measurement Conference, Santa Monica, CA, USA, 14–16 November 2016; pp. 413–420. [Google Scholar]
Huang, Y.; Rabinovich, M.; Al-Dalky, R. Flashroute: Efficient traceroute on a massive scale. In Proceedings of the ACM Internet Measurement Conference, Pittsburgh, PA, USA, 27–29 October 2020; pp. 443–455. [Google Scholar]
Conta, A.; Deering, S.; Gupta, M. Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification; Technical Report RFC 4443; Internet Engineering Task Force: Fremont, CA, USA, 2006. [Google Scholar]
Pan, L.; Yang, J.; He, L.; Wang, Z.; Nie, L.; Song, G.; Liu, Y. Your router is my prober: Measuring ipv6 networks via icmp rate limiting side channels. arXiv 2022, arXiv:2210.13088. [Google Scholar]
Beverly, R.; Durairajan, R.; Plonka, D.; Rohrer, J.P. In the IP of the beholder: Strategies for active IPv6 topology discovery. In Proceedings of the Internet Measurement Conference 2018, Boston, MA, USA, 31 October–2 November 2018; pp. 308–321. [Google Scholar]
Liu, N.; Jia, C.; Hou, B.; Hou, C.; Chen, Y.; Cai, Z. 6Search: A reinforcement learning-based traceroute approach for efficient IPv6 topology discovery. Comput. Netw. 2023, 235, 109987. [Google Scholar] [CrossRef]
Luo, Z.; Liu, J.; Yang, G.; Zhang, Y.; Hang, Z. High-Speed Path Probing Method for Large-Scale Network. Sensors 2022, 22, 5650. [Google Scholar] [CrossRef]
Yang, T.; Cai, Z. Efficient IPv6 router interface discovery. In Proceedings of the IEEE INFOCOM 2024—IEEE Conference on Computer Communications, Vancouver, BC, Canada, 20–23 May 2024; pp. 1641–1650. [Google Scholar]
Augustin, B.; Cuvellier, X.; Orgogozo, B.; Viger, F.; Friedman, T.; Latapy, M.; Magnien, C.; Teixeira, R. Avoiding traceroute anomalies with Paris traceroute. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, Rio de Janeriro, Brazil, 25–27 October 2006; pp. 153–158. [Google Scholar]
Jin, Y.; Renganathan, S.; Ananthanarayanan, G.; Jiang, J.; Padmanabhan, V.N.; Schroder, M.; Calder, M.; Krishnamurthy, A. Zooming in on wide-area latencies to a global cloud provider. In Proceedings of the ACM Special Interest Group on Data Communication, Beijing, China, 19–23August 2019; pp. 104–116. [Google Scholar]
Shavitt, Y.; Shir, E. DIMES: Let the Internet measure itself. ACM Sigcomm Comput. Commun. Rev. 2005, 35, 71–74. [Google Scholar] [CrossRef]
Donnet, B.; Raoult, P.; Friedman, T.; Crovella, M. Efficient algorithms for large-scale topology discovery. In Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Orlando, FL, USA, 19–23 June 2005; pp. 327–338. [Google Scholar]
Augustin, B.; Friedman, T.; Teixeira, R. Multipath tracing with Paris traceroute. In Proceedings of the 2007 Workshop on End-to-End Monitoring Techniques and Services, Munch, Germany, 21 May 2007; pp. 1–8. [Google Scholar]
Durumeric, Z.; Wustrow, E.; Halderman, J.A. {ZMap}: Fast internet-wide scanning and its security applications. In Proceedings of the 22nd USENIX Security Symposium (USENIX Security 13), Washington, DC, USA, 14–16 August 2013; pp. 605–620. [Google Scholar]
Gasser, O.; Scheitle, Q.; Foremski, P.; Lone, Q.; Korczynski, M.; Strowes, S.D.; Hendriks, L.; Carle, G. Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists. In Proceedings of the 2018 Internet Measurement Conference, New York, NY, USA, 31 October–2 November 2018. [Google Scholar] [CrossRef]
Song, G.; Yang, J.; Wang, Z.; He, L.; Lin, J.; Pan, L.; Duan, C.; Quan, X. Det: Enabling efficient probing of ipv6 active addresses. IEEE/ACM Trans. Netw. 2022, 30, 1629–1643. [Google Scholar] [CrossRef]
Liu, C.; Li, R.; Ding, S.; Liu, Y.; Luo, X. 6Subpattern: Target Generation Based on Subpattern Analysis for Internet-Wide IPv6 Scanning. IEEE Trans. Netw. Serv. Manag. 2024, 21, 3692–3710. [Google Scholar] [CrossRef]
Cui, T.; Gou, G.; Xiong, G.; Liu, C.; Fu, P.; Li, Z. 6GAN: Ipv6 multi-pattern target generation via generative adversarial nets with reinforcement learning. In Proceedings of the IEEE INFOCOM 2021-IEEE Conference on Computer Communications, Vancouver, BC, Canada, 10–13 May 2021; pp. 1–10. [Google Scholar]
Williams, G.; Pearce, P. Seeds of Scanning: Exploring the Effects of Datasets, Methods, and Metrics on IPv6 Internet Scanning. In Proceedings of the 2024 ACM on Internet Measurement Conference, Madrid, Spain, 4–6 November 2024; pp. 295–313. [Google Scholar]
Jiao, L.; Zhu, Y.; Zhang, W.; Zhao, L.; Zhou, Y.; Liu, Q. 6GAI: Active IPv6 Address Generation via Adversarial Training with Leaked Information. In Proceedings of the 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Tianjin, China, 8–10 May 2024; pp. 1079–1085. [Google Scholar]
Li, X.; Liu, B.; Zheng, X.; Duan, H.; Li, Q.; Huang, Y. Fast IPv6 network periphery discovery and security implications. In Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Taipei, Taiwan, 21–24 June 2021; pp. 88–100. [Google Scholar]
PeeringDB. PeeringDB Dataset. Available online: https://catalog.caida.org/dataset/peeringdb (accessed on 4 May 2024).
Steger, L.; Kuang, L.; Zirngibl, J.; Carle, G.; Gasser, O. Target acquired? Evaluating target generation algorithms for IPv6. In Proceedings of the 2023 7th Network Traffic Measurement and Analysis Conference (TMA), Naples, Italy, 26–29 June 2023; pp. 1–10. [Google Scholar]
Gasser, O.; Scheitle, Q.; Foremski, P.; Lone, Q.; Korczyński, M.; Strowes, S.; Hendriks, L.; Carle, G. IPv6 Hitlist Service. 2024. Available online: https://ipv6hitlist.github.io/ (accessed on 17 June 2024).
Song, G.; Yang, J.; He, L.; Wang, Z.; Li, G.; Duan, C.; Liu, Y.; Sun, Z. AddrMiner: A Comprehensive Global Active IPv6 Address Discovery System. In Proceedings of the 2022 USENIX Annual Technical Conference (USENIX ATC 22), Carlsbad, CA, USA, 11–13 July 2022; pp. 309–326. [Google Scholar]
Rye, E.; Levin, D. Ipv6 hitlists at scale: Be careful what you wish for. In Proceedings of the ACM SIGCOMM 2023 Conference, New York, NY, USA, 10–14 September 2023; pp. 904–916. [Google Scholar]
RIPE NCC. RIPE Atlas. 2014–2024. Available online: https://atlas.ripe.net/ (accessed on 17 April 2024).
Rapid7. Forward DNS Datasets. 2017–2021. Available online: https://scans.io/study/sonar.fdns_v2 (accessed on 17 April 2024).
CAIDA. Index of/Datasets/Topology/ark/ipv6/Dns-Names. 2014–2024. Available online: https://publicdata.caida.org/datasets/topology/ark/ipv6/dns-names/ (accessed on 17 June 2024).
CAIDA. The CAIDA UCSD IPv6 Topology Dataset. 2024. Available online: http://www.caida.org/data/active/ipv6_allpref_topology_dataset.xml (accessed on 17 June 2024).
Gont, F.; Chown, T. Network Reconnaissance in IPv6 Networks. RFC 7707 (Informational). 2016. Available online: http://www.ietf.org/rfc/rfc7707.txt (accessed on 17 June 2024).
Rye, E.; Beverly, R.; Claffy, K.C. Follow the scent: Defeating IPv6 prefix rotation privacy. In Proceedings of the 21st ACM Internet Measurement Conference, Online, 2–4 November 2021; pp. 739–752. [Google Scholar]
Partridge, C.; Allman, M. Ethical considerations in network measurement papers. Commun. ACM 2016, 59, 58–64. [Google Scholar] [CrossRef]

Figure 1. The framework of 6Trace: modular design for targeted seed set probing. After inputting the target seed set, the sending and receiving threads, along with collaborative modules, work together to complete the scanning and obtain results.

Figure 2. State encoding in packet header of 6Trace.

Figure 3. The mechanism of probe state conversion.

Figure 4. Seed set AS distribution CDF.

Figure 5. Category distribution of seed sets.

Figure 6. Interfaces and destinations per hop.

Figure 7. Probes and interfaces ratio at different maxTTL values.

Figure 8. 6Trace scalability: interface discovery and efficiency trends.

Table 1. Comparison experiment on multi-source seed properties.

Name	Date	Num	ASNs	Interface Identifiers (IIDs)
Name	Date	Num	ASNs	EUI-64	Low-Byte	Embed-IPv4	Byte-Pattern	Randomized
IPv6 Hitlists	17 June 2024	23.09 M	2.01 w	1.4 M (6.26%)	8.47 M (36.66%)	4.05 M (17.54%)	0.57 M (2.46%)	8.34 M (36.12%)
Addrminer’s Hitlists	April 2024	5.90 M	9.49 k	17 k (0.30%)	0.41 M (6.98%)	109.12 k (1.85%)	106.99 k (1.81%)	5.24 M (88.88%)
NTP	January 2022– August 2022	5.25 M	7.78 k	0 (0.00%)	2.53 M (48.14%)	0 (0.00%)	0 (0.00%)	2.73 M (51.86%)
RIPE Atlas	March 2024– April 2024	84 k	2.35 k	63.98 k (75.72%)	3.29 k (3.90%)	0.62 k (0.73%)	0.12 k (0.15%)	16.33 k (19.32%)
FDNS	November 2017– November 2021	6.36 M	6.07 k	173.34 k (2.73%)	716.54 k (11.27%)	289.61 k (4.55%)	500.31 k (7.87%)	4.66 M (73.22%)
CAIDA	May 2014– June 2024	47.28 M	6.35 k	24.39 k (0.05%)	391.49 k (0.83%)	42.69 k (0.09%)	29.41 k (0.06%)	46.79 M (98.95%)

Table 2. Performance comparison of 6Trace, Flashroute, and Yarrp6 (bold values represent the best results).

Experimental Setup	Interfaces	Probes	Scan Time	• Scanning Efficiency
Flashroute-16 (gaplimit 8)	106.23 k	12.14 M	496 s	0.0088
Flashroute-32	109.18 k	21.45 M	895 s	0.0051
Yarrp-16 (Fill Mode)	157.81 k	16.94 M	684 s	0.0093
Yarrp-32	167.72 k	32 M	1278 s	0.0052
6Trace-24	162.93 k	9.98 M	469 s	0.0163
6Trace-32	164.75 k	12.48 M	519 s	0.0132

• Scanning efficiency is defined as the ratio of newly discovered interfaces to the total number of probe packets sent.

Table 3. Performance of different network categories (the values in bold represent the best results; “Medi Distance” refers to the median distance from the source to the target).

Category	Interfaces	Probes	Scanning Efficiency	Median Distance (Target-Origin)
CDN	16.68 k	1.03 M	0.0162	12
ISP	209.60 k	1.18 M	0.1776	15
NSP	302.40 k	1.33 M	0.2274	18
ETP	6.19 k	1.26 M	0.0049	19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, Z.; Chen, P.; Xie, Y.; Chen, C.; Zhang, Y.; Yang, G. 6Trace: An Effective Method for Active IPv6 Topology Discovery. Electronics 2025, 14, 343. https://doi.org/10.3390/electronics14020343

AMA Style

Shen Z, Chen P, Xie Y, Chen C, Zhang Y, Yang G. 6Trace: An Effective Method for Active IPv6 Topology Discovery. Electronics. 2025; 14(2):343. https://doi.org/10.3390/electronics14020343

Chicago/Turabian Style

Shen, Zhaobin, Pan Chen, Yi Xie, Chiyu Chen, Yongheng Zhang, and Guozheng Yang. 2025. "6Trace: An Effective Method for Active IPv6 Topology Discovery" Electronics 14, no. 2: 343. https://doi.org/10.3390/electronics14020343

APA Style

Shen, Z., Chen, P., Xie, Y., Chen, C., Zhang, Y., & Yang, G. (2025). 6Trace: An Effective Method for Active IPv6 Topology Discovery. Electronics, 14(2), 343. https://doi.org/10.3390/electronics14020343

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

6Trace: An Effective Method for Active IPv6 Topology Discovery

Abstract

1. Introduction

2. Related Work

2.1. IPv6 Topology Discovery

2.2. IPv6 Hitlists

3. Design of 6Trace

3.1. Overview of 6Trace

3.2. Regional State Encoding Technique

3.3. Probe Order

3.4. Probing Strategy

3.5. Computational and Memory Requirements

4. Target Select

4.1. Multi-Source Seeds Collection

4.2. Characterization

4.2.1. AS Distribution

4.2.2. Network Type Distribution

4.3. Target Generation

5. Performance Evaluation

5.1. Experimental Setup

5.2. Evaluation Metrics

5.3. Impact of Configuration Parameters

5.3.1. InitTTL Range

5.3.2. maxTTL

5.4. Comparison with Existing Solutions

5.5. Scalability Evaluation for Larger IPv6 Target Sets

5.6. Impact of Different Network Categories on Seed Sets

6. Discussion

6.1. Insights from Experimental Results

6.2. Limitations and Challenges

6.3. Future Work

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI