Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching

Du, Jinyan; Zhang, Ran; Hu, Jiangbo; Xia, Tian; Liu, Jiang

doi:10.3390/aerospace12030233

Open AccessArticle

Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching

by

Jinyan Du

¹,

Ran Zhang

^2,3,*,

Jiangbo Hu

²,

Tian Xia

² and

Jiang Liu

^2,3

¹

China United Network Communication Group Co., Ltd., Beijing 100033, China

²

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

³

Purple Mountain Laboratories, Nanjing 211111, China

^*

Author to whom correspondence should be addressed.

Aerospace 2025, 12(3), 233; https://doi.org/10.3390/aerospace12030233

Submission received: 6 January 2025 / Revised: 18 February 2025 / Accepted: 10 March 2025 / Published: 13 March 2025

(This article belongs to the Section Astronautics & Space Science)

Download

Browse Figures

Versions Notes

Abstract

Low-Earth-Orbit (LEO) satellite networks have the advantage of global internet coverage and low latency, and they have enjoyed great success in the past few years. In LEO satellite networks, laser-based inter-satellite links (ISLs) are widely employed to achieve on-board data relay, and further to provide high-capacity backhaul worldwide. However, ISLs are prone to break due to the outage of the ISL capturing, tracking, and aiming systems. Meanwhile, breaks caused by different reasons can last from milliseconds to hours. The hybrid ISL fault leads to the on-board routing protocol to flap frequently, thus causing high routing overheads, low convergence speed, and degraded service consistency. In this work, we propose a hybrid fault detection mechanism to identify transient and long-term ISL outage. Further, for transient link outage, the segment routing-based loop-free backup path is adopted to provide real-time transmission recovery, and precise global route convergence is adopted to restore the long-term routing failure. For the inconsistent routing table switch between the phase from transient to long-term fault, we propose a dual timer mechanism to make sure the path can be smoothly switched without micro-loops. Simulation results validate the feasibility and efficiency of the proposed scheme.

Keywords:

segment routing; fast rerouting; dual-timer mechanism; backup path design; LEO satellite networks

1. Introduction

Satellite communication systems, with their extensive coverage, scalability, and absence of geographic blind spots, play an indispensable role in integrated networks spanning space, air, land, and sea. As a complement and extension to terrestrial networks, satellite networks facilitate bridging the digital divide across regions and expanding the coverage and service scope of terrestrial networks. These networks are widely employed to provide communication services in remote areas or during disaster relief operations. In recent years, with the advancements in Low-Earth-Orbit (LEO) satellite technology, the development and application of satellite networks have ushered in new opportunities, enabling large-scale IoT communication services in scenarios such as industrial IoT, agricultural automation, and offshore oil platforms [1].

However, in practice, the dynamic nature of satellite network topologies and the instability of ISLs result in relatively high transmission delays and increased susceptibility to ISL failures [1,2]. These failures can compromise the stability of traffic transmission in satellite networks. Therefore, improving the existing fast reroute (FRR) schemes in satellite networks is of great significance.

When addressing satellite network ISL failures, common emergency measures involve swiftly adjusting the physical state of devices to restore the underlying transport network [3]. Nevertheless, solely relying on such physical adjustments is often insufficient to ensure reliability, as network layer rerouting [4,5] is typically required to address failure scenarios. Upon detecting ISL failures through fault detection mechanisms, satellite network routing protocols leverage fast reroute mechanisms to redirect traffic to alternative paths that bypass the affected regions.

The impact of these two defects is manifested when the link failure detection mechanism is solely used to perceive stable failures. If the failure detection time threshold is set too long, the stability of traffic forwarding in the satellite network under transient failures will decrease. Conversely, if the failure detection time threshold is set too short, the entire underlay routing table will frequently re-converge, causing the primary and backup paths to switch irrationally and frequently, leading to unnecessary overhead.

Existing FRR schemes for satellite networks fail to differentiate between stable secondlevel link failures and transient faults that last only milliseconds [6]. This leads to the insufficient detection of transient faults and instability in traffic forwarding. Furthermore, current FRR schemes have shortcomings in the design of both primary and backup paths, overlooking the importance of a rational switch between them [7].

To address these issues, this paper proposes an innovative fast rerouting scheme for satellite networks, based on segment routing and dual-timer control, designed to overcome these limitations. By employing effective fault detection mechanisms and optimized path switching strategies, the proposed approach aims to improve satellite network performance in complex fault scenarios. The main contributions of this work are outlined as follows.

Joint Failure Sensing Method. To address the insufficient detection of transient failures in existing approaches, this paper proposes a joint failure sensing method that integrates hardware-based detection for transient faults and link failure detection for stable failures.

Dual-Timer Controlled Path Switching. To counteract irrational switching between primary and backup paths in existing schemes, a dual-timer controlled path switching mechanism is proposed.

Optimized Backup Path Design Using Segmented Routing. An optimized backup path design is proposed, leveraging segmented routing and virtual port mechanisms to enhance traffic redirection.

The remainder of this article is organized as follows: Section 2 introduces the proposed SRDT-FRR scheme based on detailed fault classification, joint fault detection, and backup path management. Section 3 describes the simulation setup, including network topology, fault scenarios, and performance metrics for evaluation. Section 3 also presents the results and discusses the advantages of the proposed scheme in terms of recovery time, resource utilization, and network stability. Finally, Section 4 concludes with a summary of contributions.

2. Related Work

Current rerouting strategies are generally classified into passive and active rerouting [8,9]. Passive rerouting relies on Interior Gateway Protocols (IGPs) to achieve network convergence after a failure [10,11]; however, during the convergence period, traffic cannot be rerouted promptly. In contrast, active rerouting involves precomputing backup paths and swiftly redirecting traffic to these alternative routes upon the detection of link failures. Nonetheless, active rerouting faces significant challenges, particularly related to the limitations of localized fault detection at network nodes and the asynchronous nature of network state updates [12,13]. These issues can potentially lead to routing instability, such as the formation of routing loops.

To optimize rerouting strategies and enhance the stability and efficiency of terrestrial networks, numerous classical solutions have been developed in this field. In [14], a fast rerouting (FRR) method is proposed, wherein a Point of Local Repair (PLR) reroutes traffic to precomputed Loop-Free Alternates (LFAs) when a failure occurs in protected components, such as links or nodes. However, in certain topologies, the PLR may be unable to find an LFA for specific destinations, leading to the absence of backup paths. In [15], the concept of Remote LFA (RLFA) is introduced, utilizing Multiprotocol Label Switching (MPLS) to direct data to a remote node, bypassing the PLR. While RLFA addresses some of the limitations of the LFA approach, it does not guarantee complete coverage due to challenges in selecting appropriate remote nodes. In [16], the TI-LFA, which is based on segment routing, is proposed to overcome the limitations of RLFA. This method prevents loops by encoding the backup path as a segment list, enabling rerouting in scenarios where conventional paths could lead to loops.

Building upon the classical rerouting strategies, some scholars have proposed rerouting strategies specifically suited for satellite networks. In [7], an innovative FRR algorithm for Low-Earth-Orbit (LEO) satellite networks with relay satellites is presented. This algorithm combines centralized and distributed routing strategies to facilitate efficient backup path updates without requiring full recalculations. However, it faces challenges in topology design, such as the need for the careful planning of relay satellite placements and clearly defining the roles of satellites responsible for calculation and forwarding. In [17], a solution to the issues of traffic imbalance and rapid failure recovery in ground station-assisted Low-Earth Orbit (LEO) satellite constellations is proposed. The DBTS/DBTS+ traffic allocation algorithm and the LFA/LFA+ fast rerouting mechanism are introduced. DBTS+ reduces the maximum satellite load by 30% by prioritizing traffic allocation to shorter paths, with only a 10% increase in delay. LFA+ improves the protection coverage by approximately 15% by employing two backup paths. In [18], a satellite-customized segment routing protocol (STSR) is introduced to address the low bandwidth efficiency of segment routing in satellite resource-constrained environments. Using bitmap encoding (SBE) and the path unwrapping mechanism (BUM), the protocol reduces the routing header size by 43.51% (with a 2000-satellite scale) compared to SRv6, achieving a bandwidth efficiency improvement of 43% to 92%. In [19], a geo-vector-based segment routing method is proposed to mitigate the issues of load imbalance and frequent topology changes in multilayer satellite networks. Through dynamic routing table maintenance strategies (time/state hybrid trigger updates) and regional hierarchical management, the impact of the satellite movement is mitigated. The end-to-end delay is reduced to near theoretical values, while the throughput is increased by 5% in high-load scenarios. Finally, ref. [20] introduces a priority-based rerouting method that calculates link redundancy and selects backup paths based on a minimal interference algorithm. This approach reduces the resources needed for path recalculations during failures and shortens repair times.

3. Design and Implementation of Fast Rerouting Scheme Based on Segment Routing and Dual Timers Switching

This section presents the design and implementation of the SRDT-FRR scheme. First, it outlines the conceptual framework of the proposed approach, with a focus on the classification of link failures and the joint sensing mechanism for transient and stable failures. Next, the design of each functional module is detailed, including the onboard control plane, the link failure detection module, the backup path activation timer module, and the backup path deactivation timer module, highlighting their respective roles in failure detection and traffic recovery. Furthermore, the section discusses the construction and utilization of SR backup paths, with particular emphasis on the label stack-based path encoding strategy and its application logic during failure switching. Finally, a comprehensive analysis of the router behaviors is provided, illustrating the key operations performed by different types of nodes (PLR nodes, intermediate nodes, and egress nodes) in backup path activation, label processing, and underlay routing restoration. This analysis offers thorough theoretical and technical support for the effectiveness and feasibility of the fast rerouting scheme.

3.1. Overview of the SRDT-FRR Scheme

This scheme establishes a comprehensive and well-coordinated architectural framework, with a focus on optimizing and managing backup paths in satellite networks. To mitigate transmission interruptions caused by link instability, the SRDT-FRR scheme is proposed. By leveraging the flexibility of segment routing (SR) technology, the speed of hardware-based detection, and the precision of link monitoring, the proposed approach enables the joint sensing of transient and stable failures, facilitating rapid fault recovery.

3.1.1. Scheme Modeling

This paper defines a satellite network as a graph G = (V,E), where V represents the set of satellite nodes and E denotes the set of ISLs. A satellite constellation consists of N orbital planes, each containing M satellites. The proposed scheme effectively tackles the challenges posed by link failures through the coordinated operation of multiple functional modules, facilitating efficient traffic management and fault recovery.

As shown in the Figure 1, the SRDT-FRR scheme utilizes a set of collaboratively functioning modules to effectively address the challenges posed by link failures in satellite networks, enabling efficient traffic management and fault recovery. Firstly, the onboard control plane acts as the core decision-making unit, overseeing the overall management and control of the satellite network. Its key functions include facilitating information exchange between satellite nodes, formulating and updating routing strategies, and calculating and configuring backup paths. Secondly, the hardware monitoring module focuses on the high-precision, real-time monitoring of network links at short intervals, specifically targeting millisecond-level link interruptions caused by laser instability. By employing highsensitivity hardware, the module quickly detects instantaneous link anomalies, triggering the PLR to swiftly switch to a backup path and ensure uninterrupted traffic flow.

Upon the occurrence of a link disruption and the subsequent switch to a backup path, the hardware monitoring module activates a backup path activation timer. This timer regulates the duration for which the backup path remains active, thereby preventing frequent switching between the primary and backup paths caused by transient faults. The link fault detection module is responsible for periodically monitoring the status of links within the satellite network. It sends detection signals at relatively long intervals and observes the link response. When a transient fault transitions into a long-term stable fault, the link fault detection module triggers a backup path deactivation timer to manage the timing of backup path switching. By setting an appropriate duration, this module ensures network stability during the fault recovery process.

3.1.2. Backup Path Design

Based on segment routing (SR), the proposed scheme achieves 100% link protection coverage under single-link failure conditions [21]. As shown in the Figure 2, Backup paths are encoded as ordered segment lists, referred to as backup path label stacks, which guide the satellite nodes in processing and forwarding packets. Each segment, or label, represents an instruction executed by a satellite node upon receiving a packet. By combining multiple segments into an ordered list, packets can be directed along any path within the satellite network, independent of routing protocols or shortest-path constraints. The SR backup path packets carry the backup path label stack in their headers, with nodes executing the instructions in the stack using three basic operations: PUSH, CONTINUE, and NEXT.

According to the Global Backup Path Calculation and Activation outlined in Algorithm 1, upon detecting a failure, the PLR node promptly identifies the fault through hardware detection and immediately switches to the local backup path, ensuring uninterrupted traffic flow. If the fault evolves into a stable, long-term failure, it is subsequently detected by the link failure detection mechanism as the fault duration extends. This triggers a network-wide topology update, leading to the convergence of a new onboard underlay routing configuration, after which the backup path can revert to the underlay routing.

Algorithm 1 Global backup path calculation and activation

Input:: Network Graph $G = (V, E)$
Output:: Efficient rerouting using precomputed backup paths

// Step 1: Backup Path Precomputation (For each PLR node):
for each node $n \in V$ do
for each link $l \in E$ do
Precompute backup paths for link l failure
Store backup path in the backup routing table of p with lower priority than the base route
end for
end for
// Step 2: Table-based Forwarding with SR Backup Path Activation:
for each each packet forwarded by PLR node n do
if fault detected then
Redirect traffic to SR backup path label stack
end if
end for
//Step 3: Backup Path Activation and Deactivation:
if a failure occurs then
PLR node activates the backup path by increasing its priority in the routing table
Traffic is redirected to the backup path
if the fault is resolved then
Deactivate backup path by restoring the base route priority
Redirect traffic back to the primary path
end if
end if

3.2. Node Behavior Design

Within the SR backup path, the source node is responsible for converting base route packets into SR backup path packets and pushing the SR backup path labels. Intermediate nodes forward the packets by examining the labels according to the SR mechanisms. Upon reaching the backup path endpoint, the node converts the SR backup path packets back into base route packets and forwards them using the onboard underlay routing configuration.

3.2.1. PLR Node Behavior

According to the Backup Path Activation and Deactivation with Two Timers algorithm outlined in Algorithm 2, the PLR first precomputes the backup paths for potential link failures in its vicinity. To enable table-based forwarding in underlay routing, the PLR, acting as the head node of the SR backup path, introduces an additional mechanism into the underlay routing table lookup process. When a fault is encountered during base route forwarding, the PLR redirects traffic to the SR backup path label stack through a backup routing table entry. This backup entry, which has a lower priority than the original base route entry for the destination, utilizes a virtual port to direct traffic toward the SR backup path.

Upon detecting a failure with millisecond precision through hardware monitoring, the PLR activates the SR backup path. It increases the priority of the backup path entry in the underlay routing table, making it higher than that of the primary path entry, thereby redirecting traffic to the backup path’s virtual port. This operation modifies only the priority of the affected backup routing entries, without altering the underlay routing entries, thereby minimizing the impact of the switch on traffic flow.

To mitigate the cost of frequent switching between primary and backup paths during transient faults, the PLR sets a backup path activation timer when switching to the SR backup path via hardware detection. If the timer expires and the fault is resolved, the backup path’s priority is lowered below that of the primary path, returning traffic to the primary path. If a new transient fault occurs during this period, the backup path activation timer is reset.

Algorithm 2 Backup path activation and deactivation with dual timers

Input:: PLR node n & Link failure detection mechanism
Output:: Efficient management of backup path activation and deactivation
1:: // Step1: Backup Path Activation Timer for Transient Failures:
2:: for each PLR node p $n \in V$ do
3:: if transient fault is detected then
4:: Set backup path activation timer
5:: if timer expires and fault is resolved then
6:: Redirect traffic back to primary path
7:: else if stable fault occurs during timer then
8:: Cancel backup path activation timer
9:: end if
10:: end if
11:: end for
12:: // Step2: Link Failure Evolution and Long-Term Stable Fault Detection:
13:: if a fault evolves into a long-term stable fault then
14:: Flood fault information across the network
15:: Deploy converged base route with the same priority as the original primary path
16:: Set backup path deactivation timer
17:: end if
18:: if backup path deactivation timer expires: then
19:: Redirect traffic back to the onboard underlay routing system
20:: end if

Additionally, during the activation timer for transient faults, if the link failure detection mechanism identifies that the fault has evolved into a long-term stable failure, this fault information is flooded across the network. A converged base route with a priority equal to that of the original primary path is deployed, and a backup path deactivation timer is set. Once the deactivation timer expires, the backup path’s priority is lowered below that of the base route, reverting traffic to the onboard underlay routing system.

The priority control of the backup path is a core mechanism of this scheme. This formula describes the process by which the PLR node adjusts the priority of the backup path based on the network status (i.e., whether a fault has occurred):

P_{b a c k u p} (t) = \{\begin{matrix} P_{b a s e} + Δ P & backup path is activated \\ P_{b a s e} - Δ P & backup path is deactivated \\ P_{b a c k u p} (t - 1) & otherwise \end{matrix}

(1)

where

P_{b a c k u p} (t)

is the priority of the backup path at time t,

P_{b a s e}

is the priority of the underlay routing path at time t, and

Δ P

is the adjustment in priority, representing the change in priority of the backup path relative to the base route.

This formula describes the mechanism of the backup path activation timer, which is used to control the switching between primary and backup paths and to mitigate the overhead caused by frequent path switches:

T_{b p a t} (t) = \{\begin{matrix} T_{r e s e t} & new transient fault is detected \\ 0 & backup path is deactivated \end{matrix}

(2)

where

T_{b p a t} (t)

represents the value of the backup path timer at time t.

T_{r e s e t}

is the reset value of the timer, which is used to reset the timer in the event of a new fault occurrence. Once the fault is resolved and the backup path is deactivated, the timer value is reset to 0, indicating that the backup path is no longer active.

To provide a clearer description of the synchronization mechanisms and potential race conditions between the hardware monitoring and link failure detection modules, Figure 3 illustrates the state transition of the PLR node. The following content analyzes the relationships between the primary and backup routes, the triggering conditions for switching between backup and primary paths, the role of timers, and other relevant details, based on the states of the PLR node.

1. Pre-Update Primary Route State:

Under normal circumstances, the PLR operates in the pre-update primary route state. This is the initial state, in which the PLR forwards data packets using the primary route and precomputes the backup route. In this state, the priority of the backup route is lower than that of the primary route. The PLR will only transition from the initial state to the backup path state if a failure is detected by the hardware monitoring module.

2. Backup Path State:

The backup path state is an intermediate state, where the priority of the backup route is adjusted to be higher than that of the primary route. This state is more complex, as its maintenance and transition are influenced by the backup path activation timer and the backup path deactivation timer. Two timers, which are unique to the PLR, are introduced as follows.

After a transient fault is detected by hardware monitoring and the PLR switches to the backup path, a short-term failure timer (also known as the backup path activation timer) is set to avoid the frequent switching between the backup and primary paths, which would incur overhead. As long as the backup path activation timer has not expired, the path will remain on the backup route.

In the case of long-lasting faults detected by the link failure detection mechanism, the PLR provides the capability to switch to the backup path. After the failure is detected, to mitigate micro-loops caused by flooding delay in the fault information, the PLR sets the backup path deactivation timer. The duration of this timer is recommended to be in seconds, covering the maximum flooding delay, the time needed for recalculating the primary route, and the propagation delay. This timer duration depends on the constellation scale and requires configuration by the network management system.

3. Post-Update Primary Route State:

The post-update primary route state represents a stable state after the failure, when the primary route has converged. In this state, the priority of the backup route is adjusted to be lower than that of the converged primary route. The PLR then forwards data packets using the converged primary route.

3.2.2. Intermediate Nodes Behavior

When an intermediate node receives a packet, it identifies the packet as part of the SR backup path. The node forwards the packet along the path specified by the label. Specifically, the node determines the outgoing port and label operation code based on the label. It then pops the top adjacency label from the label stack in the packet header and forwards the packet accordingly.

3.2.3. Endpoint Behavior

When the egress node (stack bottom) receives a packet, it checks if the packet is part of the SR backup path and if the label in the stack is the bottom label. If both conditions are met, the node is recognized as the stack bottom. The stack bottom node then removes the SR node label from the stack, converts the packet into a underlay routing packet, performs a underlay routing table lookup, and forwards the packet based on the matching outgoing port.

4. Performance Evaluation

This section presents the simulation setup and performance evaluation of the proposed SRDT-FRR scheme for fault recovery in satellite networks. The experiment is conducted with a constellation of 100 satellites, distributed across 10 orbital planes, with 10 satellites per plane, and an orbital altitude of 1000 km. The inter-satellite link delay is set to range from 10 ms to 20 ms to simulate dynamic network conditions. To evaluate the effectiveness of the SRDT-FRR scheme, it is compared with several baseline FRR methods, including SSFRR, BP-LSRFD, BP-BFD, and LSRFD. These schemes differ in their fault detection, backup path generation, and recovery strategies, with routing interruption time and primarybackup path switching overhead serving as the key performance metrics. Three fault scenarios—stable, transient, and mixed faults—are designed for the simulations, which run for 600 s. The results show that the SRDT-FRR scheme significantly outperforms the baseline methods in terms of fault detection speed, recovery time, and minimizing path switching overhead, thereby enhancing routing reliability and fault recovery efficiency in satellite networks.

4.1. Simulation Setup

The experiments in this study are conducted using a constellation of 100 satellites of the same specifications in a 1000 km orbit, consisting of 10 orbital planes, each containing 10 satellites. The inter-satellite link delay varies dynamically between 10 ms and 20 ms.

To evaluate the effectiveness of the proposed scheme, comparisons are made with several baseline FRR methods [7]:

SS-FRR Scheme: This scheme integrates segment routing, enabling satellites with greater computational resources to autonomously generate backup paths without relying on ground controllers. This reduces the interaction delay associated with centralized routing.
Backup Path-based Link-State Routing Fault Detection Mechanism (BP-LSRFD): This mechanism detects faults within 3 s of occurrence and swiftly switches to backup paths.
Backup Path-based BFD Detection (BP-BFD): This mechanism detects faults within 90 ms and promptly switches to backup paths.
Link-State Routing Fault Detection Mechanism(LSRFD): This mechanism detects faults within 3 s, recalculates routes, and floods fault information across the network.

Previous studies have evaluated FRR-based routing schemes in satellite networks using metrics such as routing interruption time, storage overhead for backup path forwarding tables, and computational resource utilization [22,23,24]. To demonstrate the superiority of the proposed SRDT-FRR scheme, this study focuses on the routing interruption time and primary-backup path switching overhead as key performance metrics.

It should be noted that a single performance metric is insufficient to comprehensively and accurately evaluate the advantages and disadvantages of various schemes. Therefore, this study applies min-max normalization to the two selected performance metrics and subsequently calculates weighted impact scores. These scores serve as a critical basis for comprehensively assessing the relative performance of different schemes, ensuring the scientific validity and reliability of the evaluation results. This approach more accurately underscores the advantages and significance of the SRDT-FRR scheme in satellite network routing. A shorter routing interruption time indicates better network stability and continuity. Therefore, this study places greater emphasis on the metric of routing interruption time. Based on this consideration, a relatively higher weight is assigned to routing interruption time when calculating the weighted impact scores. Correspondingly, a smaller weighted score reflects the superior overall performance of the respective scheme, indicating a higher capability to meet the requirements for efficient and stable operation in satellite networks.

The simulation duration is set to 6000 s. To comprehensively evaluate the fault recovery capabilities of the SRDT-FRR scheme, three fault scenarios are designed, with one packet transmitted every 1 ms to calculate the routing interruption time:

Fault scenario a: Three random stable faults (5 s each). This scenario tests the fault recovery capability of the SRDT-FRR scheme under stable faults;
Fault scenario b: Three random transient faults (500 ms each). This scenario evaluates the low switching overhead of the SRDT-FRR scheme during transient faults;
Fault scenario c: Mixed faults (three 5 s stable faults and three 500 ms transient faults). This scenario assesses the scheme’s sensitivity to transient faults, demonstrating its ability to promptly detect and recover from such faults, reducing the routing interruption time.

4.2. Simulation Results

The comparison results for the fault scenario A in Figure 4 and Figure 5, showing the routing recovery times and primary-backup path switch counts, are presented in the figure. It is observed that the SRDT-FRR scheme and Scheme 3 recover from faults within approximately 100 ms, with primary-backup path switching times around 10 ms, resulting in lower routing interruption times. In contrast, the other schemes take about 3100 ms for routing recovery, with recalculated routing times around 50 ms, leading to longer routing interruption times. Additionally, the impact score results in Table 1 show that SRDT-FRR has the smallest impact score, further demonstrating its superior performance. Table 2 indicates that there is variability in the impact scores, which can be attributed to the uncontrollable time interval between the occurrence of a fault and its detection during the experiment. For instance, hardware monitoring detects faults occurring every 50 ms, while the fault could occur at any time between 0 and 50 ms, resulting in fluctuations in the routing interruption time. Consequently, this leads to variability in the impact scores.

For fault scenario B, the comparison in Figure 6 and Figure 7 shows that SRDT-FRR and Scheme 3 recover from faults in about 100 ms, whereas the other schemes experience routing interruption times of around 550 ms. Additionally, the SRDT-FRR scheme effectively reduces the number of primary-backup path switches, thereby lowering the switching overhead. Furthermore, the impact score results presented in Table 3 and Table 4 indicate that SRDT-FRR achieves the lowest impact score, highlighting its exceptional performance.

In fault scenario C, the comparison results in Figure 8 and Figure 9 indicate that the SRDTFRR scheme effectively detects transient faults, recovering from faults in about 100 ms, while the other schemes experience routing interruption times exceeding 10 s. Moreover, as shown in Table 5, SRDT-FRR yields the lowest impact score, underscoring its outstanding performance. In Table 6, the p-values are all less than 0.001, indicating that there are significant differences in the routing interruption time and switching overhead between the “Proposed SRDT-FRR” and the other strategies.

The simulation time for the above experiments was set to 6000 s, and the link failures in the three fault scenarios were randomly configured. Therefore, the simulation results of the different schemes are not significantly influenced by the simulation duration. To further validate the analysis from an experimental perspective, we extended the simulation time to 60,000 s and re-simulated the routing interruption times of each scheme under fault scenario A. Comparing the results shown in Figure 10 with those in Figure 4, it is evident that there is almost no difference between the two, which further supports the validity of our previous analysis.

In all fault scenarios, the SRDT-FRR scheme maintains routing interruption times around 100 ms, significantly outperforming the other schemes. Particularly in transient and mixed fault scenarios, its fault recovery efficiency is markedly superior, demonstrating high sensitivity and responsiveness to transient faults. The SRDT-FRR scheme, with its backup path activation and deactivation timer mechanisms, effectively prevents frequent switching during transient fault conditions. As a result, the number of primary-backup path switches is significantly lower than in the comparison schemes, minimizing the switching overhead.

Whether under stable or transient faults, the SRDT-FRR scheme ensures rapid recovery, protecting business traffic from interruptions while minimizing additional network resource overhead. Its high efficiency and robustness across various fault scenarios validate the practicality and reliability of the solution. The experimental results clearly demonstrate that the SRDT-FRR scheme significantly enhances the routing reliability and fault recovery efficiency in satellite networks under different fault conditions. The experiment shows that SRDT-FRR maintains a stable routing interruption time of 100 ms across three fault scenarios (Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9), significantly outperforming the comparative schemes, with BPLSRFD exhibiting 3100 ms and BP-BFD 550 ms. Furthermore, the average impact scores of SRDT-FRR (Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6) for the three scenarios are 0.103050, 0.044111, and 0.067514, respectively, representing a reduction of at least 80% compared to other schemes. The p-value is less than 0.001, indicating that there are significant differences in the routing interruption time and switching overhead between the proposed SRDT-FRR and the other strategies. The SRDT-FRR scheme’s millisecond-level fault detection capability and flexible path switching strategy outperform the comparison schemes in key performance metrics such as routing interruption time, primary-backup path switching overhead, and network stability, offering an effective solution to the fast rerouting challenge in high-dynamics satellite networks.

5. Conclusions

This paper presents a segment routing-based, dual-timer-controlled fast rerouting scheme aimed at significantly enhancing the communication stability and failure recovery efficiency of satellite networks under link failure scenarios. By integrating a joint fault detection mechanism and optimizing the backup path design, the proposed approach effectively mitigates the impact of both transient disruptions and permanent faults on the path switching overhead. Leveraging collaborative hardware and link fault detection, along with activation and deactivation timers for backup paths, SRDT-FRR demonstrates superior performance in terms of the interruption time and recovery efficiency compared to conventional methods. The simulation results validate its effectiveness, providing strong technical support for the efficient operation of satellite networks. However, the current study assumes single-link failures, which may not fully reflect the nature of real-world LEO constellations. Future work will extend SRDT-FRR to multi-link failure scenarios. Additionally, we plan to deploy a prototype on satellite platforms to evaluate energy consumption and latency under hardware constraints, bridging the gap between simulation and practical deployment. Future research will focus on extending the applicability of SRDT-FRR to more complex fault scenarios and dynamic multi-link changes like node faults, as well as integrating AI techniques to optimize resource utilization and path planning, paving the way for intelligent solutions in integrated space–air–ground–sea networks.

Author Contributions

Conceptualization, J.D. and R.Z.; methodology, J.D.; software, J.H. and R.Z.; validation, J.D., R.Z. and J.H.; formal analysis, T.X.; investigation, J.D.; resources, J.D.; data curation, R.Z.; writing—original draft preparation, J.L. and R.Z.; writing—review and editing, J.L.; visualization, J.L. and R.Z.; supervision, J.D. and T.X.; project administration, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Young Scientists Fund of National Natural Science Foundation of China (Grant NO. 62201077).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data generated during this current study are available from the authors upon reasonable request.

Conflicts of Interest

Author Jinyan Du was employed by the company China United Network Communication Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LEO	Low-Earth-Orbit
ISLs	Inter-Satellite Links
FRR	Fast Reroute
SR	Segment Routing
PLR	Point of Local Repair

References

Huber, S.; Younis, M.; Krieger, G.; Moreira, A.; Wiesbeck, W. A Reflector Antenna Concept Robust Against Feed Failures for Satellite Communications. IEEE Trans. Antennas Propag. 2015, 63, 1218–1224. [Google Scholar] [CrossRef]
Zhao, G.; Kang, Z.; Huang, Y.; Wu, S. A Routing Optimization Method for LEO Satellite Networks with Stochastic Link Failure. Aerospace 2022, 9, 322. [Google Scholar] [CrossRef]
Xu, H.; Shi, Z.; Liu, M.; Zhang, N.; Yan, Y.; Han, G. Link-State Aware Hybrid Routing in the Terrestrial–Satellite Integrated Network. Sensors 2022, 22, 9124. [Google Scholar] [CrossRef] [PubMed]
Zhou, Y.; Chen, H.; Dou, Z. MOLM: Alleviating Congestion through Multi-Objective Simulated Annealing-Based Load Balancing Routing in LEO Satellite Networks. Future Int. 2024, 16, 109. [Google Scholar] [CrossRef]
Miao, J.; Wang, P.; Yin, H.; Chen, N.; Wang, X. A Multi-Attribute Decision Handover Scheme for LEO Mobile Satellite Networks. In Proceedings of the 2019 IEEE 5th International Conference on Computer and Communications (ICCC), Chengdu, China, 6–9 December 2019; pp. 938–942. [Google Scholar]
Lai, X.; Zhao, Y.; Jing, Y.; Wang, H.; Wang, W.; Zhang, J. Fast Routing Algorithm Based on Topology Pruning in Mega Satellite Optical Networks. In Proceedings of the 2023 21st International Conference on Optical Communications and Networks (ICOCN), Qufu, China, 31 July 2023; pp. 1–3. [Google Scholar]
Chen, X.; Chen, Z.; Chang, X.; Ji, T.; Wu, Z.; Li, C. Fast Reroute Algorithms for Satellite Network With Segment Routing. IEEE Access 2023, 11, 133509–133520. [Google Scholar] [CrossRef]
Zhang, X.; Cheng, Z.; Lin, R.; He, L.; Yu, S.; Luo, H. Local Fast Reroute With Flow Aggregation in Software Defined Networks. IEEE Commun. Lett. 2017, 21, 785–788. [Google Scholar] [CrossRef]
Papan, J.; Segec, P.; Dobrota, J.; Koncz, L.; Yeremenko, O.; Lemeshko, O.; Yevdokymenko, M. Review of Fast ReRoute Solutions. In Proceedings of the 2020 18th International Conference on Emerging eLearning Technologies and Applications (ICETA), Košice, Slovenia, 12 November 2020; pp. 498–504. [Google Scholar]
Chiesa, M.; Sedar, R.; Antichi, G.; Borokhovich, M.; Kamisinski, A.; Nikolaidis, G.; Schmid, S. Fast ReRoute on Programmable Switches. IEEE ACM Trans. Netw. 2021, 29, 637–650. [Google Scholar] [CrossRef]
Lemeshko, O.; Arous, K. Fast ReRoute Model for Different Backup Schemes in MPLS-Network. In Proceedings of the 2014 First International Scientific-Practical Conference Problems of Infocommunications Science and Technology, Kharkov, Ukraine, 14–17 October 2014; pp. 39–41. [Google Scholar]
Xu, W.; Li, X.; Huang, S. Shared Backup Path Protection for Satellite Ground Laser Link. In Proceedings of the International Conference on Frontiers of Electronics, Information and Computation Technologies, Changsha, China, 21 May 2021; pp. 1–6. [Google Scholar]
Pan, T.; Huang, T.; Li, X.; Chen, Y.; Xue, W.; Liu, Y. OPSPF: Orbit Prediction Shortest Path First Routing for Resilient LEO Satellite Networks. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 9–13 May 2019; pp. 1–6. [Google Scholar]
Atlas, A.; Zinin, A. Basic Specification for IP Fast Reroute: Loop-Free Alternates. In Document RFC 5286. 2008. Available online: https://www.rfc-editor.org/rfc/rfc5286.html (accessed on 1 March 2025).
Hegde, S.; Bowers, C.; Gredler, H.; Litkowski, S. Remote-LFA Node Protection and Manageability. In Document RFC 8102. 2017. Available online: https://www.rfc-editor.org/rfc/rfc8102.html (accessed on 1 March 2025).
Suzuki, K. An Efficient Calculation for TI-LFA Rerouting Path. IEICE Trans. Commun. 2022, 105, 196–204. [Google Scholar] [CrossRef]
Zhang, S.; Li, X.; Yeung, K.L. Segment Routing for Traffic Engineering and Effective Recovery in Low-Earth Orbit Satellite Constellations. Digit. Commun. Netw. 2024, 10, 706–715. [Google Scholar] [CrossRef]
Zhao, Y.; Wu, W.; Ning, X.; Tang, Y.; Li, S.; Qin, C.; Liu, J. STSR: A Satellite-Tailored Segment Routing Method for Efficient Space Communication. In Proceedings of the 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates, 21 April 2024; pp. 1–6. [Google Scholar]
Bai, W.; Yang, H.; Tong, J.; Qin, Z.; Lyu, R. SVector Segment Routing for Large-Scale Multilayer Satellite Network. J. Commun. Inf. Netw. 2023, 8, 24–36. [Google Scholar] [CrossRef]
Liu, Q.; Wang, D.; Pan, C. Rerouting strategy for satellite MPLS networks based on priority mechanisms. Comput. Simul. 2015, 32, 57–61. [Google Scholar]
Filsfils, C.; Previdi, S.; Ginsberg, L.; Decraene, B.; Litkowski, S.; Shakir, R. Segment Routing Architecture. In Document RFC 8402. 2018. Available online: https://www.rfc-editor.org/rfc/rfc8402#:~:text=Segment%20Routing%20%28SR%29%20leverages%20the%20source%20routing%20paradigm.,SR%20node%20or%20global%20within%20an%20SR%20domain (accessed on 1 March 2025).
Yan, F.; Luo, H.; Zhang, S.; Wang, Z.; Lian, P. A comparative study of IP-based and ICN-based link-state routing protocols in LEO satellite networks. Peer-to-Peer Netw. Appl. 2023, 16, 3032–3046. [Google Scholar] [CrossRef]
Zhang, Z.; Zhao, K.; Li, W.; Fang, Y. Fast Recovery from Multiple Link Failures in LEO Satellite Networks. In Proceedings of the 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Toronto, ON, Canada, 5 September 2023; pp. 1–6. [Google Scholar]
Zhou, R.; Zhang, Q.; Tao, Y.; Shen, Y.; Zhang, W.; Tian, F.; Tian, Q.; Qian, J. Intelligent Multipath Backup Ant Colony Routing Algorithm of Satellite Network Based on Betweenness Centrality. In Proceedings of the 2020 IEEE Computing, Communications and IoT Applications (ComComAp), Beijing, China, 20 December 2020; pp. 1–5. [Google Scholar]

Figure 1. Satellite node module division.

Figure 2. Backup path diagram.

Figure 3. The PLR node state transition diagram.

Figure 4. Routing interruption time results for fault scenario A.

Figure 5. Primary-backup path switching results for fault scenario A.

Figure 6. Routing interruption time results for fault scenario B.

Figure 7. Primary-backup path switching results for fault scenario B.

Figure 8. Routing interruption time results for fault scenario C.

Figure 9. Primary-backup path switching results for fault scenario C.

Figure 10. Routing interruption time results for fault scenario A under 60,000 s.

Table 1. The impact scores for fault scenario A.

FRR Methods	Experiment 1	Experiment 2	Experiment 3	Experiment 4
Proposed SRDT-FRR	0.102033	0.109003	0.101162	0.100000
BP-LSRFD	0.969797	0.974443	0.944821	0.970958
BP-BFD	0.103485	0.109003	0.102904	0.102033
baseline SS-FRR	0.963117	0.969797	0.965150	0.967183
LSRFD	0.884608	0.890707	0.888093	0.900000

Table 2. Statistical significance of impact scores for fault scenario A.

FRR Methods	Mean	Standard Deviation	p-Value with Proposed SRDT-FRR
Proposed SRDT-FRR	0.103050	0.003802	/
BP-LSRFD	0.964990	0.012056	<0.001
BP-BFD	0.103485	0.109003	>0.001
baseline SS-FRR	0.966312	0.002216	<0.001
LSRFD	0.891602	0.005743	<0.001

Table 3. The impact scores for fault scenario B.

FRR Methods	Experiment 1	Experiment 2	Experiment 3	Experiment 4
Proposed SRDT-FRR	0.044111	0.125149	0.066666	0.035129
BP-LSRFD	0.763473	0.799401	0.806586	0.774251
BP-BFD	0.119760	0.134131	0.135928	0.114371
baseline SS-FRR	0.761676	0.78502	0.788622	0.752694
LSRFD	0.817365	0.892814	0.9	0.860479

Table 4. Statistical significance of impact scores for fault scenario B.

FRR Methods	Mean	Standard Deviation	p-Value with Proposed SRDT-FRR
Proposed SRDT-FRR	0.067514	0.037451	/
BP-LSRFD	0.786478	0.011440	<0.001
BP-BFD	0.126048	0.009074	<0.001
baseline SS-FRR	0.757503	0.014663	<0.001
LSRFD	0.868314	0.033472	<0.001

Table 5. The impact scores for fault scenario C.

FRR Methods	Experiment 1	Experiment 2	Experiment 3	Experiment 4
Proposed SRDT-FRR	0.052201	0.051386	0.050082	0.050000
BP-LSRFD	0.910058	0.810284	0.892125	0.909243
BP-BFD	0.105135	0.104157	0.100978	0.100082
baseline SS-FRR	0.901417	0.910221	0.906797	0.903618
LSRFD	0.884349	0.900000	0.891359	0.896332

Table 6. Statistical significance of impact scores for fault scenario C.

FRR Methods	Mean	Standard Deviation	p-Value with Proposed SRDT-FRR
Proposed SRDT-FRR	0.051667	0.001188	/
BP-LSRFD	0.880935	0.039263	<0.001
BP-BFD	0.102088	0.001720	<0.001
baseline SS-FRR	0.902993	0.003356	<0.001
LSRFD	0.894010	0.007390	<0.001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Du, J.; Zhang, R.; Hu, J.; Xia, T.; Liu, J. Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching. Aerospace 2025, 12, 233. https://doi.org/10.3390/aerospace12030233

AMA Style

Du J, Zhang R, Hu J, Xia T, Liu J. Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching. Aerospace. 2025; 12(3):233. https://doi.org/10.3390/aerospace12030233

Chicago/Turabian Style

Du, Jinyan, Ran Zhang, Jiangbo Hu, Tian Xia, and Jiang Liu. 2025. "Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching" Aerospace 12, no. 3: 233. https://doi.org/10.3390/aerospace12030233

APA Style

Du, J., Zhang, R., Hu, J., Xia, T., & Liu, J. (2025). Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching. Aerospace, 12(3), 233. https://doi.org/10.3390/aerospace12030233

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fast Reroute Mechanism for Satellite Networks Based on Segment Routing and Dual Timers Switching

Abstract

1. Introduction

2. Related Work

3. Design and Implementation of Fast Rerouting Scheme Based on Segment Routing and Dual Timers Switching

3.1. Overview of the SRDT-FRR Scheme

3.1.1. Scheme Modeling

3.1.2. Backup Path Design

3.2. Node Behavior Design

3.2.1. PLR Node Behavior

3.2.2. Intermediate Nodes Behavior

3.2.3. Endpoint Behavior

4. Performance Evaluation

4.1. Simulation Setup

4.2. Simulation Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI