1. Introduction
Data relay satellites (DRSs) typically operate in geostationary orbit, providing broad and stable coverage through their antennas to offer long-term and reliable data offloading services to low-Earth orbit (LEO) satellites [
1,
2,
3]. This improves the speed of remote sensing data collection and transmission [
4]. However, due to the growing demand for reliable and continuous communication across industries such as telecommunications, defense, and emergency services, the scale of the DRS scheduling problem has increased exponentially [
5,
6]. This calls for better management of DRS and the development of faster, more flexible scheduling algorithms [
1,
7,
8].
DRSs in China are primarily managed by a centralized ground control center. Users submit their DRS usage requirements in advance to the control center [
9,
10]. The control center then allocates satellite resources reasonably, considering usage constraints, to generate a usage plan, which is then executed by the DRS through the telemetry and control link [
11]. The plans generated in this mode are based on offline predictions of in-orbit resources. However, in practice, due to unpredictable risks such as space debris, solar radiation, and other complex space environmental factors [
12,
13], both remote sensing satellites (RSSs) and DRSs face the possibility of failures that cannot be predicted. As a result, the satellite control center must adjust the original plan in response to unexpected changes, forming a dynamic transmission plan to improve the resilience of the system [
14].
In previous studies on dynamic DRS scheduling, Deng et al. [
15] proposed a two-stage, multi-constraint scheduling model for DRSNs. Li et al. [
16] developed a unified modeling framework that incorporates multiple types of system disturbances, making it applicable to both dynamic and static scheduling scenarios. Zhao et al. [
17] formulated the DRS scheduling problem as a constraint satisfaction problem with multiple constraints. Chen et al. [
18] modeled the problem as a Markov decision process, while S. Rojanasoonthon et al. [
19] treated it as a mixed-integer programming problem. However, most of these studies focus solely on the utilization constraints of the DRS. In practice, due to antenna pointing losses and free-space path losses, transmission links must be established within the maximum communication cone of the DRS [
20,
21]. Furthermore, considering the limited transmission power of RSSs, incorporating link budget constraints into DRS scheduling models is both necessary and realistic [
22,
23].
Dynamic DRS scheduling refers to the process of replanning and adjusting the task transmission schedule of the DRS when the original schedule is affected by resource changes or task disturbances, in order to maintain the overall optimality or near-optimality of the scheduling scheme in a dynamic environment. Regarding scheduling algorithms, Zhai et al. [
24] used a rule-based heuristic algorithm to solve the dynamic robust DRS scheduling problem, He et al. [
25] applied a stochastic optimization framework to a hybrid dynamic DRS task planning problem, and Dai et al. [
26] proposed an adaptive large neighborhood search algorithm with deadline awareness to solve the dynamic relaxation scheduling problem for the DRS. Luo et al. [
27] introduced a flexible scheduling mode for the DRS, and Chen et al. [
28] used a breakpoint resume method to improve the efficiency of DRS resource scheduling. Li et al. [
29] applied an end-to-end reinforcement learning approach to solve the DRS scheduling problem. These studies developed algorithms for traditional DRS control modes, where different parts of the algorithm are tightly coupled, making them suitable for centralized operation on a single machine.
With the development of hardware and software technologies for DRSs, satellites now have the ability for autonomous computation [
1,
30,
31]. The traditional offline control mode of DRS control centers has severely limited the performance of DRSs. Satellites can now autonomously perceive their environment in real-time, and compared to plans generated based on predicted resource information, they can more promptly and accurately respond to risks. Autonomous collaborative planning of the DRS in orbit is the future trend of DRS management [
32,
33].
The contract net protocol (CNP), as a representative distributed decision-making architecture, exhibits strong scalability, autonomy, and coordination capabilities in dynamic environments [
34,
35]. In recent years, it has been widely applied to space mission scheduling problems. Xiang et al. [
36] introduced a hierarchical disturbance-tolerant CNP for distributed satellite TT&C tasks. Du et al. [
37] applied the CNP to the autonomous coordination of RSSs. Yang et al. [
32] proposed a multi-agent CNP-based planning framework for imaging missions, while Liu et al. [
38] developed a bottom-up CNP structure to coordinate heterogeneous observation resources. Yang et al. [
39] also designed a priority-aware variant of the CNP for large-scale constellation mission planning, and Wang et al. [
40] combined ant colony optimization with the CNP to address mobile edge computing in space information networks. These studies collectively demonstrate the potential of the CNP in distributed space mission planning. However, most of the existing work focuses on specific task types or static environments, with limited attention to large-scale dynamic DRS scheduling under task and resource disruptions.
In light of these challenges, this paper focuses on the dynamic scheduling problem of the DRS and establishes a dynamic scheduling model that incorporates realistic link budget constraints. To solve this problem, we propose an ensemble heuristic-based adaptive contract net scheduling framework. By introducing a multi-round negotiation mechanism and dynamic strategy adjustment, the framework enables the rapid adaptation of scheduling plans, thereby enhancing the robustness and efficiency of DRS operations under system disturbances. The main contributions of this study are as follows:
- (1)
A dynamic DRS scheduling model was established, considering link budget and resource usage constraints, to maximize the benefits of the dynamic adjustment scheme while minimizing disruptions to the original schedule.
- (2)
To solve this problem, an ensemble of heuristic adaptive contract net protocols (EH-ACNPs) is proposed. The algorithm coordinates task allocation through multiple rounds of combinatorial auctions, quickly generating a rescheduling plan in disturbance scenarios.
- (3)
Compared with current advanced dynamic DRS scheduling algorithms, the proposed method demonstrated excellent performance in experiments involving task disruptions, resource failures, and other scenarios. Sensitivity analysis further confirms its robustness.
The remaining chapters of this paper are organized as follows:
Section 2 describes and models the dynamic DRS scheduling problem,
Section 3 provides a detailed introduction to the proposed EH-ACNP,
Section 4 presents the simulation results, and
Section 5 summarizes the main research findings of this paper.
2. Problem Description and Mathematical Model
This section describes the dynamic DRS scheduling problem, establishes model assumptions, and develops both the potential link model for the DRS and the dynamic resource scheduling model for the DRS. The symbols and variables defined in this section are listed in
Table 1.
2.1. Problem Description
As shown in
Figure 1, the DRS plays a crucial role in satellite networks by handling essential data transmission and relay forwarding tasks, ensuring stable communication between ground stations and target satellites such as LEO RSSs and scientific research satellites [
41]. However, due to the complex and dynamic in-orbit environment, the system may face uncertainties such as resource disturbances and task disturbances, which can render the original scheduling plan invalid, thereby affecting data transmission stability and task completion rates [
36].
Resource disturbances refer to variations in DRS resources, such as link availability, power supply, and storage capacity, caused by unexpected failures or fluctuations in consumption [
16]. Task disturbances refer to changes in task parameters such as service time, priority, and data volume due to shifting demands, as well as the emergence of urgent, unplanned tasks [
36].
The objective of the dynamic resource scheduling problem for the DRS is to promptly adjust scheduling plans in response to these uncertainties to maximize task fulfillment. In practical applications, significantly modifying the original plan can negatively impact ongoing or upcoming tasks. Therefore, to enhance system stability, dynamic scheduling for DRS should aim to minimize drastic changes to the original schedule while balancing scheduling stability and resource utilization.
In this paper, we aim to maximize task benefits while minimizing scheduling disruptions. Considering engineering constraints such as task service windows, available connection time windows, task execution frequency, and antenna switching time, we establish a dynamic resource scheduling model for DRS and design algorithms for its solution. To simplify the problem without compromising its real-world representation, we introduce the following three model assumptions:
- (1)
DRSs employ a single-access technology, meaning they can serve only one user at a time. Additionally, at any given moment, a user satellite can connect to only one DRS and cannot establish simultaneous connections with multiple DRSs [
29].
- (2)
User satellites do not transmit data directly to the ground via ground stations; instead, they rely solely on DRSs for data forwarding.
- (3)
Energy and DRS memory limitations are not considered. The DRSs are assumed to be in geostationary orbit, where continuous sunlight provides a sufficient power supply. Inter-satellite links between relay satellites are stable and reliable, and the inter-satellite topology is relatively fixed. It is also assumed that DRSs primarily forward data rather than storing them for extended periods, enabling real-time data transmission to ground stations [
42].
2.2. Potential Connection Link Model
Figure 2 illustrates the geometric relationship between the RSS and the DRS. Let
represent a specific moment in the scheduling cycle, and let
denote the connectivity variable between the RSS and the DRS. When they are connected at time
,
; otherwise,
. The following three constraints apply:
(1)
Elevation Angle Constraint: Due to the limited beamwidth of the DRS’s antenna, if the relative elevation angle between the RSS and the DRS is too low, it may exceed the coverage range of the DRS’s antenna [
43]. Let
represent the elevation angle constraint logic variable. When the elevation angle condition is satisfied
, it can be described as follows [
23,
43]:
In the above equation,
represents the relative elevation angle between the RSS and the DRS at time
, and
is the minimum communication elevation angle. When a connection is possible, the elevation angle should not be smaller than the minimum communication elevation angle. In the figure, by applying the law of sines, the following expression can be obtained:
Here,
is the Earth’s radius,
is the DRS’s orbital altitude,
is the geocentric angle between the satellites, and
is the inter-satellite distance. From this expression, the elevation angle between the RSS and the DRS can be calculated as
In the above equation, the inter-satellite distance
can be calculated using the law of cosines, and the geocentric angle
can be obtained by calculating the satellite’s longitude and latitude.
(2)
Visibility Constraint: The communication link between the RSS and the DRS cannot be obstructed by the Earth, as the signal would be blocked. The maximum distance between the RSS and the DRS occurs when the line connecting the two satellites is tangent to the Earth. Let
represent the visibility relationship logic variable, which can be described as follows [
23,
44]:
(3)
Transmission Power Constraint: Due to free-space path loss, wireless signals attenuate during transmission through the space link. When the RSS and the DRS establish a communication link, the transmitted power, after space attenuation, must be sufficient to meet the minimum demodulation threshold at the receiver, considering the antenna gain. In other words, the antenna’s operating power should be less than the maximum power
. Let
represent the power constraint logic variable, which can be described as follows:
In the above equation,
represents the transmission power from
l to
r at time
, described
in decibels, and can be expressed as follows: [
44]
Here, (dB) and (dB) represent the receiver sensitivity and gain of the DRS, (dB) is the transmission gain of the RSS, and is the free-space loss, expressed in terms of the communication frequency and inter-satellite distance.
By employing interval sampling to calculate the connectivity constraints at each simulation time
, the connection window
between the RSS and the DRS can be determined.
2.3. Mathematical Programming Model
The DRS system consists of multiple DRSs, several user spacecraft, and multiple data transmission tasks. The dynamic resource scheduling problem for the DRS is an optimization problem with multiple complex constraints. To establish the dynamic scheduling model for the DRS, the relevant objects are symbolized, and the related symbols are shown as follows.
In terms of decision variables, the model defines three decision variables: , , and . is a binary variable that indicates the execution window assigned to task i. If task i is assigned to the q-th execution window related to the user satellite on antenna r, then ; otherwise, . and are continuous variables representing the start and end times of task i’s execution, respectively.
We aim to maximize task completion benefits and minimize the degree of disruption in the scheduling plan. The objective function
represents the total benefits from task completion, while
is the dynamic disruption measure, indicating the difference between the dynamic adjustment plan and the original plan. Here,
represents the priority of task
i,
represents the total number of tasks removed from the original plan, and
represents the total number of tasks adjusted in the dynamic plan compared to the original plan.
and
are the corresponding weights for these two components. The overall objective function is denoted as
f, and
and
are the objective coefficients.
The model’s relevant constraints mainly include task uniqueness constraints, service duration constraints, service time window constraints, switching time window constraints, and available time window constraints, as detailed below:
(1)
Task Uniqueness Constraint: Each task should be transmitted through, at most, one time slot of a single DRS to avoid duplicate transmissions and prevent resource wastage.
(2)
Service Duration Constraint: Each task has a different data size and requires sufficient transmission time slots to accommodate its data.
(3)
Service Time Window Constraint: Each task must be executed within a specific transmission time window. If the task is not transmitted within the time limit, it becomes invalid and no longer holds transmission significance.
(4)
Switching Time Constraint: A certain amount of time should be reserved between two adjacent transmission tasks executed by a DRS to allow for antenna alignment adjustments, i.e., antenna switching time.
where
,
represent the antenna alignment time and the reset time of DRSs, respectively.
(5)
Available Time Window Constraint: Tasks should be executed within the connectivity time window of the DRS and the user satellite. Users whose connectivity time window is not satisfied should not be assigned transmission time slots.
(6) Variable Range Constraints: The following are the range constraints for the relevant variables.
3. Algorithm Design
This section provides a detailed description of the EH-ACNP algorithm, performs an analysis of its algorithmic complexity, and presents its time complexity.
3.1. Algorithm Framework
To address the dynamic DRS scheduling problem, we propose an EH-ACNP method. The workflow of the EH-ACNP is shown in
Figure 3, and it is implemented using a multi-process concurrent architecture. The EH-ACNP takes as input both task-related information, including the set of user RSSs, the set of tasks, and resource-related information, including the set of DRSs, the set of antennas, and the available time windows for each task.
The EH-ACNP begins by initializing a process pool consisting of one main process and multiple subprocesses, each corresponding to an antenna. It also initializes a task pool managed by the main process, a resource pool managed by each subprocess for its assigned antenna, and an operator pool with initial weights assigned to each operator.
In the EH-ACNP, the main process acts as the centralized tenderer, while the subprocesses act as bidders. The tenderer applies multiple heuristic rules, referred to as bidding operators, to sort the tasks. The tenderer sends bidding requests for each task in the task pool to the bidders. Each bidder uses its bidding algorithm to plan an execution scheme and returns a bid price to the tenderer. The tenderer assigns the task to the bidder offering the highest bid. Once all tasks have been bid on, the tenderer consolidates the plans from all bidders into an overall scheduling scheme, evaluates the improvement of the solution, assigns scores to the operators, and updates their weights accordingly.
After each optimization round, if the solution has not converged, the tenderer selects a disposal operator from the operator pool. The corresponding bidders remove a portion of tasks from its local schedule, return them to the task pool, and update its resource.
The algorithm employs an adaptive operator adjustment strategy to manage the operator pool. The convergence criterion is met if either the objective function shows no improvement in two consecutive iterations or the algorithm reaches the predefined number of iterations. When either condition is satisfied, the algorithm terminates.
3.2. Bidding Operator Pool
The EH-ACNP method designs four bidding operators for sorting the bidding sequence: the maximum weight bidding operator, the minimum conflict degree bidding operator, the efficiency-first bidding operator, and the random bidding operator. This section introduces each of these four operators.
(1) Maximum Weight Bidding Operator: The maximum weight bidding operator refers to the process where the main process sorts the tasks in the task pool in descending order based on task weight, and then sequentially bids the sorted tasks to the child processes. This operator is a greedy operator that prioritizes scheduling tasks with higher weights, which helps achieve the goal of maximizing the scheduling task’s benefit.
(2)
Minimum Conflict Degree Bidding Operator: The conflict degree evaluates how much a task conflicts with other tasks. Referring to the definition in the referenced paper [
28,
45], as shown in the
Figure 4,
and
represent overlapping available transmission time windows, where
and
are the earliest start times, and
and
are the latest end times. The colored dashed lines indicate the start times of transmission activities. The conflict degree value is defined as the ratio of the shaded area to the area of the closed rectangle ABCD. This represents the conflict degree of a task within a single available time window, and the task’s conflict degree is defined as the average conflict degree across all its available windows.
The Minimum Conflict Degree First Bidding Operator prioritizes bidding the tasks with the least conflict degree to the child processes. On one hand, tasks with lower conflict degrees have less impact on subsequent tasks, and on the other hand, this increases the chances of successfully scheduling the current task.
(3)
Efficiency First Bidding Operator: The efficiency
e is defined as the ratio of the task’s time slot length to the task’s weight. It measures the contribution to the overall objective per unit length of the time slot that a task occupies, reflecting marginal utility. The specific formula for calculating efficiency is as follows:
The efficiency first bidding operator sorts the tasks in the task pool in descending order of efficiency values and bids tasks with higher efficiency values to the child processes first. This operator fully exploits the utility of time slots by prioritizing tasks with higher marginal utility, aiming to maximize the scheduling task benefit greedily.
(4) Random Bidding Operator: The random bidding operator refers to the main process shuffling the tasks in the task pool in random order and then bidding them to the child processes according to this random sequence. This operator is a random search operator that introduces more uncertainty into the solution, which is beneficial for exploring the solution space.
3.3. Disposal Operator Pool
The EH-ACNP algorithm designs four disposal operators for sorting the task sequence: the minimum weight disposal operator, the maximum conflict degree disposal operator, the efficiency-first disposal operator, and the random disposal operator. This section introduces each of these four operators.
(1) Minimum Weight Disposal Operator: The minimum weight disposal operator means that each child process sorts the tasks assigned to it by weight and deletes the tasks with lower weights first. These tasks are returned to the main process’s task pool, and their resource usage is released. This operator temporarily removes lower-weight tasks from the solution to increase the possibility of occupying higher-weight tasks.
(2) Maximum Conflict Degree Disposal Operator: The maximum conflict degree disposal operator uses the same definition of conflict degree as described earlier. When this operator is selected, each child process deletes the task with the highest conflict degree, returns it to the main process’s task pool, and releases the resource usage.
(3) Efficiency First Disposal Operator: The efficiency first disposal operator uses the same efficiency definition as in the previous formula. It prioritizes deleting tasks with lower marginal utility to improve the weight contribution of the time slots. This operator helps increase the overall solution benefit.
(4) Random Disposal Operator: The random disposal operator means that each child process randomly deletes tasks assigned to it, in order to enhance the diversity of the solution. This method increases exploration and helps explore the solution space.
3.4. Bidding Algorithm
The bidding algorithm refers to the process in which each subprocess, upon receiving a task bidding request from the main process, develops a feasible execution plan for the task and provides the corresponding bidding price. The pseudocode of the bidding algorithm is shown in Algorithm 1; Steps 1–2 perform a preliminary validity check of the task; Steps 4–5 verify the availability of the current
; Steps 7–9 remove the time intervals in the
that are already occupied to obtain the executable time slot set
; and Steps 10–17 generate the task execution plan while satisfying constraints such as antenna alignment and adjustment time windows. A schematic of the bidding algorithm is presented in
Figure 5. Once the scheduling plan is obtained, the bidding price is calculated according to Equation (
25).
Algorithm 1: Bidding algorithm |
![Aerospace 12 00749 i001 Aerospace 12 00749 i001]() |
3.5. Adaptive Operator Adjustment
The adaptive operator adjustment rule scores operators based on their performance, indirectly adjusting their weights. Specifically, if an operator results in the solution reaching the historical optimum during an iteration, it indicates good performance, and a higher score is rewarded. Conversely, if the operator leads to a solution worse than the previous iteration, it indicates poor performance, and its weight is reduced, resulting in a lower score.
Since each round of bidding computation generates a new solution, using a greedy approach that only accepts better solutions may easily lead to local optima. To encourage the algorithm to explore more, we adopt a simulated annealing criterion to determine whether to update the current solution, which involves gradually lowering a “temperature” parameter to balance exploration and optimization. If the new solution reaches the historical optimum, it is accepted unconditionally. If the new solution is better than the current one, it is also accepted. If the new solution is worse than the current one, the Metropolis criterion is used to accept the new solution with a certain probability [
46].
Since we have designed four bidding operators and four bidding-rejecting operators, the operator scoring matrix is represented by a
matrix
, where
represents the combination of bidding operator
x and bidding-rejecting operator
y, with
, indexing the bidding and bidding-rejecting operators, respectively. In a given round of computation, if using operators
x and
y results in the total solution yield reaching the historical optimum, the operators are assigned a score of
. If the solution does not surpass the historical optimum but is better than the current solution, a score of
is given. If the new solution is worse than the current solution but is accepted as the new solution based on the simulated annealing criterion, a score of
is assigned. If the new solution is worse than the current solution and is not accepted as the new solution, a score of
is given. The computation algorithm is as described in the Formula (
26).
After the operator scoring matrix is updated at regular intervals, the corresponding operator weight matrix is updated.
represents the operator weight matrix, where
are the cross weights of the operator pair, with
x being the bidding operator index and
y being the bidding-rejecting operator index.
is the operator selection count matrix, where
, and each time an operator pair is selected, the corresponding count element
is incremented by 1.
is the operator weight update cycle. After every
updates of the operator scoring matrix, the operator weights are updated with a step size of
, and the count matrix
is reset. The score matrix
undergoes a discount decay with a decay factor
, where
is the normalized score matrix. The update formula is as follows:
3.6. Algorithm Complexity Analysis
The time complexity of one round of bidding is as follows: For Algorithm 1, its time complexity is
. The value of
is related to the simulation cycle length, not the problem size, and is relatively small and finite. The value of
is also finite and does not increase with the problem size, so the time complexity can be considered
. For Algorithm 2, its time complexity is
, where the upper bound of
is the total number of tasks
, and the upper bound of
is also
. Therefore, the complexity of the bidding algorithm is
. In one round of bidding, tasks need to be sorted, which has a time complexity of
. Thus, the total complexity is
.
Algorithm 2: Function FSWA: fetch available slots within window |
![Aerospace 12 00749 i002 Aerospace 12 00749 i002]() |
For the disposal phase, tasks need to be sorted, which has an upper time complexity of . A certain proportion of tasks need to be discarded, and the time complexity for this operation is , The overall time complexity of the pruning phase is .
The total time complexity of one round of computation is , and considering the number of generations, the overall time complexity of the proposed algorithm is .