1. Introduction
Aircraft maintenance is one of the key contributors to aviation safety and operational efficiency in the airline industry. Total maintenance expenditure accounts for approximately 10–15% of the direct operating costs of an aircraft operator [
1]. Traditional aircraft maintenance policies are based on a combination of the
preventive and
corrective approach.
According to the preventive strategy, components need to be inspected/replaced at fixed intervals, which come in the form of flight hours (FHs), flight cycles (FCs), or calendar days (DYs). This strategy is implemented through the preventive maintenance tasks described in the Maintenance Planning Document (MPD) and included in scheduled maintenance checks, referred to as letter checks (e.g., A-, B-, C-, and D-checks). While this strategy is largely responsible for the safety and reliability of aviation today, its cost-effectiveness is limited by statistical considerations and generalizations, that can lead to a replacement of a component long before its true due date is reached, or to a failure of the component prior to the assigned maintenance date. In both cases, improved operational costs are induced.
The corrective maintenance strategy addresses unexpected maintenance tasks, such as faults reported by the pilots, bird strikes, or a finding during an inspection or a functional check as part of a preventive maintenance task. The stochastic nature of these unanticipated maintenance tasks creates disruptions to the maintenance schedule. As such, the maintenance schedule cannot be executed as planned but, rather, has to be constantly adjusted. In the best-case scenario, the aircraft can be assigned to a new maintenance opportunity. However, in the worst-case scenario, there might not be an available maintenance opportunity and the aircraft will lose its airworthiness until the issue is resolved. What is more, usually the resources required to repair the component after failure can be more costly than repairing the component before it fails.
Striving to reduce the related maintenance costs, the aviation industry is gradually shifting to condition-based maintenance (CBM). CBM is a predictive maintenance strategy that makes use of sensors and advanced analytics to continuously monitor the health of aircraft components and predict their remaining useful life (RUL). Maintenance actions are triggered only when there is strong evidence of failure risk, hence decreasing the number of unnecessary maintenance actions and, at the same time, avoiding unforeseen failures and corresponding unscheduled maintenance events. Since preventive maintenance strategies will remain necessary for an effective transition to CBM, this gradual transition is currently realized by combining the existing corrective and preventive with the predictive maintenance strategy.
However, this “hybrid” maintenance approach is very challenging for the maintenance planner. The first challenge arises from the uncertainty included in the RUL predictions. More specifically, the maintenance planner, instead of planning tasks considering a deterministic task due date, as in the case of the preventive and corrective maintenance tasks, has to devise the maintenance plan based on an uncertain outcome. This outcome is captured by the probability distribution of the predicted RUL.
The second challenge lies in the continuous update of the (uncertain) RUL predictions together with the stochastic arrival of unanticipated maintenance tasks, which eventually can create disruptions to the existing maintenance schedule. These disruptions compromise the feasibility and efficiency of the initial schedule. For this reason, the maintenance schedule needs to be continuously updated with respect to the new information, such that all maintenance tasks are scheduled ahead of their due date. Currently, the majority of airlines still rely on manual scheduling methods to adjust and update their maintenance schedule, which results, potentially, in sub-optimal planning.
Historically, the academic literature on aircraft maintenance scheduling is mostly oriented around the scheduling of maintenance tasks bundled in letter checks. Among the most recent works, ref. [
2] used a Dynamic Programming (DP)-based methodology to solve the aircraft maintenance check scheduling problem over a period of 4 years. However, this level of scheduling is too abstract to address the disruptions mentioned above. To achieve the level of detail required for dynamic scheduling of maintenance tasks, some researchers have shifted their attention to scheduling maintenance tasks individually. The most recent and detailed work was performed by [
3], who proposed a hybrid value function approximation (VFA)–Rolling Horizon policy to schedule critical and non-critical tasks of an aircraft fleet. However, the availability of resources, such as material or ground equipment requirements, is not considered.
In general, task (re-)scheduling in a disruptive environment while taking into account the availability of resources touches upon three fields of research: task scheduling, disruption management, and resource allocation. A widely used method in the literature for addressing these types of problems is mixed integer linear programming (MILP) algorithms ([
4,
5,
6]). Although the literature on MILP frameworks specifically applied to the aircraft maintenance-scheduling problem is extremely scarce, there are other fields of study where the developed models can be extended to the airline maintenance scheduling problem. Specifically, both the health and the construction sector show similarities to airline maintenance operations when it comes to scheduling and managing disruptions.
In the research area of the health sector, Ref. [
7] developed a MILP model to address the issue of emergency patient admission and rescheduling of elective patients, subject to the availability of various resources, such as the capacity of clinic units. The objective of the MILP model is to minimize the cost of postponing the elective surgery patients, declining the emergency patients and overutilization of operating rooms. Ref. [
8] used a MILP model to minimize the tardiness of surgeries, idle time, and overtime, while also considering uncertainty in the duration of the surgeries and constraints related to human resources and medical equipment. Ref. [
9] design a MILP model that provides support for the selection and rescheduling (when needed) for elective patients waiting for surgery. The results of the model were evaluated with respect to four different scheduling objectives. In the construction sector, Ref. [
10] proposed a rescheduling optimization model which minimizes alterations to the initial project schedule. The only contribution we are aware of, that addresses aircraft maintenance task scheduling in a disruptive environment, is [
11], who developed a MILP framework to schedule aircraft tasks, minimizing aircraft ground time while also limiting the number of schedule changes.
The main issue exhibited in the previous studies is that the problem size scales exponentially with the number of the considered tasks and the related decision variables, which could take significant computational time and memory usage for applications in real-time environment. Reinforcement learning is able to alleviate this time limitation, as the time required to find a solution can be heavily reduced once the RL model is trained. The use of RL in aircraft maintenance scheduling problems was studied in [
12], who develop a deep Q-learning network (DQN) to optimize the long-term scheduling maintenance of an aircraft fleet.
Most recently, in [
13], a two-stage scheduling framework for an aircraft fleet in a CBM context is developed, taking into account the uncertainty of the RUL predictions of the prognostics-driven tasks. In the first stage, the proposed framework uses the partially observable Monte Carlo planning (POMCP) algorithm developed by [
14], to define the optimal maintenance action for prognostics-driven tasks with uncertain RUL predictions. In the second stage, a deep reinforcement learning (DRL) algorithm is developed, which produces the maintenance schedule for the aircraft fleet, where each aircraft contains a mixture of preventive, corrective and prognostics-driven tasks. The obtained results indicate that DRL is able to produce both an efficient and stable maintenance schedule in just a few seconds.
In this paper, we adopt the framework developed in [
13] and we further use in the second stage of scheduling the MILP model developed by [
11], in order to compare and explore the performance and the possible trade-offs between the MILP and DRL scheduling approach.
The remainder of this paper is organized as follows: the aircraft maintenance requirements, objectives, and the general problem formulation are described in
Section 2. The maintenance scheduling algorithm for the prognostics-driven tasks is briefly discussed in
Section 3. An overview of the MILP and the DRL scheduling model for the aircraft fleet is described in
Section 4. In
Section 5, the performance of both models is evaluated on three different maintenance scenarios for different aircraft fleet sizes, where each aircraft has a list of open maintenance tasks that are updated on a continuous basis. Finally,
Section 6 summarizes the research with concluding remarks and recommendations for future work.
3. Maintenance Scheduling of Prognostics-Driven Tasks
We formulate the decision-making process for the execution of prognostics-driven tasks as a partially observable Markov decision process (POMDP), which is solved using a modified version of the POMCP algorithm. Readers are referred to [
13] for a detailed explanation of the POMDP formulation and the POMCP algorithm. The general concept is that the POMCP algorithm constructs a search binary-decision tree of state histories or beliefs for every considered component (associated prognostics-driven task). An example of such a tree, for a component with two hidden health states (Healthy, Degrading) and one evident state (Fail) is visualized in
Figure 2. The algorithm uses the following inputs:
The RUL predictions for every component, which for the purposes of this study are assumed to follow the normal distribution, i.e., at time , .
The available maintenance slots, .
The average daily aircraft utilization, .
The desired planning horizon.
The component maintenance cost at time
,
, which is formulated as a combination of the corrective,
, and the preventive maintenance cost,
, of the component
as follows:
where
corresponds to the elapsed time from the installation of the component until
,
is the Mean Time Between Failures for the specific component, and
is the probability that the component will fail until the next decision epoch
. The probability of failure of the component is calculated as follows:
where
corresponds to the CDF of the normal distribution.
We assume that the predictions from the prognostics model are available every day, so the decision epoch n corresponds to day n of the planning horizon. Accordingly, the tree is organized in n alternating layers of belief and action nodes, where each layer corresponds to a day of the planning horizon. Each node is characterized by the number of visits N, which counts the number of times this node has been visited, and a value V, which captures the average estimated return of all simulations when starting from this node.
Starting from the root node, which corresponds to the belief of the maintenance planner at the present day
, we perform multiple iterations to explore the different branches of the tree, or alternatively, the different scenarios regarding the health state of the component. As a result of the chosen action at each action node, the maintenance planner receives a total discounted accumulated return
, where
is a discount factor and
is the difference of maintenance cost between two consecutive decision epochs and can be formulated as follows:
This formulation of the reward function is intended to capture the additional maintenance cost savings or losses that can be incurred because of the decision of the maintenance planner to postpone the maintenance of the component for one additional day. Then, the value function, which can be used to assess the quality of action
at time
can be written as:
and corresponds to the expected return that will be earned over the planning horizon
, starting from belief state
. When all the simulations are complete, the maintenance planner selects the action node with the greatest value function:
Finally, when a new prediction from the prognostics has been received, we prune the tree at the belief node determined by the received observation. This specific belief node becomes the new root node of the tree and, as such, all the other belief nodes are now impossible.
5. Computational Experiments
5.1. Case Study
In this section, the two scheduling models will be applied to three different maintenance scenarios for different aircraft fleet sizes with data provided by a major European airline. The data are spread over a period of 5 months of airline operations. According to the requirements defined in MPD and MEL, for each aircraft there is a list of open preventive and corrective maintenance tasks which are updated on a daily basis. The execution of these tasks can be performed in specified maintenance opportunities. Each maintenance opportunity has specific available resources, which vary over time. Moreover, the available workforce in each maintenance opportunity is organized in different skills. Data about the (arrival of) corrective and preventive maintenance tasks, the maintenance slots schedule, and the available resources are provided by the airline.
Furthermore, for 10 aircraft from the considered fleets, we simulate additionally a total of 250 prognostics-driven tasks. More specifically, we assume that each aircraft has 25 monitored components with predictable RULs which are updated on a daily basis and are assumed to follow the normal distribution. The working state of the components included in the 10 aircraft is different, i.e., every component has a different true RUL.
The predictions are simulated by applying the Support Vector Regression (SVR) prognostics model developed in [
15] on the C-MAPSS dataset [
16] for turbo-fan engines, assuming that the time cycles used in the C-MAPSS dataset correspond to flight cycles (FCs). Following the approach described in [
17], the obtained predictions are then organized in four clusters according to the prediction accuracy and uncertainty, described by MAE and standard deviation
in FCs, respectively:
Cluster #1: and ;
Cluster #2: and ;
Cluster #3: and ;
Cluster #4: and .
We then assume that for the 28% of the components (7 components per aircraft in total) we obtain continuously updated RUL predictions belonging to cluster 1, for 24% to cluster 2 (6 components per aircraft in total), for 24% to cluster 3 (6 components per aircraft in total), and for 24% to cluster 4 (6 components per aircraft in total).
Moreover, the corrective and preventive replacement cost of every monitored component is set to
and
, respectively. The magnitude of the cost values was driven by [
18], where average historical values of true preventive and corrective repair costs were used.
In order to simulate the dynamic process of the arrival of new corrective and preventive tasks, and also the update of RUL predictions for the monitored components, we implement a rolling horizon approach (see
Figure 5). A maintenance schedule is generated for a fixed time window. Afterward, the planning horizon shifts one day ahead, where new tasks may arrive and/or new RUL predictions are obtained. The scheduling algorithms produce then a new and feasible schedule for the intended time window, while we choose to minimize the number of schedule changes in the next 3 days (highlighted grey area in
Figure 5). The choice of 3 days is driven by the airline practice, as aircraft allocation to flight usually occurs 3 days before the day of operations. The same process repeats until the end of the planning horizon is reached. It is noted that no scheduling opportunities beyond the end of the planning horizon are considered.
5.2. Assumptions
In our case study, we adopt the following assumptions:
Aircraft utilization is known and constant. The daily aircraft utilization , is set to 15 FHs or 4 FCs, according to historical aircraft utilization values of an airline.
RUL predictions for every monitored component are assumed to follow the normal distribution.
The prognostics-driven tasks correspond to components that are not critical for the safe operation of the aircraft.
5.3. Results Analysis
The scheduling models are going to be compared and evaluated on the basis of the four maintenance planning objectives described in
Section 2.4.
Table 4 summarizes the obtained results after applying the MILP and the DRL scheduling models on different maintenance scenarios for the different aircraft fleet sizes.
With respect to timely execution, both models achieve approximately the same results, with the MILP algorithm performing better in scenario #1 and the DRL algorithm achieving better results in scenario #3. In all scenarios, the DRL algorithm requires more maintenance slots, which is translated as a higher allocated ground time for maintenance. However, as an exchange for the increased ground time, the DRL algorithm performs significantly better than the MILP algorithm when it comes to the RUL exploitation of the monitored components, utilizing on average across all scenarios 72.6% of their RUL, compared to 40.5% achieved by the MILP algorithm. Moreover, even though the DRL algorithm uses more maintenance slots, the average duration of the chosen slots is almost similar to the average duration of slots used by the MILP algorithm.
The MILP algorithm, being more conservative with respect to the scheduling of the prognostics-driven tasks, induces fewer last-minute changes to the maintenance schedule than the DRL algorithm in all scenarios. Furthermore, the DRL algorithm performs slightly better than the MILP algorithm when scheduling the corrective tasks, whereas the MILP algorithm performs slightly better when scheduling the preventive tasks.
Finally, the computational time requirements of both models are summarized in
Table 5. These computational times were obtained using an Intel Core i5 processor 9300, with 16GB RAM and NVIDIA 1660Ti GPU. In all scenarios, the computation time needed by the DRL algorithm to produce the maintenance schedule for every day of the planning horizon is lower compared to the MILP algorithm, with the DRL being as high as approximately 65% faster. The former highlight the scalability of the DRL algorithm when more aircraft and tasks are considered. Nevertheless, both models are proven to be suitable for quasi real-time decision making, as required in an airline maintenance environment.
From the results, it can be concluded that both presented models are suitable for scheduling in an airline environment, reflecting different maintenance scheduling strategies. The MILP model assigns more weight to maintenance decisions targeting to minimization of ground time and increased schedule stability, whereas the DRL model assigns more weight to the exploitation of RUL of the monitored components. This observed trade-off provides also valuable insights with respect to aircraft scheduling in a CBM environment. As shown from the results, in order to leverage the benefits of the CBM strategy with respect to a high RUL exploitation of the components, potentially more aircraft visits to the hangar for maintenance will be required. This is a direct consequence of the uncertainty included in the RUL predictions from the prognostics models. This means that for an effective transition towards a CBM approach, the airlines should consider adding more flexibility to their maintenance schedule. However, in the long term, this would ultimately lead to less replacement/repairs of the monitored components and lower inventory-related costs.
6. Conclusions
In this paper, a comparison between a MILP and a DRL model for the maintenance scheduling of an aircraft fleet in a CBM context was presented. The RUL prognostics, are updated on a daily basis with new sensor measurements, and are characterized by uncertainty that follows the normal distribution. On top of that, also the list of preventive and corrective maintenance tasks is continuously updated. Both scheduling models take into account the list of different types of maintenance tasks, along with available maintenance slots, the available resources and the existing maintenance schedule, to produce the maintenance schedule of the aircraft fleet using a rolling horizon approach. The overarching goal is to prevent tasks going due, while at the same time, ensuring high fleet availability, schedule stability and efficient task interval utilization.
The performance of the investigated models was evaluated on three real maintenance scenarios for different aircraft fleet sizes with data provided by our partner airline, enriched with simulated data for the prognostics-driven tasks from the C-MAPSS dataset. The results show that both models have similar performance with respect to timely task execution. The DRL algorithm manages to schedule more efficiently the prognostics-driven tasks, achieving a higher RUL exploitation for the monitored components. On the other hand, the MILP algorithm induces less maintenance ground time and requires last-minute changes in the maintenance schedule. As such, the choice of which model to use relates to the objective that the airline considers of higher importance. Finally, considering quasi real-time requirements for applications of scheduling models in a real airline environment, both models achieve computational times below one minute. However, the results highlight the capability of DRL to have an increased computational efficiency that stays unaffected of the problem size and considered variables.
Future work may focus on examining the robustness of both models in different types of RUL prognostics distributions, with different types of prediction accuracy. Furthermore, it would be interesting to investigate the implementation of a hybrid approach, i.e, using a MILP model to facilitate the offline training of the DRL model, as this would help to improve the quality of the solution.