Next Article in Journal
Commodity Pricing and Replenishment Decision Strategy Based on the Seasonal ARIMA Model
Next Article in Special Issue
A Bayesian Approach for Lifetime Modeling and Prediction with Multi-Type Group-Shared Missing Covariates
Previous Article in Journal
Optimal Designs for Direct Effects: The Case of Two Treatments and Five Periods
Previous Article in Special Issue
Reliability Optimization of Hybrid Systems Driven by Constraint Importance Measure Considering Different Cost Functions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria

1
School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
2
Brandeis International Business School, Brandeis University, Waltham, MA 02454, USA
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(24), 4922; https://doi.org/10.3390/math11244922
Submission received: 4 November 2023 / Revised: 21 November 2023 / Accepted: 5 December 2023 / Published: 11 December 2023
(This article belongs to the Special Issue System Reliability and Quality Management in Industrial Engineering)

Abstract

:
This paper studies the optimal mission abort decisions for safety-critical mission-based systems with multiple components. The considered system operates in a random shock environment and is required to accomplish a mission during a fixed mission period. If the failure risk of the system is very high, the main mission can be aborted to avoid higher failure cost. The main contribution of this study lies in the design and optimization of mission abort policies for multi-component systems with multiple abort criteria. Moreover, multi-level transitions are considered in this study to characterize the different shock-resistance abilities for components in different states. Mission abort decisions are determined based on the number of components in either defective or failed state. The problem is formulated in the framework of the finite Markov chain imbedding method. We use the Monte-Carlo simulation method to derive the mission reliability and system survivability. Numerical studies and sensitivity analysis are presented to validate the obtained result.

1. Introduction

Systems deployed in critical applications, such as underwater vehicles, high-speed trains, chemical reactors, aircraft fleets, wind turbines, and high-voltage power cables, are subject to system malfunction during mission execution, which can cause environmental pollution, economic loss, and even fatalities. Therefore, the execution of a rescue procedure (RP) to ensure system survival becomes necessary when certain deteriorating conditions are met [1,2]. For example, an aircraft with multiple engines can abort the primary mission (PM) and make a precautionary emergency landing when certain engines fail [3]. The degradation status of each engine can be detected through sophisticated sensors in real-time. When the detected engine states indicate a high system failure risk, the operators can decide to abort the mission and start the rescue procedure immediately. This decision process is feasible due to the rapid development of the Internet of Things.
For safety-critical systems with a possibility of mission abort, two distinct performance measures should be balanced: mission success probability (MSP) and system survival probability (SSP). Mission success probability is defined as the probability of mission success, while system survival probability measures the probability of system non-failure during mission execution or RP. The mission abort improves SSP but leads to a reduction in MSP [4]. To strike a balance between MSP and SSP, a significant body of research has been dedicated recently to modeling and optimizing mission abort policies with single or multiple criteria.
The design and optimization of mission abort policies have lately received a lot of attention due to their practical and theoretical importance. Various mission abort optimization models have been intensively explored, which can be divided into age-based and condition-based [5]. Condition-based risk control policies outperform age-based policies due to the effective use of in-suite degradation information [6]. Despite their enormous theoretical and practical ramifications, condition-based mission abort policies have gotten little attention [7]. When a certain deterioration condition defined by the mission abort policy is met, the system aborts its PM that is followed by a RP. Zhao, et al. [8] investigated the dynamic mission abort decision-making issue for regularly inspected systems by using the Markov decision process.
Mission abort policies with single criterion have been extensively studied. Mayer investigated the optimal mission abort policies based on the number of failed components for k-out-of-n: G systems [9]. Ref. [10] designed mission abort strategies based on early-warning information. Ref. [11] studied optimal condition-based mission abort decisions based on the defective duration. However, in many real situations, the operation of some safety-critical systems is affected by more than one factor. For example, the UAV suffers not only external shocks coming from random environments but also internal degradation caused by the working load. Therefore, the decision of mission abort should fully incorporate these factors. Mission abort policies with multiple criteria have also been extensively studied. Levitin et al. studied mission abort policy based on the number of failed components and the elapsed mission time [12]. Ref. [13] investigated mission abort policy based on the defective duration and the level of degradation., Yang et al. studied abort decision-making based on the level of degradation and the age of the system [14]. Levitin et al. studied mission abort policy based on performance constraints and system state subsets [15]. Zhao et al. investigated mission abort policy based on the cumulative number of valid shocks and the number of consecutive valid shocks [16].
By reviewing existing studies, we found that many practical engineering systems operate in a shock environment, and external shocks are often directly related to system failures [17]. The shock models can be classified into five categories: cumulative shock model [18], extreme shock model [19,20], run shock model [21], δ-shock model [22,23], and mixed shock model. In the extreme shock model, a system breaks down because of an individual shock with a magnitude that exceeds a critical level [24]. The existing shock models mostly consider a binary-state system and component [25], which have more than two states ranging from perfect to entirely failed in practical engineering systems [26]. A multi-state system under an extreme shock model was first proposed in [20]. Ref. [27] explored mission abort policies for multi-state systems, which may operate in intermediate states with varied PL. In this paper, the component may have three states: perfect, defective, and failed. When a shock arrives, the state of components will transfer to an adjacent worse state with a certain probability.
Due to the harsh shock environment, the system state deteriorates as the system ages. Each shock increases the probability of failure resulting in multi-level transition probabilities [28]. For example, some systems can resist the damage of shocks due to their material, structure, or affiliated devices. The resistance to shocks will also deteriorate as the system degrades causing more prone failures, which is a function related to the system state or the system age [29]. Thus, when the failure risk of the system continuously increases to a certain threshold, it is reasonable to abort the mission and execute a rescue. This paper supposes that the probability of defective state transfer to a failed state is larger than the probability of perfect state transfer to a defective state.
Overall, the study of mission abort policy with the consideration of failure propagation for multi-component systems in a shock environment is not enough. Hence, this study proposes an optimal mission abort policy with multiple abort criteria for multi-component systems. The system is considered to be failed when the number of defective components and failed components exceeds the predefined critical threshold during the execution of a mission. Hence, a mission abort policy is implemented when the number of defective components and failed components reach the mission abort criteria to improve the mission success reliability and system survival probability with the minimization of average cost. The contribution of this study is threefold:
Multi-level transition probabilities which indicate the defective components deteriorate to a worse state with higher probability than the normal components are considered in this study;
Both the number of defective components and failed components have an effect on the mission abort activity and rescue procedures;
A cost minimization model that balances the mission success reliability and system survivability is constructed to determine the optimal mission abort decision parameters.
The organization of this study is described as follows. Section 2 formulates a mission abort policy with multiple criteria for multi-component systems in a shock environment. Section 3 derives the closed-form of mission reliability and system survival probability by using the finite Markov chain embedding method. The total average cost minimization model is constructed to find the optimal mission abort policy. In Section 4, an illustrative example is presented to verify the availability and efficiency of the proposed mission policy. Section 5 discusses the conclusions and some possible future research directions.

2. Problem Formulation

In this paper, we consider a multi-component system operating in a shock environment and required to perform a mission while maintaining normal functioning within a time period τ . Each component in the system can be in one of three possible states: normal, defective, and failed. When the number of defective components in the system is higher than M 1 or the number of failed components is higher than M 2 , the system fails. External shocks randomly affect the components in the system. Specifically, each external shock has a probability of deteriorating the component’s state by one level (e.g., from normal to defective), and another probability of having no effect on the component’s state. Due to different levels of resilience against shocks between normal and defective components, we assume that the probability of a shock deteriorating the state of a defective component is higher than that of a normal component. The impacts of random shocks on N components in the system are independent of each other. The abort criteria for the system’s mission are when the number of defective components reaches a predefined threshold R 1 or/and the number of failed components exceeds a certain threshold R 2 . If the abort condition is met before time s, the mission will be immediately aborted and the rescue procedure will be started to save the system. The time required for rescue is dependent on these two abort thresholds and the states of the components and the system will stay in the original state until the rescue procedure is completed. Otherwise, if the abort condition is met after time s, the mission is continuous because the required time for the rest of the mission completion is less than the time for a successful rescue.
Figure 1 shows an illustrative example of the proposed mission abort policy with 4 possible cases when M 1 = 5 , M 2 = 4 , R 1 = 3 , R 2 = 2 . Case 1 means the mission is aborted and the rescue procedure is started at time t 1 , k with duration t 1 due to the number of defective components reaching R 1 . When the number of failed components exceeds R 2 , the mission is aborted and the rescue procedure is triggered with duration t 2 as shown in Case 2. The mission is failed in these two cases, while the system survives. In Case 3, although the abort condition is satisfied at time t 3 , k , the mission is not aborted due to t 3 , k > s which means the left time τ t 3 , k is not enough to accomplish the rescue procedure. The system failed at time t 2 , n since the number of failed components reaches M 2 before the mission is completed at time τ . Case 4 indicates the mission is successfully finished and the system keeps functioning before time τ .

3. Reliability Evaluation of the System

In this section, we would like to analyze the reliability indexes of the considered multi-component system. Mission reliability and system survivability are derived by using the Markov chain embedding approach. For the sake of balancing these two indexes, an optimization model with the object of minimizing the total expected cost is then constructed.

3.1. Mission Reliability and System Survivability

When the system meets the mission abort condition, K 1 represents the total number of shocks. Let the random variable T denote the time when the system reaches the mission abort condition. Then, T can be expressed as T = i = 1 K 1 Y i , where Y i is the time interval between the ( i 1 ) -th shock and the i-th shock. P 1 is the probability that the component degrades from a normal state to a defective state after a shock arrives. The probability that the component transitions from the defective state to the failed state after being subjected to a shock is denoted by P 2 . P 1 and P 2 respectively represent the probability that the component degrades to the adjacent state after a shock arrives during the rescue process.
Denote N D i and N F i as the number of components in defective state and failed state after the arrival of the i-th shock, respectively. Then, the Markov chain can be constructed as follows:
X i = ( N D i , N F i ) , i = 1 , 2 ,
The corresponding state space can be shown as follows.
Ω 1 = O 1 E f a = ( n D , n F ) , 0 n D R 1 1 , 0 n F R 2 1 , n D + n F N E f a
where O 1 represents the set of states that have not reached the mission abort threshold, and E f a is the absorbing state indicating the system meets the mission abort condition. The transition probabilities can be listed as follows at the end of this paragraph.
(1)
If 0 a < R 1 , 0 b < R 2 , P X i = ( a , b ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) ( 1 P 2 ) ,
(2)
If 0 a < R 1 1 , 0 b < R 2 , P X i = ( a + 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(3)
If 0 a < R 1 , 0 b < R 2 1 , P X i = ( a 1 , b + 1 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 .
Since the probability distribution of the different absorbing states affects the initial distribution of the next stage, it is necessary to write down all possible states of the absorbing state and the corresponding probability. The following shows the transition probability of the system from the working state to the absorbing state.
(4)
If a = R 1 1 , 0 b < R 2 1 , P X i = ( R 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(5)
If 0 a R 1 1 , b = R 2 1 , P X i = ( a 1 , R 2 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 .
Based on the above transition rules, the one-step transition probability matrix Λ 1 can be constructed as follows,
Λ 1 = U 1 W 1 0 I 1
where matrix U 1 with size h 1 × h 1 represents the transition probability matrix among h 1 transition states, where h 1 = R 1 × R 1 . Matrix W 1 with size h 1 × h 1 denotes the transition probability matrix from transition states to absorbing states, where h 1 = R 1 + R 2 . Matrix I 1 is an identity matrix with size h 1 × h 1 which denotes the one-step transient matrix among the absorbing states.
According to the state transition probability matrix Λ 1 , the probability that the unit will be in each Markov chain state after suffering i shocks can be obtained as
P 1 ( i ) = π 1 ( Λ 1 ) i = ( P 1 a ( i ) , P 2 a ( i ) , , P h 1 a ( i ) , P h 1 + 1 a ( i ) , , P h 1 + h 1 a ( i ) ) ,
where P x a ( i ) denotes the probability that the system is in Markov chain state x ( x = 1 , 2 , , h 1 + h 1 ) . π 1 is the initial state probability vector of the system after suffering the i-th shock, and I 1 = ( 1 , 1 , , 1 ) h 1 × 1 . The probability that the system reaches the mission abort threshold within τ is as follows,
F T ( τ ) = P T τ = π 1 ( U 1 ) K 1 1 W 1 I 1 .
When the system reaches the mission abort threshold but does not abort the mission, the lifespan of the system is L 1 . K 2 is the total number of shocks that the system experienced during τ . The state space can be defined as follows:
Ω 2 = O 2 O 3 E f b = ( n D , n F ) , R 1 n D M 1 , 0 n F M 2 , n D + n F N ( n D , n F ) , 0 n D R 1 1 , R 2 n F M 2 , n D + n F N E f b ,
where O 2 is the state space of the states that the number of defective components reaches the mission abort threshold and the number of failed components may or may not reach the threshold for task termination, and O 3 is used to denote the set of states that the number of failed components meets the mission abort condition while the number of defective components not reaches the mission abort threshold. E f b is the absorbing state representing system failure, where the number of defective or failed components reaches the threshold of system failure. Then, the transition probabilities are listed as follows at the end of this paragraph.
(1)
If R 1 a M 1 , 0 b M 2 , P X i = ( a , b ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) ( 1 P 2 ) ,
(2)
If 0 a R 1 1 , R 2 b M 2 , P X i = ( a , b ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) ( 1 P 2 ) ,
(3)
If R 1 a M 1 1 , 0 b M 2 , P X i = ( a + 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(4)
If 0 a R 1 1 , R 2 b M 2 , P X i = ( a + 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(5)
If 0 a R 1 1 , R 2 b M 2 1 , P X i = ( a 1 , b + 1 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 ,
(6)
If R 1 a M 1 , 0 b R 2 1 , P X i = ( a 1 , b + 1 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 ,
(7)
If a = M 1 , 0 b M 2 , P X i = E f b | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(8)
If 0 a M 1 , b = M 2 , P X i = E f b | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 .
The state transition probability matrix Λ 2 can be constructed as follows,
Λ 2 = U 2 W 2 0 I 2 h 2 × h 2 .
U 2 is the matrix with size ( h 2 1 ) × ( h 2 1 ) representing the transition probability matrix among ( h 2 1 ) transition states, where h 2 = ( M 1 R 1 + 1 ) × ( M 2 + 1 ) + R 1 × ( M 2 R 2 + 1 ) + 1 . The matrix W 2 with size ( h 2 1 ) × 1 denotes the transition probability matrix from transition states to absorbing states. I 2 is the identity matrix.
Based on the transition probability matrix Λ 2 , we can derive the probability that the system is in state i after experiencing i shocks.
P 2 ( i ) = π 2 ( Λ 2 ) i K 1 = ( P 1 b ( i ) , P 2 b ( i ) , , P h 2 b ( i ) ) ,
where π 2 = P h 1 + 1 a ( i ) j = 1 h 1 P h 1 + j a ( i ) , , P h 1 + h 1 a ( i ) j = 1 h 1 P h 1 + j a ( i ) , 0 , , 0 1 × h 2 and P x b ( i ) denotes the probability that the component is in state x ( x = 1 , 2 , , h 2 ) . Then the probability of the system failure within τ can be calculated using the following formula when the system exceeds the task termination threshold but does not abort the mission.
P L 1 > τ | s T < τ = π 2 ( U 2 ) K 2 K 1 I 2 ,
where π 2 = P h 1 + 1 a ( i ) j = 1 h 1 P h 1 + j a ( i ) , , P h 1 + h 1 a ( i ) j = 1 h 1 P h 1 + j a ( i ) , 0 , , 0 1 × ( h 2 1 ) , I 2 = ( 1 , 1 , , 1 ) 1 × ( h 2 1 ) T .
The probability of mission success can be calculated as follows:
M S P = P T τ + P L 1 > τ | s T < τ = 1 π 1 ( U 1 ) K 1 1 W 1 I 1 + π 2 ( U 2 ) K 2 K 1 I 2
Denote L 2 as the lifespan of the system when it reaches the mission abort threshold and abort the mission. The state space can be defined as follows:
Ω 3 = O 4 O 5 E f c = ( n D , n F ) , R 1 n D M 1 , 0 n F M 2 , n D + n F N ( n D , n F ) , 0 n D R 1 1 , R 2 n F M 2 , n D + n F N E f c ,
where O 4 is the space of the states that the number of defective components reaches the mission abort threshold and the number of failed components may or may not reach the threshold for task termination, and O 5 is used to denote the set of states that the number of failed components meets the mission abort condition while the number of defective components not reaches the mission abort threshold. E f c is the absorbing state representing system failure, where the number of defective or failed components reaches the threshold of system failure. Then, the transition probabilities are listed as follows at the end of this paragraph.
(1)
If R 1 a M 1 , 0 b M 2 , P X i = ( a , b ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) ( 1 P 2 ) ,
(2)
If 0 a R 1 1 , R 2 b M 2 , P X i = ( a , b ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) ( 1 P 2 ) ,
(3)
If R 1 a M 1 1 , 0 b M 2 , P X i = ( a + 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(4)
If 0 a R 1 1 , R 2 b M 2 , P X i = ( a + 1 , b ) | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(5)
If 0 a R 1 1 , R 2 b M 2 1 , P X i = ( a 1 , b + 1 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 ,
(6)
If R 1 a M 1 , 0 b R 2 1 , P X i = ( a 1 , b + 1 ) | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 ,
(7)
If a = M 1 , 0 b M 2 , P X i = E f c | X i 1 = ( a , b ) = a ( n a b ) P 1 ( 1 P 2 ) ,
(8)
If 0 a M 1 , b = M 2 , P X i = E f c | X i 1 = ( a , b ) = a ( n a b ) ( 1 P 1 ) P 2 .
The state transition probability matrix Λ 3 can be obtained using the state transition probabilities mentioned above.
Λ 3 = U 3 W 3 0 I 3 h 3 × h 3 .
U 3 is the matrix with size ( h 3 1 ) × ( h 3 1 ) representing the transition probability matrix among ( h 3 1 ) transition states, where h 3 = ( M 1 R 1 + 1 ) × ( M 2 + 1 ) + R 1 × ( M 2 R 2 + 1 ) + 1 . The matrix W 3 with size ( h 3 1 ) × 1 denotes the transition probability matrix from transition states to absorbing states. I 3 is the identity matrix.
Based on the transition probability matrix Λ 3 , we can derive the probability that the system is in state i after experiencing i shocks.
P 3 ( i ) = π 2 ( Λ 3 ) i K 1 = ( P 1 c ( i ) , P 2 c ( i ) , , P h 3 c ( i ) ) ,
where P x c ( i ) denotes the probability that the component is in state x ( x = 1 , 2 , , h 3 ) . Let K 3 denote the total number of shocks that the system reaching the mission abort threshold starts rescue at T and does not fail within T + φ ( T ) . Then the probability of the system failure within τ can be calculated using the following formula when the system exceeds the task termination threshold and terminates the mission.
P L 2 > T + φ ( T ) | T < s = π 2 ( U 3 ) K 3 K 1 I 2 .
Thus, S S P can be obtained as follows.
S S P = P T τ + P L 1 > τ | s T < τ + P L 2 > T + φ ( T ) | T < s = 1 π 1 ( U 1 ) K 1 1 W 1 I 1 + π 2 ( U 2 ) K 2 K 1 I 2 + π 2 ( U 3 ) K 3 K 1 I 2
Because of the dependency between K i ( i = 1 , 2 , 3 ) and T, it is difficult to obtain the analytical solution of M S P and S S P . To obtain the optimal strategy, the Monte-Carlo simulation method is used. The Monte-Carlo simulation method, also known as the random sampling statistical test method, is a branch of experimental science. The simulation flowchart is shown in Figure 2. The general steps of the simulation are listed as follows at the end of this paragraph.
Step 1:
Initialize the parameters of the model;
Step 2:
Generate variables to simulate the shock process and the impact of the shock;
Step 3:
Judge whether the time exceeds τ ;
Step 4:
Simulate the shock arrival process and the changes in component states;
Step 5:
Judge if the system reaches the mission abort threshold;
Step 6:
Decide whether to abort the mission;
Step 7:
Judge whether the system meets the failure threshold;
Step 8:
Obtain the result of a round simulation;
Step 9:
Derive the probability of mission success and the probability of system survival based on all simulation results.

3.2. Optimization Model

Intuitively, when the mission abort condition is easier to achieve, the system reliability increases while the mission success probability decreases. To this end, to balance the trade-off between mission reliability and system survivability, an optimization model is constructed as follows. The cost of the system consists of two parts: mission failure cost and system failure cost. Denote C u and C d as the cost of mission abort and system failure, respectively. The objective of optimizing the model is to find the optimal mission abort thresholds that minimize the total expected cost. Besides, for safety-critical systems, system survival probability is important. Therefore, in the optimization model, the system survival probability is taken into consideration as a constraint condition. Additionally, the values of M S P and S S P are related to R 1 and R 2 . Then, the optimization model can be constructed as follows, where S S P ¯ is a preset value of desired S S P .
min C ( R 1 , R 2 ) = C u 1 M S P ( R 1 , R 2 ) + C d 1 S S P ( R 1 , R 2 )
s . t . S S P ( R 1 , R 2 ) S S P ¯ , 1 R 1 M 1 , 1 R 2 min R 1 , M 2 .

4. Case Study

4.1. Background

Consider a military unmanned aerial vehicle (UAV) consisting of N = 5 engines operating in a harsh battlefield environment. The UAV is designed to perform a reconnaissance mission within a specific time period τ = 10 . Each engine in the UAV can be in one of three possible states: normal, defective, and failed. The external shocks that affect the engines in the UAV are caused by gunfire, explosions, and other hostile actions on the battlefield. The arrival of shocks follows a homogeneous Poisson process with a parameter λ = 1 . Each shock can cause the engine to deteriorate from a normal state to a defective with a probability P 1 , or cause the engine to deteriorate from a defective state to a failed state with a higher probability P 2 = 0.25 . Furthermore, the impacts of random shocks on each engine are independent of each other. The UAV will crash if the number of defective engines reaches M 1 = 5 or the number of failed engines reaches M 2 = 4 .
The abort criteria for the UAV’s mission are when the number of defective components reaches R 1 or the number of failed components reaches R 2 . If the abort condition is met, the mission will be immediately aborted, and the rescue will be started to save the UAV. The time required for rescue is φ ( t ) = 0.3 t , when the mission is aborted at time t. The UAV system will remain in its original state until the rescue is completed. During the rescue, the probability of a shock deteriorating the state of a defective component is P 2 = 0.1 , higher than P 1 = 0.05 . The mission will continue to be executed if the required time for the rest of the mission completion is less than the time for rescue. The cost of the UAV failure is C d = 5000 , and the cost of mission failure is C u = 2000 .
For a given P 1 , we can derive the corresponding mission reliability and system survivability, and then obtain the total expected cost for different R 1 and R 2 . Based on the constructed optimization model in Section 3.2, we use the one-dimensional search method to find the optimal solutions of R 1 and R 2 which minimize the total expected cost.

4.2. Result Discussion

Under different probabilities that the component degrades from normal state to defective state, the optimal values for R 1 and R 2 in terms of P 1 are shown in Table 1. It can be seen that, as P 1 increases, R 1 changes from 4 to 3 and R 2 remains 1. These changes in R 1 can be explained by the fact that the probability of system state degradation will decrease after starting the rescue, so the mission should be aborted sooner to prevent system failure since the probability of components changing from their normal state to defective state grows as P 1 increases. Under the current parameters setting, when P 1 is 0.1, the optimal value of R 2 is 1. As P 1 changes from 0.1 to 0.2, it is better to obtain a smaller R 2 . But R 2 = 1 is already the minimum value that can be taken, so, R 2 remains constant at 1. The values of M S P and S S P in terms of P 1 are shown in Figure 3. It can be seen that as P 1 increases, M S P decreases and S S P increases. The reason is that a larger value of P 1 leads to a higher probability of mission abort.
Comparisons of the optimal expected total cost between mission abort optimization and the absence of abort policy are illustrated in Figure 4. It can be seen the optimal mission abort policy can greatly reduce the expected total cost compared to the absence of an abort policy. When P 1 = 0.12 , for example, the expected total cost with a mission abort decision is 2174.02, which is significantly less than the expected total cost of 4019.88 without a mission abort. Therefore, it is very necessary to adopt the mission abort policy to minimize the expected total cost.

5. Conclusions

This paper investigates the optimal mission abort policy with multiple abort criteria for multi-component systems operating in random environments. By incorporating multi-level transition probabilities, a shock model is utilized to characterize the impact of the environment. Mission abort policies are designed based on the number of defective components and failed components. Due to the dependence between variables, we use the Monte Carlo simulation method to derive the mission reliability and system survivability. In order to balance the trade-off between these two indexes, the optimization model is constructed to find the optimal mission abort thresholds. For comparison purposes, the total expected cost under a heuristic policy without aborting is obtained. It is found that the mission abort action can effectively reduce the total expected cost.
There are a number of extensions to the current study that deserve some attention. Firstly, single mission execution is assumed in this model, one practical extension is to consider the optimal mission abort policy for systems performing multiple tasks or one task with multiple executing chances. Secondly, internal degradation can also be incorporated into this model to refine the state transition process of the considered system. Finally, some preventive maintenance activities before mission abort or system failure are worth investigating because these actions can highly improve both mission reliability and system survivability.

Author Contributions

Conceptualization, X.C., B.C. and X.Z.; methodology, X.C. and X.Z.; software, B.C.; validation, X.C., B.C. and X.Z.; formal analysis, X.C.; investigation, X.C. and B.C.; writing—original draft preparation, X.C. and B.C.; writing—review and editing, X.Z.; visualization, B.C.; supervision, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Science and Technology Innovation Project of Beijing Institute of Technology (Grant Nos. 2023CX01028 and LY2022-23).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RPRescue procedure
PMPrimary mission
MSPMission success probability
SSPSystem survival probability

References

  1. Qiu, Q.; Cui, L.; Gao, H.; Yi, H. Optimal allocation of units in sequential probability series systems. Reliab. Eng. Syst. Saf. 2018, 169, 351–363. [Google Scholar] [CrossRef]
  2. Filene, R.; Daly, W. The reliability impact of mission abort strategies on redundant flight computer systems. IEEE Trans. Comput. 1974, 100, 739–743. [Google Scholar] [CrossRef]
  3. Levitin, G.; Xing, L.; Dai, Y. Mission abort policy in heterogeneous nonrepairable 1-out-of-N warm standby systems. IEEE Trans. Reliab. 2018, 67, 342–354. [Google Scholar] [CrossRef]
  4. Levitin, G.; Finkelstein, M. Optimal mission abort policy for systems in a random environment with variable shock rate. Reliab. Eng. Syst. Saf. 2018, 169, 11–17. [Google Scholar] [CrossRef]
  5. De Jonge, B.; Scarf, P.A. A review on maintenance optimization. Eur. J. Oper. Res. 2020, 285, 805–824. [Google Scholar] [CrossRef]
  6. Keizer, M.C.O.; Flapper, S.D.P.; Teunter, R.H. Condition-based maintenance policies for systems with multiple dependent components: A review. Eur. J. Oper. Res. 2017, 261, 405–420. [Google Scholar] [CrossRef]
  7. Wang, X.; Zhou, H.; Parlikad, A.K.; Xie, M. Imperfect Preventive Maintenance Policies With Unpunctual Execution. IEEE Trans. Reliab. 2020, 69, 1480–1492. [Google Scholar] [CrossRef]
  8. Zhao, X.; Sun, J.; Qiu, Q.; Chen, K. Optimal inspection and mission abort policies for systems subject to degradation. Eur. J. Oper. Res. 2021, 292, 610–621. [Google Scholar] [CrossRef]
  9. Myers, A. Probability of Loss Assessment of Critical k-Out-of-n: G Systems Having a Mission Abort Policy. IEEE Trans. Reliab. 2009, 58, 694–701. [Google Scholar] [CrossRef]
  10. Yang, L.; Sun, Q.; Ye, Z.S. Designing mission abort strategies based on early-warning information: Application to UAV. IEEE Trans. Ind. Inform. 2019, 16, 277–287. [Google Scholar] [CrossRef]
  11. Qiu, Q.; Maillart, L.M.; Prokopyev, O.A.; Cui, L. Optimal condition-based mission abort decisions. IEEE Trans. Reliab. 2022, 72, 408–425. [Google Scholar] [CrossRef]
  12. Levitin, G.; Finkelstein, M.; Huang, H.Z. Optimal Abort Rules for Multiattempt Missions. Risk Anal. 2019, 39, 2732–2743. [Google Scholar] [CrossRef] [PubMed]
  13. Zhao, X.; Fan, Y.; Qiu, Q.; Chen, K. Multi-criteria mission abort policy for systems subject to two-stage degradation process. Eur. J. Oper. Res. 2021, 295, 233–245. [Google Scholar] [CrossRef]
  14. Yang, L.; Chen, Y.; Qiu, Q.; Wang, J. Risk control of mission-critical systems: Abort decision-makings integrating health and age conditions. IEEE Trans. Ind. Inform. 2022, 18, 6887–6894. [Google Scholar] [CrossRef]
  15. Levitin, G.; Xing, L.; Dai, Y. Mission aborting and system rescue for multi-state systems with arbitrary structure. Reliab. Eng. Syst. Saf. 2022, 219, 108225. [Google Scholar] [CrossRef]
  16. Zhao, X.; Chai, X.; Sun, J.; Qiu, Q. Optimal bivariate mission abort policy for systems operate in random shock environment. Reliab. Eng. Syst. Saf. 2020, 204, 107244. [Google Scholar] [CrossRef]
  17. Gut, A. Cumulative shock models. Adv. Appl. Probab. 1990, 22, 504–507. [Google Scholar] [CrossRef]
  18. Cha, J.H.; Finkelstein, M. On new classes of extreme shock models and some generalizations. J. Appl. Probab. 2011, 48, 258–270. [Google Scholar] [CrossRef]
  19. Shanthikumar, J.G.; Sumita, U. General shock models associated with correlated renewal sequences. J. Appl. Probab. 1983, 20, 600–614. [Google Scholar] [CrossRef]
  20. Eryilmaz, S. Assessment of a multi-state system under a shock model. Appl. Math. Comput. 2015, 269, 1–8. [Google Scholar] [CrossRef]
  21. Mallor, F.; Omey, E.; Santos, J. Asymptotic results for a run and cumulative mixed shock model. J. Math. Sci. 2006, 138, 5410–5414. [Google Scholar] [CrossRef]
  22. Eryilmaz, S.; Bayramoglu, K. Life behavior of δ-shock models for uniformly distributed interarrival times. Stat. Pap. 2014, 55, 841–852. [Google Scholar] [CrossRef]
  23. Zhao, X.; Guo, X.; Wang, X. Reliability and maintenance policies for a two-stage shock model with self-healing mechanism. Reliab. Eng. Syst. Saf. 2018, 172, 185–194. [Google Scholar] [CrossRef]
  24. Zhao, X.; Wang, S.; Wang, X.; Cai, K. A multi-state shock model with mutative failure patterns. Reliab. Eng. Syst. Saf. 2018, 178, 1–11. [Google Scholar] [CrossRef]
  25. Liu, Y.; Chen, Y.; Jiang, T. Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach. Eur. J. Oper. Res. 2020, 283, 166–181. [Google Scholar] [CrossRef]
  26. Zhao, X.; Wang, S.; Wang, X.; Fan, Y. Multi-state balanced systems in a shock environment. Reliab. Eng. Syst. Saf. 2020, 193, 106592. [Google Scholar] [CrossRef]
  27. Levitin, G.; Finkelstein, M.; Huang, H.Z. Optimal mission abort policies for multistate systems. Reliab. Eng. Syst. Saf. 2020, 193, 106671. [Google Scholar] [CrossRef]
  28. Levitin, G.; Finkelstein, M.; Xiang, Y. Optimal aborting rule in multi-attempt missions performed by multicomponent systems. Eur. J. Oper. Res. 2020, 283, 244–252. [Google Scholar] [CrossRef]
  29. Wang, J.; Wang, R.; Han, X. Degradation modeling and reliability estimation for competing risks considering system resistance. Comput. Ind. Eng. 2023, 176, 108950. [Google Scholar] [CrossRef]
Figure 1. Possible Cases of mission performing when M 1 = 5 , M 2 = 4 , R 1 = 3 , and R 2 = 2 .
Figure 1. Possible Cases of mission performing when M 1 = 5 , M 2 = 4 , R 1 = 3 , and R 2 = 2 .
Mathematics 11 04922 g001
Figure 2. Monte-Carlo simulation flowchart for deriving the mission success probability MSP and system survival probability SSP.
Figure 2. Monte-Carlo simulation flowchart for deriving the mission success probability MSP and system survival probability SSP.
Mathematics 11 04922 g002
Figure 3. The M S P and S S P in terms of P 1 .
Figure 3. The M S P and S S P in terms of P 1 .
Mathematics 11 04922 g003
Figure 4. Comparisons of the optimal expected total cost of optimal policies with and without mission mission abort.
Figure 4. Comparisons of the optimal expected total cost of optimal policies with and without mission mission abort.
Mathematics 11 04922 g004
Table 1. The optimal solutions of R 1 and R 2 in terms of P 1 .
Table 1. The optimal solutions of R 1 and R 2 in terms of P 1 .
P 1 0.10.110.120.130.140.150.160.170.180.190.2
R 1 44333333333
R 2 11111111111
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chai, X.; Chen, B.; Zhao, X. Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria. Mathematics 2023, 11, 4922. https://doi.org/10.3390/math11244922

AMA Style

Chai X, Chen B, Zhao X. Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria. Mathematics. 2023; 11(24):4922. https://doi.org/10.3390/math11244922

Chicago/Turabian Style

Chai, Xiaofei, Boyu Chen, and Xian Zhao. 2023. "Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria" Mathematics 11, no. 24: 4922. https://doi.org/10.3390/math11244922

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop