Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria

Chai, Xiaofei; Chen, Boyu; Zhao, Xian

doi:10.3390/math11244922

Open AccessArticle

Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria

by

Xiaofei Chai

¹,

Boyu Chen

² and

Xian Zhao

^1,*

¹

School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China

²

Brandeis International Business School, Brandeis University, Waltham, MA 02454, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(24), 4922; https://doi.org/10.3390/math11244922

Submission received: 4 November 2023 / Revised: 21 November 2023 / Accepted: 5 December 2023 / Published: 11 December 2023

(This article belongs to the Special Issue System Reliability and Quality Management in Industrial Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

This paper studies the optimal mission abort decisions for safety-critical mission-based systems with multiple components. The considered system operates in a random shock environment and is required to accomplish a mission during a fixed mission period. If the failure risk of the system is very high, the main mission can be aborted to avoid higher failure cost. The main contribution of this study lies in the design and optimization of mission abort policies for multi-component systems with multiple abort criteria. Moreover, multi-level transitions are considered in this study to characterize the different shock-resistance abilities for components in different states. Mission abort decisions are determined based on the number of components in either defective or failed state. The problem is formulated in the framework of the finite Markov chain imbedding method. We use the Monte-Carlo simulation method to derive the mission reliability and system survivability. Numerical studies and sensitivity analysis are presented to validate the obtained result.

Keywords:

mission abort; system survivability; multi-component system; mission reliability

MSC:

37M05

1. Introduction

Systems deployed in critical applications, such as underwater vehicles, high-speed trains, chemical reactors, aircraft fleets, wind turbines, and high-voltage power cables, are subject to system malfunction during mission execution, which can cause environmental pollution, economic loss, and even fatalities. Therefore, the execution of a rescue procedure (RP) to ensure system survival becomes necessary when certain deteriorating conditions are met [1,2]. For example, an aircraft with multiple engines can abort the primary mission (PM) and make a precautionary emergency landing when certain engines fail [3]. The degradation status of each engine can be detected through sophisticated sensors in real-time. When the detected engine states indicate a high system failure risk, the operators can decide to abort the mission and start the rescue procedure immediately. This decision process is feasible due to the rapid development of the Internet of Things.

For safety-critical systems with a possibility of mission abort, two distinct performance measures should be balanced: mission success probability (MSP) and system survival probability (SSP). Mission success probability is defined as the probability of mission success, while system survival probability measures the probability of system non-failure during mission execution or RP. The mission abort improves SSP but leads to a reduction in MSP [4]. To strike a balance between MSP and SSP, a significant body of research has been dedicated recently to modeling and optimizing mission abort policies with single or multiple criteria.

The design and optimization of mission abort policies have lately received a lot of attention due to their practical and theoretical importance. Various mission abort optimization models have been intensively explored, which can be divided into age-based and condition-based [5]. Condition-based risk control policies outperform age-based policies due to the effective use of in-suite degradation information [6]. Despite their enormous theoretical and practical ramifications, condition-based mission abort policies have gotten little attention [7]. When a certain deterioration condition defined by the mission abort policy is met, the system aborts its PM that is followed by a RP. Zhao, et al. [8] investigated the dynamic mission abort decision-making issue for regularly inspected systems by using the Markov decision process.

Mission abort policies with single criterion have been extensively studied. Mayer investigated the optimal mission abort policies based on the number of failed components for k-out-of-n: G systems [9]. Ref. [10] designed mission abort strategies based on early-warning information. Ref. [11] studied optimal condition-based mission abort decisions based on the defective duration. However, in many real situations, the operation of some safety-critical systems is affected by more than one factor. For example, the UAV suffers not only external shocks coming from random environments but also internal degradation caused by the working load. Therefore, the decision of mission abort should fully incorporate these factors. Mission abort policies with multiple criteria have also been extensively studied. Levitin et al. studied mission abort policy based on the number of failed components and the elapsed mission time [12]. Ref. [13] investigated mission abort policy based on the defective duration and the level of degradation., Yang et al. studied abort decision-making based on the level of degradation and the age of the system [14]. Levitin et al. studied mission abort policy based on performance constraints and system state subsets [15]. Zhao et al. investigated mission abort policy based on the cumulative number of valid shocks and the number of consecutive valid shocks [16].

By reviewing existing studies, we found that many practical engineering systems operate in a shock environment, and external shocks are often directly related to system failures [17]. The shock models can be classified into five categories: cumulative shock model [18], extreme shock model [19,20], run shock model [21], δ-shock model [22,23], and mixed shock model. In the extreme shock model, a system breaks down because of an individual shock with a magnitude that exceeds a critical level [24]. The existing shock models mostly consider a binary-state system and component [25], which have more than two states ranging from perfect to entirely failed in practical engineering systems [26]. A multi-state system under an extreme shock model was first proposed in [20]. Ref. [27] explored mission abort policies for multi-state systems, which may operate in intermediate states with varied PL. In this paper, the component may have three states: perfect, defective, and failed. When a shock arrives, the state of components will transfer to an adjacent worse state with a certain probability.

Due to the harsh shock environment, the system state deteriorates as the system ages. Each shock increases the probability of failure resulting in multi-level transition probabilities [28]. For example, some systems can resist the damage of shocks due to their material, structure, or affiliated devices. The resistance to shocks will also deteriorate as the system degrades causing more prone failures, which is a function related to the system state or the system age [29]. Thus, when the failure risk of the system continuously increases to a certain threshold, it is reasonable to abort the mission and execute a rescue. This paper supposes that the probability of defective state transfer to a failed state is larger than the probability of perfect state transfer to a defective state.

Overall, the study of mission abort policy with the consideration of failure propagation for multi-component systems in a shock environment is not enough. Hence, this study proposes an optimal mission abort policy with multiple abort criteria for multi-component systems. The system is considered to be failed when the number of defective components and failed components exceeds the predefined critical threshold during the execution of a mission. Hence, a mission abort policy is implemented when the number of defective components and failed components reach the mission abort criteria to improve the mission success reliability and system survival probability with the minimization of average cost. The contribution of this study is threefold:

•: Multi-level transition probabilities which indicate the defective components deteriorate to a worse state with higher probability than the normal components are considered in this study;
•: Both the number of defective components and failed components have an effect on the mission abort activity and rescue procedures;
•: A cost minimization model that balances the mission success reliability and system survivability is constructed to determine the optimal mission abort decision parameters.

The organization of this study is described as follows. Section 2 formulates a mission abort policy with multiple criteria for multi-component systems in a shock environment. Section 3 derives the closed-form of mission reliability and system survival probability by using the finite Markov chain embedding method. The total average cost minimization model is constructed to find the optimal mission abort policy. In Section 4, an illustrative example is presented to verify the availability and efficiency of the proposed mission policy. Section 5 discusses the conclusions and some possible future research directions.

2. Problem Formulation

In this paper, we consider a multi-component system operating in a shock environment and required to perform a mission while maintaining normal functioning within a time period

τ

. Each component in the system can be in one of three possible states: normal, defective, and failed. When the number of defective components in the system is higher than

M_{1}

or the number of failed components is higher than

M_{2}

, the system fails. External shocks randomly affect the components in the system. Specifically, each external shock has a probability of deteriorating the component’s state by one level (e.g., from normal to defective), and another probability of having no effect on the component’s state. Due to different levels of resilience against shocks between normal and defective components, we assume that the probability of a shock deteriorating the state of a defective component is higher than that of a normal component. The impacts of random shocks on N components in the system are independent of each other. The abort criteria for the system’s mission are when the number of defective components reaches a predefined threshold

R_{1}

or/and the number of failed components exceeds a certain threshold

R_{2}

. If the abort condition is met before time s, the mission will be immediately aborted and the rescue procedure will be started to save the system. The time required for rescue is dependent on these two abort thresholds and the states of the components and the system will stay in the original state until the rescue procedure is completed. Otherwise, if the abort condition is met after time s, the mission is continuous because the required time for the rest of the mission completion is less than the time for a successful rescue.

Figure 1 shows an illustrative example of the proposed mission abort policy with 4 possible cases when

M_{1} = 5, M_{2} = 4, R_{1} = 3, R_{2} = 2

. Case 1 means the mission is aborted and the rescue procedure is started at time

t_{1, k}

with duration

t_{1}

due to the number of defective components reaching

R_{1}

. When the number of failed components exceeds

R_{2}

, the mission is aborted and the rescue procedure is triggered with duration

t_{2}

as shown in Case 2. The mission is failed in these two cases, while the system survives. In Case 3, although the abort condition is satisfied at time

t_{3, k}

, the mission is not aborted due to

t_{3, k} > s

which means the left time

τ - t_{3, k}

is not enough to accomplish the rescue procedure. The system failed at time

t_{2, n}

since the number of failed components reaches

M_{2}

before the mission is completed at time

τ

. Case 4 indicates the mission is successfully finished and the system keeps functioning before time

τ

.

3. Reliability Evaluation of the System

In this section, we would like to analyze the reliability indexes of the considered multi-component system. Mission reliability and system survivability are derived by using the Markov chain embedding approach. For the sake of balancing these two indexes, an optimization model with the object of minimizing the total expected cost is then constructed.

3.1. Mission Reliability and System Survivability

When the system meets the mission abort condition,

K_{1}

represents the total number of shocks. Let the random variable T denote the time when the system reaches the mission abort condition. Then, T can be expressed as

T = \sum_{i = 1}^{K_{1}} Y_{i}

, where

Y_{i}

is the time interval between the

(i - 1)

-th shock and the i-th shock.

P_{1}

is the probability that the component degrades from a normal state to a defective state after a shock arrives. The probability that the component transitions from the defective state to the failed state after being subjected to a shock is denoted by

P_{2}

.

P_{1}^{^{'}}

and

P_{2}^{^{'}}

respectively represent the probability that the component degrades to the adjacent state after a shock arrives during the rescue process.

Denote

N_{D}^{i}

and

N_{F}^{i}

as the number of components in defective state and failed state after the arrival of the i-th shock, respectively. Then, the Markov chain can be constructed as follows:

X_{i} = (N_{D}^{i}, N_{F}^{i}), i = 1, 2, \dots

(1)

The corresponding state space can be shown as follows.

\begin{matrix} Ω_{1} = \{O_{1}\} \cup \{E_{f}^{a}\} = \{(n_{D}, n_{F}), 0 \leq n_{D} \leq R_{1} - 1, 0 \leq n_{F} \leq R_{2} - 1, n_{D} + n_{F} \leq N\} \cup \{E_{f}^{a}\} \end{matrix}

(2)

where

O_{1}

represents the set of states that have not reached the mission abort threshold, and

E_{f}^{a}

is the absorbing state indicating the system meets the mission abort condition. The transition probabilities can be listed as follows at the end of this paragraph.

(1): If $0 \leq a < R_{1}$ , $0 \leq b < R_{2}$ , $P \{X_{i} = (a, b) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) (1 - P_{2})$ ,
(2): If $0 \leq a < R_{1} - 1$ , $0 \leq b < R_{2}$ , $P \{X_{i} = (a + 1, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1} (1 - P_{2})$ ,
(3): If $0 \leq a < R_{1}$ , $0 \leq b < R_{2} - 1$ , $P \{X_{i} = (a - 1, b + 1) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) P_{2}$ .

Since the probability distribution of the different absorbing states affects the initial distribution of the next stage, it is necessary to write down all possible states of the absorbing state and the corresponding probability. The following shows the transition probability of the system from the working state to the absorbing state.

(4): If $a = R_{1} - 1$ , $0 \leq b < R_{2} - 1$ , $P \{X_{i} = (R_{1}, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1} (1 - P_{2})$ ,
(5): If $0 \leq a \leq R_{1} - 1$ , $b = R_{2} - 1$ , $P \{X_{i} = (a - 1, R_{2}) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) P_{2}$ .

Based on the above transition rules, the one-step transition probability matrix

Λ_{1}

can be constructed as follows,

Λ_{1} = [\begin{matrix} U_{1} & W_{1} \\ 0 & I_{1} \end{matrix}]

(3)

where matrix

U_{1}

with size

h_{1} \times h_{1}

represents the transition probability matrix among

h_{1}

transition states, where

h_{1} = R_{1} \times R_{1}

. Matrix

W_{1}

with size

h_{1} \times h_{1}^{^{'}}

denotes the transition probability matrix from transition states to absorbing states, where

h_{1}^{^{'}} = R_{1} + R_{2}

. Matrix

I_{1}

is an identity matrix with size

h_{1}^{^{'}} \times h_{1}^{^{'}}

which denotes the one-step transient matrix among the absorbing states.

According to the state transition probability matrix

Λ_{1}

, the probability that the unit will be in each Markov chain state after suffering i shocks can be obtained as

P_{1} (i) = π_{1} {(Λ_{1})}^{i} = (P_{1}^{a} (i), P_{2}^{a} (i), \dots, P_{h_{1}}^{a} (i), P_{h_{1} + 1}^{a} (i), \dots, P_{h_{1} + h_{1}^{^{'}}}^{a} (i)),

(4)

where

P_{x}^{a} (i)

denotes the probability that the system is in Markov chain state

x (x = 1,

2, \dots, h_{1} + h_{1}^{^{'}})

.

π_{1}

is the initial state probability vector of the system after suffering the i-th shock, and

I_{1} = {(1, 1, \dots, 1)}_{h_{1}^{^{'}} \times 1}

. The probability that the system reaches the mission abort threshold within

τ

is as follows,

F_{T} (τ) = P \{T \leq τ\} = π_{1} {(U_{1})}^{K_{1} - 1} W_{1} I_{1} .

(5)

When the system reaches the mission abort threshold but does not abort the mission, the lifespan of the system is

L_{1}

.

K_{2}

is the total number of shocks that the system experienced during

τ

. The state space can be defined as follows:

\begin{matrix} Ω_{2} & = \{O_{2}\} \cup \{O_{3}\} \cup \{E_{f}^{b}\} \\ = \{(n_{D}, n_{F}), R_{1} \leq n_{D} \leq M_{1}, 0 \leq n_{F} \leq M_{2}, n_{D} + n_{F} \leq N\} \cup \\ \{(n_{D}, n_{F}), 0 \leq n_{D} \leq R_{1} - 1, R_{2} \leq n_{F} \leq M_{2}, n_{D} + n_{F} \leq N\} \cup \{E_{f}^{b}\}, \end{matrix}

(6)

where

O_{2}

is the state space of the states that the number of defective components reaches the mission abort threshold and the number of failed components may or may not reach the threshold for task termination, and

O_{3}

is used to denote the set of states that the number of failed components meets the mission abort condition while the number of defective components not reaches the mission abort threshold.

E_{f}^{b}

is the absorbing state representing system failure, where the number of defective or failed components reaches the threshold of system failure. Then, the transition probabilities are listed as follows at the end of this paragraph.

(1): If $R_{1} \leq a \leq M_{1}$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = (a, b) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) (1 - P_{2})$ ,
(2): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2}$ , $P \{X_{i} = (a, b) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) (1 - P_{2})$ ,
(3): If $R_{1} \leq a \leq M_{1} - 1$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = (a + 1, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1} (1 - P_{2})$ ,
(4): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2}$ , $P \{X_{i} = (a + 1, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1} (1 - P_{2})$ ,
(5): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2} - 1$ , $P \{X_{i} = (a - 1, b + 1) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) P_{2}$ ,
(6): If $R_{1} \leq a \leq M_{1}$ , $0 \leq b \leq R_{2} - 1$ , $P \{X_{i} = (a - 1, b + 1) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) P_{2}$ ,
(7): If $a = M_{1}$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = E_{f}^{b} | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1} (1 - P_{2})$ ,
(8): If $0 \leq a \leq M_{1}$ , $b = M_{2}$ , $P \{X_{i} = E_{f}^{b} | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}) P_{2}$ .

The state transition probability matrix

Λ_{2}

can be constructed as follows,

Λ_{2} = {[\begin{matrix} U_{2} & W_{2} \\ 0 & I_{2} \end{matrix}]}_{h_{2} \times h_{2}} .

(7)

U_{2}

is the matrix with size

(h_{2} - 1) \times (h_{2} - 1)

representing the transition probability matrix among

(h_{2} - 1)

transition states, where

h_{2} = (M_{1} - R_{1} + 1) \times (M_{2} + 1) + R_{1} \times (M_{2} - R_{2} + 1) + 1

. The matrix

W_{2}

with size

(h_{2} - 1) \times 1

denotes the transition probability matrix from transition states to absorbing states.

I_{2}

is the identity matrix.

Based on the transition probability matrix

Λ_{2}

, we can derive the probability that the system is in state i after experiencing i shocks.

P_{2} (i) = π_{2} {(Λ_{2})}^{i - K_{1}} = (P_{1}^{b} (i), P_{2}^{b} (i), \dots, P_{h_{2}}^{b} (i)),

(8)

where

π_{2} = {(\frac{P_{h_{1} + 1}^{a} (i)}{\sum_{j = 1}^{h_{1}} P_{h_{1} + j}^{a} (i)}, \dots, \frac{P_{h_{1} + h_{1}}^{a} (i)}{\sum_{j = 1}^{h_{1}} P_{h_{1} + j}^{a} (i)}, 0, \dots, 0)}_{1 \times h_{2}}

and

P_{x}^{b} (i)

denotes the probability that the component is in state

x (x = 1, 2, \dots, h_{2})

. Then the probability of the system failure within

τ

can be calculated using the following formula when the system exceeds the task termination threshold but does not abort the mission.

P \{L_{1} > τ | s \leq T < τ\} = π_{2}^{^{'}} {(U_{2})}^{K_{2} - K_{1}} I_{2}^{^{'}},

(9)

where

π_{2}^{^{'}} = {(\frac{P_{h_{1} + 1}^{a} (i)}{\sum_{j = 1}^{h_{1}} P_{h_{1} + j}^{a} (i)}, \dots, \frac{P_{h_{1} + h_{1}}^{a} (i)}{\sum_{j = 1}^{h_{1}} P_{h_{1} + j}^{a} (i)}, 0, \dots, 0)}_{1 \times (h_{2} - 1)}

,

I_{2}^{^{'}} = {(1, 1, \dots, 1)}_{1 \times (h_{2} - 1)}^{T}

.

The probability of mission success can be calculated as follows:

\begin{matrix} M S P & = P \{T \geq τ\} + P \{L_{1} > τ | s \leq T < τ\} \\ = 1 - π_{1} {(U_{1})}^{K_{1} - 1} W_{1} I_{1}^{^{'}} + π_{2}^{^{'}} {(U_{2})}^{K_{2} - K_{1}} I_{2}^{^{'}} \end{matrix}

(10)

Denote

L_{2}

as the lifespan of the system when it reaches the mission abort threshold and abort the mission. The state space can be defined as follows:

\begin{matrix} Ω_{3} & = \{O_{4}\} \cup \{O_{5}\} \cup \{E_{f}^{c}\} \\ = \{(n_{D}, n_{F}), R_{1} \leq n_{D} \leq M_{1}, 0 \leq n_{F} \leq M_{2}, n_{D} + n_{F} \leq N\} \cup \\ \{(n_{D}, n_{F}), 0 \leq n_{D} \leq R_{1} - 1, R_{2} \leq n_{F} \leq M_{2}, n_{D} + n_{F} \leq N\} \cup \{E_{f}^{c}\}, \end{matrix}

(11)

where

O_{4}

is the space of the states that the number of defective components reaches the mission abort threshold and the number of failed components may or may not reach the threshold for task termination, and

O_{5}

is used to denote the set of states that the number of failed components meets the mission abort condition while the number of defective components not reaches the mission abort threshold.

E_{f}^{c}

is the absorbing state representing system failure, where the number of defective or failed components reaches the threshold of system failure. Then, the transition probabilities are listed as follows at the end of this paragraph.

(1): If $R_{1} \leq a \leq M_{1}$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = (a, b) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}^{^{'}}) (1 - P_{2}^{^{'}})$ ,
(2): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2}$ , $P \{X_{i} = (a, b) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}^{^{'}}) (1 - P_{2}^{^{'}})$ ,
(3): If $R_{1} \leq a \leq M_{1} - 1$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = (a + 1, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1}^{^{'}} (1 - P_{2}^{^{'}})$ ,
(4): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2}$ , $P \{X_{i} = (a + 1, b) | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1}^{^{'}} (1 - P_{2}^{^{'}})$ ,
(5): If $0 \leq a \leq R_{1} - 1$ , $R_{2} \leq b \leq M_{2} - 1$ , $P \{X_{i} = (a - 1, b + 1) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}^{^{'}}) P_{2}^{^{'}}$ ,
(6): If $R_{1} \leq a \leq M_{1}$ , $0 \leq b \leq R_{2} - 1$ , $P \{X_{i} = (a - 1, b + 1) | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}^{^{'}}) P_{2}^{^{'}}$ ,
(7): If $a = M_{1}$ , $0 \leq b \leq M_{2}$ , $P \{X_{i} = E_{f}^{c} | X_{i - 1} = (a, b)\} = a (n - a - b) P_{1}^{^{'}} (1 - P_{2}^{^{'}})$ ,
(8): If $0 \leq a \leq M_{1}$ , $b = M_{2}$ , $P \{X_{i} = E_{f}^{c} | X_{i - 1} = (a, b)\} = a (n - a - b) (1 - P_{1}^{^{'}}) P_{2}^{^{'}}$ .

The state transition probability matrix

Λ_{3}

can be obtained using the state transition probabilities mentioned above.

Λ_{3} = {[\begin{matrix} U_{3} & W_{3} \\ 0 & I_{3} \end{matrix}]}_{h_{3} \times h_{3}} .

(12)

U_{3}

is the matrix with size

(h_{3} - 1) \times (h_{3} - 1)

representing the transition probability matrix among

(h_{3} - 1)

transition states, where

h_{3} = (M_{1} - R_{1} + 1) \times (M_{2} + 1) + R_{1} \times (M_{2} - R_{2} + 1) + 1

. The matrix

W_{3}

with size

(h_{3} - 1) \times 1

denotes the transition probability matrix from transition states to absorbing states.

I_{3}

is the identity matrix.

Based on the transition probability matrix

Λ_{3}

, we can derive the probability that the system is in state i after experiencing i shocks.

P_{3} (i) = π_{2} {(Λ_{3})}^{i - K_{1}} = (P_{1}^{c} (i), P_{2}^{c} (i), \dots, P_{h_{3}}^{c} (i)),

(13)

where

P_{x}^{c} (i)

denotes the probability that the component is in state

x (x = 1, 2, \dots, h_{3})

. Let

K_{3}

denote the total number of shocks that the system reaching the mission abort threshold starts rescue at T and does not fail within

T + φ (T)

. Then the probability of the system failure within

τ

can be calculated using the following formula when the system exceeds the task termination threshold and terminates the mission.

P \{L_{2} > T + φ (T) | T < s\} = π_{2}^{^{'}} {(U_{3})}^{K_{3} - K_{1}} I_{2}^{^{'}} .

(14)

Thus,

S S P

can be obtained as follows.

\begin{matrix} S S P & = P \{T \geq τ\} + P \{L_{1} > τ | s \leq T < τ\} + P \{L_{2} > T + φ (T) | T < s\} \\ = 1 - π_{1} {(U_{1})}^{K_{1} - 1} W_{1} I_{1}^{^{'}} + π_{2}^{^{'}} {(U_{2})}^{K_{2} - K_{1}} I_{2}^{^{'}} + π_{2}^{^{'}} {(U_{3})}^{K_{3} - K_{1}} I_{2}^{^{'}} \end{matrix}

(15)

Because of the dependency between

K_{i} (i = 1, 2, 3)

and T, it is difficult to obtain the analytical solution of

M S P

and

S S P

. To obtain the optimal strategy, the Monte-Carlo simulation method is used. The Monte-Carlo simulation method, also known as the random sampling statistical test method, is a branch of experimental science. The simulation flowchart is shown in Figure 2. The general steps of the simulation are listed as follows at the end of this paragraph.

Step 1:: Initialize the parameters of the model;
Step 2:: Generate variables to simulate the shock process and the impact of the shock;
Step 3:: Judge whether the time exceeds $τ$ ;
Step 4:: Simulate the shock arrival process and the changes in component states;
Step 5:: Judge if the system reaches the mission abort threshold;
Step 6:: Decide whether to abort the mission;
Step 7:: Judge whether the system meets the failure threshold;
Step 8:: Obtain the result of a round simulation;
Step 9:: Derive the probability of mission success and the probability of system survival based on all simulation results.

3.2. Optimization Model

Intuitively, when the mission abort condition is easier to achieve, the system reliability increases while the mission success probability decreases. To this end, to balance the trade-off between mission reliability and system survivability, an optimization model is constructed as follows. The cost of the system consists of two parts: mission failure cost and system failure cost. Denote

C_{u}

and

C_{d}

as the cost of mission abort and system failure, respectively. The objective of optimizing the model is to find the optimal mission abort thresholds that minimize the total expected cost. Besides, for safety-critical systems, system survival probability is important. Therefore, in the optimization model, the system survival probability is taken into consideration as a constraint condition. Additionally, the values of

M S P

and

S S P

are related to

R_{1}

and

R_{2}

. Then, the optimization model can be constructed as follows, where

\bar{S S P}

is a preset value of desired

S S P

.

\min C (R_{1}, R_{2}) = C_{u} (1 - M S P (R_{1}, R_{2})) + C_{d} (1 - S S P (R_{1}, R_{2}))

(16)

s . t . \{\begin{matrix} S S P (R_{1}, R_{2}) \geq \bar{S S P}, \\ 1 \leq R_{1} \leq M_{1}, \\ 1 \leq R_{2} \leq \min \{R_{1}, M_{2}\} . \end{matrix}

4. Case Study

4.1. Background

Consider a military unmanned aerial vehicle (UAV) consisting of

N = 5

engines operating in a harsh battlefield environment. The UAV is designed to perform a reconnaissance mission within a specific time period

τ = 10

. Each engine in the UAV can be in one of three possible states: normal, defective, and failed. The external shocks that affect the engines in the UAV are caused by gunfire, explosions, and other hostile actions on the battlefield. The arrival of shocks follows a homogeneous Poisson process with a parameter

λ = 1

. Each shock can cause the engine to deteriorate from a normal state to a defective with a probability

P_{1}

, or cause the engine to deteriorate from a defective state to a failed state with a higher probability

P_{2} = 0.25

. Furthermore, the impacts of random shocks on each engine are independent of each other. The UAV will crash if the number of defective engines reaches

M_{1} = 5

or the number of failed engines reaches

M_{2} = 4

.

The abort criteria for the UAV’s mission are when the number of defective components reaches

R_{1}

or the number of failed components reaches

R_{2}

. If the abort condition is met, the mission will be immediately aborted, and the rescue will be started to save the UAV. The time required for rescue is

φ (t) = 0.3 t

, when the mission is aborted at time t. The UAV system will remain in its original state until the rescue is completed. During the rescue, the probability of a shock deteriorating the state of a defective component is

P_{2}^{^{'}} = 0.1

, higher than

P_{1}^{^{'}} = 0.05

. The mission will continue to be executed if the required time for the rest of the mission completion is less than the time for rescue. The cost of the UAV failure is

C_{d} = 5000

, and the cost of mission failure is

C_{u} = 2000

.

For a given

P_{1}

, we can derive the corresponding mission reliability and system survivability, and then obtain the total expected cost for different

R_{1}

and

R_{2}

. Based on the constructed optimization model in Section 3.2, we use the one-dimensional search method to find the optimal solutions of

R_{1}

and

R_{2}

which minimize the total expected cost.

4.2. Result Discussion

Under different probabilities that the component degrades from normal state to defective state, the optimal values for

R_{1}

and

R_{2}

in terms of

P_{1}

are shown in Table 1. It can be seen that, as

P_{1}

increases,

R_{1}

changes from 4 to 3 and

R_{2}

remains 1. These changes in

R_{1}

can be explained by the fact that the probability of system state degradation will decrease after starting the rescue, so the mission should be aborted sooner to prevent system failure since the probability of components changing from their normal state to defective state grows as

P_{1}

increases. Under the current parameters setting, when

P_{1}

is 0.1, the optimal value of

R_{2}

is 1. As

P_{1}

changes from 0.1 to 0.2, it is better to obtain a smaller

R_{2}

. But

R_{2} = 1

is already the minimum value that can be taken, so,

R_{2}

remains constant at 1. The values of

M S P

and

S S P

in terms of

P_{1}

are shown in Figure 3. It can be seen that as

P_{1}

increases,

M S P

decreases and

S S P

increases. The reason is that a larger value of

P_{1}

leads to a higher probability of mission abort.

Comparisons of the optimal expected total cost between mission abort optimization and the absence of abort policy are illustrated in Figure 4. It can be seen the optimal mission abort policy can greatly reduce the expected total cost compared to the absence of an abort policy. When

P_{1} = 0.12

, for example, the expected total cost with a mission abort decision is 2174.02, which is significantly less than the expected total cost of 4019.88 without a mission abort. Therefore, it is very necessary to adopt the mission abort policy to minimize the expected total cost.

5. Conclusions

This paper investigates the optimal mission abort policy with multiple abort criteria for multi-component systems operating in random environments. By incorporating multi-level transition probabilities, a shock model is utilized to characterize the impact of the environment. Mission abort policies are designed based on the number of defective components and failed components. Due to the dependence between variables, we use the Monte Carlo simulation method to derive the mission reliability and system survivability. In order to balance the trade-off between these two indexes, the optimization model is constructed to find the optimal mission abort thresholds. For comparison purposes, the total expected cost under a heuristic policy without aborting is obtained. It is found that the mission abort action can effectively reduce the total expected cost.

There are a number of extensions to the current study that deserve some attention. Firstly, single mission execution is assumed in this model, one practical extension is to consider the optimal mission abort policy for systems performing multiple tasks or one task with multiple executing chances. Secondly, internal degradation can also be incorporated into this model to refine the state transition process of the considered system. Finally, some preventive maintenance activities before mission abort or system failure are worth investigating because these actions can highly improve both mission reliability and system survivability.

Author Contributions

Conceptualization, X.C., B.C. and X.Z.; methodology, X.C. and X.Z.; software, B.C.; validation, X.C., B.C. and X.Z.; formal analysis, X.C.; investigation, X.C. and B.C.; writing—original draft preparation, X.C. and B.C.; writing—review and editing, X.Z.; visualization, B.C.; supervision, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Science and Technology Innovation Project of Beijing Institute of Technology (Grant Nos. 2023CX01028 and LY2022-23).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RP	Rescue procedure
PM	Primary mission
MSP	Mission success probability
SSP	System survival probability

References

Qiu, Q.; Cui, L.; Gao, H.; Yi, H. Optimal allocation of units in sequential probability series systems. Reliab. Eng. Syst. Saf. 2018, 169, 351–363. [Google Scholar] [CrossRef]
Filene, R.; Daly, W. The reliability impact of mission abort strategies on redundant flight computer systems. IEEE Trans. Comput. 1974, 100, 739–743. [Google Scholar] [CrossRef]
Levitin, G.; Xing, L.; Dai, Y. Mission abort policy in heterogeneous nonrepairable 1-out-of-N warm standby systems. IEEE Trans. Reliab. 2018, 67, 342–354. [Google Scholar] [CrossRef]
Levitin, G.; Finkelstein, M. Optimal mission abort policy for systems in a random environment with variable shock rate. Reliab. Eng. Syst. Saf. 2018, 169, 11–17. [Google Scholar] [CrossRef]
De Jonge, B.; Scarf, P.A. A review on maintenance optimization. Eur. J. Oper. Res. 2020, 285, 805–824. [Google Scholar] [CrossRef]
Keizer, M.C.O.; Flapper, S.D.P.; Teunter, R.H. Condition-based maintenance policies for systems with multiple dependent components: A review. Eur. J. Oper. Res. 2017, 261, 405–420. [Google Scholar] [CrossRef]
Wang, X.; Zhou, H.; Parlikad, A.K.; Xie, M. Imperfect Preventive Maintenance Policies With Unpunctual Execution. IEEE Trans. Reliab. 2020, 69, 1480–1492. [Google Scholar] [CrossRef]
Zhao, X.; Sun, J.; Qiu, Q.; Chen, K. Optimal inspection and mission abort policies for systems subject to degradation. Eur. J. Oper. Res. 2021, 292, 610–621. [Google Scholar] [CrossRef]
Myers, A. Probability of Loss Assessment of Critical k-Out-of-n: G Systems Having a Mission Abort Policy. IEEE Trans. Reliab. 2009, 58, 694–701. [Google Scholar] [CrossRef]
Yang, L.; Sun, Q.; Ye, Z.S. Designing mission abort strategies based on early-warning information: Application to UAV. IEEE Trans. Ind. Inform. 2019, 16, 277–287. [Google Scholar] [CrossRef]
Qiu, Q.; Maillart, L.M.; Prokopyev, O.A.; Cui, L. Optimal condition-based mission abort decisions. IEEE Trans. Reliab. 2022, 72, 408–425. [Google Scholar] [CrossRef]
Levitin, G.; Finkelstein, M.; Huang, H.Z. Optimal Abort Rules for Multiattempt Missions. Risk Anal. 2019, 39, 2732–2743. [Google Scholar] [CrossRef] [PubMed]
Zhao, X.; Fan, Y.; Qiu, Q.; Chen, K. Multi-criteria mission abort policy for systems subject to two-stage degradation process. Eur. J. Oper. Res. 2021, 295, 233–245. [Google Scholar] [CrossRef]
Yang, L.; Chen, Y.; Qiu, Q.; Wang, J. Risk control of mission-critical systems: Abort decision-makings integrating health and age conditions. IEEE Trans. Ind. Inform. 2022, 18, 6887–6894. [Google Scholar] [CrossRef]
Levitin, G.; Xing, L.; Dai, Y. Mission aborting and system rescue for multi-state systems with arbitrary structure. Reliab. Eng. Syst. Saf. 2022, 219, 108225. [Google Scholar] [CrossRef]
Zhao, X.; Chai, X.; Sun, J.; Qiu, Q. Optimal bivariate mission abort policy for systems operate in random shock environment. Reliab. Eng. Syst. Saf. 2020, 204, 107244. [Google Scholar] [CrossRef]
Gut, A. Cumulative shock models. Adv. Appl. Probab. 1990, 22, 504–507. [Google Scholar] [CrossRef]
Cha, J.H.; Finkelstein, M. On new classes of extreme shock models and some generalizations. J. Appl. Probab. 2011, 48, 258–270. [Google Scholar] [CrossRef]
Shanthikumar, J.G.; Sumita, U. General shock models associated with correlated renewal sequences. J. Appl. Probab. 1983, 20, 600–614. [Google Scholar] [CrossRef]
Eryilmaz, S. Assessment of a multi-state system under a shock model. Appl. Math. Comput. 2015, 269, 1–8. [Google Scholar] [CrossRef]
Mallor, F.; Omey, E.; Santos, J. Asymptotic results for a run and cumulative mixed shock model. J. Math. Sci. 2006, 138, 5410–5414. [Google Scholar] [CrossRef]
Eryilmaz, S.; Bayramoglu, K. Life behavior of δ-shock models for uniformly distributed interarrival times. Stat. Pap. 2014, 55, 841–852. [Google Scholar] [CrossRef]
Zhao, X.; Guo, X.; Wang, X. Reliability and maintenance policies for a two-stage shock model with self-healing mechanism. Reliab. Eng. Syst. Saf. 2018, 172, 185–194. [Google Scholar] [CrossRef]
Zhao, X.; Wang, S.; Wang, X.; Cai, K. A multi-state shock model with mutative failure patterns. Reliab. Eng. Syst. Saf. 2018, 178, 1–11. [Google Scholar] [CrossRef]
Liu, Y.; Chen, Y.; Jiang, T. Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach. Eur. J. Oper. Res. 2020, 283, 166–181. [Google Scholar] [CrossRef]
Zhao, X.; Wang, S.; Wang, X.; Fan, Y. Multi-state balanced systems in a shock environment. Reliab. Eng. Syst. Saf. 2020, 193, 106592. [Google Scholar] [CrossRef]
Levitin, G.; Finkelstein, M.; Huang, H.Z. Optimal mission abort policies for multistate systems. Reliab. Eng. Syst. Saf. 2020, 193, 106671. [Google Scholar] [CrossRef]
Levitin, G.; Finkelstein, M.; Xiang, Y. Optimal aborting rule in multi-attempt missions performed by multicomponent systems. Eur. J. Oper. Res. 2020, 283, 244–252. [Google Scholar] [CrossRef]
Wang, J.; Wang, R.; Han, X. Degradation modeling and reliability estimation for competing risks considering system resistance. Comput. Ind. Eng. 2023, 176, 108950. [Google Scholar] [CrossRef]

Figure 1. Possible Cases of mission performing when

M_{1} = 5, M_{2} = 4, R_{1} = 3

, and

R_{2} = 2

.

Figure 1. Possible Cases of mission performing when

M_{1} = 5, M_{2} = 4, R_{1} = 3

, and

R_{2} = 2

.

Figure 2. Monte-Carlo simulation flowchart for deriving the mission success probability MSP and system survival probability SSP.

Figure 3. The

M S P

and

S S P

in terms of

P_{1}

.

Figure 3. The

M S P

and

S S P

in terms of

P_{1}

.

Figure 4. Comparisons of the optimal expected total cost of optimal policies with and without mission mission abort.

Table 1. The optimal solutions of

R_{1}

and

R_{2}

in terms of

P_{1}

.

Table 1. The optimal solutions of

R_{1}

and

R_{2}

in terms of

P_{1}

.

$P_{1}$	0.1	0.11	0.12	0.13	0.14	0.15	0.16	0.17	0.18	0.19	0.2
$R_{1}$	4	4	3	3	3	3	3	3	3	3	3
$R_{2}$	1	1	1	1	1	1	1	1	1	1	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chai, X.; Chen, B.; Zhao, X. Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria. Mathematics 2023, 11, 4922. https://doi.org/10.3390/math11244922

AMA Style

Chai X, Chen B, Zhao X. Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria. Mathematics. 2023; 11(24):4922. https://doi.org/10.3390/math11244922

Chicago/Turabian Style

Chai, Xiaofei, Boyu Chen, and Xian Zhao. 2023. "Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria" Mathematics 11, no. 24: 4922. https://doi.org/10.3390/math11244922

APA Style

Chai, X., Chen, B., & Zhao, X. (2023). Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria. Mathematics, 11(24), 4922. https://doi.org/10.3390/math11244922

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Mission Abort Decisions for Multi-Component Systems Considering Multiple Abort Criteria

Abstract

1. Introduction

2. Problem Formulation

3. Reliability Evaluation of the System

3.1. Mission Reliability and System Survivability

3.2. Optimization Model

4. Case Study

4.1. Background

4.2. Result Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI