*Article* **Optimizing a Multi-State Cold-Standby System with Multiple Vacations in the Repair and Loss of Units**

**Juan Eloy Ruiz-Castro**

Department of Statistics and O.R., Math Institute (IMAG), University of Granada, 18071 Granada, Spain; jeloy@ugr.es

**Abstract:** A complex multi-state redundant system with preventive maintenance subject to multiple events is considered. The online unit can undergo several types of failure: both internal and those provoked by external shocks. Multiple degradation levels are assumed as both internal and external. Degradation levels are observed by random inspections and, if they are major, the unit goes to a repair facility where preventive maintenance is carried out. This repair facility is composed of a single repairperson governed by a multiple vacation policy. This policy is set up according to the operational number of units. Two types of task can be performed by the repairperson, corrective repair and preventive maintenance. The times embedded in the system are phase type distributed and the model is built by using Markovian Arrival Processes with marked arrivals. Multiple performance measures besides the transient and stationary distribution are worked out through matrix-analytic methods. This methodology enables us to express the main results and the global development in a matrix-algorithmic form. To optimize the model, costs and rewards are included. A numerical example shows the versatility of the model.

**Keywords:** reliability; redundant systems; preventive maintenance; multiple vacations

#### **1. Introduction**

Redundant systems and preventive maintenance are of fundamental importance in ensuring reliability, preventing system failures and reducing costs. These questions, therefore, are of considerable research interest.

The occurrence of total, unexpected system failure can provoke severe damage and major financial loss. To avoid such an outcome, various reliability-enhancing methods can be applied, chief among which are redundancy and preventive maintenance. In this respect, cold, hot and warm redundant standby and *k*-out-of-*n* systems have been proposed. Among researchers who have addressed these questions, Levitin et al. [1] considered an optimal standby element sequencing problem (SESP) for 1-out-of-N: G heterogeneous warm-standby systems, while Zhai et al. [2] constructed a multi-value decision diagram with which to analyse a demand-based warm standby system. In related papers, Cha et al. [3] considered preventive maintenance for items operating in a random environment subjected to a shock Poisson process, Levitin et al. [4] evaluated the probability of mission success given an arbitrary redundancy level, and Osaki et al. [5] analysed the behaviour of a two-unit standby redundant system.

Preventive maintenance enhances system reliability and performance, reduces costs, for both repairable and non-repairable systems, and decreases the probability of sudden equipment failure. Various maintenance systems were studied by [6,7] who developed a new model for the hybrid preventive maintenance of systems with partially observable degradation. Levitin et al. (2021) [8] modelled the (time-consuming) procedure of task transfer, in an event transition-based reliability analysis of standby systems in which preventive replacements are performed according to a predetermined schedule. The aim of this approach is to optimise preventive replacement scheduling and hence to maximise reliability. In another approach to this situation, Yang et al. [9] discussed a preventive

**Citation:** Ruiz-Castro, J.E. Optimizing a Multi-State Cold-Standby System with Multiple Vacations in the Repair and Loss of Units. *Mathematics* **2021**, *9*, 913. https://doi.org/10.3390/math9080913


Academic Editor: Panagiotis-Christos Vassiliou

Received: 12 March 2021 Accepted: 16 April 2021 Published: 20 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

maintenance policy for a single-unit system subject to failure by internal deterioration and/or sudden shock, according to a non-homogeneous Poisson process whereby the process of internal failure is partitioned into two stages.

Complex systems that have a finite number of performance levels and various failure modes, each producing different effects on system performance, are termed multi-state systems (MSS). This concept was first discussed by [10] and has since been developed extensively. For example, Levitin et al. [11] described various MSS measures and considered problems of MSS optimisation, and Lisnianski et al. [12] conducted a comprehensive analysis of the question.

One of the main problems encountered with multi-state complex models is the existence of intractable expressions for their modelling and/or of difficulties in their interpretation. In this respect, matrix-analytic methods are a valuable means of analysing complex systems, preserving the Markovian structure and obtaining manageable results. This approach is usually based on two elements—phase-type distributions (PHD) and Markovian arrival processes (MAP)—which enable the results to be expressed and complex systems modelled in an algorithmic, computational form. PHD were first introduced and detailed by [13]. MAP is a counting process in which PH distributions play an important role. This method was described by [14] and comprehensively reviewed by [15,16]. A special case is that of the MAP with marked arrivals (MMAP), which enables us to count different types of arrival. Moreover, the arrival probabilities of events, for the discrete case, can be customised for different situations. MAP and MMAP theory were further developed by [16].

Many multi-state reliability systems, over time, are subject to events such as repairable or non-repairable failure, inspections or external shocks. These systems can be modelled using appropriate Markov processes, i.e., PHD and MAP ([17,18]). In parallel, unitary complex systems subject to multiple events have been discussed by [19,20]. Matrix algorithmic methods have also been used to model multi-state complex redundant systems. Ruiz-Castro (2020) [21] developed a k-out-of-n: G system, in which the units are subject to repairable and/or non-repairable failure and receive random inspections. In this system, the potential loss of units is included; thus, when a non-repairable failure occurs, the unit is removed and the system continues to be operational. In the context of complex models, a repair facility with a single repairperson is usually assumed. Thus, Ruiz-Castro et al. [22,23] analyse redundant complex systems with a general number of repairpersons and the potential loss of units, determining the optimum number of repairpersons in each case.

In brief, redundancy and preventive maintenance are incorporated into complex systems in order to enhance their reliability, and must also be included in the modelling of such systems. In theory, a unit is repaired either immediately after failure if the system is unitary or when the element in next in line in the repair facility queue. However, this might not be the case in a real scenario. For example, a failed unit might not be repaired immediately in a small or medium-sized firm that cannot afford to employ a full-time repairperson. Furthermore, when there is no failed unit to be attended to in the repair facility, what should a repairperson do? Instead of remaining idle during this period, the repairperson may take a 'vacation' and/or use the time to do other work, thus optimising resources and reducing costs. A repairperson is on vacation when absent from the repair facility, whether or not it is empty. The economic implications of this situation should be considered, taking into account that the vacation policy applied might impact both on performance and also on economic rewards/costs. In studies of this question, two time points are of particular importance: the start and end times of the vacation. Moreover, the services provided may be exhaustive or non-exhaustive. In the first case, the repairperson cannot be on vacation when the repair facility is not empty, but in the second, even if an item has been sent to the repair facility, the repairperson may be on vacation. Another possibility that must be considered is that of interruption, i.e., the repairperson may take a vacation while a unit is being repaired. The vacation end time determines when the repairperson resumes work. Finally, depending on the maintenance system adopted, the vacation may occupy a single period of time or multiple periods.

Vacation policies have been considered in queuing theory and in reliability analysis, among other areas. Thus, Doshi [24] provided a wide-ranging analysis of vacation system models and Ke et al. [25] examined the application of two vacation policies (one single and the other multiple) in a repairable system. Zaiming et al. [26] developed a reliability system with multiple, but finite, vacation periods and Wu et al. [27] analysed the reliability of a two-unit cold standby system with a single repairperson, entitled to take a vacation.

Vacation periods have also been considered for systems governed by a Markov model. In this respect, Shrivastava et al. [28] presented the case of an exhaustive vacation policy, whereby the repairperson could only take a vacation when the repair facility was empty. Under the Markovian modelling described by [29], the repairperson could take a vacation if there were no failed units in need of repair, but had to return as soon as any unit failed. In another approach, Zhang et al. [30] modelled a k-out-of-n system with a single repairperson, assuming a phase-type distribution for the vacation time and an exponential distribution for the lifetime of the units. In this system, the repairperson could take a vacation whenever there was no failed component in the system. On return, the repairperson might or might not encounter failed components waiting for repair. In the second case, the repairperson would remain within the repair facility, idle, until a failed component arrived. Finally, Ruiz-Castro et al. [31] modelled a multi-state complex system subject to multiple events and where preventive maintenance was applied. In this case, the repairperson had various duties and, moreover, was entitled to take a vacation.

In the present study, we model a cold standby system with the potential loss of units. The system evolves in discrete time; the online unit is multi-state and subject to internal failure, repairable or otherwise, to external shocks with diverse consequences, and to random inspection. When a non-repairable failure occurs, the faulty unit is removed and the system continues working with one unit less. An external shock may provoke any of the following consequences: degraded system performance, a repairable failure of the online unit or its total (non-repairable) failure. Damage to the internal performance of the online unit may be minor or major. During system inspection, the internal status of the online unit is observed. If major damage is present, the faulty unit is sent to the repair facility for preventive maintenance. According to the case presented, the repairperson may perform corrective repair or preventive maintenance. The complexity of the system is determined as follows. In modelling the system, the vacation policy employed in the repair facility is determined by the number of operational units included. A general number *R* of operational units is considered. If the repairperson returns from a vacation period and there are fewer than *R* operational units, the repairperson must then remain in the facility. Otherwise, a new vacation period begins. As the system is subject to a loss of units, when there are fewer than *R* units in the system, the repairperson must remain in the facility while this situation persists. The times embedded are PH distributed and a MMAP is constructed to model the system. In modelling this system, the following measures are calculated: availability, reliability and expected times (in both transient and stationary regimes). Rewards and costs are incorporated, and a numerical optimisation is performed to determine the optimum threshold *R* and to decide whether preventive maintenance is profitable or not.

The rest of this paper is organised as follows. In Section 2, we describe the system to be modelled, after which we present the corresponding MMAP in Section 3. In Section 4, we detail the measures applied to the transient and stationary regimes, and calculate the transient and stationary distributions. The latter is obtained both algorithmically and computationally. The system costs, rewards and associated measures are then derived in Section 5. Taking advantage of the favourable properties of PHD and MMAP, the study findings are obtained in a matrix algorithmic form. Section 6 presents a numerical example, including an optimisation exercise. Finally, the main conclusions drawn are summarised in Section 7.

#### **2. Assumptions of the System: The State Space**

A cold standby system composed of *n* units is initially assumed. One unit is online and the others are waiting on standby without degrading. The online unit is multi-state, where the internal performance is partitioned into major and minor states. It is subject to multiple events. This can suffer internal failures, repairable or not, and external shocks. Each external shock can provoke three different consequences: total failure (non-repairable), modification in the internal behaviour or an internal repairable or non-repairable failure. When a repairable failure occurs, the unit goes to the repair facility for corrective repair. The corrective repair time distribution is PH. The repair facility is composed of one repairperson who can take vacations. As it has been mentioned above, the internal performance of the online unit is partitioned into major and minor states. A major state is a state from where the online unit has a greater risk of suffering a failure. To avoid serious damage and major financial losses random inspections are carried out. The inspector observes the online unit and if this one is operational in major damage, the unit goes to the repair facility for preventive maintenance. Preventive maintenance time is also PH distributed. When the online unit undergoes a failure, one cold standby occupies the online place, if any. The new online unit will start executing from the initial distribution of the internal performance, because after repairing or preventive maintenance the unit is as good as new. The system is also subject to loss of units. After a non-repairable failure the unit is removed and the system continues working until there are no units in the system. If only one unit is in the system and a non-repairable failure occurs, the system is restarted.

One repairperson can be in the repair facility who can develop two different tasks: corrective repair and preventive maintenance. To optimise the system, the repairperson is allowed to take vacations, for a random duration, according to certain criteria.

Initially, all units are operational and the repairperson is on vacation. After returning, a new vacation begins if there are *R* or more operational units in the system. Equivalently, if there are k − R+1= *N* or more failed units needing to be repaired, where *k* is the number of units in the system, *k* = 1,..., *n*, the repairperson must remain in the repair facility.

After finishing a repair, the repairperson begins a new period of vacation if *R* units are then operational. As the system can lose units, the repairperson must always remain in the facility (or interrupt the vacation to return) when fewer than *R* units are in the system.

The following Section *"The Assumptions"* specifies the assumptions of the system.

#### *The Assumptions*

The system follows the following assumptions.

Assumption 1. The internal performance time is PH distributed with representation (α,**T**), and with order *m* (number of internal stages). The internal failure probability depends on the states. The column vectors **T**<sup>0</sup> *<sup>r</sup>* and **T**<sup>0</sup> *nr* contains the probabilities of repairable and non-repairable failures, respectively.

Assumption 2. The internal performance of the online unit is multi-state where the *n*<sup>1</sup> first units are minor and the rest are major according to damage.

Assumption 3. The external events occur according to a PH-renewal process where the time between two consecutive shocks is a PH distribution with representation (γ, **L**), with order *t*.

Assumption 4. An external shock can provoke a total non-repairable failure of the online unit with a probability equal to *ω*0.

Assumption 5. After an external shock the internal performance state can undergo a modification. This modification between any two internal states occurs according to the transition probability matrix **W**. The column vectors **W**<sup>0</sup> *<sup>r</sup>* and **W**<sup>0</sup> *nr* contains the probabilities of repairable and non-repairable failures respectively provoked by an external shock.

Assumption 6. The time between two consecutive random inspections is PH distributed with representation (η,**M**), with order *ε*.

Assumption 7. The vacation time is distributed following a PH distribution with representation (**v**, **V**), with order *υ*.

Assumption 8. The corrective repair time is PH distributed with representation (β1, **S**1), with order *z*1.

Assumption 9. The preventive maintenance time is PH distributed with representation (β2, **S**2), with order *z*2.

The behaviour of the system is shown in Figure 1, for inspection and repairable failure, Figure 2 for non-repairable failure, and Figure 3 for the vacation policy.

**Figure 1.** Internal repairable failure and inspection in the system.

**Figure 2.** Non-repairable failure in the system.

**Figure 3.** Vacation policy in the system.

#### **3. Modelling the System. The Markovian Arrival Process with Marked Arrivals**

The system is governed by a Markov process vector in discrete time. In this section the state space is described and, to model the proposed complex system, the behaviour of the online unit and of the repair facility is developed separately.

#### *3.1. The State-Space*

The state-space is composed of macro-states and it is denoted by *S* = **U***n*, **U***n*−1,..., **U**1 , where **U***<sup>k</sup>* contains the phases when there are *k* units in the system. In turn, these macrostates are partitioned as follows

$$\mathbf{U}^{k} = \left\{ \mathbf{E}\_{0}^{k,\upsilon}, \mathbf{E}\_{1}^{k,\upsilon}, \dots, \mathbf{E}\_{N-1}^{k,\upsilon}, \mathbf{E}\_{N}^{k,\upsilon}, \mathbf{E}\_{N+1}^{k,\upsilon}, \dots, \mathbf{E}\_{k}^{k,\upsilon}, \mathbf{E}\_{N}^{k,\upsilon\upsilon}, \mathbf{E}\_{N+1}^{k,\upsilon\upsilon}, \dots, \mathbf{E}\_{k}^{k,\upsilon\upsilon} \right\}; k \ge R$$
 
$$\mathbf{U}^{k} = \left\{ \mathbf{E}\_{0}^{k,\upsilon\upsilon}, \mathbf{E}\_{1}^{k,\upsilon\upsilon}, \dots, \mathbf{E}\_{k}^{k,\upsilon\upsilon} \right\}; k \ge R$$

where **E***k*,*<sup>x</sup> <sup>s</sup>* contains the phases when there are *k* units in the system and *s* of them are in the repair facility and the superscript *x* indicates if the repairperson in on vacation (*v*) or not (*nv*). Initially the repairperson begins to operate the first time that he comes back from vacation and the system has at least *N* = *k* − *R* + 1 units in the repair facility. He remains working until *N* − 1 units are in the repair facility. At this moment the repairperson goes on vacation. In any case, the order of the units in the repair facility has to be saved in memory, and there are two types of repair, corrective and preventive maintenance. For this reason, the macro-state **E***k*,*<sup>x</sup> <sup>s</sup>* is composed of the first level of macro-states **E***k*,*<sup>x</sup> i*1,...,*is* .

These macro-states contain the phases when there are k units in the system, with *s* of them in the repair facility, and the type of repair is given by the ordered sequence *i*1, ... , *is*. The values of *il* are equal to 1 or 2 if the unit is in corrective repair or preventive maintenance, respectively.

When the number of units in the system is *R* – 1 units, then the repairperson occupies his place work immediately. The inspection time is restarted each time that one unit occupies the online place.

• For *k* = 1, . . . , *R* − 1 **E***k*,*nv* <sup>0</sup> = {(*k*, 0; *i*, *j*, *u*); *i* = 1, . . . , *m*, *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*}

**E***k*,*nv <sup>s</sup>* = **E***k*,*nv i*1,...,*is* ; *il* = 1, 2; *l* = 1, . . . ,*s* for *s* = 1, . . . , *k* where **E***k*,*nv <sup>i</sup>*1,...,*is* <sup>=</sup> (*k*,*s*; *i*, *j*, *u*,*r*); *i* = 1, . . . , *m*, *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*,*r* = 1, . . . , *zi*<sup>1</sup> for *s* < *k* **E***k*,*nv <sup>i</sup>*1,...,*ik* <sup>=</sup> (*k*, *k*; *j*,*r*); *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*,*r* = 1, . . . , *zi*<sup>1</sup> • For *k* = *N*,... , *n* **E***k*,*<sup>v</sup>* <sup>0</sup> = {(*k*, 0; *i*, *j*, *u*, *v*); *i* = 1, . . . , *m*, *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*, *v* = 1, . . . , *υ*} **E***k*,*<sup>v</sup> <sup>s</sup>* = **E***k*,*<sup>v</sup> i*1,...,*is* ; *il* = 1, 2; *l* = 1, . . . ,*s* for *s* = 1, . . . , *k* where **E***k*,*<sup>v</sup> <sup>i</sup>*1,...,*is* = {(*k*,*s*; *<sup>i</sup>*, *<sup>j</sup>*, *<sup>u</sup>*, *<sup>v</sup>*); *<sup>i</sup>* = 1, . . . , *<sup>m</sup>*, *<sup>j</sup>* = 1, . . . , *<sup>t</sup>*, *<sup>u</sup>* = 1, . . . ,*ε*, *<sup>v</sup>* = 1, . . . , *<sup>υ</sup>*} for *<sup>s</sup>* <sup>&</sup>lt; *<sup>k</sup>* **E***k*,*<sup>v</sup> <sup>i</sup>*1,...,*ik* = {(*k*, *<sup>k</sup>*; *<sup>j</sup>*, *<sup>u</sup>*, *<sup>v</sup>*); *<sup>j</sup>* = 1, . . . , *<sup>t</sup>*, *<sup>u</sup>* = 1, . . . ,*ε*, *<sup>v</sup>* = 1, . . . , *<sup>υ</sup>*} **E***k*,*nv <sup>s</sup>* = **E***k*,*nv i*1,...,*is* ; *il* = 1, 2; *l* = 1, . . . ,*s* for *s* = *N*,... , *k* where **E***k*,*nv <sup>i</sup>*1,...,*is* <sup>=</sup> (*k*,*s*; *i*, *j*, *u*,*r*); *i* = 1, . . . , *m*, *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*,*r* = 1, . . . , *zi*<sup>1</sup> for *s* < *k* **E***k*,*nv <sup>i</sup>*1,...,*ik* <sup>=</sup> (*k*, *k*; *j*, *u*,*r*); *j* = 1, . . . , *t*, *u* = 1, . . . ,*ε*,*r* = 1, . . . , *zi*<sup>1</sup> 

The phase (*k*,*s*; *i*, *j*, *u*, *m*,*r*) indicates that there are *k* units in the system, with *s* in the repair facility; the internal performance of the online unit is in state *i*, the external shock time is in state *j*, the cumulative damage caused by external shocks is given by *u*, *m* is the current phase of the inspection time and *r* is the corrective repair/preventive maintenance phase for the unit currently being attended to in the repair facility. If the repairperson is taking a vacation, the phase is indicated by *v*.

The order of these macro-states is as follows:

$$o\_{\mathbb{E}\_0^{k,w}} = m \cdot t \cdot \varepsilon; \text{ s} < k, \ o\_{\mathbb{E}\_0^{k,w}} = m \cdot t \cdot \varepsilon \cdot (z\_1 + z\_2)2^{s-1}; \text{ s} = k, \ o\_{\mathbb{E}\_k^{k,w}} = t \cdot (z\_1 + z\_2)2^{k-1}$$

$$o\_{\mathbb{E}\_0^{k,v}} = m \cdot t \cdot \varepsilon; \text{ s} < k, \ o\_{\mathbb{E}\_s^{k,v}} = m \cdot t \cdot \varepsilon \cdot 2^{\mathfrak{s}}; \text{ s} = k, \ o\_{\mathbb{E}\_k^{k,v}} = t \cdot 2^{k-1}$$

#### *3.2. Modelling the Online Unit*

The online unit can undergo different types of event at any time. These are noted and defined as:


Two of them are described below, and the rest are given in Appendix A. The elements of auxiliary matrices **U**<sup>1</sup> and **U**<sup>2</sup> are defined as

$$\mathbf{U}\_{1}(i,j) = \begin{cases} 1 & ; \quad i = j; i = 1, \dots, n\_{1} \\ 0 & ; \quad \text{otherwise} \end{cases}; \mathbf{U}\_{2}(i,j) = \begin{cases} 1 & ; \quad i = j; i = n\_{1} + 1, \dots, n\_{1} \\ 0 & ; \quad \text{otherwise} \end{cases}$$

Throughout this work the symbol ⊗ denotes the Kronecker product and, given a matrix **<sup>A</sup>**, we denote this as **<sup>A</sup>**<sup>0</sup> to the column vector **<sup>A</sup>**<sup>0</sup> <sup>=</sup> **<sup>e</sup>** <sup>−</sup> **Ae**, **<sup>e</sup>** being a column vector of units with appropriate order.

#### *3.3. No Events at a Certain Time (O)*

We assume that the online unit is operational and at this time it continues working. This occurs because of different situations:

• The internal performance continues in the same phase or changes to another, equally operational state. There is no external shock (**T** ⊗ **L**), and no inspection takes place (**M**). The matrix that governs this transition for the online unit is given by **T** ⊗ **L** ⊗ **M**.


Therefore, the matrix that governs this transition for the online unit is given by

$$\mathbf{H}\_O = \left(\mathbf{T} \otimes \mathbf{L} + \mathbf{T} \mathbf{W} \otimes \mathbf{L}^0 \mathbf{y} \left(1 - \boldsymbol{\omega}^0\right)\right) \otimes \mathbf{M} + \left(\mathbf{U}\_1 \mathbf{T} \otimes \mathbf{L} + \mathbf{U}\_1 \mathbf{T} \mathbf{W} \otimes \mathbf{L}^0 \mathbf{y} \left(1 - \boldsymbol{\omega}^0\right)\right) \otimes \mathbf{M}^0 \boldsymbol{\eta}$$

#### *3.4. Non-Repairable Failure (C)*

The online unit is assumed to be operational and at the next time point a non-repairable failure occurs, because:


This transition is independent of the inspection time. After the online unit experiences a non-repairable failure, the online place is occupied by a substitute, identical unit. Then, the matrix is given by

$$\mathbf{H}\_{\mathbb{C}} = \left[ \mathbf{T}\_{nr}^{0} \mathbf{c} \otimes \mathbf{L} + \left( \mathbf{T}\_{nr}^{0} + \mathbf{T} \mathbf{W}\_{nr}^{0} \right) \mathbf{c} \otimes \mathbf{L}^{0} \mathbf{y} \left( 1 - \boldsymbol{\omega}^{0} \right) + \mathbf{e} \boldsymbol{\alpha} \otimes \mathbf{L}^{0} \mathbf{y} \boldsymbol{\omega}^{0} \right] \otimes \mathbf{e} \boldsymbol{\eta} \boldsymbol{\omega}^{0}$$

If only one unit is operational and online (i.e., all others are under repair), this unit experiences a non-repairable failure and no repair occurs, no immediate substitution can be made and therefore the system does not restart. The matrix is given by

$$\mathbf{H}'\_{\mathbb{C}} = \left[ \mathbf{T}^{0}\_{nr} \otimes \mathbf{L} + \left( \mathbf{T}^{0}\_{nr} + \mathbf{T} \mathbf{W}^{0}\_{nr} \right) \otimes \mathbf{L}^{0} \mathbf{y} \left( 1 - \boldsymbol{\omega}^{0} \right) + \mathbf{e} \otimes \mathbf{L}^{0} \boldsymbol{\chi} \boldsymbol{\omega}^{0} \right] \otimes \mathbf{e}$$

#### *3.5. The Markovian Arrival Process with Marked Arrivals (MMAP)*

The behaviour of the system is governed by a MMAP. The representation of this MMAP is given from the types of event shown below:

*A*: Internal repairable failure (default without D)

*B*: Major revision (default without D)

*C*: Non-repairable failure (default without D)

*D*: The repairperson resumes to work (default without A, B, C)

*AD*: Internal repairable failure and the repairperson resumes work

*BD*: Major revision and the repairperson resumes work

*CD*: Non-repairable failure and the repairperson resumes work

*NS*: New system

*O*: No events

The representation of the MMAP is **D***O*, **D***A*, **D***B*, **D***C*, **D***D*, **D***AD*, **D***BD*, **D***CD*, **D***NS* . The transition probability matrix associated to the embedded Markov chain from the MMAP is given by **D** = ∑ **D***Y*.

Two matrices **D***<sup>Y</sup>* are described in the next section. The rest are given in Appendix B.

#### The Matrices **D**<sup>A</sup> and **D**<sup>B</sup>

The matrices **D***<sup>A</sup>* and **D***<sup>B</sup>* govern the transition when a repairable failure or a major inspection takes place, respectively. These matrices are composed of matrix blocks that contain the transitions between macro-states **U***k*. This is a diagonal matrix block given that the number of units in the system does not change in this transition. The matrix **D***<sup>Y</sup> k* contains the transition probabilities when there are *k* units in the system and the event *Y* occurs for *Y* = *A* or *B* and *k* = 1, . . . , *n*. Then,

$$\mathbf{D}^{Y} = \begin{pmatrix} \mathbf{D}\_{n}^{Y} \\ & \mathbf{D}\_{n-1}^{Y} \\ & & \mathbf{D}\_{n-2}^{Y} \\ & & & \ddots \\ & & & & \mathbf{D}\_{1}^{Y} \end{pmatrix} \quad \text{for } Y = A, B.$$

These blocks are composed of further blocks.

*Y*

• If the number of units is less than *R*−1, the repairperson is always in his workplace. Then, for *k* = 1, . . . , *R*−1

$$\mathbf{D}\_{k}^{Y} = \begin{pmatrix} E\_{0}^{k, \text{av}} & E\_{1}^{k, \text{av}} & E\_{2}^{k, \text{av}} & E\_{k-1}^{k, \text{av}} & & & & & \\ E\_{0}^{k, \text{av}} & \begin{pmatrix} \mathbf{0} & \mathbf{D}\_{01}^{Y,k, \text{av}} & & & & & & \\ & \mathbf{D}\_{01}^{Y,k, \text{av}} & & & & & & \\ & & \mathbf{D}\_{11}^{Y,k, \text{av}} & & & & & \\ & & & \ddots & & & & \\ & & & \ddots & & \ddots & & \\ & & & & & \mathbf{D}\_{k-1,k-1}^{Y,k, \text{av}} & \mathbf{D}\_{k-1,k}^{Y,k, \text{av}} \\ & & & & & & \mathbf{0} \end{pmatrix}.$$

The block **D***Y*,*k*,*nv <sup>i</sup>*,*<sup>j</sup>* contains the transition, when there are *k* units in the system, from *i* units in the repair facility to *j* (a type event *Y* occurs) and the repairperson is in his workplace. For instance, the cases **D***A*,*k*,*nv* <sup>01</sup> and **<sup>D</sup>***B*,*k*,*nv* <sup>01</sup> (transition *<sup>E</sup>k*,*nv* <sup>0</sup> <sup>→</sup> *<sup>E</sup>k*,*nv* <sup>1</sup> for type *A* and *B* respectively) are analyzed.

In both cases, there are *k* units in the system and none of these is in the repair facility (all operational). The online unit goes to the repair facility if it undergoes an internal repairable failure (**H***A*) or a major inspection (**H***B*). In both cases a new unit will occupy the online place if the number of units in the system is greater than one. If the event is a repairable failure, then the unit will begin the repair given that the repairperson is not on vacation (β1). If the event is a major inspection, the initial distribution for the preventive maintenance would be β2.

• If the number of units is greater or equal than *R*, the repairperson can be on vacation or not. If the repairperson returns and there are less than *R* operational units then he remains at his workplace. Given that these events *A* and *B* occur when a repairable or major inspection occurs (without returning to work) then, for *k* = *R*, ... , *n* (*N* = *k* − *R* + 1, the limit of the number of units in the repair facility for the repairperson to remain):

**D***A*,*k*,*nv*

β

β

This matrix is partitioned into two great matrix blocks depending on the transition between macro states; continues on vacation and continues in the repair facility.

The block **D***Y*,*k*,*<sup>v</sup> <sup>i</sup>*,*<sup>j</sup>* contains the transition, when there are *k* units in the system, from *i* units in the repair facility to *j* (type *Y*) and the repairperson continues on vacation. For instance, the cases **D***A*,*k*,*<sup>v</sup>* <sup>01</sup> and **<sup>D</sup>***B*,*k*,*<sup>v</sup>* <sup>01</sup> correspond to the transition *<sup>E</sup>k*,*<sup>v</sup>* <sup>0</sup> <sup>→</sup> *<sup>E</sup>k*,*<sup>v</sup>* <sup>1</sup> for type *A* and *B,* respectively.

.

These matrices are for *k* = 1, . . . , *n* and *R* > 1 **D***A*,*k*,*nv* <sup>01</sup> = --*<sup>I</sup>*{*k*>1}**H***<sup>A</sup>* + *<sup>I</sup>*{*k*=1}**H** *A* ⊗ β1, **0** ; **D***B*,*k*,*nv* <sup>01</sup> = - **0**, - *<sup>I</sup>*{*k*>1}**H***<sup>B</sup>* + *<sup>I</sup>*{*k*=1}**H** *B* ⊗ β<sup>2</sup> The rest of matrices for this matrix block are as follows. **D***A*,*k*,*nv* 1,1 = **<sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **S0** <sup>1</sup> ⊗ β<sup>1</sup> **0 <sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **S0** <sup>2</sup> ⊗ β<sup>1</sup> **0** ; **D***B*,*k*,*nv* 1,1 = **0 H***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup><sup>0</sup>** <sup>1</sup> ⊗ β<sup>2</sup> **0 H***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup><sup>0</sup>** <sup>2</sup> ⊗ β<sup>2</sup> For *r* = 2, . . . , *k*−1 **D***A*,*k*,*nv <sup>r</sup>*,*<sup>r</sup>* = ⎛ ⎜⎜⎜⎜⎜⎜⎝ *I*2*r*−<sup>2</sup> ⊗ - **<sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>1</sup> <sup>⊗</sup> <sup>β</sup>1, 0 **0 0** *I*2*r*−<sup>2</sup> ⊗ - **<sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>1</sup> ⊗ β2, **0** *I*2*r*−<sup>2</sup> ⊗ - **<sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>2</sup> ⊗ β1, **0 0 0** *I*2*r*−<sup>2</sup> ⊗ - **<sup>H</sup>***<sup>A</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>2</sup> ⊗ β2, **0** ⎞ ⎟⎟⎟⎟⎟⎟⎠ **D***B*,*k*,*nv <sup>r</sup>*,*<sup>r</sup>* = ⎛ ⎜⎜⎜⎜⎜⎜⎝ *I*2*r*−<sup>2</sup> ⊗ - **<sup>0</sup>**, **<sup>H</sup>***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>1</sup> ⊗ β<sup>1</sup> **0 0** *I*2*r*−<sup>2</sup> ⊗ - **<sup>0</sup>**, **<sup>H</sup>***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>1</sup> ⊗ β<sup>2</sup> *I*2*r*−<sup>2</sup> ⊗ - **<sup>0</sup>**, **<sup>H</sup>***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>2</sup> ⊗ β<sup>1</sup> **0 0** *I*2*r*−<sup>2</sup> ⊗ - **<sup>0</sup>**, **<sup>H</sup>***<sup>B</sup>* <sup>⊗</sup> **<sup>S</sup>**<sup>0</sup> <sup>2</sup> ⊗ β<sup>2</sup> ⎞ ⎟⎟⎟⎟⎟⎟⎠ For *r* = max{1, *k* − *R* + 1}, . . . , *k* − 1 *<sup>r</sup>*,*r*+<sup>1</sup> = ⎛ ⎝ **I**2*r*−<sup>1</sup> ⊗ --*<sup>I</sup>*{*r*<*k*−1}**H***<sup>A</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *A* ⊗ **S**1, **0 0 0 I**2*r*−<sup>1</sup> ⊗ --*<sup>I</sup>*{*r*<*k*−1}**H***<sup>A</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *A* ⊗ **S**2, **0** ⎞ ⎠ **D***B*,*k*,*nv <sup>r</sup>*,*r*+<sup>1</sup> = ⎛ ⎝ **I**2*r*−<sup>1</sup> ⊗ - **0**, - *<sup>I</sup>*{*r*<*k*−1}**H***<sup>B</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *B* ⊗ **S**<sup>1</sup> **0 0 I**2*r*−<sup>1</sup> ⊗ - 0, - *<sup>I</sup>*{*r*<*k*−1}**H***<sup>A</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *A* ⊗ **S**<sup>2</sup> ⎞ ⎠ For *r* = 1, . . . , *k*−1 and *k* ≥ *R* **D***A*,*k*,*<sup>v</sup>* 0,1 = - **H***<sup>A</sup>* ⊗ - **<sup>V</sup>** <sup>+</sup> *<sup>I</sup>*{*k*≥*R*+1}**V**0<sup>ν</sup> , **0** ; **D***B*,*k*,*<sup>v</sup>* 0,1 = - **0**, **H***<sup>B</sup>* ⊗ - **<sup>V</sup>** <sup>+</sup> *<sup>I</sup>*{*k*≥*R*+1}**V**0<sup>ν</sup> **D***A*,*k*,*<sup>v</sup> <sup>r</sup>*,*r*+<sup>1</sup> = **I**2*<sup>r</sup>* ⊗ --*<sup>I</sup>*{*r*<*k*−1}**H***<sup>A</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *A* ⊗ - **<sup>V</sup>** <sup>+</sup> *<sup>I</sup>*{*r*<*N*−1}**V**0<sup>ν</sup> , **0 D***B*,*k*,*<sup>v</sup> <sup>r</sup>*,*r*+<sup>1</sup> = **I**2*<sup>r</sup>* ⊗ - **0**, - *<sup>I</sup>*{*r*<*k*−1}**H***<sup>B</sup>* + *<sup>I</sup>*{*r*=*k*−1}**H** *B* ⊗ - **<sup>V</sup>** <sup>+</sup> *<sup>I</sup>*{*r*<*N*−1}**V**0<sup>ν</sup> .

#### **4. Measures**

Multiple interesting measures in transient and stationary regime can be worked out and are described in this section.

#### *4.1. The Transient and the Stationary Distribution*

The transient distribution is determined by the initial distribution and the transition probability matrix of the vector Markov process given in Section 3.3.

Initially the online unit is new and the inspection time begins. Then, the initial distribution of the Markov process is *φ* = [α ⊗ γ*st* ⊗ η, 0] where γ*st* is the stationary distribution of the phase-type renewal process with transition probability matrix **L** + **L**0γ. Therefore, γ*st* = [1, 0] **e <sup>L</sup>** <sup>+</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>−</sup> **<sup>I</sup>** <sup>∗</sup> <sup>−</sup><sup>1</sup> .

The probability of occupying the macro-state *Ek*,*<sup>a</sup> <sup>s</sup>* at time *ν* is worked out by matrix blocks as **p***<sup>ν</sup> Ek*,*<sup>a</sup> s* <sup>=</sup> (*φ***D***ν*)*<sup>I</sup> k*,*a <sup>s</sup>* where *<sup>I</sup> k*,*a <sup>s</sup>* indicates the range for the corresponding states. Evidently, **p***<sup>ν</sup>* is the transient distribution at time *ν*.

To calculate the stationary distribution in a matrix-algorithmic form, we have partitioned the matrix **D** for the transitions between the macro-states **U***<sup>j</sup>* into the following blocks,


where

$$\begin{array}{l} \mathbf{D}\_{ii} = \mathbf{D}\_i^O + \mathbf{D}\_i^A + \mathbf{D}\_i^B + \mathbf{D}\_i^D + \mathbf{D}\_i^{AD} + \mathbf{D}\_i^{BD}; i = 1, \dots, m\\ \mathbf{D}\_{i, i-1} = \mathbf{D}\_i^C + \mathbf{D}\_i^{CD}; i = 2, \dots, m\\ \mathbf{D}\_{1, n} = \mathbf{D}\_1^{N\bar{S}}. \end{array}$$

The stationary distribution π verifies the balance equations π**D** = π and the normalization equation π**e** = 1. This vector is partitioned into the macro-states **U***<sup>j</sup>* , *j* units in the system, then, <sup>π</sup> <sup>=</sup> {π*n*, <sup>π</sup>*n*−1,..., <sup>π</sup>1} for the macro-states **<sup>U</sup>***n*,... , **<sup>U</sup>**1, respectively. π π

The solution of this matrix system is π*<sup>j</sup>* = π1**R***j*; *j* = 2, ... , *n*, being **R***<sup>j</sup>* = **R***j*+1**G***j*+1,*<sup>j</sup>* = **<sup>G</sup>**1*n***G***n*,*n*−<sup>1</sup> ··· **<sup>G</sup>***j*+1,*j*; *j* = 2, ... , *n*−1, **<sup>R</sup>***<sup>n</sup>* = **<sup>G</sup>**1,*<sup>n</sup>* and **<sup>G</sup>***ij* = **<sup>D</sup>***ij* **<sup>I</sup>** <sup>−</sup> **<sup>D</sup>***jj*−<sup>1</sup> for (*i*, *<sup>j</sup>*) <sup>∈</sup> {(1, *n*),(*n*, *n* − 1),(*n* − 1, *n* − 2),...,(3, 2)} π

The transition probability vector for the macro-state **U**<sup>1</sup> can be worked out from the normalization condition and one balanced equation as

$$\pi\_1 = (1,0)\left(\mathbf{e} + \sum\_{j=2}^n \mathbf{R}\_j \mathbf{e} \left| (\mathbf{I} - \mathbf{D}\_{11} - \mathbf{R}\_2 \mathbf{D}\_{21})^\* \right\rangle^{-1} \right)$$

where \* is the corresponding matrix without the first column.

From the stationary distribution and considering the macro-states, multiple proportional time measures can be defined:


$$\Upsilon\_{n\nu} = \sum\_{k=1}^{\overline{R}-1} \sum\_{s=0}^{k} \pi\_{E\_s^{k,n\nu}} \mathbf{e} + \sum\_{k=R}^{n} \sum\_{s=k-R+1}^{k} \pi\_{E\_s^{k,n\nu}} \mathbf{e} .$$


$$\mathbf{Y}\_w = \sum\_{k=1}^{R-1} \sum\_{s=1}^k \pi\_{E\_s^{k,nv}} \mathbf{e} + \sum\_{k=R}^n \sum\_{s=k-R+1}^k \pi\_{E\_s^{k,nv}} \mathbf{e}$$

• Proportional time that the repairperson is idle: ϒ*<sup>i</sup>* = <sup>ϒ</sup>*nv* <sup>−</sup> ϒ*w*.

π

π

π

#### *4.2. Availability and Mean Times*

It is interesting to calculate the availability of the system, the mean time in each macrostate and the mean operational time. This has been summed up in Table 1 in both regimes, transient and stationary.


#### **Table 1.** Availability and mean times in transient and stationary regime.

#### *4.3. Time up to First Time That the System Is Replaced*

A system composed of *n* units is replaced by a new and identical one when all units undergo a non-repairable failure. The time up to this event is phase-type distributed with representation (*φ*, **D** ) where **D** = **D***<sup>O</sup>* + **D***<sup>A</sup>* + **D***<sup>B</sup>* + **D***<sup>C</sup>* + **D***<sup>D</sup>* + **D***AD* + **D***BD* + **D***CD*.

#### *4.4. Expected Number of Events*

The expected number of events up to time *ν* is determined using the Markovian Arrival Process with Marked arrivals developed in Section 3.3. If the event considered is denoted by *Y* then the corresponding expected number of events is given by

$$\Lambda^Y(\mathbf{v}) = \sum\_{\mu=1}^{\nu} \mathbf{p}^{\mu-1} \mathbf{D}^Y \mathbf{e}\_{\nu}$$

For *Y* = *A*, *B*, *C*, *D*, *AD*, *BD*, *CD*, *NS*. This value in stationary regime is Λ*<sup>Y</sup>* = π**D***Y***e**. Another mean number of events can be calculated as follows.

#### *4.5. Mean Number of Repairable Failures*

A repairable failure can occur when the repairperson resumes work or not at the same time. Then, the mean number up to time *<sup>ν</sup>* is <sup>Λ</sup>*rep*(ν) <sup>=</sup> <sup>ν</sup> ∑ *u*=1 **p***u*−<sup>1</sup> **D***<sup>A</sup>* + **D***AD* **e** and in stationary regime it is Λ*rep* = π **D***<sup>A</sup>* + **D***AD* **e**.

#### *4.6. Mean Number of Major Inspections*

Analogously to the repairable case, a major inspection can occur when the repairperson occupies the workplace or not at the same time. Then, it is in transient regime <sup>Λ</sup>*mi*(ν) <sup>=</sup> <sup>ν</sup> ∑ *u*=1 **p***u*−<sup>1</sup> **D***<sup>B</sup>* + **D***BD* **e** and in the stationary case it is Λ*mi* = π **D***<sup>B</sup>* + **D***BD* **e**.

#### *4.7. Mean Number of Non-Repairable Failures (No Provoking System Failure)*

The mean number of non-repairable failures up to time *ν* is

$$\Lambda^{nr}(\mathbf{v}) = \sum\_{\boldsymbol{\mu}=1}^{\boldsymbol{\nu}} \mathbf{p}^{\boldsymbol{\mu}-1} \left( \mathbf{D}^{\boldsymbol{C}} + \mathbf{D}^{\boldsymbol{C}\boldsymbol{D}} \right) \mathbf{e}.$$

This value in the stationary case is Λ*nr* = π **D***<sup>C</sup>* + **D***CD* **e**.

#### *4.8. Mean Number of Times That the Repairperson Resumes to Work*

The mean number that the repairperson resumes and remains in his workplace up to a certain time is given by

$$\Lambda^{\text{rejoined}}(\mathbf{v}) = \sum\_{\mu=1}^{\mathcal{V}} \mathbf{p}^{\mu-1} \left( \mathbf{D}^D + \mathbf{D}^{AD} + \mathbf{D}^{BD} + \mathbf{D}^{CD} \right) \mathbf{e}$$

In the stationary case this is Λ*rejoined* = π **D***<sup>D</sup>* + **D***AD* + **D***BD* + **D***CD* **e**.

*4.9. Mean Number of Times That the Repairperson Resumes and Begins a New Period of Vacation*

The mean number that the repairperson resumes and begins a new period of vacation up to a certain time is given by

$$\Lambda^{r-b}(\mathbf{v}) = \sum\_{\mathbf{u}=1}^{\mathbf{v}} \mathbf{p}^{\mathbf{u}-1} \mathbf{Q} \mathbf{e}\_{\mathbf{u}}$$

where **Q** is a matrix described in Appendix C. In the stationary case it is Λ*r*−*<sup>b</sup>* = π**Qe**.

#### *4.10. Mean Number of New Systems*

When the system is composed of only one unit and a non-repairable failure occurs, the system is restarted with *n* new units. The mean number of new systems up to time *ν* is

$$\boldsymbol{\Lambda}^{NS}(\mathbf{v}) = \sum\_{u=1}^{\mathcal{V}} \mathbf{p}^{u-1} \mathbf{D}^{NS} \mathbf{e}\_u$$

This measure in stationary case is Λ*NS* = π**D***NS***e**.

#### **5. Rewards and Costs**

To analyze the effectiveness of the model from an economic point of view, costs and rewards have been taken into account. A net profit vector associated to the state-space is built. Previously, multiple values are introduced:

*B*: Gross profit per unit of time if the system is operational.

**c0**: expected cost per unit of time depending on the operational phase while the system is operational.

**cr1**: expected corrective repair cost per unit of time depending on the repair phase. **cr2**: expected preventive maintenance cost per unit of time for a unit that was observed with major damage depending on the preventive maintenance phase.

*H*: repairperson cost per unit of time while the repairperson in idle.

*C*: loss per unit of time while the system is not operational

*G*: fixed cost associated to each return of the repairperson (independently of if he stays or not).

*fcr*: fixed cost each time that the online unit undergoes a repairable failure from the online unit.

*fmi*: fixed cost each time that the online unit undergoes a major inspection.

*fnu*: cost for a new unit (*n*·*fnu* cost of a new system).

#### *5.1. Net Profit Vector*

When the system occupies a determined state, a net profit value is produced. Costs and rewards from the online unit and the cost provoked by the repairperson have been taken into account to build the net profit vector.

#### 5.1.1. Online Unit

If only the online unit is considered when the system visits the macro-state **E***k*,*nv <sup>s</sup>* , a net reward for the phases of this macro-state is worked out. The profit net vector for the online unit if the repairperson is in his workplace (**E***k*,*nv <sup>s</sup>* ) is for *k* = 1, . . . , *n*,

$$\mathbf{n}\mathbf{r}\_s^{k,nv} = \begin{cases} B\mathbf{e}\_{mtx} - \mathbf{c}\_\mathbf{0} \otimes \mathbf{e}\_{t\ell} & ; \quad s = 0\\ B\mathbf{e}\_{mtt2^{s-1}(z\_1 + z\_2)} - \mathbf{c}\_\mathbf{0} \otimes \mathbf{e}\_{t t2^{s-1}(z\_1 + z\_2)} & ; \quad s = 1, \dots, k - 1\\ -\mathbf{C} \cdot \mathbf{e}\_{t2^{s-1}(z\_1 + z\_2)} & ; \quad s = k. \end{cases}$$

This can be expressed for any number of units in the repair facility as the following column vector **nr***k*,*nv Total* = - **nr***k*,*nv* 0 ;...; **nr***k*,*nv k* .

If the number of units in the repair facility is *N* or more, then the repairperson remains at his workplace without vacation. In this case we define **nr***k*,*nv f romN* = - **nr***k*,*nv N* ;...; **nr***k*,*nv k* . For cased when the repairperson is on vacation, the profit net vector for the online

unit for the macro-state **E***k*,*<sup>v</sup> <sup>s</sup>* is

$$\mathbf{n} \mathbf{r}\_s^{k,v} = \begin{cases} B\mathbf{e}\_{mtvv} - \mathbf{c}\_0 \otimes \mathbf{e}\_{ttv} & ; \quad s = 0\\ B\mathbf{e}\_{mtv2^s} - \mathbf{c}\_0 \otimes \mathbf{e}\_{ttv2^s} & ; \quad s = 1, \dots, k - 1\\ -\mathbf{C} \cdot \mathbf{e}\_{tv2^s} & ; \quad s = k. \end{cases}$$

For any number of units in the repair facility the column vector **nr***k*,*<sup>v</sup> Total* = - **nr***k*,*<sup>v</sup>* 0 ;...; **nr***k*,*<sup>v</sup> k* is defined.

Then, if the total state space is considered then the net reward, according to the state visited, for the online unit is

$$\mathbf{n}\mathbf{r} = \left(\mathbf{n}\mathbf{r}\_{\mathrm{Total}}^{\mathrm{n.p.}};\mathbf{n}\mathbf{r}\_{f\mathrm{room}}^{\mu\_{\mathrm{D}}\nu};\mathbf{n}\mathbf{r}\_{\mathrm{Total}}^{\mu-1,\nu^{f}};\mathbf{n}\mathbf{r}\_{f\mathrm{room}}^{\mu-1,\mathrm{n.p.}};\dots;\mathbf{n}\mathbf{r}\_{\mathrm{Total}}^{\mathrm{N.p.}};\mathbf{n}\mathbf{r}\_{f\mathrm{room}}^{\mathrm{N.p.}};\mathbf{n}\mathbf{r}\_{\mathrm{Total}}^{\mathrm{N.-1,\mu\nu^{f}}};\dots;\mathbf{n}\mathbf{r}\_{\mathrm{Total}}^{\mathrm{1,\mu\nu}}\right)^{\prime}$$

#### 5.1.2. Repair Facility

If only the repair facility is considered, when the system visits the macro-states **E***k*,*nv <sup>s</sup>* , a cost vector for the phases of the corresponding macro-state, for *k* = 1, . . . , *n* is

$$\mathbf{nc}\_{s}^{k,nv} = \begin{cases} \begin{array}{l} H \cdot \mathbf{e}\_{mtt} \\ \mathbf{e}\_{t\{mu\}^{l}\{s < k\}} \end{array} \otimes \begin{array}{l} \vdots \quad s = 0 \\ \mathbf{e}\_{2^{s-1}} \otimes \mathbf{cr}\_{1} \\ \mathbf{e}\_{2^{s-1}} \otimes \mathbf{cr}\_{2} \end{array} \end{cases}; \quad s = 1, \ldots, k.$$

For any number of units in the repair facility, the following column vectors are defined,

$$\mathbf{n} \mathbf{c}\_{Total}^{k,nv} = \left(\mathbf{n} \mathbf{c}\_0^{k,nv'}; \dots; \mathbf{n} \mathbf{c}\_k^{k,nv'}\right)' , \ \mathbf{n} \mathbf{c}\_{fromN}^{k,nv} = \left(\mathbf{n} \mathbf{c}\_N^{k,nv'}; \dots; \mathbf{n} \mathbf{c}\_k^{k,nv'}\right)'$$

For any *k* and *s* while the repairperson is on vacation the cost of the repair facility is zero, then the following column vector is defined for this case as **nc***k*,*<sup>v</sup> <sup>s</sup>* = 0 (*mε*) *I* {*s*<*k*} *tυ*2*<sup>s</sup>* . For any number of units in the repair facility it is defined as **nc***k*,*<sup>v</sup> Total* = - **nc***k*,*<sup>v</sup>* ;...; **nc***k*,*<sup>v</sup>* .

0 *k* Then, the cost vector associated to the state space due to repair is given by

$$\mathbf{nc} = \left( \mathbf{nc}^{\mu, \upsilon}; \mathbf{nc}^{\mu, \upsilon}\_{fromN}{}'; \mathbf{nc}^{\mu-1, \upsilon'}; \mathbf{nc}^{\mu-1, \upsilon}\_{fromN}{}'; \dots; \mathbf{nc}^{N, \upsilon'}; \mathbf{nc}^{N, \upsilon}\_{fromN}{}'; \mathbf{nc}^{N-1, \upsilon \upsilon'}\_{Total}; \mathbf{nc}^{N-1, \upsilon \upsilon'}\_{Total}; \dots; \mathbf{nc}^{1, \upsilon \upsilon}\_{Total} \right)' $$

Therefore, the net profit vector corresponding to the online unit and the repair facility for the global state space is given by

$$\mathbf{c} = \mathbf{n}\mathbf{r} - \mathbf{n}\mathbf{c} = \begin{pmatrix} \mathbf{c}^n \\ \mathbf{c}^{n-1} \\ \vdots \\ \mathbf{c}^1 \end{pmatrix} \mathbf{r}$$

where

$$\begin{aligned} \mathbf{c}^k &= \left( \mathbf{n} \mathbf{c}\_{Total}^{k,nv} - \mathbf{n} \mathbf{c}\_{Total}^{k,nv} \right)' \text{ for } k = 1, \dots, R - 1, \\ \mathbf{c}^k &= \left( \mathbf{n} \mathbf{r}^{k,v'} - \mathbf{n} \mathbf{c}^{k,v'}; \mathbf{n} \mathbf{c}\_{fromN}^{k,nv} \right)' - \mathbf{n} \mathbf{c}\_{fromN}^{k,nv} \Big)' \text{ for } k = R, \dots, n. \end{aligned}$$

#### *5.2. Expected Net Profits and Total Net Profit*

Net reward measures are worked out, in transient and stationary regimes, to analyze the effectiveness of the system from an economic point of view.

#### 5.2.1. Expected Net Profit from the Online Unit Up to Time ν

The expected net profit up to time ν by considering only the online unit is

$$\Phi\_w^\mathbf{v} = \sum\_{m=0}^{\mathbf{v}} \mathbf{p}^m \cdot \mathbf{n} \mathbf{r}.$$

In stationary regime this is given by Φ*w*\_*<sup>s</sup>* = π · **nr**.

#### 5.2.2. Expected Cost from Corrective Repair and Preventive Maintenance

The expected cost because of corrective repair and preventive maintenance up to time *ν* is calculated. This is respectively

$$\boldsymbol{\Phi}\_{\boldsymbol{\sigma}\boldsymbol{r}}^{\boldsymbol{\vee}} = \sum\_{m=0}^{\boldsymbol{\vee}} \mathbf{p}^{\boldsymbol{m}} \cdot \mathbf{m} \mathbf{c}^{\boldsymbol{\vee}\boldsymbol{r}} \text{ and } \boldsymbol{\Phi}\_{\boldsymbol{p}\boldsymbol{m}}^{\boldsymbol{\vee}} = \sum\_{m=0}^{\boldsymbol{\vee}} \mathbf{p}^{\boldsymbol{m}} \cdot \mathbf{m} \mathbf{c}^{\boldsymbol{p}\boldsymbol{m}}$$

where **mc***cr* is the vector **nc** with **cr**<sup>2</sup> = 0z2 and **mc***pm* is the vector **nc** with **cr**<sup>1</sup> = 0z1 , being **0***a* a column vector of 0s with order *a*.

If the stationary regime is considered, then

$$
\Phi\_{cr\_{-}s} = \pi \cdot \mathbf{m} \mathbf{c}^{cr} \text{ and } \Phi\_{pm\_{-}s} = \pi \cdot \mathbf{m} \mathbf{c}^{pm\_{-}}
$$

#### 5.2.3. Total Net Profit

If costs, fixed costs and profits are considered, the total net profit up to time *ν* is

$$\Phi^{\rm V} = \Phi\_{\rm w}^{\rm V} - \Phi\_{\rm cr}^{\rm v} - \Phi\_{\rm pw}^{\rm v} - \left(1 + \Lambda^{\rm NS}(\mathbf{v})\right) \cdot \mathbf{n} \cdot f \mathbf{n} \mathbf{u} - \Lambda^{\rm rep}(\mathbf{v}) \cdot f \mathbf{c} \mathbf{r} - \Lambda^{\rm mi}(\mathbf{v}) \cdot f \mathbf{m} \mathbf{i} - \Lambda^{\rm r-b}(\mathbf{v}) \cdot \mathbf{G}$$

In the stationary case this is

$$\Phi = \Phi\_w - \Phi\_{cr} - \Phi\_{pm} - \left(1 + \Lambda^{NS}\right) \cdot n \cdot fmu - \Lambda^{rep} \cdot fcr - \Lambda^{mi} \cdot fmi - \Lambda^{r-b} \cdot G.$$

#### **6. A Numerical Example**

The system modelled in this paper can be applied to real-world engineering problems. It would be interesting to examine whether or not preventive maintenance is profitable and to determine the optimum distribution for vacation time and hence the corresponding value of *R*.

#### *6.1. The System*

We assume a standby system composed of four units initially as described in this work. Each unit is composed of four performance internal states where the first two are considered minor damage and the last two as major damage. The transition probability matrix for wearing out time is given by

$$\mathbf{T} = \begin{pmatrix} 0.96 & 0.03 & 0 & 0 \\ 0 & 0.97 & 0.01 & 0 \\ 0 & 0 & 0.85 & 0.06 \\ 0 & 0 & 0 & 0.6 \end{pmatrix} \text{.} $$

Beginning in the initial state (*α* = (1, 0, 0, 0)). From each state, only a transition to failure or to next performance level state can occur. The transition probability to repairable and non-repairable failure depending on the performance state are given by the column

$$\mathbf{T}^{0}\mathbf{v}\text{vectors }\mathbf{T}^{0}\_{r} = \begin{pmatrix} 0.008\\ 0.016\\ 0.072\\ 0.32 \end{pmatrix}\text{ and }\mathbf{T}^{0}\_{nr} = \begin{pmatrix} 0.002\\ 0.004\\ 0.018\\ 0.080 \end{pmatrix}\text{ respectively.}$$

The online unit is subject to external shocks. The time between two consecutive external shocks follows a phase-type distribution with representation (γ, **L**) being γ = (1, 0) and **L** = 0.9 0.05 0 0.5 .

The mean time between two consecutive accidental external failures is equal to 11 units of time.

Each time that the system suffers an external shock the internal performance can be modified by producing a repairable or non-repairable failure. The matrix that governs the changes into the operational states is

$$\mathbf{W} = \begin{pmatrix} 0.2 & 0.1 & 0.3 & 0.1 \\ 0 & 0.1 & 0.3 & 0.1 \\ 0 & 0 & 0.3 & 0.1 \\ 0 & 0 & 0 & 0.1 \end{pmatrix}$$

and the change to a repairable and non-repairable is **W**<sup>0</sup> *<sup>r</sup>* = ⎛ ⎜⎜⎝ 0.3 0.4 0.5 0.6 ⎞ ⎟⎟⎠ and **W**<sup>0</sup> *nr* = ⎛ ⎜⎜⎝ 0 0.1 0.1 0.3 ⎞ ⎟⎟⎠

respectively.

When an external shock occurs, a total failure can also be produced with a probability equal to *ω*<sup>0</sup> = 0.2.

Inspections occur randomly where the inter-inspection time is phase-type distributed with representation (η, **M**) being

$$
\boldsymbol{\eta} = (1,0), \; \mathbf{M} = \begin{pmatrix} 0.85 & 0.1 \\ 0.45 & 0.4 \end{pmatrix}.
$$

When a unit undergoes a repairable failure or inspection observes major damage, this goes to the repair facility. Therefore, two types of tasks can be developed by the repairperson, corrective repair and preventive maintenance. Both are phase-type distributed with representation for the corrective repair time,

$$\mathfrak{B}\_1 = (1,0,0) \text{ and } \mathbf{S}\_1 = \begin{pmatrix} 0.2 & 0.4 & 0.3 \\ 0.2 & 0.2 & 0.5 \\ 0.3 & 0.2 & 0.3 \end{pmatrix}.$$

and for the preventive maintenance time,

$$
\mathfrak{B}\_2 = (1,0,0) \text{ and } \mathfrak{B}\_2 = \begin{pmatrix} 0.2 & 0.3 & 0.1 \\ 0.1 & 0.1 & 0.4 \\ 0.2 & 0.2 & 0.2 \end{pmatrix}.
$$

The mean corrective repair time is 7.3810 units of time and for the preventive maintenance case this is equal to 2.5 units of time.

#### *6.2. Costs and Rewards*

Different costs and rewards have been considered as described in Section 5. We assume a gross profit while the system is operational, equal to *B* = 60. This is also the loss per unit of time while the system is not operational, *C* = 60. The online unit has a cost while it is operational depending on the operational phase. This vector is **c**<sup>0</sup> = (5, 12, 30, 40) . The repairperson can be on vacation or in his workplace. Each time that the repairperson returns on his vacation a cost equal to *G* = 20 is produced. While the repairperson is idle, a cost equal to *H* = 15 is produced.

The online unit can undergo a repairable failure. In this case, the unit goes to the repair facility for corrective repair. A fixed cost is considered for each failure equal to *fcr* = 10. Once in corrective repair, a cost depending on the state is given by **cr1** = (18,18,18) .

When inspection observes major damage, the unit also goes to the repair facility for preventive maintenance. A fixed cost is produced, *fmi* = 5. Once in the repair facility the cost will depend on the preventive maintenance state. This is given by the vector **cr**<sup>2</sup> = (15.5, 15.5, 15.5) . Finally, when all units undergo a non-repairable failure the system is re-started. It has a cost per unit equal to *fnu* = 100.

#### *6.3. Optimization Analysis*

The repairperson can take a vacation, for a random duration, and inspections may take place at random intervals. This circumstance raises two interesting questions. Firstly, if a distribution class is assumed for the duration of the vacation, from an economic standpoint what is the optimum distribution and the optimum value of *R* (i.e., the limit value of the number of operational units needed to require the repairperson to remain in the facility on returning from vacation) from an economic standpoint? Secondly, is it profitable to perform preventive maintenance?

To answer these questions, we consider two classes of distributions, the geometric distribution and the Erlang distribution, from which optimum values for *R* and the other parameters can be determined.

#### 6.3.1. The Geometric Distribution Case

We assume that the vacation time of the repairperson is distributed geometrically with parameter *<sup>p</sup>*. Then, the p.m.f. is *<sup>P</sup>*{*<sup>X</sup>* <sup>=</sup> *<sup>n</sup>*} <sup>=</sup> *<sup>p</sup>n*−1(<sup>1</sup> <sup>−</sup> *<sup>p</sup>*); *<sup>n</sup>* = 0, 1, 2, . . .

The stationary net profit depending on *p* for the system with and without preventive maintenance is shown in Figure 4. This has been worked out from Section 5.2. We can see that, when the geometric distribution is considered, the optimum value is reached for the preventive maintenance case with *p* = 0.8 and *R* = 3. In this case, and in the stationary case, the net profit per unit of time would be equal to 22.0571.

**Figure 4.** Stationary net profit depending on *p* and *R* (with preventive maintenance, continuous line; without preventive maintenance, dashed line).

#### 6.3.2. The Generalized Erlang Distribution Case

Analogously to the geometric case, we assume now that the vacation time is distributed as a Generalized Erlang distribution with parameter shape equal to 2. This distribution can be expressed as a phase-type with representation (**v**, **V**) being

$$\mathbf{v} = (1,0); \; \mathbf{V} = \left(\begin{array}{cc} p\_1 & 1 - p\_1 \\ 0 & p\_2 \end{array}\right).$$

Figures 5 and 6 show the stationary net profit depending on the parameters *p*<sup>1</sup> and *p*<sup>2</sup> and *R* for the case without preventive maintenance and with preventive maintenance, respectively.

**Figure 5.** Stationary net profit for the system without preventive maintenance depending on *R* and the parameters of the vacation distribution.

**Figure 6.** Stationary net profit for the system with preventive maintenance depending on *R* and the parameters of the vacation distribution.

We can see that, when the generalized Erlang distribution is considered for the vacation time, the optimum value is reached for the preventive maintenance case with *p*<sup>1</sup> = *p*<sup>2</sup> = 0.67 and *R* = 3. In this case, a stationary case, the net profit per unit of time would be equal to 22.4364.

#### *6.4. The Optimum System with the Generalized Erlang Distribution*

In section above we have worked out the optimum system. It is given when the generalized Erlang distribution is considered with parameters (2, 0.67, 0.67) and *R* = 3. In this section the performance measures of this system are analysed.

Firstly, the time up to first time that the system is replaced (all units undergo a nonrepairable failure), described in Section 4.3, has been analysed. The reliability function is plotted in Figure 7. Two cases are shown, with and without inspection.

**Figure 7.** Reliability function of the time up to a new system (with inspection, continuous line; without inspection, dashed line).

From the corresponding phase-type distribution, the mean time up to a new system has been calculated in both cases. Thus, the mean time up to replacing the system for the case without inspection is 167.7631 u.t., and with inspection 172.5269 u.t.

Multiple measures have been achieved for this system with and without inspection. These measures are described in Section 4. Table 2 shows the stationary distribution for macro-states **U***k*, *k* units in the system. They can be interpreted as the proportional time that the system is in these macro-states.

**Table 2.** Proportional time in macro-state **U***k.*


Performance measures are developed for the optimum system with and without inspection following Section 4. Table 3 shows the results.

**Table 3.** Performance measures for the optimum system (without inspection between parentheses).


The proportional time that the repairperson is on vacation is 0.3194. This fact is of interest for the total cost. Therefore, the repairperson is in his workplace for 0.6806 proportion of time and working for 0.3139 proportion of time. Then, the 46.12% of the time that the repairperson is in his workplace, he is working. The remaining time he is idle.

Regarding the mean number of events per unit of time we can observe that this is 0.0409 for repairable failures, 0.0049 for major inspection and 0.0058 for new systems. Thus, for each 10,000 units of time 58 new systems are expected to be re-started. The availability is also worked out. For 87.72% of the time the system is operational, a 0.23% increase than the without inspection case. Really this is low but the difference between both net profits is important, 5.79% maximum for the case with preventive maintenance.

#### **7. Conclusions**

Matrix analysis methods can be used to model a complex discrete cold standby system subject to multiple events. This method facilitates the algorithmic and computational development of multi-state complex systems. In the case in question, the online unit within the system is subject to wear and external shocks and may undergo periodic or random inspection. The repair facility is composed of a single repairperson, who may take a vacation (absence) from the repair facility. This repairperson may perform corrective repair and/or preventive maintenance.

The system described is not the standard one in which units are replaced when they undergo a non-repairable failure. In the present study, the analysis takes account of the loss of units following the occurrence of a non-repairable failure. When such a failure occurs, the system continues working with one less unit. This outcome often occurs in practice, and is reflected in the study method presented.

The (indeterminate) number of units within the repair facility and the vacation policy applied determine the behaviour of the repairperson. The vacation time begins when the number of operational units exceeds a given value, and the repairperson will remain in place, without taking a vacation, if the number of operational units in the system is below a pre-determined value.

The system is modelled in an algorithmic and computational form by means of a Markovian Arrival Process with marked arrivals. Matrix-analytic methods are used to obtain the stationary distributions, and multiple measures are derived using a matrix. These measures are related to system performance and financial results.

The method presented in this paper enables us to analyse optimization problems in multi-state complex systems. A numerical example of such an optimization is presented. The results obtained show whether preventive maintenance is profitable and reveal the optimum number of operational units, hence determining the appropriate policy for the repairperson's vacation times.

**Funding:** This paper is partially supported by the project FQM-307 of the Government of Andalusia (Spain) and by the project MTM2017-88708-P of the Spanish Ministry of Science, Innovation and Universities (also supported by the European Regional Development Fund program, ERDF).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

**Appendix A. Transition Probability Matrix Blocks for the Online Unit Depending on Type of Event**

**H***<sup>O</sup>* = **<sup>T</sup>** <sup>⊗</sup> **<sup>L</sup>** <sup>+</sup> **TW** <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> ⊗ **M** + **<sup>U</sup>**1**<sup>T</sup>** <sup>⊗</sup> **<sup>L</sup>** <sup>+</sup> **<sup>U</sup>**1**TW** <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> <sup>⊗</sup> **<sup>M</sup>**0<sup>η</sup> **H***<sup>A</sup>* = **T**<sup>0</sup> *<sup>r</sup>*α ⊗ **L** ⊗ **e**η + - **T**0 *<sup>r</sup>* + **TW**<sup>0</sup> *r* <sup>α</sup> <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> ⊗ **e**η **H** *<sup>A</sup>* = **<sup>T</sup>**<sup>0</sup> *<sup>r</sup>* ⊗ **L** ⊗ **e** + - **T**0 *<sup>r</sup>* + **TW**<sup>0</sup> *r* <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> ⊗ **e**. **H***<sup>B</sup>* = **U**<sup>2</sup> **<sup>e</sup>** <sup>−</sup> **<sup>T</sup>**0 α ⊗ **L** + **U**2**T** - **<sup>e</sup>** <sup>−</sup> **<sup>W</sup>**<sup>0</sup> <sup>α</sup> <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> <sup>⊗</sup> **<sup>M</sup>**0<sup>η</sup> **H** *<sup>B</sup>* = **U**<sup>2</sup> **<sup>e</sup>** <sup>−</sup> **<sup>T</sup>**0 ⊗ **L** + **U**2**T** - **<sup>e</sup>** <sup>−</sup> **<sup>W</sup>**<sup>0</sup> <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> <sup>⊗</sup> **<sup>M</sup>**<sup>0</sup> **H***<sup>C</sup>* = **T**0 *nr*α ⊗ **L** + - **T**0 *nr* + **TW**<sup>0</sup> *nr* <sup>α</sup> <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> <sup>+</sup> **<sup>e</sup>**<sup>α</sup> <sup>⊗</sup> **<sup>L</sup>**0γω<sup>0</sup> ⊗ **e**η. **H** *<sup>C</sup>* = **T**0 *nr* ⊗ **L** + - **T**0 *nr* + **TW**<sup>0</sup> *nr* <sup>⊗</sup> **<sup>L</sup>**0<sup>γ</sup> <sup>1</sup> <sup>−</sup> <sup>ω</sup><sup>0</sup> <sup>+</sup> **<sup>e</sup>** <sup>⊗</sup> **<sup>L</sup>**0γω<sup>0</sup> ⊗ **e**

#### **Appendix B**

*Appendix B.1. Matrices for the Markovian Arrival Process Depending on the Type of Event*

The matrices **D***<sup>A</sup>* and **D***<sup>B</sup>* are developed in the text. The rest are given below.

#### *Appendix B.2. The Matrix D<sup>O</sup>*

The matrix *D<sup>O</sup>* contains the transitions when a none-event occurs. This matrix is composed of blocks according to the transitions between the macro-states **U***<sup>k</sup>* for *k* = 1, ... ,*n.* It is given by

$$\mathbf{D}^{O} = \begin{pmatrix} \mathbf{D}\_{n}^{O} \\ & \mathbf{D}\_{n-1}^{O} \\ & & \mathbf{D}\_{n-2}^{O} \\ & & & \ddots \\ & & & & \mathbf{D}\_{1}^{O} \end{pmatrix}.$$

Therefore, for the different macro-states, this is given by:

• For *k* = 1, . . . , *R*−1

• For *k* = *R*,... , *n*

$$\begin{aligned} \text{with} \\ \boldsymbol{\Theta} &= \mathbf{a} \otimes \left(\mathbf{L} + \mathbf{L}^{0} \mathbf{y}\right) \otimes \mathbf{n}, \\ \mathbf{D}\_{0,k^{u}}^{0,k^{u}} &= \begin{pmatrix} \mathbf{L}\_{2^{u}-1} \otimes \left(I\_{\{k:N\}} \mathbf{z} + I\_{\{k:N\}} \mathbf{H}\_{0}\right) \otimes \mathbf{S}\_{1}^{0} \otimes \mathbf{u} \\\ \mathbf{I}\_{2^{u}-1} \otimes \left(I\_{\{k:N\}} \mathbf{z} + I\_{\{k:N\}} \mathbf{H}\_{0}\right) \otimes \mathbf{S}\_{2}^{0} \otimes \mathbf{u} \\\ \mathbf{D}\_{0,k^{u}}^{0,k^{u}} &= \mathbf{L}\_{2^{u}} \otimes \left(I\_{\{r:N\}} \mathbf{H}\_{0} + I\_{\{r:n\}} \left(\mathbf{L} + \mathbf{L}^{0} \mathbf{y}\right)\right) \otimes \left(\mathbf{V} + I\_{\{r:N\}} \mathbf{V}^{0} \mathbf{y}\right), r = 0, \dots, k^{u}\_{0} \\\ \mathbf{D}\_{0,0}^{0,k^{u}} &= \mathbf{H}\_{0} \\\ \mathbf{D}\_{0,0}^{0,k^{u}} &= \mathbf{H}\_{0} \\\ \mathbf{F} \mathbf{r} = 1, \dots, k \\\ \mathbf{0} & \mathbf{I}\_{2^{u}-1} \otimes \left(I\_{\{r:k\}} \mathbf{H}\_{0} + I\_{\{r:n\}} \left(\mathbf{L} + \mathbf{L}^{0} \mathbf{y}\right)\right) \otimes \mathbf{S}\_{1} \\\ \mathbf{0} & \mathbf{I}\_{2^{u}-1} \otimes \left(I\_{\{r:n\}} \mathbf{H}\_{0} + I\_{\{k$$

*Appendix B.3. The Matrix D<sup>D</sup>*

The matrix **D***<sup>D</sup>* contains the transitions when the repairperson resumes work without any other event. The structure of this matrix is

$$\text{For } r = N, \dots, k$$

$$\mathbf{D}\_{rJ}^{D,k;\mathbf{w}} = \begin{pmatrix} \mathbf{I}\_{2^{r-1}} \otimes \left( l\_{\{r=k\}} \left( \mathbf{L} + \mathbf{L}^0 \mathbf{y} \right) + l\_{\{r$$

*Appendix B.4. The Matrix DAD and DBD*

The matrices **D***A<sup>D</sup>* and **D***B<sup>D</sup>* contain the transitions when the repairperson resumes work and at same time a repairable failure or major inspection occur. In this case, for Y = *AD*, *BD* we have

$$\mathbf{D}^{\boldsymbol{Y}} = \begin{pmatrix} \mathbf{D}\_{n}^{\boldsymbol{Y}} & & & & & & \\ & \mathbf{D}\_{n-1}^{\boldsymbol{Y}} & & & & \\ & & & \ddots & & & \\ & & & & \mathbf{D}\_{R}^{\boldsymbol{Y}} & & \\ & & & & & 0 & \\ & & & & & \ddots & \\ & & & & & & 0 \end{pmatrix}.$$

$$\mathbf{F}\_{r,t+1}^{\text{ED,k,uv}} = \begin{pmatrix} \mathbf{I}\_{2^{r-1}} \odot \left( \left( I\_{\{r-k-1\}} \mathbf{H}\_{A}^{\prime} + I\_{\{r$$

#### *Appendix B.5. The Matrix D<sup>C</sup>*

The matrix **D***<sup>C</sup>* contains the transitions when only a non-repairable failure occurs. In this case the matrix is

$$\mathbf{D}^{\mathbb{C}} = \begin{pmatrix} \mathbf{0} & \mathbf{D}\_{\mathrm{u}}^{\mathbb{C}} & & & & \\ & \mathbf{0} & \mathbf{D}\_{\mathrm{u}-1}^{\mathbb{C}} & & & \\ & & \mathbf{0} & \ddots & & \\ & & & \ddots & \mathbf{D}\_{\mathrm{2}}^{\mathbb{C}} \\ & & & & \mathbf{0} \end{pmatrix}.$$

• For *k* = 2, . . . , *R*−1 and *k* = *R* ≥ 3

$$\mathbf{D}\_{k}^{\mathbb{C}} = \begin{pmatrix} E\_{0}^{k,\text{uv}} & E\_{1}^{k-1,\text{uv}} & \dots & E\_{k-2}^{k-1,\text{uv}} & E\_{k-1}^{k-1,\text{uv}} \\ E\_{1}^{k,\text{uv}} & & & & & & \\ & \mathbf{D}\_{00}^{\mathbb{C},k,\text{uv}} & & & & & \\ & \vdots & & & & & & \\ E\_{k-1}^{k,\text{uv}} & & & & & & \ddots & \\ & & & & & & & \mathbf{D}\_{k-1,k-2}^{\mathbb{C},k,\text{uv}} & \mathbf{D}\_{k-1,k-1}^{\mathbb{C},k,\text{uv}} \\ & & & & & & & 0 \\ \end{pmatrix}$$

• For *k* = *R* ≥ 2

*Appendix B.6. The Matrix DCD*

The matrix **D***CD* contains the transitions when a non-repairable failure occurs and the repairperson resumes his work. In this case the matrix is

$$\mathbf{D}^{CD} = \begin{pmatrix} \mathbf{0} & \mathbf{D}\_{\mathrm{n}}^{CD} & & & & & \\ & \ddots & \ddots & & & & \\ & & \mathbf{0} & \mathbf{D}\_{\mathrm{R}}^{CD} & & & \\ & & & \mathbf{0} & \mathbf{0} & & \\ & & & & \ddots & \ddots & \\ & & & & & \mathbf{0} & \mathbf{0} \\ & & & & & & \mathbf{0} \end{pmatrix}.$$

 *k nv k nv k nv k nv k k k v CD k nv k v CD k nv k v CD k nv k k k CD k v k k k nv N k nv k nv k k nv k E E EE E E E E E E E E* − − −− − − − − − = − § · ¨ ¸ = © ¹ **' ' ' '**  ! # % ! ! # ! !

The matrix blocks for the case *k* = *R* are **D***CD*,*k*,*nv* <sup>00</sup> = **H***<sup>C</sup>* ⊗ **e** For *r* = 1, . . . , *k*−1

$$\mathbf{D}\_{r,l}^{CD,k,w} = \begin{pmatrix} \mathbf{I}\_{2^{r-1}} \otimes \left( I\_{\{r=k-1\}} \mathbf{H}\_{\mathbb{C}}' + I\_{\{r$$

$$\bullet \quad \text{For } k = R+1, \dots, n \text{ and } R \le n-1$$

• For *k* = *R*

• The matrix blocks for the case *k* = *R*+1, . . . , *n* are For *r* = *N*−1, . . . , *k*−1

$$\mathbf{D}\_{r,r}^{CD,k,w} = \begin{pmatrix} \mathbf{I}\_{2^{r-1}} \otimes \left( I\_{\{r < k-1\}} \mathbf{H}\_{\mathbb{C}} + I\_{\{r = k-1\}} \mathbf{H}\_{\mathbb{C}}' \right) \otimes \mathbf{V}^0 \otimes \mathbf{\mathcal{f}}\_1 & \mathbf{0} \\ \mathbf{0} & \mathbf{I}\_{2^{r-1}} \otimes \left( I\_{\{r < k-1\}} \mathbf{H}\_{\mathbb{C}} + I\_{\{r = k-1\}} \mathbf{H}\_{\mathbb{C}}' \right) \otimes \mathbf{V}^0 \otimes \mathbf{\mathcal{f}}\_2 \end{pmatrix}$$

#### *Appendix B.7. The Matrix DNS*

The matrix **D***NS* contains the transitions when a failure provokes the system to be restarted. Obviously, in this case the system is composed of only one unit. When this one is broken, a new system with *n* units re-starts. When this occurs, the vacation time begins again. The structure of the matrix is

$$\mathbf{D}^{\rm{NS}} = \begin{pmatrix} \mathbf{0} & & & & \\ & \mathbf{0} & & & \\ & & \mathbf{0} & & \\ & & & \ddots & \\ & & & \ddots & \\ \mathbf{D}\_{\mathbf{i}}^{\rm{NS}} & & & & \mathbf{0} \end{pmatrix}$$

• If *R* = 1

$$\mathbf{D}\_{1}^{\text{LV}} = \begin{bmatrix} E\_{0}^{u,v} & E\_{1}^{u,v} & \dots & E\_{N-1}^{u,v} & E\_{N}^{u,v} & E\_{N+1}^{u,v} & \dots & E\_{k-2}^{u,v} & E\_{N}^{u,v} & E\_{N+1}^{u,v} & \dots & E\_{k-2}^{u,v} & E\_{k-1}^{u,v} \\ \end{bmatrix}\_{\mathbf{D}\_{1}^{\text{LV}}} = \begin{bmatrix} \mathbf{D}\_{00}^{\text{N}\mathbf{1}, \text{v}} & \mathbf{0} & \dots & & & & & & \cdots & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \dots & & & & & \cdots & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \dots & & & & & \cdots & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \dots & & & & & \mathbf{0} \\ \end{bmatrix}\_{\mathbf{D}\_{1}^{\text{LV}}}$$

with **D***NS*,1,*<sup>v</sup>* <sup>00</sup> = **H***<sup>C</sup>* ⊗ **e***υ***v**. • If *R* > 1

$$\mathbf{D}\_{1}^{\text{NS}} = E\_{0}^{1,m} \begin{pmatrix} E\_{1}^{u,v} & \dots & E\_{N-1}^{u,v} & E\_{N}^{u,v} & E\_{N+1}^{u,v} & \dots & E\_{k}^{u,v} & E\_{2}^{u,m} & E\_{3}^{u,m} & \dots & E\_{n-1}^{u,m} & E\_{n}^{u,m} \\ \mathbf{D}\_{1}^{\text{NS}} = E\_{0}^{1,m} & \begin{pmatrix} \mathbf{D}\_{00}^{\text{NS},1,v} & \mathbf{0} & \dots & & & & & & \cdots & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \dots & & & & & & \cdots & \mathbf{0} \end{pmatrix} \\ \mathbf{D}\_{1}^{1,m} = \begin{pmatrix} \mathbf{0} & \mathbf{0} & \dots & & & & & & \cdots & \mathbf{0} \end{pmatrix} \end{pmatrix}$$

with **D***NS*,1,*<sup>v</sup>* <sup>00</sup> = **H***<sup>C</sup>* ⊗ **v**.

#### **Appendix C**

To calculate the expected times that the repairperson returns to the workplace, independently of whether he remains or begins another period of vacation, the following matrix **Q** is defined. This matrix is built analogously to the matrix **D**, but any return is considered. Therefore, the matrix **Q** is the addition of the following matrices

$$\mathbf{Q} = \mathbf{D}\_{r-b}^{O} + \mathbf{D}\_{r-b}^{A} + \mathbf{D}\_{r-b}^{B} + \mathbf{D}\_{r-b}^{C} + \mathbf{D}^{D} + \mathbf{D}^{AD} + \mathbf{D}^{RD} + \mathbf{D}^{CD} + \mathbf{D}\_{r-b}^{NS}.$$

The matrices **D***D*, **D***AD*, **D***BD*, **D***CD* are described in Appendix B. The other matrices have the same structure for the corresponding event given in Appendix B. These matrices are of zeros, excepting the following blocks.


$$\begin{array}{c} \mathbf{D}\_{r,r+1}^{A,k,v} = \mathbf{I}\_{2^r} \otimes \left( \mathbf{H}\_A \otimes \mathbf{V}^0 \mathbf{v}, 0 \right) \\ \mathbf{D}\_{r,r+1}^{B,k,v} = \mathbf{I}\_{2^r} \otimes \left( 0, \mathbf{H}\_B \otimes \mathbf{V}^0 \mathbf{v} \right). \end{array}$$
 
$$\text{For } r = 0 \qquad k - R - 1 \text{ and } k \ge R.$$

$$\begin{array}{ll} \bullet & \text{For } r = 0, \dots, k - R - 1 \text{ and } k \ge R + 1, \\ & \mathbf{D}^{C, k, \upsilon}\_{r, r} = \mathbf{I}\_{2^r} \otimes \mathbf{H}\_{\mathbb{C}} \otimes \mathbf{V}^0 \upsilon \end{array}$$

$$\begin{array}{ll} \bullet & \text{If } R = 1, \\ & \mathbf{D}\_{00}^{NS, 1, \upsilon} = \mathbf{H}\_{\mathbb{C}} \otimes \mathbf{V}^{0} \mathbf{v} \end{array}$$

#### **References**

