*4.1. Historical Data*

In order to verify the solutions proposed in this publication, a process of robust job scheduling is performed on the production data and on historical data of technological machine failure. The production data describes the execution of 9 production jobs processed by 12 machines, constituting a manufacturing cell (Table 1). The parts are produced in batches of 50 elements (*b* = 50) and the setup times of individual operations are not taken into account in the production scheduling process (uncertainty of setup times is a different factor and requires additional research). Therefore:

$$pt\_{ij} = b \cdot to\_{ij} \tag{14}$$

where *b*—quantity of elements in the production batch, *toij*—operation time.


**Table 1.** Production data implemented in the robust scheduling solution.

\* *tsij*—setup time, *toij*—operation time.

The data describing the failure rate of machines in the scheduled production process have been obtained from the computer records of machine operation from a maintenance department (Table 2). The collected data describe the failure rate of six technological machines that are crucial for the performed production process. The numbers of observations are as follows:


**Table 2.** Machine failure and repair time data implemented in the scheduling solution.


The nominal and robust schedules are subsequently built based on the data presented above. This, however, must be preceded by the prediction of failure parameters with the application of Markov chain and ARIMA models, which is described in the next section.

#### *4.2. Prediction of Machine Failure Parameters*

The failure prediction process is performed using appropriate scripts formulated in the RStudio computing environment [48] that enable the analysis in the range described in Section 3.3.

Modelling the machine failure rates using the Markov chain (containing information on changes in production) is performed with the use of the *markovchain* library. Initially, the collected empirical data is verified to check whether they fulfil the properties of the Markov process. The analyses confirm that the data from the analyzed machines meet the required properties, which is further evidenced by the *p*-value index (Table 3). The *p*-value is the probability of obtaining hypothesis test results as extreme as the observed results, assuming that the null hypothesis is correct (data chain has a Markov property).

**Table 3.** Markov process identification results.


In the next stage of the machine failure prediction analysis, the transition rate matrices are generated from the collected data. From the obtained information, we determine, at a given probability level, the occurrence of subsequent chain elements, which in this study is the probability of machine failure occurrence at subsequent production shifts (Table A1).

For clarity of presentation, the results from calculations are given in the form of transition diagrams (Figure 1), which additionally enable determining the probability of failure not only on subsequent shifts but also during the current shift (the arrow returning to the node). In Figure 1, knots represent shifts and the probability of machine failure during a given shift is given at the beginning of each arrow (next to the knot), e.g., for machine *M*6, the probability that the machine failure will occur after shift 1 during shift 2 is 0.593.

**Figure 1.** Markov chains with transition diagrams for the considered machines: (**a**) Machine *M*1; (**b**) Machine *M*2; (**c**) Machine *M*3; (**d**) Machine *M*6; (**e**) Machine *M*7; (**f**) Machine *M*8.

The second key parameter of failure is its time. To this end, the *forecast* package is used, which enables identifying the elements of ARIMA time series. As a result, the predicted machine failure repair times are determined. Before forecasting, each machine is tested to verify whether the process is stationary, also the component models are established (autoregression, moving average and the integration). In the subsequent step, future repair times are predicted. Due to the fact that in ARIMA models, the forecasts may also take negative values, the collected observations have been first subjected to variance-stabilizing transformation, and after the prediction, the process is completed and their original sets of values are returned. The results from the model identifying exemplary predicted times for the 5 subsequent steps of the time series are presented in Table 4.


**Table 4.** Predicted machine repair times.

The data below display the disparity between individual machine repair times. Each machine is coupled with a different ARIMA model. From the set of possible ARIMA models, we select a model that has a smaller AIC (Akaike Information Criterion) value. The analytical process is carried out with the exclusive use of autoregression (Machine *M*1), or the moving average (Machine *M*8), or a combination thereof (Machine *M*2, *M*6, *M*7). In a single case (Machine *M*2), time variability is a series of independent random variables; therefore, the forecast repair times are established on the basis of mean observations.

The results of the prediction of machine failure parameters are perfectly applicable to robust production scheduling. Expressing the time of failure through production shifts allows us to determine the intervals in which machine failures are likely to occur. In turn, the forecasted repair times could be further employed to the determination of machine inspection and maintenance times. Therefore, in the next stage, the obtained analysis results are used to formulate a robust production schedule, whose effectiveness is subsequently verified.

#### *4.3. Production Process Modelling and Scheduling*

Before the obtained prediction results could be subjected to further processing, nominal schedules are generated from the real data. Let us assume that the product is manufactured in batches of 50 pieces and the objective function of the schedule is to minimize the makespan (*C*max).

The schedules are developed using LiSA (A Library of Scheduling Algorithms) software [49], for the analysis of scheduling problems in various environments. The production data serve to represent the production system: a set of machines *M* and jobs *J*, the technology matrix *MO* and the matrix of processing times *PT*. To test the alternative versions of scheduling, the choice of the next operation is determined by two dispatching rules [50]:


The robustness of the production schedules is to be provided by the inclusion of the results from the predictions of failure parameters using the data describing the states of production shifts set *S* and predicted repair times *rtt*. As a result, we have managed to determine the elements from the set of predicted machine failure times *FTMl* and the service time buffer set *TBMl*. As noted in the introduction, the data obtained for strategic machines are analyzed from the perspective of executed production processes, hence the discrepancies in the designations in the technology records and the schedule. All the data that serve to generate the robust schedule are presented in Table 5.


**Table 5.** The data implemented into the robust production schedule.

Since it is built on the data above, the obtained schedule is robust to machine failure disturbances. The procedure for generating the robust schedule is rather straightforward: service time buffers *TBMl* are implemented into the nominal schedule in the slots indicated by the set of machine failure times *FTMl*. The time-to-failure is counted only for the machines processing jobs (idle time was disregarded). In the case when a service time buffer is required during the operation, any interfering operation was shifted right in the order of jobs.

#### *4.4. Evaluation Criteria*

To verify the effectiveness of the robust scheduling solutions, as well as for the sake of comparative analysis against the nominal schedules, the following assessment criteria are applied:


$$\overline{\mathbb{C}} = \frac{1}{n} \sum\_{i=1}^{n} (\mathbb{C}\_i)\_\prime \tag{15}$$

where *Ci*—the completion time of job *i.*

• mean flow time *F* given by:

$$\overline{F} = \frac{1}{n} \sum\_{i=1}^{n} {}^{n}\!\_{i} (F\_{i}) \, \tag{16}$$

where *Fi*—the flow time of job *i.*

• the number of critical operations *YK* is derived from:

$$Y\_K = \sum\_{i=1}^{n} \sum\_{j=1}^{m} {w\_{ij}}\_j \tag{17}$$

$$y\_{ij} = \begin{cases} 1, \text{ when } \left(tz\_{ij} - tr\_{ij+1}\right) = 0 \\ 0, \text{when } \left(tz\_{ij} - tr\_{ij+1}\right) \neq 0 \end{cases} \tag{18}$$

where *YK*—the number of critical operations, *tzij*—the completion time of operation *oij* (current), *trij*+1—the start time of operation *oij* + 1 (subsequent).

The verification of the obtained schedules is performed during the online stage (production execution), modelled with the Enterprise Dynamics software in a series of simulation tests. The computations serve to determine total completion times of production jobs under strategic machinery failure. The modelling tool used in the study allows detecting machine failure times by setting the MTTF and MTTR indicators and selected probability distributions (Table 6). The MTTF values are specified for the uniform probability distribution (i.e., the machine failure can occur at any time—from the start of the job on the machine until its completion). On the other hand, the MTTR values are determined using the Weibull distribution, obtained for machine repair times from the Cullen–Frey graph [6].


**Table 6.** Machine failure parameters defined in the simulation environment.

\* the parameters are expressed in hours.

Twenty-five simulations of the production process are performed for each of the LPT or SPT schedules. The indicators employed in the assessment of the results from simulations are:

• Increase of completion time of all jobs Δ*C*max given by:

$$
\Delta \mathcal{C}\_{\text{max}} = \mathcal{C}\_{\text{max}} - \mathcal{C}'\_{\text{max}} \, \, \, \, \, \tag{19}
$$

where Δ*C*max—increase of completion time of all jobs, *C*max—nominal schedule makespan, *C* max—actual (executed) schedule makespan.

• Relative increase of makespan *ECmax* given by:

$$E\_{\mathbb{C}\_{\text{max}}} = \frac{\mathbb{C}\_{\text{max}}}{\mathbb{C}'\_{\text{max}}},\tag{20}$$

where *ECmax*—relative increase of makespan, *C*max—nominal schedule makespan, *C* max—actual (executed) schedule makespan.
