Next Article in Journal
Multi-Directional Shape Change Analysis of Biotensegrity Model Mimicking Human Spine Curvature
Previous Article in Journal
Study of the Rotary Bending Fatigue Resistance of 30MnB5, 41CrS4 and 30MnVS6 Steels
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Calculation of the Dangerous Failure Rate of the Safety Function

Department of Control and Information Systems, Faculty of Electrical Engineering and Information Technology, University of Žilina, 010 26 Žilina, Slovakia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(5), 2382; https://doi.org/10.3390/app12052382
Submission received: 21 January 2022 / Revised: 16 February 2022 / Accepted: 22 February 2022 / Published: 24 February 2022
(This article belongs to the Section Electrical, Electronics and Communications Engineering)

Abstract

:
Each safety-related function must be implemented with a defined safety integrity level (SIL) if the control system implements safety-related functions (SFs) in addition to the standard control functions. The required SIL of the SF depends on the quantity of the risk associated with the failure of this one SF. The SIL against random failure can be expressed through the dangerous failure rate of the SF for an electronic safety-related control system (ESRCS) operating in a continuous mode of operation. The proof must be provided (among other things) that the SIL requirements for the individual SFs are met so the ESRCS can be accepted and implemented. The assessment of the impact of random failures on the SIL of the SF must be performed using the quantitative analysis method. This paper describes the procedure and derives equations for evaluating the impact of random failure on SIL of the SF using Markov chains with two absorption states. The achieved results are presented for SF implemented by ESRCS with dual architecture based on composite fail-safety technique.

1. Introduction

Some activities are associated with hazards. These hazards can cause damage to human health, damage to the environment, or major property damage. Such hazards or their consequences can be prevented by applying safety measures. How “strong” safety measures should be, depends on the results of the risk analysis that is associated with the machinery and equipment used, which are a part of the equipment under control (EUC). The “greater” difference between tolerable risk and the reduction of risk needs the safety measures to be applied “stronger”. Among the so-called active technical safety measures also belong electronic safety-related control systems (ESRCSs). An ESRCS is a system that realizes the safety function (SF). In addition to SFs an ESRCS can (but does not have to) realize standard control functions. The standard [1] defines SF, which is realized by a safety-related system, as a function designed to ensure or maintain the safe state of a controlled system concerning specific hazardous events. SF failure can increase the risk. Therefore, each SF must have fail-safety features.
The following techniques can be used to achieve the fail-safety feature:
  • the technique of inherent fail-safety;
  • the technique of reactive fail-safety;
  • the technique of composite fail-safety.
For ESRCS to be used in practice, written proof must be provided that the specified safety requirements are met.
A characteristic feature of ESRCS that uses the inherent fail-safety technique is the one-channel architecture. The logic of such systems is realized by special logic elements with asymmetric failure features (most often electromechanical relays, so-called safety relays), whose properties are verified by long-term operational experience, as well as using appropriate methods and procedures in the application of these elements. Demonstration of the safety properties of such systems is based on qualitative analysis. Verification of the functional properties of such ESRCSs is based on the checks and tests applied. The impact of faults to the SF of such a system is based on a qualitative analysis, whereas an answer to the question “What happens if...?” is searched for. A safety-related system is considered safe if there does not exist real failure modes (or a combination of real failure modes) that would result in a dangerous ESRCS operation. Failure Modes and Effects Analysis (FMEA) can be successfully used for this purpose [2,3]. FMEA is an inductive method of analysis for technical objects that are implemented from elements for which it is possible to compile a list of real failure modes.
The use of electronic elements in the ESRCS area also led to an effort to avoid using special elements and to apply standard elements that are commonly available on the commercial market, the so-called commercial off the shelf (COTS) products. Increasing the integration of electronic components does not make it possible to compile a list of real failure modes and use only qualitative analysis methods to assess ESRCSs. ESRCS safety assessment requires the application of methods and procedures based on a probabilistic approach. In the case of ESRCS with lower safety requirements, it is possible to use the reactive fail-safety technique (characterized by a one channel architecture and feedback, to compare the actual state of the EUC with the required state of this EUC) in the case of failure.
In the case of ESRCS with higher safety requirements, the composite fail-safety technique must be used when failure appears. The composite fail-safety technique is characterized using a multi-channel architecture, while the safety condition is the independence of the channels and the early fault detection and negation.
A new name, “safety integrity”, is also emerging in the context of ESRCS. Safety integrity is the probability of a safety-related system achieving its required SFs under all the stated conditions within a stated operational environment and within a stated duration [1]. It is a case of one of the basic safety features of the ESRCS. SI is indicated by the safety integrity level (SIL). SIL does not relate to ESRCS, but to SF, which is realized by ESRCS. ESRCS can realize (and usually does) multiple SFs, and each SF may or may not be realized with a different SIL. According to the paper [1], safety integrity (SI) consists of three parts—systematic safety integrity (SystF-SI), software safety integrity (SW-SI), and hardware safety integrity (HW-SI). In general, it is valid that ESRCS hardware can dispose mainly of random failures, but also systematic failures and software have only systematic failures. For this reason, only systematic failure safety integrity (SystF-SI) and random failure safety integrity (RandF-SI) can be taken into account. Because systematic faults have different effects in the different life cycle phases and safety measures are dependent on the specific application, a quantitative SI analysis of the SF is not possible.
SystF-SI forms a non-quantifiable part of SI, and compliance with the requirements for SystF-SI is demonstrated by applying appropriate (prescribed for each SIL) techniques and measures for the avoidance of ESRCS systematic faults, which realizes the given SF.
RandF-SI forms a quantifiable part of SI and is subject to quantitative analysis. The occurrence of random failures over time can be described by reliability parameters (for example, random failure rate). The need to use quantitative methods (probabilistic approach) to evaluate RandF-SI indirectly follows from the SIL-table in the paper [1], which defines the average frequency of dangerous failure of the SF in high-demand mode of operation (PFH) for SIL as follows:
  • for SIL 1 is valid 10–6 h−1 ≤ PFH < 10–5 h−1;
  • for SIL 2 is valid 10–7 h−1 ≤ PFH < 10–6 h−1;
  • for SIL 3 is valid 10–8 h−1 ≤ PFH < 10–7 h−1;
  • for SIL 4 is valid 10–9 h−1 ≤ PFH < 10–8 h−1.
SI is affected by various factors, as shown in Figure 1. These factors should also include quality management and safety management, which cover all phases of the ESRCS life cycle. It defines phases for these tasks that need to be performed to achieve the specified safety objectives. Other factors (reliability, diagnostics, architecture recovery, and technical independence) belong to the technical and operational parameters of the SRCS.
An appropriate quantitative analysis method should be used for RandF-SI analysis. The most suitable method allows respecting the influence of as many factors as possible that affect RandF-SI. The most used methods include fault tree analysis (FTA) [4] and reliability block diagram (RBD) [5,6]. In the paper [6], the RBD method is used to determine the simplified algebraic expression for the probability of failure on demand (PFD) for a system with architecture 2 out of 4. Simplification relates to respecting the impact of several factors on PFD. A similar problem, but on a more general level, is solved by the paper [7].
The great advantage of these methods is that they are supported by many software tools (PTC Windchill, Item Software, BQR Reliability Engineering Ltd., …), which provide not only tools for analysis but also for creating the presentation of the achieved results. The disadvantage of these static methods is that they do not allow a comprehensive assessment of the impact of several factors (impact of diagnostic coverage, recovery intensity, the impact of SF architecture change…) on the monitored parameter of ESRCS, resp. SF.
FTA and RBD methods are static methods and were originally developed to evaluate the reliability and availability parameters of the system. These methods assume that the system has two states and is in one of these two states at any given time. What states are considered depends on the analysed property (functional—non-functional, up—down, safe—dangerous). Modified versions of these methods (dynamic fault trees analysis (DTFA) in the paper [8] and dynamic reliability block diagram (DRBD) in the paper [9]) largely eliminate these deficiencies. Like support, these methods use Markov chains, Petri nets, Bays nets, or Monte Carlo simulations.
When evaluating RandF-SI, it is appropriate to consider three states of ESRCS, resp. SF (up state; down and safe state; down and dangerous state). The most used state-oriented methods include the continuous-time Markov chains (CTMC) method, either alone the paper [10] or in combination with discrete-time Markov chains (DTMC) in the paper [11]. The effect of random failures on RandF-SI of SF can be modelled using CTMC. By the DTMC, it is possible to model the influence of the online diagnostic mechanism, which works discretely or to model the impact of periodic maintenance. In the paper [11], the basic idea of how to solve a problem related to the evaluation of the differently operating fault detection mechanisms to the RandF-SI of the SF is mentioned. A CTMC/DTMC combination can be used to solve this problem.
A major disadvantage of SF analysis by the Markov chain (MC) is the state space explosion. The paper [12] deals with the mathematical procedure of the creation of the model of E-SRS, which consists of several subsystems. The proposed procedure consists of the suitable merging of models of subsystems into one model of ESRCS. Tensor constructions (the tensor sum and the tensor product) have been used for the merging of CTMCs of subsystems. The state explosion problem, which occurred during the merging of CTMCs of subsystems.
The individual methods for analysis can be suitably combined. For example, the paper [13] deals with the procedure description and the definition of the mathematical relations for creating the model of the complex ESRCS. The proposed procedure consists of the decomposition of ESRCS into the mutually independent subsystems, the calculation of the required parameters of each subsystem by MC, and the creation of the resulting model of the whole ESRCS by the FTA. Subsequently, the dangerous failure rate of the whole ESRCS and the proof test interval of ESRCSs are determined by the FTA and the parameters of the subsystem. Quite often, a combination of FMEA and RBD methods in the paper [5] or a combination of FMEA and FTA methods in the paper [14] can be encountered.

2. CTMCs with Multiple Absorption States

Let us consider a stochastic process that fulfils the Markov property (the probability of moving to the next state depends only on the present state and not on the previous states):
P r { X ( t ) x |   X ( t 0 ) = x 0 ,   X ( t 1 ) = x 1 ,   ,   X ( t n ) = x n } = P r { X ( t ) x |   X ( t n ) = x n }
where X ( t ) is the random variable, t T (T is the time range) is the time parameter and is valid, and 0 t 0 < t 1 < < t n < t .
If the value, which is acquired by X ( t ) , is called state and if a set of states is countable, then the Markov process forms the Markov chain (MC). We distinguish two basic types of the MC:
  • discrete-time Markov chain;
  • continuous-time Markov chain.
For the homogenous CTMC the transition probability of the system from state i to state j can be calculated as the conditional probability that the system in time t = t + Δ t goes to state j under the condition that the system in time t was in state i , i.e.,
p i j ( t + Δ t ) = P r { X ( t + Δ t ) = j |   X ( t ) = i } .
The CTMC is completely defined if the transition rate matrix (3) and the initial distribution (4) are defined. The transition rate matrix:
= ( q i j )   for   i , j { 1 ,   , n } ,
where q i j is the transition rate from state i to state j and q i i = j = 1 , j i n q i j is the sojourn rate in state i . If the CTMC is homogeneous, the transition rates are constant. The initial distribution:
P 0 = P 0 ( t = 0 ) = { p 1 ( t = 0 ) ,   p 2 ( t = 0 ) ,   ,   p n ( t = 0 ) }
where P 0 is the initial distribution in time t = 0 , p i ( t = 0 ) is the probability of state i in time t = 0 .
The CTMC distribution in time t can be calculated as a solution of the differential equations system (5) for the initial distribution (4)
d P ( t ) d t = P ( t ) · .
Figure 2 shows a CTMC that contains ( n k + 1 ) transient states and ( k 1 ) absorption states. q i j is the intensity of the transition from state i to state j .
The CTMC in Figure 2 can be described by a general matrix of transition rates (6):
= ( l = 1 , l 1 n q 1 l q 12 q 1 ( k 1 ) q 1 n q 21 l = 1 , l 2 n q 2 l q 2 ( k 1 ) q 2 n q ( k 1 ) 1 q ( k 1 ) 2 l = 1 , l ( k 1 ) n q ( k 1 ) l q ( k 1 ) n 0 0 0 0 0 0 0 0 ) ,
where the state i = 1 ,   ,   ( k 1 ) is from the set of transition states T and state j = k ,   ,   n is from the set of absorption states A . Because the article solves the problem of calculating the dangerous failure rate of SF, the symbol λ will be used instead of the symbol q for the transition rates in the next part of the article.

3. The Dangerous Failure Rate of the SF

Let the ESRCS implement only one SF and all its elements participate in the implementation of this SF, then the SI of safety function (SI-SF) can be identified with the SI ESRCS.
Because random failures occur randomly and continuously over time, CTMCs can be used to analyse the effects of random failures on SI-SF. In general, it is a state-oriented, graphic-mathematical method. The qualitative part of the method is related to the creation of a graphical model, which expresses the relationship between the elements of the considered system and their impact on SI-SF. Model creation is a very important activity in Markov’s analysis. The automatic model generating is theoretically possible (under certain conditions) in paper [12], but practically unusable. The extent to which the developed CTMC model corresponds to the real properties of SF depends on the experience of the analyst and knowledge of the operational and technical characteristics of ESRCS, which implements SF. The quantitative part of the analysis is focused on the mathematical description of the graphical model and the subsequent calculation of the monitored system property—the dangerous failure rate of the SF.
Available sources of information in the papers [15,16,17] consider CTMC-based models with only one absorption state (Figure 3) or no absorption state when analysing the impact of random failures on SI-SF. CTMC without an absorption state is considered when the recovery mechanism is also considered in the modelling. The model without absorption state cannot be used to calculate the dangerous failure of the SF.
If the CTMC contains one absorption state, then this state is considered as a down state and dangerous state (state D) in safety analyses. The ESRCS (which implements the SF) enters this state due to the occurrence of a dangerous failure. If the state probability p D ( t = 0 ) = 0 , then the probability p D ( t ) has the properties of the distribution function. The transition rate to state D can be interpreted as a dangerous failure rate of SF and is determined by the equation
λ D ( t ) = d p D ( t ) d t 1 p D ( t ) ,
which is consistent with the definition of failure rate for a non-restored object.
Recovery from state D is possible, but it is a special case of recovery. In general, the cause of a dangerous condition needs to be identified, which usually leads to changes in the system, and given the extent of the change, all procedures and measures within each life cycle phase should be applied appropriately as if it were a new system. In such a case, the recovery has its specific features, for example, because these changes are subject to assessment by an independent authority. After the recovery, ESRCS is considered “as good as new” and starts a new operation (RandF-SI of SF re-analysis starts again in time ( t = 0 ) ).
If ESRCS operates in a continuous mode of operation and has an online mechanism for detecting and negating the fault, then ESRCS can go to the down and safety state (in the next part of this article, this is referred to as state S). Recovery from this state is possible either without interruption of ESRCS operation or after interruption of operation (depending on the ESRCS architecture and the method of organization of the recovery).
However, in many cases, when the down state (state D or state S) is reached, the RandF-SI analysis of SF is required to be completed and the dangerous failure rate of SF (transition rate to state D) is calculated. Since states D and S are absorption states, in this case, Equation (7) cannot be used to calculate the dangerous failure rate of SF (transition rate to state D). Such a situation is shown in Figure 4.
In general, the rate of the transition from state i to state j for time Δ t is understood as the derivate of the transition probability, i.e.,
λ i j = p i j   ( t , t + Δ t ) Δ t = P ( x ( t + Δ t ) = j | x ( t ) = i ) Δ t = P ( ( x ( t + Δ t ) = j ) ( x ( t ) = i ) ) P ( x ( t ) = i ) · Δ t .
It is also true that
i = 1 k 1 P ( x ( t ) = i ) + j = k n P ( x ( t ) = j ) = 1 ,   i = 1 k 1 P ( x ( t ) = i ) = 1 j = k n P ( x ( t ) = j ) .
Let it be true that
i = 1 k 1 P ( x ( t ) = i ) = i = 1 k 1 P i ( t ) , j = k n P ( x ( t ) = j ) = j = k n P j ( t ) .
The rate of transition to state j (state from the set of absorption states j A ) from the set of transition states
λ j = i = 1 k 1 λ i j = i = 1 k 1 P ( ( x ( t + Δ t ) = j ) ( x ( t ) = i ) ) i = 1 k 1 p ( x ( t ) = i ) · Δ t = p j ( t ) i = 1 k 1 p ( x ( t ) = i ) · Δ t = d p j ( t ) d t 1 j = k n p j ( t ) .
If the system contains one absorption state (state D), then it follows from (11) that
λ D ( t ) = d p D ( t ) d t 1 p D ( t )
Equation (12) corresponds to Equation (7).
If the system contains two absorption states (state D and state S), then it follows from (11) that
λ D ( t ) = d p D ( t ) d t 1 ( p D ( t ) + p S ( t ) ) ,
λ S ( t ) = d p S ( t ) d t 1 ( p D ( t ) + p S ( t ) ) .
Another approach can be chosen for the CTMC in Figure 4. This CTMC can be described by a transition rate matrix (15) and a set of differential Equation (16)
A = ( ( λ D ( t ) + λ S ( t ) ) λ D ( t ) λ S ( t ) 0 0 0 0 0 0 ) .
d p X ( t ) d t = ( λ D ( t ) + λ S ( t ) ) .   p X ( t ) , d p D ( t ) d t = λ D ( t ) .   p X ( t ) , d p S ( t ) d t = λ S ( t ) ·   p X ( t ) .
It follows from (16) that
d p D ( t ) d t d p S ( t ) d t = λ D ( t ) λ S ( t ) ,
λ D ( t ) = d p D ( t ) d t p X ( t ) = d p D ( t ) d t 1 ( p D ( t ) + p S ( t ) ) ,
λ S ( t ) = d p S ( t ) d t p X ( t ) = d p S ( t ) d t 1 ( p D ( t ) + p S ( t ) ) .

4. Rate of Dangerous Failure of a Safety Function for Dual Architecture with Two Absorption States

In operation, ESRCSs with a two-channel architecture that is based on the technique of composite fail-safety with comparison are very often used. Standard [1] requires RandF-SI to be performed not on the system but separately for each SF. For the sake of clarity of this paper, it is assumed that ESRCS consists of two hardware-identical and physically independent channels—channel R and channel L (Figure 5), which manage the EUC. Let ESRCS implement one SF, then the dangerous failure of ESRCS can be identified with the dangerous failure of SF.

4.1. CTMC for Dual Architecture with Two Absorption States

Let SRES with dual architecture (Figure 5) have the following features:
  • the R and L channels are hardware identical, λ R = λ L = λ , where λ R is the failure rate of channel R, and λ L is the failure rate of channel L;
  • the SRES has a fault detection and negation mechanism, which is characterized by a diagnostic failure coverage coefficient c , fault detection time t d , and fault negation time t N ;
  • SRES operates in continuous operation mode.
To evaluate ESRCS with such properties, the CTMC is shown in Figure 6. The paper [11] also deals with a similar problem.
The characteristic of the individual states of the model in Figure 6 is described in Table 1.
The rate of the transition from state 3 to state S (induction of a safe ESRCS response because of the fault detection and negation) can be expressed by the relationship (20).
δ = 1 t d + t N .
The CTMC in Figure 6 can be described by a system of differential equations:
d p O K ( t ) d t = 2 λ · p O K ( t ) ,   d p 2 ( t ) d t = 2 λ · ( 1 c ) · p O K ( t ) λ · ( 1 + c ) · p 2 ( t ) ,   d p 3 ( t ) d t = 2 λ · c · p O K ( t ) + λ · c · p 2 ( t ) ( λ + δ ) · p 3 ( t ) , d p S ( t ) d t = δ · p 3 ( t ) , d p D ( t ) d t = λ · p 2 ( t ) + λ · p 3 ( t ) .
If at time t = 0 the ESRCS is in the OK state (Figure 6), then the initialization vector is:
P ( t = 0 ) = { 1 ,   0 ,   0 ,   0 ,   0 } .
Then the probabilities of each state are:
p 1 ( t ) = e 2 λ t , p 2 ( t ) = 2 · e 2 λ t + 2 · e ( 1 + c ) λ t , p 3 ( t ) = 2 λ c ( λ c δ ) · ( e ( λ + δ ) t e ( 1 + c ) λ t ) , p 4 ( t ) = p D ( t ) = e 2 λ t 1 + 2 δ ( λ c δ ) · ( 1 + c ) · ( e ( 1 + c ) λ t 1 ) 2 c λ 2 ( λ c δ ) · ( λ + δ ) · ( e ( λ + δ ) t 1 ) , p 5 ( t ) = p S ( t ) = 2 δ c ( λ c δ ) · ( 1 + c ) · e ( 1 + c ) λ t 2 c λ δ ( λ c δ ) · ( λ + δ ) · e ( λ + δ ) t + 2 δ c ( λ + δ ) · ( 1 + c ) .

4.2. Dangerous Failure Rate of the SF for Dual Architecture with Two Absorption States

Based on Equations (13) and (14), for the CTMC in Figure 6 deduce that
λ D ( t ) = d p D ( t ) d t 1 ( p D ( t ) + p S ( t ) ) = 2 λ ( λ c δ ) · e 2 λ t 2 δ · λ · e ( 1 + c ) λ t + 2 c λ 2 ·   e ( λ + δ ) t ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 δ · e ( 1 + c ) λ t ,
λ S ( t ) = d p S ( t ) d t 1 ( p H ( t ) + p S ( t ) ) = 2 c λ δ · e ( λ + δ ) t 2 δ c λ · e ( 1 + c ) λ t ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 δ · e ( 1 + c ) λ t .
By merging states 1, 2, and 3 into state X (Figure 6), the CTMC from Figure 6, can be replaced by CTMC that corresponds to the CTMC in Figure 4. CTMC with merged states is shown in Figure 7.
For the probability of state X, it is valid that:
p X ( t ) = p 1 ( t ) + p 2 ( t ) + p 3 ( t ) = ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 · δ · e ( 1 + c ) λ t ( λ c δ ) ,
d p X ( t ) d t = 2 λ · ( λ c δ ) · e 2 λ t 2 λ c · ( λ + δ ) e ( λ + δ ) t + 2 δ · λ · ( 1 + c ) · e ( 1 + c ) λ t ( λ c δ )
If the applied substitution is correct, then it must be true that:
d p X ( t ) d t = ( λ D ( t ) + λ S ( t ) ) ·   p X ( t )
After inserting Equations (24)–(26) to Equation (28), we get:
( λ D ( t ) + λ S ( t ) ) ·   p X ( t ) = ( 2 λ ( λ c δ ) · e 2 λ t 2 δ · λ · e ( 1 + c ) λ t + 2 c λ 2 ·   e ( λ + δ ) t ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 δ · e ( 1 + c ) λ t + 2 c λ δ · e ( λ + δ ) t 2 δ c λ · e ( 1 + c ) λ t ( λ c δ ) . e 2 λ t + 2 λ c · e ( λ + δ ) t 2 δ · e ( 1 + c ) λ t ) · ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 · δ · e ( 1 + c ) λ t ( λ c δ ) = 2 λ ( λ c δ ) . e 2 λ t + 2 δ · λ · ( 1 + c ) · e ( 1 + c ) λ t 2 c λ ( λ + δ ) ·   e ( λ + δ ) t ( λ c δ ) = d p X ( t ) d t
Which confirms the correctness of this reasoning.
By merging states S and D to state Y in the CTMC in Figure 7, a CTMC with two states is formed (Figure 8).
If it is true that
p Y ( t ) = p D ( t ) + p S ( t ) ,
then the rate of the transition from state X to state Y can be expressed by:
λ Y ( t ) = d ( p D ( t ) + p S ( t ) ) d t 1 ( p D ( t ) + p S ( t ) ) = d p D ( t ) d t 1 ( p D ( t ) + p S ( t ) ) + d p S ( t ) d t 1 ( p D ( t ) + p S ( t ) ) = λ D ( t ) + λ S ( t )

4.3. Practical Application of the Equation for the Dangerous Failure Rate of the SF

Let SF be implemented by a dual architecture based on complex safety in case of comparison failure, as shown in Figure 6. Let:
  • λ = λ L = λ R = 1.10 6   h 1 (hardware identical channels);
  • the fault detection and negation mechanism is characterized by the coefficient of diagnostic coverage of faults c = 0.99 and the fault detection time t d = 2 ; the fault negation time t N is negligible with respect to the value t d ( t N t d ) ;
  • the considered time interval in which the rate of a dangerous SF disorder is calculated is 20 years (standard considered useful life ESRCS).
The calculations are performed in the software environment Wolfram Mathematica 11.
In Figure 9, the time dependence p D ( t ) calculated according to Equation (23) is shown. It is a function that is increasing, but it holds that p D ( t ) < 1 , i.e., does not meet the requirements for the distribution function. This trend can also be observed in the data presented in Table 2. In Figure 10, the time dependence λ D ( t ) calculated according to (23) for δ = 0.5   h 1 and, in Figure 11, the time dependence λ D ( t ) calculated according to (23) is shown, but assuming that δ = 0   h 1 . From a comparison of Figure 10 and Figure 11, it is possible to judge the influence of the mechanism of negation of the fault on the value of the rate of the dangerous failure SF. This fact ultimately has a positive effect on prolonging the time interval between the proof tests. The effect of online diagnostics in combination with a proof test on SI-SF is described in paper [11].

4.4. Merging of Absorption States

Let us convert a model with two absorption states (Figure 6) to a model with one absorption state by putting state D and state S into one state—state Y (Figure 12).
The probability of state Y (Figure 12) is:
p Y ( t ) = 1 + e 2 λ t 2 λ c · e ( λ + δ ) t ( λ c δ ) + 2 · δ · e ( 1 + c ) λ t ( λ c δ ) .
The equation calculating of the rate of the transition to the Y state and the adjustment according to (24) and (25) is
λ Y ( t ) = d p Y ( t ) d t 1 p Y ( t ) = 2 λ ( λ c δ ) · e 2 λ t 2 · λ · δ · ( 1 + c ) · e ( 1 + c ) λ t + 2 · λ · c · ( λ + δ ) ·   e ( λ + δ ) t ( λ c δ ) · e 2 λ t + 2 λ c · e ( λ + δ ) t 2 δ · e ( 1 + c ) λ t λ Y ( t ) = d p Y ( t ) d t 1 p Y ( t ) = λ D ( t ) + λ S ( t ) .
Figure 13 shows the time dependence p Y ( t ) calculated according to (32), and Figure 14 shows the time dependence λ Y ( t ) calculated according to (33).

5. Discussion and Conclusions

Different approaches can be encountered in analyses of the impact of random failures on SI-SF. If the system consists of components connected in a simple serial architecture with a defined SIL, then the simplified relations given in the standards can be used. An example of such a system is the application published in paper [18]. However, in more complex cases, simplified relationships from standards cannot be used, and it is necessary to create a suitable model. One of the tools used to create such a model is the CTMC. The available literature and scientific publications consider the use of CTMC for the safety analysis with one absorption state. The transition of SF to this state is interpreted as the occurrence of a dangerous fault of SF and means the transition of a system that implements a SF to a down state. The fact is that ESRCS can also enter the down state due to the fault detection and negation, which leads to the creation of a CTMC with two absorption states. The problematics of the influence of fault detection and negation mechanisms on the safety properties of SF and related problems are analysed in paper [11]. However, in paper [11], the analysis ends at the level of calculating the probability of occurrence of individual absorption states. The standard [1] requires the calculation of the SF dangerous failure rate for the SF in continuous operation mode. This paper directly solves this issue. The results can be used directly in the safety analysis of SF implemented on hardware with fault detection and subsequent fault negation. From the point of view of CTMC analysis, such hardware leads to multiple absorption states. In this paper, the relations are applied to the calculation of the dangerous failure rate of the dual architecture with two absorption states (Section 4.1, Section 4.2, Section 4.3). In Section 4.4, the results are verified by merging the absorption states. Verification confirms the correctness of the derived equations.
The procedure for calculating the SF dangerous failure rate was designed to develop a real ESRCS (system with architecture 2oo3), intended for use in railway transport. This procedure has been assessed and accepted by an independent organization with international accreditation to assess the safety of SRCS for railway applications.
Comparing the results presented in this article with other publicly available scientific articles is problematic because the available literature solves this problem with only one absorption state (e.g., paper [19]).
Further development aims to extend this approach for use in ESRCS safety integrity assessment, which has a learning feature and can restart in uninterrupted operation. It is an ESRCS with a 2oo3 architecture. After identifying a failure state in one channel, this channel is isolated, and the system goes into a degraded mode (it works in the 2oo2 architecture). If it was a short-term fault condition, automatic recovery to full mode (without interruption of operation) is required. However, this activity must not lead to a reduction in safety integrity, which must be demonstrated.
This procedure could be applied to analyse other RAMS (reliability, availability, maintainability, and safety) parameters of ESRCS. However, creating a suitable CTMC for each property would be necessary. This contention constitutes some limitation of the presented approach, so further research in this area will focus on linking the results published in paper [11] and this paper to assess the RAMS parameters of the ESRCS comprehensively. The intention is to create a comprehensive model for typical architectures that will be useful in analysing all RAMS parameters.
The terms used in this paper are consistent with the terminology used in the standard [1], which is the internationally accepted standard that outlines the requirements, principles, and methods for safety assessment.

Author Contributions

Data curation, J.Ž.; formal analysis, K.R.; investigation, J.H.; methodology, K.R.; project administration, M.M.; resources, J.Ž.; supervision, K.R.; validation, J.Ž.; visualization, J.H.; writing—original draft, J.H.; writing—review and editing, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by KEGA grant number 010ŽU-4/2022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All necessary data is given in this article. (Generated from equations).

Acknowledgments

This work has been supported by the Educational Grant Agency of the Slovak Republic (KEGA) Number 010ŽU-4/2022: Implementation of special functions to the control systems for industrial applications.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. EN61508: Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems. 2010. Available online: https://webstore.iec.ch/publication/22273 (accessed on 20 January 2022).
  2. Rástočný, K.; Bubeníková, E. Safety and Availability—Basic Attributes of Safety-Related Electronic Systems for Railway Signalling. In Informatics and Intelligent Applications; Springer Science and Business Media LLC: Sielpia, Poland, 2019; Volume 1049, pp. 69–82. [Google Scholar]
  3. Fithri, P.; Riva, N.A.; Susanti, L.; Yuliandra, B. Safety analysis at weaving department of PT. X Bogor using Failure Mode and Effect Analysis (FMEA) and Fault Tree Analysis (FTA). In Proceedings of the 2018 5th International Conference on Industrial Engineering and Applications (ICIEA); Institute of Electrical and Electronics Engineers (IEEE), Singapore, 26–28 April 2018; pp. 382–385. [Google Scholar]
  4. Fu, J.; Li, H.; Chi, Y.; Zhen, J.; Xu, X. nSIL Evaluation and Sensitivity Study of Diverse Redundant Structure. Reliab. Eng. Syst. Saf. 2021, 210, 107518. [Google Scholar] [CrossRef]
  5. Ding, L.; Wang, H.; Kang, K. A novel method for SIL verification based on system degradation using reliability block dia-gram. Reliab. Eng. Syst. Saf. 2014, 132, 36–45. [Google Scholar] [CrossRef]
  6. Haridasan, R.; Kumar, M.; Marathe, P.P. Safety analysis of 2oo4 coincidence logic systems. Int. J. Syst. Assur. Eng. Manag. 2014, 6, 26–31. [Google Scholar] [CrossRef]
  7. Jin, H.; Lundteigen, M.A.; Rausand, M. New PFH-formulas for k-out-of-n:F-systems. Reliab. Eng. Syst. Saf. 2013, 111, 112–118. [Google Scholar] [CrossRef]
  8. Kolek, L.; Ibrahim, M.Y.; Gunawan, I.; Laribi, M.A.; Zegloul, S. Evaluation of control system reliability using combined dynamic fault trees and Markov models. In Proceedings of the 2015 IEEE 13th International Conference on Industrial Informatics (INDIN); Institute of Electrical and Electronics Engineers (IEEE), Cambridge, UK, 22–24 July 2015; pp. 536–543. [Google Scholar]
  9. Robidoux, R.; Xu, H.; Xing, L.; Zhou, M. Automated Modeling of Dynamic Reliability Block Diagrams Using Colored Petri Nets. IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 2010, 40, 337–351. [Google Scholar] [CrossRef] [Green Version]
  10. Shu, Y.; Zhao, J. A simplified Markov-based approach for safety integrity level verification. J. Loss Prev. Process Ind. 2014, 29, 262–266. [Google Scholar] [CrossRef]
  11. Rástočný, K.; Ždánsky, J.; Franeková, M.; Zolotová, I. Modelling of Diagnostics Influence on Control System Safety. Comput. Inform. 2018, 37, 457–475. [Google Scholar] [CrossRef]
  12. Balak, J.; Rastocny, K. Use of tensor construction of Markov chains when evaluating observed feature of E-SRS. In Proceedings of the 2018 ELEKTRO; Institute of Electrical and Electronics Engineers (IEEE), Mikulov, Czech Republic, 21–23 May 2018; pp. 1–6. [Google Scholar]
  13. Balák, J.; Rástočný, K. Mathematical Model for Safety Evaluation of Distributed Interlocking System. In Informatics and Intelligent Applications; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2018; Volume 897, pp. 234–248. [Google Scholar]
  14. Peeters, J.; Basten, R.; Tinga, T. Improving failure analysis efficiency by combining FTA and FMEA in a recursive manner. Reliab. Eng. Syst. Saf. 2018, 172, 36–44. [Google Scholar] [CrossRef] [Green Version]
  15. Torres, E.S.; Sriramula, S.; Celeita, D.; Ramos, G. Model for Assessing the Safety Integrity Level of Electrical/Electronic/Programmable Electronic Safety-Related Systems. In Proceedings of the 2019 IEEE Industry Applications Society Annual Meeting, Baltimore, MD, USA, 29 September–3 October 2019; Volume 1, pp. 1–7. [Google Scholar] [CrossRef]
  16. Gabriel, A.; Ozansoy, C.; Shi, J. Developments in SIL determination and calculation. Reliab. Eng. Syst. Saf. 2018, 177, 148–161. [Google Scholar] [CrossRef]
  17. Chen, H.; Yi, Q. Reliability and safety analysis of cross-redundant Structure based on Markov Process. In Proceedings of the 5th International Symposium on Computational Intelligence and Design, Hangzhou, China, 28–29 October 2012. [Google Scholar]
  18. Bačík, J.; Tkáč, P.; Hric, L.; Alexovič, S.; Kyslan, K.; Olexa, R.; Perduková, D. Phollower—The Universal Autonomous Mobile Robot for Industry and Civil Environments with COVID-19 Germicide Addon Meeting Safety Requirements. Appl. Sci. 2020, 10, 7682. [Google Scholar] [CrossRef]
  19. Lu, Z.; Liang, X.; Zuo, M.J.; Zhou, J. Markov process based time limited dispatch analysis with constraints of both dispatch reliability and average safety levels. Reliab. Eng. Syst. Saf. 2017, 167, 84–94. [Google Scholar] [CrossRef]
Figure 1. Factors that affect the SI of an ESRCS.
Figure 1. Factors that affect the SI of an ESRCS.
Applsci 12 02382 g001
Figure 2. CTMCs with multiple absorption states.
Figure 2. CTMCs with multiple absorption states.
Applsci 12 02382 g002
Figure 3. CTMC with one absorption state.
Figure 3. CTMC with one absorption state.
Applsci 12 02382 g003
Figure 4. CTMC with two absorption states.
Figure 4. CTMC with two absorption states.
Applsci 12 02382 g004
Figure 5. Block diagram of a two-channel ESRCS architecture.
Figure 5. Block diagram of a two-channel ESRCS architecture.
Applsci 12 02382 g005
Figure 6. CTMC for two-channel ESRCS architecture with continuous mode of operation; a time-continuous fault detection and negation mechanism.
Figure 6. CTMC for two-channel ESRCS architecture with continuous mode of operation; a time-continuous fault detection and negation mechanism.
Applsci 12 02382 g006
Figure 7. Merging transient states of the model.
Figure 7. Merging transient states of the model.
Applsci 12 02382 g007
Figure 8. CTMC with two states.
Figure 8. CTMC with two states.
Applsci 12 02382 g008
Figure 9. The time dependence of p D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Figure 9. The time dependence of p D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Applsci 12 02382 g009
Figure 10. The time dependence of λ D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Figure 10. The time dependence of λ D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Applsci 12 02382 g010
Figure 11. The time dependence of λ D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0   h 1 .
Figure 11. The time dependence of λ D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0   h 1 .
Applsci 12 02382 g011
Figure 12. Merging of absorption states.
Figure 12. Merging of absorption states.
Applsci 12 02382 g012
Figure 13. The time dependence of p Y ( t ) for CTMC in Figure 12; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Figure 13. The time dependence of p Y ( t ) for CTMC in Figure 12; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Applsci 12 02382 g013
Figure 14. The time dependence of λ Y ( t ) for CTMC in Figure 12; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Figure 14. The time dependence of λ Y ( t ) for CTMC in Figure 12; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Applsci 12 02382 g014
Table 1. States of the model from Figure 6.
Table 1. States of the model from Figure 6.
StateCharacteristics
OKESRCS is fault-free; neither channel is in a fault due to a random failure.
ESRCS is in up state.
2Channel R or channel L is in an undetectable fault due to the occurrence of one random failure or due to the occurrence of several random failures.
ESRCS is in up state.
3Channel R or channel L is in a detectable fault due to the occurrence of one random failure (transition from state OK to state 3) or due to the occurrence of several random failures (transition from state OK to state 2 and subsequent transition from state 2 to state 3).
ESRCS is in up state.
SThe safe state that the ESRCS enters after the fault detection and negation of channel R or channel L.
ESRCS is in a down state.
DDangerous state—both channels are faulty.
ESRCS is in a down state.
Table 2. Probability p D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
Table 2. Probability p D ( t ) for CTMC in Figure 6; λ = 1.10 6   h 1 , c = 0.99 , δ = 0.5   h 1 .
t   [ year ] 020200100010,000100,0001,000,000
p D ( t ) [ ] * 10 3 00.2445724.3416545.0271135.0271155.0271155.027115
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Rástočný, K.; Ždánsky, J.; Hrbček, J.; Medvedík, M. Calculation of the Dangerous Failure Rate of the Safety Function. Appl. Sci. 2022, 12, 2382. https://doi.org/10.3390/app12052382

AMA Style

Rástočný K, Ždánsky J, Hrbček J, Medvedík M. Calculation of the Dangerous Failure Rate of the Safety Function. Applied Sciences. 2022; 12(5):2382. https://doi.org/10.3390/app12052382

Chicago/Turabian Style

Rástočný, Karol, Juraj Ždánsky, Jozef Hrbček, and Milan Medvedík. 2022. "Calculation of the Dangerous Failure Rate of the Safety Function" Applied Sciences 12, no. 5: 2382. https://doi.org/10.3390/app12052382

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop