Next Article in Journal
Predicting Pavement Condition Index Using Fuzzy Logic Technique
Next Article in Special Issue
Patterns of Learning: A Systemic Analysis of Emergency Response Operations in the North Sea through the Lens of Resilience Engineering
Previous Article in Journal
GIS Integration of DInSAR Measurements, Geological Investigation and Historical Surveys for the Structural Monitoring of Buildings and Infrastructures: An Application to the Valco San Paolo Urban Area of Rome
Previous Article in Special Issue
Air Transport System Agility: The Agile Response Capability (ARC) Methodology for Crisis Preparedness
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Learning from Incidents in Socio-Technical Systems: A Systems-Theoretic Analysis in the Railway Sector

by
Antonio Javier Nakhal Akel
,
Giulio Di Gravio
*,
Lorenzo Fedele
and
Riccardo Patriarca
Department of Mechanical and Aerospace Engineering, Sapienza University of Rome, Via Eudossiana—18, 00184 Rome, Italy
*
Author to whom correspondence should be addressed.
Infrastructures 2022, 7(7), 90; https://doi.org/10.3390/infrastructures7070090
Submission received: 6 June 2022 / Revised: 23 June 2022 / Accepted: 28 June 2022 / Published: 30 June 2022
(This article belongs to the Special Issue Infrastructure Resilience in Emergency Situations)

Abstract

:
Post mortem incident investigations are vital to prevent the occurrence of similar events and improve system safety. The increasing interactions of technical, human and organizational elements in modern systems pose new challenges for safety management, demanding approaches capable of complementing techno-centric investigations with social-oriented analyses. Hence, traditional risk analysis methods rooted in event-chain reactions and looking for individual points of failure are increasingly inadequate to deal with system-wide investigations. They normally focus on an oversimplified analysis of how work was expected to be conducted, rather than exploring what exactly occurred among the involved agents. Therefore, a detailed analysis of incidents beyond the immediate failures extending towards socio-technical threats is necessary. This study adopts the system-theoretic accident model and process (STAMP) and its nested accident analysis technique, i.e., causal analysis based on systems theory (CAST), to propose a causal incident analysis in the railway industry. The study proposes a hierarchical safety control structure, along with system-level safety constraints, and detailed investigations of the system’s components with the purpose of identifying physical and organizational safety requirements and safety recommendations. The analysis is contextualized in the demonstrative use of a railway case. In particular, the analysis is instantiated for a 2011 incident in the United Kingdom (UK) railway system. Hence, the CAST technique requires information regarding incidents, facts and processes. Therefore, the case study under analysis provided the information to analyze the accidents based on system theory, in which the results of the analysis prove the benefits of a CAST application to highlight criticalities at both element- and system-level, spanning from component failure to organizational and maintenance planning, enhancing safety performance in normal work practices.

1. Introduction

Accident causal models are meant to explain how events occur in order to support the most basic elements of a risk management process. Therefore, accident models help to identify an accident’s causal factors and hence determine what measures need to be implemented to avoid similar consequences or reduce their likelihood in the future [1,2]. Due to the increase in the complexity of systems over time, many accidents do not simply result from such trigger events but are caused by more complex etiological structures that imply enormous costs [3]. Therefore, organizations can be interpreted as socio-technical systems due to the interrelated and interdependent structures where social and technical aspects remain intertwined [4,5]. Currently, accident reports are sometimes poorly written when referring to causes, since the analysis frequently stops after finding someone or something to blame and the opportunity to learn important lessons from the accident is lost [6,7]. In addition, it is left aside that these accidents involve different factors, e.g., human factors, mission, equipment, financial pressures, reputation, and information that increase the normal operational variability of the system [8]. Moreover, any complex system is characterized by nonlinear behaviors generated from the interactions among the system components. Hence, dysfunctional interactions among system components are usually the most suitable way to describe accidents, substituting event-chain-based models that focus on individual component failures [9,10,11,12].
One interesting stream of research in this sense is built up around the systems theoretic accident modelling and process (STAMP) model, which is rooted in control theory and previous experience of hierarchical safety control actions [8,11]. The STAMP model has been used to identify the systemic factors behind accident occurrence, as it provides the basis for maximizing learning from events [6]. In STAMP, an accident is regarded as a complex process, not just the sum of stand-alone events. It is represented through a dynamic process that allows mapping which safety constraints were violated in a certain safety control structure (SCS), and what systemic control failures happened during the accident. From STAMP, it derives a powerful accident analysis tool defined as causal analysis based on systems theory (CAST), which provides a new method of causality for analyzing and designing systems against accidents, especially in complex socio-technical cases [13]. CAST is able to capture infrastructure issues as well as organizational, technical and human concerns.
From these observations, this paper aims to present the usage of CAST as a systemic methodology for the analysis of incidents and for the definition of synthetic yet representative explanatory bookends to report the intertwined etiology of an event. This aim has been contextualized in the railway industry, studying an incident that occurred in the United Kingdom (UK), according to the respective incident report made available by the Rail Accident Investigation Branch (RAIB).
The remainder of the paper is structured as follows. Section 2 introduces a brief literature review, detailing the uses of CAST in different industries and process systems. Section 3 shows the methodology adopted to study the incident in the UK railway system. Section 4 suggests a case of study to exploit the capabilities of CAST in a real railway investigation. Later, Section 5 provides some managerial implications and results discussion. Lastly, the conclusions summarize the benefits of applying CAST and the way forward to extend its application in other industry sectors.

2. Literature Review

Accident models based on control theory develop through two key phases that emphasize the hierarchical relationships between various system components. The first phase is the development of a diagram of functional interactions using control-feedback structures for system components at various hierarchical levels across life-stages, called safety control structure (SCS) [13]. Secondly, the functional interactions between all system components are then analyzed systematically across all levels of an SCS to trace the dysfunctional interactions from the physical systems up to the higher institutional level. One of the most-used modeling approaches of this type is the system-theoretic accident model and process (STAMP) method [2]. STAMP provides a useful framework to analyze how inadequate control and constraints violations may occur. As a model, the STAMP promotes the development of a powerful accident analysis technique defined as causal analysis based on system theory (CAST) [14,15]. CAST provides a framework to examine the entire design and operational characteristics of the socio-technical system to determine the systemic causal factors and modifications needed to prevent similar losses in the future [13,16]. The following section describes some works on how the CAST method has been previously applied in industrial processes, highlighting the advantages of the method over traditional causal analysis methods [17].
Within the gas transportation industry, accident prevention strategies mainly rely on risk assessment using traditional event-based accident models, which are not sufficient for explaining accidents related to the complex socio-technical system [18]. A CAST analysis has been used to demonstrate the safety flaws and to uncover the rationale behind the decisions that were made leading up to the catastrophic gas explosion in a pipeline [15]. This analysis went beyond pure technical factors, allowing the analysis of complex socio-technical accidents such as underground pipeline gas accidents. Especially, urban underground pipelines are referred to as high-risk processes, where various failures could bring out unexpected interactions which would cause the collapse of a safety system [18,19,20]. Similar issues emerge for oxygen gas pipelines in safety-critical sectors, such as hospitals [21].
Within the defense sector, a new approach combined a STAMP-inspired safety control theory with human reliability analysis (HRA) [22] to focus more on human error causality within a socio-technical system. This combination was rationalized and proven feasible by using a test case of the Minuteman III missile accident. The integrated method based on the STAMP and systems dynamic model fitted within HRA research, highlighting the contributing factors and categorizing them. The identified factors were related to a broader socio-technical perspective and were suggested to be more comprehensive than the causation results presented in the accident investigation report issued officially [23].
Several works combined system control theory with additional model or methods to explore more advantages from the analysis to improve the safety constraints of the system and reduce the potential for unexpected events. In the air traffic domain, a STAMP-driven framework was integrated with levels of the human factors analysis and classification system (HFACS) [24] to design a STAMP-HFACS diagram, in which the STAMP individual and organizational error taxonomy was extended by the HFACS taxonomy. This research showed how the HFACS categories and factors could be incorporated into the CAST schema for an accident of a Coast Guard helicopter [25]. In the aerospace industry, an application of CAST has been proposed for the International Space Station EVA 23 water intrusion into suit mishap [26].
The maritime industry also attracted interest for the STAMP-based application. It is among the most regulated industries with many comprehensive standards regarding safety at sea, security, health and protection of crew members, and environmental safeguarding [27], also due to the attraction that naval incidents have on media and reputational risks [28]. In these settings, CAST provided a more systematic and comprehensive perspective on accident analysis, helping to discover more problems and defects at different levels in a grounding accident cruise [29]. Specific recommendations have been instructed through CAST, related to blackout, loss of propulsion and near grounding situations in stormy water [28].
Still in transportation, the railway industry is a widely investigated domain from a safety perspective, usually referred to as an intertwined set of human–technical–organizational factors [30]. In this regard, a study developed control theory and accident taxonomies in the Japanese High-Speed Railway (HSP) context. The analysis revealed challenges and possible improvements to organizational and institutional factors affecting safety for the Japanese HSR. CAST proved to be effective in uncovering a generalized accident mechanism archetype that considers dynamic interactions between various sub-systems and stakeholders at the institutional and organizational levels [2].
The literature confirms that CAST, despite being a relatively recent method, has attracted great interest for implementation as an accident analysis tool in various fields including subsea systems [31], healthcare systems [32], maritime transportation [33], coal mining [34], nuclear power plants [35], among others. These studies suggest that CAST prevails over traditional linear accident models by accounting for human factors, organizational factors, on top of technical aspects [36]. As such, it is confirmed to be a reasonable testbed for tracing more complicated failure mechanisms, as the ones expected to be present in railway operations.

3. Methods

3.1. System-Theoretic Accident Model and Process

In the traditional causality models, accidents are caused by chains of failure events, where each failure directly causes the next one in the chain. In systems theory, emergent properties, such as safety, arise from the interactions between the system components [11]. The emergent properties are controlled by imposing constraints on the behavior of and interactions between the components. Safety then becomes a control problem where the goal of the control is to enforce the safety constraints. Accidents result from inadequate control or enforcement of safety-related constraints on the development, design, and operation of the system [13]. The system-theoretic accident model and process (STAMP) accident model is based on these principles.
Controls must be established to accomplish the control of a system’s behavior by enforcing the safety constraints in its design and operation. These controls do not need to involve necessarily a human or automated controller, e.g., component behaviors and unsafe interactions may be controlled through physical design, or social controls that include organizational, governmental, and regulatory structures, and may span cultural, and policy constraints. In this framework, preventing future accidents requires shifting from a focus on preventing failures to the broader goal of designing and implementing controls to understand why an accident occurred and determining why the previous controls were ineffective [37]. The STAMP is based on these principles and its foundations are on three basic constructs, here introduced with relevance for railway operations [6,38,39].
  • Safety constraints: the most important role in STAMP is not the event, but the constraints, the events that lead to losses occur only because safety constraints have not been successfully applied. As technology advances, the identification and enforcement of safety constraints become increasingly hard; in the past, many constraints were physical, related to the strength of the materials or equipment, therefore, the constraints allowed the use of passive controls that ensured safety as long as the physical presence existed. On the other hand, active controls require certain actions to provide protection in: (i) the detection of a hazards event; (ii) the measurement of variables; (iii) the interpretation of measurements; (iv) the response.
    The introduction of active controls, e.g., electromechanical controls, allowed operators to control the process from a great distance. However, direct contact with the process generates a possible loss of information, and consequentially, the additional causation of errors. To provide all the necessary information, designers must provide feedback on the actions of the operators and any errors that may have occurred.
  • Hierarchical Safety Control Structure: each level imposes constraints on the underlying layers to control their behavior. Control processes work between the layers to control the processes at the lower level and to enforce safety constraints. Incidents occur when audit processes provide insufficient control and/or safety constraints are violated; some examples of those could be: (i) missing constraints; (ii) inadequate safety-check actions; (iii) poor communication; (iv) incorrect procedures; (v) lack of feedback to detect errors. Therefore, among the hierarchical levels of each control structure, downward communication channels are required to provide information to impose constraints and upward communication channels to obtain feedback and measure to how much the constraints are respected.
    The control structures always change over time, especially those that include human components, which is why the controls must not necessarily be applied in an authoritarian way by imposing rules or requirements; it is possible to set objectives to be achieved at the lower levels and leave them the decision on the best way to achieve the required result.
  • Process Model: The model of process plays an important role in understanding why accidents occur; why humans provide certain control over safety-critical systems and in the design of safety systems. Each controller—either human or automated—needs a process model in a STAMP formulation. The models may contain only two variables or could be very complicated with many parameters. To define a process model, it is necessary to isolate the defined control laws and the value of the variables over time, and to relate them to the varieties of execution: incorrect process models; incorrect commands given; correct actions requested too early or too late, control applied too early or maintained too long.

3.2. Causal Analysis Based on System Theory

Causal analysis based on system theory (CAST) is a method used to understand the identity of an adverse or undesired event that leads to loss in a certain process [31]. The purpose of using CAST is to learn how to avoid losses in the future; those causes identified should not be reduced to an arbitrary “root cause”, due to the fact that many systemic factors defined in a “root cause” are often omitted in accident reports, with the result that some of important/far-reaching causes are ignored or never fixed [13]. Figure 1 depicts the CAST technique methodology applied step by step. In addition, the following section describes each step to perform the same CAST analysis.
Therefore, the method tries to learn as much from every accident as possible to be in a position to define the causes and probable causes, in order to find the issues responsible and then reduce the losses caused by them in order to avoid future accidents.
Step 1. Basic Components and Information Gathering
The method starts by identifying the boundaries of the system under analysis. Then, the hazards that led to the loss and the constraints that must be satisfied in the design and operation of the system need to be identified. The constraints include the concerns of the investigation and the SCS to be considered that extend beyond the boundaries of the system itself and, indeed, beyond its responsibilities [13]. Deriving the safety constraints from the hazard is rather obvious, except perhaps for the inclusion of constraints to handle the case where the hazard is not avoided. The failure or unsafe behavior of a system component is not a system hazard, it is rather the cause of a hazard. Once such information is defined, it is essential to understand what happened in the controlling process, as in the following conditions [23]:
  • Requirements for hazard mitigation: check if the industry provides protection against hazards.
  • Controls: verify if the safety equipment (controls) was designed to satisfy the above hazard mitigation requirements.
  • Missing or inadequate controls that might have prevented the accident.
  • Failures: check for damage in the process equipment.
  • Unsafe interactions: verify if there were dangerous interactions among the system components in the accidents caused.
  • Contextual factors: verify if there are factors that can influence the process.
  • Summary of the roles: inspect the roles of the components in the accident.
Step 2. Modeling the Safety Control Structure
Control processes operate between levels to control the processes at lower levels in the hierarchy. These control processes enforce the safety constraints for which the control process is responsible. Accidents occur when these processes provide inadequate control, and the safety constraints are violated by the behavior of the lower-level components [6]. At each level of the hierarchical structure, inadequate control may result from missing constraints (unassigned responsibility for safety), inadequate safety control commands, commands that were not executed correctly at a lower level, or inadequately communicated or processed feedback about constraint enforcement. The cause is always that the control structure and controls constructed to prevent the hazard were not effective in preventing the hazard. The CAST analyses exist for a certain type of loss; the controls probably will have been identified in previous analyses [13], that is why the CAST purpose is (i) to determine why and how they might be improved; (ii) generate the questions and answer them; (iii) identify more controls and even constraints.
Step 3. Individual Component Analysis: Why were the Controls Ineffective?
With the development of the SCS, it is possible to determine the controls that caused losses in the process. The CAST analysis involves identifying the flaws in this model and determining why they occurred to identify the roles in the loss and start determining whether the responsibilities were satisfied. When the component analysis has been completed and documented, a summary can be made of the role of the component in the accident along with recommendations.
Consequently, it is necessary to perform the analysis by generating questions about what happened in the event that must be answered during the investigation in order to give a detailed description of why the accident occurred. The analysis could be focalized on the following issues [40]:
  • Component responsibilities related to the accident.
  • Contribution (actions, lack of actions, decisions) to the hazardous state.
  • Flaws in the mental/process model contributing to the actions.
  • Contextual factors explaining the actions, decisions, and process model flaws.
Step 4. Analyzing the Control Structure as a Whole
The systemic analysis focuses on factors that affect the behaviors and interactions of all the components working together within the SCS to prevent hazardous system states. The following are some of the important systemic factors that might be considered during the analysis [13]:
  • Communication and coordination.
  • The safety information system.
  • Safety culture.
  • Design of the safety management system.
  • Changes and dynamics over time, in the system and in the environment.
  • Internal and external economic and related factors in the system environment not covered previously in the analysis.
Step 5. Generating Recommendations and Changes to the SCS
The process of generating recommendations should be straightforward if the analysis have been conducted thoroughly. The scope of creating recommendations is to change the SCS in the part of the process that failed in order to prevent a similar loss in the future. Sometimes recommendations are made but never implemented, so not only must there be some way to ensure that recommendations are followed, but there must also be feedback to ensure that they are effective in achieving objectives and strengthening the SCS. A continuous hazard improvement program can be designed as part of any risk management program in which both the hazard and the recommendations made to reduce it are monitored. There are three requirements to create and implement the recommendations [13]:
  • Assigning responsibility for implementing the recommendations.
  • Checking that they have been implemented.
  • Establishing a feedback system to determine whether they were effective in strengthening the controls.
The goal here is to ensure that the organization is continually learning and improving its risk management efforts so that the potential for losses is reduced over time.

4. Case Study

The follow section describes a train derailment incident of the UK railway system. Despite its limited severity, i.e., small damage to the train structure, and no injured people [41], we decided to use CAST on it in order to prove how even small events can generate high learning potential, if analyzed from a systemic lens.

4.1. Summary of the Accident

On 26 August 2011, at about 00:44, on Network Rail infrastructure, a train derailment occurred just before Bordesley Junction, Birmingham, in the fourth wagon (PHA type) from the end of a 6Z31 freight train from Lafarge Aggregates Ltd. (Lutterworth, UK), while travelling at 18 km/h. The freight train continued; the following three wagons (KJA type) were pulled into derailment at the junction. The derailed wagons then ran afoul of the adjacent line for 103 m before they re-railed on a crossover. Finally, the train stopped with its rear wagon 33 m beyond the crossover. Figure 2 shows a geographical map highlighting the location of the event. This figure is complemented by Figure 3 that represents the details and path followed by the freight trains at a higher level of detail.
During the derailment, another freight train (6O46) was approaching on the adjacent line. The train stopped when the driver saw the signal for the junction change from green to red. He could see clouds of dust from the train coming towards him. Both freight trains stopped alongside each other. Neither train driver nor the staff was injured.
The rear PHA wagon which ran derailed suffered damage to its suspension and braking equipment. There was extensive damage to the track and signaling equipment at the junction. The railway line through the junction remained closed while the train was repaired so it could be moved to a nearby siding; a restricted service in one direction over the junction was implemented on 27 August in the early hours when the track damage was repaired and the signaling restored. A full service was running again on 28 August, albeit with an emergency speed restriction to the trains over the Bordesley Junction, Birmingham.

4.2. Causal Analysis Based on System Theory (CAST) Application

This section proposes a step-by-step application of CAST to document the applicability of the STAMP SCS as a means of isolating and providing the relevant descriptive findings about the event.
Step 1. Basic components and information gathering to identify the hazards and the safety constraints involved in the loss
The combination of factors related to the suspension on the wagon and the track geometry at the junction could produce a loss related to damage to the train or injuries to people.
For the train integrity, the related hazard is uncontrolled train passage at crossovers. Consequently, it is possible to define the following high-level safety constraints:
  • Operating conditions must be set according to the constraints posed by the infrastructure company and the train company.
  • Infrastructure maintenance must be timely and effective.
  • Train equipment maintenance must be timely and effective.
  • The railway signage system must be able to communicate the real situation to the driver.
  • The braking system must function properly when passing over a crossover.
For public safety, the hazard is the exposure of operators on board or on the track and citizens to a train derailment. Accordingly, the respective high-level safety constraints are:
  • The safety system of the train must protect staff in case of derailment.
  • Warnings and other measures must be available to protect operators and citizens close to crossovers.
Step 2. Modeling the Safety Control Structure to map the process system
There were nine main organizations involved in the incident. Four of them belong to the UK government and are in charge of the creation of the regulations to manage and maintain the transport system, the rest of the organizations are related with the UK railway system that guarantee and manage the operation of the rail transport service.
Some relevant aspects related to the organization are that the Lafarge assumed that its PHA wagon fleet could continue to operate and be maintained as before unless told otherwise by the Private Wagon Registration Agreement Management Group (PWRAMG) or Wabtec. Wabtec continued to maintain the PHA wagon fleet to the approved maintenance plan, as they were contracted to do, because they expected the PWRAMG or Lafarge to instruct them to do otherwise if necessary; and the PWRAMG continued to give its agreement for the PHA wagon fleet to operate over the national network, as its primary role was to check the requirements of the PWRA were being met. Table 1 summarizes the roles of each organization, as per the discussed incident.
In Table 1, one can find the most important laws and certifications applicable in the operating context to operate the freight train system. For this research, other specific maintenance plans and certifications of cars and trains of the railway system are neglected in our analysis.
Table 1 explains the system components and their roles that comprise the UK railway system. This information supports the creation of the STAMP model of the system and identifies the safety constraints violated or inadequately provided during the incident.
Step 3. Individual Component Analysis: Why were the Controls Ineffective?
Actions to analyze the organizational processes
This section describes the violations of the system safety constraints that led to the incident. The violations are listed by each involved organization and then clustered into dedicated organization-specific Box 1, Box 2, Box 3 and Box 4 in which each violation has been defined by a code to facilitate the understanding and traceability for more analyses to be developed in the subsequent sections.
Box 1. Safety constraints violations from Network Rail management Infrastructure.
Network Rail management Infrastructure (NRI)
Safety constraints violated or inadequately provided:
NRI.1
The maintenance team were carrying out measured shovel packing repairs at a track joint before the first track twist and at the joints within the switches and crossing for the junction after the second track twist. The aligned waveforms for the section of track with the track twists did not change throughout 2011, but it did change over time in the vicinity of the joints which was as a result of the work done by the maintenance team.
NRI.2
The maintenance team had carried out similar repairs at these joints several times in the past. They did not measure the cant in the section of plain line between the joints, so they did not find the reported track twists. Therefore, the track twists went unrepaired.
NRI.3
Network Rail maintenance had arranged for a tamping shift to take place at Bordesley Junction during an overnight engineering possession on 21 and 22 August. The work started over two hours later than planned as the tampers did not arrive at their starting places at Bordesley Junction until 02:45 h. The tamping shift did take place that night, but it ran short of time so not all of the planned work could be done.
NRI.4
The ballast in places at Bordesley Junction was contaminated with dirt and dust, especially near to the overbridged section. Witnesses suggested that the formation beneath the track bed at the junction was poor due to water discharging from the drains on the overbridged section onto the track at the junction.
NRI.5
Although the rail was well within its maintenance limits, the side wear changed the contact angle between the wheel flange and the rail head. This increased the likelihood of the flange climbing onto the rail head.
Box 2. Safety constraints violations from Private Wagon Registration Agreement Management Group office.
Network Rail PWRAMG (NRM)
Safety constraints violated or inadequately provided:
NRM.1
The report issued by Serco in 2009 shows the problems in the suspension but at the moment of the incident: (i) no modifications had been made to the suspension on REDA 16066; (ii) the way in which the PHA wagon fleet was operated on the national network had not been reassessed; (iii) no formal changes had been made to the wagons’ maintenance regime.
NRM.2
The Network Rail PWRAMG identified a number of changes to the suspension components to reduce the likelihood of the wheels locking up but did not set a timescale for completion of this assessment.
NRM.3
The Network Rail PWRAMG knew from the testing carried out by Serco that the suspensions were prone to lock up, but the wagon fleets continued running in service; based on the incident and failures data, they decided that no further action was needed.
NRM.4
The Network Rail PWRAMG issued a letter to all private wagon owners when provided details of the proposed engineering changes and the new maintenance checks. It did not mandate that the maintenance checks should be carried out; the maintenance checks became mandatory nineteen months after they were first proposed by the PWRAMG.
Box 3. Safety constraints violations from Wabtec maintenance Staff.
Wabtec maintenance Staff (WS)
Safety constraints violated or inadequately provided:
WS.1
The maintainers followed the instructions for each type of examination, each instruction called for specific components within the suspension to be examined. However, some were not fully visible or could not be measured unless the suspension was disassembled by lifting the wagon and taking the wheelset out.
WS.2
The damper pot liner wear plate on the trailing left-hand wheel’s suspension was worn beyond its maintenance limit.
WS.3
While the prescribed maintenance process for REDA 16066 was being completed, this did not lead the maintainers to identify and change worn components within the suspension before this wear increased the likelihood of the suspension locking-up.
WS.4
Wabtec did not include the new checks within the maintenance plans in 2009 or on any of the documentation that is filled in when an examination takes place.
Box 4. Safety constraints violations from DB Schenker.
DB Schenker (DB)
Safety constraints violated or inadequately provided:
DB.1
The effectiveness of grease in reducing the level of friction between the wheel flange and the rail was limited by its position low down on the inside face. It was also the case that at the time of derailment the rails were dry, and therefore the level of friction was higher than it would have been if they were wet.
Developing the Safety Control Structure
The previous section described how the STAMP model provided a useful framework for analyzing how inadequate control and constraint violation occur. Therefore, the STAMP SCS has been developed, as per the following two phases (i) high-level SCS (cf. Figure 4); (ii) detailed SCS to allow a major level of granularity in the controlled process (cf. Figure 5).
The high-level SCS is represented in Figure 4 and it has been sectioned into the following controllers and controlled processes: (i) the Cabinet Minister of the Department of Transport; (ii) the Rail Accident Investigation Branch (RAIB); (iii) the Office of Rail and Road (ORR). Further, the Network Rail organization is divided into two divisions: (iv) Network Rail Management Infrastructure; (v) Private Wagon Registration Agreement Management Group (PWRAMG). Later, (vi) DB Schenker; (vii) Wabtec Rail Limited; and (viii) Lafarge Aggregates Ltd. Serco (represented in gray in Figure 4). Serco is an organization that does not belong to the railway system but worked with Network Rail to carry out an analysis on the performance of the wagons.
A subsequent SCS is proposed by isolating the organizations related to the process of carrying out train operations in the UK railway system at the time of the event, that are more relevant for the study of the event at hand (cf. Figure 5). The role and task developed by Lafarge Aggregates Ltd. Serco has been defined in gray to document that these interactions happen in exceptional cases.
Additionally, Figure 6 presents a detailed diagram representing the effects of violations of the safety constraints defined by the codes created in the previous section. This figure shows the safety constraints identified previously giving clearer evidence on the relationships between the controllers and controlled processes involved.
Step 4. Analyzing the Control Structure as a whole to document the causes for ineffective controls in the SCS
Following the development of the SCS and by defining the violated safety constraints, it is then possible to highlight the ineffective control actions performed as being linked to the event. Before investigating the ineffectiveness of the controls, an updated SCS was proposed. Figure 7 highlights in red the control actions or feedback that was carried out inadequately, or was even missing. The red lines identify a relationship between the safety constraints violated and the organizations involved in the SCS. On the other hand, Figure 7 presents blue elements that identify additional control actions or feedback that can be recommended once proceeding in the CAST application. These lines have been identified by embracing a systems perspective that looks for potential improvements driven by the structure presented in the SCS and linked to the identified failures. While this is not expected to be a full-fledged set of analyses, it should be interpreted as a tentative way forward for a proper socio-technical investigation driven by a CAST application.
A detailed analysis follows on how control actions and feedbacks loops were ineffective during the event, along with the components involved in these actions. Therefore, this analysis has been provided in dedicated organization-specific Box 5, Box 6, Box 7, Box 8, Box 9 and Box 10 (same color used in the STAMP model to identify the safety constraints that were lacking) to relate to the component highlighted in the STAMP model with the ineffective safety constraints system.
Box 5. Description of how safety constrains from Network Rail PWRAMG has been provided inadequately.
Network Rail PWRAMG
Ineffective Control Actions/Feedback:
  • The control action (Regulations) from Network Rail PWRAMG to Maintenance Execution management unit that delivers to the Maintenance execution management unit. Since they did not mandate the changes in the maintenance and inspections of the Railway structure, and the maintenance checks became mandatory nineteen months after they were first suggested.
  • The control action (Method of inspection) from Network Rail PWRAMG to PWRAMG inspectors that provides the guidelines for inspection of the wagons. Since they did not modify the threshold in the measurement of the dynamic track twist, a 3-m track twist more severe than 1 in 200 reduced the load on the leading right-hand wheel, causing the flange to climb onto the rail head. Although trains were permitted to operate over track with this degree of track twist, the derailment is unlikely to have occurred in its absence.
Box 6. Description of how safety constrains from Wabtec Rail Limited has been provided inadequately.
Wabtec Rail Limited
Ineffective Control Actions/Feedback:
  • The control action (Planned Preventative Maintenance) from Wabtec Rail Limited to Wabtec Rail Limited maintainers that imposed the guideline to repair and monitor the components of the wagons; because they did not implement the changes in the suspension of the wagon PHA, established in the POCL given from Network Rail PAWRAMG and had not included the new checks within the maintenance plans of 2009 or on any of the documentation that is filled in when an examination takes place.
  • The control action (Procedure related to the wagons and POCLs) from Wabtec Rail Limited to Lafarge Aggregates Ltd that impose the work condition of the wagon and report through private owner circulation letter (POCL) the changes related with the regulations and test.
Box 7. Description of how safety constrains from Wabtec Rail Limited maintainers has been provided inadequately.
Wabtec Rail Limited maintainers
Ineffective Control Actions/Feedback:
  • The control action (PPM) from Wabtec Rail Limited maintainers of PHA wagon that repairs and monitors the components of the wagons, because the maintainers followed the instructions for each type of examination. However, some were not fully visible or could not be measured unless the suspension was disassembled by lifting the wagon and taking the wheelset out. While the prescribed maintenance process for REDA 16066 was being complied with, this did not lead the maintainers to identify and change worn components within the suspension before this wear increased the likelihood of the suspension locking-up.
  • The feedback (PPM report) from Wabtec Rail Limited maintainers of PHA wagon that describes the changes and maintenance of the wagon during the PPM. Since February 2010, there was no record of any work being done to the trailing corner on the left-hand side, but REDA 16066 was last lifted in October 2010 during a PPM examination.
Box 8. Description of how safety constrains from Executive and Business Support Team has been provided inadequately.
Executive and Business Support Team
Ineffective Control Actions/Feedback:
  • The control action (POCL) from Executive and Business support team to Lafarge Aggregates Ltd that communicates to the owners of the trains any change in the policies and the regulations related to the operations of the train or the conditions of the work staff. Since the POCL in 2009, the Executive and Business support team did not make any effort to convince the owners to carry out the changes necessary in the suspension wagon to avoid the locking up of the wheel.
Box 9. Description of how safety constrains from DB Schenker maintainers has been provided inadequately.
DB Schenker maintainers
Ineffective Control Actions/Feedback:
  • The control action (Maintenance) from DB Schenker maintainers to 6Z31 train that inspects, repairs and maintains the train equipment. The damper pot liner wear plate on the trailing left-hand wheel’s suspension was worn beyond its maintenance limit.
Box 10. Description of how safety constrains from Border Junction maintenance team has been provided inadequately.
Border Junction maintenance team
Ineffective Control Actions/Feedback:
  • The control action (Maintenance) from Border Junction maintenance team to Railway infrastructure that inspects and repairs the railway structure. The team did not repair the twisted track section where the event occurred. On 25 August, the up and down main Bordesley lines and the track over the junction were inspected by Network Rail maintenance staff on foot. No problems were found by the staff that carried out this inspection. It was also the case that at the time of derailment, the rails were dry, and therefore the level of friction was higher than it would have been if they were wet.
Step 5. Generating Recommendations and Changes to the SCS to drive system improvements
Since CAST was conceived as a socio-technical accident analysis tool, and an organizational learning support, one of the most important phases for its application is the generation of safety recommendations to improve existing safety constraints or add new ones. For the purpose of the paper, the recommendations have been tentatively defined by exploring the SCS in terms of the feedback or control actions that were missing or were inadequate. While this is not expected to be a full-fledged set of recommendations, it should be interpreted rather as an example of possible outcomes originated from the analysis related to inspections to be prioritized. Rather than being a validated set of on-the-shelf solutions for railway infrastructure, the purpose of the following recommendations is to guide the reader into understanding the complete applicability of CAST.
Accordingly, in line with the previous findings, these recommendations have been represented in organization-specific Box 11, Box 12 and Box 13 (same color used in the STAMP model to highlight the control actions and feedback recommendations to improve the safety management system under analysis).
Box 11. Detail description of new recommendation for Network Rail PWRAMG.
Network Rail PWRAMG
Improvements and recommendations:
  • The track twist measurement process should become a periodic inspection. Rather than being interpreted as ad hoc consultations, a formal periodic process to ensure that track twists are within standards should be required. Even though this measurements process can be outsourced to an external company (such as Serco, for the event under investigation), there are no restrictions in doing it in-house. For the monitoring, it should be taken into consideration the fact that the cant distance varies depending on the load. As such, it could be possible to start using sensors that automatize the monitoring process on sample parts of the lines.
Box 12. Detail description of new recommendation for DB Schenker maintainers.
DB Schenker maintainers
Improvements and recommendations:
  • A formal inspection for the suspension system should be carried out. This recommendation starts from the missed correspondence of components for the train 6Z31 involved in the event, as maintained by DB Schenker. This should be a regulated action in the charge of the train company.
  • The inspection of suspension shall be part of a formalized information exchange process. The results of inspections as carried out by the train companies should be validated by the Network Rail PWRAMG.
Box 13. Detail description of new recommendation for Network Rail management Infrastructure.
Network Rail management Infrastructure
Improvements and recommendations:
  • Train operations should be shared between the train operating company and Network Rail to ensure the smooth and effective management of the maintenance plan.
    This recommendation emerges from the need to properly schedule time-based maintenance interventions for the train owned by a company and for Network rail to establish properly the time scale of each intervention.

5. Discussion

This study did not analyze the system development process since the incident was mainly caused by inadequate control actions during the system operation process. Furthermore, to determine the conditions in which failure control factors occurred, the CAST analysis identified the contexts in which decisions were made, inadequate control actions, inaccurate feedback, and mental model flaws of controllers exist as per the built SCS.
The CAST analysis even presented five additional recommendations to the original investigation report, raising the total number of recommendations from four to nine. In general, the original recommendations from the official report were restricted to contractors’ procedure deviations. It is not useful and should not be the goal of an accident report to apportion blame or describe causes only linked to human error, but to understand what led to this error, and how this can be avoided in the future. More important than numbers, the CAST analysis identified systemic causal factors that allow the prevention not only of similar events in the future, but broader types of accidents that stem from unsafe control actions at higher levels of the safety control structure.
In addition, through CAST, it is possible to provide additional recommendations from official investigations, especially on systemic factors. There is a need to rethink the way in which the organizational agencies conduct accident analysis or procedures. The study raises the issue that in the analysis of this accident conducted by the Rail Accidents’ Investigation Branch, they generally refrained from reviewing their own control actions contributing to the hazard [42]. Using novel approaches, such as CAST, accident analysis carried out by organizational and governmental agencies can reduce the subjectivity in selecting the chaining conditions and provide general recommendations to be absorbed by different levels of the SCS [43].
It could be observed that an incident can propagate further, generating other events with smaller or even larger consequences. The CAST method is not very effective for such purposes: as a causal analysis method, it inherits a strong reactive ex post perspective. The purpose of its application is thus static, in the sense that it does instantiate a previous event, and identify measures related to it, along with potential recommendations that are, however, rooted in the context being investigated. However, this specific incident proved that even minor events, if studied systematically and coherently can offer large learning opportunities. The same findings can be defended in other sectors, as per the CAST technique which provided safety recommendations in space operations [26]; or coal mine settings [34]; or even in underground pipeline accidents [15].

6. Conclusions

The present study provides an exemplary application of a system-theoretic approach in a specific railway incident. The CAST method based on systems theory has resulted in being effective for understanding accident causation mechanisms in a way that suggests improvement measures to prevent future, even major, events [44,45]. CAST offered a systematic and comprehensive perspective of the incident and allowed the discovery of problems and defects among the different organizational levels. CAST can use specific events for accident causal analysis and improve the system’s ability to find and correct problems [46].
Moreover, the proposed analysis of the railway incident demonstrated that the CAST technique can represent a powerful alternative to traditional methods. Whilst the official accident report propositions were limited to changes in the process system and documentary aspects, additional recommendations, especially regarding systemic factors and design flaws, were obtained from CAST for each component that played a role in the accident causation. These recommendations may prevent not only similar events, but broader types of accidents that span from causal mechanisms linked to managerial and organizational factors, lack of governmental controls, towards more operational layers.
On the other hand, future research could be carried out to complement or improve the current CAST analyses. The implementation of simulations [47,48] or system dynamics models [49] could be a promising way forward in this sense to ensure a dynamic integration of data [50]. System dynamics can be used to explore the causal relationships following the logic presented in the SCS, extended with ad hoc modelling. In addition, it is possible to complement the analysis with human reliability models to extend the way human controllers behave and are influenced by work or contextual pressures [22,23]. Furthermore, quantitative analysis based on Bayesian modelling may complement the reactive dimension of CAST with proactive estimations of the events’ likelihood even in the case of intertwined sets of contributing factors [51].

Author Contributions

Conceptualization, R.P. and A.J.N.A.; Validation, R.P. and L.F.; Formal analysis, A.J.N.A. and R.P.; Writing—original draft preparation, A.J.N.A.; Writing—review and editing, R.P. and G.D.G.; Supervision, R.P. and G.D.G.; Funding acquisition, G.D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Underwood, P.; Waterson, P. Systems thinking, the Swiss Cheese Model and accident analysis: A comparative systemic analysis of the Grayrigg train derailment using the ATSB, AcciMap and STAMP models. Accid. Anal. Prev. 2014, 68, 75–94. [Google Scholar] [CrossRef] [Green Version]
  2. Bugalia, N.; Maemura, Y.; Ozawa, K. Organizational and institutional factors affecting high-speed rail safety in Japan. Saf. Sci. 2020, 128, 104762. [Google Scholar] [CrossRef]
  3. Paltrinieri, N.; Bonvicini, S.; Spadoni, G.; Cozzani, V. Cost-Benefit Analysis of Passive Fire Protections in Road LPG Transportation. Risk Anal. 2012, 32, 200–219. [Google Scholar] [CrossRef]
  4. Patriarca, R.; Falegnami, A.; Costantino, F.; Di Gravio, G.; De Nicola, A.; Villani, M.L. WAx: An integrated conceptual framework for the analysis of cyber-socio-technical systems. Saf. Sci. 2021, 136, 105142. [Google Scholar] [CrossRef]
  5. Righi, A.W.; Saurin, T.A. Complex socio-technical systems: Characterization and management guidelines. Appl. Ergon. 2015, 50, 19–30. [Google Scholar] [CrossRef] [PubMed]
  6. Leveson, N. Engineering a Safer World: Systems Thinking Applied to Safety; The MIT Press: Cambridge, MA, USA, 2011; Volume 1. [Google Scholar]
  7. Ibrion, M.; Paltrinieri, N.; Nejad, A.R. Learning from failures: Accidents of marine structures on Norwegian continental shelf over 40 years time period. Eng. Fail. Anal. 2020, 111, 104487. [Google Scholar] [CrossRef]
  8. De Carvalho, P.V.R. The use of Functional Resonance Analysis Method (FRAM) in a mid-air collision to understand some characteristics of the air traffic management system resilience. Reliab. Eng. Syst. Saf. 2011, 96, 1482–1498. [Google Scholar] [CrossRef]
  9. Leveson, N. A new accident model for engineering safer systems. Saf. Sci. 2004, 42, 237–270. [Google Scholar] [CrossRef] [Green Version]
  10. Rasmussen, J. Risk management in a dynamic society: A modeling problem. Saf. Sci. 1997, 27, 183–213. [Google Scholar] [CrossRef]
  11. Rasmussen, J.; Svedung, I. Proactive Risk Management in a Dynamic Society; Swedish Rescue Services Agency: Karlstad, Sweden, 2000; Volume 1, ISBN 9172530847. [Google Scholar]
  12. Li, W.; Zhang, L.; Liang, W. An Accident Causation Analysis and Taxonomy (ACAT) model of complex industrial system from both system safety and control theory perspectives. Saf. Sci. 2017, 92, 94–103. [Google Scholar] [CrossRef]
  13. Leveson, N. CAST handbook: How to Learn More from Incidents and Accidents. 2018, Volume 1, p. 188. Available online: http://sunnyday.mit.edu/CAST-Handbook.pdf (accessed on 20 February 2022).
  14. Leveson, N.; Daouk, M.; Dulac, N.; Marais, K. Applying STAMP in Accident Analysis. 2003, pp. 1–26. Available online: http://hdl.handle.net/1721.1/102905 (accessed on 25 February 2022).
  15. Li, F.; Wang, W.; Xu, J.; Dubljevic, S.; Khan, F.; Yi, J. A CAST-based causal analysis of the catastrophic underground pipeline gas explosion in Taiwan. Eng. Fail. Anal. 2020, 108, 104343. [Google Scholar] [CrossRef]
  16. Altabbakh, H.; AlKazimi, M.A.; Murray, S.; Grantham, K. STAMP—Holistic system safety approach or just another risk model? J. Loss Prev. Process Ind. 2014, 32, 109–119. [Google Scholar] [CrossRef]
  17. Patriarca, R.; Chatzimichailidou, M.; Karanikas, N.; Di Gravio, G. The past and present of System-Theoretic Accident Model And Processes (STAMP) and its associated techniques: A scoping review. Saf. Sci. 2022, 146, 105566. [Google Scholar] [CrossRef]
  18. Li, F.; Wang, W.; Dubljevic, S.; Khan, F.; Xu, J.; Yi, J. Analysis on accident-causing factors of urban buried gas pipeline network by combining DEMATEL, ISM and BN methods. J. Loss Prev. Process Ind. 2019, 61, 49–57. [Google Scholar] [CrossRef]
  19. Xu, D.; Wang, Y.; Meng, Y.; Zhang, Z. An Improved Data Anomaly Detection Method Based on Isolation Forest. In Proceedings of the 2017 10th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 9–10 December 2017; pp. 287–291. [Google Scholar]
  20. Lu, L.; Liang, W.; Zhang, L.; Zhang, H.; Lu, Z.; Shan, J. A comprehensive risk evaluation method for natural gas pipelines by combining a risk matrix with a bow-tie model. J. Nat. Gas Sci. Eng. 2015, 25, 124–133. [Google Scholar] [CrossRef]
  21. Shaban, A.; Abdelwahed, A.; Di Gravio, G.; Afefy, I.H.; Patriarca, R. A systems-theoretic hazard analysis for safety-critical medical gas pipeline and oxygen supply systems. J. Loss Prev. Process Ind. 2022, 77, 104782. [Google Scholar] [CrossRef]
  22. Patriarca, R.; Ramos, M.; Paltrinieri, N.; Massaiu, S.; Costantino, F.; Di Gravio, G.; Boring, R.L. Human reliability analysis: Exploring the intellectual structure of a research field. Reliab. Eng. Syst. Saf. 2020, 203, 107102. [Google Scholar] [CrossRef]
  23. Rong, H.; Tian, J. STAMP-based HRA considering causality within a sociotechnical system: A case of minuteman III missile accident. Hum. Factors 2015, 57, 375–396. [Google Scholar] [CrossRef]
  24. Reason, J. Human Error; Cambridge University Press: Cambridge, MA, USA, 1990; ISBN 9780521306690. [Google Scholar]
  25. Lower, M.; Magott, J.; Skorupski, J. A System-Theoretic Accident Model and Process with Human Factors Analysis and Classification System taxonomy. Saf. Sci. 2018, 110, 393–410. [Google Scholar] [CrossRef]
  26. Robertson, J.; Kothakonda, A. CAST Analysis of the International Space Station EVA 23 Suit Water Intrusion Mishap. In Proceedings of the International Astronautical Congress, Bremen, Germany, 1–5 October 2018. IAC-18-45324. [Google Scholar]
  27. Virdin, J.; Vegh, T.; Jouffray, J.-B.; Blasiak, R.; Mason, S.; Österblom, H.; Vermeer, D.; Wachtmeister, H.; Werner, N. The Ocean 100: Transnational Corporations in the Ocean Economy. Sci. Adv. 2021, 7, 8041. [Google Scholar] [CrossRef]
  28. Ibrion, M.; Paltrinieri, N.; Nejad, A.R. Learning from failures in cruise ship industry: The blackout of Viking Sky in Hustadvika, Norway. Eng. Fail. Anal. 2021, 125, 105355. [Google Scholar] [CrossRef]
  29. Zhang, J.; Kim, H.; Liu, Y.; Lundteigen, M.A. Combining system-theoretic process analysis and availability assessment: A subsea case study. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2019, 233, 520–536. [Google Scholar] [CrossRef]
  30. Stringfellow, M.V.; Dierks, M.M.; Pysessor, A. Accident Analysis and Hazard Analysis for Human and Organizational Factors; Massachusetts Institute of Technology: Cambridge, MA, USA, 2004. [Google Scholar]
  31. Mogles, N.; Padget, J.; Bosse, T. Systemic approaches to incident analysis in aviation: Comparison of STAMP, agent-based modelling and institutions. Saf. Sci. 2018, 108, 59–71. [Google Scholar] [CrossRef]
  32. Leveson, N.; Samost, A.; Dekker, S.; Finkelstein, S.; Raman, J. A Systems Approach to Analyzing and Preventing Hospital Adverse Events. J. Patient Saf. 2016, 16, 162–167. [Google Scholar] [CrossRef]
  33. Kim, T.; Nazir, S.; Øvergård, K.I. A STAMP-based causal analysis of the Korean Sewol ferry accident. Saf. Sci. 2016, 83, 93–101. [Google Scholar] [CrossRef]
  34. Düzgün, H.S.; Leveson, N. Analysis of soma mine disaster using causal analysis based on systems theory (CAST). Saf. Sci. 2018, 110, 37–57. [Google Scholar] [CrossRef]
  35. Ibrion, M.; Paltrinieri, N.; Nejad, A.R. Learning from non-failure of Onagawa nuclear power station: An accident investigation over its life cycle. Results Eng. 2020, 8, 100185. [Google Scholar] [CrossRef]
  36. Hulme, A.; Mclean, S.; Dallat, C.; Walker, G.H.; Waterson, P.; Stanton, N.A.; Salmon, P.M. Systems thinking-based risk assessment methods applied to sports performance: A comparison of STPA, EAST-BL, and Net-HARMS in the context of elite women’s road cycling. Appl. Ergon. 2021, 91, 103297. [Google Scholar] [CrossRef] [PubMed]
  37. Sultana, S.; Andersen, B.S.; Haugen, S. Identifying safety indicators for safety performance measurement using a system engineering approach. Process Saf. Environ. Prot. 2019, 128, 107–120. [Google Scholar] [CrossRef]
  38. Pereira, D.P.; Hirata, C.; Nadjm-Tehrani, S. A STAMP-based ontology approach to support safety and security analyses. J. Inf. Secur. Appl. 2019, 47, 302–319. [Google Scholar] [CrossRef]
  39. Salmon, P.M.; Cornelissen, M.; Trotter, M.J. Systems-based accident analysis methods: A comparison of Accimap, HFACS, and STAMP. Saf. Sci. 2012, 50, 1158–1170. [Google Scholar] [CrossRef]
  40. Ando, T.; Wang, B.; Hisazumi, K.; Kong, W.; Fukuda, A.; Michiura, Y.; Sakemi, K.; Matsumoto, M. Verification model translation method toward behavior model for CAST. In Proceedings of the 2018 5th International Conference on Dependable Systems and Their Applications, DSA 2018, Dalian, China, 22–23 September 2018; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2018; pp. 142–147. [Google Scholar]
  41. RIAB. Rail Accident Report, Derailment at Bordesley Junction, Birmingham 26 August 2011; Rail Accident Investigation Branch: Derby, UK, 2012.
  42. Hamim, O.F.; Hasanat-E-Rabbi, S.; Debnath, M.; Hoque, M.S.; McIlroy, R.C.; Plant, K.L.; Stanton, N.A. Taking a mixed-methods approach to collision investigation: AcciMap, STAMP-CAST and PCM. Appl. Ergon. 2022, 100, 103650. [Google Scholar] [CrossRef]
  43. Landi, R.G.; Montedo, U.B.; Lahoz, C.H.N. Using systems theory for additional risk detection in boiler explosions in Brazil. Saf. Sci. 2022, 152, 105761. [Google Scholar] [CrossRef]
  44. Ouyang, M.; Hong, L.; Yu, M.H.; Fei, Q. STAMP-based analysis on the railway accident and accident spreading: Taking the China-Jiaoji railway accident for example. Saf. Sci. 2010, 48, 544–555. [Google Scholar] [CrossRef]
  45. Song, T.; Zhong, D.; Zhong, H. A STAMP Analysis on the China-Yongwen Railway Accident; Springer: Berlin/Heidelberg, Germany, 2012; Volume 7612. [Google Scholar]
  46. Zhang, J.H.; Wu, B. A STAMP-based Causal Analysis of the Beiyou25 Grounding Accident. In Proceedings of the Prognostics and System Health Management Conference, Qingdao, China, 25–27 October 2019. [Google Scholar]
  47. Simone, F.; Patriarca, R. A simulation-driven cyber resilience assessment for water treatment plants. In Proceedings of the 32nd European Safety and Reliability Conference, Dublin, Ireland, 28 August–1 September 2022. [Google Scholar]
  48. Patriarca, R.; Simone, F.; Di Gravio, G. Modelling cyber resilience in a water treatment and distribution system. Reliab. Eng. Syst. Saf. 2022, 226, 108653. [Google Scholar] [CrossRef]
  49. Leerapan, B.; Teekasap, P.; Urwannachotima, N.; Jaichuen, W.; Chiangchaisakulthai, K.; Udomaksorn, K.; Meeyai, A.; Noree, T.; Sawaengdee, K. System dynamics modelling of health workforce planning to address future challenges of Thailand’s Universal Health Coverage. Hum. Resour. Health 2021, 19, 31. [Google Scholar] [CrossRef]
  50. Landucci, G.; Paltrinieri, N. A methodology for frequency tailorization dedicated to the Oil & Gas sector. Process Saf. Environ. Prot. 2016, 104, 123–141. [Google Scholar]
  51. Zarei, E.; Khan, F.; Yazdi, M. A dynamic risk model to analyze hydrogen infrastructure. Int. J. Hydrogen Energy 2020, 46, 4626–4643. [Google Scholar] [CrossRef]
Figure 1. Methodology to perform a CAST technique.
Figure 1. Methodology to perform a CAST technique.
Infrastructures 07 00090 g001
Figure 2. Location of the incident.
Figure 2. Location of the incident.
Infrastructures 07 00090 g002
Figure 3. Details of the incident site and infrastructures.
Figure 3. Details of the incident site and infrastructures.
Infrastructures 07 00090 g003
Figure 4. High level SCS for the UK railway system.
Figure 4. High level SCS for the UK railway system.
Infrastructures 07 00090 g004
Figure 5. Detailed SCS model of the event under examination.
Figure 5. Detailed SCS model of the event under examination.
Infrastructures 07 00090 g005
Figure 6. Focus on main unsafe constraints and system components involved.
Figure 6. Focus on main unsafe constraints and system components involved.
Infrastructures 07 00090 g006
Figure 7. Updated SCS: in red the control actions or feedback performed inadequately or missing; in blue additional recommended control actions or feedback.
Figure 7. Updated SCS: in red the control actions or feedback performed inadequately or missing; in blue additional recommended control actions or feedback.
Infrastructures 07 00090 g007
Table 1. Organizations responsibilities and roles involved in the incidents.
Table 1. Organizations responsibilities and roles involved in the incidents.
OrganizationRoles and Responsibilities
Cabinet Minister (1)S/he manages the guidelines creating legislation to give a general rule for the management of safety and risk in the railway system
Department of TransportIt works with other agencies and partners to support the transport network providing policy, guidance, and funding to local authorities
Rail Accident Investigation Branch (2) (RAIB)It works with the other agencies to investigate accidents in order to improve railway safety and inform the industry and the public
Office of Rail and Road (3) (ORR)It regulates the rail industry’s health and safety performance, holds Network Rail and other rail infrastructure networks to account and makes sure that the rail industry is competitive and fair
Network Rail Management InfrastructureIt manages and regulates the processes to guarantee the operation of the railway systems
Private Wagon Registration Agreement Management Group (4) (PWRAMG)It works to manage the guidelines and control the staff related with the system
Wabtec Rail Limited (5)It works to provide maintenance of the trains and its components
DB SchenkerIt operates and contracts the staff of the trains
Lafarge Aggregates Ltd.The owners of the trains which operate on the United Kingdom (UK) railway system
Additional notes on the organizations’ roles and responsibilities: (1) The assignment of a maintenance entity for the vehicles operated in the Network Railway system must be selected based on European Railway Safety Directive 2004/49/EC (as amended by European Railway Safety Directive 2008/110/EC). (2) The Railways and Transport Safety Act 2003 and its associated implementing regulations, the Railways (Accident Investigation and Reporting) Regulations 2005, give the RAIB authority to make recommendations. Under the Act and Regulations, the RAIB can direct recommendations to any organization or person that is considered best placed to implement the required changes. (3) The private wagon owners must have a Private Wagon Registration Agreement (PWRA), the registration that allows them to operate freight trains subject to separate arrangements. This undertaking requires certification under the Railways and Other Guided Transport Systems (Safety) Regulations 2006 (ROGS), that provide the regulatory regime for rail safety and identifies the legal responsibilities and duties of private wagon owners and Network Rail. (4) Audits the owners and maintainers of private wagons to confirm they have the procedures in place to manage and maintain their wagons and control their management and maintenance in accordance with the PWRA. It is also requested to apply a bi-annual asset-condition check audit which looks at the condition of a selection of wagons within a fleet against their approved maintenance plans. Moreover, the preparation of Private Owner Circulation Letters (POCLs) may be issued to private wagon owners and state the actions that must be taken in response to any changes to procedures. (5) The wagons undergo an annual vehicle inspection and brake test (VIBT), and two planned preventative maintenance (PPM) examinations every four months; the maintenance plan is reviewed by PWRAMG and certified by Rail Safety and Standards Board (RSSB).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nakhal Akel, A.J.; Di Gravio, G.; Fedele, L.; Patriarca, R. Learning from Incidents in Socio-Technical Systems: A Systems-Theoretic Analysis in the Railway Sector. Infrastructures 2022, 7, 90. https://doi.org/10.3390/infrastructures7070090

AMA Style

Nakhal Akel AJ, Di Gravio G, Fedele L, Patriarca R. Learning from Incidents in Socio-Technical Systems: A Systems-Theoretic Analysis in the Railway Sector. Infrastructures. 2022; 7(7):90. https://doi.org/10.3390/infrastructures7070090

Chicago/Turabian Style

Nakhal Akel, Antonio Javier, Giulio Di Gravio, Lorenzo Fedele, and Riccardo Patriarca. 2022. "Learning from Incidents in Socio-Technical Systems: A Systems-Theoretic Analysis in the Railway Sector" Infrastructures 7, no. 7: 90. https://doi.org/10.3390/infrastructures7070090

APA Style

Nakhal Akel, A. J., Di Gravio, G., Fedele, L., & Patriarca, R. (2022). Learning from Incidents in Socio-Technical Systems: A Systems-Theoretic Analysis in the Railway Sector. Infrastructures, 7(7), 90. https://doi.org/10.3390/infrastructures7070090

Article Metrics

Back to TopTop