2. Methodology
The Success Likelihood Index Method (SLIM) is considered as one of the most commonly used methods for human reliability analysis in the maritime industry due to insufficient data or limited resources. Furthermore, the SLIM is commonly used in processes that contain task steps in a hierarchical structure [
18,
19]. The SLIM was designed by Embray et al. [
20,
21], and it is a decision-analytic approach that uses experts to estimate the human error probability that could happen while carrying out certain tasks. It belongs to the family of human reliability analysis (HRA) methods. It is appropriate for use in problems with a scarcity of data related to the performance of a certain task under study due to the fact that it is based on expert opinion [
20,
22,
23,
24]. Since very little data, i.e., reports on analyses of lifeboat accidents, are publicly available (perhaps due to the failure to report all maritime accidents), it can be concluded that the SLIM is suitable for the use of finding factors that influence the success of exercises and quantifying the probability of human errors during their execution. In addition, as stated in [
23], it has advantages when it is used for “operational, maintenance and incident analysis for specific tasks or scenarios”. Having this in mind, the SLIM is a methodology that perfectly fits the scope of this study.
Therefore, experts need to identify the task(s) to be evaluated first. The method considers that each task to be performed is affected by Performance-Shaping Factors (PSFs), whose weights and ratings define how each of them contributes to the Success Likelihood Index (SLI). PSFs are the factors that, according to experts’ elicitation, have the most significant effect on human performance during a specific task’s execution and its success (in this case, a lifeboat drill and all operations related to the drill). The weights (Wi) present the relative importance of the PSFs to the overall task (values from 0 to 100), while ratings (Ri) present the effect of each PSF on the performance of each task (values from 0 to 100, where 0 presents the most negative effect on task performance, 100 is the most positive effect, and 50 is a neutral effect) [
20]. The SLI is calculated using Equation (1) [
20,
21,
22,
23,
24]:
Then, human error probability (HEP) is quantified by using the obtained SLI and calculating the Probability of Success (PoS) (Equation (2)):
Then, the PoS must be subtracted from 1 and antilogarithmized. In Equation (2),
a and
b are constants that could be quantified by the given statistical data for the tasks performed (if available, which is not the case for the lifeboat drill accidents) or absolute probability judgments made by experts participating in the study [
22]. In the latter case, the experts estimate the PoS of a specific task in the possible best and worst situations (depending on the PSFs).
However, the SLIM methodology suffers from certain drawbacks, such as, for example, the total HEP of the specific operation considered being estimated as the aggregation of the human error probabilities of all tasks identified, and ignoring the dependencies among HEPs that occur due to common PSFs [
25]. To mitigate the shortcomings of the SLIM, it was decided to use the Bayesian Network–Success Likelihood Index Method (BN-SLIM) [
25,
26,
27] for the part of the lifeboat drill that, by utilising the SLIM, is shown to be with the highest probability of human error. A Bayesian network (BN) is a probabilistic directed acyclic graphical model for reasoning under uncertainty. It is composed of nodes representing variables and edges representing causal relationships between the variables. The causal relationships between the variables are quantified by the conditional probability tables (CPTs), making them a powerful reasoning tool. CPTs quantify the conditional relationship of a child node given all possible combinations of the states of its parent nodes. Unlike the child nodes, the root nodes (ones without parents) are assigned marginal probabilities. The causal probability relationships in BNs can be proposed by experts or updated using the Bayes theorem and new data being collected [
27,
28,
29].
The BN-SLIM takes into account the dependencies among the SLIs of the task due to common PSFs and prevents the overestimation or underestimation of the total HEP [
27]. In addition, it enables the diagnostic analysis and identification of the PSFs that contribute the most to HEP. It can be used as a proactive measure to prevent lifeboat drill accidents and reduce the likelihood of operator error [
25]. According to the original SLIM, the effect of PSFs on HEP is modelled through the variable of the Success Likelihood Index (Equation (1)), which is already estimated by the experts (in the SLIM). Therefore, two functions are needed to estimate HEP value: (a) one to model the relationship between the PSFs and SLIs, and (b) the second to calculate the HEPs using SLIs. Each PSF within the BN has several states representing its ratings. Since, according to Equation (1), PSF ratings and weights determine SLI values, the edges are drawn from PSF nodes to SLI nodes. The number of states of the SLI nodes is equal to the number of possible combinations of states of the PSF rating nodes. For the sake of simplicity, only two scores, “R1 = 1” and “R100 = 100”, are considered as the states of the PSF nodes of the Bayesian network. Therefore, conditions were taken for the worst and best possible influences of PSFs on the selected tasks. In addition, the developed BN was verified by utilising sensitivity analysis, a technique for the verification of the parameters of a BN [
30]. Highly sensitive causal nodes will have a significant impact on the targeted node, and acting on those nodes will have the best effect on the target node (in this case, HEP) [
31]. Furthermore, the BN-SLIM model was validated using data from real-life accident investigation reports where causal factors were identified, and the BN-SLIM model was updated with real data.
The sequence of the methodology utilised in this paper is presented in
Figure 1.
The study involved eight experts, all master mariners (sea captains) with at least ten years of seagoing experience. They have been involved in the inspections and maintenance of lifeboats and have conducted lifeboat drills onboard seagoing ships.
3. Results
The experts agreed with the tasks performed during the lifeboat drill identified in [
32] (
Table 1). The tasks were arranged into five groups: before lowering the lifeboat, prior lifeboat crew boarding, launching with crew onboard, recovery of the lifeboat, and post recovery (T1–T5), representing the phases of the lifeboat drill. The identified tasks are used to assist in the estimation of the HEP, which is needed to quantify constants
a and
b (Equation (2)).
Although the IMO issued revised guidelines for developing operation and maintenance manuals for lifeboat systems where procedures for finalising lifeboat drill tasks together with instructions for inspection and maintenance are given [
33], accidents during lifeboat drills still occur. After identifying the tasks involved, the experts defined PSFs. Each PSF was assigned a weight according to its significance for the safe performance of the lifeboat drill. Each expert weighted all PSFs, and then the average values were taken and normalised (
Table 2).
The highest weights were given to PSF 2, PSF 1, and PSF 3 (
Table 2). The authors identified the PSFs that are recognised as the most common causes of lifeboat-related accidents, according to [
7,
34]. In addition, the largest weights were assigned to the design and condition of equipment and Crew Competence, which are, according to [
35], the most common causes of lifeboat drill accidents worldwide. Hence, it can be concluded that their judgement is validated against the objective criteria.
After weighing, experts rated the PSFs, considering each task separately. The arithmetic means were taken and multiplied by the normalised weights to obtain the SLI according to Equation (1) (
Table 3).
After obtaining the SLI values for each task, it was necessary to determine the HEP. Hence, the experts estimated the HEP for each group of tasks (T1–T5) for the worst- and best-case scenarios (in the best-case scenario, all PSFs are as good as possible, and the opposite is true for the worst case). Equation (2) was used for obtaining constants
a and
b, where an SLI value of 0 was taken for the worst case, while an SLI value of 100 was taken for the best case. The obtained values of the
a and
b constants are shown in
Table 4.
After calculating the constants
a and
b, the HEP was obtained using Equation (2) (the obtained SLI values from
Table 3 were used together with constants a and b from
Table 4). The results are presented in
Table 5.
According to the experts’ judgement, tasks T4.5—Resuming recovery to the embarkation deck level, testing the manual operation of the limit switches during recovery, and disembarking the lifeboat crew; T4.2—Confirming crew readiness for recovery; and T4.1—Aligning the lifeboat with the forward and aft davit hooks, retrieving the painter, and engaging the hooks, were identified as the ones with the highest probability of human error during the execution, with 4.1%, 4.1%, and 3.4%, respectively (
Figure 2).
In addition, it was necessary to perform an inter-judge consistency analysis to ensure that there was consistency in the judgement of experts used in the study. For that purpose, a two-way ANOVA test was utilised to compare the means of two or more groups. The dependent variable used in the test was log(HEP), and the factors used were experts and tasks. IBM SPSS version 29.0.1.0 (171) was used for the analysis. According to the analysis, the individual log(HEP) values were significantly different for the tasks performed (
p < 0.001) but were not significantly different for the experts (
p = 0.792) and were consistent with the study’s purpose (
Table 6).
Furthermore, there is a need to calculate the interclass correlation coefficient (r), expressing the mean correlation between experts’ judgments, utilising Equation (3) [
20,
21]:
where F is the F value from
Table 6 (F = 470.339), and n is the number of experts (n = 8). When these values are inserted into Equation (3), the result obtained is r = 0.979, meaning that there is a high level of agreement between the experts’ judgements (a result closer to 1 means a higher level of agreement).
For sensitivity analysis, two-way ANOVA was performed using PSF weights as the dependent variable and PSF categories and tasks as the factors. The results showed that the assigned weights were significantly different between PSF categories (
p < 0.05) and significantly different between tasks (
p < 0.05) (
Table 7).
PSF ratings were the dependent variable, while PSF categories and tasks were the factors in rating analysis. According to the analysis results, the ratings significantly differ between PSF categories and between tasks (
p < 0.05) (
Table 8).
The mean of PSF ratings can be considered a measure of the overall quality of the lifeboat drill performance, considering the evaluated tasks during the drill performance.
After obtaining results using the SLIM, Task 4—Recovery of the lifeboat was recognised as the one with the highest probability of human error. In order to reduce the shortcomings of the SLIM, the BN was utilised to estimate the probability of human error with respect to the selected tasks and the Performance-Shaping Factors according to the experts’ opinions. The BN was developed with five PSF nodes as root nodes, five SLI nodes, five HEP nodes, and overall HEP (
Figure 3). The BN was modelled using the GeNIe Modeler tool [
36]. The normalised weights from
Table 2 were taken to calculate SLI values according to Equation (1) (as stated in the
Section 2, values of 1 and 100 were taken as ratings). Therefore, 32 possible states (2
5) were calculated for each SLI according to Equation (4).
Each SLI value can be represented as a possible state of the SLI node. Furthermore, according to Equation (2), the SLI node should be the only parent of the HEP node. This node has two states, i.e., human error occurs (HEP) and no human error (PoS). When completing the Bayesian network structure, conditional probability tables (CPTs) should be assigned to the SLI and HEP nodes to quantify the effects of the PSF node. In order to determine the state values of the SLI factors (SLI T4.1–SLI T4.5), CPTs were used (
Figure A1 in
Appendix A).
After filling in CPTs for the SLI T4.1–SLI T4.5 nodes, it was necessary to fill in the CPTs for the HEP T4.1–HEP T4.5 nodes. In order to achieve this, Equation (2) was used, whereby it was necessary to estimate the probability of human error in the best and worst case for each identified task during lifeboat recovery and, based on that, to calculate the individual values of the constants
a and
b for each individual HEP, that is, for each task identified. The estimated values of the HEP and the calculated constants
a and
b for each individual task are presented in
Table 9, where the experts estimated the values of the HEP for the worst- and best-case scenarios (the probabilities were estimated by consensus during a meeting with experts).
Then, utilising Equation (2), the values of all states of the nodes HEP T4.1–HEP T4.5 were calculated. After quantifying the values of all nodes, the quantitative BN-SLIM model was obtained and is presented in
Figure A2. As shown in
Figure A2, the overall task HEP is estimated to be 4%, given the PSF probabilities as estimated by the experts. Also, there is a difference between the SLIM and BN-SLIM results (
Table 10).
As can be seen from
Table 10, the largest difference between the HEPs is for task T4.4—Confirming the release lever is properly rested and hooks locked, and it is higher when using the BN-SLIM. This confirms that using the BN-SLIM in addition to the SLIM is beneficial because specific tasks that are identified with the highest probability of operator failure could be additionally analysed, and better insight could be obtained.
Sensitivity analysis in the GeNIe Modeler was performed to identify which nodes are the most sensitive regarding the overall task HEP (
Figure A3). According to the obtained results, PSF1—Crew Competence is the most sensitive node, followed by PSF2—Design and condition of equipment. It can be concluded that for the recovery of the lifeboat (part of the lifeboat drill), the greatest effect on safety would be obtained by acting on these two factors. It has to be mentioned that, when weighting PSFs for SLIM, the experts gave higher importance to PSF2 than PSF1. Consequently, the result obtained is also one of the advantages of using the BN-SLIM over the SLIM since the PSFs with the largest effect on HEP can be identified for each phase of the lifeboat drill. Therefore, if effective measures to improve Crew Competence (PSF1) and the Design and condition of equipment (PSF2) are introduced onboard ships, the HEP for tasks included in lifeboat recovery would be as presented in
Table 11.
As presented in
Table 11, just acting on PSF1 would reduce the HEP during lifeboat recovery from 4% to 2.2%. If PSF1 and PSF2 are acted upon and improved, the HEP would be reduced to 1.4%. In addition, there is a need to show the change in HEPs if all PSF probabilities change.
Table 12 presents changes in the HEP of lifeboat recovery T4 (tasks T4.1–T4.5) when changing the probabilities of PSFs.
However, in practise, it might be difficult to implement measures that could improve all of the PSFs identified in this study. The optimal solution would be to implement measures that would improve the effect of PSF1 and PSF2 on lifeboat drill tasks and subtasks.
Furthermore, the BN-SLIM model needs to be validated; therefore, a real-life lifeboat drill accident investigation report was used. The short details of the accident are as follows: the bulk carrier was at anchor, and the abandon ship drill was carried out. During the lifeboat recovery, the aft hook-release gear mechanism opened. Because the forward mechanism was unable to take the full load, it also opened. The lifeboat then fell from a height of 11 m into the sea, fatally injuring one of the five lifeboat crew members [
37]. The accident investigation report lists the following as the causes of the accident [
37]:
“The aft release mechanism opened when the floating blocks impacted the davit arms. The load was then transferred to the forward hook, which, due to improper resetting, was released and caused the lifeboat to fall”.
“The hooks were improperly reset because they were not held in their proper position during the reset of the release mechanism”.
“Residual compressive force in the release cable allowed the lock piece to be held in place by friction alone, making hoisting of the lifeboat possible”.
“The deformation of the release cable attachment enabled the release handle to be secured in the reset position even though the hook-release mechanism was not properly reset”.
“The combination of reading the instruction manual and a pre-drill training session was insufficient to ensure that crew members fully understood how to reset the hook-release mechanism”.
According to the accident investigation report findings, the crew members were not competent for the task because they did not know how to reset the hooks properly (PSF1). In addition, the equipment was in poor condition because the release cable attachment was deformed, which enabled the release handle to be secured in the reset position (PSF2). Also, the leadership and supervision were poor because the pre-drill training session was insufficient to ensure that crew members fully understood the hook-release mechanism operation (PSF3). The investigation report also listed “Findings as to Risk”, where it was stated that the instruction manual contents were less than adequate for thorough understanding (PSF4). The same section identified problems in shipboard safety culture as well because theoretical knowledge on using life-saving equipment was not transformed in practise, and there was a constant risk when using such equipment (PSF5). Accordingly, all PSFs recognised as important in this research were identified as causes of the lifeboat drill accident, which also occurred during the recovery of the lifeboat. In addition, task T4.4—Confirming the release lever is properly rested and hooks locked was a key task that led to the accident because it was not performed due to the inadequate competence of the designated lifeboat crew [
37]. To validate the developed BN-SLIM model, all PSFs were assigned negative values (R1 = 100%) (
Figure A4). According to the BN-SLIM results, the probability of human error increases to 13.5% when all PSFs are negative (from 3.6% as estimated during regular lifeboat drill performance). The results confirm the validity of the developed model.
4. Discussion
By utilising the SLIM and expert opinions, the highest HEP was estimated for the lifeboat recovery phase of the lifeboat drill. An external validation of the findings was conducted by using real-world data. The Oil Companies International Marine Forum (OCIMF) data showed that almost half of the lifeboat drill accidents occurred while recovering the lifeboats [
38]. In addition, the same survey results confirmed that equipment failures and design shortcomings were responsible for approximately two-thirds of all reported lifeboat accidents [
38]. Tasks T4.1—Aligning the lifeboat with forward and aft davit hooks, retrieving the painter, and engaging the hooks and T4.5—Resuming recovery at the embarkation deck level, testing the manual operation of the limit switches during recovery, and disembarking the lifeboat crew, were identified as the ones with the highest HEP. To reduce the probability of human error during the performance of these tasks, it is important to have well-trained and educated crew members onboard a ship, familiar with the lifeboat davit system and lifeboat equipment, especially with the lifeboat hook release system. Therefore, familiarisation with lifeboats and the associated equipment should be thorough and performed by the officer in charge of the lifeboat. Lifeboat familiarisation checklists should be developed and used to ensure that all equipment, procedures, and actions are covered. Besides this, regular performance of lifeboat drills and training are prerequisites for well-trained and competent crew members. In addition, present weather conditions and the weather forecast should be included in the risk assessment carried out before drill performance, ensuring that environmental factors are favourable.
Furthermore, as identified by the experts and confirmed by the lifeboat accident investigation reports, the adequate design and good condition of equipment are important elements of safe and well-performed lifeboat drills, emphasising lifeboat falls (wires) and lifeboat hooks. A lifeboat equipment inspection checklist should be developed (if not already in use) and strictly followed to ensure the equipment is operational and safe for use. Besides that, special attention should be given to regular maintenance, such as lubricating falls and all other moving parts and ensuring that the hooks, together with the on-load release system and limit switches, are fully operational. Inspections should be performed by crew members trained in lifeboat maintenance to ensure the operability of the equipment and improve safety onboard.
Fall-Preventer Devices (FPDs) are recommended to be used during lifeboat drills when the lifeboat is not waterborne and during lifeboat recovery when waterborne. When properly used, FPDs could prevent the lifeboat from falling if the hooks open. Therefore, the risk of accidents could be significantly reduced. The crew members must be familiar with the usage of FPDs, which must be in good condition (without any damage).
The HEP for Task T4.2—Confirming crew readiness for recovery was identified as the third highest during the lifeboat drill. Although this task does not directly affect accident occurrence, if the lifeboat crew is not seated with seat belts fastened, the consequences of the accidents could be more significant. Hence, confirmation should be received from the lifeboat coxswain that the lifeboat crew is ready for the recovery (since control of the recovery is performed from the ship deck).
Furthermore, utilising the BN-SLIM showed that task T4.4—Confirming the release lever is properly rested and hooks locked is the one with the highest HEP, which is also confirmed by the real-life case where an accident occurred because the lifeboat crew members did not confirm (or more accurately, did not know how to) that the release lever was properly rested and the hooks were locked. Special attention should be given to this task, and shipboard leadership should ensure that lifeboat crew members have good theoretical and practical knowledge of using lifeboats, especially lifeboat hooks. Additional training should be introduced to improve safety and improve the learning process. Such training could also be virtual [
39], where lifeboat hook operation would be exercised without fear of injuries or fatalities.
The most sensitive PSFs, according to the BN-SLIM results, are Crew Competence and the design and condition of equipment. Hence, adequate familiarisation with LSA on board immediately after embarking on a ship and adequately performing education and training in maritime training institutions could serve as a good base for competency. Further training onboard a ship is needed to develop additional skills and gain knowledge and experience in using LSA to improve safety. Furthermore, equipment should be regularly inspected and maintained according to company maintenance procedures. Only experienced and knowledgeable officers should be in charge of inspection and maintenance to prevent possible equipment deficiencies and breakdowns. In addition, if the design of the lifeboat or some part of its equipment affects safety, it should be reported to the company. Lifeboat and lifeboat equipment manuals should be on board and refer to equipment presently on board, and obsolete equipment manuals should be immediately removed from the ship.
As lifeboat drill tasks and subtasks with the highest probability of human error are identified in this research, they could be used in practise to improve safety. The findings of this research could serve to improve existing risk assessments for gravity lifeboat drills onboard ships. Furthermore, adopting the proposed measures for improving identified PSFs could be included in shipboard policies and procedures to reduce the number of lifeboat accidents. Therefore, the practical implication of our research findings is that focusing on tasks with the highest HEP and most sensitive PSFs while performing risk assessments for gravity lifeboat drills could improve the safety of lifeboat drills. Safety officers should ensure that the risks connected with lifeboat recovery are mitigated and acceptable. However, during the drill, procedures should be closely followed, and any inconsistencies should be avoided. All possible issues should be noted in the risk assessment review and discussed during safety meetings on board. Corrective measures and safety barriers should be developed and implemented to avoid possible incidents during lifeboat drills.
4.1. Limitations of the Current Research
This research, however, is subject to several limitations. The estimated HEPs are based on expert judgement and not on empirical data, which could affect their reliability. However, as already stated, the data on lifeboat accidents are scarce, and so it was decided to use the SLIM, which relies heavily on expert judgement. This reliance may introduce biases and limit the generalisability of the findings. The study’s context is limited to master mariners from one country, potentially overlooking variations in training practises, equipment, and regulatory environments in other regions. Nonetheless, it has to be mentioned that the master mariners participating in this research sail on international shipping companies and work with multinational crews. Furthermore, the experts involved are all master mariners, which, while valuable, represent a relatively homogenous group. However, the master mariners are most familiar with the problems of accomplishing the tasks for successfully performing lifeboat drills. In addition, the SLIM assumes independence among the HEPs of different tasks within the lifeboat drill. This assumption may oversimplify real-world scenarios where dependencies among tasks exist due to shared PSFs. Therefore, the authors introduced the BN-SLIM to address this issue. Also, the absence of a larger empirical dataset for validation weakens the robustness of the conclusions. This research focuses specifically on gravity lifeboats, excluding free-fall lifeboats. This limitation restricts the findings to other types of lifeboats, which might have different operational challenges and error probabilities. This study does not account for variations in ship types, crew experience levels, or environmental conditions, which could affect the generalisability of findings.
4.2. Directions for Future Research
To address the limitations of this study, including a more diverse range of stakeholders, such as engineers and safety officers, could provide a more comprehensive perspective in future research. Chief engineers, safety officers, Flag and Port State Control officers, and master mariners would make a heterogeneous group whose expert opinions would significantly enhance this study’s results. Furthermore, including experts from different regions would enhance the generalisability of the findings and enrich the research. As this paper dealt with only gravity davits, future research will include all types of lifeboats and davits with reference to different types of ships, trying to make our findings more general and applicable to all types of ships and lifeboats.