A Resilience Engineering Approach for the Risk Assessment of IT Services

Fargnoli, Mario; Murgianu, Luca

doi:10.3390/app132011132

Open AccessArticle

A Resilience Engineering Approach for the Risk Assessment of IT Services

by

Mario Fargnoli

^*

and

Luca Murgianu

Faculty of Technological and Innovation Sciences, Universitas Mercatorum, Piazza Mattei 10, 00186 Rome, Italy

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(20), 11132; https://doi.org/10.3390/app132011132

Submission received: 13 September 2023 / Revised: 4 October 2023 / Accepted: 7 October 2023 / Published: 10 October 2023

Download

Browse Figures

Versions Notes

Abstract

:

Nowadays, services related to IT technologies have assumed paramount importance in most sectors, creating complex systems involving different stakeholders. Such systems are subject to unpredictable risks that differ from what is usually expected and cannot be properly managed using traditional risk assessment approaches. Consequently, ensuring their reliability represents a critical task for companies, which need to adopt resilience engineering tools to reduce the occurrence of failures and malfunctions. With this goal in mind, the current study proposes a risk assessment procedure for cloud migration processes that integrates the application of the Functional Resonance Analysis Method (FRAM) with tools aimed at defining specific performance requirements for the suppliers of this service. In particular, the Critical-To-Quality (CTQ) method was used to define the quality drivers of the IT platform customers, while technical standards were applied to define requirements for a security management system, including aspects relevant to the supply chain. Such an approach was verified by means of its application to a real-life case study, which concerns the analysis of the risks inherent to the supply chain related to cloud migration. The results achieved can contribute to augmenting knowledge in the field of IT systems’ risk assessment, providing a base for further research.

Keywords:

resilience engineering; IT systems; cloud migration; risk assessment; functional resonance analysis method (FRAM); critical-to-quality (CTQ); IT services supply chain; security management systems

1. Introduction

Recently, the concept of resilience has been applied to numerous different contexts to stress the role of adaptability and variability in dealing with the events that are characterizing our lives in these years, such as the COVID-19 pandemic, the energy crisis, wars, climate changes, etc. [1,2,3,4]. As stressed by Lay et al. [5], factors such as complexity, ambiguity, constrained resources, and uncertainty are more and more shaping our lives, requiring adaptability and adaptive behaviors to face the variety of perturbations that might occur in real systems. While at a general level, the definition of resilience proposed by the United Nations can illustrate its broad application in the case of adverse situations (“The ability of a system, community or society exposed to hazards to resist, absorb, accommodate to, and recover from the effects of a hazard in a timely and efficient manner, including through the preservation and restoration of its essential basic structures and functions”) [6], in the engineering context the research approach proposed by Hollnagel et al. [7] can be considered as one of the most accepted and diffused, describing resilience as “The intrinsic ability of a system to adjust its functioning prior to, during, or following changes and disturbances, so that it can sustain required operations under both expected and unexpected conditions”. Resilience represents a key factor in enhancing risk assessment, allowing the shift from traditional risk analysis, mainly based on a control-centric approach, to a more modern concept, which is usually called “Safety II” and relies on the acknowledgment that both acceptable and adverse outcomes are based on everyday performance adjustments [8].

Actually, as stressed by Farooqi et al. [9], traditional risk analysis tools belonging to the “Safety I” approach, such as Failure Modes and Effect Analysis (FMEA), Fault Tree Analysis (FTA), and Event Tree Analysis (ETA), provide a bimodal perspective of work activities, according to which positive or negative outcomes are a consequence of different systems’ modes of functioning. Hence, despite the advantages in terms of usability, Safety I tools fail to capture the complexity and variability of human performances and related activities [10]. Accordingly, failures should be considered a result of the everyday variability of human performances rather than unique events. Hence, adverse outcomes are not only due to failures and malfunctions but also to the result of performance variabilities [11]. In order to put into practice this novel approach, the Functional Resonance Analysis Method (FRAM) is one of the most widespread tools for modeling causal factors of accidents (i.e., what goes wrong) and the behavior of sociotechnical systems [12]. This method is currently considered one of the few structured “Safety II” tools used in the industrial context [13]. As pointed out by Hollnagel [14], FRAM allows safety managers to describe the functions of a complex system by characterizing their potential variability and the functional resonance based on dependencies and couplings among them. More in detail, FRAM is aimed at analyzing how a certain socio-technical system works by describing it not in terms of components but as a set of functions that represent the “work-as-done” [15]. In this way, it is possible to continuously adjust the system’s performance, allowing things to go right; conversely, when the performance variability leads to unexpected outcomes, which are described as functional resonance, things go wrong and accidents may occur [16].

In recent literature, numerous studies have investigated the implementation of FRAM to enhance resilience in different domains [17,18,19]. Delikhoon et al. [20] observed that the reduction of unplanned chains of events and losses can be achieved by a proactive management of risks, which can enhance the quality of performance and productivity. Accordingly, most studies on resilience engineering focused on the risk of safety accidents involving modern industrial-technological systems, whose complexity makes reactive risk assessment approaches unsuitable for comprehending how and why accidents occur [21,22]. This is in line with the findings by Patriarca et al. [23], who observed that the majority of FRAM applications concern safety problems in manufacturing, transportation, or power plant contexts, while the use of such an approach in business and operational research domains is still limited. In particular, there is a need to analyze and manage the so-called “black swans”, i.e., unpredictable risks that differ from what is usually expected and are very difficult to envisage. For example, due to their unpredictability, human factors are considered a potential source of errors and malfunctions, requiring a proactive approach to ensure effective safety management [13].

To deal with them, as stressed by Aven [24], further research is needed to develop risk assessment procedures tailored for the management of “critical infrastructures”, whose vulnerability is related to threats due to human actions that cannot be properly managed using traditional risk assessment approaches, as in the case of security infrastructures and the reliability of IT services [25,26]. It must be noted that also in the latter context, recent cases of failures and malfunctioning have caused significant problems: for example, the fire that seriously damaged Europe’s largest cloud services firm, OVHcloud, in 2021 [27], or the outage that occurred in 2021, which involved one of the major Content Delivery Network (CDN) providers [28]. Similarly, several authors pointed out the need to use a resilience engineering approach to augment risk analysis of IT services in the healthcare domain, where system failure must be rigorously avoided [29,30]. Moreover, it must be noted that IoT technologies attract great interest from cybercriminals; consequently, as observed by Zhou et al. [31], the potential for significant damage to an enterprise network can be substantial. All these occurrences also show the unpredictable nature of risks that might affect IT services and cloud computing.

The need for a proper risk assessment approach in this sector was pointed out by several studies [32,33], which mainly outlined the provision of risk assessment frameworks capable of dealing with the variability, unpredictability, and vulnerability of these complex systems.

In the IT context, on the one hand, in recent years numerous efforts have been carried out to augment risk assessment approaches that can guarantee cloud security, as reported, for example, by the MEDINA project financed by the European Union [34]. On the other hand, most risk assessment tools in this domain rely on a reactive approach, while only a few of them take into account supply chain risks [35] and the elicitation of software requirements for complex systems [36]. Hence, a “Safety II” approach is needed to explain performance variability in such a context, and FRAM appears to be the proper tool for systemic safety assessment in this complex and dynamic domain [37].

Based on the above considerations, the current study aims to reduce this research gap by augmenting knowledge of the risk management of IT services by means of a resilience engineering approach through the application of FRAM to the case of cloud migration. In particular, a risk assessment framework was developed to provide a structured approach based on FRAM that can be used at an operational level, which we think can contribute to the lack of practical tools to apply functional resonance [38] and a multidisciplinary approach for the requirements elicitation of IT services [30].

In more detail, the remainder of the paper is as follows: The next section concerns the study’s materials and methods and is divided into two parts. In the first part, an analysis of FRAM is presented in order to better highlight its main features; in the second part, the framework of the proposed research approach is illustrated to deal with the IT supply chain criticalities. Then, in Section 3, the application of such an approach to a real-world case study concerning the analysis of the risks inherent to the supply chain related to cloud migration is reported, and the results achieved are provided. Section 4 discusses the research outcomes, while Section 5 concludes the paper by addressing future work.

2. Materials and Methods

In recent literature, numerous tools have been found that have been used to evaluate and improve the resilience of socio-technical systems [20]. Relying on the tier-based approach for the classification of these tools, which was proposed by Linkov et al. [39], three different tiers can be distinguished based on the increasing level of complexity of the system and information needed to deal with:

Tier I, which consists of a screening level where the main properties of the system are identified and prioritized;
Tier II, where the description of the system structure is defined, and bottlenecks are identified;
Tier III, which includes the modeling of the interactions between the sub-systems and different scenarios, can be analyzed to verify the system’s performance under uncertainties.

Usually, Tier II tools are sufficient to provide information about the system, allowing decision-makers to properly improve the system’s resilience, while the third level is hardly ever implemented due to the high level of information and resources needed for the analysis. Among the tools belonging to Tier II, FRAM is certainly one of the most diffused, as it can provide information on the system structure and its components by means of a systemic analysis approach [8].

2.1. The Functional Resonance Analysis Method (FRAM)

In brief, the method’s scheme and functioning can be summarized in the following main phases:

Definition of the goal of the analysis;
Identification and definition of the functions;
Definition of the variability of each one of the functions;
Variability aggregation;
Identification of possible solutions.

In the first phase, the objective of the analysis is defined, such as risk assessment or hazard analysis, as well as safety management.

The second step concerns the description of the tasks/functions that allow the system’s functioning, they include human, technological, and organizational activities, and based on the taxonomy provided by Hollnagel [14], each function is characterized by the following aspects:

Input (I), representing the function’s starter or transformer;
Output (O), which is the result of the function’s transformation;
Precondition (P), i.e., what should happen to allow the function’s transformation;
Resource (R), i.e., what is needed for the function’s activation or transformation in order to achieve the output;
Time (T), representing the time constraints that can affect the function;
Control (C), i.e., the control and monitoring modes of the function.

These aspects are usually graphically related to each function by means of a hexagon, as schematized in Figure 1: they represent a certain status of the system, not an activity.

It must be noted that if a system is characterized by a sequence of activities, each step of such a sequence can be identified as a function. Conversely, a more specific analysis is needed to bring to light elementary actions/tasks characterizing the system, e.g., by means of tools such as the Hierarchical Task Analysis (HTA) technique that allows the decomposition of complex tasks [40]. In such a context, it must be noted that while the term “task” refers to planned working activities (i.e., “work-as-imaged”), the term “activity” identifies their practical execution (i.e., “work-as-done”) [41].

The goal of the third step is to define the variability of each one of the functions/tasks that characterize the system. In other terms, the analysis is focused on ascertaining the variability of function output, which is usually classified as:

Endogenous or internal variability;
Exogenous or external variability;
Influenced by upstream functions, i.e., when a functional upstream-downstream coupling occurs.

Based on this, following the taxonomy proposed by Hollnagel [14], the output variability of each function can be estimated considering both time and precision, as reported in Table 1, where the functions (Fu) are classified into the three categories: human (M), technological (T), and organizational (O).

Then, for each function, the variability can be evaluated following the criteria proposed in Table 2.

The variability aggregation step aims at verifying if and where functional resonance can occur by means of the analysis of the upstream-downstream couplings, which allows us to understand how the output of each function can be related to the different aspects of the others [42]. At a general level, five different types of upstream-downstream couplings should be considered: between input and output; between output and preconditions; between output and resources; between output and control; and between output and time.

The last step consists of the definition of the consequences of the variability in order to implement possible measures aimed at reducing the variability of the function’s performances (in the case of a negative variability) or augmenting it when the variability is considered positive for the system’s functioning.

Accordingly, based on the taxonomy provided by Hollnagel [14], six different solution types can be implemented:

Elimination;
Prevention;
Facilitation;
Protection;
Monitoring;
Dampening.

2.2. Research Approach

The migration of an IT system is considered a high-risk process because only when the process is almost completed is it possible to understand the relationships between the data managed and the related trade-off between the cost, effort, risk, and time on the one hand and the value creation on the other [43]. More in detail, in cloud computing, six main types of migration strategies are recognized [44], which are named the “6Rs” and can be schematized as follows:

Re-hosting (lift-and-shift)
Re-platforming (lift-tinker-and-shift)
Re-purchasing (move to a different product)
Refactoring
Retiring
Retaining.

These strategies provide different levels of optimization, requiring, at the same time, different efforts in terms of implementation costs and resources. Accordingly, the analysis of the supply chain related to the migration process and the supplier selection process are key elements for a successful migration.

More specifically, the system supply chain in the case of cloud migration is composed of the following activities, which can be carried out by internal or external actors (i.e., the organization’s technicians or the suppliers, namely):

Connectivity and webserver of the Hosting service (external);
Hosting service configuration (internal);
Provision of applications for the management of the Hosting service (external);
Configuration of the Hosting platform (internal);
Domain name transfer/registration (external);
Domain name transfer/registration (internal);
Domain Name System (DNS) service (external);
Domain Name System (DNS) configuration (internal);
Provision of applications for the system migration (external);
Management of customers and access to data (internal).

As can be seen in the above list, half of the activities are provided by external actors; this shows the criticality of both the suppliers’ selection and the management of the whole migration process.

Another critical issue in the supplier selection process relies on the satisfaction of customer needs and expectations [45,46]. To deal with this, the CTQ technique can be used to gain information about customer needs and preferences. Such a tool requires a survey among customers to determine their point of view on the quality of both the products and the related services, including customer care services [47]. A general scheme of this method is illustrated in Figure 2, where QDs represent the quality drivers and PRs represent the performance requirements. The output of this analysis allows the definition of the key requirements for customers, viz., the system’s critical-to-quality features [48].

Based on the above considerations, the FRAM approach was used to analyze the risks related to the migration process of an IT platform, which includes the transfer of data and software and is usually articulated into three main phases: plan, execution, and validation. The use of FRAM was integrated with the application of two different tools aimed at improving its effectiveness:

The 6Rs approach to defining the proper strategies for IT system migration: in order to decide the proper strategy, the decision tree model proposed by Lloyd [43] was used.
The Critical-to-Quality (CTQ) technique was used by means of a survey among the IT platform customers.

In addition, it must be noted that technical standards, such as ISO 22316 [49] and ISO 28000 [50], were used both in the planning phase (to verify the system requirements and the suppliers’ features) and in the validation phase (to verify the conformity of the new system).

Overall, the scheme of the proposed approach is illustrated in Figure 3, where the main activities are summarized.

In other words, our research approach consists of two main phases:

Phase I, which is aimed at the definition of the main requirements and features of the migration process that have to be supplied, the output of this phase is represented by the definition and characterization of the performance requirements for the proper selection of the providers of the new IT platform.
Phase II, which concerns the risk analysis of the migration process that has to be supplied, the output of this phase encompasses the detection of process criticalities and its optimization.

3. Case Study

The research approach was verified by means of its practical application in collaboration with an IT company willing to improve the migration process of its web platform that can be achieved by means of the domain name (i.e., the company’s web address): this platform provides online services, and the company wants to update it.

More in detail, the platform consists of both a hardware infrastructure and software aimed at providing technological tools and services for the distribution of digital content and services. In particular, the analyzed IT system contains both a public part and one reserved for registered client for e-commerce (electronic commerce) purposes. Moreover, there is one area for administrative businesses, such as supply chain management, payment management, customer care, etc. The case study was carried out in collaboration with a group of company technicians and IT experts (namely two IT engineers of the company and two IT external consultants), who supported us in all the evaluation processes, which were carried out in a qualitative manner during specific meetings and interviews. Due to a non-disclosure agreement (NDA), some details concerning the company and its platform are omitted.

The first step of the analysis concerned the definition of the customer requirements, i.e., the supply features that can satisfy the company’s expectations. From the interviews, it emerged that the company desires a full migration of the web system since the current system is supported by out-of-date hardware (HW) and software (SW). Accordingly, the new HW and SW solutions must be better performing, allowing a higher level of cyber security and compliance with the requirements of the General Data Protection Regulation (GPDR) [51], as well as greater service continuity and a higher quality of customer assistance services. Furthermore, while no cost targets were indicated, it was requested that the suppliers show their attention to environmental and energy-saving concerns. Finally, it was requested to reduce as much as possible the no-reachability time during the migration.

Accordingly, since the IT system is already hosted in the cloud, it was decided to follow the Re-hosting and Re-platforming strategies. In practice, this means that the whole service, including the infrastructure and the database, is moved to a hosting cloud (Re-hosting), and in addition, some components are modified to adapt the cloud service to the new platform features (Re-platforming).

Then the CTQ method was applied to define the main customer needs and the related quality drivers to address the suppliers’ selection; the list of customer needs and quality drivers is illustrated in Figure 4.

Then, for each quality driver, the performance requirement was decided: in Table 3, only an excerpt of the list of PRs is reported due to the NDA.

The next step of the analysis consisted of the identification of the risks related to the migration process by means of the use of FRAM. The starting point of the application of this tool concerns the definition of the functions characterizing the migration process, which was carried out together with the company’s technicians and IT experts by means of semi-structured interviews, in line with Hoy et al. [52]. The list of the identified functions is the following, where P indicates planning activities, E execution activities, and C validation ones:

1c.: Transfer request (P): in the request, the company specifies the characteristics and features of the platform as well as the motivations behind the request; the acceptance of the supplier’s offer by the company is the precondition of the evaluation and execution phases of the migration process; in the offer, the provider must specify the details of the migration procedures and the execution times.
2c.: Web service analysis (P): this activity is aimed at clarifying the features and applications of the server that has to be migrated, taking into account both hardware (e.g., CPU, type of HDD, RAM, etc.) and software (e.g., operative system, applications, CDN, etc.) characteristics; the output of this analysis consists of a document reporting all the information collected (e.g., the URL, the technical specification of the platform, etc.).
3c.: Web hosting analysis (P): the goal of this activity consists of the definition of a report that includes an evaluation of both the current provider and the new ones based on the customer needs that emerged in the first phase of the procedure.
4c.: Domain name analysis (P): since the domain name has to be transferred from the current provider to a new one, all the services related to the DNS should be analyzed; the output of this analysis consists of a technical report including all the features related to the DNS (an example of the services related to the DNS is the provision of an e-mail (electronic mail) system).
5c.: DNS provider analysis (P): this activity is related to the verification of the characteristics of all the services related to the DNS that have to be provided; the company has to decide whether to keep a unique provider or select two different suppliers (e.g., one as a maintainer for the name and domain registration and one for the management of the DNS services).
6c.: Configuration of the services related to the domain name (P): once the provider for the management of the DNS services is chosen, the DNS services must be configured to reduce the unavailability time during the migration.
7c.: Backup of the source system (E): this activity consists of performing the backup of the current platform; the output is represented by archive files and images to be used during the tests and the transfer into the new system.
8c.: Migration test (E): this test is aimed at verifying that the new server and the files generated by the backup are complete and correct. With this goal in mind, a different address is used, creating a clone platform while the current platform is still running.
9c.: Functionality test (C/E): the clone platform is tested to verify all its functionalities and make corrections and modifications if needed in order to guarantee the full accessibility of services.
10c.: New server configuration (E): once the functionality test is concluded, the new server that will host the platform is configured.
11c.: Request for transfer of the domain name (E): the request for transfer of the domain name is made, and once the transfer is confirmed by the new service provider, the control panel is accessible.
12c.: Platform installation (E): the platform is installed and all tools that are necessary for the migration are configured.
13c.: Platform migration (E): all the files generated during the backup are transferred to the new system.
14c.: Consistency test and services verification (C): both the accessibility and functionality of all services of the new platform are verified.
15c.: Completing the migration (C/E): the migration process is completed and the platform is accessible to the customers.

Moreover, for each function, a checklist reporting its main aspects (classified as Input (I), Output (O), Precondition (P), Resource (R), Time (T), and Control (C) categories) was implemented. Then, the FRAM model was developed by means of the software FRAM Model Visualiser (FMV Pro, rel. 2.1.6 [53]), and an excerpt of this application is reported in Figure 5, while the graphic output is illustrated in Figure 6, where the green hexagon represents a human function, the red hexagon an organizational function, and the blue hexagon a technological function. The whole application of the method is provided in the Supplementary Material.

Together with the group of experts, at this stage of the analysis, the following assumptions were made for all the functions, considering a medium level of complexity:

Precision: “acceptable”;
Time: “in time”.

Accordingly, the functions (human (M), technological (T), and organizational (O)) were categorized as shown in Table 4.

The next step concerned the analysis of the variability. Since it emerged that 8 out of 14 functions are human ones, which are characterized by variability in performances with a high frequency and large magnitude, we focused our attention on these functions. In particular, functions 2c, 3c, 5c, and 11c appeared as the most critical for the migration process and the quality of the services provided.

For instance, function 2c (web service analysis) is the most critical human function since it influences all subsequent human, technological, and organizational functions. It influences the selection of the suppliers and the way the backup activity is carried out. Similarly, functions 3c (web hosting analysis) and 5c (DNS provider analysis) have a large impact on the evaluation of the providers, conditioning the final results of the migration and the quality of the services provided by the new platform.

As far as function 11c (request for transfer of the domain name) is concerned, its criticality relies on the non-accessibility time during the domain transfer; the risk of a prolonged period of service unavailability was considered crucial for the company. Hence, a different scenario was evaluated where: if the activity is carried out “too early”, the address is achievable but an error message appears since the migration is not completed yet; conversely, if the function is performed “too late”, the address is not achievable although the migration is completed. In the case of an “in time” execution, the time needed to correctly access the platform can vary largely, depending on the performance of the provider. Based on this, to reduce the above risks, the migration process was modified as follows:

the request for transfer was postponed, it occurs when the migration is completed;
the use of a temporary domain (alias domain) was foreseen so that if the migration fails or the services on the new platform are unstable, it is possible to go back from the alias domain to the current domain, deleting the redirect to the temporary solution.

In this way, once the migration is completed and both the functionality and consistency of the services are verified, it is possible to operate the request for the domain transfer. Such a new procedure modifies the list of functions as follows:

1n.: Transfer request;
2n.: Web service analysis;
3n.: Web hosting analysis;
4n.: Domain name analysis;
5n.: DNS provider analysis;
6n.: Backup of the source system;
7n.: Migration test;
8n.: Functionality test;
9n.: New server configuration;
10n.: Platform installation;
11n.: Platform migration;
12n.: Redirecting the source domain to a temporary domain;
13n.: Consistency test and services verification;
14n.: Request for transfer of the domain name;
15n.: Configuration of all the services related to the domain name;
16n.: Completing the migration.

In Figure 7, the FRAM model of the new procedure is reported, while the difference between the current model and the new one is illustrated in Figure 8, where the two migration procedures are compared.

More in detail, in the new procedure, the functions “Configuration of all the services related to the domain name” and “Request for transfer on the domain name” have a new position, while a novel function was introduced (function 12n, Redirecting the source domain to a temporary domain). This function adds a sound control phase aimed at augmenting the resilience of the procedure. Actually, the use of an alias domain allows one to go back to the “old” platform, deleting the redirecting process and thus stopping the migration process if the controls made in the new function 13n (consistency test and service verification) fail. Thanks to this modification, customers do not notice any dysfunctionality since the platform is always available.

Moreover, in the current migration procedure, a delay (“too late”) or anticipation (too early) in the completion of function 11c (Request for transfer on the domain name), which depends on two different suppliers (the old and the new maintainers), can cause errors in the migration process. This can entail the platform malfunctioning or being unavailable.

In addition, when inaccuracy occurs, as in the case of a mistake by the provider regarding the company or a mistake by the company regarding the new supplier in communicating the Domain Authorization Code of the Extensible Provisioning Protocol (EPP), the migration process will be blocked.

Conversely, in the new configuration setting, the inclusion of function 12n allows technicians in both of the above cases to maintain the availability of the platform while mistakes and problems are solved.

4. Discussion

The capabilities of providers of IT services represent a critical factor for a company that depends on their reliability and effectiveness in guaranteeing the proper functioning of the IT platform. Such a situation has been largely investigated in the literature, especially in the case of the migration from legacy applications to updated ones [54]. However, most studies focused on the quality features of the services provided in terms of Infrastructure as a Service (IaaS), Software as a Service (SaaS), and Platform as a Service (PaaS) [55]. Other research has fostered cyber risk assessment, which is mainly aimed at dealing with supply chain vulnerabilities and inadequate implementation of security controls [56]. Hence, in the former case, several multi-criteria decision-making (MCDM) approaches have been proposed for suppliers’ selection [57], while in the latter, different supply chain cyber risk assessment models can be found [56,58,59].

However, in spite of these noteworthy studies, little research has been carried out on the specific analysis of the cloud migration process and its risks. Thus, the current study certainly can contribute to augmenting knowledge on this issue, which is of paramount importance for avoiding the interruption or malfunctioning of the services that the company offers to its customers. Actually, as noted by Camacho et al. [60], the inaccessibility of these services due to platform malfunctioning may lead both to severe penalty costs due to the infringement of availability clauses as well as to the loss of market reputation.

The proposed procedure integrates tools for highlighting customer priorities and risk management to better design and carry out migration processes. This is in line with the research hints by Chang et al. [61], who hoped for more research on developing resilient software systems. Moreover, consistent with Aven [24], the proposed approach allowed us to customize the risk assessment of the migration process, providing mitigation measures for risks that cannot be properly managed using traditional risk assessment approaches. Thus, the proposed procedure has proven advantageous in simplifying the system screening and enabling a more reliable migration process compared to the existing risk assessment methods for cloud migration relying on “Safety I” tools, as well as allowing IT engineers to concentrate on important aspects and system serviceability. According to them, this FRAM-based procedure can be considered an effective tool for the management of complex systems, such as cloud migration processes involving multiple stakeholders. However, they also outlined that the unfamiliarity with the FRAM approach makes its application time-consuming and more complex than traditional risk assessment tools.

The application of the proposed approach to a real-life case study resulted in very promising results since it provided us with information about critical issues that need adequate attention when implementing a migration process. In particular, the inclusion of an additional control phase during the migration process as well as the use of an “alias” domain contributed to reducing the vulnerability of the process.

Hence, at the practical level, such an approach entails a twofold benefit: on the one hand, the analysis of performance requirements allows the company to make a more sound selection of suppliers for platform migration and maintenance.

On the other hand, the use of FRAM allowed us to perform the variability analysis of the migration procedure, and the new scheme resulted in being less critical than before, while the risks related to the unavailability of the services were satisfactorily reduced. This reflects the method’s capabilities, as underlined by Adriaensen et al. [62], who stressed the effectiveness of FRAM in better managing and mitigating risks from system weaknesses and brittleness. Accordingly, it can be concluded that a versatile risk analysis framework such as the proposed framework can effectively deal at an operational level with today’s ever-changing IT landscape, where resilient strategies are crucial for ensuring uninterrupted service delivery and adaptability.

From a methodological point of view, the proposed framework shows the effectiveness and flexibility of the FRAM method, expanding research on its use in operational contexts, as suggested by Patriarca et al. [23]. Moreover, the FRAM application in the cloud migration process represents a novelty in this specific sector and responds to the need to use an in-depth description tool when analyzing contextual and requirement information, as underlined by Holgado [63]. The method allowed us to analyze the relevance and criticality of the relationships among the different activities of the migration process, bringing to light new insights for IT practitioners such as the importance of controls, preconditions, and temporal constraints, which can hardly ever be taken into account by conventional tools for IT systems analysis, in line with the findings by de Souza et al. [64]. Therefore, FRAM incorporation in a structured application framework reduces its criticalities, as pointed out by De Leo et al. [13]. Moreover, the integration of security risk standards into the risk assessment procedure makes it more effective and efficient [65].

As far as the use of FRAM is concerned, it must be noted that both company technicians and IT experts that supported us were not familiar with it. However, after the initial difficulties, they showed great interest in such a tool, recognizing its effectiveness and the advantages of using a systemic procedure for guidance. This is in line with the research insights by Farooqi et al. [9] and addresses the need for a procedure on how to apply safety II concepts in practical contexts outlined by Provan et al. [66]. Thus, this outcome represents another research insight, fostering the implementation of practical procedures to facilitate the integration of different tools that support the application of FRAM in a synergistic manner. This can allow engineers to apply a resilience approach in a more effective manner, reducing their difficulties. Accordingly, further research is needed to provide operational frameworks for the implementation of FRAM-based analyses.

In addition to these benefits, the study also presents several limitations. Firstly, it must be noted that the application of the proposed approach to a single case study reduces the value of the research findings. Hence, further studies are needed to better validate the proposed approach in different cloud computing contexts. Thus, on the one hand, caution is needed when generalizing these research findings beyond the sample context, as argued by Alam and Perry [67]. On the other hand, in line with several studies [68,69], a single case study can certainly be used as a research means for exploratory investigation and to generate new understandings.

Another limitation is related to the evaluation of the potential variability of functions, which in our case study was established in a qualitative manner during meetings with the group of experts. However, to overcome the ambiguity and reduce the subjectivity in the evaluation processes, tools capable of supporting decision makers to handle the complex interrelations of the functions and their variability can be used, such as the Analytic Hierarchy Process (AHP) method, which can guarantee a good balance between targets, understanding, and objectivity as suggested among others by Rosa et al. [70] and Alboghobeish et al. [71], or via detailed investigations of functional interactions [72]. Additionally, the use of mathematical simulation models can improve the quantification of variability in complex systems, reducing FRAM deficiencies [73].

Regarding the control phases of the proposed procedure, another augmentation concerns the further elaboration of ethical considerations to achieve a more comprehensive assessment.

Finally, considering the case study, it must be noted that communication with the platform’s customers was not considered in the current research, which focused on the migration process only. However, a specific function addressing the communication activities should be included; for example, the possibility of an alert system to inform customers that they are operating in a new domain should be taken into account.

5. Conclusions

Nowadays, IT services are of paramount importance in most sectors, and their maintenance and upgrade represent critical tasks for companies. The current study is an attempt to extend resilience engineering approaches to the provision of these services with the goal of reducing risks for both companies and customers.

More specifically, this research can benefit academics and practitioners alike by providing a procedure for analyzing risks related to a migration process that includes supplier selection criteria and customer needs and can be applied to different scenarios. Accordingly, on the one hand, this study should be seen as a foundation for researchers and practitioners engaged in building new conceptual models for assessing cloud migration risks by means of the functional resonance approach. On the other hand, further work is needed to extend the external validity of the proposed procedure, also by means of its augmentation with tools aimed at reducing subjectivity in establishing the potential variability of functions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app132011132/s1.

Author Contributions

Conceptualization, M.F. and L.M.; methodology, M.F. and L.M.; validation, M.F. and L.M.; writing—review and editing, M.F. and L.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy concerns.

Conflicts of Interest

The authors declare no conflict of interest.

References

Martínez, K.; Claudio, D. Expanding Fundamental Boundaries between Resilience and Survivability in Systems Engineering: A Literature Review. Sustainability 2023, 15, 4811. [Google Scholar] [CrossRef]
Reyers, B.; Moore, M.-L.; Haider, L.J.; Schlüter, M. The contributions of resilience to reshaping sustainable development. Nat. Sustain. 2022, 5, 657–664. [Google Scholar] [CrossRef]
Folke, C.; Carpenter, S.; Elmqvist, T.; Gunderson, L.; Holling, C.S.; Walker, B. Resilience and sustainable development: Building adaptive capacity in a world of transformations. AMBIO J. Hum. Environ. 2002, 31, 437–440. [Google Scholar] [CrossRef] [PubMed]
Carpenter, S.R.; Arrow, K.J.; Barrett, S.; Biggs, R.; Brock, W.A.; Crépin, A.-S.; Engström, G.; Folke, C.; Hughes, T.P.; Kautsky, N.; et al. General Resilience to Cope with Extreme Events. Sustainability 2012, 4, 3248–3259. [Google Scholar] [CrossRef]
Lay, E.; Branlat, M.; Woods, Z. A practitioner’s experiences operationalizing Resilience Engineering. Reliab. Eng. Syst. Saf. 2015, 141, 63–73. [Google Scholar] [CrossRef]
United Nations Office for Disaster Risk Reduction, Report of the Open-Ended Intergovernmental Expert Working Group on Indicators and Terminology Relating to Disaster Risk Reduction, United Nations General Assembly, Geneve (CH). Available online: https://digitallibrary.un.org/record/852089 (accessed on 26 June 2023).
Hollnagel, E.; Woods, D.D.; Leveson, N. Resilience Engineering: Concepts and Precepts; Ashgate: Aldershot, UK, 2006. [Google Scholar]
Hollnagel, E.; Wears, R.L.; Braithwaite, J. From Safety-I to Safety-II: A White Paper. Published Simultaneously by the University of Southern Denmark, University of Florida, USA, and Macquarie University, Australia: The Resilient Health Care Net. 2015. Available online: https://www.england.nhs.uk/signuptosafety/wp-content/uploads/sites/16/2015/10/safety-1-safety-2-whte-papr.pdf (accessed on 7 April 2023).
Farooqi, A.; Ryan, B.; Cobb, S. Using expert perspectives to explore factors affecting choice of methods in safety analysis. Saf. Sci. 2022, 146, 105571. [Google Scholar] [CrossRef]
Yousefi, A.; Rodriguez Hernandez, M.; Lopez Peña, V. Systemic accident analysis models: A comparison study between AcciMap, FRAM, and STAMP. Process Saf. Prog. 2018, 38, e12002. [Google Scholar] [CrossRef]
Patriarca, R.; Bergström, J.; Di Gravio, G.; Costantino, F. Resilience engineering: Current status of the research and future challenges. Saf. Sci. 2018, 102, 79–100. [Google Scholar] [CrossRef]
Patriarca, R.; Di Gravio, G.; Costantino, F.; Falegnami, A.; Bilotta, F. An Analytic Framework to Assess Organizational Resilience. Saf. Health Work 2018, 9, 265–276. [Google Scholar] [CrossRef]
De Leo, F.; Elia, V.; Gnoni, M.G.; Tornese, F. Integrating Safety-I and Safety-II Approaches in Near Miss Management: A Critical Analysis. Sustainability 2023, 15, 2130. [Google Scholar] [CrossRef]
Hollnagel, E. FRAM: The Functional Resonance Analysis Method: Modelling Complex Socio-Technical Systems; CRC Press: London, UK, 2012. [Google Scholar] [CrossRef]
Grabbe, N.; Kellnberger, A.; Aydin, B.; Bengler, K. Safety of automated driving: The need for a systems approach and application of the Functional Resonance Analysis Method. Saf. Sci. 2020, 126, 104665. [Google Scholar] [CrossRef]
Le Coze, J.C. The ‘new view’ of human error. Origins, ambiguities, successes and critiques. Saf. Sci. 2022, 154, 105853. [Google Scholar] [CrossRef]
Li, W.; He, M.; Sun, Y.; Cao, Q. A proactive operational risk identification and analysis framework based on the integration of ACAT and FRAM. Reliab. Eng. Syst. Saf. 2019, 186, 101–109. [Google Scholar] [CrossRef]
Patriarca, R.; Bergström, J.; Di Gravio, G. Defining the functional resonance analysis space: Combining Abstraction Hierarchy and FRAM. Reliab. Eng. Syst. Saf. 2017, 165, 34–46. [Google Scholar]
Falegnami, A.; Costantino, F.; Di Gravio, G.; Patriarca, R. Unveil key functions in socio-technical systems: Mapping FRAM into a multilayer network. Cogn. Technol. Work 2020, 22, 877–899. [Google Scholar] [CrossRef]
Delikhoon, M.; Zarei, E.; Banda, O.V.; Faridan, M.; Habibi, E. Systems Thinking Accident Analysis Models: A Systematic Review for Sustainable Safety Management. Sustainability 2022, 14, 5869. [Google Scholar] [CrossRef]
Leveson, N. A systems approach to risk management through leading safety indicators. Reliab. Eng. Syst. Saf. 2015, 136, 17–34. [Google Scholar] [CrossRef]
Yu, D.J.; Schoon, M.L.; Hawes, J.K.; Lee, S.; Park, J.; Rao, P.S.C.; Siebeneck, L.K.; Ukkusuri, S.V. Toward general principles for resilience engineering. Risk Anal. 2020, 40, 1509–1537. [Google Scholar] [CrossRef]
Patriarca, R.; Di Gravio, G.; Woltjer, R.; Costantino, F.; Praetorius, G.; Ferreira, P.; Hollnagel, E. Framing the FRAM: A literature review on the functional resonance analysis method. Saf. Sci. 2020, 129, 104827. [Google Scholar] [CrossRef]
Aven, T. Risk assessment and risk management: Review of recent advances on their foundation. Eur. J. Oper. Res. 2016, 253, 1–13. [Google Scholar] [CrossRef]
Wagner, C.; Hudic, A.; Maksuti, S.; Tauber, M.; Pallas, F. Impact of critical infrastructure requirements on service migration guidelines to the cloud. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud, Rome, Italy, 24–26 August 2015; Institute of Electrical and Electronics Engineers (IEEE): Piscataway, NJ, USA; pp. 1–8. [Google Scholar]
Choubey, R.; Dubey, R.; Bhattacharjee, J. A survey on cloud computing security, challenges and threats. Int. J. Comput. Sci. Eng. 2011, 3, 1227–1231. [Google Scholar]
DIGICRT, Massive Fire Destroyed OVH Strasbourg Data Center. 2022. Available online: https://constellix.com/news/massive-fire-destroyed-ovh-strasbourg-data-center (accessed on 26 June 2023).
Medina, A. Inside the Fastly Outage: Analysis and Lessons Learned, ThousandEyes, Cisco Systems. 2021. Available online: https://www.thousandeyes.com/blog/inside-the-fastly-outage-analysis-and-lessons-learned (accessed on 26 June 2023).
Zhou, Z.; Matsubara, Y.; Takada, H. Developing Reliable Digital Healthcare Service Using Semi-Quantitative Functional Resonance Analysis. Comp. Syst. Sci. Eng. 2023, 45, 35–50. [Google Scholar] [CrossRef]
de Carvalho, E.A.; Gomes, J.O.; Jatobá, A.; da Silva, M.F.; de Carvalho, P.V.R. Employing resilience engineering in eliciting software requirements for complex systems: Experiments with the functional resonance analysis method (FRAM). Cogn. Technol. Work 2021, 23, 65–83. [Google Scholar] [CrossRef]
Zhou, J.; Hai, T.; Jawawi, D.N.A.; Wang, D.; Lakshmanna, K.; Maddikunta, P.K.R.; Iwendi, M. A lightweight energy consumption ensemble-based botnet detection model for IoT/6G networks. Sustain. Energy Technol. Assess. 2023, 60, 103454. [Google Scholar] [CrossRef]
Theoharidou, M.; Tsalis, N.; Gritzalis, D. In Cloud We Trust: Risk-Assessment-as-a-Service. In Trust Management VII; Springer: Berlin/Heidelberg, Germany, 2013; Volume 401, pp. 100–110. [Google Scholar] [CrossRef]
Sendi, A.S.; Cheriet, M. Cloud Computing: A Risk Assessment Model. In Proceedings of the 2014 IEEE International Conference on Cloud Engineering, London, UK, 8–11 December 2014; pp. 147–152. [Google Scholar] [CrossRef]
Tecnalia, The MEDINA Project. Available online: https://medina-project.eu/mission-and-vision/ (accessed on 26 June 2023).
Akinrolabu, O.; New, S.; Martin, A. CSCCRA: A Novel Quantitative Risk Assessment Model for SaaS Cloud Service Providers. Computers 2019, 8, 66. [Google Scholar] [CrossRef]
Alves Carvalho, E.; Orlando Gomes, J.; Jatobá, A.; Ferreira Silva, M.; Rodrigues Carvalho, P.V. Software Requirements Elicitation for Complex Systems with the Functional Resonance Analysis Method (FRAM). In Proceedings of the XVII Brazilian Symposium on Information Systems, Uberlândia, Brazil, 7–10 June 2021; pp. 1–8. [Google Scholar] [CrossRef]
Diop, I.; Abdul-Nour, G.; Komljenovic, D. The Functional Resonance Analysis Method: A Performance Appraisal Tool for Risk Assessment and Accident Investigation in Complex and Dynamic Socio-Technical Systems. Am. J. Ind. Bus. Manag. 2022, 12, 195–230. [Google Scholar] [CrossRef]
Martins, J.B.; Carim, G.; Saurin, T.A.; Costella, M.F. Integrating Safety-I and Safety-II: Learning from failure and success in construction sites. Saf. Sci. 2022, 148, 105672. [Google Scholar] [CrossRef]
Linkov, I.; Fox-Lent, C.; Read, L.; Allen, C.R.; Arnott, J.C.; Bellini, E.; Coaffee, J.; Florin, M.-V.; Hatfield, K.; Hyde, I.; et al. Tiered Approach to Resilience Assessment. Risk Anal. 2018, 38, 1772–1780. [Google Scholar] [CrossRef]
Fargnoli, M.; Lombardi, M.; Puri, D. Applying Hierarchical Task Analysis to Depict Human Safety Errors during Pesticide Use in Vineyard Cultivation. Agriculture 2019, 9, 158. [Google Scholar] [CrossRef]
Patriarca, R.; Di Gravio, G.; Costantino, F. A Monte Carlo evolution of the Functional Resonance Analysis Method (FRAM) to assess performance variability in complex systems. Saf. Sci. 2017, 91, 49–60. [Google Scholar] [CrossRef]
Alvarenga, M.A.B.; Frutuoso e Melo, P.F.; Fonseca, R.A. A critical review of methods and models for evaluating organizational factors in Human Reliability Analysis. Prog. Nucl. Energy 2014, 75, 25–41. [Google Scholar] [CrossRef]
Lloyd, J. Migration Strategies. In Infrastructure Leader’s Guide to Google Cloud: Lead Your Organization’s Google Cloud Adoption, Migration and Modernization Journey; Apress: Berkeley, CA, USA, 2022; pp. 99–105. [Google Scholar] [CrossRef]
Varma, K.M.; Se, G.B. Efficient Scalable Migrations in the Cloud. In Proceedings of the IEEE/ACIS 7th International Conference on Big Data, Cloud Computing, and Data Science (BCD), Danang, Vietnam, 4–6 August 2022; pp. 3–6. [Google Scholar] [CrossRef]
Abdul Rahman, A.A.L.; Islam, S.; Kalloniatis, C.; Gritzalis, S. A Risk Management Approach for a Sustainable Cloud Migration. J. Risk Financ. Manag. 2017, 10, 20. [Google Scholar] [CrossRef]
Karumanchi, M.D.; Sheeba, J.I.; Pradeep Devaneyan, S. Integrated internet of things with cloud developed for data integrity problems on supply chain management. Meas. Sens. 2022, 24, 100445. [Google Scholar] [CrossRef]
Fargnoli, M.; Haber, N. A QFD-based approach for the development of smart product-service systems. Eng. Rep. 2023, e12665. [Google Scholar] [CrossRef]
Fargnoli, M.; Haber, N.; Tronci, M. Case Study Research to Foster the Optimization of Supply Chain Management through the PSS Approach. Sustainability 2022, 14, 2235. [Google Scholar] [CrossRef]
ISO 22316:2017; Security and Resilience—Organizational Resilience—Principles and Attributes. ISO: Geneva, Switzerland, 2017. Available online: https://www.iso.org/standard/50053.html (accessed on 26 June 2023).
ISO 28000:2022; Security and Resilience—Security Management Systems—Requirements. ISO: Geneva, Switzerland, 2022. Available online: https://www.iso.org/standard/79612.html (accessed on 26 June 2023).
EU. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation). Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A02016R0679-20160504&qid=1688462060670 (accessed on 26 June 2023).
Hoy, K.M.; Fallon, E.; Kelly, M. Paediatric Homecare Risk Management: An Application of Functional Resonance Analysis Method (FRAM). Safety 2023, 9, 52. [Google Scholar] [CrossRef]
Rees Hill, FRAM Model Visualiser (FMV). Available online: https://functionalresonance.com/the%20fram%20model%20visualiser/ (accessed on 26 June 2023).
Sen, A.; Madria, S. Analysis of a cloud migration framework for offline risk assessment of cloud service providers. Softw. Pract. Exp. 2020, 50, 998–1021. [Google Scholar] [CrossRef]
Kumar, R.R.; Mishra, S.; Kumar, C. A novel framework for cloud service evaluation and selection using hybrid MCDM methods. Arab. J. Sci. Eng. 2018, 43, 7015–7030. Available online: https://link.springer.com/content/pdf/10.1007/s13369-017-2975-3.pdf (accessed on 26 July 2023). [CrossRef]
Akinrolabu, O.; Nurse, J.R.C.; Martin, A.; New, S. Cyber risk assessment in cloud provider environments: Current models and future needs. Comput. Secur. 2019, 87, 101600. [Google Scholar] [CrossRef]
Lee, S.; Seo, K.K. A hybrid multi-criteria decision-making model for a cloud service selection problem using BSC, fuzzy Delphi method and fuzzy AHP. Wirel. Pers. Commun. 2016, 86, 57–75. [Google Scholar] [CrossRef]
Akinrolabu, O.; New, S.; Martin, A. Cyber Supply Chain Risks in Cloud Computing—Bridging the Risk Assessment Gap. Open J. Cloud Comput. 2018, 5, 1–19. [Google Scholar]
Albakri, S.H.; Shanmugam, B.; Samy, G.N.; Idris, N.B.; Ahmed, A. Security risk assessment framework for cloud computing environments. Secur. Commun. Netw. 2014, 7, 2114–2124. [Google Scholar] [CrossRef]
Camacho, C.; Cañizares, P.C.; Llana, L.; Núñez, A. Chaos as a Software Product Line—A platform for improving open hybrid-cloud systems resiliency. In Software—Practice and Experience; Wiley: Hoboken, NJ, USA, 2022; pp. 1–34. [Google Scholar] [CrossRef]
Chang, V.; Ramachandran, M.; Yao, Y.; Kuo, Y.H.; Li, C.S. A resiliency framework for an enterprise cloud. Int. J. Inf. Manag. 2016, 36, 155–166. [Google Scholar] [CrossRef]
Adriaensen, A.; Decré, W.; Pintelon, L. Can Complexity-Thinking Methods Contribute to Improving Occupational Safety in Industry 4.0? A Review of Safety Analysis Methods and Their Concepts. Safety 2019, 5, 65. [Google Scholar] [CrossRef]
Holgado, M. A Systems Engineering Approach to Performance-Based Maintenance Services Design. Processes 2019, 7, 59. [Google Scholar] [CrossRef]
de Souza, I.T.; Rosa, A.C.; Vidal, M.C.R.; Najjar, M.K.; Hammad, A.W.A.; Haddad, A.N. Information Technologies in Complex Socio-Technical Systems Based on Functional Variability: A Case Study on HVAC Maintenance Work Orders. Appl. Sci. 2021, 11, 1049. [Google Scholar] [CrossRef]
Abioye, T.E.; Arogundade, O.T.; Misra, S.; Adesemowo, K.; Damaševičius, R. Cloud-Based Business Process Security Risk Management: A Systematic Review, Taxonomy, and Future Directions. Computers 2021, 10, 160. [Google Scholar] [CrossRef]
Provan, D.J.; Woods, D.D.; Dekker, S.W.A.; Rae, A.J. Safety II professionals: How resilience engineering can transform safety practice. Reliab. Eng. Syst. Saf. 2020, 195, 106740. [Google Scholar] [CrossRef]
Alam, I.; Perry, C. A Customer-oriented new service development process. J. Serv. Mark. 2002, 16, 515–534. [Google Scholar] [CrossRef]
Yin, R.K. Case Study Research. Design and Methods; Sage: Thousand Oaks, CA, USA, 2014. [Google Scholar]
Haber, N.; Fargnoli, M.; Sakao, T. Integrating QFD for product-service systems with the Kano model and fuzzy AHP. Total Qual. Manag. Bus. Excel. 2018, 31, 929–954. [Google Scholar] [CrossRef]
Rosa, L.V.; Carvalho, P.V.; Haddad, A.N. FRAM-AHP: A Resilience Engineering Approach for Sustainable Prevention. In Occupational and Environmental Safety and Health II; Springer: Cham, Switzerland, 2020; pp. 123–131. [Google Scholar] [CrossRef]
Alboghobeish, A.; Shirali, G.A. Integration of Functional Resonance Analysis with Multicriteria Analysis for Sociotechnical Systems Risk Management. Risk Anal. 2022, 42, 882–895. [Google Scholar] [CrossRef]
Abreu Saurin, T.; Patriarca, R. A taxonomy of interactions in socio-technical systems: A functional perspective. Appl. Ergon. 2020, 82, 102980. [Google Scholar] [CrossRef]
Salehi, V.; Veitch, B.; Smith, D. Modeling complex socio-technical systems using the FRAM: A literature review. Hum. Factors Ergon. Manuf. Serv. Ind. 2021, 31, 118–142. [Google Scholar] [CrossRef]

Figure 1. Scheme of a hexagonal representation of a function in the FRAM method.

Figure 2. Scheme of the CTQ method.

Figure 3. Scheme of the proposed approach.

Figure 4. Application of the CTQ method.

Figure 5. Excerpt of the FRAM application by means of the FMV software.

Figure 6. Application of the FRAM model to the current transfer procedure.

Figure 7. Application of the FRAM model to the new transfer procedure.

Figure 8. Application of the FRAM model to the new transfer procedure.

Table 1. Classification of the functions in the FRAM method.

Fu	Precise	Acceptable	Imprecise
T	Normal	Improbable	Improbable
M	Possible	Typical	Possible
O	Improbable	Possible	Probable

Table 2. Criteria suggested for the evaluation of the variability in the FRAM method.

Fu	Precise	Acceptable	Imprecise
T	Improbable	Normal	Improbable
M	Possible	Possible	Possible
O	Improbable	Probable	Possible

Table 3. Excerpt of the list of performance requirements.

Quality Drivers	Performance Requirements
1.1 HW and SW performances	1.1.a Type of processor: RAM, SSD, etc.
	1.1.b Bandwidth
	1.1.c Command execution times and diffusion
1.2 SLA performances	1.2.a SLA ≥ 99.99%
	1.2.b Refund policy
	1.2.c Details of services
1.3 Customer care performances	1.3.a Troubleshooting within 2 h
	1.3.b ISO/IEC 20000-1 certification
	1.3.c ISO 9001 certification
2.1 ISO/IEC 27001, 27017, and 27018 certification	2.1.a 5-year certification
	2.1.b Guidelines published on the company’s website
	2.1.c Results of the third-party audits published on the company’s website
2.2 ISO 22301 certification	2.2.a 5-year certification
	2.2.b Guidelines published on the company’s website
	2.2.c Results of the third-party audits published on the company’s website
2.3 Tier 4 or ANSI/TIA-942 certification	2.3.a 5-year type IV certification
	2.3.b Guidelines published on the company’s website
	2.3.c Results of the third-party audits published on the company’s website
3.1 Compliance with GPDR	3.1.a EU server
	3.1.b ISO/IEC 27701 certification
	3.1.c Exhaustive privacy section of the company’s website

Table 4. Functions’ categorization.

Function	Time: In Time	Precision: Acceptable
T	Normal	Improbable
M	Possible	Typical
O	Probable	Possible

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fargnoli, M.; Murgianu, L. A Resilience Engineering Approach for the Risk Assessment of IT Services. Appl. Sci. 2023, 13, 11132. https://doi.org/10.3390/app132011132

AMA Style

Fargnoli M, Murgianu L. A Resilience Engineering Approach for the Risk Assessment of IT Services. Applied Sciences. 2023; 13(20):11132. https://doi.org/10.3390/app132011132

Chicago/Turabian Style

Fargnoli, Mario, and Luca Murgianu. 2023. "A Resilience Engineering Approach for the Risk Assessment of IT Services" Applied Sciences 13, no. 20: 11132. https://doi.org/10.3390/app132011132

APA Style

Fargnoli, M., & Murgianu, L. (2023). A Resilience Engineering Approach for the Risk Assessment of IT Services. Applied Sciences, 13(20), 11132. https://doi.org/10.3390/app132011132

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Resilience Engineering Approach for the Risk Assessment of IT Services

Abstract

1. Introduction

2. Materials and Methods

2.1. The Functional Resonance Analysis Method (FRAM)

2.2. Research Approach

3. Case Study

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI