*Article* **Digital Twin for Monitoring Ergonomics during Manufacturing Production**

## **Alessandro Greco , Mario Caterino \* , Marcello Fera and Salvatore Gerbino**

Department of Engineering, University of Campania Luigi Vanvitelli, via Roma 29, 81031 Aversa, Italy; alessandro.greco@unicampania.it (A.G.); marcello.fera@unicampania.it (M.F.); salvatore.gerbino@unicampania.it (S.G.)

**\*** Correspondence: mario.caterino@unicampania.it; Tel.: +39-081-50-10-318

Received: 30 September 2020; Accepted: 30 October 2020; Published: 2 November 2020

**Abstract:** Within the era of smart factories, concerning the ergonomics related to production processes, the Digital Twin (DT) is the key to set up novel models for monitoring the performance of manual work activities, which are able to provide results in near real time and to support the decision-making process for improving the working conditions. This paper aims to propose a methodological framework that, by implementing a human DT, and supports the monitoring and the decision making regarding the ergonomics performances of manual production lines. A case study, carried out in a laboratory, is presented for demonstrating the applicability and the effectiveness of the proposed framework. The results show how it is possible to identify the operational issues of a manual workstation and how it is possible to propose and test improving solutions.

**Keywords:** ergonomics; manufacturing; production process; Digital Twin

#### **1. Introduction**

Ergonomic issues represent one of the main factors characterizing the manufacturing working environment. Indeed, in addition to designing a production system according to an ergonomic approach, the working activities need to be continuously monitored, especially when the volume of production varies, causing changes in cycle time and, consequently, changes in working tasks and workload.

Going more into detail, biomechanical overload represents one of the main risk factors in a manufacturing environment and is a possible source of Musculo-Skeletal Disorders (MSDs). MSDs consist in lesions or alterations of muscles, nerves, tendons and joints and, as demonstrated by numerous studies in literature and international standards, they are caused by prolonged exposure to awkward working postures, exerted forces, Material Manual Handling (MMH) and repetitive actions [1,2].

Over the years, the high incidence of the MSDs related to biomechanical overload led to the development of numerous risk assessment methods, principally observation-based, as collected by Takala et al. [3]. The application of risk assessment methods is mandatory during normal production to monitor working activities whenever changes are made to production volume or to working tasks. However, since the methods are observational, mostly based on the compilation of specific check-lists, the evaluation procedure is time-consuming and, additionally, strictly subjective, so it does not guarantee the repeatability of measurements.

For these reasons, in the era of Industry 4.0, the need to develop numerical methodologies and measurement devices has become a priority for companies wishing to have efficient and safe production systems. In particular, among the enabling technologies, the integrated use of Industrial Internet of Things (IIoT) and simulation tools allows for monitoring working performance, enhancing the control of the process and a near-real time management of the whole production line, in terms of balancing and ergonomics [4,5]. However, the literature dealing with the use of Industry 4.0 concept and tools for Ergonomics is poor, as also pointed out by Kadir et al. [6] in a deep review. Most researches are documented in conference proceedings, although the topic is very actual with wide application possibilities.

This research was born with the aim of creating tools that can support experts in ergonomic screening, ensuring accuracy and repeatability of measurements. From this point of view, the DT, based on the integration of IIoT and simulation, can represent the best solution as it allows exploiting the computational capability by using numerical models in which real data are implemented.

IIoT includes objects or devices (sensors, actuators, mobiles, etc.) that are able to interact and communicate between them, via internet protocols, and, thanks to the implementation of specific algorithms, they are able to carry out measurements and make decisions autonomously in order to manage machines and production systems [7]. Among these technologies, for ergonomics purposes, wearable devices are crucial since they can be provided with sensors capable to measure parameters related to humans (postures, forces, muscular activity, etc.). Widespread in recent years, wearable motion tracking devices have seen a massive introduction in the factory environment. Thanks to their good accuracy and low invasiveness, they allow acquiring motion data during normal working activities and analyzing them in order to evaluate ergonomics indexes related to the biomechanical load due to working postures [8–10].

In addition to IIoT, a manufacturing scenario can be fully reproduced in a 3D environment, giving the possibility to simulate the operations of a real working process [11]. This makes feasible, since the design phase, choosing the correct solutions, knowing in advance the performance of the line. The implementation of real data, collected by IIoT devices, in a simulation scenario realizes the so-called Digital Twin (DT).

Manual operations are still dominant in complex manufacturing systems [12], hence the working tasks are affected by high variability. By means of DT, it is interesting to investigate the possibility to implement real data into a simulation in order to assess ergonomics indexes.

In literature, several researches deal with digital human modeling and simulation for assessing ergonomics indexes. Caputo et al. [13,14] defined experimental/numerical procedures to evaluate the ergonomics of working activities by using, separately, inertial motion tracking system and simulation. Experimental data were used in Caputo et al. [13], for validating the numerical model, by comparing experimental and numerical results. The authors proposed a framework for preventively evaluating the ergonomic indexes (in particular, the Ergonomic Assessment Work-Sheet—EAWS) during the design phase of a new manufacturing line. In Caputo et al. [14], four ergonomic indexes, the same considered for the case study presented in Section 3, have been evaluated by means of numerical models for validating the design of a new workstation.

Makarova et al. [15] demonstrated that process parameters and ergonomics indexes can be investigated in a virtual environment. Case et al. [16] investigated the workers' ageing by implementing human capability data within a simulation environment. Tarallo et al. [17] proposed a computer-aided production control framework, which includes the use of digital human models, for implementing the principles of Industry 4.0 in manual working environments. The authors described how to monitor manual manufacturing processes by using a virtual simulation software (Siemens Tecnomatix Jack®) and an optical motion capture system (Microsoft Kinect®).

Sanjog et al. [18] used both physical and virtual ergonomics tool for assessing the ergonomics of an industrial shop-floor workstation. They pointed out that DHM and simulation could be very much beneficial for engineers/production supervisors/ergonomists to set the best design solution for a safe workstation.

However, these numerical models may not provide enough accurate results, since human motion is evaluated by inverse kinematic, making the movements sometimes unrealistic. So, implementing real data in Digital Human Models (DHM) should allow accurate simulation and assessment of production performance (ergonomics, working times, line efficiency, etc.).

Few significant studies have been documented in the technical literature about human simulation based on experimental data and most of them are relative to the control of human body for clinical purposes. Most of the research studies based on DT concern the improvement of manufacturing process and the product lifecycle management. In fact, there are no applications of human Digital Twin to assess the risk factors closely related to manual working activities.

Catarci et al. [19] provided a detailed literature about DTs and their modelling techniques, proposing a complex architecture for digital factories. Malik and Bilberg [20] presented a DT for investigating the performance of a human-robot collaboration work-cell, transferring the only data related to the robot, while Li et al. [21] developed an Augmented Reality (AR) application for the control of robots during a similar task, by implementing the DT of human hands by means of LeapMotion sensor and a Kinect V2 camera. Nikolakis et al. [22] used experimental data for the implementation of a DT for recognizing and simulating human activities.

Zheng et al. [23] studied the literature about DT technology for realizing a framework, aimed at product lifecycle management, based on three function modules: data storage, data processing and data mapping. Similarly, Ma et al. [24] proposed a framework based on DT to support the management of cyber-physical systems of production workshop, including product design and manufacturing. Aimed to improve the order management process, Kunath and Winkler [25] proposed a conceptual framework, based on DT, of a decision support system, which is able to find the best solution through simulating several scenarios. Instead, Havard et al. [26] combined DT and virtual reality in a co-simulation environment for assessing industrial workstations. They carried out a case study related to a human-robot collaborative workplace performing also ergonomic analysis.

This research aims to fill the gap in the literature about the use of human DT to evaluate ergonomics by proposing a methodology that supports ergonomists/occupational physicians/line managers in mapping the ergonomics risk for all the workstations of manufacturing environment.

There are two main motivations behind choosing a DT-based procedure for assessing the ergonomics. Firstly, it ensures that analyses are not affected by the ergonomist subjectivity, typical of the traditional (observational) techniques. Then, DT drastically reduces the computation times, thanks to algorithms able to quickly process data. So, since ergonomic analyses are significantly time consuming procedures, which could require hours for filling-in the spreadsheets, it is presumable saving a lot of time and then costs.

This paper moves from a previous research [4], in which the authors proposed a novel methodological framework, based on the implementation of a DT, for carrying out near real time analyses about the performance of manufacturing production lines, in terms of working times and balancing. The core of the research in [4] was related to human motion data collection and transferring for reproducing the real production in a simulation scenario. Such model is able to supply output data useful to the evaluation of the desired line performances, besides representing a predictive model for the behavior of the line itself following possible modifications.

Herein, the methodological framework has been modified and adapted in order to evaluate the worker's performance in terms of ergonomics, investigating the possible causes of risk of injury due to biomechanical overload: working postures, exerted forces, Material Manual Handling and repetitive actions. This procedure represents an innovation for the ergonomic screening of production lines that, currently, is still mainly performed by observational techniques and is a highly time consuming process. In fact, typically the ergonomist observes and records the work activity, estimates the index calculation parameters, fills in the checklists and evaluates the risk index. This process may require a significant amount of time, in the order of hours, to estimate the indexes of each workstation. In addition, the use of experimental data and DT allows an objective analysis as well as ensuring the repeatability of measurements. This approach contributes to the transformation towards the so-called smart factory, in agreement with the principles of Industry 4.0.

A case study, aimed to demonstrate the effectiveness of the framework, has been set and performed at the Laboratory of Machine Design of the University of Campania Luigi Vanvitelli. Data have been acquired by means of a wearable inertial motion tracking system [27] and the DT has been implemented in the Tecnomatix Process Simulate software by Siemens®.

The reminder of the paper is organized as follows. Section 2 describes the methodological framework, aimed to support the ergonomic assessment of the investigated working activity. Section 3 describes the case study that investigates an assembly activity reproduced in laboratory. Section 4 presents the results analysis and the discussions, while Section 5 concludes the paper.

#### **2. Methodological Framework**

Figure 1 describes the methodological framework developed to investigate about ergonomics of manual working tasks in a manufacturing scenario.

As already mentioned, the methodological framework moves from that one described in Fera et al. [4], which was related specifically on production line performance evaluation, whereas in this paper it has been modified and adapted for assessing the ergonomic indexes during the real production.

The framework consists of seven steps:


In the next paragraphs of this section, each phase will be explained and characterized. Data collection and simulation phases are the same of those described in [4], and briefly summarized in the following.

**Figure 1.** Methodological framework for monitoring ergonomics performance during the production process.

#### *2.1. Data Collection*

According to Fera et al. [4], data collection is a crucial step in order to apply the proposed approach. Specifically, data collection is related to the human body movements and it may be performed by means of both optical and non-optical motion capture based systems.

Together with the motion tracking system, it could be useful also to get additional data, such as the ones related to the exerted forces, by means of specific devices (force sensors, cyber gloves, etc.) to carry out the analysis.

It is important to define the number of cycles to be acquired during data acquisition session. This is required (see Section 3.1) in order to compute the basic statistical moments (mean and standard deviation) of ergonomic indexes, useful to assess whether a deep investigation is needed to evaluate critical ergonomic issues. The number of cycles to acquire during data collection phase strictly depends on the cycle time of the working task: in particular, the smaller the cycle time, the greater the number of acquisitions will be due to the increase of the variability of acquisitions [28].

Once the number of cycles to acquire have been defined, data acquisition session may start. If data acquisition is performed by means of wearable sensors, ethical requirement must be signed by the workers wearing them. Typically, a questionnaire about the usability of the system is also submitted to workers.

Acquired data are transferred to an appropriate software able to perform human simulations, such as Tecnomatix by Siemens® [29] or Delmia by Dassault Sistèmes® [30], and able to integrate such

a data to the DHM, which accurately replicates the real workers' working tasks. Depending on the type of device used during the acquisition session, the way to transfer data to the software changes. In some cases, as shown later on in the case study, custom plugins and a re-sampling of data are necessary to integrate them in the simulation scenario.

#### *2.2. Simulation*

Step 4 in Figure 1 concerns simulation. Since it is focused on ergonomic assessment, simulation has to accurately replicate manual working tasks, according to the acquired data.

The steps [4] to follow in order to carry out the simulation are substantially four and involve: (i) virtual scenario setting, reproducing the same workplace layout; (ii) DHM creation, according to the anthropometric characteristics of the real worker; (iii) data implementation and operation refining when necessary (e.g., handling, picking or grasping an object, application of a force, etc.); (iv) run the simulation.

#### *2.3. Ergonomics Assessment*

Once the simulation is completed, numerical data are analyzed in order to perform the ergonomics assessment (step 5 in Figure 1). In a manufacturing scenario, as anticipated in Section 1, it is necessary to investigate the biomechanical overload, a source of injury risks, that is principally caused by working postures, exerted forces, MMH and repetitive actions with upper limbs. Many methods, tools and screening procedures have been developed over the years [3], some of which already implemented in different software codes for production process simulation. Many of them are according to the standards ISO 11,226 [31] and ISO 11228-1,2,3 [32–34] that regulate the whole procedure of occupational ergonomics monitoring.

After selecting the four risk indexes (one for each risk factor), numerical data are used for evaluating them (step 5.1 in Figure 1) for each acquired cycle. If the average values of the four indexes fall within the low risk area, the control phase is carried out by applying the following Equation (1) to each evaluated index:

$$\mathbf{I}\_{\sigma} \le \mathbf{I}\_{\mathbf{t}} - \mathbf{I}\_{\mu} \tag{1}$$

where *I*<sup>µ</sup> and *I*<sup>σ</sup> are the average and the standard deviation values of the evaluated index respectively, with respect to investigated working cycles, and I<sup>t</sup> is the threshold value of the index for accessing the medium risk area.

If Equation (1) is satisfied, the framework suggests continuing the production (step 6 in Figure 1).

If none of the indexes fall within the risk area or if the Equation (1) is not satisfied, it is necessary furtherly investigating about the working task focusing on the critical factor.

This important step should be conducted by experienced ergonomists or by occupational physicians, since they have the appropriate know how for a proper identification and resolution of the critical issues.

Let us analyze in detail about the investigation (step 5.2 in Figure 1).

In case the working posture risk index does not fall within the low risk area, it is advisable to investigate, by observation or by studying the temporal history of postural angles, the postures assumed by the operator, identifying the sub-phases of the work cycle that mostly contribute to the value of the index.

About the exerted forces, if these exceed the maximum applicable value, as reported in specific tables by Snook and Ciriello [35], it is necessary to investigate the posture assumed during the force exertion, as well as the intensity and the duration of the application.

MMH is evaluated when the weight of the handled object is at least 3.5 kg; if the risk index exceeds the low risk area threshold, the investigations are different, depending on the kind of handling:

• for lifting operations, the attention has to be paid to the initial and final altimeters of the handling, as well as the type of gripping and frequency;


If the threshold value is exceeded by the index related to repetitive actions with upper limbs, the investigation will be focused on the number of technical actions, the possible awkward postures of the joints, the types of grip, the frequencies of actions and the recovery times.

Once the critical issues (step 5.3 in Figure 1) have been identified, it is advisable to discuss the possibility of making changes to the station layout or to the operations to be carried out.

After the discussion and testing of possible solutions, by means of numerical simulation, the ergonomist will deal with the decision-making process (step 6 in Figure 1) in order to improve the production process.

#### **3. Case Study**

Aiming to show the applicability and the effectiveness of the framework depicted in Figure 1, a case study is described here, related to a working task carried out in a laboratory, where a simple assembly task has been defined and performed. Figure 2 shows the working scenario reproduced in a simulation environment.

**Figure 2.** Workstation layout, reproduced in a simulation scenario.

As described in Figure 3, which shows both the real experiment and the Digital Twin, a female worker manually performs an assembly task of two components (labeled "1" and "2") made of steel. After picked up and positioned the components, these are joined by performing four screwings; then, the assembly is placed in a cart. A detailed description of each operation in reported in Table 1.

The assumed cycle time is 30 s, which includes 2 s of recovery. In addition, the work-shift duration has been assumed equal to 8 h, including 30 min breaks (10 min per break), equally distributed along the whole shift. A 60 min lunch-break has been scheduled at the middle of the shift.

Going into detail about the characteristics of the workstation, the components 1 and 2 have weight equal to 6 kg and 2 kg respectively, while the weight of the screwdriver is 2.5 kg.

About the screwings, the joints are made with M10 threaded bolts and the tightening torque is equal to 30 Nm.

The shelf where the two components are placed is 1400 mm high, while the height of the workbench is 900 mm. Finally, the assembly is positioned in the cart on a support plane 500 mm high.

The worker is P40 of the Italian female population [36], with a stature of 1550 mm and a weight of 45 kg.

In the following sections, the application of the methodological framework depicted in Figure 1 is applied.

**Figure 3.** Tasks of the working activity. For each picture frame, on the left the real scenario and on the right the Digital Twin.



#### *3.1. Data Collection*

Motion data have been collected by using a wearable inertial motion tracking system developed at the Department of Engineering of the University of Campania Luigi Vanvitelli. The tracking system is composed by Inertial Movement Units and a sensor fusion algorithm, based on Extended Kalman Filter, has been developed for evaluating the attitude of body segments and, therefore, the posture angles related to the investigated working activity. Acquisition start and stop are managed by means of a mobile app used by an observer. A scheme of the motion tracking system is provided in Figure 4 and a full description of the algorithm can be found in [27]. This motion tracking system does not yet work in real time, so data transferring has been carried out after data processing. Indeed, the algorithm autonomously compiles CSV (Comma-Separated Values) files in which Euler angles, quaternions and posture angles are provided for each segment of the human body in each time frame.

The system is worn over the normal clothes by the worker (as in Figure 3) and, after a proper starting calibration of the sensors, the acquisition is run and the worker can normally perform the working tasks.

According to Giacomazzi [28], since the cycle time is about 30 s, motion data have been acquired for 60 consecutive working cycles (*i* = 60). Triggers have been manually introduced by the observer during the acquisition of data, in order to separate data related to consecutive working cycles. Figure 5 shows the trends of posture angles for trunk, elbow and arms along one working cycle.

**Figure 4.** Wearable motion tracking system configuration.

**Figure 5.** Posture angles trends over one working cycle for: trunk (**a**), arms (**b**) and elbows (**c**).

#### *3.2. Data Transferring and Simulation*

Data transferring is strongly influenced by the software chosen to perform simulation. Among commercial software for human simulation, most of them have interfaces that enable them to directly connect external supported devices, such as Kinect® or XSens®, to the simulation environment. Alternatively, several software packages allow implementing customized plugins to read data from external devices. To simulate the proposed working activity the software Tecnomatix Process Simulate by Siemens®, version 15.0.1, was used. It is an excellent solution for industrial processes simulation, including a module of human simulation which allows performing very accurate simulation of manual tasks. Moreover, several codes able to evaluate ergonomics indexes are already implemented and it is also allowed creating customized routines to implement other codes, plugins or interfaces. In this regard, a custom plugin has been developed in Visual C# to load the collected data. The plugin allows reading data stored on a CSV file, transferring them to the DHM and, hence, creating compound operations, according to experimental data, by using the function "CreateHumanCompoundOperation".

The pseudocode for data transferring can be found in Fera et al. [4].

For this study, data have been re-sampled in order to reduce the number of micro-operations and, hence, the computational cost. Figure 6 shows the interpolating curve of the sampled data related to the posture angles trends, shown in Figure 5.

Motion data related to all the 60 investigated working cycles have been transferred and, hence, all the working cycles have been simulated.

**Figure 6.** Posture angles smooth trends over one working cycle for: trunk (**a**), arms (**b**) and elbows (**c**).

#### *3.3. Ergonomic Assessment*

At the end of the simulation, numerical data are analyzed to perform the ergonomic assessment. As stated in Section 2.3, the main sources of injury risks are postures assumed by workers during each cycle of the work shift, exerted forces, MMH and repetitive actions; thus, each of this class has to be investigated by means of different ergonomic assessment methods.

In this case study, the methods chosen to evaluate each ergonomic category are the most feasible considering the type of working activity performed. They are listed below, with a short description:

• Working Postures: OWAS (Ovako Working Analysis System) method. It allows evaluating the whole body postures [37]. The method consists in the analysis of the posture to which a risk class is assigned, based on the values of postural angles. There are four classes (no risk, low risk, medium risk and high risk respectively) The index (*I*) for the whole cycle is evaluated according to the frequency (*a*, *b*, *c*, *d*) with which the four risk classes are found by means of the following equation:

$$I = [(a \cdot 1) + (b \cdot 2) + (c \cdot 3) + (d \cdot 4)] \cdot 100 \tag{2}$$

∙ ∙ ∙ ∙ • Manual Material Handling: NIOSH (National Institute for Occupational Safety and Health) lifting equation, which is useful when manual lifting of loads is carried out [38]. The method is applied only if the weight of the handled object is higher than 3 kg. It assigns a score based on the vertical and horizontal displacements with which the object is handled, as well as the frequency of action and the type of grip. The Lifting Index (*LI*) is given by the ratio between the Loaded Weight (*LW*) and the Recommended Weight Limit (*RWL*):

$$LI = \text{LV/RVD} \tag{3}$$

where *RWL* depends on: worker's genre and age, vertical displacement of the object, maximum horizontal distance between the object and the body, angular dislocation of the object with respect to the sagittal plane, grip mode and lifting frequency;


Table 2 reports the risk areas for the selected indexes.


**Table 2.** Risk areas for the selected indexes.

It is worth noting that Equation (1), needed to perform the control phase of ergonomic indexes, is not applicable to all the chosen methods. In fact, Equation (1) is applied only when the index has a variability during the investigated cycles.

In this application, OWAS and OCRA indexes present variabilities along the working cycles, since fatigue, distractions or other factors may principally affect the working postures assumed by the worker; in contrast, exerted forces and NIOSH indexes remain the same in different cycles, since they depend on the positions of objects and on their weights, which are not variable.

#### 3.3.1. Indexes Evaluation

In this section, the indexes evaluation is presented. The indexes have been evaluated by means of tools implemented in Tecnomatix Process Simulate, except for the OCRA checklist which has been filled-in by using data provided by the simulation.

**Figure 7.** OWAS index scores for each working cycle and their statistical values.

*μ σ <sup>σ</sup>* The blue dotted line represents the average value of the OWAS index (*I*µ), while red lines represent the standard deviation ±σ (*I*σ) values. Table 3 resumes the statistical values and their comparison with medium risk threshold value (*It*).

**Table 3.** OWAS index score: statistical values.


By applying the Equation (1), it is possible to deduce that, as well as demonstrated by the results in Figure 6, it is widely verified: *<sup>σ</sup>* − *<sup>μ</sup>*

$$I\_{\sigma} = 8.3 < I\_{l} - I\_{\mu} = 53.1 \tag{4}$$

Thus, no further investigations are needed about working postures.

Regarding exerted forces, there are 4 operations that require the application of force. They are related to screwing operations (OP40 in Figure 3): the worker applies a counter-reaction force at the tightening end. The value of the force is the same for each screwing and in each working cycle, so there is no variability. It depends on the tightening torque (*T*), which is 30 Nm, and the lever arm (*a*), which for the used gun screwdriver (Figure 8b) is 100 mm. So the exerted force (*FEX*) is given by the following equation:

$$F\_{EX} = T/a = \text{300 N} \tag{5}$$

The Force Solver tool in Tecnomatix Process Simulate enables to evaluate the maximum applicable force, based on the posture and on the direction of application.

Since it is not possible to predict the direction of the tightening end force, the worst-case scenario has been considered, which corresponds to the direction, in the transverse plan, along which the minimum value of the maximum applicable force is depicted (Figure 8a). The maximum applicable force is equal to 44 N.

**Figure 8.** (**a**) Maximum applicable force. (**b**) Screwdriver lever arm.

Hence, the exerted force (*FEX*) is widely higher than the maximum applicable force, so an improving solution is necessary.

Concerning MMH, NIOSH lifting index (*LI*) has been evaluated by considering two lifting operations that are performed for each task:


The NIOSH Lifting Index (*LI*) is constant throughout the working cycles and it is equal to:

$$LI = 0.96\tag{6}$$

Concerning OCRA checklist, the scores varies along the cycles. Figure 9 shows the results concerning the right limb, which is the most stressed one.

**Figure 9.** OCRA scores for each working cycle and their statistical values.

*μ σ* The blue dotted line represents the average value of the OCRA score (*I*µ), while red lines represent the standard deviation (*I*σ) values. Table 4 resumes the statistical values and their comparison with medium risk threshold value (*It*).



By applying the Equation (1), it is possible to deduce that, as well as demonstrated by the results in Figure 9, it is verified:

$$I\_{\sigma} = 0.9 < I\_{l} - I\_{\mu} = 1\tag{7}$$

*<sup>σ</sup>* − *<sup>μ</sup>* Thus, as well as for postures, also repetitive actions do not need further investigations, even if in this case the values are borderline between low risk and medium risk areas. This is deducible also from Figure 8, in which more than one cycle has OCRA score within medium risk area.

#### 3.3.2. Critical Issues

Once the results about ergonomics indexes have been obtained, according to the methodological framework in Figure 1, if one or more indexes do not fall within the low risk area, it is appropriate to investigate which sub-phase or specific characteristic of the working cycle mostly contributes to the value of the index.

The analysis of the results related to the case study described shows that the values of exerted forces and the lifting index (NIOSH) exceed the threshold values for the low-risk area; hence, it is necessary an intervention for reducing the injury risk due to biomechanical overload.

Regarding the exerted forces, the value of the counter-reaction force, due to the tightening of the bolt, exceeds the maximum exercisable value. To decrease this force value, it is necessary to use a different type of screwdriver, with a higher lever arm. An angled screwdriver, with a distance between the spindle and the actuation button higher than a gun screwdriver, will significantly reduce the value of the counter-reaction force absorbed by the worker arm.

About the lifting index, according to NIOSH equation, it was deduced that the main contribution to is given by the vertical distance to be covered in handling and the maximum horizontal distance between the handled component and the body, which is excessive especially when the assembly is placed in the cart. In order to reduce the index, it is necessary to think about a reconfiguration of the workstation, lowering the shelf where the components are placed and raising the support surface of the assembly inside the cart.

The next section describes a possible modification of the workstation layout and the type of screwdriver. A simulation will show how this contributes to reduce the value of the risk indexes.

#### *3.4. Proposal and Testing of Improving Solutions*

According to the procedure shown in Figure 1, this section aims at proposing workstation layout and equipment changes in order to reduce the values of risk indexes.

As stated in the previous Section 3.3.2, NIOSH lifting index can be reduced by:


For this purposes, the shelf has been modified: its height from the ground has been reduced from 1400 mm, as in the previous configuration, 1170 mm. Since the workbench is 900 mm above the ground, the vertical dislocation has been significantly reduced.

The new cart has been designed in order to significantly increase the height of the support surface from the ground: from 500 mm to 1030 mm. In addition, a break has been made to facilitate the positioning of the assembly. In this way the horizontal distance is reduced and the worker does not assume a posture with a large trunk flexion.

Figure 10 shows the new workstation layout.

**Figure 10.** Workstation layout after equipment changes.

Concerning, the counter-reaction force due to the screwing operations, the gun screwdriver has been replaced with an angle screwdriver (Figure 11) with 300 mm lever arm (size "a"), very commonly found on the market.

**Figure 11.** Angle screwdriver; size of lever arm equal to "a".

Figure 12 shows the working tasks after the workstation layout and equipment changes, simulated in Tecnomatix Process Simulate software environment.

**Figure 12.** Working tasks after workstation layout and equipment changes.

A simulation has been run in order to numerically evaluate the 4 selected risk indexes. Table 5 shows the risk indexes values related to the new workstation configuration and the comparison with the risk index values evaluated in the previous configuration.


**Table 5.** Risk indexes comparison after workstation layout changes.

Results show that by modifying the workstation layout and equipment, the risk indexes drastically reduce.

Concerning the critical issues that emerged in the previous analysis, the NIOSH index has been reduced by 38.5% and now it falls within the low risk area. Concerning the exerted forces, despite a reduction of 66%, counter-reaction forces still exceed the upper limit. A further solution could be to change the characteristics of the bolted joints in such a way that a lower tightening torque is required.

However, a reduction in cycle time of about 4 s was also observed. Therefore, considering the unchanged production volume (900 pieces per shift), the recovery time for each cycle would increase. This could balance, at least in part, the biomechanical load due to the exerted forces, which exceed the maximum limit.

Finally, although it already fell within the low risk area, the OWAS index was also reduced by 10%.

#### **4. Discussion**

Ergonomic risk mapping of workstations is fundamental, as well as mandatory, for the companies to sustain high efficiency and commercial competitiveness without compromising workers' health and safety. However, the assessment of risk indexes is still carried out by means of observational techniques. This implies that their evaluation becomes high time consuming and, above all, affected by subjective considerations. This research was born with the aim of creating tools that can support experts in ergonomic screening and, above all, can ensure accuracy and repeatability of measurements. From this

point of view, the DT can represent the best solution as it allows exploiting the computational capability by using numerical models in which real data, in this case related to human motion, are implemented.

Based on DT, the novel methodological framework here-in proposed has been developed to carefully investigate about ergonomics of manual working tasks in a manufacturing scenario. The case study has been introduced for proving its applicability and effectiveness.

As a consequence, the main implication for ergonomists is related to a significant reduction of the time for evaluating risk indexes, allowing them to focus mostly in identifying issues and proposing solutions. For data collection, we must consider the wearing and calibration time of the wearable system, that typically takes 10-15 min, which is acceptable in comparison to the very short time, in order of seconds, taken by the software for data analysis.

On the other hand, the proposed approach has some limitations, mainly due to the characteristics of software chosen for simulations. The analysis carried out in the present study has been performed by using Tecnomatix Process Simulate, one of the best solutions on the market. However, only some risk assessment methods codes are already implemented in the software (e.g., OCRA index code is not available) and this may require additional effort to implement the algorithm related to a specific index. Moreover, if a re-design of the work station is needed, for example if some issues related to the ergonomic screening tasks occur, the alternative solution is not automatically provided but it needs an analysis by an expert ergonomist. Lastly, in order to obtain a real time analysis, and so a proper DT, the motion capture system must be able to process and transfer data in real time. For the proposed case study, it has been used a system which requires off-line processing.

The literature about the use of human DT for the ergonomic evaluation of workstations is poor as already mentioned in the introduction. However, especially about the application of numerical models for evaluating ergonomic indexes, some comparison considerations can be reported.

First of all, the methodological framework is perfectly in line with the evolution of the topic in the era of industry 4.0, in the use of hardware and software tools for evaluating ergonomic indexes [40].

The present study offers a direct link with experimental data and the use of numerical model with respect to the studies by Caputo et al. [13,14], which are focused on the workstation design validation by considering ergonomic as a design parameter. The advantage offered by this novel methodological framework is the realization of a cyber-physical system in which the motion data of the worker are directly implemented in the simulation model. This offers the possibility to use a numerical model that, on the basis of experimental data, is able to provide ergonomists with a fast and reliable ergonomic screening tool for monitoring the real production.

Similarly, Bortolini et al. [41] carried out a complete ergonomic analysis by setting up an automatic procedure through an optical motion capture system. They used kinematic data for evaluating several risk indexes (OWAS, REBA, EAWS, etc.), demonstrating how such kind of procedures make the ergonomic assessment fast, reliable and objective. The here-in proposed methodology has several advantages compared to [41]. Firstly, the use of a wearable motion capture system allows collecting data directly in a working environment, during the normal production shift, which would be complicated by using optical devices that, although very accurate, are bulky and require a fine calibration. Moreover, the possibility to observe the simulation of the working activity allows immediately identifying possible anomalies and the simulation itself can be used as a training tool for other workers.

An analogous approach has been proposed in Grandi et al. [42], which through virtual environments proposes a workflow for ergonomic assessment during the design with the possibility of using immersive reality devices. However, also this procedure is not completely appropriate as not applicable in a factory environment, where, due to the tight spaces and the high focus that a work task requires, it is not easy to use immersive reality tools.

Zhang et al. [43] proposed a mathematical model for assessing ergonomics and optimizing the assembly line. Although it is a very interesting approach, the model considers only the OCRA index, evaluated according to traditional observational techniques.

The virtual simulation environment described in [17] by using an optical motion capture system is a good tool for the control of working time and quality in a manual manufacturing process. It could benefit from the use of the evaluation ergonomic indexes here proposed to improve the ergonomic performances in the manufacturing production scenario, even though a more performing motion capture device than Microsoft Kinect should be adopted.

The methodology proposed in the present is a contribution towards an innovative way for the ergonomic monitoring of manual workstations and it also contributes at reducing the literature gap about the topic. Collecting experimental data related to real workers, who perform their working tasks during normal working shifts, is extremely fundamental for a correct evaluation, especially in terms of objectivity and repeatability of measurements. In addition, by using cutting-edge technologies, it is possible to perform the ergonomic assessment in real time, with immediate feedback, allowing a significant reduction in evaluation time and, therefore, costs. This approach could be very beneficial for plant ergonomists and occupational physicians.

#### **5. Conclusions**

This paper presents a methodological framework, based on Digital Twin, aimed to assess the ergonomic performance in a manufacturing production scenario. Implementing motion data, collected during the working activities, in a virtual 3D scenario allows performing the ergonomic screening of the investigated workstation. In this way it is possible to evaluate the desired risk indexes and figure out eventual production issues.

A case study regarding a simple assembly task has been conducted in a laboratory environment to demonstrate the effectiveness of the proposed framework. Data related to working postures have been collected by a wearable inertial motion tracking system for 60 consecutive working cycles and transferred to the Digital Twin.

In this way it has been possible to easily evaluate four risk indexes related to working postures, exerted forces, material manual handling and repetitive actions, sources of biomechanical overload.

The first simulation figured out issues related to exerted forces and material manual handling, whose values overcame the upper limit of the low risk area. Hence, workstation layout and equipment changes have been proposed and a further time-based simulation has been run to test the solutions. The subsequently ergonomic assessment showed a significant reduction of the risk indexes.

It is fundamental to underline that the traditional ergonomic screenings, carried out in an observational way, are time consuming procedures which require several hours of work. The framework herein proposed allows drastically reducing the evaluation times as well as making the assessment objective and repeatable.

In summary, three important key elements can be pointed out in the present study:


The procedure described in this paper offers an improvement over current ergonomic screening techniques, but a bigger advantage will come from the use of devices capable of transferring data in real time, providing immediate ergonomic analyses.

**Author Contributions:** Conceptualization, A.G., M.C., M.F., S.G.; methodology, A.G., M.C.; software, A.G., M.C.; formal analysis, A.G., M.C., M.F., S.G.; investigation, A.G., M.C.; data curation, A.G., M.C.; writing—original draft preparation, A.G., M.C., S.G.; writing—review and editing, M.F., S.G.; visualization, A.G., M.C.; supervision, M.F., S.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This work was supported by the University of Campania *Luigi Vanvitelli* under SCISSOR Project—V:alere program 2019.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Towards Integrated Digital Twins for Industrial Products: Case Study on an Overhead Crane**

**Juuso Autiosalo 1,\* , Riku Ala-Laurinaho <sup>1</sup> , Joel Mattila <sup>1</sup> , Miika Valtonen <sup>2</sup> , Valtteri Peltoranta <sup>3</sup> and Kari Tammi <sup>1</sup>**


**Featured Application: The initial version of a digital twin for an overhead crane with the main focus on providing data for machine designers and maintainers to support decision making and additional features for operation.**

**Abstract:** Industrial Internet of Things practitioners are adopting the concept of digital twins at an accelerating pace. The features of digital twins range from simulation and analysis to real-time sensor data and system integration. Implementation examples of modeling-oriented twins are becoming commonplace in academic literature, but information management-focused twins that combine multiple systems are scarce. This study presents, analyzes, and draws recommendations from building a multi-component digital twin as an industry-university collaboration project and related smaller works. The objective of the studied project was to create a prototype implementation of an industrial digital twin for an overhead crane called "Ilmatar", serving machine designers and maintainers in their daily tasks. Additionally, related cases focus on enhancing operation. This paper describes two tools, three frameworks, and eight proof-of-concept prototypes related to digital twin development. The experiences show that good-quality Application Programming Interfaces (APIs) are significant enablers for the development of digital twins. Hence, we recommend that traditional industrial companies start building their API portfolios. The experiences in digital twin application development led to the discovery of a novel API-based business network framework that helps organize digital twin data supply chains.

**Keywords:** digital twins; crane; machine design; integration; maintenance; operation; API; open source

#### **1. Introduction**

Digital twin (DT) represents a new paradigm for the Industrial Internet of Things. DTs are linked to a real-world counterpart and leverage several technologies and other paradigms, such as simulation, artificial intelligence, and augmented reality, for optimizing the operation of the counterparts. DTs are being built at an accelerating pace in both industry and academia and the digital twin term has reached multiple expressions of recognition, such as being among the IEEE Computer Society's Top 12 Technology Trends for 2020 [1]. The roots of the digital twin concept are in mirroring physical systems as exemplified by two seminal publications: Grieves and Vickers [2] described digital twins as a set of virtual information describing a potential or actual physical manufactured product and NASA [3] described DT as an integrated ultra-realistic simulation that combines physical models and sensor data. The purpose and use cases of the concept have since been further described in further leading publications [4–6]. These concentrate on various areas of mechanical engineering, which is a trend continued by the majority of digital twin applications as shown by multiple review articles [7–12]. More recently, other domains,

**Citation:** Autiosalo, J.; Ala-Laurinaho, R.; Mattila, J.; Valtonen, M.; Peltoranta, V.; Tammi, K. Towards Integrated Digital Twins for Industrial Products: Case Study on an Overhead Crane. *Appl. Sci.* **2021**, *11*, 683. https://dx.doi.org/10.3390/ app11020683

Received: 23 October 2020 Accepted: 21 December 2020 Published: 12 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

such as healthcare [13], business [14], and buildings [15], are starting to adopt digital twins as well. For a more in-depth exploration of the background of the digital twin concept, we refer to Section II in Autiosalo et al. [16].

Several earlier studies make recommendations for digital twin research. There seems to be a lot to be improved, as Liu et al. [11] conclude their review by stating that current digital twin literature cannot be inherited by other researchers. Other studies point out specific issues, including both technical aspects as well as considering humans as crucial actors in twin development. Marmolejo-Saucedo et al. [17] list the integration of information technology and the integration of partner companies as research issues. Barricelli et al. [10] list the cost of development and human interaction as challenges and call for a sociotechnical and collaborative approach in designing digital twins. Holler et al. [18] identify three topics for future research agenda: information architectures and models, specific applications, and the role of the human. Tao and Qi [19] recognize the need for experts from several disciplines and the need to make building digital twins easier. They also state that there should be physical 'innovation hubs' that are accessible to experts from different fields. Parmar et al. [14] recognize people and their skills as an essential factor in adopting digital twins. We condense these research issues into the following research question: "How to build integrated digital twins for industrial products?" This study aims to answer this question through a case study on the digital twin development journey of an industrial overhead crane.

Despite the high amount of manufacturing-related DT publications, only few DTs have been developed specifically for cranes. Moslått et al. [20] developed and validated a simulation-and-control focused digital twin for an offshore crane for lift planning purposes. Moi et al. [21] experimentally verified a real-time simulation of strain on a small-scale boom crane to prove that simulation-based virtual sensors can be used for condition monitoring. Szpytko and Duarte [22] developed a statistical decision-making model to efficiently schedule maintenance breaks for port gantry cranes. Additionally, an overhead crane is mentioned as a part of an ontology focused roll grinding simulation configurator [23].

The main research data of this study comes from the digital twin development journey of an overhead crane "Ilmatar" [24]. Most of the development was implemented as an industry-university research project called "DigiTwin" which is referred to as "the project" in this study. Many parts of the project have been published earlier as separate works, and this study combines these components together to form the whole digital twin of the crane as depicted in Figure 1 and gathers together the development experiences to draw practical DT development recommendations. Hence, the key contributions of the paper are:


**Figure 1.** Overview of the digital twin components developed for the overhead crane. The black arrows depict data flow through a technical Application Programming Interface (API) and gray arrows are human-machine interfaces. The full arrowhead depicts primary information flow and half arrowhead secondary information flow, such as control. Each component is further described in the mentioned section and the APIs are further described in Section 3.15.

The digital twin of the Ilmatar crane is not yet finished as most of its components are separate from each other from the user perspective. Ongoing research and development efforts concentrate on integrating these components together. Meanwhile, this study shares the experiences of one digital twin development journey, contributing its share to the overall body of knowledge.

#### *1.1. Hypotheses*

Eight hypotheses on digital twins and related matters were made during the preparation and initial phases of the project. The first six are common assumptions mentioned in the project plan, the seventh was generated previously as a result of a practical study [25] in the same physical crane environment, and the eighth was a commonly agreed development direction at the start of the project. The hypotheses are:


8. Digital twin can be built without selecting a central visualization and simulation model.

The hypotheses were used as guiding principles during the project rather than being taken under taken under specific examination as the project concentrated on building practical use cases. The hypotheses are included in this paper to represent priors and motivation of the work as well as to act as a tool for discussing the results of the project in Section 4.2.

#### **2. Methods and Materials**

We used two research methods for this study: Participation Action Research (presented in Section 2.1) to gather data from the industry–university collaboration project, and the basic principles of Grounded Theory (Section 2.2) to draw conclusions from the data. Main materials used for the study include the overhead crane Ilmatar (Section 2.3), its data interface (Section 2.4), and an IoT platform (Section 2.5).

#### *2.1. Participatory Action Research*

Participatory Action Research (PAR) [26] refers to a research data acquisition method in which the researchers themselves participate in the activity they are analyzing. PAR is a subcategory of action research and has been used in various fields of social science research for decades. PAR enables rich data collection but brings in the risk of researcher bias.

This study used PAR to acquire data from the development process of the digital twin of the Ilmatar crane. We focus our analysis on two types of qualitative data: technical data on the contents of the digital twin, and socially oriented data on how the development process of that digital twin was implemented. The technical data mainly serves as an example of what a digital twin of an industrial product can be, and the development process data are used to derive the steps that need to be taken when implementing a digital twin.

A significant portion of the digital twin development was made as a single collaborative industry–university research and development project called "DigiTwin". The project was implemented at Aalto University with four funded industrial partners and one supportive industrial partner that provided resources and attended meetings. Additionally, some activities were performed outside the project.

The data of the development journey were collected in the following ways: observation through participation, project meeting memos and other materials (e.g., internal presentation slides and emails), and presentation recordings of two seminar sessions. The data from the first two methods are confidential whereas the recordings are publicly available on Youtube (https://www.youtube.com/channel/UCJrkhYovV4V-PwqlJeW5bmQ/videos). Furthermore, a master's thesis [27] was conducted on the dynamics of university–business cooperation and is used as supportive material.

#### *2.2. Grounded Theory*

Grounded Theory [28] is a theory generation process that includes three main phases: (i) collect (qualitative) data, (ii) organize data into categories, and (iii) analyze the relations between these categories to generate theories. Ideally, each category has multiple data points and the relation between the majority of the data points between the two categories is the same.

In this study, we take a similar approach as Autiosalo et al. [16] in leveraging Grounded Theory, meaning we leverage the basic principles instead of strict coding practices. As a comparison to this approach, Josifovska et al. [29] implemented Grounded Theory with three distinct coding phases to identify main building blocks and their properties for digital twins to further create a digital twin reference framework.

Induction is used as a supportive theory generation method. When using induction, we observe a situation that causes an effect and afterward reason a rule that states that a certain situation leads to the observed effect. In this work, we observed causes and effects

during the DT development project and could induce that easy-to-use (quick to learn and set up) programming interfaces are a requirement for efficient digital twin application development. The amount of data covered in this study is not yet enough to call this a theory.

#### *2.3. Overhead Crane Ilmatar*

Aalto University has a full-size industrial overhead crane called "Ilmatar" (Figure 2) installed in one of its laboratory spaces [24]. The crane is manufactured by Konecranes and has several smart features such as target positioning, sway control, load floating, and snag prevention. The lifting capacity of the crane is 3200 kg, installed movement area 9.0 m by 19.8 m, and maximum speed 32 m/min. The crane is connected to the Internet, and several statistics about the use of the crane are sent to the vendor cloud and a third party IoT platform. A subset of the crane Programmable Logic Controller (PLC) system is linked to an Open Platform Communications Unified Architecture (OPC UA) server which provides a two-way (read and control) interface to crane data. The crane does not perform regular production line tasks and therefore the crane usage profile differs from its industry counterparts, having lower usage activity and seldom high-load lifts.

**Figure 2.** Overhead crane Ilmatar at Aalto University Industrial Internet Campus.

The crane serves as a research, innovation, and education platform and has been used in collaboration with companies as well as part of student projects. The crane development environment was published as "Ilmatar OIE" (Open Innovation Environment) in November 2019 further described in Section 3.1. The environment is open by default with some of the resources publicly available and some after manually handled registration.

Ilmatar crane is a special case among cranes and industrial products for two main reasons. First, it sees an extraordinarily large number of research, development, and education activities and therefore creates a lot of unstructured data. This is highlighted by the amount of cases presented in Sections 3.7–3.14. Second, being located in a student laboratory of a university, it is primarily public by nature in terms of both physical access and data sharing. Experiences from publishing results from the crane environment are described in Section 3.16.

#### *2.4. OPC UA Interface of the Crane*

OPC UA is an industrial standard for communication continuously developed by OPC Foundation [30]. It defines information, message, communication, and conformance models to enable interoperability, is platform-independent, and supports several communication protocols and data formats [31]. OPC UA has become a common standard especially among PLC manufacturers and is an important enabler for Industrial Internet applications. It is used mainly inside local factory networks.

The OPC UA server of the crane allows monitoring its current status variables, such as position and speed. The crane can also be controlled via the interface. Description of the OPC UA interface is currently available in Ilmatar OIE via sign-up [32]. The server itself is not publicly on the Internet, but in a password protected network in the laboratory hall. Authentication is not currently required to access the OPC UA server, but to control the crane, a specific access code has to be written periodically to a certain node. Modifications to the OPC UA server, such as adding new nodes, are made by manufacturer technicians to ensure safety.

#### *2.5. IoT Platform: MindSphere*

IoT platform "MindSphere" by Siemens [33] has been used for data gathering from the crane since its installation. Data gathering is performed by a physical MindSpere-specific gateway "MindConnect Nano" [34] that reads data from the OPC UA server of the crane. MindSphere was also used for building the bearing lifetime estimation application shown in Section 3.7. MindSphere is built on top of open-source cloud platform "Cloud Foundry", getting its core capabilities from that, while the practical productization and usability solutions are MindSphere specific. The MindSphere instance used with the crane was changed from version 2 to version 3 during the project.

#### **3. Results**

This section presents the practical components of the digital twin and the essential related material. These include a documentation solution, tools, frameworks, and case descriptions related to the digital twin development journey of the Ilmatar crane. Most of the results have been described earlier in academic publications or project seminars, all of which are referred at the start of each section if applicable. This study adds any necessary details from the overall DT development perspective. The level of technical detail for the results is sparse, concentrating on providing a meta description for the basis of discussion on the development of (integrated) digital twins.

All results are purpose-specific assets built for the overhead crane Ilmatar. Some of the results were intentionally built as part of the digital twin, while others were separate development efforts connected to the crane. The criteria for including them in this study is that they could be included as part of the digital twin of Ilmatar in the future.

#### *3.1. Documentation: Ilmatar Open Innovation Environment*

Ilmatar OIE is a combination of digital and physical resources of the crane environment. A single public web page [32] acts as a start page, including a basic description of the environment and a list of resources. The page provides links to the rest of the published digital resources that are either in academic publications, GitHub, or in a web workspace "Eduuni" that requires registration. In addition, a lot of unpublished works have been developed in the environment, and some of those are mentioned on the web page. Physical resources are currently not very extensively documented, and guidance on those rather relies on personal on-location instruction. This is a fairly natural choice as the crane is a fullscale industrial device with the potential to make extensive damage to its environment even with its safety features. Each crane user has to complete safety training before operating the crane.

The environment balances on what to include as public resources based on three main factors: (1) administration effort, i.e., how much time can be used to publish and update the materials, (2) user experience, i.e., comprehensiveness and ease of access to the materials, and (3) safety, i.e., physical safety, cybersecurity, and protecting immaterial property. The balancing is currently made on-the-fly, allowing natural development mainly based on the number of users.

Ilmatar OIE is currently the most comprehensive public collection of information dedicated to the Ilmatar crane. It was not originally made to be a part of the digital twin, but the amount of meta-knowledge it contains makes it relevant also from the digital twin perspective; Ilmatar OIE partially fulfills the "Data link" feature described by Autiosalo et al. [16] by providing a collection of useful links to various resources of Ilmatar crane.

#### *3.2. Tool: OSEMA*

Open Sensor Manager (OSEMA) is an open-source web platform that enables the setup and modification of settings on microcontroller-based sensors. It was developed as a practical response to the multitude of existing IoT protocols and communication methods noticed during master's thesis work by Ala-Laurinaho [35]. The work was further developed and published in a journal article [36]. Currently, OSEMA has no publicly available instance, but users can install it to their own server with the source code available at GitHub [37].

OSEMA can be used for the no-code setup of sensors and therefore enables effortless retrofitting sensors to the Ilmatar crane, aiding in data collection. After sensor installation, the manager web page can be used to change the configuration of sensor nodes, including measurement settings and network parameters, remotely over the Internet. OSEMA offers a web user interface and a Representational State Transfer (REST) API for the monitoring and management of the sensor nodes. Ilmatar was equipped with OSEMA-managed distance sensors tracking the location of the bridge and trolley. In addition, a 3-axis accelerometer was installed on the hook of the crane. This sensor was used in the usage roughness indicator application presented in Section 3.8.

OSEMA allows data collection from the real-world entity, which is a crucial part of the digital twin concept. Therefore, the sensor manager is considered an enabler building block for the implementation of digital twins. OSEMA can be used as the "Coupling" feature of FDTF (described in Section 3.4), supporting the creation of integrated digital twins with its REST API.

#### *3.3. Tool: OPC UA–GraphQL Wrapper*

OPC UA–GraphQL wrapper is a server application that connects to an OPC UA interface to provide it as a GraphQL interface. The wrapper was developed as a master's thesis by Hietala [38] and presented and evaluated in a conference publication [39]. The source code is published as open source in GitHub [40]. A GraphQL wrapper was installed to the Ilmatar crane on a Raspberry Pi, and an example control application was made for it, presented in Section 3.11.

GraphQL is a query language and execution engine open-sourced by Facebook in 2015 [41]. Since then, GraphQL has become a popular query language among developers [42]. GraphQL overcomes multiple shortcomings of the popular REST API and is seen as a successor for them [43].

The OPC UA–GraphQL wrapper was developed after noticing a hindrance in the crane interface: OPC UA, even though being a common standard in the industry, is not familiar to web software developers and it is fairly complicated compared to, for example, REST APIs. The OPC UA–GraphQL wrapper makes data from the OPC UA server of the crane easier to access, therefore facilitating software development for the Ilmatar. The OPC UA–GraphQL wrapper also comes with a built-in node viewer, so a user can use any standard web browser to click through the nodes and their current values. As a downside, not all features (e.g., publish-subscribe) of OPC UA are yet supported.

The wrapper represents an important phenomenon in the development of integrated digital twins: adapters. It takes some of the development burdens away from application development. While OPC UA might be technically possible to implement in any project, it may not be feasible during projects with a limited time frame, for example, due to the lack of libraries for some programming languages. The wrapper is also located between two cultures: OPC UA servers are typically installed in closed intranet networks in factories, whereas GraphQL servers are typically available via the public Internet. The benefits of networked digital twins can only be achieved if the twins can send messages to each other, for which the public Internet currently seems the most prominent option. Hence, data adapters that couple operation data with the Internet are important enablers for digital twins, although security solutions still need to be developed before the public Internet can be used.

#### *3.4. Conceptual Framework: Feature-Based Digital Twin Framework*

The feature-based digital twin framework (FDTF) is a framework that aims to deepen the conceptual understanding of digital twins by identifying features of digital twins and combining them into building blocks of digital twins. FDTF was developed during Ilmatar

digital twin development and published as a journal article by Autiosalo et al. [16] and presented in a project seminar [44].

The features and building blocks of digital twins are put into practice by software components. One software component usually fulfills more than one feature, exemplified by the observation that the features exist in different hierarchical levels. The suggested features are data link (which connects the features together), coupling, identifier, security (enablers), data storage, user interface, computation (resources), simulation model, analysis, and artificial intelligence (producers). The purpose of these features is to take the focus of conceptual digital twin development away from existing tools, and into the development of novel, digital twin-specific solutions. In a proposed process of digital twin development, you first define a business need-based use case, then select the features needed to accomplish the needs, and lastly implement a digital twin with software components that provide the desired features. Examining the building blocks of a digital twin from the feature perspective makes redundancies evident, helping clarify the mess created by software components that were not originally made for digital twin implementation.

The components are tied together with a data link, which is being developed towards a ready-made software tool in a follow-up project. There are also several partial implementations of this conceptual approach, such as the Message Queuing Telemetry Transport (MQTT) broker [45], the whole Semantic Web approach [46], API gateways by different providers, the digital twin definition language [47], and the DT Core presented in Section 3.5. These represent the practical work towards integrated digital twins, and the data link concept is a statement to combine these approaches to create more comprehensive digital twins.

Defining digital twins was seen as a necessary activity before starting digital twin development. FDTF was developed as a response to this need. It aims to help understand the nature of digital twins, so that they can be developed further as a concept, rather than staying limited to the old tools and ways of working. FDTF highlights the difference between visionary conceptual development and actual implementation with existing software.

#### *3.5. Implementation Framework: DT Core*

DT core is a modular building block for DT data processing. It was presented in a project seminar by Valtonen [48]. It is a step towards realization from FDTF to DT application implementation, such as the one described in Section 3.9. DT core represents the state of a twin and consists of data storage, logic, and interfaces.

The data storage contains state descriptive data, metadata, and management data. The descriptive data form the main information contents for the DT, consisting of "hard" data such as 3D models, component structures, and numerical IoT data. Metadata describes the meaning of those data and interpretation guidelines for the hard data, such as value ranges and units for the data. Management data provides higher level insight to the contents of the DT, such as prioritization if data comes from multiple sources, parameters for data processing, history of state changes, and also a prediction of future state from a historical viewpoint. The management data serves the logic side of the DT core.

The logic of the DT creates additional value by refining the DT data, applicationlevel logic, and processing management. Data within the storage can be refined with methods such as inference trees and AI models, and key performance indicator calculations. Application-level logic serves use cases through automatic reprocessing of data and signal abnormality detection. Processing management keeps data up-to-date by sustaining a processing heartbeat and may include alerts that are modified according to user feedback.

The interfaces bring data in and out of the DT core. They consist of query and storage management and security policies. Several query protocols can be supported to apply CRUD (create, read, update, and delete) operations, stream and batch-based updates, and standard Structured Query Language (SQL) operations to the storage and analytics parts of DT core. Each query faces security policies and is authenticated when needed. By enabling data exchange between multiple external systems, the interfaces of the DT core can be used to create an active data ecosystem.

Layering is used as a way to manage the time frame of actions for different types of DTs. The physical twin operates in real-time and the layers are divided by the heartbeat of their update frequency. Data on the layers are updated in an IoT platform, asset DT, fleet DT, and strategic DT, which leverage 1 s, 1 min, 1 h, and 1 day heartbeats, respectively (Figure 3).

**Figure 3.** The layers of the DT core framework with update rate getting less frequent as data flows from physical twin to strategic decision-making level. Examples of use cases are given on the right.

DT core is currently an implementation framework for building digital twins that are focused on data analysis. It was used as a guiding principle in the development of the brake condition monitoring application presented in Section 3.9.

#### *3.6. Information Framework: Digital Twin-Based Product Lifecycle Management*

The digital twin-based product lifecycle management (DT-PLM) framework is a way to categorize the engineering content created from the initial idea to the product use phase. The framework was presented in a project seminar by Pantsar and Mäkinen [49]. The vision was used as a guiding principle during the project, although being an extensive framework, only selected parts of it could be implemented.

DT-PLM contains three categories: DT data, DT intelligence, and the real world, as shown in Figure 4. The DT data contain static representations of the product, whereas the active features in the intelligent part, such as simulation, bring digital twins to life, i.e., the DT becomes an active agent in cyberspace. The real world joins the lifecycle as the first prototype tests are performed and mirrored during the use phase of the product.

**Figure 4.** Digital twin-based product lifecycle management from product idea to product in use. The accumulating knowledge is used to develop and improve the product and to learn how to make future products better [49].

The original design data are referred to as "product DNA". Storing the DNA is the first step, the second is to use it in various tasks during the product lifecycle. For example, DNA can be used to predict problems by looking at the DNA shared by a product family. If the DNA is in a machine-readable format, it can be leveraged as an active part of the usage phase digital twin system. For example, a system simulation can be used to provide virtual sensor information when plugged into the real counterpart. As related work, the product DNA concept has also been used by Silvola [50] in similar meaning, although focusing especially on the uniqueness of DNA.

DT-PLM framework assists in understanding the practical information contents of digital twins from a product development perspective. This is an important perspective as a lot of potentially useful qualitative data are created during the product design phase. A challenge is that a lot of the design data are confidential and cannot be given even to buyers of the product. Currently, most of the design data are also prepared only for internal use, making them potentially unusable for others. A very strong business case would be needed for transferring the data to outsiders, but when the manufacturer is also the maintenance provider, using the design data in later phases of the lifecycle can be a gradual process.

#### *3.7. Case 1: Bearing Lifetime Estimation*

Bearing lifetime estimation case consists of a method and a proof-of-concept implementation for the closed-loop design of crane bearings by performing automated data analysis on crane usage data and bringing the analysis to a system used daily by product developers. The majority of the results described in this subsection were presented earlier in a project seminar [51] by Peltoranta and Autiosalo. A detailed description of the data analysis and visualization was given in a bachelor's thesis by Mattila [52].

The data flow for the case travels through several technical tools: (i) the physical crane with sensors, Programmable Logic Controller, and an OPC UA server, (ii) physical IoT gateway, (iii) cloud-based IoT platform with data storage, custom engineering formulas, and web technologies (e.g., JavaScript), and (iv) a PLM system. The crane produces the raw data and provides an OPC UA interface for the IoT gateway. The gateway sends selected usage data to an IoT platform that stores the data, performs customized analysis (kinematics-based virtual sensoring and bearing lifetime calculations) on the data, as well as hosts a visualization that is displayed on a PLM system. The data flow is shown in Figure 5.

**Figure 5.** The data flow of the bearing lifetime estimation case. Designing a new crane is the purpose of providing the data, but not implemented during this study.

Collected raw data included the vertical position and the load of the crane, which were further refined with kinematic equations to determine the revolutions and load inflicted on each rope sheave bearing. The crane has three rope sheaves, each with a distinct number of revolutions. During the project, data collection for this implementation was straightforward with existing industrial tools, but presenting meaningful data in a meaningful way proved to be challenging.

Methodology for this case includes engineering design knowledge and practical data flow implementation tools from raw data generation to end-user visualization system. According to basic engineering dimensioning formulas, the lifetime estimations of ball bearings are based on the usage frequency and stress. When we get these data from the crane, we can compare the designed lifetime and the usage-based predicted lifetime of the component. For the example crane Ilmatar, the usage-based lifetime prediction for the bearings was approximately a thousand years due to the low usage frequency and light loading exerted on the crane used for research and education. With the lifetime prediction, a machine designer can determine if the component has been selected correctly. This knowledge can be leveraged to design new cranes that better fit their expected usage profile. Of course, the machine designer can perform estimations themselves, but the automated analysis saves their time for more demanding tasks and with large fleets, manual estimations may be infeasible. As future development, automated analysis can be used for triggering alerts if a crane is used more than expected.

The end-user interface of the case is the visualization that was shown inside a webbased PLM application (Figure 6a). Figure 6b shows the estimated usage (green line) of the bearing if the use of the crane continues similarly as during a selected time period (red curve). The time period and other parameters of the graph, such as the designed lifetime (pink line) that is used for drawing the reference line (blue line), can be changed either by changing parameters in the URL of the browser or with a configurator user interface shown in Figure 6c.

(**b**)


(**c**)

**Figure 6.** Screenshot samples of user interfaces for the bearing lifetime estimation case. (**a**) Product lifecycle management (PLM) software Teamcenter showing bearing usage data. The pie chart shows the types of usage and the graph shows timewise usage. The number of cycles is calculated from true usage data of the Ilmatar crane and the inspections are mockup for proof-of-concept visualization purposes. (**b**) A graph visualizing the past use (red curve), predicted expenditure according to the past usage (green), and the allowed linear usage to achieve the designed number of cycles (blue) in the designed lifetime of one of the bearings in the Ilmatar crane. The chosen lifetime for drawing the blue line is 100 years. (**c**) The user interface for selecting the information to be displayed on the graph.

> The codebase for the web application was put into a GitLab instance upkept by university IT services and a continuous integration and continuous delivery (CI/CD) pipeline was made for the application using a Linux server provided by the university IT services. Any changes to the GitLab codebase, made with a git client or via a web

page, triggered tests for the new application and if they succeeded, the changes were automatically deployed to the web application.

The case functions as a prototype to guide the creation of DTs for a larger fleet of cranes. The larger fleet supposedly enables further opportunities, such as usage profile categorization. The case represents a cultural shift in engineering, including designs that are further customized to purpose, bringing the customer closer to the machine design process, and continuous learning from the existing fleet of cranes. With the digital twin, the traditional assumptions can be challenged by enabling closed-loop design by leveraging data and leading to more optimized designs. Further exploration of usage-based design optimization is described in Section 3.10.

The bearing life estimation case had the most participating organizations: the university and four companies. The crane manufacturer provided the initial motivation for the case, while formalization into a concrete goal was performed as a collaborative effort. The case was defined by looking at the available resources and finding an implementable application from the area of machine design. Most of the practical application development work was done by the university inside the MindSphere/Cloud Foundry IoT platform.

#### *3.8. Case 2: Usage Roughness Indicator*

This case features a proof-of-concept usage roughness indicator that shows how smoothly a crane hook is handled. This work was presented at an unrecorded demo session of a project seminar by Valtonen and Ala-Laurinaho. It was also presented as a use case application in a journal article that introduced OSEMA [36]. The application was developed as a collaborative effort between the university and a company member of the project consortium. The university installed a sensor to the crane and connected it to a company cloud. The consortium member performed data processing and visualized the indicator. The components and interfaces of the system are shown in Figure 7.

**Figure 7.** Components of the usage roughness indicator application.

The usage roughness index is calculated from real-time measurement data from a threeaxis accelerometer attached to the hook of the crane. The sensor node was configured with an OSEMA instance that was upkept by the university. The sensor sent the acceleration data to an IoT platform of the consortium member. The same consortium member developed a machine learning algorithm to estimate the usage roughness. The algorithm is based on a neural network that was created using Tensorflow. The final application uses a flow-based approach (Figure 8) for data processing.

The machine learning algorithm was developed remotely. First, the crane was operated with different driving styles and the hook acceleration data was collected into the IoT platform. Based on the data, the consortium company implemented a machine learning algorithm. The algorithm was then tested with the crane, and the observations about the performance of the algorithm were sent to the company. Using the feedback, the company improved the algorithm. This iterative loop was repeated several times until the algorithm worked appropriately. Hence, the machine learning model was evaluated to be sufficient through iterative test cycles with linguistic feedback from the crane operator to algorithm developers. A key point of success for the development of the application was an easyto-use IoT platform, which offered a well-defined interface for receiving measurement data.

#### *3.9. Case 3: AI-Enhanced Brake Condition Monitoring*

This case presents a methodology on how brake condition monitoring services can be enhanced with flow-based artificial intelligence (AI). This work was presented in a project seminar by Valtonen and Peltoranta [53]. The case was developed by two consortium members: the crane vendor and an IoT service provider. The case builds on an existing data collection implementation and adds value through cloud-based data processing.

The brake that holds the rope of the crane is a critical component from both functional and safety perspectives. It is also a maintenance-intensive component and hence the crane vendor provides brake monitoring. The pre-existing monitoring services were enhanced with flow-based AI methods to provide deeper insight. As a result, the AI-enhanced twin knows the state of the real counterpart, similarly as people know the state of their health. Based on the health information, the twin provides proactive observations to maintenance personnel. For example, the personnel can be alerted to service the brake earlier than scheduled due to a faster-than-expected wearing speed or to inspect the brake control system if it has triggered an excessive amount of fault signals in a short period of time.

Technical implementation of the brake monitoring consists of "flow nodes" (Figure 9) that perform a certain activity to data and forward the result to the next node. The nodes also act as a visualization of the data processing pipeline and can be grouped into four sections based on their function:


The flow-based AI method provides three types of advantages: (1) Data interpretation: the meaning of data is easier to understand with the linguistic terms. (2) Root-cause identification: e.g., warning in overall brake system health is caused by the wear speed of brake-lining. (3) Predictive maintenance: automatic notices, warnings, and alerts for operators, service personnel, and owners of the crane.

The brake condition monitoring use case leverages the whole physical–digital–physical loop of a digital twin if the human maintenance at the end is accepted as part of the loop: the physical crane generates data that are processed digitally to create insights and alerts that enhance the physical condition of the crane through maintenance actions. In the future, if these algorithms prove reliable enough, critical alerts could trigger restrictions on crane control parameters.

**Figure 9.** A screenshot sample of the development environment showing the flow nodes of the artificial intelligence (AI)-enhanced brake condition monitoring application. The node categories at the bottom were added afterward. Purple nodes represent data ingestion, blue are data selection, green and yellow ones perform signal processing, and red nodes implement fuzzy inference.

#### *3.10. Case 4: Design Automation*

The design automation case presents a proof-of-concept on how real usage data can be used to redesign crane components, specifically a rope sheave and bearing. This case was presented earlier in a project seminar presentation by consortium member Sutela [54] and in a related demo session. The case was implemented by a consortium member with usage data provided by the university.

The application development started with an original design of a combination of a rope sheave and a bearing. Then a design-automation-enabled model was created into a form that can leverage usage data. Usage data collected from Ilmatar was fed to the automated model which created a new design that was optimized according to the usage. The information flow of the case is shown in Figure 10. The results showed a new design for rope sheave-bearing combination whose weight was reduced by 16% and dimensions reduced by 3–10%. The newly selected bearing was 20–30% cheaper.

**Figure 10.** The data flow of the design automation case. The usage data originally comes from the crane as described in Case 1 (Section 3.7).

The case acts as a proof-of-concept of usage-data-driven design automation when implemented with only one crane, but the method can lead to significant benefits when applying to large fleets and more expensive parts. Being able to rely on actual usage data in dimensioning decisions can even enable deviation from standards when they are robust enough to ensure safety.

#### *3.11. Case 5: Web UI with GraphQL*

This web user interface (UI) application acts as an alternative user interface to the crane via web technologies, showcasing the potential of the GraphQL wrapper. This case was presented earlier in a master's thesis [38] and conference proceedings [39]. The source code of the web app has been published on GitHub [55].

The GraphQL wrapper is supposed to ease application development for Ilmatar. To demonstrate application development with it, a control application for the crane was developed. In addition to controlling the movement of the crane, the application allows monitoring the internal state of the crane by subscribing to the node values of the OPC UA server. All communication with the crane is performed using the GraphQL interface and, thus, the UI application developer does not need to be familiar with OPC UA.

#### *3.12. Case 6: Mixed Reality Control*

The mixed reality (MR) control application shows a set of real-time crane information to the user and allows control of the crane with HoloLens MR glasses. The application was developed as a master's thesis work by Hublikar [56] with Autiosalo as the instructor. The work also includes a prototype of a data linking digital twin to transfer data between the MR glasses and the crane.

HoloLens MR glasses draw 3D images to the user on a transparent screen and allow user interactions via head orientation and hand gestures. The 3D images are stationary in regards to the surrounding world. The developed control application visualizes target destinations for the crane as hologram balls, and when a user taps a ball, the crane hook moves to that location. The glasses also show values from the OPC UA server of the crane, such as x, y, and z position, in a sidebar that is tied to the head movement of the user. Furthermore, a voice control feature was developed and tested in the user test for the application.

The prototype of a data linking digital twin is a program hosted on a separate computer. It reads and writes data from and to the OPC UA server of the crane, and transfers the data via WebSocket to the HoloLens glasses. All the devices (crane, computer, and glasses) are connected to the same local area network via WiFi or Ethernet cable.

The mixed reality control application was developed as a single person project. The development proved laborious with one person implementing both the data link between the devices and the user interface in the glasses. Especially the user interface development environment required a lot of learning and for example, the synchronization of the crane and HoloLens coordinate systems was made only by specifying a fixed initialization location for the app. The source code of the application was not published. Nevertheless, the application proved that a connection between HoloLens and the crane can be established and that the development environment allowed implementing the overall idea. The project also made the pain points of interfaces and coordinate synchronization apparent. Lastly, the user tests indicate that the user-friendliness of the glasses and the overall solution is promising.

#### *3.13. Case 7: High Precision Lifting Controller*

High precision lifting controller is a combination of sensors, a minicomputer, and a web application connected to the crane to automatically insert a cylinder (the stator of an electric motor) into a tight hole (the frame of an electric motor). The application was presented by Sjöman et al. [25] with Autiosalo as one of two main developers and continued in a master's level mechatronics project course with Ala-Laurinaho as one of the team members and Autiosalo as an instructor.

The position of the cylinder is measured with sensors that are attached to the frame. Sensors are connected to a minicomputer (Raspberry Pi) which also reads the position of the crane from its OPC UA server. The values are used to control the cylinder to the desired position. The actions are triggered and monitored via a web browser UI.

The OPC UA control interface of the Ilmatar crane was made by the crane vendor from the demand of this project. The development led to the creation of a python library that enabled easy access to the Ilmatar OPC UA server. The library was distributed as an individual file to several other projects until finally published as open-source code on GitHub [57] as part of the launch of Ilmatar OIE more than two years after the initial creation.

The digital twin concept was not considered while developing this case. However, this case initiated the two-way communication interface to the crane, which has enabled a variety of new applications. These experiences are valuable if we want a digital twin to have two-way communication between the crane and its digital twin. Furthermore, the

functionality of this case could be integrated into an operator-focused digital twin as an additional feature.

#### *3.14. Case 8: LIDAR-Based Pathfinding*

LIDAR-based pathfinding enables automatic scanning of crane environment with a LIDAR sensor and pathfinding to a target with obstacle avoidance. The application was developed at a combined master level mechatronics and automation project course [58] with Autiosalo as the advisor for crane use. The team won an innovation competition with the application [59].

The LIDAR was installed as an additional sensor to the crane and connected via Ethernet to a Raspberry Pi microcomputer which sent the sensor data to a Windows computer via WiFi. The Windows computer performed data processing and sent resulting control commands to the crane with OPC UA python library.

The team was mainly independent in their work and crane support included mainly just handing the documentation of the interface and the crane python library for accessing the OPC UA server. The team noticed an error due to an update and fixed the library. The source code for the application was not published at that time.

This case was not developed for a digital twin and is implemented as a local solution. Nevertheless, this application provides a new local operational feature for the crane. It generates a lot of data about the environment that could be provided to other parties via the digital twin of the crane. The digital twin could also relay control requests to the pathfinding application from external parties, such as a forklift.

#### *3.15. Usage of APIs in the Use Cases*

All of the described use cases (Sections 3.7–3.14) are connected to some other building block of the DT as all of them use real data from the crane. The data exchange methods were chosen individually for each use case to fulfill their specified needs. The details of API usage in the cases are shown in Table 1 and the selection processes of the APIs are described in the following paragraphs.

Case 1: Bearing lifetime estimation was developed as an opener case for the project and was planned as a collaborative effort among all project members. The planning phase included an iterative process of finding usable data sources for the crane and finding a topic that serves the goals of the project. The goals were to leverage operational data of the crane from multiple sources and to support machine design activities. Initial ideas included combining information from multiple systems, such as enterprise resource planning and maintenance, but this turned out too ambitious due to practical reasons: access to the systems was limited or they did not have suitable APIs. The case finally leveraged two systems: MindSphere and Teamcenter.

MindSphere was set up to receive usage data from the crane OPC UA interface so it was a natural choice for Case 1. The MindSphere database provided a REST API which was used in a separate web application for two purposes: refining the data and presenting the data. Refining the data required both reading and writing through the REST API and presenting. The REST API authentication (acquiring credentials and learning to use the credentials) required setup, but this proved manageable during the project.

Teamcenter was selected as the target system for the digital twin interface because it was already used by machine designers, had a lot of design data, and was supposed to store product lifecycle information. Teamcenter had a SOAP (Simple Object Access Protocol) API, but it was not used because none of the project personnel had used it before and the time required to learn to use it was considered too high. It was also unclear what the API would actually offer. However, embedding a web browser element to Teamcenter was easy, and this was the method to integrate the MindSphere-based usage data visualization to Teamcenter.

#### **Table 1.** Usage of APIs during the development of Ilmatar digital twin.


Case 2: Usage roughness indicator was developed late in the project after realizing that existing resources and competence could be easily combined into a new kind of application. The application leveraged data from an OSEMA sensor and cloud services of the Regatta IoT platform. Regatta supported MQTT protocol and it was chosen as the method of data transfer as it was recently added to OSEMA and waiting for a use case. The Regatta MQTT specification had to be added to OSEMA, although this proved straightforward as the API was well-documented and the configuration was done by the main OSEMA developer. In addition, Regatta was used by experts in the software, allowing rapid development of the cloud application.

Case 3: AI-enhanced brake condition monitoring was developed using the data from the existing crane and brake monitoring systems. The data for the analysis were extracted from the crane database and streamed into the developed flow implementation (see Figure 9) using MQTT protocol. The flow application routed the data between their nodes using internal, code-level communication routines. The calculated data from the AI-supporting nodes were exported to the IoT platform where it was visualized as time-series graphs.

Case 4: Design automation was built on Rulestream and the usage data of Ilmatar were brought from MindSphere as a spreadsheet file. Hence, this case did not have an actual API. It is included among other APIs to showcase that an API is not always necessary, especially in tasks that are anyway subject to human decision making. The case however used refined data from Case 1 and is an end-user application at the end of an engineering data pipeline that was created with APIs. This acts as a reminder that existing APIs and data can be beneficial beyond the original use case.

Case 5: Web UI with GraphQL was built to showcase the capability of the newly developed OPC UA–GraphQL wrapper. Both were created by the same developer who learned to use both OPC UA and GraphQL during the master's thesis work. Emphasis was put on developing good documentation as the benefits of the wrapper are expected to come from applications built on top of the wrapper.

Case 6: Mixed reality control used the OPC UA python library for the communication between the crane OPC UA server and the middleware. The middleware used WebSocket to communicate with HoloLens MR glasses. Both read and modify operations were used on each interface.

Case 7: High precision lifting controller was the first application to use the crane OPC UA interface for both reading and controlling. The python OPC UA library was developed during this work and it was built on top of an open-source library "FreeOpcUa". The server application used WebSocket to communicate with the client browser.

Case 8: LIDAR-based pathfinding used the OPC UA python library for communication. Aside from this, it uses non-standard or local communication methods leveraging C# and TCP.

#### *3.16. Publishing the Results from the Crane Environment*

The publicity combined with concrete actions and coordination makes the Ilmatar environment a special industrial research platform. The crane is a tangible device for which it is easy to innovate new applications, and coupled with coordination from university and industrial partners, Ilmatar sees an extraordinarily large amount of development activities. Hence, it generates an extraordinarily large amount of (qualitative) product development data. In distinction to usual corporate research equipment, a significant portion of Ilmatar data is public. Most of the published data are in academic publications, such as conference papers [24,39], journal articles [25,36], and master's theses [38,56]. Some of these works were purely descriptive with no actual application, some featured an application that was not published, and for some, the software was published as free open source in addition to the academic publication.

Publishing pieces of software as open-source code creates extra unrewarded work in most cases. However, there has been time savings from publishing one piece of software. The python library developed for the crane during high precision lifting development (Sec-

tion 3.13) was useful also for future projects, but distributing and maintaining an updated version of that library required effort. Publishing the library as open source [57] removed the need for separately distributing the software and streamlined user contributions to the library. As a downside from the open sourcing, it is not clear now who uses the software which would help keep track of the demand for the library and collect user feedback. (This is one of the reasons why the Ilmatar OIE resources were put behind registration.) The other open-source projects made in the crane environment have been published without user demand and have not yet attracted contributions outside the original creators. They are more complex than a simple library and they are not as obviously required as the crane library, which directly takes away work from those who want to connect to the crane OPC UA interface.

#### *3.17. Scalability of Digital Twins in Company Operations*

Here, we present observations on how digital twins can be taken into a normal part of company operations. The contents of this subsection are based on consortium member Lehto's presentation at a project seminar [60] and they are included as base material for further discussion.

Each of the presented cases shows potential benefit for different stakeholders of industrial cranes, but a question about payback time remains as creating a digital twin requires extra effort instead of a situation where a digital twin serves its users right from the beginning. Once digital twins become more integrated with product development, creating virtual products will likely be more cost-efficient and contain more synergies than today.

There cannot be a project for every digital twin a company produces. Digital twins are scalable only through a plug-and-play implementation that is supported by IT systems across the company. Company-wide adoption can be achieved through a strategy for digital twins and IoT data. This strategy determines what data are important, to whom, and why.

With a strategy that supports digital twin creation, a company can determine whether benefits outweigh costs. The costs come from multiple activities during the DT lifecycle, such as creation, sensoring, DT structure upkeep, updating the service and modernization actions, a data stream from a physical environment, analytics, computational power, and distribution of information reports. Depending on the use case, DTs may include additional features that induce costs outside these categories, and some costs can be minimized through automation.

Usages for widely implemented fleets of digital twins include not only the previously mentioned new product development and maintenance services but also for example sales who can use the existing fleet data to show what kind of cranes fit what kind of customer applications. As data become more easily available across the company, we see it probable that more use cases emerge, and these benefits come "free" on top of the already alleviated costs. This can create a positive spiral, leading to a situation where digital twins are expected to become a normal way of handling any information across the whole company.

The company-wide culture change from using familiar local data to using new cloudbased DT data is undoubtedly a major challenge, but it is crucial for achieving all the potential benefits. Digital twins and their interfaces need to be integrated into and serve the everyday company processes.

#### **4. Discussion**

The discussion section presents the insights, lessons learned, and recommendations from the development experiences described in Section 3.

#### *4.1. Integrated Digital Twin*

The concept of an integrated digital twin developed gradually from various phenomena observed during the creation of the Ilmatar digital twin. The basic conceptual ideas

were presented in an earlier publication [16] whereas the current paper shows a practical DT implementation and formalizes the concept of integrated digital twin according to these experiences.

By an integrated digital twin, we mean that all components of a digital twin are available from a single web location as seamlessly as possible. Instead of forcing users to fetch information from several separate systems, an integrated twin brings the information to users based on their needs. An integrated digital twin can be used by software agents and human users as depicted in Figure 11. Therefore, integration means either machinereadable APIs or human-perceivable views to information. The human view is provided by a UI application that is connected to the machine-readable interfaces, highlighting the fact that DTs operate in cyberspace which is not native to humans.

**Figure 11.** Services and users of an integrated digital twin. The data link is not software at its core but is made available via software. "Feature software" blocks represent the features of feature-based digital twin framework (FDTF). The figure does not depict the connection to the real-world entity, which is fulfilled by one or many "Coupling" feature software blocks.

The borders of the integrated digital twin are depicted with dotted lines in Figure 11, making it unclear if the features are a part of the DT or not. This is a conscious choice because during the development it proved impossible to judge if a selected software is part of the digital twin. Any attempts for hard lines seemed arbitrary and were dismissed. Instead, an integrated digital twin is defined by the connections between software blocks:

An integrated digital twin is a collection of digital services available via a single digital location, linked to a single real-world entity.

This services focused definition avoids the ambiguity related to determining if a software is part of DT. The services are provided by the software blocks which are part of the digital twin service supply chain. In the cases of Ilmatar, each block is crucial in providing the end result and is therefore an obligatory part of the service supply chain. However, each case was built as an independent whole and some have manual activities as part of the data supply chain. Work needs to be done before the individual DT applications are combined into an integrated digital twin. Although in the long term, there should be rather a natural push towards integrated DTs, as a digital twin should be a service for both crane users and application developers. The need for a platform for building integrated digital twins is evident.

Reaching the level of an integrated digital twin already seems to be a general unsaid goal for any digital twin development project. To demonstrate this phenomenon, we describe two cases presented in the Aalto environment. Another university research project developed a digital twin for a rotor system [61]. This reached a higher level of integration into one digital twin with a web-based 3D view of the rotor which combined the other features, including sensor data from two sources and neural network-based virtual sensors. A simple broker server software was developed to combine data sources with customized data adapters where necessary. The twin was built almost entirely at the university, minimizing the need for cross-organization collaboration. The rotor system serves as a good example that with good coordination it is possible to build integrated digital twins, in this case for research equipment built with industrial-grade components.

Another example of an integrated digital twin was shown by a mining technology and plant supplier company Outotec in a demo session of the project seminar. They used Aveva software (AVEVA Group plc, Cambridge, UK, https://sw.aveva.com/digital-twin) to create a "plant engineering digital twin" that brought engineering information of a processing facility available via one 3D model. The twin integrated information from multiple types of documents, such as layouts, lists, and diagrams, into one view, and is a good example of an integrated digital twin used in the industry.

Even with the encouraging examples from building integrated digital twins in single organizations, current tooling seems inappropriate. Creating properly integrated cross-organizational digital twins requires so much additional labor on top of the actual application development that the job simply does not get done with a decent amount of resources. Hence, a new coordinated approach for building integrated digital twins is needed.

We propose creating a digital twin platform that is independent of any feature software or providers. We also propose two design principles for the platform: openness and usercentered design. Openness is further divided into two components: open source software and open standards. Openness because digital twins should be located on such a low level in the technology stack that the required network effects will only happen with an open solution. The World Wide Web and containerization are good examples of Internet-based technologies that have created network effects via an open approach and we see that digital twins should be located on a similar level of the Internet stack. User-centered design is required to ensure adoption of the DT platform, including good usability for both digital twin end users and developers as users of the platform. The goal is that a digital twin naturally attracts content because it is the handiest place for it, and adding a new feature to a digital twin becomes as easy as downloading an app to a smartphone.

#### *4.2. Hypotheses Review*

We now review the eight hypotheses shown in Section 1.1. The review is based on the experiences of one environment during a relatively short time, and should therefore be taken as indicative rather than conclusive.

Hypothesis 1: Digital twin transforms data from a physical product to useful knowledge. Digital twin offers data and knowledge to all stakeholders across the product lifecycle.

The bearing lifetime estimation and brake condition monitoring case focus strongly on turning data to knowledge through multi-phase data processing. The usage roughness indicator also turns raw data into more sophisticated information about how the crane is being handled. The design automation case uses existing knowledge of the application and turns data into an actionable solution. The web UI shows data rather than transforming it into knowledge. Hence, turning data into knowledge seems to be crucial in some cases, but not in all. This emphasizes that digital twins are defined by their use case, and different features of DTs have different purposes.

The Ilmatar DT as a whole provides the data or knowledge to designers, maintainers, and users of the crane, although each case focuses on just one stakeholder. Currently, these cases are fragmented, and serving all stakeholders from one digital twin demands a lot of integration. Both of the claims of hypothesis 1 can be fulfilled if the digital twin is a combination of multiple cases and they are often perceived as goals for twins, but they are not general requirements for digital twins.

Hypothesis 2: Digital twin integrates digital models and data from different sources and providers and offers a customized view for each stakeholder.

The models and data used for the components of the design and maintenance focused DT cases come mainly from the manufacturer and the crane itself. The operator focused applications heavily rely on crane data, but also leverage external data, such as the additional sensors of the high precision lifting case and the environment data of the LIDAR-based pathfinding case. Hence, it seems that design and maintenance can rely on data from a limited amount of sources, whereas operation focused applications also crave other data sources. Although, it may also be that we did not find the right supplementary data for design and maintenance purposes. Integrating the work to a single digital twin proved to be more difficult than expected, but is still seen as a prominent development direction.

The existence of customized views is initially verified, as the components of Ilmatar digital twin offer views for machine design and maintenance as well as operators. In addition, sales have been identified as another relevant stakeholder. Each of these groups needs a customized view to achieve their goals.

Hypothesis 3: Digital twin enables networking and business.

Networking has several points of validation across the use cases, although they come with two preconditions: functional interfaces and a commonly beneficial use case. The digital twin concept itself as well as the buzzwordiness of the term acted as business networking enablers. Actual business creation is more difficult to validate in the context of this externally funded research project as most of these were made as proof-of-concept prototypes instead of business-critical applications.

The usage roughness case had clear API-based boundaries of responsibility between the two organizations, which enabled efficient information exchange. The clear boundaries can be thought of as a networking enabler thanks to the efficiency of communication they offer, and we find it probable that if taken to a business context, the API-based responsibility boundaries will enable easier business creation.

The bearing use case combined most parties to one case: one company provided a use case, the university acted as an integrator and application builder, a second company provided analysis algorithms and the end application was integrated to the PLM software upkept by a third company. The application platform was provided and the data stream setup by a fourth company. Making an integrated digital twin application naturally combined parties into a digital supply chain network after a common goal had been defined.

On the conceptual side, the digital twin provided a common goal for the project consortium. Each participant had their role in that vision and even though they weren't all directly linked to each other, aiming to create one digital twin with several features brought the network together. We are expecting that a more integrated digital twin will support even stronger networking when the results are combined to one interface.

Buzzwordiness around the digital twin concept is another aspect that seems to create a lot of networking, for example in the form of the number of people participants in events. Each of the two seminars organized for the project attracted more than a hundred participants mainly from industry, an exceptional amount for such a small nationallyfunded research project. The digital twin term gathers together people from various disciplines to pursue a common goal: creating digital twins.

Hypothesis 4: Machine design dimensioning and product development processes can be redefined with the true usage and maintenance data provided by a digital twin.

One of the project partners fed usage data to a design automation model to create usage-based dimensioning of the rope sheave. This approach works only in hindsight but can be developed further to achieve redefinition of existing processes. If the whole crane is designed with rule-based design automation and the usage profile of the customer can be estimated accurately (for example based on data from similar customers), cranes can be designed and manufactured to fit the expected use case more accurately, allowing lower overall expenses. In addition, if the usage of machine parts can be measured, such as for bearings and brakes of Ilmatar as described earlier in cases 1 and 3, lifetime estimations of those parts can be developed to become more accurate and parts can be changed when their usage reaches their designed lifetime, not by trying to estimate usage or looking for signs of failure. True maintenance data was not included in the Ilmatar digital twin so far, but it will be needed to make accurate estimations.

Hypothesis 5: The overhead crane located at university premises acts as an excellent development platform, offering industrially relevant applications, such as plugging digital twin as part of product configuration, design, and life-cycle management.

The experiences for this hypothesis are mixed, and depend on the viewpoint. From the university point of view, the applications are very industrially relevant, but from the industry perspective, the cases were rather academic proof-of-concept tests and the results could not yet be implemented in company operations. The uniqueness of the crane also created some obstacles in development, as the usage profile of Ilmatar differs from most cranes. Nevertheless, the crane has proved to be a unique development platform, enabling research collaboration in ways not possible earlier. When the pros and cons of this type of environment are known, research and development activities can be planned to take place on topics that are supported by the environment. For example, data access to the crane was comprehensive and it was easy to share it and generate small amounts of targeted usage data, but the long-term usage data does not match those of high-usage industrial cranes. The public nature of university and integration to teaching should also be considered when planning development activities. The general outlook on the environment has been positive and anticipatory for future developments.

Hypothesis 6: Acting as an interface for all Industrial Internet data is one of the most important functions of a digital twin, enabling the efficient use of a vast amount of data.

The need for this kind of integrated digital twin grew during the digital twin development, but it proved to be such a complex task that no concrete evidence was acquired to support that this is an important task especially for a digital twin. Easy-to-use interfaces in general proved to be a very important enabler, as they seem to speed up application development significantly. The conceptual frameworks developed during the project support this hypothesis.

Hypothesis 7: "Using existing APIs enables fast prototyping, and bringing 'developer culture' from the 'software world' to the 'physical world' enables faster prototyping/product development cycles" [25].

This hypothesis is supported by all cases and two cases proved especially supportive for faster prototyping. The development of the Web UI with GraphQL was trivially fast thanks to the OPC UA–GraphQL wrapper. The development of the usage roughness indicator was fast as the used IoT platform provided a well-defined interface to which the OSEMA sensors were easy to connect.

All but one of the operator-facing applications were built on top of the pre-existing OPC UA API, and the usage roughness indicator was built on an existing IoT platform and the pre-built OSEMA sensor platform. From the design and maintenance focused cases, bearing lifetime estimation and design automation also leveraged OPC UA via the MindSphere IoT platform and the AI-enhanced brake condition monitoring used an existing interface and IoT platform.

However, most of the applications are focused on user interfaces with no changes to physical products. The design automation created new designs for crane parts, but the results were not used for actual physical prototyping. We also had the API of only one crane, while making proper conclusions for physical development would have needed APIs from several cranes. It also seems that many current APIs are too laborious to be used efficiently in iterative physical product development.

Hypothesis 8: Digital twin can be built without selecting a central visualization and simulation model.

We built the digital twin without leaning on one visualization and simulation model and it came out fragmented in multiple separate cases. We see that this approach allowed a wide exploration of different features for digital twins, but we ended up with an intangible result: we have several separate instead of one integrated digital twin of the crane. However, we still see this as a direction worth exploring, and to overcome the difficulties, we are building a "data link" tool to tie the pieces together more concretely without modelbased visualization. Hence, the hypothesis seems valid, but extra care should be put into combining the different pieces together if a visualization model is not used.

In conclusion, most of the hypotheses were validated at least partially. Some of the hypotheses proved too ambitious to be fulfilled in one project, but are seen as prominent development direction.

#### *4.3. API-Based Business Network Framework*

Observing the collaboration of different stakeholders of the project led to the discovery of a framework that states that business networks should be designed in parallel with the corresponding technical application network. The approach is to leverage technical interfaces (APIs) as the basis for organizational boundaries in a business network. More specifically, when organizations collaborate, they have both a technical API and a business relationship, and the structure of these relationships is built as identical as depicted in Figure 12.

**Figure 12.** Example illustration of API-based business network framework. The data from Product 1 owned by Organization 1 go through a loop via digital products before providing an application for the physical product. The structure can be of any form and does not need to be a loop.

Our hypothetical theory is that the structure of APIs between organizations should be used to define the structure of teams and their responsibilities in digital twin development. This makes sense because when data are moved to a service managed by another team, the responsibility of working on that data changes. Each collaborator works on their own platform and on the data they receive from others (or generate their own data).

There are multiple projected benefits for this kind of network design, mainly in enhancing communication. A relationship that is built around a technical API is simple and traceable; the API either works or not and it either gives the specified data or not. The data specs are specified in the API documentation. It is immediately visible if the API stops working or gives bad data. Hence, it is easy to identify who should fix the issue. The API-based network architecture also contributes towards the general data monetization [62] ambition. When a business relation and an API are parallel, it should be natural to define monetary value for the data that is transferred through the API.

The distribution of ownership for the different parts of the application is an intended design feature of this style. This is becoming necessary as applications are required to be so complex that it is not sensible for the main application owner to master all the technical details of the application. Instead, the application owner becomes an API architect who does not need to see what happens behind each API; it is enough that the input and output of each block work as agreed. These factors lead to the basic characteristics of API-based business networks:


The API-based business network architecture style highly resembles the microservice organizational style commonly used in web application development. However, industrial DT services have physical devices as parts of the application, which brings in more diverse computing environments, more complex supply chains, and long support periods. The diversity of computing environments appears in several forms, such as a high number of operating systems, limited computing power, limited connectivity, and additional security demands. The complexity of supply chains for physical products is high as a large number of parts are bought from subcontractors and the goods need to be shipped physically. Long support periods are required for physical products whereas IT products can be even completely changed. Therefore, while this API-based organizational style has been used in software-only, special attention is required to make it work also for industrial DT products.

The API-based networking style can be demonstrated by two examples from the project. The usage roughness case (Section 3.8) provided inspiration and a well-working example of the network style. The two parties relied their cooperation on data exchange through a single API; one party provided the data and the second analyzed it and provided the visualization application. If modifications were needed, requests were made via email, but the data were still exchanged via the API. Additionally, the whole Ilmatar DT can also be visualized according to API-based networking style as shown in Figure 13. The identification of organizations was made after the completion of the project, but the components still have clearly defined owners. Based on this project it seems that ending up with the API-based networks style structure is a natural way of organizing a multicomponent industrial DT. Therefore, it seems logical to use this framework as a design tool for DTs.

**Figure 13.** A case visualization of the API-based business network framework: The components of the Ilmatar DT are colored by the organization in charge of their operation. U: university in general, C1-C4: companies, UG1-UG6: university groups.

The API-based business network architecture framework is currently a hypothetical theory and requires further experimentation before benefits in operational industrial environments can be verified. Practical implementation requires all participating organizations to be invested in APIs, more than the current industry standard. However, it seems inevitable that more and more companies start using APIs as part of normal operations and in this kind of future, it is only natural to end up with API-based business networks. At the current form, the framework can be used as a communication tool to present and plan multi-organization DTs efficiently.

In a related study, Barricelli et al. [10] brought up the complexity of collaborative DT design projects, emphasizing the communication and skill gaps between participants from different domains. Barricelli et al. encourage using a sociotechnical design approach to ease communication gaps and to enable development also for experts outside the IT domain. This approach seems to support the need for the API-based business network framework and requires using APIs that are user friendly also for people without an IT background.

#### *4.4. Lessons Learned*

During the study, we recognized the following six themes that should be given special attention when developing digital twins that have more than one component.

APIs. Various types of exchanging data between systems proved to be an area with a lot of space for rapid development. APIs have been a basic tool in IT systems development for decades, but the DT world is only on the verge of recognizing them. There is a multitude of different API types in various dimensions: they can be local or remote, private or public, just a library, standardized or unstandardized, or between these categories. Documentation can often be understood only by IT experts. Hence, knowing everything about the relevant APIs seems impossible. Nevertheless, we made a basic flowchart for selecting APIs during DT application development based on the experiences of the project, shown in Figure 14.

Key takeaways from the project are that APIs should be used more in DT development, any properly implemented API services become a part of the company infrastructure that can be used in later projects, and APIs need to be easy to use for as many people as possible to be used efficiently. Further notes on the usability of APIs are given in Section 4.5.

Standards. During the project it became apparent that a standard for describing digital twins on metadata level would be needed to make extendable digital twins. The digital twin of the crane consists of several parts built for different purposes with different tools, but there was no method on how to state in a machine-readable format that all these parts belong to the digital twin of our Ilmatar crane. Standards for metadata exchange are now being developed [63].

Tools. While there are tools for implementing specific components of digital twins, there were no tools for combining these. In addition, only some of the component tools can be launched as services with 24/7 availability, which is a practical requirement if the component is leveraged by other teams. Furthermore, launching even simple web services proved to be unnecessarily difficult during the study. Therefore, we have two recommendations for digital twin tool development: digital twin builders that combine multiple components and making it easier for non-coders to deploy the components as 24/7 services. A recently published open-source tool "Digital Twins Explorer" by Microsoft [64] is a good opening in the right direction.

Open source. The majority of the cases of the project were built on top of open-source solutions: all the cases built by "University groups" in Figure 13 leveraged open source and for example, the MindSphere IoT platform is based on open-source software. Open solutions can be tested instantly and therefore they seem to be especially suitable for innovation. Internet and software companies have found ways to leverage this innovation potential by both using and offering open solutions, whereas industrial companies are still reserved towards openness. Digital twins are mostly software and leverage the Internet, so any company building them should get familiar with open solutions to stay competitive. It is also good to acknowledge that open source is a complex field: for example, there are a plethora of open-source licenses, multiple business model styles, and specific dynamics for community engagement.

Skills. The digital twin applications of the project were developed mainly by mechanical engineers, although some of them were at least moderately experienced in programming, which proved to be an essential skill in several DT components. Even though creating many advanced features is certainly possible, they may not be implementable in a limited time frame. This can be especially deceiving when planning to use new tools and the expectations for those tools don't match reality. On the other hand, the right tools can also remove the need for experts in certain areas. Each digital twin development project should critically evaluate if they have the necessary combination of skills and tools available for developing the type of digital twin needed. Skills have been recognized as a crucial factor also by other researchers [4,14].

Goal. Developing multi-component DTs is currently an uncharted territory with no standard implementation guides, which means that project participants need to both apply their knowledge in new ways and learn new skills. This lack of best practices and example solutions makes it practically impossible to plan the detailed outcome of the project in advance. Hence, it is important to define, communicate, and update the goal of the project constantly as new skills are acquired. The two methods for goal management during the project were to define a purposeful use case and to practice cross-organizational leadership, which are further described in Section 4.5.

**Figure 14.** Flowchart for acquiring API resources for a digital twin (DT) application development project as observed during the example project. The diamonds are fairly quick checks, whereas the rectangles represent various amounts of work. (e.g., building a reusable API requires more work than building a case-specific API.) Hence, rectangles should be avoided to get started with actual application development as fast as possible. Notes: The word "workers" refers to all the project workers collectively, i.e., it is enough that one of the workers has a specific skill. As an alternative for workers learning new skills, the project can also use outside support to perform the consequent tasks, although this may lead to a prolonged dependence of the support and possibly including the support in updating the use case requirements. Defining and updating the use case requirements is included in the chart as they proved to be inseparably dependent on the availability of suitable APIs.

#### *4.5. Managerial Implications*

The most important takeaway from the development of digital twin applications for Ilmatar crane was the significance of easy-to-use APIs. They seem currently undervalued and leveraging them properly would offer significant efficiency boosters. Crucial

factors for achieving integrated digital twins include also a purposeful use case and crossorganizational leadership. In the long run, the role of open standards and open-source software need to be acknowledged and developed intentionally.

Easy-to-use APIs are important because they are a requirement both for building integrated digital twins and for browsing digital twin data. Ease of use is of course a subjective matter, varying per person, and it is important specifically as such. The APIs need to be usable for those who need the data in their work; it does not help if they are usable for the most experienced developer in the company. Unfortunately, current methods to view the data from APIs are often cumbersome. For example, the user interface of the popular API browser "Postman" (PostMan, Inc., San Francisco, CA, USA, https://www. postman.com/) relies heavily on typing text instead of offering clickable buttons. This textbased UI may be preferred by developers thanks to its versatility, but it is impractical for someone with little programming experience who just wants to browse the data behind the API. Contrastingly, the OPC UA client "UaExpert" (Unified Automation Gmbh, Kalchreuth, Germany, https://www.unified-automation.com/products/development-tools/uaexpert. html) has a graphical interface that allows browsing by clicking through a tree-like structure. MindSphere offers a graphical web user interface for browsing the historical data. We recommend decision-makers to demand this kind of usability from all API providers (both internal and external). To test usability, you can simply ask yourself: Can you browse the data behind the API? If not, probably neither can the majority of your employees, which makes developers a constant bottleneck for data access throughout the organization. APIs offer a whole new view to operations and enable innovations, but only if they are accessible. We recommend treating APIs as investments.

A purposeful use case is a starting point for the actual digital twin development. The digital twin development for the Ilmatar crane started out by defining a common use case, which led to the formation of the bearing lifetime estimation case that had roles for all but one of the project partners. Other cases were defined among fewer participants and led to isolated twin applications. Combining these all to one twin proved both technically difficult and lacked purpose. It should also be noted that use cases were defined by what is possible with the currently available APIs, further highlighting their importance.

Cross-organizational leadership is required both when planning the use case and implementing digital twins applications that cross organizational boundaries. The planning stage requires insight into the competencies of all participating organizations. During implementation, cross-organizational leadership helps to stay focused or to make the decision to change plans. Cross-organizational leadership is important for integrated digital twins because they combine competencies and products of organizations in ways that differ from their old practices. The importance of leadership can be reduced by having strong interfaces and a clear purpose, or even by following the API-based business network framework (Section 4.3) so that everyone knows their role.

Open standards form the basis for the World Wide Web and it seems that the network of digital twins will only materialize with a similar approach. It is important to distinguish freely accessible open standards and paywalled traditional standards as there seems to be a conflict between the proponents of these styles. The traditional standards may be overlooked by developers if they cannot be accessed easily, and some industries may think the Internet standards are not standards at all. Digital twins need them both in their mission to merge the physical world with cyberspace.

Open-source software is often overlooked by industries due to various reasons, often unnecessarily. While there are cases for both open and closed source software, the unique benefits of open source software, such as community creation and developer friendliness, should be taken into account when developing the company software portfolio. Many companies also use open source more than they think. For example, a vast majority of Internet servers are run on Linux, so it is not a question of "Does your company use open source software?" but rather "How much open source software does your company use?" When you combine the facts that software companies are already heavily relying on open

source and that machines are becoming increasingly software-based, it seems inevitable that manufacturing companies will start using more open-source software. It is time for manufacturing companies to start preparing an open-source software strategy.

#### *4.6. Limitations*

The method of data collection used in this study (Participatory Action Research) is known to pose a risk of researcher bias. In an attempt to alleviate this risk and to pursue objectivity, we presented also matters that did not go well during the studied project. In fact, the main conclusions are based on the encountered difficulties. Nevertheless, the fact that the authors participated in the project may render them blind to some aspects of the project that could be seen by outsiders and this should be acknowledged when reading the study. However, this data collection method allowed authors to acquire deeper data than is possible through outsider observation.

The method of theory formation used in this study (Grounded Theory) concentrates on new theory generation rather than theory evaluation. The conclusions of this study are supported by a limited amount of data, i.e., one industry–university project, and should therefore be later be evaluated against more development experiences to see if the observations were specific to the research project, or applicable to the development of integrated digital twins in general.

#### **5. Conclusions**

This study presented, analyzed, and gave recommendations based on an industryuniversity project that developed a multi-component digital twin for an industrial overhead crane. The twin is built with two tools (OSEMA and OPC UA–GraphQL wrapper) and three frameworks (FDTF, DT core, and DT-PLM) developed during the project and consists of eight separate application cases built for the designers, maintainers, and operators of the crane. One use case was developed as an integration of multiple systems and stakeholders, but the rest of the applications were not merged into one coherent digital twin. It became clear that building integrated digital twins demands a lot of coordination work that may not be worth the trouble with current tools. Hence, the current lack or unsuitability of tools is seen as a major barrier to the development of integrated digital twins.

The cases indicate that user-friendly APIs speed up application development and are even a prerequisite for the innovation of applications. However, leveraging the current APIs efficiently requires new skills from the workforce: first, an overall understanding of what can be achieved with APIs should be attained by every employee, second, the technical know-how to use them as tools for those who can benefit from API data in their daily work, and third, the technical skills to provide APIs as service to other employees. Currently leveraging API data demands too much work, so we recommend investing in user-friendly interfaces and treating them as valuable digital infrastructure.

We formalized the concept of integrated digital twins and reviewed a series of eight digital twin related hypotheses with mostly supporting results. We also present experiences from making the research environment a public innovation platform. We discovered a novel API-based business and innovation network architecture style as a way to structure digital twin data supply networks.

In conclusion, our development experiences indicate that we should continue striving towards integrated digital twins while more user-friendly tooling is required before this can be achieved.

**Author Contributions:** Conceptualization, J.A., M.V., V.P., and K.T.; methodology, J.A., M.V., and V.P.; software, J.M., R.A.-L., M.V., and J.A.; resources, J.A., R.A.-L., and M.V.; writing—original draft preparation, J.A. and R.A.-L.; writing—review and editing, J.A., R.A.-L., K.T., and M.V.; visualization, J.A., M.V., and J.M.; supervision, K.T.; project administration, J.A., V.P., and K.T.; funding acquisition, K.T. and J.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Business Finland grant number 8205/31/2017 "DigiTwin" and grant number 3508/31/2019 "MACHINAIDE".

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The processed data presented in this study are included in the article and cited works. Restrictions apply to the raw data of the development process due to confidentiality. Additional information is available from the corresponding author upon reasonable request.

**Acknowledgments:** The authors would like to thank all DigiTwin consortium members and those who presented or participated in project seminars. J.A. would like to thank KAUTE Foundation and Walter Ahlström Foundation. R.A.-L. would like to thank Tekniikan edistämissäätiö.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


## *Article* **Developing a Digital Twin and Digital Thread Framework for an 'Industry 4.0' Shipyard**

**Toh Yen Pang 1,\* , Juan D. Pelaez Restrepo <sup>1</sup> , Chi-Tsun Cheng <sup>1</sup> , Alim Yasin <sup>1</sup> , Hailey Lim <sup>1</sup> and Miro Miletic <sup>2</sup>**

	- s3588698@student.rmit.edu.au (A.Y.); s3776055@student.rmit.edu.au (H.L.)

**\*** Correspondence: tohyen.pang@rmit.edu.au; Tel.: +61-3-9925-6128

**Abstract:** This paper provides an overview of the current state-of-the-art digital twin and digital thread technology in industrial operations. Both are transformational technologies that have the advantage of improving the efficiency of current design and manufacturing. Digital twin is an important element of the Industry 4.0 digitalization process; however, the huge amount of data that are generated and collected by a digital twin offer challenges in handling, processing and storage. The paper aims to report on the development of a new framework that combines the digital twin and digital thread for better data management in order to drive innovation, improve the production process and performance and ensure continuity and traceability of information. The digital twin/thread framework incorporates behavior simulation and physical control components, in which these two components rely on the connectivity between the twin and thread for information flow and exchange to drive innovation. The twin/thread framework encompasses specifications that include organizational architecture layout, security, user access, databases and hardware and software requirements. It is envisaged that the framework will be applicable to enhancing the optimization of operational processes and traceability of information in the physical world, especially in an Industry Shipyard 4.0.

**Keywords:** digital twin; digital thread; framework; shipyard; industry 4.0

#### **1. Introduction**

Digital twin (DTW) technology is the cornerstone of digital transformation, which we are currently witnessing in the new industry 4.0 revolution. DTW is accessible now more than ever and many reputable and innovative companies such as Tesla and Siemens have adopted it with varying success. Siemens [1] has integrated DTW into its three major sections of product lifecycle: product, production and performance. The virtual representation of the product is created and tested to validate performance under expected use conditions. Production is optimized through manufacturing process simulations where any sources of error or failure can be identified and prevented before proceeding to physical production. Subsequently, DTW has potential to improve performance by producing high-quality products at lowest logical cost by integrating manufacturing processes and enhancing production planning in manufacturing implementation [2].

In addition to companies in the business of manufacturing products, companies in other sectors such as the National Aeronautics and Space Administration (NASA), a pioneer of the DTW, used this technology to develop ultra-high fidelity simulation models of aerospace vehicles. These simulations enabled NASA's engineering team to predict the future performance and status of their vehicles accurately in the form of the "factors-of-safety" during design and certification phases. It also enabled mission managers to make informed decisions based on historical and real-time data to improvise possible in-flight changes to a vehicle's mission [3].

**Citation:** Pang, T.Y.; Pelaez Restrepo, J.D.; Cheng, C.-T.; Yasin, A.; Lim, H.; Miletic, M. Developing a Digital Twin and Digital Thread Framework for an 'Industry 4.0' Shipyard. *Appl. Sci.* **2021**, *11*, 1097. https://doi.org/10.3390/ app11031097

Received: 11 December 2020 Accepted: 21 January 2021 Published: 25 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The global medical industry has been utilizing DTW to test medical devices virtually before introducing them into the physical world. For example, the Living Heart Project has adapted DTW for cardiovascular surgeons in diagnosis, education and training [4]. This project is not limited to cardiovascular surgeons but has positive implications for medical device design, clinical diagnosis and regulatory science. The fundamentals of this project involve the use of both pacemakers in live participants and virtual patients with the goal of increasing industry innovation in tackling heart diseases.

Further practical use of DTW in the medical industry relates to tailoring health care to individuals. In South Korea, DTW is being utilized in combination with Medical Artificial Intelligence to tailor healthcare plans to individual patients [5]. This, in conjunction with information on tracked health and lifestyle data from wearable devices, could eventually result in a "virtual patient." Virtual patient models allow medical personnel to perform continuous remote monitoring on patients at low-cost and provide health predictions and prescribe preventive treatments promptly. Through such interventions, South Koreans have benefitted from significant health improvements, reductions in healthcare costs and increased personal freedom in dealing with their own health.

Beyond healthcare, DTW is employed on a large scale in urban planning. For example, Virtual Singapore is a dynamic 3D city model [6] that consists of a detailed 3D map of Singapore and contains information such as texture and material representations of geometrical objects, terrain attributes and infrastructure and so forth. This 3D model is useful in virtual experimentation, virtual test-bedding, planning and decision-making and research and development.

Despite DTW being accessible now to most companies and governments, the adoption and uptake in Australian small- and medium-sized enterprise (SME) is still very slow. For most SMEs, tackling industry 4.0 problems requires a number of enabling technologies such as Product Lifecycle Management (PLM) software, enterprise resource planning (ERP) packages, the Internet of Things (IoT) and Cyber-Physical Systems (CPS), which communicate and cooperate with each other in real time. Unfortunately, it can be difficult for SMEs to integrate data into these systems when they have been developed by separate firms. Hence, the foundational knowledge, experience and potential of DTW has yet to become mainstream. There also exists a gap in understanding the requirements, applicability, security and sustainability of such technologies.

There are many studies in the field of DTW but very few studies have reported combined DTW and digital thread (DTH) technology in industrial transformations. The purpose of this paper is to report on the development of a new framework that combines the DTW and DTH for better data management in order to drive innovation, time to market and improve the production process and performance. First, we review the concept of DTW. Secondly, we consider its applicability in the entire product life cycle context. Thirdly, we describe the DTH and its entities. Fourthly, we discuss the development and integration of a DTW and DTH in a new framework and look at the necessary components for industry to embrace it. Finally, we are providing an example of combining DTW and DTH in industry 4.0 Shipyard to demonstrate how this new framework is going to work, particularly in Australian context.

#### **2. The Digital Twin**

A DTW is commonly known as a connection of data between a physical entity and its virtual representation that is made for the purpose of improving the performance of the physical part using computational simulations and techniques [7]. The concept of DTW was first introduced more than a decade ago at the University of Michigan and was further developed by Michael Grieves [8]. Grieves described the DTW as a cycle of data between three components, that is, a physical object, its virtual model and the information processing hub that links the physical object and its virtual model. Grieves envisaged this new concept as the possible foundation of PLM and a new product-manufacturing method to fulfil desired design specifications [8]. Figure 1 depicts these three components (virtual representation, information hub and physical objects) of DTW in an industry application.

**Figure 1.** An example of the application of digital twin in industry.

#### *2.1. Physical Environment*

The physical environment is the basis for developing the DTW [9,10]. Generally, the objects included in most of the studies are manufactured products such as vehicles, aircraft or 3D printers. Of key importance is the fact that the DTW is not solely limited to an object itself but often considers the environment and interactions with it. If the DTW is created for the optimization of the manufacturing process, then the purpose of the DTW in the product lifecycle must be specified [7,11–13].

#### *2.2. Virtual Space*

Virtual space is the first phase of creating a DTW and incorporates a 3D model representation of the physical object, containing the geometric modelling of the physical object, the virtual workers and the virtual environment in which the product is contained. The user should model and analyze that of the 3D product in the physical space and simulate this in the virtual space, including movements of workers and the products and how they interact. The user also needs to define the attributes and properties of the product and corresponding rules of operation in the physical world and then simulate these in the virtual space. Once all these aspects have been successfully integrated into the DTW environment, the full virtual representation is considered complete.

#### *2.3. Information Integration*

Information that is collected from physical sources (from suppliers, the product itself, organizational changes) will be analyzed and integrated into the DTW during the dataintegration phase. These data need to be analyzed and integrated into the DTW seamlessly. For example, a stock-taking DTW would need to understand the amount of stock left in the shop floor as physical objects and be able to translate this information to ensure up-to-date stock tracking. This is the step where the real-world data are integrated with virtual representations to create a DTW.

#### *2.4. Current Digital Twin Application in the Industry*

DTWs attract interest from different industries' operations areas such as product design, logistics, manufacturing and maintenance. Also, DTWs can be used to increase the efficiency and automation levels of the manufacturing, maintenance and after-sales service (as shown in Figure 2) [7,14,15].

**Figure 2.** Application areas of Digital Twins according to Melesse et al. (reproduced from [15], Elsevier, 2020).

In the product concept, design and production phases, DTW can be a very useful tool in the manufacturing system. Studies show that DTWs have been used successfully to understand the performance and behavior of individual machines, making it easier to integrate the production line [16,17]. By leveraging the advantages of DTW, small manufacturing companies have achieved better performance in automation and adaptability to changes in customer orders or material properties such as hardness, strength and elasticity. These successes show that DTW can be used as a tool to increase efficiency in the production planning and optimization of the manufacturing implementation [16,17]. DTW has also revealed potential in the predictive maintenance area where, based on the information collected from the physical component, multi-physics simulations and data analysis are performed to predict future performance and possible future failures. These can be used to generate early warnings and to feed into the maintenance plan continuously, thereby reducing the costs of unplanned disruption. However, these kinds of applications have not yet been widely adopted and further research is needed to generalize them for wider use [9,15,18].

In the services area, such as after-sale service, DTW can be used as an information tool to provide added value to the customers by being able to produce better predictions of the future behavior and the remaining lifetime of an asset and its components. DTWs can also be used to collect useful data to drive design modification, improve product performance and improve the overall production planning cycle [15,16,19]. Despite some successful applications, the methods and tools to implement DTW in industry are still in their early stages of development and need more research. Also, many of the physical phenomena involved in the manufacturing of several products such as aircraft, vehicles and machining tools are complex and hard to simulate. Hence, these issues need more research to develop better models. Additionally, the large amount of data that can be collected by a DTW introduces new challenges in data handling, processing and storage [20,21] and hence, a framework to build DTWs should address these challenges [9,14,15,22–25].

#### *2.5. Enabling Tools for Digital Twin*

In the literature [2,26], the enabling tools for DTW can be broken down into five categories: 1. tools for controlling the physical world; 2. tools for DTW modeling; 3. tools for DTW data management; 4. tools for DTW services applications; and 5. tools for connections in a DTW environment. There are a number of commercial application platforms that have various enabling DTW technologies provided by global companies, for example, Predix (General Electric Company, Boston, MA, USA), Thingworx (PTC Inc., Boston, MA, USA), Mindsphare (Siemens, Munich, Germany), ANSYS (ANSYS Inc., Canonsburg, PA, USA), 3D Experience (Dassault Systèmes®, Vélizy-Villacoublay, France), Altair (Altair Engineering, Inc., Troy, MI, USA), Oracle (©ORACLE, Austin, TX, USA), HEXAGON (MSC Software, Newport Beach, CA, USA) and SAP (Weinheim, Germany) [26].

#### **3. Traditional Product Lifecycle Management Approach**

In the traditional PLM approaches to product development, there are many user groups and stakeholders involved in creating and sharing information during the planning, design, production and service phases (Figure 3). Hence, many documents and a large amount of data are created to capture the decisions and results of PLM activities.

**Figure 3.** Traditional product life cycle management process.

Therefore, the engineers in any one team in the PLM will continually work independently by importing files locally for modification and then exporting them for storage and future use. If subsequent user groups use different data manage systems and software, the net result is that these iterations can be slow and time consuming. The overall cost required for data conversion from one part of the system to the other becomes large and reduces overall value for money.

#### *3.1. Data Silos and Fragmented Information*

For decades, organizations have optimized each product life cycle phase separate from others. Hence, highly fragmented information and knowledge exchange exists between life-cycle phases [27,28]. As a result, valuable information and knowledge is often lost and not used as context for decision-making in the transition phases and, hence, there are information gaps in the product life cycle, especially in the design-to-manufacturing and design-to-service and maintenance stages. We know that PLM is an iterative activity. Therefore, the management and exchange of information becomes crucial to ensure continuity of work flow to support innovation-based models of competitiveness and to reduce the risks of failure [27,29–31].

#### *3.2. Digital Twins in Product Lifecycle Management*

In engineering PLM, integration of DTW is a paradigm shift that can help companies set up for better processes of managing all product lifecycle stages starting from ideation, to design, testing, certification, manufacture, operations, maintenance and, finally, disposal (Figure 4) [32,33]. With a DTW, thousands of processes and modifications can be modelled for all lifecycle phases of a product. Users can test for different "what if" scenarios for changes in the design, materials, manufacturing parameters, logistics and operational conditions, among others. Furthermore, the effects of the modifications to the other phases of the life cycle can also be assessed [34].

For example, some aspects that can be achieved with DTW are a detailed recording and storage of process data from the manufacturing stages, immediate use of information from manufacturing difficulties or errors and parts defects to identify critical manufacturing steps. Also, clients can be offered customization to their needs, repair processes can be scheduled based on the knowledge of the entire product operation history throughout the product life cycle and higher machine availability, considerably lower downtimes and faster attention times following predictive maintenance of machine tools can be available [1,6].

**Figure 4.** Integration of Digital Twin application with the Product Life Cycle management.

#### **4. The Digital Thread**

A DTH refers to a data-driven architecture that links all information generated and stored within the DTW enabling it to flow seamlessly through the entire PLM phase from invention to disposal [10,35–37]. Mies et al. [38] described the process of a DTH in the context of additive manufacturing technologies. The DTH enabled data to be integrated into one platform, allowing seamless use of and ease of access to all data. Mies et al. hypothesized that additive manufacturing processes offer ideal opportunities to apply DTH as they rely heavily on new data-driven technologies.

Siedlak et al. [39] performed a case study on a DTH that was integrated into traditional aircraft design metrics. The use of DTH enabled the necessary multidisciplinary trades to link their data through common inputs and data flows, which facilitated integrated models and design analyses. It allowed the sharing of information between usually isolated organizations to enable a more time- and cost-efficient design process.

DTH is a multi-step process that complements DTW over the entire lifecycle of the physical entity. It contains all the information necessary to generate and provide update to a DTW [35]. It relies heavily on the correct development of a framework that creates homogeneity and easy access to data through three main data chains: 1. the product innovation chain; 2. the enterprise value chain; and 3. the field and services chain (Figure 5).

**Figure 5.** The concept of digital thread to complement digital twin.

#### *4.1. Product Innovation Chain*

The product innovation chain is the first step in the initialization of the DTH. This is where the lifecycle of the product is created and stored for future needs. The product designs, process planning and design flow are integrated into the thread, which outlines any suppliers and the information that were created during the first development of the physical product.

#### *4.2. Enterprise Value Chain*

The enterprise value chain is the second step in the creation of the DTH and incorporates more sophisticated details in the production of the product. This is where supplier information is integrated into the thread and on how the supplier might have produced the parts, batch numbers and so forth. Other information on the parts, including materials used and manufacturing details, would also be added. For this part of the thread, as much information can be added as the user requires. If required, all the information, including individuals who manufactured the parts, where the original materials were from and how they were obtained, can be added if this is what is required by the end-users.

#### *4.3. Field and Service Chain*

Information related to maintenance and parts is found within the field and service section of the DTH. Information that would be useful to the maintenance team and various suppliers can be seen in this section, with maintenance manuals and part availability from suppliers being incorporated into the DTH.

#### *4.4. Key Technologies for Digital Thread*

The key technologies that support implementing DTH in the three main data chains have been challenged by the difficulty in aggregating disparate data in various formats from different systems and organizations throughout the product lifecycle [36]. There exist commercial software tools that support inter-operability and enable the DTH applications. For example, the ModelCenter (Phoenix Integration, Blacksburg, VA, USA), TeamCenter (Siemens, Plano, TX, USA), ThingWorx (PTC Inc.), 3DExperience (Dassault Systèmes®), Aras Innovator® (Aras, Andover, MA, USA) and Autodesk Fusion Lifecycle (Autodesk Inc., San Rafael, CA, USA) are various commercial software tools for managing centralized data storage and the integration of simulation models for optimizing product and system designs [40–42].

#### **5. New Digital Twin and Digital Thread Framework Development**

The importance of DTW and DTH is highlighted by academe and industry due to its virtual/real-world integration [9]. As DTWs can integrate data collected from physical models with data from computational models and processes with advanced prediction methods, the results can be used to improve the performance of the existing product or to produce improved versions in the future [7]. Also, product design, assembly, production planning and workspace layouts have been found to be potential fields for twin/thread framework application [17,43].

The development of a new DTW and DTH framework (hereafter, the twin/thread framework) is an integration of DTW and DTH step and often requires more resources than when building a DTW for the first time. The twin-thread framework has multi-layered stages (Figure 6), which require the developer to follow a loop-style iterative approach to develop it.

**Figure 6.** Digital Twin and Digital Thread framework for efficient product data management.

The new twin/thread framework comprises product design and physical asset components that are building blocks for establishing a centralized product data management (PDM) system. A PDM system will ensure the inter-operability of services and platforms involved in a project and help to standardize file formats, adopt common data storage and representation approaches and impose version control on data files across platforms. In addition, the PDM system will not only save time for engineers and designers in importing files from one platform and exporting them into another but also allows them to communicate and collaborate constantly with other stakeholders (i.e., a non-linear style approach) via a unified and consistent data representation framework, with the aim to delivering relevant data to the right person at the right time and in real time.

The advantage of the new twin/thread framework is that the users can use DTW to set up virtual models to test out scenarios to investigate where problems might have occurred and help them to predict what they might do to rectify the problems. DTH is an added benefit where it enables all stakeholders to effectively communicate and share big data bi-directionally up and down stream throughout the entire product life cycle.

#### *5.1. Integration of a Model-Based Systems Engineering (MBSE) Approach to Support PLM*

Given the increasing model-driven data across many industries, a new Model-Based System Engineering (MBSE) approach was introduced. MBSE uses a unified platform to support the requirements of design, analysis, verification, production and maintenance within the entire PLM activities [29]. MBSE aims to use a models-oriented approach (instead of document-based approach) to support the exchange of information. Figure 7 provides an overview of the common tiers of MBSE architecture. The lowest tier in the architecture contains data that are to be accessed and potentially used for analysis. Systems within the middle and top tiers provide functions and services that manage the translation and/or transaction of data between different organizations [36].

The decision-makers can also use MBSE to manage risks by defining proactive and reactive resilience strategies and contingency plans using the historical and real-time disruption data analytics to ensure business continuity [44].

**Figure 7.** Model-Based Systems Engineering framework to support the consistency of exchanging model-data.

#### *5.2. Behaviour Simulation*

An operation of a process is required for the behavior simulation step to simulate a physical product in a virtual space. Key functions of the physical model will be simulated and the response of the virtual product will be examined. For example, in a stock-take model, the virtual model could be simulated to represent a real-life scenario of lost stock, the virtual model would then be required to find the supplier and order new stock to replenish the resources automatically, keeping the flow and function of overall product. Behavior simulation needs inputs from the DTH with respect to supplier information to be integrated into the DTW. Once the behavior is simulated virtually, the system can move to the physical control and complete the twin/thread cycle.

#### *5.3. Physical Control*

Physical control is the last stage in the twin/thread framework and involves controlling and changing the physical system. The physical control brings the other steps together and produces a fully functioning DTW that can change and interact with the physical model. By incorporating sensing and controlling systems and linking them with the communication infrastructure, the physical model will be able to be manipulated and changed within a virtual space. The behavior and structure of the physical world can be controlled manually or automatically through the DTW and real-world changes can be analyzed and optimized through simulations. After the physical control has been executed, the DTW will update instantly to simulate the new physical model. For example, for a stock-take delivery on-time set-up, the use of sensors would identify a low stock levels of a product and the product would be ordered through the supplier information based in DTH and the DTW would be updated with the amount of stock. Once delivered, the stock would then revert to 'normal' supply levels and the DTW will need to be updated immediately to reflect this change.

Once physical control is completed, the next iteration of the cycle begins and the DTW will need to be constantly updated in order to keep up with the workforce and the demanding needs of the new industry 4.0.

The twin/thread framework also encompasses different aspects including organizational architecture layout, security, user access, data storage and hardware and software requirements, which will be addressed in the following sections.

#### *5.4. Organisational Architectures*

First, the organizational architecture needs to be developed in the system. This may be set up by the supplier of the software or can be set up in-house depending on the users' needs. This includes the organization set-up, logos and context behind the DTW before starting the process, setting up a clear outline of what the organization needs and the needs of the users.

#### *5.5. Data Storage Requirements*

The software requirements for the twin/thread framework also need to be established for the data to be easily managed and imported into the various systems. Ideally, the software would allow for all the functions required in the DTW including 3D modelling, product design chain flow, manufacturing details and service information. Whichever software is chosen by the user should also include a service agreement with that company to ensure any complications and issues can be resolved, enabling maximum efficiency and use of the software.

A large volume of data will be collected from a variety of sources during the entire PLM process. These data can be classified into three sets: 1. structured (i.e., data with specific formats such as digits, symbols, tables, etc.); 2. semi-structured (e.g., trees, graphs, XML documents, etc.); and 3. unstructured (e.g., texts, audios, videos, images, etc.) [45] These data need to be stored in databases for further processing, analysis and decisionmaking. Big data storage technologies, such as distributed file storage (DFS), standard Structured Query Language (SQL) databases, NoSQL database, NewSQL database and cloud storage, can be applied according to the nature of the data [26,45].

The DTW model can be updated continuously with the newest data stored in the database via SQL queries or online application programming interface (API). Interactive dashboards and other visualization tools, such as AR/VR goggles, can extract and consume data using the same mechanism.

#### *5.6. Hardware Requirements*

The hardware requirements for the software also need to be established before developing the twin/thread framework. These requirements are based on what software the users will be running for the DTW (examples of software include 3DExprience) and the type of activities they will be undertaking with the software. For CPU-exhaustive tasks (such as design tools or CAD creation), premium hardware is needed to run the required software. As there are a number of companies that offer the enabling software and technologies for DTW, users are recommended to refer to the vendor's certified hardware specifications. For example, Dassault Systèmes' has its specific certification process for workstations and laptops from various manufacturers, models, operating systems, graphic cards and drivers. This is to ensure reliable operation and seamless integration of the DTW enabling software and removes any hardware issues in running the software. It is also recommended that the hardware be upgraded periodically to ensure smooth operating and functionality for all users. By investing time in development, the twin/thread framework will run effectively by eliminating compatibility and scalability issues. Without investing time in the framework, users might experience poor software instability run time and will lack productivity due to a non-sustainable software environment in the long-term use of the twin/thread framework.

#### *5.7. Cyber Security Framework*

The next vital step is to set up and control cybersecurity for the twin/thread framework to ensure cyber resilience (Figure 8). The cybersecurity protocol contains three essential elements: 1. robust policies to maintain safeguard; 2. technologies that comply with security control; and 3. training of staff to support organizational awareness [46]. Data security could be industry-specific and some industries might require more rigorous security measures than others. A measure that would ensure the safety of the information

in the twin/thread framework would be the implementation of ISO27001 [47]. ISO27001 is an international security standard developed to provide a model for establishing, implementing, operating, monitoring, reviewing, maintaining and improving an information security management system. These security measures could be implemented to all users who have access to the DTW on the server. Additional training is recommended to all users to ensure the utmost safety of the organizations and the information stored within the twin/thread framework. This is an integral stage in the framework's development, as this is what protects both the users and the suppliers from potential danger and IT crime [46].

**Figure 8.** Cybersecurity framework for the digital twin/thread system.

Identifying correct user access and the creation of an identity and access management protocol (IAM) for the user are the next stages of the framework's development. This involves setting up correct access and roles for the right users, ensuring that only the information and recourses that are needed by that user is accessed [48,49]. User authorization needs a further authentication step to ensure the security of the data. This could be achieved through adapting strong or multi-factor authentication options such as the use of security questions or through email authorizations [48].

#### *5.8. Proposed Architechture of Enabling Digital Twin/Thread Application*

Despite DTH spanning the entire product life cycle, digital data continuity from the design to maintenance stage, as well as between Original Equipment Manufacturers (OEMs) and suppliers is limited. 'Discontinuity' of digital data and fragmentation of supply chain information might be the result from the use of many CAD software and/or PLM systems by OEMs, cyber security and data sharing control requirements and the lack of the required technology and digital skills among OEMs and suppliers.

A new enabling framework is, therefore, needed to link all information within the DTW to flow seamlessly through the entire product life cycle to support downstream processes in real-time and to address the challenges from design to manufacturing transition. The new enabling framework should have sufficient functionality, scalability and connectivity with customers and suppliers to ensure digital continuity and be easily integrated into the twin/thread framework.

In order to achieve digital continuity in the entire PLM, a platform and a set of software applications dedicated specifically to engineering design, verification and manufacturing are required. As noted in the literature [50,51], standardized design software, databases, tools and processes are a key to success for big and complex projects that involve many stakeholders from many countries to ensure digital continuity and traceability without

causing costly mistakes and delay. Figure 9 provides an overview of the proposed architecture of a twin/thread system, which comprises organization/technical specifications, associated interface tools, PLM components, data analytics and the operation of the modeloriented MBSE approach. Each aspect of the proposed architecture is discussed in the following sub-sections.

**Figure 9.** Proposed architecture for enabling digital twin/thread application to enhance digital continuity and traceability.

#### 5.8.1. Digital Twin and Thread Application Suites

The top section of the framework includes a database, an application server and thick client (i.e., software such as 3DExperience). The application server provides the interface between the database and access to internal and external clients [36]. The database contains interdisciplinary models, for example, CAD models, functional models and simulation models. Each of these models are created during the engineering process of a DTW using specific tools. The connectors such as 1. Open Services for Lifecycle Collaboration (OSLC) links, which establish traceability and analyze relationships between the requirements, functions, resources, manufacturing and processes, 2. AUTomotive Open System ARchitecture (AUTOSAR)—a standard for system specification and exchange that helps to improve the reusability of vehicle software architectures, and 3. Unified Profile for Department of Defense Architecture Framework (DoDAF) and the UK Ministry of Defence Architecture Framework (MODAF) (UPDM)—a common software language to describe defense architectures, are used to connect data and achieve the DTH across domains, applications, organizations, systems and systems-of-systems. The DTW and DTH are connected to the PLM data repository via the data acquisition interface.

#### 5.8.2. Product Lifecyle Management Components

Users can employ the PLM features to configure the collaborative creation and management dissemination of information related to product. These features allow users from different locations to work concurrently in real time on the same data, via a simple web connection to the twin/thread application suites. The integration of such features within the twin/thread suites allows users to optimize the change management processes as well as minimize the impact on every stage of the lifecycle [52].

#### 5.8.3. Model-Based Systems Engineering (MBSE)

MBSE provides a common guideline on the management concept, system-to-system architecture and operational scenarios to promote concurrent model development and enhance re-usability of model data. It aggregates the model data from engineering and manufacturing items and processes or from different organizations in the supply chain. With MBSE, users can employ modelling and simulation data to create DTW of the physical assets in each step of the lifecycle journey. Then, the DTH will link its corresponding DTW to the design of the physical systems to ensure traceability links [52]. See Section 5.1 for details.

#### 5.8.4. Data Acquisition Interface

The data acquisition interface will capture and store data collected by sensors and operational data from the real world. Through this interface, sensor and operational data can be transferred to DTW and this, subsequently, allows users to perform dynamic behavior simulation in parallel with the real-world data. Technologies that can implement a data acquisition interface are, inter alia, Predix (General Electric Company), Thingworx (PTC Inc.), Mindsphare (Siemens) and 3D Experience (Dassault Systèmes®) [26].

#### 5.8.5. Organization and Technical Data

These data contain information about the physical asset itself. All documentations (e.g., requirements, specifications, design layouts, service manuals, maintenance reports etc.) that are generated by all stakeholders throughout the entire product life cycle can be stored here [52].

#### 5.8.6. Operational Data

Real-world operational data can also be stored here using the data acquisition interface [53], such as: 1. sensor data, which is continuously streamed and recorded the current operation of an asset; 2. control data, which determines the current status of the real component; and 3. Radio-Frequency Identification (RFID) scanner data, which capture the current physical location of physical assets.

#### 5.8.7. Co-Simulation Interface

The co-simulation interface can be used to simulate the flow of the entire production system and manufacturing processes in the real world [53]. For example, a user can utilize a factory layout program in the application suite to create a DTW of an existing physical factory floor and then use the factory flow simulation interface to create the factory flow process, starting from the supply of raw materials to the final dispatch of end products. The user can begin the simulation by choosing a start location, the required resources (e.g., 3D objects used in simulation, raw materials, worker manikins, etc.) and manufacturing processes (e.g., conveyor belts, numerical control machines, robotic arms, etc.) for the designated tasks. While the simulation is running, the current state, utilization percentage, current capacity and total of activities completed can be tracked. A system performance monitor in the simulation interface can be used to display live information for the whole factory floor and all resources. The live information, which includes utilization, total activities completed, average bottleneck of resources and current operational state of machinery, can provide unique and important support to customers and shareholders in implementing strategic planning and optimization.

#### 5.8.8. Big Data and Analytics

There is a necessity for a platform for reliable 'big data' storage and to perform data analytics for decision-making. A large amount of data is generated and processed at any stage of the product lifecycle [54]. Large datasets can also come from various sources (e.g., computers, mobile devices, sensing technologies) [55,56]. Data analytics provides the capacity to analyze large and complex datasets and project/process managers can gain greater insight to make informed decisions and implement actions by searching, discovering and processing patterns in big data [55]. When a product is manufactured, all relevant data, such as status data from machines or energy consumption data from manufacturing systems, are stored and accessible in the DTW via the data acquisition interface. As a result, energy consumption optimization and better operational efficiency can be achieved. Such data also provide actionable insights for future decision-making.

#### *5.9. Intellectual Property (IP)*

In a globalized environment where innovation is crucial, the main competitive advantage of organizations lies in the development of new ideas and intellectual property (IP). Throughout the phases in a product's lifecycle, many change iterations (e.g., changes in customer demands or amendments in design and optimization) and the exchange of many highly sensitive information (e.g., IP products and services or personal information) will take place between various user-groups and stakeholders [57]. Historically, organizations faced a lack of integrated systems to manage their IP and heavy reliance on spreadsheets/manual documents. Thus, managing IP protection raises numerous challenges for organizations. The following sections elaborate on how the proposed approach can help to ensure IP continuity and protection.

#### 5.9.1. Intellectual Property Continuity

Traditionally, organizations use the "throw it over the wall" approach, where different teams work in insolation from each other. Once a task is completed, they will hand over documents and 3D models to the next team. This approach does not address data silo issues and information that is often lost or lacks traceability [10]. The proposed twin/thread framework can play a significant role in modern product development and management. It provides a single, shared PDM platform to connect various user-groups and stakeholders throughout the entire product lifecycle from concept to disposal. The PDM platform will: 1. allow users to have easy, quick and secure access to data in a central repository during the product design process and 2. enable users to support product development and management processes by sharing, updating and controlling the way users create, modify and monitor the flow of product-related information. Such processes occur during entire the product lifecycle and each stage involves dynamic interactions between entities that use the available information to generate new information and IPs and share them further [57,58]. As such, the proposed twin/thread framework will transform the way organizations manage their information and IP more efficiently by harmonizing all sources and types of data (of different formats, stored using different means and in different locations) to ensure digital continuity and traceability (Figure 10).

**Figure 10.** Management of intellectual property to ensure its continuity and filling the missing gaps.

As IP management maturity increases, companies can identify gaps related to engineering design, manufacturing planning, steps of a production process and service and maintenance over the lifecycle. With the ease of tracible information and knowledge, organizations can fill in the missing gaps for generating real growth possibilities [59].

#### 5.9.2. Intellectual Property Security and Protection

One issue is how to protect IP effectively from loss, leak and theft. Through the adoption of a model oriented MBSE approach with the twin/thread framework and with proper cyber-security measures in place, organizations can provide segregated access to internal and external clients (e.g., OEMs and suppliers). In this regard, a number of appropriate organizational and technical concepts to exchange, manage and control access to information securely will be considered [49,60]:


Depending on the business model and the needs of the organization, commercial software providers (as identified in Section 2.5) can provide consultation, implementation, integration, hosting and training services for potential control of access to information from PDM and PLM platforms, and from a shared folder to companies with different scales. This secure access can ensure the protection of IP and other proprietary lifecycle data [36], for example, the IP and the design of the product being fabricated, batch 'Bill of Materials' components and any processes being developed to fabricate the product [36,61].

#### **6. Industry 4.0 Shipyard**

The differences between implementing a DTW in the manufacturing and maritime domains have been recently studied [56]. The study showed that very few implementation frameworks for the maritime domain have been developed but found one promising framework with the basic requirements for a DTH solution. The study concluded that both domains are developing open platforms for DTW implementation and present some useful real-world implementation examples of DTW [56].

DTW has also been proposed as a natural step from MBSE, with great possibilities of improvement in the production of highly complex products such as cruise ships. Some of the advantages highlighted are the ease in collaboration amongst all teams involved in the process of ship design [62–65]. Also, the possibility to access information and manage it efficiently using an advanced interface could help develop efficient maintenance and training programs that, in time, can lead to higher operational performance levels [66,67].

Additionally, DTH has been identified as a different way from the traditional 2D drawings for shipbuilders to design and build their ships faster and better. DTH offers shipbuilders the possibility of having their employees and suppliers connected to and synchronized with their shipyard, production planning, customer orders and requirements, 3D models and every aspect of design [68].

However, the current DTW models applied in the ship building industry show that only some of the components of the ship are being represented in the DTW, which is

understandable due to the considerable number of sub-assets included in a modern ocean vessel. Including such a huge number of parts and their properties, interactions and performances in a model imposes great challenges. As DTH technology continues to mature, it will help the industry improve several aspects of their production processes through collaboration and constant communication of information [69].

#### *6.1. Proposing Digital a Twin/Thread Framework in Australian Shipyard*

As a result of the progressive implementation of smart and autonomous systems of Industry 4.0, the shipbuilding industry has developed a new, radical paradigm in its manufacturing systems by integrating automated tools and processes, creating new demands for more lean production processes, while increasing production efficiency, improving ship safety and reducing environmental impacts.

Furthermore, in a very complex shipyard site that contains large areas for fabrication of the ships, dry docks, slipways, warehouses, painting facilities and so forth, there exists many moving goods and many parts may look alike during the entire ship building life cycle. Hence, there is a need for ship operators to develop a relatively energy efficient way of moving goods and to accurately identify and trace moving goods to minimize impacts and to improve productivity and safety in the shipbuilding process.

In developing Shipyard 4.0, we believe a right framework is required to assist in designing a virtual work environment using highly detailed DTW, which could optimize the entire shipbuilding process by delivering the right information at the right time to avoid mistakes and increase productivity.

The concept of the DTH is typically defined as the flow of information that informs how a product moves through its design and production lifecycle (Figure 11). The implementation of DTH allows the monitoring of production in shipyard facilities and of the suppliers' production in their own plants. This provides greater product and process visibility to the ship builders, as well as greater transparency for the customer throughout the building process.

**Figure 11.** Digital Twin and thread implementation scheme for a shipyard.

A DTW of the shipyard will improve the efficiency of the factory flow when data can be extracted via behavior simulations of an operation process, such as machining time. Extracted data can be connected via DTH and fed into the DTW to identify bottlenecks. Furthermore, having a twin/thread framework where the DTW resides offers benefits in time management in scheduling and delivery. This relates to the inclusion of supply-chain data in the DTW. A fully integrated supply chain allows users to access the full spectrum of information available. A twin/thread framework could improve decision-making thanks to its single source of information.

In addition to improving manufacturing and design stages, DTW and DTH could enhance managerial decision-making processes. Provided that a true DTW of the shipyard is created, the information generated from all areas would be conducive to optimizing and achieving key performance indicators. Additionally, if the supply chain was integrated into the DTW, this information would give management information on what to expect, potential future issues and time to adjust to unforeseen circumstances.

Ships are normally built to last for up to thirty or more years. Therefore, it is important to ensure the continuity and traceability of design-to-service and design-to-maintenance information until their final dismantling. After its construction ships will continue to operate in the seas and will have impacts on the environment throughout their operational lives. The use of the twin/thread framework with the integrated MBSE and big data will be able to help providing a way of more effectively dealing with environmental and other issues.

According to the Australian Naval Group's SEA 1000 program [70], a total of 12 submarines will be built and all expected to be in operation by the mid-2050s. When considering future design aspects, over the next 30 years, the DTW offers the opportunity to test and reiterate designs via virtual testing, such as thermal and structural analysis, for improvement through its feedback loop processes. Legacy, historical and real-time data (maintained history, sensors data, test results etc.) connected via DTH through the physical ship can be subsequently fed back into the design process and used to improve design in case there are unforeseen circumstances or realized areas of improvement.

#### *6.2. Roadmap for the Implementation of the Proposed Twin/Thread Framework*

The implementation of the twin/thread framework can be challenging for an organization. A clear understanding of the framework and careful planning are essential to deploy its applications effectively to meet the organization's requirements and needs and to prevent costly mistakes. There are a few global companies that provide 'out-of-the-box' software applications and PLM solution suites for both DTW and DTH, including PTC Inc., Siemens, ANSYS Inc., Dassault Systèmes® and Autodesk Inc. For organizations interested in implementing the twin/thread suites as a mean to improve efficiency, software providers would normally offer consulting, implementing and support services that align with the customers' business requirements. While every implementation journey is unique, businesses can obtain the best results by following an industry 'best practice' and the methodology roadmap shown in Figure 12. The common phases for implementing out-of-the-box twin/thread solution are divided into: 1. access and definition; 2. design and build; and 3. deployment and support.

The very first step is for businesses identifying the requirements and needs for the twin/thread suites within the enterprise. A comprehensive understanding of an organization's own business processes and requirements can provide insight into how to set up the necessary organizational architectures to ensure the seamless flow of information outlined in Section 5.4. Once the requirements have been identified, it is time to engage a potential software provider and system integrator to put the plan into action by identifying the hardware, software and data storage requirements (Section 5.5 and Section 5.6) and nominating the project team to champion the roles. Prior to full roll-out, it is important to design and build the distinct architectures of the twin/thread applications that are aligned

to the organization's requirements. The proposed architecture (Section 5.8) can be used to guide implementation.

**Figure 12.** Design, build and implementation of the 'out-of-the-box' digital twin/thread product roadmap (modified from [71]).

While the twin/thread suites have effective hardware and software security processes, it is critical for organizations to consider the additional security measures outlined in Sections 5.7 and 5.9.2, especially in cases that involve IPs and new innovations when the twin is replicating their physical counterparts throughout the entire lifecycle [72]. Early participation from executive leadership and a well-trained and educated workforce in the twin/thread suites is a key attribute to ensure successful implementation in an organization. Finally, regular database-integrity checks and maintenance need to be considered before the application goes live and beyond to ensure that any problems are detected and administrators can either restore from a backup or conduct repair options.

#### *6.3. Operation and Sustainment of Twin/Thread Framework in Australia Shipyard*

With the adoption of the twin/thread framework, the shipyard industry can utilize DTW to transform the whole production lifecycle to ensure sustainability and to improve the performance of future programs [10,55]. For example, design engineers can leverage MBSE to work together with manufacturing engineers to create 3D models and simulations that link to real-time visualization for digital and physical production processes and instructions throughout the entire product value chain. The DTH will provide a platform to aggerate big data from disparate systems throughout the product lifecycle into actionable information through data analytics. With this deep insight from diagnostic analytics, descriptive analytics and predictive analytics, engineers, managerial teams and technician can use the data to support decision-making [55].

The twin/thread framework has been proposed and, in order to make it work, contracting authorities in the shipyard industry need to have the necessary hardware and software systems to facilitate multi-OEM participation in the DTH to ensure the connectivity of data. The sustaining of twin/thread frameworks will depend on continuing digital transformation, the endorsement of standardize tools and data exchange and better understanding of and agreements for upstream lifecycle functions to accommodate needs in downstream functions [10].

#### **7. Conclusions**

DTW and DTH are two promising technologies that will allow the manufacturing industry to optimize of their operational processes and traceability of information in the physical and virtual worlds. However, from the literature reviewed in this article, it can be concluded that these technologies are still in their early stages and further research related to implementation is needed, especially in framework development and in data processing, storage and security.

At present, existing frameworks can perform only limited aspects of what a true DTW and DTH should be able to achieve. While a DTW is designed to include the entire lifecycle of a physical part from design to use and then disposal, existing frameworks are largely focused on the design and creation stages only. Though some papers have referred to PLM in relation to DTH, which to ensure the connectivity of data silos and isolated information and elements to improve communication and collaboration, the existing DTH technology that integrates seamlessly with DTW has yet to be successfully implemented.

The proposed twin/thread framework, which uses DTW to represent the enterprise chains (i.e., product innovation chain, enterprise value chain and asset chain) and uses DTH to connect the enterprise data together to create digital continuity and accessibility. The advantage of the new twin/thread framework is that the users can use DTW to set up virtual models to simulate possible scenarios to predict future performance and the possible future failures. DTH is an added benefit where it enables all stakeholders to effectively communicate and share big data bi-directionally up and down stream throughout the entire product life cycle.

In order to adopt the twin/thread framework, OEMs need to define and adopt suitable technologies for product, process and resource modelling and validation, then maintain a digital repository for the deposition of the numerous products, processes and resources information within a single platform, of which the Model based System Engineering (MBSE) approach was introduced. The MBSE approach allows user-groups and stakeholders to collaborate on a unified system, where they can share data, perform simulation and visualization of a highly detailed model of a future physical product and exchange information in the form of models instead of document.

This will open avenues for accurate identification and easier traceability of information that will lead to improved efficiency and productivity. More significant is the possibility of iterative designs through feedback processes, which can shorten production lead times. This feedback is made possible through the DTH connecting the physical environment and DTW and is created through data extractions from both the physical and digital worlds. The same information that improves future design is used for management decisions.

In the context of the shipyard, the benefits of integrating a twin/thread framework into the established shipyard process span improved productivity and performance. Design engineers can leverage on the DTW to test and reiterate designs via virtual testing, such as thermal and structural analysis, for improvement through its feedback loop processes. The DTH will provide a platform to aggerate big data from multiple sources, such as maintained history, sensors data, test results and so forth, throughout the product lifecycle into actionable information through data analytics to improve the performance of future programs.

**Author Contributions:** Conceptualization, T.Y.P., C.-T.C., M.M. and A.Y.; Methodology, J.D.P.R., A.Y. and H.L.; Software, C.-T.C. and M.M.; Investigation, T.Y.P., A.Y. and H.L.; Resources, J.D.P.R., A.Y. and H.L.; Writing-Original Draft Preparation, T.Y.P., C.-T.C., J.D.P.R., A.Y. and H.L.; Writing—Review & Editing, T.Y.P., C.-T.C. and M.M.; Visualization, J.D.P.R., A.Y. and H.L.; Supervision, T.Y.P., C.-T.C. and M.M.; Project Administration, A.Y., H.L. and J.D.P.R.; Funding Acquisition, T.Y.P. and C.-T.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Defence Science Institute (DSI), grant number CR-0032.

**Data Availability Statement:** No new data were created or analyzed in this study. Data sharing is not applicable to this article.

**Acknowledgments:** The authors thank the students who participated in this research. We also acknowledge the contributions of the staff of MEMKO Systems Pty Ltd.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Virtual Prototype for Fast Design and Visualization of Gerotor Pumps**

**Juan Pareja-Corcho 1,2 , Aitor Moreno <sup>2</sup> , Bruno Simoes <sup>2</sup> , Asier Pedrera-Busselo <sup>3</sup> , Ekain San-Jose <sup>3</sup> , Oscar Ruiz-Salguero <sup>1</sup> and Jorge Posada 2,\***


20009 Donostia-San Sebastian, Spain; amoreno@vicomtech.org (A.M.); bsimoes@vicomtech.org (B.S.) <sup>3</sup> Egile Innovative Solutions, Kurutz-Gain Polígono Industrial Pol., 12, 20850 Gipuzkoa, Spain;

asier.pedrera@egile.es (A.P.-B.); ekain.sanjose@egile.es (E.S.-J.) **\*** Correspondence: jposada@vicomtech.org; Tel.: +34-943-309-230

**Abstract:** In the context of generation of lubrication flows, gear pumps are widely used, with gerotor-type pumps being specially popular, given their low cost, high compactness, and reliability. The design process of gerotor pumps requires the simulation of the fluid dynamics phenomena that characterize the fluid displacement by the pump. Designers and researchers mainly rely on these methods: (i) computational fluid dynamics (CFD) and (ii) lumped parameter models. CFD methods are accurate in predicting the behavior of the pump, at the expense of large computing resources and time. On the other hand, Lumped Parameter models are fast and they do not require CFD software, at the expense of diminished accuracy. Usually, Lumped Parameter fluid simulation is mounted on specialized black-box visual programming platforms. The resulting pressures and flow rates are then fed to the design software. In response to the current status, this manuscript reports a virtual prototype to be used in the context of a Digital Twin tool. Our approach: (1) integrates pump design, fast approximate simulation, and result visualization processes, (2) does not require an external numerical solver platforms for the approximate model, (3) allows for the fast simulation of gerotor performance using sensor data to feed the simulation model, and (4) compares simulated data vs. imported gerotor operational data. Our results show good agreement between our prediction and CFD-based simulations of the actual pump. Future work is required in predicting rotor micromovements and cavitation effects, as well as further integration of the physical pump with the software tool.

**Keywords:** digital-twin; gerotor pump; hydraulic-systems; simulation; computer-aided design

#### **1. Introduction**

Gerotor pumps play an important role in the aerospace industry, particularly in the processes of cooling, lubrication, and fuel boost and transfer. In other sectors, the gerotor pumps are operated in a wide range of applications, such as dosing and filling technologies in pharmacy and medicine, dispensing technologies and coating applications in manufacturing, among others. The popularity of such pumps in industrial applications arises from the fact that gerotor pumps represent a reasonable compromise in terms of compactness, reliability, cost, and versatility [1]. The working principle of a gerotor pump is based on the interaction between a pair of toothed gears with trochoidal envelope profiles. The relative movement between the profiles generates a series of chambers with varying volume that perform a cycle of suction and delivery actions (in interaction with input and output ports), thus effectively producing a volumetric flow (see Figure 1).

**Citation:** Pareja-Corcho, J.; Moreno, A.; Simoes, B.; Pedrera-Busselo, A.; San-Jose, E.; Ruiz-Salguero, O.; Posada, J. A Virtual Prototype for Fast Design and Visualization of Gerotor Pumps. *Appl. Sci.* **2021**, *11*, 1190. http://doi.org/10.3390/app11031190

Academic Editor: Andrew Y. C. Nee Received: 16 December 2020 Accepted: 25 January 2021 Published: 28 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

**Figure 1.** Gerotor pump general architecture: (**a**) inner and outer gear, (**b**) inlet/outlet disposition in pump.

The current design process for gerotor pumps commonly involves: (i) a *geometric modeling* step in a CAD environment, (ii) a *design verification* phase using fluid mechanics simulations to validate the efficiency and other desired characteristics of the pump, and (iii) a *physical testing* phase to verify the predicted characteristics of the pump in a real test bench once the design has been validated through a simulation tool. This process can be considerably time-consuming, due to the large amount of time that is required in the design verification stage. The design engineer must mesh the complex geometry of the volume chambers each time that the design is changed and perform a time-consuming simulation. In most design cases, the simulation of a geometric configuration takes up to a day to generate results. The described workflow hinders the effectiveness of rapid design methodologies or the easy testing of a large number of geometric configurations of the pump in a reasonable time.

**Implementation**. In this manuscript, we present the implementation of a virtual prototype of a Gerotor pump designed to be integrated with data measured in an experimental setup in order to improve the established design process. Our implementation does not constitute a full Digital Twin, but rather will be a step towards a fully functional Digital Twin tool that reproduces the behavior of the real pump. This virtual prototype allows for a rough design condition vs. performance appraisal, thus enabling the design and testing scenarios. Once the designer is satisfied with this approximated design vs. performance ratio, a more precise CFD simulation process would take place. An important current feature of the virtual prototype tool presented is the import and display of the sub-sequent CFD simulation results and experimental data measured in a real pump, for the benefit of the designer manufacturer and client. This feedback of the CFD simulation results might be included in a numerically oriented closed loop at the design stage. At the present time, we only report visual CFD data feedback. The implemented tool is able to use data that were measured in the experimental setup to feed the fast virtual prototype. Differences between the virtual prototype state variables and measured state variables allow for several activities: (a) to modify the pump design, (b) to control the actual pump, and (c) to feed satisfactory virtual prototype parameters into parametric or constraint-driven CAD models to obtain a full Boundary Representation of the Gerotor pump. Notice that (c) streamlines the design-for-gerotor process and avoids the need for a external CAD application.

The manuscript is divided, as follows: the Section 2 reviews the available literature in the context of physical simulation of gerotor pumps and Digital Twin implementations. Section 3 introduces the experimental setup of the pump, the lumped parameter model, and the virtual prototype tool. Section 4 presents the comparison between our predictions and a Computational Fluid Dynamics simulation used as ground truth. We do not address the comparison with respect to the experimental data, because we cannot measure the comparison variable in our experimental setup. Section 5 concludes the manuscript and

discusses possible future developments in both the virtual prototype and its integration within a full Digital Twin tool.

#### **2. Previous Works**

In this section, we review the literature in two dimensions: (a) methods for fluid dynamics simulation in gerotor pumps and (b) implementations of virtual prototypes in Digital Twin oriented tools.

#### *2.1. Fluid Mechanical Simulation*

Several approaches have been proposed to simulate the performance of gerotor pumps, depending on the level of detail required. Most previous work relies on two methods for the fluid simulation: (i) lumped parameters models (LP) and (ii) computational fluid dynamics models (CFD), with each one exhibiting different performances regarding time and memory complexity.

*CFD models:* computational fluid dynamics models use specialized software to solve the Navier–Stokes equations in a discretized domain. CFD models can be classified in two categories: (i) two-dimensional (2D) models and (ii) three-dimensional (3D) models. Castilla et al. [2] and Houzeaux et al. [3] presented 2D CFD models for the simulation of rotary pumps that present accurate results with respect to an experimental setup. Recently, 3D simulations of the pump have been performed in order to analyze specific aspects of the pumps design, such as: (a) profile geometry optimization [4,5], (b) discharge coefficient calculation [6], and (c) fluid leakage due to clearances [7]. The main advantages of CFD based methods are: (i) detailed description of the fluid's behavior inside the cavity of the pump and (ii) very accurate prediction of the effect of cavitation and fluid–body interaction on performance. The main disadvantages of CFD methods are: (i) large simulation time and memory requirements, (ii) the requirement to remesh the entire domain in each step of the solution, and (iii) the difficulty to mesh appropriately the inter-teeth clearance domain [2].

*LP models*: lumped parameter models discretize the pump in a number of control volumes, where each CV (control volume) corresponds to a cavity of the pump. The mass and energy conservation equations are used to integrate the pressure in each control volume. The pressure inside each control volume will depend on the instantaneous volume of the chamber and net flowrate of fluid through its surface. Pellegri et al. [8,9] presented a simple lumped parameter model that was mounted on AMESIM software, coupled with a geometric module that calculates the instantaneous areas and volumes of the chambers. The results show good agreement between predicted and measured data. Shah et al. [10] presented a lumped parameter model in AMESIM software for the prediction of cavitation effects on the pump simulation; the results show that the model is accurate in predicting the effect of cavitation phenomena on the overall performance of the system. The main advantages of the lumped parameter approach are: (i) the low time and memory complexity and (ii) the flexibility to integrate with larger hydraulic circuits [1,8,9]. The main disadvantages of the lumped parameter approaches are: (i) the results are coarse with respect to CFD methods and (ii) calibration of the model vs. experimental data is needed, which makes this approach unsuitable for detailed analysis of local behavior of fluid [11].

#### *2.2. Digital Twins and Virtual Prototypes in Gerotor Applications*

Digital Twins are virtual abstractions of physical products, processes, or phenomena very commonly used in the context of Industry 4.0 [12]. Digital Twins are a valuable tool in digital design and manufacturing, as they allow for prediction of system performance and simulation/optimization. Relatively few applications of Digital Twin methodology are found in industrial contexts [13], opening opportunities for wider adoption of Digital Twins in industries, such as fluid power systems. The use of accelerated coarse simulations for fast decision making, although not being entirely similar to the concept of Digital Twin, is being explored in other industrial contexts, such as quality control in manufacturing [14,15].

The lumped-parameter models that have been cited in the previous sections are usually implemented in specialized commercial software. This restriction limitates their feasibility towards a fully functional Digital Twin tool that integrates data from an experimental test bench. In the case of lumped parameter models, the design engineer must express the pump in a CAD environment and then import the geometric data into a differential equation solver (e.g., AMESIM [9]). In the case of CFD models, several commercial codes are used in the solution of the Navier–Stokes equations, including PumpLinux, ANSYS Fluent, and CFX (all appearing in Ref. [11]). So far, we have found no standalone fully-integrated implementations of gerotor pump simulation environments that suits our design needs.

#### *2.3. Conclusions of Literature Review*

Two approaches are commonly used in the context of gerotor pump simulation: (i) Lumped Parameter (LP) models and (ii) Computational Fluid Dynamics (CFD) models. The lumped parameter models allow for the fast simulation of pump performance at the price of loss of accuracy and detail. CFD models allow for very accurate simulation of pump performance with detailed information regarding in-chamber heterogeneity, at the expense of large simulation time and complexity. Because of its accuracy, it is common to use CFD as a ground truth value for pump experiments when no experimentally obtained comparison data are available. We use CFD as our point of comparison for the reasons expressed above and the availability of CFD software simulation. Furthermore, we found that implementations of both approaches are: (i) dependant on proprietary commercial software and (ii) not easily integrated with other standalone non-commercial design and optimization tools in the context of Digital Twins tools.

As a response to such shortcomings, we present the implementation of a virtual prototype for a gerotor pump, which also allows for the integration of measured data, thus enabling the functioning of a Digital Twin. Our implementation: (1) integrates pump design, fast approximate simulation, and result visualization processes, (2) does not require an external numerical solver platforms for the approximate model (as other approaches do), (3) allows for the fast simulation of gerotor performance, and (4) feeds the simulation model with data measured in an experimental setup to improve the accuracy of the model. Several variables can be used to assess the pump behavior, including, among others, maximum pressure, torque, and power. We use the maximum pressure in the pump as our comparison variable since (1) it is a variable of interest for the pump manufacturer and the variable that our model predicts and (2) the comparison with other variables (e.g., torque) would require specialized proprietary software. This software is not available to the industry manufacturer. See Implementation in the Introduction section.

#### **3. Methodology**

For our Digital Twin (DT) implementation, we have defined a three step workflow: (i) the design engineer inputs the values for the parameterization of pump design into the Geometry Configurator, (ii) the geometric model of the pump (inner and outer gears) is generated, and (iii) the virtual prototype performs the fast simulation of the pump with the geometric data and experimental data being measured from the test bench (see Figure 2). Please see the Abbreviations section immediately before the References section to find the meaning for the symbols used.

**Figure 2.** Implemented Tool Architecture and Workflow.

This section is divided in three parts: (1) we present the experimental setup where we show the data collection setup used to feed the virtual prototype tool, (2) we explain the geometric model, showing the generation of the gerotor geometric model and the calculation of geometric quantities such as the history of chamber volumes and areas, (3) we present the fluid dynamics module, where we lay out the foundations of the simulation model and (4) we discuss the software tool that integrates the virtual prototype model, including 3D visualization, with the data being collected from the experimental setup and external CFD simulations.

#### *3.1. Experimental Setup*

The contact point between the physical pump and the computational modeling is the testing bench. The experimental setup hosts the sensors that are used to collect performance data and feed it to the simulation model. Figure 3 presents the testing bench setup used and a manufactured pump mounted in the bench with a translucent cover.

**Figure 3.** Experimental setup: (**a**) testing bench and (**b**) physical pump mounted on the test setup (translucent cover).

Pressure sensors that are located in the input and output ports measure the pressure value to be fed to the virtual prototype (as shown in Figure 2). Further integration of the data that were collected in the testing bench for variables other than pressure with the simulation model is still to be addressed.

#### *3.2. Geometric Model*

The internal profile of the gerotor is generated according to the parameterization proposed by Ref. [16]:

$$\mathbf{x}\_i(\mathfrak{a}\_{\rm pc}) = R\_2 \cos \left( \frac{1}{Z - 1} a\_{\rm pc} \right) \pm e \cos \left( \frac{Z}{Z - 1} a\_{\rm pc} \right) - \frac{S}{m} \left[ R\_2 \cos \left( \frac{1}{Z - 1} a\_{\rm pc} \right) \pm r\_2 \cos \left( \frac{Z}{Z - 1} a\_{\rm pc} \right) \right] \tag{1}$$

$$y\_j(a\_{\rm pc}) = -R\_2 \sin\left(\frac{1}{Z-1} a\_{\rm pc}\right) \mp e \sin\left(\frac{Z}{Z-1} a\_{\rm pc}\right) + \frac{S}{m} \left[ R\_2 \sin\left(\frac{1}{Z-1} a\_{\rm pc}\right) \pm r\_2 \sin\left(\frac{Z}{Z-1} a\_{\rm pc}\right) \right] \tag{2}$$

$$m = \sqrt{r\_2^2 + R\_2^2 \pm 2r\_2 R\_2 \cos \alpha\_{pc}}\tag{3}$$

where the parameter *αpc* ∈ [0, 2*π*] corresponds to the turning angle of the rotor. Figure 4 shows the resulting shape of the internal profile as the trace of contact point *P* ′ , whose position is determined by the radii *R*<sup>2</sup> and *r*2, the eccentricity *e*, and the number of chambers *Z*. In Figure 4, the external profile is determined by a set of outer circumferences that are truncated by a larger cutting circumference.

**Figure 4.** Parameterization of internal profile shape as in Equation (2).

This construction method of the external profile, even though simple and widely used, limits the performance of the pump, as the resulting shape does not mesh perfectly with the internal profile shape [17]. We have implemented an additional method to build the external profile as the conjugated curve of the internal shape.

Suppose the curve *C* that corresponds to the internal profile (Equations (1) and (2)) is put through a series of affine transformations that are defined by the rolling without slipping of the circumference defined by *r*<sup>1</sup> with respect to the circumference defined by *r*<sup>2</sup> in Figure 4. Subsequently, the external profile shape will be defined as the envelope curve of the locus of *C* as it moves through the rotation domain.

Figure 5 shows the locus of curve *C*, as generated by the movement of the circumferences. The envelope curve of the locus can be used as the external profile shape, with the advantage that by using this external shape both curves mesh perfectly, hence improving the performance by avoiding fluid recirculation. Once the inner and outer profile shapes are defined, the geometric quantities of each chamber are calculated by sampling both the internal and external curve to form a closed polygon, finding the area *A<sup>i</sup>* and perimeter *P<sup>i</sup>* of the polygon corresponding to chamber *i*.

**Figure 5.** Locus of the internal profile curve *C*.

#### *3.3. Fluid Dynamics Module*

We discretize the flow domain in several control volumes to obtain a lumped parameter model of the pump, as shown in Figure 6. We assume the fluid properties within each control volume (CV) to be homogeneous, but not constant in time, effectively treating each control volume as the basic domain of simulation. Notice that, as the pump rotates, the geometry of the control volumes changes; therefore, the model requires a constant update of the geometric calculations for each control volume (area, perimeter) as the position of the pump changes (see Figure 2).

**Figure 6.** Control Volume discretization of the gerotor pump.

By the principle of the conservation of mass and energy, along with Reynold's transport theorem, it is possible to derive an expression for the change of pressure within a control volume:

$$\frac{dp}{dt} = \frac{\beta\_{eff}(p)}{V(\theta)} \left( \sum Q\_i - \omega \frac{dV}{d\theta} \right) \tag{4}$$

We omit the derivation of such an expression, since it is beyond the scope of our paper, the interested reader can find a thorough explanation in Ref. [11]. The net flowrate that flows through the boundary of a control volume needs to be calculated in order to integrate Equation (4), and since the volume and volume derivative at angle *θ* are provided by the

geometric module. We consider two types of flows through the boundary of a control volume (Figure 7):


**Figure 7.** Types of flows through the boundary of a control volume.

The flow between the control volume and the input/output port is modeled as the flow through a variable geometry orifice subject to a difference in pressure. The pressure at the input and output ports is fed to the numerical model with data that were collected from the experimental testing bench, while the control volume pressure varies according to Equation (4). The flowrate in this situation can be obtained as:

$$Q\_{\rm in} = C\_d A\_{i, \rm in} \sqrt{\frac{2(P\_i - P\_{\rm in})}{\rho\_{eff}}} \tag{5}$$

$$Q\_{out} = \mathcal{C}\_d A\_{i,out} \sqrt{\frac{2(P\_i - P\_{out})}{\rho\_{eff}}} \tag{6}$$

Notice that, since a control volume only interacts with one of the ports at any given time, the net port flowrate *Q* for a control volume is equal to *Qin* or *Qout*, depending on the position of the control volume at the time of analysis. The calculation of the discharge coefficient *C<sup>d</sup>* depends on the value of the Reynold's number *Re* and the hydraulic diameter *D<sup>h</sup>* at such a time. The hydraulic diameter *D<sup>h</sup>* and the Reynold's number *Re* are calculated, as follows:

$$D\_{\rm li} = \frac{4A\_{\rm i}(\theta)}{P\_{\rm i}(\theta)}\tag{7}$$

$$R\_{\varepsilon} = \frac{D\_h}{\nu} \sqrt{\frac{2\Delta P}{\rho\_{eff}}} \tag{8}$$

Finally, the discharge coefficient *C<sup>d</sup>* is estimated while using an experimental expression (Ref. [11]):

$$\mathcal{C}\_d = \mathcal{C}\_{d\max} \tanh\left(\frac{2\mathcal{R}\_\varepsilon}{\mathcal{R}\_{crit}}\right) \tag{9}$$

where *C<sup>d</sup>* max is the maximum discharge coefficient and *Recrit* is the critical Reynolds number, which indicates the transition between laminar and turbulent regime. Values for constants *C<sup>d</sup>* max and *Recrit* can be found in the literature as a function of conditions of the pump [8]. This flow between adjacent control volumes that should be nominally tight is enabled by the small gap between the rotors at their maximal approximation position. These gaps are necessary for ensuring rotation and limit friction and wear. The resulting fluid migration between adjacent chambers is caused by:


Typically, the gap between rotors at contact points is very small when compared to the overall size of the pump. The curvature radii at the throat are much larger than the throat gap. Therefore, (1) and (2), above, may be modeled by assuming that the approaching teeth form a constant clearance gap between two parallel plates (Figures 8 and 9).

**Figure 8.** Pouiseuille flow between adjacent control volumes.

Figure 8 shows the working principle of the Poiseuille flow in the pump case, where two static plates of length *l<sup>t</sup>* and width *b* are separated by a distance *h<sup>t</sup>* . The difference in pressure between adjacent control volumes induces a flow *Q<sup>p</sup>* that can be obtained as:

$$Q\_p = b \frac{\Delta P(\frac{h\_l}{2})^3}{12\mu L} \tag{10}$$

Notice that width *b* corresponds to the length of the pump profiles in the *z* direction. The distance *h<sup>t</sup>* is estimated by the geometric module as it may vary for each contact point throughout the rotation of the rotors. As length *l<sup>t</sup>* cannot be directly measured in the geometric model, we estimate *l<sup>t</sup>* as a function of *h<sup>t</sup>* . Starting from the point of minimum distance *h<sup>t</sup>* , we move outwards through the profile curves to the point where the distance between rotors is *ht*∗ = (1 + *ǫ*)*h<sup>t</sup>* . Once such points are found, the length *l<sup>t</sup>* is assumed to be the Euclidean distance between the points found. This approximation has shown to be effective for values of *ǫ* around 0.1 [8].

**Figure 9.** Couette flow between adjacent control volumes.

Figure 9 shows the working principle of the Couette flow. Two parallel plates having relative velocity with respect to each other produce a flowrate between them through the viscosity of the fluid and shear stress induced by the relative movement of the plates. The Couette and Poiseuille flows (Figures 8 and 9) both result in a fluid exchange between adjacent control volumes, therefore affecting the net flowrate through their borders and their pressures. Finally, as the pump usually operates in a low pressure range, the variance of effective fluid properties (bulk modulus, density) in hydraulic oil with respect to the instantaneous pressure [18] inside a control volume must be taken into account:

$$\beta\_{\rm ef} = \frac{\beta\_{\rm oil}}{1 + \alpha \cdot \left(\frac{p\_0}{p}\right)^{\frac{1}{k}} \cdot \left(\frac{\beta\_{\rm oil}}{\kappa \cdot p} - 1\right)}\tag{11}$$

$$\rho\_{\rm ef} = \frac{\alpha \cdot \rho\_{\rm air,0} + (1 - \alpha) \cdot \rho\_{\rm oil,0}}{\alpha \cdot \left(\frac{p\_0}{p}\right)^{\frac{1}{\hbar}} + (1 - \alpha) \cdot \left(1 + \frac{m \cdot (p - p\_0)}{\beta\_{\rm oil}}\right)^{-\frac{1}{m}}} \tag{12}$$

where *β*oil, *ρ*air, and *ρ*oil are the properties of the oil and air at atmospheric conditions, respectively. *p*<sup>0</sup> is the atmospheric pressure, *α* is the void fraction, and *κ* is the polytropic constant of air.

#### *3.4. Software Tool*

The Digital Twin (DT) tool implements two functionalities: a *2D geometry configurator* and a *3D data tool*. The 2D geometry configurator allows for the design engineer to define a new geometry for the profiles of the inner and outer gears according to a set of parametric variables. The 3D data tool converts the model that is defined in the 2D geometry configurator to a full B-Rep model for simulation and visualization purposes of both (i) data simulated from our pre-CFD simulation model and (ii) data imported from CFD simulations or test bench.

Figure 10 shows the visualization of the parameterized pump in the interface of the 2D geometry configurator tool. The tool allows for the design engineer to input the desired set of values for the parametric variables of Equation (2). Our application automatically generates both the conjugated design (Figure 10a) and the classic design (Figure 10b). Both types of design, as well as other geometric configurations, are easily explorable in the 2D visualization canvas.

**Figure 10.** Visualization of parameterized pump in the two-dimensional (2D) geometry configurator interface: (**a**) pump with conjugated external profile and (**b**) pump with classic external profile.

In addition to the discretized geometric model of the inner and outer gears, the geometry configurator calculates the shape of the resulting chambers for any given angle of the rotation of the pump. The configurator allows the user to interactively rotate the pump position with a slider, as well as to visualize and record the change in the area of each one of the chambers.

The ports and chambers geometries must be well-aligned for the correct calculation of the intersection area between the chambers and the ports. The geometry of the input/output ports (shown in Figure 11) of the pump is calculated from the geometry of the inner and outer gear geometries in order to ensure the alignment. The geometries calculated by this application (gear, chamber, and port geometry) are automatically imported into the 3D data tool for simulation purposes. The CAD models are also exported to external files.

**Figure 11.** Gerotor design with input/output ports.

Figure 12 shows the interface that is devised for the toolkit that was introduced in [19,20] to enable the virtual prototype. The interface is built with Qt™ library and it comprises several panels: (i) the *3D visualization panel*, which allows for the visualization of the pump geometry (gears, ports, and chambers) and the animation showing the rotation of the pump through a pumping cycle, (ii) a *hierarchy panel* that shows the hierarchy tree of the geometric model and allows the user to select and highlight geometric entities, (iii) a *settings panel* in which the user inputs the fluid properties, operating conditions, and initial pressure conditions that are necessary to perform the fast pre-CFD simulation (see Figure 2), (iv) a *simulated data panel* that displays the results from the fast simulation model, and (v) a *measured data panel* that displays the results of the variables that are imported to the virtual prototype tool, whether measured in the test bench or simulated by the CFD software.

**Figure 12.** Three-dimensional (3D) data tool interface.

The 3D visualization panel enables design engineers to examine the three-dimensional behavior of the pumps for the parameters that are defined in the Geometry Configuration interface. Notice that the shape of the chambers must be recalculated each time the pump is set to a new angle position. Figure 13 shows the high degree of flexibility that can be achieved in the visualization of the different components of the pump geometry.

In Figure 13b, the color of each chamber is related to the pressure that is calculated by the simulation model. The tool also allows for the design engineer to import data from the CFD simulations and test bench tests with the objective of integrating and comparing performance data (both predicted and measured) in the same software environment. Figure 14 shows the data for a measured variable that was imported from an external CFD simulation, as visualized within the software tool. It is possible to import external data in the form of the widely used CSV format or while using a special text formatting that was specifically tuned for our application.

**Figure 13.** Geometry visualization: (**a**) inner and outer gear of the pump, (**b**) fluid chambers with transparent ports.

**Figure 14.** Example of the pressure data collected from the experimental setup sensors.

#### **4. Results**

We present the results of a design case while using the presented virtual prototype tool to design and simulate a gerotor pump. We also compare the obtained results with the data imported from an external CFD simulation for the same geometry, fluid properties, and operating conditions. Figures 15 and 16 present the volumetric data that were calculated by our tool for the pump design used in this test run.

**Figure 15.** Volumetric data results from the virtual prototype: (**a**) profile of geometry with highlighted chamber (*CV*1) and (**b**) history of area in a z-cut for selected chamber.

**Figure 16.** Volumetric data results from the virtual prototype: (**a**) history of intersection area between chamber *CV*<sup>1</sup> and input port and (**b**) history of intersection area between chamber *CV*<sup>1</sup> and output port.

Figure 15a shows the selected analysis chamber. In this case, we have selected the maximum volume chamber when the pump is at initial position *t* = 0, even though our tool can perform the geometric analysis of all chambers simultaneously. Figure 15b shows the area evolution for the analysis chamber. Notice that it starts with the maximum value and diminishes until it reaches the minimum value for chamber area around halfway through a revolution. The minimum value for the area is not exactly zero because of the gaps in the meshing between the internal and external rotors described in the previous section. The area then rises until it reaches the maximum value once the revolution of the pump is completed. Figure 16a shows the intersection area between the analysis chamber and the input port (area through which the working fluid enters the pump). Notice that the intersection area increases at first, because, at initial position, the port and the chamber are only partially overlapped (as seen in Figure 15a). When the chamber is no longer overlapping with the input port and has started discharging fluid (overlapping with the output port), the intersection area with the input port becomes zero. The same analysis corresponds to the intersection area between the chamber and output port shown in Figure 16b. Figure 17 shows the history of areas for all chambers in the pump; the periodicity is explained by the cyclic design of the pump.

**Figure 17.** History of area for all 9 chambers in the pump.

We now discuss the results of a fluid dynamics simulation with our virtual prototype tool. The fluid properties, operating conditions and initial pressure conditions that are used for the simulation presented, are described in Table 1:


**Table 1.** Simulation conditions for test operating point.

Figure 18 shows the history of calculated pressure inside an analysis chamber. Notice that the analysis chamber *CV*<sup>1</sup> (Figure 18a) is initially near the maximum volume position and, therefore, the initial pressure *Pt*=<sup>0</sup> will be low (near input port pressure). As the pump rotates, the volume of the chamber reduces and increases the pressure inside the chamber. The maximum value of the pressure in chamber is reached in the minimum volume position and, once the analysis chamber enters the discharge cycle, the pressure starts to reduce as a result of the discharge of fluid and increase in volume of the chamber itself.

**Figure 18.** Pressure results from virtual prototype: (**a**) profile of geometry with highlighted chamber (*CV*1), (**b**) history of pressure in chamber *CV*<sup>1</sup> , and (**c**) pressure distribution in pump after a full revolution, as seen in Digital Twin (DT).

Figure 18c shows the pressure distribution in pump after a full revolution. Notice that there is a peak in pressure in the chambers near the minimum volume position and the pressure starts decreasing after the chamber intersects with the output port and goes through the discharge cycle. The discretization that was used in our virtual prototype allows for the tracking of the history of pressure for each individual chamber. However, the discretization used in the CFD methods make it impossible to track the history of pressure for an individual chamber. This is because, in CFD, the entire discretization (mesh) is updated in every time step (remeshing) to meet convergence requirements [21]. Therefore, for the purposes of comparison against a benchmark (i.e., CFD data), we use the maximal pressure across the entire fluid domain (all chambers). The CFD simulations are currently used in most of the design processes as an accurate prediction of the pump's performance. Therefore, they are a valid point of comparison for our implementation. Figure 19 shows the results of maximum pressure in pump, as predicted by both CFD simulation and by our implementation.

**Figure 19.** Virtual Prototype vs. computational fluid dynamics models (CFD) maximum predicted pressure in gerotor pump.

Our implemented virtual prototype is able to estimate the maximum pressure within the pump with a relative error of 21%. Our virtual prototype fails to reproduce the amplitude of the pressure oscillation in the pump. We believe that this shortcoming is due to the used assumption of homogeneous pressure in each one of the control volumes. The CFD model does not need such an assumption, and it is able to take into account variations of pressure within the chambers. Still, the maximum pressure in the pump as it rotates through an entire revolution is an important indication of the pump's performance, and it is reasonably estimated by our virtual prototype.

Table 2 shows a comparison of pre-processing and simulation times for both a CFD simulation and a simulation with our virtual prototype. The pre-processing time of the CFD simulation includes the generation of CAD models and mesh generation. The preprocessing time in our virtual prototype includes the parameter input for the geometry configuration tool and the automatic generation of B-Rep models. The main advantage of our virtual prototype is the much lower processing and simulation time, while still providing valuable performance information to the design engineer. The economic impact of our prototype upon the maker's functioning is not available at the present time, because of the fact that the design process, at the maker's facilities, is intertwined with other products and processes.

**Table 2.** Comparison of pre-processing and simulation times for the CFD simulation and our virtual prototype.


#### **5. Conclusions and Future Work**

In this manuscript, we have presented the implementation of a virtual prototype tool in the context of gerotor pump design, a component that has been widely used in different industries and that usually requires time-consuming tasks in the design workflow. Our implementation is a first step towards a fully functional Digital Twin of a gerotor pump. Our implemented tool allows for the integration of data that are collected from an experimental setup with a virtual prototype model. The collected data are fed to

the numerical model in order to improve the accuracy of the performance predictions. This allows the engineer to have a fast overview of the performance of the pump and allows him to discard unsuitable geometric configurations in an efficient manner. The presented implementation integrates a 2D design interface with an interactive parameterized model of the pump and a 3D interface. The 3D tool allows for the visualization of the 3D model that corresponds to the previously defined 2D geometry. Our initial tests show that the implemented model to perform fast pre-CFD simulations approaches the result of more detailed and time-consuming simulations within an acceptable margin of error. Our implemented tool also integrates, in a single application, the geometry data and simulation data, which are otherwise treated in different environments. Our virtual prototype is not suitable if a detailed prediction of the behavior inside each chamber of the pump is required.

Future work is needed in the improvement of both the software implemented and the fast simulation model in order to achieve a fully functional Digital Twin tool of the pump. Efforts regarding the 3D tool should be focused on data visualization and the creation of a data structure for simulation profiles of different geometric configurations. A needed improvement of the pre-CFD simulation may be achieved by modeling (a) cavitation and (b) micro-movements in the rotors, due to induced pressure at the chambers. A further integration of experimental data with the simulation model is needed to improve the accuracy of the predictions.

**Author Contributions:** J.P.-C., A.M., A.P.-B., E.S.-J. and O.R.-S. conceptualized the tool; J.P.-C., A.M. and B.S. implemented the tool; J.P., O.R.-S., E.S.-J. and A.P.-B. supervised the industrial application of the tool; A.M., B.S. and J.P.-C. implemented the geometrical aspects of this research; J.P.-C., E.S.-J. and A.P.-B. supervised the simulation model results. All the authors contributed to the writing of the article. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work has received funding from the Eusko Jaurlaritza/Basque Government under the grants KK-2018/00071 (LANGILEOK) and ZL-2020/00190 (LATIDO).

**Acknowledgments:** The authors thank Tecnun-Universidad de Navarra for the support in the CFD-generated operational data.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


## *Review* **Is Digital Twin Technology Supporting Safety Management? A Bibliometric and Systematic Review**

**Giulio Paolo Agnusdei 1,2,\* , Valerio Elia <sup>1</sup> and Maria Grazia Gnoni <sup>1</sup>**


**Abstract:** In the Industry 4.0 era, digital tools applied to production and manufacturing activities represent a challenge for companies. Digital Twin (DT) technology is based on the integration of different "traditional" tools, such as simulation modeling and sensors, and is aimed at increasing process performance. In DTs, simulation modeling allows for the building of a digital copy of real processes, which is dynamically updated through data derived from smart objects based on sensor technologies. The use of DT within manufacturing activities is constantly increasing, as DTs are being applied in different areas, from the design phase to the operational ones. This study aims to analyze existing fields of applications of DTs for supporting safety management processes in order to evaluate the current state of the art. A bibliometric review was carried out through VOSviewer to evaluate studies and applications of DTs in the engineering and computer science areas and to identify research clusters and future trends. Next, a bibliometric and systematic review was carried out to deepen the relation between the DT approach and safety issues. The findings highlight that in recent years, DT applications have been tested and developed to support operators during normal and emergency conditions and to enhance their abilities to control safety levels.

**Keywords:** safety; digital twin; smart operator; smart manufacturing; Industry 4.0

#### **1. Introduction**

The fourth industrial revolution—also known as the Industry 4.0 paradigm—is changing current industrial production systems. The Industry 4.0 digital transformation is aimed at the optimization and automation of the previously introduced digitalization by adding the intelligent networking of machines, processes, and people [1]. In recent years, the concepts and theories of Industry 4.0 have become considerably important within the manufacturing sector [2]. One main reason for this increasing diffusion is the use and implementation of hi-tech technologies within the production processes at an affordable cost [3] in order to create smart grids across the entire value chain and networks formed by interconnecting intelligent machines [4]. Within industrial manufacturing contexts, Industry 4.0 entails the networking of data coming from machines, products, and people; in general, this involves the interconnection of smart devices among different plants and factories [5] through tools and embedded components, such as cyber–physical systems (CPSs), Internet of Things (IoT), cloud computing, robotics, systems based on artificial intelligence, and cognitive computation [4].

Digital Twins (DTs) are key enabling technologies (KETs) that were created to improve the efficiency and profitability of Industry 4.0 systems [6,7].

The first applications of DTs in industrial systems focused on developing a "digitalized copy" of a production process and/or a product. More recently, through the overlapping with the IoT, DT functionality and interoperability has been "augmented" by adding a real-time interaction with the real system [1].

**Citation:** Agnusdei, G.P.; Elia, V.; Gnoni, M.G. Is Digital Twin Technology Supporting Safety Management? A Bibliometric and Systematic Review. *Appl. Sci.* **2021**, *11*, 2767. https://doi.org/10.3390/ app11062767

Academic Editor: A.Y.C. Nee

Received: 26 February 2021 Accepted: 18 March 2021 Published: 19 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The availability of virtual and digital information represents an effective solution for improving product design, manufacturing technology, and other critical service processes, such as safety management [8,9].

Safety management, as the process of realizing certain safety functions, aims to promote organizational safety and to protect people and property within an organization from unacceptable safety risks [10]. Recently, with the increasing global economic uncertainties, safety management in most organizations is under growing pressure to achieve the best performance, and this requires access to a variety of high-quality safety information [11].

In the era of Industry 4.0, it cannot be disputed that data and information represent an indispensable resource and a successful key factor for a new paradigm of safety management [12–15].

Within the Safety 4.0 framework [16], DT-assisted safety management systems can be implemented to help operators execute complex safety procedures, thus reducing risks and human errors. DT guidance can lead operators through safety tasks and provide them with real-time information regarding the contextual conditions [17]. This can reduce the costs and time required for service and maintenance, decrease oversights and mistakes, and increase safety [18,19].

The main purpose of this study is to present the prevailing state of research on Digital Twin technology, its manifold applications, and the intersection between DTs and safety, answering the following research questions:

RQ1. What are the current publication trends in the domain in terms of the types of studies, time, and affiliated countries?

RQ2. Which are the influential studies and themes of research in this domain and how have they evolved over the years?

RQ3. What are the recent research trends, gaps, and areas for future research in this domain?

The rest of this paper is organized as follows: Section 2 delineates the advent of DTs, their main features, and their background. Section 3 illustrates the adopted methodological framework, while Section 4 covers the findings on the publication trends and on the keyword and cluster content analyses. The study is concluded in Section 5, where future research avenues are suggested.

#### **2. Background**

The DT concept first appeared in the aerospace field around the 1970s (Figure 1); the first complete characterization was proposed by Grieves during a course on Product Lifecycle Management at the University of Michigan [20].

**Figure 1.** The timeline of the evolution of the definition of a Digital Twin (DT).

Basically, a DT refers to a system consisting of three main subsystems: (a) physical products in real space, (b) virtual products in virtual space, and (c) data and information that tie the virtual and real products together. Grieves [20] depicted DT flow as a cycle between the physical and virtual states (called twinning), with data flowing from the physical to the virtual and information and processes flowing from the virtual to the physical states (Figure 2).

**Figure 2.** Digital Twin concept.

Then, NASA provided the first definition of a DT for the aeronautic sector [21]: A "digital twin is an integrated multi-physics, multi-scale, probabilistic simulation of a vehicle or system that uses the best available physical models, sensor updates, fleet history, etc., to mirror the life of its flying twin".

Differently from the first definition, this one referred explicitly to the concept of simulation modeling, which is often adopted in several tools for supporting the design, validation, and testing of a system. Within this definition, the simulation is considered as a multi-physics model, as more than one physical field is simultaneously involved and different physical properties are integrated; simulation modeling is also usually multi-scale because different levels of time and/or space are used; finally, it is probabilistic because it could easily be developed with probability calculations.

These features characterize traditional simulation models, and they are enhanced in DT applications due to their stricter connection with the physical world.

Traditional simulation models are developed based on real data, but their updating is usually a static process [22]. Differently, in DTs, a data update process is required, and this represents the most relevant feature of DTs. Thus, due to the diffusion of the Industry 4.0 paradigm, a huge amount of data about physical systems are now available—even in real time—to be used for operation redesign and control. This availability is becoming the main key enabler of DT system development in manufacturing systems.

From 2010 onwards, the definition of DTs given by NASA was modified, and new definitions that were more focused on the industrial sector were proposed in the scientific literature. Lee et al. [23] proposed DTs as an advancement in predictive manufacturing systems. In their definition, the DT is defined as a simulation model that acquires real data and transfers them to a simulator in the cloud. In line with this study, Rosen et al. [24] stated that DTs, by combining real data with simulation models, allowed for the drawing up of a forecast based on realistic data, thus providing a sort of guidance system that supports operators and planners during normal operation, as well as during maintenance and service. Chen [25] described the DT as a computerized model of a physical device or system that represents all functional features and links with the working elements.

In 2018, some authors identified the Digital Twin as a digital representation of a physical production system that uses integrated simulations and service data, holding information from multiple sources across a product's life cycle. This information is continuously updated based on operational changes and is visualized in different ways to forecast

current and future conditions of the physical counterpart in order to enhance decision making [26,27].

Recently, one of the latest definitions identified the Digital Twin as a virtual instance of a physical system (twin) that is continually updated with the latter's performance, maintenance, and health status data throughout the physical system's life cycle [28].

As a promising means of achieving cyber–physical interaction, integration, and fusion, Digital Twins (DTs) have captured growing attention from academic researchers as well as industrial practitioners [29,30].

They can stimulate the development of new approaches in design, production, and service, eventually leading to more innovations, such as better data management in order to improve the production process and performance and to ensure continuity and traceability of information [31], support for the analysis of production-line performance parameters, allowing for continuous monitoring of the line balancing and performance with the variations in the production demand [32], and support for monitoring and decision making regarding the ergonomic performance of manual production lines [33].

#### **3. Methodology**

Review studies can be of several types. In this study, a combination of bibliometric and systematic reviews is adopted. Bibliometric analyses are extensively performed to trace the knowledge anatomy of a research field and are used to analyze research topics [34]. Systematic literature reviews are used to synthesize the contents of the literature, limit bias [35], and identify possible research gaps.

For the purpose of answering RQ1 and RQ2 and identifying the publication trends and most influential research themes, two bibliometric analyses were performed, which provided comprehensive maps of the knowledge structures of the DT research field and the intersection between the DT and safety research fields. The results from the clustering of the intersection between the DT and safety research fields provided a foundation for a cluster content analysis aimed at answering RQ3 and identifying the recent research trends in the domain, as well as the gaps and areas for future research. Figure 3 proposes a schematization of the adopted methodology.

**Figure 3.** Methodological framework.

#### *3.1. Search Protocol and Datasets*

The Scopus database, the widest repository of peer-reviewed scientific literature, was used for the construction of the datasets. The first step was the identification of the keywords to be used for the selection of the document samples.

The terms selected to extract the document sample for the first bibliometric analysis (Figure 3) were the following: (i) digital; (ii) twin. Scopus was queried on 10 February

2021 with this combination of keywords: "digital AND twin". This choice allowed the borders of the analysis to be defined, ensuring a specific focus on the topics to be inspected. The extraction from the Scopus database was limited to: (i) studies written in the English language; (ii) the "Engineering" and "Computer science" areas; (iii) the period 2003–2021 because the first theorization of DTs was provided by Grieves in 2003. It provided an output of 3301 documents, whose citation information (authors, documents title, publication year), abstract, and index keywords were exported.

In order to limit the analysis to the intersection between the digital twin and safety management topics, a second extraction from the Scopus database was performed to obtain the document sample for the second bibliometric analysis (Figure 3). For the same time period, 2003–2021, Scopus was queried on 10 February, 2021 with the combination of keywords "digital AND twin AND safety", and with the same limitations as in the first bibliometric analysis. The obtained dataset consisted of 190 documents, whose citation information (authors, documents title, publication year), abstract, and index keywords were exported.

#### *3.2. Bibliometric and Content Analysis Methods*

The methodology adopted to investigate the scientific literature dynamics related to the two document samples extracted from Scopus was based on the use of two types of software for bibliometric analysis. The Bibliometrix package application in the R software was used to evaluate the growth, maps, and trends of the scientific field of research [36], while VOSviewer 1.6.14, which was developed to conduct text mining and to construct bibliometric maps [37], was used to identify the study keywords' co-occurrence.

In the study, the VOSviewer software was used to create a network map of the cooccurrence of terms considering index keywords. The keywords' co-occurrence analysis was carried out using the full counting method. The relatedness of items was determined based on the number of documents in which they occurred together.

According to the methods proposed by Donohue [38], the cut-off point for the term occurrence was determined with the following Formula (1):

$$T = \left(1 + \sqrt{1 + 8 \times I}\right) / 2\tag{1}$$

where *T* represents the optimal minimum number of occurrences of a keyword and *I* is the total number of keywords. Notwithstanding, to ensure wider software processing, keywords that co-occurred at least 10 times were selected for the analysis [39].

The index keywords were processed using VOSviewer, and the results were displayed in the network visualization and the overlay visualization. The network visualization shows keyword co-occurrence, where the dimensions of circles represent the weights of keywords, the lines represent the ways in which two words are linked, and thicker lines mean stronger connections among words. In the network visualization, VOSviewer uses colors to indicate the cluster to which a keyword has been assigned. The clustering technique [40] requires an algorithm for solving an optimization problem. For this purpose, VOSviewer implements the smart local moving algorithm introduced by Waltman and Van Eck [41].

The overlay visualization replicates the same map as that in the network visualization, but with different colors. The items and their links are colored in order to make it possible to view temporal trends and to identify which keywords were used most frequently during the observation period. The layout of the map was built by normalizing the strengths of the links among the elements through the association strength method according to Van Eck and Waltman [42].

In order to conduct a systematic review of the safety management issues related to the DT concept, documents were extracted from the clusters obtained through the second bibliometric phase. The most recent documents were carefully examined to identify the common features that characterized each cluster as well as the gaps in the literature.

#### **4. Results**

#### *4.1. The Digital Twin Research Field*

Conference papers represent essential documents for supporting the scientific development of new research and application fields. As shown in Figure 4, the analysis of the dataset according to the type of document shows that the majority are conference papers (57%), followed by journal articles (33%), conference reviews (4%), and book chapters (3%). These results represent important evidence that allows Digital Twin research to be identified as an emerging research field.

**Figure 4.** Publications in the DT research field by document type.

Figure 5 illustrates the progression of publications available in the Scopus data on Digital Twins in the period 2003–2021. There was an upsurge in publications from just 25 documents published in 2003 to 1279 documents in 2020. Research on Digital Twins saw a sudden spurt from 2017, which could represent a point of discontinuity, with an exponential shape in the last three years (2018–2020). The main reason for this surge may mainly be attributed to the governmental ICT (Information and Communication Technologies) investments in many countries carried out by the adoption of Industry 4.0 policy plans [43]. Even if there are more conference papers than journal articles in absolute terms, in the last three years (2018–2020), the share of journal articles over the total published documents regarding DTs has increased noticeably, going from 24% to 35%, registering an opposite trend compared to conference papers, which fell from 67% to 52%.

Table 1 lists the top countries affiliated with authors of DT research, with the leading three being Germany (492 documents), the United States (477 documents), and China (434 documents). They are the top three global manufacturing export countries in the world [44], and Germany, in addition to having a leading position in Europe (Figure 6), is the country where the Industry 4.0 concept was first developed [44,45]. A noticeable selfperpetuating effect of giving and taking references was observed in these countries, which register more study contributions to the pool of the DT field. Countries that registered a high production were also among the countries whose studies received more citations. It can be said that the most well-established countries are dominating or leading in the field, and this can also be seen in the citation patterns.

**Figure 5.** Annual publication trend of 3301 documents retrieved from Scopus for the DT research field in the period 2003–2021.

**Table 1.** Top ten affiliated countries publishing on DTs.


**Figure 6.** Map of affiliated countries publishing on DTs.

Keyword analysis was performed to explore the most prevalent themes in the DT research field. A total of 15,901 keywords were identified in 3301 documents. It is important to evaluate the keywords of a document to understand how authors frame their work, what the most interesting aspect of the research is or how it is evolving, and what trends are being created. From the extracted document sample, a count of the index keywords was performed in order to calculate their frequency and rank them. Table 2 shows the ranking of the top ten most relevant Keywords-Plus (ID). The ID is standardized; it is defined by Scopus to help in the research of documents associated with a topic. "Digital twin" is the most frequently used keyword, with 1404 occurrences, which indicates that this word alone is used as a termed concept in the literature. The other most frequently used keywords are "life cycle" (333 occurrences), "manufacture" (325 occurrences), "embedded systems" (248 occurrences), "Internet of Things" (206 occurrences), and "Industry 4.0" (200 occurrences).


**Table 2.** Top ten most frequent index keywords in the DT research field.

Based on the hypothesis that a research specialty can be identified by the relations among document keywords, keyword co-occurrence analysis is useful for identifying the thematic areas or clusters that constitute the theoretical blocks or foundational topics of the field under analysis [46].

Starting from the entire document sample, including 15,901 keywords, a co-occurring keyword analysis was performed in order to construct diagrams to display the network and the overlay visualization (respectively, Figures 7 and 8). Considering keywords that co-occurred at least 10 times, 502 keywords were selected for the final analysis.

**Figure 7.** Network visualization of the DT research field.

**Figure 8.** Overlay visualization of the DT research field.

The keywords were grouped into clusters, represented by different colors. In particular, the keywords were clustered into seven groups, and each keyword was assigned to only one cluster. As shown by the network visualization (Figure 7), the DT research field is primarily composed of seven clusters of connected topics: Industry 4.0 and IoT (cluster blue), learning systems (cluster—red), smart manufacturing (cluster—green), information management (cluster—cyan), life-cycle management (cluster—purple), holograms (cluster yellow), and digital image analysis (cluster—orange).

A very close linkage between five clusters (red, blue, green, purple, and cyan) is clearly observable, while two clusters appear disconnected from the first ones (yellow and orange).

The clusters clearly indicate that scientific research in the field of DTs has focused, above all, on Industry 4.0 manufacturing, life-cycle management, and data processing and analysis. Since it is evident that the orange and yellow clusters are almost totally unlinked from each other, the holograms and digital image analysis research fields can be considered to have developed almost autonomously.

In the overlay visualization, which shows the temporal distribution of the keywords in each cluster (Figure 8), keywords are colored according to a score. This score is given based on the average year of occurrence of a keyword. Colors range from blue (oldest time period) to green and yellow (most recent time periods). It emerged that the topics related to holograms (cluster—yellow) and digital image analysis (cluster—orange) were developed before the others. This could mean that in an early stage, the DT concept and research field derived previously from holograms, digital image analysis, and scanning, and then evolved to a wider and more common application that is usually adopted in manufacturing activities.

The most recent documents in the scientific literature refer, in fact, to the fields of smart manufacturing, IoT, life cycle, and information management (Figure 8), which also represent the main research fields according to the co-occurrence network map (Figure 7). The latter are trendy topics. Within them, there are a series of strongly linked research sub-fields, e.g., quality control of processes and products, design of processes and products, predictive analysis through machine learning algorithms, and, finally, safety management.

#### *4.2. Bibliometric Results on the Intersection between the DT and Safety Research Fields*

Safety aspects are fundamental within the manufacturing sector and represent a research field with great potential and wide application areas [47]. As indicated in Section 3, the linkages between the Digital Twin and safety issues were investigated through a second bibliometric analysis.

As for the document sample related to Digital Twins, the document sample related to studies that integrate the DT and safety research fields also includes many conference proceedings. The latter generally report preliminary studies, which serve as a basis for the development of more complex research activities that later conclude with international journal publication [48]. Figure 9 shows that conference papers represent 65% of the dataset according to the type of document, followed by journal articles (33%), conference reviews (7%), reviews (4%), and book chapters (1%). In the light of these results, studies that integrate DTs and safety issues can be considered as the most recent emerging research field within the wider research field of DTs [49].

**Figure 9.** Publications in the intersection of the DT and safety research fields by document type.

Figure 10 highlights the trend of publications available in the Scopus data on the intersection between the Digital Twin and safety research fields in the period 2003–2021. Most of the articles were published in the three-year period of 2018–2020. In fact, there was, an exponential increase in publications from 0 documents published in 2003 to 92 documents in 2020, which represents a peak, confirming the researchers' pioneering interest in developing the scientific topics of Digital Twins related to process manufacturing activities and, consequently, to safety management.

Table 3 lists the top countries affiliated with the authors of research on the intersection of DTs and safety, with the leading three being the United States (40 documents), Germany (25 documents), and China (21 documents), confirming the rank results registered for the DT research field, with the only permutation being between Germany and the United States. The numbers of documents published by authors from the UK (16 documents) and Italy (13 documents) are also noteworthy. Italy earned the fifth position, also highlighting the role that the national "Industry 4.0" plan played in driving organizational changes in enterprises that particularly addressed safety issues [50]. Figure 11 shows Map of affiliated countries publishing in the intersection of the DT and safety research fields. Even if they are not very numerous, documents from Italy register a high impact on the research field. As an example, the review conducted by Cimino et al. [51] (2019) of the Politecnico di Milano, which was published in *Computers in Industry* and dealt with a topic related to the intersection between DTs and safety, is among the most cited documents (43 citations). This highly cited document is followed by the study by Oyekan et al. [52] (34 citations),

which shows how virtual reality Digital Twins could assist in the safe implementation of human–robot collaborative strategies in factories of the future.

**Figure 10.** Annual publication trend of 3301 documents retrieved from Scopus for the intersection between the DT and safety research fields in the period 2003–2021.


**Table 3.** Top ten affiliated countries publishing in the intersection of the DT and safety research fields.

Keyword analysis was performed to explore the most prevalent themes in the intersection between the DT and safety research fields. A total of 1673 keywords were identified in 190 documents. From the extracted document sample, a count of the index keywords was performed in order to calculate their frequency and to rank them. Table 4 shows the ranking of the top ten most relevant Keywords-Plus (ID). As for the first bibliometric analysis, "Digital twin" is the most frequently used keyword, with 93 occurrences, followed by "life cycle" (32 occurrences), "safety engineering" (21 occurrences), "accident prevention" (19 occurrences), and "virtual reality" (19 occurrences).

A significant finding outlined by the keyword analysis is that there are themes that occur in both the DT research field and the intersection of the DT and safety research fields. This circumstance highlights the unanimity on the conceptualization of the Digital Twin and its applications, as well as the mostly indirect and diversified references to safety aspects in much of the scientific literature on DTs.

**Figure 11.** Map of affiliated countries publishing in the intersection of the DT and safety research fields.


**Table 4.** Top ten most frequent index keywords in intersection of the DT and safety research fields.

Starting from the document sample, which included 1673 keywords, a co-occurring keyword analysis was performed in order to obtain Figures 12 and 13, which display the network and the overlay visualization for the intersection between the DT and safety research fields, respectively. In this case, 51 keywords were selected for the analysis, since they co-occurred at least 10 times.

As shown in Figure 12, the research field derived from the intersection between DTs and safety consists of seven clusters: decision making and offshore applications (cluster red), IoT and life-cycle approaches (cluster—green), Industry 4.0: from manufacture to virtual reality (cluster—blue), machine learning support for DTs and safety (cluster—yellow), safety engineering (cluster—purple), hazards and risk assessment (cluster—orange), and DTs in battery management systems (cluster—cyan).

In the overlay visualization, which shows the temporal distribution of the keywords in each cluster (Figure 13), the field of studies regarding the intersection between DTs and safety has evolved from a previous concentration on topics related to aircraft and fleet operations to wider issues related to accident prevention and risk and information management, as well as to more strategic and specific themes referring to life-cycle management and offshore technology applications.

**Figure 12.** Network visualization of the intersection of the DT and safety research fields.

**Figure 13.** Overlay visualization of the intersection of the DT and safety research fields.

*4.3. Cluster Content Results for the Intersection between the DT and Safety Research Fields*

As shown in Figure 12, VOSviewer uses colors to indicate the cluster to which a keyword has been assigned within the network visualization based on the methods explained in Section 3.2.

The red cluster of keywords, called "Decision making and offshore applications", includes studies regarding DT solutions aimed at supporting decision-making processes. These studies identify the Digital Twin concept as a promising tool for decision makers and stakeholders alike, which is bound to benefit those who use it [53]. In particular, DTs solve big data problems in the field of offshore resources, letting workers spend less time looking for data and more time identifying trends and innovative ways to exploit the data, e.g., smarter drilling, greater field automation, or improved safety [54]. Based on in situ measurement information, DTs can support operational and maintenance decisions that will preserve the integrity, safety, and availability of assets [55].

The green cluster, called "IoT and life-cycle approaches", regards studies that consider that a DT should encompass and plan the entire life cycle of a physical asset, thus producing profound differences depending on the application domain [56]. Through the use of the IoT, the gap between the physical and virtual worlds is filled by bridging a physical component's sensors and actuators with its digital counterpart [57]. In the safety domain, a DT provides an opportunity to train employees in virtual environments, thus helping to achieve accident prevention and to reduce the probability of accidents that may occur during on-the-job training [58].

The blue cluster, called "Industry 4.0: from manufacture to virtual reality", groups the latest research related to Digital Twins and virtual reality environments for safety purposes within the Industry 4.0 paradigm. Over recent years, the concept of human– machine interaction has received wide attention, since it represents the basis for achieving automation in manufacturing. Conventional simulations do not allow us to experience future production systems as end-users in an immersive environment. For this reason, virtual reality has had technological development [59]. Cyber–physical systems, however, require operators' awareness of the situation in order to be able to adequately address potential issues in a timely manner. Detecting early symptoms may speed up the incident response process and mitigate the consequences of business interruption or safety hazards. Running parallel to their physical counterparts, DTs allow for the deep inspection of their behavior without the risk of disrupting operational technology processes [60].

The yellow cluster, called "Machine learning support for DT and safety", includes studies that use machine learning algorithms to rapidly ascertain optimal aircraft dynamics to maximize the fire-retardant release effectiveness [61], to create a digital representation of humans that focuses on their vital quantities, as well as on the surrounding environment, for health monitoring purposes [62], or to enable industrial robots to bypass obstacles or people in a workspace [63].

The purple cluster, called "Safety engineering", regards studies about the interactions among DT data acquisition, DT data processing, and safety issues [49], as well as research regarding advanced structural simulations combined with physics-based deterioration models in order to calculate structural performance [64].

The orange cluster, called "Hazards and risk assessment", groups studies about risk assessment and tools of risk prediction. By comparing the real data with those obtained by the simulation software, DTs can predict risks and/or anomalies and communicate with a server in order to generate a warning [65]. This is particularly relevant when DTs serve to audit and evaluate compliance with legal requirements in everyday production and logistics processes. Non-compliance can endanger employees and the environment, and can cause financial and reputational damage [66].

The cyan cluster, called "DT in battery management systems", refers to studies that aim to enhance the safety, reliability, and performance of battery systems. All data relevant to batteries can be measured and transmitted to create a Digital Twin of a battery system, allowing for the diagnostic evaluation of a battery's charge and aging level [67].

As highlighted through the cluster content analysis, the scientific literature lacks a generic and widely recognized DT architecture. This heterogeneity unavoidably requires a deeper understanding of how to deal with DT systems by evaluating and comparing them and exploring how they differ in handling different environments. The lack of standardization and the variety of DT definitions, which cause discrepancies among DT implementation projects, represent a challenge that requires further research efforts aimed at fastening the progress in supporting safety management.

When trying to address open research questions in the field of Digital Twins, another challenge comes from the multidisciplinary approaches needed to design and develop adequate safety management measures. With technological improvements and the spread of blockchain, the scientific gap related to the integration of data for small IoT systems, as well as large heterogeneous systems, including human interaction, should be filled.

#### **5. Conclusions**

The Digital Twin technology is an emerging topic that has captured the attention of researchers in recent years. It is also becoming popular amongst managers and practitioners, thus demonstrating that it is one of the most fertile and contributing fields in the areas of engineering and computer science research.

This review further contributes to DT research in terms of unfolding the evolving literature in terms various themes and trends, hence putting forth the status of scholarly work since its inception in 2003. However, some gaps in the research on the intersection between the DT and safety research fields have been identified, and recent research efforts were addressed.

With the intention of thoroughly reviewing the existing literature, this study provides valuable insights into DTs and safety. Such concepts have an ever-increasing importance in day-to-day management decisions. The scholarly work reveals the still scarce implementation of DT technologies across industry, and specifically for safety management purposes. This study represents a wake-up call for decision makers and other stakeholders who should undertake paths towards improvements enabled by DT technologies and, ultimately, safety in its complex definition. There is a large scope of contributions to theoretical development, methodologies, and new applications. DT technology is an issue with vast implications for safety management, and its development can guide the way to competitive and stable industries.

**Author Contributions:** Conceptualization, G.P.A. and M.G.G.; Data curation, G.P.A.; Methodology, G.P.A., M.G.G., and V.E.; Supervision, M.G.G.; Validation, G.P.A.; Visualization, G.P.A.; Writing original draft, M.G.G., G.P.A., and V.E.; Writing—review and editing, M.G.G. and V.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is part of the activities carried out within SO4SIMS project (Smart Operators 4.0 based on Simulation for Industry and Manufacturing Systems) funded by the Italian Ministry of Education, Universities and Research MIUR (Project PRIN-2017FW8BB4).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Sustainability Requirements of Digital Twin-Based Systems: A Meta Systematic Literature Review**

**Rui Carvalho 1,\* and Alberto Rodrigues da Silva <sup>2</sup>**


**Abstract:** Sustainable development was defined by the UN in 1987 as development that meets the needs of the present without compromising the ability of future generations to meet their own needs, and this is a core concept in this paper. This work acknowledges the three dimensions of sustainability, i.e., economic, social, and environmental, but its focus is on this last one. A digital twin (DT) is frequently described as a physical entity with a virtual counterpart, and the data, connections between the two, implying the existence of connectors and blocks for efficient and effective data communication. This paper provides a meta systematic literature review (SLR) (i.e., an SLR of SLRs) regarding the sustainability requirements of DT-based systems. Numerous papers on the subject of DT were also selected because they cited the analyzed SLRs and were considered relevant to the purposes of this research. From the selection and analysis of 29 papers, several limitations and challenges were identified: the perceived benefits of DTs are not clearly understood; DTs across the product life cycle or the DT life cycle are not sufficiently studied; it is not clear how DTs can contribute to reducing costs or supporting decision-making; technical implementation of DTs must be improved and better integrated in the context of the IoT; the level of fidelity of DTs is not entirely evaluated in terms of their parameters, accuracy, and level of abstraction; and the ownership of data stored within DTs should be better understood. Furthermore, from our research, it was not possible to find a paper discussing DTs only in regard to environmental sustainability.

**Keywords:** digital twins (DTs); Internet of Things (IoT); sustainability requirements; sustainable development; product design

#### **1. Introduction**

A digital twin (DT) is often described as a digital or virtual entity with a physical counterpart, and with data connections between the two [1], implying the existence of connectors and blocks to allow efficient and effective data communication. A DT is a digital representation of some more complex physical system, and in spite of the fact there are distinct definitions of DT, this was the original, and the one we adopt [1]. Grieves and Vickers of NASA are considered the pioneers of this concept, presenting it in a lecture on product life-cycle management in 2003, as is acknowledged by Liu et al. [2]. They point out three components [2]: (i) a physical product, (ii) a virtual representation of that product, and (iii) the bi-directional data connections from the physical to the virtual representation, or vice versa. Among the main purposes of developing DTs are product design, modeling, simulation, and optimization of specific assets [3,4].

Today the usage of DTs is not yet generalized, but since 2015 there has been a clear increase of scientific studies toward a better understanding of their potentialities. Machine tools and consumer goods are common examples of DT usage. However, it is not in all cases that a DT has to be a high-fidelity digital model of a physical system or asset. For instance, a DT can also be used for representing a whole city (urban digital twin), geographic areas,

**Citation:** Carvalho, R.; da Silva, A.R. Sustainability Requirements of Digital Twin-Based Systems: A Meta Systematic Literature Review. *Appl. Sci.* **2021**, *11*, 5519. https://doi.org/ 10.3390/app11125519

Academic Editor: Andrew Y. C. Nee

Received: 31 March 2021 Accepted: 4 June 2021 Published: 15 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

buildings, or even human bodies and human organs. However, the focus of this paper is on DTs as they represent physical assets. Other reported usages of building DTs are for cybersecurity incident prediction [3], monitoring ergonomics in IoT contexts [5], online education [6], or optimization farming systems [7].

Product design is also a fundamental and related aspect of DTs and environmental sustainability because it is a discipline that deals with many complex decisions and crosscutting concerns, such as safety, security, usability, or sustainability (including the choice of materials or the use of energy). Furthermore, product design may influence the planning of a production line, another frequent application of DTs [1,2,8]. This means that errors and failures can be predicted and managed with the help of DT approaches, using data analytics, artificial intelligence (AI), and machine-learning techniques.

Sustainable development is defined by the UN (1987), in the Brundtland Report ("Our Common Future") [9], as development that meets the needs of the present without compromising the ability of future generations to meet their own needs. This framework is very useful, as the concept of a circular economy (CE) [10], defined as an economic model to minimize the consumption of finite resources, is becoming more important and is closely related to the concept of the supply chain [11]. Industry 4.0 was introduced in 2011, and it has become synonymous with smart manufacturing/factory, corresponding to established concepts such as computer-integrated manufacturing, a flexible manufacturing system [10], CE, and all together, the management of a huge amount of data generated by DTs [12], allowing the integration of new tools such as those related to the Internet of Things (IoT).

Closely linked to product design optimization and CE is the concept of sustainable product design, defined by Massey [13] as the art of designing buildings, cities, and other artifacts so that they meet the objectives of sustainable development. Product design is not just art, it is also in the decision of what materials to use, choosing in such a way that a product is useful for society. A DT, as defined in this research, has a close connection with sustainable product design, but our focus will be on its IT characteristics and benefits.

This paper provides a meta systematic literature review (i.e., an SLR on SLRs) on the topic of the sustainability requirements of DT-based systems. In this context, sustainability requirements are defined as requirements that make sustainable development possible. Merten et al. [14] tried using the knowledge of generic requirements to assist automatically during requirement specification. Paech et al. [15] tried developing a systematic process for deriving the sustainability requirements for a specific system, i.e., a checklist of general and IT-specific details for each sustainability dimension (environmental, technical, social, economic, and individual), and the influences between them. Additionally, a new model and new concepts, such as needs and the effects between needs, whether negative, neutral, or positive, were created; this new mindset is used to develop a system respecting a due balance between different dimensions to achieve sustainability. In the context of IT, this means a controlled natural language [16] that helps the specification of requirements and tests systematically and rigorously [17,18], such as the ITLingo RSL language [19], does have a paramount role in defining software sustainability requirements [20] as it already supports risks, vulnerabilities and goals/solutions [21], for example. These tools belong to the spectrum of model-driven engineering, using textual specifications and conceptual models to improve the efficacy and efficiency of the analysis and design of these IT systems [22] and their usability [23].

To sum up, this paper is a meta-SRL plus an attempt to add new contributions to the environmental sustainability debate. Sustainable development implies a responsible consumption of resources today, and because DTs can allow optimization of the operations, they can be a tool for that purpose. When designers test a new product, they might use a DT to virtually test a new implementation without consuming raw materials and simulating the usage of environmentally friendly materials, also reducing working hours, and only produce it afterward if the simulation makes sense.

This paper is organized as follows: Section 2 introduces the SLR methodology followed; Section 3 presents the results of the meta-SLR; Section 4 presents a critical analysis to identify future work paths. Finally, Section 5 presents the main conclusions.

#### **2. The Research Methodology**

To develop this study, we considered the SLR methodology as proposed by Kitchenham et al. for software engineering [24], then we looked to the work by Escallón and Aldea because they presented a methodology that is valuable in the context of this research [25].

The method of SLR, as proposed by Kitchenham et al. [24], has three main stages: a phase for planning, to execute that plan, and to analyze the results. The execution phase has five tasks: (i) extract studies from databases, (ii) eliminate duplicates from the sample, (iii) apply inclusion and exclusion criteria, (iv) gather backward and forward citations, and (v) identify the final dataset of selected papers. If, at task iv, there are new papers found, then the researcher moves backward to task (ii) and repeats the process, from task (ii) to task (iv), as many times as are needed. Additionally, we included certain techniques described by Wolfswinkel et al., namely, the backward and forward citation steps of their selection phase [26]. If any work related to a research task is not present in the currently selected set of references, but it is considered relevant, it should be added to the selected ones. Since Google Scholar is a very popular tool and supports the backward and forward analysis of citations, this was the main tool adopted. The overall process is shown in Figure 1, inspired by the methodology used by Escallón and Aldea [25].

**Figure 1.** The overall process of SLR (in BPMN notation).

To allow for a more precise fine-tuning of our research, we also considered the work by Ahmad et al. [27]. They explicitly identified and categorized the different types of controlled or common vocabularies (CV) [27] available, and usage for the requirements specification of software development, which was an important output to us because we wished to focus on identifying sustainability requirements. A CV is an organized collection of terms that have well-known meanings, without the ambiguities or misunderstandings that synonyms could cause. Their purpose is to organize information in a structured manner with consistency, indicating semantic relationships, allowing the simple classification, querying, and retrieving of data [13,28–30]. Examples of the most frequent CVs are ontologies, taxonomies, thesauri, and folksonomies. Natural language processing and knowledge management techniques [31,32] often use CV support tools.

#### *2.1. Planning Phase*

The planning phase is the part where the design process is followed to perform the SLR according to the selected methodology. Firstly, the relationship between sustainability requirements and DTs-based systems is addressed, and then the relationship with product design is established. Section 2.1.1 defines the main questions that guided this research; Sections 2.1.2 and 2.1.3 refer to the scientific repositories and queries that were used in the search process, as well as the inclusion and exclusion criteria.

#### 2.1.1. Research Question

We defined one research question (RQ1) with three sub-questions (SQ). These questions were used to address the main objectives of the research:


The search process considers these questions and the technical definitions available in the existing literature and subsequent reading of the found works is also needed.

#### 2.1.2. Search Process

The following were the databases or scientific repositories to search for relevant papers to answer the questions of this research. These databases were selected because they are very well known among the scientific community, and were also used in other SLR papers that we considered as models to follow in the IT domain:


These were the inclusion criteria: (i) articles published in the past 10 years; (ii) studies published in journals and conference proceedings or indexed books; (iii) studies written in English; (iv) articles referring to SLR that were selected in the first database searches. We also included high-quality studies, even when they were short or written in Portuguese. Conversely, these were the exclusion criteria: (i) very short papers (i.e., with fewer than 5 pages); and (ii) duplicated works, those unified by the database under various names.

#### 2.1.3. Queries

In the first week of January 2021, we started from the query "systematic literature review" AND "digital twin" (all fields), and then tried other queries, considering not only the main aim of this paper but also the number of papers found, and the possibility of making the search deeper. The first step was to search for papers regarding a meta-SLR about DTs, and the second step was to search for sustainability requirements. The time period of the search was also considered, firstly the publication dates between 01/01/2011 and 31/12/2020, and secondly, the publication dates between 01/01/2019 and 31/12/2020. As a consequence, several additional queries were used:


In different databases, we elicited distinct scientific outputs, and consequently, we had to use other queries. The search for the keyword "environmental sustainability" was fruitless.

#### *2.2. Execution Phase*

The execution phase is the step where both the results and the process to execute the SLR are explained. We followed the phases and criteria previously defined, and will now describe our experience during this process, followed by the useful information we were able to extract. The queries were presented in the already mentioned databases, and the results are presented in Table 1. After obtaining the results from these databases, we completed the following steps: first, eliminate all duplicates; second, based on the paper title, whether exclusion criteria apply; third, based on the title, select those articles where both inclusion criteria apply, and exclusion criteria do not apply; fourth, repeat the third step but read through the full text; fifth, for each remaining article, review the reference section and repeat steps 2 to 4; and sixth, for relevant references, for each remaining article, Google Scholar is used to review the forward citations and repeat steps 2 to 4.


**Table 1.** Search in Databases.

A first search (see column "First Search" in Table 1) allowed us to identify works related to SLR, and a second search identified citations using Google Scholar (see column "with Google Scholar"). It was also possible to identify relevant papers to analyze the context of DT usage even though they were not SLRs (see column "Non-SLR"). Finally, reading abstracts of the papers, it was possible to select 29 papers (column "Selected Papers", with 13 SLR and 17 non-SLR papers). SLR papers are listed in Table 2, and non-SLR papers are listed in Table 3. The types of papers considered are C—Conference Paper; J—Journal Paper, T—Thesis, and B—Book.

Selected SLR papers are very recent: 2020 (10 papers); 2019 (2 papers) and 2018 (1 paper). The same holds true for selected non-SLR papers: 2020 (12 papers); 2019 (2 papers); 2016 (1 paper) and 2013 (1 paper). This situation is not surprising because the technologies and issues surveyed in this meta-SLR are very recent.


#### **Table 2.** Set of selected SLR papers.

#### **Table 3.** Set of selected non-SLR papers.


#### **3. Literature Review and Results**

This section presents the papers relevant to (i) the identification of DT sustainability requirements, and (ii) the identification of the relationship between DTs and product design. These two dimensions allowed the mapping of the answers we consider in this paper. Papers were sorted along the lines of these two dimensions being selected by the dominant work in each research.

#### *3.1. Digital Twins and Sustainability Requirements*

In our set of selected papers, there are 4 SLR papers and 4 non-SLR papers mainly related to DTs and sustainability.

Pokhrel, Katta, and Palacios [3] (S3) study the definition of DT and "state-of-theart" on the development of DT, including reported work on the usability of a DT for cybersecurity using the SRL methodology. Regarding incident prediction, the cases of the reported use of DTs are: intrusion detection; anomaly detection; monitoring (remote and on-site); virtual commissioning; autonomy; predictive analytics; documentation; and communication. Security is a major dimension of sustainability; for example, if equipment is dangerous its daily usage is probably impossible; their paper is an example of relevant SLR application in the field.

Rosa et al. [10] (S5) assess the relationships between CE and I4.0 using the SLR methodology. They stress the hybrid categories like Circular I4.0 and Digital CE, but move forward to the identification of the main benefits of integrating CE and I4.0, such as production technologies, financial performance, market expansion, supply chain management, product life-cycle management, workforce empowerment, and business models.

Ejsmont, Gladysz, and Kluczek [33] (S6) use a bibliometric literature review to evaluate the impact of Industry 4.0 on sustainability. They find that authors who deal with CE usually also study sustainable supply chains; nevertheless, I4.0 concepts such as sustainability, big data, smart manufacturing, IoT, sustainable development, digital transformation, and industrial IoT are frequently addressed. Cyber-physical systems, sustainable manufacturing, the smart factory, and digitalization are also popular concepts. Possibly, the main conclusion of their paper is that the positive sustainability outcome of these technologies is not guaranteed, and so, success requires supportive measures and specific policies to ensure the competitiveness of local actors.

Rajput and Singh [34] (S7) present an Industry 4.0 model for CE and cleaner production. Their model is built with mixed-integer linear programming (MILP) to optimize product machine allocation, e.g., optimizing the trade-off between energy consumption and machine processing cost. In this model, sensors are also deployed to capture real-time information in the Industry 4.0 facility.

Iñigo [41] (NS2) stresses the complexity of the new I4.0 and the new tools associated with it, for example, 3D printing, to allow the optimization of manufacturing.

Because there is no production without energy (meaning electricity at the production factory level, non-renewable sources of energy still being the paramount contribution to electricity production), the next three papers (i.e., NS1, NS3, and NS4) look to the general requirement of the "responsible use of energy".

Nimbalkar et al. [8] (NS1) and Supekar et al. [43] (NS4) present a framework for quantifying the energy and productivity benefits of smart manufacturing technologies. Breweries are the example used to demonstrate this framework and the implementation of smart manufacturing technologies. To determine the feasibility of a set of smart manufacturing interventions, the framework uses the cost of conserving energy (CCE) as a complementary measure. The quantification and analysis of energy productivity is its focus, and a strategic analysis framework to estimate cost-effective improvements in energy efficiency and productivity, using smart manufacturing, has been developed.

Saad, Faddel, and Mohammed [42] (NS3) study the effective and efficient implementation and design of DTs for energy cyber-physical systems. With the emergence of distributed energy resources (DERs), with communication and control complexities, it is fundamental to guarantee an efficient platform that can digest all the incoming data and ensure the reliable operation of the power system. To build this support technology, two DT types are introduced: one to cover the high-bandwidth applications, and another to the low-bandwidth applications that need centric oversight decision-making. The validation and test of this approach were performed using Amazon Web Services (AWS) as a cloud host that incorporates physical and data models, and additionally can receive live measurements.

#### *3.2. Digital Twins and Product Design*

The following papers are important to establish a bridge between DTs and product design. These papers introduce and analyze aspects that should be considered in real scenarios when designing or building DTs. In our set of selected papers, there are 9 SLR papers and 12 non-SLR papers mainly related to DTs and product design:

Jones et al. [1] (S1) try to characterize the DT concept using an SLR. The authors acknowledge that there are a variety of definitions employed across industry and academia. They identified 13 characteristics of DTs to clarify the definition, namely: physical entity/twin; virtual entity/twin; physical environment; virtual environment; state; realization; metrology; twinning; twinning rate; physical-to-virtual connection/twinning; virtual-to-physical connection/twinning; physical processes; and virtual processes and a complete framework and operation process of the DT.

Liu et al. [2] (S2) provide a literature review on DTs based on concepts, technologies, and industrial applications. They evaluate the current state of the art, discuss the concept of the DT, and analyze certain key enabling technologies of DTs. Additionally, they discuss fifteen industrial applications with their respective life cycles, and also present valuable observations and future work recommendations for DT research.

Josifovska, Yigitbas, and Engels [4] (S4) develop a reference framework for DTs within cyber-physical systems (CPSs). The authors define CPSs as system representations that integrate physical units and processes with computational entities over the internet, allowing ubiquitous access to information and services. The framework establishes a relationship between the 5-level CPS architecture and the DT framework, to answer open questions and challenges on how to design and realize CPSs.

Barth et al. [35] (S8) systematize DTs, creating an ontological and conceptual framework. Furthermore, these authors try to answer three research questions: (i) Which dimensions are used to classify and structure DTs in academic literature? (ii) What are the fundamental differences or specifications within these dimensions? and (iii) How do these different specifications relate to each other?

Rub and Bahemia [36] (S9) try to understand the current reality of smart factory implementation using an SLR. They identify a research gap related to the make-or-buy decision around DTs and other core components of the smart factory. This is significant because it is assumed that the smart factory leads to the creation of value, but that creation depends upon the way the factory is implemented—whether the implementation project is executed in-house or using an external supplier.

Dave et al. [37] (S10) discuss the new possible reality of smart cities where concepts such as IoT, big data, AI, robotics, and DTs are paramount, and stress the role of the latter. Smart cities, manufacturing, and healthcare are considered the main fields of application for DTs. An example of the application of DT is their usage in traffic management systems, using traffic cameras that are merely recording, but their recordings can be used to create traffic management models to reduce traffic congestion; they certainly provide more data to export and update a road network with real-time decisions. The authors conclude that there is a need for demonstration sites to test the new technologies with real data, and a need for extensive professional panels of experts in diverse research fields, for example, urban development, IT, transportation, and environmental policies.Polini and Corrado [38] (S11) present an example of DT for the stone-sawing process. The authors describe the DT, but have concerns regarding the accuracy of the equipment and its efficiency and efficacy.

Strmecki et al. [39] (S12) use an SLR to study the possible application of ontologies in automatic programming. Ontologies, which are typically considered as a technique or an artifact used in one or more software life-cycle phases, may be used to help achieve

the goal of finding higher abstraction levels, and ways to reuse software to increase its productivity and quality, within the discipline of software engineering.

Sjarov et al. [40] (S3) study the DT concept in the industry in a systematic way. The authors acknowledge a significant growth in the number of scientific studies since 2015 (industry-related publications per year carrying "Digital Twin" in their title). Studies show a variety of applications of DTs ranging from products and processes to whole production systems. Explicit definitions were found to be partly conflicting, and similar notions like "Product Avatar" and "Digital Shadow" are also identified. Their paper extends the theoretical foundation, setting a basis for future, improved DT modeling.

Schweiger, Barth, and Meierhofer [44] (NS5) focus their work on the data resources needed to create DTs. DTs are considered one of the key technologies for organizations moving from producing goods to offering services. The main hypothesis is that a large part of the new data produced, or resources, are already generated during the beginning of life (BOL) phase of the product life cycle, but are not used in the middle of life (MOL) phase. The new framework allows a better understanding of how to use data resources from BOL phases in MOL phases, and permits the creation of an ontology of product data, making the creation and maintenance of DTs easier.

Rivera et al. [45] (NS6) look closely at the engineering of IoT-intensive DT software systems. The authors assume the real DT to be a product that is equipped with several sensors or computing devices that generate, consume, and transfer data for different purposes. Due to this reality, they consider DTs, to a large extent, IoT-intensive systems.

Lutze [46] (NS7) studies the DT-based software design in eHealth as a new development approach for health/medical software products. The author's DT concept builds on (i) a personal digital twin as a Gemini of the patient, (ii) a group digital twin modeling the designated user group of the software, and (iii) a system digital twin for the software product itself. Agile development techniques in comparison to the V-model-based classic software development are considered as offering better support possibilities.

Valk et al. [47] (NS8) present a taxonomy of DTs using an SLR. To accomplish this task several dimensions of DTs are pointed out: data link, purpose, conceptual elements, accuracy, interface, synchronization, data input, and creation time.

Jay [48] (NS9) identifies the infrastructure requirements for the creation of DTs. In addition, the author assumes that after 2015, simulation is a core functionality of systems using seamless assistance along their entire life cycle, i.e., supporting operation and service with a direct link to operation data.

Perno and Hvam [49] (NS10) investigate the processes of the manufacturing industry and develop a framework for scoping DTs in that context. Due to the novelty of the concept and the broad range of technologies upon which it is built, the process of scoping Digital Twin projects can prove to be daunting for process-manufacturing companies.

Arias [50] (NS11) entitles the new project "Plattform Industrie 4.0", and, after briefly describing its technologies, considers its implications.

Leon and Horita [51] (NS12) try to overcome two challenges: (i) how to decentralize existing legacy systems to provide a technology solution that meets the new needs of users in this more digital society; and (ii) how to create a systems architecture that addresses the characteristics inherent in digital transformation.

Sensuse et al. [52] (NS13) use SLR to identify qualitative research in ontology engineering. The main purpose of these authors is an operationalization of socio-technical ontology engineering methodology. This methodology consists of five main phases, namely: (i) planning, (ii) analysis, (iii) design, (iv) implementation, and (v) evaluation.

Dermeval [53] (NS14) uses SLR to identify the applications of ontologies in RE (requirements engineering). The main findings of this research are that: (i) there is empirical evidence of the benefits of using ontologies in RE activities, especially for reducing ambiguity, inconsistency, and the incompleteness of requirements; (ii) the RE process is usually only partially addressed, for example, only considering functional requirements; (iii) ontologies support a great diversity of RE modeling styles; (iv) several studies describe the

use/development of tools to support different types of ontology-driven RE approaches; (v) about half of the studies followed W3C recommendations on ontology-related languages; (vi) a great variety of RE ontologies were identified; nevertheless, none of them has been broadly adopted; and (vii) several promising research opportunities were identified. Other authors also have some valuable inputs to this discussion [3,13,37].

Rocca et al. [54] (NS15) try to put into the same basket VR, DTs and CE practices, and present a laboratory application case: virtually testing waste from electrical and electronic equipment (WEEE) disassembly plant configuration, using a set of dedicated simulation tools. The authors stress the importance of their work due to the increasing awareness of customers toward climate change effects, the high demand instability affecting several industrial sectors, and the fast automation and digitalization of production systems.

Fatwanto [55] (NS16) proposes a software requirements specification analysis using natural-language processing techniques. The author tries to improve the software product production process.

#### **4. Discussion**

The study of DTs is recent, mainly after 2015 [33,40], and there is an unclear definition of DT. However, regardless of the growing complexity of their applications, it is agreed that there are several benefits to using DTs, such as optimization of Industry 4.0, and the sustainability of the product design process. In addition, the lack of studies with technical details can be a difficulty when adopting this technology. A closer look into possible environmental sustainability benefits and at the product design level can help a clear understanding of this reality.

The analysis of the available literature allows us to identify several aspects of the relationship between DTs, product design and sustainability, and hence, to answer the original research question, we preliminary discuss the involved sub-questions (SQi):

SQ1: What is the relationship between DTs and product design?

Concerning SQ1, from the set of selected papers we verify that there are primarily two relationships between DTs and product design: (i) DTs are digital models of physical products fed with real-time data, having an important role in understanding real behaviors and needed adaptations, and (ii) tests using DTs are less expensive and easier than building new physical prototypes.

SQ2: What are the environmental sustainability requirements of DTs?

Concerning SQ2, and based on the selected literature, the main environmental classes of sustainability requirements are: (i) control of energy consumption and (ii) use of environmentally friendly materials. When CE at the I4.0 level is considered, there is a clear need for a tradeoff between complexity and energy consumption versus the results of the new technology implementation. Possibly, making DT tests and creating less complex products is a main topic of research.

SQ3: What are the open issues and challenges in future research paths for DTs and sustainability?

Concerning SQ3, at a first look and as already mentioned, the complexity reduction to build and set up DTs is only one future research path. The impact of complexity and energy consumption is paramount when the decision to introduce DTs at the factory level is considered. This has an overall impact on sustainability in its dimensions of security, environmental and financial sustainability. The discussed meta-SLR allowed us to identify several possible open issues and challenges for research.

First, methods and processes to design and implement DTs are needed. Since there are several specific application domains with their characteristics, this should be considered. Although there are detailed descriptions of DTs [38,56], it is unclear how to realize them, especially if there is no previous experience on how to do it.

Second, SLR studies point out gaps such as: (i) perceived benefits have not been identified; (ii) DT across the product life cycle or the DT life cycle is not sufficiently studied (whole life cycle, evolving digital profile, historical data); (iii) DTs have not been created, thus, it is not clear how DTs contribute to reducing cost or improving service or supporting decision making; (iv) technical implementations must be improved and detailed in the context of IoT; (v) the level of fidelity is not evaluated in terms of the number of parameters, their accuracy and levels of abstraction; (vi) data ownership of data stored within the DT must be determined; and (vii) integration between virtual entities must be improved, because better methods are needed for communication [1]. As already mentioned, DTs are a recent technology, and this fact partially explains these gaps or diversified future research paths.

At the same time, there is an attempt to clearly define the DT concept, for example, to classify existing standards such as the "Plattform Industry 4.0", which describes a standardized DT in I4.0 [47]. If there are several approaches and contexts to define DTs, how can we identify the most important gaps in the study of sustainability in DT usage? We start with the assumption that a DT consists of three parts: (i) a physical product, (ii) a virtual product, and (iii) connections and data flowing between them [47]. Then, we assume that at the technical implementation and at a particular level of fidelity, it is possible to identify its main contributions to sustainability.

RQ1: What is the state of the art in the area of sustainability requirements of DT-based systems related to product design?

Finally and trying to answer our original research question RQ1, we must address it carefully. Despite the existing gaps, the literature identifies several sustainability requirements for DT-based systems related to product design, namely: (i) fidelity; (ii) energy control; (iii) complexity control; (iv) identification of environmentally friendly and costefficient materials, and (v) easy reproduction of new product designs. Different studies investigate distinct sustainability requirements, and there is no integrated approach to understanding how DTs can create environmental sustainability. An integrated approach would imply additional complexity, for example, a fully rigorous fidelity implies additional time and further energy consumption, and this output might create a tradeoff, leading to fuzzy fidelity. This fuzzy fidelity means that environmental costs continue to be external to the production because their evaluation would also imply further work and costs. Easy reproduction of new product designs might imply a reduction of costs at the production stage, but the costs of the first steps of DT implementation might explain why this requirement is such a demanding one.

#### **5. Conclusions**

In this work, it has been possible to identify relevant research work regarding the study of DT-based systems and technologies, using the SLR methodology as the main tool for a meta-analysis on the subject of SLR. Special attention was put on the choice of vocabulary used to perform the research in several databases. Based on that analysis, it was possible to answer the RQ1 as well as the sub-questions SQ1, SQ2, and SQ3.

There are five main concerns to address in the development of a sustainable DT: (i) fidelity; (ii) energy control; (iii) complexity control; (iv) identification of environmentally and cost-efficient materials; and (v) easy reproduction of new product designs. It is also possible to identify areas of research related to DTs, namely: (i) the study of its concept and definition [1,4,35,36,40,47]; (ii) the presentation of examples [38,56]; and (iii) the use of SLR to understand the current research and to identify future research paths [2,10,33,37].

This analysis allows us to identify two main gaps that correspond to two future research paths. The first gap is the absence of a detailed paper explaining exactly what a DT is, with an extensive and rich example that stresses even the hardware characteristics, and how the connections between the physical and digital dimensions can be designed, developed, and maintained. This reality might be explained by the unwillingness of the industry to disclose sensitive information, this gap to be fulfilled by the academy.

Secondly, we identify papers that present sustainability in DT applications, and, additionally, we identify studies that distinguish between several types of sustainability, including environmental sustainability. Furthermore, we identify papers where the connection between CE and the usage of DTs, in the context of Industry 4.0, is clearly stated. However, it was not possible to find a paper that only discusses an SLR of DTs regarding environmental sustainability. Is this merely a vocabulary issue, CE being equal to environmental sustainability? We believe the answer is no, because CE, at the factory level, is still a concept at the laboratory stage, and the complexity implied by its implementation might be paramount. In other words, for the question of whether CE is environmentally sustainable if the management and technical complexity to achieve it is so noteworthy, to evaluate this hypothesis, a future research path might be the study of a scenario where a zero environmental impact product as a control is the design objective, with the help of a DT.

In summary, this paper presents a meta-SLR on DT-based systems that allow us to identify and discuss the main classes of requirements to consider in the development of a sustainable DT, but also allow us to identify gaps and limitations in both research and practice aspects.

**Author Contributions:** Conceptualization, All; formal analysis, R.C.; funding acquisition, A.R.d.S.; investigation, All; evaluation methodology, R.C.; supervision, A.R.d.S.; writing—original draft, R.C.; writing—review and editing, All. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially funded by Portuguese national funds through FITEC-Programa Interface, with reference CIT "INOV-INESC Inovação-Financiamento Base", and FCT UIDB/50021/2020. The APC was funded by FITEC-Programa Interface, with reference CIT "INOV-INESC Inovação-Financiamento Base".

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** Work supported by funds under FITEC-Programa Interface, CIT INOV- INESC Inovação-Financiamento Base, and FCT UIDB/50021/2020.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Applied Sciences* Editorial Office E-mail: applsci@mdpi.com www.mdpi.com/journal/applsci

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-1799-5