**Optimization Design and Flexible Detection Method of a Surface Adaptation Wall-Climbing Robot with Multisensor Integration for Petrochemical Tanks**

#### **Minglu Zhang, Xuan Zhang, Manhong Li \*, Jian Cao and Zhexuan Huang**

School of Mechanical Engineering, Hebei University of Technology, Tianjin 300130, China; zhangml@hebut.edu.cn (M.Z.); 201811201016@stu.hebut.edu.cn (X.Z.); 201921202068@stu.hebut.edu.cn (J.C.); lnzx0508@163.com (Z.H.)

**\*** Correspondence: 2015038@hebut.edu.cn; Tel.: +86-138-0209-7213

Received: 27 October 2020; Accepted: 18 November 2020; Published: 20 November 2020

**Abstract:** Recently, numerous wall-climbing robots have been developed for petrochemical tank maintenance. However, most of them are difficult to be widely applied due to common problems such as poor adsorption capacity, low facade adaptability, and low detection accuracy. In order to realize automatic precise detection, an innovative wall-climbing robot system was designed. Based on magnetic circuit optimization, a passive adaptive moving mechanism that can adapt to the walls of different curvatures was proposed. In order to improve detection accuracy and efficiency, a flexible detection mechanism combining with a hooke hinge that can realize passive vertical alignment was designed to meet the detection requirements. Through the analysis of mechanical models under different working conditions, a hierarchical control system was established to complete the wall thickness and film thickness detection. The results showed that the robot could move safely and stably on the facade, as well as complete automatic precise detection.

**Keywords:** wall-climbing robot; passive adaptive mechanism; magnetic circuit optimization; flexible detection method

#### **1. Introduction**

With the rapid development of industries, an increasing number of spherical and cylindrical tanks have been used to store industrial products in the petrochemical field. Different degrees of damage in storage tanks have gradually emerged due to the open environment and natural aging, and regular maintenance has been adopted to ensure the safety of operation. However, traditional maintenance methods require a large number of humans and resources that are inefficient, costly, and dangerous [1–5]. Thus, developing a reliable and flexible wall-climbing robot has become a hot spot in the field of tank maintenance, as such a robot can realize the high precision detection of different detection modules under high risk and in complex petrochemical tanks [6–10].

At present, the adsorptive, moving, and detection mechanisms of wall-climbing robots have been extensively studied. Some typical robot systems have been developed and applied in various fields. The adsorption mechanism is the primary condition to ensure robot movement on a facade. Wall-climbing robots have different adsorption mechanisms for different working surfaces and moving modes. Numerous studies have revealed the following five adsorption modes: permanent magnet, electromagnetic, negative pressure, molecular force, and mixed adsorption [11–17]. Navaprakash et al. [18] used the principle of negative pressure adsorption to design an adsorption mechanism and verified its safe and stable adsorption on non-magnetic facades through software simulation. Chen et al. [19] designed a wall-climbing robot that uses a rotational-flow suction unit to realize climbing rough walls and overstepping small obstacles. Demirjian et al. [20] designed a

caterpillar wall-climbing robot based on bionic principles that uses binder materials and breaks with traditional adsorption concepts. Seriani et al. [21] used wall-climbing robots on both sides of a wall to adsorb each other so as to realize the safe adsorption and stable movement on a non-magnetic wall. Wang et al. [22] optimized the magnetic circuit through the finite element analysis method and designed a new type of permanent magnet wheel with the same magnetic pole array arrangement that considerably improved the adsorption efficiency of the magnet. Wen [23] proposed an adjustable variable magnetic adsorption mechanism to realize the stability detection of a robot on the outer walls of storage tanks. Eto et al. [24] innovatively designed a two degrees-of-freedom (DOF) rotating magnetic attachment mechanism that maintains the optimal adsorption state of the magnet through passive adjustment and realizes safe and stable adsorption on different walls. Xiao et al. [25] designed a new steady-state permanent magnet adsorption operation mechanism to accomplish stable adsorption on complex facades. Fan et al. [26] combined electromagnetic and internal force compensation principles to realize the fast, controllable adsorption and separation of wall-climbing robots.

Many research institutions have developed a large number of wall-climbing robots for industrial applications based on the above adsorption mechanisms by combining mobile mechanisms and detection methods. By integrating viscous materials and a wheel-legged moving mechanism, Amirpasha et al. [27] innovatively proposed a wheeled foot-climbing robot that can achieve large obstacle crossing and wall transition. Wang et al. [28] creatively designed a bipedal, three-DOF wall-climbing robot to realize the detection of wind fan blades. Huang et al. [29] designed a crawler robot for ship detection by integrating a caterpillar structure and the magnetic adsorption mechanism that could realize the large-area detection of complex walls. Zhang et al. [30] designed a wall-climbing de-rusting robot for ship welds based on the visual recognition method of three-line laser structural light. Zhang et al. [31] developed a crawler wall-climbing robot to remove coatings based on high pressure water jet technology. In addition, numerous wall-climbing robots have been developed for petrochemical maintenance and other fields [32–37]. Mizota et al. [38] proposed a control method for the compliant motion of a wall-climbing robot based on propelling wave theory to realize stable and flexible movements on a façade by wall-climbing robots. Wu et al. [39] innovatively proposed a coordinated control method based on task trajectory tracking to realize the compliant detection of robots. Zhang [40] used an intelligent perception system to compliantly control a robot and to realize autonomous adaptive full-range detection over complex terrain. Song et al. [41] proposed an intelligent discrete trajectory tracking control algorithm based on the improved Dual-Heuristic Dynamic Programming (DHP) algorithm to solve the circular trajectory movement of a robot on a vertical wall.

Numerous wall-climbing robots have been developed and applied for petrochemical maintenance. However, current research is generally in the bottleneck state due to the limitations of reliable adsorption, surface adaptability, and detection devices, and the following three problems should be urgently solved. (1) Permanent magnet adsorption mechanisms have low magnetic energy utilization and adsorption capacity due to the limited transfer mechanism analysis of the multimedium magnetic circuit. (2) Moving the existing wall-climbing robots smoothly on curved surfaces with changeable morphologies is difficult due to insufficient studies on the passive flexible adaptive moving mechanism. (3) Achieving the vertical alignment of a probe for different detection modules while sticking to the facade is difficult for existing detection mechanisms, thus affecting detection effects and accuracy

Here, a wall-climbing detection robot that can realize multimode non-destructive testing on different walls is proposed on the basis of the above-mentioned problems. A high performance permanent magnet wheel was designed on the basis of magnetic circuit optimization to solve the safety adsorption problem, and the rapid demagnetization structure of the wheel was designed to facilitate the robot's removal from the wall after detection. Different from the traditional wall climbing mechanism with rigid connection, the wheels in this paper were flexibly connected with the moving mechanism to form a pseudopodia robot that could adapt to curved surfaces and move flexibly on the surfaces of spherical and cylindrical storage tanks. In order to improve the detection accuracy and efficiency of existing testing equipment, a flexible adaptive detection mechanism with multi-DOFs is proposed to passively adapt to wall surfaces by integrating a hooke hinge mechanism. A dynamic model of the wall-climbing robot was established on the basis of different working conditions to solve the momentum distribution problem of wheels under different motion modes. Through different process controls, the robot can use ultrasonic and eddy current probes to detect the thicknesses of wall and paint film, respectively. Experiments were conducted on a 5-mm-thick cylindrical tank surface to test the structure and detection capability of the robot. The experiments showed that the robot can move flexibly and stably on different facades. Simultaneously, the robot can accurately detect the thicknesses of walls and paint films by carrying different detection probes that can replace manual work to a certain extent.

The remainder of this paper is organized as follows. Section 2 introduces the structure of the detection robot, which mainly includes the magnetic adsorption moving mechanism and the passive flexible detection mechanism. Section 3 establishes mechanical analysis models for different working conditions and motion modes to determine the minimum adsorption and driving forces of safe and stable motions. Section 4 introduces the hardware composition of the control system and flexible detection process control flow with multiple detection capabilities. Section 5 presents the experimental process and analysis results. Section 6 provides several conclusions drawn from this research.

#### **2. Introduction to Detection Robot**

A wall-climbing detection robot adapted to different curvature walls was developed while considering the varied morphology of petrochemical tanks. The robot mainly comprised an adaptive moving mechanism, magnetic adsorption wheels, and a flexible detection mechanism with multi-DOF. The working environment of the robot comprises facades with different curvatures. Thus, solving the problems of the safe adsorption and stable movement of the moving mechanism, as well as the flexible adaptation and accurate measurement of the detection mechanism, was necessary. Therefore, a wall-climbing robot with flexible detection was developed. This robot can steadily adsorb and complete different detection tasks on different facades to meet the requirements of petrochemical tank detection. A high performance magnetic wheel structure that can be quickly demagnetized is also proposed. This structure coordinates the design of the multi-DOF moving mechanism to passively adapt to different curvature walls to ensure safe and stable movement. A flexible detection mechanism was designed in accordance with the operational requirements of the different detection modules by integrating rope pulling and a hooke hinge mechanism to realize the self-adaptive vertical alignment of the probe to adapt to different detection techniques. The detection robot can realize the precise movement and action of different detection process flows through a state control strategy and finally complete the wall detection tasks. The specific structure of the robot is shown in Figure 1.

**Figure 1.** Overall structure of the wall-climbing detection robot.

#### *2.1. Magnetic Wheel*

#### 2.1.1. Structural Design of the Magnetic Wheel

The permanent magnet adsorption mechanism is the crucial point in the design of a wall-climbing robot, because it is directly related to the safe absorption and stable movement on a wall surface. The robot movement is stable and the safety factor is large when the magnetic wheel adsorption capability is strong. However, the friction between magnetic wheels and the wall surface increases with the adsorption force and the resistance to be overcome in the movement is large, thus leading to a high driving torque. Simultaneously, detachment from the wall becomes difficult for the magnetic wheel after completing an avoidance detection task. Therefore, designing a lightweight wheel with strong adsorption was the pivotal technical problem to be solved in this paper. A new method using the combination of fan-shaped permanent magnet and yoke iron as the excitation source is proposed to improve the utilization rate of the magnet. At the same time, in order to detach the robot from the wall after completing the task, a fast demagnetization method was designed by using the lever principle. The specific structure is shown in Figure 2.

**Figure 2.** Magnetic wheel structure with fast demagnetization.

In the process of the adsorption force production of the magnetic wheel, most magnetic sensing lines come from a small part of the magnet close to the wall surface. Therefore, a radial magnetized fan magnet (Nd2Fe14B) was selected as the excitation source to reduce the weight and provide a strong adsorption force. Yoke iron was used to collect magnetic induction lines because of its high permeability that can reduce magnetic flux leakage and improve the utilization ratio of magnetic energy. Figure 2 shows that the fan magnet was placed in the suspension of the wheel, which could rotate relative to the wheel hub. When the output shaft transmits motion to the hub through a key, the permanent magnet always remains relatively still with the wall and does not rotate with the hub, which not only maintains a constant adsorption force but also avoids relative motion with the wheel. Actively reducing the adsorption force between the magnetic wheel and the wall, that is, the magnetic wheel demagnetization, is necessary to facilitate the robot detachment from the wall after the detection task. A small tangential force can be used in the adsorption state to force magnet rotation relative to each other, which can reduce the adsorption force between the magnet and the wall. Thus, a fast demagnetization mechanism was designed on the basis of the lever principle, which could facilitate magnet rotation around an output axis, thus completing the demagnetization.

#### 2.1.2. Optimization of Magnetic Wheel

The magnetic wheel structure was optimized to obtain a high performance and lightweight magnetic wheel. The adsorption force of a magnetic wheel whose outside diameter and width are fixed is affected by the air gap *h*, the thickness of yoke iron *H*, and the shape of the magnet (the inner radius *Rin* and angle of the magnet θ). The electromagnetic field analysis software Ansoft was used to analyze the magnetic field strength of the permanent magnet and to determine the relationship between magnetic wheel parameters and the magnetic field strength to realize the lightweight of the magnetic wheel and ensure the reliability of adsorption. This analysis provided a reference for motion assessment and improved the magnetic utilization rate.

According to the principle of a single variable, the relationship between wheel adsorption force *F* and variable can be obtained by changing the air gap *h*, inner radius *Rin*, and angle of magnet θ. The simulation results are shown in Figure 3.

**Figure 3.** Influence of magnetic wheel parameters on the adsorption force: (**a**) The relationship between the adsorption force and the air gap height under different inner radius of the magnet, (**b**) the relationship between the adsorption force and the air gap height under different yoke iron thicknesses, (**c**) the relationship between the adsorption force and the air gap height under different magnet center angles, and (**d**) description of magnetic wheel structure and size.

The magnetic wheel adsorption force was found to be inversely proportional to the distance from the wall according to the information in the three above-mentioned figures; the adsorption force further away from the was found to be worse. Figure 3a shows that an increase in the inner radius *Rin* could lead to a contained high magnetic energy, a high magnetic field intensity that could be excited, and a strong adsorption force. Figure 3b indicates that the capability of the yoke iron to collect magnetic induction lines was found to increase with the yoke iron height *H*. This phenomenon complicates the magnetic saturation production and enhances the utilization ratio of magnetic energy products to improve magnetic field strength and adsorption force. Figure 3c reveals that the effective transfer area between the magnet and the wall surface was found to increase with the angle of the magnet θ, which improves the adsorption performance of the magnet. Considering the volume limitation of the wheel, the shape of the magnet was optimized in accordance with the functional relationship between the magnetic field strength and the geometric parameters of the magnetic wheel (*h*, *Rin*, θ, and *H*). Continuous nonlinear programming has the capability of using the response surface method to approximate the finite element response characteristics, which is very suitable for solving the optimization problem of finite variables. Therefore, the continuous nonlinear programming in Ansoft

Maxwell was adopted to optimize the three parameters of the magnetic wheel. The iterative process is complex and tedious but is commonly used; thus, comprehensively describing the solution process is unnecessary. Finally, a permanent magnet wheel with good performance was designed, and the specific mechanism size is shown in Table 1.


**Table 1.** Comparison of magnetic wheel characteristics before and after optimization.

The magnetic wheel adsorption experiment was conducted on an arc facade to test the adsorption capability of the magnetic wheel. The adsorption capacity of the wheel was tested in horizontal, vertical, and oblique states. The actual adsorption force was obtained by reading the maximum pull value of the magnetic wheel in the adsorption state on the wall through the dynamometer (the pull value of the wheel when it leaves the wall is the instantaneous maximum pull value). The influence of gravity in all cases was removed in the data recording process. The specific values are shown in Table 2.

**Table 2.** Adsorption force of the magnetic wheel under different conditions.


The actual adsorption force of the magnetic wheel could be obtained as 120 N by averaging the above values. The adsorption capacity can meet the requirement of safety adsorption of wall-climbing robot.

Thus far, a wheeled adsorption mechanism was innovatively designed. A lightweight permanent magnet wheel with strong adsorption capability was obtained through magnetic circuit optimization design and multivariable simulation optimization. The actual adsorption capacity of the magnetic wheel under different working conditions was then measured by experiments, and the adsorption performance of the magnetic wheel was verified.

#### *2.2. Passive Adaptive Moving Mechanism*

A detection robot works on a circular or spherical facade, and the adaptability of the moving mechanism to the complex wall is directly related to the movement safety and stability. Achieving the adaptability of small curvature tanks is difficult for traditional moving mechanisms, which will easily lead to slipping and instability, thus affecting work efficiency and operation safety. Moreover, realizing the stable movement of a robot on a wall becomes a problem. The possible instability of the detection robot was analyzed to solve this problem, which mainly includes the following two points. (1) A single front wheel is forced to leave a wall surface when a detection robot encounters an obstacle. Another wheel on the same side with a similar connection also leaves the surface due to the rigidity of the robot. This phenomenon directly leads to a sharp decline in the adsorptive capacity of the robot on the wall surface, thus making the robot prone to instability. (2) A robot's movement on the curved surface leads to an incomplete fitting of the angle between wheels and the wall. Ensuring enough adsorption force is difficult, and the decrease in contact area easily causes instability. Different from the traditional wheeled moving mechanism, we combined multi-DOF deformation concept to design an innovative moving mechanism with the ability for surface passive adaptation. The close contact between wheels and the wall was realized by passively adapting the fuselage component, thus ensuring the safe operation of the moving mechanism. The specific mechanism is shown in Figure 4.

**Figure 4.** Pseudopodia flexible moving mechanism: (**a**) Moving mechanism structure, (**b**) obstacle crossing process, and (**c**) surface adaptation process.

A passive adaptive moving mechanism was designed in this paper to improve the adaptability of robots to facades and ensure their safe and stable movement. Figure 4 shows that the moving mechanism comprises the wheel frame, support frame, and cam mechanism. The wheels on the left and right sides are connected with the hand frame through the axes *A*<sup>1</sup> and *A*2, respectively, and can rotate about the axes. The cam mechanism is fixed on the front and rear sides of the support frame by springs. The elastic deformation of the spring pushes the cam to move, which drives the wheel frame rotation around the axes *C*<sup>1</sup> and *C*2; thus, both wheels fit vertically to the wall. Hence, the robot can be safely adsorbed on different curvature walls. The driving motors adopt diagonal arrangement and transfer power by using a synchronous belt to ensure the driving torque and simplify the control. Figure 4b shows that the right wheel frame rotates around axis *A*<sup>1</sup> when the unilateral wheel of the robot encounters obstacles to ensure that each wheel can be reliably adsorbed on the wall surface. This phenomenon avoids the first instability situation. Figure 4c shows that the cam mechanism is passively adjusted to drive the wheel frames on both sides moving to rotate around the axes *C*<sup>1</sup> and *C*<sup>2</sup> when the robot operates on the circular arc wall. Therefore, the magnetic wheel can closely contact the wall surface, which ensures stable and safe movements. Through the design of the above structure, the wheels on both sides of the robot can be flexibly adjusted with multi-DOF to ensure that each wheel can contact closely to different curvature walls and meet safety adsorption requirements.

#### *2.3. Detection Mechanism*

Nondestructive testing has always been highly recommended in the detection methods of petrochemical storage tanks. Ultrasonic and eddy current sensors are needed during the maintenance of petrochemical storage tanks to complete the thickness measurements of the wall and paint film. Different detection tasks require different technological processes, and the relative position between the detection device and the wall surface directly affects the detection effect and accuracy. Therefore, keeping the probe vertically aligned and close to the wall surface is necessary, while active and accurate real-time control increases the difficulty of control. Different from a traditional rigid detection

mechanism, an underactuated passive adaptive detection mechanism was designed by integrating a hooke hinge mechanism to meet the precise detection requirements of different walls. The vertical alignment of probes is realized by the passive adaptation of the hooke hinge mechanism, and the probe is pressed tightly to the wall surface by the spring to meet the detection requirements. The specific architecture is shown in Figure 5.

**Figure 5.** Flexible detection mechanism.

Figure 5 shows that the detection mechanism is fixed on the robot through the substrate. Hooke hinge structures enable a detection mechanism to have three DOFs, which can help detection mechanisms be perpendicular to kinds of complex wall surfaces. The torsion spring is installed on the rotary shaft of the hooke hinge mechanism. This hinge can provide torque force to press the probe on the wall surface to ensure the detection effect and improve detection accuracy. The hooke hinge mechanism is connected with the DC motor rocker arm through a wire rope. The DC motor rotates to lower the probe in the detection state. On the contrary, the DC motor rotates in reverse to lift the probe away from the wall in the non-detection state. The detection mechanism can be manually fixed into an L-shape by the locating pin after the removal of the wall-climbing robot from the wall surface. The holding mechanism in the hooke hinge mechanism is used to fix the detection probe, and different detection modules can be conveniently replaced to complete different detection tasks. In addition, eight stainless steel beads are installed uniformly on the probe holding mechanism to convert sliding friction into rolling friction to avoid damage to the detection wall. The above structure ensures close contact between the end of the detection mechanism and the surface. Therefore, detection efficiency can be guaranteed.

#### **3. Mechanical Analysis**

The weight and adsorption force of a robot directly affect the safety and movement flexibility when it runs on different facades. A robot must meet the safety requirements of different working conditions and movement modes in the process of continuous detection to realize the full domain detection of petrochemical storage tanks. Here, the critical failure states of the designed robot were analyzed through a mechanical model under different working conditions to obtain the minimum adsorption force of the magnetic wheel and ensure the safe and stable movement of the wall-climbing robot on a facade. Dynamic models were also established for different motion modes, and the robot and each wheel were analyzed to achieve the optimal momentum distribution and optimize the motion performance.

#### *3.1. Statics Analysis*

In the process of facade movement, a wall-climbing robot is prone to dangerous states, such as static sliding, vertical overturning, horizontal overturning, and oblique overturning. These states affect movement safety and flexibility. Thus, mechanics analysis on a robot must be conducted to determine the minimum adsorption force to ensure safe and stable movement. Here, a mechanical model was established for mechanics analysis, as shown in Figure 6.

**Figure 6.** Static model of robot: (**a**) Vertical state, (**b**) horizontal state, and (**c**) oblique state.

Force and moment balance equations were established for the above states based on classical mechanics theory. In order to simplify the calculation process, we proposed the concept of safety factor to compensate for relatively small disadvantages such as cable weight and severe environment. The following static model of the robot was obtained.

$$\begin{cases} \begin{array}{c} \sum\limits\_{i=1}^{4} F\_{fi} = sG\\ \frac{4}{i} \sum\limits\_{i=1}^{4} F\_{Ni} = 4F\_{M\text{alg}}\\ \sum\limits\_{i=1,3} (F\_{Ni} - F\_{M\text{alg}})l + sGh\_{\text{c}} = 0\\ \sum\limits\_{i=1}^{2} (F\_{Ni} - F\_{M\text{alg}})B + sGh\_{\text{c}} = 0 \end{array} \tag{1}$$

The meanings of the letters in the formula are shown in Table 3:

**Table 3.** Parameters in the mechanical model.


The critical condition for the robot to be in a safe and stable state is that all magnetic wheels are on the wall surface, that is, constraints of support force and friction are present and the maximum static friction should be larger than the gravity component. Therefore, the value range of magnetic wheel adsorption force can be obtained as follows: *FMag* ≥ *sG*/4μ.

Figure 6c shows that when the robot is inclined to adsorb on the wall surface, the robot may flip around the AB or CD axes in this state. Gravity (G) can be decomposed into *G* sin β and *G* cos β along the direction of AB and CD. *G* sin β and *G* cos β were found to be less than *G*. Therefore, the calculated critical value of the safety adsorption force is less than the threshold of adsorption force when the robot is vertical and horizontal, as calculated above.

Therefore, the minimum adsorption force required by the robot was obtained to maintain static stability.

#### *3.2. Dynamics Analysis*

Dynamic analysis was conducted to obtain the optimal driving torque of each motor for the stable movement of the robot in different motion modes. The analysis of various motion modes revealed that the driving torque of other operation modes is less than or equal to that required for vertical upward straight or turning motion. Therefore, the dynamics analysis model of the robot was established in the two situations, and the best driving torque was obtained.

#### 3.2.1. Dynamic Analysis in the Vertical Upward Movement

Dynamic analysis is similar to static analysis when running vertically upward. However, the existence of acceleration and the difference in the friction coefficient should be considered. Assuming that each wheel performs pure rolling motion without sliding, the mechanical model was obtained, as shown in Figure 7.

**Figure 7.** Dynamics model of the robot: (**a**) Side view of the vertical upward state, (**b**) main view of the vertical upward state, and (**c**) force analysis diagram of each wheel.

Rolling resistance was found to be generated due to the deformation of the rubber layer of the wheel, and its deformation was found to be small. Therefore, the moment of rolling resistance compared with other torsional moments could be ignored. The rolling resistance compared with other forces could also be disregarded due to the small rolling resistance coefficient. In order to enable the robot to overcome gravity and move stably, Formula (2) was established according to the principle of force balance, and then the required motor torque was solved.

$$2\frac{T\_I}{R\_w} - sG = \frac{sG}{g}a.\tag{2}$$

By simple derivation of Formulas (1) and (2), the required torque of the motor was calculated as follows:

$$T\_t \ge (\frac{1}{2} + \frac{a}{2g})sGR\_w. \tag{3}$$

The meaning of the letters in the formula is shown in the following Table 4:


**Table 4.** Parameters in the mechanical model.

3.2.2. Dynamic Analysis in Steering

In this research, the wall-climbing robot as found to be able to achieve steering via different speeds of the wheels on each side. The angular speed and steering radius were, respectively, determined by the speed and direction of the wheels on both sides, as shown in Figure 8.

**Figure 8.** Dynamic analysis in steering: (**a**) Large radius turning state, (**b**) small radius turning state, and (**c**) dynamic analysis of each wheel.

Figure 8 shows the relationship between the rotation speed of wheels on both sides and the turning radius:

$$\begin{cases} \; \omega = \frac{V\_l}{(R + B/2)} = \frac{V\_r}{(R - B/2)}\\ \; \qquad \alpha = \dot{\omega} \end{cases} \tag{4}$$

The turning radius formula of the robot could be easily obtained according to the speed of the wheels on both side:

$$R = \frac{V\_l + V\_r}{V\_l - V\_r} \cdot \frac{B}{2}.\tag{5}$$

When *R* > 0.5*B*, the center of rotation is outside the robot (as shown in Figure 8a); when *R* < 0.5*B*, the center of rotation is inside the robot (as shown in Figure 8b). The condition of *R* > 0.5*B* was taken as an example for force analysis. The horizontal to the vertical rotation of the wall-climbing robot was taken as the model to analyze the strained condition. The torque balance formula with point O as the center of rotation, as shown in Formula (6), was established to solve the required output torque of the motor. A mechanical model of the following turning states was obtained:

$$\frac{T\_t}{R\_w}(R + \frac{B}{2}) + \frac{T\_t}{R\_w}(R - \frac{B}{2}) - \sum\_{i=1}^{4} F\_{fi}\frac{l}{2} = J\alpha. \tag{6}$$

In combination with Formulas (4)–(6), Formula (7) could be obtained:

$$T\_t \ge \frac{J\alpha + 2\mu F\_{\text{Mag}}R\_{wl}}{2R}.\tag{7}$$

The dynamics of the two motion modes of the robot, vertical upward motion and turning motion, were analyzed, and the equations were solved to find the motor torque range suitable for the stable operation of the robot.

#### **4. Control System**

This chapter introduces the hierarchical control system built by an industrial personal computer (IPC) as the upper computer, which uses IPC to realize the planning of the whole detection process and completes the detection task through the hierarchical control of each functional module. The control system mainly includes the precise motion control system of the flexible moving mechanism and the active adjustment control system of the flexible detection mechanism. Limited by the severe environment, wired control is used for remote controls to ensure the stability and accuracy of interactive information transmission. In addition to the basic function of flexible movement on the wall, the detection robot also needs to perform different detection modules for different detections. The robot can measure the thicknesses of the wall and paint film by, respectively, using ultrasonic and the eddy current probes. RS485 communication is adopted to complete the high-speed transmission of real-time detection information to realize multimodule and multimode coordinated detection, and the detection workflow is realized by using a distributed control system to monitor the close cooperation between the components. The specific control system structure is in Figure 9.

**Figure 9.** Hardware composition of the control system.

The robot is controlled remotely by external input devices, such as buttons in the control box. Control instructions are transmitted to the motors on both sides of the moving mechanism through the RS485. The motion pattern of the robot can be changed by adjusting the rotation speed and direction of the wheels on both sides. Simultaneously, inertial navigation information is used to adjust the running state of the motor to facilitate accurate movement. The distance between the detection device and the wall surface can be actively adjusted by controlling the motor steering in the flexible detection mechanism. The passive adaptability of the robot is used to ensure that the probe is perpendicular to the wall. Data obtained by probes are transmitted back to the main control unit in real-time via RS485 and displayed digitally. The robot can measure the thicknesses of the wall and paint film by, respectively, using ultrasonic and the eddy current probes. An air compressor and a diaphragm pump may be required during wall thickness measurement when using ultrasonic sensors to spray the coupling fluid near the probe to assist the robot in completing the wall thickness detection.

The control system of a wall-climbing robot is the key to motion and detection. The control can be divided into three parts: initialization, movement control, and detection control. The motion control realizes the flexible movement of the robot on the wall based on the feedback of the inertial sensor and the active remote control of the user. The detection control is allocated in accordance with different detection modules, which can be called for specific requirements. For example, measuring the position of marking points is necessary when the robot conducts the eddy current detection of paint film thickness while continuous monitoring and the coordination of coupling liquid when the robot conducts ultrasonic detection to wall thickness. The specific control flow chart is shown in Figure 10.

**Figure 10.** Robot detection workflow. (**A**) in the figure represents film thickness detection. (**B**) in the figure represents wall thickness detection.

The detection process can be analyzed as follows from the flow in Figure 10.

	- A: Film thickness detection.
	- B: Wall thickness detection.
	- A: Film thickness detection:
		- (1) The motors are driven to help the robot reach the initial detection point.
		- (2) After the robot reaches the predetermined position, the DC motor in the flexible detection mechanism rotates forward to lower the detection probe. The probe cooperates with its passive adaptation mechanism to realize the vertical alignment of the probe.
		- (3) After the sensor in the flexible detection mechanism confirms the detection position, the probe collects the paint film thickness information and transmits it back to the main control unit.
		- (4) The DC motor reverses to lift the probe, and the robot completes the current position detection.
		- (5) The main control unit checks the presence of a termination signal: if no termination signal is present, then the robot moves to the next detection point and repeat steps 2–5; if the termination signal is obtained, then the detection task is stopped.
	- B: Wall thickness detection:

#### **5. Experiment**

The authors of this paper designed a flexible and adaptive wall-climbing robot for film and wall thickness detection on curved wall surfaces by combining the magnetic wheel, flexible moving mechanism, and multi-DOF detection unit mentioned above. The key technical parameters of the robot are shown in Table 5.


**Table 5.** Keys technical parameters of the robot.

This chapter discusses the movement and detection performance tests of the wall-climbing robot to verify the rationality and feasibility of the above-mentioned structure, control system, and the correctness of the mechanical theoretical analysis. The experimental facility mainly comprised the robot system and a cylindrical façade. Figure 11 shows a vertical circular steel plate, with a radius of 8 m and a wall thickness of 5 mm, which was used to simulate the tank environment. The detection robot system included the robot body, the control box, and auxiliary equipment. The air and diaphragm pumps of the auxiliary device provided coupling fluid for ultrasonic thickness detection. Operators controlled the movement and detection of the robot through the control cabinet to complete the detection task of the wall surface.

**Figure 11.** Experimental scene.

#### *5.1. Movement Performance Test*

Experiments on the vertical arc steel plate were conducted to test the stable adsorption and flexible precise movement capability of the robot. The performance of the robot was analyzed by monitoring the rotation speed of each wheel and the change in the position of the robot's center of mass during its vertical upward and horizontal circumferential movement on the arc plate. Among them, the wheel speed information was obtained by detecting the encoder information of the wheel motor, and the position information of the robot was obtained by the inertial navigation module mounted on the moving mechanism. The details are shown in Figure 12.

**Figure 12.** Experimental of wall motion: (**a**) Vertical upward climbing motion test and (**b**) horizontal circumferential motion test.

The robot could move safely and stably without slipping, falling, and other instabilities during the experiment. This finding indicated that the robot has good adsorption performance and adaptive capability. The synchronous belt is used to drive wheels on the same side. Therefore, the four-wheel robot could be simplified to the form of two wheels on right and left. The data in Figure 12a reveal that the rotation speed of the wheels on both sides during the vertical upward movement remained at 0.09 m/s despite slight fluctuations, and the position of the mass center did not deviate significantly in the horizontal direction. The data in Figure 12b show that the rotation speed of the wheels on both sides remained at 0.15 m/s in the horizontal circular motion of the robot, and the center of mass did not deviate significantly in the vertical direction. The above experimental data prove the steady and accurate robot movement on the circular steel plate.

A turn right movement experiment was conducted on the vertical circular arc wall to verify the movement flexibility. The robot was controlled to move from vertically upward to horizontally to the right, and the wheel rotation speed and the change of the mass center were detected in this process. The steel plate used in the experiment was expanded along the perimeter to intuitively understand the motion state, and the change curve of the mass center was drawn. The details are shown in Figure 13.

**Figure 13.** Turn motion test from vertical to horizontal.

The robot did not lose its stability and completed the right turning movement from the vertical direction to the horizontal direction in the above experiment. The left and right wheel rotation speeds were, respectively, set at 0.15 m/s and 0.09 m/s; therefore, the expected theoretical turning radius was 0.8 m. The figure above reveals that the rotation speed of the wheels on both sides slightly fluctuated up and down around the theoretical value. The adjusted trajectories show that the robot completed the turn, albeit with some deviation due to gravity.

#### *5.2. Detection Accuracy Test*

The wall-climbing detection robot could carry different detection equipment to detect a wall surface. Measurements of the paint film and wall thicknesses were taken as examples to conduct experiments to verify its detection capability. First, the test of film thickness was conducted. Mark points were set every 50 mm on the steel plate as the detection target points, and a handheld instrument was used to collect the detection information as the standard value. Then, the experiment data were automatically detected and recorded after setting the detection mode and the advance distance of the robot. The specific experimental process and two groups of test data are shown in Figure 14.

**Figure 14.** Measurement experiment of film thickness.

The robot could move accurately and complete the corresponding detection process during the experiment, thus finally achieving the measurement and data recording of paint film thickness. A comparison of the two above-mentioned sets of data revealed that the thickness of the paint film automatically detected by the robot was 80 um, which was close to that detected manually. Allowable error bands (±0.5 um) were set with industrial testing requirements after consulting relevant testing manuals, and all the measured values of the marking points were found to be within the allowable error band. The maximum error was 0.3 um, appearing at the eighth marker, which also met the detection requirements.

An auxiliary device was necessary for the thickness detection of steel plates to provide the coupling liquid for the detection robot. The thickness of the steel plate was measured at the continuously changing splicing steel plate, and experiments of manual measurement and automatic continuous measurement were also conducted. The specific experimental process and two groups of test data are shown in Figure 15.

**Figure 15.** Measurement experiment of wall thickness.

During the experiment, the robot could effectively complete the detection process and conduct automatic detection continuously. Similarly, allowable error bands (±0.2 mm) that met the requirements of industrial testing were set for wall thickness measurement. Figure 15 intuitively shows that the robot could continuously detect the thickness of the steel plate, and its thickness changed from 5 to 8 to 5 mm, which was similar to the manual measurement result and met the detection requirements. The two kinds of test data considerably fluctuated at the welding seam due to the influence of welding quality and position deviation of measurement points. Thus, the detection accuracy problem at the welding seam was temporarily disregarded. Stable data could be collected at other locations, and the test results met the test requirements. The maximum error of measurement was +0.2 mm, which was also within the required error range and met the detection requirements.

The above experiments revealed that the designed wall-climbing robot could adapt to the curved wall and move safely, smoothly, and flexibly on the wall. Vertical alignment detection could be realized by carrying ultrasonic and eddy current probes and by cooperating with the passive adaptation of the multi-DOF flexible detection mechanism, and the detection tasks of wall and paint film thicknesses could be effectively completed. The experimental results showed that the robot could complete the task of accurate wall stability detection and realize the automatic surface detection of petrochemical storage tanks in the degree of movement.

#### **6. Discussion and Conclusions**

A wall-climbing detection robot that can adapt to tanks with different radii of curvature was designed to address the increasing maintenance and testing requirements of petrochemical storage tanks. The robot realizes the non-destructive detection of the wall surface and its safe operation through human remote and automatic controls. Different from the traditional adsorption mechanism, the fan-shaped permanent magnet, which added a yoke to collect the magnetic induction line, is used as the excitation source in this robot. This adsorption mechanism reduces the weight of the magnetic wheel, improves the utilization rate of magnetic energy, and ensures reliable adsorption. In order to solve the problem that the existing adsorption devices are difficult to detach from a wall after completing detection, an innovative fast demagnetization mechanism was designed by using the lever principle. Considering that the traditional rigid moving mechanisms are difficult to adapt to different tank wall environments (different curvatures and various obstacles), a flexible adaptive moving mechanism

with multi-DOFs was innovatively designed. The multi-DOFs flexible deformation of the moving mechanism can adapt to a wall surface, which ensures a close fit between magnetic wheels and the wall surface. A flexible detection mechanism that was designed on the basis of the hooke hinge mechanism can quickly change detection equipment to meet the technical requirements of film and wall thickness detections. Through the passive adaptation of a multi-DOFs hooke hinge mechanism, the detection probe can always be perpendicular to the center and close to the wall surface, thus meeting the requirements of accurate detection. Considering various working conditions, the minimum adsorption force and the optimal driving force range of straight line and turning motion were calculated by establishing the mechanical model, which ensured the flexible and stable movement of the robot on an arc facade. Finally, the precise coordination control of each component is performed by the wired control to complete the detection task while considering the limitation of the severe environment.

The wall-climbing detection robot was found to be able to move stably on a façade by conducting experiments on a facade with a thickness of 5 mm and a radius of 8 m, which verified the adsorption capacity of the magnetic wheel. The robot could complete large-radius and in-situ turning movements, which verified the wall surface adaptability of the robot's moving mechanism. Through multisensor information fusion and multicomponent cooperation, the robot could complete the detection tasks of wall and paint film thickness detections by ultrasonic and eddy current sensors, respectively. The detection results also confirmed these findings. The experiment proved that the robot can complete the automatic wall detection task for petrochemical storage tanks.

**Author Contributions:** M.Z. provided supervision, formal analysis, and resources to the project. X.Z. designed all the experiments and subsequently drafted the manuscript. M.L. conceived the original ideas. J.C. contributed to the construction of the experiment platform. Z.H. conducted all the experiments and provided human resources. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Natural Science Foundation of China (Grant Nos. 61803142, 61733001, and U1913211), the Natural Science Foundation of Hebei Province (Grant Nos. F2018202210 and E2018202338), Hebei Science and Technology Agency Science and Technology Innovation Strategy Funding Project (Grant No. 20180603).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Soft-Tentacle Gripper for Pipe Crawling to Inspect Industrial Facilities Using UAVs**

**F. Javier Garcia Rubiales, Pablo Ramon Soria, Begoña C. Arrue \* and Anibal Ollero**

GRVC Robotics Laboratory, University of Seville, Avenida de los Descubrimientos, S/N, 41092 Seville, Spain; fragarrub@alum.us.es (F.J.G.R.); prs@us.es (P.R.S.); aollero@us.es (A.O.) **\*** Correspondence: barrue@us.es

**Abstract:** This paper presents a crawling mechanism using a soft-tentacle gripper integrated into an unmanned aerial vehicle for pipe inspection in industrial environments. The objective was to allow the aerial robot to perch and crawl along the pipe, minimizing the energy consumption, and allowing to perform contact inspection. This paper introduces the design of the soft limbs of the gripper and also the internal mechanism that allows movement along pipes. Several tests have been carried out to ensure the grasping capability on the pipe and the performance and reliability of the developed system. This paper shows the complete development of the system using additive manufacturing techniques and includes the results of experiments performed in realistic environments.

**Keywords:** UAVs; inspection; soft robotics

#### **1. Introduction**

The use of unmanned aerial vehicles (UAVs) has grown exponentially during the last decade. This growth has been associated with technological improvements, such as those in navigation systems and perception sensors.

Nowadays, there is an increasing interest in the use of UAVs for inspection and maintenance. At the moment, most inspection and maintenance tasks are carried out manually which exposes the operators to many dangerous situations. This paper focuses on facilities where there are tons of tubes and pipes that are required to be inspected, as can be found in the oil and gas sector.

In oil and gas production plants, some components degrade. The excessive corrosion of pipelines can lead to accidents, catastrophic failures, impact the environment, and affect plant availability. To prevent this situation, inspection processes such as wall thickness measurements are performed to ensure that plants have safe operating condition, or provide alerts for corrective actions if needed. These activities manually performed by operators. The main problem is that the structures to be inspected are in elevated locations at high temperatures or with toxic materials. This comes at a considerable cost to ensure the safety of inspection personnel and production outages.

By using UAVs, the operators are capable of inspecting inaccessible or dangerous zones without facing any risk. Furthermore, embedding sensors and cameras on the UAV allows them to perform more complex inspections. However, these operations using UAVs are still performed by manual control. The future of these applications relies on current research into the automation of these aerial systems.

For the accurate contact inspection of pipes with drones, landing gear is beneficial because it allows static contact to enable the UAV to perform measurements by coupling to pipes without causing any damage. Moreover, we are proposing a system that should also allow the robot to crawl along the pipeline. The soft gripper that is proposed in this paper is capable of having the necessary strength to hold onto the pipes, and move along them without causing damage.

**Citation:** F. J. García-Rubiales, P. Ramon-Soria, B. C. Arrue and A. Ollero Soft-Tentacle Gripper for Pipe Crawling to Inspect Industrial Facilities Using UAVs. *Sensors* **2021**, *21*, 4142. https://doi.org/10.3390/ s21124142

Academic Editors: Yashar Javadi and Carmelo Mineo

Received: 13 May 2021 Accepted: 10 June 2021 Published: 16 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Then, in addition to saving energy, compared to UAVs that can only fly, our hybrid (flying and crawling) locomotion system presented in this paper does not require the ability to accurately land on the inspection point because it can crawl after landing to be positioned where desired. The proposed system also has other benefits, compared to conventional crawlers, since its flying capability allows it to access places which would pose challenges for a human operator.

The idea is to use soft materials for the landing gear attached to the UAV as this is a safer alternative than other methods used in the state of the art. The use of soft materials in this area is a novelty when integrated into aerial robots. The problem is the difficulty of designing a lightweight gripper that is at the same time compact, energy-efficient and reliable.

Soft materials increase the adaptability of the holding system, while ensuring lower damage to the structures. This is an ideal solution for typical pipe inspection tasks in industrial facilities.

The ultimate goal is to have a system capable of crawling through pipes and inspecting them with ultrasonic sensors and make non-destructive testing (NDT) inspections. This kind of solution is very interesting for the industry and related service providers, as they can save costs, time and prevent undesirable accidents.

The rest of this paper is divided into five parts. The second section reviews the previous work. The third section describes the soft landing gear system, including the design, manufacturing process, and the operation flow of the landing gear and the soft limbs. The fourth section discusses the validation of the proposed system and the experiments carried out to validate its functioning. In the fifth section, the flight tests with the final setup is described and evaluated. Finally, our conclusions are drawn in the sixth section.

#### **2. Related Work**

Soft devices are currently being used in many areas of robotics because they provide advantages that the more traditional systems do not have, such as adaptability, compliance, better interaction with the environment and multi-functional end-effectors. Some examples of bio-inspired systems that can be used to interact with people are lightweight compliant arms with soft muscles that are pneumatically activated [1] and a pneumatic actuator can also be used to try to imitate the movements of a fish [2]. There are also soft grippers with muscles that are pneumatically activated to work in industrial environments interacting with humans [3].

There are many examples of materials and technologies, such as dielectric elastomer actuator (DEA), silicone-based elastomers, 3D-printed flexible actuators, or pneumatic actuators [4–7]. The main limitations associated with these soft-based actuators tend to be related to the complexity of the manufacturing process.

Other authors' examples are the high-contraction ratio pneumatic artificial muscle (HCRPAM) [8,9], prosthesis and grippers for manipulation [10], robots with elastomer actuators [11] and horticultural manipulation applications [12]. However, the main disadvantages of these actuators are the weight and space required by the pneumatic systems, which have motors and compressors with relatively high dimensions and weight.

Soft robotics are starting to be used in UAVs aiming to develop systems with capabilities to manipulate delicate objects, and to interact with people while flying. The use of soft materials is explored in [13] to become collision-resilient and increase its robustness. A special folding mechanism was investigated in [14]. DEA artificial muscle has also been used to try to simulate flapping wings, insects [15], or using a flexible membrane based on origami folding to preserve structural integrity during collisions [16]

The soft and compliant nature of the actuators ensures that soft robots are able to provide a safe interaction between the system and the facility to be inspected. Despite being lightweight structures, they are capable of achieving a high degree of freedom and a high force-to-weight ratio.

A large variety of robots have been designed to inspect pipes internally [17,18]. In fact, the most popular method of inspection is intelligent pigging [19,20], which makes use of devices equipped with sensors that navigate inside the pipes carried by the pipeline's fluid.

Robots have also been used to inspect pipes externally, usually called crawlers [21]. These robots commonly use magnetic wheels or tracks to move along the surface of pipes. This is very useful because the crawler can go underneath the pipe to make measurements, detecting possible leaks or corrosion. They are able to move freely over smooth surfaces, but in general, they cannot overcome obstacles and are limited to magnetic metal pipes.

The main locomotion alternative to these magnetic crawlers consists of an annular structure equipped with wheels, which obtain their adherence from a vacuum sucker.

UAVs can reach inaccessible areas faster than human operators or crawler robots. However, their flight endurance is very limited, and these robots are mainly used nowadays to perform visual inspections on structures by using different types of cameras (color, stereo, infrared), lasers, or other sensors. Authors in [22–24] proposed the use of UAVs to detect gas leaks and to monitor and map pipes. However, these solutions only allow for the visualization of surface damages.

Most recent research focused on the development of aerial robots that are able to not only perceive but also interact with the environment. This can be achieved by using robotic arms with several degrees of freedom (DOFs) that are attached to the UAV to interact and perform contact inspections [25,26]. These robots enable a new kind of application in which robots will be able to not only inspect but also perform the maintenance tasks at the industrial facilities [27]. These types of systems are called aerial robotic manipulators or aerial manipulators (AMs).

As related with previous perching mechanisms for UAVs, in [28], a single soft gripper was embedded at the bottom of a UAV to perch on pipes for inspection and maintenance tasks. Similar approaches have been taken in the rigid landing gripper of [29], the semi-soft perching system developed in [30] and the bio-inspired UAV with a soft landing gear that [31] used to land. However, these designs did not tackle the problem of moving along the pipe. The system presented in this paper allows crawling over the pipes to inspect them, saving time of flight.

#### **3. Soft Landgear**

In this section, the design of the landing gear and its characteristics are described. The section is split into two parts: the first one focuses on the description of forward-motion mechanism, while the second one focuses on the design and mechanical properties of the soft limbs.

The complete mechanism gear has been designed to allow the robot to crawl over the pipe. A compact and functional design that can be used for a variety of pipes because of the flexibility of the limbs is shown. Figure 1 shows the complete CAD design.

The gripper is manufactured with three different materials, which are: TPU, PLA and ecoflex. TPU (thermoplastic polyurethane) is a linear elastomeric polymer that can be used for 3D printing. Its greatest qualities are the flexibility and durability of the material. Polylactic acid (PLA) is a common plastic material in 3D printing. Finally, ecoflex is a cure silicone rubber compound. The advantages of ecoflex are that it mixes its components in equal parts to obtain a smooth and moldable silicone that can gain any shape and greatly increases the adhesion.

**Figure 1.** CAD design of the complete system.

#### *3.1. Forward-Motion Mechanism*

The forward-motion mechanism is a rigid core designed to be manufactured in PLA (as mentioned earlier). Additionally, all of the electronics are placed in this part because the rigid casing is more robust to possible impacts.

It has been designed to be flat on the upper side to make it easier to attach to different drones. This attachment is done with an embedded electromagnet. This special device is able to switch the magnetic field so that it can be enabled or disabled with a pulse. This can be used, for example, in situations where gas is detected or in emergency cases. This functionality is further explained in our previous work [32].

The mechanism is composed of two pieces: a fixed part that is attached to the drone and a mobile part that generates the forward movement. These two parts are symmetrical. The objective is to make a compact and robust design with the lowest possible weight so that it can be easily transported by the UAV, while the center of gravity does not significantly change when the UAV is moving.

Figure 2 shows the configuration of all the components. Three servomotors are responsible for all of the movements. Two of the servos are used to fold the soft limbs, which will be described later in Section 3.2. The other servo actuates as an endless screw that produces the forward movement. To restrict the torsion of the endless screw, two linear guides are located at the extremes of the rigid parts, which are attached with two 8 mm bars.

The lower part of the landing gear is a half-cylindrical section for the better adaptation to the pipes where it lands. The reference pipe size for this circular section is 160 mm, but the system can work in a diameter pipe range between 100 and 300 mm. On the sides, the system has two flaps protruding from the structure to attach the soft limbs.

**Figure 2.** CAD view of the forward-motion system. The arrangement of the three servos, the worm gear, the linear bearings at the ends of the case and the guides to contracting the soft limbs can been seen.

#### *3.2. Soft Limb Design*

This section presents the design of the soft limbs. The points of interest of the softtentacle gripper are its capabilities of adapting to pipes of any diameter and absorbing the impact on the pipe while landing.

Each limb is made out of TPU. This rubber-like material provides the gripper with enough rigidity to retain its shape and maintain the exerting forces, but also enough elasticity to bend and adapt to different pipe shapes.

Another benefit of this material is that it can be used by a 3D printer, making it possible to easily iterate and develop different limb shapes. Finally, the tip of the limb is coated with silicone to increase the grip force of the system and obtain a softer contact with the pipe.

The main property of the gripper is that it is intrinsically compliant, allowing it to easily adapt to variations on the diameter of the pipe. Furthermore, weight is a crucial variable in aerial systems, where any additional payload means less flight time and it also affects the maneuverability of the UAV. Therefore, the landing system to be 3D printed has been designed to keep the weight to a minimum. Finally, the soft approach also offers more safety in the case of crashes.

The selection of the shape of the limbs has been decided after various tests and simulations, by observing the limb's deformation and finding the best adaptive shape for the pipe. In the studies carried out, the properties of the TPU material were analyzed. An example of simulation for the deformation of the limb is shown in Figure 3.

**Figure 3.** The left-hand image shows a study of the deformation for non-linear materials and the right-hand image shows a study of the stress.

To obtain a good grip and have the highest diameter range, the stiffness of the joint had to be taken into account. The stiffness formula for each joint is *ji* = *EI*/*L*, where E and I are two constant parameters: E is the Young's modulus (the one used for the TPU is 100 N/mm2), and *I* is the cross-sectional moment of inertia. L is the length between the nylon thread and the flexible segment. Figure 4 shows the final shape of the limb.

**Figure 4.** The image shows the final impression of the limb and the different angles chosen in its design.

The first tests were made with *j*<sup>1</sup> = *j*<sup>2</sup> = *j*<sup>3</sup> = *j*4, where *j*<sup>1</sup> is the closest joint to the UAV and *j*<sup>4</sup> is the joint that is farthest away. This means that all lengths (L) were equal and, therefore, the same stiffness was generated at each joint of the soft limb. With this configuration, the limb started to bend first on the tip, which implied that it was not adjusting properly to the pipe, as shown in the left-hand example of Figure 5.

**Figure 5.** Example of the bad fold of soft limb and good fold when changing *ji* parameters.

After this first attempt, the lengths *Li* were changed, making them higher in the joints that are closer to the UAV, and reducing it progressively in the joints next to the tip; i.e., *j*<sup>1</sup> < *j*<sup>2</sup> < *j*<sup>3</sup> < *j*4. With this approach, the tentacles bend better toward the shape of the pipe, increasing the grip of the system. The resulting stiffness generates the fold pattern that is shown in the right-hand example of Figure 5.

The following equation describes the recursive formula for the joint's stiffness [33]. The equations in (1) indicate the curvature generated by the limb joints:

$$\begin{aligned} j\_n &= \frac{1}{\theta\_n} (F\_t(m\_n \mathfrak{g}(l\_{cn} \cos \sum\_{n=1}^N \theta\_n - \frac{l\_l}{2} \sin \sum\_{n=1}^N \theta\_n) \\ &\quad + l\_{an} \sin(\frac{\theta\_n}{2}) + h)) \\ R\_{\mathbf{x}n} &= F\_t - m\_n \mathfrak{g} \sin \sum\_{n=1}^N \theta\_n \\ R\_{\mathbf{y}n} &= F\_t \frac{\theta\_n}{2} - m\_n \mathfrak{g} \cos \sum\_{n=1}^N \theta\_n \end{aligned} \tag{1}$$

These equations have been used to determine the joint deflections required to grip a circular cross-section of a particular radius R formed with *θi* = *α* angles. Figure 6 shows an example of how the limbs adapt to the circular section of a pipe, showing the different parameters used for the calculation. This design radius is considered to be optimal. Nevertheless, due to the softness of the framework, the gripper can envelope pipes of larger and smaller radius. Equations can be used to solve *θ*i, giving as a result:

$$\begin{cases} \cos(\alpha) = \frac{\mathbb{R} + H}{\mathbb{R} + H + \mathbb{S}}\\ \tan(\alpha) = \frac{\alpha + L\_{i-1}/2}{\mathbb{R} + H} \\ \cos(\theta\_i) = \frac{L\_i}{2\alpha} \\ \tan(\theta\_i) = \frac{2\delta}{L\_i} \end{cases} \tag{2}$$

$$\theta\_i = \tan^{-1}(\frac{2(L\_i^2 + L\_{i-1} + L\_i)(R + H)}{L\_i(4(R + H)^2 - L\_i^2)})\tag{3}$$

By solving Equation (2), the system obtains *δ* and substitutes *δ* into *tan*(*θi*) and Equation (3) is obtained.

**Figure 6.** Model used to determine limb deflection in a pipe. H is the height of the limb, L is the length of the link, R is the radius of the pipe, *θ* is the deflection of the joint, and *α* is the angle between the contact points of two consecutive links.

Finally, by knowing each *θ<sup>i</sup>* = *α* angles for each joint, we can obtain *βi*, the bending angle, from the following equations:

$$\begin{cases} S = \tan(\alpha)R\\ \beta\_i + 2\psi\_i = 180\\ \cos(\psi\_i) = \frac{S/2}{Z} \\ \sin(\beta\_i/2) = \frac{S/2}{Z} \end{cases} \tag{4}$$

Figure 7 shows all the parameters used to solve the system of Equation (4):

**Figure 7.** Model for the determinate used to determinate the *β* angle.

The conclusion was that to obtain the best fit possible of the limbs to the pipe, the shape of the limbs should suffer a progressive deformation from the base to the end of the limb. Table 1 shows the selected angles obtained for the selected radius R, the tendon tension *Ft* and the dimensions of the limbs.


#### *3.3. Complete Locomotion System*

This subsection explains that all of the components conform to the locomotion system, and how they work. The locomotion system consists of three servomotors, which are embedded into the frame of the soft landing gear (as explained earlier).

Two of the servomotors are used to bend the soft limbs. The third is used to perform the linear displacement of the landing gear, which is made with an endless screw. Figure 8 shows the sequence of movements made by the soft land gear.

The first step to select the servomotors is to calculate the required minimum torque. The complete system weighs 3.25 kg, thus each limb will must exert ∼0.81 kg at its tip. Each motor has two limbs. For that reason, the final servo should exert at least 1.625 kg. By making this assumption, we grant that the gripper can hold the complete weight of the UAV in the worst scenario. Nevertheless, in most situations, this required strength will be lower, as part of the weight is held by the pipe, and the gripper only needs to prevent the UAV slipping laterally.

As shown in Figure 8, the motion runs as follows: two servomotors are hooked in pairs with the soft limbs using nylon threads, which allows them to open and close the limbs depending on the direction of the rotation of the motor. The third servomotor is responsible for the forward movement, using an endless screw that is connected to the motor at one end and to a nut at the opposite end. When the motor turns in one direction, the soft landing gear moves.

**Figure 8.** Example of movement sequence of the soft landing gear: (in **stage one**), it grips to the pipe; (in **stage two**), it opens the front limbs (blue case); (in **stage three**), it moves forward; (in **stage four**), it closes the front limb (blue case); (in **stage five**), it opens the rear limbs (black case); and finally, (in **stage six**), it moves the rear part.

#### *3.4. Soft Limb Manufacturing Process and Assembly*

This subsection will describe how we manufactured the limbs that are installed on the landing gear. Once the design has been carried out and has fulfilled the specifications, the manufacturing process of the limbs begins. One of the most important challenges is to be able to manufacture a soft part in a 3D printer, while making it easy to replicate and ensuring that the process is accurate.

All of the designs are made with TPU and a pair of PLA stiffeners, all produced on a 3D printer. These limbs are printed with different infill and printing patterns. The limbs were tested in the complete setup, until the one with more flexibility was selected. It was tested with a 10 percent infill which was very flexible and did not maintain the desired curvature when it was pulled by the nylon, then it was tested with a 20 percent infill which was very stiff and hardly allowed the limb to bend when it was in tension. The limb with an infill of 15 percent and a square printing patron was chosen as the optimal case.

Once the process of the impression of the TPU limb was finished, a PLA stiffener must be added to the tip to ensure that the tip does not deform. A stiffener is used to prevent losing strength at the tip when the nylon threads are contracted. After this, the different nylon strands that exert the force to deform and obtain the circular shape of the pipe, should be added to the limbs. These nylon threads are attached to the servomotor of the corresponding locomotion system and is then hooked at the end of the limb with the help of the PLA stiffeners at the end of the limb.

Finally, ecoflex is added. This elastomer is very flexible and rough. Ecoflex is applied to the tips of the limbs to improve adhesion to the pipe. To incorporate ecoflex with the TPU, molds were created using PLA with the shape of the limb tip and then were filled with ecoflex. The soft limb was then introduced into the molds, obtaining the silicone on the TPU. The ecoflex was cured for eight hours to obtain its physical qualities, and after this time, it was demolded. Figure 9 shows the final result of the operation.

**Figure 9.** The final result when joining TPU with ecoflex. It can be seen that ecoflex is only applied to the tip to increase the adhesion in this area.

With all the parts manufactured, everything is assembled as follows. There are two parts. The first is the fixed part, where the two motors are located. One motor is responsible for closing and opening the limbs and the second motor operates the worm screw. This part has two holes at the top where the couplers are inserted. These are joined with metal bars. The second is the mobile part, which incorporates in its interior only a motor to drive the limbs. It also contains some superior holes where the linear bearings are inserted so that this part can be moved through the metallic bars. It also contains a nut in the frontal part where the screw without end will be connected.

The last step of the construction of the soft landing gear is to join these two parts with the worm screw and the metal bars. The metal bars are added to give consistency and to carry the weight of the landing gear so that the worm screw is not exposed to too much stress. It should be added that the motors in charge of opening and closing the limbs have a reel that is fixed to it and a bearing. The reel takes care of rolling up the nylon threads, and these threads pass along the limb and stay attached to the stiffeners.

#### **4. Soft Land Gear Validation**

This section presents the validation tests of the mechanical behavior of the soft landing gear. Experiments have been carried out to measure the force exerted by the limbs using different pipe sizes. The deformation when the limbs close over the pipe and the maximum slope ranges that the system can withstand without falling (laterally) are also studied.

It should be noted that the results in this section demonstrate the reliability of the design, as well as the actual capabilities of the gripper. These results can be extrapolated for manufacturing other customized landing gears for different pipe sizes and other payload requirements.

#### *4.1. Pull Force*

In this section, a comparison is made between various servos, checking the force that can be exerted both experimentally and theoretically. Then, one must choose the one that meets the design expectations and has the least weight and dimensions.

At first, the grip force is evaluated via experiments closing the claw on a test bench and measuring the force with a dynamometer. The dynamometer is hooked to the claws and pulled upwards to give real force exerted by the servomotor.

For those tests, the base of the limb was fixed and a force was applied at the tip. This force is equal to that exerted by the nylon on the limb.

The servomotor voltage that is given as an example of the datasheet is validated with experiments, comparing the force output for a dynamometer applying this voltage with the theoretical force obtained in the datasheet.

A comparison has been made with three servos, the Feetech FS5103B, the Feetech SCS15 and the Feetech FT6325M. According to the technical specifications, the first servo has a voltage operation ranging from 4.8 to 6, the second has a voltage operation ranging from 6 to 8.4 and the third also has a voltage operation ranging from 6 to 8.4, whilst the force range obtained for each servo is as follows: for the first, the force ranges from 0.5 to 0.7 kg; for the second, the force ranges from 2.2 to 2.9 kg; and for the third, the force ranges from 2.8 to 3.6 kg.

Figure 10 shows that the theoretical force is greater than the experimental force, which was expected, due to the normal mechanical losses. This loss is lower than the 4%, being bigger with lower voltages and lower with higher voltages. It is observed that the servo 1 graph does not perform the specifications while the second and third graphs do, and the third one more than complies with the specifications. Finally, the second one is chosen as it meets the design requirements and the servo has a lower weight.

After analyzing this information, we chose to use the Feetech SCS15 servomotors because they have enough force to move the landing gear and hold on to the pipes. In addition, they have a serial bus connection in which the three motors can be connected at the same time with the same bus and each motor can also be selected according to the motor ID. In Section 4.1, we will explain why this servo was selected.

**Figure 10.** Comparison between the theoretical force and the real force performed by the soft limbs according to the voltage applied to the servos. The orange line represents the theoretical measurements and the blue line represents the experimental measurements obtained.

#### *4.2. Contact Pressure*

This section introduces the experiments for measuring the pressure exerted by the soft landing gear on two pipes of 140 mm and 160 mm diameter. In these experiments, force-sensing resistors (FRS sensors) have been distributed all over the soft surface. These sensors have a resistance that changes when a force is applied to it. This measurement can be mapped to forces and extrapolated to pressure over the surface. In these experiments, we tried to understand the behavior of the soft limbs and the pressure areas, where the soft train exerts less pressure on the pipes, which exert more pressure. Several experiments were carried out to calculate the pressure of each limb. Once the experiments were carried out, the data were collected and averaged to later be processed and obtain the pressure map. This process was done for both 160 mm and 140 mm pipes.

Figure 11 shows that more pressure is exerted in the base of the soft train and also in the tips of the limbs. The areas where less pressure is exerted are the intermediate areas due to formed folds.

**Figure 11.** The upper image shows the lower part of the landing gear to which the pressure study is made. The lower left-hand image shows the pressure map made on a 140 mm-diameter pipe and the lower right-hand image shows the pressure map made on a 160 mm-diameter pipe.

The same behavior is observed in both studied cases, in the pressure graph made on the 160 mm-diameter pipe and in the 140 mm-diameter pipe. The difference between these two graphs is that the general pressure recorded on the 140 mm pipe pressure graph is lower; that is, in general, there is less pressure at the end of the extremities and at the base of the soft train than on the 160 mm pipe.

#### *4.3. Maximum Lateral Angle*

A study was also conducted to determine the range of the angle at which the soft train can be attached to the pipe without separating from it. For this test, a test-bench was installed in which a smooth PVC pipe of 160 mm diameter was placed and the maximum inclination angle concerning the vertical of the pipe was checked. The soft train together with the UAV was able to hold on to it with a maximum angle of 30 ◦C. Figure 12 shows the maximum angle at which the soft gripper can hold the contact to the pipe.

This experiment also tested the movement of the soft train and verified that the prominences arranged at the base of the train can pass over the pipe joints and their irregularities.

**Figure 12.** The maximum angle at which the landing gear can be grabbed with the drone on the pipe.

#### *4.4. Crawling Gait Analysis*

Finally, to analyze the repetitiveness of the movement of the system, a gait analysis using a motion capture (MoCap) system was carried out.

The MoCap system allowed us to record in real time the position of markers in a controlled movement. Three reflective markers have been placed at each limb, which are located at each joint where the limb is bent in Figure 13.

**Figure 13.** Placement of markers on a pair of soft limbs.

Knowledge of the position of the markers can be used to validate the motion on the pipeline. When the limbs are extended, the markers line up. As the limbs begin to bend, the markers move inward, forming a semicircle. Then, the displacement begins. First, the forward pair of limbs open and move forward. In the next step, the opposite happens: the front limbs are closed and the rear limbs open, which moves the mechanism forward. The motion must be linear on the pipe. Slippage is corrected by changing the center of gravity of the UAV by making adjustments when joining the soft landing gear to the main UAV platform to ensure the position of the center of gravity.

The recording of the position of the markers and the time can also be used to obtain the speed of the landing gear along the pipe. Thus, it has been obtained that the average speed is 4 cm/s.

Moreover, the position of the three markers on the limbs can also be used to obtain the radius of the circumference when closing the limbs. For the case studied, the radius is 84 mm.

Finally, it can be concluded that the movement made by the soft landing gear on the pipe is always the same, obtaining the same circumference radius. This also allows us to correct the center of gravity to prevent slipping, and to ensure that the UAV and the soft landing gear are centered on the pipe.

Figure 14 illustrates the motion of the marker points in two limbs.

**Figure 14.** Closing sequence on the pipe and the limbs' deformation.In the first picture, the tentacles are open. In the second, they begin to close. Finally, in the third picture, the limbs are completely close, adapting to the shape of the pipe.

#### **5. Experimental Test**

This section describes the experimental setup of the whole working system, including the flying platform.

The soft landing gear was validated on a DJI FlameWhell 550 multirotor platform. This platform was chosen due to its versatile design. One of the advantages of the developed landing gear system is that it can be adapted to any type of multi-rotor system thanks to its modular design. It can also land on pipes of different diameters, and once landed, it can crawl along the pipe to perform both visual and contact inspection.

The motor controller board is in charge of supplying the necessary power for the motors to work and to control the motor sending the information through the data bus. An AVR-based board is used to control the landing and crawling on the pipe through the controller board, which executes the program that sequentially opens and closes the soft limbs to generate the movement of the system.

The AVR-based board is connected to an Nvidia tx2 onboard computer, which is in charge of sending the order received from the pilot and execute the high-level behavior software. However, this point is not the subject of this paper. It can also be used to integrate more sensors, including cameras. All these components are powered through specific voltage converters that regulate the power of each device from the battery.

The AVR-based board sends the information by serial communication to the motor's controller. This asynchronous Rx–Tx protocol has been used because it allows us to have two lines: one for transmitting to the motors and the other for receiving data such as the position, the speed, or which motor is working in each step. Figure 15 shows all of the components that we have used.

**Figure 15.** Setup including in the flying platform.

This UAV has two operation modes: manual and autonomous. In the manual mode, an operator sends the commands to move through the pipe. These commands are received by the Nvidia tx2 and transmitted to the AVR-based board via USB. The operator can select

between either opening and closing the limbs, or either moving the soft train forward or backward. In the autonomous mode, the AVR-based board sends a sequential program to the motors so that they carry out a movement. A linear sequence of opening front limbs, moving forward, closing front limbs, opening rear limbs and moving forward is performed. This mode should be activated once the UAV is attached to the pipe.

Tests were carried out to verify the functionality of this unit. The first experiments that were performed on the pipe were the ones in which the soft landing gear moves through it. Once this stage was tested, the landing gear was installed on the multi-rotor system, in this case a DJI f550 which has a Pixhawk 2 for the control of the multi-rotor, an Nvidia tx2 as an onboard computer, a camera connected to it to locate and position the UAV on the pipe, and a gas sensor which is a safety sensor to detect a gas leak and preventing putting the installation at risk—which is further detailed in [32]. Figure 16 shows the scheme of the proposed system.

**Figure 16.** System scheme used.

Multiple experiments have been carried out to demonstrate that the system works under different conditions, including indoor and outdoor experiments with wind, which makes it more difficult to maneuver and land the UAV. The soft landing gear grips firmly to the pipe and helps center the multi-rotor to the pipe while landing. The grip is fast and safe and does not damage or scratch the pipe at any time, though the landing gear is strong enough to hold on to the pipe and move along it without needing to be stabilized with the UAV propellers. Figure 17 gives an example of the soft train attached to a pipe with and without the multi-rotor UAV.

**Figure 17.** The image on the left-hand side shows the first experiment performed with the landing gear alone on the pipe. The picture on the right-hand side shows the complete system and how it is attached to the pipe.

The gripper system worked in all of the tests, demonstrating its reliability and ability to overcome joints between pipes while moving. It can also be attached to irregular surfaces.

As proof of concept, an ultrasonic sensor was attached to the soft landing gear to acquire non-destructive information about the thickness of a typical steal pipe. Figure 18 shows the assembly and the components used for the inspection. The ultrasonic sensor is placed at the front of the landing gear and has a motor to move the sensor up and down. It is important to make the right pressure so it can take the measurements.

**Figure 18.** The image shows the complete system with an ultrasonic sensor and the flaw detection computer.

#### **6. Conclusions**

A novel design of a soft-tentacle gripper for UAVs to crawl on pipes was presented.

How the soft gripper was manufactured with 3D printing and molding techniques for the silicone-based material was also described. This manufacturing process is highly repeatable and reliable.

The design of the UAV landing gear is unique and is capable of crawling on pipes without damaging them. It also provides a fast coupling and decoupling of the pipe. It can move through welds and joints, without losing adhesion and without jamming. The soft mechanism also adapts to pipes of different diameters or even to non-cylindrical pipes.

The soft gripper was designed to be as lightweight as possible, easy to transport and easy to change. The system also has low-cost implementation and is easy to reproduce using additive manufacturing techniques.

The system has been validated with real tests landing and crawling on the pipes with an automatic system.

This system can be combined with various sensors to perform inspections, an example of which is the use of an ultrasonic sensor. This type of sensor combined with this soft landing gear is very useful for inspecting pipes located at high altitudes. This saves in terms of inspection costs and time and increases the safety of workers.

This research paper represents the first step to create a fully autonomous hybrid flying and crawling contact inspection robot that is able to operate in an environment with many obstacles.

Future work will focus on the development of a faster locomotion system with more degrees of freedom. Furthermore, a standard ultrasonic transducer will be introduced in the base of the crawl system in order to realize flaw detection over the pipe and to detect anomalies.

**Author Contributions:** Conceptualization, F.J.G.R., P.R.S. and B.C.A.; methodology, F.J.G.R., P.R.S. and B.C.A.; software, F.J.G.R. and P.R.S.; validation, F.J.G.R.; formal analysis, F.J.G.R.; investigation, F.J.G.R., P.R.S. and B.C.A.; resources, F.J.G.R., P.R.S. and B.C.A.; data curation, F.J.G.R.; writing– original draft preparation, F.J.G.R. and P.R.S.; rewriting—review and editing, F.J.G.R., P.R.S., and B.C.A.; visualization, F.J.G.R., P.R.S. and B.C.A.; supervision, P.R.S. and B.C.A.; project administration and supervision, A.O.; funding acquisition, A.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the European funded project HYFLIERS (Hybrid Flying–Rolling with-Snake-Arm Robot for Contact Inspection) SI-1762/23/2017 EU-funded project.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We thank Robotics, Vision and Control Group and GRIFFIN (General Compliant Aerial Robotic Manipulation System Integrating Fixed and Flapping Wings to Increase Range and Safety) SI-1867/23/2018 ERC-ADG EU-funded project for supporting us during this work. This work is part of the European Union-funded project HYFLIERS, which aims to develop an inspection and maintenance system for the oil and gas industry using hybrid (aerial and ground) robots (Hybrid Flying–Rolling with-Snake-Arm Robot for Contact Inspection) SI-1762/23/2017 EU-funded project; we also thank Alejandro Gomez Tamm and Vicente Perez Sanchez for their support of this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


### *Article* **Detection of the Deep-Sea Plankton Community in Marine Ecosystem with Underwater Robotic Platform**

**Jiaxing Wang 1, Mingqiang Yang 1,2,\*, Zhongjun Ding 3,\*, Qinghe Zheng 1, Deqiang Wang 1, Kidiyo Kpalma <sup>4</sup> and Jinchang Ren <sup>5</sup>**


**Abstract:** Variations in the quantity of plankton impact the entire marine ecosystem. It is of great significance to accurately assess the dynamic evolution of the plankton for monitoring the marine environment and global climate change. In this paper, a novel method is introduced for deep-sea plankton community detection in marine ecosystem using an underwater robotic platform. The videos were sampled at a distance of 1.5 m from the ocean floor, with a focal length of 1.5–2.5 m. The optical flow field is used to detect plankton community. We showed that for each of the moving plankton that do not overlap in space in two consecutive video frames, the time gradient of the spatial position of the plankton are opposite to each other in two consecutive optical flow fields. Further, the lateral and vertical gradients have the same value and orientation in two consecutive optical flow fields. Accordingly, moving plankton can be accurately detected under the complex dynamic background in the deep-sea environment. Experimental comparison with manual groundtruth fully validated the efficacy of the proposed methodology, which outperforms six state-of-theart approaches.

**Keywords:** image motion analysis; image processing; optical flow; underwater robotic

#### **1. Introduction**

Plankton are organisms that live in oceans and fresh water [1] that play an important role in the material and energy recycling within the marine food chain [2]. The study of plankton community and plankton itself is indispensable for understanding of marine resources and the impacts of climate change on ecosystems [3]. In addition, the number of plankton is a key indicator of carbon and energy cycling [4], and of great significance to species diversity and ecosystem diversity [5]. From the early 19th century to date, many examples of large-scale sensor equipment were used to solve the challenge of getting reliable high-resolution estimates of plankton abundance at depth [6]. Acoustic and optical techniques for the in-situ observation of zooplankton are currently popularly used for plankton distribution assessment. Although acoustic-based observation has outstanding advantages of high observation frequency, it has inaccurate quantification and usually requires the combination of optical image analysis or other traditional sampling of zooplankton. In recent years, a series of advances were made in computer vision [7], including hyperspectral imaging [8], principal component analysis of images [9,10], and deep learning [11–13] for image classification [14]. As marine plankton is small and uneven in size, it is difficult to describe it quantitatively, such as with inventory and abundance statistics.

**Citation:** Wang, J.; Yang, M.; Ding, Z.; Zheng, Q.; Wang, D.; Kpalma, K.; Ren, J. Detection of the Deep-Sea Plankton Community in Marine Ecosystem with Underwater Robotic Platform. *Sensors* **2021**, *21*, 6720. https://doi.org/10.3390/s21206720

Academic Editor: Shafiqul Islam

Received: 16 August 2021 Accepted: 5 October 2021 Published: 10 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

At present, a lot of plankton detection methods are proposed that often rely heavily on the use of sophisticated underwater instruments. J. Craig et al. [15,16] constructed an ICDeep system, based on the Image Intensified Charge Coupled Device (ICCD) camera, to assess the quantity of low-light bioluminescent sources in the marine environment. Philips et al. [17] created a marine biological detector, where a Scientific CMOS (SCMOS) camera was used to image the organisms before conducting statistical analysis of the plankton abundance. With the development of the computer vision, multitarget trackingenabled automatic analysis was gradually applied to this field [18]. Kocak et al. [19] proposed to use the active contour (snake) models to segment, label, and track images of the snake model for the classification of the plankton. Luca et al. [20] also presented an automatic plankton counting method, which mainly used the interframe difference and the intersection of the bounding boxes to perform multitarget matching. The aforementioned methods achieved some results in automatic analysis and counting. However, there are still some challenges due to the particularity and complexity of plankton's own form and passive movement mode. Applying machine vision techniques to underwater images or videos is a feasible way to study plankton at present. Underwater plankton imaging has the capacity to detect patterns of the plankton distributions that we would be unable to be tackled by sampling with nets. [21]. Therefore, we consider applying machine vision technology to underwater images or videos is currently a feasible method for studying plankton.

Underwater robots play an important role in various video surveillance tasks including data collection. A mobile robot that can be fixed on a rotatable axis would be advantageous because it provides 360◦ visual coverage instead of using a fixed image camera installed in a predetermined direction. These mobile robots capture unprecedented shots of marine life in dangerous environments inaccessible to humans. A submarine can push and control the underwater robot to complete the collection of deep-sea data and store the data in the computer for analysis. Some underwater robots are shown in Figure 1.

In this paper, we propose a deep-sea plankton detection method based on the Horn– Schunck (HS) optical flow [22]. The optical flow is the instantaneous velocity of the pixel movement of the moving object on the image plane. The advantage of the optical flow method is that the motion vectors can be estimated by the optical flow vector accurately. In this way, one can detect the plankton and easily analyze statistically its volume using image processing and machine vision. The research on plankton can be specifically divided into density, position, number, individual and total volume, etc. In the case where the spatial position of plankton does not coincide in two consecutive frames, the presence or absence of plankton should be determined according to the following conditions: the time gradient maps at the plankton's location in two consecutive optical flow fields will be opposite to each other, and the horizontal and vertical gradients of the plankton at that location are equal and their direction is the same. Since the connected components are marked as the location of plankton, the number of connected components can be regarded as the number of plankton. By using this method, we firstly count the number of plankton in the video, followed by a statistical analysis. Various comparative experiments are carried out to benchmark with other methods to fully demonstrate the effectiveness of the proposed methodology.

**Figure 1.** Underwater robot pattern: submarine can push and control underwater robot to complete collection of deep-sea data, then store data in computer for analysis.

#### **2. The Proposed Method**

#### *2.1. Principle*

The deep ocean floor is clear and suitable for video acquisition with active lighting. During the video acquisition process, the camera position and shooting angle change with the movement of the submersible, making the plankton detection task a moving target detection problem under complex and dynamic backgrounds. Two consecutive optical flow field matrices derived from three consecutive video frames in a video are employed. For fast-moving plankton (plankton does not overlap in space in two consecutive frames), the two consecutive optical flow values at the position where the plankton is located are opposite. In practice, the amount of grayscale change is often close to 0. Therefore, the two consecutive optical flows are approximately opposite to each other, and we discuss this situation by setting two thresholds in the experiment section. We use this property to map out the location of the plankton. Figure 2, hereafter provides an overview of the proposed method, which consists of three modules.

**Figure 2.** Main processing blocks of proposed algorithm. Module 1 is for preprocessing, whilst Module 2 performs 3D convolution on video frame to extract dense optical flow. Module 3 is a dual threshold setting to determine whether a plankton is contained at a specific location or not(see Section 3.3).

As shown in the Module 1 of Figure 2, grayscale images are obtained by weighting three channels of the input frames. In module 2, three convolution operations are performed on two consecutive frames to produce three different gradients(see Figure 3), which correspond to three different convolution kernels. The details of the convolution process are shown in Figure 3 to illustrate this process. We find that the time gradients of the two optical flow fields derived from three consecutive frames of images are opposite in numerical value and direction in the corresponding positions of plankton in the middle frame. In the following description, the time gradients of the two consecutive optical flow fields are represented by ∇*<sup>t</sup>* and ∇ *<sup>t</sup>*. The horizontal gradients of the two consecutive optical flow fields derived from three consecutive frames are equal in magnitude and direction in the corresponding positions of plankton in the middle frame. Similarly, the vertical gradients are also equal. In the following description, the horizontal gradients of the two optical flow fields are represented by ∇*<sup>x</sup>* and ∇ *<sup>x</sup>*, the vertical gradients are ∇*<sup>y</sup>* and ∇ *<sup>y</sup>*. Finally, Module 3 is for dual thresholding, which is explained separately when discussing the parameters later.

**Figure 3.** Three convolution kernels corresponding in time and space. Two consecutive frames are used to form a 3D matrix whose size is (*height* + 1) × (*width* + 1) × 2. Size of filter is 2 × 2 × 2. Result of each operation is gradient of the pixel at upper-left corner of convolution kernel.

#### *2.2. Proof*

In the HS optical flow method, the constraint equation of optical flow can be established as Equation (2) according to the premise of the optical flow method: invariance of gray level [22]. Three first-order differences are used to replace the horizontal, vertical, and time gradients. Let the gray value at plankton's position in the middle frame be *Ix*,*y*,*t*, where the subscripts *x* and *y* are the pixel index, and *t* is the time index. The position of plankton changes with the movement of ocean current and the camera lens. As shown in Figure 4, the plankton is small-sized, so its position in frame *t* doesn't overlap in frame *t* + 1. When it changes from position 1 to position 2, the gray value corresponding to position 2 of plankton at frame *t* − 1 is the background gray value *Ix*,*y*,*t*−1. In a similar way, when the position of plankton changes from position 2 to position 3, the gray value corresponding to position 2 at frame *t* + 1 becomes the background gray value *Ix*,*y*,*t*<sup>+</sup>1. Based on the characteristics of deep-sea underwater video, the background around the plankton is invariant in time, i.e.,:

$$I\_{\mathbf{x}, \mathbf{y}, t-1} = I\_{\mathbf{x}, \mathbf{y}, t+1} \tag{1}$$

**Figure 4.** Position of plankton in three consecutive frames.

$$
\nabla\_x \mu + \nabla\_y \upsilon + \nabla\_t = 0 \tag{2}
$$

The time gradients at the plankton's positions in the two adjacent optical flow fields are:

$$\nabla\_t = \frac{1}{2} (I\_{x,y,t} - I\_{x,y,t-1} + I\_{x+1,y,t} - I\_{x+1,y,t-1}) \tag{3}$$

$$
\nabla'\_t = \frac{1}{2} (I\_{\mathbf{x}, \mathbf{y}, t+1} - I\_{\mathbf{x}, \mathbf{y}, t} + I\_{\mathbf{x}+1, \mathbf{y}, t+1} - I\_{\mathbf{x}+1, \mathbf{y}, t}) \tag{4}
$$

Based on Equation (1), the background gray value *Ix*,*y*,*t*−<sup>1</sup> = *Ix*,*y*,*t*+1, ∇*<sup>t</sup>* = −∇ *<sup>t</sup>* , the time gradients of the two optical flow fields derived from three consecutive frames of images are opposite in the corresponding positions of plankton in the middle frame.

The horizontal gradients of the plankton's location in the two optical flow fields are:

$$\nabla\_{\mathbf{x}} = \frac{1}{2} (I\_{\mathbf{x}+1, \mathbf{y}, t} - I\_{\mathbf{x}, \mathbf{y}, t} + I\_{\mathbf{x}+1, \mathbf{y}, t-1} - I\_{\mathbf{x}, \mathbf{y}, t-1}) \tag{5}$$

$$\nabla'\_{\mathbf{x}} = \frac{1}{2} (I\_{\mathbf{x}+1, \mathbf{y}, t+1} - I\_{\mathbf{x}, \mathbf{y}, t+1} + I\_{\mathbf{x}+1, \mathbf{y}, t} - I\_{\mathbf{x}, \mathbf{y}, t}) \tag{6}$$

The same way, based on Equation (1), we can get that ∇*<sup>x</sup>* = ∇ *<sup>x</sup>*, i.e., the horizontal gradients of the two optical flow fields derived from three consecutive frames are equal in the corresponding positions of plankton in the middle frame. In the same way, we can get ∇*<sup>y</sup>* = ∇ *y*.

In fact, in the process of proof, the time and space gradients are estimated in a 2 × 2 × 2 cubic neighborhood by taking the mean.

Then, we iterate *n* times for gray gradient relaxation by setting the initial conditions as *v*<sup>0</sup> = *v* <sup>0</sup> = 0 and *u*<sup>0</sup> = *u* <sup>0</sup> = 0.

$$
\Delta = \left(\frac{\nabla\_x u\_\hbar + \nabla\_y v\_\hbar + \nabla\_t}{a\_2 + \nabla\_x^2 + \nabla\_y^2}\right) \tag{7}
$$

$$
\mu\_{n+1} = \mu\_n - \nabla\_x \Delta \tag{8}
$$

$$
\upsilon\_{n+1} = \upsilon\_n - \nabla\_y \Delta \tag{9}
$$

The parameter *α*<sup>2</sup> reflects the smoothness constraints of the HS optical flow algorithm; Δ is an iteration factor in the process of the iterative algorithm; ∇*<sup>x</sup>* and ∇*<sup>y</sup>* are the horizontal and vertical gradients, and *u* and *v* are the horizontal and vertical optical flow field matrices, respectively.

The relationships of Equations (7)–(9) are represented by a series, where the number of iterations is *n*. Let's substitute Equations (7)–(9), the new formulas are as follows:

$$u\_{n+1} = u\_n - \nabla\_x(\frac{\nabla\_x u\_n + \nabla\_y v\_n + \nabla\_t}{a\_2 + \nabla\_x^2 + \nabla\_y^2})\tag{10}$$

$$\upsilon\_{n+1} = \upsilon\_n - \nabla\_y(\frac{\nabla\_x u\_n + \nabla\_y \upsilon\_n + \nabla\_t}{u\_2 + \nabla\_x^2 + \nabla\_y^2}) \tag{11}$$

where *un*+<sup>1</sup> and *un* are two horizontal optical flow fields before and after the *n*-th iteration, *vn*+<sup>1</sup> and *vn* are two vertical optical flow fields before and after the *n*-th iteration. We can derive *un*+<sup>1</sup> = −*u <sup>n</sup>*+1, *vn*+<sup>1</sup> = −*v <sup>n</sup>*+1. When *n* = 0, we have:

$$\upsilon\_1 = \upsilon\_0 - \nabla\_y(\frac{\nabla\_x u\_0 + \nabla\_y v\_0 + \nabla\_t}{u\_2 + \nabla\_x^2 + \nabla\_y^2}) \tag{12}$$

$$\upsilon\_1' = \upsilon\_0' - \nabla\_y'(\frac{\nabla\_x' u\_0' + \nabla\_y' v\_0' + \nabla\_t'}{a\_2 + \nabla\_x'^2 + \nabla\_y'^2}) \tag{13}$$

*v*<sup>1</sup> and *v* <sup>1</sup> are the two consecutive vertical optical flow field at the first iteration. If the time gradients of the last two optical flow fields are opposite, that is ∇*<sup>t</sup>* = −∇ *<sup>t</sup>* , we can get: *v*<sup>1</sup> = −*v* 1. When *n* = *k*, *vk*<sup>+</sup><sup>1</sup> = −*v <sup>k</sup>*+1. That is, Equations (14) and (15) are opposite:

$$
\upsilon\_{k+1} = \upsilon\_k - \nabla\_y(\frac{\nabla\_x u\_k + \nabla\_y \upsilon\_k + \nabla\_t}{a\_2 + \nabla\_x^2 + \nabla\_y^2}) \tag{14}
$$

$$v\_{k+1}' = v\_k' - \nabla\_y'(\frac{\nabla\_x u\_k' + \nabla\_y' v\_k' + \nabla\_t'}{a\_2 + \nabla\_x'^2 + \nabla\_y'^2})\tag{15}$$

where *vk*<sup>+</sup><sup>1</sup> and *v <sup>k</sup>*+<sup>1</sup> represent the previous and the next vertical optical flow field matrix at the (*k* + 1)th iteration, respectively.

When *n* = *k* + 1, we can show that *vk*<sup>+</sup><sup>2</sup> = −*v k*+2

$$\upsilon\_{k+2} = \upsilon\_{k+1} - \nabla\_y(\frac{\nabla\_x u\_{k+1} + \nabla\_y \upsilon\_{k+1} + \nabla\_t}{a\_2 + \nabla\_x^2 + \nabla\_y^2}) \tag{16}$$

$$v\_{k+2}' = v\_{k+1}' - \nabla\_y'(\frac{\nabla\_x' u\_{k+1}' + \nabla\_y' v\_{k+1}' + \nabla\_t'}{a\_2 + \nabla\_x'^2 + \nabla\_y'^2}) \tag{17}$$

By adding Equations (16) and (17), and substituting *vk*<sup>+</sup><sup>1</sup> = −*v <sup>k</sup>*+1, ∇*<sup>x</sup>* = ∇ *<sup>x</sup>* and ∇*<sup>y</sup>* = ∇ *<sup>y</sup>* into Equation (16) and Equation (17), respectively, we have:

$$
\upsilon\_{k+2} = -\upsilon\_{k+2}'\tag{18}
$$

Therefore, for fast-moving plankton, the values of the vertical optical flow field matrices of the space position where the plankton is located are opposite from each other: *v* = −*v* , and the same applies horizontally: *u* = −*u* .

#### *2.3. The Volume of Plankton*

Based on the above proof, one can calculate the number of pixels where plankton is located, and then multiply the actual size of a pixel to obtain the area of plankton. The resolution of the known image is *height* × *width*. According to camera internal reference, the actual range of our field of view is about W m by H m. The calculation of the actual area is given by:

$$S = N \times (W/width) \times (H/height) \tag{19}$$

where *N* is the number of pixels, and *S* is the corresponding actual surface. A method of approximate calculation is adopted here. Firstly, we can get the radius of a circle that has the same area as the plankton, and then calculate the volume of the sphere based on that radius. The advantage of this method is that we can get the 3D volume of an irregular object only by its area [23]. In addition, we can predict the type of plankton based on the estimated size, laying the foundation for the later identification of plankton types. The volume can be calculated by:

$$V = \frac{4}{3} \times \pi^{-\frac{1}{2}} \times S^{\frac{3}{2}} \tag{20}$$

The proposed method adds its own theoretical innovation on the basis of the original optical flow method and was proved mathematically. In this way, the complexity and passive motion patterns of plankton are well-solved, and the accuracy improves as the above problems are solved.

#### **3. Experimental Results and Analysis**

The data capture was provided by the China National Deep Sea Center. The data set was obtained by an underwater robotic nondestructive testing system carried by a deep-sea manned submersible. The camera's technical specifications are: resolution: 1080*i* HDTV; minimum illumination: 2l ux; optical zoom: 10 times; digital zoom: 12 times; aperture range: 3.2 mm–32 mm; video aspect ratio: 16:9 or 4:3. In this study, three six-minute videos of the plankton community from appearing to disappearing from the screen were selected, which were obtained from a submarine on the western Pacific sea mountain slope, and the diving depths are 2741.88 m and 5555.68 m, corresponding to 76 and 77 dives, respectively. The reason why the three videos are selected is that plankton appeared more frequently in them. Due to the complexity of the deep-sea environment and the irregular camera movement, the background is complex and dynamic. In this case, using high-precision image processing technology to study the plankton community from appearing to disappearing from the screen can effectively distinguish sedimentary clouds and plankton community in images. Examples of deep-sea plankton images are shown in Figure 5 and the details of data set including diving number, date, diving time, longitude, latitude and depth are shown in Table 1.

**Figure 5.** Example images of deep sea plankton.

**Table 1.** Details of datasets including diving number, date, diving time, longitude, latitude, and depth.


#### *3.1. Number and Volume of Plankton*

Processing the recorded video of a complete plankton community from appearing to disappearing from the screen, the results obtained are shown in Figure 6. Figure 6a shows the variation of the number of plankton in three six-minute videos, and Figure 6b shows the variation of the volume of the corresponding three videos. The process of plankton appearing in front of the camera to disappearing is shown in Figure 6c,d. In the first 30 s of Figure 6c, the amount of plankton is small and the detection results are more accurate. We can see that the amount of plankton rises in the last 30 s of Figure 6c. For dense particle clouds, overlap, and hence, occlusion occurs frequently, which leads to relatively low average accuracy and recall rates.

**Figure 6.** (**a**) Number of plankton in three six-minute videos. (**b**) Total volume of plankton in three six-minute videos. (**c**) Number of plankton in a period. (**d**) Volume of plankton in a period.

The actual volume curve of plankton in the video is shown in Figure 6b,d. We can see that the volume curve and the quantity curve of plankton generally follow the same trend. At the 40th second in Figure 6c, the plankton community moves away from the camera and then comes back, resulting in a smaller scene and a smaller overall volume due to perspective. So, we can see that the volume curve goes down and then goes up from Figure 6d.

#### *3.2. Comparison with Six Target Detection Methods*

The proposed method is compared with six state-of-the-art methods for performance evaluation. The results are shown in Figure 7, where Figure 7a represents some original images of the video, including sediment clouds, plankton, and uneven backgrounds. Top-Hat transform [24] is used to detect the location of the plankton in the image as shown in Figure 7b, the weakness of this algorithm is that there are some missed cases. Figure 7c and Figure 7d show the detection results of the frame difference method [25] and the motion estimation and image matching method [26], respectively. We show the result from the scan line marking method [27] in Figure 7e results from the simple block-based sum of absolute differences flow (SD) method [28], and the Lucas–Kanade (LK) optical flow method [29] are given in Figure 7f,g. The weakness of the above three methods is that there are a few false positives, and both Figure 7c,e detected the sediment cloud in the background by mistake. The result of Figure 7h is obtained using the proposed method. After comparing with the manual ground truth, we find that the plankton detected by the proposed method is more consistent with the original image in Figure 7a.

We take 20 images of the video, and the data are cleaned by manual counting to get the ground-truth. Then, we compare the number of plankton, recall rate, precision rate, and *F*1-*score* of the seven methods. When using 10 frames in the first 30 s of the video, the amount of plankton is small and the detection results are more accurate, the average accuracy rate is 0.901, the average recall rate is 0.955, and *F*1-*score* is 0.927. In addition, the equations and related symbols are shown in Table 2 and Equations (21)–(23). The results are shown in Tables 3 and 4. Taking 10 frames in the last 30 s of the video, the amount of plankton is high. For dense particle clouds, overlap can easily occur, and hence, occlusion occurs frequently, so the average accuracy and recall rates are relatively low, i.e., 0.895 and 0.943, respectively, and the *F*1-*score* is 0.918, The results are shown in Tables 5 and 6. In addition, we randomly selected 10 frames from the video for testing. The experimental results are shown in Tables 7 and 8. The performance of the proposed method is still very good. We use bold font to highlight the best results in each category in Tables 4, 6, 8 and 9.

$$Precision = \frac{TP}{TP + FP} \tag{21}$$

$$Recall = \frac{TP}{TP + FN} \tag{22}$$

$$F1 = \frac{2Precision \times Recall}{Precision + Recall} \tag{2.3}$$

**Table 2.** Confusion Matrix.


**Figure 7.** Location of plankton detected with seven different methods: (**a**) original image; (**b**) Top-Hat transform; (**c**) frame difference method; (**d**) motion estimation and image match; (**e**) scan line marking method; (**f**) simple block-based sum of absolute differences flow (SD); (**g**) Lucas–Kanade (LK) optical flow method, and (**h**) proposed method.


**Table 3.** In first 30 s, comparison of number of detected plankton using seven methods and Ground-Truth.


**Table 4.** In first 30 s, comparison of recall rate, precision rate, and *F*1-*score* of seven methods.

**Table 5.** In last 30 s, comparison of number of detected plankton using seven methods and Ground-Truth.


#### *3.3. Discussion of Parameters*

For each imaging system, there is a depth of field within which the closest field objects and farthest field objects are all in focus. If we deploy the system in air, the light intensity for the near field object and far field object should not be different in theory. However, when deployed in seawater, the light intensity changes as the light propagates in the water from near-field to far-field because of scattering caused by seawater and particles in the seawater. Therefore, during the experiment, there are two situations that need to be discussed. Firstly, 'grayscale invariance' is one of the prerequisites of the HS optical flow method, but in actual operation, the amount of grayscale change is often close to 0 but not equal to 0. Therefore, the threshold *β*<sup>1</sup> is set to handle this situation, as shown in Equation (24).


**Table 6.** In last 30 s, comparison of recall rate, precision rate, and *F*1-*score* of seven methods.

**Table 7.** Comparison of number of detected plankton from 10 randomly selected frames.


$$\left|\left|u+u'\right|\right| < \beta\_1 \quad \text{or} \quad \left|v+v'\right| < \beta\_1 \tag{24}$$

Secondly, when there is no plankton and the optical flow happens to be small, if the values of the optical flow are not the opposite but the sum still conforms to Equation (24), the threshold *β*<sup>2</sup> needs to be set to solve this situation, as shown in Equation (25).

$$-\mu u' > \beta \varepsilon \quad \text{or} \quad -\upsilon v' > \beta \varepsilon \tag{25}$$

The best threshold value is obtained by traversing the range value, the scope of *β*<sup>1</sup> is 0.05 to 0.35, step size is 0.05, the scope of *β*<sup>2</sup> is 3–9, and the step length is 1. Then, the original images and all those resulting from different thresholds are represented by vectors. At last, we calculate the cosine similarity between two images, that is the calculation of cosine distance between two vectors; the larger the cosine distance between the two vectors, the more similar the two images are. The results are shown in Table 9.


**Table 8.** Comparison of recall, precision, and F1-score of detected plankton from 10 randomly selected frames.

**Table 9.** Select best threshold by comparing cosine distance between two vectors.


As shown in Figure 8, Figure 8a is the original image, Figure 8c represents the result of using the threshold *β*2, and the one without the threshold *β*<sup>2</sup> is shown in Figure 8b.

#### *3.4. Time Complexity Comparison*

The time complexity comparison of the proposed method and six state-of-the-art methods is provided in Table 10. We select a one-minute video of 1440 frames and calculate the computation time to measure the time complexity of difference methods. Although the proposed method doesn't have great advantage in term of the time complexity, it outperforms other methods in accurate detection of plankton. In terms of the detection efficiency, some experimental comparisons were carried out. Based on the same one-minute video, the computation time and recall rate of the following four different strategies are compared, respectively. We sample pixels at intervals of 1, and take interval frames from full sequence at intervals of 1 frame. According to the results shown in Table 11, the interval between pixels has a weak influence on the error of the result, where the recall rate, precision rate, and *F*1-*score* are the closest to the original image's result, and the detection efficiency is improved by greatly reducing the calculation time.

(**a**)

**Figure 8.** Comparison with or without threshold: (**a**) original image; (**b**) one without threshold *β*2, and (**c**) result of using threshold *β*2.


**Table 10.** Time complexity comparison of proposed method and 6 state-of-the-art methods in a 1-min video of 1440 frames.

**Table 11.** Time complexity comparison of different sampling in a 1-min video of 1440 frames.


#### **4. Conclusions**

Detection of plankton plays an important role in the exploration and research of deepsea areas. Variations in the quantity and spatial distribution of plankton determine the function of the entire marine ecosystem. In this paper, we introduce a method for deep-sea plankton community detection in marine ecosystem with an underwater robotic platform. Compared with that of traditional methods, our method simultaneously improves the precision and recall of plankton detection. The obtained results and the proved theory provide a scientific basis for studying the material cycle and energy flow of deep-sea ecosystems. For our future work, with a view to strengthening the proposed solution, we aim to improve our plankton detection approach, and then conduct studies for plankton recognition and identification of their species.

**Author Contributions:** Conceptualization, J.W. and Z.D.; methodology, J.W. and M.Y.; writing—original draft preparation, J.W., K.K. and D.W.; visualization, J.R. and K.K.; writing—review and editing J.W., K.K., Q.Z. and J.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Key R&D Program of China under Grant No. 2018YFC0831503.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank National Deep Sea Center for the provision of the video material used in the experimental validation of the method.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **A Robotic Experimental Setup with a Stewart Platform to Emulate Underwater Vehicle-Manipulator Systems**

**Kamil Cetin 1,2,\*,†, Harun Tugal 1,2,‡, Yvan Petillot 1,2, Matthew Dunnigan 1,2, Leonard Newbrook 1,2 and Mustafa Suphi Erden 1,2**


**Abstract:** This study presents an experimental robotic setup with a Stewart platform and a robot manipulator to emulate an underwater vehicle–manipulator system (UVMS). This hardware-based emulator setup consists of a KUKA IIWA14 robotic manipulator mounted on a parallel manipulator, known as Stewart Platform, and a force/torque sensor attached to the end-effector of the robotic arm interacting with a pipe. In this setup, we use realistic underwater vehicle movements either communicated to a system in real-time through 4G routers or recorded in advance in a water tank environment. In addition, we simulate both the water current impact on vehicle movement and dynamic coupling effects between the vehicle and manipulator in a Gazebo-based software simulator and transfer these to the physical robotic experimental setup. Such a complete setup is useful to study the control techniques to be applied on the underwater robotic systems in a dry lab environment and allows us to carry out fast and numerous experiments, circumventing the difficulties with performing similar experiments and data collection with actual underwater vehicles in water tanks. Exemplary controller development studies are carried out for contact management of the UVMS using the experimental setup.

**Keywords:** underwater vehicle–manipulator system; robotics emulator; contact management; remote inspection; force control

#### **1. Introduction**

An underwater vehicle–manipulator system (UVMS) consists of an underwater robotic manipulator mounted on an underwater vehicle typically used for subsea inspection and surveillance [1–3]. Due to the inherent danger of manned subsea operations, the research interest in underwater robotic systems has continuously increased as UVMSs have a wide range of application areas—for instance, for object inspection, underwater welding, and valve manipulation within the offshore industry [1,2]. The underwater robot manipulators enhance capabilities of the underwater vehicles and reduce operational costs and danger to human life for the essential subsea tasks requiring physical interaction. However, designing robust controller for such a complex system is a challenge from a control point of view due to the highly dynamic coupling between the manipulator and the floating vehicle. In addition, the overall system needs to be robust against external disturbances, e.g., caused by waves or tidal streams, while the end-effector of the manipulator is interacting with the environment. These are also common problems for land-based mobile manipulators but are particularly relevant to UVMSs as the base vehicle is floating. For testing and demonstration purposes, here we consider underwater asset inspection/manipulation tasks which require

**Citation:** Cetin, K.; Tugal, H.; Petillot, Y.; Dunnigan, M.; Newbrook, L.; Erden, M.S. A Robotic Experimental Setup with a Stewart Platform to Emulate Underwater Vehicle-Manipulator Systems. *Sensors* **2022**, *22*, 5827. https://doi.org/ 10.3390/s22155827

Academic Editors: Yashar Javadi and Carmelo Mineo

Received: 16 June 2022 Accepted: 1 August 2022 Published: 4 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

maintaining a physical contact with the asset surface, where the exact location of the contact on the surface or the exact trajectory followed on the surface is not critical. This is typically the case with pipe thickness measurements, corrosion measurements, cleaning of surfaces from biological structures, and placement of (e.g., magnetically attached) sensors/devices on the surface.

In this study, it is assumed that vehicle motions, caused by environmental disturbances, are unknown for the robotic arm controller while keeping the end-effector in contact with a surface under disturbances. We have emulated environmental disturbances with realistic data that we have collected from a physical underwater vehicle floating in a water tank under the occasional impact of push movements. In addition, the physical interaction of the manipulator end-effector with the surface acts as disturbances in the motion of the vehicle. This is due to the physical coupling between the manipulator and the vehicle and transmission of the interaction force to the base through robot links. The position disturbance due to this force-impact has been computed and applied on the emulating base platform. In this way, we have obtained a realistic emulation of the underwater disturbance impacts on the robot base, by capturing the two main causes: water flow and environment interaction. As a result, a simulation environment and a physical experimental setup have been developed to interact with each other to replicate a UVMS in order to test and validate the controllers in a dry-lab environment. A force/position control method is adapted from our earlier studies [4] and an admittance based controller [5] that applies virtual dynamics at the manipulator end-effector for perpendicular force interaction with the unknown surface is implemented in this study. This admittance controller does not require knowledge of the vehicle position/velocity, the stiffness of the environment or manipulator base disturbance effects. We demonstrate the use of the setup to replicate costly underwater experiments, through an evaluation of an admittance-based controller in comparison to a PID based controller, both in simulations and in physical experiments with the hardware-based emulator.

For the problem of physical contact and surface tracing using a UVMS in the underwater environment, the authors in [6] proposed an optimized redundancy resolution scheme for operational space tracking control of the end-effector of a UVMS. In [7,8], the authors used task-priority-based redundancy resolution methods where the primary task was defined by the operational space position/velocity tracking and force tracking was proposed as a secondary objective. In [9,10], the authors proposed force/position hybrid controllers for the interaction of the end-effector of UVMS with an underwater environment. In [11], an impedance control focused on task priority redundancy solution was developed for contact force control of UVMS. However, these approaches do not consider the problems related to the disturbance effects on the underwater vehicle motion, since they always have access to the position data of the end-effector relative to an inertial base.

In [12–14], the problem of the physical interaction has been considered for the aerial robots, and they developed variable impedance controllers based on force estimations without using force sensors. For the general problem of hard contact interaction of robot manipulators, the authors in [15–17] developed dynamic adaptive hybrid impedance controllers.

In the surveys of underwater robotics [18,19], there are several simulators for the development of underwater robotics. In the TRIDENT project [20], an ROS-based open-source kinematic simulator, named UWSim, was developed for underwater robotics simulation. In [21], Gazebo was integrated into the UWSim to simulate kinematics and dynamics of underwater robots. In [22], the authors extended a Gazebo-based Unmanned Underwater Vehicle (UUV) simulator by implementing the model of hydrodynamic effects. In [10], the authors developed a hybrid simulator for underwater vehicles and manipulators with the ability to accurately simulate hydrodynamic and contact forces of the UVMS with the environment. However, these studies focused only on the development of software-based simulation frameworks to simulate the dynamics or kinematics of underwater vehicles and manipulators. However, due to the complexity of accurately modeling and simulating the

physical disturbances and the interaction forces/torques with an environment, a hardwarebased emulation system with physical interaction provides more realistic means of testing and validation for a UVMS. Therefore, in our study, in addition to the software-based simulation, we have a hardware emulation of underwater robotics.

Briefly, we can summarize the main contributions of this study as follows: first, we used realistic underwater vehicle movements transmitted in real time in the experimental setup or pre-recorded in a water tank environment. Next, we simulated the water current effect on floating base vehicle motion, considering both hydrodynamic and contact interaction effects. We also used a physical robotic setup with a Stewart platform and a robotic arm manipulator to emulate a UVMS. We then demonstrate the use of this system to perform fast and numerous experiments to compare control schemes for underwater asset inspection without lengthy and costly underwater experiments.

#### **2. Realistic Real-Time Data Set and Transfer from Water Tank to the Land Robotic Setup**

In this study, a real Falcon underwater ROV is deployed at sea in a realistic environment. This vehicle is connected through 4G to the remote lab (approximately 160 km apart) where its position and velocity (in 6 DOF) are used to drive a 6 DOF Stewart platform, see Figure 1. This setup provides a good proxy for the real experiments without the need for complex and expensive underwater hardware and integration.

As shown in Figure 1, the land robotic setup was located in the laboratory (in Edinburgh, UK) and real-time communication between the laboratory and the remote water tank (in Blyth, UK) was established through 4G routers (DrayTek Vigor 2862). During the exemplary studies, a time delay of about 0.3 s was observed. The ROV navigation data were recorded during the experiments and are reproducible on the robotic setup to evaluate future algorithm improvements.

**Figure 1.** Software and hardware implementations from a real demonstration between the Robotics laboratory in Edinburgh, UK and the water tank in Blyth, UK.

#### **3. Software-Based Simulation Platform**

We have developed a UVMS simulation platform in Gazebo using an underwater vehicle and environment proposed in [22]. The simulation platform consists of a 7 DOF robot manipulator model (KUKA IIWA14) mounted on a 4 DOF underwater vehicle model (Rexrov2) and a pipe as an interaction object in the underwater environment. The force sensor attached at the end-effector of the manipulator allows us to measure the interaction force which is used to generate joint motion commands during the surface inspection. In order to move the Rexrov2 in the simulation, Gazebo uses the actual position measurements of the real Falcon ROV in the water tank. Figure 2 shows the overall underwater simulation platform; this platform was developed in Gazebo simulating a UVMS (a robot manipulator mounted on the Rexrov2 vehicle) to perform a surface inspection on a pipe. This simulation platform has been used in integration with the physical setup during the exemplary studies for controller development of contact management.

The simulator we developed is based on the UUV Simulator [22,23] consisting of Gazebo/ROS plugins with the implementation of Fossen's equations of motion for underwater vehicles [24], 6 DOF PID controllers for ROV thrusters' modules, ocean wave model with hydrodynamics and hydrostatic effects, and the Rexrov2 vehicle model [25]. In this way, our physical land robotic setup that will be explained next considers the impact of (simulated) water dynamics and manipulator force interaction effects on the base vehicle, along with other pre-recorded realistic position disturbances.

**Figure 2.** Simulating the UVMS using robotics simulation platform Gazebo. A KUKA IIWA manipulator model mounted on a Rexrov2 vehicle carries out surface inspection on a pipe.

#### **4. Physical Robotic Setup**

Figure 3 shows the land robotic setup; this setup emulates a UVMS with a real KUKA IIWA14 robot manipulator fixed on the Stewart parallel manipulator platform interacting with a pipe. It is composed of a 7 DOF robot manipulator (KUKA IIWA14) to emulate an underwater robotic manipulator, a 6 DOF base vehicle (Stewart parallel manipulator) to emulate an underwater vehicle and an ATI Gamma NET FT force sensor attached to the end-effector of the manipulator for the contact management. Since pipes are one of the most common objects to be interacted within the offshore subsea environment [26,27], a PVC vent pipe with a diameter of 500 mm and a thickness of 4 mm was placed in front of the land robotic setup to emulate the underwater object that the UVMS's end-effector is supposed to inspect. In the exemplary studies, the real Falcon ROV's actual position data from the water tank was used to move the Stewart platform. It should be noted that the actual position measurement of the Falcon ROV was only used to move the platform and not to control the manipulator. Since the communication is unilateral and open-loop control is implemented on the Stewart platform, the communication time delay between the two locations did not impact the test and verification of control quality.

**Figure 3.** Simulating the UVMS with a real KUKA IIWA14 robot manipulator fixed on the Stewart parallel manipulator platform interacting with a pipe.

#### **5. Interaction of Physical Robotic Setup-Realistic Data-Simulation Platform**

Generally in a UVMS, once the end-effector of the manipulator contacts an object, the interaction forces and torques at the contact point would result in reaction forces (and torques) on a floating base vehicle that disturbs its position (and orientation) with respect to the inspected object. Therefore, in our physical robotic setup, the interaction forces at the end-effector of the (KUKA) manipulator should be accounted for and reflected to the (Stewart) base platform as a position disturbance. In the simulation platform, we simulated the position disturbance on the floating vehicle due to the real-time force interaction of the end-effector, using the model of a Rexrov2 vehicle with dynamic parameters and PID controllers on its thrusters [22,25]. Afterwards, we embedded these disturbances on top of the previously recorded water wave disturbances (realistic data set) as shown in Figure 4. While the water wave disturbances were pre-recorded, the disturbances due to interaction were dynamically changing in real-time according to the actual interaction of the manipulator in the physical robotic setup. For that purpose, first the force/torque (F/T) interaction that would occur between the underwater manipulator base and vehicle are computed using the end-effector F/T measurements, and then the resultant F/T on the center of mass of the vehicle are computed and superimposed on the force and torque resulting from the thrusters of the Rexrov2 in the simulator. The overall computed movement of the Rexrov2 in the simulator is added to the recorded realistic movement of Falcon ROV in the water tank, and the result is finally transferred to the physical Stewart platform emulating the vehicle movement in the dry-lab.

Overall, we measure the force at the tool-tip in the physical robotic setup and feed this measurement into the simulation platform. The simulator computes the movement of the base under this impact (the simulator considers the models of the robot arm [28] and the base vehicle [23] along with the water dynamics [22,24]). We then merge the simulator vehicle position with the designed disturbance effect (i: no disturbance, ii: sinusoidal movement in each direction, iii: realistic underwater disturbance recorded on an underwater vehicle; as will be explained in the following sections) and send the merged position signal in a feed-forward way to the Stewart platform in the physical robotic setup.

**Figure 4.** Block diagram of the floating base (Stewart platform) movement.

The closed-loop force/position controllers in the operational space are applied only to the KUKA manipulator for the contact management. On the other hand, the floating base (Stewart platform) is independently controlled by the open-loop position commands provided from real position data of the Falcon ROV due to water wave disturbance and simulated position data of the Rexrov2 due to contact interaction disturbance. All the software implementation of the real-time controllers of the robotic setups, reading of the F/T measurements of the sensor, interacting with the Gazebo simulator, and communicating with the ROV in the water tank through 4G routers was conducted in C++ under Ubuntu with the Robot Operating System (ROS) middleware running at 1 kHz. A marker was attached to the end-effector through a compliant adapter. When the end-effector tool contacts and makes a tracing movement on the pipe surface, the ATI's Gamma F/T sensor attached between the end-effector and the tool measures the forces and torques in 3 translational directions [*xyz*] and three rotational directions [*αβγ*] in the operational space at the frequency of 1 kHz. The KUKA robot manipulator uses the KUKA Robot Controller (KRC) that operates at 1 kHz as a client on a remote workstation. The Stewart platform is connected to a real-time QNX control box running at 30 Hz which in turn connects to the central control computer.

#### **6. Exemplary Studies for Development of Contact Management Controllers**

The experimental setup was evaluated with the force/position hybrid control architectures of [4,5] for the contact management. The aim of the force controller is to ensure that the end-effector of the robot manipulator is in contact with the environment perpendicularly via applying a linear reference force in the *z* translational direction (a dynamically changing direction always perpendicular to the unknown surface) and a zero torque in roll (*α*) and pitch (*β*) rotational directions in the local (tool) frame. Additionally, the position controller enables the end-effector to follow the desired motion in the *x* and *y* directions in the local frame. In these hybrid control methods, the force and position controls are designed independently in dynamically changing local frame directions according to the shape of the surface to generate the end-effector velocity commands in each iteration. This approach is an adaptation of the operational space formulation proposed in [29]. The control strategy in [4] is for fixed-based robot manipulators where a standard proportional (P) controller was used to control perpendicular force interaction and surface trajectory tracking. In [5], taking into account the unknown disturbance effect of the floating base vehicle to the position of the robot manipulator, the control architecture is enhanced via an admittance control approach.

The proposed system has been evaluated in three different application scenarios where in each case the platform commanded to carry out distinct motions (*i*: no movement on the Stewart platform; *ii*: sinusoidal movement in each Cartesian direction with a position change of 0.1sin(2*πTt*) m in *x*, *y*, *z* translational and 0.1sin(2*πTt*) rad in *α*, *β*, *γ* rotational directions with *T* = 8 ms sampling period; *iii*: the actual Cartesian pose of a real ROV submerged in a water tank). In scenarios II and III, Rexrov2's position in the simulator is also added to the movement of the Stewart platform to account for the disturbance effects of hydrodynamics and contact interaction on base vehicle movement. In all scenarios, the performance comparison between the admittance controller [5], the P force controller in [4] and the PID force controller are presented. It should be noted that, when these force controllers are separately implemented in the end-effector's *z* translational, *α* and *β* rotational directions, simultaneously the same PD position controller is implemented to the end-effector's *x* and *y* translational directions in all scenarios. For the admittance controller for the perpendicular force contact interaction, the general inertia and damper coefficients were chosen as 0.5 Kg and 100 Ns/m, respectively. For comparison purposes to the case of force control, the PID control gains were used as *KP* = 0.05, *KD* = 0.5, and *KI* = 0.002.

#### *6.1. Application Scenario-I*

In the first experiment, the platform is fixed in the global frame for benchmarking. The P, PID, and the admittance controllers are separately implemented on the manipulator for force control. As shown in Figures 5a,b and 6a, the end-effector perfectly tracks the pre-specified trajectory as projected on the 3D surface, and Figure 7c illustrates that it continuously applies the desired force −2 N on the pipe surface.

**Figure 5.** Experimental results of the admittance controller in Scenario-I: (**a**) the 2D pre-specified trajectory on XY plane versus the 2D projection of the 3D trajectory tracking, (**b**) the 3D actual end-effector trajectory on pipe.

**Figure 6.** Trajectory drawing pictures on the pipe (the admittance controller was implemented): (**a**) fixed-based manipulator in Scenario-I, (**b**) floating-based manipulator in Scenario-II.

**Figure 7.** Comparative results of the F/T measurements in Scenario-I: (**a**) the P controller, (**b**) the PID controller, (**c**) the admittance controller.

#### *6.2. Application Scenario-II*

In this scenario, a pre-defined sinusoidal Cartesian position along with the Rexrov2 movement due to the hydrodynamic and contact interaction forces effecting the base is commanded to the platform. The purpose here is to observe the manipulator behavior when the vehicle is subject to a known (sinusoidal) disturbance movement (without the complicated disturbance movement of realistic underwater data and without the impact of force interaction of the manipulator). The P, PID, and the admittance controllers are separately implemented on the manipulator for force control. Then, the results of the three force control methods are compared; see Figures 6b and 8. The sinusoidal movement deviates the end-effector trajectory from the intended raster movement. However, as expected, the end-effector remains in contact with the pipe staying perpendicular to the surface and applies a force in the *z* direction (Figure 9c), no matter how far it deviates from the pre-specified trajectory in the *x* and *y* directions (Figures 6b and 8).

**Figure 8.** Experimental results of the admittance controller in Scenario-II: (**a**) the 2D pre-specified trajectory on XY plane, (**b**) the 3D vehicle movement as disturbance effects to the robot manipulator, (**c**) the actual end-effector trajectory on pipe with respect to the global frame.

#### *6.3. Application Scenario-III*

In this scenario, the Stewart platform moves according to the actual Cartesian pose of the real ROV in the water tank plus the Rexrov2 movement in the Gazebo simulator due to the contact interaction disturbance. Here, as in the previous scenarios, the P, PID, and the admittance controllers are separately implemented to the floating-based manipulator system. Figures 10 and 11c show the 3D actual end-effector trajectories on the pipe for the admittance controller. The movement of the Stewart platform produces a disturbance effect to the base of the KUKA manipulator, but the admittance controller still keeps the end-effector perpendicularly in contact with the pipe as shown in Figure 12c and completes the trajectory tracking within the working space of the pipe surface. However, unlike in the previous scenarios (I and II), the P and PID controllers fail to maintain continuous end-effector contact in the presence of realistic disturbances. Since the Stewart platform mimics the ROV motions through the water wave disturbance and contact interaction disturbance, the actual trajectory tracking positions of the end-effector are different from the pre-specified trajectory.

**Figure 9.** Comparative results of the F/T measurements in Scenario-II: (**a**) the P controller, (**b**) the PID controller, (**c**) the admittance controller.

#### *6.4. Discussion*

Before the evaluations, the PID gains were tuned in order to get the best performance possible. The main challenge was to manage the trade-off between stability in contact and fast recovery in case of loss of contact with the surface. For instance, when the system lost contact between the end-effector and the pipe surface, a low P gain resulted in the controller taking significant time to recover the contact. On the other hand, when the robot's end-effector was in contact with the pipe, a large P resulted in instability and frequent cycles of loss-and-recovery of the contact. Therefore, by trial-and-error, the best PID control gains that gave better results than the pure P controller were identified. While the base of the robot manipulator is constantly in motion, the end-effector of the robot manipulator with a highly sensitive force sensor is in constant interaction with an object with an unknown surface and is constantly moving in all directions. Therefore, especially during this interaction, which takes place perpendicular to the surface, the vibrations that occur, as seen in Figures 6, 9 and 11, are caused by the measurements of the very sensitive force sensor. As a result of the advantages of force controllers, these vibrations are minimized.

In Scenario-I, since there is no disturbance on the base movement, the continuous contact and the trajectory tracking of the end-effector is achieved as expected. The mean square force errors (*f*(*z*) − *fd*(*z*))<sup>2</sup> and the standard deviations in the *<sup>z</sup>* perpendicular direction are given in Table 1 for each experimental scenario. The case for Scenario-I constituted a reference in order to compare the impact of disturbances on the base platform. Both the P controller as proposed in [4] and the PID controller designed in this study functioned as well as the admittance controller proposed in [5] (see Figure 7 and Table 1 (I)). However, in Scenario-II, the results with the admittance controller were significantly improved in comparison with the results with the P and PID controllers, (see Figure 9 and Table 1 (II)).

**Figure 10.** Experimental results of the admittance controller in Scenario-III: (**a**) the 2D pre-specified trajectory on the XY plane, (**b**) the 3D vehicle movement as disturbance effects to the robot manipulator, (**c**) the actual end-effector trajectory on pipe with respect to the global frame.

In the realistic Scenario-III, the controller needs to handle the movements of the base that suddenly change in different directions during the movement of the actual ROV in the water. From Figure 12 and Table 1 (III), it is observed that the deviation from the reference value is significantly less with the admittance controller compared to the other two controllers. In this scenario, various losses of contact with the pipe were observed with all three controllers (see Figure 11). However, the total duration of the loss of contact is much less with the admittance controller (even not observable on the marker trace on the pipe in Figure 11c). It is clearly seen from Figure 11a,b that there are significant losses of contact with the P and PID controllers.

As a result, as seen in Figure 12 and Table 1, when the P, PID, and admittance controllers were compared, the admittance controller has less mean squared force error and standard deviation than the P and PID controllers in fixed-based experimental (I) and floating-based experimental scenarios (II and III). Most importantly, the disturbance effects caused by the floating real ROV and the simulated ROV under the contact interaction can be much better compensated by the admittance controller compared to the P and PID controllers.

**Figure 11.** Trajectory drawing pictures on the pipe for floating-based manipulator in Scenario-III: (**a**) implementation for the P controller, (**b**) implementation for the PID controller, (**c**) implementation for the admittance controller.

**Figure 12.** Comparative results of the F/T measurements in Scenario-III: (**a**) the P controller, (**b**) the PID controller, (**c**) the admittance controller.

**Table 1.** Mean and standard deviations of the squared force errors on the *z*-direction and the total loss of contact duration from the first contact to the end of the trajectory for the P, PID, and admittance controllers in experimental scenarios.


#### **7. Conclusions**

This study demonstrated that force/position control approaches for the physical interaction of the UVMS with underwater structures can be developed with the experimental robotics setup in a dry laboratory environment that allows us to carry out fast and numerous trial experiments. This experimental setup consists of three sub-setups. First, we used realistic underwater vehicle movements transmitted to the system in real time or pre-recorded in a water tank environment. Second, we simulated the water current impact on the floating base vehicle movement considering both hydrodynamic and contact

interaction effects. Third, we used a physical robotic setup with a Stewart platform and a robotic arm manipulator to emulate a UVMS. We have demonstrated the use of this system to conduct experiments to compare control schemes for underwater asset inspection, without lengthy and costly underwater experiments. Particularly, we have shown that an admittance control scheme performs better than conventional P and PID controllers for contact and force level management in interaction with an unknown surface.

**Author Contributions:** Conceptualization, K.C., H.T. and M.S.E.; methodology, K.C., H.T. and M.S.E.; software, K.C. and L.N.; validation, K.C. and H.T.; formal analysis, K.C., H.T. and M.S.E.; investigation, K.C.; resources, Y.P. and M.D.; data curation, K.C.; writing—original draft preparation, K.C.; writing—review and editing, K.C., H.T., Y.P., L.N. and M.S.E.; visualization, K.C.; supervision, M.S.E.; project administration, Y.P. and M.S.E.; funding acquisition, Y.P. and M.S.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) through the ORCA Hub project with Grant No. EP/R026173/1.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors have no conflict of interest to declare that are relevant to the content of this article. All data, materials and codes are available, and codes were produced by open-source software applications. This manuscript has not been published and is not under consideration for publication elsewhere.

#### **References**


### *Article* **Fine Alignment of Thermographic Images for Robotic Inspection of Parts with Complex Geometries**

**Carmelo Mineo 1,\*, Nicola Montinaro 2, Mario Fustaino 2, Antonio Pantano <sup>2</sup> and Donatella Cerniglia <sup>2</sup>**


**\*** Correspondence: carmelo.mineo@icar.cnr.it; Tel.: +39-091-680-9720

**Abstract:** Increasing the efficiency of the quality control phase in industrial production lines through automation is a rapidly growing trend. In non-destructive testing, active thermography techniques are known for their suitability to allow rapid non-contact and full-field inspections. The robotic manipulation of the thermographic instrumentation enables the possibility of performing inspections of large components with complex geometries by collecting multiple thermographic images from optimal positions. The robotisation of the thermographic inspection is highly desirable to improve assessment speed and repeatability without compromising inspection accuracy. Although integrating a robotic setup for thermographic data capture is not challenging, the application of robotic thermography has not grown significantly to date due to the absence of a suitable approach for merging multiple thermographic images into a single presentation. Indeed, such an approach must guarantee accurate alignment and consistent pixel blending, which is crucial to facilitate defect detection and sizing. In this work, an innovative inspection platform was conceptualised and implemented, consisting of a pulsed thermography setup, a six-axis robotic manipulator and an algorithm for image alignment, correction and blending. The performance of the inspection platform is tested on a convex-shaped specimen with artificial defects, which highlights the potential of the new combined approach. This work bridges a technology gap, making thermographic inspections more deployable in industrial environments. The proposed fine image alignment approach can find applicability beyond thermographic non-destructive testing.

**Keywords:** robotics; thermography; non-destructive testing; image alignment; image blending

#### **1. Introduction**

Non-destructive Testing (NDT) comprises highly multidisciplinary groups of analysis techniques used throughout science and industry to evaluate materials' properties and ensure the integrity of components/structures without causing damage to them [1]. In civil and industrial manufacturing, the increasing deployment of smart/composite materials demands high integrity and traceability of NDT measurements, combined with rapid data throughput. Traditional manual inspection approaches are insufficient in some scenarios since they produce a manufacturing process bottleneck [2]. Therefore, there are fundamental motivations for increasing automation in NDT. Computer-Aided Design (CAD) has been extensively used in engineering design phases. Computer-Aided Manufacturing (CAM) also allows large components to be produced quickly through combinations of traditional subtractive approaches and novel additive manufacturing processes [3]. As a result, large components with complex geometries have become very common in modern structures. NDT inspection is still often performed manually by technicians who typically must move appropriate probes over the contour of the part surface. Manual scanning requires trained technicians and results in a prolonged inspection process for large samples. Automation of NDT is required to cope with the inspection of such structures. Robotic manipulation of

**Citation:** Mineo, C.; Montinaro, N.; Fustaino, M.; Pantano, A.; Cerniglia, D. Fine Alignment of Thermographic Images for Robotic Inspection of Parts with Complex Geometries. *Sensors* **2022**, *22*, 6267. https:// doi.org/10.3390/s22166267

Academic Editor: Ruqiang Yan

Received: 11 July 2022 Accepted: 17 August 2022 Published: 20 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

NDT sensors also plays an essential role in inspecting parts made of composite materials. A fundamental issue with composite components is that parts designed as identical can have significant deviations from the CAD model. Composite parts suffer from inherent but different part-to-part springiness out of the mould, which presents a significant challenge for precision NDT measurement deployment. While manual scanning may remain a valid approach for some specific areas of a structure, developing reliable automated solutions has become an industry priority to drive down inspection times and costs. An industrial robot is an automatically controlled, reprogrammable, multipurpose manipulator programmable in three or more axes [4]. Many manufacturers of industrial robots have produced robotic manipulators with excellent positional accuracy and repeatability. In the spectrum of robot manipulators, some modern robots have suitable attributes to develop automated NDT systems. They present precise mechanics, the possibility to accurately master each joint and the ability to export positional data at high update rates. The key challenges to face when developing a robotic NDT system include integrating the NDT instrumentation with the robotic manipulator, creating a suitable robot inspection path for the part under inspection, and developing software for NDT data collection and visualisation. These challenges have been addressed by several applications of six-axis robotic arms for the inspection of parts through automated ultrasonic techniques [5–7]. Robotic ultrasonic inspection has become commonplace thanks to the research investments driven by the aerospace sector in the suitability of ultrasonic techniques to inspect critical aerospace components. Some works have presented robotic ultrasonic inspection systems capable of achieving high data throughputs, accompanied by bespoke software for data visualisation and analysis [5,8]. Automated geometry mapping has also been demonstrated using robotically manipulated metrology sensors [9].

Besides these techniques, other types of inspections have not reached the same level of robotisation; this is the case for thermographic testing, also known as thermal imaging, infrared (IR) thermography or simply thermography. It is an NDT imaging technique that allows the visualisation of heat temporal patterns in an object or a scene and is based on the principle that two dissimilar materials possessing different thermophysical properties produce two distinctive thermal signatures that can be revealed by an infrared sensor, such as an IR thermal camera [10–12]. Although a thermographic setup in reflection mode, with a heat excitation source and an IR camera on the same side of the part under inspection, is not well suited to detect defects located deep in the volume of a component, it presents some advantages over ultrasonic-based inspections. It is contactless and full-field, meaning that the whole area of a component detectable within the field of view of an IR camera is inspected remotely at once. Schmidt and Dutta [13] proposed using industrial robots as manipulators to perform active thermography in 2012. The robotic manipulation of the thermographic instrumentation can enable the possibility of performing inspections of large components with complex geometries by collecting multiple thermographic images at given positions. Despite preliminary investigations [13,14], the robotisation of the thermographic inspection method has not been fully exploited to date due to the lack of a suitable approach capable of aligning automatically-collected thermographic images. The importance of consistent registration of NDT data in CAD models is highlighted in [15]. Aligning thermographic images for NDT analysis is not trivial since accurate and consistent pixel blending must be guaranteed and is crucial to facilitate defect detection and sizing. Inaccurate alignment and blending may create unreal artefacts in the composite thermography image and cause false-positive flaw detection. In this work, an innovative inspection platform was conceptualised and implemented, consisting of a pulsed thermography setup, a six-axis robotic manipulator and a novel algorithm for image transformation, alignment and blending. The performance of the inspection platform is tested on a convex-shaped specimen with artificial defects, highlighting the potential of the new combined approach.

The remaining part of this work is organised as follows. Section 2 reviews the theoretical principles of thermographic inspection and provides scientific literature references. Following a detailed clarification of the origin of the misalignment in robotically-acquired

images and the limitations of existing image alignment algorithms, Section 3 describes the novel image alignment and blending algorithm developed by this work. Section 4 introduces the automatic thermography setup used to validate the proposed method. Section 5 presents the experimental results. The outcomes of this work and the method's performance and prospects are discussed in Section 6.

#### **2. Thermography Principles**

Thermography, as introduced above, can be deployed through different techniques [16]. The essential equipment for manual (not automated) thermography includes an IR camera, a computer to record (and sometimes process) data and a monitor to display images. The main classification of the thermographic techniques differentiates between passive and active techniques. Passive thermography exploits the fact that materials and structures may naturally be at different (higher or lower) temperatures than the background. For example, the human body is generally at a higher temperature than the ambient; hence it is easily detected by an IR camera without additional stimulation. Conversely, an external stimulus is needed in active thermography to produce a thermal contrast in the object's surface. Active techniques are particularly suited to non-destructive testing since an object containing internal defects (such as voids, delaminations and/or inclusions of foreign material) will require the excitation of thermal disequilibrium to produce a distinctive surface thermal signature detectable with an IR camera. In the realm of active thermographic techniques, pulsed thermography (PT) has broad applicability in NDT. When an object's surface is heated through a short (a few milliseconds) energy pulse of light radiation, a series of thermal waves with different amplitudes and frequencies propagate inside the object medium in a transient mode. The surface temperature is monitored under the principle that defective areas cool down (or heat up) at a different rate than non-defective areas [17–19]. It is known that the thermal wave originating from the energy pulse can be decomposed into a multitude of individual sinusoidal components and that it is possible to link temporal and frequency domains. In pulsed phase thermography (PPT), the PT is combined with the phase and frequency concepts of lock-in thermography (LT), where specimens are subject to a periodical excitation [12,20–22]. Flash lamps generate a heat pulse of high intensity and low duration. The subsequent temperature decay is then acquired over a truncation window.

Once raw data are collected, there are multiple techniques to analyse the data. One approach consists of calculating the Discrete Fourier Transform (DFT) to evaluate the thermal response's frequency content. The phase of specific harmonic content can finally be obtained and presented as a phasegram, an image where the scalar value associated with each pixel represents the phase. Any discontinuity in phase contrast is either caused by the object geometry or indicates a potential flaw. In the PPT approach, whereas deeper anomalies are expected to be better contrasted in low-frequency phasegrams, high-frequency phasegrams probe better for superficial issues. The signal normalisation inherent in evaluating the phase is also expected to reduce the counter effects of non-uniform heat deposition and environmental reflections [23]. It must be noted that the terms phasegram(s) and thermographic image(s) are used as synonyms in the remainder of this paper.

#### **3. Fusion of Multiple Thermographic Images**

#### *3.1. Misalignment Issue in Robotically-Acquired Images*

Robotic NDT inspections generally occur in a well-structured environment, where the part position is precisely registered with respect to the robot reference system. Great care is dedicated to ensuring the robot tool path is accurately referenced to the sample reference frame to ensure effective data collection during automated inspections [24]. Despite the efforts, a deviation between the actual tool path and the ideal tool path always remains due to the following reasons: (i) the physical tolerances in the robot joints; (ii) the geometric deviations in the mounting support of the sensing instrumentation; (iii) the residual inaccuracy in the calibration of the part position; (iv) the deviation between

the actual sample geometry from the part digital counterpart. For these reasons, the resultant data usually reveal some imperfect alignment when they are encoded through robot positional feedback and plotted in the form of a single map. For robotic thermographic inspections, the problem translates to evident misalignment of the thermographic images. The issue may be mitigated through an external metrology tracking system (e.g., a six-DoF laser tracker), capable of measuring the position of the sensing instrumentation with respect to an absolute reference frame. However, such metrology systems are expensive and can increase the overall complexity of robotic inspection systems. In robotic machine vision systems, the exact position of the camera with respect to the robot mounting point is calibrated through the hand-eye calibration method [25], which is based on the knowledge of the camera's intrinsic parameters, such as focal length, aperture, field-of-view and resolution, and on the capture of a calibration pattern (e.g., a checkerboard) from different viewpoints. However, this method is not always applicable to thermographic cameras since they do not usually have a visible-light imaging sensor (RGB sensor). A similar method based on calibration patterns with different thermal infrared emissivity could be adopted to calibrate IR cameras [26]. This work developed a practical solution consisting of correcting each image's plotting location and its prospective aberrations to obtain a misalignment-free fullfield view of the inspected sample. The remainder of this section explains the limitations of available image-stitching algorithms and the theoretical foundations of the proposed method herein.

#### *3.2. Limitations of Existing Alignment Methods*

Algorithms for aligning images and stitching them into seamless photo-mosaics are among the oldest and most widely used in computer vision. The alignment of images requires establishing mathematical relationships that map pixel coordinates from the unaligned images to their aligned versions. Five parametric 2D planar transformations have been defined [27] (see Figure 1). Each one of these transformations can be described by a transformation matrix (*τ*(*p*)), with *p* being a vector of parameters. Pure translation can be written as *<sup>x</sup>* <sup>=</sup> *<sup>x</sup>* <sup>+</sup> *<sup>t</sup>* or *<sup>x</sup>* <sup>=</sup> *<sup>τ</sup>*(*p*)·*<sup>x</sup>* <sup>=</sup> *I t* ·*x*, where *<sup>x</sup>* <sup>=</sup> {*x*, *<sup>y</sup>*, 1} is the vector of coordinates of the untransformed image pixel and *x* denotes the coordinates of the same pixel in the transformed image, *<sup>I</sup>* is the (2 <sup>×</sup> 2) identity matrix, and *<sup>t</sup>* <sup>=</sup> *tx ty* is the translation vector, containing two parameters (respectively, the translation along the *x*-axis and the translation along the *y*-axis). The Euclidean transformation is written as *<sup>x</sup>* <sup>=</sup> *<sup>τ</sup>*(*p*)·*<sup>x</sup>* <sup>=</sup> *R t* ·*x*, where *<sup>R</sup>* is the 2D rotation matrix. Thus, Euclidean transformation depends on three parameters: *tx*, *ty*, and an angle θ (for the rotation matrix). Euclidean distances are preserved. The similarity transformation, also known as scaled rotation, preserves angles between lines. It is expressed as *x* = *sR t* ·*x*, where *<sup>s</sup>* is the scale parameter that brings the parameter counter to four. It must be noted that *s* is a scalar and the scaling operation is intended to be isotropic. The affine transform is written as *<sup>x</sup>* <sup>=</sup> *<sup>τ</sup>*(*p*)·*<sup>x</sup>* <sup>=</sup> *<sup>A</sup>*·*x*, where *<sup>A</sup>* is an arbitrary 2 <sup>×</sup> 3 matrix with six parameters. Parallel lines remain parallel under affine transformations. Projective transformation, also known as perspective or homography, is expressed as *<sup>x</sup>* <sup>=</sup> *<sup>τ</sup>*(*p*)·*<sup>x</sup>* <sup>=</sup> *<sup>H</sup>*·*x*, where *<sup>H</sup>* is an arbitrary 3 × 3 matrix:

$$\mathbf{x}' = \begin{bmatrix} h\_{00} & h\_{01} & h\_{02} \\ h\_{10} & h\_{11} & h\_{12} \\ h\_{20} & h\_{21} & 1 \end{bmatrix} \cdot \mathbf{x} \tag{1}$$

Thus, perspective transformation requires eight parameters and preserves straight lines.

**Figure 1.** The basic set of 2D planar transformations (Reprinted with permission from Ref. [28]. 2007, now publishers inc).

Assuming the choice of a suitable motion model to transform each image, a typical strategy to align a collection of images consists of aligning the images in pairs. In order to align a pair of images, it is necessary to devise some methods to estimate the parameters to apply the selected transformation to one image while the other is kept fixed. One approach is to shift or warp the first image relative to the other and measure how much the pixels agree. The first methods to quantitatively measure such agreement are often called "direct methods", based on pixel-to-pixel matching [29]. These methods are usually slow since the number of pixel pairs to evaluate can be very large. Direct methods work by directly minimising pixel-to-pixel dissimilarities; a different class of algorithms works by extracting a sparse set of features and then matching these to each other [27,30,31]. Featurebased approaches have the advantage of being more robust against scene movement, are potentially faster, and can be used to automatically discover the adjacency (overlap) relationships among an unordered set of images [32].

Although feature-based approaches work well to create panoramas of scenes with enough distinguishable features, they are not suited to align multiple images for NDT applications. Non-destructive testing aims to detect defects in parts and/or structures. As such, besides the presence of intrinsic geometrical details (e.g., borders and corners), most images may appear relatively featureless since the presence of defects is not the norm. An attempt to use a feature-based alignment approach was presented in [33], where the authors note the need to mark artificial points on the background of a test objective to obtain the mapping matrix from two-dimensional (2D) thermal wave imaging data to the 3D spatial coordinate's digital model. On the other hand, feature-based approaches can also fail if plenty of spatially periodic features are present in the images, which can be the case for industrial components due to stiffeners/stringers, heat dissipators and/or fixturing holes. Direct methods are less prone to failure caused by a lack of image features or abundance of periodicity since they can leverage any consistent low-contrast gradient to find the optimum image transformation parameters. However, the scientific literature does not show any solution readily available to work with the scalar information present in each pixel of thermographic images. As stated above, thermographic images differ from RGB or grayscale images since the pixel values may represent phases (expressed in degrees or radians) and may be negative values. Moreover, the optimum solution to align and stitch multiple thermographic images can not progress pairwise. Although it can work only for images taken in a single row, like in the case of a horizontal panorama, robotic thermographic inspection generally collects images through a raster tool-path, with multiple images arranged in multiple passes. A pairwise image-stitching algorithm would produce a visible drift between adjacent passes due to the progressive summation of alignment errors.

#### *3.3. Fine Pixel-Based Alignment Method*

This work developed a direct method capable of simultaneously aligning multiple images. The method is suitable to be used when the rough position of the camera (the shooting pose of each image) is known. That is the case for robotically acquired images, where the camera position is obtained from the robot's positional feedback. Given a set of images, knowledge of camera shooting poses allows skipping the search for the adjacency

relationships among the set. Knowing the scale factor makes it possible to convert the pixel index coordinates to real-world coordinates and identify the overlap between the images. The scale factor can easily be calculated by measuring the size of a known object or the known distance between two points in an image in terms of the number of pixels and considering the actual length it represents. Therefore, the algorithm herein is specifically targeted to perform a fine alignment of all images in the set. It is referred to as the Fine Pixel-based Alignment Method (FiPAM). It must be noted that FiPAM is currently suitable for aligning multiple mosaic images of a sample surface that curves only in one direction. Although the constraint of single direction curvature is a significant limitation, it does not impede using FiPAM for mosaic images of any surface belonging to the large family of cylindrical surfaces intended as "generalised cylindrical surfaces" [34]. Under that condition, all collected images can be transposed to a planar domain. Indeed, any cylindrical surface can be represented in the plane by "unrolling" it on a flat surface. An additional assumption is that the part surface captured within the camera field of view is sufficiently close to a flat plane. In other words, the ratio between the local surface curvature and the camera field of view must be small. Figure 2 illustrates a set of nine images used to explain the theoretical foundations of FiPAM.

**Figure 2.** Schematic representation of a set of nine images used to explain the theoretical foundations of FiPAM.

Direct methods find the optimum alignment between a pair of images by an iterative search, where one image is transformed with respect to the other through one of the five planar transformations. To use a direct method, a suitable error metric must first be chosen to measure the goodness of the alignment. Given two images, with one image (*I*0(*x*)) taken as a reference image sampled at discrete pixel locations (*xk* <sup>=</sup> {*xk*, *yk*, 1}), with *<sup>k</sup>* being the pixel index, we wish to find the optimum transformation parameters that align it with the second image (*I*1(*x*)), which is kept fixed. The error metric is defined as the sum of squared differences (SSD) of the pixel values of *I*<sup>1</sup> at the transformed pixel locations and the reference values of *I*0. This kind of function has been successfully used in the image processing literature, with different aims (e.g., inpainting [35]). Given a transformation (*τ*(*p*)), with *p* being a vector of parameters, we have:

$$SSD\left(\pi(\mathfrak{p})\right) = \sum\_{\mathbf{k}} \left[ I\_1(\pi(\mathfrak{p}) \cdot \mathbf{x}\_{\mathbf{k}}) - I\_0(\mathbf{x}\_{\mathbf{k}}) \right]^2 \tag{2}$$

The optimum set of parameters (*p*∗) can be found by solving a least-squares problem of this SSD function. Since the transformation allows multiple degrees of freedom (DoFs) for the image, this is a multi-parameter problem. Therefore, a suitable search technique must be devised. The most straightforward technique would be to exhaustively try all possible alignments (full search). In practice, this would be too slow and is not practicable. Several works have developed hierarchical coarse-to-fine search techniques based on image pyramids [27] when the approximate image alignment is unknown. In this work, since the approximate position of each image is assumed to come from the known camera pose, it has been decided to limit the search space by setting lower and upper bounds for the transformation parameters.

Regarding the set of images in Figures 2 and 3 illustrates all the overlaps between image #4 and its neighbour images. Given a positive scalar value herein named "offset" (*o*), it is possible to draw shrunk overlap areas whose boundary is at distance *o* from the boundary of the original overlap areas. The actual value to use for *o* depends on the expected maximum entity of misalignment caused by the inaccuracy in robotic manipulation of the camera and by the deviations in the physical camera support. Assuming these offset areas move with image #4 and the original overlap areas stick with the parent neighbour image, the bounds of the transformation parameters guarantee that the offset overlap areas remain within the original overlap footprints. Generalising Equation (2) to allow simultaneous alignment of multiple images, FiPAM is based on the following SSD function.

**Figure 3.** Illustration of all overlap areas between image #4 and neighbour images. The magnified region serves to clarify the relationship between the overlap areas and the offset overlap areas.

*Ki*,*<sup>j</sup>* is the set of pixel indices that fall within the offset area, produced by the overlap between the *i*th and the *j*th image. Assuming a set of *n* images, Equation (3) is the sum of squared differences of the pixel intensity values of the *j*th image and the ith image. Crucially, the overlap pixel locations of the *j*th image are transformed according to the transformation matrix of the *<sup>i</sup>*th image (*τi*(*p*)) and the locations of the *<sup>i</sup>*th image are transformed according to the transformation matrix of the *j*th image (*τj*(*p*)).

Now, it must be noted that the vector *p* includes all the parameters required in the transformation matrices, and only a subset of it is used to compute a single transformation matrix (*τi*(*p*), with *i* = 1 : *n*). Moreover, the summation is not evaluated for *j* = *i* (an image is always aligned with itself) and for combinations of *i* and *j* corresponding to images that do not overlap, where *Ki*,*<sup>j</sup>* is an empty set. This formulation solves a typical problem with pixel-based methods, which is the possibility that parts of *Ii* may lie outside the boundaries of *Ij*. This advantage follows directly from the constraints applied to the search space for the transformation parameters. Another aspect to discuss relates to the fact that the transformed pixel indices can be fractional, so a suitable interpolation function

must be applied to evaluate the image intensities (*Ii* and *Ij*). This work employs bi-cubic interpolation, which yields better results than bilinear interpolants [36]. It must be noted that Equation (3) does not require the image pixel values to be in a specific format. Thus, it can work with the phase values of thermographic phasegrams and images with three RGB colour channels, although it is also possible to first transform the images into a different colour space.

The mathematical parametric formulation of all transformation matrices pictured in Figure 1 was implemented in FiPAM. The formulation allows maximum flexibility in choosing the most suitable transformation for each image, meaning that all images in a set can be aligned using the same type of planar transformation, or each image can use a transformation of a different type. In other words, each image can be transformed by allowing different DoFs, which relate to a different number of parameters. Automating the selection of the optimum transformation for each image is out of the scope of this work. In practical situations, similarity or affine transformations produce satisfactory results if the part surface captured within the camera field of view is sufficiently close to a flat plane. Once the optimum transformation parameters are found, the aligned version of the *i*th image is computed by transforming its original discrete pixel locations with the following equation:

$$\mathbf{x}\_{i}^{\prime} = \pi\_{i}(\mathcal{p}^{\*}) \cdot \mathbf{x}\_{i} \tag{4}$$

#### *3.4. Image Blending*

Aligning all images in a dataset is not sufficient to merge the images into a single composite image. Indeed, multiple aligned images may present significant differences in pixel intensities in overlapping areas. For RGB images, exposure differences are typically caused by ambient light changes during image capture. In active thermographic imaging, the same problem may be caused by the progressive increase of an object's surface temperature when it is subject to multiple heat pulses. Image blending is usually accomplished through averaging the intensity of homologue/overlapping pixels or by using more sophisticated methods, such as "Laplacian pyramid blending" [37] and "Gradient-domain blending" [38]. Although these blending methods work well and have been implemented in many variants for consumer imaging (e.g., for panoramic image stitching), they cannot directly be used to blend images originating from NDT inspections. Indeed, in NDT images, it is necessary to retain the robustness of quantitative information (e.g., to perform pixel intensity comparisons) and avoid introducing any image processing artefacts. A typical challenge lies in removing low-frequency exposure variations while retaining sharp intensity gradients that may indicate the presence of small defects. In other words, it is necessary to prevent blurring. In this work, image blending has been solved through a method that preserves the valuable NDT information in each image. All pixel intensities in an image are offset by a unique value to maintain gradients unaltered. To explain this approach, Figure 4a,b provides an example of nine aligned images. The intensity discontinuity between any two overlapping images has been purposely emphasised. These example images do not contain high contrast features, which are typical for NDT images taken of a not-defected sample.

The idea is to shift the intensity of all pixels in an image vertically by a particular corrective value. Thus, *n* being the number of images in the set, it is necessary to compute a vector of *n* scalar optimum intensity correction values (*c*<sup>∗</sup> = *c*∗ <sup>1</sup>, *c*<sup>∗</sup> <sup>2</sup>, ... *c*<sup>∗</sup> *<sup>i</sup>* ,... *c*<sup>∗</sup> *n* ) that simultaneously correct all images in the set. These values may be positive or negative to produce an increase or a decrease in image pixel intensities. Interestingly, this computation can be formalised again through a least-squares problem of the following SSD function:

$$SSD(\mathbf{c}) = \sum\_{i=1}^{n} \sum\_{j=1}^{n} \sum\_{h} \left\{ \left[ I\_{j}(\mathbf{x}\_{h}) + \mathbf{c}\_{j} \right] - \left[ I\_{i}(\mathbf{x}\_{h}) + \mathbf{c}\_{i} \right] \right\}^{2} \quad \text{with } (j \neq i) \text{ and } (h \in H\_{i,j}), \tag{5}$$

where *Hi*,*<sup>j</sup>* is the set of pixel indices that fall within the overlap between the aligned *i*th and *j*th image. It must be noted that the formulation of this SSD function follows the same approach used for the computation of the alignment parameters. The summation is not evaluated for *j* = *i* (no intensity self-correction is required) and for combinations of *i* and *j* corresponding to images that do not overlap, where *Hi*,*<sup>j</sup>* is an empty set. Since intensity correction is performed after the alignment stage, Equation (5) does not perform any image transformation. Moreover, since the problem is limited to the computation of only one scalar parameter per image, convergence to a solution for Equation (5) is obtained faster than for Equation (3). Once the optimum intensity correction values are found, the matrix of corrected pixel intensities for the *<sup>i</sup>*th image (*Ii*) is computed with the following equation.

**Figure 4.** Exemplification of image blending, used in FiPAM. (**a**) Aligned images with discontinuous pixel intensities; (**b**) images after correction of discontinuities; (**c**) 3D plot of uncorrected pixel intensities; (**d**) 3D plot of the corrected image set.

Once all images are aligned and their intensity is corrected, the final composite image is obtained by applying the Laplacian pyramid blending, which allows a smooth transition between images. The application of blending at the end of the procedure is admissible since it does not introduce any image artefact when pixel intensity differences are low, which is the case after the phase of image pixel correction.

#### **4. Robotic Thermography Setup**

#### *4.1. Inspection System Integration*

Figure 5 illustrates the automatic thermography setup used in this work. The robotic manipulator was a KUKA KR10 R1100-2 arm [39], with a maximum payload of 11.1 kg and a maximum reach of 1101 mm. The setup was designed to perform PPT inspection in reflection mode, meaning that the flash lamp and the IR camera were always kept on the same side of the part under inspection. A custom-built supporting bracket was used to mount the flash lamp and the IR camera onto the robot and keep them in a fixed relative position during the inspection. The support allowed adjusting the orientation of the flash lamp to set the angular offset between the flash lamp illumination axis and the camera axis. This adjustment is not active because an actuator does not vary it during the execution of a robotic inspection path. However, keeping the camera focal distance constant for all data collection poses in a path makes it possible to manually set the optimum angular offset for any chosen camera focal distance before executing the inspection path. The heat source was an Elinchrom Twin X4 Lamphead EL20181, capable of releasing a pulse of 4800 W/s with a duration of 5.56 ms (1/180 s), powered by two power supplies in a parallel configuration [40]. The excitation source features a lightweight aluminium chassis, two twin flash tubes and twin cables connected to two Elinchrom 2400 RX power packs. The presence

of two flash tubes and two power packs allows shorter flash durations and faster recycle times than a single flash tube connected to a single power pack, which is advantageous for the robotisation of the thermographic inspection. The IR camera was a cooled FLIR X6540sc IR-camera [41], equipped with a 50 mm F/2.0 lens; it has an adjustable acquisition rate of up to 125 Hz at full frame. The camera detector consists of 640 × 512 pixels, cooled by a Stirling thermodynamic cycle that uses an Indium-Antimonide fluid. The camera was connected to the computer through the Gigabit Ethernet link for full bandwidth data acquisition. The FLIR ResearchIR Max® software (version 4.40.1), running on the computer, enabled the initial configuration of the camera and the reception of the thermographic data during the robotic inspection. DFT was used to evaluate the frequency content of the thermal response.

**Figure 5.** (**a**) Schematic representation of the robotic thermographic setup used in this work; (**b**) Photo of the actual laboratory setup.

#### *4.2. Sample*

The sample was an epoxy specimen reproducing the curved geometry of a compressor blade. The specimen was produced by pouring a mix of liquid epoxy resin and a hardener into a mould. The resultant polymerised sample had one convex side, one concave side, and a varying thickness. The curvature of both surfaces is constrained to one direction. Six flat bottom holes (FBHs), three with square sections and three with round sections, were machined on the concave side of the sample as artificial defects. Thus, the FBHs are not visible from the convex side of the sample. Figure 6a,b illustrate the sample geometry, its main dimensions, and the position and size of the FBHs. The sample was coated with acrylic-based black matt paint to uniformise and enhance the surface emissivity, improving the effectiveness of PT inspection. The sample was placed on the optical table at a registered position within the working envelope of the robot arm, using a fixed custom supporting base. The specimen was inspected from the convex side. Figure 6c shows the sample ready for inspection. In order to validate the proposed alignment method, as will become clear in the following sections, the robotic thermographic inspection was also performed by wrapping the sample with a flexible plastic 3D printed grid (as shown in Figure 6d). The grid square pattern had a 3 mm pitch and wire width of 0.6 mm.

**Figure 6.** (**a**) Picture of the sample with the indication of footprint dimensions; (**b**) picture of the back wall with the indication of FBH locations and sizes; (**c**) sample placed on the supporting base without grid; (**d**) sample with the grid.

#### *4.3. Robot Path-Planning, Simulation and Control*

Six-axis robotic arms have traditionally been used in production lines to perform pick-and-place operations (e.g., palletising robots). In that scenario, where the exact trajectory between any two consecutive poses is not too important, a robot can be manually programmed by simply teaching the robot controller the coordinates of a few poses. Such teaching is usually performed by manually jogging the robot to each desired pose to record its coordinates. Then a robot programme is manually written to move the robot through the recorded poses. More recently, accurate mechanical joints and control units have made industrial robotic arms precise enough for finishing tasks in manufacturing operations [42]. As a result, software brands and robot manufacturers have developed many software applications to help technicians and engineers in programming complex robot tasks [43]. Using such software platforms to program robot movements is known as off-line programming (OLP). It is based on importing the 3D virtual model of the complete robot work cell, the robot end-effector, and the sample(s) to be manipulated or machined. Such robotic OLP software modules usually evolve from CAD/CAM applications, suited to programming Computer Numerical Control (CNC) manufacturing machines.

Despite the abundance of OLP software solutions geared towards manufacturing applications, limited solutions have been demonstrated for robotic NDT delivery [44,45]. Using commercial OLP software to generate appropriate tool paths for NDT purposes may seem relatively straightforward at first glance, but there are several inadequacies:


In this work, robotic path-planning, simulation and control for automated thermographic inspection were enabled through developing a bespoke MATLAB-based graphic user interface. Figure 7 shows a screenshot of the application taken during the pathplanning phase to inspect the sample described above. This software application imports the digital models of the robot, the thermographic instrumentation and the sample, producing a virtual representation of the inspection setup. The application was mainly developed to enable the automated thermographic data collection required for validating the data alignment method introduced by this work. Although it has no ambition to be a fullydeveloped software tool, it contains vital features to allow flexibility and future usability. The digital sample model is positioned in the virtual scene according to the user-specified coordinates for the sample reference frame with respect to the robot reference system. The set of coordinates comprises the three Cartesian coordinates of the sample origin and the three Eulerian angular coordinates of the coordinated axes. Although the application does not allow easy replacement of the employed thermographic instrumentation, provision has been made to enable customisation of the IR camera focal distance. Indeed, the inspection resolution depends on the camera's distance from the sample surface for a given camera lens with a fixed focal distance. Thus, changing the camera focal distance is greatly important to allow accurate planning and simulation of the robotic task. The indication of the camera focal distance enables the software to compute the robot tool centre point (TCP) coordinates. The application allowed the creation of a raster inspection tool-path for the sample, according to the user-specified maximum spacing between consecutive image acquisition poses (25 mm) and offset from the sample edges (10 mm), resulting in an inspection path consisting of 15 data acquisition poses arranged in three passes (5 poses per pass). The TCP is kept on the part surface for all poses. The z-direction of the tool reference frame follows the surface's normal direction to keep the camera view axis always perpendicular to the surface. Due to the curvature of the surface, the fact that the IR camera view axis is kept perpendicular to the surface does not guarantee that all the infrared rays emitted by the surface are perpendicular to the camera sensor. That aspect can be neglected by reducing the part surface area imaged from a single camera position, which is the main reason for employing robotic thermography. The surface area imaged from each camera position is reduced by bringing the camera closer to the part and/or cropping the camera's full image frame (sub-windowing). The sub-windowing also allows higher frame acquisition rates, resulting in a better temporal sampling of the thermal wave.

**Figure 7.** MATLAB-based graphic user interface for robot path-planning, simulation and control.

The application allows simulating the automated task workflow before sending the path command coordinates to the connected robot. The connection between the computer and the robot was managed through the Interfacing Toolbox for Robotic Arms (ITRA) [46]. The ITRA allowed synchronising the robotic camera manipulation and the data collection to carry out the following steps, supported by the schematic representation given in Figure 5a:


Figure 8 shows the robotic inspection system at the first five path poses. A video of the robotic data acquisition is available for download as Supplementary Material.

Pose #1 Pose #2 Pose #3 Pose #4 Pose #5

**Figure 8.** Robotic inspection system during data acquisition for the first five poses.

#### **5. Results**

Figures 9 and 10 show the sets of thermographic images acquired with the tool path presented in Section 4.3, using a camera focal distance of 550 mm. The camera acquired the evolution of the thermographic field for 10 s at each pose (starting from one second before the trigger of the flashlamp). DFT was used to evaluate the frequency content of the thermal response at 0.6 Hz. All images have the same size (192 × 224 pixels). They relate to the images captured from the sample with and without the grid. Thus, the same robotic tool path was repeated twice to ensure repeatability in the acquisition poses. It must be noted how the average pixel intensity varies from image to image within each set; for example, image #4 and image #12 respectively present significant lower and higher average intensity than the rest of the images in the first set, respectively. Furthermore, pixel intensity is not repeatable since differences are evident across the two sets. Any pair of corresponding images in the two sets present a visible difference in pixel intensity.

**Figure 9.** Set of phasegrams taken from the sample with the superposed grid.

**Figure 10.** Set of phasegrams taken from the sample without the superposed grid.

Figure 11 highlights the initial estimate of the overlaps in each set of images. The images were encoded with the camera shooting positions and scaled by the measured resolution value. The known pitch of the grid (3 mm) was used to estimate the resolution of the images, which was 150 μm/pixel (∼=4444 pixels/cm2). Figure 11a,b relate to the set of images with and without the grid, respectively. There, the image pixels were plotted with 50% transparency to allow visualising the overlaps, which are more clearly illustrated in Figure 11c. Following the notation introduced in Section 3.3, the presented FiPAM method was employed using a value of 1 mm for the offset (o) between the image overlap areas and the shrunk areas. This equates to assuming that the maximum distance between a pair of corresponding pixels in neighbour images (the misalignment) does not exceed 1 mm, which is the case for the sets of images at hand.

**Figure 11.** Plots of scaled and encoded images. (**a**) Set of images taken from the sample with the grid; (**b**) Images taken from the sample without the grid; (**c**) Overlapping areas.

As stated above, the FiPAM algorithm contains the mathematical parametric formulation of all five typical 2D image transformations (translation, Euclidean, similarity, affinity and homography), allowing aligning all images in a set with the same type of planar

transformation or using a different type of transformation for each image. Each image can be given a diverse set of DoFs and treated in six different ways if no transformation (no degrees of freedom) is included as an additional option, corresponding to a total of 615 ∼= 4.70·10<sup>11</sup> possible diverse ways of applying FiPAM to our sets of fifteen images.

Figures 12 and 13 illustrate the results obtained with FiPAM, using the similarity transformation for all images. The similarity transformation, which allows four DoFs (horizontal translation, vertical translation, rotation and scaling), proved sufficient to permit a fine alignment of all images in the given sets. In Figures 12a and 13a, whereas the dotted blue line rectangles represent a fivefold scaled-down version of the original images, the rectangles with a green line perimeter represent a twofold scaled-down version of the aligned images, where the translation is magnified by a factor of 20 and the rotation transformation is maximised by a factor of 40. These magnifications were introduced to illustrate the computed alignment transformations for visualisation purposes. The infill colour given to each aligned image is linked to the computed pixel intensity correction through the indicated colour map. The resulting blended mosaic thermographic image (460 × 800 pixels) is given in Figures 12b and 13b for the images with and without the grid, respectively.

**Figure 12.** (**a**) Schematic illustration of similarity transformations and pixel intensity corrections computed through the proposed method for the set of images relative to the sample with the grid. (**b**) Resulting composite mosaic thermographic image.

**Figure 13.** (**a**) Schematic illustration of similarity transformations and pixel intensity corrections computed through the proposed method for the set of images relative to the sample without the grid. (**b**) Resulting composite mosaic thermographic image.

Although testing FiPAM with all transformation combinations is not viable, the method was evaluated through a representative subset by employing each of the five possible planar transformations for all images and changing the number of images to align; this allowed varying the problem size significantly to evaluate the execution time of FiPAM. The number of images considered for each type of transformation was: 2, 5, 10 and 15, corresponding to aligning the first two images, the five images collected in

the first pass of the tool path, the images in the first two passes or all the images in the set. As a result, the total number of DoFs considered in the alignment problem spanned from four, for two images transformed through pure translation (two parameters per image), to 120, for fifteen images transformed through homography (eight parameters per image). FiPAM was implemented and evaluated through MATLAB (version 2020b), running on a computer with an Intel® i7-6820HQ CPU (2.70 GHz, 4 Cores) and 32 Gb of Random-Access Memory. The MATLAB implementation code developed in this work is accessible at https://doi.org/10.5281/zenodo.6817052 (accessed on 11 July 2022). The recorded execution times are plotted in Figure 14.

**Figure 14.** Execution times for alignment, pixel intensity correction and Laplacian blending.

#### **6. Discussion**

Traditional manual inspection approaches are insufficient in some scenarios. Therefore, there are fundamental motivations for increasing automation in non-destructive testing. Automation of NDT is required to cope with the inspection of large and/or curved geometries. The key challenges to face when developing a robotic NDT system include integrating the NDT instrumentation with the robotic manipulator, creating a suitable robot inspection path for the part under inspection, and developing software for NDT data collection and visualisation. Although these challenges have been addressed by several applications of six-axis robotic arms for the inspection of parts through automated ultrasonic techniques, other types of inspections have not reached the same level of robotisation, which is the case with thermographic testing. This work bridges a technology gap, making thermographic inspections more deployable in industrial environments. Furthermore, the proposed fine image alignment method (FiPAM) can find applicability beyond thermographic NDT.

The results prove that FiPAM enables the proper merging of multiple thermographic images into one single mosaic image, which is easier to analyse. This is accomplished through three steps: simultaneous alignment of all images in a set, global optimum pixel intensity correction, and image blending. The reported composite mosaic images, in Figures 12b and 13b, obtained through computing similarity transformations and pixel corrections for the images acquired in this work, show a significant reduction of the original discontinuities. Whereas the scale of the composite image relative to the sample with the grid is immediately retrievable from the known grid pitch (3 mm), a reference 20 mm long scale bar was added to the image relative to the sample without the grid. It is straightforward to note that the sizes of the thermographic indications correspond to the physical sizes of the artificial FBHs. The difference in thermographic pixel intensity for FBHs of diverse sizes is coherent with the change in the aspect ratio between heat blocking and leakage surface, as described in [47]. Larger FBH diameter to depth ratios produce the emergence of localised higher intensities in the IR image sequence.

Although FiPAM execution times are machine-dependent, the patterns presented in Figure 14 provide a helpful guideline for understanding the general trends. As expected, the execution times for alignment, pixel intensity correction and Laplacian blending increase with the number of images. The alignment phase execution time also depends on the type of transformations used for the images in the set. They influence the size of the least-squares problem and the number of transformation parameters to find through the minimisation of the SSD function in Equation (3). Thus, for a given number of images to align, using the same type of transformation for all images, the execution time increases monotonically, moving from translation to Euclidean, similarity, affinity and homography transformations. Although all possible combinations are not assessed in this work, it is not difficult to imagine intermediate execution times for generic combinations, where not all the images get transformed by the same transformation type. As a rule of thumb, for a given number of images, the alignment execution time should never exceed the time relative to the case where all images get transformed through homography, since it corresponds to the biggest problem with the maximum number of parameters. Fluctuations in alignment execution times can be observed for patterns relative to translation and Euclidean transformations. They are thought to be caused by the limited DoFs allowed by these transformations, which can cause prolonged convergence times due to the difficulty of obtaining a good image alignment. The execution times of the pixel correction and Laplacian blending phases depend on the number of images. The minor differences associated with the used transformation type are thought to be caused by the different overlaps of the aligned images, which changes the number of pixel intensity differences to compute for the SSD function in Equation (4).

The advantages of FiPAM, described in this work, should be clear by now. One limitation of the current implementation is that FiPAM is suitable for aligning multiple images of a sample surface that curves only in one direction. Although this limitation does not impede using FiPAM for generalised cylindrical surfaces, future work should focus on extending FiPAM to operate with images encoded in three-dimensional space.

**Supplementary Materials:** The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/s22166267/s1, Video S1: Robotic thermographic data acquisition.

**Author Contributions:** Conceptualisation, C.M., N.M. and A.P.; methodology, C.M. and N.M.; software, C.M.; validation, C.M., N.M. and M.F.; formal analysis, C.M.; investigation, N.M. and M.F.; resources, C.M. and N.M.; data curation, C.M.; writing—original draft preparation, C.M.; writing review and editing, all; visualisation, C.M.; supervision, D.C. and A.P.; project administration, D.C. and A.P.; funding acquisition, C.M. and D.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work received funding from the European Union's Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie grant agreement No. 835846.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The source code for the MATLAB-based implementation of FiPAM and the example thermographic dataset are available at https://doi.org/10.5281/zenodo.6817052 (accessed on 11 July 2022).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


### *Article* **Defects Recognition Algorithm Development from Visual UAV Inspections**

**Nicolas P. Avdelidis 1,\*, Antonios Tsourdos 1, Pasquale Lafiosca 1, Richard Plaster 2, Anna Plaster <sup>2</sup> and Mark Droznika <sup>3</sup>**


**Abstract:** Aircraft maintenance plays a key role in the safety of air transport. One of its most significant procedures is the visual inspection of the aircraft skin for defects. This is mainly carried out manually and involves a high skilled human walking around the aircraft. It is very time consuming, costly, stressful and the outcome heavily depends on the skills of the inspector. In this paper, we propose a two-step process for automating the defect recognition and classification from visual images. The visual inspection can be carried out with the use of an unmanned aerial vehicle (UAV) carrying an image sensor to fully automate the procedure and eliminate any human error. With our proposed method in the first step, we perform the crucial part of recognizing the defect. If a defect is found, the image is fed to an ensemble of classifiers for identifying the type. The classifiers are a combination of different pretrained convolution neural network (CNN) models, which we retrained to fit our problem. For achieving our goal, we created our own dataset with defect images captured from aircrafts during inspection in TUI's maintenance hangar. The images were preprocessed and used to train different pretrained CNNs with the use of transfer learning. We performed an initial training of 40 different CNN architectures to choose the ones that best fitted our dataset. Then, we chose the best four for fine tuning and further testing. For the first step of defect recognition, the DenseNet201 CNN architecture performed better, with an overall accuracy of 81.82%. For the second step for the defect classification, an ensemble of different CNN models was used. The results show that even with a very small dataset, we can reach an accuracy of around 82% in the defect recognition and even 100% for the classification of the categories of missing or damaged exterior paint and primer and dents.

**Keywords:** defect recognition; aircraft inspection; deep learning; CNN; UAV; defect classification; AI

#### **1. Introduction**

Air transport is one of the most significant ways of moving people across the globe. In 2019, the number of air passengers carried worldwide was around 4.2 billion, an overall increase of 92% compared with 2019 [1]. During COVID-19, most travelling was put almost on a halt with the numbers decreasing significantly. In 2020, the total number of passengers dropped significantly to around one billion (1034 million) [2]. As a result, the need of reducing costs across the industry has become imminent. Around 10–15% of the operational costs of an airline are around maintenance, repairs, and overhaul (MRO) activities [3]. Currently, aircraft maintenance heavily involves visual tasks carried by humans [3]. This is very time consuming, costly and introduces possibilities for human errors. It is understood that automating these visual tasks could solve this problem [4–6]. For this reason, the use of climbing robots or UAVs to perform these tasks have been attempted. Climbing robots usually use magnetic forces, suction caps, or vortexes to climb

**Citation:** Avdelidis, N.P.; Tsourdos, A.; Lafiosca, P.; Plaster, R.; Plaster, A.; Droznika, M. Defects Recognition Algorithm Development from Visual UAV Inspections. *Sensors* **2022**, *22*, 4682. https://doi.org/10.3390/ s22134682

Academic Editors: Yashar Javadi and Carmelo Mineo

Received: 28 May 2022 Accepted: 20 June 2022 Published: 21 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to the aircraft structure [7–9]. However, robotic platforms for inspection face difficulties in achieving good adherence and mobility due to their lack of flexibility [7,10,11]. On the other hand, UAVs have been proposed for the inspection [12–15] of buildings, wind turbines, power transmission lines and aircrafts. UAVs could minimize inspection time and cost as they can inspect quickly large areas of the aircraft and data can be transmitted to a ground base in real time for processing. The key challenge of all the above automated techniques is developing defect detection algorithms that are able to perform with accuracy and repeatability. Several attempts have been made and most of them can be divided into the following two categories: the ones that use more traditional image processing techniques [5,16–18] and the ones that use machine learning [19–24]. In the first category, image features such as convexity or signal intensity [5] are used. In [18], the authors proposed a method using histogram comparisons or structural similarity. In addition, in [16,17], the authors proposed the use of neural networks trained on feature vectors extracted from contourlet transform. These techniques have very good accuracy in the test data but are failing to effectively generalize and need continuous tuning. On the other hand, algorithms using convolutional neural networks (CNN) have showed good results in defect detection [19–21,25]. In [19,20], CNNs are used as feature extractors and then either a single shot multibox detector (SSD) or a support vector machine (SVM) are used for the classification. The use of faster CNN is also proposed for classification and localization [22]. In addition, the use of UAVs together with deep learning algorithms is proposed for the tagging and localization of concrete cracks [26,27].

The main challenge of the machine learning algorithms is the requirement of a large amount of data. Especially for the CNNs, the amount of data required can be in the scale of thousands, especially if a model is not already pretrained. The existence of large datasets in concrete structures has allowed CNNs to show excellent results in defect detection in concrete structures. On the other hand, in aircraft structures, the results are promising but are still not very accurate [18] or they deal with only the problem of defect recognition [20]. In this paper, we propose a two-step classification process of an ensemble of machine learning classifiers for both defect recognition and classification. In this two-step process, we are using pretrained CNNs to both recognize and classify a series of defects in aircraft metallic and composite structures. In the first step, we are performing the defect recognition and in the second step, the defect classification. By combining the results of different classifiers, we can more effectively address the issue of small datasets and produce results with an accuracy reaching 82% in the defect recognition step.

#### **2. Dataset**

One of the challenges in this study was the creation of datasets for training and testing the classifiers. As most of the datasets of defects on aircrafts are not public available, we needed to create our own. The datasets were created with the help and permission of TUI© [28]. The images were taken during the scheduled maintenance of aircrafts in TUI's base maintenance hangar in Luton, UK. The imaging sensor used was a SONY RX0 II© rugged mini camera. This model can be carried by a drone and is able to take images from any angle. All the technical specifications of the camera, such as sensor size and type, focal length, size of the sensor, are widely available and the effective resolution is 15-megapixels with maximum resolution of 4800 × 3200 pixels. Images for the datasets were captured and the following seven types of defects were investigated:


In Figure 1, images of the defects are presented. Most of the obtained images contained several defects, together with other elements such as screws, etc. In order to create the two different datasets, further processing was needed to extract only the objects that we were interested in from each of the images.

**Figure 1.** Images of different types of defects in aircraft structures. (**a**) Missing paint, (**b**) dents, (**c**) lighting strike damage, (**d**) lighting strike fastener repair, (**e**) blend/rework repair (material removed and then re-protected with exterior paint); (**f**) double patch repair.

The objects of interest were cropped through a semi-automated procedure to create the datasets for the training. A Python script was developed so the user can select and crop the area with the object of interest. The cropped image was saved in the new image file. The name of the file was indicative of the category of the defect. This provided us the capability to extract multiple images of interest from only one image, with and without defects. The cropped images were grayscaled because we did not want the classifiers to associate color with any defects during training. This was carried out because defects are not color related and aircrafts are painted in different colors, depending on the company. Images of the datasets can be observed in Figure 2.

**Figure 2.** Sample images from the two datasets created for training the classifiers. (**a**) An image of a dent, (**b**) a lighting strike fastener repair; (**c**,**d**) are images with objects that are not defects.

Following the above procedure, two datasets were created, one containing images from each category of the defects described above and one contains images with and without defects. The second dataset in the no-defect category has images of screws, gaps, small plates etc., objects that the classifier will need to distinguish from the defects. Figure 2 shows images from the two datasets with and without defects.

The defect/no defect dataset, which we will refer as binary for simplicity, contains 1059 images, 576 of defects and 483 of non-defects. The other dataset, referred as the defect dataset, contains 576 images of the 7 types of defects. Both datasets were relatively small but gathering images was very challenging under the current circumstances (COVID-19 restrictions, flights reductions etc.). To try to overcome this, we carried out a custom split of the images between training, and validation, with 88% for training, 9% for validation and the rest for testing for both datasets. This was carried out to give the opportunity to the classifiers to learn as much as possible from the dataset. For the binary dataset, the splitting can be observed in Table 1 and for the defect dataset in Table 2.


**Table 1.** Dataset split for training, validating and testing the defect/non defect classifier.

**Table 2.** Dataset split for training, validating and testing the defect classifier.


#### **3. Defect Classification Algorithms**

As previously mentioned, one of the challenges of the classification problems in applications in aerospace is the small amount of data available. In this paper, we tried to address this by proposing a two-step classification approach with a combination of different classifiers. In the first step, a classifier decides if the image contains a defect and if this is true in the second step, the defect is classified by a different classifier. The classifiers are a combination of pretrained CNNs on ImageNet [29], which we retrained with the use of transfer learning [30]. In the first step, a DenseNet201 model is used and in the second, an ensemble of different CNNs as can be observed in Figure 3.

**Figure 3.** Block diagram of the two-step process for defect recognition and classification.

Transfer learning refers to a technique of retraining a CNN that has already been trained in very large dataset, such as Imagenet [29]. Even though the dataset that the CNN is been initially trained in is irrelevant to the problem research, ref. [30] has shown that the benefits of this technique are substantial. There are mainly two approaches on how to implement transfer learning; in the first, only the convolutional layers of the trained network are used as feature extractors [31] and then the features are fed to a different classifier, such as support vector machines [31]. In the second approach, which is used in this paper, the head of the neural network (fully connected layers) is replaced. The output of the new connected layers will match the number of the categories of our classifier. The new neural network is initially trained by keeping all the weights of the convolution layers frozen/non trainable. Then, to fine tune the model, a number of the layers are unfrozen and the training of the network is updated. The basic rule for unfreezing layers is, the less the data, the less layers to unfreeze. In addition, because the initial/bottom layers of a CNN extract more abstract features that can be used in any type of image, we unfreeze (for training) the layers closer to the top of the network. Another point that needs attention during both training rounds is not to update the weights of the batch normalization layers. These layers contain two non-trainable weights, tracking the mean and variance of the inputs that usually get updated during training. So, if we unfreeze these layers during fine tuning, the updates applied will destroy what the model has learned.

The models were implemented using TensorFlow [32], as this is a well-established deep learning library, widely used for both commercial applications and research. Because TensorFlow contains around forty pre-trained networks, we needed to identify those that fit better on our datasets. To achieve this, we trained each network for five epochs with the convolutional layers frozen. To continue with fine tuning, we chose the best four pretrained networks for each classifier. For the binary classifier, the models that performed better were Mobilenet, DenseNet201, ResNet15V2 and InceptionResNetV2. For the defect classifier, the four models with the best results were EfficientNetB1, EfficientNetB5, EfficientNetB4 and DenseNet169.

To improve the performance of the chosen models, we fine-tuned them for another ten epochs. For fine-tuning, we unfroze the last 10% of the layers of each model and reduced the learning rate by a factor of ten compared to the initial one. In addition, techniques of reduce learning and early stopping were used. Both techniques are included in TensorFlow libraries. In the reduce learning technique, the learning rate of the optimizer is reduced if the validation loss has not improved for a certain number of epochs. Similar in the early stopping as the name suggests, training stops if our metric (in this case, validation loss) has not improved for a certain number of epochs and the graph with the best weights is saved. Both techniques were used to prevent overfitting.

In addition to the CNN, a random forest was trained. The initial idea was to use in the first step both the CNN and the random forest but the overall benefit of this was low. For training, the random forest we have extracted the features of Hu moments, color histogram and Haralick. The overall accuracy of the random forest classifier was 76%.

#### **4. Results**

As discussed in the previous chapter, the initial training of five epochs has been carried out for each of the pretrained models of TensorFlow. The results of the four best networks for the defect recognition can be observed in Table 3.

**Table 3.** Performance of the 4 best out of 40 pretrained networks for the binary classifier after 5 epochs.


The results of the best four networks for the defect classification can be observed in Table 4.



As expected, due to the small number of images and due to the lack of fine tuning especially for the defect classifier, the accuracy in both the validation and testing images was relatively low. At this stage, no further analysis was carried out or any extra metrics, such as confusion matrices or classification reports, as the purpose was to identify the best CNNs for each of the datasets.

After this initial training, each of the networks were further trained, as discussed for another ten epochs. The results of the training can be observed in Table 5.

**Table 5.** Performance of the 4 best pretrained networks for binary classifier after fine tuning for a total of 15 epochs.


The same procedure was followed for the other set of classifiers for the defect classification. The results can be observed in Table 6.


**Table 6.** Performance of the 4 best pretrained networks for defect classifier after fine tuning for a total of 15 epochs.

To understand better the behavior of the CNNs while trained, the validation loss was taken into account. This metric, together with the validation accuracy, can illustrate when the CNN will start overfitting. Usually, when the validation loss does not improve, but the validation accuracy does, overfitting occurs. This is also the main reason why we used the techniques of reduce learning and early stopping.

To decide which of the above eight CNNs to use in the proposed system, further metrics were produced. For each of the models, a classification report and a confusion matrix was produced to measure the performance in the test data. A classification report measures the values of precision, recall and F1-score [33]. Precision quantifies the number of correct positive predictions. It is defined as the ratio of true positives divided by the sum of true positives and false positives [33]. It shows how precise/accurate the model is. It is very useful if the false positive cost is high, which in our case was not. If one misclassifies a non-defect, it will produce an extra load of work for the inspector but it is not critical. Recall is the ratio of correctly predicted positive predictions against all the predictions in the actual class [33]. It is the ratio of true positives divided by the sum of true positives and false negatives. In simple terms, recall shows how many of the predictions in the class are actual positives. It is the metric we can use if there is a high cost of false negatives; in our case, if we misclassify a defect as non-defect. The F1 score is calculated as the multiplication of precision and recall, divided by the sum of precision and recall and then multiplied by 2 [33]. The F1 score can be interpreted as the harmonic mean of both precision and recall. The F1 score can also be interpreted as the average of precision and recall. It is a very valuable metric, especially when both errors caused by false positives and false negatives are undesirable.

Taking into consideration all the above, we created a classification report with the above metrics for each of the models.

In Table 7, the combined classification reports can be observed for all four models for defect recognition and in Table 8, the combined confusion matrices.

From the above tables, we can observe that DenseNet201 performs very well with high precision. The results from the confusion matrix show that the model has predicted correct eighteen out of the twenty-two images containing a defect and nine out of eleven images for the no defect category.

Comparing InceptionResNetV2 and DenseNet201, we can observe that the first has a better precision than DenseNet201 for the defect category by its recall value being much lower. This is also reflected in the confusion matrix, where InceptionResNetV2 has more false negatives. In addition, the F1 score for DenseNet201 is higher in both categories. Because misclassifying a defect is critical in our application, we can state that DenseNet201 performs better.

From the above results, in can be observed that DenseNet201 has the best overall accuracy with 81.82%, the best precision and recall values for the defect class. In addition, it has the least false negatives and the best F1 score for both classes. Another test we performed was to combine the classifiers in an ensemble to investigate whether any improvements in the metrics were possible. The ensemble of classifiers did not give better results, compared to DenseNet201.


**Table 7.** Combined classification reports for defect recognition classifiers.

**Table 8.** Combined confusion matrices for defect recognition classifiers.


The same procedure was followed for the defect classification models and the results of the metrics and confusion matrices can be observed in Tables 9–16.


#### **Table 9.** Classification report of Dense169 for defect recognition.

#### **Table 10.** Confusion natrix for Dense 169.


#### **Table 11.** Classification report of EfficientNetB1 for defect classification.


#### **Table 12.** Confusion matrix of EfficientNetB1.



**Table 13.** Classification report of EfficientNetB4 for defect classification.

#### **Table 14.** Confusion matrix of EfficientNetB4.


#### **Table 15.** Classification report of EfficientNetB5 for defect classification.


#### **Table 16.** Confusion matrix of EfficientNetB5.


From the above matrices, the performance of the models for the defect classification is relatively low. However, this is due to the number of images in the dataset and because the dataset was unbalanced. To improve performance and ensure the predictions are more consistent, we used the ensemble model. We combined all four models to create a new model in which the input image is fed into all four models. The predictions of each of the models are passed to a layer that is added at the end of the model. This final layer averages the predictions of the four models and returns array with the new values. This technique, especially in our case where the performance of the models is similar, provides a more consistent outcome for all the different classes. The results for the ensemble model can be observed in Tables 17 and 18.


**Table 17.** Classification report of the ensemble model for defect classification.

**Table 18.** Confusion matrix of the Ensemble.


For the ensemble model, although in some categories it may have worse performance than others, its overall performance is better. It has positive predictions for all the categories in comparison with other models and its overall accuracy is above the average value of the models.

Finally, we tested the whole pipeline of our algorithm. We first fed the test images to the defect recognition model and then, if the image had a defect, we passed it to the defect classifier. As a defect recognition model, we have chosen the DenseNet201 and for the defect classification, the ensemble model. As we have used the same test dataset, the results of the defect recognition model are the same as Tables 7 and 8 and for the ensemble, similar to the Tables 17 and 18. However, by filtering through the first step, the images that we achieved 100% accuracy for were the categories of the missing or damaged exterior paint and primer and dents.

Although the results are promising, the overall accuracy of the defect classifier is low. As previously mentioned, this is mainly due to the small number of images and because the dataset is very unbalanced. Taking into consideration the accuracy for the defect recognition classifier together with the number of images, we believe that by having around five hundred images for each defect category, we will be able to improve significantly not only the performance of the defect classifier but also of the overall process.

#### **5. Conclusions**

In this paper, we have presented the development of a two-step process for defect recognition and classification of aircraft structures. A dataset was created from real aircraft defects taken in TUI's maintenance hangar. On the one hand, the lack of defects on aircrafts made the creation of the dataset very challenging and on the other, the recognition of defects is crucial for the safety of the passengers and crew. To overcome this, we proposed a two-step process method. Firstly, we recognized the defect and then we classified it. This method has the advantage of using two different classifiers, one for defect recognition and one for defect classification. By splitting the process of defect recognition and classification in two, we improved the accuracy. This is because first, we can train the defect recognition model with more data, thus making it more accurate. In addition, in this first step, we perform with higher accuracy the most significant part of finding the defect. Secondly, we use a dedicated classifier for defect classification. This gives the opportunity to the second classifier to learn more effectively the differences between the different types of defects, as it does not have to learn any of the non-defect images.

The results of the first step had an accuracy **81.82%**, which is quite high considering the small training dataset. In the second step, for the defects of missing or damaged exterior paint and primer and dents, we achieved 100% accuracy.

Although the results are promising, future work will be carried out in increasing the defect dataset, especially in adding more images in the very small categories to improve the unbalanced dataset. In addition, the process will be combined with a UAV inspection for real time recognition and classification

**Author Contributions:** Conceptualization, N.P.A. and A.T.; Methodology, All Authors.; Software, N.P.A.; Validation, M.D.; Resources, All Authors; Writing—Original Draft reparation, N.P.A.; Writing— Review and Editing, N.P.A., A.T. and M.D.; Visualization, N.P.A.; Supervision, N.P.A. and A.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported and funded by the British Engineering and Physics Sciences Research Council (EPSRC IAA project).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Sensors* Editorial Office E-mail: sensors@mdpi.com www.mdpi.com/journal/sensors

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel: +41 61 683 77 34

www.mdpi.com ISBN 978-3-0365-6530-9