*Article* **Towards Autonomous Bridge Inspection: Sensor Mounting Using Aerial Manipulators**

**Antun Ivanovic 1,\*, Lovro Markovic 1, Marko Car 1, Ivan Duvnjak <sup>2</sup> and Matko Orsag <sup>1</sup>**


**Featured Application: The main idea of this paper was to deploy a team of unmanned aerial vehicles (UAVs) to attach a sensor to a bridge using a two-component adhesive in order to perform an inspection. Constant pressure must be applied for several minutes to form a bond between two adhesives. Therefore, one UAV sprays the colored component of an adhesive while the aerial manipulator transports the sensor, detects the contact point and attaches the sensor to it. A trajectory planning algorithm was developed around the dynamic model of the UAV and the manipulator attached to it, ensuring that the end-effector is parallel to the wall normal. Finally, the aerial manipulator achieves and maintains contact with a predefined force through an adaptive impedance control approach.**

**Abstract:** Periodic bridge inspections are required every several years to determine the state of a bridge. Most commonly, the inspection is performed using specialized trucks allowing human inspectors to review the conditions underneath the bridge, which requires a road closure. The aim of this paper was to use aerial manipulators to mount sensors on the bridge to collect the necessary data, thus eliminating the need for the road closure. To do so, a two-step approach is proposed: an unmanned aerial vehicle (UAV) equipped with a pressurized canister sprays the first glue component onto the target area; afterward, the aerial manipulator detects the precise location of the sprayed area, and mounts the required sensor coated with the second glue component. The visual detection is based on an Red Green Blue - Depth (RGB-D) sensor and provides the target position and orientation. A trajectory is then planned based on the detected contact point, and it is executed through the adaptive impedance control capable of achieving and maintaining a desired force reference. Such an approach allows for the two glue components to form a solid bond. The described pipeline is validated in a simulation environment while the visual detection is tested in an experimental environment.

**Keywords:** aerial robotics; inspection and maintenance; aerial manipulation; multirotor control

#### **1. Introduction**

The world of unmanned aerial vehicles (UAVs) has been rapidly growing in recent years. As their design and control are perfected, these aerial vehicles have become more and more available. Nowadays, off-the-shelf ready-to-fly UAVs can be found and bought in shops, which makes them available to virtually anybody. This, in turn, has sparked a great deal of public interest in UAVs since their potential can be found in applications such as agriculture, various inspections (bridges, buildings, wind turbines), geodetic terrain mapping, the film industry, and even for hobby enthusiasts to fly and record videos from a first-person perspective. The vast majority of commercially available UAVs are equipped with a camera, while more specialized vehicles for terrain mapping or crop spraying offer a more diverse sensor suite.

**Citation:** Ivanovic, A.; Markovic, L.; Car, M.; Duvnjak, I.; Orsag, M. Towards Autonomous Bridge Inspection: Sensor Mounting Using Aerial Manipulators. *Appl. Sci.* **2021**, *11*, 8279. https://doi.org/ 10.3390/app11188279

Academic Editor: Alessandro Gasparetto

Received: 23 July 2021 Accepted: 1 September 2021 Published: 7 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

All of the aforementioned systems primarily observe and gather data about the environment, while having little to no ability to interact with and change the environment. One way to augment these vehicles for physical interaction is to attach a lightweight manipulator to their body, which is the main interest of the aerial manipulation field. Although such vehicles are more complex for both modeling and control, their benefit lies in performing versatile tasks that require interaction with the environment.

In general, there are three types of bridge inspections: periodic, special and damage inspections. Periodic bridge inspections differ from country to country according to national standards, and are usually performed at least once every two to three years. Special inspections are typically used to monitor the condition of deficient elements at specific locations based on predefined requirements. Damage inspections are usually performed after events that have occurred due to environmental impacts or human actions. The aim of a bridge inspection is to evaluate and assess structural safety and reliability. Current techniques are based on traditional visual inspection with a combination of nondestructive methods (NDTs). Traditional visual inspection is performed by experienced (trained) engineers and using specialized trucks equipped with the cranes and basket, that allow inspectors to review the conditions underneath the bridge. During the inspection, the engineers are equipped with various NDT [1] tools to detect construction faults and defects such as corrosion, cracks, voids, weakening connections, and concrete delamination. Some of these NDTs require mounting small sensors to collect data, such as accelerometers, strain gauges, tilt meters and various transducers for acoustic or pressure measurements. Afterwards, the bridge is excited with vibrations, sound waves, tapping, etc., and mounted sensors record responses to these specific excitations. Furthermore, there are usually requirements for performing measurements during the bridge inspection, such as the short- and long-term monitoring of vibrations, strains, displacements, etc. Mainly, these inspections offer valuable information about the current bridge conditions, but there are a number of disadvantages. The use of trucks during inspections requires total or temporary road closures, which at the same time require safety measures to keep traffic flowing as freely as possible. In addition, inspectors often encounter challenges in reaching all portions or elements in narrow areas, such as tight spaces between girders, beams and vaults. The aforementioned significantly increases the time and overall cost of the inspection. An aerial robot, with the potential to reach these challenging locations on the bridge, could significantly reduce the time and cost of these inspections and improve worker safety. Moreover, we note that the aforementioned sensors are relatively lightweight, which makes them suitable for transportation and mounting with an aerial robot.

#### *1.1. Concept*

We envision a team of robots working together to attach sensors to bridges and similar grade separation infrastructure. In theory, such a task could be accomplished with a single aerial robot, at the cost of a complex mechanical design. The proposed team shown in Figure 1 consists of two drones. One drone applies the adhesive material, and the other attaches sensors. We envision a two-stage process using two-component adhesives which form a solid bond from two separate reactive components: the "resin" and the "hardener". The first UAV applies the resin by spraying it onto the surface, while the second one attaches the sensor with the hardener already applied before the flight.

It is important to follow the prescribed ratio of the resin and the hardener to achieve the desired physical properties of the adhesive. Only when mixed together do the two components form the adhesive. The reaction typically begins immediately after the two components are mixed and the bond strength depends both on maintaining the contact and the viscosity of the mixed adhesive during the process. Manufacturers can control the cure rate to achieve various working times (worklife) until final bond strength is achieved, ranging from minutes to weeks. Resin bases are usually more viscous than their respective hardener and are generally applied by brush, roller, applicator or spray. In this work, we propose attaching a canister of pressurized resin to the UAV, and spray it through a

nozzle onto the infrastructure surface. In this scenario, the spray needs to be softer and less turbulent to reduce the amount of material lost due to bouncing and it must be colored for the detection in the second stage. Spraying with drones is not a novel concept [2,3], so without loss of generality, we will omit the details of this design and instead focus on detecting, navigating to, and sustaining contact with the sprayed surface.

In typical applications, the assemblies are usually kept in contact until the sufficient strength of the bond is achieved. When fully cured, two-component adhesives are typically tough and rigid with good temperature and chemical resistance. We rely on the robotic arm attached to the second aerial vehicle to apply a controlled contact force between the sensor and the surface. Maintaining this fixed assembly contact through the impedance control system enables us to achieve a successful curing process and create a permanent bond between the sensor and the infrastructure. After the first UAV sprays the resin onto the surface, the second aerial robot finds the sprayed part and applies the contact with the sensor's surface. Before takeoff, the surface of the sensor is brushed with a hardener. Once contact is made, it is maintained for the prescribed curing time, after which the aerial robot disembarks and leaves the sensor attached to the surface.

**Figure 1.** Two aerial robots working together to attach sensors to different parts of a bridge and similar grade separation infrastructure. The one on the left is used to spray the resin onto the surface, while the aerial robot on the right maintains contact to the surface with the sensor attached to its end-effector.

#### *1.2. Contributions*

This paper focuses on developing a method for mounting sensors on a bridge wall using an aerial manipulator. The first contribution is augmenting the model-based motion planning with the adaptive impedance controller. The motion planning method accounts for the underactuated nature of the multirotor UAV and corrects the end-effector configuration for an appropriate approach. This method also relies on the dexterity analysis which keeps the manipulator configuration within its optimal region, ensuring that the manipulator is never fully extended or contracted while mounting a sensor. The second contribution is the visual blob detection which locates and tracks the appropriate sensor mounting point. The blob detection has been experimentally verified in an indoor environment, yielding the reliable and robust tracking of the mount location, as well as the blob plane orientation. Finally, the third contribution is the simulation analysis of the system's performance, conducted on a straight and inclined wall approach. The simulation concentrates on testing the motion planning together with the impedance controller, performing a repeatability analysis and ensuring that the desired contact force is achieved.

#### **2. Related Work**

In the world of aerial inspections, a number of UAV-based solutions are being proposed by researchers. In [4], a technical survey for bridge inspections is given. Researchers in [5] present the project AERIAL COgnitive Integrated Multi-task Robotic System with Extended Operation Range and Safety (AERIAL-CORE) which focuses on power lines inspection, maintenance and installing bird diverters and line spacers. Most of these approaches are based in conjunction with new technologies to ensure faster and cheaper inspections. Nowadays, UAVs use high-resolution cameras for visual inspections and employ point cloud methods based on digital photogrammetry [6], Light Detection And Ranging (LiDAR)-based methods [7], digital image correlation [8], etc. There are also reports for visual compensation during aerial grasping [9], aerial grasping in strong winds [10], and the development of a fully actuated aerial manipulator for performing inspections underneath a bridge [11]. According to the experimental testing of contact-based bridge inspections, there is a need to develop a solution for mounting application sensors (such as accelerometers, strain gauges and tilt meters) on a bridge using a UAV. It is expected that a sophisticated system with the possibility of automatic sensor mounting will increase the frequency of measurements without interrupting traffic, ensure the safety of inspectors as well as reduce inspection time and overall costs.

As mentioned earlier, the second UAV needs to be aware of the position of the sprayed adhesive which is applied in a blob-like pattern. For this purpose, an Red Green Blue - Depth (RGB-D) camera is used due to its favorable dimensions and weight. It provides image and depth information about the environment which proves useful for object localization and UAV navigation. Such cameras were commonly found on UAVs and unmanned ground vehicles (UGVs) present at the recent MBZIRC 2020 competition. In [12,13], RGB-D information is used for color-based brick detection and localization for the wall-building challenge using UAVs and UGVs, respectively, while in [14] the authors use a Convolutional Neural Network (CNN)-based UAV detection and tracking method for the intruder UAV interception challenge. Furthermore, visual sensors proved useful in [15], where the authors performed a contact-based inspection of a flat surface with an aerial manipulator. The surface position and orientation was obtained by applying random sample consensus (RANSAC) on the RGB-D information. A thorough survey of 2D object detection methods from UAVs was given in [16]. In this paper, the authors present a modular framework for object detection in which a simple contour-based blob detector is implemented. The goal is to use RGB-D information to enable an autonomous inspection workflow. The blob position is obtained by segmenting the depth data at the points where the object is detected in the image, while RANSAC [17] is used to determine its orientation.

After the successful detection of a blob-like pattern, it is necessary to attach the inspection sensor. The first phase of the sensor attachment is achieving contact, and the second is maintaining that contact to allow for the two adhesive components to form a bond. Generally, the contact can be achieved with or without force measurements. In [18], contact with the wall is performed and maintained. Researchers in [19] performed wall contact and aerial writing experiments. The work presented in [20] modeled and exploited the effects of the ceiling effect to perform an inspection underneath a bridge. The common denominator in the former approaches is maintaining the contact without any force feedback. Although mounting a force sensor on a UAV increases both mechanical and control complexity, an immediate benefit is the ability to maintain precise contact force regardless of the environment. In [21], the researchers used a force/torque sensor to achieve compliant control while pulling a rope and a semi-flexible bar. A fully actuated UAV with a manipulator has been employed in [22] to compare force feedback control with and without the force/torque sensor. Researchers in [23] used a single degree of freedom manipulator with a force sensor mounted at the end-effector to press an emergency switch.

Relying on the blob-like pattern detection and the impedance control, a trajectory for achieving contact is required to steer the aerial manipulator towards the contact point. While mounting the sensor, it is essential that the approach and contact are perpendicular

to the wall plane. This can be considered as a task constraint imposed on the planner which the aerial manipulator has to satisfy. Researchers in [24] propose a task constrained planner for a redundant robotic manipulator that enables them to do everyday tasks such as opening drawers or picking up objects. In [25], a task-constrained planner was developed for underactuated manipulators. Since multirotor UAVs are typically underactuated systems, it is necessary to address dynamics and kinematics while planning the endeffector trajectory. Aerial manipulator 6D end-effector trajectory tracking based on the differential flatness principle was presented in [26]. The underactuated nature of multirotor UAVs can cause unexpected deviations in end-effector configuration. Researchers in [27] address this particular problem by including the dynamic model of the system into the planning procedure. In our previous work [28], a trajectory planning method based on the full dynamic model of an aerial manipulator was developed. In this paper, we further augmented this method to plan for the desired force required by the impedance controller.

#### **3. Mathematical Model**

In this section, the mathematical model of the aerial manipulator is presented. The coordinate systems convention is depicted in Figure 2. Furthermore, an analysis for the manipulator dexterity and reach was performed.

#### *3.1. Kinematics*

The inertial frame is defined as *L*W. The body-fixed frame *L*<sup>B</sup> is attached to the center of gravity of the UAV. The position of the UAV in the world frame is given with **p**<sup>B</sup> <sup>W</sup> = *xyz<sup>T</sup>* ∈, and the attitude vector is **Θ** = *φθψ<sup>T</sup>* . Combining the position and attitude vectors defined the generalized coordinates of the UAV as **q**<sup>B</sup> = (**p**<sup>B</sup> W)*<sup>T</sup>* **Θ***<sup>T</sup> T* <sup>∈</sup> <sup>R</sup>6. Written in a matrix form *<sup>T</sup>*<sup>B</sup> W, the transformation contains both the position and orientation of the UAV obtained through an on-board sensor fusion or through an external positioning system (i.e., GPS). The notation *T<sup>b</sup> <sup>a</sup>* <sup>∈</sup> <sup>R</sup>4×<sup>4</sup> was used to denote a homogeneous transformation matrix between frames *a* and *b*.

A rigid attachment between the body of the UAV and the base of the manipulator *L*<sup>0</sup> was considered, denoted with the transformation matrix *T*<sup>0</sup> <sup>B</sup> . The manipulator used in this work was a *M* = 3 degree-of-freedom (DoF) serial chain manipulator with the end-effector attached to the last joint. The DH parameters of the arm are given in Table 1. Using this notation, one can write a transformation matrix *Tee* <sup>0</sup> between the manipulator base and its end-effector as a function of joint variables *q*1, *q*<sup>2</sup> and *q*3. For brevity, the expression for the entire matrix *Tee* <sup>0</sup> is left out and only the end-effector position and its approach vector equations are written using the well-known abbreviation cos(*q*<sup>1</sup> + *q*2) := *C*12:

$$\mathbf{p}\_0^{\rm ex} = \begin{bmatrix} a\_1(\mathbf{C}\_1 + \mathbf{C}\_{12}) + d\_3 \mathbf{C}\_{123} \\ a\_1(\mathbf{S}\_1 + \mathbf{S}\_{12}) + d\_3 \mathbf{S}\_{123} \\ 0 \end{bmatrix}, \mathbf{z}\_0^{\rm ex} = \begin{bmatrix} \mathbf{C}\_{123} \\ \mathbf{S}\_{123} \\ 0 \end{bmatrix} \tag{1}$$

**Table 1.** DH parameters of the 3-DoF manipulator attached to the UAV. A virtual joint *q*∗ <sup>4</sup> is added to fully comply with the DH convention. Link sizes *a*<sup>1</sup> and *d*<sup>3</sup> are omitted for clarity.


**Figure 2.** Coordinate systems of the world, UAV and the 3-DoF manipulator.

Putting it all together, the full kinematic chain of the aerial manipulator can be constructed as

$$T\_W^{\rm ce} = T\_W^{\rm B} \cdot T\_{\rm B}^{\rm O} \cdot T\_{\rm O}^{\rm ce} \,\,\,\tag{2}$$

combining the fixed transformation *T*<sup>0</sup> <sup>B</sup> with *T*<sup>B</sup> <sup>W</sup> and *Tee* <sup>0</sup> depend on UAV and manipulator motion. Since there is obvious coupling between the motion of the body and the manipulator arm, a *β* ∈ [0, 1] parameter is introduced to distribute the end-effector motion commands either to the UAV global position control or manipulator joint position control. To this end, the following distribution relationship is used:

$$
\begin{aligned}
\Delta \mathbf{P}\_{\text{UAV}} &= \boldsymbol{\beta} \cdot \Delta \mathbf{P} \\
\Delta \mathbf{P}\_{arm} &= (1 - \boldsymbol{\beta}) \cdot \Delta \mathbf{P}\_{\text{'}}
\end{aligned}
\tag{3}
$$

where Δ**P** is used to denote the desired aerial manipulator displacement expressed as the following combination body and arm motion:

$$
\begin{split}
\Delta \mathbf{P} &= \Delta \mathbf{P}\_{\text{UAV}} + \Delta \mathbf{P}\_{\text{arm}} \\ &= \boldsymbol{\beta} \cdot \Delta \mathbf{P} + (1 - \boldsymbol{\beta}) \cdot \Delta \mathbf{P}.
\end{split}
\tag{4}
$$

The manipulator displacement is denoted by Δ**P***arm* and the UAV displacement by Δ**P**UAV, where both Δ**P***arm* and Δ**P**UAV are expressed in the coordinate system *L*0. With *β* = 1, the UAV motion is used to control the position of the end-effector. When *β* = 0, the situation is reversed and the manipulator motion is used to move the end-effector. For every other *β*, the end-effector motion is obtained in part by the UAV body and the manipulator arm motion.

There are obvious advantages in combining the motion of the UAV and the manipulator arm. The UAV can move in 3D space beyond the reach of the arm; however, the motion of the UAV is not as precise and dynamically decoupled. The kinematics of the arm enable the end-effector to obtain the desired approach angle **z***ee* <sup>0</sup> = [cos(*δ*), sin(*δ*), 0] *T*, which under the hovering assumption, becomes equal to the global approach vector **z***ee W* pointing towards the contact point on the infrastructure. The straightforward mathematical manipulation of Equation (1) allows for writing the constraint equation:

$$q\_3 = \delta - q\_1 - q\_2 \tag{5}$$

which ensures that the manipulator points in the right direction, where *δ* is the desired manipulator inclination in the body *x*–*z* plane.

To find the optimal manipulator pose during contact, the dexterity D and the reach R of the pose were taken into account, while considering that the joints are as far as possible from their physical limits L. Since the motion of the arm is constrained with its approach axis condition, a reduced form of a Jacobian matrix was used **J** = *<sup>δ</sup>***p***ee* 0 *<sup>δ</sup>q*<sup>1</sup> , *δ***p***ee* 0 *δq*<sup>2</sup> to derive the pose dexterity index D = **<sup>J</sup>***<sup>T</sup>* · **<sup>J</sup>** and determine how far the current pose is from the null space of the manipulator [29]. The reach of the pose R = (**p***ee* <sup>0</sup> )*<sup>T</sup>* · **<sup>p</sup>***ee* <sup>0</sup> was also taken into account, since the goal was to keep the end-effector and the contact point as far away from the UAV body. Finally, the following equation is defined:

$$\mathcal{L} = \frac{(q\_1^2 - Q\_{1\max}{}^2)(q\_2^2 - Q\_{2\max}{}^2)}{Q\_{1\max}{}^2 Q\_{2\max}{}^2},\tag{6}$$

to measure how far away the given configuration is (i.e., *q*1, *q*2) from the joint limits *Q*1*max*, *Q*2*max*. Normalizing D, R and L enables combining the three conditions into a single manifold M = D · R · L and find the optimal configuration **q**<sup>∗</sup> <sup>M</sup> = *q*∗ <sup>1</sup> *q*<sup>∗</sup> <sup>2</sup> *q*<sup>∗</sup> 3 *T* for the desired approach angle *δ*. The described method is depicted in Figure 3 for the specific case of the approach angle *δ* = 0◦, but can be extended to any value of the approach angle through Equation (5).

As a side note, the manipulator attachment on the top of the UAV body was chosen to be able to reach surfaces underneath the bridge. Although this shifts the center of gravity upwards, the stability of the system is not compromised since the manipulator is constructed of lightweight materials.

**Figure 3.** The visual decomposition of dexterity D, reach R, limit L and the overall combined surface. This analysis is performed for *δ* = 0◦: (**a**) the dexterity D surface shows the measure of how far the manipulator is from the null space. Values around zero are closer to the null space; (**b**) the reach R surface shows how far the end-effector can move in a certain configuration. This value tends towards zero as the arm approaches a folded configuration; (**c**) the limit L depicts how far a certain configuration is from the physical limits of the manipulator joints; and (**d**) the combined manifold M of the formerly described surfaces. Higher values offer better trade-off between dexterity, reach and limit, defining the optimal manipulator configuration **q**∗ M.

#### *3.2. Dynamics*

The most complicated task of the aerial manipulator is attaching the sensor to a wall and maintaining the required force reference while the two-component adhesive hardens. To successfully perform such a task, the coupled UAV-manipulator system dynamics have to be addressed for the precise end-effector configuration planning.

Considering the UAV dynamics only, the derivative of the generalized coordinates can be defined as **q**˙ <sup>B</sup> = (**p**˙ <sup>B</sup> W)*<sup>T</sup>* (*ω*<sup>B</sup> W)*<sup>T</sup> T* <sup>∈</sup> <sup>R</sup>6. Here, the (**p**˙ <sup>B</sup> <sup>W</sup>)*<sup>T</sup>* is the linear velocity of the body of the UAV in the world frame and the (*ω*<sup>B</sup> <sup>W</sup>)*<sup>T</sup>* represents the angular velocity of the UAV in the world frame. The UAV's propulsion system consists of *np* propellers rigidly attached to the body. Each propeller produces force and torque along the *z*<sup>B</sup> axis. The vector of the propeller rotational velocities is simply defined as

$$
\boldsymbol{\Omega}\_{\rm UAV} = \begin{bmatrix} \boldsymbol{\Omega}\_1 & \dots & \boldsymbol{\Omega}\_{n\_P} \end{bmatrix}^T \in \mathbb{R}^{n\_p}. \tag{7}
$$

Force and torque produced by each propeller are non-linear functions depending on the rotational velocity **Ω**UAV. Rather than using the rotational velocities as control inputs, they can be mapped to a more convenient space. Namely, the mapped control input space can be written as

$$\mathbf{u}\_{\text{UAV}} = \mathbf{K} \cdot \operatorname{diag}(\mathbf{O}\_{\text{UAV}}) \cdot \mathbf{O}\_{\text{UAV}} \tag{8}$$

where **<sup>K</sup>** <sup>∈</sup> <sup>R</sup>4×*np* is the mapping matrix and **<sup>u</sup>**UAV <sup>=</sup> *u*<sup>1</sup> *u*<sup>2</sup> *u*<sup>3</sup> *u*<sup>4</sup> *T* , where *u*<sup>4</sup> represents the net thrust and *u*1, *u*<sup>2</sup> and *u*<sup>3</sup> are moments around the body frame axes.

As stated earlier, the manipulator consists of three rotational DoFs. Therefore, the joint positions of the manipulator are defined as **q**<sup>M</sup> = *q*<sup>1</sup> *q*<sup>2</sup> *q*<sup>3</sup> *T* . The rotational velocity of each joint is a time derivative of joint positions **q˙** <sup>M</sup> = *d***q**M/*dt*. The torque of each joint is considered the control input of the manipulator **u**<sup>M</sup> = *τ*<sup>1</sup> *τ*<sup>2</sup> *τ*<sup>3</sup> *T* .

The resulting generalized coordinates of the aerial manipulator can be written as **q** = **q**UAV **q**<sup>M</sup> *T* <sup>∈</sup> <sup>R</sup>9, and the velocities can be obtained in the same manner as **q˙** = **q˙** UAV **q˙** <sup>M</sup> *T* <sup>∈</sup> <sup>R</sup>9. The resulting control inputs of the system can be expressed as **u** = **u**UAV **u**<sup>M</sup> *T* <sup>∈</sup> <sup>R</sup>7. Finally, the full system dynamics can be written as

$$\mathbf{M}(\mathbf{q})\ddot{\mathbf{q}} + \mathbf{c}(\mathbf{q}, \dot{\mathbf{q}}) + \mathbf{g}(\mathbf{q}) = \mathbf{u},\tag{9}$$

where **<sup>M</sup>**(**q**) <sup>∈</sup> <sup>R</sup>9×<sup>7</sup> is the inertia matrix, **<sup>c</sup>**(**q**, **<sup>q</sup>**˙ ) <sup>∈</sup> <sup>R</sup><sup>7</sup> is the vector of centrifugal and Coriolis forces, **<sup>g</sup>**(**q**) <sup>∈</sup> <sup>R</sup><sup>7</sup> is the gravitational term.

#### **4. Control System**

The overall control of the aerial manipulator consists of several nested control loops. The complete controller overview, with motion planning and blob detection blocks, is depicted in Figure 4.

#### *4.1. Aerial Manipulator Control*

At the inner most level, the UAV is controlled through cascade attitude and rate controllers. The input to these controllers is the desired orientation and based on the state, the output is the vector of the rotors' angular velocities. The second level of control, which uses the inner attitude control loop, consists of two additional cascades, the position and the velocity control. These controllers receive a referent position and velocity feed-forward value to generate the desired vehicle orientation and thrust. The manipulator joints are controlled through standard Proportional, Integral, Derivative (PID) controllers; however, in a real-world setting, servo motors with integrated control are typically used.

As mentioned earlier, it is important to track the desired force after contact with a wall is achieved. To accomplish this, an adaptive impedance controller is employed to generate an appropriate setpoint for the position controller. This controller receives a trajectory supplied by the mission planner, which steers the aerial manipulator towards the sensor mounting target on the bridge.

**Figure 4.** The overall functional schematic of the system. The aerial manipulator control subsystem is necessary for the controlling position and attitude of the UAV, and manipulator joints. On top of this controller, the adaptive impedance control is employed in order to track the desired force. Motion planning generates an appropriate trajectory based on the target point supplied by the blob detection algorithm.

#### *4.2. Adaptive Impedance Control*

The objective of the adaptive impedance controller is to ensure a stable physical interaction between the aerial manipulator and the environment [30]. As mentioned earlier, the standard UAV control scheme is based on position and attitude controllers. When interacting with the environment, the desired contact force must be considered. The position controlled system can be extended to follow the desired force by introducing an impedance filter. The design of such a filter is explained here for a single DoF.

The behavior of the system is defined by the target impedance as

$$c(t) = m(\ddot{x}\_c(t) - \ddot{x}\_r(t)) + b(\dot{x}\_c(t) - \dot{x}\_r(t)) + k(x\_c(t) - x\_r(t)),\tag{10}$$

where *m*, *b* and *k* are constants, *xr*(*t*) is the referent position, provided to the impedance filter as an input, and *xc*(*t*) is the output of the impedance filter representing the position command. The filter is designed as a linear second-order system with a dynamic relationship between the position and the contact force tracking error *e*(*t*) so that it mimics a mass-spring–damper system. The contact force tracking error is defined as follows:

$$x(t) = f\_r(t) - f(t),\tag{11}$$

where *fr*(*t*) is the other filter input defining the referent force, and *f*(*t*) is the measured (exerted) contact force. If the environment is modeled as a first-order elastic system (equivalent spring system) with unknown stiffness *ke*, the measured force can be approximated as

$$f(t) = k\_e(\mathbf{x}(t) - \mathbf{x}\_e(t)),\tag{12}$$

where *x*(*t*) is the position of the manipulator and *xe*(*t*) is the position of the environment in an unexcited state. By substituting Equation (12) in Equation (11), the position of the aerial manipulator can be expressed as follows:

$$\mathbf{x}(t) = \frac{f\_{\mathbf{f}}(t) - \mathbf{c}(t)}{k\_{\varepsilon}} + \mathbf{x}\_{\varepsilon}(t). \tag{13}$$

Assuming that the commanded position value can be achieved by the aerial manipulator, i.e., *x* = *xc*, the substitution of Equation (13) in Equation (10), the system in the steady state can be described as follows:

$$
\varepsilon(t) = \frac{k \cdot k\_{\varepsilon}}{k + k\_{\varepsilon}} \left( \frac{f\_r(t)}{k\_{\varepsilon}} + \mathbf{x}\_{\varepsilon}(t) - \mathbf{x}\_{\mathcal{I}}(t) \right). \tag{14}
$$

For a contact force error of zero in the steady state, the following must hold:

$$\mathbf{x}\_r(t) = \frac{f\_r(t)}{k\_c} + \mathbf{x}\_t(t). \tag{15}$$

In other words, the position setpoint has to be designed in such a way that it compensates for the displacement of the environment due to the exerted contact force. To ensure this, a value of the unknown environment stiffness *ke* is needed. Furthermore, *ke* plays a fundamental role in the stability of the impedance filter Equation (10), which ultimately affects the stability of the aerial manipulator while in contact with the environment. A stable contact between the aerial manipulator and the environment can be ensured using the Hurwitz stability criterion, by designing the system with *b*/*m* > 0 and (*k* + *ke*)/*m* > 0. However, since *ke* is unknown, an adaptation law for the position setpoint that guarantees the contact stability while compensating for this hidden, unknown parameter is proposed.

The adaptation law is derived starting from Equation (15). An adaptive parameter *κ*(*t*) is introduced so that:

$$
\mathbf{x}\_r(t) = \mathbf{x}(t)f\_r(t) + \mathbf{x}\_t(t). \tag{16}
$$

It can be shown using the Lyapunov stability analysis that the following adaptation dynamics equation for *κ*(*t*) will yield a stable system response:

$$k\ddot{\mathbf{x}}(t) + b\ddot{\mathbf{x}}(t) + m\dddot{\mathbf{x}}(t) = -\gamma\sigma(t) + \gamma\_d \dot{\sigma}(t). \tag{17}$$

We refer the interested reader to the proof which can be found in the Appendix A.

#### **5. Motion Planning**

As discussed in Section 1.1, the main concept of this paper was to use a team of two UAVs, each applying one component of the adhesive. To apply the "resin" component, the UAV has to plan a collision-free trajectory and position itself in front of the target area to start spraying. This is fundamentally different from mounting a sensor coated with "hardener". In the latter case, apart from planning a collision-free trajectory, the manipulator-endowed UAV has to apply pressure for a certain amount of time for the two components to mix.

From the perspective of motion planning, the planner needs to be augmented to include a manipulator with three degrees of freedom, contact force and the weighing parameter *β*. To successfully maintain the pressure, the planner relies on the impedance controller described in Section 4.2. Furthermore, one of the requirements when mounting the sensor on the wall is for the sensor to be perpendicular to the wall. Therefore, it is necessary to take the underactuated nature of the multirotor UAVs into account during the motion planning. Namely, the errors in the planned end-effector configuration were mainly induced due to the roll and pitch angles while executing the planned motion. In our previous work [28], we developed a model-based motion planner for aerial manipulators that is capable of correcting the aforementioned end-effector deviations. In this paper, the idea from [28] was extended to consider the impedance control when obtaining the full state of the aerial manipulator.

#### *5.1. Waypoint Configuration*

When dealing with an aerial manipulator, exerting some contact force inevitably yields a high dimensional waypoint configuration. We define a single waypoint as a set of UAV and joint poses, together with the force reference and motion distribution factor *β*:

$$\mathbf{w} = \begin{bmatrix} \mathbf{q}\_{\text{B}}^T & \mathbf{q}\_{\text{M}}^T & \mathbf{f}\_r^T & \boldsymbol{\beta} \end{bmatrix}^T \in \mathbb{R}^{13},\tag{18}$$

where **<sup>q</sup>**<sup>B</sup> <sup>∈</sup> <sup>R</sup><sup>6</sup> and **<sup>q</sup>**<sup>M</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> are generalized coordinates of the UAV and the manipulator defined in Section 3.1. The force reference vector **f***<sup>r</sup>* = *fx fy fz T* <sup>∈</sup> <sup>R</sup><sup>3</sup> and weighing scalar parameter *β* are required by the impedance controller. Furthermore, the impedance controller assumes a step change of these values. Ideally, the change should occur at the moment of contact since no force can be exerted without contact. Therefore, these values are only changed at the final waypoint.

Apart from the desired force and the parameter *β*, the final waypoint must contain the UAV position and orientation, as well as the manipulator joint configuration. Specifying these values relies on the blob detection algorithm presented in Section 6. Namely, the algorithm outputs the position and orientation of the detected blob in the world frame. Following the manipulator dexterity and reach analysis described in Section 3.1, the optimal manipulator configuration **q**∗ <sup>M</sup> is obtained based on the provided plane normal. The optimal manipulator configuration is then used as the desired configuration for the final waypoint. This way, during operation, the manipulator never reaches a fully extended or contracted pose, which allows the impedance controller to command both the arm and the UAV to achieve and maintain the desired force.

#### *5.2. Trajectory Planning*

There are three phases in the trajectory planning procedure. First, an initial trajectory is planned based on the provided waypoints. Second, the initial trajectory is sent to a simulated model in order to obtain the full state of the aerial manipulator during the trajectory execution. Third, the end-effector configuration is corrected based on the full state of the vehicle, and the final trajectory is sent to the target aerial manipulator.

#### Initial Trajectory

To execute a smooth motion towards the desired waypoint, we use a suitable timeoptimal path planning by reachability analysis (TOPP-RA) trajectory planner [31]. The TOPP-RA algorithm searches for the time-optimal trajectory and is based on a "bang-bang" principle on the generalized torque of each DoF. The planner is capable of receiving the input waypoints of an arbitrary dimension and output a smooth trajectory. Each DoF has to be provided with dynamical constraints in terms of velocity and acceleration, which are respected during the trajectory generation process.

As mentioned, the input to the TOPP-RA trajectory is the path of a set of *n* ≥ 2 waypoints:

$$\mathcal{P} = \left\{ \mathbf{w}\_i \mid \mathbf{w}\_i \in \mathbb{R}^{13}, i \in (0, 1, \dots, n) \right\}. \tag{19}$$

Based on the dynamical constraints, the output of the TOPP-RA planner is a sampled trajectory:

$$\mathcal{R}\_s = \left\{ \mathbf{t}(kT\_s) \mid \mathbf{t}(kT\_s) \in \mathbb{R}^{3 \times 13}, k \in (0, \dots, n\_l) \right\},\tag{20}$$

where **t** = (**w**)*<sup>T</sup>* (**w**˙ )*<sup>T</sup>* (**w**¨ )*<sup>T</sup> T* <sup>∈</sup> <sup>R</sup>3×<sup>13</sup> is a single sampled trajectory point consisting of position, velocity and acceleration; *Ts* is the sampling time; and *nt* is the number of points in the sampled trajectory. Note that each trajectory point contains both roll and pitch angles. Although these angles can be planned through the TOPP-RA algorithm, they are omitted at this point because of the underactuated nature of the multirotor UAV. Nevertheless, they are used later in the paper when the model corrections are applied.

The impedance controller expects a step change in the force and weighing parameter *β* referent values. To satisfy this requirement, large constraints for velocity and acceleration are imposed for these DoFs. However, because other DoFs have constraints below their physical limit, the overall force and *β* trajectory has a slower, dynamically smooth profile. These profiles also have overshoots and undershoots which are not acceptable because they are not within the hard constraints required for *β*. To tackle this problem, a simple piecewise constant velocity interpolation was applied to the force and *β*. This way, a large velocity constraint produces a step change which is a suitable input to the impedance controller. A visual example of the difference between the TOPP-RA and piecewise constant velocity interpolation is depicted in Figure 5.

**Figure 5.** Visual comparison between TOPP-RA and the piecewise constant velocity interpolation. Waypoints for both trajectories are kept the same, and around *t* = 2.5 s, the middle waypoint(yellow cross) is reached. Although the dynamical constraints are the same, TOPP-RA takes other degrees of freedom into account and produces a trajectory with overshoot which is not suitable for parameter *β*.

#### *5.3. Model-Based Corrections*

The initial trajectory from Equation (20) is planned without any consideration about the underactuated nature of the multirotor UAV. To obtain the unknowns, namely roll and pitch angles, the initial trajectory can be executed in a simulation environment. The chosen simulation environment is, in our case, *Gazebo*, because it is realistic and supports the robotics operating system (ROS), which is the backbone of our implementation. The simulated aerial manipulator is based on the mathematical model described in Section 3. The standard cascade PID controllers are employed for low-level attitude and high-level position control. The impedance controller is built on top of the position controller and provides a position reference based on the input trajectory. More details about the simulation environment are provided in Section 7.

The first step is executing the initial trajectory in the aforementioned simulation environment. While executing, the roll and pitch angles are recorded as they are needed for obtaining the full state of the UAV. Rearranging Equation (2) and plugging the unknown roll and pitch angles in the full state of the UAV, the transform of the end-effector in the manipulator base frame can be obtained:

$$T\_{L\_0}^{\rm ce} = (T\_{\rm B}^{L\_0})^{-1} \cdot (T\_{\rm W}^{\rm B})^{-1} \cdot T\_{\rm W}^{\rm ce}.\tag{21}$$

Usingthe inverse kinematics of the manipulator, joint values **q**<sup>M</sup> for the desired endeffector configuration are obtained. This way, the null space of the aerial manipulator is used for the end-effector correction. Note that due to the configuration of the manipulator, an exact solution of the inverse kinematics will not always exist. In such a case, an approximate closest solution is used instead.

The final trajectory is constructed by replacing the initial **q**<sup>M</sup> with the corrected values. This trajectory is afterwards sent to the target aerial manipulator.

The careful reader should note that the developed three DoF manipulator operates on the *x* and *z* position in the body frame, as well as the pitch angle. This allows the impedance controller to maintain the orientation perpendicular to the wall, while compensating for the UAV body motion in the *x* and *z* axes. However, the system will experience disturbances and control errors which will act on the roll and pitch angle, and the lateral movement along the body *y* axis. We can address these issues either with mechanical dampers or by adding additional degrees of freedom to the manipulator, which will be explored in future work.

#### **6. Blob Detection**

This section presents the methods we propose to detect the hardener blob position and orientation. A modular object detection framework, as shown in Figure 6, is designed to ensure a reliable blob pose detection. Since the detection is to be done on board the UAVs, RGB-D cameras are selected. Therefore, the inputs to the framework are images and organized point clouds obtained from the visual sensor. The remainder of this section introduces the individual components of the framework and adds implementation details where necessary.

**Figure 6.** The pipeline for the modular object detection framework. Inputs to the system are an arbitrary number of sensor messages (image, depth, point cloud, etc.) along with sensor–world frame transformations. Output is the detected blob pose in the world frame. The synchronizer and detector are modular components, while the linear Kalman filter, world transformation and pose tracker stay invariant.

The sensor message synchronizer is responsible for the time-based synchronization of the given sensor message streams. In the case of blob detection, a module that synchronizes images and organized pointclouds from an RGB-D camera is derived. This is necessary since the algorithm detects the blob in both 2D image space and 3D point clouds, which are not necessarily sampled simultaneously. The underlying implementation uses ROS libraries to synchronize messages with an approximate time policy.

An object detector attempts to find a set of object poses using synchronized sensor data. The module used in this paper detects blob poses and is implemented in the following way. First, all the blob positions and radii are found in the image frame using the standard blob detection functionality found in the OpenCV libraries. Second, the depth information corresponding to the detected blobs is isolated from the organized point cloud. Finally, blob positions are calculated as centroids of the corresponding depth positions, while the orientation is obtained through the random sample consensus (RANSAC) algorithm from the Point Cloud Library (PCL).

The remaining framework components are independent from synchronizer and detector modules. The pose tracker is used to track the obtained object's poses through multiple frames based on the closest Euclidean distance criterion. This component solves the issue of multiple objects being visible, as it always outputs the pose of the currently tracked object. Moreover, it increases the robustness of the system since it remembers the object poses for a certain number of frames, which allows some leniency with the detector.

The goal of the world transformation component is to transform the tracked pose from the sensor to the world frame using the estimated odometry from an external source that any UAV should have access to. Additionally, since the blob poses are to be sent as references to the trajectory planner, it is important to correctly compute the blob orientation. Since the blob is a flat surface, there are two equally correct possible orientations that can be detected. Therefore, the blob orientation is chosen as follows:

$$R\_{\rm blob} = \begin{cases} R\_{\rm blob} & \text{if } \mathbf{r}\_{1B} \cdot \mathbf{r}\_{1 \rm blob} \ge 0 \\ R\_{\rm blob} \cdot R\_{1 \rm b0} & \text{otherwise} \end{cases},\tag{22}$$

where **r**1*<sup>B</sup>* is the heading component of the UAV rotation matrix expressed in world coordinates *RB* = **r**1*<sup>B</sup>* **r**2*<sup>B</sup>* **r**3*<sup>B</sup>* and **r**1blob is the heading component of the blob rotation matrix expressed in world coordinates *R*blob = **<sup>r</sup>**1blob **<sup>r</sup>**2blob **<sup>r</sup>**3blob and *R*<sup>180</sup> = *diag*(−1, −1, 1).

Finally, a linear Kalman filter with a constant velocity model is used to further increase the robustness of the system and provide smoother blob position estimates. The constant velocity model for each axis is given as follows:

$$\mathbf{x}\_{k+1} = F\_k \mathbf{x}\_k + \mathbf{w}\_{k'} \qquad F\_k = \begin{bmatrix} 1 & \mathbf{T}\_s \\ 0 & 1 \end{bmatrix} , \tag{23}$$

where <sup>T</sup>*<sup>s</sup>* is the discretization step, **<sup>x</sup>***<sup>k</sup>* <sup>∈</sup> <sup>R</sup><sup>2</sup> is the state vector containing the position and velocity along the corresponding axis and **<sup>w</sup>***<sup>k</sup>* <sup>∈</sup> <sup>R</sup><sup>2</sup> is the process noise. The observation model along a single axis is given as follows:

$$\mathbf{z}\_k = H\_\mathbf{x} \mathbf{x}\_k + \mathbf{v}\_{k\prime} \qquad H\_k = \begin{bmatrix} 1 & 0 \end{bmatrix} \,\mathrm{},\tag{24}$$

where <sup>z</sup>*<sup>k</sup>* <sup>∈</sup> <sup>R</sup> is the position observation along the corresponding axis and <sup>v</sup>*<sup>k</sup>* <sup>∈</sup> <sup>R</sup> is the measurement noise.

If the detector is unable to provide measurements and the pose tracker removes the pose from the tracking set, the linear Kalman filter is still able to provide blob position estimates.

Experimental validation of the described methods is performed in an indoor Optitrack environment with an Intel Realsense D435 RGB-D camera. To ensure ground truth is available for detection validation, reflective markers are attached to both the camera and the blob. In order to determine the transformation between the camera optical frame and the reflective markers attached to the camera, an optimization-based calibration approach is used as described in [32].

Results are shown in Figures 7 and 8. The experiments are performed with the UAV in constant motion while looking at the general direction of the painted blob. Figure 7 shows a relative difference between the ground truth UAV motion in the world frame and the UAV motion as observed from the detected blob frame. Figure 8 presents the comparison of ground truth and detected blob positions expressed in the world frame. It is important to note that camera calibration errors can manifest themselves as static offsets between the detected and ground truth blob positions in Figure 8. However, in this case, the visual detection provided a reliable blob tracking results which is a direct consequence of careful camera calibration.

**Figure 7.** This figure shows the comparison of normalized motion of the UAV body frame as observed from the Optitrack world frame and as observed from the detected blob frame labeled **p***<sup>W</sup> <sup>B</sup>* [*k*] <sup>−</sup> **<sup>p</sup>***<sup>W</sup> <sup>B</sup>* [0] and **p***blob <sup>B</sup>* [*k*] <sup>−</sup> **<sup>p</sup>***blob <sup>B</sup>* [0], respectively.

**Figure 8.** This figure shows the comparison of the ground truth blob position and the blob position estimate expressed in the Optitrack world frame. The root mean squared error across all three axes is 4.6 cm.

#### **7. Simulation**

The environment used for simulating the UAV and manipulator dynamics, as well as the contact with the environment, is the widely accepted Gazebo simulator. It is realistic and highly modular, with a large community and a support for the robot operating system (ROS), which is also the primary implementation environment for impedance control, motion planning and blob detection. Through ROS, Gazebo has a large variety of developed plugins realistically simulating various sensors and actuators. All simulations were conducted with Linux Ubuntu 18.04 operating system and ROS Melodic middleware installed.

The UAV is modeled as a single rigid body with *np* propellers mounted at the end of each arm. As propulsion units, these propellers generate thrust along the *z* axis of the UAV body. To simulate the propeller dynamics, the rotors\_simulator package is used. It contains a plugin that models thrust based on the user-provided propeller parameters [33]. Furthermore, to obtain the UAV attitude and position, IMU and odometry plugins are mounted on the vehicle. The manipulator was mounted on the body of the UAV and consists of three joints connected with links. A rod type tool is mounted as the end-effector, with a force-torque sensor required by the impedance controller. Furthermore, a monocular camera with an infrared projector is also mounted for the blob detection.

#### *7.1. End-Effector Motion Distribution Analysis*

Given some end-effector configuration, the inverse kinematics is responsible for finding the UAV position and yaw angle, as well as the manipulator joint values that satisfy the desired configuration. The parameter *β* from Equation (3) defines a ratio of how much the manipulator joints and UAV position and orientation contribute to achieving the desired end-effector configuration, as described in Section 3.1. Recalling the values, *β* = 1 only moves the UAV in the direction of the desired end-effector configuration; and *β* = 0 uses the inverse kinematics of the manipulator to achieve the desired configuration.

To determine the influence of *β* on the overall system, an analysis was conducted with different *β* values. The desired end-effector configuration was chosen to be in contact with a plane perpendicular to the bridge wall which required the force reference along the *x* axis. The waypoints for the trajectory planner were kept the same across all trials, and only *β* was changed. The results of this analysis are depicted in Figure 9. As can be observed, all trials produced very similar results with oscillating force upon contact and eventually reaching the desired reference, providing us with no obvious conclusion regarding how to select the optimal *β*. However, following the dexterity analysis from Section 3.1, and only relying on the manipulator motion might drive the system close to its limits due to the UAV body movement. On the other hand, the motion of the UAV induces disturbances in the end-effector pose control. The manipulator is therefore responsible for compensating errors introduced by the motion of the UAV body. Taking all of the aforementioned into account, the value is chosen as *β* = 0.5 so that both the manipulator and the UAV are simultaneously used to maintain a steady contact force.

**Figure 9.** Force response comparison for different values of the parameter *β*. The analysis was conducted on a plane perpendicular to the ground where the force reference along the *x* axis is required.

#### *7.2. Bridge Sensor Mounting*

Since the concept of this paper was to mount inspection sensors on a bridge, the simulation trials were tailored in the same direction. After spraying the first component, it is necessary to achieve and maintain a stable contact while the second adhesive component on the sensor dries. Since the manipulator is attached above the propellers, the workspace of the manipulator is limited to contact above the UAV or on the plane perpendicular to the ground.

Naturally, the first set of simulation trials were conducted by holding the desired force on a plane perpendicular to the ground. In this case, the contact force only acts along the *x* axis and the response is depicted in Figure 10. The time delay between the planned and executed contact is present due to the impedance filter which slows down the dynamics of the referent trajectory. After the initial contact, there are some oscillations and an overshoot which diminish over time and the desired force reference is achieved.

**Figure 10.** Force response in case of a contact plane perpendicular to the ground, *δ* = 0◦.

The second set of simulation trials included an inclined contact plane. This requires the UAV approach from below the plane and achieving contact perpendicular to the plane. Since the plane is inclined for *δ* = 68◦, the planned force referent values have components in both the *x* and *z* axes, as shown in Figure 11. Similarly to the previous example, the force response has some oscillations around the instance of contact, but it eventually settles and reaches the desired force reference.

**Figure 11.** Force response in the case of the contact plane inclined for *δ* = 68◦.

The simulation tests for *δ* = 0◦ and *δ* = 68◦ were performed *n* = 10 times for each case, as depicted in Figure 12. The left portion of the figure is a dot product between the normal of the blob **r***<sup>t</sup>* and the end-effector orientation vector **r***ee*. If the value of the dot product **r***<sup>t</sup>* · **r***ee* = 1, the two vectors are parallel which results in a successful approach. For both angles, the dot product is very close to 1 and the orientation error is negligible. On the right, the distance between the center of the target and contact point is shown. The error distance is in both cases less than 0.1 m, which ensures the relatively high precision of sensor mounting, well within margins for the bridge inspection. The accompanying video of simulation tests can be found on our YouTube channel [34].

**Figure 12.** (**Left**): box and whiskers plot of the dot product between the blob plane normal vector **r***t* and the end-effector orientation vector **r***ee*; and (**Right**): box and whiskers plot of the distance from the target point after contact.

#### **8. Conclusions**

This paper presents a step towards autonomous bridge inspection by investigating the possibility of mounting various inspection sensors using an aerial manipulator. Currently, inspectors use specialized trucks with cranes and baskets in order to access the area underneath the bridge. This inevitably leads to road closure which poses an inconvenience for both inspectors and traffic. To alleviate this problem, the aforementioned aerial manipulators can be used to access difficult-to-reach areas of the bridge. As mounting sensors require forming a bond between the wall and sensor, we envision using a two-component adhesive with a short cure time. Since the aerial manipulator has to achieve and maintain contact with the sensor mount point, short cure times are desirable because of te limited flight time of these platforms. Nevertheless, current flight times of outdoor multirotors reach up to 30 min, which ensures enough time for the two adhesive components to form the bond.

Although preliminary, the results of this paper seem promising. The visual detection was extensively tested and reliably tracks the blob position. The adaptive impedance controller is capable of maintaining the required force. Even though there are some oscillations and settling times in the force response, in practical use, it does not make much difference since the curing time of the adhesive is at least several minutes. The trajectory planner was augmented to plan in the force space which allows for setting the force reference step change before the contact. The simulation results show the high repeatability of the overall system which gives us the confidence to perform experiments in a real-world environment.

Our first step in future work was to perform experiments in a controlled laboratory environment. The outdoor environment poses a different set of challenges including the lower accuracy positioning system and unpredictable disturbances, i.e., wind gusts. Since these factors will inevitably reflect on the overall end-effector accuracy, we are looking into augmenting the manipulator to be able to compensate for lateral movements, as well as roll and yaw angles. To further increase the system's accuracy, the developed visual tracker will be used to improve feedback around the tracked blob on the bridge wall in real-world experiments.

**Author Contributions:** All authors contributed equally to this work. Conceptualization, M.O. and I.D.; methodology, A.I., L.M. and M.C.; software, A.I., L.M. and M.C.; investigation, A.I.; writing original draft preparation, A.I.; writing—review and editing, L.M., I.D. and M.O.; supervision, I.D. and M.O.; project administration, I.D. and M.O.; funding acquisition I.D. and M.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the European Commission Horizon 2020 Programme through the project under G.A. number 820434, named Energy-Aware BIM Cloud Platform in a Cost-Effective Building Renovation Context—ENCORE. Furthermore, this research was part of the scientific project Autonomous System for Assessment and Prediction of Infrastructure Integrity (ASAP), financed by the European Union through the European Regional Development Fund—The Competitiveness and Cohesion Operational Programme (KK.01.1.1.04.0041).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

Here, we show detailed proof that the adaptation law (17) is stable. Assuming that THE referent force *fr*(*t*) is constant, *fr*(*t*) = *Fr*, the time derivatives of Equation (16) are:

$$\begin{aligned} \dot{\mathbf{x}}\_r(t) &= \dot{\mathbf{x}}(t)F\_r + \dot{\mathbf{x}}\_\varepsilon(t), \\\\ \ddot{\mathbf{x}}\_\mathbf{r}(t) &= \ddot{\mathbf{x}}(t)F\_\mathbf{r} + \ddot{\mathbf{x}}\_\varepsilon(t), \end{aligned} \tag{A1}$$

while the derivatives of Equation (13) yield:

 $\dot{\mathfrak{x}}(t) = \frac{-\dot{\mathfrak{x}}(t)}{k\_{\mathfrak{x}}} + \mathfrak{x}\_{\mathfrak{x}}(t),$  
$$\mathfrak{x}(t) = \frac{-\mathfrak{x}(t)}{k\_{\mathfrak{x}}} + \mathfrak{x}\_{\mathfrak{x}}(t).$$

By susbstituting Equations (A1) and (A2) into Equation (10), the dynamics of the contact force error can be obtained as

$$m\ddot{e}(t) + b\dot{e}(t) + (k + k\_e)e(t) = \mathbf{g}(t),\tag{A3}$$

where:

$$\mathbf{g}(t) = F\_r[k(1 - k\_\ell \kappa(t)) - k\_\ell(b\dot{\kappa}(t) + m\ddot{\kappa}(t))],\tag{A4}$$

and *x* = *xc*. The adaptation law thus determines the dynamics of the adaptation parameter *κ*(*t*) and defines the dynamics of the contact force error. Formally, the adaptation law should enforce *g*(*t*) −→ *g*<sup>∗</sup> such that *e*(*t*) −→ 0 and *κ*(*t*) −→ 1/*ke*.

For the Lyapunov candidate:

$$V(t) = \frac{1}{2} \left[ p\_1 \varepsilon(t)^2 + p\_2 \dot{\varepsilon}^2(t) \right] + \frac{1}{2\gamma} [\mathbf{g}(t) - \mathbf{g}^\*]^2,\tag{A5}$$

with *<sup>p</sup>*1, *<sup>p</sup>*<sup>2</sup> and *<sup>γ</sup>* as positive parameters, the condition *<sup>V</sup>*˙ (*t*) <sup>≤</sup> 0 yields:

$$\frac{2}{\gamma}\mathbf{g}(t)\dot{\mathbf{g}}(t) + 2\mathbf{g}(t)[p\_1\mathbf{e}(t) + p\_2\dot{\mathbf{e}}(t)] - \frac{2}{\gamma}\dot{\mathbf{g}}(t)\mathbf{g}^\* \le 0. \tag{A6}$$

After reordering, we obtain:

$$
\lg(t) |\lg(t) - \lg^\*| \le -\gamma \sigma(t) \lg(t),
\tag{A7}
$$

where *σ*(*t*)=[*p*1*e*(*t*) + *p*2*e*˙(*t*)]. By choosing:

$$
\dot{\mathfrak{g}}(t) = -\gamma \sigma(t) + \gamma\_d \dot{\sigma}(t),
\tag{A8}
$$

where *γ<sup>d</sup>* is a positive constant, Lyapunov condition Equation (A7) becomes:

$$
\lg(t) \le \lg^\*,\tag{A9}
$$

i.e., for the adaptation law to be stable, *g*(*t*) should be bounded. Since *xr*, *x*˙*<sup>r</sup>* and *x*¨*<sup>r</sup>* are bounded, so are *e*,*e*˙ and *e*¨. Therefore, *g*(*t*) is also bounded, i.e., the condition in Equation (A9) is satisfied. The adaptation law is finally obtained by taking the derivative of Equation (A3), and substituting *g*˙(*t*) with Equation (A8), yields the (17). Parameters *γ* and *γ<sup>d</sup>* dictate the adaptation dynamics. Based on the measured contact force, the error adaptation law Equation (17) estimates the adaptation parameter *κ* (reciprocal value of the environment stiffness), which is then used in Equation (16) for calculating the referent position *xr*.

#### **References**


### *Article* **Aerial Robotic Solution for Detailed Inspection of Viaducts**

**Rafael Caballero 1,\*, Jesús Parra 1, Miguel Ángel Trujillo 1,\*, Francisco J. Pérez-Grau 1, Antidio Viguria <sup>1</sup> and Aníbal Ollero <sup>2</sup>**

<sup>1</sup> Advanced Center for Aerospace Technologies (FADA-CATEC), 41309 Sevilla, Spain;

jparra@catec.aero (J.P.); fjperez@catec.aero (F.J.P.-G.); aviguria@catec.aero (A.V.)

<sup>2</sup> GRVC Robotics Labs, University of Seville, 41092 Sevilla, Spain; aollero@us.es

**\*** Correspondence: rcaballero@catec.aero (R.C.); matrujillo@catec.aero (M.Á.T.)

**Abstract:** The inspection of public infrastructure, such as viaducts and bridges, is crucial for their proper maintenance given the heavy use of many of them. Current inspection techniques are very costly and manual, requiring highly qualified personnel and involving many risks. This article presents a novel solution for the detailed inspection of viaducts using aerial robotic platforms. The system provides a highly automated visual inspection platform that does not rely on GPS and could even fly underneath the infrastructure. Unlike commercially available solutions, our system automatically references the inspection to a global coordinate system usable throughout the lifespan of the infrastructure. In addition, the system includes another aerial platform with a robotic arm to make contact inspections of detected defects, thus providing information that cannot be obtained only with images. Both aerial robotic platforms feature flexibility in the choice of camera or contact measurement sensors as the situation requires. The system was validated by performing inspection flights on real viaducts.

**Keywords:** inspection; maintenance; UAV; aerial robotics; aerial robotic manipulation; viaduct; LIDAR; photogrammetry; contact

#### **1. Introduction**

The inspection of viaducts and bridges is a very time-consuming and resourceintensive activity. It requires heavy involvement from highly qualified and specifically trained personnel. Additionally, these inspections pose health and safety risks that are mainly derived from working at height and the difficulty of the operation. Current inspection methodologies involve the use of climbing operators, who, by means of ropes, hang from the structure and perform the measurements required by the inspectors to evaluate its current state (see Figure 1a). These works present many potential accident risks due to the difficulty and technical level required to access certain complicated areas at heights, the possible physical fatigue of the workers, human errors in the safety of the operation, or even problems with the use of specific measuring tools.

An alternative method is the use of heavy machinery, like cherry pickers, truckmounted lifts, and cranes (see Figure 1b). This machinery requires new specialized personnel to operate it and perform inspections and does not eliminate the problem of having to expose people to work at heights. The surfaces to be inspected are usually located at a high altitude, and under it, several types of obstacles could be found, such as traffic of vehicles or trains, water flows, or rough terrain. This means that the use of machinery has to be done from the top surface of the structure, interrupting its service and increasing the operational costs of the inspection.

Currently, the highly qualified staff required for inspections is normally composed of civil engineers working for engineering firms specialized in structures. Every little deformation, crack, or defect can be the cause of a potentially bigger critical problem, so they must be identified as soon as possible. An example that requires high accuracy is

**Citation:** Caballero, R.; Parra, J.; Trujillo, M.Á.; Pérez-Grau, F.J.; Viguria, A.; Ollero, A. Aerial Robotic Solution for Detailed Inspection of Viaducts. *Appl. Sci.* **2021**, *11*, 8404. https://doi.org/10.3390/app 11188404

Academic Editor: Yosoon Choi

Received: 6 August 2021 Accepted: 7 September 2021 Published: 10 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the measurement of cracks width with an error smaller than 0.1 mm. Moreover, crack depth can only be measured using contact sensors. However, visual inspection is the most extended way of assessing the preliminary status of the viaduct before deciding if specialized equipment is needed. This means that, in order to find the smallest defects, the inspector must be very close to them. Furthermore, human subjectivity or the lack of experience could lead to an undervaluation of the severity of a defect.

(**a**) (**b**)

(**c**)

**Figure 1.** Examples of current inspection methodologies. (**a**) Current method for the viaduct inspection with rope access, from Ayres Associates, Inc. [1]. (**b**) Current method for the viaduct inspection using specialized machinery, from Forsgren Associates, Inc. [2]. (**c**) Proposed new methodology using the flight platform AERO-CAM in Álora, Málaga (Spain) performing an inspection under a viaduct deck.

This whole process presents many inefficiencies that can be minimized using aerial robotic technologies combined with computer vision algorithms (e.g., artificial intelligence) and other computerized technologies supporting the post-processing of the acquired data. This article proposes an aerial robotic solution for gathering all the needed data to analyze the status of a viaduct. Our proposed solution drastically improves the safety of the inspections, as it does not involve the work at height of any human being or the use of heavy machinery. It reduces inspection times and costs by reducing the number of specialized people required to perform the inspection and avoiding the need to interrupt the use of the structure. On the other hand, it improves the quality of the data obtained, since it is the inspectors themselves who indicate the points to be inspected by the aerial robotic platforms, which can always be quickly sent back to obtain more information if necessary. In addition, the system is flexible to use the different sensors needed: either cameras or sensors that require physical contact with the structure.

#### *1.1. Aerial Robots for Inspection*

The use of autonomous unmanned aerial vehicles (UAVs) to capture images and their subsequent use in infrastructure analysis is currently on the rise. After an exhaustive search of the projects developed in the research world, relevant applications have been found that make use of autonomous UAVs for image capture in the inspection of railway tracks [3], where autonomous flights are carried out to follow the tracks, capturing images for subsequent analysis. Other applications, like mining inspection obtained 3D maps [4], make it possible to evaluate the earthworks carried out; wind turbines inspection for evaluation of deformations or damage [5]; inspection of civil infrastructures capturing images and their subsequent analysis through the use of neural networks to identify possible cracks or landslides [6]. Finally, in [7], bridge inspection is studied in a similar way as proposed in this article, where images are captured autonomously using a UAV platform and then analyzed using photogrammetry software. The main limitation raised is that they use Global Navigation Satellite System (GNSS) positioning to automate flights, while in the system proposed in this article, the aircraft that captures images does not navigate with GNSS but with onboard sensors such as LIDAR. The physical characteristics of the viaducts cause the GNSS signal to be partially or totally degraded when flying near or under them. These degradation problems with global positioning signals are discussed in [8], where the most critical ones affecting this article are signal masking and multipath. Therefore, the degradation of this signal leads to localization problems in the UAV in which it can drift its flight or even make sudden changes in its positioning, seriously compromising the safety and integrity of the operation. In addition, GNSS poses problems of repeatability of inspection operations, since the number and position of available satellites varies over time.

There are currently some commercial systems for infrastructure inspection such as the one offered by Skydio [9], which makes use of several onboard cameras to navigate and perform the inspection autonomously. Unlike Skydio's system, our system is able to reference the inspection to a global coordinate system that can be used from the construction of the viaduct until the end of its life. This feature makes our methodology better suited to the current workflow of inspectors, who already use total stations to check the displacements of structures against global references defined during construction. Additionally, thanks to the global coordinate system, the solution offers the possibility for more than one aircraft to navigate and perform an inspection while maintaining the same references. By using a generic gimbal, our system is much more flexible than Skydio's system in the choice and configuration of the required camera sensor, as it is not limited to the built-in camera. In addition, our system uses an aerial robot that performs inspection by making physical contact with the structure. This provides information that cannot be acquired by pictures exclusively, such as precise measurements of crack depth and width, material hardness, concrete humidity, etc.

In relation to this inspection by contact with an aerial robot, there are different lines of research with a number of projects that focus on maintaining stability during physical contact [10,11]. In [12] the predecessor aerial manipulator of the one used in this work is presented, which was patented [13] and awarded with the EU Radar Innovation Award 2017 [14]. In [15], an aerial vehicle that operates overhead using a rigid arm, and that is even capable of keeping in contact with the surface [16], is presented. In [17] an aerial vehicle that operates at the bottom as well as at the front, is presented. Aerial manipulators capable of operating in either direction are presented in [18]. Stable contact operations have been achieved using a pusher trirotor in [19] or, a quadrotor in [20,21]. In [22] a long rigid tool exerting force against a surface is applied.

#### *1.2. Aerial Robots Localization*

Different sensors can be used to achieve effective positioning of aerial robots. Total stations can be used to localize an UAV in motion with respect to a reference system, as in [23]. This presents several problems, such as the high cost of this tool, as well as the dependence on flying continuously in line of sight with it or having an uninterrupted

wireless communication system with the aerial robot. In addition, total stations only provide position information, not orientation.

On the other hand, the localization problem can also be solved through monocular or stereo visual cameras, as in [24,25]. The problem with visual cameras is that they are totally dependent on external light conditions. This is especially problematic when navigating near infrastructure because of the shadows and light changes it can cause. Other localization systems use a LIDAR as the main sensor [26]. These sensors do not depend on external light conditions, as they use lasers to measure distances. While their application in autonomous driving is on the rise, their use in aerial robots is still limited because they require more available payload and a more powerful onboard computer than cameras. In [27], a comparison of several algorithms applied to aerial robots is made.

#### *1.3. Article Introduction*

This article proposes a novel solution for the detailed inspection of viaducts using aerial robotics. This solution is an alternative to current inspection methodologies, improving safety, costs, time, and data quality. Given the amount of details involved in this solution, this article first provides a general overview and then focuses on the technical and experimental aspects of the visual inspection.

The rest of the article is organized as follows. Section 2 describes the proposed viaduct inspection system and the two aircraft involved. Section 3 presents a localization solution that provides one of the aircraft with autonomous capabilities to perform a visual inspection of a viaduct. Section 4 outlines the localization and inspection experiments and presents the results used to assess the performance of the proposed solution. Finally, the conclusions and future work are summarized in Section 5.

#### **2. System Description**

The proposed viaduct inspection system offers a comprehensive solution to check and to evaluate the condition of these infrastructures through its integrated tools. A workflow has been created that meets the needs to speed up, reduce the cost, and increase the safety of these inspections. All the tasks are carried out with aerial robotic platforms whose characteristics are chosen according to the task to be performed. These tasks can be general and/or detailed photographs or make physical contact with the structure to take measurements with sensors as required.

The workflow is shown in Figure 2 and is as follows. Given a viaduct of interest on which an inspection should be performed, the proposed workflow begins with the creation of a mission. During this phase, it is required to acquire a 3D map of the structure in which the inspector can select the areas and points of interest. This 3D map is not only useful for the creation and subsequent visualization of the mission, but it is also required for the global localization of the aircraft. It is obtained with the help of a robotic total station that establishes an arbitrary coordinate system and performs scans to obtain the 3D points around it. It is important to capture these data from different points of view to obtain a complete point cloud of the viaduct. To facilitate the subsequent use of this map, the reference system should be aligned with the *ENU* axes (**x** = East, **y** = North, **z** = Top). If possible, it is desirable to obtain an approximate GNSS coordinate of the origin of the point cloud to locate it globally. Otherwise, this can be done manually. The use of the total station is then limited to one time only. Once the map has been created, it can be reused in all subsequent inspections, provided that the viaduct has not suffered significant changes.

With the points and areas of interest selected on the 3D map created, this information is sent to the aerial platform, which translates it to its local coordinate system and creates the route of waypoints and actions necessary to carry out the inspection autonomously capturing overlapping pictures. This mission is then a first general visual inspection of the structure to locate any possible defect. This check is performed by taking general pictures of the structure with high resolution in an automated way using the visual inspection platform described in Section 2.1.

**Figure 2.** General viaduct inspection workflow.

After the first general pictures are taken, they are analyzed to check the condition of the viaduct, paying special attention to those areas where a defect is suspected. This analysis can be performed manually by an inspector or automatically by applying an automatic image defect detection algorithm, like [28,29]. After the analysis, a decision should be made to determine if more detailed information on the defects found or suspect areas are required. If so, more detailed visual or contact information may be obtained using the visual or contact inspection platforms, respectively. In case more visual information is required, the mission previously created by the inspector can be reused but using another camera configuration that better collects the required information. For example, a different lens can be used for the camera with a longer focal length to obtain better details of the specific area. However, if the previous mission does not meet the requirements of the new visual inspection, the inspector can create a new one with the 3D map and select the previously found defects.

When the missions are finished and the visual information obtained is sufficient, our proposed solution also considers the use of a specific platform for contact inspections. This platform is described in Section 2.2 and has a robotic arm with an end effector on which a sensor can be installed. In case of finding defects in the visual inspection that require a deep analysis with specific sensors that require physical contact with the structure, this platform is sent to those defects and captures data. When the contact inspection data is analyzed, a decision is made as to whether further visual or contact information is required or whether the inspection is terminated.

In short, the proposed workflow is an iterative process in which one can always return to a suspect area to obtain more detailed information. All the inspections are carried out by aerial robots specifically designed for each purpose. The viaduct inspection system comprises two platforms that work sequentially as described previously. The following sections describe these UAVs, showing their configuration and capabilities.

#### *2.1. Visual Inspection*

The visual inspection UAV (see Figure 3) is known as AERO-CAM. This aircraft is specialized in taking very high quality images of a structure. This UAV is equipped with a stabilized camera which takes the images of desired areas.

**Figure 3.** AERO-CAM robot for visual inspection UAV.

The AERO-CAM platform is built from a DJI Matrice 600 Pro on which the necessary components to operate have been installed. Since the DJI is a commercial platform, the system is easily replicable. The standard configuration of the UAV has been preserved, with both the autopilot and the rotors and blades being those recommended by DJI. The autopilot includes a GPS/GNSS receiver, a 9-axis IMU, a magnetometer, and a barometric altimeter. In addition, it carries a Lightware Laser Altimeter for precision landing. Regarding the camera system, the UAV is equipped with a Gremsy T3V2 gimbal [30] mounted in the slot available above the platform and carrying a Sony Alpha 7 camera [31]. This mounting location allows the gimbal-camera set to have a better available field of view and can even take pictures pointing completely upwards, as opposed to mounting it on the bottom of the UAV like most commercial camera drone systems. This is especially useful when performing inspections under a viaduct as the UAV will be able to take pictures of the bottom part of the deck. Depending on the space available for the flight and the amount of detail to be obtained in each image, the camera can be equipped with different lenses. This camera is managed by a Raspberry Pi Model 3B+ that implements a software developed using using Sony's Camera Remote SDK [32].

To provide the platform with autonomous capabilities, the UAV mounts an Ouster OS0-128 LIDAR sensor [33] under the avionics with a custom anti-vibration structure. All previously mentioned sensors are connected together with the autopilot to the onboard computer, which is an Intel NUCi7. Finally, an Ubiquiti Rocket M5 is used for ground communications and to connect the Rapsberry Pi and Intel NUC via Wi-Fi.

Both onboard computers run Ubuntu 18.04 and ROS Melodic and have their clocks synchronized for greater accuracy in capturing images with metadata. The software of the platform is programmed as nodes that communicate with each other. Figure 4 shows the scheme of processes that operate in the system.

With all this equipment, the AERO-CAM is able to perform completely autonomous visual inspections even in GNSS-denied environments. It is capable of taking off and landing on its own, as well as carrying out the mission created from the 3D map of the viaduct. These missions are composed of many waypoints that have an image associated with them. Each time the UAV reaches a waypoint, it moves the gimbal and captures the corresponding image autonomously.

**Figure 4.** AERO-CAM general software architecture.

#### *2.2. Contact Inspection*

The aerial contact inspection robot is named AeroX [12] (see Figure 5). It is a specialized aircraft capable of contacting static surfaces. This UAV is composed of two different platforms: the aerial platform and the Robotic Mobile Contact Platform (RMCP), which will be in charge of the Ultrasonic Testing (UT) inspection for measuring the cracks' depth. The RMCP is attached at the end of the contact device of the aerial platform.

**Figure 5.** AeroX robot for contact inspection.

AeroX is a novel aerial robotic manipulator that performs physical contact inspection with unprecedented capabilities. It is composed of a robotic vehicle, a six degree-of-freedom (DoF) robotic arm, and a robotic end-effector equipped with wheels and inspection sensors. AeroX has a semi-autonomous operation, which provides interesting advantages in contact inspection. In the free-flight mode, the pilot guides the robot until performing contact with its end-effector on the surface to be inspected. During contact, AeroX is in its fullyautonomous GNSS-free contact-flight mode, in which the robot keeps its relative position with respect to the surface contact point using only its internal sensors. During autonomous flight, the inspector—with uninterrupted contact—can move the end-effector on the surface to accurately select the points to be inspected with sensors that require to be in contact with or very close to the surface.

The AeroX controller is able to efficiently compensate perturbations thanks to its design, which transmits the surface contact forces and perturbations to the robot center of mass and allows small movements of the aerial part of the robot in every DoF to absorb other perturbations such as wind. AeroX adopts a 4 coaxial rotor configuration and a simple and efficient design which provides high stability, maneuverability, and robustness to rotor failure. It can perform contact inspection on surfaces at any orientation, including vertical, inclined, horizontal top or horizontal bottom, and its operation can be easily integrated into current maintenance operations in many industries.

Although AeroX is part of the proposed solution for viaduct inspection, the technical and experimental development of this article focuses on the AERO-CAM localization algorithms. For more information about AeroX, please refer to [12].

#### **3. Localization Solution**

The proposed solution for the visual inspection of the viaduct requires the creation of a previous 3D map using a total station. This map will be a point cloud that identifies the reference coordinate origin for the entire inspection system. To create this map, operators should ensure that the *ENU* coordinate system is followed. This map can be reused in future inspections of the viaduct.

The UAV system has its own localization and navigation algorithm that provides the transform {*TLD*} whose origin is the take-off point {*L*}. Since this location may vary, the complete system requires a second localization system that establishes the 3D transformation, {*TGL*}, between the initial UAV pose and the global reference system, {*G*}, expressed in *ENU* coordinates at the origin of the map created by the total station. These transforms can be visualized in Figure 6.

**Figure 6.** Full transformation system.

Therefore, the AERO-CAM platform has two parallel localization processes to perform the automatic inspection of the viaducts. The following subsections explain the details of both processes.

#### *3.1. Global Localization System*

The function of the global localization system is to find the transform, {*TGL*}, which establishes the connection between the global reference system of the viaduct 3D map and the UAV localization system as expressed in Equation (1). Finding this transform is crucial, as it will allow the aerial robot to safely navigate to those areas of interest selected by the inspector without maintaining the same take-off position between flights. This process eliminates total station dependency after the initial 3D map has been created/acquired. In addition, since the viaduct can be found in an inaccessible area, the take-off position may not be replicated between flights. This can occur even on inspections on different days where changing terrain or weather conditions make it impossible to replicate the take-off position accurately.

$$T\_{GL} = T\_{GD} \* T\_{LD}^{-1} \tag{1}$$

$$T\_{GD} = T\_{GL} \ast T\_{LD} \tag{2}$$

This global localization system is designed to calculate the transform at the start of each mission, just before the aerial robot takes off. Therefore, the transform, {*TGL*}, is fixed and will only vary during the flight if another transform with better accuracy has been obtained. During flight, this system continues to calculate the transform between the UAV's current position and the base map, {*TGD*}, as expressed in Equation (2), so that if the accuracy of the transform improves, it gets updated. This last case can be also visualized in Figure 6. This in-flight update is only applied with the confirmation of the inspector on the ground, who personally checks whether the mean square error calculated by the global positioning algorithm is better.

The global localization can be executed on a ground computer asynchronously, since this calculation need not be instantaneous. In this case, the onboard computer sends the data to the ground computer, which performs the calculations and sends the results back to the aerial robot. This update has no direct impact on the relative localization of the UAV, as this is not affected by the change of {*TGL*}. This update has no direct impact on the ongoing flight of the UAV, as its relative localization and control are not affected. However, the mission waypoints, which are referenced to {*G*}, are updated in the onboard computer. Therefore, the UAV changes its target points to more accurate ones.

To find the correspondence between the 3D map generated by the total station and the data from the onboard sensors, we apply an algorithm that makes use of the geometric characteristics of the point clouds. Firstly, the point clouds are preprocessed to filter out the sparse data noise by applying a filter that removes outliers if the number of neighbors in a given radius (e.g., 0.1 m) is smaller than a given number, typically 15. Secondly, the algorithm performs a distributed downsampling by applying a voxel grid filter and tries to remove the ground points. The process of eliminating the ground points is carried out by creating a parametrizable grid of squares that is filled with the z-value of the lowest point within each square. For each square, all points with z-values between the minimum and a given threshold (1.5 m) are removed. Then, the algorithm calculates the FPFH (Fast Point Feature Histogram) descriptors [34] of the remaining distributed points. These features encode the geometric properties of the k-nearest neighbors of a given point using the average curvature of the multidimensional histogram around that point. Among its advantages, these features are invariant in position and a certain level of noise. After this feature extraction process, the Random Sample Consensus (RANSAC) algorithm is applied to find a first approximation between both inputs. The result is then corrected according to the problem-specific assumptions outlined below and refined via the Iterative Closest Point (ICP) algorithm. These correction and refinement steps are applied twice to further adjust the result. They are only refined twice as doing it more has not

shown a substantial improvement of the result but an increase of computational load and computational time. Depending on whether the initial guess is reliable and if, at the instant of processing, the UAV is close to the structure, the RANSAC stage can be exchanged for the ICP algorithm directly to obtain better results. To identify these refinement stages, they are named ICP1, ICP2, and ICP3, with ICP1 being the one that can be exchanged for RANSAC, as explained before. Figure 7 shows a block diagram of the main steps of the algorithm.

**Figure 7.** Global localization algorithm.

To improve this process, some assumptions are made that simplify the problem and work in all possible scenarios:


The introduction of the above assumptions mainly corrects the orientation before starting the alignment process, thus reducing the problem to almost pure translation. In addition, even minimal GNSS coverage at the take-off point provides an initial guess that makes the problem converge more accurately and faster. In case no GNSS coverage is available, the approximate coordinate of the take-off point with respect to the total station 3D map can be entered manually.

In addition to the above assumptions, in case of significant changes between reality and the reference map obtained with the total station due to catastrophes or severe structural failures, the discordant areas of the reference map should be removed. Alternatively, a map of the new state of the structure can be created with the same reference origin as the previous one.

#### *3.2. Relative Localization System*

The purpose of the relative localization system is to find the transform {*TLD*}, which describes the motion of the aerial robot from its take-off point. This take-off point will be located near the viaduct, on a flat surface parallel to the horizon so that the UAV can take off safely. This localization is performed using only current readings from the onboard sensors and does not require any prior data. It is desirable that this localization is as accurate as possible and minimizes drift over time, as much as possible, to avoid a significant divergence between the UAV's perceived and actual poses.

The relative localization system makes use of the LIDAR and a 9-axis IMU to calculate the UAV's pose at each instant. The algorithm operates at high frequency in real time, updating the pose at the same frequency as the IMU, which in the case of AERO-CAM is 400 Hz. The LIDAR is set to an operating frequency of 10 Hz. This algorithm is executed entirely onboard the aerial robot in the equipped Intel NUC. Despite running in real time, this algorithm has the highest processing load among the programs executed. It is of vital importance to the system, as it provides localization feedback to the UAV control algorithm, so that it can ensure a stable flight while navigating autonomously to the desired target

points. The localization algorithm is based on LIO-SAM [35] and its general architecture is adapted to AERO-CAM as shown in Figure 8.

This architecture establishes a tightly coupled fusion between the LIDAR and the IMU, building a factor graph in which the measurements made by the sensors are integrated to build and optimize the map, as shown in Figure 9. The factor graph is optimized using smoothing and mapping a Bayes tree with iSAM2 [36]. The IMU pre-integration is based on [37]. Since the double integration of IMU measurements leads to large drift, the architecture proposes its short-term integration instead, correcting its bias thanks to the localization at lower frequency in the built map using the information of the LIDAR point cloud. In order to process everything in real time, the algorithm discards LIDAR readings if they are not sufficiently displaced (typically 1 m and 0.2 radians) with respect to the previous reading (known as LIDAR keyframes). In this way, a lot of redundant information that would otherwise increase the computational load is discarded. Between LIDAR keyframes, the IMU readings are integrated, converging in a node of the graph that would be the state of the location at that given instant. Unlike the original algorithm, the adaptation for AERO-CAM does not introduce GPS/GNSS factors since the signal quality is totally impaired during the inspection flight due to the structure itself. Another difference with the original algorithm is that the loop closure option is disabled to avoid possible jumps in the odometry. The main reason is that this odometry is used to close the control loop so as to avoid as many peaks and spikes as possible, as it is safety critical to smooth the flight near the viaduct during the inspection. However, this particularization of the algorithm can lead to larger drifts in the calculated odometry. To overcome this problem, inspection flights are assumed to have a controlled duration with a planned route close to the viaduct, thus providing a rich point cloud which will help to minimize drift.

**Figure 8.** LIO-SAM adapted architecture.

**Figure 9.** LIO-SAM adapted factor graph.

As already mentioned, the result of all this processing is the relative localization of the aerial robot with a high frequency (400 Hz) that serves the control algorithm to proceed with the AERO-CAM. This publication does not intend to go into the details of the original LIO-SAM implementation. For more details, please refer to [35].

#### **4. Experimental Results**

The experimentation phase of this article was carried out with the AERO-CAM platform, performing the various experiments described next. Real flights around and under two viaducts were performed to evaluate the localization solution. On the one hand, there is the railway viaduct *Arroyo del Espinazo* in Álora, Málaga (Spain). This viaduct was inaugurated in 2006 and is currently in use. It has a length of 1.2 km with a maximum pillar height of 93 m and a width of 14 m. The pillars are equidistantly distributed and have a hollow square cross-section. On the other hand, there is the road viaduct *Puente de las Navas* in Algodonales, Cádiz (Spain). This viaduct was built in the 1980s and is still active with the A-384 road passing over it; therefore, it withstands daily traffic. It is approximately 350 m long and consists of cylindrical pillars supporting in pairs three longitudinal beams on which the deck rests. Both viaducts are in a good state of conservation, presenting small aesthetic defects in the concrete during the inspections without danger. The utility of the experimental inspections is focused on predictive maintenance, being able to return in the future to carry out the same inspection and compare the evolution. The flights in Álora were pilot assisted in order to perform realistic routes while those in Algodonales were fully autonomous. A preliminary map of the viaducts was created using a Leica Nova MS50 total station.

The trajectory followed in these experiments consists of a take-off close to the viaduct and a flight inspection of different areas that may include changes in altitude. Figure 10 shows some of these trajectories. In order to obtain the ground truth of the trajectory followed by the platform, a prism was installed on it and, using the Leica total station, the position with respect to the origin of the viaduct map was tracked. Note that the total station only provides position data, as it cannot estimate orientation. This position data is provided at 20 Hz.

**Figure 10.** Examples of trajectories followed from take-off to landing. The maneuvers were performed under a deck of the Algodonales and Álora viaducts. The graphs on the left illustrate X-Y, center X-Z and right X-Y-Z. (**a**) Alora sequence 5, (**b**) Alora sequence 7, (**c**) Algodonales sequence 3, (**d**) Algodonales sequence 4.

#### *4.1. Global Localization*

The experiments to test the global localization algorithm consisted of extracting LIDAR readings from the aerial platform for different time instants and inputting them for computation. These instants include moments before take-off and during flight and landing. To illustrate this process, Figure 11 shows two alignment examples at take-off.

On the one hand, the initial alignment of the total station 3D map and the LIDAR data readouts are shown, taking into account the assumptions introduced in Section 3.1. On the other hand, both point clouds aligned with the algorithm results are shown.

During the experimentation, the execution of the global localization algorithms was carried out on a laptop with a 4-core Intel Core i7-8564U CPU and 8 GB of RAM. The execution and convergence time of the algorithm varied between 12 and 25 s for each resulting transform. This duration is not a problem since the first iteration is performed before take-off. The rest of them can be performed during the flight and update the transform at convenience, as explained in Section 3.1.

The performance evaluation of the global localization algorithm is carried out by studying the mean error between matches after each ICP step. All tested cases converge to a valid solution. Tables 1 and 2 show the metrics obtained for Algodonales and Álora, respectively. As explained in Section 3, the ICP1 step is not always executed, so it is denoted in the tables as "\*" when there is no data. The so called "Proportional Correspondence" metrics (Prop. Corr.) shows the number of correspondences in that ICP stage divided by the size of the LIDAR point cloud at that instant in time. The MSE metrics represents the mean square error of the correspondences after applying the transform obtained at that stage.

**Figure 11.** Global localization examples at take-off. Red points are the LIDAR reading. (**a**) Algodonales, (**b**) Álora.



**Table 2.** Álora global localization metrics.


Additionally, during these instants, the position given by the total station—which has the same reference system as the 3D map—was obtained. This position serves as a ground truth to check the output of the global localization algorithm, since the output of this algorithm should correspond with the reading of the total station. Again, since the total station does not provide orientation, only the transitional part is considered. Tables 3 and 4 show the obtained results.

**Table 3.** Comparative table between the ground truth obtained by the total station and the output of the global location algorithm for the Algodonales viaduct.


**Table 4.** Comparative table between the ground truth obtained by the total station and the output of the global location algorithm for the Álora viaduct.


The results are considered valid since the algorithm is able to converge correctly in the proposed realistic cases. The advantage of the global localization system is that, if run before take-off, the operator can visually validate the obtained result and proceed with the inspection if there is no problem. Tables 3 and 4 show how the final 3D error is between 0.2 and 0.64 m for the tested cases, the z-axis (vertical) being the most affected. The results are considered good since the uncertainty of the point clouds and of the algorithm itself must be taken into account. While the total station error is in the order of millimeters (always proportional to the distance), the LIDAR points have an error of ±1.5–5 cm (both errors according to the manufacturers), which may influence the result.

#### *4.2. Relative Localization*

In the experiments to test the relative localization, the position data estimated by the algorithm were compared with the ground truth from the total station. It should be taken into account that the latter can only provide positions without orientation, so only the translation part is compared. The comparison between the two sets was made with a time association of the positions and a scale-free alignment was performed with the Umeyama algorithm [38]. The EVO framework [39] was used to facilitate this task. The metrics used are the APE (*Absolute Position Error*), to evaluate the global consistency, and the RPE (*Relative Position Error*), to evaluate the local one. For the RPE, an increment of 0.5 m was selected for the calculations. For both metrics, data such as maximum peak, mean, standard deviation, and RMSE (*Root Mean Square Error*) were obtained. Tables 5 and 6 show the results.


**Table 5.** Algodonales dataset description and localization errors.

**Table 6.** Álora dataset description and localization errors.


#### *4.3. Inspection Result*

The results obtained after performing a mission show the tracking of the planned trajectory by taking images of the viaduct. Each image obtained saves metadata containing the exact pose and instant in which they were taken with respect to the 3D map of the viaduct. In this way, it is always possible to review the inspection performed and to know the exact location to which each image belongs, with respect to the 3D map. Figure 12 illustrates one of the experiments performed. Specifically, it refers to Algodonales sequence 2, where a flight was performed under the viaduct deck, along the external part of the viaduct. Each vertical arrow indicates the pose of the camera for each acquired image. The green line represents the trajectory of the AERO-CAM.

The pictures shown in Figure 12b,c show an example of the visual information obtained with the AERO-CAM. Both pictures show different parts of the lower part of the viaduct deck, which is a difficult area to access. Figure 12b focuses on one of the cross beams, while picture Figure 12c shows the outer side of the deck. As explained above, the quality and level of detail of these pictures depend on the choice and configuration of the camera as well as the distance to the structure configured in the mission. In this case, an 85 mm lens has been used with a distance to the structure of about 2 m. This results in pictures with a resolution of 9504 × 6336 pixels in which the detail density per pixel is very high.

The distribution of the pictures along the structure is also crucial for a possible later analysis by performing a reconstruction of the structure through photogrammetry. All pictures have enough overlap between them to make this possible. The overlap is not only beneficial for possible photogrammetry but also allows the same point to be analyzed from different pictures, adding more redundancy and information to the system.

**Figure 12.** Algodonales sequence 2 inspection result. (**a**) shows the 3D model with the followed route (green) and the location of the acquired pictures (blue planes). (**b**,**c**) show two examples of the acquired pictures.

#### **5. Conclusions**

This work demonstrates that it is possible to perform an inspection of a viaduct with aerial robots as an alternative to current methodologies, saving time and cost and improving the safety and quality of data obtained. The design and development of the AERO-CAM and AeroX inspection platforms was quite successful, since together they cover the sensor needs of a viaduct: either looking for possible defects or analyzing existing ones.

In addition, the provision of autonomous capabilities to the platforms, especially to the AERO-CAM, greatly facilitates the work and provides more flexibility than conventional methods. These capabilities also reduce the number of specialized people needed to operate these platforms, thus improving their safety and speed of use.

Thanks to the choice of a LIDAR-type sensor for the autonomous capabilities of the AERO-CAM, the platform is fully versatile to operate in variable lighting situations, either due to weather conditions or to possible shadows and lighting changes that may be caused by the viaducts themselves. In addition, the system does not rely on a total station for the flight of the robotic platforms. Likewise, the camera installed in its gimbal can be configured to adapt to the different level of detail required or even be replaced by another one without having to redesign the platform.

As future developments, although the localization presented in this article provides good and sufficient results in terms of accuracy and calculation speed, the AERO-CAM does not consider taking off autonomously from areas where its LIDAR cannot see the structure, as the global localization would fail. This means that, until the system is able to locate itself, the pilot has to fly the UAV to the viaduct. Therefore, other strategies can be developed to complement the localization system to overcome these edge cases.

On the other hand, another line of work is the inclusion of a detect and avoid system to provide the AERO-CAM with more advanced intelligent capabilities when executing missions. Currently, the system relies on the fact that the mission designed by the inspector is free of obstacles and the waypoints keep a safe distance from the viaduct, but it could make use of the LIDAR readings to detect potential hazards on the route and re-plan it in real time.

**Author Contributions:** Conceptualization, R.C., M.Á.T., F.J.P.-G., A.V. and A.O.; Investigation, R.C., J.P., M.Á.T. and F.J.P.-G.; Methodology, R.C.; Software, R.C. and J.P.; Validation, R.C. and J.P.; Project administration, M.Á.T., F.J.P.-G., A.V. and A.O.; writing—original draft preparation, R.C.; writing review and editing, R.C., J.P., M.Á.T., F.J.P.-G., A.V. and A.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the RESIST (H2020-2019-769066) and PILOTING (H2020-2020- 871542) projects funded by the European Commission.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available yet due to ongoing private work.

**Acknowledgments:** Authors would like to thank David Tejero for his support in the development of the global localization algorithm, Jorge Mariscal for his review of this article and the interesting discussions, and the GRVC from the University of Seville for collaborating with their total station. In addition, the authors would like to thank Ferrovial for allowing and providing access to infrastructure for the experimentation phase.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**

