*1.1. Related Work*

Performing an autonomous landing procedure on a platform is a complex task that requires several steps to be achieved. In outdoor scenarios, global navigation satellite

**Citation:** Delbene, A.; Baglietto, M.; Simetti, E. Visual Servoed Autonomous Landing of an UAV on a Catamaran in a Marine Environment. *Sensors* **2022**, *22*, 3544. https://doi.org/10.3390/s22093544

Academic Editors: Reza Ghabcheloo and Antonio M. Pascoal

Received: 31 March 2022 Accepted: 4 May 2022 Published: 6 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

system (GNSS) receivers usually provide the positions of the quadrotor and the catamaran. Still, the data coming from GNSS are not sufficient to perform an autonomous landing, due to their inaccuracy. Even if properly filtered [11], the accuracy and precision are not enough for such complex and precise maneuvers. Additionally, the catamaran is subjected to unpredictable oscillatory dynamics caused by sea behavior and weather conditions to which the quadrotor must be able to react. For these reasons, alternative approaches have to be taken into consideration. For instance, a video system would allow increasing the reliability and the performance of the landing procedure. The GNSS data are used by the quadrotor to move close to the position of the catamaran, and from there, a vision system gives the quadrotor the relative pose of the platform on the catamaran. Computer vision algorithms are extremely useful to close the loop when the landing platform is in an uncertain position or it is moving [12], since they allow directly estimating the horizontal and vertical tracking errors with respect to the target point, instead of providing its coordinates in an absolute frame. An interesting and efficient solution was proposed in [13], where an extended Kalman filter was developed to combine data coming from different sensors (inertial navigation, GNSS receiver, and visual sensor) to build a navigation system and perform a landing procedure. In [14], a solution composed of several LEDs and an "H" sign placed on the landing platform was proposed: the LEDs give the possibility to the unmanned aerial vehicle (UAV) of recognizing the platform from high altitudes by using an infrared camera, and the "H" sign helps the estimation of the center of the platform itself when the quadrotor is closer. Instead, in [15], helipads composed of different geometric shapes (a cross, a circle, and a square) were proposed, to test the designed visionbased autonomous landing algorithm. Experimental results have been achieved outside of the marine environment, with a mobile robot carrying a landing platform moving on the ground. Another solution for the estimation of the relative pose between the quadrotor and the landing platform is the one proposed in [16], where a specific marker composed of a series of concentric circles would allow the detection of the platform from close by.

A similar methodology is the one presented in [17], where a landing platform composed of several AprilTags [18,19] with different dimensions is introduced: the larger tags permit the detection from higher altitudes, and the smaller ones from lower altitudes. This allows the quadrotor to constantly track the landing platform while decreasing its altitude during the landing procedure. The choice of using AprilTags is mainly related to their versatility and robustness [20].

### *1.2. Contributions*

The innovation of this paper with respect to the state of art is mainly the development of a set of software packages able to perform an autonomous landing procedure in a sea environment, where the catamaran is subject to wave-induced oscillations. The landing procedure is tackled by implementing a set of strategies, such as a preliminary positioning of the drone, platform searching, horizontal tracking to keep it aligned, and vertical compensation with respect to the landing platform. The behavior of the quadrotor during the whole landing procedure is handled by a finite state machine: a set of states and conditions that describe the actions the quadrotor has to perform, depending on the data coming from different sensors. An improved landing platform composed of more tags with respect to the past solution [9] has been designed. Simulations of autonomous landing have been performed in an environment composed by several tools, such as Gazebo (more info at: http://gazebosim.org/ last access: 30 March 2022), ROS2 (more info at: https://docs.ros.org/en/foxy/index.html, last access: 30 March 2022), and PX4 (more info at: https://px4.io/, last access: 30 March 2022), where the pose of the catamaran was replicated from data coming from sea tests involving only the catamaran itself so that the motion of the landing platform was realistic.

The proposed software architecture allows both the validation of the considered methodology via software-in-the-loop simulations and the integration of most of those components in the real hardware, as a preparation for tests in a real environment. To further validate the reliability and robustness of the onboard vision system, an onboard video captured from a manual flight landing of the quadrotor on the catamaran has been processed offline using the adopted vision system.

Therefore, with respect to the previous works [9,10], the contributions of the present manuscript are:


This paper is organized as follows: In Section 2, an overview of the whole system is proposed. In Section 3, the methodology of the proposed landing procedure is detailed, and in Section 4 the developed software/firmware architecture is presented. In Section 5, the main results obtained in flight emulation tests are shown. Finally, some conclusions are given in Section 6.

### **2. System Overview**

The considered experimental system is composed mainly of two different agents, each one with unique characteristics and features.

#### *2.1. Catamaran*

The ULISSE autonomous surface vehicle (ASV), developed by the interuniversity research center for Integrated Systems for Marine Environment (ISME, University of Genova node), is a 3 m long and 1.8 m wide catamaran, constructed in fiberglass (see Figure 1). It was designed as a modular vehicle for various applications. When used for marine geotechnical surveys [21] or acting as an intelligent buoy for underwater vehicles, it carries a deck with an underwater mast with acoustic sensors. When used as a means to extend the action range of the aerial drones, the catamaran is equipped with a dedicated landing platform (as in Figure 1). In each hull of the catamaran, a compartment hosts batteries (around 3.2 kWh of energy each), the hardware architecture where the control software runs the *ROS2* middleware, and a wide range of sensors (GNSS receiver, gyroscopes, accelerometers, and a compass sensor) to collect ego-motion measurements. The catamaran is provided with a roll-bar where the GNSS antenna is located, along with a 5 GHz one for Wi-Fi communication. The catamaran is propelled by two Torqeedo Cruise 2R electric thrusters, with electrical power of 2 kW each, which offer high maneuverability of the vessel even at low speeds, making it very agile in cluttered areas.

**Figure 1.** The ULISSE catamaran, equipped with the landing platform and the splash-proof quadrotor, deployed in one of the tests at sea.

#### *2.2. Quadrotor*

The chosen quadrotor model is the SwellPro Splash Drone 3 (more info at: https://swellpro.com/, last access: 30 March 2022), a drone that provides an external waterproof structure specifically designed for marine applications, along with various internal hardware components that allow performing manual flights. For the purpose of the realization of the proposed strategy, these components were substituted to robotize the agent itself. In particular, a Raspberry Pi Model B+ (more info at: https://www.raspberrypi.com/, last access: 30 March 2022) was embedded, along with a Raspicam v2, to allow onboard computations, and a Pixracer (more info at: https://docs.px4.io/master/en/flight\_controller/ pixracer.html, last access: 30 March 2022) autopilot system containing several embedded sensors, such as an accelerometer, a magnetometer, a gyroscope, and a barometer. The autopilot receives the setpoints computed by the algorithm running on the Raspberry Pi, and translates them into pulse width modulation (PWM) signals for the single motors of the quadrotor. The quadrotor is also endowed with a GNSS receiver. An ultrasonic sensor was included, as it is essential during the landing procedure, and so was a payload release mechanism actuated by a servo motor, to enable the quadrotor to carry out delivery tasks in a marine environment.

#### **3. Methodology**

The proposed landing solution is composed of different modules. Each one is detailed hereafter.

#### *3.1. Perception and Pose Estimation*

The relative pose of the quadrotor with respect to the landing platform is estimated by an onboard vision system that processes the video stream coming from the Raspicam. The platform is equipped with a set of visually distinguishable tags, each one being different from the others and characterized by a unique ID. The adopted vision system, named AprilTag, is an open-source, robust, and well documented tool [18,19] that allows 3D position and orientation computation of the considered tags with respect to the camera [20]. The use of a single tag does not guarantee its identification during the whole landing procedure; hence, the landing platform was equipped with 13 unique AprilTags, following a similar configuration as [17].

The tags, as shown in Figure 2, were placed in such a way as to guarantee visibility from different distances and robustness in the landing phase. The AprilTag markers on the outer edges are large, and thus easily recognizable at higher altitudes. The smaller internal ones play a crucial role in the final instants of the landing maneuver when the quadrotor is closer to the platform, improving safety and reliability.

**Figure 2.** The set *Sl* of AprilTags of different sizes as printed on the landing platform.

*c*

The list of the detectable tags is represented by the set *Sl* = {1, ... , 13}. At each iteration of the vision system, the detected tags ID are stored in a subset *Sd* ⊆ *Sl*. For each detected tag ID *<sup>i</sup>* <sup>∈</sup> *Sd*, the vision system computes a transformation matrix *<sup>c</sup> <sup>i</sup> T* that describes the position in the scene of the identified tag *i* with respect to the camera frame *c*. In order to compute the pose of the platform center with respect to the camera for each tag *<sup>i</sup>* <sup>∈</sup> *Sl*, a set of transformation matrices *<sup>i</sup> pT*—*i* ∈ *Sl* describing the position of each tag with respect to the platform center—is calibrated and computed offline. Thus, a post-multiplication gives the needed transformation matrix:

$$\,\_{p}^{c}T\_{i} = \,\_{i}^{c}T\_{p}^{i}T\,. \tag{1}$$

Theoretically, each detected tag gives equally correct information. However, to improve the quality of the estimation, these measures are merged and weighted by the areas (*ai*) of each detected tag in the camera frame: *i* ∈ *Sd*. Thus, the weighted transformation matrix between the center of the platform and the camera on the quadrotor *<sup>c</sup> pT* is obtained by:

$$\,\_p^c T = \frac{1}{w} \sum\_{i \in \mathcal{S}\_d} \,\_p^c T\_i a\_i \, \, \, \, \, \tag{2}$$

where *w* is the normalization term defined as:

$$w = \sum\_{i \in S\_d} a\_i \,. \tag{3}$$

The reference error is then transformed in the inertial frame, taking into account the quadrotor's attitude (see Figure 3), and sent to the guidance controller, which generates the desired commands for the autopilot.

Still, depending on sea conditions, the vertical velocity of the landing platform could vary a lot, and the estimated vertical error using the vision system alone does not provide a reliable measure in the final instants of the landing phase. To increase robustness, an ultrasonic sensor pointing downward was installed on the quadrotor, providing distance information at a limited range. More precisely, the camera provides information at 20 fps (frames per second). The ultrasonic sensor provides information at 30 Hz, and at distances less than 0.75 m, provides more precise and reliable data. The distance data coming from the ultrasonic sensor are used to estimate the platform's vertical velocity via a basic Kalman filter [9]. This information is merged with the estimated vertical error, as shown in Section 3.3.

**Figure 3.** A representation of the different transformation matrices involved in the relative pose estimation between the UAV and the landing platform's center.

#### *3.2. Horizontal Platform Tracking*

One of the tasks the quadrotor has to perform during the landing procedure is horizontal tracking of the platform, reducing the estimated horizontal error provided by the camera. By defining the horizontal positions of the quadrotor and the platform as *pq*,*xy* and *pl*,*xy*, respectively, the horizontal position error is *ep*,*xy* = *pl*,*xy* − *pq*,*xy*. A measure of this error is taken from the onboard vision system. Thus, a PI regulator is designed to produce position setpoints *p*∗ *<sup>q</sup>*,*xy*:

$$p\_{q,xy}^{\*} = p\_{q,xy} + K\_P \varepsilon\_{p,xy} + K\_I \int \varepsilon\_{p,xy} \, dt \,, \tag{4}$$

where *KP* and *KI* are the proportional and integral gains of the controller, respectively.

#### *3.3. Vertical Platform Compensation*

Once the quadrotor is at a certain distance from the landing platform, it needs to take care of the heave motions of the landing pad, induced by the waves. In this delicate phase, the altitude setpoints are generated to keep the relative velocity between quadrotor and catamaran to a specific value *vdes <sup>r</sup>*,*<sup>z</sup>* . More in detail, a vertical target absolute velocity can be defined as:

$$
\upsilon\_{q,z}^{\rm des} = \upsilon\_{r,z}^{\rm des} + \upsilon\_{l,z} \tag{5}
$$

where *vl*,*<sup>z</sup>* is obtained by:

$$
\upsilon\_{l,z} = \upsilon\_{q,z} - \upsilon\_{r,z} \tag{6}
$$

and *vr*,*z* is the estimation of the relative vertical velocity obtained by the above mentioned Kalman filter. The vertical error velocity is defined as:

$$
\sigma\_{\upsilon,z} = \upsilon\_{q,z}^{des} - \upsilon\_{q,z} \,. \tag{7}
$$

Thus, the desired vertical velocity is a proportional scale of (7) by a gain *K*1:

$$
\sigma\_{q,z}^\* = \upsilon\_{q,z} + K\_1 \,\varepsilon\_{v,z} \,. \tag{8}
$$

Finally, the altitude setpoints are generated as:

$$p\_{q,z}^\* = p\_{q,z} + K\_2 \, v\_{q,z}^\* \, . \tag{9}$$

where *pq*,*<sup>z</sup>* is the current altitude of the quadrotor, and *K*<sup>2</sup> is a scale gain. Vision system data are not used at this stage, since the ultrasonic sensor gives information at a higher frequency.

#### *3.4. Finite State Machine*

The landing phase is described by a series of connected states, whose transitions are handled by a finite state machine. The behavior of the quadrotor is described by eight states: initialization, searching, tracking, hovering, descending, ascending, compensation, and landing; Figure 4 describes how the states are linked. The transitions among them are triggered by boolean algebra operations.

**Figure 4.** The diagram of the proposed finite state machine for the autonomous landing.

Initially, the quadrotor performs the rendezvous with the catamaran. The latter sends a stream of its GNSS position to the former, which flies to reach it. The catamaran's GNSS position is exploited only in the initial phase, as it can be imprecise, and would not guarantee a robust and reliable landing, especially in cases of signal loss. When the quadrotor reaches the area described by the received GNSS coordinates, the finite state machine starts, whose states are detailed in the following subsections.

#### 3.4.1. Initialization

This is the entry point of the procedure. In this state, the quadrotor reaches the starting altitude and starts to look for the landing platform. If the landing pad is not detected, the finite state machine changes the state to *searching*. Otherwise, the quadrotor places itself in a specific position and orientation with respect to the catamaran. The basic idea is to prevent landing from a position where the quadrotor could hit the roll-bar located on the stern side of the catamaran (see Figure 1).

For this purpose, as seen in Figure 5, the quadrotor is placed in front of the landing platform, at a certain distance from the center. In detail, the desired positions *p*∗ *<sup>q</sup>*,*<sup>x</sup>* and *p*<sup>∗</sup> *q*,*y* are computed directly by:

$$p\_{q\_rx}^{\*} = -p\_{q\_rx} + \varepsilon\_{p,x} + \mathcal{R}\sin(\psi\_l) \tag{10}$$

$$p\_{q,y}^{\*} = \ \ p\_{q,y} + \varepsilon\_{p,y} + R\cos(\psi\_l) \ \ , \tag{11}$$

where *ep*,*<sup>x</sup>* and *ep*,*<sup>y</sup>* are the estimated horizontal error components (see Section 3.2), *R* is the desired fixed distance the quadrotor has to keep from the platform, and *ψ<sup>l</sup>* is the catamaran's yaw angle. To prevent further complications in the quadrotor's movements, its yaw is kept constant for the whole landing phase.

**Figure 5.** The generation of the initial relative position *p*∗ *<sup>q</sup>* during the initialization phase. The quadrotor needs to place itself in front of the catamaran, and it does that by moving around the platform and placing at a certain distance from it.

#### 3.4.2. Searching

If the quadrotor has no visual information about the position of the landing platform, it enters a state where it searches for it. To this end, the quadrotor reaches a predefined altitude and flies in circles increasing large in radius. In particular, by taking as the center point the quadrotor's position at the initialization time of the searching phase *pq*,*x*(*t*0), *pq*,*y*(*t*0), the desired position of the quadrotor is defined by:

$$p\_{q,x}^\*(t) = -p\_{q,x}(t\_0) + R\cos\left(\frac{v\_s}{R}(t - t\_0)\right) \tag{12}$$

$$p\_{q,y}^\*(t) = -p\_{q,y}(t\_0) + R \sin\left(\frac{v\_s}{R}(t - t\_0)\right),\tag{13}$$

where *R* is the desired radius of the first circle and *vs* is the desired linear velocity to be tracked during this phase. The searching continues until the following condition is verified:

$$\left(t - t\_0\right) < \frac{2\pi R}{v\_s} \,. \tag{14}$$

When this condition is no longer true, the parameters are updated: *R* is increased by 0.5 m so that the quadrotor inspects a new area while overlapping a part of the previous one, *t*<sup>0</sup> is set to the current value of *t* (*t*<sup>0</sup> = *t*). By doing that, the condition returns true, and at the next iteration, the quadrotor starts a new circle, but with an increased radius. The structure of the second term of (14) makes sure the quadrotor starts a new circle in the exact instant when it finishes the first one.

Once the quadrotor detects the platform, the landing procedure begins, and the quadrotor goes back to the initialization state. This process guarantees the success of the action even if only the approximate position of the platform is known. If the quadrotor loses the platform when already landing, the searching state takes also into account the last computed vision error, so that the quadrotor centers itself in the last known position of the catamaran to restart the search.

#### 3.4.3. Tracking

When the quadrotor is correctly positioned with respect to the catamaran, the tracking state is triggered, handling the reduction of the vertical and horizontal error between the two agents. The descent of the quadrotor toward the catamaran becomes slanted; in particular, at the time instant *th* indicating the moment this state starts, a slope between its current altitude and an altitude point *zmax* (ideally, the maximum distance from the platform that allows the horizontal tracking of the smaller tags) is chosen. The *z* reference is computed using:

*z*(*th*) = *m*(*th*)*ep*,*x*(*th*) , (15)

where

$$m(t\_h) = -\frac{p\_{q,z}(t\_h) - z\_{\max}}{R\sin(\psi\_l(t\_h))}\,. \tag{16}$$
