*3.2. Detection and Localization of the Mobile Platform*

In this section, we describe the proposed method to *detect* and *localize* the landing platform relative to the UAV's coordinate frame. To this end, several standard computer vision techniques were used. It should be noted that this task is not the central topic of this work, but rather a required task to accomplish the rest of the steps that make up the autonomous landing of a UAV on a moving target. Thus, it was assumed that the landing platform had an easy to detect pattern (color-wise) or a marker on top of it. In either case, OpenCV [20] functions are readily available to accomplish shape- and color-based or more accurate marker-based detection.

The detection algorithm is summarized in Algorithm 1. First, the input frame is converted to the Hue, Saturation, Value (HSV) color model; second, a color mask is applied, where we keep all pixels within a certain HSV range, in particular that which corresponds to the red color; third, a Gaussian filter is applied to blur the image; finally, the Canny edge detection and Hough line transform algorithms are employed, followed by polygon fitting and computation of the centroid's coordinates in the image plane. An example result of this approach using the downward-looking camera of a real aerial vehicle (AR Drone 2.0) is shown in Figure 3.

```
Algorithm 1 Detection algorithm.
```

```
Input: video_ f eed
```
Output: *centroid*

```
1: while true do
```

**Figure 3.** Detection of the landing platform and extraction of its centroid using the imagery from the downward-looking camera of an AR Drone 2.0.

To convert the previously calculated coordinates of the centroid in the image plane Ω to a three-space vector in the camera frame the inverse of the well-known pinhole camera model can be used. This process is depicted in Figure 4.

**Figure 4.** Pinhole camera geometry. *C* is the camera center and *p* the principal point [21].

Assuming radial distortion has been removed in a pre-processing step, Equations (1) and (2), i.e., the pinhole camera model equations, can be used to define the projection of a three-space vector in the camera frame into the image plane:

$$
\lambda \begin{pmatrix} u' \\ v' \\ 1 \end{pmatrix} = \begin{pmatrix} f & 0 & c\_x \\ 0 & f & c\_y \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} X \\ Y \\ Z \end{pmatrix} \tag{1}
$$

$$\begin{aligned} u &= u'/\lambda\\ v &= v'/\lambda \end{aligned} \tag{2}$$

where (*u*, *v*)*<sup>T</sup>* is the projection of the point on the image plane expressed in pixels; *f* is the focal length in pixels; (*cx*, *cy*)*<sup>T</sup>* are the coordinates of the principle point of the camera; and (*X, Y, Z*) the three-space coordinates of the landing platform's centroid in the UAV's camera frame. Note that in the above equation we have assumed square and non-skewed pixels.

To obtain our desired 3D coordinates in the camera space, we need to invert the pinhole model. Additionally, if we know *Z* beforehand (in our case, *Z* corresponds to the vertical distance from the center of the UAV's camera frame to the center of the landing platform), we can easily compute *X* and *Y*:

$$Z\begin{pmatrix} X \\ Y \end{pmatrix} = Z\begin{pmatrix} \frac{u - c\_x}{f\_x} \\ \frac{v - c\_y}{f\_y} \end{pmatrix} \tag{3}$$

and thus obtain the 3D position of the landing platform's centroid with respect to the UAV's camera frame.

We discard measurements taken when the Inertial Measurement Unit (IMU) indicates an inclination bigger than a threshold. Therefore, in theory, relative positions of the landing platform with respect to the UAV computed by the detection-localization algorithm are always obtained without a major tilt, thus producing reliable estimates to a certain degree. Nonetheless, a minor level of noise is always present in the output of this algorithm. Leveraging a Kalman filter, as we will explain in Section 3.3.2, will prove to be an effective way to deal with such measurement noise.
