1. Introduction
Unmanned aerial vehicles (UAVs) are widely used in various fields, due to their small size, flexible operation, and stable flight [
1]. One of the most important tasks for a UAV is autonomous landing without human intervention. This is crucial for the real application of UAVs in different fields, such as shipboard landing [
2], UAV carrier landing [
3], package delivery [
4], returning of spacecraft [
5], etc. Recently, vision-aided guidance, navigation, and control technology for UAV attracts much attention with the help of computer vision technology [
6,
7,
8]. It shows great potential in the capability of low cost, anti-interference, and autonomous operation for guidance and navigation.
In vision-based landing guidance, the landing marker detection and landing guidance strategy are key technologies. Landing markers serve the purpose of providing visual guidance and positioning information during the landing of a UAV. The commonly used markers include: (1) well-shaped markers such as “H”, “T”, circles, rectangles, etc. [
9], (2) QR code-based landing markers such as QR, ArUco, AprilTag, etc. [
10], and (3) composite landing markers which are combined by multiple geometric shapes or constructed by nested QR codes [
11]. The composite landing markers can adapt to changes in the UAV’s field view and is currently one of the most used landing markers. In [
12], a multi-level marker that facilitates the detection of landing markers by UAVs at high altitudes is proposed. In [
13], a marker consisting of a circle and two vertical line segments inside is designed to estimate the altitude of the UAV. In [
14], a nested QR code is designed for the precise landing of the UAV. An optimal landing marker design should exhibit a harmonious balance between the speed of recognition and the range of visibility.
The guidance strategy also plays critical role in the UAV landing process. Regarding the autonomous landing guidance strategy, many studies have been conducted. In [
15], image-based visual servoing and feature shape compensation is proposed for the UAV landing on ships. In [
16], a new method for estimating rotation and translation between two camera views by at least five matching corners is proposed. In [
17], a vision tracking and landing algorithm on a moving platform is proposed for a multi-rotor UAV. In [
18], the visual landing method for low-cost quad-UAV on unknown mobile platforms is designed. In [
19], a robust controller is proposed for two separate stages during the UAV landing. In [
20], a point clustering relative position estimation algorithm is proposed. In [
21], a carrier landing system with fixed-time controller is presented. In [
22], a model predictive controller is developed for non-horizontal landing. In [
23], a trajectory planning method is proposed for UAVs with low maneuverability to land on charging platforms. In [
24], an elastic visibility aware planning and flexible adjustment method is designed for UAV landing on sloping platforms. However, the above research shows certain constraints. For example, the solution of landing on a ship is suitable for large UAVs while it is hard to apply to the small UAVs with limited computation resources. In addition, most of existing low-cost landing solutions for small UAVs have the problem of inaccurate visual estimation.
In this paper, an autonomous landing guidance strategy is designed for a small quad-UAV based on nested landing marker and fused altitude. A newly designed multiscale landing marker is captured by the onboard camera and processed by the modified YOLOv4. The lateral and vertical positions are estimated based on the vision information and fused altitude. Then, the landing commands are generated by the image-based visual servoing method. The main contributions of this study include the following:
(1) A landing marker detection method is designed to detect the double-layered ArUco marker which is obtained by combining two special ArUco codes in a certain pattern. In the detection, the normalized Wasserstein distance (NWD) is used to substitute for the intersection of union (IoU) to calculate the similarity between bounding boxes of adjacent small targets which is predicted by YOLOv4. By this means, the detection accuracy of the landing marker is improved in complex flight environments such as multiple small obstacles occurring at the same time or the marker being partially covered by obstacles.
(2) The altitude is a key state during the whole landing process. The relative altitude between the UAV and the marker calculated by the visual Perspective-n-Point (PnP) method often has errors. Hence, an altitude correction method is proposed by fusing the image-based altitude and inertial measurement unit-based altitude. By such means, the estimation accuracy of the altitude is improved, leading to a more precise landing result consequently.
(3) A re-guidance strategy together with the image-based visual servoing method is designed considering the moving of the marker or the interruption during the landing. By the landing marker recapture method, an adaptive landing guidance is achieved for the quad-UAV. The performance of the proposed strategy is verified by multiple simulations and flight experiments in different scenarios.
The rest of this paper is outlined as follows.
Section 2 describes the framework of the proposed landing guidance strategy.
Section 3 presents the marker design and detection algorithm. In
Section 4, the position estimation and adjustment are described. The overall realization process of the proposed landing strategy is presented in
Section 5.
Section 6 analyzes the test results.
Section 7 concludes the paper.
2. The Framework of Proposed Landing Guidance
The proposed UAV landing guidance system is shown in
Figure 1 consisting of a quad-UAV, the landing marker, and the ground station. The quad-UAV is equipped with a camera, inertial measurement unit (IMU), and onboard processor. The monocular camera is used to capture the landing marker video information. The IMU is used to obtain the acceleration and angular velocity of a UAV. The onboard processor communicates with the ground station by WiFi. In the onboard processor, the data from the onboard sensors are processed and then sent to the ground station.
The landing marker is attached to a fixed or moving target. A nested marker is designed aiming to accommodate varying height perspectives of the UAV. By integrating multiple scales of the ArUco into the marker, the visibility and positional accuracy are enhanced under different viewing angles.
On the ground station, the visual information received from the UAV is detected by the modified YOLOv4 algorithm. The landing marker is detected, and its center coordinate is estimated. Based on the visual image and other sensor information, the PnP method is used to estimate the relative position of the UAV and landing marker. Afterwards, the position information is sent to the landing guidance module.
Furthermore, an autonomous landing guidance strategy is designed to generate guidance commands for the UAV. The commands are then transmitted to the UAV via WiFi to guide the UAV towards the marker during its approach.
6. Simulation and Experiment
In this section, simulations and experiments are conducted to verify the performance of the proposed strategy. The marker detection algorithm is firstly demonstrated by using multiple images of the marker captured from different angles of view. Then, the altitude correction method proposed is analyzed. After that, the simulation tests under AMOVLab platform are conducted. At last, flight experiments are conducted using DJI-Tello UAV. The code and video of the proposed algorithm can be viewed in [
30,
31].
6.1. Landing Marker Detection Results Analysis
A dataset is constructed by collecting different images of the marker at different UAV view fields. The flight heights are 5 m, 10 m, and 15 m. Different scenarios are considered including similar backgrounds, strong lighting, similar small targets, and natural grassland. The objects being photographed include the landing marker, book, and person, where the landing marker is the main target, while the book and person are interferences. The dataset is then annotated on Labeling software. Three types of objects are labeled as ArUco, book, and person, respectively. The data format of the labels is PascalVoc, resulting in 5818 training samples and 443 test samples for algorithm validation. After the annotation is completed, the file is saved in XML format. The prepared dataset is used for training the NWD-YOLOv4 model and to verify the detection performance of the model.
Figure 9 presents the comparison results obtained by the traditional ArUco detection algorithm, YOLOv4 algorithm, and NWD-YOLOv4 algorithm. In
Figure 9a,b, the marker is partially obscured. The results illustrate that the traditional ArUco extraction algorithm [
32] fails to detect it as shown in
Figure 9a even under an ideal environment and the marker is large enough, whereas the YOLOv4-based algorithm can detect the obscured ArUco with a high accuracy of 87.22%, as shown in
Figure 9b. In
Figure 9c,d, the image is captured under the same field of view of the UAV. Additional small targets including a book and a person are joined, which cause interferences near the landing marker. The detection result of the marker by the YOLOv4 algorithm is 76.85%, as depicted in
Figure 9c. When multiple adjacent small targets appear within the field of view of the UAV, it is easy to cause false detection. This is because the IoU used in traditional YOLOv4 is very sensitive to small target scales, and it is difficult to measure the similarity between two boxes when they intersect little. However, by employing the improved NWD-YOLOv4 detection algorithm, it becomes capable of distinguishing multiple small targets with higher detection accuracy reaching 88.36%, as depicted in
Figure 9d. In
Figure 9e, these images are captured under different fields of view of the UAV. These scenarios contain similar backgrounds, strong lighting, similar small targets, and natural grassland. Using the NWD-YOLOv4 detection algorithm, markers can be detected with an accuracy of 87.93%, 96.68%, 98.76%, and 97.24%, respectively, and clearly distinguished without false detection in the case of similar small target interferences.
To further verify the effectiveness of the proposed detection method, comparison experiments are conducted. Three models including YOLOv4, NWD-YOLOv4, and YOLOv8 are trained on the same small object datasets. The object detection experiments are then performed in the same scenario. The results of quantitative analysis are shown in
Table 1.
The performance of YOLO series network is usually evaluated by the detection accuracy (such as mAP@0.5 and mAP@0.5:0.95) and model parameters (such as GFLOPs and params), as shown in
Table 1. The AP refers to average precision, which is an indicator calculated by the precision and recall of the network. mAP refers to the mean of the AP value. The mAP@0.5 is the mAP value when the threshold of the NMS evaluation index is 0.5. The mAP@0.5:0.95 indicates the average of the mAP values when the threshold of the evaluation index is 0.5, 0.55, … , 0.95. The GFLOPs refers to the number of floating-point operations and "params" is the total number of parameters that need to be trained in the network model.
As can be seen from
Table 1, for the NWD-YOLOv4 method, the values of mAP@0.5 and mAP@0.5:0.95 are 93.70% and 44.72%, respectively, which are 0.73% and 12.36% higher than those of YOLOv4, while the GFLOPs and params of YOLOv4 and NMD-YOLOv4 are the same, that is 6.957G and 6.057M. Hence, it can be concluded that, compared with YOLOv4, the detection accuracy of NWD-YOLOv4 is improved without increasing the amount of computation.
The mAP@0.5 of the YOLOv8 network is 95.1%, which is 1.4% higher than that of NMD-YOLOv4. But the mAP@0.5:0.95 of YOLOv8 is only 24.6%, which is much less than NMD-YOLOv4 with the mAP@0.5:0.95 of 44.72%. This means that, when the NMS threshold is increased, the mAP of YOLOv8 will decrease faster, which is not conducive to the accurate screening of prediction boxes. Meanwhile, GFLOPs and params of YOLOv8 are also much larger than that of NWD-YOLOv4. Hence, compared with YOLOv8, the NWD-YOLOv4 is much faster and with better accuracy at map@0.5:0.95. In summary, the proposed NWD-YOLOv4 has better performance in small target detection tasks.
6.2. Altitude Correction Results Analysis
In this subsection, the altitude correction strategy proposed in
Section 4.1 is tested. This experiment uses a motion capture system composed of 12 cameras (model type: OptiTrack Prime 13), as shown in
Figure 10. The Tello UAV and each landing marker with four reflective balls are placed in the motion capture system workspace. The user host computer software Motive 2.3.1 is used to build the Tello and landing marker rigid body and monitor their motion information in real time in the workspace.
Figure 11a,b present the results of acceleration data of the Tello UAV. The blue curves are all unprocessed data curves, and the red curves represent the filtered data. In
Figure 11a, the acceleration signal is obtained from the onboard accelerometer. It can be observed that the unprocessed data have significant noise information due to environmental interference, while the filtered data are smooth by removing partial noise information using the Kalman filter.
Figure 11b shows acceleration data by differentiating twice the altitude data estimated by image information. The filtered acceleration data are then used for complementary fusion to obtain a more accurate altitude by the proposed strategy in
Section 4.2.
In
Figure 11c, the comparison result of the altitude data is presented. The black curve represents the altitude estimated from the visual image. The red curve represents the fused altitude obtained using the strategy proposed in
Section 4.2. The blue curve is obtained by the motion capture system, mainly used for comparison with the fusion altitude. The green curve represents the error between fusion and real altitude, and it can be observed that the value of this curve gradually tends to zero over time.
6.3. Simulation Test
In this section, the simulation test is conducted to verify the proposed landing guidance strategy based on a simulation platform named Prometheus designed by AMOVLab [
33,
34]. Both static landing target and moving landing target are considered. The simulation platform is built on the Linux system of Ubuntu 16.04. The ROS-Kinetic and Gazebo simulators are installed. In the simulation, the takeoff point is regarded as the origin of the earth-fixed coordinate system and the landing marker size used is 0.6 m × 0.6 m.
6.3.1. Simulation Result of Landing on Static Target
The landing marker is placed at a position of (1.0, 1.0, 1.0) m statically in the earth-fixed coordinate system. The UAV takes off in the OFFBOARD mode, hovers at the altitude of 5 m. The GPS is first used to guide the UAV to the area near the landing zone. The vision-based landing strategy proposed in this paper is then adopted to guide the UAV once the marker is captured by the UAV camera.
Figure 12 presents the vision-based landing process. The landing marker is continuously partially obstructed. It can be observed that the landing marker can still be detected as shown in the top right corner of
Figure 12a. Based on the detected visual information, the UAV is gradually guided to approach the marker as shown in
Figure 12b,c. The corresponding 3D landing trajectory of the UAV is presented in
Figure 12e.
Figure 13 shows the three-axis positions and velocities of the UAV during landing on a static target. In
Figure 13c, as shown by the blue curve, the UAV starts landing from the altitude of 5 m. The UAV continuously lowers its altitude to fly closer to the marker denoted by the black curve, with the help of the proposed strategy. The total landing process lasts for 11 s. When the UAV’s altitude decreases to 0.2 m above the marker, only the inner marker is seen by the camera, as shown in
Figure 12c. The altitude and vertical velocity show a sudden change at around 10.5 s as illustrated in
Figure 13c–f. Meanwhile, the lateral deviation from the landing marker is gradually decreasing as shown in
Figure 13a,b. Due to the offset between the center point of inner and outer marker, there is a slight change in horizontal velocity at around 10.5s as shown in
Figure 13d,e. After 11 s, the UAV locks and finally lands on the marker. At this point, the position of the UAV is (1.0078, 1.0051, 0) m.
To further verify the accuracy and practicality of proposed landing strategy, 10 simulation cases are conducted.
Figure 14a shows the final landing position of the UAV starting from different altitudes ranging from 2.5 m to 3 m to the landing marker, as well as the error distance between the final landing point and the center point of the marker. In
Figure 14b, the average distance deviation between the final position of the quad-UAV and the center of marker is 0.031 m. Simulation results demonstrate that the proposed strategy has high accuracy for landing on static targets in the simulation platform.
6.3.2. Simulation Result of Landing on Moving Target
In this subsection, the landing marker is placed on a moving vehicle as shown in
Figure 15. The UAV takes off to (0, 0, 5.0) m in the earth-fixed coordinate system and remains in hover. The initial position of the moving vehicle is at (1.0, 2.5) m. The vehicle moves uniformly along the X-axis at a certain speed 0.5 m/s in a straight line, and the Y-axis velocity is maintained at 0 m/s. The carriage altitude of the vehicle is 1.1 m, so the initial position of the landing marker is (1.0, 2.5, 1.1) m. When the UAV’s onboard camera can stably capture the landing marker, the UAV begins to land.
Figure 15a–d show the entire process of UAV landing at the moving marker.
Figure 15e presents the trajectories of the quad-UAV and the moving marker.
Figure 16 shows the position and velocity of the quad-UAV when landing on a moving target. Once the marker is captured by the camera, the UAV starts accelerating along the X-axis. The direction is the same as the moving direction of the vehicle as illustrated by
Figure 16a, to ensure that the marker is tracked and it is always in the UAV’s view field. Based on the proposed landing strategy, the altitude of the UAV descends gradually and the lateral position is adjusted as shown in
Figure 16a–c. It can be observed that there is a rapid change in horizontal velocity at around 8.5s as shown in
Figure 16e. This is because when the distance between UAV and marker reaches view switching distance, the camera can only detect the inner marker, and the center of the inner marker has an offset, causing the speed changes. It also results in a change in vertical velocity as shown in
Figure 16f.
The total time for the landing of the UAV is about 9.5 s. Finally, the UAV lands at the position of (2.1398, 2.5211, 1.1) m and the terminal position of the marker is (2.0, 2.5, 1.1) m, as shown by the black curve in
Figure 16a,b. The deviation between them on X-axis and Y-axis is 0.1398 m and 0.0211 m, respectively. The above results show that the proposed method has good performance in simulation environments when the landing marker is attached on a moving vehicle with the speed of 0.5 m/s. The moving velocity of the target is low in this simulation. If the moving velocity is faster, the image of the marker is easily lost out of the UAV’s view field. In that case, the recapture strategy will be initiated as discussed in
Section 5, which is demonstrated in the next subsection.
6.3.3. Simulation Result in the Case That Marker Is Lost
In this subsection, simulation is conducted to verify the proposed recapture strategy in the case that the landing marker is out of the UAV’s field view during the landing process.
Figure 17 gives the 3D trajectory and altitude history of the UAV.
The initial landing marker is placed at (1.0, 1.0, 0) m in the earth-fixed coordinate system as denoted by the red square in
Figure 17a. When the UAV flies to the altitude of 3.2784 m as labeled by the purple circle in
Figure 17a, the landing marker is suddenly moved manually to (2.0, 2.0, 0) m as indicated by the red star in
Figure 17a. The marker is out of the UAV’s view field.
From
Figure 17b, it can be observed that the UAV begins to descend from an altitude of 5 m. After about 6 s of landing, the marker image is lost. Then, the UAV attempts to search for the marker by changing the lateral position, but does not find it. At about 46 s, it flies up to the initial altitude of 5 m to search the landing marker again. After that, the marker is seen by the camera successfully, the landing guidance strategy works again and directs the UAV to approach the landing marker. The final position of the UAV is (2.0274, 2.0224, 0) m, and the deviation of the UAV on the X-axis and Y-axis is 0.0274 m and 0.0224 m, respectively, with high landing precision.
6.4. Experiment Test
6.4.1. Experiment Platform Description
The flight experiments are conducted in this section. The UAV used in the experiments is the DJI Tello UAV with size of 17 cm × 17 cm × 4.1 cm. It is equipped with WiFi communication, an onboard forward-facing camera capable of capturing 720p/30 FPS video, an IMU, and two different modules for estimating altitude including a barometer and a time-of-flight (ToF) module, as shown in
Figure 18. The onboard camera’s capability of the small Tello UAV is limited. The camera of Tello UAV is with only forward-looking view. To capture the down-looking view where the landing marker is located, an additional tilted mirror is installed in front of the camera, whose details description can be found in our previous work [
34]. The captured video information is then transmitted to the ground station.
The ground station is used to perform marker detection and landing guidance commands calculation as shown in
Figure 18. The environment configuration of the ground station is given in
Table 2. The size of the marker used in the experiment is 0.205 m × 0.205 m.
6.4.2. Experiment Result of Landing on Static Target
In the experiment, the center point of the marker is chosen as the origin of the earth-fixed coordinate system. The speed limit is set to 0.07 m/s in the X and Y directions and 0.15 m/s in the vertical direction. The PID controller is used to adjust the speed.
Figure 19a–d show the static landing experiment process of Tello UAV, and
Figure 19e shows the landing trajectory of Tello UAV.
Figure 20 shows the position and velocity in each axis in the static landing process of Tello UAV. The takeoff position of the Tello UAV is (0.6, −0.3, 0) m. In the beginning, the UAV rises to an altitude of 2 m. When part or all the landing markers enter the UAV camera’s view, the UAV begins to approach the landing marker and slowly lowers the altitude based on the proposed landing position adjustment criteria designed in
Section 4.2. When the UAV descends to the altitude of 0.8 m, the onboard camera can only observe the inner ArUco which is used for close-ground guidance. It should be noted that, when the altitude of the Tello UAV from the ground decreases to 0.2 m at 38 s, the video stream transmission will also be terminated since the Tello UAV is locked for safety design. Finally, the UAV lands at the current adjusted position under the effect of gravity. The final landing position in the earth-fixed coordinate system is (0.0090, −0.0041, 0) m. The entire process from the appearance of the landing marker in the UAV’s view field to the final landing takes about 39 s.
Furthermore, 10 experiments are performed to further verify the proposed strategy. In all 10 cases, the Tello UAV flies to an altitude of 2 m. The deviation between final landing point of UAV and center point of marker are recorded and plotted in
Figure 21a. The mean deviation is 0.043 m as given in
Figure 21b with good landing precision.
6.4.3. Experiment Result of Landing on Moving Target
In this subsection, landing experiments on moving target are carried out. A rope is used to drag the marker, simulating a slowly and uniformly moving target.
Figure 22 shows the landing process and UAV trajectory.
Figure 23 shows the position and velocity in various directions in the landing process of the quad-UAV. The marker remains unchanged in both X and Z directions. In the Y direction, the marker is dragged by a rope in a steady slow motion with average speed of 0.015 m/s and maximum speed of 0.18 m/s. In the experiment test, the moving velocity of the target is set slower than that in the simulation test. This is mainly considering the hardware limitation of the Tello UAV platform. The down-view of the UAV is achieved by reflecting the front-view of onboard camera of the Tello UAV through an additional mirror. The field view is affected by the size of the mirror to some extent, which will affect the capture range of the marker during the landing process.
The initial position of the marker is (0, 0.15) m. The UAV takes off at (0.6, 0.8, 2) m in the earth-fixed coordinate system and hovers to search for the landing marker. When the marker appears in the onboard camera’s view field, the UAV begins to continuously track the moving marker, keeping the marker image in the view. The deviation between UAV and marker in X and Y directions is continuously adjusted by the method designed in
Section 4.2. Meanwhile, the flight altitude slowly decreases. When the UAV descends to the altitude of 0.8 m, the onboard camera can only observe the inner ArUco which is used for close-ground guidance. When it reaches 0.2 m, the UAV motor turns off and directly lands on the ground. The whole process lasted about 29 s. The final deviation between Tello UAV and marker in X direction is 0.0187 m, and the deviation in the Y direction is 0.0019 m. The experimental result meets the requirements of dynamic landing with satisfactory precision.