1. Introduction
In the last decade, quadrotor usage for both civilian and military applications has significantly increased in applications such as construction, transportation, surveillance, industry, marine science, mapping, military, emergency response, and law enforcement. In construction and industry, quadrotors are used to examine the condition of structures, machinery, or infrastructure located in remote areas or at high altitudes. When surveying areas that are difficult to access on wheels or foot, quadrotors easily perform that kind of task [
1,
2]. In transportation, quadrotors are used for data collection, patrol, and recently for deliveries [
3,
4]. Quadrotors are employed in marine science studies for marine animal research [
5], marine species identification, and photogrammetric studies [
6]. For mapping applications, quadrotors scan an area and take pictures of it to build a virtual 3D model [
7], later used for mobile 3D mapping and image recognition. Quadrotors are also employed for grasping and retrieving objects while hovering, as seen in [
8,
9]. In emergency situations such as rescue missions, quadrotors may be used when foot or vehicular access is dangerous or difficult, for example, buildings after natural disasters or terror attacks [
10].
To accomplish the challenging tasks assigned to them, quadrotors require an accurate navigation system. Using a global navigation satellite systems (GNSS) receiver combined with an inertial navigation system (INS) is one of the commonly used approaches for quadrotor navigation. It provides position and velocity information with an accuracy suitable for various applications [
11,
12]. However, indoors and some outdoor locations (like urban canyons), GNSS readings might not be available and the navigation solution relies only on the INS solution in a situation known as pure inertial navigation. There, due to noise and errors in the inertial sensors, the navigation solution drifts in time.
As GNSS measurements are not available indoors, quadrotor navigation is usually based on the fusion between the INS and a camera.
In [
13,
14,
15] simultaneous localization and mapping (SLAM) approaches are used as an alternative to GNSS indoors. In [
16], a quadrotor indoor navigation algorithm based on position-based visual servoing was suggested, while indoor quadrotor navigation based on visual markers for quadrotor position computation was used for library inventory and book localization was suggested in [
17]. Both visual positioning systems and SLAM rely on a camera as a main sensor. However, in some situations, such as poor visibility, the camera is not usable. Therefore, as GNSS suffers from blockage and unavailability, and cameras from lighting conditions and distortion, the quadrotor navigation solution is expected to rely only on the inertial sensors in several portions of its trajectory. As a consequence, the navigation solution drifts with time [
18].
Recently, to cope with situations of pure inertial navigation, the quadrotor dead-reckoning (QDR) framework was proposed [
19]. Motivated by the pedestrian dead reckoning (PDR) approach using the smartphone inertial sensors [
20,
21,
22], QDR requires the quadrotor to be flown in a periodic motion trajectory instead of a straight line trajectory. In this manner, similar to step-length detection and estimation in PDR, the peak-to-peak change in distance of the quadrotor is estimated.
Figure 1 illustrates the straight line trajectory used in pure inertial navigation and also the periodic motion trajectory applied in the QDR approach.
QDR manages to significantly improve the pure inertial navigation solution; however, it has several shortcomings. QDR requires a gain calibration before it can be used and this gain is limited to the periodic motion applied in the process. In addition, QDR provides only a change in distance estimation between two peaks of the period motion along the trajectory, and so the ability to track the actual trajectory is limited to the time between those peaks, which could be several seconds. Finally, QDR provides only the change in distance of the quadrotor and does not provide its altitude; thus, it is limited to a two-dimensional position solution.
In other related navigation domains, machine learning (ML) and deep learning (DL) algorithms are used to improve the overall navigation performance. In [
23,
24,
25], a deep learning approach is used for robot indoor navigation. Human activity recognition [
26,
27,
28] and smartphone location recognition (SLR) [
29] algorithms based on ML/DL were shown to improve the accuracy of PDR by using it as a prior [
30,
31]. SLR was also shown to improve the performance adaptive attitude and heading reference system (AHRS) [
32]. ML approaches were also used to improve the accuracy and time to converge of the coarse alignment process [
33]. In autonomous underwater vehicles, an end-to-end DL approach was suggested to regress missing Doppler velocity log measurements [
34]. In addition, DL approaches greatly improved traditional PDR approaches [
35,
36,
37,
38,
39].
In this paper, we propose QuadNet, a hybrid DL-framework for quadrotor dead reckoning enabling three-dimensional position estimation at any user-defined time rate using only inertial sensor readings.
QuadNet, being a hybrid framework, uses regression neural networks to provide the quadrotor’s change in distance and altitude and model-based equations to determine the heading. Thus, QuadNet requires only the inertial sensor readings to provide the three-dimensional position vector. Two different network architectures for the regression task are suggested and evaluated.
The main contribution of the paper is an accurate pure quadrotor inertial navigation solution. Compared to the model-based QDR solution [
19], the proposed approach:
- 1.
Provides altitude information enabling the determination of the quadrotor three-dimensional position vector;
- 2.
The regression rate can be set at any desired time interval for the position vector estimation. For example, in [
19], it was a peak-to-peak estimation occurring approximately every seven seconds, and now it was reduced to less than a second.
- 3.
As a data-driven approach, there is no need for any calibration prior operation as required in the model-based QDR approach.
Experimental results with DJI’s Matrice 300 quadrotor are provided to show the benefits of using the proposed approach over the QDR approach in situations of pure inertial navigation.
The rest of the paper is organized as follows:
Section 2 presents the INS equations of motion and the QDR approach.
Section 3 presents the proposed QuadNet framework, and
Section 4 elaborates on the data collection process and prepossessing.
Section 5 describes the experiment results, and
Section 6 provides the conclusions.
4. Data Collection and Preprocessing
To evaluate QuadNet’s performance relative to the QDR approach, the same dataset as in [
19] was used. This dataset was recently published in the autonomous platforms inertial dataset [
43] and is available at
https://github.com/ansfl/Navigation-Data-Project/, accessed on 22 December 2020.
DJI’s Matrice 300 RTK quadrotor, as shown in
Figure 6, was used in the experiments. The experiments were performed in an outdoor environment for GNSS measurements availability. A Pixhawk Cube flight controller was mounted on the quadrotor landing gear to record its three IMU sets during flight. Note that out of the three IMUs, only two sets are mechanically vibration-isolated; therefore, the third IMU readings are not used in this work.
In addition, an RTK GNSS receiver was installed next the Pixhawk Cube to obtain the GT of the trajectory. The sampling rate of Pixhawk Cube IMU is 25 Hz, and the sampling rate of the RTK GPS is 10 Hz.
The quadrotor was flown manually by an experienced and licensed pilot. The travelled distance of the trajectory we used in the experiment was about 100 m, and the altitude was set to 20 m. The quadrotor was flown in a straight line and in a periodic motion back and forth to the same starting point. Due to disturbances caused by manual flight and strong winds, the altitude of the quadrotor was not kept constant during the straight flight, and the periodic motion was not smooth.
4.1. Dataset
A total of min of recording data consisting of GNSS and IMU measurements was collected during 15 trajectories.
Those include 14 trajectories with periodic motion and 1 straight line trajectory. The former are used for comparison to the classical pure inertial navigation solution.
Focusing on the periodic motion trajectories, 12 trajectories with a total flight time of min were used for training and 2 trajectories with a total flight time of 53 s were used for testing. For both train and test datasets, the accelerometer and gyroscope readings were used as input. As two sets of IMU are used, the train dataset contains 67,800 samples over 6 inertial sensors, i.e., a total of 406,800 reading samples.
Training a neural network requires a large amount of data, which means a large amount of recordings is needed. The number of recorded samples we have is considered small; therefore, data augmentation was applied to achieve more examples using the same recorded trajectories.
The number of total samples we have in our dataset is calculated as:
The number of samples inserted as input to the neural network is of our choosing and is called the window size. The number of examples we can create for our dataset depends on the number of samples, window size, and the stride we choose, as described in:
where stride is the number of sample shifts over the input matrix. For example, when the stride is equal to one, then we move the window one sample at a time.
Since our sampling rate is 25 Hz, by choosing a window size of 25 samples and a stride of 25, we obtain a number of examples that is equal to the recorded time of the trajectory. However, by choosing a stride value lower than 25 samples, we create a bigger number of examples while simultaneously creating an overlap between the data that is described as:
For example, if we look at a range of [1–50], by using a stride of 25, we obtain 2 examples in the ranges of [1–25] and [26–50]. However, by using a stride of five, we obtain six examples in the ranges of [1–25], [5–30], [10–35], and up until [25–50].
For our analysis, we choose a stride of 5; hence, we have an overlap of 20 samples, and by that we obtain around 4200 examples, i.e., the size of our training dataset is . For the test dataset, a stride of 25 samples is chosen to avoid overlap. In the test dataset, the time duration of the first trajectory is 23 s, and the second trajectory duration is 30 s, resulting in dimensions of and .
For the training process, a batch size of 32, a constant learning rate of , and the ADAM optimizer were used. The QuadNet networks were trained on the Google Colab GPU.
5. Analysis and Results
5.1. Performance Measure
The root mean square error (RMSE) metric and the distance error at the end of the trajectory were chosen as the performance measure, where the RMSE was used for both the distance and height.
The
RMSE is defined by:
where
N is the number of samples,
is the GT distance/height observed at time
i, and
is the estimated position/height observed at time
i.
5.2. Baseline Architecture Evaluation: Distance
Equations (
3)–(
5) were used to calculate the pure inertial navigation solution for the straight line trajectory. The test dataset containing the periodic motion trajectories was used to evaluate the QDR and QuadNet approaches. Each of those trajectories was recorded in a back-to-forth manner; that is, flying from a start point to the end point and back to the start point as shown in
Figure 7. Note that the periodic motion is not repeatable between peaks due to pilot flight errors and winds that were present during the flights. The quadrotor height was obtained using RTK-GPS measurements.
Prior to using the QDR approach, the approach gain is calculated. To that end, the train dataset was employed to obtain a QDR gain of
using (
9). This gain is used by applying the QDR position calculations (
7)–(
8).
Using the same train dataset (containing the two IMUs readings), both QuadNet architectures (
Section 3.2) were trained. The evaluation of QDR and QuadNet was made on the two trajectories in the test dataset, where for each trajectory two different IMUs were examined.
For all three approaches—INS, QDR, and QuadNet—the heading angle is obtained from (
5). The same procedure was repeated for each of the two IMUs.
Note that the INS approach and QuadNet frameworks are used to estimate both the change in distance and height of the quadrotor, while the QDR approach is used for distance estimation only.
Table 1 presents the results obtained on the test dataset using the three approaches: INS, QDR, and QuadNet, for distance estimation using the first IMU.
Both QuadNet architectures outperformed the INS and QDR approaches for the position at the end of the trajectory and RMSE. INS results for the straight line trajectory showed an error of times the traveled distance, with an RMSE of m. In the QDR approach, the error at the end of the trajectory was less than with an RMSE less than 5 m. In QuadNet2, the error was less than and RMSE of m, reducing more than half of the QDR’s RMSE. QuadNet1 achieved the best results as its error was less than and RMSE less than m for both trajectories.
The QuadNet frameworks also maintain better tracking of the actual trajectory, as the QDR approach updates the distance only between two peaks, while in QuadNet, the distance update occurs every 1 s (25 IMU samples). To better illustrate that,
Figure 8 and
Figure 9 present the distance error throughout the two test trajectories.
The pure inertial INS trajectories rapidly diverge. The QDR approach provides an estimation of the change in distance approximately every 7 s.
The same procedure was repeated for IMU
, and the results are given in
Table 2.
Once again, both QuadNet architectures show better results. INS results for the straight line trajectory showed an error of times the traveled distance, with an RMSE of m. In the QDR approach, the error at the end of the trajectory was less than with an RMSE less than 10 m. In QuadNet2, the error was less than , and the RMSE was less than 4 m. The QuadNet1 error was less than and the RMSE was less than 3 m for both trajectories.
5.3. Baseline Architecture Evaluation: Height
The QDR approach follows PDR guidelines and therefore allows only the estimation of the change in distance. Thus, the QDR approach is not considered in this section.
Following the same procedure as in the previous section,
Table 3 shows the height RMSE of the traditional INS approach and both QuadNet architectures using IMU
readings.
It is clearly seen that QuadNet approaches outperformed the traditional INS. QuadNet2 showed an RMSE of less than m, while the INS’s RMSE was more than m. As in the distance estimation, QuadNet1 achieves the superior results with an RMSE less than 1 m, improving QuadNet2 by in the first trajectory and by in the second trajectory.
The same procedure was repeated for IMU
, and the results are given in
Table 4.
The same behavior is seen in the second trajectory. Both QuadNet architectures greatly improved the INS performance, and QuadNet1 obtained the best result. This time, the amount of improvement relative to QuadNet2 was and for trajectories 1 and 2, respectively.
Figure 10 and
Figure 11 show the INS and both deep learning frameworks calculated height error throughout the two trajectories.
As observed in the figures, the INS solution quickly diverges while both QuadNet architectures achieve a bounded error. Along the entire trajectories, a height error of less than 4 m was obtained for QuadNet2 and less than m for QuadNet1. That is, QuadNet2 achieves an error that is more than twice the error of QuadNet1. In addition, it is easily seen that the trajectory tracking ability, as well as the overall accuracy of the proposed approach, is much better compared to the INS in situations of pure inertial navigation.
5.4. Influence of Window Size
In both distance and height evaluations, QuadNet1 obtained the best performance using the baseline network parameters defined in
Section 4.1. To seek better performance, the influence of window size was examined using the QuadNet1 architecture. The baseline window size is 1s, corresponding to 25 samples as the IMUs operate at 25 Hz. For our analysis, window sizes of 15 and 50 samples are also examined.
Table 5 shows the distance and height RMSE for the three window sizes using QuadNet1 architecture and IMU
measurements.
The 25 sample window size showed the best results for both distance and height RMSE while the 15-sample window size obtained the worst performance. This is attributed to the fact that a 15-sample window size is too small for the network to learn the data given the quadrotor dynamics. Moreover, by using a 50-sample window size, the distance and height RMSE of trajectory
worsen by
. Yet, regardless of the window size, the INS errors (distance RMSE of 143 m, as shown in
Table 1, and height RMSE of 20.7 m, as shown in
Table 5) greatly improve. Thus, the window size hyper-parameter should be tuned as a function of the IMU measurement rate and the quadrotor dynamics to minimize the RMSE.
5.5. Influence of Input Size
Until now, the input of the QuadNet networks has a dimension of six and includes the three gyroscope and three accelerometer measurements. Here, we examine two more possibilities for the input dimension:
- 1.
Six (baseline): Three gyroscope and three accelerometer measurements; i.e., the specific force and angular velocity vectors.
- 2.
Three: Three accelerometer measurements; that is, only the specific force vector.
- 3.
Two: The specific force and angular velocity magnitudes.
Table 6 shows the distance and height RMSE for the QuadNet1 architecture using IMU
measurements.
In general, it is expected that by inserting wider data to the network it will achieve better performance, particularly because both accelerometers and gyroscopes are used to determine the position of the quadrotor using INS equations. Therefore, it is not surprising that when the input includes both accelerometer and gyroscopes measurements, QuadNet obtains the best performance.
Nevertheless, all input types greatly improve the pure inertial INS solution (143 m, as shown in
Table 1). As a result, with a cost of reduced performance compared to the baseline input, only accelerometers can be used to greatly reduce the INS solution. This could be very beneficial when strict weight or power constraints are met, such as in micro-drones.
6. Conclusions
QuadNet, a hybrid framework for quadrotor dead reckoning using only inertial sensors was proposed to estimate the quadrotor’s position vector. To that end, the quadrotor is required to be flown in a periodic motion trajectory. QuadNet was destined to improve the QDR approach and improve the navigation accuracy in situations of pure inertial navigation. Specifically, QDR depends on a precalibrated gain, provides only peak-to-peak distance estimation, and does not estimate the quadrotor altitude. The proposed QuadNet framework does not require any gain and provides the change in distance and altitude of the quadrotor in any required time duration.
Two QuadNet architectures—QuadNet1 and QuadNet2—were examined for the regression task. The first comprises CNN layers, while the second is a combination of CNN layers and LSTM. A comprehensive comparison between the two frameworks, the QDR approach, and traditional INS on the same dataset was conducted. The dataset was collected using DJI Matrice 300 RTK and includes two sets of inertial sensors and RTK GNSS GT solution.
Among all approaches, the QuadNet1 architecture (CNN only) achieved superior results with an average error less than of the traveled distance, a distance RMSE of m, and an altitude RMSE less than 1 m. Compared to QuadNet2, QuadNet1’s performance is better by times in the distance RMSE and times in altitude RMSE.
We also showed that using only accelerometer readings as the input to the NN greatly improves the standalone INS solution. As a consequence, for micro-drones or when limitations on the power/cost/computational load exist, QuadNet can be applied using only accelerometers.
As for QDR, QuadNet requires to be flown in a periodic motion trajectory, and so a trade-off is considered between reduced energy consumption in straight line trajectories to a more accurate navigation solution in periodic motion trajectories. Considering that short time periods are most common situations for pure inertial navigation, in most scenarios the added energy consumption can be ignored.