2.2. Layout of Experimental Environment and Acquisition of Data
The aerial image of the test field was collected in the field in the west of the campus on the morning of December 13, 2018. The weather was clear and the wind speed was low on that day. The layout of experimental environment is shown in
Figure 2, in which the size of aluminum block is 20 cm × 20 cm × 5 cm and the thickness and radius of navigation landmark’s umbrella shaped surface is 3 mm and 6 cm.
After the experimental environment was set, the longitude and latitude coordinates of the 0–26 center point of the landmark need to be measured by the C94 M8P module of RTK satellite positioning system. The measuring time at each point was 10 s. Five data values were obtained per second. Finally, the average values of 50 measured values at each point were summarized.
After the latitude and longitude coordinates were measured, the RGB image of the experimental plot was taken by UAV. The interval of UAV photography was set to 1 s and 104 aerial photographs were acquired in the whole process. It is necessary to use image mosaic technology to synthesize a complete image for the next analysis process. This research was completed by software Agisoft Photoscan 1.2.6. The RGB image obtained by stitching a series of original photographic images was stored in TIF format with a size of 448 M, and the resolution of the image was about 1 cm. It can be read directly by ArcGIS software for subsequent operation.
2.3. Affine Transformation Algorithm
Assuming that XOY is a Cartesian coordinate system, xo’y is a physical coordinate system, the intersection angle between two coordinate systems is α, the transverse distance between the coordinate origin o’ and O is
and the longitudinal translation distance is
. The scale of the physical coordinate system (i.e., the scale of the pictures taken in this study) is
and
. According to the principle of graphics, the coordinate transformation formula is as follows:
In formula, , , , .
Assuming that
and
present the coordinate difference between control points and the transformation value according to the formula individually.
According to the principle of least square method, two sets of equations can be obtained by taking the sum and minimum of the squares of
and
as follows.
The coefficients can be obtained by solving Equations (5) and (6), so the transformation of whole graph can be determined.
2.4. Template Matching
(1) Standard correlation coefficient matching
Template matching algorithm can be realized through the function “matchTemplate” in OpenCV. According to the different matching values, there are six commonly used methods: square difference matching, standard square difference matching, correlation matching, correlation coefficient matching and standard correlation coefficient matching. From simple (square difference matching) to complex (correlation coefficient matching), the more accurate matching results can be obtained, the longer calculation time it takes. In order to obtain higher detection accuracy (according to the official document in OpenCV), the standard correlation coefficient matching method was used in this research.
Using correlation coefficient to measure the similarity between two vectors. Assuming that the target template is a 5 × 5 image, it can be regarded as a 25-dimensional vector, and each dimension is the gray value of a pixel point. Comparing this vector with each sub-region in the image, the process of finding out the sub-region with the largest standard correlation coefficient is standard correlation coefficient matching, as shown in Equation (7):
where
is the image gray function,
is the center pixel coordinates of the target window,
is the template gray function, and
is the center pixel coordinates of the search window. With
,
, we can get Equation (8):
With using template as the search window, the correlation coefficients of the original image are matched according to the fixed step size (usually 1 pixel). The closer the result is to 1, the higher the similarity between the region and the template is.
(2) Scale-Invariant Feature Transform (SIFT) descriptor matching
SIFT [
18] features are invariant to rotation, scale scaling, brightness and so on. It is very stable for features extraction and mainly composed of the following four steps:
a. Extremum detection in Difference of Gauss (DOG) scale space: First, constructing DOG scale space, and using Gauss ambiguity of different parameters to express different scale spaces in SIFT. The construction scale space is used to detect the feature points that exist at different scales.
b. Delete unstable extreme points. Two main types are deleted: low contrast extremum points and unstable edge response points.
c. Determine the main direction of feature points. The magnitude of the gradient of each pixel are calculated in the field with the feature point as the center and the radius of 3 × 1.5, and then the magnitude of the gradient is counted by histogram. The horizontal axis of the histogram is the direction of the gradient, and the vertical axis is the cumulative value of the gradient magnitude corresponding to the gradient direction. The direction corresponding to the highest peak in the histogram is the direction of the feature.
d. Generate descriptors of feature points. Firstly, the coordinate axis is rotated as the direction of the feature points, and the gradient magnitude and direction of the pixels in the 16 × 16 window centered on the feature points are divided into 16 blocks, each of which is the histogram statistics of eight directions in its pixels. A total of 128-dimensional feature vectors can be formed.
After getting the key points of the two images, we could match those feature points by calculating their distances. And the size and average distance of the top 10 key-points were used in the matching method.
The score represents the matching degree between the obstacle image and the UAV image; we could find the best matched area on the UAV image which contains the object similar to the obstacle image.