1. Introduction
Autonomous robots can replace humans to complete some repetitive, high-risk or heavy-duty tasks, effectively improve living standards and social work efficiency, and play an important role in promoting social development. The navigation system is one of the core components to ensure that the robots can complete the task smoothly and reliably [
1]. At present, the most commonly used navigation system is inertial navigation, which has the advantages of strong autonomy and strong anti-interference ability and can provide complete navigation parameters. However, inertial navigation is expensive, and the error rapidly diverges over time [
2]. The output frequency of satellite navigation is low, usually 10 Hz, but for a system with a control frequency of 100–200 Hz, the ideal output frequency is 20–40 Hz; otherwise, the control effect of the system will be affected; and if the satellite antenna is blocked, the signal will be interrupted [
3]. In addition, the simultaneous localization and mapping (SLAM) method based on a visual odometer (VO) [
4,
5] or visual–inertial odometer (VIO) [
6,
7] is also an important way to solve the autonomous navigation of unmanned platforms, but the common SLAM methods have problems such as a large amount of computation and low robustness in complex and large scenes [
8]. Therefore, exploring an autonomous and reliable navigation method suitable for unmanned platforms in large scenes has become an urgent problem that needs to be solved.
Researchers have found that many creatures in nature have superior navigation skills. Although organisms themselves do not have chips with powerful computing power or high-precision navigation sensors, they can achieve migration activities of hundreds or even thousands of kilometers in various complex environments [
9,
10]. For example, monarch butterflies in North America migrate more than 4000 km each year from Canada to Mexico [
11]; a kind of turtle called eretmochelys imbricate will migrate more than 2000 km from their feeding grounds to their breeding grounds [
12]. These navigational properties of organisms have led researchers to study the mechanism of biological navigation.
Mann and colleagues found that the navigation method of domestic pigeons is different from the traditional navigation method [
13], and its navigation method has two characteristics: the navigation is a node navigation method with strong topology; absolute orientation information is essential for the pigeon. Henrik reviewed the mechanism of orientation and navigation of long-distance migratory animals during the migration process and proposed that long-distance navigation of migratory animals consists of three stages, namely the long-distance phase, narrowing-in or homing phase, and pinpointing the goal phase [
14]. In the three-stage navigation process, the absolute heading is always the most important information, and the navigation will be completed by comprehensively using various cues of perception. If the migratory animal navigation mechanism is applied to the autonomous navigation of the unmanned platform, its performance will be improved. Neuroscientists have undertaken a lot of research on the navigation mechanisms of many organisms involving mammals, such as rats, and non-mammals, such as fish [
15]. Among them, the navigation mechanism of rats is similar to that of most mammals, including humans; therefore, we can draw on rat brain navigation models.
O’Keefe and Mosers, winners of the 2014 Nobel Prize in Physiology or Medicine, discovered that there are navigation-related cells in the rodent brain, including place cells [
16], head direction cells [
17], grid cells [
18], speed cells [
19], and boundary cells [
20]. Grid cells show strong topological properties in the process of activation discharge [
21], and it has been shown that, similar to pigeon navigation, rodents also use topological maps in the process of navigation.
According to the characteristics of spatial navigation cells, researchers proposed different navigation cell construction models and proposed a spatial topology map construction method. Arleo proposes a navigation model based on head direction cells and place cells, which realizes robot topology map construction, navigation positioning, and loop closure detection in small environments [
22]. Gaussier imitates the mechanism of place cells in rats by linking the visual information of the environment with the location information to express the spatial environment, finally creating the topological map of the environment by establishing the connectivity between the place cells [
23]. Ramirez proposed a place cell navigation model based on the neural mechanism of the hippocampus in the process of rat navigation, which was actually applied to the control of mobile robots, but it is only suitable for environments with fixed structures [
24]. Inspired by position cells, head direction cells, and grid cells, Erdem uses the oscillatory interference model to create grid cells and proposes a bionic navigation model based on the forward linear predictive trajectory detection method [
25]. Tejera proposes a biomimetic localization model based on grid cells and place cells, which provides long-term environmental localization through place cells [
26]. Based on the discharge characteristics of rat navigation cells, Cong and colleagues proposed a mathematical model based on episodic memory to model the environment [
27]. In recent years, researchers have proposed different navigation models based on different neural network models [
28,
29,
30]. Schneider proposed a biologically inspired cognitive architecture that uses a local navigation map as its main data element; each local map is matched with the closest maps, and then the navigation map is mapped; the biological feedback pathways help the exploration and generation of cause-and-effect behavior [
31]. Milford proposed the two-dimensional bionic navigation model RatSLAM, which has many advantages [
32]. It has a small amount of computing and storage and can build large-scale environmental maps; the entire system is lightweight and low-cost. Yu proposed NeuroSLAM to build a three-dimensional environment topology map [
33]. This model greatly improves the performance of RatSLAM and, to a certain extent, extends the mouse brain navigation model to a migratory animal navigation model, which is more suitable for the navigation of unmanned platforms.
However, unlike migratory animal navigation mechanisms, the Neuroslam does not use the most important absolute heading information. The main reason why rats do not need absolute heading is that the distance for foraging to return to the nest is only a few kilometers, while the journey of migratory animals is hundreds or even tens of thousands of kilometers [
12], so rats can use relative heading to complete the navigation task. If the absolute heading is applied to the brain-inspired SLAM, the robustness of the system in the large scene environment can be improved, and the topological consistency of the topological map can be improved. However, it is necessary to obtain an absolute heading that is autonomous, reliable, and has a strong anti-interference ability.
Other creatures in nature provide a solution for stably obtaining absolute heading information. The researchers found that a desert ant has a unique compound eye structure, which enables it to perceive the polarization distribution state of the sky, and then obtain absolute heading information [
34]. Based on the compound eye structure of sand ants, researchers have developed various types of polarized light sensors [
35,
36,
37,
38]. In recent years, researchers have proposed a variety of orientation, attitude, and positioning methods based on polarized sky-light sensors [
39,
40,
41,
42]. In terms of practical application, Lambrinos designed a polarization-sensitive unit and applied it to the navigation control of ground robots [
35]. The experiment proved the feasibility of bionic polarized sky-light navigation. According to the polarization-sensitive mechanism of insects, Chu analyzed the polarization-sensitive angle measurement model, designed a six-channel photoelectric polarized sky-light sensor, and applied it to the actual navigation control of ground mobile robots [
36]. Zhi proposed an attitude measurement method based on an inertial/GPS/polarized sky-light sensor and conducted a flight experiment [
39]. Dupeyroux designed a polarized sky-light sensor in the ultraviolet band and applied it to the autonomous navigation of hexapod robots [
40], and in 2022, He proposed that insect-inspired artificial intelligence is an important direction for the development of small autonomous robots [
43].
This paper proposes a brain-inspired navigation model based on absolute heading for the autonomous navigation of unmanned platforms in large scenes. Inspired by the three-stage navigation method of migratory animals, the proposed model combined the sand ant’s strategy of acquiring absolute heading and the brain-inspired SLAM system, which is a method closer to the navigation mechanism of migratory animals. The proposed model refers to the three components of NeuroSLAM, but each component introduces absolute heading information. First, a brain-inspired grid cell network model and a head-direction cell network model with absolute heading were constructed based on the continuous attractor network. Therefore, the position and heading of this model can be decoupled. Then, an absolute heading-based environment vision template is constructed using the line scan intensity distribution curve, and the path integration error is corrected using the vision template. Finally, a topological cognitive node is constructed according to the grid cell, the head direction cell, the environmental visual template, the absolute heading information, and the position information. Numerous topological nodes form the absolute heading-based topological map. The experimental results showed that, compared with NeuroSLAM, the proposed method has higher visual template recognition accuracy and faster recognition speed, and the constructed environment topology map has higher mapping accuracy and topology consistency.
The rest of this paper is organized as follows.
Section 2 describes the principle of bionic polarized sky-light navigation.
Section 3 proposes a brain-inspired navigation model based on the absolute heading.
Section 4 verifies the performance of the proposed navigation model through outdoor practical experiments. Finally,
Section 5 concludes the contribution of this paper.
2. Principle of Bionic Polarized Sky-light Navigation
It is known that the natural light emitted by the Sun to the Earth is unpolarized, but when it passes through the Earth’s atmosphere, the sunlight will be scattered and absorbed by air molecules and aerosol particles in the atmosphere, forming a polarization phenomenon, and the entire sky will generate a regular and stable skylight polarization distribution [
44]. When the weather is clear, and the sky is cloudless, the skylight polarization distribution can be described by Rayleigh scattering theory [
44], and the distribution is shown in
Figure 1. Similar to the gravitational field and the geomagnetic field, the skylight polarization distribution is also a global field; therefore, it can be used for navigation.
In the skylight distribution, the maximum polarization direction vector corresponding to a point in the sky is called the E-vector, as
shown in
Figure 2. According to the Rayleigh scattering theory, the E-vector of the observation point maintains a vertical relationship with the plane formed by the sun, the observation point, and the observed point, and then, the E-vector can be converted into the absolute heading angle.
The polarized sky-light sensor developed based on insect compound eye structure can measure the E-vector of the points. In the paper, a high probe lens type single point sensor independently developed by us is adopted, as shown in
Figure 3. Its structural design and principle can be found in [
37]. The polarized sky-light sensor reaches a dynamic outdoor accuracy of 0.5 degrees under a clear sky.
In the paper, the attitude angles are defined as roll (), pitch (), and yaw (); the output of the polarized sky-light sensor is defined as ; the navigation frame (n-frame) is defined as north-east-down, and the body frame (b-frame) is front-right-down. The yaw angle is the absolute heading angle, which is crucial for carrier navigation. Then, the absolute heading can be calculated as follows.
As shown in
Figure 2, the E-vector in b-frame can be represented by:
E-vector in b-frame can be converted to n-frame by the direction cosine matrix, the formula is as follows:
Expanding the direction cosine matrix, it can be expressed as follows:
The solar vector in n-frame
can be expressed as follows:
where,
and
represent the solar altitude and solar azimuth respectively. Then, according to Rayleigh scattering theory, we can find:
Combined with the above formulas, the absolute heading can be deduced as follows:
During the experiment, the polarized sky-light sensor faces the zenith. If the horizontal attitude angle of the carrier is small, it can be assumed that the observation point is the zenith, and the absolute heading is simplified as follows:
An actual example is provided to show the process. If the current roll and the pitch of the carrier are 0° and 0°; the local solar azimuth is 120°; the output of the polarized sky-light sensor is 50°. Bringing them into Equations (6) and (7) yields that the current absolute heading is 260° or 80°; bringing them into Equation (9) yields that the current absolute heading is still 260° or 80°. There is ambiguity in the absolute heading, which can be judged according to the value of the previous moment.
If the carrier is tilted greatly, the polarized sky-light sensor cannot be guaranteed to face the zenith, and the absolute heading can be obtained according to the attitude angle.
3. Brain-Inspired Navigation Model Based on Absolute Heading
The overall framework of the brain-inspired navigation model based on absolute heading is shown in
Figure 4, which mainly consists of three parts: grid cell and head direction cell model construction, environment visual template construction, and environment topology map construction. The system used two types of sensors, both of which are autonomous and lightweight sensors. The polarized sky-light sensor is responsible for obtaining the absolute heading, and the binocular camera is responsible for obtaining the environmental visual image information. First, the system builds a grid cell network model and a head direction cell network model with an absolute heading based on the continuous attractor network and uses them to perform path integration according to the estimated value of the carrier’s self-motion. Then, an absolute heading-based environment vision template was constructed using the line scan intensity distribution curve, and the path integration error was corrected using the vision template. Finally, a brain-inspired topological map based on absolute heading was constructed according to the grid cell and head direction cell model, the environmental visual template, the absolute heading, and the position corresponding to the carrier.
The continuous attractor network is a type of neural network that consists of an array of units with fixed-weighted excitatory and inhibitory connections. Different from most neural networks, it operates by updating the activity of each unit instead of changing the value of the weighted connections. As the carrier moves in an environment, the activation value of each unit in the CAN varies between 0 and 1 [
32].
The algorithm pseudo code of the model is shown in
Table 1. The following describes the three components of the system in detail.
3.1. Grid Cell and Head-Directed Cell Model Based on Continuous Attractor Network
The grid cells and head direction cells are the path integrator of the navigation model and are an important part of the whole system. They perform path integration by integrating motion information and environmental cues to complete the rat navigation task. Referring to NeuroSLAM, we constructed a three-dimensional grid cell
based on a continuous attractor network to represent the three-dimensional position
of the carrier [
33]. We used the absolute heading obtained by the polarized sky-light sensor to construct a two-dimensional head direction cell
to represent the current heading of the carrier. The following describes the construction process of head direction cells.
The head direction cell was constructed using a two-dimensional continuous attractor network, denoted as , where is the absolute heading angle, is the current vertical height of the carrier, and the head direction cell was used to represent the absolute heading of the carrier corresponding to different heights. The update of the activation of the head direction cell consists of two parts, the dynamics of the attractor and the integration of the heading angle and the height. In addition, loop detection will also update the activation.
Attractor network dynamics include local excitatory connections to surrounding neurons from the activated cells and global inhibitory connections to all neuronal cells. The two connections cause the head-direction cell to gradually converge to a steady state, producing a main cluster of neurons with a high activation level called an activity packet, the center of which represents the absolute heading estimated by the head-direction cell. Firstly, construct the local excitation weighting matrix
:
where
and
are the variances of the two-dimensional Gaussian distribution;
and
represent the distribution coefficients, which are obtained by the following formula:
where
and
represent the dimensions of the xy-axis of the head direction cell model. Then the local excitatory connection produces the activation change as:
To limit the continuous increase in cell activation, global inhibitory connections are established for all cells.
is the global inhibition constant, then the final change of internal attractor dynamics to cell activation is:
To ensure that the activation of the head direction cell is non-negative:
Finally, in order to ensure that the total cell activation remains stable, normalize the cell:
The changes in the heading angle and the height will also change the head direction cell activation, enabling the active packets to move in a two-dimensional continuous attractor network. The polarized sky-light sensor was used to obtain the absolute heading
, and the change of the head direction cell activity is:
where
and
represent the integer part of the changes of the heading angle and the height;
represents the residual parameter, which is calculated as follows:
where
represents the velocity in the height direction, and
represents the velocity coefficient in the height direction;
and
represent the fractional part of the changes of the heading angle and the height. Parameter
can be obtained by:
3.2. Vision Template Based on Absolute Heading
During the process of spatial exploration, rats process the environmental visual information through the optic nervous system to form a visual template corresponding to the geographic location so as to memorize their exploration path. With the help of the visual template, the rat can perceive the current position and judge whether it has reached the previously experienced scene.
In this paper, a line scan intensity distribution method based on absolute heading was proposed to construct a visual template. The binocular camera was used to output the environmental visual image, and the line scan intensity distribution method was used to process the visual image information with reference to RatSLAM [
32]; the absolute heading obtained by the polarized sky-light sensor added a constraint to each visual template. The following is the specific construction process of the visual template.
First, the visual image was processed using the patch normalization method. This method has been proven to improve the robustness of image recognition in the case of illumination changes. This method was also used by RatSLAM. The intensity of a single pixel of the image after patch normalization is calculated by:
where
and
represent the mean and standard deviation of the n pixels around the (x, y) pixel, respectively.
The normalized image was converted into a one-dimensional vector using the scanline intensity profile method. This method uses one-dimensional vectors to represent two-dimensional image features, which can significantly improve the calculation speed of the entire navigation system and reduce the storage capacity of the system. Using the image scanline intensity profile vector
and the absolute heading
to construct a vision template, a single vision template is defined as follows:
The visual template has two functions, one is to compare the new visual template with the visual template library for template matching, and the other is to calculate the forward-moving speed of the carrier. The implementation of its two functions is as follows.
Set two visual templates
and
, and the corresponding line scan intensity profile are
and
, then the average intensity difference between the two templates is:
where
represents the offset of the distribution, that is, the number of pixels offset of the two images;
represents the pixel width of the image. The smaller the average intensity difference, the more visually similar the two images are, and the more likely they are the same scene. The offset
is a variable. The traditional processing method is to set
, and take the minimum value of the
times calculation result as the average intensity difference. This method is prone to mismatch between two similar but not the same scenes and requires multiple calculations. Our method utilized the absolute heading of the visual template to calculate the offset
. The absolute heading of the two visual templates are
and
, respectively, and the angle deviation
is:
If the camera is located at the same location and the azimuth angle is shifted by
to capture two images, the translation distance
between the centers of the two images can be approximated as:
where
is the depth information of the image center. According to the camera imaging model [
45], we can obtain the coordinate component
in the camera imaging frame:
Combining Equations (25) and (26), we can obtain:
The camera imaging plane coordinates
can be converted to pixel plane coordinates
after one zoom and one translation [
45], which can be expressed as:
where
m and
n are zoom coefficients, and
and
are translations, which are constants related to the camera. According to Equations (27) and (28), the pixel offset can be obtained as follows:
Therefore, according to the absolute heading of the visual template, the theoretical distribution offset
is obtained by:
where
is a constant only related to the camera, which can be calculated according to the camera parameters and adjusted according to actual experience. Since there will be errors in the experimental process, let
, and the minimum value is the average intensity difference:
When performing visual template matching, a threshold a is set, and when , the two visual templates are considered to be successfully matched; when , it is considered that the two visual templates are not the same scene. If the current scene is a new scene that has not been experienced before, a new visual template is created.
The second function of the visual template is to calculate the moving speed of the carrier. First, the distribution offset
between the two visual templates is obtained according to Equation (31), and then the unit average intensity difference is calculated:
Then the forward movement speed of the carrier is:
In the formula, is a constant related to the forward moving speed; is the limit coefficient of the maximum speed of the carrier. The carrier speed calculated by the visual template is not accurate, but the method can establish a connection with the actual scene, thereby ensuring the reliability of navigation and, at the same time, significantly reducing the amount of calculation. This just reflects the advantage of the biological topological navigation mode that does not pursue high positioning accuracy.
Similarly, the visual template can also be used to calculate the moving speed of the carrier in the height direction. First, each of the two visual templates is cropped
column to avoid the influence of the rotation of the heading angle; the remaining parts are calculated using the line scan intensity distribution by row, is
and
. Then, the unit average intensity difference is:
where
represents the pixel height of the image and
ensures that the two images have enough overlap. Then the moving speed in the height direction is:
Likewise, denotes a constant related to the velocity in the height direction; denotes the maximum velocity in the upward direction.
3.3. Environmental Topological Map Based on Absolute Heading
In the process of moving in space, rats combine their motion trajectory with the perception of the surrounding environment and finally form an environmental cognitive map composed of a series of connected cognitive nodes in the brain to guide the rat navigation. The cognitive map is a topological map. This section details the construction process of the topology map, as shown in
Figure 5.
The topological map consists of two basic elements: cognitive nodes
and topological edges
that constitute the topological relationship of the nodes. The cognitive nodes constructed in this paper include the visual template
, grid cells
, head direction cells
, and the absolute pose
in the current scene. A single cognitive node is defined as follows:
where
is a four-dimensional vector composed of position and the absolute heading. Define topological edges based on absolute heading:
where
,
, and
represent the position change of the carrier from node
i to node
j but
represents the absolute heading from the center of node
i to node
j, instead of the heading angle change. Topological connections can be established between cognitive nodes through topological edges, which can be expressed as follows:
where
is a constant-valued row vector.
When the carrier moves from a cognitive node to a new scene that it has not experienced before, a pending cognitive node will be generated, and the system will compare the matching degree of this node with the existing nodes in the topology map and then determine whether it is necessary to create a new node, the matching degree
designed in this paper is:
where
,
,
, and
are the weighting coefficients for each part to match the cognitive nodes;
represent the matching difference parameters calculated by the absolute heading: if
,
If
and
,
And if
and
,
here,
is a threshold. When the absolute heading difference between the two cognitive nodes is less than 60°, the two cognitive nodes are considered to be successfully matched in the absolute heading matching item; otherwise, the matching fails. Considering that the carrier has two motions in the same scene, forward and reverse, when calculating the absolute heading difference, if the two cognitive nodes are reversed, the absolute heading of the current cognitive node will be rotated 180° first.
Set a node matching threshold , if , add the current cognitive node to the topology map, and establish a relationship between the new node in the topology map through the topology edge according to the equitation 18. If there are multiple nodes whose matching degrees are less than the threshold , it is considered that the node with the lowest matching degrees is successfully matched with the current scene; that is, the carrier currently moves to the scene represented by the cognitive node .
When the cognitive nodes are successfully matched, the system loop detection is triggered, and the accumulated error caused by path integration can be corrected by using the node information. In this paper, the map relaxation method used in RatSLAM is used to correct the topology map. This method corrects the cognitive nodes by compensating a pose offset since there is no accumulated error in the absolute heading; only the position of the cognitive nodes needs to be modified. The pose offset is calculated as follows:
where,
is the correction coefficient,
represents the number of connections from cognitive node
to other nodes, and
represents the number of connections from other cognitive nodes to node
.