1. Introduction
With the continuous development of aerospace technology, faulty spacecraft on-orbit maintenance, malfunctioning satellite rescue, repairing and refueling, debris removal, asteroid exploration, etc., have become realistic problems to be solved. The motion estimation of these non-cooperative space objects is an important prerequisite for successfully completing the above missions; that is, achieving the autonomous relative navigation of non-cooperative space objects.
To improve navigation accuracy, autonomous navigation based on multi-source information fusion is applied in a specific spacecraft using microwave radars, laser radars, optical sensors, etc., as navigation sensors [
1,
2]. However, due to the complexity of fusion filtering algorithms and the constraints of the on-board computing ability, it is difficult to achieve real multi-source information fusion for normal satellites. As a standard configuration of a spacecraft, optical sensors have the advantages of full range applicability and rich information, which are ideal information sources for the autonomous navigation of space non-cooperative objects.
According to the application scenarios, autonomous navigation based on optical sensors can be divided into four categories. The first is autonomous relative navigation based on binocular vision. Fourie [
3] proposed a method of pose estimation and 3D reconstruction of non-cooperative space objects based on binocular vision. This method has been demonstrated using real world data acquired inside the ISS [
4]. Ge [
5] proposed a method of motion and inertial parameter estimation of non-cooperative object on orbit using stereo vision. The measurement accuracy and range of binocular vision depend heavily on the baseline length and are limited by the size of the spacecraft itself. The method is only suitable for close range measurement, within a few meters, and cannot be applied to remote navigation in the range of a few hundred meters to tens of kilometers.
The second is autonomous relative navigation based on monocular vision and terrain matching. NASA’s Mars rovers, Curiosity [
6] and Perseverance [
7], have adopted image matching technology based on the terrain model of the landing area to achieve obstacle avoidance and fixed-point landing. This kind of method needs to achieve a high-precision 3D model of known objects; it needs to carry out surround measurements on the object to obtain a large number of Multiview images and send them back to earth station for processing. This kind of method has a long process period and is primarily applied to missions with known object and boundary conditions. It is not suitable for fast relative navigation missions with unknown objects.
The third is autonomous relative navigation based on monocular vision and artificial markers. In a Japanese asteroid exploration mission, Hayabusa 1 [
8] and Hayabusa 2 [
9] dropped luminous object markers to the surface of asteroids Itokawa and Ryugu and used a monocular optical camera to achieve relative position measurements. Due to the need to release artificial markers, this method is difficult to apply to failed satellites, space debris, etc.
The fourth is autonomous relative navigation based on SLAM. Takeishi [
10] proposed a motion estimation method based on a particle filter and nonlinear optimization and applied it in Hayabusa 2. SLAM technology is mainly used in the field of ground mobile robots, and the scene is required to be static [
11]. In a relative navigation application scene, the object and spacecraft are always in motion, and the traditional methods cannot be directly applied to the satellite. Fahmi [
12,
13,
14,
15,
16] proposed aggregation operators on triangular cubic fuzzy numbers, which can be applied to decision-making problems. This method has application prospects in SLAM. However, the current research algorithms are primarily applied to asteroid detection scenes, and the algorithm is complex, and the computational overhead is large, which makes it difficult to meet the constraints of on-board computing ability.
To solve the above problems, a motion estimation method of non-cooperative space objects for autonomous navigation based on monocular sequence images is presented. This paper is organized as follows:
Section 2 provides the proposed method.
Section 3 provides the simulation results.
Section 4 discusses the performance of the proposed method. In
Section 5, we conclude our work.
2. Proposed Method
2.1. The Relationship between Object Motion and Camera Equivalent Motion
When the spacecraft is controlled to stare at a non-cooperative object, the object is in the field of vision of the spacecraft camera. As the motion of the spacecraft is known, the problem of estimating the object motion can be simplified as how to use fixed monocular camera sequence images to obtain the motion of the object.
This section provides the model for motion estimation of a non-cooperative object. First, we define the following reference frames (
Figure 1) to make them useful for understanding the problem:
- (1)
Camera-fixed frame ({Oc-XcYcZc}): A fixed camera frame centred at the position of the camera, and the frame motion with the camera.
- (2)
Object-fixed frame ({Ob-XbYbZb}): A fixed object frame centred at the centroid of the object, and the frame motion with the object.
A pinhole projection model of the camera is assumed, and a feature point
Pb on the object is photographed to the point
Pc on the focal plane; a single feature’s measurement is expressed as:
where [
Xb,
Yb,
Zb]
T, [
Xc,
Yc,
Zc]
T,
R and
T are the coordinates in the object-fixed frame ({
Ob-
XbYbZb}) of
Pb, the coordinates in the camera-fixed frame ({
Oc-
XcYcZc}) of
Pc, the rotation matrix and the translation vector from the object-fixed frame to the camera-fixed frame.
The pinhole projection model of the camera is expressed as:
where
u and
v are the two direction image pixel coordinates of feature point
Pb,
u0 and
v0 are the center pixel coordinates of the image photographed by the camera, and d
a, d
b and
f are the two direction cell sizes of the focal plane and the focal length of the camera, respectively.
For a translating and rotating space non-cooperative object, we use a monocular camera fixed with an inertial reference frame to take pictures of the object. The position and attitude, relative to the camera-fixed frame {
Oc-
XcYcZc}, are represented by translation vector
T and attitude rotation matrix
R; then, the different
T and
R of the object motion, relative to the camera, can be obtained at different times. Similarly, the object position and attitude between two moments can be represented by another translation vector and attitude rotation matrix (
Figure 2).
At the two different moments,
i and
i + 1, the coordinates of the feature point P on the non-cooperative object in the frame {
Oc-
XcYcZc} are (
Xc,i,
Y c,i,
Z c,i) and (
Xc,i+1,
Y c,i+1,
Z c,i+1). The attitude rotation matrices from {
Ob-
XbYbZb} to {
Oc-
XcYcZc} are
Ri and
Ri+1, and the translation vectors are
Ti and
Ti+1. The attitude rotation matrix and translation vector of the object changing from moment
i to moment
i + 1 are
Gi and
Li. Then, the coordinate transformation relationship can be obtained:
where
Ti,
Ti+1 and
Li are described in the camera-fixed frame {
Oc-
XcYcZc}.
Then, combining Equations (3) and (4), the following expressions can be obtained:
Equation (7) can be written as follows:
where it can be denoted that:
The above observation movement process is essentially the relative movement of the camera and the object. It can be understood in two equivalent ways. The first interpretation is the real physical state; that is, the camera does not move, and the object moves. The second equivalent interpretation is to take the object as a reference. Imagine that a person stands on a moving object and observes a fixed camera. This can be equivalent to the object not moving and the camera moving, as shown in
Figure 3. Therefore,
Ri and
Ti represent the rotation attitude matrix and translation vector of the camera-fixed frame {
Oc-
XcYcZc} from moment
i to moment
i + 1.
Furthermore, combining Equations (5), (6) and (9), the following equations can be obtained:
Equation (10) provides the relationship between object motion and camera equivalent motion. Note that Ti consists of two parts, and the part represents the equivalent translation vector of the camera-fixed frame caused by the object rotation.
2.2. Camera Translation and Rotation from Sequence Images
A sequence image is constructed by observing the object with a monocular camera, and the same feature point in two adjacent image frames conforms to the epipolar constraint:
where
and
.
Bringing Equation (2) into the above equation, the following can be obtained:
where
is the camera internal parameter matrix, and
is the pixel coordinates at the moment
.
We denote that and , which are the essential matrix and the fundamental matrix of the stereo vision, respectively, and the eight-points method is often used to obtain their values, where is a constant scale factor. After obtaining the essential matrix and the fundamental matrix by feature matching, the and can be obtained by performing singular value decomposition (SVD) decomposition on E.
Assume that the rotation attitude matrix is
R0 and the translation vector is
T0 at the initial moment. Then, according to Equation (7), the attitude rotation matrix
and translation vector
describing the position and attitude change of {
Oc-
XcYcZc} from the initial moment to moment
i are obtained:
where
is the continuous multiplication notation, and
is the continuous addition notation.
2.3. The Model for Estimating the Object Rotation State
At the initial moment of observation, the object-fixed frame parallel to the camera-fixed frame {Oc-XcYcZc} is established on the object; that is, the attitude rotation matrix is a unit matrix. Then, the matrix is equal to Ri, and the transpose matrix of Ri is the attitude rotation matrix of the object, which represents the attitude rotation of the object. Therefore, it can be concluded that the rotation of the object from the observation start time to any time can be determined.
Taking the derivation of the attitude rotation matrix
, the object angular velocity relative to the object-fixed frame
can be obtained:
where
and
.
Furthermore, through the projection transformation of the object angular velocity in the object-fixed frame by
, the object angular velocity relative to the camera-fixed frame can be obtained:
Similarly, the attitude quaternion of the object can be obtained through the attitude rotation matrix
:
Then, the rotation axis direction and rotation angle are:
Thus far, the model for estimating the rotation state of the object has been obtained, as follows:
2.4. Discussion on Estimating the Object Translation State
Within comparison to estimating the rotation of an object, the translation problem is more complicated. The conclusion is given by Equation (10) that the camera translation consists of two parts, and the object real translation is included in
. To clarify the composition of the translation vector
, we discuss the equivalent translation process of the camera in detail, as shown in
Figure 4.
From the decomposition of the essential matrix, can be obtained. This means that the direction is known, but the distance scale is missing. The triangle constituted by , and cannot be determined by only one side direction. To know what information is needed to determine the triangle, step by step, some assumptions are made:
If the size and direction of and are known, can be determined.
If the size and direction of and are known, and can be determined.
If only the direction of and is known, the ratio of , and can be determined, that is, unified to the same scale. Then, can be unified to this scale, and the direction of can be determined.
On the basis of 3, if the size of is also known, can be determined. This means that if the position and direction at the initial moment and the direction at any moment of the object centroid are known, the translation of the object can be solved.
If only the direction of is known, the direction of can be determined. If the directions of and are not parallel, the object must have translation. It should be pointed out that this is a sufficient and unnecessary condition.
The above assumptions can guide us to obtain the relevant data from other sensors or the configuration of sensors during the application of the proposed method.
3. Results
The calculation effect of this method will be verified by mathematical simulation. Assuming that there is a free nutation satellite in orbit, it represents a typical space non-cooperative object. Before the robot operates on the orbit, it first needs to obtain the motion state of the object.
The mass characteristics and initial motion state of the satellite described in the object-fixed frame are as follows:
The centroid position of the satellite described in the camera-fixed frame is as follows:.
The internal parameter matrix of the camera is as follows:
The above parameter selection comes from the general satellites, and they are simply a set of horizontally comparable parameters that are randomly assumed to not affect the results. The number of feature points extracted from the object needs to be greater than eight. The camera’s internal parameter matrix is derived from an ideal camera with a focal length of 16 mm, the size of a single cell on the image plane is 5.5 microns, and the total number of pixels is 2048 by 2048.
Through the object feature recognition and matching algorithm, more than eight feature points on the object are extracted and stably tracked, as shown in
Figure 5. The spatial motion trajectory of the object feature points is shown in
Figure 6. The tracked object image feature points and their change trajectories are shown in
Figure 7.
The attitude quaternion and angular velocity of the object are obtained by the method in this paper, as shown in
Figure 8 and
Figure 9, where the red dotted curve is the result estimated by the proposed method, and the blue solid curve is the real state of the object simulation.
4. Discussion
Through simulation, it can be proven in entirety that the proposed method is feasible; that is, the attitude motion of the target can be obtained by a sequence of images of the object using only the monocular camera, and the required conditions for estimating the translation of the object are analyzed. In terms of hardware resources and the motion estimation process, the consumption is low. However, the simulation only verifies the validity of the motion estimation, and the image processing algorithm is not the focus of this paper; the image processing algorithm is very important and is the premise of this method. Therefore, motion estimation can only be achieved if accurate matches of more than eight feature points can be obtained. Overall, the rotation speed and observation distance of the object, as well as the image processing algorithm, will affect the extraction accuracy of the feature points, which is another important issue and requires further research.
There are various models available, and the current study differs from others. At present, some methods of motion estimation for non-cooperative space objects need to be implemented through binocular cameras. However, on the satellite, the baseline length of the binocular camera is greatly limited. Other methods need to use microwaves or lidar, which will encounter difficulties in the equipment installation layout and consume satellite resources. Moreover, many methods use complex fusion filtering algorithms. These all reduce the application effect of onboard autonomous navigation. In contrast to other methods, which require binocular cameras or radar to measure data, this method only uses a monocular camera, which has low requirements for the configuration of the measurement sensors. Therefore, the computational complexity and processing speed are the advantages of this method. Similarly, this method has some disadvantages, such as being unable to estimate the real size of the translation and requiring high image processing accuracy. However, it is still a novel method for the motion estimation of non-cooperative space targets.
5. Conclusions
In this paper, a novel motion estimation method of non-cooperative space objects for autonomous navigation based on monocular sequence images is proposed. This method is not limited by the baseline length as in the stereo vision algorithm and has no constraints on the size and range of the object. It can be applied to asteroid exploration missions. This method does not require prior information such as the three-dimensional model of the object or artificial markers. It only uses the sequence images of the monocular camera to estimate the motion of the object. This method only uses image matching to obtain the motion estimation of the object, which avoids the problems of stability and convergence time of the filtering algorithm. Theoretically, only two images of the initial adjacent frames are needed to obtain the motion estimation of the object. This method has low calculation cost, runs stably and realizes the simultaneous modelling and estimation of the space moving object. Moreover, the algorithm only uses a basic matrix vector calculation, which is convenient for code transplantation and embedded implementation, and can well meet the requirements of real-time computation on board.
Author Contributions
Conceptualization, D.G.; Methodology, D.W.; Software, W.Z.; Investigation, W.E.; Data curation, R.D. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China (U20B2055); the National Key Research and Development Program of China (2019YFA0706500).
Institutional Review Board Statement
The work not involving humans or animals.
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Conflicts of Interest
The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.
References
- Xu, C.; Wang, D.Y.; Huang, X.Y. Autonomous navigation based on sequential images for planetary landing in unknown environments. J. Guid. Control Dyn. 2017, 40, 2587–2602. [Google Scholar] [CrossRef]
- Li, M.D.; Huang, X.Y.; Xu, C.; Guo, M.; Hu, J.; Hao, C.; Wang, D. Velocimeter-Aided Attitude Estimation for Mars Autonomous Landing: Observability Analysis and Filter Algorithms. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 451–463. [Google Scholar] [CrossRef]
- Fourie, D.; Tweddle, B.E.; Ulrich, S.; Saenz-Otero, A. Flight results of vision-based navigation for autonomous spacecraft inspection of unknown objects. J. Spacecr. Rocket. 2014, 51, 2016–2026. [Google Scholar] [CrossRef]
- Tweddle, B.E.; Setterfield, T.P.; Saenz-otero, A.; Miller, D.W. An Open Research Facility for Vision-Based Navigation Onboard the International Space Station. J. Field Robot. 2016, 33, 157–186. [Google Scholar] [CrossRef]
- Ge, D.M.; Wang, D.Y.; Zou, Y.J.; Shi, J. Motion and inertial parameter estimation of non-cooperative target on orbit using stereo vision. Adv. Space Res. 2020, 66, 1475–1484. [Google Scholar] [CrossRef]
- Martin, M.S.; Mendeck, G.F.; Brugarolas, P.B.; Singh, G.; Serricchio, F.; Lee, S.W.; Wong, E.C.; Essmiller, J.C. In-flight experience of the Mars Science Laboratory Guidance, Navigation, and Control system for Entry, Descent, and Landing. CEAS Space J. 2015, 7, 119–142. [Google Scholar] [CrossRef]
- Brugarolas, P.B. Guidance, Navigation and Control for the Entry, Descent, and Landing of the Mars 2020 Mission. In Proceedings of the 40th Annual Guidance and Control Conference, Breckenridge, CO, USA, 2–8 February 2017. [Google Scholar]
- Uo, M.; Shirakawa, K.; Hashimoto, T.; Kubota, T.; Kawaguchi, J. Hayabusa’s touching-down to Itokawa—Autonomous guidance and navigation. In Proceedings of the AAS/AIAA Space Flight Mechanics Meeting, Tampa, FI, USA, 22–26 January 2006. [Google Scholar]
- Ono, G.; Terui, F.; Ogawa, N.; Kikuchi, S.; Mimasu, Y.; Yoshikawa, K.; Ikeda, H.; Takei, Y.; Yasuda, S.; Matsushima, K.; et al. GNC strategies and flight results of Hayabusa2 first touchdown operation. Acta Astronaut. 2020, 174, 131–147. [Google Scholar] [CrossRef]
- Takeishi, N.; Yairi, T. Visual monocular localization, mapping, and motion estimation of a rotating small celestial body. J. Robot. Mechatron. 2017, 29, 856–863. [Google Scholar] [CrossRef]
- Saputra, M.U.; Markham, A.; Trigoni, N. Visual SLAM and Structure from Motion in Dynamic Environments: A Survey. ACM Comput. Surv. 2018, 51, 37. [Google Scholar] [CrossRef]
- Fahmi, A.; Abdullah, S.; Amin, F.; Siddiqui, N.; Ali, A. Aggregation operators on triangular cubic fuzzy numbers and its application to multi-criteria decision making problems. J. Intell. Fuzzy Syst. 2017, 33, 3323–3337. [Google Scholar] [CrossRef]
- Amin, F.; Fahmi, A.; Abdullah, S.; Ali, A.; Ahmad, R.; Ghani, F. Triangular cubic linguistic hesitant fuzzy aggregation operators and their application in group decision making. J. Intell. Fuzzy Syst. 2018, 34, 2401–2416. [Google Scholar] [CrossRef]
- Fahmi, A.; Abdullah, S.; Amin, F.; Ali, A. Cubic fuzzy Einstein aggregation operators and its application to decision-making. Int. J. Syst. Sci. 2018, 49, 2385–2397. [Google Scholar] [CrossRef]
- Fahmi, A.; Abdullah, S.; Amin, F.; Ali, A.; Khan, W.A. Some geometric operators with triangular cubic linguistic hesitant fuzzy number and their application in group decision-making. J. Intell. Fuzzy Syst. 2018, 35, 2485–2499. [Google Scholar] [CrossRef]
- Fahmi, A.; Abdullah, S.; Amin, F.; Khan, M.S.A. Trapezoidal cubic fuzzy number Einstein hybrid weighted averaging operators and its application to decision making. Soft Comput. 2019, 23, 5753–5783. [Google Scholar] [CrossRef]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).