PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints

Li, Cuiming; Shang, Zhengyu; Wang, Jinxin; Niu, Wancai; Yang, Ke

doi:10.3390/app14167097

Open AccessArticle

PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints

by

Cuiming Li

^*,

Zhengyu Shang

^*,

Jinxin Wang

,

Wancai Niu

and

Ke Yang

School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 7097; https://doi.org/10.3390/app14167097

Submission received: 20 June 2024 / Revised: 4 August 2024 / Accepted: 7 August 2024 / Published: 13 August 2024

(This article belongs to the Special Issue Autonomous Vehicles and Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

Simultaneous Localization and Mapping of autonomous cleaning and inspection robots operating in the photovoltaic power station scene.

Abstract

Due to various typical unstructured factors in the environment of photovoltaic power stations, such as high feature similarity, weak textures, and simple structures, the motion model of the ORB-SLAM2 algorithm performs poorly, leading to a decline in tracking accuracy. To address this issue, we propose PE-SLAM, which improves the ORB-SLAM2 algorithm’s motion model by incorporating the particle swarm optimization algorithm combined with epipolar constraint to eliminate mismatches. First, a new mutation strategy is proposed to introduce perturbations to the pbest (personal best value) during the late convergence stage of the PSO algorithm, thereby preventing the PSO algorithm from falling into local optima. Then, the improved PSO algorithm is used to solve the fundamental matrix between two images based on the feature matching relationships obtained from the motion model. Finally, the epipolar constraint is applied using the computed fundamental matrix to eliminate incorrect matches produced by the motion model, thereby enhancing the tracking accuracy and robustness of the ORB-SLAM2 algorithm in unstructured photovoltaic power station scenarios. In feature matching experiments, compared to the ORB algorithm and the ORB+HAMMING algorithm, the ORB+PE-match algorithm achieved an average accuracy improvement of 19.5%, 14.0%, and 6.0% in unstructured environments, respectively, with better recall rates. In the trajectory experiments of the TUM dataset, PE-SLAM reduced the average absolute trajectory error compared to ORB-SLAM2 by 29.1% and the average relative pose error by 27.0%. In the photovoltaic power station scene mapping experiment, the dense point cloud map constructed has less overlap and is complete, reflecting that PE-SLAM has basically overcome the unstructured factors of the photovoltaic power station scene and is suitable for applications in this scene.

Keywords:

ORB feature; image matching; multi-objective optimization; particle swarms; epipolar constraints; simultaneous localization and mapping (SLAM)

1. Introduction

Solar energy, as one of the main new energy sources, has become the backbone of the new energy industry [1]. Typically, large-scale photovoltaic power plants are built in geographically complex areas such as valleys and deserts, and utilizing mobile robots to carry out tasks such as cleaning and inspection in the photovoltaic power plants has obvious advantages [2,3,4]. The perception of their own status and the external environment by mobile robots is a prerequisite for their autonomous operation in the non-structured environment of photovoltaic power plants [5,6].

In applications such as autonomous driving, robot positioning, and map building, ORB-SLAM2 [7], as a visual SLAM (simultaneous localization and mapping) method based on image features, has become quite mature in scenes with rich textures and complex structures [8]. However, in special environments like photovoltaic power plants, there are unstructured factors: high feature similarity, weak scene textures, and simple structures. These unstructured factors impose strong interference on the ORB feature matching algorithm in the ORB-SLAM2 algorithm, leading to a decrease in tracking accuracy.

The images of the photovoltaic power plant scene are mainly composed of photovoltaic modules. As shown in Figure 1, the image of a photovoltaic module can be seen as a combination of multiple small images with similar texture and structural features. The appearance, shape, and color of photovoltaic modules are similar, lacking unique markings or sufficient distinguishing characteristics. The similar features lead to poor differentiation in feature descriptors, which results in a large number of mismatches during the feature matching stage, subsequently causing incorrect pose estimation. Inaccurate pose estimation can affect the tracking accuracy and mapping effectiveness of the SLAM system and may even result in tracking failures midway through. Figure 1 illustrates that the textures and structures of photovoltaic panels are extremely simple, with the extracted feature points concentrated at the connections of the photovoltaic panels. The limited number of feature points reduces the tracking stability and backend optimization capability of the SLAM system, leading to diminished system robustness. According to the analysis of a large number of feature matching experimental results, it was found that there is an inconsistency in the mismatches between the features in the images of the photovoltaic power plant scene, meaning they do not have the same constraints, whereas correct matches generally have the same constraints.

ORB-SLAM2 uses the ORB [9] algorithm for image feature extraction and matching. In this context, this paper focuses on recent research on improving the ORB algorithm. Reference [10] proposed an improved ORB algorithm based on dynamic thresholds and enhanced quadtree methods to enhance the uniformity of feature point extraction. Reference [11] introduced an improved ORB algorithm combining adaptive histogram equalization techniques to improve feature point quality and matching efficiency under low or overexposed conditions. Reference [12] combined an improved ORB algorithm with the Lucas–Kanade algorithm to propose a new image feature matching algorithm, which enhances feature point uniformity and matching accuracy. Reference [13] proposed an improved ORB algorithm based on adaptive thresholds and local grayscale differences, improving the effectiveness of feature description. References [14,15] segmented and organized high-dimensional feature data by establishing kd trees and k-means trees, enabling faster nearest neighbor searches and accelerating feature matching speed. Reference [16] incorporated grid motion statistics into feature matching to eliminate incorrect matches, effectively improving feature matching accuracy. Reference [17] used the RANSAC algorithm to remove mismatches in coarse matching point sets, achieving high real-time performance and effectively enhancing feature matching accuracy. Reference [18] utilized the PROSAC algorithm to remove mismatches by sampling from an ever-expanding best set, reducing computational load compared to Reference [17], but the randomness of sampling decreased the algorithm’s robustness. Reference [19] employed the particle swarm algorithm with feature point similarity as the fitness function for feature point matching, providing effective matching accuracy and robustness, albeit with poorer real-time performance. Reference [20] addressed mismatch removal in remote sensing images using the particle swarm algorithm combined with affine transformations, showing good real-time performance and matching accuracy but limited applicability across various scenes. Reference [21] introduces a hardware accelerator for ORB-SLAM that optimizes rBRIEF descriptor generation using a genetic algorithm, reducing energy consumption by 14,597× and 9609× on CPU and GPU, respectively. Reference [22] enhances ORB feature extraction in complex lighting by combining image enhancement with a truncated adaptive threshold, improving feature point detection and robustness. Reference [23] introduces AAM-ORB, which integrates an affine attention module into the ORB feature matching pipeline to improve feature matching accuracy and efficiency under scene changes, using grid-based motion statistics to enhance speed. Reference [24] introduces a vision-based spatial target recognition method for robotic arms, combining improved ORB feature extraction with an enhanced GMS-MLESAC algorithm to achieve precise, fast, and stable recognition of spatial targets for complex task execution. Reference [25] improves ORB feature matching by introducing a signature-based method that significantly reduces the number of features needing comparison, thereby speeding up the matching process while maintaining high precision.

The methods mentioned above enhance the accuracy or speed of the ORB algorithm. However, due to the challenges posed by unstructured factors in the photovoltaic power plant environment for feature matching and the real-time requirements of visual SLAM systems, some of these proposed improvements are not suitable for visual SLAM algorithms in unstructured environments. This paper proposes a method for removing ORB image feature mismatches based on an improved particle swarm algorithm combined with epipolar constraints (PE-match), focusing on finding the majority of correct matches with consistent constraint conditions among the coarse matching point pairs. Compared to matching methods based on random sampling and clustering, the particle swarm algorithm has better global search capability and, compared to genetic algorithms, has lower algorithm complexity. Epipolar constraints offer better robustness against changes in perspective compared to affine transformations, and the analytical solution provided by epipolar constraints ensures the real-time performance of the algorithm. PE-match addresses the sensitivity of epipolar constraints to mismatches by leveraging the optimization capability of the improved Particle Swarm Optimization Algorithm, effectively expanding the application environment of epipolar constraints. PE-SLAM employs PE-match to improve the motion model of ORB-SLAM2, enhancing the tracking accuracy of the motion model in unstructured environments of photovoltaic power plants. The improved motion model, as the main tracking module within ORB-SLAM2, further increases the localization and mapping accuracy of ORB-SLAM2 in photovoltaic power plant scenes.

2. PE-SLAM System Framework

PE-SLAM is an improvement of the ORB-SLAM2 framework, as shown in Figure 2, consisting of the following five modules: “TRACKING”, “LOCAL MAPPING”, “LOOP CLOSING”, “DENSE POINT CLOUD MAPPING”, and “FULL BA”. “TRACKING” focuses on real-time computation of the camera’s motion trajectory by matching features in adjacent image frames. “LOCAL MAPPING” optimizes local poses and constructs local maps in the local scene. “LOOP CLOSING” corrects trajectory drift caused by error accumulation by determining if the camera has returned to a previously visited location. “DENSE POINT CLOUD MAPPING” is used to build dense point cloud maps. “FULL BA” further enhances the system’s accuracy and stability through optimization algorithms.

PE-SLAM is based on the classic ORB-SLAM2 framework and combines particle swarm with the epipolar constraint feature matching algorithm (PE-match) to perform a mismatch removal operation on the feature matching completed by the motion model. At the same time, after generating keyframes in visual odometry, a dense point cloud map based on keyframes is established to compensate for the limitation of ORB-SLAM2 in only being able to generate sparse maps.

3. Improved Motion Model Based on PE-Match

The “TRACKING” module of ORB-SLAM2 can be divided into two main stages during the tracking process: pose estimation and pose optimization. The pose estimation stage employs three strategies: tracking based on reference keyframes, predictive tracking based on motion models, and relocalization. In the pose optimization stage, by establishing the matching relationship between map points of the current frame and its adjacent keyframes, more feature point matches are obtained to optimize and adjust the previous pose estimation.

The primary task of tracking in the pose estimation stage is handled by the motion model. As shown in Figure 3, its basic idea is to assume the camera is in uniform motion over a short period, using the pose and velocity estimated from the previous frame to predict the approximate positions of matching feature points between the current frame and the previous frame. A search area is then set around this position for feature descriptor matching, which improves matching speed while ensuring good accuracy. The specific process is as follows:

Step 1: Set the success criteria for feature matching: minimum distance < 0.9× next smallest distance.

Step 2: Initialize the bit position R, t of the current frame using the current motion velocity and the bit position of the previous frame.

Step 3: Using the bit position R and t of the current frame, project the 3D point of the previous frame to the current coordinate system, search for feature points that can be matched within the radius t of that 2D point, traverse the points that can be matched, calculate the descriptor distance, and record the minimum matching distance.

Step 4: Record the matching point feature directions for direction verification, eliminate the clusters of point pairs with a small number of the same directions in the direction histogram, and retain the clusters of the top three most numerous direction point pairs.

Step 5: If fewer than 20 matching point pairs are found, expand the search radius t = 2 × t and search again.

Step 6: Perform G2O map optimization of the bit pose of the current frame using matched point pairs.

Step 7: Based on the number of successful matches, determine whether the tracking is successful or not.

The motion model in ORB-SLAM2 only utilizes orientation histograms to remove outliers from feature matching results, which leads to fast processing but mediocre removal of incorrect matches. Addressing this issue, this paper proposes a method that, after filtering with orientation histograms in Step 4, employs a particle swarm combined with epipolar constraints to further eliminate incorrect matches. This secondary filtering step can remove the majority of incorrect matches, thereby enhancing the accuracy of the SLAM system.

4. Feature Matching Algorithm Based on Particle Swarm Optimization and Epipolar Constraint (PE-Match)

The traditional ORB feature matching algorithm has the advantage of fast computation speed. However, in photovoltaic power plant images, there are non-structural factors such as high feature similarity and simple structures, leading to a high occurrence of incorrect matches with the ORB algorithm. As mentioned earlier, feature matching in photovoltaic power plant scenes exhibits global feature similarity, with incorrect matches lacking consistency and not possessing the same constraint properties, while correct matches generally exhibit the same constraint properties. Therefore, PE-match is designed to address the characteristics of incorrect matches in photovoltaic power plant scenes. It utilizes the particle swarm algorithm’s information exchange and parallel search capabilities to search for constraints that match the majority of point pairs in the coarse matching point set and ultimately eliminates outliers based on these constraints.

The epipolar constraint leverages rotation and translation information in images and is minimally affected by large disparities between images. Its analytical solution guarantees real-time performance for algorithms and is suitable for image processing scenarios involving mobile robots. Therefore, this paper adopts the epipolar constraint as the required constraint for the algorithm.

The particle swarm optimization algorithm [26] has unique advantages in dealing with multi-objective optimization problems, but traditional particle swarm algorithms suffer from the drawback of getting trapped in local optima. This paper improves the mutation operations of pbest (personal best value) to enhance the global search capability of the particle swarm algorithm in the later stages of convergence.

4.1. Epipolar Constraint Principle

Epipolar constraints [27] reduce the distribution of possible matches of a point on another image from two dimensions to one dimension, providing constraints to weed out false matches. The epipolar constraint schematic is shown in Figure 4;

I_{1}

and

I_{2}

are the image planes of the camera at two moments,

O_{1}

and

O_{2}

are the camera centers corresponding to the two image planes, and according to the pinhole camera model, it can be known that

p_{1}

and

p_{2}

are the 2D pixel points mapped by the spatial point

P

on the two image planes. The ray

O_{1} p_{1}

has a spatial point

P

corresponding to pixel

p_{1}

, and the projection of this ray on the image plane

I_{2}

is the epipolar line

L_{2}

, the pixel point of

P

projected on the image plane

I_{2}

must be on the epipolar line

L_{2}

, and

p_{2}

matching

p_{1}

must be on the epipolar line.

A point

p_{1}

in the image plane

I_{1}

in Figure 4, whose corresponding epipolar line

L_{2}

in the image plane

I_{2}

can be expressed as follows:

{\begin{matrix} L_{2} : a x + b y + c = 0 \\ [a, b, c] = F p_{1} \end{matrix}

(1)

In the formula,

F

is the fundamental matrix.

The distance from the pixel to be verified to

L_{2}

is calculated using the point-to-straight line distance, and if the distance is within a certain threshold, then the pixel is identified as the correct match for

p_{1}

. As shown in Figure 5, if it is not within the threshold then this matched point pair is rejected.

The fundamental matrix

F

is a mathematical representation of the correspondence between pairs of matching points and contains the camera’s internal and external reference information. Given the fundamental matrix and the pixel coordinates of image feature points, we can calculate the epipolar line corresponding to the matching point on the other image.

Let the spatial position of point

P

be

P = {[X, Y, Z]}^{T}

, the pixel positions of the two pixels

p_{1}

and

p_{2}

are known from the pinhole camera model and represented by Equation (2) as follows:

{\begin{matrix} s_{1} p_{1} = K P \\ s_{2} p_{2} = K (R P + t) \end{matrix}

(2)

In the formula,

K

is the camera internal reference,

R

and

t

are the camera motion information, and

s_{1}

and

s_{2}

are constants.

Equation (3) is derived from Equation (2) and written as follows:

{p_{2}}^{T} F p_{1} = 0

(3)

In the formula:

F

is the fundamental matrix,

F = K^{- T} t^{\land} R K^{- 1}

, which contains the translation and rotation information of the camera.

Consider a pair of matching points whose normalized coordinates are

p_{1} = {(x_{1}, y_{1}, 1)}^{T}

,

p_{2} = {(x_{2}, y_{2}, 1)}^{T}

. Equation (4) is obtained from Equation (3) and normalized coordinates:

[\begin{matrix} x_{2} & y_{2} & 1 \end{matrix}] [\begin{matrix} f_{11} & f_{12} & f_{13} \\ f_{21} & f_{22} & f_{23} \\ f_{31} & f_{32} & f_{33} \end{matrix}] [\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}] = 0

(4)

Given m sets of matches:

{p_{1}}^{i} = {(x_{1}^{i}, y_{1}^{i}, 1)}^{T}

,

{p_{2}}^{i} = {(x_{2}^{i}, y_{2}^{i}, 1)}^{T}

, Equation (5) can be obtained:

[\begin{matrix} x_{2}^{1} x_{1}^{1} & x_{2}^{1} y_{1}^{1} & x_{2}^{1} & y_{2}^{1} x_{1}^{1} & y_{2}^{1} y_{1}^{1} & y_{2}^{1} & x_{1}^{1} & y_{1}^{1} & 1 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ x_{2}^{m} x_{1}^{m} & x_{2}^{m} y_{1}^{m} & x_{2}^{m} & y_{2}^{m} x_{1}^{m} & y_{2}^{m} y_{1}^{m} & y_{2}^{m} & x_{1}^{m} & y_{1}^{m} & 1 \end{matrix}] f = 0

(5)

From Equation (4), the fundamental matrix

F

has nine unknowns. Since the rank of

F

is 2, it reduces the degree of freedom by one. Therefore, eight matching point pairs are needed to solve eight linear equations to derive the unique

F

. This method is sensitive to incorrect matches and image noise [28], and the problem of solving the fundamental matrix can be seen as a multi-objective optimization problem for the set of matching point pairs.

4.2. Improved Mutation Strategy

One of the common issues with particle swarm optimization is the tendency to prematurely converge. As the number of iterations increases, the population diversity decreases, leading the particle swarm to easily get stuck in local optima. In contrast, the mutation strategy of the differential evolution algorithm [29] can introduce disturbances to particles during the search process, thus possessing strong global search capabilities. Therefore, integrating the mutation strategy can enhance the global convergence ability of the particle swarm optimization algorithm.

The traditional mutation strategy is to weight the difference between two random individuals in the population and add it to the third random individual to obtain the mutation individual, as shown in the following Equation (6):

{\begin{cases} h_{i} = q_{r 1} + f \\ f = W (q_{r 2} - q_{r 3}) \end{cases}

(6)

where

q_{r 1}

,

q_{r 2}

,

q_{r 3}

are three particle individuals randomly selected in the population, with

r 1 \neq r 2 \neq r 3

;

h_{i}

is the

i t h

mutant individual;

W

is the scaling factor; and

f

is the perturbation. According to Equation (6), it can be seen that in the early stage of the search due to the randomness of the initial distribution of the particle swarm resulting in a large discrepancy between particles, the traditional mutation method applies a large perturbation to the particles. As the search proceeds, the particle swarm gradually converges, the difference between particles becomes smaller, and the perturbation becomes smaller.

The particle swarm optimization algorithm has the advantage of fast convergence in the early stage, but its drawback is that it easily gets stuck in local optima in the later stage. Directly applying traditional mutation strategies to the individual best values of the particle swarm can result in the individual best values being minimally disturbed in the late stage of convergence, making it difficult for the particle swarm to escape from local optima.

To address the aforementioned issue, this paper proposes a new mutation strategy, the core idea of which is to reduce disturbances to particles when the differences between particles are significant in the early convergence stage of the particle swarm, thereby accelerating the early convergence process. When the differences between particles are small in the later stages of convergence, the degree of disturbance to particles is increased to help the particle swarm escape from local optima. The new mutation strategy is shown in Equation (7), written as follows:

{\begin{cases} h_{i} = q_{r 1} + f (D) \\ f (D) = \frac{D i m e n s i o n}{1 + \exp^{(\frac{- 1}{10 \times D})}} - \frac{D i m e n s i o n}{2} \\ D = \frac{\sum \frac{q_{r 2} - q_{r 3}}{\max (q_{r 2} - q_{r 3})}}{n} \end{cases}

(7)

In the formula,

D i m e n s i o n

is the search range of particles,

D

is the difference degree of particles, and

n

is the number of perturbations. Assuming that the search range

D i m e n s i o n = 100

, the result of perturbation

f (D)

is shown in Figure 6, and the perturbation

f (D)

of particles decreases as the degree of difference

D

of particles increases.

4.3. Specific Steps for PE-Match

The previous sections have elaborated on the epipolar constraint principle and the improved mutation strategy. The specific steps of the feature matching method based on improved particle swarm and epipolar constraint (PE-match) are as follows:

Step 1: Obtain a set of coarse matching point pairs through the traditional ORB algorithm.

Step 2: Initialize the particle population. Firstly, set the population number; after that, each particle reads randomly from the coarse matching without repeating eight pairs of matching point pairs of pixel coordinates and set the particle velocity to zero.

Step 3: Design a fitness function based on the characteristics of correct matches having the same constraint properties in a photovoltaic station scene, while incorrect matches do not possess the same constraint properties and calculate the fitness of each particle. The number of matching point pairs satisfying the epipolar constraints is used as the fundamental for evaluating the quality of the fundamental matrix for the particle’s fitness assessment. The specific calculation steps are as follows:

(a): The fundamental matrix $F$ is calculated from the coordinates of the eight pairs of pixels in the particle.
(b): The epipolar line is calculated from the fundamental matrix $F$ and the pixel coordinates of the first point in the coarse matched point pair, and the distance $d$ from this epipolar line is calculated using the second point in the matched point pair as in Equation (8). If the distance is less than a certain threshold, the matched pair is considered a correct match.

$d = | \frac{a x + b y + c}{\sqrt{a^{2} + b^{2}}} |$

(8)
(c): Using the set of $F$ and coarse matching point pairs, we complete step (b) with $j$ random pairs and count the number of correct matches as the fitness value. Since the fitness calculation takes up a lot of time in the particle swarm algorithm, setting the number of coarse matching point pairs participating in fitness calculation as $j$ ensures that the algorithm does not increase in time consumption as the number of coarse matching point pairs increases. Usually, $j$ is set to 50 or 100.

Step 4: Record and update

G b e s t_{t}

and

P b e s t_{i}

.

G b e s t_{t}

stores the position of the particle with the highest fitness found by the entire population, while

P b e s t_{i}

stores the position of the particle with the highest fitness found by the individual particle itself.

Step 5: Update the velocity and position of the particles. The particle swarm optimization algorithm during the optimization process, the velocity and position of the particles are updated by the search experience of the whole swarm and the search experience of the individual particles as shown in Figure 7.

The velocity and position update rules are as shown in Equation (9).

{\begin{matrix} V_{i} (t + 1) = ω V_{i} (t) + c_{1} r_{4} (P b e s t_{i} - X_{i} (t)) \\ + c_{2} r_{5} (G b e s t_{t} - X_{i} (t)) \\ X_{i} (t + 1) = X_{i} (t) + V_{i} (t) \end{matrix}

(9)

In the formula,

V_{i} (t)

and

X_{i} (t)

are the velocity and position of the

i t h

particle at iteration

t

;

r_{4}

and

r_{5}

are two stochastic parameters uniformly distributed in [0, 1];

c_{1}

and

c_{2}

are the learning weights for the individual particle extremes and the particle population extremes; and

ω

is the inertia weight.

Step 6: Perturb

P b e s t_{i}

. In this paper,

P b e s t_{i}

is perturbed using improved mutation and crossover operations to prevent the algorithm from getting stuck in local optima. The mutation operation is represented by Formula (17).

{\begin{matrix} h_{i} = P b e s t_{i} + \frac{D i m e n s i o n}{1 + \exp^{(\frac{- 1}{10 \times D})}} - \frac{D i m e n s i o n}{2} \\ D = \frac{\sum \frac{P b e s t_{r 2} - P b e s t_{r 3}}{\max (P b e s t_{r 2} - P b e s t_{r 3})}}{n} \end{matrix}

(10)

In the formula,

h_{i}

is the mutated individual,

i

is the particle index,

D

is the degree of particle difference,

D i m e n s i o n

is the particle search range,

n

is the number of subtractions, and

r 2 r 3

are random numbers and

r 2 \neq r 3

.

The crossover operation is expressed as Equation (11) and is performed on the

P b e s t_{i}

of the particle population.

L

is the crossover probability.

P b e s t_{i} = {\begin{matrix} h_{i}, r a n d ⩽ L \\ P b e s t_{i}, r a n d > L \end{matrix}

(11)

Step 7: Verify that the

G b e s t_{t}

has not changed significantly over multiple iterations, and if it has not changed significantly or if the maximum number of iterations is reached then output the fundamental matrix with the highest fitness. If the output condition is not reached then go back to Step 3 and continue iterating.

Step 8: The false matches in the coarse matching are eliminated based on the fundamental matrix output from Step 7 according to the principle of epipolar constraints.

5. Results

5.1. Feature Matching Experiment

5.1.1. Description of the Data Set

For the feature matching experiment, the OxfordVGG [30] dataset was used, from which two sets of scene images with high feature similarity, simple structure, and other non-structural factors that are also present in photovoltaic power plants were selected, as shown in Figure 8. Scene 1 features a gradually blurry tree, while Scene 2 consists of a wall with changing perspectives, each set containing four images. The dataset provides an essential matrix for validating the matching results.

5.1.2. Experimental Environment Configuration and Parameter Setting

The computer hardware specifications used in this experiment are as follows: AMD Ryzen 7 5800H processor (USA), 16 GB of memory (USA), a 64-bit Windows 11 operating system, and Visual Studio 2017 as the programming software. The third-party library OpenCV 4.1.0 was used for ORB feature extraction and description.

To validate the advantage of the algorithm proposed in this paper (ORB+PEmatch), the matching performance is compared with the traditional ORB algorithm, ORB+HAMMING algorithm [31], and ORB+RANSAC algorithm [32] in the actual environment of a photovoltaic power plant. The traditional ORB algorithm provides the same set of coarse matching point pairs as for the other three algorithms. The ORB+HAMMING algorithm uses twice the minimum HAMMING distance as the threshold to eliminate mismatches. ORB+RANSAC employs the random sample consensus algorithm to compute the fundamental matrix, with an inlier distance threshold set to three. After obtaining the fundamental matrix, mismatches are eliminated using the epipolar constraint with a common threshold of three. The inlier distance threshold is the pixel distance from the matching points to the epipolar line. Due to the noise often present in images, setting the inlier distance threshold too small can filter out correct matches, while setting it too large may result in missing incorrect matches. Based on multiple experiments and considering commonly used values from past research, the experiment adopts an approximated optimal value of three as the inlier distance threshold.

Random sampling is contrasted with the particle swarm optimization sampling discussed herein. To ensure fairness in performance comparison, the PE-match’s inner point distance threshold is set to three. Based on multiple experimental results, the near-optimal parameter settings for the particle swarm are determined: the population size is set to 20, which influences the search capability of the particle swarm algorithm, though an excessively high number can reduce the algorithm’s real-time performance. The number of differences

n

is set to three, influencing the mutation strategy’s sensitivity to differences among particles. The crossover probability

L

is set to 50%, determining the influence of the mutation strategy on

P b e s t

. The learning weights for individual and global extrema

c_{1}

and

c_{2}

are both set to 0.5, while the inertia weight

ω

is also set to 0.5. The number of matching point pairs

j

involved in fitness calculation is set to 100; this value allows the particle swarm algorithm to maintain real-time performance without degrading despite an increase in feature points, though setting it too low can diminish the optimization capability. The iteration ends, and the optimal fundamental matrix is output if there are no significant changes in

G b e s t

over 8 (

I t_\max

) iterations, a measure included to ensure the stability of the algorithm’s real-time performance. Taking the 1–4 matching experiments of Scene 1 from the OxfordVGG dataset as an example, Figure 9 illustrates the influence of key parameters on the algorithm’s accuracy and processing time. For different matching scenarios, this paper will make minor adjustments based on the aforementioned parameters to adapt to varying conditions. Because the outcomes of the particle swarm optimization are related to the number of iterations and the initial distribution of particles, subsequent experimental data related to this algorithm will be the averages of multiple trials.

5.1.3. Experimental Results and Analysis

Figure 10 displays the fitness changes with the number of iterations

t

in the 1–3 multiple matching experiments of Scene 1 in the OxfordVGG dataset using the traditional particle swarm algorithm and the improved particle swarm algorithm after lifting the constraint on the number of coarse matches

j

involved in the computation. It can be observed that the traditional particle swarm algorithm is prone to falling into local optima, while the improved particle swarm algorithm inherits the fast convergence advantage of the traditional particle swarm algorithm in the early iterations and shows better capability to escape local optima in the later iterations.

The accuracy of each algorithm is shown in Figure 11, where accuracy refers to the ratio of correctly matched pairs to all matched pairs in the matching results of each algorithm. Image feature matching is performed using the dataset images, with the need to match three image pairs in each scenario, namely 1-2, 1-3, and 1-4, with images becoming more dissimilar as their numbers increase. The matching accuracy of the proposed algorithm is on average 19.5% higher than that of the ORB algorithm, 14.0% higher than that of the ORB+HAMMING algorithm, and 6.0% higher than that of the ORB+RANSAC algorithm.

Figure 12 presents the recall rate results of the three algorithms in the six matching experiments for the two scenarios. The recall rate is defined as the ratio of the number of correctly matched point pairs after eliminating misalignments to the number of correctly matched point pairs in the coarse matches (traditional ORB) before removing misalignments. From the graph, it can be observed that the ORB+HAMMING algorithm eliminates a large number of correctly matched point pairs in the coarse matches, resulting in sparse matching results. In contrast, both our algorithm and the ORB+RANSAC algorithm perform well in retaining correctly matched point pairs from the coarse matches, with our algorithm showing a significantly higher recall rate than the ORB+RANSAC algorithm. These results suggest that our algorithm can effectively preserve correctly matched point pairs while eliminating misalignments, thus improving the quality of matching.

The time consumption of each algorithm is shown in Table 1, which records the average time consumption results of multiple experiments in four scenarios for each algorithm. The ORB algorithm demonstrates excellent speed, with our algorithm taking an average of 2.39 times longer than the ORB algorithm.

5.2. SLAM Trajectory Experiment

5.2.1. Description of the Data Set

To validate the performance metrics of the improved algorithm, experiments were conducted using the widely used and authoritative TUM dataset [33] in the field of visual SLAM research. This dataset is provided by the Technical University of Munich in Germany and consists of a large number of data sequences, including RGB-D data captured by Microsoft Kinect sensors (USA) and ground truth pose data obtained from motion capture systems.

Based on research results regarding high feature similarity, weak texture, and simple structure in the field of photovoltaic power plant scenes, the algorithm was validated using the scenarios named structure_notexture_near, nostructure_texture_far, and structure_texture_far. The structure_notexture scenario refers to scenes with 3D contours but no texture features; the nostructure_texture scenario refers to scenes with texture features but no 3D contours; and the structure_texture scenario refers to scenes with both 3D contours and texture features. These three sequences can evaluate the algorithmic performance of PE-SLAM and ORB-SLAM2 in unstructured scenes. The three sequences are shown in Figure 13.

5.2.2. Experimental Environment Configuration and Parameter Setting

The hardware configuration for the experiment is as follows: AMD Ryzen 7 5800H processor (USA), NVIDIA GeForce 3060 6 GB GPU, 16GB RAM, and 64-bit Ubuntu 18.04 operating system. The experimental parameters are set according to the TUM1.yaml file of ORB-SLAM2. The PE-match parameters are mostly consistent with the settings in Section 5.1.2, with the number of coarse matching point pairs involved in fitness calculation (denoted as “

j

”) adjusted to 50. The iteration stops and outputs the optimal fundamental matrix when there is no significant change in

G b e s t_{t}

within 6 iterations. Such parameter adjustments are made with the consideration of reducing the time consumption of PE-match in the PE-SLAM system to free up time for backend optimization, meaning that PE-SLAM aims to improve the accuracy of the motion model while minimizing any negative impact on the effectiveness of the backend optimization.

The TUM dataset provides real camera motion trajectory information, which is used to evaluate the trajectory information obtained by algorithms against the ground truth trajectory. The evaluation metrics include absolute trajectory error (ATE) and relative pose error (RPE). ATE represents the direct difference between estimated and ground truth poses, reflecting the trajectory consistency at a global level. RPE measures the local accuracy of the trajectory within fixed time intervals, corresponding to trajectory drift.

The root mean square error (RMSE) is used to reflect ATE, with

t r a n s (F_{i})

representing the translation part, written as follows:

R M S E (F_{1 : n}) : = {(\frac{1}{n} \sum_{i = 1}^{n} t r a n s {(F_{i})}^{2})}^{\frac{1}{2}}

(12)

F_{i} : = Q_{i}^{- 1} S P_{i}

(13)

In the formula,

Q_{i}

represents the estimated pose of the i-th frame,

P_{i}

represents the real pose of the i-th frame,

F_{i}

represents the absolute trajectory error of the i-th frame, and

S

is the scale conversion matrix.

Given the known time interval

△

and the total number of image frames n, it is possible to calculate m = n −

△

RPEs, with the RMSE defined as follows:

R M S E (E_{1 : n}, △) : = {(\frac{1}{m} \sum_{i = 1}^{m} t r a n s {(E_{i})}^{2})}^{\frac{1}{2}}

(14)

E_{i} : = {(Q_{i}^{- 1} Q_{i + △})}^{- 1} (P_{i}^{- 1} P_{i + △})

(15)

In the formula:

Q_{i}

represents the estimated pose of the i-th frame,

P_{i}

represents the true pose of the i-th frame, and

E_{i}

represents the relative pose error of the i-th frame.

Due to the fact that various modules of ORB-SLAM2 and PE-SLAM, such as target detection, tracking, backend optimization, and mapping are independent threads, the most time-consuming among them is the tracking thread. The speed of the tracking thread determines the system processing speed. Additionally, since the backend optimization occurs between the tracking and the arrival of the next frame, its effectiveness is not only dependent on the tracking thread’s performance but also on tracking time consumption. Moreover, PE-SLAM has made improvements to the motion model of ORB-SLAM2. Considering these factors, a tracking time evaluation algorithm is adopted to assess the real-time performance comprehensively.

5.2.3. Experimental Results and Analysis

Experimental verification of PE-SLAM and ORB-SLAM2 is conducted, with absolute trajectory errors shown in Table 2 and relative reduction rates shown in Table 3.

It can be seen that the motion model improvement based on PE-match has brought about an enhancement in system accuracy. In the case of nostructure_texture, where only texture is present without contours, PE-SLAM achieves a 44.0% reduction in error, indicating that the improved algorithm provides better system accuracy in situations where textures are similar and lack contour structures.

Table 4 shows the relative pose errors, while Table 5 presents the relative reduction rate of pose errors. It can be observed that the reduction effect of relative pose errors is consistent with that of absolute trajectory errors. In ORB-SLAM2, the strategy for backend optimization involves local trajectory optimization after tracking completion and before the arrival of the next frame. The tracking time needs to be minimized to ensure the effectiveness of backend trajectory optimization. The fact that the absolute trajectory errors are smaller than the relative pose errors in PE-SLAM indicates that the improved algorithm’s tracking time meets the real-time requirements for system backend optimization. The average tracking time for each sequence is detailed in Table 6.

As shown in Figure 14, Figure 15 and Figure 16, based on the ground truth provided by the TUM dataset and the CameraTrajectory obtained from the SLAM algorithm, six trajectory error plots is generated. In the trajectory error plot, the actual trajectory is represented by a black line segment, while the estimated trajectory is depicted by a blue line segment. Red short lines indicate the distance difference between the predicted position and the actual position of the camera at the same time point. The length of the red short lines visually reflects the degree of error.

As shown in Figure 17, Figure 18 and Figure 19, we generate six dense point cloud maps based on the three image sequences mentioned above: “structure_notexture”, “nostructure_texture”, and “structure_texture”. In this map, ORB-SLAM2 represents the dense point cloud map established after integrating a dense mapping thread into the original algorithm, and PE-SLAM represents the dense point cloud map built from the improved tracking thread. The mapping results of each sequence are shown in the following image.

5.3. Simulation Experiments of Photovoltaic Scenarios

5.3.1. Establishment of the Simulation Environment Based on Gazebo

Gazebo uses simple text formats (such as URDF, SDF, etc.) to describe and create simulation models, which include attributes such as geometric shapes, mass, inertia, kinematics, and dynamics. As shown in Figure 20, a robot model is built on Gazebo 9, with the blue block on top representing the camera appearance model. The Kinect camera simulation code provided by Gazebo is loaded onto the camera model, thereby realizing the robot’s perception part. The robot is driven using a two-wheel drive, and the drive wheels are controlled using the differential drive model provided by Gazebo.

The simulation of the photovoltaic scene is shown in Figure 21. It consists of a wall, photovoltaic modules, land, and shrub vegetation. To simulate the unstructured factors of a real photovoltaic power station, such as high feature similarity, weak textures, and simple structures, the wall is made of gray bricks with high feature similarity. The ground is made of the most common land found in photovoltaic power stations. A small number of green plants are represented by small, common shrubs and trees. These elements collectively create a localized simulation scene of a photovoltaic power station. The image sequence of this scene is named Si.

5.3.2. Simulation Experiment Results and Analysis

The evaluation metrics for the simulation trajectory experiment are the same as those for the dataset experiment. By conducting simulation experiments on the aforementioned Si, PE-SLAM and ORB-SLAM2 were validated. The absolute trajectory error validation values are shown in Table 7, and the relative reduction rates are shown in Table 8.

As shown in Table 8, the trajectory errors of both ORB-SLAM2 and PE-SLAM in the Gazebo photovoltaic power station simulation scene are at the centimeter level, indicating that both algorithms achieve good localization accuracy in this scenario. Additionally, PE-SLAM demonstrates better system accuracy, reducing the absolute trajectory error by 11.5% compared to ORB-SLAM2.

Table 9 and Table 10 present the results of the relative pose error in the photovoltaic scene simulation. The fact that the relative trajectory error is larger than the absolute trajectory error indicates that the back-end optimization of both algorithms was not affected by the tracking thread’s processing time, as both performed local and global trajectory optimization. The average tracking time for each sequence is shown in Table 11.

As shown in Figure 22, the trajectory error plot s is drawn based on the ground truth of the robot’s actual trajectory provided by the Gazebo simulation system and the camera trajectory (CameraTrajectory) derived from the algorithm. In the trajectory error graph, the ground truth trajectory is represented by black line segments, while the estimated trajectory is shown with blue line segments. Red short lines indicate the distance difference between the predicted and actual positions of the camera at the same time point, and the length of these red short lines visually reflects the degree of error.

As shown in Figure 23, two dense point cloud maps is generated from the Si image sequence mentioned earlier. ORB-SLAM2 represents the dense point cloud map created by adding a dense mapping thread to the original algorithm, while PE-SLAM represents the dense point cloud map built using the poses obtained from the improved tracking thread. The mapping results for each sequence are shown in the following figures.

Due to the relatively uniform texture and structure of the photovoltaic power station simulation scene, which enhances the scene’s unstructured factors, the mapping performance would be better in a real photovoltaic scene.

5.4. Experiments on Building Maps in Photovoltaic Scenarios

5.4.1. Experimental Scenario Composition

The mapping experiments used two image sequences captured by an RGB-D camera. One of the scenes is indoors, where controllable lighting conditions, such as changes in brightness and backlighting, are possible, as shown in Figure 24. Additionally, compared to outdoor scenes, indoor environments typically generate less noise with RGB-D cameras, leading to more accurate depth estimations. The outdoor scene, shown in Figure 25, uses a real photovoltaic power station scene. The figure illustrates that in a real photovoltaic power station, there are complex textures and structures on buildings and the ground, and the photovoltaic panels have stains with varying textures. These factors reduce the scene’s unstructured characteristics. The indoor sequence is named In, and the outdoor sequence is named Real.

5.4.2. Mapping Results and Analyses

As shown in Figure 26 and Figure 27, we generate four dense point cloud maps based on the two image sequences mentioned above, “In” and “Real”. In this map, ORB-SLAM2 represents the dense point cloud map established after integrating a dense mapping thread into the original algorithm, and PE-SLAM represents the dense point cloud map built from the improved tracking thread. The mapping results of each sequence are shown in the following image.

The map established by PE-SLAM reduces the lateral and angular point cloud overlaps aligned with the camera motion direction compared to ORB-SLAM2. This reflects that the camera poses obtained by the improved algorithm are more accurate, allowing for better tracking performance in this scene. The established map shows a small amount of longitudinal overlaps inconsistent with the camera motion direction, with outdoor scenes exhibiting significantly more longitudinal overlaps than indoor scenes. This is mainly due to deviations in the camera’s depth estimation, which become more pronounced under strong outdoor lighting conditions. As shown in Figure 19, the mapping area of PE-SLAM is larger than that of ORB-SLAM2, which is due to tracking failures midway with ORB-SLAM2, indicating that PE-SLAM has a better robustness.

6. Discussion

In the photovoltaic power station scenario, the feature similarity makes the ORB algorithm prone to mismatches during the feature matching stage. Extensive experiments have shown that these mismatches are random and lack consistent constraints, while correct matches share the same constraint conditions. Based on the characteristics of mismatches generated by ORB in the photovoltaic station scenario, an improved particle swarm optimization (PSO) algorithm is used to identify the constraint conditions that comply with the majority of matched point pairs. This helps in determining the correct matching constraints and subsequently eliminating mismatches. The epipolar constraint, which includes both intrinsic and extrinsic camera parameters, is suitable for both near and far points and has an analytical solution that meets the real-time requirements of SLAM algorithms. Therefore, this paper chooses the epipolar constraint as the constraint condition. Since the PSO algorithm is prone to local optima, a new mutation strategy is designed to perturb the Pbest of the particle swarm, enhancing the global search capability of the algorithm in the later stages. This improves the PSO algorithm’s ability to address the sensitivity of epipolar constraints to mismatches. Based on the above theory, the paper designs PE-match to improve the motion model of ORB-SLAM2, thereby reducing the number of mismatches in photovoltaic power station scenarios and improving SLAM’s localization accuracy and mapping performance.

Table 12 presents the characteristics and main experimental results of PE-SLAM in the six experimental scenarios mentioned earlier. The TUM dataset is more authoritative, and based on the experimental results, PE-SLAM has achieved better localization accuracy and mapping performance compared to ORB-SLAM2. In the structure_notexture and structure_texture scenarios, the ATE (REMS) is at the centimeter level, while in the nostructure_texture scenario, the ATE (REMS) is at the decimeter level, with the most significant improvement in this scenario, achieving an ATE (REMS) reduction rate of 44.0%. In the simulation experiments, the ATE (REMS) is only 0.023, indicating that PE-SLAM is suitable for the photovoltaic power station simulation scenarios in terms of localization accuracy. However, the mapping performance is moderate, which we believe is due to the high degree of texture similarity in the simulation scenarios. In real photovoltaic power station scenarios, the presence of background buildings, ground, and stained photovoltaic panels weakens the unstructured scene factors. Theoretically, this should result in better mapping performance, which is confirmed by the mapping experiments in the In scenario. The mapping performance in the Real scenario is noticeably lower than in the In scenario, which is due to the significant influence of outdoor lighting on the RGB-D camera in outdoor environments. Overall, PE-SLAM has surpassed ORB-SLAM2 in both localization accuracy and mapping performance in the unstructured environments of photovoltaic power stations.

7. Conclusions

This study proposes a PE-SLAM algorithm based on ORB-SLAM2, which combines an improved particle swarm optimization algorithm with epipolar constraints to improve the accuracy and robustness of the motion model in a photovoltaic power station environment. Experimental results show that in unstructured scenes, ORB+PE-match has an average improvement in matching accuracy of 19.5%, 14.0%, and 6.0% compared to the traditional ORB algorithm, ORB+HAMMING algorithm, and ORB+RANSAC algorithm, with better recall rates. In trajectory experiments using the TUM dataset, PE-SLAM reduces the average absolute trajectory error and relative pose error by 29.1% and 27.0%, respectively, compared to the ORB-SLAM2 algorithm. Mapping experiment results demonstrate the successful application of PE-SLAM in a photovoltaic power station scenario, where the dense point cloud map accurately and completely reflects the scene, effectively overcoming challenges posed by unstructured factors. Therefore, the proposed PE-SLAM algorithm offers an effective solution to address the challenges that ORB-SLAM faces in a photovoltaic power station environment.

PE-SLAM can be applied to cleaning or inspection robots in photovoltaic power station scenarios. Without external information support (such as GPS), it provides the robot with self-localization and 3D point cloud information about the environment using only the robot’s own vision sensors. The mapping experiments in photovoltaic power station scenarios show that camera noise has a certain impact on the mapping results. For practical application of the algorithm, we believe that using RGB-D cameras with better anti-glare capabilities can improve the performance of PE-SLAM. From the parameter selection experiments, it can be seen that some parameters significantly affect PE-SLAM’s performance. Therefore, optimal parameter selection is necessary for different scenarios. Future research could consider incorporating adaptive parameter algorithms to make PE-SLAM simpler and more effective in various unstructured environments. Additionally, PE-SLAM currently cannot modify the already-generated 3D point cloud based on back-end optimization results. If future research can address this, we believe the mapping performance will improve.

Author Contributions

Conceptualization, C.L. and Z.S.; methodology, C.L. and Z.S.; software, Z.S.; validation, C.L., Z.S., J.W. and K.Y.; formal analysis, Z.S. and K.Y.; investigation, W.N. and Z.S.; resources, W.N.; data curation, Z.S.; writing—original draft preparation, C.L. and Z.S.; writing—review and editing, C.L.; visualization, Z.S.; supervision, C.L.; project administration, C.L.; funding acquisition, C.L. and W.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 52265065.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Østergaard, P.A.; Duic, N.; Noorollahi, Y.; Mikulcic, H.; Kalogirou, S. Sustainable development using renewable energy technology. Renew. Energy 2020, 146, 2430–2437. [Google Scholar] [CrossRef]
Al-Housani, M.; Bicer, Y.; Koç, M. Experimental investigations on PV cleaning of large-scale solar power plants in desert climates: Comparison of cleaning techniques for drone retrofitting. Energy Convers. Manag. 2019, 185, 800–815. [Google Scholar] [CrossRef]
Olorunfemi, B.O.; Ogbolumani, O.A.; Nwulu, N. Solar panels dirt monitoring and cleaning for performance improvement: A systematic review on smart systems. Sustainability 2022, 14, 10920. [Google Scholar] [CrossRef]
Høiaas, I.; Grujic, K.; Imenes, A.G.; Burud, I.; Olsen, E.; Belbachir, N. Inspection and condition monitoring of large-scale photovoltaic power plants: A review of imaging technologies. Renew. Sustain. Energy Rev. 2022, 161, 112353. [Google Scholar] [CrossRef]
Bresson, G.; Alsayed, Z.; Yu, L.; Glaser, S. Simultaneous localization and mapping: A survey of current trends in autonomous driving. IEEE Trans. Intell. Veh. 2017, 2, 194–220. [Google Scholar] [CrossRef]
Gupta, A.; Fernando, X. Simultaneous localization and mapping (slam) and data fusion in unmanned aerial vehicles: Recent advances and challenges. Drones 2022, 6, 85. [Google Scholar] [CrossRef]
Mur-Artal, R.; Tardós, J.D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J.J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
Ma, C.; Hu, X.; Xiao, J.; Zhang, G. Homogenized ORB algorithm using dynamic threshold and improved Quadtree. Math. Probl. Eng. 2021, 2021, 6693627. [Google Scholar] [CrossRef]
Xie, Y.; Wang, Q.; Chang, Y.; Zhang, X. Fast Target Recognition Based on Improved ORB Feature. Appl. Sci. 2022, 12, 786. [Google Scholar] [CrossRef]
Chen, Q.; Yao, L.; Xu, L.; Yang, Y.; Xu, T.; Yang, Y.; Liu, Y. Horticultural Image Feature Matching Algorithm Based on Improved ORB and LK Optical Flow. Remote Sens. 2022, 14, 4465. [Google Scholar] [CrossRef]
Chu, G.; Peng, Y.; Luo, X. ALGD-ORB: An improved image feature extraction algorithm with adaptive threshold and local gray difference. PLoS ONE 2023, 18, e0293111. [Google Scholar] [CrossRef] [PubMed]
Muja, M.; Lowe, D.G. Fast approximate nearest neighbors with automatic algorithm configuration. In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, Lisboa, Portugal, 5–8 February 2009; Volume 1. [Google Scholar]
Muja, M.; Lowe, D.G. Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 2227–2240. [Google Scholar] [CrossRef] [PubMed]
Bian, J.W.; Lin, W.Y.; Matsushita, Y.; Yeung, S.-K.; Nguyen, T.-D.; Cheng, M.-M. Gms: Grid-based motion statistics for fast, ultra-robust feature correspondence. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4181–4190. [Google Scholar]
Vinay, A.; Rao, A.S.; Shekhar, V.S.; Kumar, A.C.; Balasubramanya Murthy, K.N.; Natarajan, S. Feature extractionusing ORB-RANSAC for face recognition. Procedia Comput. Sci. 2015, 70, 174–184. [Google Scholar] [CrossRef]
Li, X.; Liu, Y.; Li, D.; Yang, L.; Yang, X.; Wang, Y. Spherical Image Stitching Based on ORB and PROSAC Algorithm. In Proceedings of the 3rd International Conference on Intelligent Information Processing, New York, NY, USA, 19–20 May 2018; pp. 160–165. [Google Scholar]
Pang, Y.; Li, A. An improved ORB feature point image matching method based on PSO. In Proceedings of the Tenth International Conference on Graphics and Image Processing (ICGIP 2018), Chengdu, China, 12–14 December 2018; Volume 11069, pp. 224–232. [Google Scholar]
Wu, Y.; Miao, Q.; Ma, W.; Gong, M.; Wang, S. PSOSAC: Particle swarm optimization sample consensus algorithm for remote sensing image registration. IEEE Geosci. Remote Sens. Lett. 2017, 15, 242–246. [Google Scholar] [CrossRef]
Taranco, R.; Arnau, J.M.; González, A. LOCATOR: Low-power ORB accelerator for autonomous cars. J. Parallel Distrib. Comput. 2023, 174, 32–45. [Google Scholar] [CrossRef]
Dai, Y.; Wu, J. An Improved ORB Feature Extraction Algorithm Based on Enhanced Image and Truncated Adaptive Threshold. IEEE Access 2023, 11, 32073–32081. [Google Scholar] [CrossRef]
Song, S.; Ai, L.; Tang, P.; Mao, Z.; Gu, Y.; Chai, Y. AAM-ORB: Affine attention module on ORB for conditioned feature matching. Signal Image Video Process. 2023, 17, 2351–2358. [Google Scholar] [CrossRef]
Gao, Z.; Lv, M.; Zhang, J. Research on target recognition method of robotic arm based on improved ORB. In Proceedings of the 2024 36th Chinese Control and Decision Conference (CCDC), Xi’an, China, 25–27 May 2024; pp. 5596–5600. [Google Scholar]
Huang, B.C.; Zhang, Y.J. A High-Efficiency FPGA-Based ORB Feature Matching System. J. Circuits Syst. Comput. 2024, 33, 2450028. [Google Scholar] [CrossRef]
Marini, F.; Walczak, B. Particle swarm optimization (PSO). A tutorial. Chemom. Intell. Lab. Syst. 2015, 149, 153–165. [Google Scholar] [CrossRef]
Chai, X.; Zhou, F.; Chen, X. Epipolar constraint of single-camera mirror binocular stereo vision systems. Opt. Eng. 2017, 56, 084103. [Google Scholar] [CrossRef]
Luong, Q.T.; Faugeras, O.D. The fundamental matrix: Theory, algorithms, and stability analysis. Int. J. Comput. Vis. 1996, 17, 43–75. [Google Scholar] [CrossRef]
Qin, A.K.; Huang, V.L.; Suganthan, P.N. Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans. Evol. Comput. 2008, 13, 398–417. [Google Scholar] [CrossRef]
Mikolajczyk, K.; Schmid, C. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1615–1630. [Google Scholar] [CrossRef] [PubMed]
Fanqing, M.; Fucheng, Y. A tracking algorithm based on ORB. In Proceedings of the 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC), Shenyang, China, 20–22 December 2013; pp. 1187–1190. [Google Scholar]
Zhang, H.; Zheng, G.; Fu, H. Research on image feature point matching based on ORB and RANSAC algorithm. J. Phys. Conf. Ser. 2020, 1651, 012187. [Google Scholar] [CrossRef]
Li, Y.; Brasch, N.; Wang, Y.; Navab, N.; Tombari, F. Structure-slam: Low-drift monocular slam in indoor environments. IEEE Robot. Autom. Lett. 2020, 5, 6583–6590. [Google Scholar] [CrossRef]

Figure 1. Photovoltaic power plant scene images.

Figure 2. PE-SLAM system framework diagram.

Figure 3. Schematic diagram of motion model.

Figure 4. Schematic diagram of epipolar constraint.

Figure 5. Schematic diagram of matching point pairs for pole constraint verification.

Figure 6. Improved mutation strategy for particle perturbation.

Figure 7. Particle update schematic.

Figure 8. Optical images of OxfordVGG dataset.

Figure 9. Influence of important parameters on accuracy and time consuming.

Figure 10. Line chart of fitness change.

Figure 11. Matching accuracy based on the data.

Figure 12. The recall rate of matching results based on the dataset.

Figure 13. Unstructured scenarios for datasets.

Figure 14. Trajectory error plot of the structure_notexture sequence.

Figure 15. Trajectory error plot of the nostructure_texture sequence.

Figure 16. Trajectory error plot of the structure_texture sequence.

Figure 17. The 3D map generated from the structure_notexture sequence.

Figure 18. The 3D map generated from the nostructure_texture sequence.

Figure 19. The 3D map generated from the structure_texture sequence.

Figure 20. Robot model based on Gazebo.

Figure 21. Photovoltaic power station simulation scene based on Gazebo.

Figure 22. Trajectory error plot of the Si.

Figure 23. The 3D map generated from the Si.

Figure 24. Partial images of indoor scenes.

Figure 25. Partial images of outdoor scenes.

Figure 26. The 3D map generated from In sequence.

Figure 27. The 3D map generated from Real sequence.

Table 1. The time taken for each algorithm to run (unit: ms).

	ORB	ORB +HAMMING	ORB +RANSAC	ORB +PE-Match
Scene 1	62	62	63	151
Scene 2	55	55	56	132
average	59	59	60	142

Table 2. Absolute trajectory errors (ATE, m) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2			PE-SLAM
Experimental Sequences	REMS	MEAN	S.D.	REMS	MEAN	S.D.
structure_notexture	0.030	0.026	0.014	0.026	0.022	0.013
nostructure_texture	0.125	0.101	0.074	0.070	0.063	0.030
structure_texture	0.020	0.019	0.007	0.014	0.013	0.005

Table 3. Relative reduction in absolute trajectory error (%) of PE-SLAM compared to ORB-SLAM2.

Experimental Sequences	REMS	MEAN	S.D.
structure_notexture	13.3%	15.4%	7.1%
nostructure_texture	44.0%	37.6%	59.5%
structure_texture	30.0%	31.6%	28.6%

Table 4. Relative pose errors (RPEs, m) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2			PE-SLAM
Experimental Sequences	REMS	MEAN	S.D.	REMS	MEAN	S.D.
structure_notexture	0.046	0.039	0.024	0.043	0.036	0.023
nostructure_texture	0.255	0.164	0.195	0.113	0.092	0.066
structure_texture	0.032	0.028	0.015	0.026	0.022	0.012

Table 5. Relative reduction in relative pose Error (%) of PE-SLAM compared to ORB-SLAM2.

Experimental Sequences	REMS	MEAN	S.D.
structure_notexture	6.5%	7.7%	4.2%
nostructure_texture	55.7%	44.0%	66.2%
structure_texture	18.8%	21.4%	20.0%

Table 6. Average tracking time (ms) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2	PE-SLAM
structure_notexture	10	35
nostructure_texture	21	68
structure_texture	25	70

Table 7. Absolute Trajectory Errors (ATE, m) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2			PE-SLAM
Experimental Sequences	REMS	MEAN	S.D.	REMS	MEAN	S.D.
Si	0.026	0.024	0.009	0.023	0.021	0.009

Table 8. Relative Reduction in Absolute Trajectory Error (%) of PE-SLAM compared to ORB-SLAM2.

Experimental Sequences	REMS	MEAN	S.D.
Si	11.5%	12.5%	0%

Table 9. Relative pose errors (RPEs, m) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2			PE-SLAM
Experimental Sequences	REMS	MEAN	S.D.	REMS	MEAN	S.D.
Si	0.095	0.066	0.067	0.090	0.064	0.063

Table 10. Relative reduction in relative pose error (%) of PE-SLAM compared to ORB-SLAM2.

Experimental Sequences	REMS	MEAN	S.D.
Si	5.2%	3.0%	5.9%

Table 11. Average tracking time (ms) of PE-SLAM and ORB-SLAM2.

Experimental Sequences	ORB-SLAM2	PE-SLAM
Si	23	49

Table 12. Summary of experimental results.

Dataset.	Scene	Degree of Feature Similarity	Degree of Weak Texture	Degree of Structural Simplicity	Degree of Camera Noise	ATE (REMS) of PE-SLAM	ATE Reduction Rate	PE-SLAM Mapping Performance
	structure_notexture	High	High	Moderate	Low	0.026	13.3%	High
TUM	nostructure_texture	Moderate	Low	High	Low	0.070	44.0%	Moderate
	structure_texture	Low	Low	Moderate	Low	0.014	30.0%	High
Sim. Exp.	Si	High	High	Moderate	Low	0.023	11.5%	Moderate
Exp.	In	High	Moderate	Moderate	Moderate	-	-	High
Exp.	Real	High	Moderate	Moderate	High	-	-	Moderate

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, C.; Shang, Z.; Wang, J.; Niu, W.; Yang, K. PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints. Appl. Sci. 2024, 14, 7097. https://doi.org/10.3390/app14167097

AMA Style

Li C, Shang Z, Wang J, Niu W, Yang K. PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints. Applied Sciences. 2024; 14(16):7097. https://doi.org/10.3390/app14167097

Chicago/Turabian Style

Li, Cuiming, Zhengyu Shang, Jinxin Wang, Wancai Niu, and Ke Yang. 2024. "PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints" Applied Sciences 14, no. 16: 7097. https://doi.org/10.3390/app14167097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

PE-SLAM: A Modified Simultaneous Localization and Mapping System Based on Particle Swarm Optimization and Epipolar Constraints

Abstract

Featured Application

Abstract

1. Introduction

2. PE-SLAM System Framework

3. Improved Motion Model Based on PE-Match

4. Feature Matching Algorithm Based on Particle Swarm Optimization and Epipolar Constraint (PE-Match)

4.1. Epipolar Constraint Principle

4.2. Improved Mutation Strategy

4.3. Specific Steps for PE-Match

5. Results

5.1. Feature Matching Experiment

5.1.1. Description of the Data Set

5.1.2. Experimental Environment Configuration and Parameter Setting

5.1.3. Experimental Results and Analysis

5.2. SLAM Trajectory Experiment

5.2.1. Description of the Data Set

5.2.2. Experimental Environment Configuration and Parameter Setting

5.2.3. Experimental Results and Analysis

5.3. Simulation Experiments of Photovoltaic Scenarios

5.3.1. Establishment of the Simulation Environment Based on Gazebo

5.3.2. Simulation Experiment Results and Analysis

5.4. Experiments on Building Maps in Photovoltaic Scenarios

5.4.1. Experimental Scenario Composition

5.4.2. Mapping Results and Analyses

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI