Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting

Zhang, Yun; Yang, Bin; Zhao, Xi; Wu, Shiqian; Luo, Bin; Zhang, Liangpei

doi:10.3390/electronics13112101

Open AccessArticle

Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting

by

Yun Zhang

^1,*,

Bin Yang

¹,

Xi Zhao

²,

Shiqian Wu

³

,

Bin Luo

² and

Liangpei Zhang

²

¹

CNNC Wuhan Nuclear Power Operation Technology Co., Ltd., Wuhan 430223, China

²

The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Luo Jia Shan, Wuhan 430072, China

³

Institute of Robotics and Intelligent Systems (IRIS), Wuhan University of Science and Technology, Wuhan 430081, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(11), 2101; https://doi.org/10.3390/electronics13112101

Submission received: 28 April 2024 / Revised: 14 May 2024 / Accepted: 27 May 2024 / Published: 28 May 2024

(This article belongs to the Special Issue Computational Imaging and Its Application)

Download

Browse Figures

Versions Notes

Abstract

:

Outliers significantly impact the accuracy of geometric model fitting. Previous approaches to handling outliers have involved threshold selection and scale estimation. However, many scale estimators assume that the inlier distribution follows a Gaussian model, which often does not accurately represent cases in geometric model fitting. Outliers, defined as points with large residuals to all true models, exhibit similar characteristics to high values in quantized residual preferences, thus causing outliers to cluster away from inliers in quantized residual preference space. In this paper, we leverage this consensus among outliers in quantized residual preference space by extending energy minimization to combine model error and spatial smoothness for outlier detection. The outlier detection process based on energy minimization follows an alternate sampling and labeling framework. Subsequently, an ordinary energy minimization method is employed to optimize inlier labels, thereby following the alternate sampling and labeling framework. Experimental results demonstrate that the energy minimization-based outlier detection method effectively identifies most outliers in the data. Additionally, the proposed energy minimization-based inlier segmentation accurately segments inliers into different models. Overall, the performance of the proposed method surpasses that of most state-of-the-art methods.

Keywords:

geometric model fitting; outlier detection; quantized residual; energy minimization

1. Introduction

Geometric model fitting, which refers to the accurate estimation of model parameters in noisy data, is a fundamental issue in computer vision. This includes tasks such as estimating homography or the fundamental matrix for plane detection and motion segmentation. However, this task is complex due to the presence of noise and outliers, especially when dealing with data containing multiple geometric structures. For instance, in this study, we focus on the estimation of planar homography to illustrate the issue, as presented in Figure 1 (“ladysymon” from AdelaideRMF [1]). By analyzing matched points in two-view images (Image 1 and Image 2), the homography matrix is estimated, thus considering two distinct “plane” structures corresponding to two homographies. The number of model instances, ratios of inliers (correctly matched points), and outliers (mismatched points) are uncertain. Through the proposed model fitting approach, the two “plane” structures associated with the homographies can be estimated, and the correct matched points on each plane can be identified as inliers for each homography, thus disregarding the influence of outliers. It is important to note that the structures under consideration are not limited to “homography”, as the fundamental matrix is also examined in this research.

In the past three decades, the classical random sample consensus (RANSAC) technique [2] and its related methods [3] have been extensively employed for addressing the challenge of model fitting in the presence of outliers. While these approaches have proven effective for fitting a single model, they are not suitable for scenarios involving multimodel fitting, such as sequential RANSAC [4,5] and multi-RANSAC [6]. This limitation arises from the tendency to classify numerous data points as noise or outliers, where the inclusion of one data point in a specific model may lead to its classification as an outlier in other models, thus resulting in the generation of numerous pseudo-outliers [7]. Consequently, the problem of geometric multimodel fitting remains unresolved.

The challenge of multimodel fitting can be likened to a classic chicken-and-egg dilemma [8], where both the assignments of data to models and the model parameters are initially unknown. However, once a solution is obtained for one aspect of the problem, the solution for the other aspect can be readily derived. Multimodel fitting approaches typically begin with a sampling procedure to generate numerous hypotheses. Due to the presence of multiple underlying structures and significant outliers, the minimum sample set (MSS) utilized for hypothesis calculation often includes outliers or pseudo-outliers. Consequently, it becomes exceedingly difficult to differentiate the inliers associated with distinct models by directly utilizing hypothesis parameters or residuals.

To tackle this issue, a series of preference analysis-based methods [8,9,10,11,12,13,14,15,16,17,18,19,20] have been developed. These methods utilize hypothesis residuals to determine preference sets for data points in clustering. J-linkage [9,10], one of the earliest preference analysis-based methods, establishes a conceptual preference for points by binarizing residuals using an inlier threshold. It also introduces the Jaccard distance to assess the similarity between preference sets of two points for linkage clustering. Subsequently, the inliers from different models are allocated to distinct clusters. T-linkage [11,12], an enhanced iteration of J-linkage, enhances the binary preference function by incorporating relaxation and employs the soft Tanimoto distance to refine the conceptual preference from J-linkage for improved clustering outcomes. Robust preference analysis (RPA) [14] represents data points in a conceptual space akin to J-linkage, followed by robust principal component analysis (PCA) and symmetric non-negative matrix factorization (NMF) to break down the multimodel fitting issue into numerous single-model fitting problems. These are then addressed using M-estimator sample consensus (MSAC) [21]. However, both the conceptual preference in J-linkage and the soft conceptual preference in T-linkage necessitate an inlier threshold to mitigate the influence of outliers, thus rendering the methods less robust and somewhat cumbersome when handling outlier data points.

In order to avoid the problem brought by the inlier threshold, permutation preference makes use of the order number of the sorted residuals without introducing the inlier threshold, which is widely used in multimodel fitting methods [8,16,17,18,19,20,22]. Similarly, kernel fitting (KF) [15] makes use of the permutation by sorting the residuals of the hypotheses as the preferences to build the Mercer kernel to elicit potential points belonging to a common structure, and, as a result, the outliers and noise can be removed. Meanwhile, permutation preference is also used to represent the hypotheses for mode seeking [8,17]. Wong [18] made use of permutation preference for hypothesis sampling and inlier clustering. The simulated annealing-based random cluster model (SA-RCM) [19] integrates permutation preference with graph-cut optimization for the hypothesis sampling, and the multimodel fitting task is solved efficiently in a simulated annealing framework. Lai [22] combined permutation preference analysis and information theory principles to build a discriminative sparse affinity matrix for clustering.

The preference analysis-based methods make a great effort to take advantage of the residuals and neglect the spatial distribution of the data points, i.e., the inliers belonging to the same model are usually spatially close in image. A series of energy-based fitting methods [19,23,24,25,26,27,28] have been proposed to optimize the fitting solution by accounting for the model error and encouraging spatial smoothness of the points. The energy-based fitting methods formulate the geometric multimodel fitting as an optimal labeling problem, with a global energy function balancing the geometric errors and the regularity of the inlier clusters; the optimal labeling problem is solved by means of the

α

expansion algorithm [29]. Similar graph-based methods [30,31] have been proposed to solve this problem. Also, the hypergraph [32] has been introduced to represent the relation between hypotheses and data points for multimodel fitting [33,34].

Energy-based fitting methods leverage the spatial relationships among data points to effectively identify inliers belonging to the same model, thus resulting in a promising segmentation outcome. However, outliers are typically randomly dispersed throughout the data points, thus making it challenging for these methods to address them solely through data error or spatial smoothness considerations. Consequently, additional measures such as inlier thresholding or scale estimation [35,36,37,38,39,40,41] are often required to mitigate the influence of outliers. Nevertheless, many scale estimation techniques rely on specific noise distribution models, thus commonly assuming a Gaussian distribution. In the context of geometric model fitting, the noise distribution becomes highly complex due to factors like sampling, feature extraction, and matching processes, thereby leading to the suboptimal performance of the scale estimators. As a result, the effectiveness of energy-based fitting methods for geometric model fitting is significantly constrained.

The key factor in enhancing the accuracy of fitting models often lies in addressing outliers, which are data points with residuals larger than the inliers when compared to all true models in the dataset. This leads to the formation of a consensus among the outliers. When a sufficient number of accurate hypotheses are generated during the sampling phase, the quantized residual preferences [42,43] of the outliers tend to exhibit higher values, thus causing them to be distinctly separated from the inliers in the quantized residual preference space. Previous outlier detection methods based on quantized residual preferences were limited to preference space, thus making the outcomes susceptible to variations in the sampling process. Suboptimal results may occur when the proportion of correct models obtained from the sampling process is insufficient. By considering the distribution of points in the quantized residual preference space, energy minimization techniques can effectively handle outliers without requiring scale estimation. This study extends the application of energy minimization to the quantized residual preference space by utilizing a neighborhood graph derived from the Delaunay triangulation of points in this space rather than the image coordinates of the points. The data cost associated with outlier labeling is adjusted based on pseudo-outliers to enhance outlier detection. To maximize outlier detection, the energy minimization process adopts a strategy of alternate sampling and labeling, thus involving the iterative optimization of labeling based on energy minimization and sampling of model hypotheses within the inlier clusters for subsequent energy minimization rounds. This iterative process enhances both the sampling and labeling procedures. Following the outlier detection phase, an inlier segmentation process is applied using a conventional energy minimization fitting method on the data points with outliers removed. This segmentation process constructs a neighborhood graph based on the Delaunay triangulation of the points’ image coordinates. The energy-based inlier segmentation process also follows an alternate sampling and labeling framework for improved accuracy.

The rest of this paper is organized as follows. In Section 2, we introduce the proposed method in detail. The experiments in geometric multimodel fitting, including two-view plane segmentation and two-view motion segmentation, are presented in Section 3. Finally, we draw our conclusions in Section 4.

2. Materials and Methods

The proposed method for geometric model fitting consists of two parts: outlier detection and inlier segmentation. As shown in Figure 2, outliers and inliers are always mixed in image. We first conducted an outlier detection process in quantized residual preference space and removed the outliers; then, the inlier segmentation process was proposed to segment the inliers to different hgomography models. The outlier detection and inlier segmentation were both integrated with the energy minimization but with different neighborhood graphs. Because outliers and inliers are mixed in image, but can be separated perfectly in quantized residual preference space, the neighborhood graph is generated in quantized residual preference space during the outlier detection process; meanwhile, the inliers belonging to different models show a strong aggregated distribution in image; then, in the inlier segmentation process, the neighborhood graph is obtained from the coordinate points.

2.1. Outlier Detection

Like most of the preference-based fitting methods, the outlier detection process starts with a sampling process to generate many model hypotheses. In order to ensure the proportion of good hypothesis, we generated the model hypotheses using a region-based random sampling process. After the hypothesis generation, the quantized residual preference was calculated from the hypotheses residuals to represent the data points in quantized residual preference space. Finally, the energy minimization was proposed in quantized residual preference space to segment the whole dataset into several clusters; then, the outlier cluster was selected as the outlier detection result. In order to decrease the instability caused by the sampling process and detect the outlier as often as possible, the whole outlier detection process followed an alternate sampling and labeling framework, which alternately conducts sampling to generate hypotheses for quantized residual preference and carries out energy minimization to optimize the labeling classes in quantized residual preference space; then, the next round sampling process will be conducted in each inlier cluster. The whole work flow for the proposed outlier detection is presented in Figure 3.

2.1.1. Region-Based Random Sampling

Random sampling is a common hypothesis generation method for robust model fitting, but it is of extremely low efficiency, and the hitting rate (possibility to obtain true model) is also very low. Spatial information has been widely used in guided sampling to improve the efficiency of the sampling [5,44,45,46], which assume that inliers are closer to each other than outliers in the spatial domain of the data. In order to make full use of the spatial information that inliers belonging to one model tend to be neighbors, we conducted random sampling within a region, i.e., region-based random sampling.

The region-based random sampling was undertaken by directly randomly sampling the MSS within a region (Figure 4). Firstly, the whole data points were divided into several subregions, and then we conducted random sampling as in RANSAC on each subregion to obtain a number of model hypotheses. The subregions were obtained by an even spacial division with a stable region size. Every time we find the nearest neighbors to form a region and remove the points, then the next region will be obtained in the remaining data points until the remaining points are less than the region size.

Because inliers belonging to the same model tend to be neighbors in image, the MSS extracted from subregions will be more likely to be made up of inliers from the same model. As a result, more good hypotheses will be generated. In practical experiments on AdelaideRMF data, we found the model hypotheses generated through region-based random sampling can guarantee that more than

40 %

of the hypotheses that are calculated from inliers form the same model, while random sampling can only obtain less than

15 %

.

The setting of the size of subregion should not be too small or too large. If it is too small, the subregion will be divided too finely, thus making it difficult to effectively utilize spatial consistency information. If it is too large, it will introduce data from other models, thus leading to a decrease in the quality of the sampling model. Generally, the size is set between 3–10 times the MSS size, and the number of random sampling times in each subarea is usually set to 20–200 random samples to make sure there are enough sampling hypotheses.

2.1.2. Quantized Residual Preference

After the hypothesis generation, the hypothesis residuals are calculated for the quantized residual preferences. The quantized residual preferences are obtained by quantizing the residuals and taking the quantized values as the preference values for the data points. The other key fact for quantized residual preference space is the distance measurement.

Given the data point set

X = {x_{1}, x_{2}, \dots, x_{N}}

, the hypotheses set

H = {ℏ^{1}, ℏ^{2}, \dots, ℏ^{j}, \dots, ℏ^{M}}

after the hypothesis generation, and the residual matrix is

R = {r^{1}, r^{2}, \dots, r^{j}, \dots, r^{M}}

, where

r^{j} = {[r_{1, j}, r_{2, j}, \dots, r_{i, j}, \dots, r_{N, j}]}^{T}

refers to the residuals of hypothesis

ℏ^{j}

to all the data points in X, N is the data number, and M is the number of hypotheses. We conduct quantization on R by Equation (1):

\begin{matrix} {\overset{ˇ}{q}}_{i, j} = ⌈\frac{r_{i, j} - r_{m i n}^{j}}{r_{m a x}^{j} - r_{m i n}^{j}} * θ⌉ \\ r_{m a x}^{j} = m a x {r_{1, j}, r_{2, j}, \dots, r_{i, j}, \dots, r_{N, j}} \\ r_{m i n}^{j} = m i n {r_{1, j}, r_{2, j}, \dots, r_{i, j}, \dots, r_{N, j}} \end{matrix}

(1)

where

θ

refers to the quantization level. When using the quantized residuals to represent the hypotheses or the data points, a valid quantization length

λ

is needed to decrease the complexity of the quantized residual preferences.

q_{i, j} = \{\begin{matrix} {\overset{ˇ}{q}}_{i, j} & {\overset{ˇ}{q}}_{i, j} < = λ \\ 0 & {\overset{ˇ}{q}}_{i, j} > λ \end{matrix}

(2)

In this way, we can obtain the quantized residual matrix

Q = [{q_{1}}^{T}, {q_{2}}^{T}, \dots, {q_{N}}^{T}]

, where each row of Q is the quantized residual preference for the data point. That is, the quantized residual preference for data point

x_{i}

is the ith row of Q, i.e.,

q_{i} = [q_{i, 1}, q_{i, 2}, \dots, q_{i, j}, \dots, q_{i, M}]

. When comparing two quantized residual preferences

q_{i}

and

q_{j}

, the distance measurement defined by Equation (3) is used.

\begin{matrix} W (q_{i}, q_{j}) = \{\begin{matrix} 1 - \frac{\sum_{t = 1}^{M} φ (q_{i, t}, q_{j, t})}{m a x (ρ (q_{i}), ρ (q_{j}))} & if m a x (ρ (q_{i}), ρ (q_{j})) \neq 0 \\ 1 & else \end{matrix} \\ φ (q_{i, t}, q_{j, t}) = \{\begin{matrix} 1 & if q_{i, t} = q_{j, t}, q_{i, t} \neq 0 \\ 0 & else \end{matrix} \\ ρ (q_{i}) = \sum_{t = 1}^{M} φ (q_{i, t}, q_{i, t}) \end{matrix}

(3)

Because the residuals of outliers will be bigger for all the models in the data, the bigger residuals will more likely tend to be close to

λ

or 0 (Equation (2)) after quantization. In this way, the quantized residual preferences of the outliers will tend to have more values with 0 or

λ

, and when the proportion of good hypotheses is high enough, most of the values in the quantized residual preferences of the outliers will tend to be close to

λ

or 0, whereas the corresponding inliers will have quite small values. When projecting the data points into quantized residual preference space with the distance measurement in Equation (3), most of the outliers will present a concentrated distribution (Figure 5) and will be far away from the inliers, thus making the outliers easily separated from the inliers.

Figure 5 shows two-view data“johnsona” from the AdelaideRMF dataset [1] for multihomography estimation. Figure 5a presents the two-view images and the feature matching points between two images, and the points labelled by red triangle are the mismatched points, which are regarded as outliers in multihomography estimation. Figure 5b presents the multidimensional scaling (MDS) plot of the quantized residual preferences of the points, where the points labeled with red color are the outliers (mismatched points) corresponding to Figure 5a. It is clearly shown that the outliers are distributed close to each other and far away from the inliers.

Quantization level is usually used to enhance the consistency of residual values across different models. If the quantization level is set too large, it will make it more difficult to achieve consistency among data points belonging to the same model, because the resolution between data points becomes finer. On the other hand, if the quantization level is set too small, the distinguishability between data points from different models will decrease. The quantization level is typically related to the complexity of the model or the minimum sample size for model calculation. In experiments such as two-view plane segmentation and two-view motion segmentation, the quantization levels were set to 20 and 200 respectively. The quantization length was mainly used to control the complexity of the quantized residual preference features. When distinguishing inliers and outliers, a value of 1 is usually sufficient for good results. When segmenting inliers, a value between 1 and 5 can provide good discriminative ability.

2.1.3. Energy Minimization-Based Outlier Detection

Energy minimization is widely used in geometric model fitting problems; generally, the energy minimization is formulated using the labeling

f = {f_{i}}_{i = 1}^{N}

, thus combining data costs and smoothness costs [19,23]:

E (f) = \overset{d a t a c o s t}{\overset{︷}{\sum_{i = 1}^{N} D (x_{i}, f_{i})}} + \overset{s m o o t h n e s s c o s t}{\overset{︷}{α \cdot \sum_{< i, j > \in N} V (f_{i}, f_{j})}}

(4)

for which a set of labels

f = {f_{i}}_{i = 1}^{N}

assigns each

x_{i}

to the structures, and

f_{i} = 0

refers to the labels for the outliers. The data cost is constructed according to the residuals.

D (x_{i}, f_{i}) = \{\begin{matrix} r (x_{i}, ℏ_{f_{i}}) & if f_{i} \neq 0 \\ σ & if f_{i} = 0 \end{matrix}

(5)

where

r (x_{i}, ℏ_{f_{i}})

is the absolute residual of

x_{i}

to structure

ℏ_{f_{i}}

, and

σ

refers to the penalty for labeling

x_{i}

as an outlier. The smoothness cost is defined according to the Potts Model:

V (f_{i}, f_{j}) = \{\begin{matrix} 0 & if f_{i} = f_{j} \\ 1 & if f_{i} \neq f_{j} \end{matrix}

(6)

And Equation (4) requires a neighborhood graph

G = (V, N)

, where the vertices are the data points

V = X

, and the edge

N

is constructed from the Delaunay triangulation on X [19,23,24].

Energy (4) can be minimized by

α

expansion algorithm [29]. And most of the time, the

α

expansion follows the expand and re-estimate framework as in PEARL [23] to further decrease the energy until convergence.

While in practice, the outliers are randomly distributed on X, the spatial information of the outliers on X can hardly be used in the energy minimization method, which means that the outlier segmentation result totally depends on inlier threshold

σ

. Because the actual inlier threshold of different model varies, the single specified inlier threshold

σ

can hardly satisfy all the models. As a result, the energy minimization method performs poorly when outliers involved.

As presented in Figure 5, the outliers present a clear aggregated distribution in quantized residual preference space Q, which can be used for constructing the neighborhood graph in the energy minimization. Through the combination of the residuals and the aggregated distribution in Q by means of energy minimization, the outliers and inliers can be successfully separated.

Similarly, the neighborhood graph in Q is also constructed from the Delaunay triangulation on Q accordingly. The vertices are the quantized residual preference of data points

V = Q

, and the edges

N

are constructed from the Delaunay triangulation on Q by the distance measurement of Equation 3. This way, the distribution of outliers in quantized residual preference space can be introduced into energy minimization function.

However, the data cost can not be simply defined as the absolute residuals to the structure model, since the accurate parameter of the structure is unknown, which is usually estimated from the points with the same label. Only if all the points are from the same structure inliers can the fitted parameter be regarded close to the true parameter, which is quite difficult to be guaranteed. In practice, in order to exclude the impact of the outliers and pseudo-outliers as much as possible, to obtain accurate structure parameter, the hypothesis

ℏ_{f_{i}}

for calculating the residuals is obtained by randomly sampling a number of hypotheses

H_{f_{i}}

from points labeled with

f_{i}

, then calculating the residuals of the points labeled with

f_{i}

, and choosing the hypothesis with the minimum mean residual as the model parameter

ℏ_{f_{i}}

to calculate the data cost. The smaller the mean residual of the hypothesis is, the closer to the true model it will be. This way, we can estimate the parameter much closer to the true one.

ℏ_{f_{i}} = \underset{ℏ \in H_{f_{i}}}{arg min} \frac{\sum_{x_{i} \in L (f_{i})} {r (x_{i}, ℏ)}^{2}}{c a r d (L (f_{i}))}

(7)

where

L (k) = {x_{i} | f_{i} = k}

refers to the data points labeled with k, i.e., the inlier set of the hypothesis labeled with k.

The data cost for the inlier labels is defined as the squared residuals for hypothesis

ℏ_{f_{i}}

generated from data points labeled with

f_{i}

:

D (x_{i}, f_{i}) = r {(x_{i}, ℏ_{f_{i}})}^{2}, f_{i} \neq 0

(8)

where

r (x_{i}, ℏ_{f_{i}})

refers to the residuals of

x_{i}

with respect to hypothesis

ℏ_{f_{i}}

.

For the data points that are considered outliers, it is not sufficient to view the residuals as simply representing the data cost of those outliers, as a single parameter derived from outliers does not provide meaningful information for the entire dataset. To address this, the statistical consistency of the inliers is evaluated by considering pseudo-outliers, which are data points that belong to one model but are outliers in relation to other models. For example, if points are labeled as

f_{i}

, they can be seen as outliers to the model labeled as

f_{j}

, where

f_{i} \neq f_{j}

. Therefore, we calculated the mean residuals of the inliers labeled with

f_{i}

to other structure hypotheses

{ℏ_{f_{j}}}

as the data cost of outliers to inlier model. Furthermore, we enhanced the statistical consistency by using the hypotheses

{H_{f_{j}}}

sampled from points labeled with

f_{j}

and

f_{j} \in {1, 2, \dots, n}

.

D (x_{i}, f_{i}) = \frac{\sum_{ℏ \in {H_{f_{j}}}} r {(x_{i}, ℏ)}^{2}}{c a r d ({H_{f_{j}}})}, f_{i} = 0, f_{j} \neq 0

(9)

When calculating the data cost of the outliers to the outlier label, we simply take the minimum residual of outliers over all the hypotheses as the data cost value to the outlier label.

Generally, the energy minimization process needs initial input labels, which contain an outlier label and inlier label. In order to obtain the initial labels, a linkage clustering process based on the quantized residual preference was proposed to divide the data points into several clusters, and then the initial outlier cluster was selected using an outlier index. Because the hypotheses generated from the outliers will tend to have larger residuals, while hypotheses generated from the inliers will have smaller residuals, the outlier index

ξ_{f_{i}}

can be defined using the mean hypothesis residuals generated by MSS from each cluster, which is calculated as Equation (10).

ξ_{f_{i}} = \frac{\sum_{ℏ \in {H_{f_{i}}}} \sum_{x_{i} \in L (f_{i})} r (x_{i}, ℏ)}{c a r d (H_{f_{i}}) * c a r d (L (f_{i}))}

(10)

Because the outlier index is the mean hypothesis residuals generated by MSS from each cluster, clusters with a bigger outlier index will more likely tend to be outlier cluster. Then, the cluster with the maximum outlier index

ξ

is selected as the initial outlier cluster, and points within the cluster are labeled as 0, while the points within the other clusters are labeled with the corresponding number.

After the initial labeling, energy minimization is then conducted to obtain the optimized labels of the data points. And the result labels often produce more than two clusters, which contain outlier cluster and several other inlier clusters. Note that these inlier clusters contain serious undersegmentation and cannot be considered as the inlier segmentation result. Similarly, the outlier indexes will be calculated for each cluster to select the outlier cluster and find the other inlier clusters. If the outlier cluster is unchanged compared to the initial outlier cluster or previous detection result, then it will be returned as the final outlier detection result; otherwise the inlier clusters will be regarded as the subregions for random sampling for the next round detection following the alternate sampling and labeling framework. The full work flow is shown in Figure 3, and the details of the algorithm are presented in Algorithm 1.

Algorithm 1 Energy Minimization-Based Outlier Detection

1:: Divide X into m subregions $D = {D^{1}, D^{2}, \dots, D^{m}}$ and initialize the outlier cluster ${\overset{´}{C}}^{o} = \emptyset$ ;
2:: Conduct random sampling on each subregion in $D$ and generate hypotheses $H$ ;
3:: Calculate the quantized residual preference matrix Q and calculate the distance matrix W for every two points’ quantized residual preferences;
4:: Conduct linkage clustering with the distance matrix W and obtain clusters $C = {C^{1}, C^{2}, \dots, C^{n}}$ ;
5:: Calculate the outlier index $ξ = [ξ_{f_{1}}, ξ_{f_{2}}, \dots, ξ_{f_{n}}]$ for each cluster in $C$ , and select the cluster with the maximum outlier index as the outlier cluster $C^{o} = C^{t}$ , $t = \underset{j}{arg max} ξ_{C^{j}}$ , the inlier cluster $C^{I} = X - C^{o}$ , and initialize the outlier and inlier labels $f_{0} = {f_{i} | f_{i} = 0, i f x_{i} \in C^{o}; e l s e f_{i} = 1}$ ;
6:: Calculate the data cost using clusters $\hat{C} = C - C^{o}$ , build the neighborhood graph $N$ in quantized residual preference space, and calculate the smoothness cost;
7:: Conduct energy minimization and obtain the optimized labels $\hat{f}$ with $α$ expansion, collect the label inlier set $L$ by $L (k) = {x_{i} | {\hat{f}}_{i} = k, {\hat{f}}_{i} \in \hat{f}}$ , and obtain the outlier and inlier labels ${\tilde{f}}_{0} = {{\tilde{f}}_{i} | {\tilde{f}}_{i} = 0, i f {\hat{f}}_{i} = 0; e l s e {\tilde{f}}_{i} = 1}$ ;
8:: If ${\tilde{f}}_{0} = = f_{0}$ , return the outlier points labeled with 0 in ${\tilde{f}}_{0}$ , else $f_{0} = {\tilde{f}}_{0}$ ;
9:: Randomly sample hypotheses in each label inlier set, except for the outlier label set, select the hypothesis with the minimum mean residual as the model, re-estimate the model parameters, and refine the label by the minimum residual to the model parameters $f_{i} = \underset{{\hat{f}}_{i}}{arg min} r (x_{i}, ℏ_{{\hat{f}}_{i}})$ , ${\hat{f}}_{i} \neq 0$ , $f = {f_{i}}_{i = 1}^{n}$ ;
10:: Update the data cost using the sampling hypotheses obtained in step 9;
11:: Calculate the quantized residual preference space and update the neighborhood graph $N$ and smoothness cost;
12:: Return to step 7 and undertake the label optimization process again.

2.2. Inlier Segmentation

After the outlier detection, it is usually possible to find most of the outliers in the dataset, and there will be few outliers left after removing the detected outliers. A conventional energy minimization process [19,23] can then be used to segment the inliers, without considering the outliers, which makes the labeling optimization more convenient and accurate.

The energy minimization also starts with region-based random sampling to obtain a number of hypotheses, and the initial label of the data points is obtained from the subregions of the data points in the image that points in the same subregion will share one label. Given the dataset

{\overset{´}{X}}_{I} = X - C^{o}

after outlier cluster

C^{o}

has been removed, and

{\overset{´}{X}}_{I}

has been divided evenly into subregions

\overset{´}{D} = {{\overset{´}{D}}^{1}, {\overset{´}{D}}^{2}, \dots, {\overset{´}{D}}^{\overset{´}{m}}}

, then the initial label is

\overset{´}{f} = {{\overset{´}{f}}^{i} | {\overset{´}{f}}^{i} = k, i f x_{i} \in {\overset{´}{D}}^{k}}

.

Because this energy minimization is conducted on inliers, the data cost is calculated following Equation (8), i.e., we select the hypothesis with the minimum mean residual to calculate the data cost for the label during the random sampling process. The neighborhood graph is constructed from Delaunay triangulation of the data points

{\overset{´}{X}}_{I}

. The smoothness cost uses the Potts model in Equation (6). The whole label optimization process also follows an alternating sampling and labeling process similar to the outlier detection process. The full work flow is shown in Figure 6, and the details of the inlier segmentation process are provided in Algorithm 2.

Algorithm 2 Inlier Segmentation

1:: Divide ${\overset{´}{X}}_{I}$ into subregions $\overset{´}{D} = {{\overset{´}{D}}^{1}, {\overset{´}{D}}^{2}, \dots, {\overset{´}{D}}^{\overset{´}{m}}}$ ;
2:: Construct the neighborhood graph on ${\overset{´}{X}}_{I}$ and initialize point labels $\overset{´}{f}$ according to $\overset{´}{D}$ ;
3:: Conduct random sampling on each subregion in $\overset{´}{D}$ , and calculate the data cost;
4:: Conduct energy minimization to obtain optimized label $\dot{f}$ by $α$ expansion;
5:: Collect inlier sets $L$ by $L (k) = {x_{i} | {\dot{f}}_{i} = k, {\dot{f}}_{i} \in \dot{f}}$ according to the label $\dot{f}$ ;
6:: If $\dot{f} = = \overset{´}{f}$ , end the algorithm and return the inlier point label $\dot{f}$ ;
else $\overset{´}{f} = \dot{f}$ , $\overset{´}{D} = L$ , go to step 3;

The alternating sampling and labeling framework is very easy to converge, which usually takes a few iterations. The sampling is conducted within each optimized cluster by the energy minimization process, which in turn further makes the data cost accurate for the next round labeling process and improves the labeling result. Thus, the mutual improvement of the sampling and labeling process guarantees rapid convergence during the outlier detection (Figure 3) and inlier segmentation (Algorithm 2).

Usually, the alternating sampling and labeling framework can obtain most of the outliers after the first iteration, and more iterations will decrease the impact through the sampling process. And in most of the experiment, the results remain stable after five iterations.

3. Results

In this section, the conducted experiments in geometric multimodel fitting include two-view plane segmentation and two-view motion segmentation. The two-view plane segmentation is actually a multihomography estimation problem, and a plane can be parametrized by a homography matrix calculated from the matched points on the plane in two images. So in order to achieve two-view plane segmentation, we need to fit multiple homographies from the matched points. Every time, we need at least four pairs of matched points to calculate the homography matrix using DLT method [47], and we use Sampson distance for computing the residuals.

We first tested the proposed outlier detection method on “Merton College II” (https://www.robots.ox.ac.uk/~vgg/data/mview/ (accessed on 13 May 2024)) for two-view plane segmentation compared to kernel fitting [15]. We added 75 outlier corners in the data; the ground truth outliers and inliers are labelled in different shapes in Figure 7.

In the outlier detection experiment on “Merton College II”, the proposed method detected all the 75 outliers with only three false detected outliers that belonged to the inliers. The kernel fitting method found out 66 outliers and had four false detected outliers. The corresponding detected outlier results are presented in Figure 8.

For two-view motion segmentation, the rigid-body motion

(R, t)

can be described by a fundamental matrix

F = {[K^{'} t]}_{x} K^{'} R K^{- 1}

corresponding to two views. So, the two-view motion segmentation is actually a multifundamental matrices estimation problem, and the rigid-body motion can be generally described by a fundamental matrix. In the multifundamental matrix estimation, we used a normalized 8-point algorithm [48] to estimate the fundamental matrix, and we calculated the residuals using the Sampson distance.

The proposed outlier detection method for two-view motion segmentation was initially tested on “cars5” (Figure 9) from the Hopkins 155 motion dataset (http://www.vision.jhu.edu/data/hopkins155/ (accessed on 13 May 2024)), which is mainly used for testing motion segmentation algorithms to segment feature trajectories [49]. In this experiment, we just selected two frames (the 1th frame and 21th frame) from the “cars5” video sequence and the corresponding tracking features as the ground truth inliers, and then 100 outliers were added to test the proposed outlier detection algorithm.

In the outlier detection experiment on “cars5” for two-view motion segmentation, the proposed method successfully detected 90 outliers with only 10 missing outliers and without false detection. The kernel fitting method found only 59 outliers and had 35 false detected outliers. The proposed method shows much superiority to kernel fitting, which could detect the most outliers with the fewest false detections. The corresponding detected outlier results are presented in Figure 10.

Furthermore, the proposed outlier detection method has been fully tested on the AdelaideRMF dataset (https://github.com/trungtpham/RCMSA/tree/master/data/AdelaideRMF (accessed on 13 May 2024)) to show the performance of outlier detection and the corresponding overall inlier classification for both of the two-view plane segmentation and two-view motion segmentation. Comparisons in inlier segmentation with the state-of-the-art methods of “propose expand and re-estimate labels” (PEARL) [23], SA-RCM [19], J-linkage [9], T-linkage [11], Prog-X [28], and CLSA [50] were undertaken. The overall misclassification percentage (number of misclassified points divided by the number of points in the dataset) [51] was used to represent the model fitting performance.

The outlier detection results are shown in Figure 11 and compared with the results of kernel fitting [15] for both two-view plane segmentation and two-view motion segmentation. Figure 11a–f are the two-view plane segmentation data, and Figure 11g–l are the two-view motion segmentation data. We can see that both the kernel fitting method and the proposed method could find most of the outliers in the data. However, most of the time, the detected outliers by kernel fitting contained more inliers, and more outliers were undetected than with the proposed method. The proposed method could usually identify almost all the outliers, with very few undetected outliers, and false detection was nonexistent, except for “dinobooks”. In the “dinobooks” data, a number of model inliers were misclassified into outliers, because the even division of the data points during the region-based random sampling process made the MSS contain some outliers, which in turn made the inliers close to the outliers in quantized residual preference space, thus making them difficult to separate from the outliers. And adequate sampling of the inliers of each real model can further improve the performance of the proposed algorithm, which will be the focus of our next work. Table 1 shows the number of correctly detected outliers (“Correct”), undetected outliers (“Missing”), and falsely detected outliers (“False”) for kernel fitting and the proposed method. This quantitative comparison indicates that the proposed method can generally detect almost all the outliers in the data, with fewer falsely detected and undetected outliers than kernel fitting.

Table 2 shows the misclassification results of the two-view plane segmentation compared for PEARL [23], J-linkage [9], T-linkage [11], SA-RCM [19], Prog-X [28], and CLSA [50]. Please note, the misclassification for CLSA referred to [50] only retained two decimal number for the percentage, so the misclassifications for “neem”, “oldclassicswing”, and “sene” are at the same level with proposed method. It can be seen that the proposed method obtained the lowest level of misclassification on most of the datasets. The corresponding inlier segmentation results are presented in Figure 12, where most of the inliers on the different planes can be segmented quite accurately. For the “johnsonb” data, there are seven planes in the data, and the inliers of model 3 (points labeled with blue) occupy a large proportion, while the inliers of model 1 (red points) and model 6 (magenta points) occupy a smaller proportion, which resulted in uneven sampling. Few hypotheses were generated from the inliers in models 1 and 6, thus making these two models difficult to extract. However, the proposed method could separate models with both a large inlier scale and a small scale quite well, and it obtained a much lower misclassification level than some of the state-of-the-art methods. Our method performed better in the case with more instances for “johnsona” and “johnsonb” dataset.

The misclassification results for two-view motion segmentation are presented in Table 3 and compared to the other six methods same with two-view plane segmentation experiment. The proposed method obtained the lowest misclassification level on most of the datasets, except for the “breadtoycar” and “dinobooks” data, and even obtained a zero misclassification level with two of the datasets. These six datasets selected from the corresponding inlier segmentation results are presented in Figure 13, from which we can see that most of the inliers for the different fundamental matrix models could be segmented quite accurately, except for the “dinobooks” data, where many inliers for model 1 (red points) were classified into outliers (Figure 11l), and the inlier segmentation result was poor. For the “dinobooks” data, the proportion of outliers (43%) was very high, and, every time, eight points needed to be randomly selected to generate a fundamental matrix hypothesis. The MSS will have a great possibility of containing outliers by means of random sampling within evenly divided subregions by the Euclidean distance of the data points. This will make the proportion of good hypotheses very low and will seriously impact the performance of the quantized residual preferences, thus resulting in a poor performance for the outlier detection. Since the outlier detection result directly affects the final inlier segmentation accuracy, improving the sampling method will help to solve the problem. Therefore, we will consider to introduce preference analysis into the improvement in further study.

In the experiments, during the region-based random sampling, the size of subregion was set 20, which means 20 nearest neighboring points make up a subregion, and every time, we randomly sample 200 hypotheses in a subregion. For outlier detection, the common residuals of good hypotheses need to be quantized to close values, then the quantization level

θ

should be small; while the big residuals most likely belonging to the outliers need to be extremely highlighted from the inliers’ residuals, the quantization length

λ

is usually set to one. Most of the time, the parameters of quantization level

θ

and quantization length

λ

are closely related to the type of model. In two-view plane segmentation (multihomography estimation), quantization level

θ = 20

and quantization length

λ = 1

will get quite good results for all the data; while in two-view plane segmentation (multifundamental matrix estimation), quantization level

θ = 200

and quantization length

λ = 1

are suitable.

Techniques such as PEARL, J-linkage, T-linkage, and Prog-X address outliers by setting thresholds for key parameters. These methods encounter difficulties in achieving high levels of accuracy in data fitting tasks that involve outliers. However, strategies that exploit the consistency of outlier distribution to eliminate outliers and subsequently perform inlier segmentation, such as CLSA and the novel method proposed in this study, have shown promise for enhancing accuracy. In contrast to the CLSA approach, which depends on the distribution characteristics of the outliers in Latent Semantic Space to eliminate and segment inliers, our method integrates the combined distribution of outliers in the quantified residual space with the spatial distribution of data points. This technique utilizes an optimization framework based on energy function minimization, thus leading to enhanced precision under specific circumstances.

4. Conclusions

In this paper, we have extended the energy minimization on quantized residual preference space for outlier detection. Generally, when the sampling hypotheses contain a great number of good hypotheses close to the real models, the consensus of the outliers will be shown out on quantized residual preference space obviously, i.e., the outliers will gather away from the inliers on quantized residual preference space. To make good use of the outlier consensus, we constructed the neighbor graph of the energy minimization on quantized residual preference space and adjusted the data cost for the outlier label and inlier label according to the statistical residuals from pseudo-outliers. And following the alternate labeling and sampling framework, the outliers can be well detected. After removing the outliers, a conventional energy minimization process for inlier segmentation was conducted based on the neighborhood graph constructed from Delaunay triangulation of the data points. Both the energy minimization for outlier detection and inlier segmentation were integrated into an alternating labeling and sampling framework.

The experimental results show that the proposed energy minimization-based outlier detection method can successfully detect most of the outliers in the data. The proposed method can separate the inliers for different models, and it outperformed the state-of-the-art methods in geometric multimodel fitting.

The algorithm’s performance is influenced by random sampling, despite efforts to mitigate its effects through regional random sampling and an alternative sampling and labeling framework. When dealing with a substantial volume of data or a high proportion of outliers, the sampling ratio of the true model decreases, thus leading to a corresponding decrease in the algorithm’s effectiveness. Given that our approach entails segmenting the original data points regionally, computing the distance between quantified residual vectors, and engaging in an iterative optimization process, an abundance of data points can result in prolonged processing times and suboptimal real-time performance.

Based on this, our next step in improvement work will focus on improving the efficiency of model sampling and enhancing outlier consistency calculations. By adopting a method similar to Prog-X, we will comprehensively consider the constraints of data volume and time, increase the sampling ratio of true models, and ensure the uniform sampling of each true model. We will enhance the existing distance calculation strategy by combining parallel data processing methods to improve the speed of the distance calculation. At the same time, we will explore sampling in the model parameter space and data representation methods, thereby hoping to find new ideas and methods from the parameter space to improve existing methods.

Author Contributions

Conceptualization, Y.Z. and B.L.; methodology, Y.Z.; software, X.Z.; validation, X.Z. and Y.Z.; formal analysis, S.W.; investigation, Y.Z.; resources, Y.Z. and B.Y.; data curation, Y.Z. and X.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, B.L. and L.Z.; supervision, B.L. and L.Z.; project administration, B.Y. and B.L.; funding acquisition, B.Y. and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by China National Nuclear Power Co., Ltd. Concentrated R&D Project K231201 and CNNC Wuhan Nuclear Power Operation Technology Co., Ltd. Youth Innovation Fund Project B241214.

Data Availability Statement

The data involved in the experiments are all publicly available and properly referenced in the paper. The source code of the proposed method in the paper can be shared upon request.

Conflicts of Interest

Author Yun Zhang and Bin Yang were employed by the company CNNC Wuhan Nuclear Power Operation Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wong, H.S.; Chin, T.J.; Yu, J.; Suter, D. Dynamic and hierarchical multi-structure geometric model fitting. In Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1044–1051. [Google Scholar]
Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Choi, S.; Kim, T.; Yu, W.; Choi, S.; Kim, T.; Yu, W.; Choi, S.; Kim, T.; Yu, W. Performance evaluation of RANSAC family. In Proceedings of the British Machine Vision Conference, BMVC 2009, London, UK, 7–10 September 2009. [Google Scholar]
Vincent, E.; Laganiére, R. Detecting planar homographies in an image pair. In Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, 19–21 June 2001; pp. 182–187. [Google Scholar]
Kanazawa, Y.; Kawakami, H. Detection of Planar Regions with Uncalibrated Stereo using Distributions of Feature Points. In Proceedings of the BMVC, Kingston, UK, 7–9 September 2004; pp. 1–10. [Google Scholar]
Zuliani, M.; Kenney, C.S.; Manjunath, B. The multiransac algorithm and its application to detect planar homographies. In Proceedings of the IEEE International Conference on Image Processing 2005, Genova, Italy, 14 September 2005; Volume 3, p. III–153. [Google Scholar]
Stewart, C.V. Bias in robust estimation caused by discontinuities and multiple structures. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 818–833. [Google Scholar] [CrossRef]
Wong, H.S.; Chin, T.; Yu, J.; Suter, D. Mode seeking over permutations for rapid geometric model fitting. Pattern Recognit. 2013, 46, 257–271. [Google Scholar] [CrossRef]
Toldo, R.; Fusiello, A. Robust Multiple Structures Estimation with J-Linkage. In Proceedings of the European Conference on Computer Vision, Marseille, France, 12–18 October 2008; pp. 537–547. [Google Scholar]
Toldo, R.; Fusiello, A. Real-time incremental j-linkage for robust multiple structures estimation. In Proceedings of the International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), Paris, France, 17–20 May 2010; Volume 1, p. 6. [Google Scholar]
Magri, L.; Fusiello, A. T-linkage: A continuous relaxation of j-linkage for multi-model fitting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 3954–3961. [Google Scholar]
Magri, L.; Fusiello, A. Multiple Model Fitting as a Set Coverage Problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3318–3326. [Google Scholar]
Magri, L.; Fusiello, A. Multiple structure recovery via robust preference analysis. Image Vis. Comput. 2017, 67, 1–15. [Google Scholar] [CrossRef]
Magri, L.; Fusiello, A. Robust Multiple Model Fitting with Preference Analysis and Low-rank Approximation. In Proceedings of the British Machine Vision Conference, Swansea, UK, 7–10 September 2015. [Google Scholar]
Chin, T.J.; Wang, H.; Suter, D. Robust fitting of multiple structures: The statistical learning approach. In Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 413–420. [Google Scholar]
Chin, T.J.; Wang, H.; Suter, D. The ordered residual kernel for robust motion subspace clustering. In Proceedings of the International Conference on Neural Information Processing Systems, Bangkok, Thailand, 1–5 December 2009; pp. 333–341. [Google Scholar]
Xiao, G.; Wang, H.; Yan, Y.; Zhang, L. Mode seeking on graphs for geometric model fitting via preference analysis. Pattern Recognit. Lett. 2016, 83, 294–302. [Google Scholar] [CrossRef]
Wong, H.S.; Chin, T.J.; Yu, J.; Suter, D. A simultaneous sample-and-filter strategy for robust multi-structure model fitting. Comput. Vis. Image Underst. 2013, 117, 1755–1769. [Google Scholar] [CrossRef]
Pham, T.T.; Chin, T.J.; Yu, J.; Suter, D. The Random Cluster Model for robust geometric fitting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 710–717. [Google Scholar]
Chin, T.J.; Yu, J.; Suter, D. Accelerated Hypothesis Generation for Multistructure Data via Preference Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 625. [Google Scholar] [CrossRef] [PubMed]
Torr, P.H.S.; Zisserman, A. MLESAC: A New Robust Estimator with Application to Estimating Image Geometry. Comput. Vis. Image Underst. 2000, 78, 138–156. [Google Scholar] [CrossRef]
Lai, T.; Wang, W.; Liu, Y.; Li, Z.; Lin, S. Robust model estimation by using preference analysis and information theory principles. Appl. Intell. 2023, 53, 22363–22373. [Google Scholar] [CrossRef]
Isack, H.; Boykov, Y. Energy-Based Geometric Multi-model Fitting. Int. J. Comput. Vis. 2012, 97, 123–147. [Google Scholar] [CrossRef]
Delong, A.; Osokin, A.; Isack, H.N.; Boykov, Y. Fast Approximate Energy Minimization with Label Costs. Int. J. Comput. Vis. 2012, 96, 1–27. [Google Scholar] [CrossRef]
Pham, T.T.; Chin, T.J.; Schindler, K.; Suter, D. Interacting geometric priors for robust multimodel fitting. IEEE Trans. Image Process. 2014, 23, 4601–4610. [Google Scholar] [CrossRef] [PubMed]
Isack, H.N.; Boykov, Y. Energy Based Multi-model Fitting & Matching for 3D Reconstruction. In Proceedings of the CVPR, Columbus, OH, USA, 23–28 June 2014; pp. 1–4. [Google Scholar]
Barath, D.; Matas, J. Multi-class model fitting by energy minimization and mode-seeking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 221–236. [Google Scholar]
Barath, D.; Matas, J. Progressive-x: Efficient, anytime, multi-model fitting algorithm. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3780–3788. [Google Scholar]
Boykov, Y.; Veksler, O.; Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 1222–1239. [Google Scholar] [CrossRef]
Zhang, C.; Lu, X.; Hotta, K.; Yang, X. G2MF-WA: Geometric multi-model fitting with weakly annotated data. Comput. Vis. Media 2020, 6, 135–145. [Google Scholar] [CrossRef]
Barath, D.; Matas, J. Graph-cut RANSAC: Local optimization on spatially coherent structures. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4961–4974. [Google Scholar] [CrossRef] [PubMed]
Purkait, P.; Chin, T.J.; Ackermann, H.; Suter, D. Clustering with Hypergraphs: The Case for Large Hyperedges. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 672–687. [Google Scholar]
Wang, H.; Xiao, G.; Yan, Y.; Suter, D. Mode-Seeking on Hypergraphs for Robust Geometric Model Fitting. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2902–2910. [Google Scholar]
Xiao, G.; Wang, H.; Lai, T.; Suter, D. Hypergraph modelling for geometric model fitting. Pattern Recognit. 2016, 60, 748–760. [Google Scholar] [CrossRef]
Lee, K.; Meer, P.; Park, R. Robust adaptive segmentation of range images. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 200–205. [Google Scholar]
Babhadiashar, A.; Suter, D. Robust segmentation of visual data using ranked unbiased scale estimate. Robotica 1999, 17, 649–660. [Google Scholar] [CrossRef]
Wang, H.; Suter, D. Robust adaptive-scale parametric model estimation for computer vision. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 1459–1474. [Google Scholar] [CrossRef]
Fan, L. Robust Scale Estimation from Ensemble Inlier Sets for Random Sample Consensus Methods. In Proceedings of the European Conference on Computer Vision, Marseille, France, 12–18 October 2008; pp. 182–195. [Google Scholar]
Toldo, R.; Fusiello, A. Automatic Estimation of the Inlier Threshold in Robust Multiple Structures Fitting. In Proceedings of the International Conference on Image Analysis and Processing, Las Vegas, NV, USA, 13–16 July 2009; pp. 123–131. [Google Scholar]
Raguram, R.; Frahm, J.M. RECON: Scale-adaptive robust estimation via Residual Consensus. In Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1299–1306. [Google Scholar]
Wang, H.; Chin, T.; Suter, D. Simultaneously Fitting and Segmenting Multiple-Structure Data with Outliers. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 1177–1192. [Google Scholar] [CrossRef]
Zhao, Q.; Zhang, Y.; Qin, Q.; Luo, B. Quantized residual preference based linkage clustering for model selection and inlier segmentation in geometric multi-model fitting. Sensors 2020, 20, 3806. [Google Scholar] [CrossRef] [PubMed]
Zhao, X.; Zhang, Y.; Xie, S.; Qin, Q.; Wu, S.; Luo, B. Outlier detection based on residual histogram preference for geometric multi-model fitting. Sensors 2020, 20, 3037. [Google Scholar] [CrossRef] [PubMed]
Nasuto, D.; Craddock, J.B.R. Napsac: High noise, high dimensional robust estimation-it’s in the bag. In Proceedings of the British Machine Vision Conference 2002, Cardiff, UK, 2–5 September 2002; pp. 458–467. [Google Scholar]
Ni, K.; Jin, H.; Dellaert, F. GroupSAC: Efficient consensus in the presence of groupings. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 2193–2200. [Google Scholar]
Sattler, T.; Leibe, B.; Kobbelt, L. SCRAMSAC: Improving RANSAC’s efficiency with a spatial consistency filter. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 2090–2097. [Google Scholar]
Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Hartley, R.I. In defense of the eight-point algorithm. IEEE TRansactions Pattern Anal. Mach. Intell. 1997, 19, 580–593. [Google Scholar] [CrossRef]
Tron, R.; Vidal, R. A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms. In Proceedings of the Computer Vision and Pattern Recognition, 2007. CVPR ’07, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
Xiao, G.; Wang, H.; Ma, J.; Suter, D. Segmentation by continuous latent semantic analysis for multi-structure model fitting. Int. J. Comput. Vis. 2021, 129, 2034–2056. [Google Scholar] [CrossRef]
Mittal, S.; Anand, S.; Meer, P. Generalized Projection-Based M-Estimator. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2351–2364. [Google Scholar] [CrossRef]

Figure 1. An example of the problem addressed in this paper. The image shows a pair of two-view matching images, containing two planar structures (red region and green region). Blue, red, and green points represent matching points, where blue points indicate outliers. The green and red points respectively represent the inliers of Homography 1 and Homography 2.

Figure 2. The flowchart of our method.

Figure 3. The flowchart of outlier detection.

Figure 4. The sketch map of region-based random sampling.

Figure 5. Quantized residual preference MDS plot for the “johnsona” data. (a) Mismatched points as the outliers for multihomography estimation in “johnsona”. (b) Quantized residual preference MDS plot of the hypotheses.

Figure 6. The flowchart of inlier segmentation.

Figure 7. Outliers in “Merton College II” for two-view plane segmentation. The image shows a pair of two-view matching images for plane segmentation. The solid black dots represent matched points between the two images, while the points circled in red circles indicate true outliers.

Figure 8. Detected outliers in “Merton College II” by kernel fitting and the proposed method. For convenience, we only display the results in Image 1. Similar to Figure 7, black points represent all matching points, green points represent true outliers, points marked with blue rectangles indicate detection results of Kernel Fitting, and points marked with red triangles denote the detection results of proposed method.

Figure 9. Outliers in “cars5” for two-view motion segmentation. The image shows a pair of two-view matching images for motion segmentation. The solid yellow dots represent matched points between the two images, while the points circled in red circles indicate true outliers.

Figure 10. Detected outliers in “cars5” by kernel fitting and the proposed method. Green points represent true outliers, points marked with blue rectangles indicate detection results of Kernel Fitting, and points marked with red triangles denote the detection results of proposed method.

Figure 11. Results of the outlier detection for both the plane segmentation and motion segmentation. (a) johnsona. (b) johnsonb. (c) ladysymon. (d) neem. (e) oldclassicswing. (f) sene. (g) biscuitbookbox. (h) breadcartoychips. (i) breadcubechips. (j) breadtoycar. (k) carchipscube. (l) dinobooks.

Figure 12. Inlier segmentation results for two-view plane segmentation. (a) johnsona. (b) johnsonb. (c) ladysymon. (d) neem. (e) oldclassicswing. (f) sene.

Figure 13. Inlier segmentation results for two-view motion segmentation. (a) biscuitbookbox. (b) breadcartoychips. (c) breadcubechips. (d) breadtoycar. (e) carchipscube. (f) dinobooks.

Table 1. Outlier detection results.

	Total Points	Total Outliers	Kernel Fitting			Proposed
	Total Points	Total Outliers	Correct	Missing	False	Correct	Missing	False
two-view plane segmentation
johnsona	373	78	70	8	0	75	3	0
johnsonb	649	78	63	15	33	71	7	0
ladysymon	273	77	70	7	0	76	1	0
neem	241	88	88	0	4	88	0	0
oldclassicswing	379	123	123	0	0	123	0	0
sene	250	118	106	12	3	117	1	0
two-view motion segmentation
biscuitbookbox	259	97	90	7	3	97	0	0
breadcartoychips	237	82	76	6	0	81	1	0
breadcubechips	230	81	69	12	4	80	1	0
breadtoycar	166	56	43	13	1	53	3	0
carchipscube	165	60	52	8	0	60	0	0
dinobooks	360	155	128	27	25	151	4	41

Table 2. Misclassification (%) for two-view plane segmentation. The numbers highlighted in bold represent the minimum or equivalent minimum results.

Methods	PEARL	J-Linkage	T-Linkage	SA-RCM	Prog-X	CLSA	Proposed
johnsona	4.02	5.07	4.02	5.90	5.07	6.00	1.61
johnsonb	18.18	18.33	18.33	17.95	6.12	20.0	3.39
ladysymon	5.49	9.25	5.06	7.17	3.92	1.00	2.11
neem	5.39	3.73	3.73	5.81	6.75	1.00	0.83
oldclassicswing	1.58	0.27	0.26	2.11	0.52	0.00	0.26
sene	0.80	0.84	0.40	0.80	0.40	0.00	0.4

Table 3. Misclassification (%) for two-view motion segmentation. The numbers highlighted in bold represent the minimum or equivalent minimum results.

Methods	PEARL	J-Linkage	T-Linkage	SA-RCM	Prog-X	CLSA	Proposed
biscuitbookbox	4.25	1.55	1.54	7.04	3.11	1.00	0
breadcartoychips	5.91	11.26	3.37	4.81	2.87	5.00	0.42
breadcubechips	4.78	3.04	0.86	7.85	1.33	1.00	0.43
breadtoycar	6.63	5.49	4.21	3.82	3.06	0.00	1.81
carchipscube	11.82	4.27	1.81	11.75	13.90	3.00	0
dinobooks	14.72	17.11	9.44	8.03	7.66	10.00	12.50

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Yang, B.; Zhao, X.; Wu, S.; Luo, B.; Zhang, L. Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting. Electronics 2024, 13, 2101. https://doi.org/10.3390/electronics13112101

AMA Style

Zhang Y, Yang B, Zhao X, Wu S, Luo B, Zhang L. Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting. Electronics. 2024; 13(11):2101. https://doi.org/10.3390/electronics13112101

Chicago/Turabian Style

Zhang, Yun, Bin Yang, Xi Zhao, Shiqian Wu, Bin Luo, and Liangpei Zhang. 2024. "Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting" Electronics 13, no. 11: 2101. https://doi.org/10.3390/electronics13112101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Outlier Detection by Energy Minimization in Quantized Residual Preference Space for Geometric Model Fitting

Abstract

1. Introduction

2. Materials and Methods

2.1. Outlier Detection

2.1.1. Region-Based Random Sampling

2.1.2. Quantized Residual Preference

2.1.3. Energy Minimization-Based Outlier Detection

2.2. Inlier Segmentation

3. Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI