1. Introduction
Computed tomography (CT), derived from X-ray imaging technology, emerged in the 1970s. CT is an imaging technology that uses high-energy rays to penetrate objects to obtain projections and reconstruct multi-angle projections, using various algorithms to obtain object tomographic information [
1]. CT has the advantages of high spatial resolution and easy visualization and plays a significant role in the medical field. From the late 1970s to the 1990s, with the rapid development of digital image processing technology and microcomputer systems, CT technology has expanded from the medical field to industrial fields such as non-destructive testing, reverse engineering, and material organization analysis, has been widely used in agriculture, forestry, and geophysics [
2], and remains a popular research topic.
Especially in the field of industrial measurement, CT has developed rapidly, owing to its non-contact and penetrable measurement characteristics, avoiding contact deformation in three-dimensional coordinate measuring machines [
3] and enabling the measurement of the internal information of objects [
4]. Spherical objects represent an important type of object in CT measurement as they are often used to construct ball-plate standards to trace and calibrate industrial CT instruments, as well as in point cloud registration and other tasks. The spherical radius and spherical center coordinates are key measurement parameters, and the measurement process of CT for spherical objects is relatively intuitive. A spherical object can be placed at the rotation center of the CT turntable to obtain projection data. After processing with a three-dimensional reconstruction algorithm and gridding algorithm, the point cloud data of the spherical object can be obtained. Then, point cloud data processing can be performed to obtain measurement parameters such as spherical coordinates and radius. Researchers have proposed a variety of fitting methods for point cloud targets, which can largely be divided into two categories, namely, the least squares (LS) and numerical optimization (NO) methods [
5].
LS is a mathematical optimization modeling method that searches for the best function match for given data by minimizing the sum of squared errors to achieve optimal parameter estimation. In 2005, Reshetyuk [
6] used the definition formula of a sphere in three-dimensional space combined with the LS method to construct a linear equation system and obtained fitting parameters by solving this equation system. Subsequently, Lu [
7] applied the total LS (TLS) algorithm to spherical point cloud fitting. In addition to minimizing the sum of the perpendicular distances from the point to the surface of the sphere, the TLS algorithm also attempts to minimize the sum of the squared residuals of each point. Le [
8] used the truncated LS method to solve the robustness problem of data fitting. The truncated LS method introduces a truncation function to handle outliers, thereby improving the robustness of fitting. In 2015, Bektas [
9] used the orthogonal distance LS method to fit an ellipsoid, accounting for the actual distance from the sample point to the fitting surface, rather than simply summing the squared errors of all points. Subsequently, Liu [
10] proposed an algorithm combining nonlinear LS (NLS) and AD-random sample consensus (RANSAC) to detect spherical targets and achieve rapid fitting of such targets in mobile terrestrial laser scanning point cloud data. Wu [
11] combined RANSAC with the LS method and proposed PC-RANSAC, which first calculates the principal curvature of point cloud data to implement a principal curvature constraint on random sampling points, and then uses the LS method to estimate geometric parameters. Tao [
12] proposed a constrained weighted LS algorithm to search for the best plane to achieve iterative close point registration, diversifying the application of fitting methods in the field of point cloud registration. Fei [
13] proposed a five-parameter constrained LS method for small-angle spherical surface fragments. In this method, even if the segment angle approaches 1°, a certain degree of accuracy is guaranteed. In recent research, Li [
14] proposed an adaptive cylindrical fitting algorithm based on Robust Principal Component Analysis (RPCA). This algorithm was experimentally validated for its effectiveness and robustness across various datasets, demonstrating superior performance in handling noisy data. Zong [
15] introduced a three-dimensional line fitting algorithm based on recursive weighted LS, aimed at significantly reducing the impact of boundary noise on the fitting results. Meanwhile, Zheng [
16] developed a point cloud data processing method based on iterative residual fitting. This method first performs LS fitting on all points to calculate the residual values after the initial fitting, and then selects the points with residuals below a specified threshold to form a new point cloud. Subsequently, the remaining point cloud undergoes a second LS fitting, and the residual values from this second fitting are calculated. This process is repeated until the conditions for terminating the iteration are met, thereby achieving precise processing of the point cloud data. In short, the goal of the LS method and its variants is to minimize the error between model prediction values and observation values to obtain a more accurate geometric parameter solution.
The NO approach aims to maximize or minimize a specific evaluation criterion by identifying optimal model parameters [
17]. Many estimation algorithms can be used to solve the spherical target fitting (STF) problem. Based on whether there are feature constraints in the estimation algorithm, NO approaches can be divided into two categories: ordinary NO and feature-constrained NO [
5]. Three-dimensional point cloud models typically contain a large amount of point information, and performing three-dimensional Hough transformation consumes significant computing resources and time. Additionally, owing to the sensitivity of the Hough transform to noise and poor robustness [
18,
19], it is not widely used in three-dimensional point cloud fitting. In 2007, Schnabel [
20] used RANSAC to detect the shapes of unorganized point clouds. Specifically, RANSAC was used to derive a geometric model by sampling within a set number of iterations and using a predefined threshold to judge whether points meet certain requirements to extract an optimal model [
21,
22]. Subsequently, Schnabel [
23,
24] used the MELSAC and MSAC algorithms to detect point clouds of various shapes. MSAC is a model based on M-estimation that aims to obtain the best fit in each iteration, whereas MLESAC is based on maximum likelihood estimation. Kang [
25] used threshold-independent Bayesian sampling (BAYSAC) to implement plane detection. Specifically, they used point cloud downsampling to retain feature boundaries and BAYSAC to fit planes [
26,
27]. Shi [
28] proposed a limited random search algorithm based on point cloud data and the geometric features of spherical targets. Using parameter estimation, this method realizes SFT from a probabilistic statistical perspective. Additionally, Shi [
29] proposed an adaptive grid search algorithm that fully leverages the geometric features of spherical targets and obtains the best fitting parameters through a limited number of iterative optimizations. Bin [
30] proposed a nonlinear Gauss–Helmert (NGH) model to describe the mathematical model of point cloud formation and then proposed a novel point cloud fitting method based on the robust NGH model. Unlike previous approaches, this fitting method accounts for all random errors in various linear and nonlinear fitting problems, which can effectively reduce the influence of interference points. Shu [
31] proposed an improved RANSAC algorithm that integrates Newton’s iterative method for precise cylindrical fitting of rebar and corrugated pipes. This method enhances the convergence speed and fitting accuracy by introducing adaptive thresholds and constraints on the minimum number of inliers. Zhang [
32] developed a curvature consistency sphere detection (CCSD) algorithm for sphere recognition in light detection and ranging (LiDAR) point clouds. The CCSD employs RANSAC for sphere fitting during the final detection stage. It also introduces a mixed voting mechanism to calculate the observation error of candidate spheres based on curvature consistency, inlier support rate, and size deviation. This improves the accuracy of sphere fitting. Singh [
33] utilized an enhanced progressive sample consensus (PROSAC) algorithm for plane fitting of point cloud data to assess terrain complexity. PROSAC is an improved version of RANSAC that enhances the efficiency of the algorithm by incorporating prior knowledge to guide the sampling process. Furthermore, Li [
34] proposed a method for extracting pavement cracks based on three-dimensional laser point clouds. This method combines the M-estimator sample consensus (MSAC) algorithm with the K-nearest neighbor (KNN) algorithm. Initially, the MSAC algorithm fits the preprocessed point cloud to a plane. Then, based on the characteristic that crack points are primarily located below the pavement, it separates the crack point cloud from the pavement texture points. In summary, the main goal of the NO approach is to vary the sampling method or searching mode within a limited number of iterations to obtain an optimal solution.
Additionally, Simon Burkhard [
35] proposed a method for fitting the projected centers of spheres in cone beam X-ray imaging. To model the edge of the sphere with sub-pixel resolution, the tangent circle of the sphere was decomposed three times. Compared with elliptical target fitting, this method reduces the number of unknown variables and has advantages for numerical calculation. Hong [
36] combined the random forest model with machine learning, and this method demonstrated excellent performance in terms of fitting accuracy and generalization capability.
In summary, LS and its variants and NO algorithms are all effective ways to fit point clouds. However, these algorithms still have shortcomings in terms of further improving accuracy and robustness. LS obtains fitting parameters by solving a system of linear equations. When abnormal points are present, the solution obtained will deviate significantly from the actual value. Additionally, when an equation takes the x and y coordinates of point cloud data as independent variables, it only considers the error of the observation vector (i.e., vector z), implying that it ignores the error in the x and y coordinates in the coefficient matrix. The main shortcoming of the NO approach is that some algorithms are sampled within a certain number of iterations. If a more accurate model is to be obtained, an exact threshold and a larger number of iterations are required. RANSAC is a robust and commonly used algorithm; however, owing to the randomness of sampling, it is difficult to guarantee the accuracy of results.
To overcome the limitations described above, this paper proposes a spherical point cloud fitting algorithm (PK-RANSAC) based on projection filtering and K-means clustering. We selected several existing algorithms for comparison with PK-RANSAC. Experimental results demonstrate that the proposed algorithm combines the advantages of LS and NO algorithms, can improve both robustness and accuracy, and yields better fitting parameters.
4. Discussion
In this section, we consider five different algorithms to present a comparative analysis based on the three experiments described in the previous section. These algorithms include the LS method, RANSAC, improved RANSAC (PC-RANSAC), TLS, and the proposed PK-RANSAC. The LS algorithm minimizes total error by constructing a linear equation system to obtain fitting parameters. Although this approach is simple and easy to implement, it is easily affected by outliers and noise, resulting in poor fitting results. RANSAC is an efficient iterative method that can effectively fit model parameters under conditions with more noise and outliers. PC-RANSAC combines the advantages of LS and RANSAC to improve fitting accuracy further. TLS introduces a truncation function based on LS and enhances robustness by effectively handling outliers.
The fitting results for the simulation data are presented in
Figure 17. To show the performance differences between different algorithms clearly, the deviation values are presented in the form of logarithmic absolute values as the ordinate. This visualization amplifies differences in deviation, highlighting slight differences in algorithm performance. The larger the value of the ordinate, the closer it is to the set true value. PK-RANSAC provides significant advantages for fitting the two key parameters of the sphere center coordinates and sphere radius. Specifically, for both parameters, the deviation values of PK-RANSAC are significantly smaller than those of the other four methods, demonstrating that it has greater stability and accuracy when handling complex point cloud data. In contrast, although TLS and PC-RANSAC exhibit obvious improvements in terms of handling outliers and noise compared with LS and RANSAC, PK-RANSAC still provides superior fitting accuracy. This demonstrates that PK-RANSAC improves overall performance by combining the advantages of RANSAC and projection filtering, providing outstanding performance in complex environments. Overall, these results indicate that the application of PK-RANSAC to industrial CT measurement has great potential and can provide a more reliable solution for spherical point cloud fitting.
In the CT cylinder-ball reference object experiment, we considered the same five methods to conduct a comparative analysis of the parameter of sphere radius.
Figure 18 presents the deviations of the five point cloud algorithms when fitting the radius compared with the true value. The ordinate represents the deviation between the fitting radius and the true value. The smaller the value, the closer the fitting result is to the true value, indicating better performance. There are inevitable errors in the reconstruction process of CT, and certain deviations are introduced in the conversion from reconstructed volume data to a mesh model. However, the mesh model used by all fitting algorithms is the same, so each method has the same inherent deviation between the fitted radius and true value. This deviation can be considered a constant, so the closer the fitting result is to the true value, the better the performance of the algorithm. PK-RANSAC provides the best radius fitting performance with the smallest deviation, demonstrating excellent fitting accuracy and robustness. In contrast, the LS method is easily affected by outliers, and the fitting results contain large deviations, demonstrating the disadvantage of insufficient robustness. Additionally, although RANSAC, PC-RANSAC, and TLS improved fitting accuracy to a certain extent, they still failed to surpass the superior performance of PK-RANSAC. These results demonstrate that PK-RANSAC can effectively reduce the deviation introduced by noise and outliers when processing complex CT cylinder-ball reference object data, yielding a fitting radius close to the true value. This performance improvement is of great significance for the precision of CT measurement, especially in industrial applications that require high-precision fitting. Based on the results of our comparative analysis, it can be concluded that PK-RANSAC provides significant advantages for sphere radius fitting and is superior to the other four methods in terms of fitting accuracy and robustness, demonstrating that PK-RANSAC is not only advanced in theory but also exhibits good applicability and reliability in practical applications.
For the CT ball-plate standard experiment, we selected three algorithms, namely, RANSAC, PC-RANSAC, and PK-RANSAC, for comparative analysis to evaluate their performance in terms of fitting spherical center coordinates.
Figure 19 presents the deviations of 30 spherical center distances in the second group of CT-measured data. The numerical data are summarized in
Table 6. The horizontal axis represents the spherical center distances, while the vertical axis represents the deviation values. The three lines of different colors in the figure represent the fitting means of the three methods. Owing to its randomness and instability, RANSAC leads to large deviations between the fitting results and true values, indicating that it does not perform well when processing CT data. This instability of RANSAC may lead to a significant decrease in fitting accuracy on some complex point cloud data, negatively affecting the accuracy of geometric parameters. Although PC-RANSAC improves the robustness of fitting to a certain extent, it is still inferior to PK-RANSAC in terms of deviation control. PK-RANSAC exhibits excellent stability and accuracy for fitting the spherical center distances, and its fitting deviations are all within 5 μm, with an average deviation of only 1.91 μm. This demonstrates that PK-RANSAC not only has superior robustness for the processing of noise and outliers, but also has significant advantages in terms of the fitting accuracy of geometric parameters. These results further demonstrate the application potential of PK-RANSAC in industrial CT measurement and the provision of reliable technical support for high-precision measurement needs.
From our discussion of the three experiments above, several important conclusions can be drawn. First, although the LS method provides a simple and direct fitting method in theory, its robustness is poor and it is easily disturbed by outliers, which significantly reduces the reliability of the fitting results. Because the LS method does not effectively handle noise and outliers, it exhibits obvious deficiencies in complex datasets. Second, although RANSAC improves noise resistance to a certain extent through iterative processing, its inherent randomness produces results with greater uncertainty, especially for large-scale point cloud data, rendering it difficult to ensure the repeatability of results. This lack of stability makes RANSAC inappropriate for certain practical applications, especially in scenarios with high precision requirements. In contrast, PK-RANSAC demonstrates excellent performance. Its innovation lies in the strategic integration and enhancement of two methods, which improves robustness and repeatability. The workflow of PK-RANSAC is divided into three steps, and this multi-stage processing approach not only enhances the robustness of the algorithm but also effectively reduces the uncertainty of calculations. Consequently, PK-RANSAC achieves a high degree of fitting accuracy and stability in the presence of noise and outliers, allowing it to provide more reliable and accurate results on complex industrial CT measurement tasks.
In summary, PK-RANSAC not only overcomes the limitations of the LS method and RANSAC in terms of processing complex point cloud data, but also ensures the high accuracy and stability of fitting results through a multi-step optimization process. This makes PK-RANSAC a superior algorithm with broad application prospects in point cloud fitting, which can provide strong technical support for high-precision measurement.