CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries

Zhang, Jiawei; Wang, Xueqi; Li, Yanzheng; Liu, Yinhua

doi:10.3390/machines11110982

Open AccessArticle

CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries

School of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Machines 2023, 11(11), 982; https://doi.org/10.3390/machines11110982

Submission received: 16 September 2023 / Revised: 20 October 2023 / Accepted: 20 October 2023 / Published: 24 October 2023

(This article belongs to the Section Advanced Manufacturing)

Download

Browse Figures

Versions Notes

Abstract

:

In laser scanning inspection systems for sheet metal parts, the rapid and accurate inspection of the high-precision holes is not only crucial but difficult. The accuracy of the circular holes, especially the locating datum holes on the parts, plays an important role in the assembly quality. However, accurately segmenting the circular hole boundary points required for circular hole fitting from large-scale scanning point cloud data remains one of the most difficult tasks for inspection accuracy improvement. To address this problem, a segmentation network called the circular hole boundary segmentation network (CHBS-Net) is proposed for boundary point cloud extraction. Firstly, an encoding–decoding–attention (EDA) fusion guidance mechanism is used to address the imbalance in data distribution due to the small proportion of boundary points in the overall point cloud. Secondly, a long short-term memory (LSTM) network parallel structure is used to capture the contour continuity and temporal relationships among boundary points. Finally, the interference of neighboring points and noise is reduced by extracting features in the multi-scale neighborhood. Experiments were performed using real cases from a sheet metal parts dataset to illustrate the procedures. The results showed that the proposed method achieves better performance than the benchmark state-of-the-art methods. The circular hole inspection accuracy is effectively improved by enhancing the segmentation accuracy of the scanning boundary points.

Keywords:

sheet metal part; point cloud segmentation; 3D measurement; deep learning

1. Introduction

The manufacturing quality of circular holes in automobile manufacturing seriously affects the assembly accuracy and directly affects the product quality [1]. To ensure the manufacturing quality of these holes, laser scanning inspection techniques are usually used. These processes have two basic stages: boundary extraction and parameter fitting. Among them, the boundary extraction technique provides the necessary data support for parameter fitting and quality assessment, which is the basis of the circular hole inspection technique. Therefore, improving the accuracy of the boundary extraction of circular holes will improve the accuracy of the parameter fitting of the holes.

In recent years, the rapid progression of 3D optical measurement technologies, including light detection and ranging (LiDAR), structured light, and depth cameras, has fostered the widespread application of 3D point cloud data across various domains, such as autonomous navigation, augmented reality applications, and the field of robotics [2,3,4]. Within the prevalent methods for point cloud data processes, segmentation algorithms are extensively employed for feature analysis and regional division [5]. These algorithms serve as indispensable tools for extracting multifarious feature information from point cloud data sets, thereby facilitating the differentiation of data points into discrete regions distinguished by unique attributes. Consequently, circular hole boundary extraction based on point cloud segmentation has emerged as a burgeoning trend in recent times.

A key method in the study of artificial intelligence is deep learning. It is appropriate for simulating intricate connections between input and output data. By choosing several kernels or fine-tuning parameters through end-to-end optimization, deep learning merges feature learning and model construction into a single model. These models have a deep neural network topology, which is simply a multilevel nonlinear operation, with numerous hidden layers. To find complicated intrinsic structures, the model changes the features of each layer from raw inputs to more abstract representations [6]. This method handles complex nonlinear features in large-scale data without the requirement for the manual creation of appropriate feature representations, relies less on domain expertise, and is faster. It is frequently utilized in intelligent manufacturing for activities like fault diagnostics [7,8] and defect detection [9,10] as a result. It offers state-of-the-art manufacturing process analysis tools. Deep learning approaches have been used more and more in point cloud segmentation over the past few years, gradually taking the place of more conventional methods [11] in point cloud segmentation jobs for various industrial parts. Methods for segmenting point clouds using deep learning often use transformer-based techniques [12,13,14], attention networks [15], graph neural networks [16,17], and convolutional neural networks [18,19,20,21]. These models frequently display strong generalization skills. The smooth performance of different segmentation tasks in different scenarios is favorably ensured.

As depicted in Figure 1, this paper aims to enhance the accuracy of circular hole fitting by improving the circular hole boundary point segmentation method. Within the purview of this segmentation task, three key challenges are addressed: (1) the limited proportion of circular hole boundary points within the overall point cloud, which leads to imbalance in the training categories; (2) the exigency of contour continuity and temporal relationships among circular hole boundary points; and (3) the imperative need to mitigate interference caused by nearby and noisy data points during the segmentation process of boundary points. Therefore, a novel segmentation network is proposed, and the primary contributions of this study include:

A novel segmentation network is proposed, which aims to achieve accurate segmentation of circular hole boundary points and improve the accuracy of fitting parameters.
An encoding–decoding–attention fusion mechanism is designed. This mechanism utilizes key features in the boundary region of the circular hole to guide the point cloud segmentation.
An LSTM parallel structure is introduced for modeling contour continuity and temporal relationships between boundary points.
Feature extraction in neighborhoods of different scales is performed by considering point cloud features in neighborhoods of different ranges. This reduces the interference of neighboring points at boundary points on the segmentation results.

The rest of the paper is organized as follows. Section 2 reviews the existing point cloud segmentation methods. In Section 3, the proposed network model structure is presented. Section 4 shows the experimental results. Section 5 draws the conclusions of this paper.

2. Related Work

2.1. Convolutional Neural Network-Based Point Cloud Segmentation Algorithm

Presently, there are two methods based on convolutional neural networks (CNNs) for the segmentation of point cloud data. The first method is to transform the point cloud into an image or voxel and then apply 2D or 3D convolution operations to it. The second method is to redefine a kind of point convolution and directly use the point cloud as an input to the point convolution. In the former method, Choy et al. [22] introduced an application of generalized 3D sparse convolution on a sparse tensor. This convolution improved the robustness and reduced the computational cost compared with 2D convolution and hybrid 2D–3D convolution. Nevertheless, transforming point clouds into images or voxels fails to capitalize fully on their spatial features and introduces additional data processing expenses and structural noise. Therefore, lots of studies have worked on developing efficient point convolutional networks. For instance, Komarichev et al. [23] introduced a novel ring convolutional neural network structure. The method ordered the nearest neighbor points by changing the ring structure and orientation, wherein the ordered nearest neighbor points are applied to standard point convolution. Additionally, Xu et al. [24] proposed a position adaptive convolution to construct a convolution kernel by using a dynamic convolution weight matrix. The coefficients of this matrix were obtained by adaptively learning the relative positional relationships between points, executed through a fractional network.

2.2. Graph Neural Network-Based Point Cloud Segmentation Algorithm

In the field of point cloud semantic segmentation, researchers have adeptly employed graph neural networks to construct specialized graph structures for point clouds [25,26]. This method facilitates the exploration of neighboring information for each point through graph convolution, thereby enhancing the utilization of spatial attributes inherent to the point cloud and augmenting segmentation precision. Li at al. [27] solved the gradient vanishing problem of network superposition by applying residual connectivity, dense connectivity, and dilated convolution simultaneously to graph convolutional networks. Furthermore, Lei et al. [28] developed a spherical kernel function based on a fuzzy mechanism, which was subsequently integrated into a depth-separable graph convolutional network. This innovative method effectively addressed the problem of inaccurate segmentation arising from discontinuous spatial boundaries of point clouds.

2.3. Attention-Based Point Cloud Segmentation Algorithm

The attention mechanism proficiently allocates weights to individual points, mirroring the degree of interaction between a point and its adjacent entities. This strategy facilitates a heightened influence from significant neighbors, thereby enhancing the delineation of spatial relationships. Yang et al. [29] introduced a grouped attention model to capture point-to-point relationships, and a new Gumbel subset sampling method was proposed to solve the problem of conventional sampling methods being too sensitive to outliers. Zhao et al. [30] proposed an attention-based score refinement module to post-process the segmentation results generated by the network. This module combined the scores of neighboring points with the learned attention weights to refine the initial segmentation results. As a result, it can be seamlessly integrated into existing deep neural networks to enhance the segmentation performance. Hu et al. [31] proposed RandLA-Net, which used a local feature aggregation module with attention pooling in each feature extraction layer to learn complex local features.

2.4. Transformer-Based Point Cloud Segmentation Algorithm

The transformer model employs the multi-head self-attention mechanism to establish the focus. Through the self-attention mechanism, the transformer can adaptively learn the relationship between each point and other points, regardless of its position in the input. Each self-attention head allows the model to selectively concentrate on different aspects, thereby capturing different local and global information and integrating them into the final representation. This self-attention-based design ensures that the output of the model naturally maintains consistency with the input-disordered point cloud, thus improving the robustness and accuracy of the model. Gao at al. [32] incorporated the transformer within the encoding stage to acquire the local features from the point cloud. They replaced the general pooling with a self-attention-weighted transformation pooling module to mitigate the potential loss of crucial local features. Thyagharajan et al. [33] divided points into segments using a graph-cutting algorithm. Subsequently, the approach utilized the transformer fragment fusion network to amalgamate the contextual information sourced from various segments. This process was further refined by coupling the attention matrix with the adjacency matrix, thereby constraining the information interchange occurring between segmented features.

2.5. Practical Considerations

During the real-world measurement procedures, the 3D point cloud data associated with circular holes often exhibit incompleteness. This phenomenon can be attributed to factors such as the roughness of circular hole surfaces, material reflection, self-obscuration, or measurement range limitations. Consequently, the development of a high-precision boundary segmentation model tailored for circular holes remains an open challenge. In this study, we introduce a novel deep learning neural network model architecture that combines the encoding–decoding–attention fusion guidance mechanism in parallel with an LSTM network. In addition, we extract features by dividing the neighborhood of points into local and global scales. The results of our experiments demonstrate that the model can accurately segment circular hole boundaries.

3. Methodology

In this section, we introduce CHBS-Net. The section is organized as follows: the EDA is presented in Section 3.1; the LSTM parallel structure is presented in Section 3.2; multi-scale neighborhood segmentation is discussed in Section 3.3; the overall structure of the model and the loss function are presented in Section 3.4; and the section concludes with a description of the experimental settings in Section 3.5.

3.1. Encoding–Decoding–Attention Fusion Guidance Mechanism

In the circular hole boundary point cloud segmentation task, the boundary points account for a small proportion of the overall point cloud, which leads to the problem of category imbalance during the training process. To solve this problem, EDA is proposed to incorporate both channel attention and spatial attention mechanisms. The aim of this method is to obtain the key boundary points and the corresponding salient features. Subsequently, these features of the key points are leveraged to guide the boundary point feature extraction, thereby mitigating the effects of the category imbalance.

The network takes points denoted P as its input. The initial stage involves the encoder

E (•)

within the EDA framework. This encoder processes the input point cloud data I and begins by discerning the circular hole boundary region, resulting in the extraction of encoding features

f_{e}

. Subsequently, the spatial attention mechanism module

S (•)

and the point cloud channel attention mechanism module

C (•)

operate on

f_{e}

to yield the spatial attention results

f_{s}

and channel attention results

f_{c}

, respectively. These results are then fused through a weighted fusion process to yield the circular hole boundary feature

f_{k}

.

Simultaneously, the decoder generates decoding features

f_{d}

, which, in turn, contribute to the computation of W. This computed W is subsequently employed to perform a weighted fusion of

f_{s}

and

f_{c}

. Finally,

f_{k}

, derived from this fusion, serves as guidance for generating the circular hole boundary segmentation. The structural components and computational procedures encompass the following:

Encoder: The encoder comprises multiple layers responsible for extracting the circular hole boundary region from $f_{e}$ , as expressed in Equation (1).

$f_{e} = E (I; θ_{e}),$

(1)

where $θ_{e}$ is the learnable parameter of the encoder. The encoder consists of a series of convolutional layers, batch normalization layers, LeakyReLU activations, and random downsampling layers. The encoding features from each layer of the encoder are shared, and these features serve as inputs not only to the next layer but also to the decoder. The encoder is responsible for roughly delineating the circular hole boundary region from I.
Spatial-channel attention module: This module consists of both spatial and channel statistical feature attention mechanisms, emphasizing different aspects of point cloud data importance. The spatial attention mechanism focuses on the significance of various locations in the point cloud data, while the channel attention mechanism accentuates the relevance of distinct channels within the point cloud data, as specified in Equations (2) and (3).

$f_{s} = S (f_{e}; θ_{s}),$

(2)

$f_{c} = C (f_{e}; θ_{c}),$

(3)

where $θ_{s}$ represents the learnable parameter of the spatial statistical feature attention mechanism, and $θ_{c}$ represents the learnable parameter of the channel statistical feature attention mechanism.
Decoder: The decoder is comprised of convolutional layers, batch normalization layers, LeakyReLU activations, and upsampling layers. Initially, $f_{e}$ undergoes upsampling, and the result is fused with features from the corresponding encoder layers. This fusion aims to recover essential details for the segmentation task, eventually yielding high-level semantic decoding features $f_{d}$ , as indicated in Equation (4).

$f_{d} = D (f_{e}; θ_{d}),$

(4)

where $θ_{d}$ is the learnable parameter of the decoder.
Guidance information generation process: First, average pooling $A v g (•)$ is applied to $f_{d}$ along the channel dimension. Subsequently, a multi-layer perceptron $M (•)$ is applied to linearly transform the pooling result. Finally, the linear transformation result undergoes nonlinear activation using the tanh function $t a n h (•)$ to generate the W. The fusion process described above successfully empowers the training process to simultaneously optimize feature selection and spatial relationship modeling. It allocates more attention to key boundary points that contain important semantic information, thus alleviating the class imbalance problem as follows:

$W = t a n h (M (A v g (f_{d}))),$

(5)

$f_{k} = f_{s} ⨀ W_{s} + f_{c} ⨀ W_{c},$

(6)

where $W_{s}$ is the weight needed for $f_{s}$ in W. $W_{c}$ is the weight needed for $f_{c}$ in W. ⨀ denotes the Hadamard product of matrices.

3.2. LSTM Parallel Structure

The point cloud is characterized by a disordered distribution [34], but there is some contour continuity between the boundary points of the circular holes. As shown in Figure 2, in order to capture the continuity relationship between boundary points, the LSTM network is used to construct the time-order relationship between point clouds, which helps the network model to understand the continuity and order of the boundary points better. Firstly, the input vector I is divided into N distinct regions. Then, LSTM

L (\cdot)

is employed to perform temporal encoding on each region. The point

\{p_{i} = (x_{i}, y_{i}, z_{i}) | i = 1, 2, \dots, n\}, p_{i} \in P

will be added to the designated region based on its coordinate. The size of each region is controlled by the hyperparameter

τ

. Assuming that the maximum and minimum coordinates in the input points are represented by

(x_{m a x}, y_{m a x}, z_{m a x})

and

(x_{m i n}, y_{m i n}, z_{m i n})

, respectively, the number of regions N can be calculated using the following equation:

N = ⌈\frac{x_{m a x} - x_{m i n}}{τ}⌉ \times ⌈\frac{y_{m a x} - y_{m i n}}{τ}⌉ \times ⌈\frac{z_{m a x} {- z}_{m i n}}{τ}⌉,

(7)

Each region is assigned a unique identifier. The identifier

χ

represents the region allocated to each point as follows:

χ = (⌊\frac{x_{i} - x_{m i n}}{τ}⌋, ⌊\frac{y_{i} - y_{m i n}}{τ}⌋, ⌊\frac{z_{i} {- z}_{m i n}}{τ}⌋),

(8)

After dividing the points into different regions, each region is fed into the LSTM for temporal coding. Then, the features with contour continuity relationship

f_{l}

are output. Finally, under the guidance of

f_{k}

, features

f_{t}

are generated using the results of the concatenated solutions of

f_{l}

and

f_{d}

as follows:

f_{t} = C a t (L (I), f_{d}) ⨀ M (M a x (f_{k})),

(9)

where

C a t (\cdot)

denotes the concatenation of two feature vectors in the channel dimension.

M a x (\cdot)

means to return an element which is the maximum value in the channel dimension.

3.3. Multi-Scale Neighborhood Partitioning

The correlation between points cannot be captured by relying on single-point features. Additionally, single-point features are often more susceptible to noise. Inspired by dynamic graph CNN (DGCNN) [16], the EdgeConv is adopted to extract features. The accuracy of segmentation is improved by considering the contextual information between multiple points. Nonetheless, employing single-scale neighborhoods alone may lead to the misclassification of many neighboring points as circular hole boundary points. To solve this problem, as shown in Figure 3, we introduce a global neighborhood containing more points. By extracting features in different neighborhoods, the interference of neighboring points is reduced, thus improving the segmentation accuracy.

Specifically, the k-nearest neighbors algorithm

K (•)

is used to create both a local neighborhood

G_{l}

and a global neighborhood

G_{g}

for each point

p_{i}

individually. Note that the number of points contained in

G_{l}

and

G_{g}

are not the same. Subsequently, the features within each neighborhood are updated through the EdgeConv processing. Lastly, the features processed by EdgeConv are concatenated in the channel dimension as follows:

I = C a t (e (K (P, κ_{l})), e (K (P, κ_{g})))

(10)

where

κ_{l}

is the number of points in the local neighborhood,

κ_{g}

denotes the number of points in the global neighborhood, and

e (•)

represents the EdgeConv process.

3.4. Circular Hole Boundary Segmentation-Net

In summary, the structure of CHBS-Net is shown in Figure 4. It uses EDA as the backbone and parallelizes LSTM networks.

Initially, local neighborhood

G_{l}

and global neighborhood

G_{g}

are created for each point in P. Next, the concatenation results I of

G_{l}

and

G_{g}

in the channel dimension are fed into the EDA and LSTM to obtain the boundary point critical feature

f_{k}

and contour continuity feature

f_{l}

, respectively. Lastly, the fusion of

f_{d}

and

f_{l}

, using

f_{k}

for segmentation guidance, is performed to generate

f_{t}

.

After undergoing multi-scale neighborhood partitioning, EDA, and LSTM processing, CHBS-Net effectively leverages the interconnections among point clouds, extracts crucial features, and achieves accurate circular hole boundary segmentation. This comprehensive approach significantly enhances the data support for fitting circular hole boundaries, thereby enhancing the academic integrity and readability of the results.

The weighted cross-entropy loss is utilized as the optimization objective, which enables the effective balancing of the importance of different classes and improves the training process, as follows:

L_{c} = \sum_{i = 1}^{n} ω_{0} \times (1 - p_{l a b e l}) \times l n (1 - g_{i}) + ω_{1} \times p_{l a b e l} \times l n (g_{i}),

(11)

where

ω_{0}

and

ω_{1}

are the category weights with sample size determination,

p_{l a b e l}

is the point cloud classification label,

g_{i} \in P

is the predicted probability of each point, and n is the number of points.

3.5. Experiment Settings

In this study, a self-created feature boundary dataset of sheet metal parts is used to assess how well the suggested method improves the segmentation performance. There are numerous round hole features in this dataset that need to be inspected. A total of 42 parts—each with numerous circular hole features—are included in the dataset. Figure 5 displays only a portion of the dataset. Several well-known point cloud segmentation networks, including PointNet [18], PointNet++ [19], DGCNN [16], SpiderCNN [35], and AGCN [15], were used for this study as comparison approaches. The test set was utilized to validate the segmentation effectiveness of the various segmentation models that were trained during the studies. For evaluation metrics, we adopted intersection over union (

I o U

) and mean intersection over union (

m I o U

).

I o U

refers to the intersection of the class prediction and ground truth divided by their union [36].

m I o U

is the mean overall semantic classes of class intersection over union [37], which is used to indicate the segmentation accuracy of the algorithm.

In order to verify the effectiveness of the proposed segmentation method for improving the fitting accuracy of circular holes, this paper performs an example validation. A 3D laser point profile sensor (Gocator 2450) and a robot (UR5) are used for point cloud data capture of sheet metal parts. The relevant parameters involved in the capturing process are given in Table 1. The capture scene is shown in Figure 6.

The case validation procedure is as follows: first, various segmentation models are used to handle the gathered part point cloud data for circular hole boundary segmentation. Second, the circular hole segmentation results are fitted to the circular hole using the least squares approach. The accuracy of the circular hole fitting using various segmentation techniques is then contrasted. For the evaluation metrics, this study used the average centroid deviation

A D (c)

, average radius deviation

A D (r)

, and radius deviation variance

M S E (r)

, with the formulas shown below:

\{\begin{matrix} \begin{matrix} A D (c) = \frac{1}{K} \sum_{i = 1}^{K} {∥ c_{i} - {\hat{c}}_{i} ∥}_{2}, \\ A D (r) = \frac{1}{K} \sum_{i = 1}^{K} |r_{i} - {\hat{r}}_{i}|, \\ M S E (r) = \frac{1}{K} \sum_{i = 1}^{K} {(r_{i} - {\hat{r}}_{i})}^{2} \end{matrix} \end{matrix}

(12)

where

c_{i}

is the fitted circle center,

{\hat{c}}_{i}

is the exact circle center,

r_{i}

is the estimated circle radius,

{\hat{r}}_{i}

is the exact radius, and K is the number of circular holes in the input point cloud.

4. Experimental Results Analysis

4.1. Implementation Details

The experiments were conducted on a computational workstation equipped with a 12th Gen Intel(R) Core(TM) i7-12700K (Intel, Santa Clara, CA, USA) and Nvidia 3080 GPU (Nvidia, Santa Clara, CA, USA). The deep learning framework PyTorch 1.9.0 and CUDA 11.0 were chosen to code the framework. The network parameters were as follows: the batch size was eight, the Adam optimizer was used, the number of training epochs was 100, the learning rate was 0.001, the number of points contained in the local neighborhood

κ_{l} = 16

, and the number of points contained in the global neighborhood

κ_{g} = 128

.

4.2. Experimental Results on the Sheet Metal Parts Dataset

In this paper, CHBS-Net is compared with classical methods in the field of point cloud segmentation. All of the compared methods are reproduced from the source code provided by the authors. To be fair, we used the average results after multiple training sessions as a basis for comparison. It should be noted that all methods do not have a unique pre-processing and post-processing procedure. PointNet [18] is one of the most representative methods in the field of point cloud processing and has been widely adopted by the research community. Also included in the comparison is PointNet++ [19], which retains the properties of PointNet [18] and mitigates the problem of local feature loss. Including the above two methods as comparison methods can help readers better understand the performance of our method relative to a widely used benchmark. As can be seen from Table 2, CHBS-Net improves 10.4 mIoU and 7.1 mIoU over PointNet [18] and PointNet++ [19], respectively, which effectively demonstrates the applicability of CHBS-Net in the task of segmenting point clouds with circular hole boundaries. DGCNN [16] adapts to continuity variations in local regions in the point cloud by introducing dynamic graph convolution and adaptive edge weight learning. In contrast, CHBS-Net improves the point cloud segmentation accuracy by introducing LSTM to establish the continuity relationship in local regions and the continuity relationship between regions, respectively. As can be seen from the table, our method improves 3.5 mIoU over DGCNN [16]. AGCN [15] is a state-of-the-art method; it introduces an adaptive attention mechanism to emphasize the key boundary feature extraction in the point cloud. The EDA in CHBS-Net also introduces the attention mechanism, hoping to guide the boundary point feature extraction by separating the key features of the key boundary points, and the results show that our method improves 1.6 mIoU compared with AGCN [15]. SpiderCNN [35] innovatively introduces the local information aggregation module, which can selectively aggregate the local information around each point. The multi-scale neighborhood partitioning, on the other hand, considers not only the local information of a single point, but also the global information. The results show that CHBS-Net improves 7.6 mIoU over SpiderCNN [35] after summing up the local and global information.

The qualitative results of the comparison are presented in Figure 7. The proposed model successfully identifies complete circular hole boundary points without misclassifying other non-circular hole boundary points. In contrast, other methods tend to provide coarse identifications of boundary points, which may include points very close to the circular hole boundary.

Real point cloud data are used and the least squares circular hole fitting algorithm was used with different segmentation methods. Subsequently, the circular hole fitting errors were calculated. In Table 3, the circular hole fitting errors obtained under the action of different segmentation algorithms are listed. Overall, our proposed algorithm obtains the smallest error on the real point cloud data. The quantitative results show that our proposed segmentation method is effective for improving the circular hole fitting accuracy.

4.3. Ablation Experiment

4.3.1. Effects of LSTM Parallel Structure

To verify the effectiveness of the LSTM’s parallel structure, we set up a baseline and a variant of our model (denoted “Net-S” and “Net-M”). The differences between the two learning frameworks are illustrated in Figure 8. Net-S has the exact same structure as CHBS-Net, and Net-M is the result of removing the LSTM from CHBS-Net. We carried out the validation on the sheet metal parts dataset. The results are shown in Table 4. Our analysis shows that the use of LSTM’s parallel structure effectively improves the segmentation accuracy of the network. Compared with Net-M, Net-S improves by 1.8%. Furthermore, in the comparison of the three circular hole fitting metrics, Net-S successfully achieves the minimum fitting error. It is not difficult to find that constructing the temporal relationship between the boundary points using the LSTM can effectively promote the accurate segmentation of the boundary points of the circular holes and reduce the fitting error of the circular holes.

4.3.2. Effects of Multi-Scale Neighborhood Partitioning

The efficiency of multi-scale neighborhood partitioning was evaluated. We perform a quantitative experiment on the sheet metal parts dataset to more clearly illustrate the function of various scale neighborhoods for boundary point segmentation. In Table 5, the respective functions of local and global neighborhoods are studied. The outcomes of the circular hole fitting are obviously impacted by nearby interference points, where case.1 denotes that no multi-scale neighborhood partitioning is carried out. The parameters employed in case.3 are the same as those in CHBS-Net, and case.3 outperforms case.1 in terms of mIoU gain by 3.3%. Furthermore, in all three circular hole fitting measurements, the lowest error was attained. In this instance, the interference of surrounding points is decreased, the boundary point segmentation accuracy is enhanced, and the circular hole fitting error is significantly decreased by extracting features in two distinct sizes of neighborhoods. The ultimate accuracy of the circular hole fitting is not adequate, despite larger-scale neighborhoods being able to produce a higher mIoU, according to a comparison between the experimental results of case.2 and case.4. Because additional interference points are introduced by larger-scale neighborhoods, the circular hole fitting is significantly impacted.

5. Conclusions

Motivated by the practical engineering needs of accurately identifying the boundaries of circular holes in sheet metal parts, this paper proposes a novel segmentation model, CHBS-Net. The advantages of the method can be summarized in three aspects: (i) an encoding–decoding–attention fusion guidance mechanism is proposed to guide the segmentation process of all point clouds by filtering out the key point features in the boundary region, aiming to make full use of the key features to improve the segmentation accuracy. (ii) Introducing an LSTM parallel structure to capture contour continuity and temporal relationships between boundary points. (iii) Mitigating the interference caused by boundary neighboring points by extracting features in the neighborhood at different scales.

On the self-created feature boundary dataset of sheet metal parts used in the experiments, the effectiveness of the suggested strategy is confirmed. The results demonstrate that the suggested method beats both the state-of-the-art point cloud segmentation method, AGCN, and the traditional point cloud segmentation method, PointNet, by 1.6 mIoU and 10.4 mIoU, respectively. The proposed method effectively increases the circular hole fitting accuracy, according to experimental results on real point cloud scanning data of sheet metal parts, which is advantageous for quality assurance in the manufacturing process.

6. Limitation Discussion and Future Work

The proposed CHBS-Net shows good results on real data. Nevertheless, there is still room for further improvement of the method. First, our current network model has many parameters and slow inference speed. In order to improve the performance, we will simplify the model, reduce the parameters, and increase the inference speed in future studies. Second, this study focused on circular holes in sheet metal parts with diameters of 10–100 mm and complete shapes; other objects were not considered. In the future, we will expand the dataset to include larger or smaller holes and a variety of features to improve the generality and applicability of the model. We will develop a unified deep learning network model for 3D surface inspection in automotive manufacturing to meet industry needs.

Author Contributions

Conceptualization, Y.L. (Yinhua Liu); methodology, J.Z.; software, Y.L. (Yanzheng Li); writing—original draft preparation, J.Z. and X.W.; writing—review and editing, J.Z., X.W., Y.L. (Yanzheng Li), and Y.L. (Yinhua Liu); supervision, Y.L. (Yinhua Liu); funding acquisition, Y.L. (Yinhua Liu). All authors have read and agreed to the published version of the manuscript.

Funding

Paper was supported by the National Natural Science Foundation of China (51875362), the Natural Science Foundation of Shanghai (21ZR1444500), and Shanghai Pujiang Program (22PJD048).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gao, C.; Yu, H.; Gu, B.; Lin, Y. Modeling and analysis of assembly variation with non-uniform stiffness condensation for large thin-walled structures. Thin-Walled Struct. 2023, 191, 111042. [Google Scholar] [CrossRef]
Li, Y.; Ma, L.; Zhong, Z.; Liu, F.; Chapman, M.A.; Cao, D.; Li, J. Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 3412–3432. [Google Scholar] [CrossRef] [PubMed]
Park, Y.; Lepetit, V.; Woo, W. Multiple 3D Object tracking for augmented reality. In Proceedings of the 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, Cambridge, UK, 15–18 September 2008; pp. 117–120. [Google Scholar] [CrossRef]
Wang, Z.; Xu, Y.; He, Q.; Fang, Z.; Fu, J. Grasping pose estimation for SCARA robot based on deep learning of point cloud. Int. J. Adv. Manuf. Technol. 2020, 108, 1217–1231. [Google Scholar] [CrossRef]
Zhang, Z.; Malashkhia, L.; Zhang, Y.; Shevtshenko, E.; Wang, Y. Design of Gaussian process based model predictive control for seam tracking in a laser welding digital twin environment. J. Manuf. Process. 2022, 80, 816–828. [Google Scholar] [CrossRef]
Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep learning for smart manufacturing: Methods and applications. J. Manuf. Syst. 2018, 48, 144–156. [Google Scholar] [CrossRef]
Yin, J.; Zhao, W. Fault diagnosis network design for vehicle on-board equipments of high-speed railway: A deep learning approach. Eng. Appl. Artif. Intell. 2016, 56, 250–259. [Google Scholar] [CrossRef]
Li, C.; Sanchez, R.V.; Zurita, G.; Cerrada, M.; Cabrera, D.; Vásquez, R.E. Gearbox fault diagnosis based on deep random forest fusion of acoustic and vibratory signals. Mech. Syst. Signal Process. 2016, 76–77, 283–293. [Google Scholar] [CrossRef]
Cheng, X.; Yu, J. RetinaNet with Difference Channel Attention and Adaptively Spatial Feature Fusion for Steel Surface Defect Detection. IEEE Trans. Instrum. Meas. 2021, 70, 2503911. [Google Scholar] [CrossRef]
He, Y.; Song, K.; Meng, Q.; Yan, Y. An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features. IEEE Trans. Instrum. Meas. 2020, 69, 1493–1504. [Google Scholar] [CrossRef]
Xie, Y.; Tian, J.; Zhu, X.X. Linking Points with Labels in 3D: A Review of Point Cloud Semantic Segmentation. IEEE Geosci. Remote Sens. Mag. 2020, 8, 38–59. [Google Scholar] [CrossRef]
Zhao, H.; Jiang, L.; Jia, J.; Torr, P.; Koltun, V. Point Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 16239–16248. [Google Scholar] [CrossRef]
Zhang, C.; Wan, H.; Shen, X.; Wu, Z. PVT: Point-Voxel Transformer for Point Cloud Learning. arXiv 2022, arXiv:2108.06076. [Google Scholar] [CrossRef]
Guo, M.H.; Cai, J.X.; Liu, Z.N.; Mu, T.J.; Martin, R.R.; Hu, S.M. Pct: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
Xie, Z.; Chen, J.; Peng, B. Point clouds learning with attention-based graph convolution networks. Neurocomputing 2020, 402, 245–255. [Google Scholar] [CrossRef]
Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic Graph CNN for Learning on Point Clouds. arXiv 2019, arXiv:1801.07829. [Google Scholar] [CrossRef]
Liang, Z.; Yang, M.; Deng, L.; Wang, C.; Wang, B. Hierarchical Depthwise Graph Convolutional Neural Network for 3D Semantic Segmentation of Point Clouds. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8152–8158. [Google Scholar] [CrossRef]
Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar] [CrossRef]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Red Hook, NY, USA, 4–9 December 2017; pp. 5105–5114. [Google Scholar]
Wu, B.; Zhou, X.; Zhao, S.; Yue, X.; Keutzer, K. SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4376–4382. [Google Scholar] [CrossRef]
Li, Y.; Wang, Y.; Liu, Y. Three-Dimensional Point Cloud Segmentation Based on Context Feature for Sheet Metal Part Boundary Recognition. IEEE Trans. Instrum. Meas. 2023, 72, 2513710. [Google Scholar] [CrossRef]
Choy, C.; Gwak, J.; Savarese, S. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3070–3079. [Google Scholar] [CrossRef]
Komarichev, A.; Zhong, Z.; Hua, J. A-CNN: Annularly Convolutional Neural Networks on Point Clouds. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7413–7422. [Google Scholar] [CrossRef]
Xu, M.; Ding, R.; Zhao, H.; Qi, X. PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 3172–3181. [Google Scholar] [CrossRef]
Lei, H.; Akhtar, N.; Mian, A. Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3664–3680. [Google Scholar] [CrossRef] [PubMed]
Xu, Q.; Sun, X.; Wu, C.Y.; Wang, P.; Neumann, U. Grid-GCN for Fast and Scalable Point Cloud Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 5660–5669. [Google Scholar] [CrossRef]
Li, G.; Müller, M.; Qian, G.; Delgadillo, I.C.; Abualshour, A.; Thabet, A.; Ghanem, B. DeepGCNs: Making GCNs Go as Deep as CNNs. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 6923–6939. [Google Scholar] [CrossRef] [PubMed]
Lei, H.; Akhtar, N.; Mian, A. SegGCN: Efficient 3D Point Cloud Segmentation with Fuzzy Spherical Kernel. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11608–11617. [Google Scholar] [CrossRef]
Yang, J.; Zhang, Q.; Ni, B.; Li, L.; Liu, J.; Zhou, M.; Tian, Q. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3318–3327. [Google Scholar] [CrossRef]
Zhao, C.; Zhou, W.; Lu, L.; Zhao, Q. Pooling Scores of Neighboring Points for Improved 3D Point Cloud Segmentation. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1475–1479. [Google Scholar] [CrossRef]
Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11105–11114. [Google Scholar] [CrossRef]
Gao, Y.; Liu, X.; Li, J.; Fang, Z.; Jiang, X.; Huq, K.M.S. LFT-Net: Local Feature Transformer Network for Point Clouds Analysis. IEEE Trans. Intell. Transp. Syst. 2023, 24, 2158–2168. [Google Scholar] [CrossRef]
Thyagharajan, A.; Ummenhofer, B.; Laddha, P.; Omer, O.J.; Subramoney, S. Segment-Fusion: Hierarchical Context Fusion for Robust 3D Semantic Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–23 June 2022; pp. 1226–1235. [Google Scholar] [CrossRef]
Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
Xu, Y.; Fan, T.; Xu, M.; Zeng, L.; Qiao, Y. SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Cham, Switzerland, 2018; pp. 90–105. [Google Scholar]
Zhang, Y.; Zhou, Z.; David, P.; Yue, X.; Xi, Z.; Gong, B.; Foroosh, H. PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 9598–9607. [Google Scholar] [CrossRef]
Wen, S.; Wang, T.; Tao, S. Hybrid CNN-LSTM Architecture for LiDAR Point Clouds Semantic Segmentation. IEEE Robot. Autom. Lett. 2022, 7, 5811–5818. [Google Scholar] [CrossRef]

Figure 1. Workflow diagram. A new segmentation network for boundary points of circular holes is proposed. By segmenting more accurate boundary points of circular holes, more accurate circular hole fitting results are obtained.

Figure 2. Workflow of LSTM to establish temporal relationships between point clouds.

Figure 3. Diagram of the process of multi-scale neighborhood delineation.

Figure 4. Structure of CHBS-Net. The input point cloud shape is

(b, 3, n)

. Where b stands for batch, 3 stands for the

(x, y, z)

three channels, and n stands for the number of points.

Figure 4. Structure of CHBS-Net. The input point cloud shape is

(b, 3, n)

. Where b stands for batch, 3 stands for the

(x, y, z)

three channels, and n stands for the number of points.

Figure 5. Samples from the presented feature boundary dataset of sheet metal parts. Note: the circular holes in the figure are presented as elliptical due to the viewing angle problem.

Figure 6. Real part point cloud acquisition process.

Figure 7. Segmentation results of the feature boundary dataset of sheet metal parts.

Figure 8. Differences between Net-S and Net-M. (a) The structure of Net-S is identical to that of CHBS-Net. (b) There is no LSTM parallel structure in Net-S.

f_{k}

and

f_{d}

are generated by the encoder and decoder in EDA, respectively, and

f_{d}

is guided by

f_{k}

to obtain the final segmentation result.

Figure 8. Differences between Net-S and Net-M. (a) The structure of Net-S is identical to that of CHBS-Net. (b) There is no LSTM parallel structure in Net-S.

f_{k}

and

f_{d}

are generated by the encoder and decoder in EDA, respectively, and

f_{d}

is guided by

f_{k}

to obtain the final segmentation result.

Table 1. Relevant parameters during case validation process.

Average Length of Parts (mm)	Average Width of Parts (mm)	Diameter of Circular Hole (mm)	FOV of the Camera (mm)	Maximum Number of Sampling Points	X-Direction Resolution (mm)	Scanning Speed (mm/s)
450	200	10–100	145–425	1800	0.100–0.255	80

Table 2. MIoU of different models tested on dataset of sheet metal parts.

Method	mIoU
PointNet [18]	76.7
PointNet++ [19]	80.0
DGCNN [16]	83.6
SpiderCNN [35]	79.5
AGCN [15]	85.5
Ours	87.1

Table 3. AD(c), AD(r) and MSE(r) of different models tested on the feature boundary dataset of sheet metal parts.

Method	AD(c)	AD(r)	MSE(r)
PointNet [18]	0.668	0.271	0.076
PointNet++ [19]	0.509	0.239	0.093
DGCNN [16]	0.467	0.297	0.090
SpiderCNN [35]	0.439	0.309	0.104
AGCN [15]	0.469	0.278	0.080
Ours	0.381	0.181	0.034

Table 4. Ablation experiments with LSTM parallel structure.

Method	mIoU	AD(c)	AD(r)	MSE(r)
Net-S	87.1	0.381	0.181	0.034
Net-M	85.3	0.494	0.298	0.103

Table 5. Ablation experiments for multi scale neighborhood partitioning.

Index	Method	mIoU	AD(c)	AD(r)	MSE(r)
case.1	-	83.8	0.469	0.298	0.103
case.2	$κ_{l} = 16$ , $κ_{g} = 0$	83.2	0.482	0.283	0.096
case.3	$κ_{l} = 16$ , $κ_{g} = 128$	87.1	0.381	0.181	0.034
case.4	$κ_{l} = 0$ , $κ_{g} = 128$	85.1	0.431	0.325	0.111

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, J.; Wang, X.; Li, Y.; Liu, Y. CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries. Machines 2023, 11, 982. https://doi.org/10.3390/machines11110982

AMA Style

Zhang J, Wang X, Li Y, Liu Y. CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries. Machines. 2023; 11(11):982. https://doi.org/10.3390/machines11110982

Chicago/Turabian Style

Zhang, Jiawei, Xueqi Wang, Yanzheng Li, and Yinhua Liu. 2023. "CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries" Machines 11, no. 11: 982. https://doi.org/10.3390/machines11110982

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CHBS-Net: 3D Point Cloud Segmentation Network with Key Feature Guidance for Circular Hole Boundaries

Abstract

1. Introduction

2. Related Work

2.1. Convolutional Neural Network-Based Point Cloud Segmentation Algorithm

2.2. Graph Neural Network-Based Point Cloud Segmentation Algorithm

2.3. Attention-Based Point Cloud Segmentation Algorithm

2.4. Transformer-Based Point Cloud Segmentation Algorithm

2.5. Practical Considerations

3. Methodology

3.1. Encoding–Decoding–Attention Fusion Guidance Mechanism

3.2. LSTM Parallel Structure

3.3. Multi-Scale Neighborhood Partitioning

3.4. Circular Hole Boundary Segmentation-Net

3.5. Experiment Settings

4. Experimental Results Analysis

4.1. Implementation Details

4.2. Experimental Results on the Sheet Metal Parts Dataset

4.3. Ablation Experiment

4.3.1. Effects of LSTM Parallel Structure

4.3.2. Effects of Multi-Scale Neighborhood Partitioning

5. Conclusions

6. Limitation Discussion and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI