Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data

Yan, Dongming; Li, Lijuan; Liu, Yue; Lin, Xuezhu; Guo, Lili; Chao, Shihan

doi:10.3390/app132312602

Open AccessArticle

Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data

by

Dongming Yan

¹

,

Lijuan Li

^1,2,*,

Yue Liu

¹

,

Xuezhu Lin

^1,2,

Lili Guo

^1,2 and

Shihan Chao

¹

Key Laboratory of Optoelectronic Measurement and Control and Optical Information Transmission Technology of the Ministry of Education, School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China

²

Zhongshan Research Institute, Changchun University of Science and Technology, Zhongshan 528400, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(23), 12602; https://doi.org/10.3390/app132312602

Submission received: 11 October 2023 / Revised: 13 November 2023 / Accepted: 18 November 2023 / Published: 23 November 2023

(This article belongs to the Topic Complex Systems and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Featured Application

The proposed deep learning method based on multi-control point derivation of 3D non-rigid transformation uses the gated cyclic unit in CNN-GRU-SA to extract control point features before and after transformation, accelerating multi-feature fusion through a self-attention mechanism. The proposed method has potential applications in real-time deduction of entity deformation.

Abstract

In complex measurement systems, scanning the shape data of solid models is time consuming, and real-time solutions are required. Therefore, we developed a 3D non-rigid transformation deduction model based on multi-control point perception data. We combined a convolutional neural network (CNN), gated recurrent unit (GRU), and self-attention mechanism (SA) to develop the CNN-GRU-SA deduction network, which can deduce 3D non-rigid transformations based on multiple control points. We compared the proposed network to several other networks, with the experimental results indicating that the maximum improvements in terms of loss and root-mean-squared error (RMSE) on the training set were 39% and 49%, respectively; the corresponding values for the testing set were 48% and 29%. Moreover, the average deviation of the inference results and average inference time were 0.55 mm and 0.021 s, respectively. Hence, the proposed deep learning method provides an effective method to simulate and deduce the 3D non-rigid transformation processes of entities in the measurement system space, thus highlighting its practical significance in optimizing entity deformation.

Keywords:

CNN; GRU; SA; non-rigid transformation; deep learning; 3D deformation

1. Introduction

In three-dimensional (3D) space, transformations can be classified as rigid body and non-rigid transformations. Rigid body transformation is a special case of a similar transformation that connects point sets in different coordinate systems through rotation and translation [1]. Non-rigid transformation allows translation, rotation, scaling, and more complex deformation operations on an object, through which the position, direction, size, and shape of the object in a 3D space can be changed. Compared with rigid transformation, non-rigid transformation usually requires more complex mathematical models and algorithms; however, the advancement of modern computer and graphics processing technology enables real-time performance of non-rigid transformation. This has significant implications for areas such as virtual reality, augmented reality, and interactive applications.

Three-dimensional deformation is a central research topic in the field of flexible transformation. Roh et al. [2] monitored the physical deformation of large structures by extending one-dimensional (1D) two-level piezoresistive behavior to 2D/3D surface self-sensing. Songlin et al. [3] determined the material parameters of Yeoh’s third-order model based on a uniaxial test. The authors established a mathematical model of the input pressure and deformation of a water hydraulic soft actuator. Chengyu et al. [4] developed a small-scale soil deformation measurement system based on a fiber Bragg grating for underground displacement monitoring. Hyeon et al. [5] proposed a method to estimate the deformation and strain of an entire structure by measuring the discrete 3D coordinate data of the target structure obtained from LiDAR. Zhang et al. [6] implemented digital image correlation technology to accurately measure the full-field strain generated in various quasi-static tensile tests, including cyclic tensile testing, uniaxial tensile testing, and stress relaxation tests, which accurately measured the characteristics of polydimethylsiloxane that is directly related to the performance of wearable strain sensors. Note that research on deformation monitoring based on FEM methods typically involves a complex metamodeling process that is time consuming and labor intensive.

With the advent of Industry 4.0, the need for 3D deformation intelligent detection and digital twins is increasing. Mesh geometry can adapt to various domains as an input to artificial neural networks [7]. Therefore, some scholars have realized 3D non-rigid transformation deduction by introducing deep-learning frameworks. For instance, Zou et al. [8] designed a learning method based on a deep neural network to correct constant curvature kinematics through accurate visual estimation results. Dey et al. [9] used artificial neural network (ANN) technology to simultaneously measure temperature and strain to integrate fiber Bragg gratings. Yang et al. [10] reported an AI-based method, which implemented a generation confrontation neural network based on game theory, to bridge the gap between the material microstructure (design space) and physical properties. Yang et al. [11] predicted the strain and stress tensor for a given input composite geometry based on deep learning methods. Yang et al. [12] proposed a new DIC method based on deep learning Deep DIC, where two convolutional neural networks (CNNs), DisplacementNet and StrainNet, collaborate to predict the displacement and strain end-to-end. Note that the 3D non-rigid transformation deduction based on the deep learning method can learn robustness from data while still performing well in the case of limited noise. However, after the deep-learning model is trained, the 3D non-rigid transformation result can be deduced from the new input data with a low delay.

The contributions of this paper are as follows:

We propose a 3D non-rigid transformation derivation model based on multi-control point perception data to predict the shapes of entities after transformation. Unlike traditional 3D non-rigid transformation algorithms, this method does not rely on unstructured or high-resolution grids and does not involve a large number of computational operations and complex metamodeling processes. The deduction time of this model is nearly 200 times faster than traditional FEM methods.
This method infers the changed shape of an object based on its initial shape, the control point data in their initial state, and changes in the control point data. Using non-contact measurements to obtain control point coordinates instead of the original force magnitude and direction reduces the complexity of the model. Our method has the advantages of low computational complexity, low grid dependency, and high real-time performance.
We propose a CNN gate recurrent unit self-attention mechanism (CNN-GRU-SA)-based multi-input single-output derivation network. The network first uses the CNN feature extraction module to extract the initial shape. Subsequently, the network introduces the GRU to obtain the features of the initial and changed control points. The extracted features are then fused according to the self-attention mechanism. The output features are extracted through the convolutional layers.
We measure and compare the deduction time of the FEM method and the deep learning-based method to verify the real-time performance of the proposed method. We also compare the network with several other neural network architectures to demonstrate its applicability and optimality.

This paper is organized as follows. In Section 2, we introduce the specific process of establishing the measurement field, initialization, and real-time update of the derivation model based on multi-control point perception data. In Section 3, we introduce the three stages of the CNN-GRU-SA derivation network, namely input feature extraction, self-attention mechanism fusion, and feature output. In Section 4, we present the training process and derivation results and analyze and discuss the results. In Section 5, we conclude the study and discuss the scope for future research.

2. Model

The real-time performance of a 3D non-rigid transformation deduction model based on multi-control point perception data is paramount. Continuous monitoring of entity control points and real-time updates of non-rigid transformation results are essential for real-time performance. Its essence is to prevent the secondary scanning of the solid model shape data. However, the method must represent the shape data of the solid model through control points to achieve high real-time performance. The deduction model was divided into three key stages: establishment of the measurement field, initialization, and real-time update, as shown in Figure 1.

(a): Establishing the measurement field

During the establishment of the measurement field, the 3DM vision system sets the world coordinate system and simultaneously shares the system with the binocular vision measurement system.

(b): Initialization

In the initialization stage, the binocular vision measurement system scans the overall shape

S_{0}

of the solid model before changing it to the first input of the deduction network. Simultaneously, the 3DM vision system measures the 3D coordinates

C_{0}

of all the control points as the second input of the deduction network.

(c): Live update

In the live update stage, the 3DM vision system measures the 3D coordinates

C_{t}

of all control points at time t as the third input of the deduction network. The 3D non-rigid transformation result

S_{t}

of the changed solid model is determined using

S_{0}

,

C_{0}

, and

C_{t}

. When a 3D non-rigid transformation occurrs, the 3DM vision system must re-measure the 3D coordinates of the control points and obtain the 3D non-rigid transformation results after the physical model changes through the trained deduction network to achieve a real-time update.

3. Method

We propose a CNN-GRU-SA deduction network, as shown in Figure 2. The architecture combines the GRU and SA modules and is a multi-input single-output network architecture. Mainly, two methods exist for a multi-input network architecture to perform data fusion: feature-level fusion and decision-level fusion. Feature-level fusion involves extracting features from different input data and completing subsequent learning tasks by performing data fusion at the feature level. Decision-level fusion results from a deep learning model that is first trained with different input features and then fused with the outputs of multiple models [13]. The proposed CNN-GRU-SA deduction network adopts a decision-level fusion method, and the training process can be divided into three stages: input feature extraction, attention mechanism fusion, and output features.

(a): Input feature extraction stage

The pre-change control point features (i.e., 3D coordinates

C_{0}

) and post-change control point features (i.e., 3D coordinates

C_{t}

) of the network architecture input are expanded using the GRU module for feature data. The reset gate of the GRU can control the intensity of the information of the input feature being ignored, whereas the update gate helps preserve the input information. Using the GRU, the long-term dependence of the control point features can be captured better and faster, and the convergence speed of the training can be accelerated. The feature operation is as follows:

C_{0}^{'} = G (C_{0}),

(1)

C_{t}^{'} = G (C_{t}),

(2)

where G is the GRU feature extraction module,

C_{0}^{'}

is the feature after

C_{0}

expands its dimension, and

C_{t}^{'}

is the feature after

C_{t}

expands its dimensions.

Simultaneously, the pre-change point cloud features (i.e., the overall shape

S_{0}

) input by the network architecture are expanded with feature data through a feature extraction module comprising four sets of CNN, batch normalization (BN), and ReLU activation functions. First, the feature dimensions are expanded using a CNN to enhance the robustness of the data. Second, to prevent excessive deviations and offsets, BN is used to standardize the features and unify the data distribution. Finally, the rectified linear unit activation function is used to enhance the ability of the features to change nonlinearly. The feature operation is as follows:

S_{1} = R (B (F (S_{0}))),

(3)

S_{2} = R (B (F (S_{1}))),

(4)

S_{3} = R (B (F (S_{2}))),

(5)

{S_{0}}^{'} = R (B (F (S_{3}))),

(6)

where F is a 1D convolution operation with a convolution kernel size and step size of one. B is the batch operation and R is the activation function ReLU.

S_{1}

,

S_{2}

,

S_{3}

, and

S_{0}^{'}

are the features of dimension expansion.

At this stage, the number of input features is increased from 3 to 256, and the number of features in the first dimension remains equal.

(b): Attention mechanism fusion stage

First, the three new features

C_{0}^{'}

,

C_{t}^{'}

, and

S_{0}^{'}

obtained during the input feature extraction stage are concatenated in the first dimension to form new features. The feature operation is as follows:

S_{c o n} = C o n (C_{0}^{'}, C_{t}^{'}, S_{0}^{'}),

(7)

where Con is the concatenation function and

S_{c o n}

is the new feature obtained after concatenation.

Subsequently, the new features are sent to the SA module to obtain the attention weight. The feature operation is as follows:

S_{S A} = S A (S_{c o n}),

(8)

where SA is the self-attention mechanism module and

S_{S A}

is the obtained attention weight.

Finally, the new feature

S_{c o n}

, formed by splicing, and the attention weight

S_{S A}

are added. The feature operation is as follows:

S_{t}^{'} = S_{c o n} + S_{S A},

(9)

where

S_{t}^{'}

represents a new feature under the attention mechanism.

Based on the position information of the feature, each feature is processed through an SA mechanism, that is, features at different positions introduce different weights. This allows the model to distinguish the importance of features at different locations, thereby helping to better process the features.

(c): Output feature stage

The new features obtained in the fusion stage of the attention mechanism are inputted to the fully connected layer (FC) for feature shrinkage operations so that the first dimension is equal to the output. Its function is to select information related to the output features from the fused features. Then, the feature shrinkage operation is performed through a feature extraction module composed of three sets of CNN, BN, and ReLU activation functions so that the second dimension is equal to the output feature. The function of the output feature module is to reduce the dimensionality of the data, that is, from 256 to 3.

The CNN-GRU-SA derivation network proposed in this study expands the input features through CNN and GRU modules, decreasing the probability of feature loss during training. In addition, the attention weight of the fusion feature is extracted through the SA module to accelerate convergence during training. The final output result is obtained by shrinking and extracting the data through the FC and output feature modules. The main purpose of this feature processing method is to introduce more nonlinear transformations and increase the representation ability of the network while reducing the feature dimension to reduce the computational complexity and prevent overfitting. The combination of dimensionality expansion and compression enabled the neural networks to model complex data effectively within a relatively small hidden space. These operations produce a flexible and efficient network for learning feature representations, improving the generalization ability of the model and reducing the consumption of computing resources when processing large-scale data.

4. Experiment

4.1. Dataset

In this study, the experimental object comprised a wing model made of carbon fiber. First, we imported the STL file of the model and set the fixed support and deformation areas. Twenty points were then selected as the control points. The principle of selecting control points is to minimize their number and provide maximum coverage of the deformation area. Next, an external force between 0 and 20 N was applied in the direction of the positive and negative z-axes of the control point. Finally, 10⁵ sets of point cloud coordinates before and after deformation were obtained using simulation analysis software. Because the control points were selected in the point cloud, the index of the point before and after deformation remains unchanged, and the changed control point coordinates can be obtained directly from the changed point cloud. The deformation method is shown in Figure 3, where the green area on the left is the fixed support area, the beige area is the deformation area, and the red spherical point is the control point.

The dataset used for network training contained 10⁵ samples, of which the first 80% was set as the training set, and the last 20% was set as the test set. Each sample contained the following data:

Input 1: xyz coordinates of 20 control points before change;
Input 2: xyz coordinates of 20 changed control points;
Input 3: xyz coordinates of the point cloud before a change, where the point cloud contains 1208 points;
Output: xyz coordinates of the point cloud after a change, where the point cloud contains 1208 points.

4.2. Parameter Settings

The experimental configuration and network parameters were as follows:

Hardware equipment: NVIDIA GeForce RTX 3080 GPU;
Operating system: Windows 10 Professional Workstation Edition;
Software tools: Python 3.10 programming language, PyTorch 2.0 deep learning framework, and other third-party libraries;
Batch size: 8;
Initial learning rate: 0.001;
Learning rate decay: multiply by 0.7 every 10 epochs;
Optimization algorithm: Adam optimization algorithm [14];
Evaluation indicators: L1 Loss (c = 1) [15] and RMSE;
Network module input and output feature parameters, as shown in Table 1.

In the training process, in addition to the training time, the size of the dataset and number of iterations also affect the network. The results shown in Figure 4 are obtained through comparative experiments.

As shown in Figure 4a, the loss value decreases rapidly during the first 200 iterations. Between 200 and 400 iterations, the loss values tend to converge. Between 400 and 1000 iterations, the fluctuation in the loss value is less than or equal to 5 × 10⁻⁷, and the disturbance range is less than 1% of the loss value. This experiment indicates that the training results of the network are not significantly optimized when the number of iterations exceeds 400. Therefore, for the network training used in this study, the number of iterations was set to 400.

In addition, we first conducted network training on datasets of different sizes, deduced the 3D non-rigid transformation of all data through the obtained network model, and finally determined the deviation (i.e., Euclidean Distance) between the deduction and actual results. As shown in Figure 4b, the experimental results show that when the dataset reached 10⁵, the deviation statistical results tend to be stable and distributed more compactly. In contrast, when the dataset size is 10⁵, Q3 (Upper Quartile), Q2 (median), Q1 (Lower Quartile), and the mean present the smallest values, and the upper and lower limits approach the minimum values. The detailed results of all deviation statistics are listed in Table 2.

4.3. Results and Discussion

First, to verify the real-time performance of the 3D non-rigid transformation derivation model based on multi-control point perception data, we deduced all samples in the dataset and compared the times spent by the FEM method and the proposed model. The manual secondary scanning time was considered to be at least 20 min. According to the statistical results presented in Figure 5 and Table 3, the FEM method required an average of 4.155 s compared to the 0.021 s average deduction time of the proposed model, nearly 200 times faster than the FEM method. In addition, this real-time performance is vastly superior to that of manual secondary scanning.

Secondly, to verify the feasibility and optimality of the CNN-GRU-SA network architecture, we compared it with CNN-GRU [16], CNN-GRU-CBAM [17], and CNN-LSTM-SA [18] through comparative experiments. Among them, CNN-GRU-CBAM combines the Convolutional Block Attention Module (CBAM) [19], and CNN-LSTM-SA combines Long Short-Term Memory (LSTM) [20]. The comparison results of the training and testing are shown in Figure 6.

The results indicate that the CNN-GRU-SA proposed in this paper has the following advantages:

In Figure 6a,b, compared with the other three network architectures, the loss and RMSE of CNN-GRU-SA have a faster convergence speed on 80% of the training set.
In Figure 6c,d, in the 20% test set, the fluctuation of the loss and RMSE of the network is also smaller than that of the other comparison networks.
To further verify the performance of the CNN-GRU-SA, we provide the final results of the network training after completing the iterations. The results listed in Table 4 suggest that the CNN-GRU-SA network architecture proposed in this study has a maximum decrease of 39% in the training set loss, 49% in the training set RMSE, 48% in the testing set loss, and 29% in the testing set RMSE compared to the minimum values of other networks.

Finally, to show the results of network deduction more intuitively, we selected a set of data in the dataset, performed deduction using the trained network model, and developed a deviation map of the deduction results and label data. Figure 7a–c shows the situation before and after the control point changes and the appearance before the entity changes. Figure 7d shows the changed appearance, which requires the network architecture to obtain the label data through deduction. Figure 7e–h depicts the deviation graphs of the results of the four network architecture simulations and the label data. These four figures indicate that outliers with deviations of approximately 0.4 mm remain in the extrapolation results of CNN-GRU and CNN-GRU-CBAM. The inference results of CNN-LSTM-SA and CNN-GRU-SA outperform those of the other two networks. The proposed CNN-GRU-SA performs better at the edge of the tip than CNN-LSTM-SA. Moreover, our method has better bias results in areas far from the tip. In addition, we conducted statistical analysis on the deviation, as shown in Table 5. The results indicate that the difference between CNN-GRU-SA and the minimum value of all networks is less than 10⁻³ mm. In terms of maximum, mean, and standard deviation, CNN-GRU-SA is the best. Compared with the other three networks, our network can deduce the changed appearance more accurately.

4.4. Verification Example

To verify the effectiveness of the proposed model, we constructed the experimental environment presented in Figure 8. The 3DM vision system is responsible for obtaining the three-dimensional coordinates of control points by encoding points in real time, while the binocular vision measurement system is responsible for obtaining the shape data of the solid model.

The control point data we collected at time t are shown in Table 6.

Based on the coordinates of the control points, we obtained the deduction results presented in Figure 9 using the proposed model. The gray point cloud represents the appearance data obtained by the binocular vision system. The red point cloud represents the deformation results derived by the proposed model. Significant deviations are observed in three regions, i.e., A1, B1, and C1 because the A2, B2, and C2 regions are encoding point locations, which do not contain point cloud information. Therefore, only the closest points could be used for calculating deviations. Additionally, owing to the poor quality of the point cloud in the edge region of the scanning results, for the deduction results in region D, the deviation could only be calculated by finding relevant points nearby. However, most of the results are consistent with the actual situation. According to our statistical analysis, the average deviation was 0.57544 mm.

5. Conclusions

In this study, we developed a 3D non-rigid transformation deduction model based on multiple-control point perception data. The proposed model provides real-time performance and can rapidly deduce the non-rigid transformations of entities by predicting the positions of multiple control points. The average deduction time was 0.021 s, nearly 200 times faster than the FEM method. Compared to traditional FEM methods, our proposed method does not require the magnitude, direction, and position of the force as an input, only the three-dimensional coordinates of the control points before and after the change. We proposed the CNN-GRU-SA network architecture to extract control point features through a GRU module, adaptively adjusting the weights of the features using an SA module, and obtaining final output features. Compared to the other three network architectures implemented for comparison, the proposed network exhibited a faster loss convergence speed and a less volatile RMSE decline curve. On the training set, the loss and RMSE of the proposed network decreased by 39% and 49%, respectively, compared to the minimum values among the other three networks. Additionally, the changed shape deduced by the proposed network was the closest to the actual non-rigid transformation result of the target object, with an average deviation of only 0.055 mm. The proposed deep learning method based on multi-control point perception data can efficiently and rapidly deduce the shape of an entity after a change, improving measurement efficiency to a certain extent. In the proposed model, all control point data must be predicted. However, loss of control point data is inevitable owing to issues of physical occlusion and the limited measurement range of the equipment. Therefore, in the future, we will attempt to deduce the 3D non-rigid transformations of entities with incomplete sensor data, potentially providing more practical significance.

Author Contributions

Conceptualization, D.Y.; methodology, D.Y. and S.C.; software, D.Y.; validation, D.Y.; formal analysis, D.Y.; investigation, D.Y.; resources, D.Y.; data curation, D.Y.; writing—original draft preparation, D.Y.; writing—review and editing, D.Y.; visualization, D.Y.; supervision, Y.L., L.L., X.L., and L.G.; project administration, D.Y. and X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Project of the Jilin Province Science and Technology Development Program (No. 20200401019GX) and the Zhongshan Social Public Welfare Science and Technology Research Project (No. 2022B2013).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Owing to the large dataset, we only uploaded the entire code to GitHub at https://github.com/YDM-Cloud/CNN-GRU-SA on 20 July 2023.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work reported in this study.

References

Ligas, M.; Prochniewicz, D. Procrustes based closed-form solution to the point-wise weighted rigid-body transformation in asymmetric and symmetric cases. J. Spat. Sci. 2021, 66, 445–457. [Google Scholar] [CrossRef]
Roh, H.D.; Park, Y.-B. Carbon fiber grid sensor for structural deformation using piezoresistive behavior of carbon fiber. Sens. Actuators A Phys. 2022, 341, 113348. [Google Scholar] [CrossRef]
Nie, S.; Huo, L.; Ji, H.; Lan, Y.; Wu, Z. Bending deformation characteristics of high-pressure soft actuator driven by water-hydraulics for underwater manipulator. Sens. Actuators A Phys. 2022, 344, 113736. [Google Scholar] [CrossRef]
Hong, C.; Zhang, Y.; Yang, Y.; Yuan, Y. A FBG based displacement transducer for small soil deformation measurement. Sens. Actuators A Phys. 2019, 286, 35–42. [Google Scholar] [CrossRef]
Jo, H.C.; Kim, J.; Lee, K.; Sohn, H.-G.; Lim, Y.M. Non-contact strain measurement for laterally loaded steel plate using LiDAR point cloud displacement data. Sens. Actuators A Phys. 2018, 283, 362–374. [Google Scholar] [CrossRef]
Zhang, S.; Ge, C.; Liu, R. Mechanical characterization of the stress-strain behavior of the polydimethylsiloxane (PDMS) substate of wearable strain sensors under uniaxial loading conditions. Sens. Actuators A Phys. 2022, 341, 113580. [Google Scholar] [CrossRef]
Deng, B.; Yao, Y.; Dyke, R.M.; Zhang, J. A Survey of Non-Rigid 3D Registration. Comput. Graph. Forum 2022, 41, 559–589. [Google Scholar] [CrossRef]
Zou, S.; Lyu, Y.; Qi, J.; Ma, G.; Guo, Y. A deep neural network approach for accurate 3D shape estimation of soft manipulator with vision correction. Sens. Actuators A Phys. 2022, 344, 113692. [Google Scholar] [CrossRef]
Dey, K.; Vangety, N.; Roy, S. Machine learning approach for simultaneous measurement of strain and temperature using FBG sensor. Sens. Actuators A Phys. 2022, 333, 113254. [Google Scholar] [CrossRef]
Yang, Z.; Yu, C.-H.; Buehler, M.J. Deep learning model to predict complex stress and strain fields in hierarchical composites. Sci. Adv. 2021, 7, eabd7416. [Google Scholar] [CrossRef]
Yang, Z.; Yu, C.-H.; Guo, K.; Buehler, M.J. End-to-end deep learning method to predict complete strain and stress tensors for complex hierarchical composite microstructures. J. Mech. Phys. Solids 2021, 154, 104506. [Google Scholar] [CrossRef]
Yang, R.; Li, Y.; Zeng, D.; Guo, P. Deep DIC: Deep learning-based digital image correlation for end-to-end displacement and strain measurement. J. Am. Acad. Dermatol. 2022, 302, 117474. [Google Scholar] [CrossRef]
Huang, S.-C.; Pareek, A.; Seyyedi, S.; Banerjee, I.; Lungren, M.P. Fusion of medical imaging and electronic health records using deep learning: A systematic review and implementation guidelines. npj Digit. Med. 2020, 3, 136. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar] [CrossRef]
Barron, J.T. A general and adaptive robust loss function. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 4326–4334. [Google Scholar] [CrossRef]
Dua, N.; Singh, S.N.; Semwal, V.B. Multi-input CNN-GRU based human activity recognition using wearable sensors. Computing 2021, 103, 1461–1478. [Google Scholar] [CrossRef]
Cao, B.; Li, C.; Song, Y.; Qin, Y.; Chen, C. Network intrusion detection model based on CNN and GRU. Appl. Sci. 2022, 12, 4184. [Google Scholar] [CrossRef]
Pezzelle, S.; Fernández, R. Is the red square big? MALeViC: Modeling adjectives leveraging visual contexts. arXiv 2019, arXiv:1908.10285. [Google Scholar]
Liang, Y.; Lin, Y.; Lu, Q. Forecasting gold price using a novel hybrid model with ICEEMDAN and LSTM-CNN-CBAM. Expert Syst. Appl. 2022, 206, 117847. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]

Figure 1. 3D non-rigid transformation deduction model based on multi-control point perception data.

Figure 2. CNN-GRU-SA network architecture.

Figure 3. Schematic diagram of the deformation method.

Figure 4. (a) Decline of loss during the iteration process; (b) statistical results of bias after deduction of all training data under different datasets.

Figure 5. (a) Statistical results of FEM deduction time; (b) statistical results of neural network deduction time.

Figure 6. (a) Loss drop comparison of the training set; (b) statistical results of the RMSE of the training set; (c) loss drop comparison of the test set; (d) statistical results of the RMSE of the test set.

Figure 7. (a) Control point before change; (b) surface before change; (c) control point after change; (d) surface after change; (e) deduction results of CNN-LSTM-SA; (f) deduction results of CNN-GRU; (g) deduction results of CNN-GRU-CBAM; (h) deduction results of CNN-GRU-SA.

Figure 8. Real experimental environment.

Figure 9. Actual deduction results. (A-C) Deviation caused by encoding points; (D) Missing scanning data in the edge area.

Table 1. Network module input and output feature parameters.

No.	Module Name	Layer Name	Input Shape	Output Shape
1	Cloud Feature Module	Conv-BN-ReLU (4 groups)	[1028,3]	[1028,256]
2	Control Point Feature Module	GRU	[20,3]	[20,256]
3	Changed Control Point Feature Module	GRU	[20,3]	[20,256]
4	Connect Feature Module	Concat	[20,256] [1028,256] [20,256]	[1248,256]
5	Convolutional Block Attention Module	SA	[1248,256]	[1,256]
6	Full Connection Output Module	FC	[1248,256]	[1208,256]
7	Output Feature Module	Conv-BN-ReLU (3 groups)	[1028,256]	[1028,3]

Table 2. Derivation bias under different datasets.

Data Size (Piece)	Upper Limit (mm)	Q3 (mm)	Q2 (mm)	Q1 (mm)	Lower Limit (mm)	Mean (mm)
1 × 10⁴	0.430	0.210	0.110	0.060	0.003	0.165
2 × 10⁴	0.397	0.195	0.095	0.061	0.008	0.143
3 × 10⁴	0.409	0.190	0.079	0.045	0.006	0.132
4 × 10⁴	0.329	0.158	0.065	0.042	0.003	0.112
5 × 10⁴	0.355	0.168	0.075	0.042	0.002	0.116
6 × 10⁴	0.193	0.099	0.055	0.034	0.002	0.074
7 × 10⁴	0.197	0.103	0.059	0.040	0.006	0.083
8 × 10⁴	0.175	0.092	0.055	0.037	0.004	0.071
9 × 10⁴	0.130	0.072	0.049	0.032	0.003	0.059
1 × 10⁵	0.131	0.071	0.043	0.029	0.004	0.055

Table 3. Deduction time comparison results.

Method	Max (s)	Min (s)	Mean (s)	Std (s)
FEM	14.883	2.807	4.155	0.714
Ours	0.628	0.013	0.021	0.004

Table 4. Network comparison results.

Network	Train Loss	Test Loss	Train RMSE (mm)	Test RMSE (mm)
CNN-GRU	1.03 × 10⁻⁴	5.60 × 10⁻⁸	1.99 × 10⁻⁴	2.31 × 10⁻⁴
CNN-GRU-CBAM	1.07 × 10⁻⁴	5.56 × 10⁻⁸	2.02 × 10⁻⁴	2.30 × 10⁻⁴
CNN-LSTM-SA	9.32 × 10⁻⁵	5.14 × 10⁻⁸	1.80 × 10⁻⁴	2.24 × 10⁻⁴
CNN-GRU-SA	6.31 × 10⁻⁵	2.94 × 10⁻⁸	1.02 × 10⁻⁴	1.64 × 10⁻⁴

Table 5. Deviation statistics results.

Net Frame	Max (mm)	Min (mm)	Mean (mm)	Std (mm)
CNN-GRU	4.90 × 10⁻¹	6.42 × 10⁻³	6.22 × 10⁻²	4.99 × 10⁻²
CNN-GRU-CBAM	3.80 × 10⁻¹	3.56 × 10⁻³	5.99 × 10⁻²	4.75 × 10⁻²
CNN-LSTM-SA	2.52 × 10⁻¹	4.80 × 10⁻³	6.60 × 10⁻²	4.39 × 10⁻²
CNN-GRU-SA	2.33 × 10⁻¹	3.79 × 10⁻³	5.70 × 10⁻²	3.89 × 10⁻²

Table 6. Control point coordinates.

Encoding Point Number	t₀			t
Encoding Point Number	x (mm)	y (mm)	z (mm)	x (mm)	y (mm)	z (mm)
31	176.18	224.35	−0.05	176.18	224.40	−0.12
36	176.23	145.36	−0.03	176.18	145.34	−0.14
34	176.20	64.40	0.03	176.21	64.41	−0.12
42	296.23	185.43	−0.01	296.18	185.36	−0.84
16	296.22	65.37	−0.03	296.20	65.40	−0.84
48	376.27	226.37	0.04	376.18	226.42	−1.61
47	376.23	106.35	0.04	376.27	106.40	−1.65
41	377.19	26.42	−0.01	377.23	26.40	−1.61
58	456.18	165.34	0.00	456.24	165.38	−2.70
28	455.27	65.41	0.04	455.24	65.45	−2.64
43	537.22	42.35	0.01	537.22	42.39	−3.99
45	535.25	145.34	−0.03	535.27	145.38	−3.92
64	616.25	184.39	−0.05	616.29	184.36	−5.43
63	616.25	65.41	−0.03	616.25	65.38	−5.42
61	696.17	104.39	0.03	696.27	104.44	−7.21
62	777.20	145.37	0.01	777.26	145.39	−9.13
65	777.18	65.39	0.01	777.25	65.36	−9.14
66	855.25	26.43	−0.04	855.21	26.35	−11.27
281	936.27	106.35	0.00	936.23	106.37	−13.60
68	1056.19	65.36	0.00	1056.28	65.38	−17.42

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, D.; Li, L.; Liu, Y.; Lin, X.; Guo, L.; Chao, S. Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data. Appl. Sci. 2023, 13, 12602. https://doi.org/10.3390/app132312602

AMA Style

Yan D, Li L, Liu Y, Lin X, Guo L, Chao S. Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data. Applied Sciences. 2023; 13(23):12602. https://doi.org/10.3390/app132312602

Chicago/Turabian Style

Yan, Dongming, Lijuan Li, Yue Liu, Xuezhu Lin, Lili Guo, and Shihan Chao. 2023. "Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data" Applied Sciences 13, no. 23: 12602. https://doi.org/10.3390/app132312602

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data

Abstract

Featured Application

Abstract

1. Introduction

2. Model

3. Method

4. Experiment

4.1. Dataset

4.2. Parameter Settings

4.3. Results and Discussion

4.4. Verification Example

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI