Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data

Appl. Sci. 2023, 13(23), 12602; https://doi.org/10.3390/app132312602

by Dongming Yan¹

, Lijuan Li^1,2,*, Yue Liu¹

, Xuezhu Lin^1,2, Lili Guo^1,2 and Shihan Chao¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Matija Perne

Reviewer 4:

Paweł Zalewski

Appl. Sci. 2023, 13(23), 12602; https://doi.org/10.3390/app132312602

Submission received: 11 October 2023 / Revised: 13 November 2023 / Accepted: 18 November 2023 / Published: 23 November 2023

(This article belongs to the Topic Complex Systems and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper is interesting but it could be addressed clearly on the methodology such as the logic involved in various AI model how they connect together. Meanwhile, the limitation of this study and future of research should be addressed in the section of Conclusion.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper presents a 3D flexible transformation model based on multi-control point perception data. It predicts the shapes of entities after transformation and performs well while meeting real-time requirements. However, a critical issue lies in the lack of explicit differentiation between the proposed model and existing ones, as well as detailed comparison and explanation of its superiority. The paper requires revision to improve its quality. Following are comments and suggestions:

· It would be beneficial to explicitly present the contributions in bullet-point format as the second-last paragraph of Section 1.

· Additional network and model details are needed. A comparison with existing networks, highlighting the unique features and advantages of authors’ model, would be beneficial.

· Figure 4(a) should be enlarged for better clarity.

· The captions for Figure 6 and Table 3 need correction.

· In Figure 6, consider making the legend for the proposed network “CNN_GRU_SA” bold and increase the font size for improved comprehension.

· In line 286, it is unclear if “other network” refers to the mean values of all other networks. Clarification is needed.

· References to all other networks used in tables, figures, and text should be added.

· Section 4.3 requires more detailed information on how the proposed network outperforms existing ones. Consider adding a table with quantitative results from Figure 7.

Comments on the Quality of English Language

This version requires thorough proofreading before acceptance, as it contains multiple typos and grammatical errors. For instance, 'feedback' should be a single word.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

I thank the authors for writing up the results of this interesting research and the editor for giving me the opportunity to review it.

The authors address the problem of quickly computing the shape of a deformed body from the original shape and the information about movement of several control points. They propose a particular architecture of a neural network for the purpose, and demonstrate its efficiency and accuracy on an example.

In principle, I admire the study. The methods are precisely described, the results are clear and appropriately discussed. I appreciate that the computer code used is made available. There are, however, some omissions that should be addressed before the article is published.

My main worry is, does this method have a purpose, is it possible to use it for a practical purpose. What is the proposed workflow of its application? The manuscript does not answer these questions, but it should.

The use of the word "experimental" (already in the Abstract) is quite deceiving, and so are the measurement tools illustrated in Figure 1; the presented study is fully computational. No experiment or measurement was performed. The reader only realizes it after reading the whole paper and encountering no experiments and measurements ... This is not acceptable; please make it clear (in the title, Abstract, Introduction and Conclusions) that this is a completely computational modelling study.

The presented CNN-GRU-SA network imitates the predictions of the finite-element method (FEM) model. It is therefore a surrogate model. Surrogate modelling makes sense if the surrogate model has an important advantage over the original model. We learn that the new method computes the deformed shape in 0.021 s – how long does Ansys FEM need to perform the same computation?

I see that the original point cloud is an input to the CNN-GRU-SA network, in addition to the original and changed control points. However, I presume the CNN-GRU-SA network only works for the wing, the object it was trained on (the manuscript trains it on the wing and applies it only to the wing). Am I right, or would it also be able to model a different object (a tree, a sofa, a bicycle frame) if a different cloud and control points were provided as the input without re-training? If it does generalise like that, does it only work for carbon fiber objects (as it was trained on a carbon fiber wing) or for other materials too?

If the model has to be trained for every object and if a FEM model has to be built for the purpose, please specify in the manuscript that a FEM model is a step in the suggested workflow. If so, one does not really avoid the "very complex finite-element metamodeling process that is time-consuming and labor-intensive" by using the proposed method.

How long does the training on 100.000 sets of point clouds take?

Modelling a physical object using CNN-GRU-SA network without making a FEM model first seems challenging as well. One needs 100.000 sets of point clouds at different states of deformation for training. Measuring these clouds seems challenging, particularly because the points of the clouds are supposed to move with the material of the object from one deformation state to another (it is not enough to independently measure the shape of the surface at each deformed state). Is one supposed to go directly from a physical object to CNN-GRU-SA network without making a FEM model? If so, would you comment on it in the article?

In lines 191–192, you describe how more nonlinear transformations help the CNN-GRU-SA network fit the original (the FEM model). How much nonlinearity is there? FEM model solves the partial differential equations describing the deformation of the body; the amount of nonlinearity needed in the CNN-GRU-SA network should therefore depend on the nonlinear deformation characteristics of the modelled material. Is the carbon fiber strongly nonlinear? Should the network architecture change depending on the material?

It seems to be assumed that the deformation is a result of forces applied in the control points (line 205). Is this a limitation of the method? Can the flexible transformation be computed if the forces are applied outside of the points whose positions are measured?

Below, I list some minor issues that I believe will only require slightly modifying the text.

The abstract is too "abstract" in my opinion. Already the first sentence, "In complex measurement systems, the real-time deduction of three-dimensional (3D) flexible transformations usually requires a fast response and real-time feedback, implying that the speed of data analysis and processing must be sufficiently fast," is intimidating and not really informative. Please make the abstract easier to understand and explain more clearly what the study is about. A "flexible transformation" is a very abstract term and it is hard to see how to model it; explaining that you are interested in quickly computing the deformed shape of the surface from movement of a few points on the surface of the body (in addition to how you do it and how successful you are, which is already explained) would really help.

Relatedly, when I think of a "model", I'm most interested in the model inputs and outputs. It seems that the inputs to your model are the displacements of several control points as a result of the modelled deformation, while the output is the shape of the deformed body, presented as a point cloud of around 1000 points – unless I'm wrong. Please specify it clearly, it will help the reader a lot.

Introduction is missing a clear explanation of what is the novel contribution of the article. What do you do that has not been done before and what is the advantage of doing it this way? For example while you do mention that "research on deformation monitoring based on traditional methods typically involves a very complex finite-element metamodeling process that is time-consuming and labor-intensive", you don't clearly state that your method does not suffer from these drawbacks.

The overview of the literature is unfortunately not sufficient. You skip all the literature on finite-element modelling as an established method of modelling flexible transformations, in spite of your work relying on this method! On the other hand, you introduce several specialist terms such as "constant-curvature kinematics" and the acronym DIC that is never defined, even though the reader does not have to be familiar with them in order to benefit from your article. Please limit the scope of the literature review to what is relevant for your research, be complete within this scope (omitting nothing important), and specify what the scope is. Currently, it looks like you are reviewing the complete literature on three-dimensional deformation, but you skip a lot (including finite elements which actually matter for this work).

The "loss", which is mentioned already in Abstract, is only defined as L1 loss in line 234 (I appreciate the literature reference!). It would be much better if it was clear which loss it is from the beginning. Also, the formula in reference [14] implies that the loss function has an additional parameter c; please list the value of the parameter you use.

"Deviation" is discussed in lines 249–256, Figure 4 and Table 2. It is quite clear what it is – the difference between the model output and the true value – even though it is never formally defined, which would be even better. In addition, it has a unit, probably millimetre. Please specify it. Also, the "derivation" in line 257 should presumably be "deviation".

Captions in lines 266 and 267 are missing (there are placeholders instead of content).

Figure 7 is very expressive, I like it. It is missing the explanation of what quantity is shown though (presumably deviation / deformation in millimetres). Do I read the panel (d) correctly that the maximum deformation is around 53 millimetres? This is actually a very important number for the interpretation of the results: a deviation of 0.5 mm should be judged relative to the total deformation. Please specify the maximum / average / typical / RMS deformation in the text, I don't think you do it in the current version.

What is the "manual secondary scanning time" mentioned in line 261?

I am looking forward to reading the revised version which will answer my biggest questions regarding the presented work.

Comments on the Quality of English Language

The original work presented in the article should be explained in present tense, as is the custom in research articles, not in past tense which is meant to describe work published before. A verb, or a larger part of a sentence, is missing in lines 205–206.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Authors presented the results of application of 3D transformation model of non-rigid / flexible body built via combined neural network of convolutional neural network, gate recurrent unit, and self-attention mechanism. Before accepting the paper I have few comments to improve its quality:

1. In the title and throughout text the word flexible suggests "in various ways". The transformation developed by the authors should rather be named "deformable body" or "non-rigid body" transformation as opposite to "rigid body" transformation. What is the origin / reference of the term “flexible transformation”?.

2. In the introduction the authors should pay more attention to an increasing amount of effort that has been placed into developing learning-based techniques that improve or enhance existing techniques to develop robust shape descriptors. For instance as an input to artificial neural networks, mesh geometry has been adapted to a variety of domains. A survey of non-rigid 3D registration has been presented in https://arxiv.org/pdf/2203.07858.pdf

3. Table1: Explain parameters of input and output shape. Are they number of points?.

4. Figure 4b and Table 2: The units of bias should be provided.

5. Figure 5: Add the title.

6. Table 3: Add the title.

7. Table 4: Add the units of RMSE.

8. Line 285: There is 29% decrease not an increase (also applies to test loss and train RMSE).

9. Check the comments in the attached file.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Check the proper use / meaning of English adjectives.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed all the concerns.

Comments on the Quality of English Language

Minor editing is required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

I thank the authors for revising the manuscript. The revision has mostly improved it, my minor remarks were successfully addressed. However, my main concerns regarding the article have unfortunately not been considered. Let me reiterate them.

The standard workflow of computing a non-rigid transformation of an object is:
- build a finite-element model of the object;
- compute the shape of the object under an arbitrary external loading.

The workflow proposed in the article, as I understand it, is:
- build a finite-element model of the object;
- compute the shape of the object under 8x10⁴ different choices of external loading (training set);
- train a neural network to infer the shape of the object under external loading (point cloud) from the coordinates of a score of control points before and after the change;
- compute the shape of the object under external loading (test example);
- extract the coordinates of the control points in the test example and provide them to the neural network;
- obtain the shape of the object under the chosen external loading from the neural network.

How is the proposed workflow an improvement over the traditional one?

It seems clear to me that the method is meant to be used on physical objects, not on finite-element-modelled ones. However, it is not demonstrated: the "experiments" described in the article happen fully inside a computer. Worse, it is not obvious how one could apply it to physical objects. Would they still have to be modelled with FEM and the only measurement input to the model would be the measured coordinates of the control points? Would one instead physically experiment on the object and measure the deformed cloud points? In the former case, what's the benefit? In the latter one, how does one know which point of the non-transformed cloud corresponds to each one of the transformed one?

In addition,
- It should be more clearly specified that this is a computational study. After reading about the "experimental results" in the Abstract, I was disappointed when I reached the end of the article and there was no physical measurement performed.
- The revision does not mention finite-element method by name at all. It should, as FEM is a crucial part of the work performed. Literature on it should thus be reviewed, and the neural network numerical performance should be benchmarked against the FEM as the alternative.

To conclude, I still do not see the selling point of the work in the revised version. I cannot imagine how the contributions of the paper (that are now listed in the introduction) could be used in practice: as presented, the method DOES rely on "traditional 3D non-rigid transformation algorithms", I cannot find "the deduction time of the traditional method" in the article (I believe "the traditional method" is FEM), and proposing the CNN-GRU-SA architecture is of little value if the method as a whole has no advantage over the traditional one.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for improving the manuscript once more.

I think I FINALLY get the main point of the article (in my opinion). The proposed machine learning method results in a model that is evaluated two orders of magnitude faster than FEM. Furthermore, it takes displacements of a handful of control points as the inputs, while FEM requires forces on the body (inverse modelling would be possible, but it may lengthen the evaluation time, possible by orders of magnitude). FEM is thus unsuitable for quickly computing the non-rigid transformation of the body, particularly with these inputs, and the proposed method is a reasonable solution of the problem.

The issue is that the important contribution is quite well hidden in the manuscript. Your response to my comment: "Compared to traditional FEM methods, our proposed method does not require the magnitude, direction, and position of the force as input, only the three-dimensional coordinates of the control points before and after the change" really helped me understand the point, so I think that this fact should be equally clearly stated in a prominent place in the manuscript as well.

Currently, you specify that "Our method has the advantages of low computational complexity, low grid dependency, and high real-time performance" (lines 101–103). This is accurate and important, but the similarly important fact that your method uses different inputs than FEM (coordinates instead of force) is not mentioned anywhere. Furthermore, in the same point you inaccurately claim in lines 97—98 that "this method does not rely on unstructured or high-resolution grids" (it does, it uses FEM) and that it "does not involve a large number of computational operations and complex metamodeling processes" (it does – you construct a FEM model, simulate it hundreds of thousands of times, and train your network.) Then you put equal weight to the fact that you propose to use CNN-GRU-SA, even though this "detail" is much less important – of course you pick a good network architecture amongst a few ideas, there is nothing special about that. The next "contribution" that you "measure and compare the deduction time" is also much less important, it is more an obvious part of your method than a contribution. Therefore, I suggest you reconsider what is really important (in my opinion, a neural network that is 1. faster than FEM and 2. uses inputs that are easier to measure) and expose it sufficiently prominently, not hiding it with too much technical detail.

That being said, technical details are still of interest and it is great that you provide them in Table 1 Figure 2 and related text. Looking at Table 1, it took me a little while to understand the meaning though. It is not immediately obvious that modules 1—3 work in parallel (having one NN layer each), 4 connects the outputs and 5—7 work in series (does 6 have more than 1 layer though?). If you can make it clearer, please do.

In line 403, you provide the code but the text of the URL is messed up. Please fix it.

In line 217, you imply that you need a lot of nonlinearity in the network, but it cannot be directly seen from the provided data. Could you implement the same architecture of the model with linear regression instead of CNN-GRU-SA? It is much easier to do, so if the loss and RMSE are comparable, it should be recommended. Judging by the differences in performance of different neural networks, we may suspect that nonlinearity indeed does matter and that linear regression would not work well, but showing it would be even better.

Do you happen to know of other examples of neural networks / machine learning being used to substitute FEM with a model with more convenient inputs? If you do, cite them, please (unless you already do and I failed to notice it). I believe your study is in any case original and important enough that it is worth being published (when presented clearly) but more background and information on related works may make it even more helpful to the reader.

Thank you again for improving the manuscript.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 4

Reviewer 3 Report

Comments and Suggestions for Authors

I would like to thank the authors for polishing up the manuscript.

While I have no objections to publishing the article as is, I do not fully agree with you regarding (non)linearity. If a transformation is non-rigid, it does not mean it is nonlinear: as long as you stay within the linear regime of the material and the angles are small, the change of coordinates of each point will be proportional to the forces applied, and the changes of coordinates of grid points will be proportional to the changes of coordinates of the control points. I encourage you to try it. Use linear regression between inputs and outputs. You can even implement it as a shallow linear neural network if it is most convenient to you that way. You should see that the computation of [1028,3] changed cloud point coordinates from [20,3] changed control point coordinates (you may not even need the other inputs) linearly, through a multiplication by a fitted 60x3084 matrix, performs much better than a 6-parameter rigid transformation from the original to changed cloud point coordinates. If you follow my suggestion and try it out, feel free to add a report on it to the article.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Article Menu

Deep Learning Approach for Deduction of 3D Non-Rigid Transformation Based on Multi-Control Point Perception Data

Further Information

Guidelines

MDPI Initiatives

Follow MDPI