Feature Sparse Choosing VIT Model for Efficient Concrete Crack Segmentation in Portable Crack Measuring Devices
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper presents a light-weight concrete crack segmentation model based on "Feature Sparse Choosing VIT". Concrete crack measurement is crucial for building structures. The authors utilize "Feature Sparse Choosing VIT" to reduce the computational complexity of the VIT and decrease the number of channels for crack features. The article is well organized. The theme meets the requirements of the journal.
Comment:
1. Line 34: the authors mentioned “traditional manual measurements of cracks are inefficient and costly”. Please explain the specific methods of traditional measurement. This is important for demonstrating the advantages of the methods in the article.
2. The crack width in the database case provided by the author is very large and obvious. Such obvious cracks seem to no longer require machine recognition. For example, for engineered cementitious composite (ECC) or textile-reinforced ECC, the crack width is very fine and dense (Micron level). Have the characteristics of such cracks been taken into account in the database?
3. Conclusion: The main purpose of the conclusion section is to summarize the research results. It can point out the innovative points and contributions of the research, as well as the limitations and shortcomings of the research. The conclusion section also can propose prospects and suggestions for future research. Therefore, the repetition of the introduction in the conclusion seems inappropriate. Suggest the author to list the conclusions one by one.
Author Response
Thank for your advices, I have revised in my paper
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsFor the concrete crack measurement, the authors proposed a lightweight concrete crack segmentation model based on the Feature Sparse Choosing VIT (LTNet). If the authors can modify the first person and English language questions in the manuscript, the reviewer will consider the manuscript acceptable.
Comments on the Quality of English LanguageMinor editing of English language required
Author Response
Thanks for your advise and I have revised it
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis article suggests a new segmentation algorithm for concrete cracks. The algorithm is new and its performances are better than the conventional ones in the same field. Please consider the following comments.
1. Typographical errors
line 156: "Feather" --> "Feature"
line 291: "designe" --> "design"
2. on Figure 2
Which part of Figure 1 is Figure 2? The interfaces of FSVIT shown in Figure 2 are "Embedded Patches" and "Multi-Head Attention". The reviewer does not find Figure 2 part in Figure 1. Please add some descriptions about this.
3. on Figure 2
Please explain the small squares with gray and white colors with linear projection. The reviewer estimates F5 is the set extracted from F3, but does not understand other squares.
4. line 129
Please add some descriptions on "up-sampling." As a pooling, GAP is adopted. When an up-sampling is standard and simple, it would be good for adding some references.
5. line 216
On, Equation (4), the function f( ) is not explained. In line 217, (x,y) is defined as training input and training label, but the correspondence between (x,y) and training label is not clear. Please add some descriptions on this.
6. lines from 232 to234
The reviewer does not understand the processes of shuffle(x) and concatenate(x). Especially, what "concatenate(x)" of channels means? The addition of explanation may not be necessary but please let the reviewer know the roles of shuffle(x) and concatenate(x).
7. line 349
The meaning of part of sentence "has specially design an edge feature extractor" is not clear.
8. line 380
The reviewer thinks the description "excessive use of the FSVIT would lead to redundant parameter" needs some concrete explanation. If possible, please describe which sets of parameters are redundant.
9. Undetected or false-positive images
The reviewer would like to know at least an undetected image and a false-positive image. Moreover, a example image that another method fails to detected and suggested method can successfully detect. If possible, consider some additional description with appropriate images.
Author Response
Thanks for your advices and I have revise all of them
Author Response File: Author Response.pdf