Next Article in Journal
Study on Field Test of Deformation and Stability Control Technology for Shallow Unsymmetrical Loading Section of Super-Large-Span Tunnel Portal
Previous Article in Journal
C2B: A Semantic Source Code Retrieval Model using CodeT5 and Bi-LSTM
 
 
Article
Peer-Review Record

Multi-Scale and Multi-Factor ViT Attention Model for Classification and Detection of Pest and Disease in Agriculture

Appl. Sci. 2024, 14(13), 5797; https://doi.org/10.3390/app14135797 (registering DOI)
by Mingyao Xie and Ning Ye *
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2024, 14(13), 5797; https://doi.org/10.3390/app14135797 (registering DOI)
Submission received: 9 May 2024 / Revised: 1 July 2024 / Accepted: 2 July 2024 / Published: 2 July 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript was a novel and significant contribution to the field and of high interest to readers due to the recent popularity in application of transformer models. The methodology was thorough, data augmentation techniques robust, and results clearly communicated. To potentially add to the paper consider the following minor additions:

1. Provide more detailed descriptions of the data augmentation parameters and the hyperparameter tuning process used during training.

2. include a more detailed discussion on practical implications and potential real-world applications of the findings to increase the impact and relevance.

These improvements will strengthen an already excellent paper.

Author Response

Comments 1: Provide more detailed descriptions of the data augmentation parameters and the hyperparameter tuning process used during training.

Response 1: Thanks for the advice.

First, this paper provides supplementary explanations of data augmentation parameters in two aspects. In Section 2.2, it elaborates on the parameters used for three factors of expansion. The parameters for the environmental factor include the size of the random crop region, set at 20%; the image inpainting method used is DeepFill, with parameters that can be referenced from literature [27-28]. The parameters for the growth factor involve the mixup fusion weights, which have been supplemented with explanations. The parameters for the filming factor are the random transformation parameters, with the range provided. In Section 4.2, it discusses the parameters of the dataset when the data augmentation model participates in training, which is expanded by nine times, using the augmentation methods corresponding to the three factors in equal proportions of 1:1:1.

Secondly, concerning the selection of hyperparameters during the training process, this paper provides detailed supplementary elaboration in Section 3.2, including the optimizer, learning rate, batch size, epochs, and knowledge distillation weights.

 

Comments 2: Include a more detailed discussion on practical implications and potential real-world applications of the findings to increase the impact and relevance.

Response 2: Thanks for the advice. We have supplemented a detailed discussion on practical implications and potential real-world applications within our paper. In Section 4, we elaborate on the application impact and value of the three sub-models of SFA-ViT in the classification and detection of agricultural pests and diseases, with each subsection concluding with a respective discussion. In the conclusion section, we further elaborate on the overall value of SFA-ViT for research in agricultural pest and diseases detection and its applications in related fields.

Reviewer 2 Report

Comments and Suggestions for Authors

Review of manuscript: applsci-3027869  

Multi-scale and Multi-factor ViT Attention Model for Classification and Detection of Pest and Disease in Agriculture

I am convinced that the topic undertaken by the authors falls within the publication area of the Applied Sciences journal.

The paper has a clear layout, the title corresponds to the content.

I have no objections to the research method used and the results presentation.

The References list contains the latest data from the last few years.

 

However, I have a comments and objections:

1) I suggest putting the explanation of the abbreviation ViT already in the Abstract before the abbreviation SFA-ViT

2) The abbreviation SFA-VIT has been explained twice (L13, L78)

3) Figures: 1-7, please provide source

4) Please improve the visibility of the objects placed in Figure3 (scale: 1, 2, 3)

5) L382-387: In my opinion, this text is more appropriately placed in the Abstract or Methods chapter, rather than in Conclusions

6) In the Conclusions chapter, I would have expected more information what is the added value of the research conducted by the Authors for practice.

 

 

Author Response

Comments 1: I suggest putting the explanation of the abbreviation ViT already in the Abstract before the abbreviation SFA-ViT.

Response 1: Thanks for the advice. We have made modifications in L12-14.

 

Comments 2: The abbreviation SFA-VIT has been explained twice (L13, L78).

Response 2: Thanks for the advice. We have made modifications in L77 (corresponding to the original L78).

 

Comments 3: Figures: 1-7, please provide source.

Response 3: Thanks for the advice. We have provided the sources of Figures 1-7 in system.

 

Comments 4: Please improve the visibility of the objects placed in Figure3 (scale: 1, 2, 3).

Response 4: Thanks for the advice. We have made modifications in Figure3.

 

Comments 5: L382-387: In my opinion, this text is more appropriately placed in the Abstract or Methods chapter, rather than in Conclusions.

Response 5: Thanks for the advice. We fully agree with your viewpoint and have made modifications to the conclusion section.

 

Comments 6: In the Conclusions chapter, I would have expected more information what is the added value of the research conducted by the Authors for practice.

Response 6: Thanks for the advice. Your viewpoint is very helpful to us. We further demonstrate the added value of this model in the conclusion section.

Back to TopTop