Next Article in Journal
Quantitative Uniform Approximation by Activated Singular Operators
Previous Article in Journal
An Effective Federated Object Detection Framework with Dynamic Differential Privacy
 
 
Article
Peer-Review Record

SA-ConvNeXt: A Hybrid Approach for Flower Image Classification Using Selective Attention Mechanism

Mathematics 2024, 12(14), 2151; https://doi.org/10.3390/math12142151
by Henghui Mo and Linjing Wei *
Reviewer 2: Anonymous
Mathematics 2024, 12(14), 2151; https://doi.org/10.3390/math12142151
Submission received: 17 June 2024 / Revised: 4 July 2024 / Accepted: 8 July 2024 / Published: 9 July 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The abstract is too long.

Sections 2 and 3 are redundant.

some figures (Figures 4 to 9) are of low quality. If they are extracted from somewhere else copy a high-quality image and include the reference.

The same for Figs 12 to 14. very low quality. Figure 14 is not informative.

More comparisons in terms of f measure and run time should be included,

The hyper parameter setting and number of parameters should be included.

 

Author Response

Revision Statement:

Dear Experts,

First and foremost, I would like to express my deepest gratitude for your invaluable comments and suggestions during the review process. Your expert guidance has been instrumental in highlighting the shortcomings of my work and guiding improvements to meet higher research standards.

In response to your observations regarding "The abstract is too long," "Sections 2 and 3 are redundant," "some figures (Figures 4 to 9) are of low quality," "The same for Figs 12 to 14, with Figure 14 not being informative," "More comparisons in terms of f measure and run time should be included," and "The hyperparameter setting and number of parameters should be included," I have made comprehensive revisions:

  1. Abstract Revision: Following expert advice, I have streamlined the abstract by approximately 100 words, enhancing its conciseness.
  2. Clarity in Sections 2 and 3: I have clarified the content of Sections 2 and 3 and significantly reduced their length by merging Section 3 with subsequent sections detailing improvements.
  3. Image Quality: Per your suggestions, I have regenerated and uploaded high-quality versions of Figures 4 to 9, and Figures 12 to 14, with special attention to enlarging Figure 14 to improve its informativeness.
  4. F1 Score and Run Time: I have added F1 score comparisons to the tables and included run time assessments in Tables 2 and 3 as requested.
  5. Hyperparameters Detailing: I have comprehensively detailed the hyperparameter settings, including strategies like early stopping, to enhance the reproducibility of the research.

These revisions aim to comprehensively address the issues raised during the review process, ensuring that the manuscript not only meets but exceeds publication standards.

Thank you again for your valuable time and professional insights. I look forward to your further guidance and hope that my revisions meet your expectations. I am confident that with your help, my work can make a significant contribution to the field.

Yours sincerely,

Review Comments:

  1. For "The abstract is too long": Alerted by experts, I have revised and condensed the abstract by about 100 words.
  2. For "Sections 2 and 3 are redundant": Alerted by experts, I have clarified Sections 2 and 3 and greatly reduced their length by merging Section 3 with subsequent improvement sections.
  3. For "some figures (Figures 4 to 9) are of low quality": Alerted by experts, I have regenerated all the figures in high definition and uploaded them.
  4. For "The same for Figs 12 to 14. Very low quality. Figure 14 is not informative": Alerted by experts, I have regenerated all the figures in high definition and uploaded them, with Figure 14 undergoing enlargement for better clarity.
  5. For "More comparisons in terms of f measure and run time should be included": Alerted by experts, I have added F1 score comparisons to the tables and included time assessments in Tables 2 and 3.
  6. For "The hyperparameter setting and number of parameters should be included": Alerted by experts, I have detailed the hyperparameters in the paper, including strategies such as early stopping.

Reviewer 2 Report

Comments and Suggestions for Authors

The paper discusses a new approach to flower image classification based on the Fusion of Selective Attention Network (SANet) and ConvNeXt, named SA-ConvNeXt. 

The idea is innovative, but the paper is poorly presented, and there are numerous remarks that the authors should address to improve the paper.

 

- The paper should have four major sections: Introduction, Material and Methods, Results and Discussion, and Conclusion. Authors can create subsections as needed.

 

- The introduction is very poor and needs significant improvement. As this is a comparative study, the introduction should cover all architectures, not just deep learning. The current 35 references are insufficient; the authors should include more references to provide a comprehensive comparison.

 

- All figures in the paper are unclear and need to be replaced with clearer versions. The text inside the figures is illegible, and the overall quality is very poor.

 

- Table 2 (comparison with other models) is too limited. Flower image classification is a vast field with numerous classification architectures. The authors should compare their approach with at least 10 other works.

 

- The abstract is too lengthy and should be reduced.

 

- The keywords should reflect the research area and not appear random. For example, ADAM and loss are not suitable; choose more relevant keywords related to image classification.

 

- The introduction section should conclude by providing an overview of the subsequent sections in the paper.

 

- Since 2012, research teams have continuously... (Line 80) !!! Are you sure about this date? Please provide a reference.

 

- Please provide the full names and abbreviations of each architecture (VGGNet, GoogLeNet, InceptionV2, InceptionV3, ResNet, and Inception-ResNet-V2...).

 

- Many papers discuss attention mechanisms and transformers for image classification. The authors should review and include a discussion on these in the introduction section.

 

- Line 125 (Image Preprocessing section): The authors discuss only image processing using deep learning. Please change the title and also the title of Figure 1, as this process is specific to neural networks and not the fixed approach for image processing. Other scenarios exist.

 

- Why is 224 used in Figure 1? If it is just an example, mention it or remove the number from the figure. Not all images have a size of 224.

 

- There is no need to discuss these steps of Image Preprocessing in detail. If necessary, include them in the introduction without figures.

 

- Section 3 should be titled Materials and Methods.

 

- This section should start by providing a general overview of the approach, including a flowchart explaining all steps to help readers understand the idea.

 

- Sections 3 and 4 should be combined.

 

- Up to line 199, there is no indication of the novel idea behind the paper. The review phase includes a lot of information and some mathematical equations without any clear explanation. For example, normalization equation! why present min-max normalization? Similarly, image processing scenarios! why present this specific scenario?

 

- How was the model tuning performed?

 

- The section "Model Training and Result Analysis" should start with Data.

 

- The authors should be careful with their section numbering (e.g., 5-2 Data Construction, and also 5-3 Data Construction).

 

- How was the data constructed? Did you create a new dataset?

 

- There is a difference between Migration and Transfer! Please pay attention.

 

- A general algorithm explaining all steps of your approach should be added.

 

- The conclusion should be a single paragraph, not in bullet points.

 

Recommendations: The authors should have a major section called Introduction, which includes a comprehensive review of image classification techniques (classical, machine learning, deep learning, transformer, etc.), a thorough comparison, and the best-obtained scores with their limitations. Afterward, they should directly explain their approach in the Materials and Methods section, with clear and easily understandable steps, present the results, and discuss why this approach is original and recommended. Finally, conclude the paper succinctly.

Comments on the Quality of English Language

-------------------

Author Response

Revision Statement:

Dear Experts,

First and foremost, I would like to express my deepest gratitude for the invaluable feedback and suggestions you provided during the review process. Your expert guidance is crucial for enhancing my paper and has significantly helped me identify and address areas needing improvement to meet higher research standards.

In response to your comments:

  • "The paper should have four major sections: Introduction, Material and Methods, Results and Discussion, and Conclusion. Authors can create subsections as needed."
    Following your advice, I have restructured the paper into four main sections accordingly.

  • "The introduction is very poor and needs significant improvement. As this is a comparative study, the introduction should cover all architectures, not just deep learning. The current 35 references are insufficient; the authors should include more references to provide a comprehensive comparison."
    I have expanded the introduction to cover all architectures and increased the number of references from 35 to 40, including a detailed comparison involving Transformers.

  • "All figures in the paper are unclear and need to be replaced with clearer versions. The text inside the figures is illegible, and the overall quality is very poor."
    All figures have been regenerated in high resolution and uploaded for clarity.

  • "Table 2 (comparison with other models) is too limited. Flower image classification is a vast field with numerous classification architectures. The authors should compare their approach with at least 10 other works."
    I have deepened the comparison of models and introduced more comparative metrics as per your suggestion.

  • "The abstract is too lengthy and should be reduced."
    The abstract has been condensed and refined to highlight the main points, with approximately 100 words removed.

  • "The keywords should reflect the research area and not appear random. For example, ADAM and loss are not suitable; choose more relevant keywords related to image classification."
    The keywords have been revised to better reflect the research focus, removing 'ADAM' and 'loss' and adding 'CNN'.

  • "The introduction section should conclude by providing an overview of the subsequent sections in the paper."
    An overview of the subsequent sections has now been included in the introduction.

  • "Since 2012, research teams have continuously... (Line 80) !!! Are you sure about this date? Please provide a reference."
    The date has been verified and corrected to reflect a more authoritative presentation.

  • "Please provide the full names and abbreviations of each architecture (VGGNet, GoogLeNet, InceptionV2, InceptionV3, ResNet, and Inception-ResNet-V2...)."
    Full names and abbreviations for each mentioned architecture have been provided.

  • "Many papers discuss attention mechanisms and transformers for image classification. The authors should review and include a discussion on these in the introduction section."
    Discussions on attention mechanisms and transformers have been added to the introduction.

  • "Line 125 (Image Preprocessing section): The authors discuss only image processing using deep learning. Please change the title and also the title of Figure 1, as this process is specific to neural networks and not the fixed approach for image processing. Other scenarios exist."
    The title has been changed to "2. Image Preprocessing in SA-ConvNeXt" to specify its relevance to neural networks.

  • "Why is 224 used in Figure 1? If it is just an example, mention it or remove the number from the figure. Not all images have a size of 224."
    The specific mention of '224' has been replaced with 'new_h' and 'new_w' to avoid specificity.

  • "There is no need to discuss these steps of Image Preprocessing in detail. If necessary, include them in the introduction without figures."
    Details on image preprocessing have been simplified and included in the introduction without figures.

  • "Section 3 should be titled Materials and Methods."
    The title of Section 3 has been changed accordingly.

  • "This section should start by providing a general overview of the approach, including a flowchart explaining all steps to help readers understand the idea."
    A detailed flowchart and a general overview of the approach have been added to help clarify the methodology.

  • "Sections 3 and 4 should be combined."
    Sections 3 and 4 have been combined as per your suggestion.

  • "Up to line 199, there is no indication of the novel idea behind the paper. The review phase includes a lot of information and some mathematical equations without any clear explanation. For example, normalization equation! why present min-max normalization? Similarly, image processing scenarios! why present this specific scenario?"
    A detailed explanation and rationale for using min-max normalization and the specific image processing scenarios have been added.

  • "How was the model tuning performed?"
    A detailed description of the model tuning process and hyperparameter settings has been included.

  • "The section 'Model Training and Result Analysis' should start with Data."
    The structure of this section has been revised to start with data.

  • "The authors should be careful with their section numbering (e.g., 5-2 Data Construction, and also 5-3 Data Construction)."
    Section numbering has been carefully reviewed and corrected.

  • "How was the data constructed? Did you create a new dataset?"
    Details on the data construction have been clarified, noting that the dataset is derived from the Newton University's Flower102 dataset.

  • "There is a difference between Migration and Transfer! Please pay attention."
    Terminology regarding 'migration' and 'transfer' has been corrected to reflect accurate usage.

  • "A general algorithm explaining all steps of your approach should be added."
    A general algorithm has been included to explain all steps in a more accessible manner.

  • "The conclusion should be a single paragraph, not in bullet points."
    The conclusion has been rewritten into a single, cohesive paragraph.

These comprehensive revisions aim to address all the concerns raised during the review process comprehensively. I look forward to your further guidance and approval of these revisions, confident that they enhance the quality of the work substantially. Thank you once again for your time and insights.

Yours sincerely,

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

All issues are answered. The article is acceptable.

Author Response

Thank you!!!!

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

I can't make decisions based on the new version only. Please prepare a highlighted version that contains the highlights of each change. Also, answer my questions or provide me with the exact line number of the paragraph where each change is located.

Comments on the Quality of English Language

_----------------

Author Response

Dear Experts,

Firstly, I would like to express my deepest appreciation for the valuable feedback and suggestions provided during the review process. Your expert guidance is invaluable in enhancing the quality of my paper, helping me identify shortcomings in my research and guiding improvements to meet higher academic standards.

  1. In response to "The paper should have four major sections: Introduction, Material and Methods, Results and Discussion, and Conclusion. Authors can create subsections as needed." Following your advice, I have restructured the paper into these four main sections.

  2. Regarding "The introduction is very poor and needs significant improvement. As this is a comparative study, the introduction should cover all architectures, not just deep learning. The current 35 references are insufficient; the authors should include more references to provide a comprehensive comparison." Heeding your suggestions, I have increased the number of references to 40 and expanded the introduction to include a detailed comparison of transformers, enriching the content. Line 91.

  3. Concerning "All figures in the paper are unclear and need to be replaced with clearer versions. The text inside the figures is illegible, and the overall quality is very poor." Per your recommendations, I have regenerated and uploaded all figures in high resolution.

  4. For "Table 2 (comparison with other models) is too limited. Flower image classification is a vast field with numerous classification architectures. The authors should compare their approach with at least 10 other works." Following your advice, I have deepened the comparison of models and introduced more comparative metrics.

  5. Regarding "The abstract is too lengthy and should be reduced." Following your guidance, I have condensed the abstract, removing approximately 100 words to highlight the main points more succinctly.

  6. About "The keywords should reflect the research area and not appear random. For example, ADAM and loss are not suitable; choose more relevant keywords related to image classification." In response to your feedback, I have revised the keywords, removing 'ADAM' and 'loss', and adding 'CNN' to better reflect the research focus.

  7. Regarding "The introduction section should conclude by providing an overview of the subsequent sections in the paper." Following your suggestion, I have incorporated an overview of the subsequent sections into the introduction.

  8. In response to "Since 2012, research teams have continuously... (Line 80) !!! Are you sure about this date? Please provide a reference." Upon review, I have updated the text to reflect advancements in artificial intelligence due to enhancements in neural network architectures more accurately, removing the specific mention of 2012.

  9. Concerning "Since 2012, research teams have continuously... (Line 80) !!! Are you sure about this date? Please provide a reference." Following your feedback, I have verified and adjusted the content, removing the specific mention of 2012. Line 76.

  10. Regarding "Please provide the full names and abbreviations of each architecture (VGGNet, GoogLeNet, InceptionV2, InceptionV3, ResNet, and Inception-ResNet-V2...)." Following your suggestions, I have provided full names and abbreviations for each architecture: "VGGNet (Visual Geometry Group Network), GoogLeNet, InceptionV2 (Inception Version 2), InceptionV3 (Inception Version 3), ResNet (Residual Network), and Inception-ResNet-V2 [12-17]." Line 80.

  11. In response to "Many papers discuss attention mechanisms and transformers for image classification. The authors should review and include a discussion on these in the introduction section." Following your advice, I have added discussions on transformers relevant to image classification. Line 93.

  12. Regarding "Line 125 (Image Preprocessing section): The authors discuss only image processing using deep learning. Please change the title and also the title of Figure 1, as this process is specific to neural networks and not the fixed approach for image processing. Other scenarios exist." In response to your feedback, I have renamed the section to "2. Image Preprocessing in SA-ConvNeXt" to reflect its specific applicability to neural networks. Line 132.

  13. About "Why is 224 used in Figure 1? If it is just an example, mention it or remove the number from the figure. Not all images have a size of 224." Following your feedback, I have modified the mention of 224 in the figures to clarify it as an example and replaced it with 'new_h' and 'new_w' as appropriate. Line 140.

  14. For "There is no need to discuss these steps of Image Preprocessing in detail. If necessary, include them in the introduction without figures." Heeding your advice, I have streamlined and omitted unnecessary details from the Image Preprocessing section. Line 133.

  15. Regarding "Section 3 should be titled Materials and Methods." Following your suggestion, I have renamed the third section of the paper. Line 177.

  16. In response to "This section should start by providing a general overview of the approach, including a flowchart explaining all steps to help readers understand the idea." Following your advice, I have included a detailed flowchart and restructured the section to provide a clearer overview of the approach. Line 429.

  17. Concerning "Sections 3 and 4 should be combined." Per your advice, I have merged these sections to enhance the flow and coherence of the paper.

  18. About "Up to line 199, there is no indication of the novel idea behind the paper. The review phase includes a lot of information and some mathematical equations without any clear explanation. For example, normalization equation! why present min-max normalization? Similarly, image processing scenarios! why present this specific scenario?" Following your feedback, I have clarified the mathematical formulations and provided detailed explanations for the choices made, such as min-max normalization. Line 165.

  19. Regarding "How was the model tuning performed?" In response to your query, I have elaborated on the hyperparameter tuning process in greater detail. Line 429.

  20. For "The section 'Model Training and Result Analysis' should start with Data." Following your guidance, I have restructured this section to start with data discussion.

  21. About "The authors should be careful with their section numbering (e.g., 5-2 Data Construction, and also 5-3 Data Construction)." Following your suggestion, I have carefully revised and corrected the section numbering. Line 419.

  22. In response to "How was the data constructed? Did you create a new dataset?" Following your inquiry, I have provided further details about the dataset, clarifying that it originates from the Newton University Flower Dataset 102.

  23. Regarding "There is a difference between Migration and Transfer! Please pay attention." Following your correction, I have revised the terminology to accurately reflect 'model transfer' instead of migration.

  24. About "A general algorithm explaining all steps of your approach should be added." In response to your suggestion, I have included a more accessible description of the methodology throughout the paper.

  25. Regarding "The conclusion should be a single paragraph, not in bullet points." Following your advice, I have reformulated the conclusion into a cohesive paragraph and added future prospects. Line 655.

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

I thank the authors for considering my remarks and i recommend publishing the paper.

Back to TopTop