Next Article in Journal
An OOSEM-Based Design Pattern for the Development of AUV Controllers
Previous Article in Journal
Research on Aerodynamic Characteristics of Three Offshore Wind Turbines Based on Large Eddy Simulation and Actuator Line Model
 
 
Article
Peer-Review Record

An Automatic Detection and Statistical Method for Underwater Fish Based on Foreground Region Convolution Network (FR-CNN)

J. Mar. Sci. Eng. 2024, 12(8), 1343; https://doi.org/10.3390/jmse12081343 (registering DOI)
by Shenghong Li 1,2, Peiliang Li 1,2,*, Shuangyan He 1,2, Zhiyan Kuai 1,2, Yanzhen Gu 1, Haoyang Liu 1, Tao Liu 1 and Yuan Lin 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
J. Mar. Sci. Eng. 2024, 12(8), 1343; https://doi.org/10.3390/jmse12081343 (registering DOI)
Submission received: 29 June 2024 / Revised: 3 August 2024 / Accepted: 5 August 2024 / Published: 7 August 2024
(This article belongs to the Section Ocean Engineering)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Please see attached file.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

 Reconsider after major revision. Thanks 

Author Response

Comments 1: First, the manuscript must follow the IMRAD structure (Introduction, Methodology, Results and Discussions, Conclusions) Precisely, the current section 2 (Related work) should be included in the introduction. The current sections 3 and 4 (experiments) should be included in the Methodology. Authors should have a clear section talking about their Results and Discussions.

Response 1: We appreciate your constructive feedback and guidance on structuring our manuscript according to the IMRAD format.

In response to your suggestions, we have made the following revisions:

Introduction: We have incorporated the previous Section 2 (Related Work) into the Introduction. This revised Introduction now offers a comprehensive background and sets the context for our study more effectively. The specific changes are detailed on lines 118-173.

Methodology: We have merged the former Sections 3 and 4 (Experiments) into a unified Methodology section. This section, now detailed on line 230, provides an extensive overview of the FR-CNN framework and the experimental procedures.

Results and Discussion: We have introduced distinct sections for Results and Discussion. The Results are now presented on lines 507, followed by the Discussion section starting on line 756. These additions provide a clearer and more structured presentation of our findings and their implications.

We sincerely thank you for your insightful comments, which have significantly enhanced the clarity and organization of our manuscript..

Comments 2: How do you validate your approach? Please explicit and well detail the criteria used for the validation of your method, from the images to quantification step.

Response 2: Thank you for your suggestion. In our study, we validated our method using the following approach, as detailed in Section 3, Results and Discussion, line 507:

Data preparation: We extracted 8,000 representative frames from the underwater video of Dongji Island recorded in November 2020. These frames were meticulously annotated and then divided into 5,600 images for training and 2,400 images for validation.

Training process: We utilized ResNet-50 as the backbone for the feature extraction network, integrated with the FPN structure. Transfer learning was applied by initializing with Faster R-CNN weights pre-trained on the COCO dataset. To enhance model robustness, we incorporated various data augmentation techniques.

Verification method: Model performance was assessed on the validation set using multi-class average precision (mAPIoU=0.5) as the primary metric. Additionally, we tested the model on images collected at different time points to evaluate its performance under varying environmental conditions.

Comparison of results: We compared FR-CNN with YOLOv5 and Faster R-CNN+FPN. The results(shown in Table 3 on line 574 and Table 4 on line 576) demonstrated that FR-CNN achieved the highest accuracy and maintained strong temporal stability.

We hope this explanation clarifies our validation process. Should you have any further questions or need additional information, please do not hesitate to contact us.

Comments 3: What was their main purpose of the manuscript: just presenting the approach of Foreground Region Convolution Network OR dealing with the Underwater Organisms? Authors wanted to talk about this approach (Foreground Region Convolution Network ) as their main research methodology to deal with Underwater Organisms? If yes, why not other approaches? Please precise and compare with other approaches.

Response 3: Thank you for your suggestion. The primary aim of the manuscript is to address both the methodological development of the Foreground Region Convolution Network (FR-CNN) and its application to underwater organism detection. Specifically, the paper:

1.Presents the FR-CNN Approach:

We introduce FR-CNN, a two-stage object detection network. This approach integrates both unsupervised and supervised learning techniques, thereby enhancing the accuracy of the detection process. The FR-CNN method effectively addresses issues such as inaccuracies in dataset annotations and enhances precision in fine screening.

2.Applies the Approach to the Detection of Underwater Organisms:

The FR-CNN methodology has been adapted for use in underwater environments through the incorporation of a multi-scale regression Gaussian background model. This adaptation has been developed with the specific intention of addressing the distinctive challenges presented by underwater imagery, including dynamic fluctuations in illumination and variations in color, with the objective of enhancing the detection and analysis of underwater organisms.

In summary, the manuscript aims to advance the FR-CNN methodology and demonstrate its effectiveness in the context of underwater organism detection. This dual focus represents both a significant methodological contribution and an application-oriented study.

We appreciate the opportunity to clarify our choice of FR-CNN as the primary method for studying and processing underwater organisms. Our decision was based on several key considerations:

1.Highly Targeted Foreground Detection:

The FR-CNN has been developed with the specific purpose of extracting and identifying foreground areas within images. This capability is of paramount importance for the detection of underwater creatures in complex and dynamic environments. Although traditional background modeling techniques, such as Gaussian background modeling, are effective at handling background variations, they often prove inadequate for accurately addressing fine-grained foreground detection. The FR-CNN employs deep learning to accurately identify and categorize foreground objects, thereby enhancing the precision of detection.

2.Robustness and Adaptability:

In comparison to traditional methodologies, such as Gaussian background modeling, the FR-CNN exhibits enhanced robustness. The Gaussian background modeling technique is based on the long-term statistical properties of the background, which may not respond rapidly to sudden changes, such as the appearance of new underwater organisms or abrupt alterations in illumination. The deep learning framework of the FR-CNN enables effective distinction and segmentation of foreground organisms, even in complex and fluctuating backgrounds, thereby demonstrating enhanced adaptability.

3.Capacity for Real-Time Processing:

Although Gaussian background modeling is highly effective in real-time video processing, it has inherent limitations in terms of foreground accuracy and precision. The FR-CNN integrates advanced feature extraction with real-time processing capabilities, thereby facilitating more accurate detection and recognition of underwater organisms. This is particularly advantageous when dealing with significant variations in the appearance and behavior of target organisms.

4.A comparison with alternative methodologies is provided below:

(1) Background Modeling Methods:

Techniques such as Gaussian background modeling are well-suited to the detection of long-term background changes; however, they are less effective for the detection of dynamic and detailed foregrounds in complex underwater environments.

(2) Traditional Image Segmentation Methods:

Threshold-based segmentation methods frequently necessitate precise parameter tuning, which can prove challenging in the context of intricate backgrounds and foregrounds typical of underwater imagery.

(3) Other Deep Learning Methods:

Although target detection networks (e.g., YOLO, SSD) also demonstrate efficacy in foreground detection, the FR-CNN enhances detection accuracy and classification for underwater organisms by focusing specifically on foreground regions using its convolutional network architecture.

In conclusion, the FR-CNN has been selected for its precise detection capabilities, adaptability to complex environments, and effective real-time processing. It is our contention that this method markedly enhances the efficacy of underwater biological detection applications.

We would like to express our gratitude once again for your valuable feedback and consideration.

Comments 4: In the opinions of reviewer, it is better to write the manuscript in the direction of showing the difficulties of dealing with underwater organisms and then this approach of Foreground Region Convolution Network would be one of solutions. In this case, please modify accordingly your title to the main objectives of your research.

Response 4: Thank you for your valuable feedback regarding the direction of the manuscript. We appreciate your suggestion to focus on the challenges of dealing with underwater organisms and to present our Foreground Region Convolution Network (FR-CNN) as one of the solutions to these challenges.

We have revised the manuscript to better highlight the difficulties associated with underwater organism detection and have emphasized how our proposed FR-CNN addresses these challenges. Additionally, we have modified the title to reflect the main objectives of our research: "FR-CNN: An Automatic Detection and Statistics Method for Underwater Fish Based on Foreground Region Convolution Network."

We believe these changes will make the manuscript more focused and align it better with the main objectives of our research. Thank you for your insightful suggestions, which have helped improve the clarity and direction of our work.

Comments 5: The explanation for the principles how to transfer the signals from images to the identification of species is not clear. Can you please provide a detailed description of principes how you did transfer the signals from images to the identification of fish species and species counts?

Response 5: Thank you for your suggestion. We appreciate your feedback on the clarity of the principles for transferring signals from images to the identification of fish species and species counts. We have revised the manuscript to provide a more detailed description of these principles. The updated explanation can be found in lines 194-212.

Comments 6: Could you provide more details for the problem of environment if effected by light and depth?

Response 6: Thank you for your suggestion. In the revised version of the manuscript, specifically in lines 237-250, we have added detailed information about how light and depth affect environmental monitoring

The observation equipment sensors (shown in Figure 2 in line 252) include acoustic Doppler current meters, multi-parameter water quality meters, underwater high-definition cameras, supplementary lights, water pumps, and other equipment, which provide long-term, continuous, real-time monitoring of environmental parameters and the status of underwater biological resources.

Specifically, our integrated OTWHC-500 underwater camera is equipped with 30x optical zoom and a field of view of approximately 84°. It can operate in a pressure-resistant range of 1000 to 6000 meters and provides real-time 1080p video output. Supplementary lights are essential to maintaining the clarity and contrast of underwater videos. These lights help alleviate problems such as color distortion caused by the absorption and scattering of different wavelengths of light at various depths. They also address the challenge of discontinuous video brightness and clarity caused by inconsistent lighting conditions during the day and night cycle.

In addition, the water pump is used to regularly clean contaminants or attachments on the camera lens, ensuring that the visual data remains high-quality and the monitoring system operates effectively for extended periods.

We believe these enhancements provide a clearer understanding of how our devices are designed to address the challenges posed by underwater environments, particularly with regard to light and depth variations.

Comments 7: How can you treat the problem of darkness in the pictures?

Response 7: Thank you for your suggestion. In addressing the problem of darkness in the pictures, we have implemented several measures to ensure optimal image quality under varying light conditions.

First, our observation system includes supplementary lights specifically designed to enhance underwater visibility. These lights help mitigate the issue of darkness by providing consistent illumination, which is crucial in deep or murky waters where natural light is minimal or absent. The supplementary lights are strategically positioned to reduce shadows and improve overall image brightness and clarity.

Second, our OTWHC-500 underwater camera is equipped with advanced low-light imaging capabilities. The camera's high sensitivity to light allows it to capture clearer images in low-light conditions. Additionally, the 30x optical zoom and wide field of view help in adjusting the focus and exposure settings to optimize image capture based on the available light.

Third, we regularly use the water pump to clean the camera lens, ensuring that no contaminants or biofouling obstruct the light or degrade the image quality. This maintenance helps maintain the effectiveness of both the camera and the supplementary lighting.

These measures combined ensure that our system can effectively handle the problem of darkness in underwater pictures, providing high-quality visual data for accurate monitoring and analysis.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors are advised to revise Figure 1 such that the panel [a-f] is included below the image for clarity purposes.

 

The authors are advised to update this “ YOLOv1–YOLOv5 series[27-31]” since the latest one is YOLOv10: Real-Time End-to-End Object Detection.

 

The authors have referenced to Figure 4 “ The classification and position regression network, indicated by orange dashed lines, performs binary classification (background or foreground)” without indicating Figure 4. Furthermore, the authors are advised to be consistent with references; for example, Figure 3 is referenced as Figure 3 in “Thamnaconus modestus, Microcanthus strigatus, Oplegnathus fasciatus as shown in Figure 3” However, Figure 4 is referenced as Fig 4 - “outlined by blue dashed lines in Fig. 4 illustrates”


The authors are advised to increase the resolution of Figure 4, as it might be hard for users to read without zooming in.

 

Reference is required for statements or methods. It seems like many statements do not have any references. For example “ Adaptive Multi-scale Gaussian Background (AMGB) model” does not have reference.;  In real underwater environments, especially with continuously sampled video data, the presence of foreign objects and changes in water turbidity can significantly alter the illumination, color, and other characteristics of images. This leads to a decline in the model's inferential capability over time. These statements need to be updated with citations. Please revise the paper to incorporate such citations.

 

In section 3.5, the authors mention, "This algorithm augments the foreground object bounding boxes obtained in Section 3.1 at different scales. However, section 3.1 is about Observation equipment and datasets. Is this reference correct?

 

Since a few computations are performed on a CPU, the authors are advised to perform inference on the CPU for accurate inference speeds. It is not fair to compare YoloV5 on a GPU with FR-CNN, which has GPU + CPU computations. Running both models on the CPU would provide correct inference speeds.

 

The authors are advised to format the tables correctly. In Table 1, the first row is left-aligned, whereas the others are center-aligned. The same goes for other tables.

 

Since the main contribution of the paper is - FR-CNN as an anchor-free two-stage object detection network and a multi-scale regression Gaussian background model as an unsupervised method for dynamic calibration of existing datasets, the authors are advised to provide ablation studies to support this. For example, see the results for various ResNets and how each version performs. Is there a specific reason ResNet-50 was utilized?

 

The authors are advised to provide details about the models, such as the convolutional filter and the classifier used with inputs from the BM module. Furthermore, researchers require complete training details to reproduce the work.

 

The authors are advised to compare with object detectors specifically developed for underwater object detection, such as YOLO-Fish and the YOLORG algorithm. In addition, the authors are advised to use various datasets, such as DeepFish and Fishnet Open Images, to check the proposed model's generalization capability.

Comments on the Quality of English Language

The authors are advised to revise Figure 1 such that the panel [a-f] is included below the image for clarity purposes.

 

The authors are advised to update this “ YOLOv1–YOLOv5 series[27-31]” since the latest one is YOLOv10: Real-Time End-to-End Object Detection.

 

The authors have referenced to Figure 4 “ The classification and position regression network, indicated by orange dashed lines, performs binary classification (background or foreground)” without indicating Figure 4. Furthermore, the authors are advised to be consistent with references; for example, Figure 3 is referenced as Figure 3 in “Thamnaconus modestus, Microcanthus strigatus, Oplegnathus fasciatus as shown in Figure 3” However, Figure 4 is referenced as Fig 4 - “outlined by blue dashed lines in Fig. 4 illustrates”


The authors are advised to increase the resolution of Figure 4, as it might be hard for users to read without zooming in.

 

Reference is required for statements or methods. It seems like many statements do not have any references. For example “ Adaptive Multi-scale Gaussian Background (AMGB) model” does not have reference.;  In real underwater environments, especially with continuously sampled video data, the presence of foreign objects and changes in water turbidity can significantly alter the illumination, color, and other characteristics of images. This leads to a decline in the model's inferential capability over time. These statements need to be updated with citations. Please revise the paper to incorporate such citations.

 

In section 3.5, the authors mention, "This algorithm augments the foreground object bounding boxes obtained in Section 3.1 at different scales. However, section 3.1 is about Observation equipment and datasets. Is this reference correct?

 

Since a few computations are performed on a CPU, the authors are advised to perform inference on the CPU for accurate inference speeds. It is not fair to compare YoloV5 on a GPU with FR-CNN, which has GPU + CPU computations. Running both models on the CPU would provide correct inference speeds.

 

The authors are advised to format the tables correctly. In Table 1, the first row is left-aligned, whereas the others are center-aligned. The same goes for other tables.

Since the main contribution of the paper is - FR-CNN as an anchor-free two-stage object detection network and a multi-scale regression Gaussian background model as an unsupervised method for dynamic calibration of existing datasets, the authors are advised to provide ablation studies to support this. For example, see the results for various ResNets and how each version performs. Is there a specific reason ResNet-50 was utilized?

 

The authors are advised to provide details about the models, such as the convolutional filter and the classifier used with inputs from the BM module. Furthermore, researchers require complete training details to reproduce the work.

 

The authors are advised to compare with object detectors specifically developed for underwater object detection, such as YOLO-Fish and the YOLORG algorithm. In addition, the authors are advised to use various datasets, such as DeepFish and Fishnet Open Images, to check the proposed model's generalization capability.

 

Author Response

Comments 1: The authors are advised to revise Figure 1 such that the panel [a-f] is included below the image for clarity purposes.

Response 1: Thank you for your suggestion. We have revised Figure 1 accordingly, moving the panel [a-f] below the image for better clarity. This change is reflected on line 46 of the revised manuscript.

Comments 2: The authors are advised to update this “ YOLOv1–YOLOv5 series[27-31]” since the latest one is YOLOv10: Real-Time End-to-End Object Detection.

Response 2: Thank you for your suggestion. We have updated the manuscript to include the latest YOLOv10: Real-Time End-to-End Object Detection, replacing the previous series reference. This change is reflected on line 116 of the revised manuscript.

Comments 3: The authors have referenced to Figure 4 “ The classification and position regression network, indicated by orange dashed lines, performs binary classification (background or foreground)” without indicating Figure 4. Furthermore, the authors are advised to be consistent with references; for example, Figure 3 is referenced as Figure 3 in “Thamnaconus modestus, Microcanthus strigatus, Oplegnathus fasciatus as shown in Figure 3” However, Figure 4 is referenced as Fig 4 - “outlined by blue dashed lines in Fig. 4 illustrates”.
The authors are advised to increase the resolution of Figure 4, as it might be hard for users to read without zooming in.

Response 3: Thank you for your suggestion. We have addressed the following points:

We have corrected the reference to Figure 4 to ensure it is clearly indicated.

We have increased the resolution of Figure 4 to improve readability without the need for zooming.

These changes are reflected on line 303 of the revised manuscript. We appreciate your suggestions to enhance the clarity and consistency of our work.

Comments 4: Reference is required for statements or methods. It seems like many statements do not have any references. For example “ Adaptive Multi-scale Gaussian Background (AMGB) model” does not have reference.;  In real underwater environments, especially with continuously sampled video data, the presence of foreign objects and changes in water turbidity can significantly alter the illumination, color, and other characteristics of images. This leads to a decline in the model's inferential capability over time. These statements need to be updated with citations. Please revise the paper to incorporate such citations.

Response 4: Thank you for your suggestion. This change is reflected on line 405,176-179 of the revised manuscript. We have addressed this concern by incorporating relevant references to support the statements regarding the Adaptive Multi-scale Gaussian Background (AMGB) model and the impact of real underwater environments on model performance. Specifically, we have updated the manuscript to include citations that substantiate the effects of foreign objects and changes in water turbidity on illumination, color, and inferential capability. We appreciate your valuable input in improving the quality of our work.

Comments 5: In section 3.5, the authors mention, "This algorithm augments the foreground object bounding boxes obtained in Section 3.1 at different scales. However, section 3.1 is about Observation equipment and datasets. Is this reference correct?

Response 5: Thank you for your suggestion. You are correct that Section 3.1 discusses observation equipment and datasets, and not the algorithm for augmenting foreground object bounding boxes. We have revised the reference in the manuscript to ensure it accurately directs readers to the appropriate section. The corrected reference can now be found on lines 460-461.

Comments 6: Since a few computations are performed on a CPU, the authors are advised to perform inference on the CPU for accurate inference speeds. It is not fair to compare YoloV5 on a GPU with FR-CNN, which has GPU + CPU computations. Running both models on the CPU would provide correct inference speeds.

Response 6: Thank you for your suggestion. The question of the fairness of comparing YOLOv5 on a GPU with FR-CNN, which employs both GPU and CPU computations, is a pertinent one.

To provide clarity, it should be noted that Gaussian Background Modelling, which forms part of FR-CNN, is fundamentally a background subtraction technique based on statistical analysis. This method entails the construction of a background model through the implementation of a long-term statistical analysis of pixel colour values. The necessity for processing extensive historical pixel data, including the adjustment of Gaussian distribution parameters and the evaluation of probability density functions, presents significant challenges for the execution of Gaussian Background Modelling on GPUs. These challenges are due to the fact that the algorithm relies on high-frequency access to global memory, which can lead to bandwidth limitations and reduced computational efficiency. Furthermore, the inter-pixel dependencies and the multi-stage nature of the algorithm present additional challenges to the efficient parallelisation on GPUs.

Consequently, FR-CNN employs a hybrid CPU + GPU methodology for the purpose of inference. It is acknowledged that running both models on the CPU would facilitate a more accurate comparison of inference speeds. However, given the inherent limitations of Gaussian Background Modelling on GPUs, the current implementation is a reflection of the necessity for a combined CPU and GPU approach.

Comments 7: The authors are advised to format the tables correctly. In Table 1, the first row is left-aligned, whereas the others are center-aligned. The same goes for other tables.

Response 7: Thank you for your suggestion. We have carefully reviewed and updated all tables in the manuscript to ensure consistent alignment. Specifically, we have center-aligned the text in all rows of the tables. The changes have been applied to Table 1 and other tables at lines 376, 554, 574, 576, 744, and 748.

Comments 8: Since the main contribution of the paper is - FR-CNN as an anchor-free two-stage object detection network and a multi-scale regression Gaussian background model as an unsupervised method for dynamic calibration of existing datasets, the authors are advised to provide ablation studies to support this. For example, see the results for various ResNets and how each version performs. Is there a specific reason ResNet-50 was utilized?

Response 8: Thank you for your detailed review of our paper and valuable suggestions. In response to your query, we offer the following commentary:

In regard to the experiments conducted to investigate the efficacy of ablation techniques:

A comparison of Faster R-CNN+FPN and our proposed FR-CNN is provided in Table 2. The results demonstrate that FR-CNN exhibits significant advantages in multi-scale regression Gaussian background modelling. As the experimental results have already demonstrated the improvement of our model following the introduction of the Gaussian background model, we have chosen not to conduct further ablation experiments. We posit that the existing comparison results provide substantial support for the primary contribution of our model.

The rationale behind the utilisation of ResNet-50 is as follows:

The ResNet-50 network serves as the backbone of our model. The reasons for selecting this particular approach are as follows: In our research, we selected ResNet-50 as the backbone network of FR-CNN, primarily due to its proficiency in deep feature extraction, advantages of its residual structure, exemplary performance, commendable computational efficiency and parameter balance, as well as its extensive applicability and community support. The ResNet-50 network is capable of effectively extracting multi-level features from low-level to high-level through its 50-layer deep network, which is a crucial aspect for target detection tasks. The residual block addresses the training issues inherent to deep networks through the introduction of shortcut connections, thereby enhancing the network's expressive capacity and training stability. ResNet-50 demonstrates robust performance on a range of standard datasets, particularly in the domain of image classification on the ImageNet dataset, which substantiates its adeptness in feature extraction. Additionally, in comparison to deeper networks, ResNet-50 offers an optimal balance between computational efficiency and memory usage, enabling its effective operation in resource-limited environments. Its extensive application and extensive community resources provide crucial support for the implementation and optimisation of the model. Consequently, ResNet-50 is an ideal choice for our FR-CNN backbone network, ensuring excellent performance and efficiency in object detection tasks.

Comments 9: The authors are advised to provide details about the models, such as the convolutional filter and the classifier used with inputs from the BM module. Furthermore, researchers require complete training details to reproduce the work.

Response 9: Thank you for your valuable feedback regarding the need for more detailed information about the models and training procedures. We agree that providing comprehensive details is crucial for reproducibility.

In response to your suggestion, we have added specific information about the convolutional filters and the classifier used with inputs from the BM module. Additionally, we have included complete training details to facilitate the reproduction of our work. These updates can be found on lines 329-347 and 379 of the revised manuscript.

We appreciate your input, which has significantly improved the clarity and completeness of our manuscript.

Comments 10: The authors are advised to compare with object detectors specifically developed for underwater object detection, such as YOLO-Fish and the YOLORG algorithm. In addition, the authors are advised to use various datasets, such as DeepFish and Fishnet Open Images, to check the proposed model's generalization capability.

Response 10: Thank you for your valuable suggestion regarding the comparison with object detectors specifically developed for underwater scenarios, such as YOLO-Fish and YOLORG, as well as the use of various datasets like DeepFish and Fishnet Open

Due to constraints in our current resources and time limitations, we were unable to incorporate these specific comparisons and additional datasets into this version of the manuscript. Additionally, we attempted to download the DeepFish and Fishnet Open Images datasets but were unable to access them. Furthermore, the YOLO-Fish and YOLORG algorithms do not have publicly available source codes, which prevented us from reproducing and comparing them with our model.

However, we recognize the importance of these comparisons and evaluations. In future work, we plan to address these aspects by seeking alternative ways to obtain relevant datasets and implementing comparable algorithms if they become available. We believe that including these evaluations will provide a more comprehensive assessment of the model's performance

We appreciate your insightful suggestions and hope to explore these avenues in subsequent research.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

1) Please modify the title such as: 

 An automatic detection and statistical method for underwater fish based on Foreground Region Convolution Network ( FR-CNN)

2) The manuscript must be re-edited correctly accordingly to the required format of Journal.

3) Please check carefully typos, the misuse of hyphens and the spacing between words and citation, for example, line 148: Mohamed et al. [43]employed.  There are many of them in the manuscript. Therefore the authors must read thoroughly again the manuscript and edit entirely their new version.

Comments on the Quality of English Language

Please also edit again the English to improve the quality of paper.

Author Response

Comments 1: Please modify the title such as:

 An automatic detection and statistical method for underwater fish based on Foreground Region Convolution Network (FR-CNN).

Response 1: Thank you for your constructive suggestion regarding the title of our manuscript. We have modified the title as you recommended. The new title is:

"An Automatic Detection and Statistical Method for Underwater Fish Based on Foreground Region Convolution Network (FR-CNN)"

We believe this change better reflects the content and focus of our work. We appreciate your insightful feedback and hope that the revised title meets your expectations.

Thank you for your time and effort in reviewing our manuscript.

Comments 2: The manuscript must be re-edited correctly accordingly to the required format of Journal.

Response 2: Thank you for your valuable feedback. We have carefully revised the manuscript to ensure it adheres to the required format of the Journal. We have re-edited the document according to the guidelines provided, paying close attention to formatting details to meet the Journal's standards.

We appreciate your guidance in improving the quality of our submission and hope that the revised version aligns with the Journal's requirements.

Thank you for your time and effort in reviewing our manuscript.

Comments 3: Please check carefully typos, the misuse of hyphens and the spacing between words and citation, for example, line 148: Mohamed et al. [43]employed.  There are many of them in the manuscript. Therefore the authors must read thoroughly again the manuscript and edit entirely their new version.

Response 3: Thank you for your thorough review and valuable comments. We have carefully revised the manuscript to address the issues you pointed out regarding typos, the misuse of hyphens, and the spacing between words and citations. Specifically, we have ensured that all instances, such as the one in line 148 ("Mohamed et al. [43]employed"), have been corrected.

We have thoroughly read through the entire manuscript and made the necessary edits to improve its overall quality. We appreciate your attention to detail and hope that the revised version meets your expectations.

Thank you for your time and effort in reviewing our manuscript.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed majority of the comments provided by the reviewers

Author Response

Dear Reviewer,

We would like to express our sincere gratitude for your previous valuable comments and feedback on our manuscript. Your insights were instrumental in guiding our revisions and improving the quality of our work. We have carefully considered and addressed all your suggestions in the revised version.

We greatly appreciate your time and effort in reviewing our manuscript and contributing to its enhancement. Thank you for your support and for the opportunity to present our work.

Best regards,

Shenghong Li

Corresponding author:

Name: Peiliang Li

E-mail: [email protected]

Author Response File: Author Response.pdf

Back to TopTop