Next Article in Journal
In lumpfish We Trust? The Efficacy of Lumpfish Cyclopterus lumpus to Control Lepeophtheirus salmonis Infestations on Farmed Atlantic Salmon: A Review
Previous Article in Journal
Exploring the Spatial and Temporal Distribution of Frigate Tuna (Auxis thazard) Habitat in the South China Sea in Spring and Summer during 2015–2019 Using Fishery and Remote Sensing Data
 
 
Article
Peer-Review Record

Fish Face Identification Based on Rotated Object Detection: Dataset and Exploration

by Danyang Li 1, Houcheng Su 1, Kailin Jiang 2, Dan Liu 1 and Xuliang Duan 1,*
Reviewer 1:
Reviewer 2:
Submission received: 18 July 2022 / Revised: 19 August 2022 / Accepted: 19 August 2022 / Published: 25 August 2022
(This article belongs to the Section Fishery Facilities, Equipment, and Information Technology)

Round 1

Reviewer 1 Report

In this paper, the authors describe a dataset (created by them), as well as a deep learning pipeline to do fish identification.

The pipeline consists of fish data collection, preprocessing, fish detection, and fish face recognition. In the fish detection stage, in contrast to using the standard rectangle as bounding boxes, they used a rotated bounding box to perform fish detection. In the fish face identification stage, they proposed a Fish Face Recognition Network (FFRNet) with a self-SE module. By comparing with other backbones (e.g. MobileNet, ResNet50, EfficientNet, viT, etc), their FFRNet has shown the best performance with 92% accuracy on fish data.

Comments from methods:
Some unclear points in fish detection and rotation detection:

• In section 3.1.1. Rotating frame representation, the context is a bit confusing — they use a long page to describe different rotation detection methods, but their rotation method is a bit unclear. What kind of classification task for the rotation detector? (L363~) What’s the train and test set of the classification task? Do they have the labels of the rotated bounding boxes? If so, why not use the labeled rotated bbox to train a fish detector which predicts the polygon or rotated rectangle as the bounding box of each fish?  • In section 4.1 Object detection experiment, they presented the rotating target detection in Table 4. Is this rotating target detection method related to the classification task mentioned in section 3.1.1? • In Table 4, what is the mAP value in R-CenterNet and R-Yolov5s?  • What is the metric of ‘mIOU ‘ and ’mAngle’? There is no explanation for these metrics.     There are a number of works on detection of animals with more complex shapes (the rotated bounding box detection is not novel):
- E.g.: https://openaccess.thecvf.com/content/WACV2021/papers/Pan_Ellipse_Detection_and_Localization_With_Applications_to_Knots_in_Sawn_WACV_2021_paper.pdf
and references therein! 
- in pose estimation, there are also bodyplan agnostic detection methods: https://www.nature.com/articles/s41592-022-01443-0 (This paper also does ReID on unmarked fish)
-  Y. Jiang, X. Zhu, X. Wang, S. Yang, W. Li, H. Wang, P. Fu, and Z. Luo, “R2cnn: rotational region cnn for orientation robust scene text detection,” arXiv:1706.09579, 2017.
- J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, Y. Zheng, and X. Xue, “Arbitrary-oriented scene text detection via rotation proposals,” TMM, 2018.
- Jian Ding, Nan Xue, Yang Long, Gui-Song Xia ∗ , Qikai Lu, “Learning RoI Transformer for Detecting Oriented Objects in Aerial Images” https://arxiv.org/abs/1812.00155
-  Xue Yang 1,2,3,4 , Jirui Yang 2 , Junchi Yan 3,4,∗ , Yue Zhang 1 , Tengfei Zhang 1,2 Zhi Guo 1 , Xian
Sun 1 , Kun Fu 1,2, “SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects” https://arxiv.org/abs/1811.07126
   

Data information is unclear

• How much data do they have? In abstract, they said, ’A dataset of 7260 fish with identifying information was produced.’ In the experiment (L658): ‘The following experiment took 2912 pics, with no difference between the data label format of each one and that of previous, from the rotated bounding box dataset.’ • How much data was used for training and testing in the fish identification task? It is also important to split this properly, and explain the rationale.    Other comments
• In section 3.2.2 Self-SE module of FFRNet, they compared the Self-SE, SE, base network. Are there any quantitative results to show the effectiveness of Self-SE module? • It seems the data are from the same video and same experimental environment, and the fish number is not as complex as in the real breeding environment. Currently, the face identification performance is good enough in the same experimental setting. How can it be further applied to more complex and real breeding scenarios?

Comments from context:

• Citations are missing:
• L95: panda face recognition, L101: MF Hansen et al., L110: D Crouse et al., L277: DBFace, CenterNet, etc.

The writing should be improved:

• There are some inconsistent definitions of the method:
• L215,216: “phash”, L257: “pHash”  • L490: “self SE module”, “Self-SE modules”  • There are many capital words after commas, semicolons:
• L59, L61, L179, L202, L203, L215, L261, L276, etc….

 

• Tense inconsistent, especially in section 3. Material and methods and section 4. Results.
• Sometimes using the past tense, sometimes using the present tense.

Comments from methods:
Some unclear points in fish detection and rotation detection:

• In section 3.1.1. Rotating frame representation, the context is a bit confusing — they use a long page to describe different rotation detection methods, but their rotation method is a bit unclear. What kind of classification task for the rotation detector? (L363~) What’s the train and test set of the classification task? Do they have the labels of the rotated bounding boxes? If so, why not use the labeled rotated bbox to train a fish detector which predicts the polygon or rotated rectangle as the bounding box of each fish?  • In section 4.1 Object detection experiment, they presented the rotating target detection in Table 4. Is this rotating target detection method related to the classification task mentioned in section 3.1.1? • In Table 4, what is the mAP value in R-CenterNet and R-Yolov5s?  • What is the metric of ‘mIOU ‘ and ’mAngle’? There is no explanation for these metrics.

Data information is unclear:

• How much data do they have? In abstract, they said, ’A dataset of 7260 fish with identifying information was produced.’ In the experiment (L658): ‘The following experiment took 2912 pics, with no difference between the data label format of each one and that of previous, from the rotated bounding box dataset.’ • How much data for training and testing in the fish identification task? 


Other comments:

• In section 3.2.2 Self-SE module of FFRNet, they compared the Self-SE, SE, base network. Are there any quantitative results to show the effectiveness of Self-SE module? • It seems the data are from the same video and same experimental environment, and the fish number is not as complex as in the real breeding environment. Currently, the face identification performance is good enough in the same experimental setting. How can it be further applied to more complex and real breeding scenarios?

Comments from context:

• Citations are missing:
• L95: panda face recognition, L101: MF Hansen et al., L110: D Crouse et al., L277: DBFace, CenterNet, etc.

The writing should be improved:

• There are some inconsistent definitions of the method:
• L215,216: “phash”, L257: “pHash”  • L490: “self SE module”, “Self-SE modules”  • There are many capital words after commas, semicolons:
• L59, L61, L179, L202, L203, L215, L261, L276, etc…. • Tense inconsistent, especially in section 3. Material and methods and section 4. Results.
• Sometimes using the past tense, sometimes using the present tense   The whole section 70-86 should be seriously condensed and adapted for readers in FIshes. 

The English should be  seriously improved, e.g.:

- Line 10: " We have explored fish face 10 identification for the first time. "
- Line 28: "
 In the future 28 fishery industry, reducing Marine fishing to protect the ecological environment and increasing fishery breeding are the main trends in the future. "
- Line 43: " In the intelligent detection of animal diseases and abnormal behaviors by computer vision method, identity recognition can accurately locate and early warning the detection. "
- Line 57: "In fish, the fish are usually smaller, breeding density is high "
- ..... 
- Line 117 .. 
- many more!

Author Response

Response to Reviewer 1 Comments

 

Point 1: In section 3.1.1. Rotating frame representation, the context is a bit confusing — they use a long page to describe different rotation detection methods, but their rotation method is a bit unclear. What kind of classification task for the rotation detector? (L363~) What’s the train and test set of the classification task? Do they have the labels of the rotated bounding boxes? If so, why not use the labeled rotated bbox to train a fish detector which predicts the polygon or rotated rectangle as the bounding box of each fish?

 

Response 1: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. Following the format of the yolo series model, the output tensor has a shape of h, w and 255. Each grid has a vector of 255 dimensions. It contains information about the 3 boxes (Each box contains xbox, ybox, wbox and hbox.), the confidence level and the 80-dimensional category information. Then expand the dimensionality on top of this to 435 (255+180) to make it possible to classify angles. The classification task of the rotating target detection model mentioned in the manuscript refers specifically to the classification angle, which is a loss function (e.g. cross entropy) using classification. The data labels are in the format (x, y, w, h, angle) and we are using labelled rotated frames for training.

 

Point 2: In section 4.1 Object detection experiment, they presented the rotating target detection in Table 4. Is this rotating target detection method related to the classification task mentioned in section 3.1.1?

 

Response 2: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. They are relevant. In section 3.1.1 we describe in detail how to transform the regression problem of rotating boxes into a classification problem. A more detailed answer is mentioned in the previous question.

 

Point 3: • In Table 4, what is the mAP value in R-CenterNet and R-Yolov5s?

 

Response 3: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. Sorry, this was a mistake on our part and we have now added the mAP values for R-CenterNet and R-Yolov5s in Table 4.

 

Point 4: What is the metric of ‘mIOU ‘ and ’mAngle’? There is no explanation for these metrics.

 

Response 4: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. In this manuscript, mIoU refers to the mean of the IoU of all predicted bounding boxes and the true bounding box, and mAngle refers to the mean of the angles of all predicted bounding boxes and the angles of the true bounding box. We have added a note on this in the manuscript.

 

Point 5: There are a number of works on detection of animals with more complex shapes (the rotated bounding box detection is not novel).

 

Response 5: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. Indeed, you are right, but we would like to emphazise that rotating box object detection is an approach adopted in response to the low accuracy of identity recognition. The focus was not on the adoption of a rotating box object detection model, but rather the idea of wanting to reduce the background area drove us to adopt this approach.

 

Point 6: How much data do they have? In abstract, they said, “A dataset of 7260 fish with identifying information was produced.” In the experiment (L658): “The following experiment took 2912 pics, with no difference between the data label format of each one and that of previous, from the rotated bounding box dataset.”

 

Response 6: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. That was an error. Please allow us to clarify that the dataset is divided into two main categories, the dataset used for object detection and another one is used for identification. the number of data used for object detection is reported in Table 1, there are 1160 images labelled with rotated boxes and 1160 images labelled with standard boxes. Then 500 of these fish were sampled separately for identification to test what kind of box is effective, and the final result was high accuracy for identification based on rotating bex object detection. Therefore, we expanded the number to 2912. We ended up with 3412 (500+2912) datasets for identification and 2320 (1160+1160) datasets for object detection. We have corrected the numbers in the original text.

 

Point 7: How much data was used for training and testing in the fish identification task? It is also important to split this properly, and explain the rationale.

 

Response 7: Your comments and suggestions on the manuscript are sincerely appreciated. In the fish identification task, we first sampled 500 images from standard box object detection and rotating frame target detection for identification, respectively, and the results are reported in Table 6. Due to the high accuracy of rotatingbox based identification, we expanded the number of datasets to 2912, and the results are reported in Table 7. We ended up with a dataset of 3412 (500 + 2912) identities. We randomly divided the training and test sets in a ratio of 4:1. This division method is extremely common in the field of deep learning. Therefore, we think it is not necessary to take more explanation of these.

 

Point 8: In section 3.2.2 Self-SE module of FFRNet, they compared the Self-SE, SE, base network. Are there any quantitative results to show the effectiveness of Self-SE module?

 

Response 8: Your comments and suggestions on the manuscript are sincerely appreciated. We report the results of Self-SE (FFRNet) in Table 8.

 

Point 9: It seems the data are from the same video and same experimental environment, and the fish number is not as complex as in the real breeding environment. Currently, the face identification performance is good enough in the same experimental setting. How can it be further applied to more complex and real breeding scenarios?

 

Response 9: Your comments and suggestions on the manuscript are sincerely appreciated. You are right, but please note that our work is an exploratory experiment. The aim is to explore whether face recognition can be transferred to fish face recognition. This means that there is no real application of fish face recognition to actual farming yet and we have added a discussion section to the manuscript. In the discussion section we discuss in detail the robustness of the algorithm in real-life scenarios. In the future, we will continue to explore the application of fish object detection and identification in real-life scenarios. This will be an important part of our future work.

 

Point 10: Citations are missing:

L95: panda face recognition, L101: MF Hansen et al., L110: D Crouse et al., L277: DBFace, CenterNet, etc.

 

Response 10: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have added several citations.

 

Point 11: There are some inconsistent definitions of the method:

L215,216: “phash”, L257: “pHash”

L490: “self SE module”, “Self-SE modules”

 

Response 11: Thank you sincerely for your comments and suggestions on the manuscript. We have changed "phash" to "pHash" in lines 215 and 216. We have changed "self SE module", "Self-SE modules" in line 490 to "Self-SE module ".

 

Point 12: There are many capital words after commas, semicolons:

L59, L61, L179, L202, L203, L215, L261, L276, etc….

 

Response 12: Thank you sincerely for your comments and suggestions on the manuscript. We apologize for these errors in the writing of the manuscript. We have corrected them all.

 

Point 13: Tense inconsistent, especially in section 3. Material and methods and section 4. Results. Sometimes using the past tense, sometimes using the present tense.

 

Response 13: We sincerely thank you for your comments and suggestions on the manuscript and we have revised.

 

Point 14: The whole section 70-86 should be seriously condensed and adapted for readers in Fishes.

 

Response 14: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have revised widely in the sections you suggested.

 

Point 15: The English should be seriously improved, e.g.:

- Line 10: " We have explored fish face 10 identification for the first time. "

 

Response 15: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have revised widely.

 

Point 16: The English should be seriously improved, e.g.:

- Line 28: "In the future fishery industry, reducing Marine fishing to protect the ecological environment and increasing fishery breeding are the main trends in the future. "

 

Response 16: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have revised widely.

 

Point 17: The English should be seriously improved, e.g.:

- Line 43: " In the intelligent detection of animal diseases and abnormal behaviors by computer vision method, identity recognition can accurately locate and early warning the detection. "

 

Response 17: Thank you sincerely for your comments and suggestions on the manuscript. We have revised widely.

 

Point 18: The English should be seriously improved, e.g.:

- Line 57: "In fish, the fish are usually smaller, breeding density is high "

 

Response 18: Thank you sincerely for your comments and suggestions on the manuscript. We have revised widely.

 

Point 19: The English should be seriously improved, e.g.:

- Line 117 …

 

Response 19: Thank you sincerely for your comments and suggestions on the manuscript. We have revised widely.

Reviewer 2 Report

Lin and colleagues have proposed a new automatic methodology for face recognition in fish farming. The aim of this study is interesting and potentially provides a tool for helping to farm on identifying animals. However, I have a major concern that authors should consider before the study might be considered for publication.

Currently, there are several systems that allow you to automatically monitor the status of fish in groups and thus provide support to farmers. The methodology by Lin and colleagues appears to be limited to their specific setup. The fish are kept in small and transparent tanks in a low n, conditions that are hardly present on farms. Indeed,  fish are generally monitored from above because they are maintained in large opaque tanks. This could worsen the performance of the system proposed by Lin and colleagues and I would have expected it to be taken into consideration to verify the robustness of their procedure. Therefore, I would like the authors to discuss the limits on the applicability of their system on extensive farms. Despite this, the procedure is very promising, and I would suggest applying it to recognize individuals and monitor their behavior for cognitive research and animal welfare purposes.

Additional comments

Lines 25-27: Authors should avoid repeating “Marine ecological”. I would suggest removing “as fish is an important part of Marine River ecological environment” because not strictly necessary for the meanings of the sentence.

Line 27. Add “Thus” after point interruption.

Lines 28-30: I would suggest avoiding the passive form. For example: “In the future, fishery industry should target that of reducing Marine fishing to protect the ecological environment and increasing “ 

Lines 30-32: la frase è ridondante. “People are gradually increasing aquaculture to meet the demand for aquatic products and, consequently, reduce the damage to the natural environment”.

Lines 32-34:  Authors should report some references.

Lines 59 and 61: the capital letter is not necessary

Line 72: change to “Howard A” in according to the chosen style (see also line 96).

Lines 75-77 and following lines: Authors reported only the principal positive advantages for each computer vision NN. I would suggest deeply discussing the shared features and the different aspects of proposed NNs. One of the main objectives of this work is to compare classification performance by considering different NN architectures. A deeper discussion of the similarities and differences in their architecture would lead readers better understand the choices made by the authors.

Lines 94-95: redundant use of “animals” term.

Lines 94-112: authors reported several examples of automatic face recognition methodology applied in animal farming. However, no reason to apply such techniques has been described. Besides reporting many applicative examples, I would suggest motivating the use of automatic techniques in animal farming.

Lines 115-117: “At present repetition”.

Throughout the manuscript, I read several different names for identifying the studied species (lines 121, 134, and so on). Check and use the same style (and report the Latin name for the species).

Lines 123-125: If I well understand, Fish Face Recognition Network (FFRNet) is the classificational module that permits the identification of fish by using the FaceNet architecture to collect information from images. If it is correct my interpretation, authors should better describe their classificational module

Figure 3: Change the description of each phase as reported in the main text.

 

Data processing. I did not understand how authors collect images from video recordings. Please, specify the number of frames in each second and describe more the “cutting” processes for each acquired frame. 

Line 217: Should be AHash. Moreover, I consider necessary a better clarification and motivated the preference for using an algorithm rather than others.

Lines 271: What are “other reasons”? Please, motivate your choice for removing images from your sample

Line 359: [0:180] or [0-180]

Author Response

Response to Reviewer 2 Comments

 

Point 1: Currently, there are several systems that allow you to automatically monitor the status of fish in groups and thus provide support to farmers. The methodology by Lin and colleagues appears to be limited to their specific setup. The fish are kept in small and transparent tanks in a low n, conditions that are hardly present on farms. Indeed, fish are generally monitored from above because they are maintained in large opaque tanks. This could worsen the performance of the system proposed by Lin and colleagues and I would have expected it to be taken into consideration to verify the robustness of their procedure. Therefore, I would like the authors to discuss the limits on the applicability of their system on extensive farms. Despite this, the procedure is very promising, and I would suggest applying it to recognize individuals and monitor their behavior for cognitive research and animal welfare purposes.

 

Response 1: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. You are right, but please note that our work is an exploratory trial aimed at exploring whether face recognition can be transferred to fish face recognition. This means that there is currently no real application of fish face recognition to actual farming. We have added a the Section 5 to discuss that in the manuscript. We discuss in detail the robustness of the algorithm in real-life scenarios. In the future, we will continue to explore the application of fish object detection and identification in real-life scenarios. This will be an important part of our future work. We will further explore the use of our programs in real-world environments. We will also explore the role of our programs in individual identification and individual behavior detection, contributing to fish cognition research and animal welfare.

 

Point 2: Lines 25-27: Authors should avoid repeating “Marine ecological”. I would suggest removing “as fish is an important part of Marine River ecological environment” because not strictly necessary for the meanings of the sentence.

 

Response 2: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have removed " as fish is an important part of Marine River ecological environment ".

 

Point 3: Line 27. Add “Thus” after point interruption.

 

Response 3: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have added "Thus" after the dot break.

 

Point 4: Lines 28-30: I would suggest avoiding the passive form. For example: “In the future, fishery industry should target that of reducing Marine fishing to protect the ecological environment and increasing. “.

 

Response 4: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have changed " In the future fishery industry, reducing Marine fishing to protect the ecological environment and increasing fishery breeding are the main trends in the future. " to " In the future, fishery industry should target that of reducing Marine fishing to protect the ecological environment and increasing. ".

 

Point 5: Lines 30-32: la frase è ridondante. “People are gradually increasing aquaculture to meet the demand for aquatic products and, consequently, reduce the damage to the natural environment”.

 

Response 5: Thank you from the bottom of our hearts for your comments and suggestions on the manuscript. We have removed " People are gradually increasing aquaculture to meet the demand for aquatic products and, consequently, reduce the damage to the natural environment. ".

 

Point 6: Lines 32-34: Authors should report some references.

 

Response 6: Thank you from the bottom of our hearts for your comments and suggestions on the manuscript. We have added references.

 

Point 7: Lines 59 and 61: the capital letter is not necessary.

 

Response 7: Thank you from the bottom of our hearts for your comments and suggestions on the manuscript. We apologize for the errors made in the writing of the manuscript. We have made the corrections.

 

Point 8: Line 72: change to “Howard A” in according to the chosen style (see also line 96).

 

Response 8: Thank you from the bottom of our hearts for your comments and suggestions on the manuscript. We have changed "A Howard" to "Howard A".

 

Point 9: Lines 75-77 and following lines: Authors reported only the principal positive advantages for each computer vision NN. I would suggest deeply discussing the shared features and the different aspects of proposed NNs. One of the main objectives of this work is to compare classification performance by considering different NN architectures. A deeper discussion of the similarities and differences in their architecture would lead readers better understand the choices made by the authors.

 

Response 9: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We report the main features and advantages of each neural network architecture, but do not discuss the common features and different aspects of these networks. The reason for this is that the advantages and disadvantages of different neural networks can be seen intuitively in experimental tasks. The similarities and differences of the architectures have been described along with the improvements made and the results achieved by each architecture. At the same time, we have added a new discussion section (5. Discussion). In this section we provide an in-depth discussion of the similarities and differences between the different network architectures.

 

Point 10: Lines 94-95: redundant use of “animals” term.

 

Response 10: Thank you very much for your comments and suggestions on the manuscript. We have revised the words.

 

Point 11: Lines 94-112: authors reported several examples of automatic face recognition methodology applied in animal farming. However, no reason to apply such techniques has been described. Besides reporting many applicative examples, I would suggest motivating the use of automatic techniques in animal farming.

 

Response 11: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have added a rationale for applying facial recognition to animals in lines 85 to 92.

 

Point 12: Lines 115-117: “At present repetition”.

 

Response 12: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have removed the duplicate "At present".

 

Point 13: Throughout the manuscript, I read several different names for identifying the studied species (lines 121, 134, and so on). Check and use the same style (and report the Latin name for the species).

 

Response 13: A: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have changed "the domestic fish" in line 121 to "golden crucian carp". And we have reported the Latin name for "golden crucian carp": "Carassius auratus".

 

Point 14: Lines 123-125: If I well understand, Fish Face Recognition Network (FFRNet) is the classificational module that permits the identification of fish by using the FaceNet architecture to collect information from images. If it is correct my interpretation, authors should better describe their classificational module.

 

Response 14: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. There is nothing wrong with your understanding. We specify the relationship between the Self-SE module and FFRNet.

 

Point 15: Figure 3: Change the description of each phase as reported in the main text.

 

Response 15: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have changed the descriptions of each stage of the report in the main text so that they are consistent with the descriptions of each stage in the pictures.

 

Point 16: Data processing. I did not understand how authors collect images from video recordings. Please, specify the number of frames in each second and describe more the “cutting” processes for each acquired frame.

 

Response 16: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We are very sorry for the ambiguity in our description here. What we wanted to convey was that each second of the video (30 FPS) contains 30 images, and we extracted all of the images per second using OpenCV tools. For example, a 30s video can be extracted 900 images. We have modified the description in the manuscript (lines 201-212).

 

Point 17: Line 217: Should be AHash. Moreover, I consider necessary a better clarification and motivated the preference for using an algorithm rather than others.

 

Response 17: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have changed the name from "aHash" to "AHash". At the same time, we have added the reason why pHash is superior to AHash (lines 215-228).

 

Point 18: Lines 271: What are “other reasons”? Please, motivate your choice for removing images from your sample.

 

Response 18: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. Other reasons refer to the presence of residual shadows and aggregations of fish in the captured image, which can make it impossible to label the image. We have explained this in lines 284 and 285.

 

Point 19: Line 359: [0:180] or [0-180].

 

Response 19: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have amended "[0180]" in the original to "[0:180]".

 

Round 2

Reviewer 1 Report

I looked at the updated manuscript and think that my reviews have not been sufficiently addressed. The English still needs to be seriously improved (throughout the manuscript), and I had shared several relevant papers, which were not added. Only once it is readable, can one assess the science. 

Author Response

Response to Reviewer 1 Comments

 

Point 1: The English still needs to be seriously improved (throughout the manuscript).

 

Response 1: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have used MDPI's English editing service to improve the English in the article.

 

Point 2: I had shared several relevant papers, which were not added.

 

Response 2: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. Sorry, this was our mistake. We have cited the paper you have shared in this revision.

Author Response File: Author Response.docx

Reviewer 2 Report

The authors have adequately responded to all my comments and provided new information for a better understanding of the reason that led them to perform this study and the methodology adopted to solve their research question.

From what I concern, the manuscript has been improved and it might be considered for publication after Editor's decision. I recommended an English check from a professional native-English reader (for example, in lines  28-29 remove "that" and change to "increase").

 

Author Response

Response to Reviewer 2 Comments

 

Point 1: I recommended an English check from a professional native-English reader (for example, in lines 28-29 remove "that" and change to "increase").

 

Response 1: Thank you from the bottom of my heart for your comments and suggestions on the manuscript. We have used MDPI's English editing service as a way to improve the English in the article.

Author Response File: Author Response.docx

Back to TopTop