Next Article in Journal / Special Issue
A Multidimensional Framework Incorporating 2D U-Net and 3D Attention U-Net for the Segmentation of Organs from 3D Fluorodeoxyglucose-Positron Emission Tomography Images
Previous Article in Journal
Gms-Afkmc2: A New Customer Segmentation Framework Based on the Gaussian Mixture Model and ASSUMPTION-FREE K-MC2
Previous Article in Special Issue
SIDGAN: Efficient Multi-Module Architecture for Single Image Defocus Deblurring
 
 
Article
Peer-Review Record

Development of a Seafloor Litter Database and Application of Image Preprocessing Techniques for UAV-Based Detection of Seafloor Objects

Electronics 2024, 13(17), 3524; https://doi.org/10.3390/electronics13173524
by Ivan Biliškov 1 and Vladan Papić 2,*
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Electronics 2024, 13(17), 3524; https://doi.org/10.3390/electronics13173524
Submission received: 20 July 2024 / Revised: 27 August 2024 / Accepted: 3 September 2024 / Published: 5 September 2024
(This article belongs to the Special Issue Artificial Intelligence in Image Processing and Computer Vision)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The article explores innovative methodologies for detecting marine debris using unmanned aerial vehicles (UAVs), combining image preprocessing and deep learning techniques. The research addresses a crucial environmental issue with an innovative approach using UAV technology for marine litter detection. The development of a specialized database for this purpose is particularly commendable. In general, this study is well-structured. The author only have some minor concerns:

1.  Title clarity: In the study, the authors primarily focused on applying and testing existing image preprocessing methods instead of developing new preprocessing methods, but rather evaluated and compared the effectiveness of various established techniques. The current title, "Image preprocessing and deep learning for detection of seafloor objects from UAVs," might indeed lead readers to expect new image preprocessing techniques, given how it's framed. The reviewer suggest to change it to something like: "Development of a Novel Database and Application of Image Preprocessing Techniques for UAV-Based Detection of Seafloor Objects".

2. Literature review: While the comprehensive listing of databases provides valuable context, the section 2.1 would benefit greatly from a more detailed analysis of these resources. Please do not simply list everything without any explanation or analysis. Actually, the author did it well in the section 2.2, with a seperate paragraph at last to identify the exact knowledge gap and research focus. The reviewer recommends to add such a summary for section 2.1 as well.

3. The paper mentions the use of nine image preprocessing methods without clear justification for their selection or a detailed review of their relevance and effectiveness in the specific context of underwater imaging from UAVs. Since image reprocessing is an important content of this paper, it is better to Include a rationale for each preprocessing method selected, or add a literature reivew to provide literature evidence or preliminary results that demonstrate their efficacy in similar applications.

Comments on the Quality of English Language

The English quality in the paper is generally good, with clear articulation of complex scientific concepts and a structured presentation of the research content.

Author Response

Comments 1.  Title clarity: In the study, the authors primarily focused on applying and testing existing image preprocessing methods instead of developing new preprocessing methods, but rather evaluated and compared the effectiveness of various established techniques. The current title, "Image preprocessing and deep learning for detection of seafloor objects from UAVs," might indeed lead readers to expect new image preprocessing techniques, given how it's framed. The reviewer suggest to change it to something like: "Development of a Novel Database and Application of Image Preprocessing Techniques for UAV-Based Detection of Seafloor Objects".

Response 1. Agree. We have accepted your new title (with some modifications) which better reflects the contents of the article.

Comments 2. Literature review: While the comprehensive listing of databases provides valuable context, the section 2.1 would benefit greatly from a more detailed analysis of these resources. Please do not simply list everything without any explanation or analysis. Actually, the author did it well in the section 2.2, with a separate paragraph at last to identify the exact knowledge gap and research focus. The reviewer recommends to add such a summary for section 2.1 as well.

Response 2. Agree. We have added a proposed summary section that keeps with the flow of the article.

Comments 3. The paper mentions the use of nine image preprocessing methods without clear justification for their selection or a detailed review of their relevance and effectiveness in the specific context of underwater imaging from UAVs. Since image reprocessing is an important content of this paper, it is better to Include a rationale for each preprocessing method selected, or add a literature review to provide literature evidence or preliminary results that demonstrate their efficacy in similar applications.

Response 3. We have added a section that clarifies why those methods were chosen and what were limitations of some other methods. Also, relevant review literature was added to the references.

Reviewer 2 Report

Comments and Suggestions for Authors

The research presents a novel approach to detecting marine debris using UAVs and advanced image processing techniques. The creation of a comprehensive database and the application of various preprocessing methods significantly improve the accuracy of underwater object detection.
The creation of a large, well-annotated database is a significant contribution to the field, providing a valuable resource for future research. Now, the study is geographically limited to the Croatian coast, which may affect the generalizability of the findings to other regions with different environmental conditions.
The study employs a variety of preprocessing techniques, showcasing the impact of different methods on detection accuracy. The effectiveness of combined preprocessing methods indicates a need for further research to simplify and optimize preprocessing pipelines.
The use of YOLOv8 architecture and transfer learning enhances the model's performance, demonstrating the effectiveness of state-of-the-art deep learning methods. While the database includes 31 classes, some categories showed lower detection performance, suggesting a need for more balanced and diverse datasets.
An analysis of the proposed algorithms should be done on a data set collected from a single geographical point, from various altitudes, for different wind conditions and different wave forms. A data set that is more difficult to obtain but which would better highlight the advantages and disadvantages of the proposed solution and would allow the development of preprocessing recommendations depending on the image capture conditions.

Author Response

Comments 1. The creation of a large, well-annotated database is a significant contribution to the field, providing a valuable resource for future research. Now, the study is geographically limited to the Croatian coast, which may affect the generalizability of the findings to other regions with different environmental conditions.

Response 1. We tried to get different types of seafloor, transparency, waves, sun angle and similar to get a more diverse dataset. But we agree that it is localized to the Croatian coastline. Collecting images is a time consuming process and we couldn’t expand to other countries with ease. On the other hand, we would be happy to accept contributions from other researchers to expand the database and do more research.

Comments 2. The use of YOLOv8 architecture and transfer learning enhances the model's performance, demonstrating the effectiveness of state-of-the-art deep learning methods. While the database includes 31 classes, some categories showed lower detection performance, suggesting a need for more balanced and diverse datasets.

Response 2. All images are real-life snapshot and we don’t have any generated or artificially augmented images, so we couldn’t influence what litter we would find on the seafloor. Some techniques (getting more images from the same spots but different conditions, oversampling, undersampling, custom loss function) can be applied to deal with this problem, but it is left for other researchers and also it is planned for our future work. In this paper, we used this database as-is without focusing research on that.

Comments 3. An analysis of the proposed algorithms should be done on a data set collected from a single geographical point, from various altitudes, for different wind conditions and different wave forms. A data set that is more difficult to obtain but which would better highlight the advantages and disadvantages of the proposed solution and would allow the development of preprocessing recommendations depending on the image capture conditions.

Response 3. Introduced database has metadata to filter by height, azimuth, time of the day, day of the year, wind speed, etc… There are many different analysis that could be done with this dataset (evaluation metrics and results compared by altitude, wind speed,, time of day,...). In this paper we presented only basic analysis but all others could be done as well. We couldn’t do it because of the article size limit (and also available time for review response). But open dataset essentially allows anybody to do research based on parameters that are useful or interesting to them.

Reviewer 3 Report

Comments and Suggestions for Authors

General:

  1. How do the  speed of wind or time of taking a photo affect the dataset quality? Do training models on pictures (or patches) taken on a specific time period (e.g. from 8 AM to 12 AM) produce different results than subset of dataset based on other time period?

  2. Imbalanced dataset - dataset is heavily imbalanced, even if restricted to 9 classes. How can it be alleviated?

  3. I would like to ask Authors, why did they split the process of creating the dataset into two subsections in subsequent chapters (3.1 and 4.1)? Could it be rearranged as one chapter in order to keep the flow of the paper?

  4. I would suggest to carve out a subsection for metrics alone, that would present used metrics and discuss them, just within separate subsection, so that results are discussed without the need of introducing them 

 

Commentary:

  1. Lines 205-209 - please expand on this point, it is invaluable to point that work is developing niche or filling a research gap 

  2. Line 311 - time to compute was mentioned only for this method, why?

  3. Lines 476-478 - comment on why those values of parameters were chose would be appreciated 

  4. Lines 478-478 - why did Authors settle for 15 models? What was the criteria to choose which methods to combine? What were the limitations that did not let Authors evaluate all the possibilities?

  5. Line 531 - Why value 0.5 was chosen or how was it obtained?

 

Editorial:

 

  • I’m not sure what the convention for numbers is, whether it should be 7212 or  7 212, but both are used in the manuscript - please pick one (e.g. line 490)

  • Please revise usage of abbreviations in manuscript e.g. line 78 uses abbreviation explained in line 109 (JAMSTEC) and line 110 - repeated explanation of abbreviation 

  • Please revise usage of quotation marks for proper names e.g. line 120 - I do believe that Youtube does not need to be in “”. This might apply to other proper names (e.g. datasets)

  • Lines 49-58 - please add some references to claims in this paragraph 

  • Line 115 - it might not be clear to whom Authors refer to by saying “the authors use a database” - whether it is used to refer to authors of mentioned publication or themselves. 

  • Is Equation (2) a pasted image? It does not align with its number properly. Equation (4), (5), (6) as well.

  • Line 402 - min y \in … needs to be fixed to improve readability

  • Lines 401-406 - need to be corrected, something is off with formatting

  • Line 433 - Where t(x) …” should there be \in?

  • Equation (10) - In equation is “R.” but in explanations is “R”?

  • Table 2. - I’m afraid that sorting the table alphabetically by the name instead of obtained results (I would suggest F1- score) is a very unfortunate decision.

  • Lines 558-559 - Instead of “ "None" model without transfer “ I would suggest using already introduced name “None no transfer” 

  • Figure 16. - Please add values within confusion matrix (for each cell)

  • Lines 568-575 - please discuss the results using number and values instead of vague “high number” or “majority”

  • Chapter 5 Conclusions. - remove empty lines

Author Response

General:

Comments 1. How do the  speed of wind or time of taking a photo affect the dataset quality? Do training models on pictures (or patches) taken on a specific time period (e.g. from 8 AM to 12 AM) produce different results than subset of dataset based on other time period?

Response 1. Introduced database has metadata to filter by height, azimuth, time of the day, day of the year, wind speed, etc… There are many different analysis that could be done with this dataset. We gave basic analysis but all others could be done as well. We couldn’t do it because of the article and time limit. But open dataset essentially allows anybody to do research based on parameters that are useful or interesting to them. Also, this is probably something we will do in the future research.

Comments 2. Imbalanced dataset - dataset is heavily imbalanced, even if restricted to 9 classes. How can it be alleviated?


Response 2. All images are real-life snapshot and we don’t have any generated or artificially augmented images, so we couldn’t influence what litter we would find on the seafloor. Some techniques (getting more images from the same spots but different conditions, oversampling, undersampling, custom loss function) can be applied to deal with this problem, but it is left for other researchers or future work. We used this database as-is without focusing research on that. That is why we have used 9 classes that have the highest count.

Comments 3. I would like to ask Authors, why did they split the process of creating the dataset into two subsections in subsequent chapters (3.1 and 4.1)? Could it be rearranged as one chapter in order to keep the flow of the paper?
Response 3. Flow is envisioned to cover two topics at the same time, dataset creation and preprocessing methods research and application on that dataset. 3.1. Is methodology of dataset creation framework. We have also explained how we got from original dataset to dataset used to train models on. Chapter 4 presents final database and object detection results that were obtained using that database (with or without preprocessing). We think (hope) this style of presentation is easy to follow and understand but we are open to more precise suggestions.

Comments 4. I would suggest to carve out a subsection for metrics alone, that would present used metrics and discuss them, just within separate subsection, so that results are discussed without the need of introducing them

Response 4. Good point. Suggestion is accepted and now 4.2. has been split into two subsections.

Commentary:

Comments 1. Lines 205-209 - please expand on this point, it is invaluable to point that work is developing niche or filling a research gap

Response 1. Good observation, we have modified this part a bit.

Comments 2. Line 311 - time to compute was mentioned only for this method, why?

Response 2. Our error, it should have been deleted since we didn’t have time to compute for all other methods so we removed it completely.

Comments 3. Lines 476-478 - comment on why those values of parameters were chose would be appreciated

Response 3. Explained and added references.

Comments 4. Lines 478-478 - why did Authors settle for 15 models? What was the criteria to choose which methods to combine? What were the limitations that did not let Authors evaluate all the possibilities?
Response 4. We have expanded now on our process of selecting models and added references to relevant literature that guided the selection process. Some of the methods (like methods that use AI couldn’t be used since we didn’t have proper dataset to replicate those methods or it would be too expensive).

Comments 5. Line 531 - Why value 0.5 was chosen or how was it obtained?

Response 5. Thanks, we have modified that part with a better explanation why this value is used.

Editorial:

Comments 1. I’m not sure what the convention for numbers is, whether it should be 7212 or  7 212, but both are used in the manuscript - please pick one (e.g. line 490)

Response 1. Fixed

Comments 2. Please revise usage of abbreviations in manuscript e.g. line 78 uses abbreviation explained in line 109 (JAMSTEC) and line 110 - repeated explanation of abbreviation

Response 2. Fixed

Comments 3. Please revise usage of quotation marks for proper names e.g. line 120 - I do believe that Youtube does not need to be in “”. This might apply to other proper names (e.g. datasets)

Response 3. Fixed

Comments 4. Lines 49-58 - please add some references to claims in this paragraph

Response 4. Done, added extra literature that supports the claims and referenced that in the article.

Comments 5. Line 115 - it might not be clear to whom Authors refer to by saying “the authors use a database” - whether it is used to refer to authors of mentioned publication or themselves.

Response 5. Yes, unfortunate wording, we have edited this part.

Comments 6. Is Equation (2) a pasted image? It does not align with its number properly. Equation (4), (5), (6) as well.

Response 6. Images were used as a placeholder but it seems that there has been some error with our editing those equations. We have fixed that and it is aligned correctly now.

Comments 7. Line 402 - min y \in … needs to be fixed to improve readability

Comments 8. Lines 401-406 - need to be corrected, something is off with formatting

Comments 9. Line 433 - Where t(x) …” should there be \in?

Comments 10. Equation (10) - In equation is “R.” but in explanations is “R”?

Comments 11. Table 2. - I’m afraid that sorting the table alphabetically by the name instead of obtained results (I would suggest F1- score) is a very unfortunate decision.

Response 7. 8. 9. 10. 11. Fixed

Comments 12. Lines 558-559 - Instead of “ "None" model without transfer “ I would suggest using already introduced name “None no transfer”

Response 12. Thanks, this is a good catch, it is fixed now.

Comments 13. Figure 16. - Please add values within confusion matrix (for each cell)

Response 13. Done.

Comments 14. Lines 568-575 - please discuss the results using number and values instead of vague “high number” or “majority”

Response 14. Thanks for the comment on this, we have changed this paragraph to be precise and use proper values when presenting results.

Comments 15. Chapter 5 Conclusions. - remove empty lines

Response 15. Removed

Reviewer 4 Report

Comments and Suggestions for Authors

1. What objects will the deep learning network detect? Objects on the water or under the water?

2. In Section 2.1, the title is about images databases. However, it discussed different deep learning models.

3. Two tables are needed to compare different image datasets and deep learning models.

4. In the sentence, "restoration involves using the properties of medium to reverse the effect,"  what is the effect? How to reverse it?

5. Why is the restoration needed for deep learning? Do the deep learning models have the ability to classify the raw images?

6. In Figure 8, what does the half size mean? Compress?

7. How does the drone collect the underwater images? It can only fly above the water.

8. Are there any improvements on the image preprocessing algorithm and deep learning networks in this paper?

 

Comments on the Quality of English Language

1. The captions of tables should be above the tables.

2. Equation 3 is not written in right format.

 

Author Response

Comments 1. What objects will the deep learning network detect? Objects on the water or under the water?

Response 1. CNN will learn to detect features of the objects and it is trained on the objects that are under the water (seafloor). However, it might be possible that this network will get good results if the objects are floating (on the water). Underwater objects change color based on the depth, get vegetation coating or animals make shelters around or in those objects, modifying how they look. In new revision, we have further clarified this.

Comments 2. In Section 2.1, the title is about images databases. However, it discussed different deep learning models.

Response 2. Section 2.1. covers image databases but some of those papers apply techniques (deep learning or computer vision algorithms) to find objects on those images. That is why we have mentioned in what capacity that dataset was used and what was interesting about that application (or how that dataset was used). Most important part about chapter 2.1. are methodologies of how databases were acquired and what was their intended use for.

Comments 3. Two tables are needed to compare different image datasets and deep learning models.

Response 3. Not sure what you are referring to. Our work shows development of one final dataset, how we got from drone images to curated dataset used for training models. Therefore, we used only one - that final dataset to be the base dataset for preprocessing methods. We applied preprocessing methods to this dataset and trained models with each copy.

 

Comments 4. In the sentence, "restoration involves using the properties of medium to reverse the effect,"  what is the effect? How to reverse it?

Response 4. Good point, it was unclear from that sentence. We have reworded that sentence and provided some explanation.

Comments 5. Why is the restoration needed for deep learning? Do the deep learning models have the ability to classify the raw images?

Response 5. Need for the restoration is discussed in the chapter 2.2. where it is explored how restoration improves image quality and therefore can improve results that are obtained when training on those images. Generally, if some features can be improved by some of the, let’s say, classical methods, it could improve final results of the detection (this way unburdening CNN). Our final results confirm this hypothesis.

 

Comments 6. In Figure 8, what does the half size mean? Compress?

Response 6. Good observation, this might not be clear from the image and text above so we have addressed this issue to clarify how each step is called and how it is used to generate next database. Thanks for pointing this out.

 

Comments 7. How does the drone collect the underwater images? It can only fly above the water.
Response 7. Thanks for the question. It seems some clarifications should be done. In the first revision of the manuscript, lines 487 and 488 say “Given that a database of underwater objects photographed by drones (UAVs in the new version of the manuscript) does not exist or is not public, our own database was built and the following link provides information…” where we explicitly state how images are obtained. Also, chapter 3.1. (lines 213-216) talks about guidelines and heights used for UAVs to obtain those images. Some other clarifications in text have also been done.

 

Comments 8. Are there any improvements on the image preprocessing algorithm and deep learning networks in this paper?

Response 8. This paper doesn’t propose improvements in existing network architectures or algorithms. It is focused on development and presentation of a novel database. In addition, we have applied some state-of-the-art algorithms that are used in enhancement of underwater images for improving detection of UAV acquired images because large part of underwater object presentation degradation is, in fact, influenced by the sea conditions. To the best of our knowledge, this approach has not been tested yet. Future work includes modifying network to obtain better results.

Round 2

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript has been improved according to the reviewers' comments. In this situation, I suggest accepting it for publishing.

Back to TopTop