Next Article in Journal
Application of High-Frequency Defrosting, Superheated Steam, and Quick-Freezing Treatments to Improve the Quality of Seafood Home Meal Replacement Products Consisting of the Adductor Muscle of Pen Shells and Common Squid Meat
Next Article in Special Issue
A Mobile Service Robot Global Path Planning Method Based on Ant Colony Optimization and Fuzzy Control
Previous Article in Journal
Key Factors in the Implementation of the Internet of Things in the Hotel Sector
Previous Article in Special Issue
State Estimation of Over-Sensored Systems Applied to a Low-Cost Robotic Manipulator
 
 
Review
Peer-Review Record

Object Detection, Distributed Cloud Computing and Parallelization Techniques for Autonomous Driving Systems

Appl. Sci. 2021, 11(7), 2925; https://doi.org/10.3390/app11072925
by Edgar Cortés Gallardo Medina 1, Victor Miguel Velazquez Espitia 1, Daniela Chípuli Silva 1, Sebastián Fernández Ruiz de las Cuevas 1,2, Marco Palacios Hirata 1, Alfredo Zhu Chen 1, José Ángel González González 1, Rogelio Bustamante-Bello 1 and Carlos Francisco Moreno-García 3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2021, 11(7), 2925; https://doi.org/10.3390/app11072925
Submission received: 30 January 2021 / Revised: 7 March 2021 / Accepted: 18 March 2021 / Published: 25 March 2021
(This article belongs to the Collection Advances in Automation and Robotics)

Round 1

Reviewer 1 Report

Although the article is a bit long compared to other papers, it is pleasant to read. Perhaps the next article should be considered in an extensive bibliography:  The Effectiveness of Using a Pretrained Deep Learning Neural Networks for Object Classification in Underwater Video, which is closely related to the topic of the article.

Author Response

Although the article is a bit long compared to other papers, it is pleasant to read. Perhaps the next article should be considered in an extensive bibliography: The Effectiveness of Using a Pretrained Deep Learning Neural Networks for Object Classification in Underwater Video, which is closely related to the topic of the article. 

Thank you very much for your comments, we agree that the suggested piece of work is closely related to this research. We have included this citation as part of our review of related methods in section 1: “Furthermore, the study of these techniques has been extended to other application domains beyond AVs on the highway. For instance, Szymak et al. [6] presented a comparative study where different deep learning architectures where tested to perform object classification in video image processing of underwater AVs. In this study, authors evaluated the performance of different architectures to classify objects (i.e. fish, other vehicles, divers & obstacles) and detect the corrosion of abandoned munitions. Interestingly, they were able to deduct that pre-trained algorithms have higher probabilities of success compared to tailor-made approaches. This rationale will be explained and supported throughout our work 

Reviewer 2 Report

The paper is focused on autonomous driving perception systems and it pays close attention to the concepts of neural networks and parallelization. The paper presents an interesting review though not particularly original and not much scientifically sound.

The abstract is not developed concisely, it should show the aim of the work and the results of the review could be indicated more precisely.

The article presents great weaknesses that should be solved before being published in an academic journal. Most references are informative in nature and not from scientific journals with an impact index in the automotive section. In addition, authors are advised to review the citation style, some of which do not comply with the MDPI format.

Some of the acronyms used are not defined or are defined later, for thee reason the paper is hard to understand, some examples are:

  • DS is defined in the list of acronyms at the end of the paper, it should be defined before.
  • CUCA C/C++ should be explained, no all readers will be experts.
  • Is YOLO an acronym?. Please provide more details.
  • CRNN is defined late. I think it should be defined in 2.5.
  • LSTM is defined late. I think it should be defined in 2.6.

Equations are not according to MDPI format.

Although a section of applications found in the disclosing literature is made, results that are not found should be included.

Some figures have low quality and do not help to understand.

In the conclusion section, the authors could add whether the study may have some further developments or other future applications. In the current form of the article the authors only give an opinion that is not supported by the study carried out.

Some sentences could be better formulated to properly explicit the sense of the concepts. Authors must use the impersonal, in the current form of the article the personal is used (i.e. we centralized our….).

I do not want to be negative but because of these limitations I have to reject the article. I hope that my decision does not discourage the authors and they continue working.

Author Response

The paper is focused on autonomous driving perception systems and it pays close attention to the concepts of neural networks and parallelization. The paper presents an interesting review though not particularly original and not much scientifically sound. The abstract is not developed concisely, it should show the aim of the work and the results of the review could be indicated more precisely. 

Indeed, the original abstract was not assertive; we have changed it to: “Autonomous vehicles are increasingly becoming a necessary trend towards building the smart cities of the future. Numerous proposals have been presented in recent years to tackle particular aspects of the working pipeline towards creating a functional end-to-end system, such as object detection, tracking, path planning, sentiment or intent detection, amongst others. Nevertheless, few efforts have been made to systematically compile all of these systems into a single proposal that also considers the real challenges that these systems will have on the road, such as real-time computation, hardware capabilities, etc. In this paper, we review the latest techniques towards creating our own end-to-end autonomous vehicle system, considering the state of the art methods on computer vision, distributed systems, and parallelization. Our findings show that while methods such as convolutional neural networks, recurrent neural networks and long short-term memory can effectively handle the initial detection and path planning tasks, more efforts are re-quired in the implementation of cloud computing to reduce the computational time that these methods demand. In addition, we have mapped different strategies to handle the parallelization task, both within and between the networks. 

The article presents great weaknesses that should be solved before being published in an academic journal. Most references are informative in nature and not from scientific journals with an impact index in the automotive section. In addition, authors are advised to review the citation style, some of which do not comply with the MDPI format. 

We have reviewed the citation style of all sources to comply with the journal’s format. In addition, we have added more citations from scientific journals. Finally, all informative sources have been either removed or added as footnotes. 

Some of the acronyms used are not defined or are defined later, for thee reason the paper is hard to understand, some examples are: 

  • DS is defined in the list of acronyms at the end of the paper, it should be defined before. All instances of DS have been addressed. 
  • CUCA C/C++ should be explained, no all readers will be experts. CUDA has now been briefly explained in the section where it is mentioned. 
  • Is YOLO an acronym?. Please provide more details. This acronym is defined in section 2.2.5 and has its own section for explanation, and now defined in the acronym section as well 
  • CRNN is defined late. I think it should be defined in 2.5. It is now defined in this section 
  • LSTM is defined late. I think it should be defined in 2.6. It is now defined in this section 

Thank you very much for noting these inconsistencies. We have reviewed and corrected them. 

Equations are not according to MDPI format. 

We have reviewed the guideline of MDPI (https://www.mdpi.com/authors/layout#_bookmark37) to have all equations comply with the template provided. Moreover, the AE has kindly helped us reconvert the paper into the latest format. 

Although a section of applications found in the disclosing literature is made, results that are not found should be included. Some figures have low quality and do not help to understand. 

All figures have been changed and resized. 

In the conclusion section, the authors could add whether the study may have some further developments or other future applications. In the current form of the article the authors only give an opinion that is not supported by the study carried out. 

Thank you very much for this remark, we have added the following lines to indicate that this study may have some other applications: “Finally, it is worth to note that this study may have some further developments and applications for other areas which demand enhanced ADS, such as underwater ROVs [6] (as described in Section 1). 

Some sentences could be better formulated to properly explicit the sense of the concepts. Authors must use the impersonal, in the current form of the article the personal is used (i.e. we centralized our….). 

We have reviewed all the manuscript to locate all sentences that were confusing and written in this form, and we have reworded them to impersonal as the reviewer suggested. Indeed, this gives more clarity to the manuscript. 

Reviewer 3 Report

The manuscript provides an overview of techniques (majorly machine learning techniques) for image processing algorithms for the AV applications and the real-time implementation of the algorithms. The paper with its current format is not publishable due to several severe issues:

    - The structure of the paper is confusing. For example, the sub-headings of section 2 are not consistent (how 'Feature Extension' and 'RNN' can be categorised at the same level?)

     - There are lots of unnecessarily long, repeated and mixed up sentences in paragraphs. For example, the provided explanation under 2.1 is long and still not clearly explains that what the authors mean by the heading of 'Feature Extraction''. Most of the paragraphs suffer from the same issues.

     - While the authors claim in Section 1 that the main focus of the paper is on parallelization and real-time implementation, there is not enough information for these interesting topics and the majority of the paper is about explaining different machine learning algorithms with repetitive sentences. Also, the dedicated sectors to the implementation do not cover any literature review on implementation for particularly AV applications but only covers generic information about the implementation of NNs on GPU with the NVIDIA technologies.

    - English wording needs a thorough revision. The paper is unnecessarily long with several vague sentences and phrases. For example, the heading of Section 3 is very vague (applications of what for AVs?). Or as another example, it is not clear how such a long explanation of the first three paragraphs under 3.1 is related to the 'Applications to AVs'. Or, how the headings of 3.2.1 and 3.2.2 are related to the heading of 3.2.

Author Response

The manuscript provides an overview of techniques (majorly machine learning techniques) for image processing algorithms for the AV applications and the real-time implementation of the algorithms. The paper with its current format is not publishable due to several severe issues: 

    - The structure of the paper is confusing. For example, the sub-headings of section 2 are not consistent (how 'Feature Extension' and 'RNN' can be categorised at the same level?) 

Indeed, there was an issue with the levels of the previous sub-sections. We have addressed this issue. Now “Feature extraction” is 2.1, “Object detection” is 2.2, and all NN methods within are 2.2.X. This in turn has converted RNN into 2.2.7. 

     - There are lots of unnecessarily long, repeated and mixed up sentences in paragraphs. For example, the provided explanation under 2.1 is long and still not clearly explains that what the authors mean by the heading of 'Feature Extraction''. Most of the paragraphs suffer from the same issues. 

To clarify the aim of section 2.1, we have renamed it as Feature Extraction Methods used for Object Detection in AVs. Furthermore, we have removed the first sentence to simplify the presentation of the concepts. We have simplified the structure and language across the entire manuscript. The paragraph introducing section 2.1 was made more concise, the important topics and objectives are defined right away to avoid confusion. 

     - While the authors claim in Section 1 that the main focus of the paper is on parallelization and real-time implementation, there is not enough information for these interesting topics and the majority of the paper is about explaining different machine learning algorithms with repetitive sentences. Also, the dedicated sectors to the implementation do not cover any literature review on implementation for particularly AV applications but only covers generic information about the implementation of NNs on GPU with the NVIDIA technologies. 

We acknowledge that some sentences in the abstract and section 1 might have been misleading, as the main purpose of the project is to present a thorough study of the state of the art object detection techniques, and how cloud computing, distributed systems and parallelization might become a necessary addition to enhance those techniques in the near future (as the current literature in these topics is scarce). We have changed the abstract to reflect this point of view “Autonomous vehicles are increasingly becoming a necessary trend towards building the smart cities of the future. Numerous proposals have been presented in recent years to tackle particular aspects of the working pipeline towards creating a functional end-to-end system, such as object detection, tracking, path planning, sentiment or intent detection, amongst others. Nevertheless, few efforts have been made to systematically compile all of these systems into a single proposal that also considers the real challenges these systems will have on the road, such as real-time computation, hardware capabilities, etc. This paper reviews the latest techniques towards creating our own end-to-end autonomous vehicle system, considering the state of the art methods on object detection, and the possible incorporation of distributed systems and parallelization to deploy these methods. Our findings show that while techniques such as convolutional neural networks, recurrent neural networks and long short-term memory can effectively handle the initial detection and path planning tasks, more efforts are required to implement cloud computing to reduce the computational time that these methods demand. Also, we have mapped different strategies to handle the parallelization task, both within and between the networks.”. Moreover, we have added the following lines to section 1: “This paper provides a thorough state of the art investigation on object detection methods for the ADS pipeline. In addition, distributed cloud computing and parallelization techniques are studied to understand how these can be incorporated to deploy these methods. Unlike other reviews, this study is centralized mainly in autonomous driving perception systems, focusing on the neural network (NN) concepts for object detection, and how the concept of a Distributed System (DS) for parallelization can be included within 

    - English wording needs a thorough revision. The paper is unnecessarily long with several vague sentences and phrases. For example, the heading of Section 3 is very vague (applications of what for AVs?). Or as another example, it is not clear how such a long explanation of the first three paragraphs under 3.1 is related to the 'Applications to AVs'. Or, how the headings of 3.2.1 and 3.2.2 are related to the heading of 3.2. 

Thank you very much for this remark. We have changed the title of section 3 to “Methods to address the pipeline of end-to-end ADSs”, and we provide an introductory sentence for the reader to know that in this section “we will discuss the latest methods presented in literature to address the different stages that constitute an end-to-end ADS”. After this sentence, we explain the main subdivision of section 3 (four sections, one for each of the parts of this pipeline). Therefore, we hope it is now clearer that section 3.1 is related to the topic as it deals with image pre-processing and instance segmentation, which is the first step in this pipeline. Section 3.2 is called “Driver Assistance and Predicting Driver Patterns”, which is the second step used. In this regard, we present methods such as BC (3.2.1) RL (3.2.2) and LSTM-based (3.2.3). We have thoroughly reviewed the manuscript using two professional English review services. 

Round 2

Reviewer 2 Report

The paper has been improved.

Back to TopTop