Next Article in Journal
Comprehensive Study of Fuel Cell Hybrid Electric Vehicles: Classification, Topologies, and Control System Comparisons
Previous Article in Journal
Possible Approaches to Studying the Influence of Magnetic Fields and Mechanical Effects on the Physicochemical Properties of Aqueous IgG Colloids
 
 
Article
Peer-Review Record

Real-Time Optical Flow Estimation Method Based on Cross-Stage Network

Appl. Sci. 2023, 13(24), 13056; https://doi.org/10.3390/app132413056
by Min-Hong Park 1, Jae-Hoon Cho 2 and Yong-Tae Kim 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(24), 13056; https://doi.org/10.3390/app132413056
Submission received: 10 October 2023 / Revised: 5 November 2023 / Accepted: 27 November 2023 / Published: 7 December 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

 

The article deals with the always interesting topic of optical flow in computers and its optimization for real-time applications. Here the authors present an approach that made amalgam of different NN's with the goal of making it faster, more accurate, and thus suitable for use in real-time environments. Despite the interest in the topic and the quality of the presentation of the paper, there are several key aspects that the authors need to improve to support their claims. These aspects must be imroved: - The CAFT solution is explained quite clearly in theory and its resolution is given, but the claim that it's suitable for a low computational platform (and better than other comparable methods) isn't supported in detail. Where are measurements (test results) showing what times are needed to process frames? The comparison between the selected methods only offers advantages in terms of precision and detail of the optical flow field. - Also, and following up on my previous comment, what platform did you use to perform the experiment? Simply stating "PyThorch" is neither sufficient nor informative. You need to describe the experimental setup(s) you're using for the experimental proof. This is the only and true measure of real-time suitability, i.e. for a true comparison of processing requirements in real-time applications! - In Table 1, time(s) are irrelevant without explanation of the test conditions and setup used. - Similarly for the results in Table 2, please state clearly what the numbers given mean, what they measure exactly Simply stating "learning results for the data set" isn't a good way to compare methods. - In Table 2, a percentage comparison of methods would be much more meaningful. Mere numbers without measurements are neither informative nor easy to digest. - In the conclusion, you mention the percentage advantage of your approach, but this isn't clearly supported by the experiment. - In addition, the results for all the data sets you use for the experimental proof must be given separately for each data set, as well as an overall result in the conclusion (at the end of the experimental chapter of the article). - The conclusion chapter must be more extensive and detailed.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The proposed model combines the CSPNet with the RAFT structure to reduce the number of parameters and, consequently, contributes to real-time optical flow estimation. Although the innovation is confined to the integration of existing techniques, an improvement over the state-of-the-art is demonstrated by the results.

 

However, significant improvements are needed in the way the paper is composed. The following aspects require attention:

 

1. An enhancement in the introduction can be achieved by reducing detailed explanations of less critical aspects and directing more focus towards clarifying the motivation behind the selection of the components that are being combined. A more explicit rationale for choosing RAFT, CSPNet, etc., over other existing techniques should be provided (e.g. instead of an excessive emphasis on SLAM).

 

2. The introduction can be further refined by including supporting citations for statements. For instance, the following paragraph lacks supporting references: "Previously, estimating camera pose was challenging due to the low accuracy of optical flows based on rules. However, advancements in deep learning technology have facilitated the solution to the camera pose estimation problem. Recent developments in deep learning networks have shown promise in enhancing the prediction of optical flows".

 

3. When a claim is being made in the introduction, it has to be supported and discussed in the results section. The following statement lacks support in the results and discussuon: “First, by using a single high-resolution flow field, we solve the problem of low-resolution error recovery in detailed continuous calculations, missing small and fast-moving objects, and the large number of training times generally required for continuous calculations.”

 

4. In the related work section, figures representing existing methods lack concrete explanations to support the content (e.g., Figure 3). When figures are employed in the paper, it is essential to ensure that all components depicted in the figure are adequately explained to guide the reader's attention. Furthermore, while it is recommended to cover a broad spectrum of related works, over-presenting works that do not significantly contribute to your proposal can potentially confuse the reader. It is important to maintain a balanced approach and focus more on essential subjects.

 

4. Figure 7 and Equation 1 do not receive sufficient supporting explanation within the text. It is imperative that all variables in the equation and figures be introduced before their use.

 

5. The proposed model shares several similarities with RAFT. This relationship should be prominently highlighted in the paper, spanning the introduction, methodology, and results sections. Additionally, explicate the distinctions between your proposal and RAFT. As a suggestion, in the section discussing related works, instead of elaborating on the fact that RAFT uses a differentiable loss function, delve into the design choices underpinning RAFT.

 

6. Justify your design choices and the selection of metrics reported in the results. For instance, elaborate why 4 filters were chosen in the correlation layer, provide insights into the rationale behind specific kernel sizes, and clarify whether these decisions were empirically tested or inherited from an existing model (with references).

 

7. The captions for tables representing the results need to be expanded, as the tables are challenging to comprehend without accompanying explanatory text. The text explanation itself should be revised for better fluency.

 

8. The results can benefit from additional experimentation. Consider including an ablation study to underscore the significance of each proposed component in driving improvements, drawing inspiration from the RAFT paper.

Comments on the Quality of English Language

The paper's language can be improved for smoother flow and correct grammar usage.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

I appreciate the effort put into providing the revised version. The article now flows more smoothly and is easier to comprehend.

Best regards

Back to TopTop