Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Improved DeepSORT Algorithm Based on Multi-Feature Fusion

Appl. Syst. Innov. 2022, 5(3), 55; https://doi.org/10.3390/asi5030055

by Haiying Liu^*,†, Yuncheng Pei^†, Qiancheng Bei and Lixia Deng

Reviewer 1: Anonymous

Reviewer 2:

António Sousa

Reviewer 3: Anonymous

Appl. Syst. Innov. 2022, 5(3), 55; https://doi.org/10.3390/asi5030055

Submission received: 22 April 2022 / Revised: 23 May 2022 / Accepted: 24 May 2022 / Published: 13 June 2022

Round 1

Reviewer 1 Report

The article entitled “Improved DeepSORT Algorithm Based on Multi Feature Fusion” is well-written and, from my point of view, would be of interest for the readers of ASI. In spite of this and before its publication, there are some issues that should be improved by the authors:

Leave a space before the bracket of the bibliographical references.
Line 23: change ‘et al’ by ‘et al.’
In general it would be interest to reinforce the content of the materials and methods section, providing to the reader with the relevant information about the methodologies employed. Please consider enlarging and changing in depth the referred section.
Line 26: before using R-CNN the first time define it. The same is applicable to the rest of acronyms
Line 80: introduce a reference to Mahalanobis distances.
From my point of view Figure 1. Feature pyramid network structure is not clear enough. Please, consider introducing some labels that would clarify its content.

Author Response

Response to the Reviewer(s)' Comments:

To Whom It May Concern:

We are very thankful to the Editor-in-Chief, Associate Editor, and reviewers who have made critical comments and constructive suggestions concerning the work reported in the manuscript.

The manuscript has now been revised by taking into account these comments and suggestions. More mathematical details and experimental results are supplemented to test the new proposed algorithm and validate its effectiveness and performance. On behalf of my co-authors, I am now submitting the revised manuscript and our reply to the associated editor and reviewers as a supporting document.

Sincerely,

Haiying liu

Response to the Reviewer 1’s Comments:

1. Leave a space before the bracket of the bibliographical references.
The authors have gone through the whole paper and added spaces before brackets in all references.
1. Line 23: change ‘et al’ by ‘et al.’
We have modified the ‘et al’ as ‘et al.’, In line 22 of the revised paper.
1. In general, it would be interest to reinforce the content of the materials and methods section, providing to the reader with the relevant information about the methodologies employed. Please consider enlarging and changing in depth the referred section.
In lines 50-54 of the article, we introduced some new references and analyzed their main principles, and then derived our proposed research content.
1. Line 24: before using R-CNN the first time define it. The same is applicable to the rest of acronyms.
In the new version, we have defined R-CNN when it was used for the first time and we have double checked our whole papers to modified all the acronyms.
1. Line 80: introduce a reference to Mahalanobis distances.
In the modified version, we added reference 14 to introduce Mahalanobis distance.
1. From my point of view Figure 1. Feature pyramid network structure is not clear enough. Please, consider introducing some labels that would clarify its content.
In the new version, we have modified Figure 1 and added corresponding labels for better understanding.

Author Response File: Author Response.docx

Reviewer 2 Report

Brief summary

This paper proposes an improved version of DeepSORT algorithm when tracking multiple pedestrians. Its main contribution consists in improving the feature extraction network model.

Broad comments

The document is sometime not easy to read and follow.

The English needs major review.

The document is well supported with references.

The subject of the paper is interesting and with a great potential of application.

One of the weaknesses of this study is that it does not test the proposed algorithm when there is occlusion which is one of the main issues of feature extraction, matching and ultimately pedestrian tracking.

Another weakness is the fact that the presented results are based on simple cases of single pedestrian well identified in the images and not more complex scenes with multiple pedestrians.

Specific comments

In line 2 of the abstract did you mean “driverless cars”?! Or “Self-driving vehicles”? Please correct.

The sentence in lines 60 and 61 might be confusing to the reader. Please rephrase.

In line 99 and 100 please correct “mismatches and mismatches are prone to occur.”

Figure 1 seems incomplete. It does not add any information to the text. Please redraw figure with some meaningful information associated.

In line 135 to 137 authors should support the claim about “training network error behavior” with references or example result of the proposed method.

The text from line 141 to 149 should be rewritten to better explain figure 2.

In line 153 the phrase “It has non-negative, triangular inequality and other characteristics.” It is very short and not meaningful. Should be rewritten. Other characteristics?!

In line 186 did you mean equation 7? Please correct.

Section 2 explanation of GIOU (from line 150 to 214) is too long. Authors should reduce the explanation since there is duplicated ideas in the images and text.

The title of section 3 is not in correct English and should be rephrased “The Improved the feature extraction network”.

The text between line 246 and 258 is too repetitive. Authors should rephrase the text and compact the explanation.

In line 272 authors should explain why the training sets are smaller than the test sets (usually training sets are much larger than test sets).

Table 1 should be reconstructed. The values should be or at least added in percentage (%) so the reader can have a better understanding of proposed method accuracy. Is should also identify in the text to the reader the meaning of the acronyms (FN, FP, IDS, MOTA and MOTP).

The discussion of the results should be more detailed with a deeper analysis. Also the conclusions should be more detailed.

Author Response

Response to the Reviewer(s)' Comments:

To Whom It May Concern:

We are very thankful to the Editor-in-Chief, Associate Editor, and reviewers who have made critical comments and constructive suggestions concerning the work reported in the manuscript.

Sincerely,

Haiying liu

Response to the Reviewer 2’s Comments:

In line 2 of the abstract did you mean “driverless cars”?! Or “Self-driving vehicles”? Please correct.

According to reviewer’s advice, we have used the professional expression “driverless car" in line

The sentence in lines 60 and 61 might be confusing to the reader. Please rephrase.

If the target trajectory and detection match, the cascade matching method will be used.

In line 99 and 100 please correct “mismatches and mismatches are prone to occur.”

In the new version, lines 107-108, the sentence " mismatches and mismatches are prone to occur." has been change to "the false matching or mismatch phenomenon was prone to occur."

Figure 1 seems incomplete. It does not add any information to the text. Please redraw figure with some meaningful information associated.

In the new version, we have modified Figure 1 and added corresponding labels for better understanding.

In line 135 to 137 authors should support the claim about “training network error behavior” with references or example result of the proposed method.

In the revised version, we have modified this claim with appropriate expression: “Continuing to increase the network layer will not achieve good performance for improving the network’s accuracy” and the reference [15] will support this claim.

The text from line 141 to 149 should be rewritten to better explain figure 2.

We rewrote the description of Figure 2 with some detailed information in lines 150-156 of the new version.

In line 153 the phrase “It has non-negative, triangular inequality and other characteristics.” It is very short and not meaningful. Should be rewritten. Other characteristics?!

In lines 161-162 of the newly revised article, we deleted " It has non-negative, triangular inequality and other characteristics." and added " The IOU value is always greater than 0, and its value will not change with the scaling of the prediction frame and detection frame. " to describe IOU features.

In line 186 did you mean equation 7? Please correct.

Thanks for correction and the equation here should be equation 8 and we have revised it in the new version.

Section 2 explanation of GIOU (from line 150 to 214) is too long. Authors should reduce the explanation since there are duplicated ideas in the images and text.

We have simplified this part and revised it in lines 159-208.

The title of section 3 is not in correct English and should be rephrased “The Improved the feature extraction network”.

We have revised the title of section 3 to " The Improved Feature Extraction Network ".

The text between line 246 and 258 is too repetitive. Authors should rephrase the text and compact the explanation.

We have rephrased the text between line 246 and 258 and compact the explanation in the revised article.

In line 272 authors should explain why the training sets are smaller than the test sets (usually training sets are much larger than test sets).

In the market-1501 datasets which released by Tsinghua University, there are 12936 pictures and 19732 pictures in the test sets. However, when we use it, we use all the training sets and some pictures in the test set, which are explained in lines 269-270 of the new version of the article.

Table 1 should be reconstructed. The values should be or at least added in percentage (%) so the reader can have a better understanding of proposed method accuracy. Is should also identify in the text to the reader the meaning of the acronyms (FN, FP, IDS, MOTA and MOTP).

In the revised article, we have modified table 1 with more explanation and explained the parameters in the evaluation index in lines 287-296.

The discussion of the results should be more detailed with a deeper analysis. Also, the conclusions should be more detailed.

In the revised paper, we rewrote the conclusions with deeper analysis according to the constructive comments of the reviewer.

Author Response File: Author Response.docx

Reviewer 3 Report

The paper is interesting and presents the improvement of the pedestrian multi-target tracking algorithm DeepSORT. For this ResNet50 is used as the feature extraction network, which is combined with Feature Pyramid Network (FPN). However, there are issues that must be solved.

The authors should improve the abstract by providing motivation for their work.
It would be good to discuss the background and proposed work in separate sections.
There are many terms that need to be expanded in order to fully understand the concepts such as JDE, ReLU, MOTA, MOTP, FP, etc.
Figure 1 should have tags so that the readers can understand the FPN structure easily.
Limitations of the approach are not discussed. It would be good to discuss future research directions and the limitations of the work.
Also, some editorial issues must be solved before acceptance, such as

“prediction. the phenomenon”
“figure 6(b).
“process . Appearance”

There are some related works, which are not explained. The author should include the missing papers and compare them to the proposed work such as Shao et al. [1] integrated FPN with ResNet to extract and fuse features of images.

Shao, X., Wang, Q., Yang, W., Chen, Y., Xie, Y., Shen, Y., & Wang, Z. (2021). Multi-Scale Feature Pyramid Network: A Heavily Occluded Pedestrian Detection Network Based on ResNet. Sensors (Basel, Switzerland), 21(5), 1820. https://doi.org/10.3390/s21051820
Pereira, R.; Carvalho, G.; Garrote, L.; Nunes, U.J. Sort and Deep-SORT Based Multi-Object Tracking for Mobile Robotics: Evaluation with New Data Association Metrics. Sci.2022, 12, 1319. https://doi.org/10.3390/app12031319

Author Response

Response to the Reviewer(s)' Comments:

To Whom It May Concern:

We are very thankful to the Editor-in-Chief, Associate Editor, and reviewers who have made critical comments and constructive suggestions concerning the work reported in the manuscript.

Sincerely,

Haiying liu

Response to the Reviewer 3’s Comments:

The authors should improve the abstract by providing motivation for their work.

In the revised paper, we modified the abstract and provided the motivation for our work.

It would be good to discuss the background and proposed work in separate sections.

In the new version we have separated the background and the proposed work

The background analysis of the article is added accordingly.

We expanded and rewrote the introduction with plus the two additional articles which is recommended.

There are many terms that need to be expanded in order to fully understand the concepts such as JDE, ReLU, MOTA, MOTP, FP, etc.

JDE is introduced in line 39 of the revised article, the complete spelling of relu is supplemented in line 234, and FN, FP, IDS, MOTA and MOTP are introduced in lines 287-296.

Figure 1 should have tags so that the readers can understand the FPN structure easily.

In the new version, we have modified Figure 1 and added corresponding labels for better understanding.

Limitations of the approach are not discussed. It would be good to discuss future research directions and the limitations of the work.

In the modified article, we introduced the improvement of the proposed algorithm in lines 106-116, and also discussed the future research directions and limitations of the algorithm in the final conclusion.

Also, some editorial issues must be solved before acceptance, such as

“prediction. the phenomenon”
“figure 6(b).
“process . Appearance”

In the revised article, we checked the whole paper and revised these kinds of editorial issues.

There are some related works, which are not explained. The author should include the missing papers and compare them to the proposed work such as Shao et al. [1] integrated FPN with ResNet to extract and fuse features of images.

Shao, X., Wang, Q., Yang, W., Chen, Y., Xie, Y., Shen, Y., & Wang, Z. (2021). Multi-Scale Feature Pyramid Network: A Heavily Occluded Pedestrian Detection Network Based on ResNet. Sensors (Basel, Switzerland), 21(5), 1820. https://doi.org/10.3390/s21051820
Pereira, R.; Carvalho, G.; Garrote, L.; Nunes, U.J. Sort and Deep-SORT Based Multi-Object Tracking for Mobile Robotics: Evaluation with New Data Association Metrics. 2022, 12, 1319. https://doi.org/10.3390/app12031319

Thank you sincerely for reviewing our papers. we have read the recommended papers which related in detail and absorbed some experience. We take these two articles as the reference articles of this paper and make a detailed analysis in the introduction.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The document is now ready for publication.

Author Response

We all the authors are kindly appreciate your professional suggesitons and constructive advice for this paper.

have a good day

haiying Liu

Article Menu

Improved DeepSORT Algorithm Based on Multi-Feature Fusion

Further Information

Guidelines

MDPI Initiatives

Follow MDPI