Next Article in Journal
CNN-Based Surrogate Models of the Electrostatic Field for a MEMS Motor: A Bi-Objective Optimal Shape Design
Previous Article in Journal
Machine Learning-Driven Approach for a COVID-19 Warning System
 
 
Article
Peer-Review Record

Improvement in Error Recognition of Real-Time Football Images by an Object-Augmented AI Model for Similar Objects

Electronics 2022, 11(23), 3876; https://doi.org/10.3390/electronics11233876
by Junsu Han 1, Kiho Kang 1 and Jongwon Kim 2,*
Reviewer 2: Anonymous
Electronics 2022, 11(23), 3876; https://doi.org/10.3390/electronics11233876
Submission received: 23 September 2022 / Revised: 21 November 2022 / Accepted: 22 November 2022 / Published: 23 November 2022

Round 1

Reviewer 1 Report (New Reviewer)

The paper is interesting. However, there are three main issues:

1) The language must be improved (undefined acronyms, grammar misuse, repeated sentences, concordance errors and so on).

2) Page 7, lines 195-199. Must the described procedure be required for each individual game? That is, does the color range need to be specified each time?

3) You state that you have four decision classes (A, B, C, and D) corresponding to players of both teams, referees, and overlapped objects. That is, in order to assess performance, you need to use a confusion matrix (which is the standard name for "classification table layout") for 4 classes, which is a 4x4 matrix, instead of the 2x2 matrix you use. And the formulations of the measures (sensitivity, specificity, and so on) change for a non-binary matrix. That said, I'm very sorry to tell you that all of your performance assessments must be redone, as well as the results, discussion, and conclusion sections. Please use adequate measures and correct the results. 

Author Response

Dear Reviewer

We wish to submit a research paper for publication in Electronics, changed title “Improvement in error recognition of real-time football images by an object-augmented AI model for similar objects”. And we provide point-to-point responses to the reviewer’s comments in the previous paper. Also, we highlighted the content that makes all changes easy to see.

Please see the attached manuscript (electronics-1927193).

 

Point 1: The language must be improved (undefined acronyms, grammar misuse, repeated sentences, concordance errors and so on).

Response 1: Thank you for your valuable suggestions. If the manuscript (electronics-1927193) is accepted, we will be completed the correction of English sentences by using MDPI's English editing service.

Point 2: Page 7, lines 195-199. Must the described procedure be required for each individual game? That is, does the color range need to be specified each time?

Response 2: Thank you for your valuable suggestions. The HSV color range needs to be specified only once in the soccer game. We described the contents on Page 7, lines 203-205.

Point 3: You state that you have four decision classes (A, B, C, and D) corresponding to players of both teams, referees, and overlapped objects. That is, in order to assess performance, you need to use a confusion matrix (which is the standard name for "classification table layout") for 4 classes, which is a 4x4 matrix, instead of the 2x2 matrix you use. And the formulations of the measures (sensitivity, specificity, and so on) change for a non-binary matrix. That said, I'm very sorry to tell you that all of your performance assessments must be redone, as well as the results, discussion, and conclusion sections. Please use adequate measures and correct the results.

Response 3: Thank you for your valuable suggestions. We used a multi-class confusion matrix to evaluate the AI model performance and modified the results.

We described the contents on Pages 10-13, lines 288-347.

  3.1. Results of object recognition

  3.2. Performance of the AI models

 

Thank you for your consideration. 
I'm looking forward to hearing from you.

Best regards,
Author

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

In this manuscript, the authors attempted to address the overlapping occurred among the players in real-time soccer image. The overlapping is an important issue in the object detection, to which numerous attentions had been paid. This study focused on a specific problem about the accurate detection of players in people crowded. It is an interesting work, especially for the soccer fans. However, the description and structure of this manuscript is poor. The main contribution of this work is unclear, and the novelty is limited.

Comments:

The title of this manuscript needs to be changes. It is unclear and difficult to attract the attention of readers.

The motivation of this work is unclear. The scientific problems of this study should be emphasized.

This study analysed real-time image. The time cost of the models should be reported.

The description of results should be reformulated. “3.1 Results of False Positive error reduction” or “3.2. Results of False Negative error reduction” is unacceptable as a subsection title.

The weakness of the proposed methods should be given in the discussion.

Author Response

Dear Reviewer

We wish to submit a research paper for publication in Electronics, changed title “Improvement in error recognition of real-time football images by an object-augmented AI model for similar objects”. And we provide point-to-point responses to the reviewer’s comments in the previous paper. Also, we highlighted the content that makes all changes easy to see.

Please see the attached manuscript (electronics-1927193).

Point 1: The title of this manuscript needs to be changed. It is unclear and difficult to attract the attention of readers.

Response 1: Thank you for your valuable comments. We changed the title of the paper to "Improvement in error recognition of real-time football images by an object-augmented AI model for similar objects “. And we made the point clearer in line with the improvement of object recognition errors and tried to attract attention.

Point 2: The motivation of this work is unclear. The scientific problems of this study should be emphasized.

Response 2: Thank you for your valuable comments. We tried to complete the motivation more clearly and emphasize them than before.

We described the contents on Page 3-4, lines 101-119.

Point 3: This study analysed real-time image. The time cost of the models should be reported.

Response 3: Thank you for your valuable comments. We supplemented the contents and Figure 13 on Page 9, lines 241-244.

Point 4: The description of results should be reformulated. “3.1 Results of False Positive error reduction” or “3.2. Results of False Negative error reduction” is unacceptable as a subsection title.

Response 4: Thank you for your valuable comments. We reformulated the evaluation results in section 3.

We described the contents on Pages 10-13, lines 292-347.

  3.1. Results of object recognition

  3.2. Performance of the AI models

Point 5: The weakness of the proposed methods should be given in the discussion.

Response 5: Thank you for your valuable comments. We described the contents on Pages 12-13, lines 330-345.

  3.2. Performance of the AI models

 

Thank you for your consideration. 
I'm looking forward to hearing from you.

Best regards,
Author

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (New Reviewer)

I note that you tried to do the k-classes confusion matrix and the corresponding performance assesment. However, the formulations of the measures (sensitivity, specificity, and so on) are still incorrect. That said, I'm very sorry to tell you that all of your performance assessments must be redone again, as well as the results, discussion, and conclusion sections. I suggest using the formalations in the paper "Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management45(4), 427-437." It's a little bit old, but have pretty clear formulations for k-classes, you can find them in Table 3. As your data is imbalanced, I suggest using macro-average to avoid bias.

Author Response

Dear Reviewer1

We wish to submit a research paper for publication in Electronics, titled “Improvement in error recognition of real-time football images by an object-augmented AI model for similar objects”. And we provide point-to-point responses (Round 2) to the reviewer’s comments in the previous paper. Also, we highlighted the content that makes all changes easy to see.

Reviewer's comments:

Point 1: English language and style are fine/minor spell check is required.

Response 1: Thank you for your valuable suggestions. If the manuscript (electronics-1927193) is accepted, we will be completed the language editing by using MDPI's English editing service.

 

Point 2: I note that you tried to do the k-classes confusion matrix and the corresponding performance assessment. However, the formulations of the measures (sensitivity, specificity, and so on) are still incorrect. That said, I'm very sorry to tell you that all of your performance assessments must be redone again, as well as the results, discussion, and conclusion sections. I suggest using the formulations in the paper "Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437." It's a little bit old, but have pretty clear formulations for k-classes, you can find them in Table 3. As your data is imbalanced, I suggest using macro-average to avoid bias.

Response 2: Thank you for your valuable suggestions and advice.

We believe that the reviewer's proposal relates to the generalized measure methodology for evaluating recognition performance according to a class-classification method and class types that constitute a recognition model.

The focus of this research, we evaluate how much different recognition errors can have under the same conditions for three types of AI recognition models that have completed model training for an object class with a similar shape to the defined classification method. And we don't evaluate the generalized recognition accuracy of the classes themselves.

As the method advised by the reviewer evaluates the performance of the recognition model we want to evaluate, it is also a valuable opinion to evaluate the performance using the definition of generalization for each class. However, I was not sure that the errors that could occur through the generalization of classification process for each class would be equally applied to the different three recognition models. Therefore, the macro-average formulations, which average the errors of all recognition object images, are not suitable for comparing the number of recognition errors in the three AI recognition models. Furthermore, we don't think these formulations are reasonable in an environment where the object to be recognized changes a lot, such as a real-time football game. In this study, we have not included classification methods according to the recognition algorithm type. The reason is that the three AI models, Yolo3-416, Yolo3-HSV, and Yolo3-Augment, with the same recognition algorithm, have different procedures and structures for object recognition, so the features of the occurred errors are important.

Since the manuscript may have a misunderstanding regarding the model performance evaluation according to the classification method, we described that there are various performance measurement methods according to the classification method on page 9, line 252-261, and it is added on reference [23].

Thank you for your consideration.

I'm looking forward to hearing from you.

 

Best regards,

Author

Author Response File: Author Response.pdf

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The overall quality of the paper has been improved in some respects, but I am really sorry to say that it is still no sufficient for publication in a journal. The reasons are as follows:

  1. The paper still contains many typos. Just some examples:

 

l.12 we analyze and evaluation -> evaluate

l.15 propose AI method -> propose an AI method

l.25 analogous objects recognition -> object recognition

l.18 is secured to -> ensures

….

 

  1. The structure of the paper has been improved, but gives no easy access to the central contribution and further no related work is discussed. The main contribution is given in section 2.2.2. If I get it right it simply increases the number of detectable classes. This becomes not clear when reading the abstract. The problem of dealing with occlusion and grouping is not new in the detection and tracking literature. There are different ways to address the problem. E.g. there already exist work which annotate partial occlusion cases in pedestrian detection and use them to analyze detection performance. In summary there has to be a related work section which highlights the novelty of the proposed approach. In the same sense it may also worth discussing approaches related to out-of-distribution detection.
  2. In section 2.2.2. the usage of the term augmentation is misleading, since it is used to explain increasing the class number.
  3. The proposed approach is extremely simplistic and not well evaluated. The central idea is simply to use an additional class that summarizes failure cases. Actually this is an interesting approach worth following. It appears that the error improvement is related to an unfair evaluation, since models trained for 3 classes are directly compared to a model with 4 classes. Therefore there is no real improvement, since your evaluation simply alteres the FP/TP counts.
  4. Large parts of the paper still present material from other publications or textbook knowledge.

Overall the approach is interesting and worth following, but it needs a significantly stronger elaboration.

Reviewer 2 Report

The authors describe a method for the detection and classification of soccer players.

The paper is difficult to understand due to very bad English. However, from what I understood, the authors compared three methods solving different problems, i.e., detecting and classifying with different class sets. Two of the three compared methods classify the players from both teams and the referee. The third method additionally classifies pairs of overlapped players. Such a comparison makes no sense to me. The proposed method should be compared with other methods solving the same problem using the same dataset and the same validation protocol. Moreover, after reading the paper it is not clear to me what are the novelties of this work. Therefore, my opinion is to reject the paper.

The detailed list of issues are as follows:

  1. The title of the paper is confusing. What is "the error improvement effect"? In this work only the soccer videos are analyzed. Therefore, it should be "objects recognition in soccer videos" instead of "objects recognition in sports video".
  2. Section one is chaotic. The beginning of an introduction should be a motivation for the proposed method (why do we need automatic object recognition in sport video). Then the authors should provide a review of the related work (it may also be in a separate section). Finally, the proposed approach should be briefly characterized with a list of contributions, and advantages of the proposed method over the existing algorithms should be specified.
  3. What are "Taylor and Nitschke's data augmentation experiments"? Table 1 with results is provided but it is not described. What is the difference between the first and the second column? They have identical headers but different results.
  4. "and the maximum minimum value of the specified range is defined as a threshold value" – what is the maximum minimum value?
  5. What bitwise operation was used in Fig. 3c)?
  6. 14 is badly cropped (captions on the right side of the image).
  7. Why the blue curve is called "not person" instead of "True negative"?
  8. There is no information about the dataset used in the evaluation. How many videos of different soccer games were used? How many frames were selected? What is the size of training and testing data?
  9. Section 5 is titled "Conclusions", however, no conclusions are made there. Only the information about the future works.
  10. The English is very poor. There are few sentences in the entire paper that are grammatically correct. Many sentences are difficult to understand. In my opinion, the paper should undergo extensive English revisions. Some parts, e.g., Subsection 3.3, should be completely rewritten. Some grammatically incorrect sentences, mainly from the abstract and Section 1, are listed below:
  • "In this paper, we analyze and evaluation the recognition" - it should be "evaluate".
  • "And it extends the recognition range" - I suggest writing "It also extends"
  • "so improves errors occurring in player detection" - "So improves"? Maybe it should be "and improves".
  • "Further, we believe that the AI model in this paper" - it should be "presented in this paper".
  • "In general, recognizing an object from get the image sensor"
  • "for automatically recognizing objects" - I suggest writing "for automatic object recognition"
  • "In the object recognition process of AI algorithm, selects an arbitrary" – I think it should be "algorithm, it selects".
  • "with the pre-trained objects, after that determine target object."
  • "When the recognition process is completed, it is secured the values of the location (x, y)," – I think it should be "it secures the values".
  • "and labeled with 4 classes (Class A, B, C, D)" – it should be (A, B, C, D). There is no need to repeat the word "class".
Back to TopTop