Next Article in Journal
The Integral Role of Intelligent IoT System, Cloud Computing, Artificial Intelligence, and 5G in the User-Level Self-Monitoring of COVID-19
Previous Article in Journal
Multi-Mode Data Generation and Fault Diagnosis of Bearings Based on STFT-SACGAN
 
 
Article
Peer-Review Record

Research and Optimization of a Lightweight Refined Mask-Wearing Detection Algorithm Based on an Attention Mechanism

Electronics 2023, 12(8), 1911; https://doi.org/10.3390/electronics12081911
by Xiangbo Shi 1, Yala Tong 1,*, Fei Mei 1 and Zhongjian Wu 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2023, 12(8), 1911; https://doi.org/10.3390/electronics12081911
Submission received: 11 March 2023 / Revised: 14 April 2023 / Accepted: 17 April 2023 / Published: 18 April 2023
(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

This is a great all-around paper describing the state-of-the-art face detection algorithm.

The topic addressed in the research is both original and relevant to the field. The research proposes a refined method for mask-wearing detection using a lightweight network based on attention mechanisms, which addresses the specific gaps in the field in terms of incomplete classification, small target miss detection and insufficient feature extraction capabilities. The approach aims to improve the detection speed and accuracy of the YOLOv4- tiny network, which is a popular deep learning model for object detection.

The research builds on the existing literature on mask-wearing detection using deep learning techniques, but proposes a novel approach that includes several key innovations, such as adding a category of "incorrect_mask" to the dataset, introducing the CBAM attention module, and using the Focal Loss function and improved mosaic data enhancement strategy. These innovations aim to remove the limitations of the existing methods and improve the performance of the lightweight mask-wearing detection network.

Overall, the research provides valuable insights into developing more effective and efficient deep learning models for mask-wearing detection, which is particularly relevant in the current global context of the COVID-19 pandemic.

The conclusions presented in the study are consistent with the evidence and arguments presented and effectively address the main question posed. The authors provide a comprehensive analysis of the experimental results, compare the proposed approach to existing methods, and demonstrate the approach's advantages in terms of accuracy, speed, and efficiency.

Furthermore, the references cited in the research seem appropriate and relevant to the topic. The authors draw on a range of sources related to deep learning object detection techniques, as well as previous research on mask-wearing detection using machine learning and computer vision techniques. The references are used to support the arguments and evidence presented in the research, and the authors provide appropriate citations throughout the paper. Overall, the research demonstrates a strong mastery of the relevant literature and effectively integrates this knowledge into the development of the proposed approach to mask wear detection.

One mistake:

Line 189 there is no reference to Fig [?]

 

Comments for author File: Comments.pdf

Author Response

Dear reviewer,

 

Thank you very much for your approval of our paper, we sincerely thank your careful reading, and based on your suggestion, we have corrected "Fig" into "Figure 8".

 

Yours sincerely,

Xiangbo Shi

11 April 2023

Reviewer 2 Report

The paper appears to be poorly written and lacks clarity. The paper claims to address the issues of incomplete classification, miss detection of small targets, and insufficient feature extraction capabilities of lightweight networks in detecting complex faces. However, the paper fails to provide a clear problem statement or research objective.

The paper proposes a lightweight refined method for mask wear detection based on attention mechanism. However, the paper does not explain the attention mechanism in detail or provide a clear description of the proposed methodology. Moreover, the paper lacks proper experimental design and testing methodology. The paper only provides the results of a three-object classification experiment, which is not sufficient to evaluate the proposed methodology's effectiveness. The paper claims that the algorithm achieved a 6.97% improvement in accuracy on face targets with a high mean average precision (mAP) of 93.05%, but the paper does not provide any details on the performance of the algorithm on real-world scenarios. Additionally, the paper lacks proper literature review and comparison with existing methods. The authors claim that the proposed methodology is superior to the YOLOv4-tiny network, but the paper does not provide a clear comparison with existing methods, nor does it provide any analysis of the limitations of the proposed methodology. The authors are encouraged to check the following relevant papers: doi:10.1016/j.patcog.2021.108398, doi:10.1109/ACCESS.2022.3182055, doi:10.3390/s22030896, doi:10.1007/s10044-021-00975-z.

In conclusion, the paper is poorly written, lacks clarity, and lacks proper experimental design and methodology. The paper fails to provide a clear problem statement, research objective, and methodology. Moreover, the paper lacks proper analysis of the proposed methodology's limitations and effectiveness. Therefore, the paper needs a major revision before it could be considered for publication.

Author Response

Dear reviewer,

 

Thank you very much for your comments and professional advice. These opinions help to improve the academic rigor of our article. We have corrected the revised manuscript's modifications based on your suggestion and request. Meanwhile, the manuscript was reviewed and edited by the language services of MDPI. We hope that our work can be improved again. Furthermore, we would like to show the details as follows.

 

(1) However, the paper fails to provide a clear problem statement or research objective.

 

Response: We greatly appreciate your feedback, which we believe will significantly enhance our manuscript. Based on your comments, we have restructured the introduction to distinguish between two key areas: "In the mask dataset" (line 33) and "In the mask-wearing detection algorithm" (line 40 ). Furthermore, we have highlighted the current issues (line 64) and proposed in the dataset to build the real scene data of not wearing the mask correctly, as well as improving the detection and feature extraction capabilities of our lightweight network for small targets in our algorithm.

 

(2) However, the paper does not explain the attention mechanism in detail or provide a clear description of the proposed methodology.

 

Response: We greatly appreciate your feedback, which is relevant and valuable. Based on your comments, we have revised our manuscript by adding a detailed explanation in "3.2 Introducing the attention mechanism" (line 168) regarding the purpose of implementing the attention mechanism. This explanation includes addressing issues such as the lightweight network's weak feature extraction ability and the large amount of background noise generated by adding a layer of large-scale predictive features.  Our proposed solution is to implement the attention mechanism in the enhanced feature extraction network of YOLOv4-tiny to address these challenges.

 

(3) Moreover, the paper lacks proper experimental design and testing methodology.

 

Response: We believe that you have brought up a valid point where we fell short in our manuscript. Following your suggestion, we have included a detailed description of the experimental testing methods in "4.3 Experimental protocols" (line 229). This includes five key aspects: "P-R curve," " precision, recall, and AP values" "detection speed" " actual scene" and "ablation experiments," with individual explanations of how these aspects are crucial to assess the effectiveness of the algorithm in the mask-wearing detection scenario.

 

(4) The paper does not provide any details on the performance of the algorithm on real-world scenarios.

 

Response: Your suggestion has been extremely helpful to us, and as per your recommendation, we have compared the performance of the YOLOv4-tiny algorithm and Ours in real-world scenarios in "4.4.4. Comparison of actual scene effects" (line 302), as shown in Figure 9. We have provided a detailed explanation of the performance of these two different algorithms in the three images presented. The first two images show instances where the YOLOv4-tiny algorithm missed small targets, while in the last image, it caused false detection. On the other hand, our proposed algorithm, Ours, achieved excellent detection results in all three scenarios.

 

(5) The paper lacks proper literature review and comparison with existing methods.

 

Response: We are glad you mentioned this point, which we overlooked in our manuscript. Based on your comments, we have added it by reproducing the algorithm proposed in the literature18, evaluating it in our dataset for training, and comparing it by "P-R curve," "AP value, " detection speed" (4.4.1, 4.4.2, 4.4.3). Ours outperformed literature 18 both in terms of detection speed and accuracy.

 

(6) The authors claim that the proposed methodology is superior to the YOLOv4-tiny network, but the paper does not provide a clear comparison with existing methods, nor does it provide any analysis of the limitations of the proposed methodology.

 

Response: We are glad that you mentioned this point, and based on your comments, we have added the existing methodological literature18 and compared it. Also, in "5. Conclusion" (line 340), we explain the improved results of our improved algorithm and add our limitations of Ours, which are manifested in two aspects. Firstly, the dataset still needs to be bigger, and the dataset needs to be balanced more. Second, our lightweight mask detection model is not applied to mobile edge devices or real public places (line 358).

 

Thank you very much for your attention and time. Look forward to hearing from you.

 

Yours sincerely,

Xiangbo Shi

11 April 2023

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The paper was improved somewhat, but several issues remain.

The comparison with other studies is still weak with only a few works used in comparison. The authors should discuss MASK-Yolo, INSTA-Yolo, SEG-Yolo, YoloV7, among others.

Explain why your method performed worse (Tables 2, 3) than other models. What can be done to improve the result?

Author Response

Dear reviewer,

 

We sincerely thank you for your careful reading and for your comments and professional advice. These opinions help to improve the academic rigor of our article. We have corrected the revised manuscript's modifications based on your suggestion and request. We hope that our work can be improved again. Furthermore, we would like to show the details as follows.

 

(1) The comparison with other studies is still weak with only a few works used in comparison. The authors should discuss MASK-Yolo, INSTA-Yolo, SEG-Yolo, YoloV7, among others.

 

Response: Thank you very much for your feedback. We believe that your suggestions can significantly enhance our manuscript. According to your comments, we have compared with other studies, including MASK-Yolo, YoloV7, and Yolox, ([21], [22], [24]) and have reviewed the relevant research results in the field of mask detection algorithms, summarized their achievements, and pointed out some shortcomings. We have also read the relevant work on INSTA-Yolo and SEG-Yolo, but unfortunately, we have not found their relevant research in the field of mask detection. We hope you can give us valuable feedback to improve our paper further. Thank you again for your feedback.

 

(2) Explain why your method performed worse (Tables 2, 3) than other models. What can be done to improve the result?

 

Response: We appreciate your valuable feedback, and we have made modifications to the manuscript based on your suggestions. As you pointed out, there is still some gap between our algorithm and the YOLOv4 algorithm, particularly in detecting face targets. This is mainly because YOLOv4 is a more complex network with more layers, which has better detection performance on complex data but the less obvious improvement on simple data. (line 288) However, such a complex network structure sacrifices detection speed. (line 314)  We also identified some areas for improvement in our future work, including expanding the dataset and improving the network structure to further enhance the detection accuracy of the network. (line 379)

 

 

Thank you very much for your attention and time. Look forward to hearing from you.

 

Yours sincerely,

Xiangbo Shi

14 April 2023

Author Response File: Author Response.pdf

Back to TopTop