Next Article in Journal
Evaluation of Carbon Emission Efficiency in the Construction Industry Based on the Super-Efficient Slacks-Based Measure Model: A Case Study at the Provincial Level in China
Next Article in Special Issue
Exchanging Progress Information Using IFC-Based BIM for Automated Progress Monitoring
Previous Article in Journal
Linear and Nonlinear Earthquake Analysis for Strength Evaluation of Masonry Monument of Neoria
Previous Article in Special Issue
Design and Validation of a Mobile Application for Construction and Demolition Waste Traceability
 
 
Article
Peer-Review Record

Real-Time Early Safety Warning for Personnel Intrusion Behavior on Construction Sites Using a CNN Model

Buildings 2023, 13(9), 2206; https://doi.org/10.3390/buildings13092206
by Jinyu Zhao 1,*, Yinghui Xu 1, Weina Zhu 1, Mei Liu 1 and Jing Zhao 2
Reviewer 1:
Reviewer 2:
Reviewer 3:
Buildings 2023, 13(9), 2206; https://doi.org/10.3390/buildings13092206
Submission received: 24 July 2023 / Revised: 14 August 2023 / Accepted: 24 August 2023 / Published: 30 August 2023

Round 1

Reviewer 1 Report

The manuscript is concerned with the safety management in construction sites, and specifically contributes to the early warning of intrusion behavior using CNN.

As the authors said in Lines 155-157 that there are few early warning studies that use machine vision techniques; however, they did not address these studies in the literature nor in the introduction sections. It is highly recommended to study previous research on using machine vision in early warning intrusion behavior.

Line 29-32:  “more accurate 29 ability to identify the intrusion behavior of construction site personnel” accurate than what? Please clarify.

Table 1 and the corresponding paragraph in Lines 436 – 439: the authors may wish to formulate the results in terms of true positive, true negative, false positive and false negative. It would be more informative if the authors display the confusion matrix, calculate the accuracy as well as the precision and recall, and provide a discussion on these findings.

Line 462 and the calculations are not clear.

This manuscript proposes an interesting research study by providing an integrated methodology to carry out safety early warning on intrusion behavior in construction hazard area, which includes multiple subsequent procedures and models. One of these procedures is to identify the attributes of the personnel and check whether their attributes conforming to the hazard area or not. The authors may wish to shed some light on this process.

In Line 263 the authors stated that the model will produce a regular reminder when personnel enter the hazard area conforming to their own attribute information. However, the authors did not show any finding with this regard. Therefore, the authors may wish to present the results of the developed model when there are some personnel enter a hazard area conforming to their attributes.

The authors did not check whether the CNN model is underfitting or overfitting.

Was the data used in developing the proposed models divided into three sets: training, validation and testing? If so, why did not the authors present the results of the training, validation and testing phases?

 

The manuscript is well written; however, there exist some grammatical mistakes and typos, such as those in:

Line 52: “…then issues an alert in time when signs 52 of accidents appear.”

Lines: 234-235, 237-239, 352-353, and 420-421.

Author Response

Response to Reviewer 1 Comments

 

Point 1: As the authors said in Lines 155-157 that there are few early warning studies that use machine vision techniques; however, they did not address these studies in the literature nor in the introduction sections. It is highly recommended to study previous research on using machine vision in early warning intrusion behavior.

 

Response 1: As suggested, the manuscript has collated research on useing machine vision in early warning intrusion behavior in the "Literature Review" section. See line 125-143 for details.

 

Point 2: Line 29-32: “more accurate ability to identify the intrusion behavior of construction site personnel” accurate than what? Please clarify.

 

Response 2: In fact, this part wanted to show that the accuracy of method's recognition met expectations, and the manuscript has been revised. See line 29-32 for details.

 

Point 3: Table 1 and the corresponding paragraph in Lines 436 – 439: the authors may wish to formulate the results in terms of true positive, true negative, false positive and false negative. It would be more informative if the authors display the confusion matrix, calculate the accuracy as well as the precision and recall, and provide a discussion on these findings.

 

Response 3: The manuscript has been reworked in response to suggestions, as detailed in section 4.1.

 

Point 4: Line 462 and the calculations are not clear.

 

Response 4: There was a problem with the presentation of the manuscript, which has been revised. See line 478-485 for details.

 

Point 5: This manuscript proposes an interesting research study by providing an integrated methodology to carry out safety early warning on intrusion behavior in construction hazard area, which includes multiple subsequent procedures and models. One of these procedures is to identify the attributes of the personnel and check whether their attributes conforming to the hazard area or not. The authors may wish to shed some light on this process.

 

Response 5: Based on BIM-RFID technology, the "Construction Site Personnel Management System" is completed to realize the function of collecting and real-time detecting the personnel attribute information on the construction site. Modifications have been made according to the suggestions, and see section 3.3.1 for details.

 

Point 6: In Line 263 the authors stated that the model will produce a regular reminder when personnel enter the hazard area conforming to their own attribute information. However, the authors did not show any finding with this regard. Therefore, the authors may wish to present the results of the developed model when there are some personnel enter a hazard area conforming to their attributes.

 

Response 6: The reference to "regular reminder" in this article means ordinary reminder, and changes have been made in the manuscript. Also, ordinary reminder and emergency reminder are different levels of early warnings made by the program based on information about the attributes of the intruder identified, and the determination of information and warnings are realized through the "Construction Site Personnel Management System" mentioned in the article. See section 3.3.1 for details.

 

Point 7: The authors did not check whether the CNN model is underfitting or overfitting.

 

Response 7: The manuscript has been revised and explained, as detailed in section 4.1.

 

Point 8: Was the data used in developing the proposed models divided into three sets: training, validation and testing? If so, why did not the authors present the results of the training, validation and testing phases?

 

Response 8: Since the classification model has already been identified in this paper, it is only necessary to divide the data into training set and test set, as described in section 4.1.

 

Point 9: Comments on the Quality of English Language

The manuscript is well written; however, there exist some grammatical mistakes and typos, such as those in:

Line 52: “…then issues an alert in time when signs 52 of accidents appear.”

Lines: 234-235, 237-239, 352-353, and 420-421.

 

Response 9: The manuscript has been checked for errors and corrected, see line 47, 234-240, 358-360, 431-433 for details

Author Response File: Author Response.docx

Reviewer 2 Report

The direction of research chosen by the authors is relevant and interesting, but the article leaves many questions.

1. I suggest that the Authors describe in more detail and give a clear definition of the concept of "Personnel Intrusion Behavior".

2. Lines 163. "Convolutional neural network (CNN) was proposed in 1962 by Hubel .." please give the reference.

3. Lines 324. What does it mean, "... developed with the fifth author's participation"?

4. Lines 329. "... based on the BIM platform ..." What is the relationship between BIM and proposed CNN model? What is BIM platform?

5. Equation (1) is not clear. Please explain.

 

The modeling process is poorly described.

6. What problem did the CNN model solve (classification, object identification, one of the segmentation types)?

7. Which type of CNN (U-net, RCNN, Yolo, etc.) was used in this study?

8. What was the architecture of the model?

9. Which activation function, loss function, optimization algorithm, quality measure were used?

 

The results are poorly described and do not convince that the model is real working.

10. Figure 7 is not clear.

11. In my opinion, 50 images are not enough to teach the model to identify objects.

12. How was the data set divided into training, test and validation?

13. What were the performance measures of the model (loss and accuracy)?

There are several questions related to the real-time operation of the model declared by the authors.

14. The CNN model is trained to recognize/identify objects in images. As the authors previously stated, video cameras have been installed at the construction site for monitoring. How does a video image turn into a photo, and the photo, in turn, enters the model for object identification?

15. What is the data processing time of the model and how long does it take the worker to get into the danger zone.

16. Authors need to describe the entire process in detail.

Author Response

Response to Reviewer 2 Comments

 

Point 1: I suggest that the Authors describe in more detail and give a clear definition of the concept of "Personnel Intrusion Behavior".

 

Response 1: The manuscript has been revised in accordance with the recommendations, as detailed in the “Introduction” section.

 

Point 2: Lines 163. "Convolutional neural network (CNN) was proposed in 1962 by Hubel .." please give the reference.

 

Response 2: Reference "Hubel, D.H., Wiesel, T.N. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J. Physiol. 1962, 160, 106-154. DOI: 10.1113/jphysiol.1962.sp006837" has been provided in the manuscript. See line 171 for details.

 

Point 3: Lines 324. What does it mean, "... developed with the fifth author's participation"?

 

Response 3: The fifth author of this paper, Jing Zhao, participated in the research and development of the "Construction Site Personnel Management System", which is mainly based on BIM-RFID technology, and is the platform for the collection of construction personnel information and the realization of the voice warning function in this paper.

 

Point 4: Lines 329. "... based on the BIM platform ..." What is the relationship between BIM and proposed CNN model? What is BIM platform?

 

Response 4: See section 3.3.1 for details.

The BIM platform is a sub-system of the "Construction Site Personnel Management System". The main work of this sub-system is to divide the construction site into dynamic safety zones based on the built BIM 3D model and the field layout model, and update the safety zones according to the construction progress based on the advantages of the BIM technology, so as to realize the whole process of personnel safety management. Meanwhile, the data center of the personnel management system is also established based on the BIM platform, and the real-time data information of each type of personnel at the construction site is intelligently classified and stored.

 

Point 5: Equation (1) is not clear. Please explain.

 

Response 5: Equation (1) is an expression for the person's coordinates in the person attribute information. See lines 336-347 for details.

 

Point 6: What problem did the CNN model solve (classification, object identification, one of the segmentation types)?

 

Response 6: The paper mainly uses the advantages of CNN to deal with the target detection problem and its self-learning ability, according to the principle of its algorithm to carry out image recognition, feature mapping analysis, classification, and provide key data for the subsequent construction of the early warning model.

 

Point 7: Which type of CNN (U-net, RCNN, Yolo, etc.) was used in this study?

 

Response 7: This paper mainly utilizes the performance of CNN to extract image features, and selects "Image Pyramid" as the multi-scale representation of the image to achieve target detection. The main algorithm applied is the YOLO.

 

Point 8: What was the architecture of the model?

 

Response 8: See figures 5 and 6 for details.

 

Point 9: Which activation function, loss function, optimization algorithm, quality measure were used?

 

Response 9: This paper selects the Tanh function as the activation function and the Cross-entropy Loss function, and applies the confusion matrix to measure the model. See equation 5 and section 4.1 for details.

 

Point 10: Figure 7 is not clear.

Response 10: The explanation of Figure 7 has been modified, and see line 478-491 for details.

 

Point 11: In my opinion, 50 images are not enough to teach the model to identify objects.

 

Response 11: The results of training on large sample datasets have been presented in section 4.1 as suggested.

 

Point 12: How was the data set divided into training, test and validation?

 

Response 12: Typically, when a classification model has been identified, only the training and test sets are needed. See section 4.1 for details.

 

Point 13: What were the performance measures of the model (loss and accuracy)?

 

Response 13: Precision, Recall and F1 Score in confusion matrices and Loss. See section 4.1 for details.

 

Point 14: The CNN model is trained to recognize/identify objects in images. As the authors previously stated, video cameras have been installed at the construction site for monitoring. How does a video image turn into a photo, and the photo, in turn, enters the model for object identification?

 

Response 14: The video is processed using Python language, and the video is cut into pictures according to a certain number of frames, and then the captured images are transmitted to the image processing terminal through the "Construction Site Personnel Management System" mentioned in this paper. The captured images containing the daily activities of construction site personnel are labeled using the pre-prepared Labelme software, and then trained by the TensorFlow learning framework.

 

Point 15: What is the data processing time of the model and how long does it take the worker to get into the danger zone.

 

Response 15: See the disposal results of section 4.1 for detials.

 

Point 16: Authors need to describe the entire process in detail.

 

Response 16: Changes have been made in response to suggestions, as detailed in the newly submitted manuscript.

Author Response File: Author Response.docx

Reviewer 3 Report

This paper proposed a warning for personnel intrusion behavior on construction sites using the CNN model, which is a good idea. The study showed that the method proposed in this paper is feasible, but whether it can be adapted to real-world environments remains to be refined. In addition, the authors need to refine the definition of hazard areas, which is critical in this paper. Some specific comments are as follows:

1. The author must explain why this paper focuses on early warning for personnel intrusion behavior.

2. The critical research issue should be listed in the Introduction.

3. The Literature Review needs to be streamlined, and some references that are not very relevant to the topic of this paper can be ignored.

4. Please explain the basis for the division of hazard area. The actual construction sites are more complicated.

5. Authors need to refine the definition of hazard areas, which is critical in this warning proposed in this paper. Defining hazard areas is very complex, and only application of hazard areas for early warning management is difficult to adapt to the actual situation.

The quality of English is OK.

Author Response

Response to Reviewer 3 Comments

 

Point 1: The author must explain why this paper focuses on early warning for personnel intrusion behavior.

 

Response 1: Clarification has been provided in the introduction section as suggested, see line 37-90 for details.

 

Point 2: The critical research issue should be listed in the Introduction.

 

Response 2: The introduction section has been reworked in line with the recommendations.

 

Point 3: The Literature Review needs to be streamlined, and some references that are not very relevant to the topic of this paper can be ignored.

 

Response 3: The Literature Review has been reorganized in line with the recommendations.

 

Point 4: Please explain the basis for the division of hazard area. The actual construction sites are more complicated.

 

Response 4: See section 3.2.1 for details.

 

Point 5: Authors need to refine the definition of hazard areas, which is critical in this warning proposed in this paper. Defining hazard areas is very complex, and only application of hazard areas for early warning management is difficult to adapt to the actual situation.

 

Response 5:  See section 3.2.1 for details. This article can provide an idea that can be followed up with a deeper study in this direction.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The comments have been addressed; however, the technical writing of the manuscript still requires improvement.

The technical writing of the manuscript still needs to be improved. For instance, Lines 448 - 455 have run-on sentences and grammatical mistakes.

Reviewer 2 Report

I recommend to accept the article in present form 

Back to TopTop