1. Introduction
About 33% of cases of days away from work and physical disabilities in the construction industry are due to work-related musculoskeletal disorders (WMSDs) [
1]. WMSDs cause immense losses to injured workers, their employers, and also to society, as workers’ compensation is partially shared by society [
2]. Postures that are awkward, prolonged, or repetitive are generally considered a major contributor to increases in MSD risks [
3]. Residential roofers typically perform repetitive tasks on sloped surfaces ranging from 10° to 26° (sometimes as steep as 45°) for a long time. As a result, this population has the second-highest incidence rate of WMSDs among all construction trades [
4].
Roofers often experience a considerable amount of knee joint rotation because of prolonged and repetitive awkward kneeling during shingle installation on sloped rooftops. Awkward knee joint rotation includes knee flexion, abduction–adduction, and internal–external rotation. Deep flexed kneeling postures generate significant net quadriceps moments at the knee and increase the stress on the patellar tendon [
5]. Increased knee adduction and abduction create additional stress and force on the inner part of the knee, specifically the medial compartment, leading to an elevated risk of osteoarthritis development [
6]. Similarly, when the tibia rotates internally or externally in relation to the femur it exerts stress on the ligaments of the knee joint, particularly the posterior cruciate ligament (PCL) and medial collateral ligament (MCL) [
7]. At sloped rooftops, residential roofers’ knees encounter higher abduction and internal rotation during shingle installation [
8]. Previous research conducted by the authors revealed that roofers face an elevated risk of knee musculoskeletal disorders (MSDs) during various phases of shingle installation on sloped surfaces [
9]. The research findings indicated that certain phases carry a higher risk of awkward knee joint rotation compared to others. Specifically, the placement and nailing of shingles were identified as the two phases with the greatest amount of awkward knee joint rotation. Consequently, these phases are considered the most risky in shingle installation and have the potential to contribute to the development of knee MSDs [
9]. Without any training and/or wearable protective devices, residential roofers frequently experience persistent harm to their knees due to incorrect operations during phases. To avoid potential injuries or disorders, it is crucial to ensure that roofers adopt the correct postures over different phases during the shingle installation.
The current practice primarily relies on ergonomists to observe and assess undesirable postures in roofing jobsites. Such a procedure is manual and tedious, and the results can be subjective and error-prone. As a result, it is plausible to create techniques that automate this observation and assessment process via the use of advanced sensing and data analytics [
10]. As such, the jobsite safety performance can be improved by alerting roofers with warnings of potential hazards in a timely manner [
11]. It is envisioned that a roofer will be monitored through a surveillance system using sensors while installing shingles on a slanted rooftop, and the sensors will collect the needed data to inform the system in real-time about the roofer’s posture and position and the duration of each posture in which s/he has been. To enable such a system, it is essential to identify different phases of shingle installation automatically so that this identification result can be further used to improve the surveillance system towards automation.
3. Problem Statement and Research Objective
Roofing postures can be recorded and quantified to assess the possibility of developing knee MSDs. However, the assessment procedure may face technical challenges thanks to differences in individual behaviors in performing the shingle installation tasks. To alleviate this situation, it is attractive to leverage the postural data to determine the working phases in which an individual is currently involved considering the evidence that different phases are associated with different risk levels in shingle installation. As a result, risk assessment can be performed by observing the amount of time a roofer is spending in a particular phase. With the rapid advancement in the field of sensors and monitoring technologies, it is envisioned that the collection of roofers’ postural data in an automatic and inexpensive manner will become viable. This will enable obtaining the knee joint rotation angles from the roofers’ postural movement. However, the existing literature does not currently provide evidence of successfully using knee joint rotation angles to accurately identify the specific activities associated with the phases of shingle installation. Whether machine learning methods can perform such identification is also unknown.
In this research, the authors proposed to examine the feasibility of utilizing machine learning to automatically identify the various phases involved in a residential roofing task of shingle installation by harvesting the combination of knee joint rotations and roof setting information. Previously, the authors presented the preliminary findings in a conference to demonstrate the potential of this subject matter [
38]. However, the prior work did not examine the impacts of different features on learning efforts or provide in-depth analyses of the learning performance. This paper presents a comprehensive design, implementation, analysis, results, and discussion of the present research.
5. Results
The mean cross-validation accuracies obtained from the three approaches for a number of input features in a subset are depicted in
Figure 5. Please note that for each number of input features in a subset, presented in
Table 2, only the result of the input feature subset that provided the highest mean cross-validation accuracy has been presented. For all three classifier types, the highest mean cross-validation accuracies were obtained when all four features were used; therefore, the classifiers trained on all four features were selected for further analysis.
Table 4 displays the average cross-validation accuracy along with the corresponding standard deviation for each classifier. Additionally, the table presents the lowest cross-validation accuracy achieved by each classifier.
Mean cross-validation accuracies are also summarized in
Table 5 for different classifiers. The mean cross-validation accuracy was at best 92.16% when both knee kinematics (FL + AB_AD + IN_EX) and the residential roof slope were used in the KNN inputs; however, the mean cross-validation accuracy rate dropped to 89.68% when only kinematic data were used to train the KNN classifier. A similar pattern was also observed for the DT and RF classifiers. These results confirm that adding roof setting information along with knee kinematics variables improved the classification performance of the classifiers.
Table 6 shows the time taken to train and test different classifiers. All classifiers were very quick at both training and testing instances of data.
Table 7 presents the overall performance of the different classifiers in terms of their overall accuracy, F1 score, precision score, recall score, and Kappa index.
In this study, the KNN classifier demonstrated the best performance among all the classifiers. In
Table 4, it achieved the highest mean cross-validation accuracy of 92.16%. In
Table 7, the KNN classifier outperformed the other classifiers in all metrics. It had the highest overall classification accuracy of 0.9262, as well as the highest scores for F1, precision, recall, and the Kappa index, with values of 0.9260, 0.9220, 0.9262, and 0.9020, respectively. The F1 score, which is close to 1, indicates the KNN classifier’s ability to correctly identify the phases better than just relying on any standard classification’s accuracy alone. A high precision score suggests a low false positive rate, while a high recall score indicates a low false negative rate in the predictions. The high Kappa index value suggests excellent agreement between the test data and the predicted data, as values ranging from 0.81 to 1.00 indicate almost perfect agreement.
In terms of overall accuracy, it was also observed that the RF classifier achieved a classification accuracy of 91.12%, which is comparable to that achieved by KNN. This classifier also performed well in phase classification in terms of its precision, recall, F1 score, and Kappa index.
To gain insights into the accuracy of classifying each phase, a detailed analysis of the classification results for each phase was conducted. This involved examining the confusion matrices generated by the KNN and RF classifiers, which are presented in
Table 8 and
Table 9, respectively. These matrices provide precision and recall values to determine per-class classification accuracies. The elements on the diagonal of the confusion matrices indicate the number of instances where the predicted class aligns with the actual class. Conversely, the off-diagonal elements indicate the classifier’s incorrect predictions. Higher values along the diagonal indicate a better performance, indicating that the classifier made a higher number of correct predictions. For instance, in
Table 8 a Precision1 value of 0.943 for class P1 signifies that 94.3% of the observations classified as P1 were accurate. Similarly, a Recall1 value of 0.930 signifies that 93% of the observations belonging to class P1 were accurately identified as P1.
Table 10 represents the performances of different classifiers to classify the seven phases in terms of F1 scores. In
Figure 6, the performance of different classifiers in classifying the phases has been illustrated with F1 scores.
Table 10 and
Figure 6 suggest that the KNN classifier also outperformed the other two classifiers in a per-class prediction accuracy assessment.
6. Discussion
This study investigated the use of machine learning to classify roofers’ activities in the shingle installation process. Knee kinematics data and residential roof setting information were used to classify seven different phases of shingle installation operation, applying machine learning techniques. Given the high prevalence of knee MSDs among roofers, as well as the lack of knowledge in understanding postural differences among different phases of sloped shingle installation roofing tasks, this study examined if machine learning can be used to differentiate the seven phases of shingle installation using knee joint rotation (kinematics) and roof slope information. Meanwhile, this study investigated if the combination of knee kinematics and roof setting information as machine learning mode inputs are more effective than knee kinematics only for the classification of the shingle installation phases.
Three classifiers (i.e., decision tree, random forest, and k-nearest neighbors) were tested in this study. The results reported in
Table 5 showed that the highest testing accuracy of 92.62% was obtained by the KNN classifier. The RF classifier also achieved a testing accuracy of 91.12%, which was comparable to that achieved by the KNN classifier. For the KNN classifier, the number of neighbors k plays an important role in the classification performance of the classifier, i.e., k is the key tuning parameter of the KNN classifier. In this study, k values from 1 to 40 were examined to identify the optimal k value for all training sample sets. Although the collected data were post-processed to attenuate noise and remove outliers, the robustness of the KNN classifier to noisy data and outliers is still in question [
51]. One advantage of RF over KNN is the ease of parameter tuning during the training of the classifier. Tuning the number of trees in the forest often leads to good accuracy. Oshiro et al. [
49] suggested that it is possible to obtain a good balance between accuracy, processing time, and memory usage with a range between 64 and 128 trees in a forest. Using more than the required number of trees may be unnecessary, but this does not harm the model except by increasing the computation time [
48].
From
Table 5, it can be observed that the complete set of all four features provided the best phase classification results (KNN 92.62%, RF 91.12%). These findings indicate that knee kinematics and roof slope information complement each other when detecting the risky phases of a shingle installation process. Moreover, residential roof slopes can significantly affect the knee joint rotation angles during sloped shingle installation [
8]. As a result, a better phase classification performance by the classifier can possibly be attributed to more useful information generated by the combination of knee kinematics variables and the roof slope compared to the knee kinematics variables only.
Per-class classification results (F1 scores) presented in
Table 10 and
Figure 6 suggested that relatively risker phases (P2 and P5) could be identified more accurately using knee kinematics variables and roof setting information as input to the training algorithms, compared to the moderate and least risky phase. This is evident from the higher F1 scores (~95%) of these two phases compared to the other phases. F1 scores were considered as they enable the measurement of the balance between precision and recall scores. Moreover, the F1 score is a useful measure to deal with imbalanced datasets with a non-uniform distribution of class labels, as is the case in this study. The authors’ previous study found that, among the seven phases, the placing and nailing shingle phases (P2 and P5) required more repetition of extreme and awkward movements of the knees for placing and installing shingles compared to the other phases. More specifically, roofers experienced extreme flexion, abduction, adduction, and internal and external rotations in their knee joints during placing shingles (P2), and hence P2 could be deemed as the riskiest phase in terms of awkward knee rotations during the sloped shingle installation. As to flexion, abduction, and external rotation, the next riskiest phase was nailing shingles (P5), when the participants faced extreme adduction as well. Moreover, the durations of these two phases were relatively higher compared to other phases. Therefore, each classifier was well able to distinguish these two riskiest phases from the other ones. However, further investigation is needed to substantiate this finding. From the confusion matrices presented in
Table 6 and
Table 7, it is observed from the recall and precision scores that the highest areas of confusion using the KNN and RF classifiers were P4, P6, and P3. This lies in the small variations in the flexion, abduction–adduction, and internal–external rotations of knee joint rotation angles, which were the least extreme in these phases, as suggested in the authors’ previous study [
9].
While previous research has explored the recognition of awkward postures, there is limited knowledge regarding their identification within a work context. Roofers, who frequently experience awkward postures and repetitive movements while installing shingles, are particularly vulnerable to MSDs, with knee MSDs being the most prevalent. Accurately identifying these factors during roofing operations can help minimize roofers’ exposure and reduce the knee MSD risks. This study successfully illustrated the ability of machine learning in identifying high-risk phases of shingle installation by leveraging knee joint rotation angles and information regarding the specific residential roof settings where roofers carry out their tasks. To the authors’ best knowledge, this paper is the first that exploits machine learning to classify task-specific risky phases using knee kinematics data and roof setting information.
The importance of proactive safety measures over reactive ones is widely acknowledged. The present study, which focuses on automating the classification of work phases in roofing operations, has significant implications for researchers, practitioners in the occupational safety and health field, and the construction industry. The integration of machine learning with non-invasive biomechanical devices or inertial measurement units (IMUs) that can capture knee rotational kinematics during dynamic movements presents significant potential. This combination could lead to the development of an automated activity monitoring and risk identification system as an intervention for roofers. Such a system would enable the continuous monitoring and evaluation of roofers’ postures throughout the entire process of sloped shingle installation. Furthermore, it shows promise in identifying instances where a worker spends an excessive duration in a specific phase, particularly in high-risk phases involving awkward rotations and repetitive motions, such as shingle placement and nailing. Currently, there is no standardized procedure for shingle installation in residential roofing. This information could be beneficial to designing a well-documented and standardized residential roofing procedure that can help reduce MSDs among roofers. Automated machine-learning-based classification may also facilitate the process of MSD-risk-related data collection. Additionally, the productivity of workers may also be analyzed through the automated identification of the phases and, thereby, the determination of their working durations at each identified phase.
This study has several limitations. First, ground truth data were collected from an experimental study performed in a controlled laboratory setting, not from a real construction site. Second, only knee kinematics and roof slope information were used as features to analyze the phase classification performance of the classifiers. Activations of the knee postural muscles (i.e., electromyography signals from different muscles) were not considered. Awkward postures can make a muscle less efficient in producing the required amount of force to accomplish a task, which results in higher muscle activation and muscle overloading. Hence, knee postural muscle activation data might yield more useful discriminative features, which could further improve phase classification performance of the classifier. Third, only three feature-based models (i.e., KNN, DT, and RF) were tested in the current study. A results comparison from a deep learning model, such as a deep neural network, might provide more insights into the underlying classification mechanisms, but it was outside the scope of the current investigation. Fourth, this study employed individuals without professional roofing experience as participants. Kinematically speaking, distinctions may exist between professional and non-professional roofers. Nevertheless, all subjects in this research were physically active and possessed relevant experience in activities such as home remodeling. It is hypothesized that their biomechanical responses during the experiments closely resemble those of professional roofers. However, further scientific investigation is required to substantiate this assumption [
8]. Lastly, the dataset included only roofing phase data. Real-life scenarios will include non-roofing phases as well, such as standing, resting, and walking. In such cases, those non-roofing phase related data should be removed before applying the classification method.
7. Conclusions and Future Extension
Construction roofers are exposed to awkward knee joint rotations in different phases of shingle installation roofing tasks, which contributes to the development of knee MSDs among them. Some of the phases involve extreme knee joint rotation and hence impose the greatest risk of knee MSDs. This study suggested that machine learning can automatically detect and classify the phases of a typical sloped shingle installation task with a high accuracy based on the roofer’s knee joint rotation angles and the information of the roof slope at which they operated. Seven different phases, namely reaching for shingles, placing shingles, grasping the nail gun, moving to the first nailing position, nailing shingles, replacing the nail gun, and returning to an upright position, were performed in a simulated shingle installation task to examine the feasibility of the automated detection and classification of the phases. The classification performances of three types of supervised machine learning classifiers (i.e., DT, KNN, and RF) were compared in order to select the best classifier. Cross-validation and overall prediction accuracy results showed that the KNN classifier obtained the best results with 92.16% and 92.62%, respectively. The recall and precision performance of correctly classifying the phases were above 92% for this classifier. The results of the current study show that machine learning can be used to accurately recognize the different phases of a shingle installation task. The findings of this study highlight the feasibility and potential of the application of a machine-learning-based automated phase classification method for assessing the MSD risk, productivity, and efficiency of residential roofers.
In future studies, the focus will be on investigating the effects of knee postural muscle activation on the classification of shingle installation phases. The aim is to ascertain whether muscle activation can offer more informative features for a deeper comprehension of postural disparities among phases and the associated risk of knee MSDs in roofing. Additionally, more insights into the underlying mechanisms of classification can be obtained through an exploration of deep neural network learning models. It is important to note that personal protective equipment, such as fall protection harnesses and lifelines, were not utilized in the simulated roofing tasks conducted for this study. Future research is needed to examine how the implementation of these safety measures affects the performance of roofers during their tasks. Furthermore, the models developed in this study will be extended to real-life settings, incorporating the use of wearable sensors, to assess the feasibility of automated recognition with the involvement of professional roofers. Additionally, the extension of the models to other body parts of roofers, such as the lower back, which are susceptible to MSDs, may also be explored.