The results shows that most participants chose the observation pattern of observing multiple AOIs in sequence before making judgments, presenting a high response accuracy rate.
5.1.1. Potential Electrical Contact Hazards
As shown in
Table 4, participants mainly adopted two strategies when observing potential electrical contact hazards: 32.6% of participants followed the observation pattern of “charged body (A)—wire (B)—connection between charged body and electric wire (hereinafter referred to as the intersection)”; 48.8% of participants conformed to the observation pattern of “wire (B)—energy-releasing source (A)—intersection”, which was followed by 48.8% of the participants.
It is worth noting that in the two different visual paths, the areas of charged objects (A) and wires (B) are both necessary conditions for participants to successfully recognize hazards. However, the results of heat maps and post-experiment interviews with participants showed that they were more concerned about the intersection of (A) and (B). On construction sites, the most common electrical fires come from the connection of energized wires to equipment [
55,
56], which is the intersection in the Hazard #2 scene. Interestingly, the interview results indicate that both groups of construction workers who adopted different observation sequences expressed confidence in this strategy and efficiency in choosing to focus on the intersection interface in this scenario. Although this common tendency provides specific guidance for interpreting construction workers’ recognition process of potential electrical contact hazards, it seems to be inconsistent with the priority areas of concern (e.g., charged bodies, ground wire) that have been generalized in previous studies [
17,
57]. The reason is probably that the participants in this study are all experienced construction workers, whereas these previous studies used alternative groups in the selection of participants, such as civil engineering students, who did not have as much experience as workers with work experience [
17,
58].
Therefore, the current research results show that when recognizing unknown scenes with potential electrical contact hazards, compared with energy-releasing source, giving priority to the intersection that considers the focus of attention among the components is more consistent with the cognitive patterns of safety-experienced construction workers, which supplements and revises the visual path generalized in previous visual research. Computer vision technology should pay more attention to the definition and feature extraction of the intersection areas to enhance the hazards recognition ability and efficiency in potential electrical contact hazard scenarios, as this may reflect the interaction between the two common components of energy-releasing source and wires. As RBC and gestalt model suggests that humans can identify hazards based on the spatial relationships of different elements and considering the extent to which they “deviate from the prototype” [
18]. Further development and refinement of corresponding computer vision auxiliary technology aiming at hazard monitoring, such as the detection of the integrity of the insulation portion of the wire envelope, and the detection of abnormal heat generation, will occur in conjunction with thermal imaging technology [
59].
5.1.2. PPE-Related Hazard
Based on the results of data analysis, combined with post-experiment interviews, there are two strategies for participants to observe and recognize Hazards #2 and #3: (1) “Construction worker-PPE-construction work environment”. The corresponding specific visual search paths are shown in Solution #2-1 in
Table 7 and Solution #3-1 in
Table 10. (2) “Construction operating environment (association of potential hazards)—construction workers-PPE”. The corresponding specific visual search paths are shown in Solution #2-2 shown in
Table 7 and Solution #3-2 shown in
Table 10.
Safety training provides workers with the knowledge base and long-term memory of safety regulations that guide workers to the most likely interpretation and identification of a given scenario and generate perceptual goals. Construction workers are searching for hazards in accordance with this paradigm, which is the “top-down” model in psychology. The human-object-environment strategy described above is also consistent with the requirement in [OSHA] [
60] that safety training prioritize whether workers working at heights are wearing safety devices. When the background (objects, features or groups in the scene) is consistent with the perceptual target, the effect of hazard recognition can be achieved [
18]. In these two PPE-related hazard scenes of this study, the correct answer rate of construction workers using this paradigm in the two PPE hazard scenarios both exceeded 80%.
Even though the purpose of safety training is to reinforce long-term knowledge and enhance the effect of “top-down” inspection, due to the heavy construction workload, workers’ general perception starts from trivial signs and follows “heuristic” reasoning [
61,
62] to determine whether there are hazards, which is the “bottom-up” dominant cognitive model. This conforms to what psychology describes as the spontaneous interchange of “bistable perception” [
63], which unconsciously shifts attention to prominent visual features of potential importance [
64,
65] that is, hazards closely related to workers [
66]. This observation mode can be vividly described as “scene association type”. Although this bottom-up cognitive model may lay the foundation for understanding people’s selection of AOI, it is not sufficient to complete the visual searching cognitive process, especially in the dynamic construction site with high complexity. In such scenarios, potential hazards may come from many different and unintended areas. Therefore, the completion of the cognitive process also requires the supplement of associations with the current scene. Human perceptual mechanisms discard redundant information [
67] and use existing information such as common sense, knowledge and experience, combined with their expectations of the current situation [
68], to associate and discover missing information [
18]. For example, under Hazard #2 and Hazard #3, when they quickly focus on the workers from the observation of the scene in hazard search tasks, they will generate expectations of the safety of the workers, that is, the workers should be protected by PPE. When further observations reveal that PPE is in a state of absence, they will make the judgment that workers are in a hazardous state, and the cognitive process of hazard recognition will be completed. This model can be summarized by “norm-guided”.
Interestingly, the results of the current study show that both strategies correspond to a fairly high correct rate of hazard recognition, without presenting a clear superiority or inferiority. However, it can be confidently concluded that the high accuracy rate corresponding to the normative interpretation means that the hazard recognition strategies taught in the safety training are effective, which verifies the positive effect of safety training on safety management. Safety training enables individuals to recall the requirements of safety regulations in construction scenarios, and consciously perform corresponding safety inspection and CHR. It is worth noting that although the “scene-associative” visual cognitive strategy appears to be less efficient than the “norm-guided” approach in targeting hidden areas, they both exhibit high levels of correctness, suggesting that the “scene-associative” may have potential implications for safety training. Specifically, the scenario-associative strategy is more reflective of worker experience, and although there may be significant internal individual variability, the overall consistent strategy presented helps to examine inappropriate instructional content that may be contrary to subjective human experience and habits in traditional construction safety training design [
69]. Quality safety training content should include hands-on activities for visual exploration of the environment, or at least activities that provoke reflection on the possible sources of hazards, rather than merely conveying guiding information about visual exploration behavior that meets the normative guidance and requirements. Visual research can record reasonable and effective hazard recognition experience with common features through the cognitive strategies reflected in the visual sequence, which provides a rich source of content design for safety training to enhance its positive contribution to safety management [
70,
71].
In addition, these two strategies also have enlightening significance for the development of computer vision in hazard recognition. For example, the paradigm of human interpretation of safety specifications and the top-down visual cognitive strategy can be referred to learn how to accurately map the content in safety specifications with inherent complexity in specific scenarios [
30], thus enhancing the computer’s ability to learn and understand safety regulations. The large amount of data recorded in vision research, after certain screening to optimize the quality of the data, can be used for the learning and training of computer vision algorithms [
31], thus training computers to predict possible future hazardous states of the current scene, which may be significant for safety management. This is because what can be seen and recorded is the world composed of objects and surfaces, but humans can make perceptions and associations about the meaning of the observed visual attributes. The human cognitive system uses heuristic problem-solving shortcuts to make inferences about the received information for generating perceptions of the world rather than exhaustive algorithms [
40,
41]. In the current application paradigm of computer vision, searching for the presence of a feature is faster than for the absence of a feature [
18], which is also the reason why a static scene with potential hazards may be more difficult to be recognized by computer vision than a dynamic scene. However, with the help of appropriate cues, things objectively absent in a scene can be interpreted as present but hidden, thus allowing for the simulation of human associative abilities. Vision research may provide such cues and learning basis for computer vision algorithms, such as the visual behavior corresponding to the visual cognitive strategy summarized in this study, which will bring leap-forward improvement for the hazard recognition ability of computer vision.