1. Introduction
As the improvement of automation continues, ship operators must keep track of more information and manage an increasing number of systems, which increases the operators’ workload [
1]. The improvement of workload influences operators work efficiency, physiology, and physiological health [
2], and it has become a general issue for various industries [
3]. Previous research has revealed that human performance is affected by workload [
4]; an appropriate workload level can reduce human error and improve system security [
5].
Workload not only includes the cognitive demands of the tasks, but also includes fatigue and stress. At present, the measurement methods are mainly divided into three categories, namely, performance measurement methods, subjective rating, and physiology measurement [
6]. The subjective method is easy to implement and is low in cost. At present, subjective evaluation methods mainly include the circumplex model of affect [
7], the Positive Affect and Negative Affect Schedule [
8], the Activation–Deactivation Adjective Check List [
9], NASA-TLX [
10], the Subjective Workload Assessment Technique [
11], the Overall Workload Scale [
12] and Modified Cooper–Harper Ratings [
13]. However, subjective results are susceptible to characteristics such as bias, reaction sets, and protest attitudes [
14].
According to current research, there is a certain relationship between the workload and physiology response [
15,
16]. Compared with subjective evaluation methods, physiology measurement methods are affected by environmental factors. The experimental equipment connected to the participant’s body will affect the operator’s mental state. But physiology measurements require small samples to get accurate results. At present, a large amount of research has been conducted on eye reactions in physiology measurement methods. Eye responses are useful for assessing the design level of the interface and the operator’s workload [
17]. The pupil diameter is generally combined with the scanning path, fixation time, and error rate to evaluate workload [
18,
19]. Blinking is considered a sign of fatigue [
20]. When the operators need to process more information, the blinking frequency was decreased [
21,
22]. The other opinion suggested that the blink frequency was affected by visual demand and workload, which were opposite each other [
23]. When the workload increased, the blink duration also decreased [
24]. In addition, the parameter of fixation time had a correlation with workload [
25,
26]. The research results revealed that using fixation time to evaluate workload showed higher reliability [
27].
Yan et al. (2019) proposed an artificial neural network model to predict the operator’s workload [
28]. Although both the model proposed by Yan and the model proposed in this study are used to predict the workload of operators, there are still great differences between the two research studies. First, the main objectives of previous research are to consider the relationship between operators’ workload and eye responses in the task of operating a marine engine interface. The ultimate objective is to build an artificial neural network model to predict the operators’ workload. However, the main purpose of this study is to construct a discrimination model using typical discriminant methods to evaluate the workload level of operators. Therefore, the difference between the two is that the previous research developed an artificial neural network model, while this research constructed a discriminant model. Second, the artificial neural network model is trained based on the back propagation algorithm. The constructed discriminant model in this study is based on Bayes’ discriminant idea. Therefore, there are differences in the methods of constructing the two models. Third, an artificial neural network model was developed based on integrating eye response data. However, the construction of the discrimination model is based on eye response and human performance (error rate). There are limitations when using a single method to measure the workload level. However, integrating multiple methods can effectively overcome the shortcomings of a single method and improve the accuracy of workload level evaluation. Therefore, this study used the typical discrimination method to establish the discrimination model based on multiple indicators, which achieved the discrimination of the workload level of ship interface operators.
2. Materials and Methods
2.1. Participants
Operators should not only master the operation of marine engines but also have a lot of practical operation experience. However, these operators cannot be contacted at any time. Therefore, this study tested the participants who received specific task training. Twenty-two students were invited to participate in the study, ranging in age from 21 to 25 (age = 22.8 ± 1.4 years). They are novice students from an altogether different domain. All the participants have a good engineering education background and are familiar with computer operations, they are in good health, with normal vision and normal color vision. All participants performed three tasks. Finally, two-thirds of the sample data were used to construct the model, and one-third of the sample data was used to verify the reliability of the model. All students were in good health, with normal vision and normal color vision. All students must ensure adequate sleep on the day before the experiment, and the students were asked to stay in a quiet environment before the experiment began. Each student completed an operational training task before the experiment.
2.2. Equipment
The Neptune simulator MC90 was developed by Kongsberg Maritime Ship Systems Software (Version: MC90-V, Norge, Norway) based on real data. The software is very close to real operating procedures. In MC90 software, the propulsion machinery is based on one MAN B&W 5L90MC, low speed, 5 cylinder configuration, 2-stroke, turbocharged, reversible diesel engines. The main engine is coupled to a propeller shaft with either a fixed pitch propeller or a controllable pitch propeller. The software is an effective training tool for marine engineering professionals. This study used the software for simulation operation.
In this research, eye response was recorded by iView X head-mounted eye tracking device (SensoMotoric Instruments, Teltow, Germany). The recording rate was 50 Hz, the pupil/corneal reflex was <0.1°, and the gaze position accuracy was <0.5° to 1.0°. The software of BeGaze was used to analyze and process the experiment data.
2.3. Experimental Task
This research analyzed the changes of indicators under different workload levels and then analyzed the correlation between workload and each indicator. The three tasks were used to control workload level. The operation procedure of the three tasks is shown in
Table 1 and
Table 2 and
Table 3 [
29]. The first task is the emergency generator operating procedures. It is mainly to provide power to the ship when there is a problem with the ship’s power supply system. The operating procedures of the emergency generator are shown in
Table 1. The second task is the shaft generator operating procedures. It can supply the ship’s network with electrical energy when the SG is running above 200 rpm. Between 200 and 400 rpm the load is limited to half, and above 400 rpm maximum power is available. The operating procedures of the shaft generator are shown in
Table 2. The third task is the operating process for starting the diesel generator, and the operating procedures are shown in
Table 3. According to the number of steps for each operation, the first, second, and third tasks were represented by the low, general, and high workload, respectively. The required human–computer interface of each task procedure is shown in
Figure 1.
2.4. Date Collection and Analysis
2.4.1. Performance Data
Performance indicators were used to evaluate the effectiveness of an object in completing a specific task. Performance evaluation parameters mainly include the number of errors, error rate, operation time, etc. [
30]. According to the current research result of performance evaluation methods, this study selected error rate and operation time as performance evaluation indicators.
2.4.2. Subjective Data
The subjective evaluation mainly collects scores when participants use rating scales. Compared with objective evaluation methods, subjective evaluation methods need numerous sample data, and the evaluation results are vulnerable to being influenced by the behavior of the evaluators, such as personal preferences and moods. However, due to the low cost and adaptability of subjective evaluation methods, the application of subjective evaluation methods is still very extensive. This study used the NASA-TLX score to evaluate the operators’ workload. NASA-TLX evaluates workload from six dimensions: mental demand, physical demand, temporal demand, own performance, effort, and frustration. Each dimension used the 0 to 100 scale to evaluate the level. NASA-TLX score was calculated by Equation (1).
where
NASA-TLX is the NASA task load index;
Xi is the rating score of the ith dimension;
Wi is the relative weight of the ith dimension.
2.4.3. Eye Responds Data
In the BeGaze software interface (Version 2.3, Teltow, Germany), the area of interest (AOI) was selected as the simulated display interface. All recording data beyond the data on the simulated interface was excluded from the analysis. During the experiment, the eye response on the simulated interface was recorded by coordinate values. This study selected the four eye response indicators of pupil diameter, blink rate, saccade rate, and fixation rate as the research objects. Saccade rate has a negative correlation with fixation rate. When the interface information is easy to obtain, the saccade rate will increase and the fixation rate will decrease. On the contrary, when the presentation of interface information does not conform to the operator’s cognition, the operator’s fixation rate will increase and the saccade rate will decrease. Therefore, saccade rate and fixation rate are used to evaluate the workload of operators. Pupil diameter is the average value of the pupils in the vertical and horizontal directions of the left eye. The time between closing eyes and open eyes is considered to be blinking time. The average value of multiple measurements is considered as the blink time in the experiment. The blink rate is the number of blinks per second. The saccadic rate refers to the number of times the eye moves rapidly around in a second. The fixation rate refers to the number of times the eye stays in the area for a long time in one second.
4. Discussion and Conclusions
Workload has a great influence on information acquisition and feedback time, which may lead to human errors. This study constructed a discrimination model to discriminant the operator’s workload level. Three task procedures were used to represent three workload levels.
According to the number of steps for each operation, the workload of the three tasks was divided into the low workload, general workload, and high workload. The t-test result of NASA-TLX showed that the NASA-TLX score has a significant difference between the three tasks, which indicated that the workload has a significant difference under different tasks. The dimension of mental demand has the highest score among the three tasks, which indicated that the operation process required a high degree of concentration. The high effort score indicated that users needed to spend more effort to complete the operation task.
The operation time has a significant difference between the three tasks, and the error rate has a significant difference between low workload and high workload. The increase in workload requires more cognitive resources, but cognitive resources are limited, which may lead to an increase in error rate. At the same time, the increase in the number of operation steps also requires the operator to spend more time completing a task.
The t-test result showed that the pupil dilation and saccadic rate have a significant difference between the three workloads. The fixation rate also has a significant difference between low workload and general workload and between low workload and high workload. The increased task steps leads to the workload increasing. As the increase of workload, the amount of information that the eyes need to obtain increases, which leads to the expansion of pupil diameter. In addition, it is more difficult to search the required information under a high workload, which leads to a decrease in saccade rate and an increase in fixation rate. The blink rate is affected by both workload and visual demand. Therefore, the workload did not have a significant effect on the blink rate.
According to the above analysis, these indicators can reflect the workload levels. The correlation analysis showed that the NASA-TLX score has a correlation with the error rate, operation time, and eye response indicators. Therefore, these indicators were used to construct the discriminant model. This research constructed a discriminant model based on the indicators of NASA-TLX score, error rate, operation time, pupil diameter, blink rate, saccadic rate, and fixation rate. The validation results indicated that the discrimination model can effectively distinguish the operators’ workload level.
However, there are still some limitations to this study. Firstly, the quantity of samples is small. Secondly, the experiment was performed on a simulation platform, which is different from the real operating environment. Therefore, the reliability of the model needs to be further verified in practical application.