*3.2. Intelligence Module*

Firstly, the communication between the active sensors is enabled by Thingsboard. ThingsBoard is an open source IoT platform for data collection, processing, visualization and IoT device management. It is free for both personal and commercial use and can be implemented anywhere.

It enables device connectivity through industry-standard IoT protocols (MQTT, CoAP and HTTP) and supports both cloud and on-premise deployments. ThingsBoard combines scalability, fault tolerance and performance, ensuring that the users' data are never lost.

ESP32 is a series of low-power, low-consumption system-on-a-chip microcontrollers with integrated Wi-Fi and dual-mode Bluetooth, as mentioned in the previous section. The device is responsible for transmitting the information to the ThingsBoard platform and its subsequent processing by the intelligent model see Figure 4, to interact with the helmet.


**Figure 4.** Setting up the ThingsBoard platform to operate according to the information received from ESP32, IoT module added to ThingsBoard and Multi-sensorial configuration.

Simple steps are required to link the devices to the platform:


The data obtained through ThingsBoard is later processed by an intelligent model, the model confirms or denies the existence of a real emergency. This is the reason why configuring the platform is very important.

An association must be created between the different values of the sensors and the corresponding response. Once these associations are created, it is possible to modify any value depending on the values to be tested empirically or in the alarms. Alarms are configured in the device settings so that the respective notifications appear on the panel. A rule chain must be added.

A selection of the attributes placed on the server and on the device's threshold panel must be carried out. The names of the attributes on the server must correspond with those on the panel so that when the data are dynamically configured, they will be recognized correctly and will appear on the diagram generated by the platform, Figure 5.

Subsequently, in the script block, it is verified that the information coming from the device does not exceed the established threshold value. If the script is positive, an alarm is configured and the information to be displayed is defined.


**Figure 5.** Alarm configuration on ThingsBoard, Block alarm creation method and Connecting alarms with sensors.

Moreover, the root string, which is in charge of obtaining and processing the information coming from the devices, has been modified. In this case, an originator type section has been added, where the devices that transmit the information are identified. Likewise, code strings have been generated to implement the customized code blocks in the panels. Finally, the information on the data panel may be visualized.

In cases where it is not necessary to perform this procedure, it is possible to view the notifications generated by the different devices. To this end, it is necessary to enter the Device section. Select one of the devices for which an alarm has been configured and go to the alarms tab, see Figure 6, where the notifications generated by that device are displayed.

Once the alarms have been configured on the platform, validation is carried out through the explanation of the AI [60].


**Figure 6.** Final configuration of ThingsBoard platform to be validated through an intelligent algorithm.

### **4. Platform Evaluation**

This section compares the different algorithms used in the state of the art to solve problems similar or related to the one being addressed here [61–64], these models have been accepted for real world problems due to their dataset results with data unbalance and saturation issues, this comparison will be performed with the same amount of data and on an objective quantitative basis. Furthermore, the present proposal is described in detail.

### *4.1. Data Model*

In this study, samples of data from a real environment have been obtained, where a subject was subjected to various scenarios in simulated environments, considering the different risks that could arise. The five analyzed parameters are shown in Table 7. The acquired dataset consists of a total of 11,755 samples, where five descriptive variants are proposed with respect to the target of the study.



This research tackles a multi-class type of problem, for this reason there is a set of labels that have a different meaning. When the programming of the microcontroller was carried out, the different parameter values that could trigger an alarm signal were investigated, for example, if the air quality falls below the threshold (measured by the Air Quality Index, AQI) it is possible to associate this situation with the values for other parameters measured by neighboring sensors. The 12 labels proposed in this work are described below, where research was carried out on the most common problems in industrial areas and from there the type of sensors in the helmet were included [65,66]:


Once the information has been understood, it is cleaned. As proposed by [67], the data were cleaned due to common problems such as missing values solved with the clamp transformation, see Equation (2).

$$a\_i = \begin{cases} lower & if a\_i < lower \\ upper & if a\_i > upper \\ a\_i & Othe side \end{cases} \tag{2}$$

where *ai* represents the *i*−th sample of the data set, lower and upper thresholds respectively.

The upper and lower thresholds can be calculated from the data. A common way of calculating thresholds for the clamp transformation is to establish:


Where *Q*1 is the first quartile, *Q*3 is the third quartile and *IQR* is the interquartile range (*IQR* = *Q*3 − *Q*1 the interquartile range). Any value outside these thresholds would become the threshold values. This research takes into account the fact that the variation in a data set may be different on either side of a central trend. Each sample that had missing data was eliminated so as not to bias the model. However, the search for outliers was only used to find erroneous data generated by the electronic acquisition system since outliers usually provide a large source of information for the analysis of a dataset.

### *4.2. Intelligent Models Evaluation*

The comparison part describes each of the models used for the current project, detailing the Support Vector Machine, Naïve Bayes classifier, Static Neural Network and a Convolutional Neural Network. Each model used the dataset of 11,755 where 80% was used for modeling and 20% for evaluation, in other words 9404 in training and 2351 in evaluation. The following confusion matrices reports the result of the validation and after that, we include a figure for each model in order to present the information clearly. It is worth mentioning that all models were trained and validated with the same data division in relation 80-20, it is also notable above an imbalance of classes, given the imbalance some models had unfavorable behavior in cross validation. To handle the imbalance it is possible to opt for techniques such as oversampling or undersampling but it is not desired to change the quality of the data, that is why the model with the best performance will be chosen and evaluated with 10 folds for validation.

### **Support Vector Machine**

SVMs belong to the categories of linear classifiers, since they introduce linear separators better known as hyperplanes, regularly made within a space transformed to the original space.

The first implementation used for the multi-class classification of SVM is probably the one against all method (one-against-all). The SVM is trained with all the examples of the *<sup>m</sup>*−th class with positive labels, and all the other examples with negative labels. Therefore, given the data of the (*<sup>x</sup>*1, *y*1), ...,(*xl*, *yl*) where *xi* ∈ *Rn*, *i* = 1, ..., *l* y *yi* ∈ {1, ..., *k*} is the class of *xi*, the *<sup>m</sup>*−th SVM, and solve the problem in Equation (3) [68], which involves finding a hyper plane so that points of the same kind are on the same side of the hyperplane, this is finding a *b* and *w* such:

$$y\_i(w'x\_i + b) \succ 0, \; i = 1, \dots, N \tag{3}$$

Equation (4) looks for a hyper plane to ensure that the data are linearly separable.

$$\min\_{1 \le i \le N} y\_i(w'x\_i + b) \ge 1 \tag{4}$$

where *w* ∈ <sup>R</sup>*d*, *b* ∈ R and the training dataset *xi* is mapped to a higher dimensional space. Thus, it is possible to search among the various hyperplanes that exist for the one whose distance to the nearest point is the maximum, in other words, the optimum hyperplane [68], see Equation (5).

$$\min\_{w,b} \frac{1}{2} w^T w$$
  $\text{individual } a \text{ } y\_i(w^T, x\_i + b) \ge 1, \forall i$ 

Given the above, we are looking for a plane with the maximum distance between the samples of different classes on a higher dimension. As mentioned above, the SVM was of the type one against all in the mathematical description since it is a multi-class problem. Furthermore, the type of kernel was linear. The modeling was performed and the confusion matrix was obtained with 20% of data for evaluation. The accuracy of each class in comparison to the others can be observed in Table 8. The SVM was the model with the worst performance out of the four evaluated according to the recommendation in the literature where the overall accuracy was 68.51%.

**Table 8.** Confusion matrix SVM.


### **Naive Bayes Classifier**

A Gaussian NB classifier is proposed that is capable of predicting when an accident has occurred in a work environment through different descriptive characteristics, which is based on Bayes' theorem. Bayes' theorem establishes the following relationship, given the class variable *y* and the vector of the dependent characteristic *x*1 through *xn* [69,70], Equation (6).

$$P(y \mid \mathbf{x\_1}, \dots, \mathbf{x\_n}) = \frac{P(y)P(\mathbf{x\_1}, \dots, \mathbf{x\_n} \mid y)}{P(\mathbf{x\_1}, \dots, \mathbf{x\_n})} \tag{6}$$

where ∀*i*, the relationship can be simplified as shown in Equation (7).

$$P(y \mid \mathbf{x}\_1, \dots, \mathbf{x}\_n) = \frac{P(y) \prod\_{i=1}^n P(\mathbf{x}\_i \mid y)}{P(\mathbf{x}\_1, \dots, \mathbf{x}\_n)} \tag{7}$$

where *<sup>P</sup>*(*<sup>x</sup>*1, ... , *xn*) is constant based on the input; the classification rule presented in Equation (8) can also be used.

$$P(y \mid \mathbf{x}\_1, \dots, \mathbf{x}\_n) \propto P(y) \prod\_{i=1}^n P(\mathbf{x}\_i \mid y)$$

$$\Downarrow$$

$$\hat{y} = \arg\max\_y P(y) \prod\_{i=1}^n P(\mathbf{x}\_i \mid y),$$

The difference in the distributions of each class in the dataset means that each distribution can be independently estimated as a one-dimensional distribution. This in turn helps reduce the problems associated with high dimensionality. For a Gaussian NB classifier the probability of the characteristics is assumed to be Gaussian, see Equation (9).

$$P(\mathbf{x}\_i \mid y) = \frac{1}{\sqrt{2\pi\sigma\_y^2}} \exp\left(-\frac{(\mathbf{x}\_i - \mu\_y)^2}{2\sigma\_y^2}\right) \tag{9}$$

In other words, in order to use the NB classifier in the grouping of the different work circumstances that put the worker at risk, it is assumed that the presence or absence of a particular characteristic is not related to the presence or absence of any other characteristic, given the variable class. The confusion matrix of the NB is shown in Table 9, where on average the accuracy was of 78.26%.


**Table 9.** Confusion matrix NB.

### **Static Neural Network**

Neural networks are simple models of the functioning of the nervous system. The basic units are the neurons, which are usually organized in layers. The processing units are also organized in layers. A neural network normally consists of three parts [71]:


The units are regularly connected with varying connection forces (or weights). Input data is presented in the first layer, and values are propagated from one neuron to another in the next layer. At the end, a result is sent from the output layer. All the weights assigned to each layer are random in the first instance of the training. However, there are a series of methods that can be employed to optimize this phase. Furthermore, the responses that result from the network are offline. The network learns through training [71]. Data for which the result is known are continuously presented to the network, and the responses it provides are compared with the known results.

The use of a static NN is proposed in this research. The performance of the classic model Adam has been compared with the performance of a CNN. The architecture of the NN is shown in Figure 7, which is a three-layer static model, where the first layer contains five neurons that correspond to each of the five data being obtained from the multisensory case, the hidden layer has 32 neurons with the ReLU activation function and finally the output layer has 12 neurons representing the situations a worker may find themselves in. They range from safe to risky situations. The last layer has a SoftMax activation function because it is a multi-class problem. The learning step was 0.05 and the model was trained with 500 epochs. In which the approach for the proposed structure is based on "trial and error", since as it is well known establishing a neural network is more an art than a science. That is why the number of neurons on the second layer was modified, which obtained a better result than adding other layers on the network. However, CNN showed better results than the rest of the models with a predetermined structure (12 neurons in the hidden layer).

**Figure 7.** Proposed architecture, static neural network.

The result of the static NN are given in Table 10. It is possible to observe its performance was not very different from the NB classifier, where an average accuracy of 78.56% was obtained.


**Table 10.** Confusion matrix NN.

### *4.3. Convolutional Neural Network*

A Convolutional Neural Network (CNN) was the selected model, it is a deep learning algorithm mainly used to work with images in which it is possible to use an input image (instead of a single vector as in static NNs), assign importance, weights, learnable biases to various aspects/objects of the image and be able to differentiate one from another [72]. The advantage of NNs is their ability to learn these filters/characteristics. Given the abovel we propose the use of a CNN to classify the data coming from the multisensorial helmet.

The proposed CNN's operation is illustrated in Figure 8. The CNN consists of segmenting groups of pixels close to the input image and mathematically operating against a small matrix called a kernel. However, the part of the image is replaced with the input vector of size 5, where a re-shape is made to obtain a vector of 5 × 1. Therefore, the kernel proposed in the current CNN is of size 1, and moves from 1 × 1 pixel, in our case it would be different dimensions. With that size it manages to visualize all the input neurons and thus it can generate a new output matrix; a matrix that will be our new layer of hidden neurons.

A CNN can contain the spatial and temporal dependency characteristics in an image by applying relevant filters, the same applies to a data set that has been re-organized. The proposed architecture is an input layer for the transformed vector with size 5 × 1 × 1 with two hidden convolutional layers for two-dimensional data (Conv2D) and ReLU activation functions with a total of 64 and 32 neurons respectively. Finally a layer with 12 output neurons with SoftMax activation function for multiclass classification.

**Figure 8.** Deep convolutional neural network operation.

A classical model of Adam was proposed and trained with 500 epochs, the parameters were the same for the static NN and CNN to have an objective margin with respect to their evaluation. The following are the results on the AI models used for their implementation in the multisensorial helmet. Table 11 shows the evaluation for CNN where an overall accuracy of 92.05% was achieved.


