*2.3. Feature Engineering*

Feature engineering is the process of extracting attributes that improve the machine learning model from the raw information using data mining techniques [28]. The features must have an appropriate configuration. Therefore, these features are created from the previously acquired data, and their inputs are transformed and prepared for the machine learning (ML) model. Synthetic features that do not originally exist in the dataset are also created from the available information and they will allow the model to perform better.

The features have to comply with a series of properties in order to be suitable for use in the model. The attribute must be related to the objective of the application, i.e., it must be relevant in the output that our model is looking for. In addition, it must be certain that it will be possible to know the value of that feature at the time of the prediction; otherwise, the model will not work correctly. The value of the features must be numerical and must represent a magnitude. This is because, for example, a neural network is no more than a machine that performs arithmetic, trigonometric, and algebraic operations on the input variables [29]. Therefore, those features that are not numerical but are required to be used will have to be converted or encoded before introducing them into the model. Finally, it is also necessary to have enough examples of the features values to train the model correctly. The above rules are used to extract the appropriate features for the application and to generate a dataset that will be used to train and to validate the desired model. The transformations to be performed on the available signals in order to extract the attributes will depend on the final objective of the application. Therefore, once the problem to be faced has been defined, the signals must be analyzed and those that will be useful for the model must be selected. Subsequently, the necessary modifications will made.

The final aim of this stage is to generate a suitable dataset to train the model. As mentioned above, this stage depends entirely on the information obtained from the model. Therefore, it is directly related to the result of the model training and could be placed in a parallel position to the next stage. In most of the times, it will be necessary to modify the dataset with new information or to change the existing features.

#### *2.4. Training and Generation of a Custom Model*

Using the generated feature dataset from the information collected by the data acquisition system, a customized model is trained for the user and the desired application. For this purpose, appropriate machine learning algorithms are used according to the specific application, i.e., it is particularized according to the information that will be predicted in the future. The training and evaluation process of the model is performed cyclically

until the optimal and sufficiently accurate behavior of the model is found. The dataset will be modified using the information obtained during training to improve the model accuracy. Finally, a validation of the model is performed with a part of the dataset reserved for testing. This allows checking the predictive behavior of the trained model, analyzing possible lines of improvement in the process, and verifying the accuracy for new inputs to the platform.

Once the training, validation, and testing processes of the model are completed, the trained model can be used to predict information from new inputs to the platform taken with the acquisition system. The model can be hosted on a client that runs continuously and can provide real-time feedback to the user.

The first step to be performed before starting the training is the split of the dataset into three parts. A diagram of this split is shown in Figure 6. The first fragment will be the training data (training set). The training set is the dataset used by the model to learn how to process the information. From these data, the model adjusts the parameters of the classifier or from the algorithm used in the model. Using the training set, different machine learning algorithms can be evaluated to generate the model, and the results obtained can be compared to obtain the most appropriate one for the application. The training set contains most of the data from the main dataset. The validation set is used to estimate how well the model has been trained. This can be done between periodic training cycles and when the training process has been completed. It is also used to estimate model properties, such as the error in a classifier or the precision and recall in binary models. Cross-validation is the most commonly used method for this task [30]. Finally, there is the test set. It is used only to evaluate the performance of the model after the training process has been completed. It can be considered as a mock production use of the model.

**Figure 6.** Workflow of dataset split and cross validation process.

#### *2.5. Creation of a People Concentration Model in the Workstation*

The system presented in this paper can be used for multiple pathologies and applications. The acquisition system and the database allow to easily adapt or integrate new sensors. The signal processing, the feature analysis, and the machine learning algorithms used to generate the model will also be defined according to the objective. To show the workflow of the platform, an experimentation to validate the system is performed. The test consist of generating a personalized model of the concentration of a person in two different workstations within the same research laboratory. One of the places is located at the entrance of the laboratory, next to the door. Any person who must access or leave the laboratory has to pass by this position, which we will call Position A. The second position (Position B) is located in an isolated room within the same laboratory.

The final goal of the model is to predict at which times the person is focused on the tasks being performed and at which times the concentration is reduced, for example, due to an interruption or stimuli around the workstation. This will provide information on which of the positions is the most suitable to work at. Although this application is not directly related to assisting people with neurological disorders, it is an example of the versatility that the system can offer as well as to show the working methodology of the platform. In addition, a similar model could also be proposed for a person with a neurological disorder so that it would be possible to analyze how the user adapts to a specific work

environment. The two possible situations are labeled as *"Focused"* and *"Distracted"*, and the output of the predictive model will be one of these two options.

First, it is necessary to define what will be considered as *"Focused"* and *"Distracted"*. For this example, a first supervised part has been proposed where data are collected from the person and the environment in both workstations performing a task that allows to quantify the person's concentration on the task. This task consists of reading an entertainment book. The reading of technical or scientific documents, which a priori could be more related to a task to be developed in the person's work, has been discarded, because the reading pace may not be constant due to the complexity of certain parts throughout the document. During this supervised period, the number of lines per minute read by the user is counted. In this way, a range is established in which the person is considered focused on the task. Below that range, the person is considered distracted. The supervised task is necessary to establish the user's concentration level in a quantitative way. The model needs this information to learn in which situations the user is focused on the task.

Once the supervised stage is completed, the necessary information is available to generate the new model, always based on the previously established definition of concentration. The second part of the experimentation corresponds to the unsupervised collection of new data in both workstations. The user performs the usual tasks of his job. The new information collected will be used to estimate the concentration levels of the person in both workstations using the generated model with the supervised data.

For the supervised stage, a data acquisition period of 180 min has been established for each of the established positions (A and B), i.e., a total of 6 h. To make the supervised experimentation more user-friendly, the time has been divided into three one-hour periods, taken on consecutive days. Subsequently, for the unsupervised stage, a total of 8 h of data have been collected. For each of the positions, 4 h have been taken. In turn, these 4 h have been captured in 2-h periods on consecutive days.

The next step is to analyze the available signals and decide which data can provide relevant information to the model. As the experimentation is performed in a controlled environment, the weather conditions are kept constant over time at both workstations. Therefore, ambient temperature, relative humidity, and atmospheric pressure do not provide information to the model that helps to predict the concentration level and are discarded. Similarly, the use of the ambient luminosity can be discarded as the values also remain constant over time and of the same magnitude, both at position A and B. Finally, the use of the magnetometer signal from the MA sensor is also discarded because it is not necessary to know the user's position. Thus, the heart rate and the body temperature signals of the user, the gyroscope and the accelerometer signals of the MA sensor, the ambient sound, and the signals of the video device (optical flow and people detection) would remain.

When the available signals to be used have been decided, a preprocessing is performed before generating the dataset. The different transformations applied to each of them are described below.

• **Sound:** The audio spectrum is available divided into frequency bands. For the desired model, the information isolated by frequencies is not needed. Thus, a feature that contains the accumulated energy value in all frequency bands is generated. We use Equation (1) for this purpose:

$$Sound\_{total} = 10 \log \left( \sum\_{i=1}^{n} 10^{\frac{Sound\_i}{10}} \right) \tag{1}$$


Finally, the signals of all the generated features are normalized, excluding the quantification of people. A fragment of these signals can be seen in the Appendix A in Figure 1. Once the features have been established and the different transformations have been carried out, the dataset is generated. For this purpose, a time vector is created where the time values of all the features are chronologically ordered together inside with an identifier that indicates the feature to which they belong. In addition, an index is assigned to each of the features indicating the last entry added to the dataset. Then, the time vector is browsed, adding to the dataset the values of the different features of the position indicated by each of its indexes. This merging generates an input in the dataset. Once the entire time vector has been browsed, the dataset is complete with all the information generated. To describe this process graphically, a schematic is shown in Figure 7.

**Figure 7.** Diagram of the dataset generation process.

Using the described method, both datasets are generated for our model with the data collected in the experiment (supervised set and unsupervised set). Table 1 shows the distribution of both. Therefore, joining the records in both positions (A and B), we have a dataset for training and validation of the model with 17,135 entries, with 34.08% of entries labeled as *"Distracted"*.



For training and validation of the model, the dataset has been split (80–20%). Between different checked algorithms, a sequential neural network model has been selected due to its performance with the available dataset. The neural network architecture has also been adjusted from the simplest to obtain an adequate performance without increasing the inference time excessively. The selected neural network consists of the input layer, two hidden layers with 120 and 80 nodes with *"relu"* type activation function and the output layer with 2 nodes and *"softmax"* as activation function. A schematic of the model structure is shown in Figure 8. The parameters used for training the model are as follows.

> { *Optimizer* = *Adam*, *Loss f unction* = *Sparse Categorical Crossentropy*, *Batch size* = 10, *Numer o f epochs* = 500 }

**Figure 8.** Architecture of the model.

#### **3. Results**

It should not be forgotten that the final aim of the experimentation is to generate a model of the concentration of a person in two different workstations within the same laboratory. This section attempts to present the results obtained from both the training of the model and the experimentation itself.

The obtained results of the model training are summarized in the Table 2. The Receiver Operating Characteristic (ROC) curve and its respective Area Under Curve (AUC) are included in Figure 9. The performance of the model is very fine. The model is able to recognize which dataset inputs have been defined as *"Focused"* and *"Distracted"* with high accuracy.

**Table 2.** Report of the model performance.


**Figure 9.** Receiver Operating Characteristic (ROC) curve of the model training result.

Once the model has been generated and evaluated its performance, the unsupervised dataset is used in order to identify each dataset input as a *"Focused"* or *"Distracted"* situation using the model. The results obtained are shown in Figure 10. A comparative graph between both positions (A and B) can be seen.

**Figure 10.** Obtained distribution of the predictions of the unsupervised data.

#### **4. Discussion**

A model with successful results has been trained only using seven generated features from all the integrated signals in the platform. As can be seen in Table 2, the error in the model evaluation is very low. The AUC is very close to 1 (Figure 9), which means that the obtained results in the evaluation are almost perfect.

Analyzing the generated predictions by the model from the unsupervised set (Figure 10), the total time of concentration in Position B is higher. There is a notable difference in the time that the user remains focused on his tasks in Position A (51.34%), with respect to Position B, where more than the 90% of the time has been identified as *"Focused"*. In addition, the concentration time has undergone a 50% more changes in the level of concentration in Position A (39) than in Position B (25). This implies that concentration periods are shorter between distractions. Therefore, it seems that as common sense would indicate, Position B is a better workstation.

Regarding the performance of the initial stages of the platform, the acquisition data system has allowed obtaining signals with good quality in all devices and the storage protocol has worked correctly. The analysis of the signals and the processing carried out to extract the features have been customized for the application that has been presented as an example of use. However, the platform presented in this work has demonstrated high versatility when it comes to generating new information that can help people with neurological disorders.

#### **5. Conclusions**

This work aims to present an intelligent platform that could provide useful information about people with neurological pathologies as an assistive technology. The novelty of this work is the acquisition of physiological and environmental signals for the generation of predictive models using machine learning algorithms. Throughout the paper, the different stages that make up the system have been described. Finally, an example of the use of the platform is presented, which allows a more detailed description of each of the steps taken until the desired model is obtained. The proposed application in the experimentation does not have a direct relationship with the assistance to people with neurological disorders. However, it has made it possible to describe the work at each stage of the platform. In addition, the presented example could be transformed into a real case with a user with a neurological disorder. For example, the platform could be used to measure the user's level of adaptation to a particular job.

This paper attempts to show the versatility offered by the presented system. The modular design allows to integrate or adapt different sensors. The analysis and processing of the signals will be defined according to the objective for which the platform is to be used. Likewise, the machine learning algorithms used will depend on the model to be obtained. However, to generate a real predictive model that helps to manage problems arising from a neurological disorder, it is necessary to collect a much larger amount of information than the used for the example case. In addition, the feature engineering that must be developed to obtain the appropriate features for the model is also very complex. Therefore, the system has been presented in a generic way together with a simple example to show the workflow.

Currently, the system is in use for the development of a personalized models of people with ASD, one for each user. Every model should predict behavioral changes in the user due to environmental stimuli, caused by a sensory processing problem derived from their pathology. Data collection is being developed in a clinical setting. Another possible use could be to obtain information about the stress level of a person with reduced mobility. The information provided by the platform could be used to modify the user's posture, alert the caregiver, etc. Different applications could also be to have an intelligent feedback system during a session of rehabilitation activities, or to be used in Sleep Disorders Units.

**Author Contributions:** Conceptualization, J.M.V.-S. and J.M.S.-N.; methodology, J.M.V.-S. and J.M.S.-N.; software, J.M.V.-S.; validation, J.M.S.-N., E.A.-N., and J.M.V.-S.; formal analysis, J.M.V.-S., V.E., and E.A.-N.; investigation, J.M.V.-S.; resources, J.M.V.-S. and E.A.-N.; data curation, J.M.V.-S.; writing—original draft preparation, J.M.V.-S.; writing—review and editing, J.M.S.-N., E.A.-N., V.E., and J.M.V.-S.; visualization, J.M.V.-S.; supervision, J.M.S.-N., E.A.-N., and J.M.V.-S.; project administration, J.M.S.-N. and J.M.V.-S.; funding acquisition, J.M.S.-N. and E.A.-N. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was partially funded by Spanish Research State Agency and European Regional Development Fund through "Race" Project (PID2019-111023RB-C32). The work of J.M.V.-S. is supported by the Conselleria d'Educació, Investigació, Cultura i Esport (GVA) through FD-GENT/2018/015 project.

**Institutional Review Board Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

**Appendix A**

**Figure 1.** Fragment of the selected signals as features during the experimentation.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Applied Sciences* Editorial Office E-mail: applsci@mdpi.com www.mdpi.com/journal/applsci

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com

ISBN 978-3-0365-3678-1