*6.1. Overview of Results*

We begin with providing a high level view of our system's performance over the range of different parameter values. A first metric we used for our evaluation is the *F*<sup>1</sup> score [51], which takes into account both precision and recall and is robust to class imbalance. It is defined as:

$$F\_1 = 2 \times \frac{precision \times recall}{precision + recall} \tag{1}$$

where *F*<sup>1</sup> ∈ [0, 1], *precision* = *tp tp*+*f p* , *recall* = *tp tp*+*f n* , *tp* = *true positives*, *f p* = *f alse positives* and *f n* = *f alse negatives*. A value of *F*<sup>1</sup> close to one indicates the best classification performance.

Figures 7 and 8 present the *F*<sup>1</sup> score performance of our system. We must note that we have calculated the weighted average of *F*<sup>1</sup> score over all activities, weighted by the number of true instances for each class, for different window sizes, feature types and classification models.

More specifically, in Figure 7a, we illustrate our system's performance when using the first feature type without beacon feature fusion. We can observe that LR performs considerably worse compared to the other three classifiers. More specifically, KNN, RF and SVM are all able to achieve a maximum *F*<sup>1</sup> score of 0.8 for a window size of 4 s, while LR achieves a *F*<sup>1</sup> score of 0.7 for the same window size. Increasing the window size improves the classification performance; however, exceeding a size of 4 s does not yield further improvement. The same performance pattern can be seen in Figure 7b, where there is a significant performance gap between LR and the rest of the classifiers. Both figures indicate that using a higher dimensional feature space for the smart watch data (Feature Type 2), improves the performance of all classifiers.

Figure 8a,b presents our system's performance when using BLE beacon data in conjunction with smart watch data. It is evident that there is a significant enhancement in the system's performance as illustrated by the improved *F*<sup>1</sup> scores for all classifiers. We should note that when using location-enhancement, all classifiers, except LR, are able to achieve *F*<sup>1</sup> scores above 0.9 even for the smallest window size of 1 s. As a small window size improves our system's response time (less time required to recognise the performed activity), this result highlights the benefit of using beacon feature fusion in our activity recognition system. We can also observe a similar performance pattern for the case where no beacon data are used, both with respect to window size and to the gap between LR and the rest of the classifiers. However, we should note that now there is a more clear distinction among the classifiers in terms of performance. SVM achieves the best *F*<sup>1</sup> score for all experimental configurations, followed by KNN and RF, respectively.

**Figure 7.** Activity recognition system performance without beacon data: weighted average of *F*<sup>1</sup> *score* over all activities, for different window sizes, feature types and classification models. (**a**) Wearable Feature Type 1 (mean, standard deviation); (**b**) Wearable Feature Type 2 (mean, standard deviation, mean crossing rate, maximum and minimum).

**Figure 8.** Activity recognition system performance with beacon data: weighted average of *F*<sup>1</sup> *score* over all activities, for different window sizes, feature types and classification models. (**a**) Wearable Feature Type 1 (mean, standard deviation); (**b**) Wearable Feature Type 2 (mean, standard deviation, mean crossing rate, maximum and minimum).

An overview of activity-specific classification performance across all experimental configurations is illustrated in Figure 9. We can again confirm that the performance pattern observed in Figures 7 and 8 is present: LR has a consistently worse performance compared to the other classifiers; increasing the smart watch feature dimensionality improves classification performance; and beacon feature fusion significantly enhances classification performance for all activities and classifiers.



**Figure 9.** *F*<sup>1</sup> Scores for all classifiers and activities, for a window size of 3 s. (**a**) KNN; (**b**) LR; (**c**) RF; (**d**) SVM. **Figure 9.** *F*<sup>1</sup> Scores for all classifiers and activities, for a window size of 3 s. (**a**) KNN; (**b**) LR; (**c**) RF; (**d**) SVM.

#### *6.2. Evaluation of Individual Activities 6.2. Evaluation of Individual Activities*

The observations of Section **??** have informed the choice of system parameters that are investigated here, where we elaborate on our system's performance for individual activities. Based on these observations, window sizes higher than 4 s do not significantly benefit the system's performance. Furthermore, there is a clear performance gain when using the second feature type for the smart watch data. Thus, we analyse individual activity classification for window sizes up to 4 s when using the second smart watch feature type, with and without beacon feature fusion. We will refer to the activities with the codes assigned in Table **??**. The observations of Section 6.1 have informed the choice of system parameters that are investigated here, where we elaborate on our system's performance for individual activities. Based on these observations, window sizes higher than 4 s do not significantly benefit the system's performance. Furthermore, there is a clear performance gain when using the second feature type for the smart watch data. Thus, we analyse individual activity classification for window sizes up to 4 s when using the second smart watch feature type, with and without beacon feature fusion. We will refer to the activities with the codes assigned in Table 2.

To better illustrate our system's performance, we present our results for each classifier using a normalised confusion matrix. A row of the matrix represents the instances in an actual class, while a column represents the instances in a predicted class. The diagonal elements represent the number of instances where the predicted label is equal to the true label. Off-diagonal elements represent instances that are misclassified. Furthermore, we have normalised the confusion matrices by the number of elements in each actual class. In the case of class imbalance, this approach better illustrates which classes are being misclassified. Furthermore, we have colour-coded the matrices by assigning black to 1.0 (100%) and white to 0.0 (0%). Finally, we should emphasise that the evaluation results have been calculated using only the test set data (20% of the original dataset), which the classifiers have never seen before, in order to provide a more reliable estimation of their out of sample error. To better illustrate our system's performance, we present our results for each classifier using a normalised confusion matrix. A row of the matrix represents the instances in an actual class, while a column represents the instances in a predicted class. The diagonal elements represent the number of instances where the predicted label is equal to the true label. Off-diagonal elements represent instances that are misclassified. Furthermore, we have normalised the confusion matrices by the number of elements in each actual class. In the case of class imbalance, this approach better illustrates which classes are being misclassified. Furthermore, we have colour-coded the matrices by assigning black to 1.0 (100%) and white to 0.0 (0%). Finally, we should emphasise that the evaluation results have been calculated using only the test set data (20% of the original dataset), which the classifiers have never seen before, in order to provide a more reliable estimation of their out of sample error.

As shown in Section **??**, the LR classifier results in the lowest classification performance in all experimental configurations. Figures **??** and **??** confirm this observation for individual activities. More specifically, for activities A1, A2, A3 and A6, the LR classifier without beacon information is able to achieve a classification accuracy that increases with window size and manages to reach 80%. Adding beacon information does not significantly change the performance of the classifier for activities A1, A2 and A6. However, A3 benefits significantly and reaches 100% accuracy for a window As shown in Section 6.1, the LR classifier results in the lowest classification performance in all experimental configurations. Figures 10 and 11 confirm this observation for individual activities. More specifically, for activities A1, A2, A3 and A6, the LR classifier without beacon information is able to achieve a classification accuracy that increases with window size and manages to reach 80%. Adding beacon information does not significantly change the performance of the classifier for activities A1, A2 and A6. However, A3 benefits significantly and reaches 100% accuracy for a window value of 4 s. Looking at Table 2, we can see that A3 takes place in Sector 1, while A1, A2 and A6 do

not. This is beneficial for the classification and allows the LR classifier to better distinguish between the activities. We must note that, although A6 can also take place in Sector 1, the actual micro-location within this sector is different (scanning and installing take place in subtly different locations along the bench). This is adequately different for the classifier to improve its performance. Looking at activities A4, A5, A7 and A8, we observe that LR gives poor classification performance without beacon data. For example, A4 is misclassified as A5, with more than 50% of examples classified incorrectly. More specifically, we can note that the activities of patching and relocating both involve translational hand movement while grasping an object (a cable or a piece of equipment). This can be confirmed by Figure 6d,e. value of 4 s. Looking at Table 2, we can see that A3 takes place in Sector 1, while A1, A2 and A6 do not. This is beneficial for the classification and allows the LR classifier to better distinguish between the activities. We must note that, although A6 can also take place in Sector 1, the actual micro-location within this sector is different (scanning and installing take place in subtly different locations along the bench). This is adequately different for the classifier to improve its performance. Looking at activities A4, A5, A7 and A8, we observe that LR gives poor classification performance without beacon data. For example, A4 is misclassified as A5, with more than 50% of examples classified incorrectly. More specifically, we can note that the activities of patching and relocating both involve translational hand movement while grasping an object (a cable or a piece of equipment). This can be confirmed by Figure 6d,e.


**Figure 10.** Normalised confusion matrices for logistic regression, with Wearable Feature Type 2, without beacon data. (**a**) C = 10, w = 1 s; (**b**) C = 100, w = 2 s ; (**c**) C = 10, w = 3 s; (**d**) C = 10, w = 4 s. **Figure 10.** Normalised confusion matrices for logistic regression, with Wearable Feature Type 2, without beacon data. (**a**) C = 10, w = 1 s; (**b**) C = 100, w = 2 s ; (**c**) C = 10, w = 3 s; (**d**) C = 10, w = 4 s.

To further explain this, we must note that each of the complex activities that we aim to classify can be composed into a set of simpler activities, with varying time durations. For example, patching the routers requires grasping a network cable, moving it towards the respective socket and pushing the cable until it is securely connected to the socket. Similarly, changing the printer cartridges requires pulling the cartridge out of the printer slot and then pushing the new cartridge into the printer slot. During the training phase, this activity structure is taken into account in a straightforward manner, simply by applying the same label to all windowed data collected for one activity. This is done automatically by our data gathering application while the participant performs an activity. In the classification phase, the performance of our system depends on the similarity between the complex activities. This can be expressed in terms of the similarity among the simple activities of which two complex activities are composed. To further explain this, we must note that each of the complex activities that we aim to classify can be composed into a set of simpler activities, with varying time durations. For example, patching the routers requires grasping a network cable, moving it towards the respective socket and pushing the cable until it is securely connected to the socket. Similarly, changing the printer cartridges requires pulling the cartridge out of the printer slot and then pushing the new cartridge into the printer slot. During the training phase, this activity structure is taken into account in a straightforward manner, simply by applying the same label to all windowed data collected for one activity. This is done automatically by our data gathering application while the participant performs an activity. In the classification phase, the performance of our system depends on the similarity between the complex activities. This can be expressed in terms of the similarity among the simple activities of which two complex activities are composed.

We can also observe that activity A8 is misclassified as A5. These activities are again similar in nature (hand movements involve inserting an object (cable or cartridge) inside a slot (Ethernet port or printer cartridge bay), as can be observed in Figure 6e,h. Adding beacon information drastically improves results for A4 and A8. Looking at Table 2, we can see that the locations of these activities are distinct compared to the rest of the activities, and beacon information helps the classifier discriminate the relevant data points. For example, A8 is no longer misclassified as A5. Although A5 and A8 can be both performed in Sector 2, the locations of the printers within this sector are, as one would expect, distinct from the locations where patching takes place. We can also observe that activity A8 is misclassified as A5. These activities are again similar in nature (hand movements involve inserting an object (cable or cartridge) inside a slot (Ethernet port or printer cartridge bay), as can be observed in Figure 6e,h. Adding beacon information drastically improves results for A4 and A8. Looking at Table 2, we can see that the locations of these activities are distinct compared to the rest of the activities, and beacon information helps the classifier discriminate the relevant data points. For example, A8 is no longer misclassified as A5. Although A5 and A8 can be both performed in Sector 2, the locations of the printers within this sector are, as one would expect, distinct from the locations where patching takes place.

Activities A5 and A7 also benefit from beacon data, but to a lesser degree. For example, A5 is still misclassified as A2 for more than 10% of the data. This is due to the fact that both activities take place in the same sector. Although this does not mean that their locations are exactly the same (in which case there would be no benefit from additional location information), they are not sufficiently different to result in greater classification improvement. We must also highlight the fact that increasing the window size improves by a small degree the performance of the LR classifier. As a small window size results in a more responsive activity recognition system (less waiting time to construct a data point), it is evident that LR suffers in that respect since for a window size of 1 s, the results are poor. Activities A5 and beacon data, but to a lesser degree. For example, A5 is still misclassified as A2 for more than 10% of the data. This is due to the fact that both activities take place in the same sector. Although does not mean that their locations are exactly the same (in which case there would be no benefit from additional location information), they are not sufficiently different to result in greater classification improvement. We must also highlight the fact that increasing the window size improves by a small degree the performance of the LR classifier. As a small window size results in a more responsive activity recognition system (less waiting time to construct a data point), it is evident that LR suffers in that respect since for a window size of 1 s, the results are poor.


**Figure 11.** Normalised confusion matrices for logistic regression, with Wearable Feature Type 2, with beacon data. (**a**) C = 10, w = 1 s; (**b**) C = 10, w = 2 s ; (**c**) C = 10, w = 3 s; (**d**) C = 100, w = 4 s. **Figure 11.** Normalised confusion matrices for logistic regression, with Wearable Feature Type 2, with beacon data. (**a**) C = 10, w = 1 s; (**b**) C = 10, w = 2 s ; (**c**) C = 10, w = 3 s; (**d**) C = 100, w = 4 s.

Figures 12 and 13 illustrate the performance of the KNN classifier. We can observe that activities A1, A2, A3 and A6 are classified with over 90% accuracy without beacons, an improvement in performance compared to the LR classifier. Adding beacon data further improves performance, as expected. KNN benefits significantly from increasing the window size. This is shown for activities A4, A5 and A7 where for a 1-s window, they are below 60%. However, when the window size increases, they all reach an accuracy close to 75%, without beacon data. Adding beacon data further improves the performance of these activities. We must again note that, although these activities are performed in common sectors, the micro-locations inside each sector are different. For example, activity A7 (assembling the robot) and activity A5 (patching the router) take place on different parts of the workbenches inside the sectors. The KNN classifier can take advantage of this information to improve performance, something that the LR classifier could not achieve to the same degree. Finally, the benefit of location information is clearly shown in the case of activity A8. Without location information, the best accuracy obtained is 62%. With location information, it reaches 99% for the same window size (3 s). As seen in Figures 14 and 15, the RF classifier exhibits a performance similar to KNN in terms of Figures 12 and 13 illustrate the performance of the KNN classifier. We can observe that activities A1, A2, A3 and A6 are classified with over 90% accuracy without beacons, an improvement in performance compared to the LR classifier. Adding beacon data further improves performance, as expected. KNN benefits significantly from increasing the window size. This is shown for activities A4, A5 and A7 where for a 1-s window, they are below 60%. However, when the window size increases, they all reach an accuracy close to 75%, without beacon data. Adding beacon data further improves the performance of these activities. We must again note that, although these activities are performed in common sectors, the micro-locations inside each sector are different. For example, activity A7 (assembling the robot) and activity A5 (patching the router) take place on different parts of the workbenches inside the sectors. The KNN classifier can take advantage of this information to improve performance, something that the LR classifier could not achieve to the same degree. Finally, the benefit of location information is clearly shown in the case of activity A8. Without location information, the best accuracy obtained is 62%. With location information, it reaches 99% for the same window size (3 s).

being able to take advantage of micro-location information. We can again confirm that adding beacon information improves drastically the classifier's performance. Mores specifically, the classification accuracy for activities A4 and A8 (for their optimal window sizes) increases from 76 and 67% without beacon data to 98% for both activities when beacon features are fused with smart watch features. However, the RF classifier cannot fully take advantage of the feature fusion in the case of activities A5 As seen in Figures 14 and 15, the RF classifier exhibits a performance similar to KNN in terms of being able to take advantage of micro-location information. We can again confirm that adding beacon information improves drastically the classifier's performance. Mores specifically, the classification accuracy for activities A4 and A8 (for their optimal window sizes) increases from 76 and 67% without beacon data to 98% for both activities when beacon features are fused with smart watch features. However, the RF classifier cannot fully take advantage of the feature fusion in the case of activities A5

and A7: the RF classifier cannot achieve accuracy above 93% and 91% when we use beacon information, and it only manages this for the maximum window size of 4 s. and A7: the RF classifier cannot achieve accuracy above 93% and 91% when we use beacon information, and it only manages this for the maximum window size of 4 s. and A7: the RF classifier cannot achieve accuracy above 93% and 91% when we use beacon information, and it only manages this for the maximum window size of 4 s.


**Figure 12.** Normalised confusion matrices for KNN, with Wearable Feature Type 2, without beacon data. (**a**) *n* = 9, w = 1 s; (**b**) *n* = 10, w = 2 s ; (**c**) *n* = 8, w = 3 s; (**d**) *n*= 5, w = 4 s. **Figure 12.** Normalised confusion matrices for KNN, with Wearable Feature Type 2, without beacon data. (**a**) *n* = 9, w = 1 s; (**b**) *n* = 10, w = 2 s ; (**c**) *n* = 8, w = 3 s; (**d**) *n*= 5, w = 4 s. **Figure 12.** Normalised confusion matrices for KNN, with Wearable Feature Type 2, without beacon data. (**a**) *n* = 9, w = 1 s; (**b**) *n* = 10, w = 2 s ; (**c**) *n* = 8, w = 3 s; (**d**) *n*= 5, w = 4 s.


(**a**)

(**c**)

A1 A2 A3 A4 A5 A6 A7 A8


A1 A2 A3 A4 A5 A6 A7 A8

(**b**)

A1 A2 A3 A4 A5 A6 A7 A8

(**d**)


(**a**) *n* = 3, w = 1 s; (**b**) *n* = 3, w = 2 s ; (**c**) *n* = 3, w = 3 s; (**d**) *n* = 3, w = 4 s.

A1 A2 A3 A4 A5 A6 A7 A8


Figures 16 and 17 illustrate that SVM has the optimal recognition performance and also benefits **Figure 13.** Normalised confusion matrices for KNN, with Wearable Feature Type 2, with beacon data. (**a**) *n* = 3, w = 1 s; (**b**) *n* = 3, w = 2 s ; (**c**) *n* = 3, w = 3 s; (**d**) *n* = 3, w = 4 s. **Figure 13.** Normalised confusion matrices for KNN, with Wearable Feature Type 2, with beacon data. (**a**) *n* = 3, w = 1 s; (**b**) *n* = 3, w = 2 s ; (**c**) *n* = 3, w = 3 s; (**d**) *n* = 3, w = 4 s.

the most from beacon information. More specifically, classification accuracy for activities A1, A2,

A3 and A6 is above 85% without beacon data. This is further improved, as expected, when beacon information is used and reaches a classification accuracy of over 95%. Correctly classifying activities A4 and A5 proves more challenging since, as we explained above, both activities involve similar translational hand movement. However, SVM is the only classifier that reaches above 60% accuracy Figures 16 and 17 illustrate that SVM has the optimal recognition performance and also benefits the most from beacon information. More specifically, classification accuracy for activities A1, A2, A3 and A6 is above 85% without beacon data. This is further improved, as expected, when beacon information is used and reaches a classification accuracy of over 95%. Correctly classifying activities A4 and A5 proves more challenging since, as we explained above, both activities involve similar translational hand movement. However, SVM is the only classifier that reaches above 60% accuracy Figures 16 and 17 illustrate that SVM has the optimal recognition performance and also benefits the most from beacon information. More specifically, classification accuracy for activities A1, A2, A3 and A6 is above 85% without beacon data. This is further improved, as expected, when beacon information is used and reaches a classification accuracy of over 95%. Correctly classifying activities A4 and A5 proves more challenging since, as we explained above, both activities involve similar translational hand movement. However, SVM is the only classifier that reaches above 60% accuracy

without beacon data for window sizes greater than 1 s. Adding beacon data increases the classification performance to near perfect accuracy levels. Furthermore, although SVM has a classification accuracy similar to that of the other classifiers for activity A7 without beacon information, it outperforms them with beacon information and reaches 97% accuracy. Looking more closely at the confusion matrices, we observe that activity A7 proves one of the most challenging activities to classify accurately for the other classifiers, even with beacon data. Although activity A7 takes place in the same set of sectors with activities A2, A5 and A5, the micro-locations inside each sector are different (i.e., location along a workbench). SVM can use this micro-location information, revealed by the beacon data, better than other classifiers, and this results in higher classification accuracy. The same behaviour is observed for activity A8: when using information solely from smart watches, classification accuracy does not reach a level above 65%. Adding beacon information results in perfect accuracy for most window sizes. without beacon data for window sizes greater than 1 s. Adding beacon data increases the classification performance to near perfect accuracy levels. Furthermore, although SVM has a classification accuracy similar to that of the other classifiers for activity A7 without beacon information, it outperforms them with beacon information and reaches 97% accuracy. Looking more closely at the confusion matrices, we observe that activity A7 proves one of the most challenging activities to classify accurately for the other classifiers, even with beacon data. Although activity A7 takes place in the same set of sectors with activities A2, A5 and A5, the micro-locations inside each sector are different (i.e., location along a workbench). SVM can use this micro-location information, revealed by the beacon data, better than other classifiers, and this results in higher classification accuracy. The same behaviour is observed for activity A8: when using information solely from smart watches, classification accuracy does not reach a level above 65%. Adding beacon information results in perfect accuracy for most window sizes.


(**b**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.87 0 0 0 0 0.1 0.03 0 **A2** 0 0.86 0 0.01 0.08 0 0.03 0.02 **A3** 0 0 0.92 0 0.01 0.02 0.04 0 **A4** 0 0.03 0.01 0.6 0.14 0 0.05 0.17 **A5** 0 0.08 0.01 0.03 0.72 0 0.08 0.07 **A6** 0.09 0 0.01 0 0 0.88 0.01 0 **A7** 0.08 0.03 0.01 0.02 0.04 0.01 0.78 0.03 **A8** 0 0.09 0 0.03 0.17 0.01 0.14 0.57 (**b**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.93 0 0 0 0 0.05 0.02 0 **A2** 0 0.92 0 0 0.02 0 0.03 0.03 **A3** 0 0 0.97 0 0 0 0.03 0 **A4** 0 0 0 0.76 0.12 0 0.04 0.08 **A5** 0 0.12 0 0.03 0.74 0 0.05 0.06 **A6** 0.04 0 0 0 0.01 0.92 0.04 0 **A7** 0.02 0.01 0.02 0 0.04 0.03 0.84 0.04 **A8** 0 0.08 0 0.08 0.14 0 0.07 0.63 (**d**)

(**a**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.83 0 0 0 0 0.14 0.03 0 **A2** 0.01 0.74 0.01 0.02 0.14 0 0.07 0.02 **A3** 0 0 0.96 0.01 0 0.01 0.03 0 **A4** 0.01 0.05 0.01 0.57 0.15 0.01 0.08 0.12 **A5** 0.01 0.14 0.01 0.04 0.65 0 0.08 0.06 **A6** 0.08 0 0 0 0 0.88 0.04 0 **A7** 0.08 0.05 0.02 0.02 0.09 0.03 0.68 0.03 **A8** 0.01 0.08 0.01 0.06 0.19 0.01 0.12 0.5 (**a**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.94 0 0 0 0 0.05 0.01 0 **A2** 0.01 0.83 0 0 0.08 0 0.04 0.04 **A3** 0 0 1 0 0 0 0 0 **A4** 0 0.03 0.02 0.74 0.1 0 0.03 0.08 **A5** 0 0.07 0 0.06 0.76 0 0.05 0.05 **A6** 0.08 0 0.01 0 0 0.88 0.01 0 **A7** 0.04 0.03 0.04 0.01 0.04 0.03 0.77 0.03 **A8** 0.01 0.04 0.02 0.02 0.19 0 0.08 0.65 (**c**)

*γ* = 0.1, w = 4 s.


**Figure 14.** Normalised confusion matrices for random forest, with Wearable Feature Type 2, without beacon data. (**a**) 49, w = 1 s; (**b**) 50, w = 2 s ; (**c**) 43, w = 3 s; (**d**) 45, w = 4 s. **Figure 14.** Normalised confusion matrices for random forest, with Wearable Feature Type 2, without beacon data. (**a**) 49, w = 1 s; (**b**) 50, w = 2 s ; (**c**) 43, w = 3 s; (**d**) 45, w = 4 s. *Sensors* **2017**, *17*, x 21 of 26


**Figure 15.** Normalised confusion matrices for random forest, with Wearable Feature Type 2, with beacon data. (**a**) *n* = 48, w = 1 s; (**b**) *n* = 46, w = 2 s ; (**c**) *n* = 42, w = 3 s; (**d**) *n* = 41, w = 4 s. **Figure 15.** Normalised confusion matrices with beacon data. (**a**)*<sup>n</sup>* <sup>=</sup> 48, w 1 s; (**b**=46, w=; (**c**) *<sup>n</sup>* <sup>=</sup> <sup>w</sup> 3 s; (**d***n*41,

our feature space, the data become non-linearly separable, and LR is not able to take advantage of the additional information. This results in worse classification performance compared to the other classifiers. We can also confirm this by inspecting Figures 7a and 8a, where the gap in average *F*1*score* between LR and the other classifiers increases from 0.1 (without beacon data) to 0.15 (with beacon data).

**Figure 16.** Normalised confusion matrices for SVM, with Wearable Feature Type 2, without beacon data. (**a**) C = 10, *γ* = 0.1, w = 1 s; (**b**) C = 10, *γ*= 0.1, w = 2 s; (**c**) C = 10, *γ* = 0.1, w = 3 s; (**d**) C = 10,

As a general note, we should highlight the fact that LR is a linear classifier, while KNN, RF and SVM are non-linear classifiers. When adding beacon data and increasing the dimensionality of our feature space, the data become non-linearly separable, and LR is not able to take advantage of the additional information. This results in worse classification performance compared to the other classifiers. We can also confirm this by inspecting Figures 7a and 8a, where the gap in average *F*1*score* between LR and the other classifiers increases from 0.1 (without beacon data) to 0.15 (with beacon data). As a general note, we should highlight the fact that LR is a linear classifier, while KNN, RF and SVM are non-linear classifiers. When adding beacon data and increasing the dimensionality of our feature space, the data become non-linearly separable, and LR is not able to take advantage of the additional information. This results in worse classification performance compared to the other classifiers. We can also confirm this by inspecting Figures 7a and 8a, where the gap in average *F*1*score* between LR and the other classifiers increases from 0.1 (without beacon data) to 0.15 (with beacon data).

*Sensors* **2017**, *17*, x 21 of 26


(**a**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.95 0 0 0 0 0.03 0.02 0 **A2** 0 0.87 0 0 0.07 0 0.05 0 **A3** 0 0 0.97 0 0 0.01 0.01 0.01 **A4** 0 0.01 0 0.96 0 0 0.03 0 **A5** 0 0.08 0 0 0.86 0 0.05 0 **A6** 0.02 0 0 0 0 0.96 0.01 0 **A7** 0.05 0.04 0 0 0.06 0.01 0.83 0 **A8** 0 0.01 0.01 0 0 0 0.01 0.96 (**a**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.97 0 0 0 0 0.02 0.01 0 **A2** 0 0.92 0 0 0.05 0 0.02 0.01 **A3** 0 0 1 0 0 0 0 0 **A4** 0 0 0.02 0.98 0 0 0 0 **A5** 0 0.07 0 0.01 0.88 0 0.04 0 **A6** 0.06 0 0 0 0 0.92 0.02 0 **A7** 0.03 0.04 0 0 0.02 0.01 0.9 0 **A8** 0 0 0.01 0 0 0.01 0.01 0.97 (**c**)


(**b**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.93 0 0 0 0 0.05 0.02 0 **A2** 0 0.92 0 0 0.05 0 0.02 0 **A3** 0 0 0.95 0 0 0.02 0 0.02 **A4** 0 0.01 0.01 0.95 0.02 0 0 0 **A5** 0 0.07 0 0.01 0.88 0 0.04 0 **A6** 0.04 0 0 0 0 0.92 0.04 0 **A7** 0.05 0.03 0 0 0.05 0 0.88 0 **A8** 0.01 0.01 0.01 0 0 0 0.01 0.97 (**b**)

A1 A2 A3 A4 A5 A6 A7 A8 **A1** 0.95 0 0 0 0 0.04 0.01 0 **A2** 0 0.94 0 0 0.05 0 0.02 0 **A3** 0 0 1 0 0 0 0 0 **A4** 0 0 0.02 0.98 0 0 0 0 **A5** 0 0.05 0 0 0.93 0 0.02 0 **A6** 0.01 0 0 0.01 0 0.94 0.05 0 **A7** 0.02 0.01 0 0 0.05 0.01 0.91 0 **A8** 0 0.01 0 0 0.01 0 0 0.98 (**d**)


**Figure 16.** Normalised confusion matrices for SVM, with Wearable Feature Type 2, without beacon data. (**a**) C = 10, *γ* = 0.1, w = 1 s; (**b**) C = 10, *γ*= 0.1, w = 2 s; (**c**) C = 10, *γ* = 0.1, w = 3 s; (**d**) C = 10, *γ* = 0.1, w = 4 s. **Figure 16.** Normalised confusion matrices for SVM, with Wearable Feature Type 2, without beacon data. (**a**) C = 10, *γ* = 0.1, w = 1 s; (**b**) C = 10, *γ*= 0.1, w = 2 s; (**c**) C = 10, *γ* = 0.1, w = 3 s; (**d**) C = 10, *γ* = 0.1, w = 4 s. *Sensors* **2017**, *17*, x 22 of 26



in less time required to collect and classify data.

improving our system's performance.

publish in open access.

**7. Conclusions**




**Figure 17.** Normalised confusion matrices for SVM, with Wearable Feature Type 2, with beacon data. (**a**) C = 100, *γ* = 0.1, w = 1 s; (**b**) C = 10, *γ*= 0.1, w = 2 s; (**c**) C = 10, *γ* = 0.1, w = 3 s; (**d**) C = 10, *γ* = 0.1, w = 4 s. **Figure 17.** Normalised confusion matrices for SVM, with Wearable Feature Type 2, with beacon data. (**a**) C = 100, *γ* = 0.1, w = 1 s; (**b**) C = 10, *γ*= 0.1, w = 2 s; (**c**) C = 10, *γ* = 0.1, w = 3 s; (**d**) C = 10, *γ* = 0.1, w = 4 s.

In future work, we will further investigate human activities that can take place in an indoor setting, such as building emergency management [52,53]. This could prove beneficial for an emergency operation, as it could improve situational awareness with respect to the activities of building occupants in the instances before or after an incident took place. Finally, we will investigate a wider range of machine learning algorithms and consider the use of neural networks and deep learning for further

**Acknowledgments:** All sources of funding of the study should be disclosed. Please clearly indicate grants that you have received in support of your research work. Clearly state if you received funds for covering the costs to

**Author Contributions:** For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used "X.X. and Y.Y. conceived and

of off-the-shelf smart watches and BLE beacons. A mobile phone is responsible for gathering smart watch and beacon data and transmitting them to a server where the processing and classification takes place. Our approach uses location information revealed by the beacon data, to enhance the classification accuracy of the machine learning algorithms we employ. Our experimental results have shown that there is a clear improvement in the performance of our system when beacon data are used. However, the extent to which the location information can be advantageous depends on the type of classifier. LR cannot take full advantage of location information, while KNN and RF benefit more from the fusion of beacon data. SVM exhibits the highest performance gain when using beacon data. Furthermore, we observe that the more unique the location of an activity is with respect to the others, the higher the benefit in activity recognition performance. However, we must highlight that even subtle differences in activity locations are sufficient for a significant improvement in the classification accuracy (e.g., working on different parts of a workbench inside the same sector). Finally, location information can make the system more adaptive, as it allows for smaller window sizes, which results
