**2. Background**

Before further discussions, we have collected the sensor data of 10 daily activities, including eating activities, from 25 subjects (detailed specifications are provided in Section 5) equipped with the wrist-wearable device and a smartphone with sensors (see Section 4.1), and have analyzed to ascertain the complexity of eating activities and show the requirements for the eating activity recognizer to be useful in the real world.

Table 1 shows the correlation scores of each attribute with respect to the class (darker color indicates higher value). Since we had collected the various eating activities, such as eating chicken with a fork, or a sandwich with a hand, eating activities of a baby, and so on, each attribute itself showed very low correlation scores. Despite the popular adoption and relatively high performance of accelerometers, the scores of 'h\_acc's ('h' for a hand, 'acc' for an accelerometer) are considerably low, even lower than those of the environmental attributes ('lux' for illuminance, 'temp' for temperature, 'hum' for humidity), except the 'h\_acc\_y' which measures the back-and-forth motion of the hand when eating. The scores of 'acc's are considerably high compared to other attributes, but they are also fairly low and largely caused by the constraints that the collection was not done with the user's phone and they usually did not use the phone. Considering many people operate their smartphone while eating, it is rational to expect that those scores would be lower, like 'h\_acc's. Table 2 shows the correlation matrix of the attributes (darker color indicates higher value), which also shows very low value, except 'h\_acc\_x' and 'h\_acc\_y', and 'acc's. Figure 1 shows a more specific example of a three-axis accelerometer value of the hand of four different eating activities. Even with a glimpse of observation, there are considerably different patterns: 'h\_acc\_y' of the child is comparably low as the position of the food is higher for them; the variance of all values is low when eating outside, as the user grabbed a sandwich and did not move his hand frequently; 'h\_acc\_x' is much higher than other cases when eating chicken using a fork, as the user tore on the left and right sides, and so on. In addition to the value of the sensor located on the wrist, the value of the smartphone sensor could be more unpredictable and variable as the smartphone could be anywhere while eating: in the pocket, on the table, in the hand, and so on. These could imply that the recognizer may require (i) manual modeling of activity instead of using the sensor value itself, or automatically extracted features with a learning classifier; (ii) a probabilistic reasoning that infers various kinds of contexts occurring probabilistically. In addition to the precise recognition itself; (iii) the constraint of the power and memory consumption of sensors; and (iv) the obtrusiveness to the user should be considered for the practical usage [5], as a recognizer should collect and recognize continuously without charging and too high a battery consumption could restrict the usage of devices for the original purpose.


**Table 1.** Correlation scores of each attribute.

*Sensors* **2017**, *17*, 2877

Correlation coefficient, information gain, information gain ratio, symmetric uncertainty; 2 h = hand, acc = accelerometer, lux = illuminometer, temp = temperature, hum = humidity.


**Table 2.** Correlation matrix of attributes.

To fulfill those requirements, the proposed method (i) uses only five types of low-power sensors attached to the smartphone and the wrist-wearable device (Figure 2); (ii) is built on the context model of an eating activity which could represent the composition of complex eating activities, based on theoretical background and domain knowledge; and (iii) uses the Bayesian network (BN) for probabilistic reasoning, with a tree-structured and modular design approach to increase the scalability and reduce the cost for inference and management. Our contributions are as follows: (i) obtain and describe the complexity of real activities and the limitations of typical learning algorithms using real complex data; (ii) recognize the activity using only low-power and easily-accessible sensors; (iii) propose the formal descriptive model based on the theoretical background and show its usefulness; and (iv) provide the various experiments and analyses using a large amount of data from 25 different volunteers with 10 activities and various features.

**Figure 1.** A time-series variation of acceleration sensor data in various activities.

**Figure 2.** Smartphone and wrist-wearable device for data collection.
