*2.2. Machine Learning Paradigm for Physical Activity Classification* 2.2.1. Dataset

The study utilizes a publicly available dataset [28] to detect the activities of daily living (ADLs). The dataset is collected using a Galaxy S II smartphone by using its triaxial (3D) gyroscope and triaxial (3D) accelerometer sensors. The smartphone phone was mounted at a waist level to carry on data collection. Thirty subjects participated in the data collection experiment, aged from 19 to 48 years, and performed a variety of ADLs. The ADLs performed were lying, sitting, standing, walking upstairs, walking downstairs and normal walking. Data collection was conducted in a laboratory environment, and the participants were instructed to perform ADLs as naturally as possible. The sampling frequency of the accelerometer and gyroscope sensors was set to 50 Hz. The ground truth of the ADLs performed was maintained through visual observation.

#### 2.2.2. Feature Computation

Raw 3D accelerometer and gyroscope sensory signals obtained from the smartphone underwent several preprocessing and feature extraction steps to derive features that are fed to a machine learning algorithm later on to profile and classify ADLs.

The preprocessing steps involved are (i) low pass filtering using butter worth 3rd order filter at cutoff 20 Hz and (ii) median filtering. The acceleration signal is then divided into body acceleration and gravitational acceleration signals. To achieve this, a frequency of 0.3 Hz is used to separate the gravitational signals (<0.3 Hz) from the body acceleration signal (>0.3 Hz). Furthermore, jerk signals are derived from the acceleration signal and gyroscope signal by taking their derivates. The cadence is also derived from these signals. To analyze the frequency components, fast Fourier transform (FFT) is also computed to detect the trends and variations occurring in the frequency domain when different ADLs are performed. These derivations resulted in a total of 17 signals, including the original 3D gyroscope and 3D accelerometer signals. Further details about the feature extraction process can be found in [28].

The originals signals (3D accelerometer and 3D gyroscope) and the aforementioned derived signals are further processed using the windowing method to extract more features. The window length is set to 2.56 sec (128 samples of data) with 50% overlap (64 samples). Several statistical, time and frequency domains features are obtained from these derived signals, and each time, window of 128 samples is as follows: mean, standard deviation, median, signal magnitude area, maximum value, minimum value, angle between two signals, frequency domain band energy, skewness, kurtosis, average frequency component, maximum frequency component, correlation coefficient between signals, autoregressive correlation coefficients, interquartile range, sum of squares (energy) and band energy [28].

### 2.2.3. Experiments

Several experiments have been conducted using a different split of the training and testing dataset to train the machine learning classifiers and to observe the performance of different machine learning classifiers in overbalanced and imbalanced datasets. Class imbalance is a critical issue in machine learning, and this often occurs when one or few classes are underrepresented (having fewer samples) than other classes. This often creates biases during the training stage of the machine learning algorithm, and the performance of underrepresented class or classes is highly affected by these imbalanced distributions.

Therefore, in this study, we also investigated the impact of imbalanced classes on the performance of the different machine learning classifiers by conducting different experiments. In addition, we also investigated which machine learning algorithms are relatively less sensitive to class imbalance or performed better than others.

The class distribution used in the first experiment or experiment 1 (E1) is presented in Table 1. The class distribution shown in Table 1 is the original class distribution obtained after the actual data collection. Each instance (number or sample) in Table 1 represents the number of time windows obtained for that particular class. Column 1 represents the activity type (walk, sit, stand, etc.), column 2 represents the total number of data instances obtained originally, column 3 represents the proportion of each activity class with respect to the total dataset, column 5 represents the train split or the number of time instances used to train the machine learning model and the last column represents the test split or the number instance used to test the performance of the machine learning algorithms. This original distribution or the balanced distribution of the ADLs in the train/test split is named experiment 1 (E1).


**Table 1.** Class distribution of different ADLs in experiment 1 (E1).

After designing experiment 1, six further experiments are conducted by inducing class imbalance in each class to observe the performance of the machine learning classifier in classifying different imbalanced ADLs. Table 2 represents the further six experiments conducted (from E2 to E7) in addition to E1 (please see Table 1).


**Table 2.** Class distribution of different ADLs in training samples during experiments 1–7 (E1, E2, E3, E4, E5, E6 and E7).

It is important to note that training samples of the underrepresented classes are different in each of these experiments (from E1 to E7), while the test samples remain the same as per the original distribution presented in Table 1. This is due to the fact that the class imbalance added in the training samples may affect the performance of the machine learning classifier and the testing samples have no influence on the trained machine learning model.
