*3.2. Data Preprocessing*

The *SisFall* dataset required minimal preprocessing. We started by equalizing the duration of each record, by equally cutting (*top and tail* in equal measure) reducing the length to 10 s. We chose 10 s to remove any outliers induced by the fall experiment, whilst preserving the fall within each record. To generate various sensors' sampling rates, we reduced the number of samples in each record. Thus, for a sampling rate of 100 Hz, we removed 50% of the sample along the record.

Regarding the two walking and two jogging activities, which only have one trial (Table 3), we extracted 5 times 10 s for each record. We did this to have the same number of trials per activity. We selected 5 windows with no overlap along each record as follows:


The additional data preprocessing required for research question 3 is described separately, in Section 3.6.

### *3.3. Feature Extraction*

We then extracted meaningful information from the preprocessed data. This process helps extracting information that better characterize each activity. A common practice, when working with time series, is to extract time and frequency domain features [11,27,28]. In addition to the axes' features, we calculated the *magnitude* of acceleration and rotation measures, to improve the robustness of the fall detection (e.g., in case of fall-like activities involving fast movements). Thus, we also extracted time-domain features such as the *variance*, *standard deviation*, *mean*, *median*, *maximum*, *minimum*, *delta*, *25th centile* and *75th centile*. Additionally, we extracted frequency-domain features, using a Fast Fourier Transform and we extracted two features: the *power spectral density* and the *power spectral entropy*.

The feature extraction process is as follows. Firstly, various formulae are applied to each record. In our case, each record has a length of 10 s with a number of samples varying from 10 to 2000 depending on the sampling rate. We then selected a sensor axis and used all samples to extract the wanted feature class.

This process was repeated for each of the other sensor axes (3 axes, 2 sensors). We appended each calculation to a vector to characterize the record (Table 4). We applied this process also for each sensor *magnitude*, resulting in a feature vector of 88 features per record (11 feature classes × 8 axes). The resulting vector uniquely defines each activity. The algorithm compares and tries to find patterns using these features in order to correctly classify each activity. For example, a fall would most likely have a large delta on its vertical sensing axis, since a fall is usually defined by a high vertical acceleration.

Finally, we normalized the extracted features to rescale the data to a common scale. This gives more influence to data with small values which can be neglected depending on the employed algorithm. In this work, we used the common *min-max* normalization which scales the values between 0 and 1 included.


**Table 4.** List of extracted time and frequency domain features.
