2.3.1. Training Dataset

Each of the 24 files in the training dataset, in which each skier performs only 1 technique of either the classical or skating style on the flat course repeatedly, has turning points at the end of a lap each time the track is traversed. During the turning points, the skier is not performing any technique and hence these points are removed from the data of each skier by manually identifying the frames corresponding to such durations using the Xsens human model videos and the recorded videos from a digital camera. This gives data files with continuous repetitions of the same technique. A low-pass butterworth filter of fourth order and a cutoff frequency 0.007 Hz is applied to smoothen the raw data, following which the *z*-axis angular velocity of the gyroscope on the left leg is used for finding the locations of peaks. The distance between two consecutive peaks in the filtered data is variable and represents 1 cycle. Each input to the CNN-LSTM model must have the same dimensions. Thus, the locations of the peaks in the filtered data is used to resample the raw data (after removing the turning points) to a fixed cycle-length of 333 time-steps (which is the mean number of time-steps for all the techniques in the training data) using an anti-aliasing finite impulse response low pass filter, which resamples the data at (333/n) × 240 Hz, where n is the number of time steps in a given cycle before resampling and 240 Hz is the original sample rate. Resampling is performed over the raw data and not over the filtered data. It is because a neural network works best with raw data from which it automatically extracts the features required for the classification task. These resampled cycles are arranged into tensors, which make it suitable for passing them to a CNN layer and later to an LSTM layer as a multivariate time series. We thus obtain a total of 24 tensors. The number of cycles of each technique of each XC-skiing style performed by each skier on the flat course is shown in Table 4.


**Table 4.** Number of cycles of the classical (DS, P-Off, KDP, DP) and skating (V2, V2A, V1, FS) techniques performed by each skier in the training dataset.

The dimensions of each matrix in a tensor is 333 × 51 (17 gyroscopes, 3 axes each, hence 51 columns).
