**1. Introduction**

Human motion capture during gait provides a way to understand the principles of the natural mode of locomotion of the human being. Technological advances have changed its practice and improved its accuracy along history [1,2]. Recent developments in microelectromechanical systems (MEMS) have caused a renewed interest in the use of inertial measurement units (IMUs) to perform three-dimensional (3D) human movement reconstruction [3–10]. However, getting orientation from IMUs presents accuracy and consistency issues [11–16], especially in the presence of environmental ferromagnetic disturbances or when measuring fast complex movements over long periods of time [17]. This is why, although the performance of inertial sensors has improved in the last decade, optical motion capture remains as the preferred method to perform precise biomechanical studies. In fact, as pointed out in [18,19], IMU-based methods for motion capture and reconstruction are usually validated against optical methods, which remain as the golden standard reference. The problem with optical motion capture systems is that it is very difficult to ensure that all markers are visible to the cameras all the time and, moreover, other reflective objects present in the capture zone can be incorrectly identified as markers. In general, obtaining the skeletal motion involves some manual post-processing of the captured data, so the technique is not straightforward [20,21]. This problem can be overcome by using an extended Kalman filter (EKF) [22], as will be described later in this paper.

The typically high-frequency noise harmonics present in the recorded marker trajectories are hardly perceptible at displacement level. However, after numerical differentiation,

**Citation:** Cuadrado, J.; Michaud, F.; Lugrís, U.; Pérez Soto, M. Using Accelerometer Data to Tune the Parameters of an Extended Kalman Filter for Optical Motion Capture: Preliminary Application to Gait Analysis. *Sensors* **2021**, *21*, 427. https://doi.org/10.3390/s21020427

Received: 30 October 2020 Accepted: 4 January 2021 Published: 9 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the amplitude of each harmonic increases with its harmonic number; for velocities, it increases linearly, and, for accelerations, the increase is proportional to the square of the harmonic number [2]. For this reason, filtering is required when trying to obtain velocities and accelerations by numerically differentiating position data. The problem here lies on the choice of the cutoff frequency of the filter, since it is difficult to achieve a value that filters out most of the noise, without also removing relevant motion information [23–27]. From what has been said, it is deduced that different filtering quality is demanded for the animation of virtual characters, where only configuration-level information is needed, and for biomechanical analysis including dynamics, where velocity- and acceleration-level information is also required.

Numerous filtering approaches can be found in the literature [2,23,28,29]. While the Butterworth filter has generally been preferred, because impulsive-type inputs are rarely seen in human movement data [28,30,31], recent studies have applied the Kalman filter [32] (most commonly used in the literature for inertial sensors), thus improving the accuracy of estimated joint kinematics and computed orientation data [22,33,34]. However, beyond the choice of a filtering algorithm, the main problem remains in the tuning of its parameters. In almost all the filtering studies for gait analysis, the smoothing level is based on the author's decision of how much noise is acceptable. A common criterion is to establish an error threshold at position level, and then setting the cutoff frequency accordingly [22,23]. Regardless of the method used, there is no way of assessing the accuracy of the obtained accelerations by relying on position data alone. In order to provide an objective filter-tuning procedure, this work proposes to compare the filtered accelerations from the optical capture with their experimental measurement from the inertial sensors in the case of gait. In [24], a similar procedure was applied in the case of jumping, but it was not successful due to the overshoot provided by the accelerometers in their horizontal measurements.

IMUs are capable of estimating their own orientation within an Earth-fixed frame by using sensor fusion algorithms, such as Madgwick's algorithm [35] or the EKF [36]. These algorithms provide an estimate of the orientation by combining the information from the triaxial accelerometers, gyroscopes and magnetometers present in the IMU. Because IMUs show limitations to give an accurate orientation [13] (closely related to sensor calibration, magnetometer sensitivity, and presence of accelerations other than gravity) and, moreover, their Earth-based global frame will in general differ from that of the optical motion capture system, a preliminary test was performed with nine IMUs. This test was carried out in order to assess the instrumental errors associated, select the most accurate units, and determine the global frame offset corresponding to each one [11]. Second, the gait analysis of a healthy subject was conducted. The motion was recorded by both the optical and the inertial techniques, using the seven most accurate IMUs among the nine previously tested. The human motion was then reconstructed by using both the classic Vaughan's method [37], which does not impose the kinematic constraints and is similar to those proposed in [19] and the EKF introduced in this paper, which allows automatic marker labeling, is robust to short marker occlusions and imposes kinematic constraints, even in real time, so the local accelerations measured by the IMUs could be used to tune the filters applied to the optical motion capture data.

The remaining of the paper is organized as follows. Section 2 describes the two experiments carried out, sensor test and gait analysis. It points out the errors that may be incurred by inertial sensors, and proposes a way to minimize their influence. Then, it explains the two motion reconstruction methods applied and compared in this work, with a detailed description of the EKF, and shows the procedure to obtain the accelerations of the IMU attachment points from the optical system-based analysis, so that they can be compared with the accelerations measured by the IMUs. Section 3 presents the results of both the preliminary test and the gait analysis, showing the errors of the inertial sensors in orientations and accelerations, and the effect of the filter parameters adopted for the motion reconstructions methods in the accuracy of the accelerations obtained from the optical system recordings. Finally, Section 4 discusses the results and points out the limitations of the study, while Section 5 draws the conclusions of the work.

#### **2. Materials and Methods**

### *2.1. Preliminary Test*

IMUs provide the measured accelerations expressed in their local reference frames, while the optical motion capture system provides marker positions within its own fixed reference frame. Therefore, in order to establish a comparison between accelerations coming from both techniques, it is necessary to express them in the same reference frame, and this implies obtaining the transformation matrix between each IMU and the fixed frame used by the optical motion capture system.

As mentioned above, an IMU can use sensor fusion algorithms to estimate its orientation within an Earth-fixed frame. This Earth-fixed frame is usually defined as NED (North-East-Down) or NWU (North-West-Up), and will be probably rotated with respect to the reference frame used by the motion capture lab. Therefore, the first step is to determine the offset between both reference frames, for which two methods can be applied: (i) the first option is to carry out a preliminary IMU calibration process, as the spot check proposed in [11]; (ii) the second alternative is to attach three markers to each IMU, so the local frames can be obtained directly from the optical motion capture system [38]. Since the second method requires a large number of markers, thus making the motion capture process more involved and error-prone, the calibration approach has been chosen in this work.

### 2.1.1. Experimental Data Collection

Nine IMUs (STT-IWS, STT Systems, San Sebastián, Spain) sampling at 100 Hz were fixed on a flat rigid wooden plate (with no ferromagnetic disturbances), equally spaced and accurately aligned to each other. Four reflective markers were also attached to four of the sensors, as illustrated in Figure 1.

**Figure 1.** Calibration setup composed by nine IMUs and four markers on a rigid plate.

The optical motion capture system was formed by 18 infrared cameras (OptiTrack FLEX 3, Natural Point, Corvallis, OR, USA), also sampling at 100 Hz. Starting with the plate on the floor, where it was kept for 5 s, it was manually moved around for 30 s and, finally, put again in the original place during 5 s. Data from both the IMU set and the optical system were recorded and the plate orientation during the motion was obtained

from: (i) each IMU, based on gravity, magnetic North and gyroscope integration within the commercial software iSen provided by the manufacturer; (ii) the optical system, by rigid-body motion reconstruction based on the trajectories of the reflective markers 1, 2 and 3.

#### 2.1.2. Sensor Orientation and Geomagnetic Frame of Reference

Figure 2 shows the three reference frames involved in the problem. The first reference frame is the global reference frame of the motion capture lab, obtained after calibration of the optical system, and it is noted with subscript O (after optical). This reference frame is fixed and common for all IMUs. The second reference frame is the global, Earth-fixed reference frame of each inertial sensor, and it is noted with subscript E (after Earth-fixed), and superscript *i* denoting the IMU number. Although this frame should be the same for all the IMUs, their inherent errors in determining gravity and magnetic North directions lead to discrepancies among sensors. The third reference frame is the local reference frame of each inertial sensor, and it is noted with subscript I (after inertial). In the calibration setup, the local reference frame is the same for all the IMUs, and it coincides with the local reference frame of the wooden plate. Note that the axes of this reference frame have a bar on them, meaning that they are moving axes, rigidly attached to the wooden plate.

**Figure 2.** The three reference frames involved in the calibration: fixed global reference frame of the optical system (subscript O); Earth-fixed global reference frame of each IMU (in grey, subscript E and superscript i); moving local reference frame of all the IMUs and the wooden plate (subscript I).

If **R**OI is the variable rotation matrix that transforms the components of a vector expressed in the reference frame I into the components of the same vector expressed in the reference frame O, **R***<sup>i</sup>* OE is the constant rotation matrix that does the same between frames E and O for the inertial sensor i, and **R***<sup>i</sup>* EI is the variable rotation matrix that makes the same between frames I and E for that sensor, the following relation can be stated at any instant of the plate motion:

$$\mathbf{R}\_{\rm OI} = \mathbf{R}\_{\rm OE}^{\rm t} \mathbf{R}\_{\rm EI}^{\rm t} \tag{1}$$

At any time point, the trajectories of the markers measured by the optical system provide **R**OI, while the sensor fusion algorithm from the ith IMU provides **R***<sup>i</sup>* EI. Therefore **R***i* OE can be derived from Equation (1) as,

$$\mathbf{R}\_{\rm OE}^{i} = \mathbf{R}\_{\rm Col} \mathbf{R}\_{\rm EI}^{i\,\mathrm{T}} \tag{2}$$

Each **R***<sup>i</sup>* OE matrix must be constant, since it represents a rotation between two fixed frames. However, if it is calculated for all the time points of the recorded motion, the obtained values will not be completely constant, due to sensor and estimation errors. In order to find a unique matrix for each IMU, an average rotation matrix is calculated and taken as its effective **R***<sup>i</sup>* OE. Since rotation matrices are orthogonal, care must be taken when averaging them, so that orthogonality is preserved. The method followed here consisted of extracting the roll, pitch and yaw angles from each rotation matrix at every time point, averaging them, and using these values to build back the corresponding effective rotation matrix. This calibration procedure to get the **R***<sup>i</sup>* OE matrices yields different results for different days due to magnetic changes, so it should be ideally performed right before using the IMUs.

Once the IMUs were properly calibrated, the orientation error provided by each of them along the motion of the wooden plate was obtained. For each time point, the trajectories of the markers measured by the optical system provided matrix **R**OI, while data from ith IMU provided **R***<sup>i</sup>* EI. Since the constant matrix **<sup>R</sup>***<sup>i</sup>* OE had been obtained for each IMU in the calibration process described before, it can be written from Equation (1),

$$\mathbf{R}\_{\rm OI}^{i} = \mathbf{R}\_{\rm OE}^{i} \mathbf{R}\_{\rm EI}^{i} \tag{3}$$

where **R***<sup>i</sup>* OI is the rotation matrix between frames I and O provided by the ith inertial sensor. Ideal IMUs would provide the same matrix **R***<sup>i</sup>* OI for all the sensors, and it would be coincident with matrix **R**OI provided by the optical system. However, due to errors in the IMUs, such matrices differ, and the orientation error committed by each IMU at each time point can then be obtained by calculating the roll, pitch and yaw angles of **R***<sup>i</sup>* OI, and comparing them with the roll, pitch and yaw angles of **R**OI, taken as reference. This was done for the nine IMUs, the results being shown in Section 3.

Once the method to obtain the orientation error of each inertial sensor has been described, the objective of comparing the accelerations provided by the optical and the inertial systems is addressed. The optical system provides the trajectories of the markers, based on which the position history of any point of the body can be obtained through a motion reconstruction method. Then, double differentiation of the position history yields velocity and acceleration histories of the point considered. Positions, velocities and accelerations are expressed in the global reference frame of the motion capture lab (previously denoted by O). On the other hand, IMU accelerometers measure a combination of the gravitational and translational accelerations. As reported by Woodman [21], it is necessary to have very accurate rotation sensors in inertial navigation systems, because knowing the precise orientation of the body allows to properly subtract the gravitational acceleration from the measurement, in order to find the translational acceleration. Each IMU provides its acceleration expressed in its local reference frame (previously denoted by I). Therefore, to compare accelerations obtained through the optical and inertial techniques, it is necessary to express them in the same reference frame and to take into account the gravity constant, which is present in the inertial case.

To highlight all the mentioned issues, the acceleration of point 4 in Figure 1 was obtained in three different ways. First, since point 4 had a marker on it, the marker trajectory was filtered by means of a 8 Hz forward-backward 2nd order Butterworth filter, then it was differentiated twice with respect to time, and the gravity constant (9.81 m/s2) was added to the vertical component of the resulting acceleration; the presence of the marker attached at the point made it unnecessary the use of any motion reconstruction method, thus eliminating a source of error for the optical system. Second, the acceleration provided by the IMU at point 4 was expressed in frame O by multiplying it by matrix **R**5 OI (orientation provided by the IMU #5, attached to that point). Third, the acceleration provided by the IMU at point 4 was expressed in frame O by multiplying it by matrix **R**OI (orientation provided by the optical system after the mentioned filtering of the marker trajectory). The resulting accelerations and their comparison are shown in Section 3.

#### *2.2. Gait Analysis*

#### 2.2.1. Experimental Data Collection

A healthy adult male, 24 years old, 70 kg, and 175 cm, performed a complete gait cycle. Both 36 reflective markers in all his body segments for optical motion capture (same equipment as that described in Section 2.1) and 7 IMUs (the best seven among the nine tested in the preliminary test) at pelvis, thighs, shanks and feet for inertial motion capture were attached to the subject's body, as can be seen in Figure 3. One additional marker was attached to each IMU so as to determine its local position within the corresponding segment during a static pose recording.

**Figure 3.** Markers and IMUs (red numbers) attached to the subject's body for gait analysis.

#### 2.2.2. Skeletal Model and Kinematics

The human body is modeled as a three-dimensional multibody system formed by rigid bodies, as shown in Figure 4. The model consists of 18 anatomical segments [39]: two hind feet, two forefeet, two shanks, two thighs, pelvis, torso, neck, head, two arms, two forearms, and two hands. The segments are linked by ideal spherical joints (black dots in Figure 4b), thus defining a model with 57 degrees of freedom (DOF). The axes of the global reference frame are defined as follows: *x*-axis in the antero-posterior direction, *y*-axis in the medio-lateral direction, and *z*-axis in the vertical direction.

**Figure 4.** 3D human model: (**a**) graphical output; (**b**) multibody model showing the segments: joints (black dots), and marker locations (white dots).
